WO2024055805A1 - 数据检索方法、影像数据检索方法及装置 - Google Patents
数据检索方法、影像数据检索方法及装置 Download PDFInfo
- Publication number
- WO2024055805A1 WO2024055805A1 PCT/CN2023/113590 CN2023113590W WO2024055805A1 WO 2024055805 A1 WO2024055805 A1 WO 2024055805A1 CN 2023113590 W CN2023113590 W CN 2023113590W WO 2024055805 A1 WO2024055805 A1 WO 2024055805A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- disease
- hash code
- image sample
- text
- image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 116
- 238000000605 extraction Methods 0.000 claims abstract description 226
- 238000004590 computer program Methods 0.000 claims abstract description 34
- 201000010099 disease Diseases 0.000 claims description 409
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 409
- 238000003745 diagnosis Methods 0.000 claims description 168
- 238000012549 training Methods 0.000 claims description 66
- 238000003384 imaging method Methods 0.000 claims description 22
- 230000003902 lesion Effects 0.000 claims description 14
- 239000000284 extract Substances 0.000 claims description 7
- 206010035664 Pneumonia Diseases 0.000 description 37
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000002591 computed tomography Methods 0.000 description 2
- 238000002059 diagnostic imaging Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 208000023146 Pre-existing disease Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- HPIGCVXMBGOWTF-UHFFFAOYSA-N isomaltol Natural products CC(=O)C=1OC=CC=1O HPIGCVXMBGOWTF-UHFFFAOYSA-N 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 238000009206 nuclear medicine Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Definitions
- the present disclosure relates to the fields of artificial intelligence and medical and health technologies, and specifically relates to a data retrieval method, an image data retrieval method, a device for the above method, a storage medium, a computer program product, and a computer program.
- Hash codes are very effective in multi-modal retrieval applications. For example, in the medical field, text information can be used to search for medical imaging data that matches text descriptions in the database, or medical imaging data can be used to query the corresponding information in the database. Disease diagnosis text, etc.
- a pre-trained hash code extraction model is usually used to process the data to be retrieved (such as pictures or text) to obtain the hash code corresponding to the data to be retrieved. How to make the hash code extraction model accurately determine the hash code corresponding to the data to be retrieved is very important for multi-modal retrieval.
- sample data is usually used to train a hash code extraction model. For example, in the medical field, professional doctors can precisely annotate image data and corresponding disease diagnosis texts, and perform hash code extraction based on the annotated sample data. Hash code extraction model for training. However, the manual method of labeling sample data results in high training costs for the hash code extraction model, which in turn makes the process of retrieving data cumbersome, complex, and costly.
- the present disclosure proposes a data retrieval method, an image data retrieval method, and devices, storage media, computer program products, and computer programs for the above methods.
- An embodiment of the present disclosure proposes a hash retrieval method based on a hash code extraction model.
- the method includes: obtaining data to be retrieved of the target part, wherein the modality of the data to be retrieved is an image modality. or text modality; input the data to be retrieved into the hash code extraction model to obtain the target hash code corresponding to the data to be retrieved; obtain from a database in a modality different from the data to be retrieved, and obtain The search result of the target hash code matching, wherein the hash code extraction model is obtained through the following steps: obtaining the normal image sample of the target part, the first disease image sample, the second disease image sample, and the corresponding The disease diagnosis text corresponding to the first disease image sample, wherein the disease name corresponding to the first disease image sample, the second disease image sample and the disease diagnosis text is The same; determine the respective hash codes of the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text through the hash code extraction model; according to the normal The
- Another embodiment of the present disclosure proposes an image retrieval method, including: acquiring disease image data of a target part; inputting the disease image data into a hash code extraction model to obtain a target hash corresponding to the disease image data code; obtain the target diagnosis text corresponding to the target hash code from the disease diagnosis text library corresponding to the target part, wherein the hash code extraction model is obtained through the following steps: obtain the normal data of the target part image samples, first disease image samples, second disease image samples, and disease diagnosis text corresponding to the first disease image sample, wherein the first disease image sample, the second disease image sample, and the disease The disease names corresponding to the diagnosis text are the same; the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text are respectively determined through the hash code extraction model.
- Hash code according to the distance between the hash code of the normal imaging sample and the hash code of the disease diagnosis text and the hash code of the first disease imaging sample and the hash code of the disease diagnosis text determine the cross-modal contrast loss value of the hash code extraction model; according to the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the first The distance between the hash code of the disease image sample and the hash code of the second disease image sample determines the same-modality comparison loss value of the hash code extraction model; and based on the cross-modality comparison loss value Compare the loss value with the same modality, and train the hash code extraction model.
- the device includes: a first acquisition module for acquiring the data to be retrieved of the target part, wherein the to-be-retrieval The mode of the data is image mode or text mode; the hash code determination module is used to input the data to be retrieved into the hash code extraction model to obtain the target hash code corresponding to the data to be retrieved; Two acquisition modules, configured to acquire retrieval results matching the target hash code from a database that is different from the modality of the data to be retrieved, wherein the hash code extraction model is through a hash code extraction model Obtained by a training device, the training device includes: a first acquisition module for acquiring normal image samples, first disease image samples, second disease image samples of the target part, and images corresponding to the first disease image sample.
- the first determination module is used to pass the hash code
- the extraction model determines the normal image sample, the first disease image sample, the second disease image sample and the The respective hash codes of the disease diagnosis text; a second determination module configured to determine the distance between the hash code of the normal image sample and the hash code of the disease diagnosis text and the distance between the hash code of the first disease image sample and The distance between the hash code and the hash code of the disease diagnosis text is used to determine the cross-modal comparison loss value of the hash code extraction model; the third determination module is used to determine the hash value of the normal image sample based on the distance between the hash code and the hash code of the disease diagnosis text.
- the distance between the code and the hash code of the first disease image sample and the distance between the hash code of the first disease image sample and the hash code of the second disease image sample determine the hash code extracting the same-modality comparison loss value of the model; and a training module configured to train the hash code extraction model according to the cross-modality comparison loss value and the same-modality comparison loss value.
- Another embodiment of the present disclosure provides an image retrieval device, including: a first acquisition module for acquiring disease image data of a target part; a hash code determination module for inputting the disease image data into a hash code; A code extraction model is used to obtain the target hash code corresponding to the disease imaging data; a second acquisition module is used to obtain the target diagnosis text corresponding to the target hash code from the disease diagnosis text library corresponding to the target part. .
- the image retrieval device can determine the target hash code corresponding to the disease image data through the pre-trained hash code extraction model after acquiring the disease image data of the target part, and diagnose the disease based on the disease image data corresponding to the target part.
- the hash code extraction model is obtained through a training device of the hash code extraction model
- the training device includes: a first acquisition module, Used to obtain the normal image sample, the first disease image sample, the second disease image sample of the target part, and the disease diagnosis text corresponding to the first disease image sample, wherein the first disease image sample, the The disease names corresponding to the second disease image sample and the disease diagnosis text are the same; a first determination module is used to determine the normal image sample and the first disease image sample respectively through the hash code extraction model.
- the respective hash codes of the second disease image sample and the disease diagnosis text is used to determine the hash code based on the hash code of the normal image sample and the hash code of the disease diagnosis text. distance and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text to determine the cross-modal contrast loss value of the hash code extraction model; the third determination module uses based on the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the hash code of the first disease image sample and the hash code of the second disease image sample. distance between each other to determine the same-modal comparison loss value of the hash code extraction model; and a training module for calculating the hash value based on the cross-modal comparison loss value and the same-modal comparison loss value.
- Code extraction model is trained.
- Another embodiment of the present disclosure provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and executable on the processor. When the processor executes the program, the embodiment of the present disclosure is implemented.
- Hash retrieval method based on hash code extraction model, or image retrieval method.
- Another aspect of the present disclosure provides a computer-readable storage medium on which a computer program is stored.
- the program is executed by a processor, the hash retrieval method based on the hash code extraction model of the embodiment of the present disclosure is implemented, or , image retrieval method.
- Another embodiment of the present disclosure proposes a computer program product, including a computer program.
- the computer program is executed by a processor, the hash retrieval method based on the hash code extraction model of the embodiment of the present disclosure is implemented, or image retrieval. method.
- Another aspect of the present disclosure provides a computer program, including computer program code.
- the computer program code When the computer program code is run on a computer, the computer performs hash retrieval based on the hash code extraction model of the embodiment of the present disclosure. method, or, image retrieval method.
- Figure 1 is a schematic flowchart of a training method for a hash code extraction model according to an embodiment of the present disclosure
- Figure 2 is a schematic flowchart of a training method of a hash code extraction model according to another embodiment of the present disclosure
- Figure 3 is an example diagram of the network structure of a hash code extraction model according to an embodiment of the present disclosure
- Figure 4 is a schematic flowchart of a training method of a hash code extraction model according to another embodiment of the present disclosure
- Figure 5 is a schematic flowchart of a hash retrieval method based on a hash code extraction model according to an embodiment of the present disclosure
- Figure 6 is a schematic flow chart of a hash retrieval method based on a hash code extraction model according to another embodiment of the present disclosure
- Figure 7 is a schematic flowchart of an image retrieval method according to an embodiment of the present disclosure.
- Figure 8 is a schematic structural diagram of a training device for a hash code extraction model according to an embodiment of the present disclosure
- Figure 9 is a schematic structural diagram of a hash retrieval device based on a hash code extraction model according to an embodiment of the present disclosure
- Figure 10 is a schematic structural diagram of an image retrieval device according to another embodiment of the present disclosure.
- Figure 11 is a block diagram of an electronic device according to one embodiment of the present disclosure.
- the training method of the hash code extraction model, the hash retrieval method based on the hash code extraction model, the image data retrieval method and the devices, electronic equipment, storage media, and computer program products of the above methods according to the embodiments of the present disclosure are described below with reference to the accompanying drawings. and computer programs.
- Figure 1 is a schematic flowchart of a training method of a hash code extraction model according to an embodiment of the present disclosure.
- the training method of the hash code extraction model provided in this embodiment is executed by a training device of the hash code extraction model.
- the training device of the hash code extraction model in this embodiment can be performed by software and/or Implemented in the form of hardware, the training device of the hash code extraction model can be an electronic device, or can be configured in the electronic device.
- the electronic device in this example embodiment may include a terminal device, a server, etc., where the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the training method of the hash code extraction model may include: steps 101 to 105.
- Step 101 Obtain the normal image sample, the first disease image sample, the second disease image sample of the target part, and the disease diagnosis text corresponding to the first disease image sample, where the first disease image sample, the second disease image sample, and the disease diagnosis text are obtained.
- the disease names corresponding to the diagnosis text are the same.
- the target part in this embodiment may be any part of the human body or an animal.
- the target part is a part of the human body for description.
- the target part may be the chest of the human body.
- the disease name in this example can be the name corresponding to any disease.
- the disease name could be pneumonia.
- first disease image sample and the second disease image sample can be two different disease image samples corresponding to the same disease name.
- first disease image sample and the second disease image sample can be for patients.
- Step 102 Determine respective hash codes of the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text through the hash code extraction model.
- the normal image sample, the first disease image sample, the second disease image sample, and the disease diagnosis text can be input into the hash code extraction model respectively, so that the normal image sample, the first disease image sample, the second disease image sample, and the disease diagnosis text can be extracted through the hash code extraction model
- the first disease image sample, the second disease image sample and the disease diagnosis text are processed to obtain hash codes corresponding to the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text.
- the hash code extraction model at this time refers to the initial hash code extraction model that has not been trained.
- Step 103 Determine the hash code based on the distance between the hash code of the normal image sample and the hash code of the disease diagnosis text and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text. Extract the cross-modal contrast loss value of the model.
- the two determined distances can be input into the cross-modal contrast loss function of the hash code extraction model, so as to determine the cross-modal contrast loss value of the hash code extraction model through the cross-modal contrast loss function.
- a distance between a hash code of the normal imaging sample and a hash code of the disease diagnosis text and a distance between a hash code of the first disease imaging sample and a hash code of the disease diagnosis text are determined.
- the two determined distances can be weighted and summed to obtain the cross-modal contrast loss value of the hash code extraction model.
- the distance in this example can be the Hamming distance.
- Step 104 Based on the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the distance between the hash code of the first disease image sample and the hash code of the second disease image sample, Determine the same-modal contrast loss value of the hash code extraction model.
- the two determined distances can be input into the same-modal comparison loss function of the hash code extraction model, so as to determine the same-modal comparison of the hash code extraction model through the same-modal comparison loss function. loss value.
- the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the hash code of the first disease image sample and the hash code of the second disease image sample are determined. After determining the distance between them, the two determined distances can be weighted and summed to obtain the same-modal contrast loss value of the hash code extraction model.
- Step 105 Train the hash code extraction model based on the cross-modal comparison loss value and the same-modal comparison loss value.
- the total loss value of the hash code extraction model can be determined based on the cross-modal comparison loss value and the same-modality comparison loss value, and the model of the hash code extraction model can be evaluated based on the total loss value. The parameters are adjusted, and the adjusted hash code extraction model continues to be trained until the total loss value meets the preset conditions.
- a weighted summation of cross-modal contrast loss values and intra-modal contrast loss values may be performed to obtain a total loss value of the hash code extraction model.
- the preset condition is the condition for the end of model training.
- the preset conditions can be configured accordingly according to actual needs.
- the total loss value that satisfies the preset condition can be that the total loss value is less than the preset value, or that the change in the total loss value tends to be stable, that is, the difference in the total loss value corresponding to two or more consecutive trainings is less than
- the set value, that is, the total loss value basically does not change.
- the training method of the hash code extraction model in the embodiment of the present disclosure after obtaining the normal image sample of the target part, the first disease image sample, the second disease image sample, and the disease diagnosis text corresponding to the first disease image sample, through the hash code
- the hash code extraction model determines the hash codes of the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text respectively; according to the difference between the hash code of the normal image sample and the hash code of the disease diagnosis text distance and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text to determine the cross-modal contrast loss value of the hash code extraction model; according to the hash code of the normal image sample and the first disease The distance between the hash codes of image samples and the distance between the hash code of the first disease image sample and the hash code of the second disease image sample to determine the same-modal comparison loss value of the hash code extraction model; according to the cross-modal comparison loss value and the same-modal comparison loss value State comparison loss value is used
- the hash code extraction model in this example includes an image depth network, a text depth network, a first hash layer connected to the image depth network, and a second hash layer connected to the text depth network.
- this embodiment also proposes a hash code
- the training method of the Hcode extraction model is exemplarily described below with reference to Figure 2.
- FIG. 2 is a schematic flowchart of a training method of a hash code extraction model according to another embodiment of the present disclosure.
- the training method of the hash code extraction model may include: steps 201 to 206.
- Step 201 Obtain the normal image sample, the first disease image sample, the second disease image sample of the target part, and the disease diagnosis text corresponding to the first disease image sample, where the first disease image sample, the second disease image sample, and the disease diagnosis text are obtained.
- the disease names corresponding to the diagnosis text are the same.
- step 201 refers to the relevant descriptions of the above embodiments, which will not be described again here.
- Step 202 Determine the second image features of the first disease image sample through the image depth network, and determine the text features of the disease diagnosis text through the text depth network.
- the image depth network in this example embodiment may be a residual network depth network.
- the image depth network may be a residual network Resnet50. It can be understood that in practical applications, the image depth network in this embodiment can also be other types of deep networks that can perform feature extraction on image data, and this embodiment does not specifically limit this.
- the text deep network in this example can be a pre-trained language representation model.
- the language representation model can be an encoder of a bidirectional transformer. (Bidirectional Encoder Representations from Transformers) BERT model.
- the language representation model can be a knowledge-enhanced semantic representation model (Enhanced Representation through Knowledge Integration, ERNIE). It can be understood that in practical applications, the text deep network in this embodiment can also be other types of deep networks that can perform feature extraction on disease diagnosis text, and this embodiment does not specifically limit this.
- the hash code extraction model in this example embodiment may also include a self-attention layer. force The layer is set between the image depth network and the text depth network.
- One possible implementation method of determining the second image features of the first disease image sample through the image depth network, and determining the text features of the disease diagnosis text through the text depth network is:
- the text deep network determines the text features of the disease diagnosis text; the text features are input into the self-attention layer to obtain the attention features of the disease diagnosis text; the attention features are input into the image depth network to make the image depth network based on attention
- the force feature extracts the lesion features of the first disease image sample to obtain the second image feature of the first disease image sample.
- FIG. 3 An example diagram of the network structure of the hash code extraction model is shown in Figure 3.
- the target part is the chest of the human body
- the disease name is pneumonia
- the first disease image sample is a pneumonia chest image sample
- the disease diagnosis text is a pneumonia diagnosis description. text.
- the pneumonia chest image sample can be input into the image depth network in Figure 3, and the pneumonia diagnosis description text can be input into the BERT network in Figure 3 to obtain the semantic representation vector of the pneumonia diagnosis description text, and the semantic representation vector can be input to the self-attention layer, and input the attention features output from the self-attention layer into the first several convolutional layers in the image depth network (it should be noted that the multiple layers in the image depth network are not shown in Figure 3 convolutional layers) (for example, the image depth network can include five convolutional layers connected in sequence, and the attention features can be input to the first three convolutional layers in the image depth network), and the last convolutional layer output
- the image features are input into the first hash layer to obtain the hash code of the pneumonia chest image sample through the first hash layer.
- the semantic representation vector is also input into the second hash layer to obtain the hash code of the pneumonia diagnosis description text through the second hash layer.
- the first hash code is used to represent the hash code of the pneumonia chest image sample
- the second hash code is used to represent the hash code of the pneumonia diagnosis description text.
- Step 203 Input the second image feature into the first hash layer to obtain the hash code of the first disease image sample, and input the text feature into the second hash layer to obtain the hash code of the disease diagnosis text.
- the first hash layer after inputting the second image feature into the first hash layer, correspondingly, performs hash calculation based on the first image feature to obtain the hash code of the first disease image sample.
- text features are input into the second hash layer, and correspondingly, the second hash layer performs hash calculations based on the text features to obtain a hash code of the disease diagnosis text.
- the first hash layer and the second hash layer in this example embodiment can be the same.
- the first hash layer and the second hash layer in this embodiment can encode images and text images into the same hash code encoding space.
- Step 204 Determine the hash code based on the distance between the hash code of the normal image sample and the hash code of the disease diagnosis text and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text. Extract the cross-modal contrast loss value of the model.
- Step 205 Based on the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the distance between the hash code of the first disease image sample and the hash code of the second disease image sample, Determine the same-modal contrast loss value of the hash code extraction model.
- Step 206 Train the hash code extraction model based on the cross-modal comparison loss value and the same-modal comparison loss value.
- the normal image samples and the corresponding second disease image samples are processed through the image depth network in the hash code extraction model, and based on the image depth network through the first hash layer connected to the image depth network
- the output accurately determines the corresponding hash codes of the normal image sample and the second disease image sample, and uses the image depth network and text depth network in the hash code extraction model to perform the first disease image sample and disease diagnosis text respectively.
- Features are extracted, and the image features are hashed through the first hashing layer, and the text features are hashed through the second hashing layer to accurately determine the corresponding hashes of the first disease image sample and the disease diagnosis text.
- the training method of the hash code extraction model of this embodiment is described below in conjunction with Figure 4. It should be noted that in this example, the target part is the chest of the human body, and the disease name is Pneumonia is described as an example.
- Step 401 Obtain the normal image sample of the chest, the first pneumonia image sample A, the second pneumonia image sample B, and the pneumonia diagnosis text A corresponding to the first pneumonia image sample A.
- Step 402 Use the hash code extraction model to determine the hash codes corresponding to the normal image sample, the first pneumonia image sample A, the second pneumonia image sample B, and the pneumonia diagnosis text A corresponding to the first pneumonia image sample A.
- the normal image sample, the first pneumonia image sample A, the second pneumonia image sample B, and the pneumonia diagnosis text A corresponding to the first pneumonia image sample A are determined through the hash code extraction model.
- the hash code please refer to the relevant descriptions in the embodiments of the present disclosure, and will not be described again here.
- Step 403 based on the distance between the hash code nV of the normal image sample and the hash code FAT of the disease diagnosis text A and the distance between the hash code FAV of the first disease image sample A and the hash code FAT of the disease diagnosis text A. distance to determine the cross-modal contrast loss value of the hash code extraction model.
- Step 404 Based on the distance between the hash code nV of the normal image sample and the hash code FAV of the first disease image sample and the distance between the hash code FAV of the first disease image sample and the hash code FBV of the second disease image sample. The distance between them determines the same-modal comparison loss value of the hash code extraction model.
- Step 405 Train the hash code extraction model based on the cross-modal comparison loss value and the same-modal comparison loss value.
- the hash code extraction model is trained based on the two-stage contrast learning method.
- the two-stage comparison learning method mainly includes contrast learning between image data and comparison between cross-image and diagnostic texts. study method. Training is performed by designing a two-level contrastive learning loss, including: same-modal contrastive loss value and cross-modal contrastive loss value.
- the contrast loss within the modality is designed to shorten the distance between the same lesion features, and on the contrary, shorten the distance between the normal image and the pneumonia image.
- the network learns to extract pneumonia lesion-related features and lesion-irrelevant features.
- For cross-image and diagnostic document data Design a contrastive learning loss between modalities to shorten the distance between the lesion diagnostic document and lesion image data features, thereby promoting the ability to extract lesion representation.
- the first hash layer and the second hash layer can be The output of the layer is passed through the tanh activation function to obtain the final hash code corresponding to the text sample or image sample, and based on the final hash code, the same-modal comparison loss value and cross-modal comparison loss value of the hash code extraction model are determined , and adjust the model parameters of the hash code extraction model based on the same-modal comparison loss value and the cross-modal comparison loss value to achieve training of the hash code extraction model.
- Text A uses comparative learning to train the hash code extraction model, thereby reducing the cost of manually labeling sample data during the model training process and reducing the model training cost.
- Figure 5 is a schematic flowchart of a hash retrieval method based on a hash code extraction model according to an embodiment of the present disclosure.
- the hash retrieval method based on the hash code extraction model provided in this embodiment is executed by a hash retrieval device based on the hash code extraction model.
- the hash retrieval device can be implemented by software and/or hardware.
- the hash retrieval device based on the hash code extraction model can be an electronic device, or can be configured in the electronic device.
- the electronic device in this example embodiment may include a terminal device, a server, etc., where the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the hash retrieval method based on the hash code extraction model may include: steps 501 to 503.
- Step 501 Obtain data to be retrieved of the target part, where the modality of the data to be retrieved is image mode or text mode.
- the target part in this embodiment may be any part of the human body or an animal.
- the target part is a part of the human body for description.
- the target part may be the chest of the human body.
- the mode of the data to be retrieved is the image mode, it means that the data to be retrieved is the image data to be retrieved.
- the mode of the data to be retrieved is text mode, it means that the data to be retrieved is text data to be retrieved.
- Step 502 Input the data to be retrieved into the hash code extraction model to obtain the target hash code corresponding to the data to be retrieved.
- hash code extraction model used in this example embodiment is trained by the training method disclosed in this disclosure.
- Step 503 Obtain retrieval results matching the target hash code from a database in a mode different from the data to be retrieved.
- a database that is different from the modality of the data to be retrieved is a database corresponding to the text modality, where the data stored in the database is the target A hash code of the pre-existing disease diagnosis text for the site.
- the retrieval results matching the target hash code can be obtained from the database.
- a database that is different from the modality of the data to be retrieved is a database corresponding to the image modality, where the data stored in the database is The hash code of the existing disease image at the target site.
- the retrieval results matching the target hash code can be obtained from the database.
- the target part is the chest
- the modality of the data to be retrieved is the image modality, that is, the data to be retrieved is the pneumonia image to be retrieved.
- the pre-trained hash code extraction model can be used to determine the image modality to be retrieved.
- the target hash code of the pneumonia image is then retrieved from the database used to save the text modality based on the target hash code to obtain the target pneumonia diagnosis text that matches the target hash code.
- the target part is the chest
- the mode of the data to be retrieved is the text mode. That is, the data to be retrieved is the pneumonia diagnosis text to be retrieved.
- the pre-trained hash code extraction model can be used to determine The target hash code of the pneumonia diagnosis text to be retrieved is then retrieved from the database used to save the image modality based on the target hash code to obtain the target pneumonia image data that matches the target hash code. This enables mutual retrieval between diagnostic text and image data through hash codes, improving retrieval efficiency.
- the hash retrieval method based on the hash code extraction model provided by the embodiment of the present disclosure determines the target hash code of the data to be retrieved at the target part through the pre-trained hash code extraction model, and determines the target hash code from the data that is different from the data to be retrieved. In the modal database, obtain the search results that match the target hash code. This enables mutual retrieval between different modal data and effectively improves retrieval efficiency.
- Figure 6 is a schematic flowchart of a hash retrieval method based on a hash code extraction model according to another embodiment of the present disclosure. Among them, it should be noted that the hash retrieval method based on the hash code extraction model provided in this embodiment is a further refinement of the foregoing embodiment.
- the hash retrieval method based on the hash code extraction model may include: steps 601 to 604.
- Step 601 Obtain data to be retrieved of the target part, where the modality of the data to be retrieved is image mode or text mode.
- Step 602 Input the data to be retrieved into the hash code extraction model to obtain the target hash code corresponding to the data to be retrieved.
- hash code extraction model used in this example embodiment is trained by the training method disclosed in this disclosure.
- Step 603 Determine the distance between the target hash code and the hash code of each data in the database.
- the Hamming distance between the target hash code and the hash code of each data in the database can be calculated.
- Step 604 Obtain search results matching the target hash code from each data based on the distance.
- the implementation of obtaining the retrieval results matching the target hash code from each data is different according to the distance.
- the exemplary methods are as follows:
- the target data with the shortest distance is selected from each data as the retrieval result.
- target data whose distance is smaller than a preset distance threshold is selected from each data as the retrieval result.
- sort each data in order from low to high distance and select the top N data from the sorting results as the retrieval results, where N is an integer greater than or equal to 1.
- the distance in this example embodiment may be Hamming distance.
- the hash code of each data in the database in this example can be obtained in the following manner: for each data, input the data into the hash code In the code extraction model, the hash code of the data is obtained through the hash code extraction model.
- FIG. 7 is a schematic flowchart of an image retrieval method according to an embodiment of the present disclosure. It should be noted that the image retrieval method provided in this embodiment is executed by an image retrieval device.
- the image retrieval device in this embodiment can be implemented by software and/or hardware.
- the image retrieval device can be an electronic device, or Can be configured in electronic equipment.
- the electronic device in this example embodiment may include a terminal device, a server, etc., where the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the terminal device may be a PC (Personal Computer, personal computer), a mobile device, a tablet computer, etc., which is not specifically limited in this embodiment.
- the image retrieval method may include: steps 701 to 703.
- Step 701 Obtain disease image data of the target site.
- the target part in this embodiment may be any part of the human body or an animal.
- the target part is a part of the human body for description.
- the target part may be the chest of the human body.
- Step 702 Input the disease image data into the hash code extraction model to obtain the target hash code corresponding to the disease image data.
- hash code extraction model used in this example embodiment is trained by the training method disclosed in this disclosure.
- Step 703 Obtain the target diagnosis text corresponding to the target hash code from the disease diagnosis text database corresponding to the target part.
- the disease diagnosis text database in this example stores existing disease diagnosis texts and their corresponding hash codes.
- the exemplary acquisition method of the hash code corresponding to each existing disease diagnosis text in the disease diagnosis text library can be: for each existing disease diagnosis text, the existing disease diagnosis text can be The existing disease diagnosis text is input into the hash code extraction model, so as to determine the hash code corresponding to the existing disease diagnosis text through the hash code extraction model.
- one possible implementation method of obtaining the target diagnosis text corresponding to the target hash code from the disease diagnosis text library corresponding to the target site is to determine the target hash code and the disease diagnosis text library The distance between the hash codes of each existing disease diagnosis text; according to the distance, the disease diagnosis text that matches the target hash code is obtained from each existing disease diagnosis text.
- the implementation methods of obtaining disease diagnosis text matching the target hash code from each existing disease diagnosis text are different according to the distance.
- the exemplary methods are as follows:
- the target existing disease diagnosis text with the shortest distance is selected from each existing disease diagnosis text as the disease diagnosis text that matches the target hash code.
- a target existing disease diagnosis text whose distance is less than a preset distance threshold is selected from each existing disease diagnosis text as the disease diagnosis text that matches the target hash code.
- the existing disease diagnosis texts are sorted in order of distance from low to high, and the existing disease diagnosis texts ranked in the top N positions are selected from the sorting results as the disease diagnosis texts matching the target hash code, where N is an integer greater than or equal to 1.
- the distance in this example embodiment may be Hamming distance.
- the target hash code corresponding to the disease image data can be determined through the pre-trained hash code extraction model, and the disease diagnosis corresponding to the target part can be determined.
- the target diagnosis text corresponding to the target hash code from the text library. Therefore, based on the hash code of the disease imaging data, the disease diagnosis text matching the hash code can be quickly retrieved, which improves the efficiency of obtaining the disease diagnosis text.
- one embodiment of the present disclosure also provides a training device for the hash code extraction model. Since the hash code extraction method provided by the embodiment of the present disclosure The training device of the model corresponds to the training methods of the hash code extraction model provided in the above embodiments. Therefore, the implementation of the training method of the hash code extraction model is also applicable to the training of the hash code extraction model of this embodiment. The device will not be described in detail in this embodiment.
- Figure 8 is a schematic structural diagram of a training device for a hash code extraction model according to an embodiment of the present disclosure.
- the training device 800 of the hash code extraction model includes: a first acquisition module 801 , a first determination module 802 , a second determination module 803 , a third determination module 804 and a training module 805 .
- the first acquisition module 801 is used to acquire the normal image sample, the first disease image sample, the second disease image sample of the target part, and the disease diagnosis text corresponding to the first disease image sample, wherein the first disease image sample, the second disease image sample, and the disease diagnosis text corresponding to the first disease image sample.
- the disease names corresponding to the disease image samples and disease diagnosis texts are the same.
- the first determination module 802 is used to determine the hash codes of the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text respectively through the hash code extraction model.
- the second determination module 803 is configured to determine the distance between the hash code of the normal image sample and the hash code of the disease diagnosis text and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text. , determine the cross-modal contrast loss value of the hash code extraction model.
- the third determination module 804 is configured to determine the distance between the hash code of the normal image sample and the hash code of the first disease image sample and the hash code of the first disease image sample and the hash code of the second disease image sample. The distance between them determines the same-modal comparison loss value of the hash code extraction model.
- the training module 805 is used to train the hash code extraction model based on the cross-modal comparison loss value and the same-modal comparison loss value.
- the hash code extraction model includes an image depth network, a text depth network, a first hash layer connected to the image depth network, and a second hash layer connected to the text depth network.
- the first determination Module 802 including:
- the first determination unit is used to determine the first image features corresponding to the normal image samples and the second disease image samples respectively through the image depth network, and input the first image features into the first hash layer to obtain the normal image samples.
- a second determination unit configured to determine the second image features of the first disease image sample through the image depth network, and determine the text features of the disease diagnosis text through the text depth network;
- the third determination unit is used to input the second image feature into the first hash layer to obtain the hash code of the first disease image sample, and input the text feature into the second hash layer to obtain the hash code of the disease diagnosis text. Hash code.
- the hash code extraction model also includes a self-attention layer.
- the self-attention layer is set between the image depth network and the text depth network.
- the second determination unit is specifically used to: through the text depth network Determine the text features of the disease diagnosis text; input the text features into the self-attention layer to obtain the attention features of the disease diagnosis text; input the attention features into the image depth network so that the image depth network is based on the attention features Lesion features are extracted from the first disease image sample to obtain second image features of the first disease image sample.
- the training device of the hash code extraction model in the embodiment of the present disclosure after acquiring the normal image sample of the target part, the first disease image sample, the second disease image sample, and the disease diagnosis text corresponding to the first disease image sample, through the hash code
- the hash code extraction model determines the hash codes of the normal image sample, the first disease image sample, the second disease image sample and the disease diagnosis text respectively; according to the difference between the hash code of the normal image sample and the hash code of the disease diagnosis text distance and the distance between the hash code of the first disease image sample and the hash code of the disease diagnosis text to determine the cross-modal contrast loss value of the hash code extraction model; according to the hash code of the normal image sample and the first disease
- the distance between the hash codes of the image samples and the distance between the hash codes of the first disease image sample and the hash code of the second disease image sample determine the same-modality contrast loss value of the hash code extraction model; according to Compare the loss values across modalities and compare the loss values within the same modality
- an embodiment of the present disclosure also provides a hash code retrieval device of the hash code extraction model. Due to the embodiment of the present disclosure
- the hash code retrieval device of the hash code extraction model provided corresponds to the hash code retrieval method of the hash code extraction model provided by the above embodiments. Therefore, in the implementation of the hash code retrieval method of the hash code extraction model The method is also applicable to the hash code retrieval device of the hash code extraction model of this embodiment, and will not be described in detail in this embodiment.
- Figure 9 is a schematic structural diagram of a hash retrieval device based on a hash code extraction model according to an embodiment of the present disclosure.
- the hash retrieval device 900 based on the hash code extraction model includes: a first acquisition module 901, a hash code determination module 902 and a second acquisition module 903.
- the first acquisition module 901 is used to acquire the data to be retrieved of the target part, where the modality of the data to be retrieved is image modality or text modality.
- the hash code determination module 902 is used to input the data to be retrieved into the hash code extraction model to obtain the target hash code corresponding to the data to be retrieved.
- the hash code extraction model in this embodiment is trained through the training method of the hash code extraction model proposed in the embodiment of the present disclosure.
- the second acquisition module 903 is used to acquire retrieval results matching the target hash code from a database in a mode different from the data to be retrieved.
- the second acquisition module 903 includes: a determining unit and an acquisition unit.
- the determination unit is used to determine the distance between the target hash code and the hash code of each data in the database.
- the acquisition unit is used to obtain the retrieval results matching the target hash code from each data according to the distance.
- the acquisition unit is specifically used for:
- the hash code of each data in the database is obtained in the following manner: for each data, the data is input into the hash code extraction model, so as to obtain the hash code of the data through the hash code extraction model. Hima.
- the hash retrieval device based on the hash code extraction model of the embodiment of the present disclosure determines the target hash code of the data to be retrieved at the target part through the pre-trained hash code extraction model, and extracts the target hash code from the model that is different from the data to be retrieved. In the stateful database, obtain the search results that match the target hash code. This enables mutual retrieval between different modal data and effectively improves retrieval efficiency.
- Figure 10 is a schematic structural diagram of an image retrieval device according to an embodiment of the present disclosure.
- the image retrieval device 1000 includes: a first acquisition module 1001, a hash code determination module 1002 and a second acquisition module 1003.
- the first acquisition module 1001 is used to acquire disease image data of the target site.
- the hash code determination module 1002 is used to input the disease image data into the hash code extraction model to obtain the target hash code corresponding to the disease image data.
- the hash code extraction model is trained by the hash code extraction model training method provided by the embodiment of the present disclosure.
- the second acquisition module 1003 is used to acquire the target diagnosis text corresponding to the target hash code from the disease diagnosis text database corresponding to the target part.
- the image retrieval device can determine the target hash code corresponding to the disease image data through the pre-trained hash code extraction model after acquiring the disease image data of the target part, and diagnose the disease based on the disease image data corresponding to the target part. Obtain the target diagnostic text corresponding to the target hash code from the text library. Therefore, based on the hash code of the disease imaging data, the disease diagnosis text matching the hash code can be quickly retrieved, which improves the efficiency of obtaining the disease diagnosis text.
- the present disclosure also provides an electronic device and a readable storage medium.
- FIG. 11 is a structural block diagram of an electronic device according to an embodiment of the present disclosure.
- the electronic device 1100 includes a memory 1110 , a processor 1120 , and computer instructions stored on the memory 1110 and executable on the processor 1120 .
- the electronic device is used for hash retrieval or image retrieval.
- the processor 1120 executes instructions, it implements the hash retrieval method based on the hash code extraction model provided in the above embodiment, or the training method of the hash code extraction model, or the image retrieval method.
- electronic device 1100 further includes:
- Communication interface 1130 is used for communication between the memory 1110 and the processor 1120.
- Memory 1110 is used to store computer instructions that can be executed on processor 1120.
- the processor may be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a digital signal processor (DSP), an application-specific integrated circuit (ASIC), Ready-made programmable gate arrays (FPGAs) or other programmable logic devices.
- CPU central processing unit
- ASIC application-specific integrated circuit
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGAs Ready-made programmable gate arrays
- the memory 1110 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
- the memory may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required for at least one function (such as image acquisition and processing functions, etc.); the storage data area It can store data created during the use of the computer, such as images or text data.
- the electronic device further includes an image collector.
- the image collector may be a camera, X-ray imaging, computed tomography (CT), magnetic resonance imaging (MRI), nuclear medicine, ultrasound equipment, endoscopic equipment, etc.
- the processor 1120 is configured to implement the hash retrieval method based on the hash code extraction model, or the training method of the hash code extraction model, or the image retrieval method in the above embodiment when executing the program.
- the bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc.
- ISA Industry Standard Architecture
- PCI Peripheral Component
- EISA Extended Industry Standard Architecture
- the bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 8, but it does not mean that there is only one bus or one type of bus.
- the memory 1110, the processor 1120 and the communication interface 1130 are integrated on one chip, the memory 1110, the processor 1120 and the communication interface 1130 can communicate with each other through the internal interface. .
- the processor 1120 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more processors configured to implement embodiments of the present disclosure. integrated circuit.
- CPU Central Processing Unit
- ASIC Application Specific Integrated Circuit
- Another aspect of the present disclosure provides a computer-readable storage medium on which a computer program is stored.
- the program is executed by a processor, the hash retrieval method based on the hash code extraction model of any embodiment of the present disclosure is implemented. , or the training method of the hash code extraction model, or the image retrieval method.
- the present disclosure also provides a computer program product, including a computer program.
- the computer program When executed by a processor, the computer program implements the training method of a hash code extraction model as in any embodiment of the present disclosure, or, Hash retrieval method based on hash code extraction model, or image retrieval method.
- the present disclosure also provides a computer program.
- the computer program includes a computer program code.
- the computer program code When the computer program code is run on a computer, it causes the computer to perform hash code extraction according to any embodiment of the present disclosure.
- references to the terms “one embodiment,” “some embodiments,” “an example,” “specific examples,” or “some examples” or the like means that specific features are described in connection with the embodiment or example. , structures, materials or features are included in at least one embodiment or example of the invention. In this specification, the schematic expressions of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the specific features, structures, materials or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, those skilled in the art may combine and combine different embodiments or examples and features of different embodiments or examples described in this specification unless they are inconsistent with each other.
- first and second are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Therefore, features defined as “first” and “second” may explicitly or implicitly include at least one of these features.
- “plurality” means at least two, such as two, three, etc., unless otherwise expressly and specifically limited.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
提供了一种数据检索方法、影像数据检索方法及上述方法的装置、存储介质、计算机程序产品和计算机程序。该数据检索方法包括获取所述目标部位的待检索数据,其中所述待检索数据的模态为影像模态或者文本模态;将所述待检索数据输入到哈希码提取模型,以得到所述待检索数据对应的目标哈希码;和从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果。
Description
相关申请的交叉引用
本申请要求在2022年09月15日在中国提交的中国专利申请号2022111229324的优先权,其全部内容通过引用并入本文。
本公开涉及人工智能和医疗健康技术领域,具体涉及一种数据检索方法、影像数据检索方法及上述方法的装置、存储介质、计算机程序产品和计算机程序。
哈希码在多模态检索应用中有非常有效,例如,在医学领域中,可通过文本信息在数据库中查找符合文字描述的医学影像数据,或者,可通过医学影像数据在数据库中查询对应的疾病诊断文本等。
在对多模态检索应用中,通常采用预先训练好的哈希码提取模型对待检索数据(例如图片或者文本)进行处理,以得到待检索数据对应的哈希码。如何使得哈希码提取模型可以准确确定出待检索数据对应的哈希码对于多模态检索是十分重要的。相关技术中,通常采用样本数据对哈希码提取模型进行训练,例如,在医学领域中,可通过专业医师对影像数据以及对应的疾病诊断文本进行精细标注,并基于标注后的样本数据对哈希码提取模型进行训练。然而,采用人工的方式对样本数据进行标记,导致哈希码提取模型的训练成本较高,进而造成检索数据的过程繁琐复杂,成本过高。
发明内容
本公开提出一种数据检索方法、影像数据检索方法及上述方法的装置、存储介质、计算机程序产品和计算机程序。
本公开一方面实施例提出一种基于哈希码提取模型的哈希检索方法,所述方法包括:获取所述目标部位的待检索数据,其中,所述待检索数据的模态为影像模态或者文本模态;将所述待检索数据输入到哈希码提取模型,以得到所述待检索数据对应的目标哈希码;从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果,其中所述哈希码提取模型是通过以下步骤得到的:获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是
相同的;通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
本公开另一方面实施例提出一种影像检索方法,包括:获取目标部位的疾病影像数据;将所述疾病影像数据输入到哈希码提取模型,以得到所述疾病影像数据对应的目标哈希码;从所述目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本,其中所述哈希码提取模型是通过以下步骤得到的:获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
本公开另一方面实施例提出一种基于哈希码提取模型的哈希检索装置,所述装置包括:第一获取模块,用于获取所述目标部位的待检索数据,其中,所述待检索数据的模态为影像模态或者文本模态;哈希码确定模块,用于将所述待检索数据输入到哈希码提取模型,以得到所述待检索数据对应的目标哈希码;第二获取模块,用于从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果,其中,所述哈希码提取模型是通过哈希码提取模型的训练装置得到的,所述训练装置包括:第一获取模块,用于获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;第一确定模块,用于通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所
述疾病诊断文本各自的哈希码;第二确定模块,用于根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;第三确定模块,用于根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和训练模块,用于根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
本公开另一方面实施例提出了一种影像检索装置,包括:第一获取模块,用于获取目标部位的疾病影像数据;哈希码确定模块,用于将所述疾病影像数据输入到哈希码提取模型,以得到所述疾病影像数据对应的目标哈希码;第二获取模块,用于从所述目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本。本公开实施例的影像检索装置,在获取目标部位的疾病影像数据,通过预先训练的哈希码提取模型即可确定出疾病影像数据所对应的目标哈希码,并从目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本,其中,所述哈希码提取模型是通过哈希码提取模型的训练装置得到的,所述训练装置包括:第一获取模块,用于获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;第一确定模块,用于通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;第二确定模块,用于根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;第三确定模块,用于根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和训练模块,用于根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
本公开另一方面实施例提出了一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现本公开实施例的基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
本公开另一方面实施例提出了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开实施例的基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
本公开另一方面实施例提出了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现本公开实施例的基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
本公开另一方面实施例提出了一种计算机程序,包括计算机程序代码,当所述计算机程序代码在计算机上运行时,以使得计算机执行本公开实施例的基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
附图用于更好地理解本方案,不构成对本公开的限定。其中:
图1是根据本公开一个实施例的哈希码提取模型的训练方法的流程示意图;
图2是根据本公开另一个实施例的哈希码提取模型的训练方法的流程示意图;
图3是本公开一个实施例的哈希码提取模型的网络结构的示例图;
图4是根据本公开另一个实施例的哈希码提取模型的训练方法的流程示意图;
图5是根据本公开一个实施例的基于哈希码提取模型的哈希检索方法的流程示意图;
图6是根据本公开另一个实施例的基于哈希码提取模型的哈希检索方法的流程示意图;
图7是根据本公开一个实施例的影像检索方法的流程示意图;
图8是根据本公开一个实施例的哈希码提取模型的训练装置的结构示意图;
图9是根据本公开一个实施例的基于哈希码提取模型的哈希检索装置的结构示意图;
图10是根据本公开另一个实施例的影像检索装置的结构示意图;
图11是根据本公开一个实施例的电子设备的框图。
下面详细描述本发明的实施例,实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
下面参考附图描述本公开实施例的哈希码提取模型的训练方法、基于哈希码提取模型的哈希检索方法、影像数据检索方法及上述方法的装置、电子设备、存储介质、计算机程序产品和计算机程序。
图1是根据本公开一个实施例的哈希码提取模型的训练方法的流程示意图。其中,需要说明的是,本实施例提供的哈希码提取模型的训练方法由哈希码提取模型的训练装置执行,本实施例中的哈希码提取模型的训练装置可以由软件和/或者硬件的方式实现,该哈希码提取模型的训练装置可以为电子设备,或者可以配置在电子设备中。
其中,本示例实施例中的电子设备可以包括终端设备、服务器等,其中,终端设备可以为PC(Personal Computer,个人计算机)、移动设备、平板电脑等,该实施例对此不做具体限定。
如图1所示,该哈希码提取模型的训练方法可以包括:步骤101至步骤105。
步骤101,获取目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与第一疾病影像样本对应的疾病诊断文本,其中,第一疾病影像样本、第二疾病影像样本和疾病诊断文本所对应的疾病名称是相同的。
其中,本实施例中的目标部位可以为人体或者动物中的任意一个部位。其中,本示例实施例中以目标部位为人体中的一个部位为例进行描述,例如,目标部位可以为人体的胸部。
其中,本示例中的疾病名称可以任意一种疾病所对应的名称。例如,疾病名称可以为肺炎。
其中,需要说明的是,第一疾病影像样本和第二疾病影像样本可以是同一种疾病名称所对应的两个不同的疾病影像样本,例如,第一疾病影像样本和第二疾病影像样本可以为患肺炎的两个病例各自对应的胸部影像。
其中,需要说明的是,本示例实施例对各种数据的获取、存储、使用、处理等均符合国家法律法规的相关规定。
步骤102,通过哈希码提取模型分别确定正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自的哈希码。
在一些示例中,可分别将正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本输入到哈希码提取模型中,以通过哈希码提取模型来对分别正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本进行处理,以得到正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自对应的哈希码。
其中,需要说明的是,此时的哈希码提取模型是指还未经过训练的初始的哈希码提取模型。
步骤103,根据正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,确定哈希码提取模型的跨模态对比损失值。
在一些示例中,在确定出正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,可将所确定出的两个距离输入至哈希码提取模型的跨模态对比损失函数中,以通过跨模态对比损损失函数,确定哈希码提取模型的跨模态对比损失值。
在另一些示例中,在确定出正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,可对确定出的两个距离通过加权求和的方式,以得到哈希码提取模型的跨模态对比损失值。
其中,需要说明的是,本示例中的距离可以为汉明距离,汉明距离越小表示两者之间的哈希码越接近,反之则表示两者之间的哈希码差异越大。
步骤104,根据正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离,确定哈希码提取模型的同模态对比损失值。
在一些示例中,在确定出正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离后,可将所确定出的两个距离输入到哈希码提取模型的同模态对比损失函数中,以通过同模态对比损失函数确定出哈希码提取模型的同模态对比损失值。
在另一些示例中,在确定出正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离后,可将所确定出的两个距离,可对确定出的两个距离进行加权求和,以得到哈希码提取模型的同模态对比损失值。
步骤105,根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。
在一些示例性的实施方式中,可根据跨模态对比损失值和同模态对比损失值,确定出哈希码提取模型的总损失值,并根据总损失值对哈希码提取模型的模型参数进行调整,并对调整后的哈希码提取模型继续训练,直至总损失值满足预设条件。
在一些示例性的实施方式中,可对跨模态对比损失值和同模态对比损失值进行加权求和,以得到哈希码提取模型的总损失值。
其中,预设条件即为模型训练结束的条件。预设条件可以根据实际需求进行相应的配置。例如,总损失值满足预设条件可以是总损失值小于预设值,也可以是总损失值的变化趋近于平稳,即相邻两次或多次训练对应的总损失值的差值小于设定值,也就是总损失值基本不再变化。
本公开实施例的哈希码提取模型的训练方法,在获取目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与第一疾病影像样本对应的疾病诊断文本后,通过哈希码提取模型分别确定正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自的哈希码;根据正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,确定哈希码提取模型的跨模态对比损失值;根据正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距
离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离,确定哈希码提取模型的同模态对比损失值;根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。由此,无需对样本数据进行人工标注,通过对正常影像样本、疾病影像样本以及疾病诊断文本进行对比学习,即可实现对哈希码提取模型的训练,降低了哈希码提取模型的训练成本。
基于上述实施例的基础上,在本示例中的哈希码提取模型包括影像深度网络、文本深度网络、与影像深度网络连接的第一哈希层和与文本深度网络连接的第二哈希层的情况下,为了可以清楚理解通过哈希码提取模型如何分别正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自的哈希码,本实施例还提出了一种哈希码提取模型的训练方法,下面结合图2对该过程进行示例性描述。
图2是根据本公开另一个实施例的哈希码提取模型的训练方法的流程示意图。
如图2所示,该哈希码提取模型的训练方法,可以包括:步骤201至步骤206。
步骤201,获取目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与第一疾病影像样本对应的疾病诊断文本,其中,第一疾病影像样本、第二疾病影像样本和疾病诊断文本所对应的疾病名称是相同的。
其中,需要说明的是,关于步骤201的具体实现方式,可参见上述实施例的相关描述,此处不再赘述。
步骤202,通过影像深度网络确定第一疾病影像样本的第二图像特征,并通过文本深度网络确定疾病诊断文本的文本特征。
其中,本示例实施例中的影像深度网络可以为残差网络深度网络,例如,影像深度网络可以为残差网络Resnet50。可以理解的是,在实际应用中,本实施例中的影像深度网络还可以其他能够对影像数据进行特征提取的其他类型的深度网络,该实施例对此不作具体限定。
在一些示例性的实施方式中,为了准确确定出疾病诊断文本的文本特征,本示例中的文本深度网络可以为预先训练好的语言表示模型,例如,语言表示模型可以为双向变换器的编码器(Bidirectional Encoder Representations from Transformers)BERT模型。又例如,语言表示模型可以为知识增强的语义表示模型(Enhanced Representation through Knowledge Integration,ERNIE)。可以理解的是,在实际应用中,本实施例中的文本深度网络还可以其他能够对疾病诊断文本进行特征提取的其他类型的深度网络,该实施例对此不作具体限定。
在本公开的一个实施例中,为了使得影像深度网络可以专注于第一疾病影像样本的病灶区域的特征提取,本示例实施例中的哈希码提取模型还可以包括自注意力层,自注意力
层设置在影像深度网络和文本深度网络之间,通过影像深度网络确定第一疾病影像样本的第二图像特征,并通过文本深度网络确定疾病诊断文本的文本特征的一种可能实现方式为:通过文本深度网络确定疾病诊断文本的文本特征;将文本特征输入到自注意力层中,以得到疾病诊断文本的注意力特征;将注意力特征输入到影像深度网络中,以使得影像深度网络基于注意力特征对第一疾病影像样本进行病灶特征提取,得到第一疾病影像样本的第二图像特征。
例如,哈希码提取模型的网络结构的示例图,如图3所示,目标部位为人体的胸部,疾病名称为肺炎,第一疾病影像样本为肺炎胸部影像样本,疾病诊断文本为肺炎诊断描述文本。对应地,可将肺炎胸部影像样本输入到图3中的影像深度网络,并将肺炎诊断描述文本图3中的BERT网络中,以得到肺炎诊断描述文本的语义表示向量,并将语义表示向量输入到自注意力层,并将自注意力层输出的注意力特征输入到影像深度网络中前几个卷积层中(其中,需要说明的是,图3中未示意出影像深度网络中的多个卷积层)(例如影像深度网络中可以包括依次连接的五个卷积层,可将注意特征输入到影像深度网络中的前三个卷积层中),并最后一个卷积层输出的图像特征输入到第一哈希层中,以通过第一哈希层得到肺炎胸部影像样本的哈希码。对应地,语义表示向量还被输入到第二哈希层中,以通过第二哈希层得到肺炎诊断描述文本的哈希码。其中,图3中用第一哈希码表示肺炎胸部影像样本的哈希码,用第二哈希码表示肺炎诊断描述文本的哈希码。
步骤203,将第二图像特征输入到第一哈希层中以得到第一疾病影像样本的哈希码,并将文本特征输入到第二哈希层中以得到疾病诊断文本的哈希码。
在一些示例中,在将第二图像特征输入到第一哈希层中,对应地,第一哈希层基于第一图像特征进行哈希计算,以得到第一疾病影像样本的哈希码。
在一些示例中,将文本特征输入到第二哈希层中,对应地,第二哈希层基于文本特征进行哈希计算,以得到疾病诊断文本的哈希码。
其中,需要说明的是,为了方便后续可实现两种模态的数据之间可通过哈希码进行相互检索,提高检索效率,本示例实施例中的第一哈希层和第二哈希层进行哈希计算时所使用的哈希码编码空间可以是相同的。也就是说,本实施例中的第一哈希层和第二哈希层可以将影像和文本影像到同一个哈希码编码空间中。
步骤204,根据正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,确定哈希码提取模型的跨模态对比损失值。
步骤205,根据正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离,确定哈希码提取模型的同模态对比损失值。
步骤206,根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。
其中,需要说明的是,关于步骤204至步骤206的具体实现方式,可参见上述实施例中的相关描述,此处不再赘述。
在本示例实施例中,通过哈希码提取模型中的影像深度网络对正常影像样本以及对应的第二疾病影像样本进行处理,并通过与影像深度网络连接的第一哈希层基于影像深度网络的输出准确确定出了正常影像样本以及第二疾病影像样本各自对应的哈希码,并通过哈希码提取模型中的影像深度网络和文本深度网络对第一疾病影像样本和疾病诊断文本分别进行特征提取,并通过第一哈希层对图像特征进行哈希计算,并通过第二哈希层对文本特征进行哈希计算准确确定出了第一疾病影像样本和疾病诊断文本各自对应的哈希码,由此,方便后续可基于所确定出的哈希码,准确确定出哈希码提取模型的模态对比损失值和同模态对比损失值,继而可对哈希码提取模型进行准确训练,有利于哈希码提取模型的训练。
为了可以清楚理解本公开,下面结合图4对该实施例的哈希码提取模型的训练方法进行示例性描述,其中,需说明的是,本示例中以目标部位为人体的胸部,疾病名称为肺炎为例进行描述。
如图4所示,可以包括:步骤401至步骤405。
步骤401,获取胸部的正常影像样本、第一肺炎影像样本A、第二肺炎影像样本B以及与第一肺炎影像样本A对应的肺炎诊断文本A。
步骤402,通过哈希码提取模型分别确定出正常影像样本、第一肺炎影像样本A、第二肺炎影像样本B以及与第一肺炎影像样本A对应的肺炎诊断文本A各自对应的哈希码。
其中,需要说明的是,关于通过哈希码提取模型分别确定出正常影像样本、第一肺炎影像样本A、第二肺炎影像样本B以及与第一肺炎影像样本A对应的肺炎诊断文本A各自对应的哈希码的具体实现方式,可参见本公开实施例中的相关描述,此处不再赘述。
步骤403,根据正常影像样本的哈希码nV和疾病诊断文本A的哈希码FAT之间的距离以及第一疾病影像样本A的哈希码FAV和疾病诊断文本A的哈希码FAT之间的距离,确定哈希码提取模型的跨模态对比损失值。
步骤404,根据正常影像样本的哈希码nV和第一疾病影像样本的哈希码FAV之间的距离以及第一疾病影像样本的哈希码FAV和第二疾病影像样本的哈希码FBV之间的距离,确定哈希码提取模型的同模态对比损失值。
步骤405,根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。
也就是说,本示例中基于双阶段对比学习方式对哈希码提取模型进行训练,其中,双阶段学习对比方式主要是包括由影像数据之间的对比学习和跨影像-诊断文本之间的对比学习方法。通过设计双层次的对比学习损失进行训练,包括:同模态对比损失值和跨模态对比损失值。对于同模态的影像数据:设计模态内对比损失,拉近相同病灶特征之间的距离,相反拉远正常影像与肺炎影像之间的距离。由此,使网络学习提取肺炎病灶相关特征和病灶无关特征。对于跨影像和诊断文档数据:设计模态间的对比学习损失,拉近病灶诊断文档与病灶影像数据特征之间的距离,从而促进病灶表征提取能力。
在本公开的一个实施例中,为了解决符号函数sign导致深度网络梯度无法反传进行优化的问题,在对哈希提取模型进行训练的过程中,可将第一哈希层以及第二哈希层的输出通过tanh激活函数以得到对应文本样本或者影像样本的最终的哈希码,并基于最终的哈希码来确定哈希码提取模型的同模态对比损失值和跨模态对比损失值,并基于同模态对比损失值和跨模态对比损失值,对哈希码提取模型的模型参数进行调整,以实现对哈希码提取模型的训练。
在本示例中,在对哈希码提取模型进行训练的过程中,采用胸部的正常影像样本、第一肺炎影像样本A、第二肺炎影像样本B以及与第一肺炎影像样本A对应的肺炎诊断文本A进行对比学习的方式对哈希码提取模型进行训练,从而降低模型训练过程中人工标注样本数据的成本,降低了模型的训练成本。
图5是根据本公开一个实施例的基于哈希码提取模型的哈希检索方法的流程示意图。其中,需要说明的是,本实施例提供的基于哈希码提取模型的哈希检索方法由基于哈希码提取模型的哈希检索装置执行,本实施例中的基于哈希码提取模型的哈希检索装置可以由软件和/或者硬件的方式实现,该基于哈希码提取模型的哈希检索装置可以为电子设备,或者可以配置在电子设备中。
其中,本示例实施例中的电子设备可以包括终端设备、服务器等,其中,终端设备可以为PC(Personal Computer,个人计算机)、移动设备、平板电脑等,该实施例对此不做具体限定。
如图5所示,该基于哈希码提取模型的哈希检索方法可以包括:步骤501至步骤503。
步骤501,获取目标部位的待检索数据,其中,待检索数据的模态为影像模态或者文本模态。
其中,本实施例中的目标部位可以为人体或者动物中的任意一个部位。其中,本示例实施例中以目标部位为人体中的一个部位为例进行描述,例如,目标部位可以为人体的胸部。
其中,可以理解的是,在待检索数据的模态为影像模态的情况下,说明待检索数据为待检索影像数据。对应地,在待检索数据的模态为文本模态的情况下,说明待检索数据为待检索文本数据。
步骤502,将待检索数据输入到哈希码提取模型,以得到待检索数据对应的目标哈希码。
其中,需要说明的是,本示例实施例中所使用的哈希码提取模型是通本公开所公开的训练方法训练得到的。
其中,需要说明的是,关于训练哈希码提取模型的过程,可参见本公开所公开的相关描述,此处不再赘述。
步骤503,从不同于待检索数据的模态的数据库中,获取与目标哈希码匹配的检索结果。
作为一种示例,在待检索数据的模态为影像模态的情况下,不同于待检索数据的模态的数据库为文本模态所对应的数据库,其中,该数据库中所保存的数据为目标部位的已有疾病诊断文本的哈希码。对应地,可从该数据库中获取与目标哈希码匹配的检索结果。
作为另一种示例,在待检索数据的模态为文本模态的情况下,不同于待检索数据的模态的数据库为影像模态所对应的数据库,其中,该数据库中所保存的数据为目标部位的已有疾病影像的哈希码。对应地,可从该数据库中获取与目标哈希码匹配的检索结果。
例如,在目标部位为胸部,在待检索数据的模态为影像模态,即,待检索数据为待检索肺炎影像,对应的,可通过预先训练好的哈希码提取模型来确定出待检索肺炎影像的目标哈希码,然后,从用于保存文本模态的数据库中,基于目标哈希码进行检索,以得到与目标哈希码匹配的目标肺炎诊断文本。
又例如,在目标部位为胸部,在待检索数据的模态为文本模态,即,待检索数据为待检索肺炎诊断文本,对应的,可通过预先训练好的哈希码提取模型来确定出待检索肺炎诊断文本的目标哈希码,然后,从用于保存影像模态的数据库中,基于目标哈希码进行检索,以得到与目标哈希码匹配的目标肺炎影像数据。由此,实现了诊断文本与影像数据之间可通过哈希码进行相互检索,提高检索效率。
本公开实施例提供的基于哈希码提取模型的哈希检索方法,通过预先训练好的哈希码提取模型来确定目标部位的待检索数据的目标哈希码,并从不同于待检索数据的模态的数据库中,获取与目标哈希码匹配的检索结果。由此,实现了不同模态数据之间可进行相互检索的同时,有效提高检索效率。
图6是根据本公开另一个实施例的基于哈希码提取模型的哈希检索方法的流程示意图。其中,需要说明的是,本实施例提供的基于哈希码提取模型的哈希检索方法是对前述实施例的进一步细化。
如图6所示,该基于哈希码提取模型的哈希检索方法可以包括:步骤601至步骤604。
步骤601,获取目标部位的待检索数据,其中,待检索数据的模态为影像模态或者文本模态。
步骤602,将待检索数据输入到哈希码提取模型,以得到待检索数据对应的目标哈希码。
其中,需要说明的是,本示例实施例中所使用的哈希码提取模型是通本公开所公开的训练方法训练得到的。
其中,需要说明的是,关于训练哈希码提取模型的过程,可参见本公开所公开的相关描述,此处不再赘述。
步骤603,确定目标哈希码与数据库中各个数据的哈希码之间的距离。
在一些示例中,可计算目标哈希码与数据库中各个数据的哈希码之间的汉明距离。
步骤604,根据距离,从各个数据中获取与目标哈希码匹配的检索结果。
在本公开的一个实施例中,在不同应用场景中,根据距离,从各个数据中获取与目标哈希码匹配的检索结果的实现方式不同,示例性方式如下:
作为一种示例,根据距离,从各个数据中选择出距离最短的目标数据作为检索结果。
作为另一种示例,根据距离,从各个数据中选择出距离小于预设距离阈值的目标数据作为检索结果。
作为一种示例,按照距离从低到高的顺序对各个数据进行排序,并从排序结果中选择出排序在前N位的数据作为检索结果,其中,N为大于或者等于1的整数。
其中,本示例实施例中的距离可以为汉明距离。
在本公开的一个实施例中,为了可以准确确定出各个数据的哈希码,本示例中的数据库中各个数据的哈希码可以通过下述方式得到:针对各个数据,将数据输入到哈希码提取模型中,以通过哈希码提取模型得到数据的哈希码。
图7是根据本公开一个实施例的影像检索方法的流程示意图。其中,需要说明的是,本实施例提供的影像检索方法由影像检索装置执行,本实施例中的影像检索装置可以由软件和/或者硬件的方式实现,该影像检索装置可以为电子设备,或者可以配置在电子设备中。
其中,本示例实施例中的电子设备可以包括终端设备、服务器等,其中,终端设备可以为PC(Personal Computer,个人计算机)、移动设备、平板电脑等,该实施例对此不做具体限定。
如图7所示,该影像检索方法可以包括:步骤701至步骤703。
步骤701,获取目标部位的疾病影像数据。
其中,本实施例中的目标部位可以为人体或者动物中的任意一个部位。其中,本示例实施例中以目标部位为人体中的一个部位为例进行描述,例如,目标部位可以为人体的胸部。
步骤702,将疾病影像数据输入到哈希码提取模型,以得到疾病影像数据对应的目标哈希码。
其中,需要说明的是,本示例实施例中所使用的哈希码提取模型是通本公开所公开的训练方法训练得到的。
其中,需要说明的是,关于训练哈希码提取模型的过程,可参见本公开所公开的相关描述,此处不再赘述。
步骤703,从目标部位对应的疾病诊断文本库中获取与目标哈希码对应的目标诊断文本。
其中,本示例中的疾病诊断文本库中保存已有疾病诊断文本与其对应的哈希码。
作为一种示例性的实施方式中,本示例中疾病诊断文本库中各个已有疾病诊断文本所对应的哈希码的示例性获取方式可以为:针对各个已有疾病诊断文本,可将该已有疾病诊断文本输入到哈希码提取模型,以通过哈希码提取模型确定出该已有疾病诊断文本所对应的哈希码。
在本公开的一个实施例中,从目标部位对应的疾病诊断文本库中获取与目标哈希码对应的目标诊断文本的一种可能实现方式为:确定出目标哈希码与疾病诊断文本库中各个已有疾病诊断文本的哈希码之间的距离;根据距离,从各个已有疾病诊断文本中获取与目标哈希码匹配的疾病诊断文本。
在本公开的一个实施例中,在不同应用场景中,根据距离,从各个已有疾病诊断文本中获取与目标哈希码匹配的疾病诊断文本的实现方式不同,示例性方式如下:
作为一种示例,根据距离,从各个已有疾病诊断文本中选择出距离最短的目标已有疾病诊断文本作为与目标哈希码匹配的疾病诊断文本。
作为另一种示例,根据距离,从各个已有疾病诊断文本中选择出距离小于预设距离阈值的目标已有疾病诊断文本作为与目标哈希码匹配的疾病诊断文本。
作为一种示例,按照距离从低到高的顺序对各个已有疾病诊断文本进行排序,并从排序结果中选择出排序在前N位的已有疾病诊断文本作为与目标哈希码匹配的疾病诊断文本,其中,N为大于或者等于1的整数。
其中,本示例实施例中的距离可以为汉明距离。
本公开实施例的影像检索方法,在获取目标部位的疾病影像数据,通过预先训练的哈希码提取模型即可确定出疾病影像数据所对应的目标哈希码,并从目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本。由此,基于疾病影像数据的哈希码,即可快速检索出与该哈希码匹配的疾病诊断文本,提高了获取疾病诊断文本的效率。
与上述几种实施例提供的哈希码提取模型的训练方法相对应,本公开的一种实施例还提供一种哈希码提取模型的训练装置,由于本公开实施例提供的哈希码提取模型的训练装置与上述几种实施例提供的哈希码提取模型的训练方法相对应,因此在哈希码提取模型的训练方法的实施方式也适用于本实施例的哈希码提取模型的训练装置,在本实施例中不再详细描述。
图8是根据本公开一个实施例的哈希码提取模型的训练装置的结构示意图。
如图8所示,该哈希码提取模型的训练装置800包括:第一获取模块801、第一确定模块802、第二确定模块803、第三确定模块804和训练模块805。
第一获取模块801,用于获取目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与第一疾病影像样本对应的疾病诊断文本,其中,第一疾病影像样本、第二疾病影像样本和疾病诊断文本所对应的疾病名称是相同的。
第一确定模块802,用于通过哈希码提取模型分别确定正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自的哈希码。
第二确定模块803,用于根据正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,确定哈希码提取模型的跨模态对比损失值。
第三确定模块804,用于根据正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离,确定哈希码提取模型的同模态对比损失值。
训练模块805,用于根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。
在本公开的一个实施例中,哈希码提取模型包括影像深度网络、文本深度网络、与影像深度网络连接的第一哈希层和与文本深度网络连接的第二哈希层,第一确定模块802,包括:
第一确定单元,用于通过影像深度网络分别确定正常影像样本和第二疾病影像样本各自对应的第一图像特征,并将第一图像特征输入到第一哈希层中,以得到正常影像样本和第二疾病影像样本各自对应的哈希码;
第二确定单元,用于通过影像深度网络确定第一疾病影像样本的第二图像特征,并通过文本深度网络确定疾病诊断文本的文本特征;
第三确定单元,用于将第二图像特征输入到第一哈希层中以得到第一疾病影像样本的哈希码,并将文本特征输入到第二哈希层中以得到疾病诊断文本的哈希码。
在本公开的一个实施例中,哈希码提取模型还包括自注意力层,自注意力层设置在影像深度网络和文本深度网络之间,第二确定单元,具体用于:通过文本深度网络确定疾病诊断文本的文本特征;将文本特征输入到自注意力层中,以得到疾病诊断文本的注意力特征;将注意力特征输入到影像深度网络中,以使得影像深度网络基于注意力特征对第一疾病影像样本进行病灶特征提取,得到第一疾病影像样本的第二图像特征。
本公开实施例的哈希码提取模型的训练装置,在获取目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与第一疾病影像样本对应的疾病诊断文本后,通过哈希码提取模型分别确定正常影像样本、第一疾病影像样本、第二疾病影像样本和疾病诊断文本各自的哈希码;根据正常影像样本的哈希码和疾病诊断文本的哈希码之间的距离以及第一疾病影像样本的哈希码和疾病诊断文本的哈希码之间的距离,确定哈希码提取模型的跨模态对比损失值;根据正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及第一疾病影像样本的哈希码和第二疾病影像样本的哈希码之间的距离,确定哈希码提取模型的同模态对比损失值;根据跨模态对比损失值和同模态对比损失值,对哈希码提取模型进行训练。由此,无需对样本数据进行人工标注,通过对正常影像样本、疾病影像样本以及疾病诊断文本进行对比学习,即可实现对哈希码提取模型的训练,降低了哈希码提取模型的训练成本。
与上述几种实施例提供的哈希码提取模型的哈希码检索方法相对应,本公开的一种实施例还提供一种哈希码提取模型的哈希码检索装置,由于本公开实施例提供的哈希码提取模型的哈希码检索装置与上述几种实施例提供的哈希码提取模型的哈希码检索方法相对应,因此在哈希码提取模型的哈希码检索方法的实施方式也适用于本实施例的哈希码提取模型的哈希码检索装置,在本实施例中不再详细描述。
图9是根据本公开一个实施例的基于哈希码提取模型的哈希检索装置的结构示意图。
如图9所示,该基于哈希码提取模型的哈希检索装置900包括:第一获取模块901、哈希码确定模块902和第二获取模块903。
第一获取模块901,用于获取目标部位的待检索数据,其中,待检索数据的模态为影像模态或者文本模态。
哈希码确定模块902,用于将待检索数据输入到哈希码提取模型,以得到待检索数据对应的目标哈希码。
其中,本实施例中的哈希码提取模型是通过本公开实施例所提出的哈希码提取模型的训练方法训练得到的。
第二获取模块903,用于从不同于待检索数据的模态的数据库中,获取与目标哈希码匹配的检索结果。
在本公开的一个实施例中,第二获取模块903,包括:确定单元和获取单元。
确定单元,用于确定目标哈希码与数据库中各个数据的哈希码之间的距离。
获取单元,用于根据距离,从各个数据中获取与目标哈希码匹配的检索结果。
在本公开的一个实施例中获取单元,具体用于:
根据距离,从各个数据中选择出距离最短的目标数据作为检索结果;或者,
按照距离从低到高的顺序对各个数据进行排序,并从排序结果中选择出排序在前N位的数据作为检索结果,其中,N为大于或者等于1的整数。
在本公开的一个实施例中数据库中各个数据的哈希码是通过下述方式得到的:针对各个数据,将数据输入到哈希码提取模型中,以通过哈希码提取模型得到数据的哈希码。
本公开实施例的基于哈希码提取模型的哈希检索装置,通过预先训练好的哈希码提取模型来确定目标部位的待检索数据的目标哈希码,并从不同于待检索数据的模态的数据库中,获取与目标哈希码匹配的检索结果。由此,实现了不同模态数据之间可进行相互检索的同时,有效提高检索效率。
图10是根据本公开一个实施例的影像检索装置的结构示意图。
如图10所示,该影像检索装置1000包括:第一获取模块1001、哈希码确定模块1002和第二获取模块1003。
第一获取模块1001,用于获取目标部位的疾病影像数据。
哈希码确定模块1002,用于将疾病影像数据输入到哈希码提取模型,以得到疾病影像数据对应的目标哈希码。
其中,哈希码提取模型是本公开实施例所提供的哈希码提取模型的训练方法训练得到的。
其中,需要说明的是,关于哈希码提取模型的训练方法的具体描述,可参见本公开实施例的相关描述,此处不再赘述。
第二获取模块1003,用于从目标部位对应的疾病诊断文本库中获取与目标哈希码对应的目标诊断文本。
其中,需要说明的是,前述对影像检索方法实施例的解释说明也适用于该影像检索装置,此处不再赘述。
本公开实施例的影像检索装置,在获取目标部位的疾病影像数据,通过预先训练的哈希码提取模型即可确定出疾病影像数据所对应的目标哈希码,并从目标部位对应的疾病诊断文本库中获取与目标哈希码对应的目标诊断文本。由此,基于疾病影像数据的哈希码,即可快速检索出与该哈希码匹配的疾病诊断文本,提高了获取疾病诊断文本的效率。
根据本公开的实施例,本公开还提供了一种电子设备和一种可读存储介质。
图11是根据本公开一个实施例的电子设备的结构框图。
如图11所示,该电子设备1100包括:存储器1110、处理器1120及存储在存储器1110上并可在处理器1120上运行的计算机指令。所述电子设备用于哈希检索或影像检索。
处理器1120执行指令时实现上述实施例中提供的基于哈希码提取模型的哈希检索方法,或者哈希码提取模型的训练方法,或者,影像检索方法。
在一些实施例中,电子设备1100还包括:
通信接口1130,用于存储器1110和处理器1120之间的通信。
存储器1110,用于存放可在处理器1120上运行的计算机指令。在一些实施例中,处理器可以为中央处理器(Central Processing Unit,CPU)、特定应用集成电路(application-specific integrated circuit,ASIC)、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件等。
存储器1110可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。在一些实施例中,存储器可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、以及至少一个功能(比如影像采集和处理功能等)所需的应用程序等;存储数据区可存储根据计算机的使用过程中所创建的数据,比如,影像或文本数据等。
在一些实施例中,所述电子设备还包括影像采集器。所述影像采集器可以为摄像机、X光成像类、计算机断层扫描(CT)、磁共振成像(MRI)和核医学类、超声设备和内镜设备等。
处理器1120,用于执行程序时实现上述实施例的基于哈希码提取模型的哈希检索方法,或者哈希码提取模型的训练方法,或者,影像检索方法。
如果存储器1110、处理器1120和通信接口1130独立实现,则通信接口1130、存储器1110和处理器1120可以通过总线相互连接并完成相互间的通信。总线可以是工业标准体系结构(Industry Standard Architecture,简称为ISA)总线、外部设备互连(Peripheral Component,简称为PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,简称为EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图8中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。
在一些实施例中,在具体实现上,如果存储器1110、处理器1120及通信接口1130,集成在一块芯片上实现,则存储器1110、处理器1120及通信接口1130可以通过内部接口完成相互间的通信。
处理器1120可能是一个中央处理器(Central Processing Unit,简称为CPU),或者是特定集成电路(Application Specific Integrated Circuit,简称为ASIC),或者是被配置成实施本公开实施例的一个或多个集成电路。
本公开另一方面实施例提出了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本公开实施例任一的基于哈希码提取模型的哈希检索方法,或者哈希码提取模型的训练方法,或者,影像检索方法。
根据本公开的实施例,本公开还提供了一种计算机程序产品,包括计算机程序,计算机程序在被处理器执行时实现如本公开任一实施例的哈希码提取模型的训练方法、或者,基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
根据本公开的实施例,本公开还提供了一种计算机程序,该计算机程序包括计算机程序代码,当该计算机程序代码在计算机上运行时,使得计算机执行本公开任一实施例的哈希码提取模型的训练方法、或者,基于哈希码提取模型的哈希检索方法,或者,影像检索方法。
需要说明的是,前述对方法、装置实施例的解释说明也适用于上述实施例的电子设备、计算机可读存储介质、计算机程序产品和计算机程序,此处不再赘述。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。
Claims (20)
- 一种基于哈希码提取模型的哈希检索方法,包括:获取目标部位的待检索数据,其中,所述待检索数据的模态为影像模态或者文本模态;将所述待检索数据输入到哈希码提取模型,以得到所述待检索数据对应的目标哈希码;和从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果,其中所述哈希码提取模型是通过以下步骤得到的:获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
- 如权利要求1所述的方法,其中所述从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果,包括:确定所述目标哈希码与所述数据库中各个数据的哈希码之间的距离;根据所述距离,从各个所述数据中获取与所述目标哈希码匹配的检索结果。
- 如权利要求2所述的方法,其中所述根据所述距离,从各个所述数据中获取与所述目标哈希码匹配的检索结果,包括:根据所述距离,从各个所述数据中选择出距离最短的目标数据作为所述检索结果;或者,按照距离从低到高的顺序对各个所述数据进行排序,并从排序结果中选择出排序在前N位的数据作为所述检索结果,其中,N为大于或者等于1的整数。
- 如权利要求1至3中任一项所述的方法,其中所述哈希码提取模型包括影像深度网络、文本深度网络、与所述影像深度网络连接的第一哈希层和与所述文本深度网络连接的第二哈希层,所述通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码,包括:通过所述影像深度网络分别确定所述正常影像样本和所述第二疾病影像样本各自对应的第一图像特征,并将所述第一图像特征输入到所述第一哈希层中,以得到所述正常影像样本和所述第二疾病影像样本各自对应的哈希码;通过所述影像深度网络确定所述第一疾病影像样本的第二图像特征,并通过所述文本深度网络确定所述疾病诊断文本的文本特征;将所述第二图像特征输入到所述第一哈希层中以得到第一疾病影像样本的哈希码,并将所述文本特征输入到所述第二哈希层中以得到所述疾病诊断文本的哈希码。
- 如权利要求4所述的方法,其中所述哈希码提取模型还包括自注意力层,所述自注意力层设置在所述影像深度网络和所述文本深度网络之间,所述通过所述影像深度网络确定所述第一疾病影像样本的第二图像特征,并通过所述文本深度网络确定所述疾病诊断文本的文本特征,包括:通过所述文本深度网络确定所述疾病诊断文本的文本特征;将所述文本特征输入到所述自注意力层中,以得到所述疾病诊断文本的注意力特征;将所述注意力特征输入到所述影像深度网络中,以使得所述影像深度网络基于所述注意力特征对所述第一疾病影像样本进行病灶特征提取,得到所述第一疾病影像样本的第二图像特征。
- 如权利要求1至5中任一项所述的方法,其中所述数据库中各个数据的哈希码是通过下述方式得到的:针对各个数据,将所述数据输入到所述哈希码提取模型中,以通过所述哈希码提取模型得到所述数据的哈希码。
- 一种影像检索方法,包括:获取目标部位的疾病影像数据;将所述疾病影像数据输入到哈希码提取模型,以得到所述疾病影像数据对应的目标哈希码;和从所述目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本,其中所述哈希码提取模型是通过以下步骤得到的:获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
- 如权利要求7所述的方法,其中所述哈希码提取模型包括影像深度网络、文本深度网络、与所述影像深度网络连接的第一哈希层和与所述文本深度网络连接的第二哈希层,所述通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码,包括:通过所述影像深度网络分别确定所述正常影像样本和所述第二疾病影像样本各自对应的第一图像特征,并将所述第一图像特征输入到所述第一哈希层中,以得到所述正常影像样本和所述第二疾病影像样本各自对应的哈希码;通过所述影像深度网络确定所述第一疾病影像样本的第二图像特征,并通过所述文本深度网络确定所述疾病诊断文本的文本特征;将所述第二图像特征输入到所述第一哈希层中以得到第一疾病影像样本的哈希码,并将所述文本特征输入到所述第二哈希层中以得到所述疾病诊断文本的哈希码。
- 如权利要求8所述的方法,其中所述哈希码提取模型还包括自注意力层,所述自注意力层设置在所述影像深度网络和所述文本深度网络之间,所述通过所述影像深度网络确定所述第一疾病影像样本的第二图像特征,并通过所述文本深度网络确定所述疾病诊断文本的文本特征,包括:通过所述文本深度网络确定所述疾病诊断文本的文本特征;将所述文本特征输入到所述自注意力层中,以得到所述疾病诊断文本的注意力特征;将所述注意力特征输入到所述影像深度网络中,以使得所述影像深度网络基于所述注意力特征对所述第一疾病影像样本进行病灶特征提取,得到所述第一疾病影像样本的第二图像特征。
- 一种基于哈希码提取模型的哈希检索装置,包括:第一获取模块,用于获取所述目标部位的待检索数据,其中,所述待检索数据的模态为影像模态或者文本模态;哈希码确定模块,用于将所述待检索数据输入到哈希码提取模型,以得到所述待检索数据对应的目标哈希码;第二获取模块,用于从不同于所述待检索数据的模态的数据库中,获取与所述目标哈希码匹配的检索结果,其中,所述哈希码提取模型是通过哈希码提取模型的训练装置得到的,所述训练装置包括:第一获取模块,用于获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;第一确定模块,用于通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;第二确定模块,用于根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;第三确定模块,用于根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和训练模块,用于根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
- 如权利要求10所述的装置,其中所述第二获取模块包括:确定单元,用于确定所述目标哈希码与所述数据库中各个数据的哈希码之间的距离;获取单元,用于根据所述距离,从各个所述数据中获取与所述目标哈希码匹配的检索结果。
- 如权利要求11所述的装置,其中所述获取单元具体用于:根据所述距离,从各个所述数据中选择出距离最短的目标数据作为所述检索结果;或者,按照距离从低到高的顺序对各个所述数据进行排序,并从排序结果中选择出排序在前N位的数据作为所述检索结果,其中,N为大于或者等于1的整数。
- 如权利要求10至12中任一项所述的装置,其中所述哈希码提取模型包括影像深度网络、文本深度网络、与所述影像深度网络连接的第一哈希层和与所述文本深度网络连接的第二哈希层,所述第一确定模块,包括:第一确定单元,用于通过所述影像深度网络分别确定所述正常影像样本和所述第二疾病影像样本各自对应的第一图像特征,并将所述第一图像特征输入到所述第一哈希层中,以得到所述正常影像样本和所述第二疾病影像样本各自对应的哈希码;第二确定单元,用于通过所述影像深度网络确定所述第一疾病影像样本的第二图像特征,并通过所述文本深度网络确定所述疾病诊断文本的文本特征;第三确定单元,用于将所述第二图像特征输入到所述第一哈希层中以得到第一疾病影像样本的哈希码,并将所述文本特征输入到所述第二哈希层中以得到所述疾病诊断文本的哈希码。
- 如权利要求13所述的装置,其中所述哈希码提取模型还包括自注意力层,所述自注意力层设置在所述影像深度网络和所述文本深度网络之间,所述第二确定单元,具体用于:通过所述文本深度网络确定所述疾病诊断文本的文本特征;将所述文本特征输入到所述自注意力层中,以得到所述疾病诊断文本的注意力特征;将所述注意力特征输入到所述影像深度网络中,以使得所述影像深度网络基于所述注意力特征对所述第一疾病影像样本进行病灶特征提取,得到所述第一疾病影像样本的第二图像特征。
- 如权利要求10至14中任一项所述的装置,其中所述数据库中各个数据的哈希码是通过下述方式得到的:针对各个数据,将所述数据输入到所述哈希码提取模型中,以通过所述哈希码提取模型得到所述数据的哈希码。
- 一种影像检索装置,包括:第一获取模块,用于获取目标部位的疾病影像数据;哈希码确定模块,用于将所述疾病影像数据输入到哈希码提取模型,以得到所述疾病影像数据对应的目标哈希码;第二获取模块,用于从所述目标部位对应的疾病诊断文本库中获取与所述目标哈希码对应的目标诊断文本,其中,所述哈希码提取模型是通过哈希码提取模型的训练装置得到的,所述训练装置包括:第一获取模块,用于获取所述目标部位的正常影像样本、第一疾病影像样本、第二疾病影像样本以及与所述第一疾病影像样本对应的疾病诊断文本,其中,所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本所对应的疾病名称是相同的;第一确定模块,用于通过所述哈希码提取模型分别确定所述正常影像样本、所述第一疾病影像样本、所述第二疾病影像样本和所述疾病诊断文本各自的哈希码;第二确定模块,用于根据所述正常影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述疾病诊断文本的哈希码之间的距离,确定所述哈希码提取模型的跨模态对比损失值;第三确定模块,用于根据所述正常影像样本的哈希码和第一疾病影像样本的哈希码之间的距离以及所述第一疾病影像样本的哈希码和所述第二疾病影像样本的哈希码之间的距离,确定所述哈希码提取模型的同模态对比损失值;和训练模块,用于根据所述跨模态对比损失值和所述同模态对比损失值,对所述哈希码提取模型进行训练。
- 一种电子设备,包括:存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如权利要求1至6中任一所述的方法,或者,如权利要求7至9中任一所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1至6中任一所述的方法,或者,如权利要求7至9中任一所述的方法。
- 一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现如权利要求1至6中任一所述的方法,或者,如权利要求7至9中任一所述的方法。
- 一种计算机程序,所述计算机程序包括计算机程序代码,当所述计算机程序代码在计算机上运行时,以使得计算机执行如权利要求1至6中任一所述的方法,或者,如权利要求7至9中任一所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211122932.4 | 2022-09-15 | ||
CN202211122932.4A CN115410717B (zh) | 2022-09-15 | 2022-09-15 | 模型训练方法、数据检索方法、影像数据检索方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024055805A1 true WO2024055805A1 (zh) | 2024-03-21 |
Family
ID=84165043
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/113590 WO2024055805A1 (zh) | 2022-09-15 | 2023-08-17 | 数据检索方法、影像数据检索方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115410717B (zh) |
WO (1) | WO2024055805A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115410717B (zh) * | 2022-09-15 | 2024-05-21 | 北京京东拓先科技有限公司 | 模型训练方法、数据检索方法、影像数据检索方法和装置 |
CN117112829B (zh) * | 2023-10-24 | 2024-02-02 | 吉林大学 | 医疗数据跨模态检索方法、装置和相关设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112559810A (zh) * | 2020-12-23 | 2021-03-26 | 上海大学 | 一种利用多层特征融合生成哈希码的方法及装置 |
CN113220919A (zh) * | 2021-05-17 | 2021-08-06 | 河海大学 | 一种大坝缺陷图像文本跨模态检索方法及模型 |
CN114238746A (zh) * | 2021-12-20 | 2022-03-25 | 河北省气象技术装备中心 | 跨模态检索方法、装置、设备及存储介质 |
WO2022104540A1 (zh) * | 2020-11-17 | 2022-05-27 | 深圳大学 | 一种跨模态哈希检索方法、终端设备及存储介质 |
CN115410717A (zh) * | 2022-09-15 | 2022-11-29 | 北京京东拓先科技有限公司 | 模型训练方法、数据检索方法、影像数据检索方法和装置 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104346440B (zh) * | 2014-10-10 | 2017-06-23 | 浙江大学 | 一种基于神经网络的跨媒体哈希索引方法 |
CN105512289B (zh) * | 2015-12-07 | 2018-08-14 | 郑州金惠计算机系统工程有限公司 | 基于深度学习和哈希的图像检索方法 |
CN108170755B (zh) * | 2017-12-22 | 2020-04-07 | 西安电子科技大学 | 基于三元组深度网络的跨模态哈希检索方法 |
CA3011713A1 (en) * | 2018-07-17 | 2020-01-17 | Avigilon Coporation | Hash-based appearance search |
CN111241310A (zh) * | 2020-01-10 | 2020-06-05 | 济南浪潮高新科技投资发展有限公司 | 一种深度跨模态哈希检索方法、设备及介质 |
CN111522903A (zh) * | 2020-04-01 | 2020-08-11 | 济南浪潮高新科技投资发展有限公司 | 一种深度哈希检索方法、设备及介质 |
CN112800292B (zh) * | 2021-01-15 | 2022-10-11 | 南京邮电大学 | 一种基于模态特定和共享特征学习的跨模态检索方法 |
CN112817914A (zh) * | 2021-01-21 | 2021-05-18 | 深圳大学 | 基于注意力的深度跨模态哈希检索方法、装置及相关设备 |
CN113159095B (zh) * | 2021-01-30 | 2024-04-30 | 华为技术有限公司 | 一种训练模型的方法、图像检索的方法以及装置 |
CN113095415B (zh) * | 2021-04-15 | 2022-06-14 | 齐鲁工业大学 | 一种基于多模态注意力机制的跨模态哈希方法及系统 |
CN113641790A (zh) * | 2021-08-12 | 2021-11-12 | 中国石油大学(华东) | 一种基于区分表示深度哈希的跨模态检索模型 |
CN114722902A (zh) * | 2022-03-08 | 2022-07-08 | 中山大学 | 基于自监督学习的无标注视频哈希检索方法及装置 |
-
2022
- 2022-09-15 CN CN202211122932.4A patent/CN115410717B/zh active Active
-
2023
- 2023-08-17 WO PCT/CN2023/113590 patent/WO2024055805A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022104540A1 (zh) * | 2020-11-17 | 2022-05-27 | 深圳大学 | 一种跨模态哈希检索方法、终端设备及存储介质 |
CN112559810A (zh) * | 2020-12-23 | 2021-03-26 | 上海大学 | 一种利用多层特征融合生成哈希码的方法及装置 |
CN113220919A (zh) * | 2021-05-17 | 2021-08-06 | 河海大学 | 一种大坝缺陷图像文本跨模态检索方法及模型 |
CN114238746A (zh) * | 2021-12-20 | 2022-03-25 | 河北省气象技术装备中心 | 跨模态检索方法、装置、设备及存储介质 |
CN115410717A (zh) * | 2022-09-15 | 2022-11-29 | 北京京东拓先科技有限公司 | 模型训练方法、数据检索方法、影像数据检索方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
CN115410717B (zh) | 2024-05-21 |
CN115410717A (zh) | 2022-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2024055805A1 (zh) | 数据检索方法、影像数据检索方法及装置 | |
Wu et al. | Comparison of chest radiograph interpretations by artificial intelligence algorithm vs radiology residents | |
WO2022199462A1 (zh) | 医学图像报告生成模型的训练方法及图像报告生成方法 | |
CN110291535B (zh) | 用于通过卷积神经网络创建医学图像数据库的方法和系统 | |
US20210240931A1 (en) | Visual question answering using on-image annotations | |
Haritha et al. | COVID detection from chest X-rays with DeepLearning: CheXNet | |
WO2023165012A1 (zh) | 问诊方法和装置、电子设备及存储介质 | |
US11574717B2 (en) | Medical document creation support apparatus, medical document creation support method, and medical document creation support program | |
JP2018512639A (ja) | 臨床の所見のコンテキストによる評価のための方法及びシステム | |
JP2022036125A (ja) | 検査値のコンテキストによるフィルタリング | |
US20240193932A1 (en) | Information processing apparatus, method, and program | |
JP2019522274A (ja) | 共参照解析、情報抽出および類似文書検索のための装置および方法 | |
US20240119750A1 (en) | Method of generating language feature extraction model, information processing apparatus, information processing method, and program | |
US20200285804A1 (en) | Systems and Methods for Generating Context-Aware Word Embeddings | |
US20230054096A1 (en) | Learning device, learning method, learning program, information processing apparatus, information processing method, and information processing program | |
Park et al. | Deep learning-enabled detection of pneumoperitoneum in supine and erect abdominal radiography: modeling using transfer learning and semi-supervised learning | |
US20230030794A1 (en) | Learning device, learning method, learning program, information processing apparatus, information processing method, and information processing program | |
Gao et al. | Accuracy analysis of triage recommendation based on CNN, RNN and RCNN models | |
BR112020023361A2 (pt) | método e sistema | |
Edo-Osagie et al. | Deep learning for relevance filtering in syndromic surveillance: a case study in asthma/difficulty breathing | |
CN114999613A (zh) | 用于提供与医学图像相关联的至少一个元数据属性的方法 | |
Islam et al. | Distinguishing l and h phenotypes of Covid-19 using a single X-ray image | |
US20210183499A1 (en) | Method for automatic visual annotation of radiological images from patient clinical data | |
JP2013506900A (ja) | 画像に基づくクエリを使用したドキュメント識別 | |
WO2024071246A1 (ja) | 情報処理装置、情報処理方法及び情報処理プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23864538 Country of ref document: EP Kind code of ref document: A1 |