CN111428072A

CN111428072A - Ophthalmologic multimodal image retrieval method, apparatus, server and storage medium

Info

Publication number: CN111428072A
Application number: CN202010242450.7A
Authority: CN
Inventors: 方建生; 刘江
Original assignee: Southwest University of Science and Technology
Current assignee: Southwest University of Science and Technology; Southern University of Science and Technology
Priority date: 2020-03-31
Filing date: 2020-03-31
Publication date: 2020-07-17

Abstract

The embodiment of the invention discloses a method, a device, a server and a storage medium for searching ophthalmic multi-modal images, wherein the method comprises the following steps: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. Through acquiring the single-mode eye image of the user, the system identifies in the deep learning model and outputs a multi-mode identification result, the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is achieved, and the user experience is improved.

Description

Ophthalmologic multimodal image retrieval method, apparatus, server and storage medium

Technical Field

The present invention relates to retrieval technologies, and in particular, to a method, an apparatus, a server, and a storage medium for retrieving an ophthalmic multimodal image.

Background

With the development of imaging technology, digital ophthalmic images have become the main data of ophthalmology, and this trend drives the construction of ophthalmic image search function to assist the clinical decision of doctors. Conventionally, the ophthalmic image retrieval function adopts a text-based retrieval method, which first performs text description on an image (establishes a corresponding relationship between a text and the image), inputs a keyword query during retrieval, and returns a ranking result. The method of finding pictures by words has semantic difference of inconsistent text description and image content, and influences retrieval effect. With the development of computer vision technology, Content-based image Retrieval (CBIR) methods are beginning to be applied to ophthalmology. The method of finding the image by the image carries out retrieval according to the characteristics of the color, the shape, the texture and the like of the image, and avoids semantic difference between text description and image content. A Content-based Image Retrieval (CBIR) technique is a technique for retrieving the most similar Image from the Content of an Image, and combines information Retrieval, computer vision, and the like. In recent years, in the field of medical imaging, a deep learning algorithm represented by a deep convolutional network (CNN) obtains excellent performance in disease classification and lesion segmentation of ophthalmic images, and is superior to traditional classifiers (such as a Support Vector Machine (SVM), a Random Forest (RF) and the like) in extracting features such as textures, colors, forms and the like, so that a technical basis is provided for construction of an image retrieval function.

Diseases of various parts of the human body can be presented through pathological changes of the eyes, so that the academic and industrial fields are widely dedicated to automatically screening the diseases by analyzing the digital ophthalmology image through an artificial intelligence algorithm, and related results are published. For example, White Eye Detector, a free software for determining Eye cancer by pictures, developed by university of Beller, Texas, USA, BiliSten, a software for diagnosing liver cancer according to Eye color, developed by university of Washington, and Intelligent screening Eye fundus camera, introduced by national health company. However, the automatic disease screening based on the ophthalmic medical imaging technology and the image processing technology has the problems that the algorithm faces the challenges of interpretability and accuracy, the samples used for training the model also face the problems of difficult acquisition, subjective ambiguity labeling and the like, and the clinical application still has a long way to go. More importantly, although the computer screening result is only used as an auxiliary reference for the doctor, the computer screening result more or less affects the judgment of the doctor again, and thus the final diagnosis result is possibly affected.

Disclosure of Invention

The invention provides a method, a device, a server and a storage medium for searching an ophthalmologic case, which are used for realizing the effect of searching other potential ophthalmologic problems possibly existing through a single-mode ophthalmologic image.

In a first aspect, an embodiment of the present invention provides a method for retrieving an ophthalmic multimodal image, including: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes;

correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors.

Optionally, the ophthalmic digital image includes: fundus, corneal nerve and OCT images.

Optionally, the deep learning model is a multi-modal convolutional neural network model.

Optionally, before inputting the single-mode eye image into a pre-trained deep learning model and outputting a multi-mode recognition result, the method further includes:

marking the sample image by using various labels and establishing a database;

and establishing a multi-modal convolutional neural network and training the multi-modal convolutional neural network model by using the sample image to obtain a trained multi-modal convolutional neural network model.

Optionally, the marking the sample image with a plurality of labels and establishing a database includes:

if the sample images belong to the same patient, marking the sample images as a first label;

if the sample image belongs to the same case, marking the sample image as a second label;

if the sample image is not relevant, marking as a third label;

and establishing a database of sample images marked with the first label, the second label and the third label.

Optionally, the multi-modal convolutional neural network model includes: the eye fundus image convolution neural network model, the cornea neural image convolution neural network model and the OCT image convolution neural network model.

Optionally, the comparative analysis result includes: the most similar multiple sample images and their detailed case cases.

In a second aspect, an embodiment of the present invention further provides a device for retrieving an ophthalmic multi-modal image, where the device includes:

the data acquisition module is used for acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes;

and the data identification module is used for correspondingly inputting the eye images in the plurality of different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images in the plurality of different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors.

In a third aspect, an embodiment of the present invention further provides a server, where the server includes:

one or more processors;

a storage device for storing one or more programs,

when the one or more programs are executed by the one or more processors, the one or more processors implement the method for retrieving an ophthalmic multimodal image as described in any of the above.

In a fourth aspect, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for retrieving an ophthalmic multimodal image as described in any one of the above.

The embodiment of the invention discloses a method, a device, a server and a storage medium for searching ophthalmic multi-modal images, wherein the method comprises the following steps:

acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the method for retrieving the multi-mode ophthalmologic image, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-mode eye image of the user and outputs the multi-mode identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is realized, and the user experience is improved.

Drawings

Fig. 1 is a flowchart of a method for retrieving multi-modal ophthalmic images according to an embodiment of the present invention;

fig. 2 is a flowchart of a method for retrieving multi-modal ophthalmic images according to a second embodiment of the present invention;

fig. 3 is a schematic structural diagram of a retrieval apparatus for multi-modal ophthalmic images according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram of a server according to a fourth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. A process may be terminated when its operations are completed, but may have additional steps not included in the figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc.

Furthermore, the terms "first," "second," and the like may be used herein to describe various orientations, actions, steps, elements, or the like, but the orientations, actions, steps, or elements are not limited by these terms. These terms are only used to distinguish one direction, action, step or element from another direction, action, step or element. For example, a first label may be referred to as a second label, and similarly, a second label may be referred to as a first label, without departing from the scope of the present application. The first label and the second label are both labels, but they are not the same label. The terms "first", "second", etc. are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Example one

Fig. 1 is a flowchart of a method for retrieving an ophthalmic multi-modal image according to an embodiment of the present invention, which is applicable to a case where a user performs an ophthalmic disease retrieval online, and specifically includes the following steps:

step 100, obtaining a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images of multiple different modes.

In this embodiment, an eye image of a user uploaded or collected by the user is acquired, where the eye image is an ophthalmic digital image, and the ophthalmic digital image imaging methods include eye surface color photography, Fundus color photography, Optical Coherence Tomography (OCT), Anterior segment OCT, In Vivo Confocal Microscope (IVCM), Fluorescein Fundus Angiography (FFA), and Indocyanine Green choroidal Angiography (ICGA). In this embodiment, the ophthalmic digital image includes: fundus, corneal nerve and OCT images. The imaging apparatus outputs a digital image by observing the morphology of tissue structures of blood vessels, nerves, cornea, crystals, crystalline lens, iris, etc. of the eye in the human body, among which a fundoscope, a slit lamp, and Optical Coherence Tomography (OCT). The ophthalmic digital images generated by different imaging devices are different, have different resolutions and different parts, and can be used for different disease type diagnoses. Such as optical coherence tomography of the posterior segment of the eye, has important value in clinical examination and diagnosis of retinal diseases, macular diseases, optic nerve diseases, glaucoma and the like. In terms of data morphology, each imaging mode generates a modality, and the multiple imaging modes generate multiple digital images which are multiple modalities.

Step 110, correspondingly inputting the eye images of the plurality of different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the plurality of different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors.

In this embodiment, the deep learning model is a multi-modal convolutional neural network model, and the eye images of multiple different modalities are input into multiple pre-trained deep learning models to fuse and train similarity and dissimilarity of the eye images of multiple different modalities and output multiple feature vectors, and a comparative analysis result is generated according to the multiple feature vectors, where the comparative analysis result includes: the most similar sample images and the detailed case conditions thereof can be provided for the user or the doctor as reference by returning the multi-modal sample images and the case reasons thereof, so that whether the user possibly comprises other eye diseases which cannot be displayed by the single-modal eye images or not can be conveniently judged.

Illustratively, certain ocular diseases sometimes do not occur alone, but occur simultaneously, and one elderly often incorporates several senile ocular diseases. The disease of the eyeground and the cataract are suffered at the same time, the eyeground is generally examined by using a fundoplication lens, and the cataract is examined by using an OCT (optical coherence tomography), namely, the imaging modes of different modes are suitable for different diseases. When a patient in a certain case has multiple eye lesions, digital images of multiple modalities may be needed for diagnosis. The method has application value in constructing similarity relation for digital images of different modes belonging to the same case. Through case-level retrieval, the likelihood of multiple lesions in multiple modalities can be retrieved. In this case, multi-modal search is actually helpful for the discovery of multiple lesions. If a patient is known to be a disease a, the multi-modal search function inputs a digital image search of one modality of the patient, returns a digital image of another modality, and the returned digital image of the other modality is related to a disease B, so that the patient may have A, B both diseases. The retrieval is valuable in that the requirement is partially definite and partially fuzzy, and the partial fuzzy is constantly definite through the retrieval result. For example, in a text search, if a user wants to find a book and knows part of the book name, but the full name is unknown, the search engine returns many similar results by searching the part with the known book name, and the similar results may include the book that the user wants, and the user knows the full name of the book. Retrieval helps users to keep their needs clear of vast amounts of information and often has unexpected consequences. The multi-modal image retrieval is based on unexpected results for a plurality of lesion scenes, and assists doctors in further mining information to judge lesions.

For example, when a doctor diagnoses, an ophthalmologic image of a patient is extracted, and when the image is difficult to make a decision, the doctor consciously searches for similar cases to be referred to. At this time, if the image is searched by the modality of the query image alone, the case deficiency (deficiency and difficulty) of the modality may be faced, and the similar cases searched may still fail to make a diagnosis conclusion, and at this time, the search of other modalities may be further extended, and the result of the search of other modalities is used to assist the diagnosis. If a patient in a case takes only one modality of digital image of the eye, multi-modality retrieval can be used to assist the physician. However, if the patient himself takes digital images of multiple modalities, although the images can be retrieved from modality to modality, the diagnosis process is time-consuming for the doctor. When the patient is confronted with difficult and complicated diseases, the patient only takes a modal ophthalmic digital image, and the multi-modal retrieval highlights the functional value of the patient. Even if the patient takes a plurality of ophthalmic digital images, the multi-mode retrieval can avoid the retrieval of a plurality of modes of a doctor for a plurality of times, and the diagnosis efficiency is improved.

The embodiment discloses a method for searching multi-modal ophthalmology images, which comprises the following steps: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the method for retrieving the multi-mode ophthalmologic image, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-mode eye image of the user and outputs the multi-mode identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is realized, and the user experience is improved.

Example two

step 200, acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images of multiple different modes.

Step 210, labeling the sample image with a plurality of labels and building a database.

Specifically, step 210 includes: if the sample images belong to the same patient, marking the sample images as a first label;

if the sample image is not relevant, marking as a third label;

In this embodiment, the multi-modal ophthalmic digital images are labeled with 3 levels, i.e., the third label is irrelevant 0, the second label is relevant to case 1, and the first label is relevant to disease 2, so that the relevance is higher and higher. Assuming A, B, C three modes, namely a corneal nerve map, an eye fundus map and an anterior segment OCT, the three modes are labeled on the training sample mainly. The three modalities are labeled 1 if they all belong to the same case (same patient), 2 if they belong to the same disease, and 0 if they are not related. If the two modes exist, the digital images of the two modes are labeled. The image pair samples are generated mainly by considering the correlation of images between modalities. Theoretically, the higher the correlation, the more similar the learned hash code.

And step 220, establishing a multi-modal convolutional neural network and training the multi-modal convolutional neural network model by using the sample image to obtain a trained multi-modal convolutional neural network model.

In the present embodiment, the multi-modal convolutional neural network model includes: the eye fundus image convolution neural network model, the cornea neural image convolution neural network model and the OCT image convolution neural network model. And training a plurality of neural network models to perform loss function calculation on the generated hash codes.

Illustratively, taking two modalities of an eye fundus map and an OCT map as examples, the image pairs (the eye fundus map and the OCT map) of the two modalities are input at the leftmost side, the intermediate network structure layer is the same, and the last layer is a Hash layer (generating a Hash code with the length of K). During training, multi-mode loss training is carried out on the labels of the images between the modes, and a discriminant mode is adopted for loss function design, namely, the smaller the distance of the Hash codes corresponding to the images of different modes belonging to the same label is. A model based on multi-mode discriminant loss function training, namely a model fusion method. The network structures of different modes are the same, but the weights are not shared (influence of different mode characteristics is avoided), but finally the loss function fuses and trains the correlation of the two modes, so that the images of different modes have certain similarity. After training, the different modes generate discrete hash representation for the image according to respective models, namely binarization.

During training, the output of the hash layer is between [ -1,1], and a function is activated by tanh; when the discrete hash code representation is generated after training, the hash layer is symbolized, namely, a sign symbolic function is added to the tanh function, so that each bit of the K hash codes is represented by 1 or-1, and the distance can be calculated in a Hamming space after binarization. In brief, during training, K hash codes are values between [ -1,1] and are used for training the multi-modal loss function. When the hash codes are generated for the images based on the trained model, the K hash codes are the values of { -1,1 }. The retrieval of the lesion position is very important for case retrieval. Generally, the discrimination of an image depends mainly on the identification and comparison of key regions (i.e. lesion regions). If the ratio of the lesion area is small in the two images, if the comparison of the whole image is performed, the feature representation of the lesion area is ignored, which results in an error of similarity. In short, the information of the lesion region is weighted more than the information of other regions in the entire image. In the feature extraction, the model design mainly considers the feature representation of the lesion region. Therefore, the scheme introduces a spatial attitude mechanism into the model to focus on capturing the characteristics of the lesion area. Based on the CNN model with the spatial attribute mechanism, the high-dimensional image is mapped into a low-dimensional hash code. The model fusion method based on the multi-mode discriminant loss mode constructs the similarity of images among the modes, and the distance between hash codes generated by the images with the same label is smaller.

The method based on multi-modal model fusion generates a hash code for the digital image, namely supports Hamming distance calculation. The specific scene is as follows: inputting a new digital image, generating a string of K hash codes for the image through the same network, and then performing Hamming distance calculation with the samples of the generated hash codes in the database, wherein the smaller the distance, the higher the similarity. Assuming that models of two modes of OCT and fundus oculi are trained and samples in the database generate hash codes through the models, now a doctor inputs a fundus oculi image for retrieval, the fundus oculi image generates K hash codes through the models of the fundus oculi modes, and carries out Hamming distance calculation with the hash codes of the samples in the database, and returns the most similar n results. The n results may include images of two modalities, because after model fusion, the digital images between modalities establish a certain similarity, which is reflected in the K hash codes.

And step 230, correspondingly inputting the eye images in the plurality of different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images in the plurality of different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors.

The embodiment discloses a method for searching multi-modal ophthalmology images, which comprises the following steps: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; marking the sample image by using various labels and establishing a database; establishing a multi-modal convolutional neural network, training the multi-modal convolutional neural network model by using the sample image to obtain a trained multi-modal convolutional neural network model, correspondingly inputting the eye images of the plurality of different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the plurality of different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the method for retrieving the multi-mode ophthalmologic image, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-mode eye image of the user and outputs the multi-mode identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is realized, and the user experience is improved.

EXAMPLE III

The ophthalmologic multi-mode image retrieval device provided by the embodiment of the invention can implement the ophthalmologic multi-mode image retrieval method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of an execution method. Fig. 3 is a schematic structural diagram of an ophthalmic multimodal image retrieval apparatus 300 according to an embodiment of the present invention. Referring to fig. 3, the apparatus 300 for retrieving an ophthalmic multi-modal image according to an embodiment of the present invention may include:

a data obtaining module 310, configured to obtain a current eye image of a user, where the eye image is an ophthalmic digital image and includes eye images of multiple different modalities;

the data identification module 320 is configured to correspondingly input the eye images in the multiple different modalities into multiple pre-trained deep learning models, fuse and train similarity and dissimilarity of the eye images in the multiple different modalities, output multiple feature vectors, and generate a comparison analysis result according to the multiple feature vectors.

Further, the ophthalmic digital image comprises: fundus, corneal nerve and OCT images.

Further, the deep learning model is a multi-modal convolutional neural network model.

Further, the acquiring a current eye image of a user, where the eye image is an ophthalmic digital image, and the eye image includes eye images of multiple different modalities, further includes:

marking the sample image by using various labels and establishing a database;

Further, the marking the sample image with the plurality of labels and establishing the database includes:

if the sample image is not relevant, marking as a third label;

Further, the multi-modal convolutional neural network model comprises: the eye fundus image convolution neural network model, the cornea neural image convolution neural network model and the OCT image convolution neural network model.

Further, the comparative analysis result includes: the most similar multiple sample images and their detailed case cases.

The embodiment discloses a retrieval device of ophthalmology multimodal image, the device includes: the data acquisition module is used for acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; and the data identification module is used for correspondingly inputting the eye images in the plurality of different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images in the plurality of different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the retrieval device for the ophthalmic multi-modal images, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-modal eye images of the user and outputs the multi-modal identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-modal eye images in the prior art is solved, the effect of retrieving other possible eye diseases through the single-modal eye images is realized, and the user experience is improved.

Example four

Fig. 4 is a schematic structural diagram of a computer server according to an embodiment of the present invention, as shown in fig. 4, the computer server includes a memory 410 and a processor 420, the number of the processors 420 in the computer server may be one or more, and one processor 420 is taken as an example in fig. 4; the memory 410 and the processor 420 in the device may be connected by a bus or other means, and fig. 4 illustrates the connection by a bus as an example.

The memory 410 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules (e.g., the data acquisition module 310 and the data identification module 320 in the ophthalmologic case retrieval apparatus 300) corresponding to the method for retrieving the multimodal ophthalmologic image in the embodiment of the present invention, and the processor 420 executes various functional applications and data processing of the device/terminal/apparatus by running the software programs, instructions, and modules stored in the memory 410, so as to implement the method for retrieving the multimodal ophthalmologic image.

Wherein the processor 420 is configured to run the computer program stored in the memory 410, and implement the following steps:

acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes;

In one embodiment, the computer program of the computer device provided in the embodiments of the present invention is not limited to the above method operations, and may also perform related operations in the method for retrieving an ophthalmic multimodal image provided in any embodiments of the present invention.

The memory 410 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 410 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 410 may further include memory located remotely from the processor 420, which may be connected to devices/terminals/devices through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiment discloses a retrieval server of ophthalmologic multi-modal images, which is used for executing the following method, and the method comprises the following steps: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the method for retrieving the multi-mode ophthalmologic image, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-mode eye image of the user and outputs the multi-mode identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is realized, and the user experience is improved.

EXAMPLE five

An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform a method for retrieving an ophthalmic multimodal image, and the method includes:

Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the above-described method operations, and may also perform related operations in a method for retrieving an ophthalmic multimodal image provided by any embodiment of the present invention.

The computer-readable storage media of embodiments of the invention may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including AN object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.

The embodiment discloses a retrieval storage medium of an ophthalmic multi-modal image, which is used for executing the following method, wherein the method comprises the following steps: acquiring a current eye image of a user, wherein the eye image is an ophthalmic digital image and comprises eye images in various different modes; correspondingly inputting the eye images of the multiple different modes into a plurality of pre-trained deep learning models, fusing and training the similarity and the dissimilarity of the eye images of the multiple different modes, outputting a plurality of feature vectors, and generating a comparison analysis result according to the plurality of feature vectors. According to the method for retrieving the multi-mode ophthalmologic image, provided by the embodiment of the invention, the system identifies in the deep learning model by acquiring the single-mode eye image of the user and outputs the multi-mode identification result, so that the problem that other potential eye diseases of the user cannot be found through the single-mode eye image in the prior art is solved, the effect of retrieving other possible eye diseases through the single-mode eye image is realized, and the user experience is improved.

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for searching multi-modal ophthalmology images is characterized by comprising the following steps:

2. The method for retrieving ophthalmic multimodal images as claimed in claim 1, wherein the ophthalmic digital image comprises: fundus, corneal nerve and OCT images.

3. The method as claimed in claim 1, wherein the deep learning model is a multi-modal convolutional neural network model.

4. The method for retrieving multi-modal ophthalmology video of claim 3, wherein the acquiring a current eye image of the user, the eye image being an ophthalmic digital video, the eye image including a plurality of eye videos of different modalities further comprises:

marking the sample image by using various labels and establishing a database;

5. The method for retrieving multi-modal ophthalmology image of claim 4, wherein the labeling the sample image with a plurality of labels and establishing a database comprises:

if the sample image is not relevant, marking as a third label;

6. The method for retrieving multi-modal ophthalmology image of claim 4, wherein the multi-modal convolutional neural network model comprises: the eye fundus image convolution neural network model, the cornea neural image convolution neural network model and the OCT image convolution neural network model.

7. The method for retrieving multi-modal ophthalmic image of claim 1, wherein the comparative analysis result comprises: the most similar multiple sample images and their detailed case cases.

8. An ophthalmologic multi-modality image retrieval device, comprising:

9. A server, characterized in that the server comprises:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of retrieving ophthalmic multimodal imagery according to any one of claims 1 to 7.

10. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the method for retrieving an ophthalmic multimodal image as claimed in any one of claims 1 to 7.