CN116092667A - Disease detection method, system, device and storage medium based on multi-mode images - Google Patents

Disease detection method, system, device and storage medium based on multi-mode images Download PDF

Info

Publication number
CN116092667A
CN116092667A CN202211727966.6A CN202211727966A CN116092667A CN 116092667 A CN116092667 A CN 116092667A CN 202211727966 A CN202211727966 A CN 202211727966A CN 116092667 A CN116092667 A CN 116092667A
Authority
CN
China
Prior art keywords
data
characteristic
mode image
inputting
disease detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211727966.6A
Other languages
Chinese (zh)
Inventor
钟培勋
林克
曾金梁
戴少椰
贺飞飞
程绪猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi IoT Technology Co Ltd
Original Assignee
Tianyi IoT Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi IoT Technology Co Ltd filed Critical Tianyi IoT Technology Co Ltd
Priority to CN202211727966.6A priority Critical patent/CN116092667A/en
Publication of CN116092667A publication Critical patent/CN116092667A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/40Image enhancement or restoration using histogram techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • G06T5/94Dynamic range modification of images or parts thereof based on local image properties, e.g. for local contrast enhancement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Epidemiology (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)
  • Radiology & Medical Imaging (AREA)
  • Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a disease detection method, system, device and storage medium based on multi-mode images. The method comprises the following steps: acquiring multi-mode image data; preprocessing the multi-mode image data to obtain preprocessed data; inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data; and inputting the front characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data. By the method, disease detection of small samples and multi-mode images can be realized, and detection efficiency and accuracy are improved. The method can be widely applied to the technical field of computers.

Description

Disease detection method, system, device and storage medium based on multi-mode images
Technical Field
The application relates to the technical field of computers, in particular to a disease detection method, a disease detection system, a disease detection device and a disease detection storage medium based on multi-mode images.
Background
Traditional screening of fundus retina diseases is usually carried out by a doctor with abundant clinical experience through manual visual examination, and comparing fundus photographs (fundus color photographs, fundus OCT (optical coherence tomography), fundus fluorescence angiography FFA (fluorescence angiography), so that the disease condition is obtained through analysis, the efficiency is low, and misdiagnosis is easy to occur. In the related art, the fundus disease automatic detection method mostly adopts single-mode images, and uses an image processing technology or a neural network model; the method belongs to detection of single-mode large sample supervised learning, and needs a large amount of fundus image data with the existing labels, and meanwhile, a large amount of manpower is consumed for marking; and for disease detection of small samples and multi-mode data, the accuracy is low.
Disclosure of Invention
The object of the present application is to solve at least one of the technical problems existing in the prior art to a certain extent.
Therefore, the present invention aims to provide an efficient and high-accuracy disease detection method, system, device and storage medium based on multi-mode images.
In order to achieve the technical purpose, the technical scheme adopted by the embodiment of the application comprises the following steps:
in one aspect, an embodiment of the present application provides a disease detection method based on multi-modal images, including the following steps:
the disease detection method based on the multi-mode image comprises the following steps: acquiring multi-mode image data; preprocessing the multi-mode image data to obtain preprocessed data; inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data; and inputting the front characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data. According to the embodiment of the application, the weight sharing of the multi-mode image data is carried out through the front-end feature extraction model, then the channel splicing processing is carried out through the feature fusion model, the efficiency of disease detection is improved, and the accuracy of disease detection is not reduced under the condition of a small sample. By the disease detection method, disease detection of small samples and multi-mode images can be realized, and detection efficiency and accuracy are improved.
In addition, the disease detection method based on multi-mode image according to the above embodiment of the present application may further have the following additional technical features:
further, in the disease detection method based on multi-mode image according to the embodiment of the present application, preprocessing the multi-mode image data to obtain preprocessed data includes:
scaling the multi-mode image data to obtain first data;
carrying out local self-adaptive contrast stretching treatment on the first data to obtain second data;
performing limited contrast adaptive histogram equalization processing on the second data to obtain third data;
and carrying out histogram matching fusion processing on the third data to obtain preprocessing data.
Further, in an embodiment of the present application, the step of performing a locally adaptive contrast stretching process on the first data to obtain second data includes:
obtaining first gray scale according to the first data; the first gray scale is used for representing gray scale extremum in the first data;
partitioning the first data to obtain a plurality of sub-blocks;
determining a pixel gray value according to the gray average value of each sub-block;
if the pixel gray value is smaller than the gray threshold value, updating the pixel gray value to be the first gray;
if the pixel gray value is larger than or equal to the gray threshold, gamma gray stretching processing is carried out on the pixel gray value, and the pixel gray value is updated;
and determining second data according to the updated pixel gray value.
Further, in an embodiment of the present application, the pre-signature extraction model comprises a res net50 network model, the method further comprising:
and training the ResNet50 network model through an ImageNet data set to obtain a trained ResNet50 network model.
Further, in an embodiment of the present application, the step of inputting the preprocessing data into a pre-feature extraction model to obtain pre-feature data includes:
and extracting the characteristics of the preprocessed data through the trained ResNet50 network model, and sharing the training weight of the ResNet50 network model to obtain the front characteristic data.
Further, in an embodiment of the present application, the inputting the pre-feature data into a feature fusion model to obtain the disease classification result corresponding to the multi-mode image data includes:
performing convolution processing on the front characteristic data to obtain depth characteristic data;
performing feature fusion processing on the depth feature data to obtain fourth data;
inputting the fourth data into an attention mechanism module to obtain fifth data; the attention mechanism module is used for carrying out weighted emphasis processing on the fourth data;
and inputting the fifth data into a classifier module to obtain a disease classification result corresponding to the multi-mode image data.
Further, in an embodiment of the present application, the inputting the fourth data into the attention mechanism module obtains fifth data, including:
performing convolution operation on the fourth data to obtain first characteristic data and corresponding first weights;
pooling the first characteristic data to obtain compressed data;
training the first weight through the full connection layer to obtain weighted data;
and obtaining fifth data according to the compressed data and the weighted data.
In another aspect, an embodiment of the present application provides a disease detection system based on multi-modality imaging, including:
the first module is used for acquiring multi-mode image data;
the second module is used for preprocessing the multi-mode image data to obtain preprocessed data;
the third module is used for inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data;
and a fourth module, configured to input the pre-characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data.
In another aspect, embodiments of the present application provide a disease detection device based on multi-modality imaging, including:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement any of the multi-modal image-based disease detection methods described above.
In another aspect, embodiments of the present application provide a storage medium having stored therein a processor-executable program, which when executed by a processor, is configured to implement any of the above-described multi-modality image-based disease detection methods.
According to the embodiment of the application, the weight sharing of the multi-mode image data is carried out through the front-end feature extraction model, then the channel splicing processing is carried out through the feature fusion model, the efficiency of disease detection is improved, and the accuracy of disease detection is not reduced under the condition of a small sample. By the disease detection method, disease detection of small samples and multi-mode images can be realized, and detection efficiency and accuracy are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following description is made with reference to the accompanying drawings of the embodiments of the present application or the related technical solutions in the prior art, it should be understood that, in the following description, the drawings are only for convenience and clarity to describe some embodiments in the technical solutions of the present application, and other drawings may be obtained according to these drawings without any inventive effort for those skilled in the art.
FIG. 1 is a schematic diagram of an embodiment of a related art image-based disease detection method;
FIG. 2 is a flow chart of an embodiment of a multi-modality image based disease detection method provided herein;
FIG. 3 is a schematic structural diagram of an embodiment of a multi-modality image based disease detection method provided herein;
FIG. 4 is a flow diagram of one embodiment of an attention mechanism module provided herein;
FIG. 5 is a flowchart illustrating another embodiment of a multi-modality image based disease detection method provided herein;
FIG. 6 is a schematic diagram illustrating a multi-modality image based disease detection system according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an embodiment of a multi-mode image-based disease detection device provided in the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
The traditional screening of the retinal diseases of the eyeground is usually carried out by comparing eyeground photos (eyeground color photographs, eyeground OCT, eyeground fluorescent angiography FFA and the like) of various modes through manual naked eye reading by an ophthalmologist with abundant clinical experience, and has the advantages of low automation degree, low screening efficiency and fatigue misdiagnosis. Referring to an embodiment shown in fig. 1, most of the existing automatic fundus disease detection methods adopt single-mode images (such as fundus color photographs), use image processing technology or AI neural network models, belong to detection of single-mode large-sample supervised learning, require a large amount of fundus image data with labels, require a large amount of manpower to perform marking, and have lower accuracy for small-sample data multi-mode disease detection.
In the research of cross-modal screening of fundus retinal diseases, in the related technology, the joint screening can be performed through a machine learning algorithm, a CNN and other DL neural networks. The following examples:
in one embodiment of the related art, an infrared macular area fundus image and an optical coherence tomography OCT image are obtained through an electronic medical record to form a bimodal image sample, then the two image samples are respectively and simultaneously input into a neural network for training, feature information of the two modal images is obtained, then the total image feature information is calculated through weights and is input into a fully-connected network, and a prediction result is obtained. The method can realize that the combined screening prediction has fewer image modes, the fundus images before fusion do not do pretreatment operation, and the used common simple neural network model has poor performance. Meanwhile, the fusion of the dual-mode images uses simple weight weighted superposition, and the specific contribution of various mode images cannot be highlighted.
One embodiment of the related art uses fundus OCT and fundus illumination dual-mode fundus images to jointly train a DL network, a joint screening network model designed and realized by using a pre-trained VGG-19 and a transfer learning method combined with random forests is proved to be more effective in early diagnosis of age-related macular degeneration, the fundus illumination and OCT are jointly used to improve the diagnosis rate, the area under the curve AUC of a subject is 0.969, and the accuracy is 90.5 percent and is more superior than that of using only one single-mode fundus image. The method uses transfer learning, reduces dependence on a large sample and shortens network convergence time, but simplifies fundus disease detection problems into simple two-classification problems, namely only normal and senile macular degeneration can be detected, the universality of the model is reduced, and the method cannot be applied to other fundus disease types.
In the related technology, anterior ocular segment photographing, fundus color photographing and B-type ultrasonic data are also used, and a plurality of AI algorithms such as SVM, ANN and CNN are applied to design and analyze the diagnosis and grading system of early cataract, and the accuracy, sensitivity and specificity of the algorithm system can reach the screening level of human ophthalmologists. The method uses a machine learning algorithm such as SVM and the like to enter the feature compression extraction of the multi-mode image, then fusion is carried out, human intervention is needed, and the end-to-end automatic screening cannot be achieved; meanwhile, CNN is used as a feature extractor, a large amount of labeled data is needed in the stage of training model parameters, a large amount of manpower and material resources are needed to be consumed, the annotation process is time-consuming, the field knowledge of the field expert is needed, the difficulty is extremely high, and the popularization of the model is not high.
Based on the problems, the method provided by the application is used for solving the problem that under the condition that a large number of small samples of marked fundus images are not available, aiming at the condition of cross-modal fundus disease screening, a multi-mode AI network model with relatively small parameter scale and high calculation speed is used for realizing automatic screening and classification of the small samples, cross-modal fundus diseases and various fundus diseases, and under the same condition, the screening accuracy rate is higher than that of an AI network trained by single-modal data.
The following describes in detail a disease detection method and system based on multi-modal images according to an embodiment of the present application with reference to the accompanying drawings, and first describes a disease detection method based on multi-modal images according to an embodiment of the present application with reference to the accompanying drawings.
Referring to fig. 2, a disease detection method based on multi-mode images is provided in the embodiment of the present application, and the disease detection method based on multi-mode images in the embodiment of the present application may be applied to a terminal, a server, software running in a terminal or a server, and the like. The terminal may be, but is not limited to, a tablet computer, a notebook computer, a desktop computer, etc. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms. The disease detection method based on the multi-mode image in the embodiment of the application mainly comprises the following steps:
s100: acquiring multi-mode image data;
in this step, the acquired multi-modal image data may be at least one of fundus color illumination, fundus OCT, and fundus fluorescence angiography FFA. Of course, a person skilled in the art can select relevant modal image data of the disease to be examined according to the requirement, and perform disease detection by the method provided by the application.
S200: preprocessing the multi-mode image data to obtain preprocessed data;
in this step, the preprocessing process for the multi-mode image data may be: the preprocessing operations such as zooming, local self-adaptive contrast stretching, limited contrast self-adaptive histogram equalization, histogram equalization fusion and the like are carried out on the multi-mode fundus image data, the foreground area with rich features is further highlighted through the preprocessing operations, the background noise is weakened, and the quality of the ophthalmic image is effectively improved.
S300: inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front-end feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data;
in the step, the pre-training pre-characteristic extraction model is used as a pre-characteristic extractor, the cross-modal image branches are used as a channel splicing and fusion attention multi-modal screening method, so that multi-modal classification screening is realized, the situation of cross-modal reading joint diagnosis of an ophthalmologist in reality is effectively simulated, and screening accuracy of single-modal mass data training under the same condition is achieved through limited multi-modal marked data small samples. The problem of low accuracy of multi-classification detection of the ophthalmic diseases under the condition of supervised learning of small sample data is solved, the number requirement of marked samples and model training time are reduced, the scale of model parameters is compressed, and the feasibility of automatic detection of portable equipment is improved. Specifically, the pre-feature extraction model may be a ResNet50 network model.
S400: and inputting the front characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data.
In order to realize fusion screening of multi-mode images, the embodiment of the application accelerates the convergence rate of multi-mode fundus image training and reduces the work of collecting and marking a large number of labels, provides a focus multi-mode screening method for performing channel splicing and fusion on cross-mode image branches by taking a pre-training ResNet50 as a front feature extractor, namely a weight sharing multi-branch focus splicing and fusion model. Specifically, as shown in fig. 3, a pre-trained res net50 network in an ImageNet dataset is used as a pre-feature extractor, the pre-trained weights of the res net50 are used as initialization weights, forward propagation is performed to extract features, namely, each mode image branch is actually enabled to share convolution weights, after the features of three modes are initially extracted, each mode branch is further subjected to deep feature extraction through a plurality of layers of convolution, then feature level fusion is performed, the fusion mode is channel splicing, and finally the channel features of the fused multi-mode feature set are weighted and emphasized through an SE attention module and then are sent to a classifier for classification.
Optionally, in the disease detection method based on multi-mode image according to the embodiment of the present application, preprocessing multi-mode image data to obtain preprocessed data, including:
scaling the multi-mode image data to obtain first data;
carrying out local self-adaptive contrast stretching treatment on the first data to obtain second data;
performing limited contrast self-adaptive histogram equalization processing on the second data to obtain third data;
and carrying out histogram matching fusion processing on the third data to obtain preprocessed data.
In this step, the pretreatment process may be: the multi-modality image data (which may be, for example, three-modality fundus images) are scaled to a certain size scale (for example, 256×256) to obtain first data. The first data is processed by means of local adaptive contrast stretching (SACS), reducing the influence of uneven illumination on the image. And (3) performing limited contrast self-adaptive histogram equalization (CLAHE) on the image (namely the second data) processed by the SACS, and highlighting the interlayer information of OCT and the contour information of blood vessels and focal plaques of the fundus color photograph and angiography FFA image. And the enhanced and denoised image and the image subjected to CLAHE operation are subjected to histogram matching fusion, so that a foreground area is further highlighted, background noise is weakened, and the image quality is effectively improved.
Optionally, in one embodiment of the present application, the step of performing a locally adaptive contrast stretching process on the first data to obtain second data includes:
obtaining first gray scale according to the first data; the first gray scale is used for representing gray scale extremum in the first data;
partitioning the first data to obtain a plurality of sub-blocks;
determining a pixel gray value according to the gray average value of each sub-block;
if the pixel gray value is smaller than the gray threshold value, updating the pixel gray value to be the first gray;
if the pixel gray value is greater than or equal to the gray threshold, performing gamma gray stretching treatment on the pixel gray value, and updating the pixel gray value;
and determining second data according to the updated pixel gray value.
In this step, the algorithm steps of the local adaptive contrast stretching (SACS) are: traversing the image to obtain the global gray minimum value of the image
Figure BDA0004025472390000071
(i.e., a first gray scale). Then the image (256 x 256) is divided into M x N blocks (for example, M and N are both 16), and the average gray value +_ of each block is calculated>
Figure BDA0004025472390000072
Traversing the current image block, and comparing the gray value of the current pixel
Figure BDA0004025472390000073
With the magnitude of the gray threshold (which may be, for example, half of the average gray value of the current block), if the pixel gray value is smaller than the gray threshold, the global gray minimum value +.>
Figure BDA0004025472390000074
Otherwise, gamma (gamma) gray stretching is performed on the current pixel gray value, wherein gamma=1.5, and the original gray value is replaced. Specifically, the correlation calculation formula is as follows:
Figure BDA0004025472390000075
in the formula, I ij Representing the current i x j block of image blocks,
Figure BDA0004025472390000076
in some possible implementations, the limited contrast adaptive histogram equalization (CLAHE) algorithm steps may be: and performing image blocking, and calculating a gray level histogram dictionary of each image area. Then defining a cut-out threshold Cliplimit (gray level frequency/duty ratio), traversing the current area histogram dictionary, cutting out the gray level frequency/duty ratio (amplitude) if the frequency or duty ratio of the current gray level is higher than the Cliplimit threshold, and averaging to each gray level; then, a cumulative distribution histogram (CDF) of the region is calculated, and histogram equalization processing is performed. And finally, bilinear interpolation is carried out on adjacent image blocks, so that the area equalization speed is increased, the boundary jump between areas is eliminated, and the gray level conversion is more gentle.
Optionally, in an embodiment of the present application, the pre-signature extraction model includes a res net50 network model, and the method further includes:
and training the ResNet50 network model through the ImageNet data set to obtain a trained ResNet50 network model.
In this step, the pre-feature extraction model includes a res net50 network model, specifically, a pre-trained res net50 network model is used as a feature extractor, that is, a pre-trained res net50 network model in a mageNet dataset is used as an initialization weight, and forward propagation extraction features are performed, that is, the pre-trained res net50 network model is used to extract fundus OCT and fundus color illumination and fundus angiography image features and share training weights of the res net50, so as to play a role of a feature extractor. Disease detection based on multi-mode images is achieved through the action of a ResNet50 network model.
Optionally, in one embodiment of the present application, the step of inputting the preprocessing data into the pre-feature extraction model to obtain the pre-feature data includes:
and extracting features of the preprocessed data through the trained ResNet50 network model, and sharing training weights of the ResNet50 network model to obtain the front-end feature data.
Optionally, in an embodiment of the present application, inputting the pre-feature data into the feature fusion model to obtain a disease classification result corresponding to the multi-mode image data includes:
performing convolution processing on the front characteristic data to obtain depth characteristic data;
performing feature fusion processing on the depth feature data to obtain fourth data;
inputting the fourth data into an attention mechanism module to obtain fifth data; the attention mechanism module is used for carrying out weighted emphasis processing on the fourth data;
and inputting the fifth data into a classifier module to obtain a disease classification result corresponding to the multi-mode image data.
In the step, after the front-end feature extractor initially extracts the features of three modes, the respective mode branches are subjected to a plurality of layers of convolution to further deeply extract the features, then feature level fusion is carried out, the fusion mode is channel splicing, a channel attention mechanism module (an exemplary SE module) is introduced before the features are sent to the classifier, channel weighted emphasis is further carried out on the fused feature mixture set, the contribution of useful features to classification results is emphasized, the performance of useless features is weakened, and the accuracy of classification results is improved.
The feature fusion method requires that the width and height of the feature vectors of the two stacked mode images are consistent, but the number of the channels can be different. The method has the advantages that the operation is simple, and the image characteristic information of each of a plurality of modes can be kept as much as possible when the characteristic fusion is carried out.
In some possible implementations, the classifier module may select a common four-layer convolution plus one softmax output layer to compress the result to between 0 and 1, ultimately yielding the four types of common fundus diseases with the highest confidence scores. It is to be understood that the present application is not limited to the implementation of a particular classifier module.
Optionally, in an embodiment of the present application, inputting the fourth data into the attention mechanism module, obtaining the fifth data includes:
performing convolution operation on the fourth data to obtain first characteristic data and corresponding first weights;
pooling the first characteristic data to obtain compressed data;
training the first weight through the full connection layer to obtain weighted data;
and obtaining fifth data according to the compressed data and the weighted data.
In this step, the attention mechanism module may be implemented by the SE attention module. It can be understood that the first proposal of the SE module (sequence-and-specification Block) is to explicitly model the correlation in the dimension of the input characteristic channel in the sense network model, so as to improve the performance of the network model, namely, essentially, weight emphasis is carried out on the input sample channel characteristics, and more important channel characteristics with great contribution to classification recognition are allocated with larger weight; conversely, less weight is assigned to less contributing less important channel features, in effect being a mechanism of attention on the channel. As shown in FIG. 4, the SE module can be split into three steps, namely, compression (sequence), decompression (specification), and recalibration (Scale). Firstly, an input sample x is subjected to a series of convolution operations to obtain a characteristic with the channel number of C, then, a compression (sequential) operation is performed, the input characteristic with the channel number of C and the width and the height of WxH is compressed into a real number in the space dimension through a global average pooling layer, and the real number has a global receptive field in a certain sense, namely, the real number contains the spatial dependency relationship of the input characteristic. Then decompression (accounting) is performed, the weight parameter W is learned through training of two full-connection layers, each channel with the weight parameter W as a characteristic generates a weight, a larger weight is given to an important channel, a smaller weight is given to a next channel, and the like, so that weighting processing of all channels is completed, namely modeling of dependency among characteristic channels is substantially completed, that is, the decompression operation is the SE module core. And finally, performing a recalibration (Scale) operation, namely weighting each channel weighting coefficient of the input characteristic obtained after the previous compression operation and decompression operation to the original input characteristic channel by channel through multiplication, and completing the recalibration of the final characteristic channel.
In summary, referring to fig. 5, compared with the traditional manual film reading, the method provided by the embodiment of the application has higher automation degree of the detection flow, relatively stable output diagnosis result and can realize end-to-end automatic screening detection. When traditional doctor manual reading is limited by resources, the efficiency of detection can be improved in an auxiliary manner, the workload of doctors is reduced, fatigue misdiagnosis is avoided, and the situation of one person and one result occurs. Compared with the prior ophthalmic image AI scheme, the method is suitable for the condition of small sample data. The embodiment of the application uses the pre-training ResNet50 as a pre-characteristic extractor, utilizes the information complementation of the cross-mode image data, reduces the workload of marking data, saves manpower and material resources, and improves the training period of the model at the same time, thereby having strong popularization. Most of the existing ophthalmic AI automatic detection techniques are supervised learning, and a huge amount of labeled data sets are needed.
Meanwhile, compared with the conventional ophthalmic image AI scheme, the embodiment provided by the application is suitable for the scene of the combined screening of the multi-mode ophthalmic images. Most of automatic end-to-end AI screening classification models of ophthalmic images are only suitable for inputting pictures of single-mode types, and cannot truly simulate real ophthalmic doctors to combine cross-mode ophthalmic images to carry out information complementation, assist diagnosis and improve diagnosis accuracy.
In some possible implementations, the embodiments provided herein are smaller in scale than existing ophthalmic image AI protocols, are suitable for portable devices, and support automated screening (multi-classification) of multiple types of ophthalmic diseases. According to the method, under the same conditions, the detection accuracy of the large sample for supervised learning can be achieved by using shorter pre-training time and fewer training samples, more types of ophthalmic diseases can be detected, and the universal type is stronger.
Next, a disease detection system based on multi-mode images according to an embodiment of the present application will be described with reference to fig. 6.
Fig. 6 is a schematic structural diagram of a multi-mode image-based disease detection system according to an embodiment of the present application, where the system specifically includes:
a first module 610, configured to obtain multi-mode image data;
a second module 620, configured to pre-process the multi-mode image data to obtain pre-processed data;
a third module 630, configured to input the preprocessed data into the pre-feature extraction model, to obtain pre-feature data; the front-end feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data;
and a fourth module 640, configured to input the pre-feature data into the feature fusion model to obtain a disease classification result corresponding to the multi-mode image data.
It can be seen that the content in the above method embodiment is applicable to the system embodiment, and the functions specifically implemented by the system embodiment are the same as those of the method embodiment, and the beneficial effects achieved by the method embodiment are the same as those achieved by the method embodiment.
Referring to fig. 7, an embodiment of the present application provides a disease detection device based on multi-mode images, including:
at least one processor 710;
at least one memory 720 for storing at least one program;
the at least one program, when executed by the at least one processor 710, causes the at least one processor 710 to implement a multi-modal image-based disease detection method.
Similarly, the content in the above method embodiment is applicable to the embodiment of the present device, and the functions specifically implemented by the embodiment of the present device are the same as those of the embodiment of the above method, and the beneficial effects achieved by the embodiment of the above method are the same as those achieved by the embodiment of the above method.
The embodiment of the present application also provides a computer readable storage medium, in which a program executable by the processor 710 is stored, where the program executable by the processor 710 is used to perform the above-mentioned disease detection method based on multi-modal images when executed by the processor 710.
Similarly, the content in the above method embodiment is applicable to the present storage medium embodiment, and the specific functions of the present storage medium embodiment are the same as those of the above method embodiment, and the achieved beneficial effects are the same as those of the above method embodiment.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of this application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the present application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or one or more of the functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Thus, those of ordinary skill in the art will be able to implement the present application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several programs for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable programs for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with a program execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the programs from the program execution system, apparatus, or device and execute the programs. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the program execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable program execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, descriptions of the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the application, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present invention has been described in detail, the present invention is not limited to the embodiments described above, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present invention, and these equivalent modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims (10)

1. The disease detection method based on the multi-mode image is characterized by comprising the following steps of:
acquiring multi-mode image data;
preprocessing the multi-mode image data to obtain preprocessed data;
inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data;
and inputting the front characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data.
2. The method for detecting a disease based on a multi-modal image according to claim 1, wherein preprocessing the multi-modal image data to obtain preprocessed data comprises:
scaling the multi-mode image data to obtain first data;
carrying out local self-adaptive contrast stretching treatment on the first data to obtain second data;
performing limited contrast adaptive histogram equalization processing on the second data to obtain third data;
and carrying out histogram matching fusion processing on the third data to obtain preprocessing data.
3. The method for detecting a disease based on a multi-modal image according to claim 2, wherein the step of performing the locally adaptive contrast stretching processing on the first data to obtain second data includes:
obtaining first gray scale according to the first data; the first gray scale is used for representing gray scale extremum in the first data;
partitioning the first data to obtain a plurality of sub-blocks;
determining a pixel gray value according to the gray average value of each sub-block;
if the pixel gray value is smaller than the gray threshold value, updating the pixel gray value to be the first gray;
if the pixel gray value is larger than or equal to the gray threshold, gamma gray stretching processing is carried out on the pixel gray value, and the pixel gray value is updated;
and determining second data according to the updated pixel gray value.
4. The multi-modal image-based disease detection method of claim 1, wherein the pre-signature extraction model comprises a res net50 network model, the method further comprising:
and training the ResNet50 network model through an ImageNet data set to obtain a trained ResNet50 network model.
5. The method of claim 4, wherein the step of inputting the preprocessing data into a pre-feature extraction model to obtain pre-feature data comprises:
and extracting the characteristics of the preprocessed data through the trained ResNet50 network model, and sharing the training weight of the ResNet50 network model to obtain the front characteristic data.
6. The multi-modal image-based disease detection method according to claim 1, wherein the inputting the pre-characteristic data into a characteristic fusion model to obtain the disease classification result corresponding to the multi-modal image data includes:
performing convolution processing on the front characteristic data to obtain depth characteristic data;
performing feature fusion processing on the depth feature data to obtain fourth data;
inputting the fourth data into an attention mechanism module to obtain fifth data; the attention mechanism module is used for carrying out weighted emphasis processing on the fourth data;
and inputting the fifth data into a classifier module to obtain a disease classification result corresponding to the multi-mode image data.
7. The method of claim 6, wherein inputting the fourth data into the attention mechanism module to obtain fifth data comprises:
performing convolution operation on the fourth data to obtain first characteristic data and corresponding first weights;
pooling the first characteristic data to obtain compressed data;
training the first weight through the full connection layer to obtain weighted data;
and obtaining fifth data according to the compressed data and the weighted data.
8. A multi-modality image-based disease detection system, comprising:
the first module is used for acquiring multi-mode image data;
the second module is used for preprocessing the multi-mode image data to obtain preprocessed data;
the third module is used for inputting the preprocessing data into a pre-characteristic extraction model to obtain pre-characteristic data; the front feature extraction model extracts the feature data of the multi-mode image by sharing the convolution weight with the multi-mode image data;
and a fourth module, configured to input the pre-characteristic data into a characteristic fusion model to obtain a disease classification result corresponding to the multi-mode image data.
9. A multi-modal image-based disease detection device, comprising:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the multi-modal image-based disease detection method of any one of claims 1 to 7.
10. A computer-readable storage medium having stored therein a program executable by a processor, characterized in that: the processor-executable program when executed by a processor is for implementing the multimodal image based disease detection method of any of claims 1 to 7.
CN202211727966.6A 2022-12-29 2022-12-29 Disease detection method, system, device and storage medium based on multi-mode images Pending CN116092667A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211727966.6A CN116092667A (en) 2022-12-29 2022-12-29 Disease detection method, system, device and storage medium based on multi-mode images

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211727966.6A CN116092667A (en) 2022-12-29 2022-12-29 Disease detection method, system, device and storage medium based on multi-mode images

Publications (1)

Publication Number Publication Date
CN116092667A true CN116092667A (en) 2023-05-09

Family

ID=86205708

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211727966.6A Pending CN116092667A (en) 2022-12-29 2022-12-29 Disease detection method, system, device and storage medium based on multi-mode images

Country Status (1)

Country Link
CN (1) CN116092667A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350982A (en) * 2023-10-23 2024-01-05 郑州大学 Multi-medical image-based diabetic nephropathy analysis method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350982A (en) * 2023-10-23 2024-01-05 郑州大学 Multi-medical image-based diabetic nephropathy analysis method and system
CN117350982B (en) * 2023-10-23 2024-05-14 郑州大学 Multi-medical image-based diabetic nephropathy analysis method and system

Similar Documents

Publication Publication Date Title
US11295178B2 (en) Image classification method, server, and computer-readable storage medium
CN110197493B (en) Fundus image blood vessel segmentation method
CN111340819B (en) Image segmentation method, device and storage medium
CN110443813B (en) Segmentation method, device and equipment for blood vessel and fundus image and readable storage medium
Li et al. Accurate retinal vessel segmentation in color fundus images via fully attention-based networks
CN111815574B (en) Fundus retina blood vessel image segmentation method based on rough set neural network
CN109325942B (en) Fundus image structure segmentation method based on full convolution neural network
Lin et al. Automatic retinal vessel segmentation via deeply supervised and smoothly regularized network
CN112102266B (en) Attention mechanism-based cerebral infarction medical image classification model training method
CN110930416A (en) MRI image prostate segmentation method based on U-shaped network
CN114998210B (en) Retinopathy of prematurity detecting system based on deep learning target detection
WO2022166399A1 (en) Fundus oculi disease auxiliary diagnosis method and apparatus based on bimodal deep learning
Tennakoon et al. Image quality classification for DR screening using convolutional neural networks
CN111767952A (en) Interpretable classification method for benign and malignant pulmonary nodules
CN113012163A (en) Retina blood vessel segmentation method, equipment and storage medium based on multi-scale attention network
CN117058676B (en) Blood vessel segmentation method, device and system based on fundus examination image
CN112348059A (en) Deep learning-based method and system for classifying multiple dyeing pathological images
Qin et al. A review of retinal vessel segmentation for fundus image analysis
CN116758336A (en) Medical image intelligent analysis system based on artificial intelligence
CN116092667A (en) Disease detection method, system, device and storage medium based on multi-mode images
CN115147640A (en) Brain tumor image classification method based on improved capsule network
CN117557840B (en) Fundus lesion grading method based on small sample learning
CN110610480A (en) MCASPP neural network eyeground image optic cup optic disc segmentation model based on Attention mechanism
Sallam et al. Diabetic retinopathy grading using resnet convolutional neural network
CN116703837B (en) MRI image-based rotator cuff injury intelligent identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination