CN118334469A - Training method for predicting intracranial tumor image model and related device thereof - Google Patents

Training method for predicting intracranial tumor image model and related device thereof Download PDF

Info

Publication number
CN118334469A
CN118334469A CN202410466483.8A CN202410466483A CN118334469A CN 118334469 A CN118334469 A CN 118334469A CN 202410466483 A CN202410466483 A CN 202410466483A CN 118334469 A CN118334469 A CN 118334469A
Authority
CN
China
Prior art keywords
training
eye
data set
image
intracranial tumor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410466483.8A
Other languages
Chinese (zh)
Inventor
蒋帅
王克冰
何孟贤
刘博文
求佳宁
袁武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shijia Smart Technology Shenzhen Co ltd
Original Assignee
Shijia Smart Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shijia Smart Technology Shenzhen Co ltd filed Critical Shijia Smart Technology Shenzhen Co ltd
Publication of CN118334469A publication Critical patent/CN118334469A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the invention provides a training method for predicting an intracranial tumor image model and a related device thereof. The method can solve the problems that the acquisition of the eye image used for predicting the intracranial tumor based on the deep learning is time-consuming and has higher cost.

Description

Training method for predicting intracranial tumor image model and related device thereof
Technical Field
The embodiment of the invention relates to the technical field of machine learning, in particular to a training method for predicting an image model of intracranial tumor and a related device thereof.
Background
The existing diagnosis of the craniocerebral tumor needs to rely on craniocerebral CT and MRI, detection needs to be completed in a large medical institution or physical examination center, single detection cost is high, and often patients seek medical attention when typical symptoms such as headache, vomit, vision loss, limb movement sensory disturbance and the like appear, and the tumor is advanced to middle and late stages at the moment, so that the patients miss the optimal treatment time.
Since intracranial space occupying lesions are often caused by craniocerebral tumors, intracranial pressure is increased, and characterization phenomena of lesion features such as disk congestion, bulge, edge blurring and the like occur in an eye region, the lesion features can be reflected in an eye image through photographing eyes/eyeground of a patient.
However, the current method for predicting intracranial tumor by using eye images based on deep learning is based on eye images of a CT data mode and/or an MRI data mode, and the images of the two data modes are obtained by means of electronic computer tomography detection (CT detection) and magnetic resonance imaging detection (MRI detection), which are very time-consuming, have high cost, are difficult to popularize in screening, can sometimes produce inaccurate results, cannot be applied to early screening of craniocerebral tumor, and lead patients to miss an optimal treatment time window.
Disclosure of Invention
The embodiment of the invention provides a training method for an image model for predicting intracranial tumors and a related device thereof, which are used for solving the problems that the existing eye image for predicting intracranial tumors based on deep learning is time-consuming and has high cost in acquisition mode.
In a first aspect, an embodiment of the present invention provides a training method for predicting an image model of an intracranial tumor, including:
Acquiring a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
Pre-training the coding module in a preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image;
Under the condition of ensuring that network parameters in the coding module after the pre-training is not changed, taking an eye training data set in the second training data set as a sample and taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value, and training a decoding module in the initial model; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
When the training of the decoding modules in the initial model is completed, the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules is determined to be an image model for predicting intracranial tumors.
In a second aspect, embodiments of the present invention also provide a method for predicting an intracranial tumor, comprising:
acquiring an ophthalmic medical image of a user to be predicted;
determining an image model according to the training method of an image model as described in the first aspect;
And inputting the ophthalmic medical image into the image model, and outputting an intracranial tumor prediction result aiming at the user to be predicted.
In a third aspect, an embodiment of the present invention further provides a training apparatus for predicting an image model of an intracranial tumor, including:
The data acquisition module is used for acquiring a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
the coding training module is used for pre-training the coding module in the preset initial model by utilizing the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image;
The decoding training module is used for training the decoding module in the initial model by taking the eye training data set in the second training data set as a sample and taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value under the condition of ensuring that the network parameters in the coding module after the pre-training is unchanged; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
And the image model confirmation module is used for determining an initial model formed by cascading the trained decoding module and the pre-trained encoding module as an image model for predicting intracranial tumor when the training of the decoding module in the initial model is completed.
In a fourth aspect, embodiments of the present invention also provide a medical device for predicting an intracranial tumor, comprising:
The image acquisition module is used for acquiring an ophthalmic medical image of a user to be predicted;
a model determination module for determining an image model according to the training method of an image model as described in the first aspect;
And the prediction module is used for inputting the ophthalmic medical image into the image model and outputting an intracranial tumor prediction result aiming at the user to be predicted.
In a fifth aspect, embodiments of the present invention further provide a computer apparatus, the computer apparatus including:
One or more processors;
a memory for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the training method for predicting an image model of an intracranial tumor as described in the first aspect or the method for predicting an intracranial tumor as described in the second aspect.
In the embodiment of the application, a first training data set and a second training data set are acquired; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor; pre-training the coding module in a preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; The coding module is used for carrying out feature coding on the eye image; under the condition of ensuring that network parameters in the coding module after the pre-training is not changed, taking an eye training data set in the second training data set as a sample and taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value, and training a decoding module in the initial model; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result; when the training of the decoding modules in the initial model is completed, the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules is determined to be an image model for predicting intracranial tumors. The image model acquires the association relation between the eye characteristics in the eye images and the intracranial tumor in the training process, and when the ophthalmic medical image of the user to be predicted is input into the image model, the intracranial tumor prediction result of the user to be predicted is correspondingly output; the image model can be suitable for application scenes in which large-scale intracranial tumor screening and classification are required to be carried out, and is very suitable for application scenes in which primary diagnosis of intracranial tumors is required to be carried out on patients in areas with underdeveloped medical treatment, the image model can output a prediction result of the intracranial tumors only according to input eye images (such as eye images shot on the patients in real time), and the prediction result of the intracranial tumors output by the model can be used as a primary diagnosis result for assisting professional medical staff in carrying out deeper clinical diagnosis on the patients, so that the medical cost is reduced, and the medical efficiency is improved; Meanwhile, because the eye images of the real intracranial tumor patients are difficult to acquire, the data samples with intracranial tumor labels for training the model are fewer, and the embodiment of the application also adopts the synthetic eye images which are generated by the diffusion model and are used for representing the intracranial tumor, so that a training data set can be expanded, and the trained model is more accurate; moreover, the embodiment of the application can solve the problems that the screening and classification of intracranial tumor/craniocerebral tumor in the prior art need to rely on craniocerebral CT and MRI, the detection cost is high and the detection can be carried out only by a large medical institution/physical examination center, and further can solve the problems that the acquisition mode is time-consuming and the cost is high in the prior art based on the eye images (CT and MRI images) used for predicting the intracranial tumor by deep learning.
Drawings
FIG. 1 is a flowchart of a training method for predicting an image model of an intracranial tumor according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for predicting an intracranial tumor according to a second embodiment of the invention;
FIG. 3 is a schematic structural diagram of a training device for predicting an image model of an intracranial tumor according to a third embodiment of the present invention;
FIG. 4 is a schematic structural view of a medical device for predicting intracranial tumors according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
It should be noted that, the intracranial tumor often causes intracranial space occupying lesions to cause intracranial pressure increase, and characterization phenomena of lesion features such as disk congestion, bulge and edge blurring appear in the eye area, and the lesion features can be reflected in the eye image. Therefore, it is conceivable that the embodiment of the present invention can train out a coding module that can be used to extract eye coding features for various eye images by using multi-modal eye images, and form an image model for predicting intracranial tumors together with a decoding module that is trained by using an eye image of a craniocerebral tumor patient (a synthetic eye image is also used due to insufficient data of a data sample) and a corresponding intracranial tumor inspection result as a training tag and is used to decode the eye coding features in the eye image into an intracranial tumor prediction result, and use the trained image model in the medical field or the image recognition field to assist a professional doctor in clinical diagnosis of the patient.
Example 1
Fig. 1 is a flowchart of a training method for predicting an image model of an intracranial tumor according to an embodiment of the present invention, where the method may be performed by a training device for predicting an image model of an intracranial tumor, and the training device for predicting an image model of an intracranial tumor may be implemented by software and/or hardware, and may be configured in a computer device, for example, a personal notebook computer, a desktop computer, a server, an industrial personal computer, a computer all-in-one machine, a medical device, and so on, and specifically includes the following steps:
S110, acquiring a first training data set and a second training data set.
The first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and intracranial tumor examination results correspondingly matched with the eye training data set, and the eye training data set comprises a plurality of eye images and a plurality of synthetic eye images; the synthetic eye image is generated from a diffusion model trained based on real eye images of a sample user having an intracranial tumor, and the intracranial tumor examination results include an examination result having the intracranial tumor and an examination result not having the intracranial tumor. The first training data set is used for training the coding module in the preset initial model, and the second training data set is used for training the decoding module in the preset initial model; in essence, the first training data set and the second training data set may each include an eye image of a plurality of data modalities, and the nature of the synthesized eye image in this embodiment is also an eye image, and for convenience of distinguishing description, the eye image generated by the diffusion model is referred to as a synthesized eye image.
In one implementation of this embodiment, S110 may include the following steps:
s1101, acquiring eye images of multiple data modes as a first training data set.
The obtained eye images of the multiple data modes can comprise a real eye image of a user sample with the intracranial tumor and a real eye image of a user sample without the intracranial tumor, so that the coding module obtained through training of the first training data set can learn not only the eye characteristics of a patient without the intracranial tumor in the eye images of the multiple data modes, but also the eye characteristics of the patient with the intracranial tumor, and the data diversity of the eye images can enable the coding module obtained through training based on the eye images to be more robust and generalized.
In this embodiment, the plurality of data modalities may include, in particular, a ophthalmoscopy photo data modality, an optical coherence tomography OCT data modality, a nuclear magnetic resonance MRI data modality, an ophthalmic ultrasound biomicroscopy UBM data modality, and the like.
The eye image in the present embodiment may be a fundus image or an external eye image. In one embodiment, the first training data set may include both fundus and external eye images, in order to allow the coding module in the initial model to learn more ocular features during training, to enhance generalization and robustness of the coding module, and to allow the coding module to learn the coding ability to extract more and more effective ocular features during training.
Since the eyeball is very delicate in construction, it is similar to a conventional camera. The fundus appears to be the negative of the camera, including the optic disc (optic nerve), blood vessels, retinal tissue, and choroid. In addition, the fundus is the only part of the whole body where arteries, veins and capillaries can be directly and intensively observed by naked eyes, the blood vessels can reflect the dynamic and healthy conditions of the whole body blood circulation of a human body, and many whole body diseases can be reflected from the fundus, such as fundus hemorrhage is a serious diabetes complication, and hypertension, coronary heart disease, nephropathy and the like can also leave spider silk and horse marks on the fundus. The fundus image can be used for diagnosis and grading of eye diseases, segmentation of lesion points and important biomarkers, and the like, and can correspond to a plurality of tasks such as classification, segmentation, detection, synthesis and the like in deep learning. The fundus image of the present embodiment may be a 2D image of the fundus captured by the monocular camera, an image obtained by the ophthalmoscope detecting the eye, or another type of image acquired by another means, which is not limited in the embodiment of the present invention.
The external eye image may be an external photograph of the eye acquired by a camera, and mainly includes the eyeball and the orbit, the eyelid, the lacrimal apparatus, the conjunctiva, the cornea, the sclera, the anterior chamber, the iris, the pupil, the lens, and the like, unlike the fundus image which can reflect the optic disc (optic nerve), the blood vessel, the retinal tissue, the choroid, and the like.
In this embodiment, since the training data volume is huge, all the eye images in the training data set may be randomly intercepted and cut to the resolution of the preset size, which is not limited in the embodiment of the present invention.
S1102, acquiring a real eye image of a user sample with intracranial tumor and a corresponding checking result with intracranial tumor.
S1103, training a diffusion model by using a real eye image of a user sample with intracranial tumor, so that the diffusion model generates a synthetic eye image.
The synthesized eye image is an eye image characterized by intracranial tumor characteristics, and the synthesized eye image is correspondingly matched with the examination result of the intracranial tumor.
In a specific implementation, S1103 may include the following steps:
s1, determining a diffusion model to be trained to comprise a basic diffusion model and a super-resolution diffusion model.
The base diffusion model is used for generating a low-resolution image, and the super-resolution diffusion model is used for improving the resolution of the low-resolution image generated by the base diffusion model to obtain an image with target resolution.
In one example, the diffusion model to be trained can select an image model, but since the image model needs text prompt to guide image generation, the diffusion model can also be improved based on the image model, the diffusion model is formed by cascade training of a plurality of basic models, and the basic models are trained and sampled in an unconditional mode in advance to obtain the diffusion model which can finally generate high-quality images; for example, one type of basic model in the diffusion model can be a basic diffusion model, the framework of the basic model can be U-Net or EfficientU-Net, and compared with U-Net, EFFICIENT U-Net has larger improvement on the calculation efficiency and the processing speed of the model; another type of base model in the diffusion model may be a super-resolution diffusion model for improving the resolution of the low-resolution image generated by the base diffusion model, which may also be constructed as U-Net or EfficientU-Net; this embodiment is not limited thereto.
S2, setting the same training conditions, and respectively training a basic diffusion model and a super-resolution diffusion model by taking a real eye image of a user sample with intracranial tumor as a training sample through a noise enhancement technology.
The training conditions may include, among other things, the number of training iterations (e.g., 100K epoch), the amount of single batch training data (e.g., batch size of 64), the learning rate (e.g., learning rate 1 e-4), and the optimizer. Because the weights (weights) of the models are initialized randomly when the training is started, if a larger learning rate is selected to train the basic diffusion model and the super-resolution diffusion model, instability (oscillation) of the models may be brought, so before the basic diffusion model and the super-resolution diffusion model are respectively trained formally, the basic diffusion model and the super-resolution diffusion model can be respectively preheated, for example, a mode of preheating the learning rate can be selected, the learning rate in a plurality of training iteration periods (epoch) or a plurality of training steps (step) of the model to start training is smaller, the model can slowly tend to be stable under the small learning rate of preheating, and the model is trained by selecting the preset learning rate after the model is relatively stable, so that the model convergence speed is faster, and the model effect is better.
In a specific implementation, the nature of the diffusion model is to continuously predict the noise from the pure noise, and a loss value is calculated between the added noise and the predicted noise to improve the capability of the model to synthesize an image, so that the embodiment can train the basic diffusion model and the super-resolution diffusion model by setting the same training condition, and the fidelity of the image generated by the diffusion model is ensured by continuously adding the noise and enhancing the noise in the training process. For example, the diffusion model may include a base diffusion model that generates an image with a resolution of 64×64 and a super-resolution diffusion model that sequentially upsamples the 64×64 resolution image generated by the base diffusion model to a resolution of 256×256 by a noise enhancement technique. These cascaded diffusion models can effectively step-by-step generate high-fidelity images through noise enhancement techniques. Noise enhancement techniques help to improve the robustness of the super-resolution diffusion model to process image artifacts produced by the low-resolution base diffusion model by conditioning the noise level, i.e., by sensing the amount of noise addition, significantly improving the quality of the samples used to train the diffusion model.
And S3, when the basic diffusion model and the super-resolution diffusion model meet preset training termination conditions, confirming that the diffusion model formed by cascading the basic diffusion model and the basic diffusion model is trained, and enabling the diffusion model to generate a synthetic eye image.
S1104, summarizing and synthesizing an eye image, a real eye image of a user sample with intracranial tumor and an acquired real eye image of a user sample without intracranial tumor as an eye training data set.
S1105, summarizing an inspection result of the intracranial tumor corresponding to the synthesized eye image, an inspection result of the intracranial tumor corresponding to the real eye image of the user sample with the intracranial tumor, and an inspection result of the intracranial tumor not corresponding to the acquired real eye image of the user sample not with the intracranial tumor, wherein the inspection result is used as an inspection result of the intracranial tumor corresponding to the eye training data set.
And S1106, taking the eye training data set and the intracranial tumor examination result correspondingly matched with the eye training data set as a second training data set.
In one embodiment, before the first training data set is used to pretrain the coding modules in the preset initial model, data preprocessing may also be performed on the first training data set, and specifically, the preprocessing may include the following steps:
Performing data enhancement processing on the first training data set to obtain an enhanced first training data set, wherein the enhanced first training data set comprises a first type enhanced eye image and a second type enhanced eye image;
Pairing the first enhanced eye images with the first original eye images to obtain a first paired image pair; the first type of original eye image is an eye image corresponding to the first type of enhanced eye image when the first training data set is not subjected to data enhancement processing;
pairing the second enhanced eye images with the second original eye images to obtain a second paired image pair; the second type of original eye image is an eye image corresponding to the second type of enhanced eye image when the data enhancement processing is not performed in the first training data set;
the first paired image pair and the second paired image pair are summarized as a first training data set for pre-training the coding modules in a preset initial model.
The data enhancement processing in this embodiment may include, but is not limited to, color dithering (color jittering), random gaussian blur (Gaussianblur), random graying (GRAY SCALE), random exposure (solarization), random quantization operation (randomized solarization), and other data enhancement processing techniques.
S120, pre-training the coding module in the preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed.
In this embodiment, the encoding module is configured to perform feature encoding on the eye image in the first training data set. The preset first convergence condition may be a threshold value of the training iteration number, or may be that the calculated loss value is smaller than a preset loss threshold value, which is not limited in this embodiment.
In one implementation of this embodiment, S120 may include the following steps:
S1201, determining that the coding module in the preset initial model comprises a teacher network and a student network with the same initial network parameters and network structures.
S1202, training a teacher network and a student network in parallel by using a first training data set, and determining that the pre-training of the coding module is completed until a preset first convergence condition is met; wherein, teacher's network provides the supervisory signals for student's training of network in the pretraining process.
In one embodiment, the first training data set for training the teacher network and the student network in parallel may be a data-preprocessed first training data set, which includes: a first paired image pair formed by pairing the first enhanced eye image and the first original eye image, and a second paired image pair formed by pairing the second enhanced eye image and the second original eye image. The first type original eye image is an eye image corresponding to the first type enhanced eye image when the first training data set is not subjected to data enhancement processing; the second type of original eye image is an eye image corresponding to the second type of enhanced eye image when the data enhancement processing is not performed in the first training data set; the first type of enhanced eye images and the second type of enhanced eye images are eye images in the first training data set after data enhancement processing.
In a specific implementation of this embodiment, S1202 may include the following steps:
S12021, training a teacher network by taking a first-type original eye image in a first paired image pair and a second-type original eye image in a second paired image pair as training samples;
S12022, training a student network by taking a first type enhanced eye image in a first paired image pair and a second type enhanced eye image in a second paired image pair as training samples after random masking and taking an output result of a teacher network as an expected output value of the student network;
In the parallel training process of the teacher network and the student network, the teacher network transmits a first output result corresponding to the first type of original eye image and the second type of original eye image in each batch of received input data to the student network in the current parallel training; the first output result is used as an expected output value of a second output result which is correspondingly output by the first type of enhanced eye images and the second type of enhanced eye images in the same batch of input data received by the student network, and the network parameters of the student network are updated by using the back propagation of a loss value between the expected output value and the second output result; each batch of input data is part of the data in the first training data set, including a first paired image pair and a second paired image pair; the student network migrates the updated network parameters at the end of each training iteration period to the teacher network in an index moving average mode so as to update the network parameters of the teacher network;
s12023, when the teacher network and the student network both meet the preset first convergence condition, determining the teacher network which receives the network parameters updated by the student network in the last round of training iteration period in an index moving average mode at the end of the last round of training iteration period, and taking the teacher network as a coding module for finishing pre-training.
The preset first convergence condition may be a threshold value of training iteration times, or may be that the calculated loss value is smaller than a preset loss threshold value, which is not limited in this embodiment. In addition, the exponential moving average (Exponential Moving Average) is also called a weight moving average (Weighted Moving Average), which is an average method giving more weight to recent data, and the network parameters of the student network are transited to the teacher network in an exponential moving average manner, so that the teacher model updated with the network parameters of the student network in the last training iteration period can be more robust as a final pre-training completed coding module.
In one example, cross Entropy Loss may be employed as a loss function and a learning rate of 1e-3 for the pre-training process described above, and AdamW may be employed as an optimizer.
And S130, under the condition of ensuring that network parameters in the coding module after the pre-training is unchanged, taking an eye training data set in the second training data set as a sample, and taking an intracranial tumor checking result corresponding to and matched with the eye training data set as an expected true value, and training a decoding module in the initial model.
The decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result. In one example, a single layer of linear hierarchy may be concatenated over the encoding module as a decoding module for predicting whether an intracranial tumor is afflicted. In fine-tuning training of the decoding module, binary cross entropy can be used as a loss function and a learning rate of 1e-3, and SGD can be used as an optimizer.
In a specific implementation, S130 may include the following steps:
S1301, inputting an eye training data set in the second training data set into an initial model for forward propagation to obtain an intracranial tumor prediction result; the initial model comprises an encoding module after pre-training and a decoding module cascaded on the encoding module;
s1302, calculating a loss value between an intracranial tumor prediction result and an intracranial tumor examination result which is correspondingly matched with the eye training data set;
s1303, inputting the loss value into a decoding module of the initial model for back propagation, and updating network parameters in the decoding module by using the gradient change value of the loss value in the back propagation process until a preset second convergence condition is met, so that the decoding module is determined to be trained.
The preset second convergence condition may be a threshold value of training iteration times, or may be that the calculated loss value is smaller than a preset loss threshold value, which is not limited in this embodiment.
And S140, when the training of the decoding modules in the initial model is completed, determining the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules as an image model for predicting intracranial tumors.
It should be noted that, in this embodiment, the preset initial model architecture is an encoding-decoding architecture, and includes an encoding module and a decoding module. The process of training the initial model into the image model can be divided into two stages, firstly, the coding module is subjected to unsupervised pre-training, and then the decoding module is subjected to fine adjustment aiming at the intracranial tumor prediction task under the condition of freezing the network parameters of the coding module (namely, the updating parameters are not reversely propagated to the coding module).
The embodiment obtains a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthesized eye images, the synthesized eye images are generated by a diffusion model obtained by training based on the real eye images of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor; pre-training the coding module in the preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image; under the condition of ensuring that network parameters in the coding module after the pre-training is unchanged, taking an eye training data set in the second training data set as a sample, taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value, and training a decoding module in the initial model; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result; when the training of the decoding modules in the initial model is completed, the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules is determined to be an image model for predicting intracranial tumors. The image model acquires the association relation between the eye characteristics in the eye images and the intracranial tumor in the training process, and when the ophthalmic medical image of the user to be predicted is input into the image model, the intracranial tumor prediction result of the user to be predicted is correspondingly output; the image model can be suitable for application scenes in which large-scale intracranial tumor screening and classification are required to be carried out, and is very suitable for application scenes in which primary diagnosis of intracranial tumors is required to be carried out on patients in areas with underdeveloped medical treatment, the image model can output a prediction result of the intracranial tumors only according to input eye images (such as eye images shot on the patients in real time), and the prediction result of the intracranial tumors output by the model can be used as a primary diagnosis result for assisting professional medical staff in carrying out deeper clinical diagnosis on the patients, so that the medical cost is reduced, and the medical efficiency is improved; meanwhile, because the eye images of the real intracranial tumor patients are difficult to acquire, the data samples with intracranial tumor labels for training the model are fewer, and the embodiment of the invention also adopts the synthetic eye images which are generated by the diffusion model and are used for representing the intracranial tumor, so that a training data set can be expanded, and the trained model is more accurate; moreover, the embodiment of the invention can solve the problems that the screening and classification of intracranial tumor/craniocerebral tumor in the prior art need to rely on craniocerebral CT and MRI, the detection cost is high and the detection can be carried out only by a large medical institution/physical examination center, and further can solve the problems that the acquisition mode is time-consuming and the cost is high in the prior art based on the eye images (CT and MRI images) used for predicting the intracranial tumor by deep learning.
Example two
Fig. 2 is a flowchart of a method for predicting an intracranial tumor according to a second embodiment of the present invention, where the method is applicable to a case where an eye image is predicted by using a deep learning model in a medical field or an image recognition processing field, and the user can be predicted for the intracranial tumor by inputting an ophthalmic medical image of the user to be predicted to the model, and the method may be performed by a medical device for predicting the intracranial tumor, and the medical system may be implemented by software and/or hardware, and may be configured in a computer device, such as a server, a workstation, a personal computer, and the like, and the method specifically includes the following steps:
s210, acquiring an ophthalmic medical image of a user to be predicted.
In this embodiment, the ophthalmic medical image of the user to be predicted may include a fundus image photograph and/or an external eye image photograph of the user, the ophthalmic medical image being substantially an eye image.
S220, determining an image model for predicting intracranial tumors.
In this embodiment, a training method for determining an image model for predicting an intracranial tumor may include:
Acquiring a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthesized eye images, the synthesized eye images are generated by a diffusion model obtained by training based on the real eye images of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
Pre-training the coding module in the preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image;
Under the condition of ensuring that network parameters in the coding module after the pre-training is unchanged, taking an eye training data set in the second training data set as a sample, taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value, and training a decoding module in the initial model; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
When the training of the decoding modules in the initial model is completed, the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules is determined to be an image model for predicting intracranial tumors.
S230, inputting the ophthalmic medical image into the image model, and outputting an intracranial tumor prediction result aiming at a user to be predicted.
According to the embodiment of the invention, the ophthalmic medical image of the user to be predicted is acquired, the image model for predicting the intracranial tumor is determined, the ophthalmic medical image is input into the image model, and the intracranial tumor prediction result aiming at the user to be predicted is output. The image model acquires the association relation between the eye characteristics in the eye images and the intracranial tumor in the training process, and when the ophthalmic medical image of the user to be predicted is input into the image model, the intracranial tumor prediction result of the user to be predicted is correspondingly output; the image model can be suitable for application scenes in which large-scale intracranial tumor screening and classification are required to be carried out, and is very suitable for application scenes in which primary diagnosis of intracranial tumors is required to be carried out on patients in areas with underdeveloped medical treatment, the image model can output a prediction result of the intracranial tumors only according to input eye images (such as eye images shot on the patients in real time), and the prediction result of the intracranial tumors output by the model can be used as a primary diagnosis result for assisting professional medical staff in carrying out deeper clinical diagnosis on the patients, so that the medical cost is reduced, and the medical efficiency is improved; meanwhile, because the eye images of the real intracranial tumor patients are difficult to acquire, the data samples with intracranial tumor labels for training the image model are fewer, and the embodiment of the invention also adopts the synthetic eye images which are generated by the diffusion model and are used for representing intracranial tumor, so that the training data set for training the image model can be expanded, and the trained model is more accurate; moreover, the embodiment of the invention can solve the problems that the screening and classification of intracranial tumor/craniocerebral tumor in the prior art need to rely on craniocerebral CT and MRI, the detection cost is high and the detection can be carried out only by a large medical institution/physical examination center, and further can solve the problems that the acquisition mode is time-consuming and the cost is high in the prior art based on the eye images (CT and MRI images) used for predicting the intracranial tumor by deep learning.
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the invention.
Example III
Fig. 3 is a block diagram of a training device for predicting an image model of an intracranial tumor according to a third embodiment of the present invention, where the training device may specifically include the following modules:
A data acquisition module 310, configured to acquire a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
The code training module 320 is configured to pre-train the code module in the preset initial model by using the first training data set, and determine that the pre-training of the code module is completed until a preset first convergence condition is met; the coding module is used for carrying out feature coding on the eye image;
The decoding training module 330 is configured to train the decoding module in the initial model by taking the eye training data set in the second training data set as a sample and the intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value under the condition that the network parameters in the encoding module after the pre-training are ensured to be unchanged; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
And the image model confirming module 340 is used for determining that the initial model formed by the cascade of the trained decoding module and the pre-trained encoding module is an image model for predicting intracranial tumor when the training of the decoding module in the initial model is completed.
In one embodiment of the present invention, the data acquisition module 310 includes:
The first training data set acquisition sub-module is used for acquiring eye images of various data modes and is used as a first training data set;
the real image acquisition sub-module is used for acquiring a real eye image of a user sample with intracranial tumor and a corresponding checking result with intracranial tumor;
a diffusion model training sub-module for training a diffusion model using the real eye image of the user sample with intracranial tumor, such that the diffusion model generates a synthetic eye image; the synthesized eye image is an eye image characterized by intracranial tumor characteristics, and the synthesized eye image is correspondingly matched with an inspection result of the intracranial tumor;
The image summarizing sub-module is used for summarizing the synthesized eye image, the real eye image of the user sample with the intracranial tumor and the obtained real eye image of the user sample without the intracranial tumor as an eye training data set;
The inspection result summarizing sub-module is used for summarizing the inspection result of the synthesized eye image, which is corresponding to the synthesized eye image and is provided with intracranial tumor, the inspection result of the synthesized eye image, which is corresponding to the real eye image of the user sample and is provided with intracranial tumor, and the inspection result, which is corresponding to the obtained real eye image of the user sample and is not provided with intracranial tumor, and is provided with intracranial tumor inspection results, which are corresponding to the eye training data set and are matched;
and the second training data set acquisition sub-module is used for taking the eye training data set and the intracranial tumor examination result correspondingly matched with the eye training data set as a second training data set.
In one embodiment of the invention, the diffusion model training submodule includes:
The diffusion model confirming unit is used for confirming that the diffusion model to be trained comprises a basic diffusion model and a super-resolution diffusion model; the basic diffusion model is used for generating a low-resolution image, and the super-resolution diffusion model is used for improving the resolution of the low-resolution image generated by the basic diffusion model to obtain an image with target resolution;
The diffusion model training unit is used for setting the same training conditions, and respectively training the basic diffusion model and the super-resolution diffusion model by taking the real eye image of the user sample with intracranial tumor as a training sample through a noise enhancement technology; the training conditions comprise training iteration times, single batch training data amount (batch size) and learning rate; before the basic diffusion model and the super-resolution diffusion model are trained, preheating training is respectively carried out on the basic diffusion model and the super-resolution diffusion model;
and the diffusion model training completion unit is used for confirming that the diffusion model training formed by cascading the basic diffusion model and the basic diffusion model is completed when the basic diffusion model and the super-resolution diffusion model meet the preset training termination condition, so that the diffusion model generates a synthetic eye image.
In an embodiment of the present invention, before the pre-training the coding module in the preset initial model using the first training data set until the pre-training of the coding module is determined to be completed if the preset first convergence condition is met, the training apparatus further includes a data enhancement module, where the data enhancement module may include:
The data enhancement processing sub-module is used for carrying out data enhancement processing on the first training data set to obtain an enhanced first training data set, wherein the enhanced first training data set comprises a first type enhanced eye image and a second type enhanced eye image;
The first paired image pair acquisition sub-module is used for pairing the first type enhanced eye images with the first type original eye images to obtain a first paired image pair; the first type original eye image is an eye image corresponding to the first type enhanced eye image when the first training data set is not subjected to data enhancement processing;
the second paired image pair acquisition sub-module is used for pairing the second type of enhanced eye images with the second type of original eye images to obtain a second paired image pair; the second type of original eye images are eye images corresponding to the second type of enhanced eye images when the second type of enhanced eye images are not subjected to data enhancement processing in the first training data set;
And the image data summarization sub-module is used for summarizing the first pairing image pair and the second pairing image pair to be used as a first training data set for pre-training the coding module in the preset initial model.
In one embodiment of the present invention, the code training module 320 includes:
The coding module confirming sub-module is used for confirming that the coding module in the preset initial model comprises a teacher network and a student network, wherein the initial network parameters and the network structures of the teacher network and the student network are the same;
The coding module training sub-module is used for carrying out parallel training on the teacher network and the student network by utilizing the first training data set, and determining that the coding module pre-training is completed until a preset first convergence condition is met; wherein the teacher network provides supervisory signals for training of the student network during pre-training.
In one embodiment of the present invention, the coding module training submodule includes:
The teacher network training unit is used for training the teacher network by taking the first-type original eye images in the first paired image pair and the second-type original eye images in the second paired image pair as training samples;
The student network training unit is used for training the student network by taking a first type of enhanced eye images in the first paired image pair and a second type of enhanced eye images in the second paired image pair as training samples after carrying out random masking and taking an output result of the teacher network as an expected output value of the student network;
In the parallel training process of the teacher network and the student network, the teacher network transmits the first output results corresponding to the first type of original eye images and the second type of original eye images in each batch of received input data to the student network in the current parallel training; the first output result is used as an expected output value of a second output result which is output by the student network corresponding to the first type of enhanced eye image and the second type of enhanced eye image in the same batch of input data, and the network parameters of the student network are updated by back propagation of a loss value between the expected output value and the second output result; each batch of input data is part of the data in the first training data set, and comprises a first pairing image pair and a second pairing image pair; the student network migrates the updated network parameters at the end of each round of training iteration period into the teacher network in an index moving average mode so as to update the network parameters of the teacher network;
and the coding module training completion confirmation unit is used for determining that the teacher network which is used for receiving the network parameters updated by the student network in the last round of training iteration period by the index moving average mode is used as the coding module for completing the pre-training when the teacher network and the student network both meet the preset first convergence condition.
In one embodiment of the present invention, decode training module 330 comprises:
The forward propagation submodule is used for inputting the eye training data set in the second training data set into the initial model for forward propagation to obtain an intracranial tumor prediction result; the initial model comprises an encoding module after pre-training is completed and a decoding module cascaded on the encoding module;
a loss value calculation sub-module for calculating a loss value between the intracranial tumor prediction result and the intracranial tumor examination result correspondingly matched with the eye training data set;
And the back propagation sub-module is used for inputting the loss value into the decoding module of the initial model for back propagation, and updating the network parameters in the decoding module by using the gradient change value of the loss value in the back propagation process until a preset second convergence condition is met, so that the decoding module is determined to be trained.
The training device for predicting the intracranial tumor image model provided by the embodiment of the invention can execute the training method for predicting the intracranial tumor image model provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 4 is a block diagram of a medical device for predicting intracranial tumors according to a fourth embodiment of the present invention, where the medical device may specifically include the following modules:
An image acquisition module 410 for acquiring an ophthalmic medical image of a user to be predicted;
a model determining module 420, configured to determine an image model according to the training method of the image model provided in any embodiment of the present invention;
The prediction module 430 is configured to input the ophthalmic medical image into the image model, and output an intracranial tumor prediction result for the user to be predicted.
The medical device for predicting the intracranial tumor provided by the embodiment of the invention can execute the medical method for predicting the intracranial tumor provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 5 is a schematic structural diagram of a computer device according to a fifth embodiment of the present invention. Fig. 5 illustrates a block diagram of an exemplary computer device 12 suitable for use in implementing embodiments of the present invention. The computer device 12 shown in fig. 5 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in FIG. 5, the computer device 12 is in the form of a general purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. Memory 28 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), one or more devices that enable a user to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement the medical method for predicting intracranial tumors provided by the embodiments of the present invention.
Example six
The sixth embodiment of the present invention further provides a computer readable storage medium, on which a computer program is stored, where the computer program when executed by a processor implements each process of the medical method for predicting intracranial tumor described above, and the same technical effects can be achieved, and for avoiding repetition, a detailed description is omitted herein.
The computer readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (11)

1. A training method for predicting an image model of an intracranial tumor, comprising:
Acquiring a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
Pre-training the coding module in a preset initial model by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image;
Under the condition of ensuring that network parameters in the coding module after the pre-training is not changed, taking an eye training data set in the second training data set as a sample and taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value, and training a decoding module in the initial model; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
When the training of the decoding modules in the initial model is completed, the initial model formed by cascading the trained decoding modules and the pre-trained encoding modules is determined to be an image model for predicting intracranial tumors.
2. The method of claim 1, wherein the acquiring the first training data set and the second training data set comprises:
acquiring eye images of multiple data modalities as a first training data set;
Acquiring a real eye image of a user sample with intracranial tumor and a corresponding checking result with intracranial tumor;
Training a diffusion model using the real eye image of the user sample with intracranial tumor such that the diffusion model generates a synthetic eye image; the synthesized eye image is an eye image characterized by intracranial tumor characteristics, and the synthesized eye image is correspondingly matched with an inspection result of the intracranial tumor;
Summarizing the synthesized eye image, the real eye image of the user sample with the intracranial tumor and the acquired real eye image of the user sample without the intracranial tumor as an eye training data set;
Summarizing an inspection result of the synthesized eye image corresponding to the intracranial tumor, an inspection result of the intracranial tumor corresponding to the real eye image of the user sample with the intracranial tumor, and an inspection result of the obtained user sample without the intracranial tumor corresponding to the real eye image without the intracranial tumor, wherein the inspection result of the intracranial tumor is used as an intracranial tumor inspection result corresponding to the eye training data set;
And taking the eye training data set and intracranial tumor examination results correspondingly matched with the eye training data set as a second training data set.
3. The method of claim 2, wherein the training a diffusion model using the real eye image of the sample of the user having the intracranial tumor such that the diffusion model generates a composite eye image comprises:
Determining that a diffusion model to be trained comprises a basic diffusion model and a super-resolution diffusion model; the basic diffusion model is used for generating a low-resolution image, and the super-resolution diffusion model is used for improving the resolution of the low-resolution image generated by the basic diffusion model to obtain an image with target resolution;
Setting the same training conditions, and respectively training the basic diffusion model and the super-resolution diffusion model by using a real eye image of the user sample with intracranial tumor as a training sample through a noise enhancement technology; the training conditions comprise training iteration times, single batch training data amount (batch size) and learning rate; before the basic diffusion model and the super-resolution diffusion model are trained, preheating training is respectively carried out on the basic diffusion model and the super-resolution diffusion model;
When the basic diffusion model and the super-resolution diffusion model meet the preset training termination condition, confirming that the diffusion model formed by cascading the basic diffusion model and the basic diffusion model is trained, and enabling the diffusion model to generate a synthetic eye image.
4. A method according to claim 1, 2 or 3, wherein before the pre-training of the coding modules in the pre-set initial model using the first training data set until a pre-set first convergence condition is met, determining that the pre-training of the coding modules is complete, further comprises:
Performing data enhancement processing on the first training data set to obtain an enhanced first training data set, wherein the enhanced first training data set comprises a first type enhanced eye image and a second type enhanced eye image;
pairing the first enhanced eye image with the first original eye image to obtain a first paired image pair; the first type original eye image is an eye image corresponding to the first type enhanced eye image when the first training data set is not subjected to data enhancement processing;
Pairing the second type enhanced eye images with the second type original eye images to obtain a second paired image pair; the second type of original eye images are eye images corresponding to the second type of enhanced eye images when the second type of enhanced eye images are not subjected to data enhancement processing in the first training data set;
Summarizing the first pairing image pair and the second pairing image pair to be used as a first training data set for pre-training a coding module in a preset initial model.
5. The method of claim 4, wherein the pre-training the coding module in the preset initial model using the first training data set until a preset first convergence condition is met determines that the pre-training of the coding module is complete, comprising:
determining that a coding module in a preset initial model comprises a teacher network and a student network, wherein initial network parameters and network structures of the teacher network and the student network are the same;
Performing parallel training on the teacher network and the student network by using the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; wherein the teacher network provides supervisory signals for training of the student network during pre-training.
6. The method of claim 5, wherein the training the teacher network and the student network in parallel using the first training data set until a preset first convergence condition is met determines that the pre-training of the encoding module is complete, comprising:
Training the teacher network by taking the first-type original eye images in the first paired image pair and the second-type original eye images in the second paired image pair as training samples;
Training the student network by taking a first type enhanced eye image in the first paired image pair and a second type enhanced eye image in the second paired image pair as training samples after carrying out random masking and taking an output result of the teacher network as an expected output value of the student network;
In the parallel training process of the teacher network and the student network, the teacher network transmits the first output results corresponding to the first type of original eye images and the second type of original eye images in each batch of received input data to the student network in the current parallel training; the first output result is used as an expected output value of a second output result which is output by the student network corresponding to the first type of enhanced eye image and the second type of enhanced eye image in the same batch of input data, and the network parameters of the student network are updated by back propagation of a loss value between the expected output value and the second output result; each batch of input data is part of the data in the first training data set, and comprises a first pairing image pair and a second pairing image pair; the student network migrates the updated network parameters at the end of each round of training iteration period into the teacher network in an index moving average mode so as to update the network parameters of the teacher network;
And when the teacher network and the student network both meet a preset first convergence condition, determining that the teacher network which is used for receiving the network parameters updated by the student network in the last round of training iteration period by the index moving average mode at the end of the last round of training iteration period is used as a coding module for finishing pre-training.
7. The method according to claim 1, 2 or 3, wherein the decoding module in the initial model is trained by taking the eye training dataset in the second training dataset as a sample and the intracranial tumor examination result correspondingly matched with the eye training dataset as an expected true value under the condition of ensuring that the network parameters in the coding module after the pre-training is unchanged; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding characteristics obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result, and the method comprises the following steps:
Inputting the eye training data set in the second training data set into the initial model for forward propagation to obtain an intracranial tumor prediction result; the initial model comprises an encoding module after pre-training is completed and a decoding module cascaded on the encoding module;
Calculating a loss value between the intracranial tumor prediction result and an intracranial tumor examination result correspondingly matched with the eye training data set;
And inputting the loss value into a decoding module of the initial model for back propagation, and updating network parameters in the decoding module by using the gradient change value of the loss value in the back propagation process until a preset second convergence condition is met, so that the decoding module is determined to be trained.
8. A method for predicting an intracranial tumor, comprising:
acquiring an ophthalmic medical image of a user to be predicted;
determining an image model according to the training method of an image model according to any one of claims 1-7;
And inputting the ophthalmic medical image into the image model, and outputting an intracranial tumor prediction result aiming at the user to be predicted.
9. A training device for predicting an image model of an intracranial tumor, comprising:
The data acquisition module is used for acquiring a first training data set and a second training data set; the first training data set comprises a plurality of eye images, the second training data set comprises an eye training data set and an intracranial tumor examination result correspondingly matched with the eye training data set, the eye training data set comprises a plurality of synthetic eye images, the synthetic eye images are generated by a diffusion model obtained by training based on a real eye image of a sample user with the intracranial tumor, and the intracranial tumor examination result comprises an examination result with the intracranial tumor and an examination result without the intracranial tumor;
the coding training module is used for pre-training the coding module in the preset initial model by utilizing the first training data set until a preset first convergence condition is met, and determining that the pre-training of the coding module is completed; the coding module is used for carrying out feature coding on the eye image;
The decoding training module is used for training the decoding module in the initial model by taking the eye training data set in the second training data set as a sample and taking an intracranial tumor checking result correspondingly matched with the eye training data set as an expected true value under the condition of ensuring that the network parameters in the coding module after the pre-training is unchanged; the decoding module is cascaded on the coding module after the pre-training is completed in the training process and is used for decoding the eye coding features obtained after the coding module codes the eye training data set to obtain an intracranial tumor prediction result;
And the image model confirmation module is used for determining an initial model formed by cascading the trained decoding module and the pre-trained encoding module as an image model for predicting intracranial tumor when the training of the decoding module in the initial model is completed.
10. A medical device for predicting an intracranial tumor, comprising:
The image acquisition module is used for acquiring an ophthalmic medical image of a user to be predicted;
Model determination module for determining an image model according to the training method of an image model according to any one of claims 1-7;
And the prediction module is used for inputting the ophthalmic medical image into the image model and outputting an intracranial tumor prediction result aiming at the user to be predicted.
11. A computer device, the computer device comprising:
One or more processors;
a memory for storing one or more programs,
The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the training method for predicting an image model of an intracranial tumor as recited in any one of claims 1-7 or the method for predicting an intracranial tumor as recited in claim 8.
CN202410466483.8A 2024-01-24 2024-04-18 Training method for predicting intracranial tumor image model and related device thereof Pending CN118334469A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2024100969432 2024-01-24

Publications (1)

Publication Number Publication Date
CN118334469A true CN118334469A (en) 2024-07-12

Family

ID=

Similar Documents

Publication Publication Date Title
US20230260126A1 (en) Processing fundus images using machine learning models
EP3674968A1 (en) Image classifying method, server and computer readable storage medium
Abdelmotaal et al. Pix2pix conditional generative adversarial networks for scheimpflug camera color-coded corneal tomography image generation
Teikari et al. Embedded deep learning in ophthalmology: making ophthalmic imaging smarter
EP3850638B1 (en) Processing fundus camera images using machine learning models trained using other modalities
CN110991254B (en) Ultrasonic image video classification prediction method and system
US20240185428A1 (en) Medical Image Analysis Using Neural Networks
CN116759074A (en) Training method and application of multi-round conversational medical image analysis model
Li et al. Deep learning algorithm for generating optical coherence tomography angiography (OCTA) maps of the retinal vasculature
CN117612221B (en) OCTA image blood vessel extraction method combined with attention shift
Lu et al. PKRT-Net: prior knowledge-based relation transformer network for optic cup and disc segmentation
CN113205488B (en) Blood flow characteristic prediction method, device, electronic equipment and storage medium
Li et al. RPS‐Net: An effective retinal image projection segmentation network for retinal vessels and foveal avascular zone based on OCTA data
KR20200113336A (en) Learning method for generating multiphase collateral image and multiphase collateral imaging method using machine learning
Khan et al. RVD: a handheld device-based fundus video dataset for retinal vessel segmentation
CN118334469A (en) Training method for predicting intracranial tumor image model and related device thereof
CN113269711B (en) Brain image processing method and device, electronic equipment and storage medium
Pavani et al. Robust semantic segmentation of retinal fluids from SD-OCT images using FAM-U-Net
Al-antari et al. Deep learning myocardial infarction segmentation framework from cardiac magnetic resonance images
Yojana et al. OCT layer segmentation using U-Net semantic segmentation and RESNET34 encoder-decoder
Shon et al. Development of a β-variational autoencoder for disentangled latent space representation of anterior segment optical coherence tomography images
Liu et al. OCTA retinal vessel segmentation based on vessel thickness inconsistency loss
Waisberg et al. Generative artificial intelligence in ophthalmology
Preity et al. Automated computationally intelligent methods for ocular vessel segmentation and disease detection: a review
Ma et al. Deep‐learning segmentation method for optical coherence tomography angiography in ophthalmology

Legal Events

Date Code Title Description
PB01 Publication