WO2021114818A1

WO2021114818A1 - Method, system, and device for oct image quality evaluation based on fourier transform

Info

Publication number: WO2021114818A1
Application number: PCT/CN2020/117943
Authority: WO
Inventors: 王瑞; 王立龙; 吕传峰
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-06-30
Filing date: 2020-09-25
Publication date: 2021-06-17
Also published as: CN111784665B; CN111784665A

Abstract

A method, system, and device for OCT image quality assessment based on Fourier transform. The method comprises: Fourier transforming fundus OCT image samples in a fundus OCT image sample set having a known image tag to create a corresponding spectral image sample set (S110); creating a multimodal classification network model and training the multimodal classification network model via the fundus OCT image sample set and of the spectral image sample set (S120); when the training of the multimodal classification network model is completed, inputting a fundus OCT image to be classified and a spectral image to be classified corresponding to said fundus OCT image into the multimodal classification network model, and assessing the quality of said fundus OCT image via the multimodal classification network model (S130). The solution also relates to the blockchain technology; the fundus OCT image samples are stored in a blockchain. The technical solution not only implements the automation of the quality assessment of a fundus OCT image, but also significantly increases assessment precision.

Description

OCT image quality evaluation method, system and device based on Fourier transform

This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on June 30, 2020, the application number is 202010618087.4, and the invention title is "Fourier Transform-based OCT Image Quality Evaluation Method, System and Device", all of which The content is incorporated in this application by reference.

Technical field

This application relates to the field of image recognition technology, and in particular to an OCT image quality assessment method, system, device and storage medium based on Fourier transform.

Background technique

Optical coherence tomography (Optical Coherence Tomography, OCT) is an imaging technique that can be used to diagnose fundus diseases. Because it can accurately reflect the disease of the patient's fundus, and the imaging is convenient and fast, it is widely used in the field of artificial intelligence (AI) screening and auxiliary diagnosis. The inventor realized that the current OCT quality evaluation method of the fundus is mainly based on the Quality Index (Quality Index, QI) and the Signal Strength Index (Signal Strength Indicator, SSI) to determine whether the quality of the fundus OCT image is qualified. However, this method can only reflect the overall quality of an OCT image sequence, and cannot determine whether the quality of a single fundus OCT image is available, and it is difficult to apply this method to the field of artificial intelligence.

However, during the inventor’s research, it was discovered that traditional AI image quality evaluation methods usually input images into a neural network for image classification. This method only considers the image spatial domain information and does not consider the image frequency domain information, but OCT images are It is relatively simple and the image domain information is relatively single, so it is difficult for traditional AI image quality evaluation methods to obtain a better result. For the evaluation of the OCT quality of the fundus, using this traditional AI image quality evaluation method obviously cannot get a good quality evaluation result.

Based on the above problems, an efficient and high-quality method for evaluating the quality of fundus OCT images is urgently needed.

Summary of the invention

The present application provides a Fourier transform-based OCT image quality evaluation method, system, electronic device, and computer storage medium, the main purpose of which is to solve the problem of low efficiency and poor quality of the existing fundus OCT image method.

To achieve the above objective, this application provides a Fourier transform-based OCT image quality evaluation method, which includes the following steps:

Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.

In addition, the present application also provides an OCT image quality evaluation system based on Fourier transform, the system includes:

A sample set establishment unit, configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

A model training unit, configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

The model application unit is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network model after the training of the multimodal classification network model is completed , Performing quality evaluation on the OCT image of the fundus to be classified through the multi-modal classification network model.

In addition, in order to achieve the above object, the present application also provides an electronic device, which includes a memory, a processor, and a Fourier transform-based OCT image stored in the memory and running on the processor. A quality evaluation program, when the Fourier transform-based OCT image quality evaluation program is executed by the processor, the following steps are implemented:

Perform Fourier transform on each fundus OCT image sample set with known image tags in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

In addition, in order to achieve the above object, the present application also provides a computer-readable storage medium in which is stored an OCT image quality evaluation program based on Fourier transform, and the OCT based on Fourier transform When the image quality evaluation program is executed by the processor, the following steps are implemented:

This application uses the Fourier transform method to obtain the spectrum image sample set of the fundus OCT image sample set, and trains the multimodal classification network model according to the fundus OCT image sample set and the spectrum image sample set, and uses image recognition in artificial intelligence The technology for automatic OCT image quality evaluation can not only improve the efficiency of OCT image quality evaluation, but also significantly improve the classification effect of the model on the image, thereby improving OCT image quality evaluation.

Description of the drawings

FIG. 1 is a flowchart of a preferred embodiment of an OCT image quality evaluation method based on Fourier transform according to an embodiment of the present application;

2 is a schematic structural diagram of a preferred embodiment of an electronic device according to an embodiment of the present application;

Fig. 3 is a schematic diagram of internal logic of an OCT image quality assessment program based on Fourier transform according to an embodiment of the present application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

In the following description, for illustrative purposes, in order to provide a comprehensive understanding of one or more embodiments, many specific details are set forth. However, it is obvious that these embodiments can also be implemented without these specific details.

The technical solution of this application can be applied to the fields of artificial intelligence, blockchain and/or big data technology. Optionally, the data involved in this application, such as image samples, can be stored in a database, or can be stored in a blockchain.

The specific embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Example 1

To illustrate the Fourier transform-based OCT image quality evaluation method provided by this application, FIG. 1 shows the flow of the Fourier transform-based OCT image quality evaluation method provided according to this application.

As shown in Fig. 1, the OCT image quality evaluation method based on Fourier transform provided by this application includes:

S110: Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set.

It should be noted that, in order to better implement the quality assessment of fundus OCT images, at least three category labels can be set up for fundus OCT image data, such as good, poor, and usable. Among them, good corresponds to good image quality, the retina and choroid are clear, and does not affect the graphics of the doctor's disease diagnosis; usable corresponds to the retina, the choroid is blurred or missing, but does not affect the doctor's diagnosis, the image quality is barely usable image; poor corresponds to the image quality Poor, blurry or missing of the retina and choroid, the image diagnosed by the imaging doctor.

It should be noted that the label of a fundus OCT image sample with a known image label is generally made after evaluation by a medical expert. The probability value of each of the three labels to be evaluated is generally taken as the label with the largest label probability value. The type of fundus OCT image samples with known image tags. Specifically, for example, multiple (generally no less than twenty) medical experts can be selected in advance to score the fundus OCT image, and 0-50 is divided into poor image quality, blurred or missing retina and choroid, imaging doctors Diagnosed images, the lower the score, the worse the image quality, 50 to 80 points, the retina and choroid are blurred or missing, but it does not affect the doctor’s diagnosis of the picture, the 80 to 100 points have good image quality, and the retina and choroid are clear pictures. Finally, the label corresponding to the segment where the average score is located is taken as the known label of the sample, and the proportion of doctors in each segment is the probability value of the corresponding segment (each label). It should be noted that the innovation of this application lies in the construction of the later model, so as to realize the use of artificial intelligence to simulate the process of doctors recognizing pictures, thereby realizing the automation of the quality assessment of fundus OCT images, and eliminating the doctor’s recognition process.

It needs to be further explained that since different doctors have different discriminating abilities, in order to improve the accuracy of the final model, doctors with higher professional titles can be selected as much as possible for the preliminary sample label determination process. In addition, the types and evaluation criteria of tags can be adjusted according to the actual situation. This application mainly uses samples to simulate the quality assessment ability of doctors’ OCT images through the new model set, mainly in the learning and training process of the later model. The label setting process of is not the focus of this application, so I won’t repeat it here.

For the fundus OCT image sample set with known image labels involved in step S110, it is a sample set composed of fundus OCT image samples whose labels are determined after evaluation according to a preset evaluation rule, which is a better simulation of real life fundus OCT image samples. The label ratio of the image. In this sample set, the total number of samples is generally no less than 10,000, and the ratio of good, poor and usable OCT image samples is 3:4:3. This ratio is calculated based on actual data. After confirming.

Specifically, the process of performing Fourier transform on the fundus OCT image sample set with known image tags includes the following steps:

Step 1: Perform grayscale processing on each fundus OCT image in turn to improve the data capture accuracy and conversion efficiency during the later Fourier change.

Specifically, the gray-scale processing method adopted in this solution is the component method. In image processing, three components of RGB (R: Red, G: Green, B: Blue) are generally used to represent the true colors. Color, R component, G component, and B component range from 0 to 255. For example, the values of the three components of a red pixel on a computer screen are: 255, 0, 0. Pixel is the smallest image unit. A picture is composed of many pixels. Because the color of a pixel is represented by the three values of RGB, a pixel matrix corresponds to three color vector matrices, which are R matrix. , G matrix, B matrix, taking an image with a size of 800*800 as an example, the corresponding three matrices are also 800*800 matrices. The value of the first row and first column of each matrix corresponds to the component value. For example, the values of the first row and first column of each matrix are: R: 240, G: 223, B: 204, so the color of this pixel That is (240, 223, 204).

The specific gray-scale processing process is to make each pixel in the pixel matrix satisfy the following relationship: R=G=B (that is, the value of the red variable, the value of the green variable, and the value of the blue variable, these three If the values are equal, the value at this time is the gray value, which can be assigned as follows: R after grayscale=R*0.3 before processing+G*0.59 before processing+B*0.11 before processing, gray G after scaling=R before treatment*0.3+G before treatment*0.59+B*0.11 before treatment, B after grayscale=R*0.3 before treatment+G*0.59 before treatment+before treatment的B*0.11.

In addition, in order to improve the grayscale processing effect of the image, the grayscale processed image can also be binarized. The specific process is: set a threshold, such as 127, and calculate the grayscale of all pixels in the pixel matrix. The average value avg of the value, and then the average value is compared with the threshold value, if the average value is greater than the threshold value, the pixel is finally set to white, if the average value is less than the threshold, the pixel is finally set to black.

Step 2: Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples.

Step 3: Establish the spectrum image sample set according to the frequency domain samples.

What needs to be explained here is that the Fourier transform of the fundus OCT image sample set (to improve efficiency, the fast Fourier transform is generally used) is a common technical means in the field of image processing. The innovation of this application lies in the comparison of OCT images The use of the frequency domain information, therefore, the specific process of the Fourier transform will not be repeated here.

In addition, it should be emphasized that, in order to further ensure the privacy and security of the above-mentioned data to be audited, the fundus OCT image samples can be stored in the nodes of the blockchain.

S120: Create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set.

Specifically, the multi-modal classification network model designed in this application includes at least three branches, the Deep stream branch, the shallow stream branch, and the Simple Modal Image stream branch. (The first two are called trunk roads). These three branches are arranged in parallel. It is a commonly used classification model.

Among them, the Deep stream branch has many convolutional layers, which are mainly used to train through the fundus OCT image sample set, and extract the deep features of medical images (such as fundus OCT images), and compare the The fundus OCT image samples are classified once to obtain the corresponding first classification result; the shallow stream branch has fewer convolutional layers, which is mainly used for training through the fundus OCT image sample set and extracting the shallow features of the medical image. Then, perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result.

In addition, in order to improve the recognition accuracy of the model, an attention module can be added to the convolutional layer of the shallow stream branch to focus on extracting the shallow features of the image. Through the cooperation of the deep stream branch and the shallow stream branch, the effect of enriching the dimension of image features can be achieved, and the accuracy of feature acquisition can be improved.

The Simple Modal Image stream branch is mainly used for training through the spectrum image sample set to extract the frequency domain shallow features of each spectrum image sample, and then classify the spectrum image samples three times according to the frequency domain shallow features, To obtain the corresponding third classification result. What needs to be explained here is that Simple Modal Image stream only needs to extract the shallow features of the image in the frequency domain, and then it can be classified.

It should be noted that the Deepstream branch is a deep feature extraction network consisting of a classic convolutional neural network as the backbone. It is similar to the existing ResNet, DenseNet and other networks. Through the input of the image, the final output of the multi-dimensional image feature map, and Here we get a deep branch prediction probability. Shallow stream is a shallow feature extraction network, which is mainly composed of two modules, a down-sampling module and an attention module; among them, the down-sampling module is composed of a convolutional layer, an activation layer, and a normalization layer. While extracting the image features, complete the down-sampling operation; the attention concentration module is composed of the spatial attention module and the channel attention module, which respectively focus on the spatial and channel features of the image. After the data samples pass through this branch, the shallow feature map of the image can be output. And get a shallow branch prediction probability here. The Simple Modal Image stream branch structure is the same as the shallow stream branch, except that the input images are different. The Simple Modal Image stream branch needs to input the frequency domain sample data after Fourier transform.

Specifically, taking the Deep Stream branch as an example (the other branches are the same), after the fundus OCT image passes the Deep Stream branch, the network will output n probability values, where n is the number of categories to be classified and needs to be preset. For example, if I want to divide into three categories (corresponding to the above good, poor, and usable), after the Deep stream branch, the deep features of the fundus OCT image are extracted first, and then the fundus OCT image is performed once according to these deep features Classification, it will output three probability values (corresponding to the above good, poor, and usable). The three probability values add up to 1. We usually think that the category corresponding to the value with the highest probability is the first classification result output by the branch. .

In addition, in order to further improve the recognition accuracy of the multi-modal classification network model, the three branches are also cascaded and fused; specifically, firstly, perform cascade fusion from the deep features, the shallow features, and the frequency domain shallow features. Cascading to obtain cascaded features; then the multi-modal classification network model performs cascaded classification on the fundus OCT image samples according to the cascaded features to obtain corresponding cascaded classification results.

It should be noted that the cascade fusion is to perform the Concatanate operation, which is to cascade the different dimensions of the image, that is, in the channel dimension, the feature map (corresponding to the extracted features of each branch) is cascaded. After cascading, a new set of cascaded features of the fundus OCT image is formed, and the corresponding cascaded classification results are output according to the cascaded features.

When the multi-modal classification network model is trained, and the fundus OCT image to be classified is classified and recognized through the multi-modal classification network model, the above-mentioned cascade classification result is used as the final classification result output by the multi-modal classification network model .

At the end of the training of the multi-modal classification network model, calculate the corresponding loss function Loss according to the classification results of the Deep stream branch, Simple Modal Image stream branch, and Simple Modal Image stream branch (or even the cascade branch, whether to use it according to actual needs) .

Specifically, the corresponding loss function is calculated according to the first classification result, the second classification result, and the third classification result; and then the total loss function of the multimodal classification network model is determined according to the calculated loss function, when the total loss function When the convergence reaches the minimum, it is determined that the training of the multi-modal classification network model is completed.

In the actual calculation process, the calculation formula of the loss function is:

Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x _i represents the i-th category, and n represents the number of categories;

The calculation formula of the total loss function is:

_Total Loss = 0.3 × Loss _D + 0.3 × Loss _s + 0.4 × Loss _P , where,

Loss _D is the loss function of the first classification result, Loss _s is the loss function of the second classification result, and Loss _P is the loss function of the third classification result.

Among them, the parameters before each loss function are set according to the effect and medical experience. In the back propagation process, the weight of the first classification result is highlighted, and the weights of the other two classification results are considered at the same time. When the entire network converges After reaching the minimum, (that is, the loss function converges), the first classification result (de-cascade classification result with cascade classification results) is finally selected as the final result of the model output.

In this way, using the fundus OCT image sample set in conjunction with the spectrum image sample set can realize the training of the multimodal classification network model. When the fundus OCT image sample set and the samples of the spectrum image sample set are all used, the default The model training is complete.

S130: After the training of the multi-modal classification network model is completed, input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multi-modal classification network model, and the multi-modal classification network model The OCT images of the fundus to be classified are classified, and the quality is evaluated according to the classification results. It should be noted that since the early classification type is related to the quality of the fundus OCT image, when the classification is over, the classification result can be used to achieve the quality evaluation, because the quality evaluation process after the classification result is known is the field Common methods will not be repeated here.

Wherein, the fundus OCT image to be classified is an OCT image that has no label and needs to be automatically classified, and the spectrum image to be classified is a frequency domain image obtained by Fourier transform of the fundus OCT image to be classified.

It should be noted that the Fourier transform process of the OCT image of the fundus to be classified is the same as the Fourier transform process in step S110, so it will not be repeated here.

It should be noted that the fundus OCT image to be classified is input to the Deep stream branch and the shallow stream branch, and the spectrum image to be classified is input to the Deep stream branch. After processing by the multi-modal classification network model, multiple corresponding classification results (including The first classification result, the second classification result, the third classification and the cascade classification result), usually, the classification result with the highest probability among the cascade classification results is taken as the final classification recognition result.

In addition, since the data processing process of the multi-modal classification network model here is similar to the steps in step S120, except that the training process using the loss function is canceled, the fundus OCT for the multi-modal classification network model to be classified is The specific process of image processing will not be repeated here.

It should be noted that in the actual application process, before recognizing the fundus OCT image to be classified, the multimodal classification network model can be tested using the fundus OCT image with unknown category information. The specific test process is the same as that of step S130. The steps are similar and will not be repeated here. After the test result is obtained, the correctness of the test result is judged by the doctor's evaluation and other means. If the test result is the same as the doctor's evaluation result, the multi-modal classification network model is used to classify the fundus OCT The image is recognized. If the test result is different from the doctor's evaluation result, the training samples are added again, and the multi-modal classification network model is continuously trained until the test result is the same as the doctor's evaluation result.

Finally, it should be noted that when the cascade classification result is not set, the first classification result is used as the output result of the model. When the definite cascade classification result is set, the cascade classification result is the final classification result of the model.

In addition, in another embodiment, only the deep stream branch and the shallow stream branch may be merged and connected, and the output result is used as the classification result of the deep stream branch, and the subsequent process is the same as step S120.

It can be seen from the expression of the above technical solution that the Fourier transform-based OCT image quality evaluation method provided in this application can automatically perform OCT image quality evaluation by using image recognition technology in artificial intelligence, which can greatly save doctors’ working time and improve Doctor’s work efficiency; in addition, by introducing fast Fourier transform, images can be extracted from multiple dimensions, significantly improving the accuracy of image quality evaluation; in addition, by setting multiple branches for the multi-modal classification network model, Different branches extract different feature information, and then determine the final model by calculating the total loss function, which can significantly improve the recognition accuracy of the model; in addition, through the feature cascade method, the cascade classification result is obtained, and the cascade classification result is obtained. As the final model output result, the classification effect of the model can be further provided, and the OCT image quality evaluation can be improved. Finally, in the field of intelligent fundus screening, since the quality assessment of fundus OCT images is the key to whether the fundus inspection is meaningful, the quality assessment method of fundus OCT images can significantly improve work efficiency during intelligent fundus screening.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present application.

Example 2

Corresponding to the above method, this application also provides an OCT image quality assessment system based on Fourier transform, which includes:

A model training unit, creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

Example 3

The application also provides an electronic device 70. Referring to FIG. 2, this figure is a schematic structural diagram of a preferred embodiment of the electronic device 70 provided by this application.

In this embodiment, the electronic device 70 may be a terminal device with a computing function, such as a server, a smart phone, a tablet computer, a portable computer, a desktop computer, and the like.

The electronic device 70 includes a processor 71 and a memory 72.

The memory 72 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as flash memory, hard disk, multimedia card, card-type memory, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 70, such as a hard disk of the electronic device 70. In other embodiments, the readable storage medium may also be an external memory of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 70, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.

In this embodiment, the readable storage medium of the memory 72 is generally used to store the Fourier transform-based OCT image quality evaluation program 73 installed in the electronic device 70. The memory 72 can also be used to temporarily store data that has been output or will be output.

In some embodiments, the processor 72 may be a central processing unit (CPU), microprocessor or other data processing chip, used to run the program code or processing data stored in the memory 72, for example based on Fourier Transformed OCT image quality evaluation program 73 and so on.

In some embodiments, the electronic device 70 is a terminal device such as a smart phone, a tablet computer, and a portable computer. In other embodiments, the electronic device 70 may be a server.

FIG. 2 only shows the electronic device 70 with the components 71-73, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.

Optionally, the electronic device 70 may also include a user interface. The user interface may include an input unit such as a keyboard (Keyboard), a voice input device such as a microphone (microphone) and other devices with voice recognition functions, and a voice output device such as audio, earphones, etc. Optionally, the user interface may also include a standard wired interface and a wireless interface.

Optionally, the electronic device 70 may also include a display, and the display may also be referred to as a display screen or a display unit. In some embodiments, it may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an organic light-emitting diode (OLED) touch device, and the like. The display is used for displaying information processed in the electronic device 70 and for displaying a visualized user interface.

Optionally, the electronic device 70 may also include a touch sensor. The area provided by the touch sensor for the user to perform touch operations is called the touch area. In addition, the touch sensor here may be a resistive touch sensor, a capacitive touch sensor, or the like. Moreover, the touch sensor includes not only a contact type touch sensor, but also a proximity type touch sensor and the like. In addition, the touch sensor may be a single sensor, or may be, for example, a plurality of sensors arranged in an array.

In addition, the area of the display of the electronic device 70 may be the same as or different from the area of the touch sensor. Optionally, the display and the touch sensor are stacked to form a touch display screen. The device detects the touch operation triggered by the user based on the touch screen.

Optionally, the electronic device 70 may also include a radio frequency (RF) circuit, a sensor, an audio circuit, etc., which will not be repeated here.

In the device embodiment shown in FIG. 2, the memory 72, which is a computer storage medium, may include an operating system and an OCT image quality evaluation program 73 based on Fourier transform; the processor 71 executes the Fourier transform-based OCT image quality evaluation program 73; The OCT image quality evaluation program 73 of the inner transformation implements the following steps:

In this embodiment, FIG. 3 is a schematic diagram of the internal logic of the Fourier transform-based OCT image quality evaluation program according to an embodiment of the present application. As shown in FIG. 3, the Fourier transform-based OCT image quality evaluation program 73 also It can be divided into one or more modules, and one or more modules are stored in the memory 72 and executed by the processor 71 to complete the application. The module referred to in this application refers to a series of computer program instruction segments that can complete specific functions. Referring to FIG. 3, it is a program module diagram of a preferred embodiment of the OCT image quality evaluation program 73 based on Fourier transform in FIG. The OCT image quality evaluation program 73 based on Fourier transform can be divided into: a sample set establishment module 74, a model training module 75, and a model application module 76. The functions or operation steps implemented by modules 74-76 are similar to the above, and will not be described in detail here. Illustratively, for example, where:

The sample set establishment module 74 is configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

The model training module 75 is configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

The model application module 76 is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network after the training of the multimodal classification network model is completed The model is used to evaluate the quality of the fundus OCT image to be classified through the multi-modal classification network model.

Example 4

The present application also provides a computer-readable storage medium. The computer-readable storage medium stores an OCT image quality evaluation program 73 based on Fourier transform. When the OCT image quality evaluation program 73 based on Fourier transform is executed by the processor To achieve the following operations:

After the multimodal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multimodal classification network model, and the multimodality The classification network model evaluates the quality of the fundus OCT image to be classified.

The specific implementation of the computer-readable storage medium provided by the present application is substantially the same as the specific implementation of the Fourier transform-based OCT image quality evaluation method and the electronic device, and will not be repeated here.

Optionally, the medium involved in this application, such as a computer-readable storage medium, may be non-volatile or volatile.

It should be noted that the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

It should be further explained that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements , But also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments. Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium such as ROM/RAM, magnetic Disk, optical disk) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the methods of the various embodiments of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An OCT image quality evaluation method based on Fourier transform, applied to an electronic device, wherein the method includes:

Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
The Fourier transform-based OCT image quality assessment method according to claim 1, wherein the fundus OCT image sample set is stored in a blockchain, and the process of performing Fourier transform on the fundus OCT image sample comprises :

Perform gray-scale processing on each fundus OCT image in turn;

Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples;

The spectrum image sample set is established according to the frequency domain samples.
The method for evaluating OCT image quality based on Fourier transform according to claim 1 or 2, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,

In the process of training the multi-modal classification network model,

The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;

The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;

The Simple Modal Image stream branch is used for training through the spectrum image sample set to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
The OCT image quality assessment method based on Fourier transform according to claim 3, wherein, in the process of training the multi-modal classification network model,

The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;

The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;

The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the shallow features of the frequency domain to obtain a corresponding third classification result.
The OCT image quality assessment method based on Fourier transform according to claim 4, wherein, in the process of training the multi-modal classification network model,

Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;

Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
The OCT image quality evaluation method based on Fourier transform according to claim 5, wherein:

The calculation formula of the loss function is:

Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;

The calculation formula of the total loss function is:

Total Loss = 0.3 × Loss D + 0.3 × Loss s + 0.4 × Loss P , where,

Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
The OCT image quality assessment method based on Fourier transform according to claim 6, wherein, in the process of training the multi-modal classification network model,

Cascading the deep features, the shallow features, and the frequency-domain shallow features to obtain cascaded features;

The multi-modal classification network model performs cascaded classification of the fundus OCT image samples according to the cascaded features to obtain a cascaded classification result corresponding to the fundus OCT image samples;

After the multi-modal classification network model is trained, and the multi-modal classification network model is used to classify and recognize the to-be-classified fundus OCT image, the cascade classification result is used as the multi-modal classification network The final classification result output by the model.
An OCT image quality assessment system based on Fourier transform, wherein the system includes:

A sample set establishment unit, configured to perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

A model training unit, configured to create a multi-modal classification network model, and train the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

The model application unit is configured to input the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified into the multimodal classification network model after the training of the multimodal classification network model is completed , Performing quality evaluation on the OCT image of the fundus to be classified through the multi-modal classification network model.
An electronic device, wherein the electronic device includes: a memory, a processor, and a Fourier transform-based OCT image quality evaluation program stored in the memory and running on the processor, the Fourier-based When the OCT image quality evaluation program of the Liye transform is executed by the processor, the following steps are implemented:

Acquire fundus OCT image sample sets with known image tags and perform Fourier transform on each fundus OCT image sample to establish a corresponding spectrum image sample set;

Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

After the multimodal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multimodal classification network model, and the multimodal classification is performed The network model classifies and recognizes the OCT image of the fundus to be classified.
The electronic device according to claim 9, wherein the fundus OCT image sample set is stored in a blockchain, and the process of performing Fourier transform on the fundus OCT image sample comprises:

Perform gray-scale processing on each fundus OCT image in turn;

Perform fast Fourier transform on each fundus OCT image after gray-scale processing to generate corresponding frequency domain samples;

The spectrum image sample set is established according to the frequency domain samples.
The electronic device according to claim 9 or 10, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,

In the process of training the multi-modal classification network model,

The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;

The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;

The Simple Modal Image stream branch is used for training through the spectrum image sample set to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
The electronic device according to claim 11, wherein, in the process of training the multi-modal classification network model,

The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;

The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;

The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the shallow features of the frequency domain to obtain a corresponding third classification result.
The electronic device according to claim 12, wherein, in the process of training the multi-modal classification network model,

Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;

Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
The electronic device according to claim 13, wherein:

The calculation formula of the loss function is:

Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;

The calculation formula of the total loss function is:

Total Loss = 0.3 × Loss D + 0.3 × Loss S + 0.4 × Loss P , where,

Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.
The electronic device according to claim 14, wherein, in the process of training the multi-modal classification network model,

Cascading the deep features, the shallow features, and the frequency-domain shallow features to obtain cascaded features;

The multi-modal classification network model performs cascaded classification of the fundus OCT image samples according to the cascaded features to obtain a cascaded classification result corresponding to the fundus OCT image samples;

After the multi-modal classification network model is trained, and the multi-modal classification network model is used to classify and recognize the to-be-classified fundus OCT image, the cascade classification result is used as the multi-modal classification network The final classification result output by the model.
A computer-readable storage medium, wherein a Fourier transform-based OCT image quality evaluation program is stored in the computer-readable storage medium, and when the Fourier transform-based OCT image quality evaluation program is executed by a processor , To achieve the following steps:

Perform Fourier transform on each fundus OCT image sample set with a known image label in the fundus OCT image sample set to establish a corresponding spectrum image sample set;

Creating a multi-modal classification network model, and training the multi-modal classification network model through the fundus OCT image sample set and the spectrum image sample set;

After the multi-modal classification network model is trained, the fundus OCT image to be classified and the spectrum image to be classified corresponding to the fundus OCT image to be classified are input to the multi-modal classification network model, and the multi-modal The classification network model evaluates the quality of the fundus OCT image to be classified.
The computer-readable storage medium according to claim 16, wherein the multi-modal classification network model includes a Deep stream branch, a shallow stream branch, and a Simple Modal Image stream branch; wherein,

In the process of training the multi-modal classification network model,

The Deep stream branch is used for training through the fundus OCT image sample set, so as to extract the deep features of each fundus OCT image sample in the fundus OCT image sample set;

The Shallow stream branch is used for training through the fundus OCT image sample set, so as to extract the shallow features of each fundus OCT image sample in the fundus OCT image sample set;

The Simple Modal Image stream branch is used for training through the spectrum image sample set, so as to extract the frequency-domain shallow features of each spectrum image sample in the fundus OCT image sample set.
The computer-readable storage medium according to claim 17, wherein, in the process of training the multi-modal classification network model,

The Deep stream branch is also used to classify the fundus OCT image samples once according to the deep features to obtain the corresponding first classification result;

The Shallow stream branch is also used to perform secondary classification on the fundus OCT image samples according to the shallow features to obtain a corresponding second classification result;

The Simple Modal Image stream branch is also used to classify the to-be-spectrum image sample three times according to the frequency-domain shallow features to obtain a corresponding third classification result.
The computer-readable storage medium according to claim 18, wherein, in the process of training the multi-modal classification network model,

Calculating a corresponding loss function according to the first classification result, the second classification result, and the third classification result;

Calculate the total loss function of the multimodal classification network model according to the loss function, and when the total loss function converges to a minimum, it is determined that the training of the multimodal classification network model is completed.
The computer-readable storage medium of claim 19, wherein:

The calculation formula of the loss function is:

Among them, p is the probability value of the label, q is the predicted probability value output by the classification result, x i represents the i-th category, and n represents the number of categories;

The calculation formula of the total loss function is:

Total Loss = 0.3 × Loss D + 0.3 × Loss S + 0.4 × Loss P , where,

Loss D is the loss function of the first classification result, Loss S is the loss function of the second classification result, and Loss P is the loss function of the third classification result.