WO2021082691A1

WO2021082691A1 - Segmentation method and apparatus for lesion area of eye oct image, and terminal device

Info

Publication number: WO2021082691A1
Application number: PCT/CN2020/111734
Authority: WO
Inventors: 周侠; 郭晏; 王玥; 吕彬; 吕传峰
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-10-30
Filing date: 2020-08-27
Publication date: 2021-05-06
Also published as: CN110889826B; CN110889826A

Abstract

The present application is applicable to the technical field of image processing, and provided are a segmentation method and apparatus for a lesion area of an eye OCT image, and a terminal device. The method comprises: acquiring an eye OCT image to be segmented; detecting the eye OCT image, and determining a bounding box of a lesion area in the eye OCT image; and performing edge extraction on the lesion area in the bounding box, and obtaining a segmentation result of the lesion area. The present application provides a segmentation scheme of the lesion area of the eye OCT image, and accurate and efficient segmentation of the lesion area is implemented.

Description

Method, device and terminal equipment for segmenting ocular OCT image lesion area

This application requires a Chinese patent filed at the Patent Office of the State Intellectual Property Office of the People's Republic of China on October 30, 2019, with the application number 201911043286.0 and the title of the invention "Method, device and terminal equipment for segmentation of ocular OCT image lesion area" The priority of the application, the entire content of which is incorporated in this application by reference.

Technical field

This application belongs to the field of artificial intelligence technology, and in particular relates to a method, a device, a terminal device, and a computer-readable storage medium for segmenting a lesion area of an OCT image of an eye.

Background technique

Optical coherence tomography (Optical Coherence Tomography, OCT) is one of the most promising new tomographic imaging technologies that has developed rapidly in recent years, and has attractive application prospects especially in the detection and imaging of biological tissues in vivo. This imaging technology has tried to be applied in clinical diagnosis in ophthalmology, dentistry and dermatology. It is another technological breakthrough after Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) technologies. It has developed rapidly in recent years.

Segmentation of the lesion area in the ophthalmic OCT image, such as subretinal effusion, intraretinal effusion, subretinal high-reflective material, and pigment epithelial detachment, is the basis for a reliable diagnosis of fundus diseases. Therefore, the inventor realizes that there is an urgent need for a solution for segmentation of the ocular OCT image lesion area.

technical problem

One of the objectives of the embodiments of the present application is to provide a method, a device, a terminal device, and a computer-readable storage medium for segmenting a lesion area of an OCT image of an eye, so as to achieve accurate and efficient segmentation of the lesion area.

Technical solutions

In order to solve the above technical problems, the technical solutions adopted in the embodiments of this application are:

In the first aspect, an embodiment of the present application provides a method for segmenting a lesion area of an OCT image of an eye, including:

Obtain the OCT image of the eye to be segmented;

Detecting the OCT image of the eye, and determining the bounding box of the lesion area in the OCT image of the eye;

Edge extraction is performed on the lesion area in the bounding box to obtain a segmentation result of the lesion area.

In a second aspect, an embodiment of the present application provides a device for segmenting a lesion area of an OCT image of an eye, including:

The acquisition module is used to acquire the OCT image of the eye to be segmented;

The detection module is configured to detect the OCT image of the eye and determine the bounding box of the lesion area in the OCT image of the eye;

The extraction module is configured to perform edge extraction on the lesion area in the bounding box to obtain a segmentation result of the lesion area.

In a third aspect, an embodiment of the present application provides a terminal device, including: a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program When realizing the method as described in the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program that implements the method described in the first aspect when the computer program is executed by a processor.

In the fifth aspect, the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the method described in the first aspect.

Beneficial effect

In the embodiment of this application, the bounding box of the lesion area in the OCT image of the eye is determined first, and then the edge extraction of the lesion area in the bounding box is performed to obtain the segmentation result of the lesion area of the eye OCT image. On the one hand, the lesion is determined first Region bounding box, and then perform edge extraction for the image area in the bounding box, which more accurately realizes the segmentation of the lesion area; on the other hand, because the edge extraction is only for the image area in the bounding box, the segmentation efficiency is improved and the data processing is reduced. The amount of system resources is reduced.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the embodiments or exemplary technical descriptions. Obviously, the accompanying drawings in the following description are only of the present application. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative work.

FIG. 1 is a schematic flowchart of a method for segmenting a lesion area of an eye OCT image according to an embodiment of the present application;

2 is a schematic diagram of the results of step S110 and step S120 in the method for segmenting the lesion area of the eye OCT image provided by an embodiment of the present application;

FIG. 3 is a schematic flowchart of step S120 in a method for segmenting a lesion area of an eye OCT image according to an embodiment of the present application;

4 is a schematic structural diagram of a deep learning neural network model used in a method for segmenting a lesion area of an OCT image of an eye provided by an embodiment of the present application;

FIG. 5 is a schematic structural diagram of a first sub-network used in a method for segmenting a lesion area of an OCT image of an eye provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of an attention module used in a method for segmenting a lesion area of an eye OCT image provided by an embodiment of the present application;

FIG. 7 is a schematic diagram of the results of step S120 and step S130 in the method for segmenting the lesion area of the eye OCT image provided by an embodiment of the present application;

FIG. 8 is a schematic flowchart of step S130 in a method for segmenting a lesion area of an eye OCT image according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a device for segmenting a lesion area of an OCT image of an eye provided by an embodiment of the present application;

FIG. 10 is a schematic structural diagram of a terminal device to which the method for segmenting a lesion area of an eye OCT image provided by an embodiment of the present application is applicable.

Detailed ways

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application.

In order to enable those skilled in the art to better understand the solution of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of this application, not all the embodiments. Based on the embodiments in this application, for those of ordinary skill in the art, all other embodiments obtained without creative labor should fall within the protection scope of this application. It should be noted that the embodiments in this application and the features in the embodiments can be combined with each other if there is no conflict.

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.

It should be understood that when used in the specification and appended claims of this application, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other The existence or addition of features, wholes, steps, operations, elements, components, and/or collections thereof.

It should also be understood that the term "and/or" used in the specification and appended claims of this application refers to any combination of one or more of the associated listed items and all possible combinations, and includes these combinations.

As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".

In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.

Reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.

Segmenting the lesion area in the ophthalmological OCT image is the basis for a reliable diagnosis of fundus diseases. Therefore, the embodiments of the present application provide a solution for segmenting the lesion area of the OCT image of the eye, so as to perform accurate and reliable segmentation of the lesion area in the OCT image of the eye.

The segmentation scheme of the lesion area of the OCT image of the eye provided by the embodiment of the present application is suitable for the field of artificial intelligence, the field of image processing technology, the field of digital medical treatment, and the like.

FIG. 1 shows an implementation flowchart of a method for segmenting a lesion area of an OCT image of an eye provided by an embodiment of the present application. The segmentation method is applied to terminal equipment. The method for segmenting ocular OCT image lesion areas provided by the embodiments of the present application can be applied to ophthalmic OCT equipment, mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR). ) Devices, laptops, ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (PDAs), independent servers, distributed servers, server clusters or cloud servers and other terminal devices The embodiments of this application do not impose any restrictions on the specific types of terminal devices. As shown in Fig. 1, the segmentation method includes step S110 to step S130. The specific implementation principle of each step is as follows.

S110: Acquire an OCT image of the eye to be segmented.

Wherein, the OCT image of the eye to be segmented is an object that needs to be segmented in the lesion area, and the OCT image of the eye may be an original frame of the eye OCT image.

When the terminal device is an OCT device, the OCT image of the eye may be an OCT image of the eye obtained by the OCT device scanning the eye of the human body to be tested in real time.

When the terminal device is not an OCT device, the eye OCT image can be the eye OCT image obtained by the terminal device in real time from the OCT device, or it can be the pre-stored eye OCT image obtained from the internal or external memory of the terminal device image.

In a non-limiting example, the OCT device collects the OCT image of the human eye to be tested in real time, and sends the OCT image to the terminal device. The terminal device obtains the OCT image, and uses the OCT image as the image to be divided.

In another non-limiting example, the OCT device collects the OCT image of the human body under test and sends it to the terminal device. The terminal device first stores the OCT image in the database, and then obtains the OCT image of the human body under test from the database. The image is used as the image to be divided.

In some embodiments of the present application, the terminal device obtains the ocular OCT image to be segmented, and after obtaining the ocular OCT image, directly performs the subsequent step S120, that is, detects the ocular OCT image.

In some other embodiments of the present application, the terminal device adds preprocessing to the acquired OCT image of the eye. Understandably, preprocessing includes but is not limited to operations such as noise reduction and clipping. Through operations such as noise reduction and cropping, noise and data processing are reduced, the accuracy of the segmentation result is improved, and computing power is also saved. Exemplarily, the noise reduction operation may be a filtering operation, including but not limited to non-linear filtering, median filtering, bilateral filtering, and the like.

In a non-limiting use scenario of the present application, when the user wants to segment a selected frame of OCT image of the eye, the terminal is activated by clicking the specific physical button and/or virtual button of the terminal device The lesion area segmentation function of the device, at this time, the terminal device will automatically process the selected frame of the eye OCT image according to the process of step S120 and step S130 to obtain the segmentation result.

In another non-limiting use scenario of this application, when the user wants to segment the lesion area of a certain frame of OCT image of the eye, he can activate the lesion of the terminal device by clicking a specific physical button and/or virtual button. Region segmentation function, and a frame of eye OCT image is selected, the terminal device will automatically process the eye OCT image according to the process of step S120 and step S130 to obtain the segmentation result.

It can be understood here that the sequence of clicking the button and selecting a frame of OCT image of the eye can be interchanged, and the embodiments of the present application are applicable but not limited to these two different usage scenarios.

S120: Detect the OCT image of the eye, and determine the bounding box of the lesion area in the OCT image of the eye.

Step S120 is a step of detecting the OCT image of the eye, and determining the bounding box of the lesion area in the OCT image of the eye.

In the embodiment of the present application, a deep learning network model is used to detect the ocular OCT image, and the bounding box of the lesion area in the ocular OCT image is determined, and the area enclosed by the bounding box is the bounded lesion area.

The deep learning network model is used to frame the lesion area in the OCT image of the eye, specifically, the lesion area is framed by a bounding box. As shown in Figure 2, the OCT image of the eye is detected to determine the bounding box A of the lesion area in the OCT image of the eye.

When the OCT image of the eye to be segmented is input to the deep learning network model, the deep learning network model outputs the OCT image of the eye marked with a bounding box, and the area framed by the bounding box is the focus area of the OCT image of the eye.

The training process of the deep learning network model includes: acquiring a large number of ocular OCT sample images; the ocular OCT sample images are ocular OCT sample images marking the lesion area; and dividing the sample images into a training sample set and a verification sample According to the training sample set, the verification sample set and the test sample set, a back propagation algorithm is used to train a deep learning network model.

In the training process, it is necessary to obtain a large number of OCT sample images of the eye that mark the lesion area. For example, on the basis of the original sample set, the OCT sample image of the eye in the sample set can be cropped, or rotated, etc. to generate new Of sample images to expand the sample set.

It should be noted that the process of training the deep learning network model can be implemented locally on the terminal device, or on other devices that communicate with the terminal device. When the trained deep learning network model is deployed on the terminal device side, Or other devices push the trained deep learning network model to the terminal device and successfully deploy it, and then segment the acquired OCT image of the eye to be segmented into the lesion area on the terminal device. It should be noted that the OCT image of the eye obtained in the process of segmentation of the lesion area can also be used to increase the data in the training sample set, and perform further optimization of the deep learning network model on the terminal device or other device, which will further optimize the depth The learning network model is deployed to the terminal device to replace the previous deep learning network model. By optimizing the deep learning network model in this way, the adaptability of the solution of the application is further improved.

In the process of training the neural network model, the loss function used can be one of 0-1 loss function, absolute value loss function, log loss function, exponential loss function and hinge loss function, or a combination of at least two.

The deep learning network model can be a deep learning network model based on machine learning technology in artificial intelligence, including but not limited to AlexNet, VGG Net, GoogleNet, ResNet, ResNeXt, R-CNN, YOLO, Squeeze Net, SegNet, or Gan.

Optionally, in a non-limiting example of the present application, as shown in FIG. 3, step S120 includes step S121 to step S123.

S121: Perform feature extraction on the OCT image of the eye to obtain multiple feature maps of different scales.

S122: Fusion of multiple feature maps of different scales based on the attention mechanism to obtain a fusion result.

S123: Perform region extraction on the fusion result, and determine a bounding box of the lesion region in the ocular OCT image.

In this example, as shown in FIG. 4, the deep learning network model includes two cascaded deep learning network models, a first sub-network and a second sub-network.

The first sub-network includes a feature extraction network and an attention network. The feature extraction network of the first sub-network is used to extract multiple feature maps of different scales of eye OCT images; the attention network of the first sub-network is used to fuse multiple feature maps of different scales based on the attention mechanism , Get the fusion result. The second sub-network is used to perform region extraction on the fusion result, and determine the bounding box of the lesion region in the ocular OCT image.

Among them, as shown in Fig. 5, the feature extraction network of the first sub-network includes four-stage associative down-sampling and four-stage associative up-sampling. The four down-samplings are the first down-sampling, the second down-sampling, and the fourth down-sampling. Three downsampling layers and fourth downsampling; the four upsamplings are the first upsampling, the second upsampling, the third upsampling and the fourth upsampling in sequence. The result of the fourth downsampling is used as the input of the first upsampling, the result of the third downsampling and the result of the first upsampling are stitched together as the input of the second upsampling, the result of the second downsampling and the result of the second upsampling After splicing, it is used as the input of the third upsampling, and the result of the first downsampling and the result of the third upsampling are spliced as the input of the fourth upsampling. The attention network of the first sub-network includes 4 attention modules, which respectively input the first up-sampling result, the second up-sampling result, the third up-sampling result, and the fourth up-sampling result obtained by the feature extraction network 1 attention module, the output of each attention module is spliced to get the fusion result.

Exemplarily, down-sampling may be implemented through a convolutional layer, and up-sampling may be implemented through a deconvolutional layer. Alternatively, down-sampling can be implemented by a convolutional layer and a pooling layer, and up-sampling can be implemented by a deconvolutional layer and a de-pooling layer.

Exemplarily, as shown in FIG. 6, the attention module includes 1 global pooling layer, 1 convolutional layer, and the convolutional layer has BN and softmax functions. When scoring the features of each layer through the attention module, the softmax is used to normalize, so that the sum of the scores corresponding to each input feature information is 1.

The feature extraction network is used to fuse the deep and shallow features in the eye OCT image, which greatly improves the accuracy of feature extraction using the model, thereby improving the accuracy of the subsequent segmentation results. In addition, an attention module is added after each up-sampling. The attention module increases the weight of relatively important features, which further improves the accuracy of feature extraction.

The second sub-network can be the original MASK R-CNN model or Faster R-CNN model with the feature map extraction module removed. That is to say, in the example of this application, the first sub-network is used to replace the original MASK R-CNN model or The feature extraction module of the Faster R-CNN model, namely the CNN module. Mark the lesion area in the fusion result through the second sub-network.

As an example of this application, the original MASK R-CNN model without the feature extraction module, that is, the second sub-network, is connected with the fusion result output by the attention model in the first sub-network to make the output of the attention model The parameters can be used as the input parameters of the second sub-network, so as to add an attention mechanism to the MASK R-CNN model.

Since the MASK R-CNN model itself has relatively stable performance, and its generalization and accuracy are relatively high, and the attention mechanism is added to the embodiments of this application, the MASK R-CNN model with the attention mechanism can be Improve the ability to express different types of large and small lesions, so by using the MASK R-CNN model with an attention mechanism, the accuracy of the identification and detection of the lesion area can be improved, which is especially beneficial to the detection of small target lesion areas.

It should be noted that after entering the fully connected layer of the MASK R-CNN model, it is also possible to perform classification, identification and regression positioning on the category of the lesion area to be segmented and the boundary area based on the preset category loss function and the bounding box loss function. Among them, the category of the lesion area can be set to include four categories: intraretinal fluid, subretinal fluid, subretinal hyper-reflective material, and color number epithelial detachment.

It is understandable that the deep learning network model described here is only an exemplary description, and cannot be interpreted as a specific limitation to the invention.

S130: Perform edge extraction on the lesion area in the bounding box to obtain a segmentation result of the lesion area.

In the embodiment of the present application, the bounding box of the lesion area is detected through step S120, and then refined segmentation is performed on the basis of the bounding box, that is, the initial bounding box determined in step S120 is the coarse positioning area, as shown in FIG. 2 The image area enclosed by the bounding box A. In step S130, edge extraction is performed on the lesion area in the coarse positioning area to obtain a segmentation result of the lesion area.

As shown in FIG. 7, it is a schematic diagram of performing edge extraction on the lesion area in the bounding box A for the bounding box A of the lesion area in the OCT image of the eye.

As a non-limiting example of the present application, as shown in FIG. 8, step S130 includes step S131 to step S134.

S131. Obtain a horizontal convolution factor and a vertical convolution factor.

In the example of this application, the horizontal convolution factor and the vertical convolution factor may be set in the system in advance, or adjusted by the user according to requirements, or the set value may be set to the system default value after the user adjusts. The example of this application does not specifically limit these two convolution factors.

For example, the convolution factor may be the Sobel convolution factor, the Privette convolution factor, the Roberts convolution factor, and so on.

Exemplarily, the system presets the Sobel convolution factor, and the lateral convolution factor of the Sobel convolution factor is:

The vertical convolution factor of the Sobel convolution factor is:

S132. Use the horizontal convolution factor to perform convolution calculation on the region image enclosed by the bounding box to obtain a horizontal gradient; use the vertical convolution factor to perform convolution calculation on the region image enclosed by the bounding box to obtain a vertical gradient.

Among them, the horizontal convolution factor and the vertical convolution factor and the region image enclosed by the bounding box are respectively subjected to convolution calculation processing to obtain the horizontal gradient and the vertical gradient.

Exemplarily, if the system presets the Sobel convolution factor, the image of the area enclosed by the bounding box is represented by FA; Gx and Gy represent the gray value of the image after the horizontal and vertical edge detection, that is, Gx represents the horizontal gradient, Gy represents the longitudinal gradient, and the calculation formula is as follows:

It should be noted that here, the horizontal gradient and the vertical gradient are calculated for each pixel (x, y) in the regional image.

S133: Determine an edge of the lesion area in the bounding frame according to the horizontal gradient and the vertical gradient.

Among them, the edge of the lesion area in the bounding box of the OCT image of the eye is determined by the calculated horizontal gradient and the vertical gradient.

Optionally, step S133 includes:

Sum the absolute value of the horizontal gradient and the absolute value of the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the sum; or

Averaging the absolute value of the lateral gradient and the absolute value of the longitudinal gradient, and determining the edge of the lesion area in the bounding box based on the average; or

Find the root mean square of the horizontal gradient and the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the root mean square; or

A mean square sum is obtained for the horizontal gradient and the longitudinal gradient, and an edge of the lesion area in the bounding frame is determined based on the square sum.

As an example, by calculating the sum of the absolute value of the horizontal gradient and the absolute value of the vertical gradient, the edge of the lesion area in the bounding box of the OCT image of the eye is determined based on the sum value. When the arithmetic sum value of the absolute value exceeds the first preset threshold SHR1, that is, |Gx|+|Gy|>SHR1, the pixel point (x, y) is an edge point.

As another example, by calculating the average value of the absolute value of the horizontal gradient and the absolute value of the vertical gradient, the edge of the lesion area in the bounding box of the OCT image of the eye is determined based on the average value. When the average value exceeds the second preset threshold SHR2, that is, (|Gx|+|Gy|)/2>SHR2, the pixel point (x, y) is an edge point.

As another example, by calculating the root mean square of the horizontal gradient and the vertical gradient, the edge of the lesion area in the bounding box of the OCT image of the eye is determined based on the root mean square. When the root mean square exceeds the third preset threshold SHR3, that is, when (Gx2+Gy2)1/2>SHR3, the pixel point (x, y) is an edge point.

As another example, by calculating the sum of squares of the horizontal gradient and the vertical gradient, the edge of the lesion area in the bounding box of the OCT image of the eye is determined based on the sum of squares. When the sum of squares exceeds the fourth preset threshold SHR4, that is, when Gx2+Gy2>SHR4, the pixel point (x, y) is an edge point.

It should be noted that the first preset threshold is a value set for the sum of absolute values, the second preset threshold is a value set for the mean value of absolute values, and the third preset threshold is a value set for the root mean square. The fourth preset threshold is a value set for the sum of squares. The values of these four preset thresholds are empirical values, which can be set in the system in advance, or adjusted by the user according to needs, or adjusted by the user. The set value is set as the system default value, and this application does not make specific restrictions on the values of these four thresholds.

S134: Obtain a segmentation result of the lesion area based on the determined edge.

Wherein, the edge point of the lesion area is determined in step S133, and the pixel connected area surrounded by the edge point is the segmentation result of the lesion area. It should be noted that the segmentation result may include more than one pixel connected area, and how many pixel connected areas there are depends on the number of areas enclosed by the detected edge points. Continuing to refer to FIG. 7, the segmentation result includes multiple pixel connected regions.

In the embodiment of this application, the bounding box of the lesion area in the eye OCT image is determined first, and then the edge extraction of the lesion area in the bounding box is performed to obtain the segmentation result of the lesion area of the eye OCT image. On the one hand, the lesion area is determined first Bounding box, and then perform edge extraction for the image area in the boundary box, which more accurately realizes the segmentation of the lesion area; on the other hand, because the edge extraction is only for the image area in the bounding box, the segmentation efficiency is improved and the data processing amount is reduced. , Reduce system resource occupation.

It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

Corresponding to the segmentation method of the ocular OCT image lesion area described in the above embodiment, FIG. 9 shows a structural block diagram of the ocular OCT image lesion area segmentation device provided in an embodiment of the present application. For ease of description, only The parts related to the embodiments of this application are described.

Referring to Figure 9, the device includes:

The obtaining module 91 is used to obtain the OCT image of the eye to be segmented;

The detection module 92 is configured to detect the OCT image of the eye, and determine the bounding box of the lesion area in the OCT image of the eye;

The extraction module 93 is configured to perform edge extraction on the lesion area in the bounding box to obtain a segmentation result of the lesion area.

Optionally, the detection module 92 is specifically configured to:

Performing feature extraction on the eye OCT image to obtain multiple feature maps of different scales;

Fusion of multiple feature maps of different scales based on the attention mechanism to obtain a fusion result;

Region extraction is performed on the fusion result, and the bounding box of the lesion area in the OCT image of the eye is determined.

Optionally, said performing feature extraction on the eye OCT image to obtain multiple feature maps of different scales; fusing multiple feature maps of different scales based on an attention mechanism to obtain a fusion result includes:

Using a feature extraction network to perform feature extraction on the ocular OCT image to obtain a plurality of feature maps of different scales of the ocular OCT image;

Each of the feature maps of different scales is input into an attention module, and the outputs of all attention modules are spliced to obtain a fusion result.

Optionally, the feature extraction network includes multiple cascaded downsampling and multiple cascaded upsampling, and the results obtained by the multiple upsampling are multiple feature maps of different scales.

Optionally, the feature extraction network includes 4 secondary downsampling and 4 secondary upsampling. The 4 downsampling is the first downsampling, the second downsampling, the third downsampling layer, and the fourth downsampling. Downsampling; 4 times of upsampling are the first upsampling, the second upsampling, the third upsampling and the fourth upsampling; the result of the fourth downsampling is used as the input of the first upsampling, the result of the third downsampling and The result of the first upsampling is combined as the input of the second upsampling, the result of the second downsampling and the result of the second upsampling are combined as the input of the third upsampling, the result of the first downsampling and the third upsampling The result of is spliced as the input of the fourth up-sampling; the result of the first up-sampling, the result of the second up-sampling, the result of the third up-sampling, and the result of the fourth up-sampling are 4 feature maps of different scales.

Optionally, the extraction module 93 is specifically configured to:

Obtain the horizontal convolution factor and the vertical convolution factor;

Using the horizontal convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a horizontal gradient; using the vertical convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a vertical gradient;

Determining the edge of the lesion area in the bounding frame according to the lateral gradient and the longitudinal gradient;

A segmentation result of the lesion area is obtained based on the determined edge.

Optionally, the determining the edge of the lesion area in the bounding frame according to the lateral gradient and the longitudinal gradient includes:

It should be noted that the information interaction and execution process between the above-mentioned modules/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section for details. I won't repeat it here.

Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.

FIG. 10 is a schematic structural diagram of a terminal device provided by an embodiment of this application. As shown in FIG. 10, the terminal device 10 of this embodiment includes: at least one processor 100 (only one processor is shown in FIG. 10), a memory 101, and a memory 101 that is stored in the memory 101 and can be processed in the at least one processor. The computer program 102 running on the processor 100 implements the steps in the foregoing method embodiments when the processor 100 executes the computer program 102. For example, steps S110 to S130 shown in FIG. 1.

The terminal device may include, but is not limited to, the processor 100 and the memory 101. Those skilled in the art can understand that FIG. 10 is only an example of the terminal device 10, and does not constitute a limitation on the terminal device 10. It may include more or fewer components than shown in the figure, or a combination of certain components, or different components. For example, the electrocardiograph may also include input and output devices, network access devices, buses, and so on.

The so-called processor 100 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

The memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10. The memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk equipped on the terminal device 10, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device. The memory 101 is used to store the computer program and other programs and data required by the terminal device 10. The memory 101 can also be used to temporarily store data that has been output or will be output.

The embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium stores a computer program, and the computer When the program is executed by the processor, the steps in the foregoing method embodiments can be realized.

The embodiments of the present application provide a computer program product. When the computer program product runs on a mobile terminal, the steps in the foregoing method embodiments can be realized when the mobile terminal is executed.

If the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiment methods in the present application can be accomplished by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may at least include: any entity or device capable of carrying computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM), random access memory (Random Access Memory, RAM), electric carrier signal, telecommunications signal, and software distribution medium. Such as U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, according to legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.

In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.

A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

In the embodiments provided in this application, it should be understood that the disclosed terminal device and method may be implemented in other ways. For example, the terminal device embodiments described above are only illustrative. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

A method for segmenting the lesion area of an OCT image of the eye, which includes:

Obtain the OCT image of the eye to be segmented;

Detecting the OCT image of the eye, and determining the bounding box of the lesion area in the OCT image of the eye;

Edge extraction is performed on the lesion area in the bounding box to obtain a segmentation result of the lesion area.
The segmentation method according to claim 1, wherein the detecting the OCT image of the eye and determining the bounding box of the lesion area in the OCT image of the eye comprises:

Performing feature extraction on the eye OCT image to obtain multiple feature maps of different scales;

Fusion of multiple feature maps of different scales based on the attention mechanism to obtain a fusion result;

Region extraction is performed on the fusion result, and the bounding box of the lesion area in the OCT image of the eye is determined.
The segmentation method according to claim 2, wherein the feature extraction is performed on the OCT image of the eye to obtain a plurality of feature maps of different scales; and the plurality of feature maps of different scales are merged based on the attention mechanism , Get the fusion result, including:

Using a feature extraction network to perform feature extraction on the ocular OCT image to obtain a plurality of feature maps of different scales of the ocular OCT image;

Each of the feature maps of different scales is input into an attention module, and the outputs of all attention modules are spliced to obtain a fusion result.
The segmentation method according to claim 3, wherein the feature extraction network includes multiple cascaded downsampling and multiple cascaded upsampling, and the result of multiple upsampling is multiple feature maps of different scales.
The segmentation method according to claim 4, wherein the feature extraction network includes 4 secondary downsampling and 4 secondary upsampling, and the 4 downsamplings are the first downsampling and the second downsampling in turn, The third downsampling layer and the fourth downsampling; the 4 upsamplings are the first upsampling, the second upsampling, the third upsampling and the fourth upsampling in sequence; the result of the fourth downsampling is used as the input of the first upsampling , The result of the third down-sampling and the result of the first up-sampling are combined as the input of the second up-sampling, and the result of the second down-sampling and the result of the second up-sampling are combined as the input of the third up-sampling. The result of the sampling and the result of the third upsampling are stitched together as the input of the fourth upsampling; the result of the first upsampling, the result of the second upsampling, the result of the third upsampling and the result of the fourth upsampling are 4 Feature maps at different scales.
The segmentation method according to claim 2, wherein said performing edge extraction on said lesion area in said bounding box to obtain a segmentation result of said lesion area comprises:

Obtain the horizontal convolution factor and the vertical convolution factor;

Using the horizontal convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a horizontal gradient; using the vertical convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a vertical gradient;

Determining the edge of the lesion area in the bounding frame according to the lateral gradient and the longitudinal gradient;

A segmentation result of the lesion area is obtained based on the determined edge.
The segmentation method according to claim 6, wherein the determining the edge of the lesion area in the bounding frame according to the horizontal gradient and the vertical gradient comprises:

Sum the absolute value of the horizontal gradient and the absolute value of the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the sum; or

Averaging the absolute value of the lateral gradient and the absolute value of the longitudinal gradient, and determining the edge of the lesion area in the bounding box based on the average; or

Find the root mean square of the horizontal gradient and the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the root mean square; or

A mean square sum is obtained for the horizontal gradient and the longitudinal gradient, and an edge of the lesion area in the bounding frame is determined based on the square sum.
A segmentation device for a focus area of an OCT image of an eye, which includes:

The acquisition module is used to acquire the OCT image of the eye to be segmented;

The detection module is configured to detect the OCT image of the eye and determine the bounding box of the lesion area in the OCT image of the eye;

The extraction module is configured to perform edge extraction on the lesion area in the bounding box to obtain a segmentation result of the lesion area.
The segmentation device according to claim 8, wherein the detection module is specifically configured to:

Performing feature extraction on the eye OCT image to obtain multiple feature maps of different scales;

Fusion of multiple feature maps of different scales based on the attention mechanism to obtain a fusion result;

Region extraction is performed on the fusion result, and the bounding box of the lesion area in the OCT image of the eye is determined.
The segmentation device according to claim 9, wherein the feature extraction is performed on the OCT image of the eye to obtain a plurality of feature maps of different scales; and the plurality of feature maps of different scales are merged based on the attention mechanism , Get the fusion result, including:

Using a feature extraction network to perform feature extraction on the ocular OCT image to obtain a plurality of feature maps of different scales of the ocular OCT image;

Each of the feature maps of different scales is input into an attention module, and the outputs of all attention modules are spliced to obtain a fusion result.
The segmentation device according to claim 10, wherein the feature extraction network includes multiple cascaded down-sampling and multiple cascaded up-sampling, and the results obtained by the multiple up-sampling are multiple feature maps of different scales.
The segmentation device according to claim 11, wherein the feature extraction network includes 4 secondary downsampling and 4 secondary upsampling, and the 4 downsamplings are the first downsampling and the second downsampling in sequence, The third downsampling layer and the fourth downsampling; the 4 upsamplings are the first upsampling, the second upsampling, the third upsampling and the fourth upsampling in sequence; the result of the fourth downsampling is used as the input of the first upsampling , The result of the third down-sampling and the result of the first up-sampling are combined as the input of the second up-sampling, and the result of the second down-sampling and the result of the second up-sampling are combined as the input of the third up-sampling. The result of the sampling and the result of the third upsampling are stitched together as the input of the fourth upsampling; the result of the first upsampling, the result of the second upsampling, the result of the third upsampling and the result of the fourth upsampling are 4 Feature maps at different scales.
The segmentation device according to claim 9, wherein the extraction module is specifically configured to:

Obtain the horizontal convolution factor and the vertical convolution factor;

Using the horizontal convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a horizontal gradient; using the vertical convolution factor to perform convolution calculation on the area image enclosed by the bounding box to obtain a vertical gradient;

Determining the edge of the lesion area in the bounding frame according to the lateral gradient and the longitudinal gradient;

A segmentation result of the lesion area is obtained based on the determined edge.
The segmentation device according to claim 13, wherein the determining the edge of the lesion area in the bounding frame according to the lateral gradient and the longitudinal gradient comprises:

Sum the absolute value of the horizontal gradient and the absolute value of the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the sum; or

Averaging the absolute value of the lateral gradient and the absolute value of the longitudinal gradient, and determining the edge of the lesion area in the bounding box based on the average; or

Find the root mean square of the horizontal gradient and the longitudinal gradient, and determine the edge of the lesion area in the bounding box based on the root mean square; or

A mean square sum is obtained for the horizontal gradient and the longitudinal gradient, and an edge of the lesion area in the bounding frame is determined based on the square sum.
A terminal device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer program:

Obtain the OCT image of the eye to be segmented;

Detecting the OCT image of the eye, and determining the bounding box of the lesion area in the OCT image of the eye;

Edge extraction is performed on the lesion area in the bounding box to obtain a segmentation result of the lesion area.
The terminal device according to claim 15, wherein the detecting the OCT image of the eye and determining the bounding box of the lesion area in the OCT image of the eye comprises:

Performing feature extraction on the eye OCT image to obtain multiple feature maps of different scales;

Fusion of multiple feature maps of different scales based on the attention mechanism to obtain a fusion result;

Region extraction is performed on the fusion result, and the bounding box of the lesion area in the OCT image of the eye is determined.
The terminal device according to claim 16, wherein the feature extraction is performed on the OCT image of the eye to obtain a plurality of feature maps of different scales; and the plurality of feature maps of different scales are merged based on the attention mechanism , Get the fusion result, including:

Using a feature extraction network to perform feature extraction on the ocular OCT image to obtain a plurality of feature maps of different scales of the ocular OCT image;

Each of the feature maps of different scales is input into an attention module, and the outputs of all attention modules are spliced to obtain a fusion result.
The terminal device according to claim 17, wherein the feature extraction network includes multiple cascaded down-sampling and multiple cascaded up-sampling, and the results obtained by the multiple up-sampling are multiple feature maps of different scales.
The terminal device according to claim 18, wherein the feature extraction network includes 4 second-order down-sampling and 4 second-order up-sampling, and the 4 times down-sampling are the first down-sampling and the second down-sampling in sequence, The third downsampling layer and the fourth downsampling; the 4 upsamplings are the first upsampling, the second upsampling, the third upsampling and the fourth upsampling in sequence; the result of the fourth downsampling is used as the input of the first upsampling , The result of the third down-sampling and the result of the first up-sampling are combined as the input of the second up-sampling, and the result of the second down-sampling and the result of the second up-sampling are combined as the input of the third up-sampling. The result of the sampling and the result of the third upsampling are stitched together as the input of the fourth upsampling; the result of the first upsampling, the result of the second upsampling, the result of the third upsampling and the result of the fourth upsampling are 4 Feature maps at different scales.
A computer-readable storage medium storing a computer program, wherein the computer program implements the method according to any one of claims 1 to 7 when the computer program is executed by a processor.