CN114897831A

CN114897831A - Ultra-wide-angle eyeground image optic disk extraction method combining target positioning and semantic segmentation

Info

Publication number: CN114897831A
Application number: CN202210521293.2A
Authority: CN
Inventors: 徐光柱; 林文杰; 陈莎; 刘鸣; 雷帮军
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2022-05-13
Filing date: 2022-05-13
Publication date: 2022-08-12

Abstract

The method for extracting the ultra-wide-angle fundus image optic disk by combining target positioning and semantic segmentation comprises the following steps: step 1: establishing a YOLOv4 model for roughly positioning the optic disc area; step 2: extracting the optic disc area according to the coarse positioning result of the optic disc area, and removing the eye periphery area; and step 3: interactively segmenting the optic disc region through a Snake model of the active contour model to construct a U ² -a dataset with disc labels required for Net model training;and 4, step 4: using U ² -Net model for optic disc region extraction. The invention not only effectively improves the optic disc extraction precision, but also can obtain optic disc labeling data required by the supervised learning segmentation method. The accuracy rate of the optic disc positioning of the whole method reaches 99.7 percent, and stable and reliable input is provided for the subsequent segmentation links; the segmentation precision is higher than that of widely used PCNN, U-Net, DeepLabV3 and SegNet models, and the method has good application value.

Description

Ultra-wide-angle eyeground image optic disk extraction method combining target positioning and semantic segmentation

Technical Field

The invention relates to the technical field of ultra-wide-angle fundus imaging, in particular to an ultra-wide-angle fundus image optic disc extraction method combining target positioning and semantic segmentation.

Background

Glaucoma is a serious hazard and can irreversibly damage a person's vision, even leading to blindness. The change of structures such as the size and the shape of the optic disc is closely related to the pathological change degree of glaucoma of a patient, and the glaucoma can be found in time by detecting the eye structures of the optic disc and the optic cup. See document [1] Amsa Shabbir, Aqsa Rasheed, Huma Shehraz, et al.Detection of glaucoma using recombinant enzymes A comprehensive review. physical Biosciences and Engineering,2021,18(3): 2033. 2076.

The number of glaucoma patients in China is huge, but the number of professional ophthalmologists is seriously deficient. Moreover, manual extraction of the fundus optic disc region takes a long time, has low efficiency, and is difficult to meet clinical requirements. Therefore, the automatic extraction of the fundus optic disc region by utilizing the AI technology is an effective solution for solving the problem of the lack of ophthalmologists and improving the early screening participation of glaucoma patients, and is widely concerned by domestic and foreign researchers. In the document [2] chinese patent, "a retinal optic disc segmentation method based on deep learning" (CN113240677A), in order to segment the optic disc, a hole convolution is introduced into a U-Net model, and a subtraction operation element by element is introduced into a jump connection, and features obtained by sampling each layer are subtracted from corresponding features of a lower layer and from a feature of an upper layer which has the same size after a maximum pooling operation of 4x 4.

Document [3] Fu Huazhu, Cheng Jun, Xu Yanwu, et al. Joint optical disc and cup segmentation based on multi-label deep learning, 2018,37(7): 1597) 1605, proposes a deep learning model named M-Net, which uses multi-labels to solve the problem of optic disc and cup segmentation. The network mainly comprises a multi-scale input layer, a U-shaped convolution network, an edge output layer and a multi-label loss function. The multi-scale input layer constructs an image pyramid to achieve multi-scale receptive field size. A U-Net model is used as a main network structure to learn multi-scale blood vessel characteristic information, and a side output layer is used as an early classifier to generate corresponding local prediction images for different scale layers. Finally, a multi-label penalty function is used to generate the final segmentation map.

In the document [4] chinese patent "a cup optic disc segmentation method based on fundus map data set migration learning" (CN112541923B), a cup optic disc segmentation method based on migration learning is proposed for the problem of low generalization of model segmented optic discs caused by large feature differences between different fundus map data sets. According to the method, through the countermeasure training of the backbone segmentation network, the feature field discriminator and the attention field discriminator, the general features among the eye fundus image data sets are extracted, and the features are weighted by the attention module, so that the problem of blurred vision cup and optic disc boundaries is effectively relieved.

Compared with the traditional fundus image imaging, as shown in fig. 1(a), the ultra-wide angle fundus imaging technology has the advantages of large visual field, fast imaging, no mydriasis, good patient compliance and the like, as shown in fig. 1 (b). However, the brightness and definition of the formed image are inferior to those of the traditional fundus image and the resolution is high, so that the research of automatic interpretation of the fundus image is still in the initial research stage.

In the aspect of disk extraction of super wide-angle fundus images, current special research reports are rare, and documents [5] Li Zhongwen, Guo Chong, Nie Danyao, et al. A deep learning system for identifying molecular differentiation and regeneration research using ultra-wide-field images, 2019,7(22):618. Then the video disc is sent to a U-Net model trained based on a traditional fundus image data set after certain pretreatment to realize the extraction of the video disc. The reason for this is because the ultra-wide fundus image optic disk region labeling data is lacking, and the accuracy is also affected.

Disclosure of Invention

The method aims at the problems that the optic disc region extraction precision is low and the optic disc segmentation model cannot be directly used for training due to the small optic disc target, the lack of label data, the uneven illumination, the serious noise interference and the like in the ultra-wide angle eye ground image. The invention provides an ultra-wide angle fundus image optic disc extraction method combining target positioning and semantic segmentation, which comprises the steps of firstly training a visual target detection network YOLOv4 to realize coarse positioning of an optic disc area, and converting a small target positioning problem of optic disc detection in an ultra-wide angle fundus image into a larger target detection problem of a candidate area containing an optic disc. The problem difficulty is simplified, and the detection precision is improved; secondly, segmenting a video disc region approved by human eyes in an interactive mode by using a Snake algorithm in the active contour curve model to obtain a label used for subsequent neural network training; finally train U ² The Net model performs accurate extraction of the optic discs in the candidate region.

The technical scheme adopted by the invention is as follows:

the method for extracting the ultra-wide-angle fundus image optic disk by combining target positioning and semantic segmentation comprises the following steps:

step 1: establishing a YOLOv4 model for roughly positioning the optic disc area;

step 2: extracting the optic disc area according to the coarse positioning result of the optic disc area, and removing the eye periphery area;

and 3, step 3: interactively segmenting the optic disc region through a Snake model of the active contour model to construct a U ² -a dataset with disc labels required for Net model training;

and 4, step 4: using U ² -Net model for optic disc region extraction.

In the step 2, according to the positioning result of the model optodisc of YOLOv4, extracting an interested region containing the optodisc, removing the periocular region extracted from the interference optodisc region, positioning the center of the optodisc at the left or right position of the intercepted image, intercepting two interested regions with the size of 256 × 256, and covering with a mask.

The step 3 comprises the following steps:

firstly, carrying out gray scale normalization and Gaussian filtering on an ultra-wide angle fundus image to remove noise; then, manually selecting coordinate points of the initial contour curve, and obtaining a closed contour curve by utilizing an interpolation method; then, calculating an energy function E, obtaining the gradient F and the five-diagonal strip matrix A of the energy function E by using the energy function E, and further updating each point position on the curve; finally, filling a blank area in the closed curve to obtain a video disc label;

as shown in formulas (1) to (6):

E _snake ＝∫E(v(s))ds＝E _int (v(s))+E _ext (v(s)) (1)；

wherein: v(s) represents a curve function, E _snake Representing the energy of the Snake curve, E _int Represents internal energy, E _ext Represents external energy;

E _int (v(s))＝∫(|v′(s)| ² +|v″(s)| ² )ds (2)；

wherein: v'(s) is the first derivative of the curve function, v "(s) is the second derivative of the curve function,

E _ext (v(s))＝E _img (v(s))+E _constaint (v(s)) (3)；

wherein: e _img (v(s))、E _constaint (v (s)) represent image energy and binding force energy, respectively;

wherein: i is an input image, I _x And I _y Respectively representing the partial derivatives of the image pair on an x axis and a y axis, wherein x and y are coordinate values;

representing the reciprocal of the image, I _xx 、I _xy 、I _yy Respectively represent: second partial derivatives of the image on the x-axis, second partial derivatives of the first derivatives on the x-and y-axes, and second partial derivatives on the y-axisThe second partial derivative of (c).

Wherein: x is the number of _ss And x _ssss Respectively representing v(s) first and second order partial derivatives, y, of the x-axis _ss And y _ssss Respectively representing the first and second order partial derivatives of v(s) to the y-axis; alpha and beta are weights.

Wherein: f. of _x And f _y Each represents E _snake For the partial derivatives of the x axis and the y axis, t represents the iteration times;

x _t 、y _t coordinates of each point of the iteration curve are obtained;

x _t-1 、y _t-1 coordinates of each point of a last iteration curve are shown, A is a five-diagonal strip matrix, and gamma is a weight coefficient;

in said step 3, in U ² Before the training of the Net model, the contrast between the optic disc and the background is enhanced through a contrast limited histogram equalization technology CLAHE so as to improve the training effect of the model.

The step 4 comprises the following steps:

s4.1: inputting enhanced pictures into U ² -Net model, extracting optic disc regions;

s4.2: to U ² And (4) performing binarization on the output result of the Net model to obtain a binary optic disc region image.

The invention relates to an ultra-wide angle fundus image optic disk extraction method combining target positioning and semantic segmentation, which has the following technical effects:

1) according to the method, the YOLOv4 network model with both speed and detection precision is used for positioning the region of interest containing the optic disc, so that the difficulty of subsequent segmentation caused by small proportion of the optic disc region in the whole ultra-wide-angle fundus image can be effectively reduced.

2) The invention then cuts out the positioned interested region, and utilizes Snake algorithm in the active contour motion curve model to realize, cuts out the optic disk region recognized by human eyes in an interactive mode, and uses the optic disk region as a label for subsequent neural network training. The problem of super wide angle eye ground image concentrated video disk region label lack is solved.

3) After the invention obtains the label data of the video disc area, the invention carries out the U-pair ² -Net training, before training, enhancing the contrast between the optic disc and the background by a contrast-limited histogram equalization technique (CLAHE) to improve the model training effect, and finally, carrying out U training ² And carrying out binarization on the Net result to obtain a segmented optic disc region.

Drawings

Fig. 1(a) is a conventional fundus image;

fig. 1(b) is an ultra-wide-angle fundus image.

Fig. 2 is a schematic diagram of a super wide-angle image optic disk extraction process.

Fig. 3 is a diagram of the YOLOv4 model structure.

FIG. 4(a) shows a first positioning of the optic disc for YOLOv 4;

FIG. 4(b) shows the positioning of disc result two for YOLOv 4.

Fig. 5 is a schematic diagram of extracting a region of interest containing an optic disc.

Fig. 6 is a flow chart of Snake model segmentation optic disc.

Fig. 7(a) is an original image before image enhancement by the CLAHE method;

fig. 7(b) is a diagram after image enhancement by the CLAHE method.

FIG. 8 is U ² -Net network architecture diagram.

Fig. 9 is a flowchart of disc region extraction.

Fig. 10 is a comparison graph of the extraction result of the ultra-wide angle fundus image optic disk region, in which:

a, the unit pictures in the column represent the original pictures;

b the unit figures contained in column represent label diagrams;

the unit graph contained in the c column represents a PCNN model optic disc region extraction result graph;

the unit graph contained in the d column represents a U-Net model optic disk region extraction result graph;

the unit diagram contained in the e column represents a drawing result diagram of a DeepLabv3 model optic disc region;

cell graphs contained in f columns represent a graph for extracting results of SegNet model optic disk regions;

the cell diagram included in the g column represents the disc region extraction result diagram of the method of the invention.

FIG. 11 is a diagram illustrating the quantitative comparison of the disk region extraction results of the method of the present invention with other methods.

Fig. 12 is a Snake evolution diagram.

Detailed Description

The method aims at the problems that the optic disc region extraction precision is low and the optic disc segmentation model cannot be directly used for training in ultra-wide angle eye ground images due to the fact that the optic disc region is small and lacks of labeling information, illumination is uneven, noise interference is serious and the like. The invention provides an ultra-wide angle eyeground image optic disk region extraction algorithm combining a target detection model and a semantic segmentation model. Because the optic disc region is a small target relative to the whole ultra-wide angle fundus image and cannot be directly sent to a semantic segmentation network for extraction, the method firstly uses the YOLOv4 network to roughly position the optic disc region, the positioning accuracy reaches 99.7 percent, and the optic disc region of interest is extracted; then, aiming at the problem that the ultra-wide angle fundus image lacks of optic disc standard information, an optic disc region is segmented interactively by using a Snake algorithm of an active contour model, and a data set with an optic disc label required by subsequent neural network training is constructed; finally, aiming at the problem of low precision of the extraction of the ultra-wide angle eye ground image optic disk region, the U is used ² Net model extracts optic disc regions accurately. The extraction performance of the optic disc area of the ultra-wide angle fundus image is superior to that of other optic disc extraction methods at present, such as:

the scheme is described in the literature [8] Xuguan column, Wang Yao, Hu Song, etc.. PCNN and form matching enhancement combined retina blood vessel segmentation, photoelectric engineering, 2019,46(04): 74-85;

the protocol described in the literature [9] Ronneberger Olaf, Fischer Philipp, Brox Thomas. U-net: volumetric networks for biological image segmentation. International Conference on Medical image computing and computer-assisted interaction 2015: 234-241;

the protocol described in the document [10] Chen Liang-Chieh, Papandrou George, Schroff Florian, et al.

The protocol described in the document [11] Badrinarayanan Vijay, Kendall Alex, Cipola Robert.Segnet: A deep connected encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine understanding, 2017,39(12):2481 and 2495.

the method comprises the following steps: because the optic disc area is a small target relative to the whole ultra-wide angle fundus image, the optic disc area cannot be directly sent to a semantic segmentation network for extraction, and in addition, the ultra-wide angle fundus image has poor quality and comprises eye peripheral areas such as eyelids and eyelashes which are not beneficial to optic disc area extraction. The invention aims at the problem of coarse positioning of the optic disc area by using a single-stage target detection model YOLOv4, and the structure of the invention is shown in FIG. 3

The rough positioning of the optic disc area by YOLOv4 specifically includes:

the Yolov4 model takes a CSPDarknet53 module as a backbone network, is used for extracting deep semantic features of an optic disc in an ultra-wide angle fundus image, then fuses multi-scale features of the optic disc by using a space pyramid pooling and path aggregation network, and finally obtains optic disc position information by using a multi-scale output module. It can be seen from fig. 4(a) and 4(b) that the YOLOv4 model can accurately locate the optic disc position with a confidence of over 90%.

Step two: according to the positioning result of the disc of the YOLOv4 model, extracting the region of interest containing the disc, and removing the periocular region which interferes with the extraction of the disc region. As shown in fig. 5, the disk is centered to the left or right of the truncated image, and two 256 x 256 regions of interest are truncated and covered with a mask.

Step three: aiming at the problem that the ultra-wide angle fundus image lacks of optic disc region labels, the optic disc region is segmented interactively by using an active contour model Snake model to serve as a pixel level label required by subsequent deep learning model training, and the process is shown in FIG. 6. Firstly, carrying out gray scale normalization and Gaussian filtering on the ultra-wide angle fundus image to remove noise, then manually selecting a coordinate point of an initial contour curve, and obtaining a closed contour curve by using an interpolation method; and calculating an energy function E, obtaining the gradient F and the five-diagonal ribbon matrix A of the video label by using the energy function, further updating each point position on the curve, and finally filling a blank area in the closed curve to obtain the video label.

As shown in formulas (1) to (6):

E _snake ＝∫E(v(s))ds＝E _int (v(s))+E _ext (v(s)) (1)；

E _int (v(s))＝∫(|v(s)| ² +|v″(s)| ² ds (2)；

E _ext (v(s))＝E _img (v(s))+E _constaint (v(s)) (3)；

representing the reciprocal of the image;

I _xx 、I _xy 、I _yy respectively represent: second order offset of image on x-axisDerivatives, second partial derivatives of each first derivative on the x and y axes, and second partial derivatives on the y axis;

x _t 、y _t coordinates of each point of the iteration curve are obtained;

x _t-1 、y _t-1 coordinates of each point of the last iteration curve;

a is a five diagonal banded matrix and gamma is a weight coefficient.

Step four: aiming at the problems of relatively deficient data volume and poor quality of ultra-wide-angle fundus images, the invention uses image augmentation and image contrast enhancement to increase the data volume and enhance the images. The invention increases the number of pictures in the data set to 4 times of the original number by rotating and turning. Then, the grey normalization operation is carried out on the video disc image, and then the CLAHE method is used for weakening noise and increasing the contrast ratio of the video disc and the background. As shown in fig. 7(a) and 7 (b).

Step five: aiming at the problem of low precision of ultra-wide angle eyeground image optic disk region extraction, the invention uses U ² Net model performs optic disc region extraction. As shown in FIG. 8, U ² The Net model obtains strong global and local feature extraction capability through an RSU module capable of capturing context information of different scales and a jump connection capable of fusing shallow spatial position information and deep semantic information. The invention inputs the enhanced image directly to the U ² The Net model performs the training of optic disc extraction.

The invention provides an ultra-wide angle eyeground image optic disk region extraction algorithm combining visual target positioning and semantic segmentation. In order to solve the problems that the area proportion of the optic disc area in the ultra-wide angle fundus image is small, the optic disc area cannot be directly and effectively extracted, and the optic disc labeling information is lacked, the invention firstly divides the precise extraction of the optic disc area into two stages of optic disc target detection and optic disc area segmentation. In the first stage, the optic disc positioning small target detection problem is converted into the larger target detection problem by positioning the larger image area containing the optic disc, and the precise positioning is realized by training the Yolov3 network model. And then, interactively segmenting the optic disk in the detected region by using a Snake algorithm in the main profile active curve model, and constructing a training image set with optic disk labeling information required by the training of a subsequent segmentation model. In the second stage, the data set with the video disc labeling information obtained in the previous stage is used for training U ² Net, enabling accurate extraction of optic disc regions. The two-stage optic disc extraction method and the interactive optic disc labeling method not only effectively improve optic disc extraction precision, but also can obtain optic disc labeling data required by a supervised learning segmentation method. The accuracy rate of the optic disc positioning of the whole method reaches 99.7 percent, and stable and reliable input is provided for the subsequent segmentation links; the segmentation precision is higher than that of widely used PCNN, U-Net, DeepLabV3 and SegNet models, and the method has good application value.

As can be seen from FIG. 10, compared to the PCNN, U-Net, DeepLabV3 and SegNet models widely used for optic disc extraction, the algorithm of the present invention can extract a more complete optic disc region, which is especially evident on the focus image with blurred optic disc boundaries.

As can be seen from FIG. 11, in addition to the sensitivity, the algorithm of the present invention has the best results in four quantitative indicators of specificity, accuracy and Dice coefficient, compared with the PCNN, U-Net, DeepLabV3 and SegNet models, and has a powerful optic disc region extraction capability.

Claims

1. The method for extracting the ultra-wide-angle fundus image optic disk by combining target positioning and semantic segmentation is characterized by comprising the following steps of:

and 3, step 3: interactively segmenting the optic disc area through an active contour model Snake model to construct U ² -a dataset with disc labels required for Net model training;

and 4, step 4: using U ² Net model performs optic disc region extraction.

2. The method for extracting optic disk of ultra-wide angle fundus image by combining target location and semantic segmentation as claimed in claim 1, wherein: in the step 2, according to the positioning result of the model optodisc of YOLOv4, extracting an interested region containing the optodisc, removing the periocular region extracted from the interference optodisc region, positioning the center of the optodisc at the left or right position of the intercepted image, intercepting two interested regions with the size of 256 × 256, and covering with a mask.

3. The method for extracting optic disk of ultra-wide angle fundus image by combining target location and semantic segmentation as claimed in claim 1, wherein: the step 3 comprises the following steps:

as shown in formulas (1) to (6):

E _snake ＝∫E(v(s))ds＝E _int (v(s))+E _ext (v(s)) (1)；

E _int (v(s))＝∫(|v′(s)| ² +|v″(s)| ² )ds (2)；

wherein: v'(s) is the first derivative of the curve function, v "(s) is the second derivative of the curve function;

E _ext (v(s))＝E _img (v(s))+E _constaint (v(s)) (3)；

representing the reciprocal of the image, I _xx 、I _xy 、I _yy Respectively represent: second-order partial derivatives of the image on an x axis, second-order partial derivatives of first-order derivatives on the x axis and a y axis and second-order partial derivatives on the y axis;

wherein: x is the number of _ss And x _ssss Respectively representing v(s) first and second order partial derivatives, y, of the x-axis _ss And y _ssss Respectively representing the first and second order partial derivatives of v(s) to the y-axis; alpha and beta are weights;

x _t 、y _t coordinates of each point of the iteration curve are obtained;

x _t-1 、y _t-1 the coordinates of each point of the last iteration curve are A, a five-diagonal strip matrix and gamma is a weight coefficient.

4. The method for extracting optic disk of ultra-wide angle fundus image by combining target location and semantic segmentation as claimed in claim 1, wherein: in said step 3, in U ² Before the Net model training, the contrast between the optic disc and the background is enhanced through a contrast limited histogram equalization technology CLAHE, so that the model training effect is improved.

5. The method for extracting optic disk of ultra-wide angle fundus image by combining target location and semantic segmentation as claimed in claim 1, wherein: the step 4 comprises the following steps: