CN114332002A

CN114332002A - Method and device for detecting cyan eye image and storage medium

Info

Publication number: CN114332002A
Application number: CN202111609512.4A
Authority: CN
Inventors: 李静雯; 闵栋; 李曼; 赵阳光; 任海英; 滕依杉
Original assignee: China Academy of Information and Communications Technology CAICT
Current assignee: China Academy of Information and Communications Technology CAICT
Priority date: 2021-12-27
Filing date: 2021-12-27
Publication date: 2022-04-12

Abstract

The application relates to the technical field of image recognition, and discloses a method for detecting a cyan-eye image, which comprises the following steps: obtaining a retinal fundus image, and intercepting the retinal fundus image to obtain a optic disc image; extracting attention characteristics from a retina fundus image and extracting local characteristics from a optic disc image; acquiring dominant factor characteristics and recessive factor characteristics corresponding to the retinal fundus image according to the attention characteristics; acquiring global characteristics corresponding to the retinal fundus image according to the dominant factor characteristics and the recessive factor characteristics; performing feature fusion on the global features and the local features to obtain fusion features; and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image. This can improve the accuracy of glaucoma image detection. The application also discloses a device and a storage medium for glaucoma image detection.

Description

Method and device for detecting cyan eye image and storage medium

Technical Field

The present application relates to the field of image recognition technologies, and for example, to a method and an apparatus for detecting a cyan-eye image, and a storage medium.

Background

With the increasing advocation of physical health, people often need to pay attention to the health condition of each body part, and eyes are an important part of human health. The retinal fundus image is an important image for recording the health condition of eyes, and most of the current glaucoma image detection methods based on deep learning detect whether the retinal fundus image is a glaucoma-type image by detecting local features in the image, such as the ratio of optic disc/optic cup (C/D).

In the process of implementing the embodiments of the present disclosure, it is found that at least the following problems exist in the related art:

in the prior art, whether the retinal fundus image is a glaucoma type image is identified only by detecting local characteristics such as the ratio of an optic disc to an optic cup, but the retinal fundus images are different, so that the detection accuracy of the retinal fundus images is poor.

Disclosure of Invention

The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview nor is intended to identify key/critical elements or to delineate the scope of such embodiments but rather as a prelude to the more detailed description that is presented later.

The embodiment of the disclosure provides a method and a device for detecting a cyan eye image and a storage medium, so as to improve the detection accuracy of a retina fundus image.

In some embodiments, the method for cyan-eye image detection comprises: obtaining a retina fundus image, and intercepting the retina fundus image to obtain a optic disc image; extracting attention features from the retinal fundus image and extracting local features from the optic disc image; acquiring an explicit factor characteristic and an implicit factor characteristic corresponding to the retinal fundus image according to the attention characteristic; the dominant factor features are used for characterizing salient features in the retinal fundus image, and the recessive factor features are used for characterizing secondary features in the retinal fundus image; acquiring global characteristics corresponding to the retina fundus image according to the dominant factor characteristics and the recessive factor characteristics; performing feature fusion on the global features and the local features to obtain fusion features; and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image.

In some embodiments, the apparatus for cyan-eye image detection comprises a processor and a memory storing program instructions, the processor being configured to, when executing the program instructions, perform the above-described method for cyan-eye image detection.

In some embodiments, the storage medium stores program instructions that, when executed, perform the above-described method for cyan-eye image detection.

The method, the device and the storage medium for detecting the cyan-eye image provided by the embodiment of the disclosure can achieve the following technical effects: obtaining a retinal fundus image, and intercepting the retinal fundus image to obtain a optic disc image; extracting attention characteristics from a retina fundus image and extracting local characteristics from a optic disc image; acquiring dominant factor characteristics and recessive factor characteristics corresponding to the retinal fundus image according to the attention characteristics; acquiring global characteristics corresponding to the retinal fundus image according to the dominant factor characteristics and the recessive factor characteristics; performing feature fusion on the global features and the local features to obtain fusion features; and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image. Therefore, by acquiring the global features and the local features of the retinal fundus images, as the secondary features are included in the global features, richer fundus images in the retinal fundus images can be acquired, the fusion features are acquired by fusing the global features and the local features, and the retinal fundus images are detected according to the fusion features, so that the detection accuracy of the retinal fundus images can be improved.

The foregoing general description and the following description are exemplary and explanatory only and are not restrictive of the application.

Drawings

One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the accompanying drawings and not in limitation thereof, in which elements having the same reference numeral designations are shown as like elements and not in limitation thereof, and wherein:

fig. 1 is a schematic diagram of a method for cyan-eye image detection provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a network structure for extracting global features according to an embodiment of the present disclosure;

fig. 3 is a schematic diagram of another method for cyan-eye image detection provided by an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a method for acquiring an image inspection model according to an embodiment of the present disclosure;

fig. 5 is a schematic diagram of a mutual learning network structure for cyan-eye image detection according to an embodiment of the present disclosure;

fig. 6 is a schematic diagram of an apparatus for cyan-eye image detection according to an embodiment of the present disclosure.

Detailed Description

So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.

The terms "first," "second," and the like in the description and in the claims, and the above-described drawings of embodiments of the present disclosure, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure described herein may be made. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.

The term "plurality" means two or more unless otherwise specified.

In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an or relationship. For example, A/B represents: a or B.

The term "and/or" is an associative relationship that describes objects, meaning that three relationships may exist. For example, a and/or B, represents: a or B, or A and B.

The term "correspond" may refer to an association or binding relationship, and a corresponds to B refers to an association or binding relationship between a and B.

In conjunction with fig. 1, an embodiment of the present disclosure provides a method for cyan-eye image detection, where the method includes:

step S101, obtaining a retina fundus image, and intercepting the retina fundus image to obtain a optic disc image.

In step S102, attention features are extracted from the retinal fundus image, and local features are extracted from the optic disk image.

And step S103, acquiring an explicit factor characteristic and an implicit factor characteristic corresponding to the retinal fundus image according to the attention characteristic.

And step S104, acquiring global characteristics corresponding to the retinal fundus image according to the dominant factor characteristics and the recessive factor characteristics.

And step S105, performing feature fusion on the global features and the local features to obtain fusion features.

And step S106, inputting the fusion characteristics into a preset image detection model to obtain an image type corresponding to the retinal fundus image.

By adopting the method for detecting the cyan eye image, which is provided by the embodiment of the disclosure, the optic disc image is obtained by acquiring the retina fundus image and intercepting the retina fundus image; extracting attention characteristics from a retina fundus image and extracting local characteristics from a optic disc image; acquiring dominant factor characteristics and recessive factor characteristics corresponding to the retinal fundus image according to the attention characteristics; acquiring global characteristics corresponding to the retinal fundus image according to the dominant factor characteristics and the recessive factor characteristics; performing feature fusion on the global features and the local features to obtain fusion features; and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image. In this way, by acquiring the global features and the local features of the retinal fundus image, as the global features include the secondary features, richer fundus image features in the retinal fundus image can be acquired, the fusion features are acquired by fusing the global features and the local features, and the retinal fundus image is detected according to the fusion features, so that the detection accuracy of the retinal fundus image can be improved.

Optionally, intercepting the retinal fundus image to obtain a optic disc image, includes: and identifying the optic disc position in the retina fundus image through a preset detector, and intercepting the retina fundus image according to the optic disc position to obtain the optic disc image. In this way, by separately capturing and analyzing the optic disc image, it is possible to easily extract the characteristics of the optic disc and the optic cup, and to easily detect whether or not the fundus retinal image is a glaucoma-type image.

Optionally, the preset detector is fast R-CNN (Region-Convolutional Neural Networks). Because of its flexibility and robustness, the Faster R-CNN is considered the target detector.

In some embodiments, the retinal fundus image is input into a preset detector, a shared feature map is generated by a convolutional layer in the detector, the position of a sparse rectangular candidate target frame is generated from the shared feature map, and target scores and regression region boundaries are predicted through all connected layers of the detector, and then the boundary frame with the highest confidence is retained as the output of the detector. Optionally, the output of the detector is the bounding box with the highest confidence of each foreground class; in the retinal fundus image, the optic disc and optic cup to be detected are of the foreground type, i.e., the object to be detected, and the remainder is of the background type, i.e., other factors unrelated to the optic disc optic cup in the retinal fundus image.

Optionally, extracting the attention feature from the retinal fundus image and extracting the local feature from the optic disc image, comprises: attention features are extracted from the retinal fundus image through a feature extraction backbone network in the image detection model, and local features are extracted from the optic disc image. Optionally, the feature extraction backbone network is a Swin-transformer (hierarchical visual transformer). In this way, by extracting features of the retinal fundus image and the optic disc image through the Swin-transformer network structure, the relationship between the optic disc cup region and other regions in the retinal fundus image can be more remarkably focused.

Optionally, the local features include optic disc/cup area ratio, optic disc shape, optic disc color, and the like.

Alternatively, the global features include the shape of the optic nerve overlaid on the retina, the color of the optic nerve, the overall color of the fundus image, the neurovascular morphology, and the like. Alternatively, the secondary features are the shape of the optic nerve overlaid on the retina, the color of the optic nerve, the overall color of the fundus image, the neurovascular morphology, and the like. Optionally, the secondary feature is an implicit feature.

In some embodiments, the basic idea of the attention mechanism in computer vision is to let the system learn attention, that is, to let the system pay attention to a region that is desired to pay attention, in the embodiments of the present application, the image detection model learns attention to a target detection object, that is, a retinal fundus image, so that it can ignore irrelevant information and pay attention to important information, and in the embodiments of the present application, the attention mechanism may focus on features of the retinal fundus image, that is, features of the retinal fundus image as attention features.

Optionally, acquiring the corresponding dominant factor characteristic and recessive factor characteristic of the retinal fundus image according to the attention characteristic includes: and decomposing the attention characteristics to obtain the dominant factor characteristics and the recessive factor characteristics corresponding to the retinal fundus image.

Optionally, the dominant factor characteristic comprises characteristics of the optic disc, the optic cup portion, such as: optic disc/optic cup area ratio, optic disc shape and optic disc color, etc.

Optionally, the recessive factor characteristics include: the shape of the optic nerve overlying the retina, the color of the optic nerve, the overall color of the fundus image, the nerve vessel morphology, and the like.

Optionally, acquiring a global feature corresponding to the retinal fundus image according to the dominant factor feature and the recessive factor feature includes: fusing the dominant factor characteristic and the attention characteristic to obtain a dominant characteristic, and fusing the recessive factor characteristic and the attention characteristic to obtain a recessive characteristic; and fusing the dominant characteristic and the recessive characteristic to obtain the global characteristic. Therefore, as the local features comprise the features of the optic disc and the optic cup and are fine-grained features for glaucoma detection, and the global features comprise richer coarse-grained features, the global features and the local features are subjected to feature fusion, so that the key local features in the fundus image can be focused, the global features in the fundus image can be focused, the internal relation on the feature granularity level can be focused, and the detection accuracy of the retina fundus image is improved.

With reference to fig. 2, fig. 2 is a schematic diagram of a network structure for extracting global features according to an embodiment of the present disclosure; in some embodiments, the attention feature is extracted through a stage module in the Swin-transformer, the attention feature is decomposed into a dominant factor feature and a recessive factor feature, the dominant factor feature and the attention feature are fused to obtain the dominant feature, the recessive factor feature and the attention feature are fused to obtain the recessive feature, and the dominant feature and the recessive feature are fused to obtain the global feature.

In conjunction with fig. 3, an embodiment of the present disclosure provides a method for cyan-eye image detection, where the method includes:

step S301, obtaining a retinal fundus image, and intercepting the retinal fundus image to obtain a optic disc image.

In step S302, attention features are extracted from the retinal fundus image, and local features are extracted from the optic disc image.

And step S303, decomposing the attention characteristics to obtain the dominant factor characteristics and the recessive factor characteristics corresponding to the retina fundus image.

And S304, fusing the dominant factor characteristic and the attention characteristic to obtain a dominant characteristic, and fusing the recessive factor characteristic and the attention characteristic to obtain a recessive characteristic.

And S305, fusing the dominant characteristic and the recessive characteristic to obtain a global characteristic.

And S306, performing feature fusion on the global features and the local features to obtain fusion features.

And step S307, inputting the fusion characteristics into a preset image detection model, and obtaining an image type corresponding to the retinal fundus image.

By adopting the method for detecting the glaucoma image, provided by the embodiment of the disclosure, the attention characteristic of the retina fundus image is extracted, the recessive factor characteristic and the dominant factor characteristic are decomposed from the attention characteristic, and the dominant characteristic and the recessive characteristic are obtained by respectively fusing the recessive factor characteristic and the dominant factor characteristic with the attention characteristic; fusing the dominant characteristic and the recessive characteristic to obtain a global characteristic; the global features and the local features of the retinal fundus images are acquired, so that richer fundus image features in the retinal fundus images can be acquired, the fusion features are acquired by fusing the global features and the local features, and the retinal fundus images are detected according to the fusion features, so that the detection accuracy of the retinal fundus images can be improved.

Optionally, the image detection model is obtained by: obtaining a plurality of first training samples and a plurality of sample labels; the first training sample is a retinal fundus sample image, and the sample label is used for representing the image type corresponding to the retinal fundus sample image; and training a preset mutual learning network according to the first training sample with the sample label to obtain an image detection model.

Optionally, the first training sample is a retinal fundus image.

Optionally, obtaining a plurality of first training samples comprises: acquiring a plurality of retinal fundus images; and preprocessing each retina fundus image to obtain a plurality of first training samples.

Optionally, the pre-processing of each retinal fundus image comprises: and performing horizontal turning, vertical turning, random cutting and other processing on each retina fundus image. In this way, by preprocessing the retinal fundus image, a plurality of first training samples can be obtained, so that data enhancement is realized, and the number of training samples is increased.

Optionally, the mutual learning network includes a first convolutional neural network and a second convolutional neural network, and the second convolutional neural network is a peer-to-peer network of the first convolutional neural network; training a preset mutual learning network according to a first training sample with a sample label to obtain an image detection model, comprising: performing feature extraction on each first training sample to obtain sample global features corresponding to each first training sample, and intercepting each first training sample to obtain a second training sample corresponding to each first training sample; respectively inputting each second training sample into a preset first convolutional neural network and a preset second convolutional neural network for feature extraction to obtain a first sample local feature and a second sample local feature; performing feature fusion on the global features of the samples and the local features of the first samples and the local features of the second samples respectively to obtain first sample fusion features and second sample fusion features; and training the mutual learning network according to the first sample fusion characteristics and the second sample fusion characteristics to obtain an image detection model.

Optionally, the mutual learning network includes a full-connection network, and the mutual learning network is trained according to each first sample fusion feature and each second sample fusion feature to obtain an image detection model, including: inputting the first sample fusion characteristics and the second sample fusion characteristics into a fully-connected network to obtain first probability values of preset types of training samples and second probability values of preset types of training samples; respectively acquiring a first feature distribution similarity corresponding to the first convolutional neural network and a second feature distribution similarity corresponding to the second convolutional neural network according to each first probability value and each second probability value; determining a first loss function corresponding to the first convolutional neural network according to the first feature distribution similarity, and determining a second loss function corresponding to the second convolutional neural network according to the second feature distribution similarity; and performing iterative training on the first convolutional neural network and the second convolutional neural network respectively according to the first loss function and the second loss function to obtain an image detection model.

Optionally, the first feature distribution similarity is used to characterize the similarity of feature distributions between features of the second convolutional neural network output and features of the first convolutional neural network output.

Optionally, the second feature distribution similarity is used to characterize the similarity of feature distributions between features of the first convolutional neural network output and features of the second convolutional neural network output.

Optionally, the preset type includes a normal type and an abnormal type.

Optionally, the first convolutional neural network comprises a Swin-transformer network structure; the second convolutional neural network comprises a Swin-transformer network structure.

Optionally, obtaining a first feature distribution similarity corresponding to the first convolutional neural network according to each first probability value and each second probability value, including: and calculating by using the first probability values and the second probability values according to a third preset algorithm to obtain the first feature distribution similarity.

Optionally, the calculating according to a third preset algorithm by using each first probability value and each second probability value to obtain the first feature distribution similarity includes: by calculation of

Obtaining a first feature distribution similarity; wherein the content of the first and second substances,

D_KL(P₂||P₁) For the first feature distribution similarity, P₁For characterizing the corresponding feature distribution, P, of the first convolutional neural network₂The characteristic distribution corresponding to the second convolutional neural network is characterized; n is the number of first training samples, M is 2, x_iIs the ith first training sample; p₁ ^m(x_i) The first training sample corresponding to the first convolutional neural network is a probability value of m types,

the first training sample corresponding to the second convolutional neural network is a probability value of m type.

Optionally, obtaining a second feature distribution similarity corresponding to the second convolutional neural network according to each first probability value and each second probability value, including: and calculating by using the first probability values and the second probability values according to a fourth preset algorithm to obtain second feature distribution similarity.

Optionally, the calculating according to a fourth preset algorithm by using each first probability value and each second probability value to obtain a second feature distribution similarity includes: by calculation of

Obtaining a second feature distribution similarity; wherein the content of the first and second substances,

D_KL(P₁||P₂) For the second feature distribution similarity, P₁For characterizing the corresponding feature distribution, P, of the first convolutional neural network₂The characteristic distribution corresponding to the second convolutional neural network is characterized; n is the number of first training samples, M is 2, x_iIs the ith first training sample; p₁ ^m(x_i) The first training sample corresponding to the first convolutional neural network is a probability value of m types,

Optionally, determining a first loss function corresponding to the first convolutional neural network according to the first feature distribution similarity includes: calculating by utilizing each first probability value according to a first preset algorithm to obtain a first prediction result; and determining a first loss function according to the first prediction result and the similarity of the first feature distribution.

Optionally, the calculating according to a first preset algorithm by using each first probability value to obtain a first prediction result includes: by calculation of

Obtaining a first prediction result; wherein, Lc₁For the first prediction result, N is the number of the first training samples, M is 2, x_iIs the ith first training sample; p₁ ^m(x_i) The probability value of m types corresponding to the first training sample of the first convolutional neural network, optionally, m-1 represents "normal", m-2 represents "abnormal"; alternatively,

prediction type y at ith first training sample_iIn the case of m type, I (y)_iM) is 1; prediction class y at ith first training sample_iIn the case where the type is not m, I (y)_iM) is 0; optionally, the probability of the normal type is used to characterize the detection result as non-glaucoma and the probability of the abnormal type is used to characterize the detection result as glaucoma.

Optionally, the first loss function is L_θ1＝Lc₁+D_KL(P₂||P₁) (ii) a Wherein L is_θ1Is the loss value, Lc, corresponding to the first loss function₁Is a first predetermined result, P₂Feature distribution, P, of features output for the second convolutional neural network₁Feature distribution of features output for the first convolutional neural network, D_KL(P₂||P₁) Is a first characteristicAnd the feature distribution similarity is the similarity of the feature distribution between the feature output by the second convolutional neural network and the feature output by the first convolutional neural network.

Optionally, determining a second loss function corresponding to the second convolutional neural network according to the second feature distribution similarity includes: calculating by utilizing each second probability value according to a second preset algorithm to obtain a second prediction result; and determining a second loss function according to the second prediction result and the second feature distribution similarity.

Optionally, the calculating according to a second preset algorithm by using each second probability value to obtain a second prediction result includes: by calculation of

Obtaining a second prediction result; wherein, Lc₂For the second prediction result, N is the number of the first training samples, M is 2, x_iIs the ith first training sample;

for m types of probability values corresponding to the second convolutional neural network, optionally, m-1 represents "normal", m-2 represents "abnormal"; alternatively,

Optionally, the second loss function is L_θ2＝Lc₂+D_KL(P₁||P₂) (ii) a Wherein L is_θ2Is the loss value, Lc, corresponding to the first loss function₂Is a first predetermined result, P₂Feature distribution, P, of features output for the second convolutional neural network₁Is a first convolution neural netFeature distribution of features of envelope output, D_KL(P₁||P₂) And (4) distributing similarity for the second feature, namely the similarity of the feature distribution between the feature output by the first convolutional neural network and the feature output by the second convolutional neural network.

As shown in fig. 4, an embodiment of the present disclosure provides a method for acquiring an image detection model, where the method includes:

step S401, obtaining a plurality of first training samples and a plurality of sample labels; the sample label is used for representing the image type corresponding to the first training sample.

Step S402, extracting the features of each first training sample to obtain the sample global features corresponding to each first training sample, and intercepting each first training sample to obtain the second training sample corresponding to each first training sample.

And S403, respectively inputting each second training sample into a preset first convolutional neural network and a preset second convolutional neural network for feature extraction, and obtaining a first sample local feature and a second sample local feature.

And S404, respectively performing feature fusion on the global features of the samples and the local features of the first samples and the local features of the second samples to obtain first sample fusion features and second sample fusion features.

And S405, training the mutual learning network according to the first sample fusion characteristics and the second sample fusion characteristics to obtain an image detection model.

By adopting the method for obtaining the image detection model provided by the embodiment of the disclosure, the peer-to-peer network with the first convolutional neural network, namely the second convolutional neural network, is set, and the first convolutional neural network and the second convolutional neural network are trained through the first training sample, so that the first convolutional neural network and the second convolutional neural network learn with each other, and the image detection model is obtained.

Optionally, the image type includes a normal type or an abnormal type.

Fig. 5 is a schematic diagram of a mutual learning network structure for cyan-eye image detection according to an embodiment of the present disclosure; in some embodiments, when a mutual learning network is trained, extracting global features from a retinal fundus image, intercepting the retinal fundus image to obtain a optic disc image, inputting the optic disc image into a first convolutional neural network and a second convolutional neural network respectively to obtain a first local feature and a second local feature, fusing the global features with the first local feature and the second local feature respectively to obtain a first fusion feature Z1 and a second fusion feature Z2, predicting a first probability value of the retinal fundus image as a preset type according to the first fusion feature, and predicting a second probability value of the retinal fundus image as the preset type according to the second fusion feature; and acquiring the first feature distribution similarity and the second feature distribution similarity to enable the two networks to learn each other, so that the feature distribution corresponding to the first convolutional neural network is similar to the feature distribution corresponding to the second convolutional neural network as much as possible.

Optionally, when the retinal fundus image is detected through the image detection model, the prediction result corresponding to the second convolutional neural network is abandoned, and the prediction result corresponding to the first convolutional neural network is determined as the image type corresponding to the retinal fundus image.

In some embodiments, the first probability value of the retinal fundus image corresponding to the first convolutional neural network is { "the probability of the normal type is: 99% "," probability of abnormal type: 1% ", the type with the larger probability value is determined as a prediction result, namely the prediction result is a normal type, and the normal type is determined as an image type corresponding to the retina fundus image.

In some embodiments, when the first convolutional neural network and the second convolutional neural network are trained through the training sample, the network parameter weight of the second convolutional neural network is fixed and unchanged, and the feature output by the first convolutional neural network, that is, the probability that the training sample is of a preset type, is obtained; obtaining a first prediction result corresponding to the first convolution neural network; and obtaining first feature distribution similarity between the output features of the second convolutional neural network and the output features of the first convolutional neural network, determining a first loss function according to the first prediction result and the first feature distribution similarity, performing gradient back propagation on the first convolutional neural network through the first loss function, and updating the network parameter weight in the first convolutional neural network, wherein the network parameter weight of the second convolutional neural network is kept unchanged at the moment. When the first convolutional neural network and the second convolutional neural network are trained through the training samples, the network parameter weight of the first convolutional neural network is fixed to be unchanged, and the characteristics output by the second convolutional neural network are obtained, namely the probability that the training samples are of a preset type; acquiring a second prediction result corresponding to the second convolutional neural network; and obtaining second feature distribution similarity between the output features of the first convolutional neural network and the output features of the second convolutional neural network, determining a second loss function according to the second prediction result and the second feature distribution similarity, performing gradient back propagation on the second convolutional neural network through the second loss function, and updating the network parameter weight in the second convolutional neural network, wherein the network parameter weight of the first convolutional neural network is kept unchanged at the moment.

Therefore, when the type of the training sample is predicted through the first convolutional neural network, the second feature distribution obtained by training the second convolutional neural network is used for reference, when the type of the training sample is predicted through the second convolutional neural network, the first feature distribution obtained by training the first convolutional neural network is also used for reference, the similarity between the first feature distribution and the second feature distribution is measured through the KL divergence, the network parameter weight of the first convolutional neural network is adjusted according to the first loss function in each iterative training, and the network parameter weight of the second convolutional neural network is adjusted according to the second loss function, so that the feature distributions output by the first convolutional neural network and the second convolutional neural network are similar as much as possible, and the two networks are used for reference and study mutually.

In some embodiments, the second convolutional neural network can provide experience for the first convolutional neural network in the form of a posterior probability, that is, measure the similarity of feature distribution between the first feature distribution corresponding to the first convolutional neural network and the second feature distribution corresponding to the second convolutional neural network by using KL divergence.

In the prior art, the dimensionality reduction is mostly carried out by adding a down-sampling layer, the number of parameters to be learned by a network is reduced, and meanwhile, overfitting can be prevented; in the embodiments disclosed in the present application, without using a downsampling layer, by using an extended convolutional layer instead of the downsampling layer, the sharpness of an image can be ensured.

As shown in fig. 6, an apparatus for cyan-eye image detection according to an embodiment of the present disclosure includes a processor (processor)600 and a memory (memory) 601. Optionally, the apparatus may also include a Communication Interface 602 and a bus 603. The processor 600, the communication interface 602, and the memory 601 may communicate with each other via a bus 603. The communication interface 602 may be used for information transfer. Processor 600 may invoke logic instructions in memory 601 to perform the method for cyan-eye image detection of the above-described embodiments.

By adopting the device for detecting the glaucoma image, which is provided by the embodiment of the disclosure, the optic disc image is obtained by acquiring the retina fundus image and intercepting the retina fundus image; extracting attention characteristics from a retina fundus image and extracting local characteristics from a optic disc image; acquiring dominant factor characteristics and recessive factor characteristics corresponding to the retinal fundus image according to the attention characteristics; acquiring global characteristics corresponding to the retinal fundus image according to the dominant factor characteristics and the recessive factor characteristics; performing feature fusion on the global features and the local features to obtain fusion features; and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image. In this way, by acquiring the global features and the local features of the retinal fundus image, as the global features include the secondary features, richer fundus image features in the retinal fundus image can be acquired, the fusion features are acquired by fusing the global features and the local features, and the retinal fundus image is detected according to the fusion features, so that the detection accuracy of the retinal fundus image can be improved.

In addition, the logic instructions in the memory 601 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products.

The memory 601 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 600 executes functional applications and data processing, i.e., implements the method for cyan-eye image detection in the above-described embodiments, by executing program instructions/modules stored in the memory 601.

The memory 601 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. In addition, the memory 601 may include a high speed random access memory, and may also include a non-volatile memory.

The disclosed embodiments provide a storage medium storing program instructions that, when executed, perform the above-described method for cyan-eye image detection.

Embodiments of the present disclosure provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method for cyan-eye image detection.

The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.

The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, where the computer software product is stored in a storage medium and includes one or more instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.

The above description and drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. Furthermore, the words used in the specification are words of description only and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.

Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It can be clearly understood by the skilled person that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be merely a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A method for cyan-eye image detection, comprising:

obtaining a retina fundus image, and intercepting the retina fundus image to obtain a optic disc image;

extracting attention features from the retinal fundus image and extracting local features from the optic disc image;

acquiring an explicit factor characteristic and an implicit factor characteristic corresponding to the retinal fundus image according to the attention characteristic; the dominant factor features are used for representing the significant features required for detecting glaucoma in the retina fundus image, and the recessive factor features are used for representing the secondary features required for detecting glaucoma in the retina fundus image;

acquiring global characteristics corresponding to the retina fundus image according to the dominant factor characteristics and the recessive factor characteristics;

performing feature fusion on the global features and the local features to obtain fusion features;

and inputting the fusion characteristics into a preset image detection model to obtain the image type corresponding to the retina fundus image.

2. The method according to claim 1, wherein acquiring corresponding dominant and recessive factor signatures of the retinal fundus image from the attention signature comprises:

and decomposing the attention characteristics to obtain the dominant factor characteristics and the recessive factor characteristics corresponding to the retina fundus image.

3. The method of claim 1, wherein obtaining the corresponding global feature of the retinal fundus image from the dominant and recessive factor features comprises:

fusing the dominant factor characteristic and the attention characteristic to obtain a dominant characteristic, and fusing the recessive factor characteristic and the attention characteristic to obtain a recessive characteristic;

and fusing the dominant characteristic and the recessive characteristic to obtain the global characteristic.

4. The method of claim 1, wherein the image detection model is obtained by:

obtaining a plurality of first training samples and a plurality of sample labels; the first training sample is a retina fundus sample image, and the sample label is used for representing the image type corresponding to the retina fundus sample image;

and training a preset mutual learning network according to the first training sample with the sample label to obtain the image detection model.

5. The method of claim 4, wherein the mutual learning network comprises a first convolutional neural network and a second convolutional neural network, the second convolutional neural network being a peer-to-peer network of the first convolutional neural network; training a preset mutual learning network according to a first training sample with the sample label to obtain the image detection model, wherein the training comprises the following steps:

performing feature extraction on each first training sample to obtain sample global features corresponding to each first training sample, and intercepting each first training sample to obtain second training samples corresponding to each first training sample;

respectively inputting each second training sample into a preset first convolutional neural network and a preset second convolutional neural network for feature extraction to obtain a first sample local feature and a second sample local feature;

performing feature fusion on each sample global feature and each first sample local feature and each second sample local feature to obtain a first sample fusion feature and a second sample fusion feature;

and training the mutual learning network according to the first sample fusion characteristics and the second sample fusion characteristics to obtain the image detection model.

6. The method of claim 4, wherein the inter-learning network comprises a fully-connected network, and wherein training the inter-learning network according to each of the first sample fusion features and each of the second sample fusion features to obtain the image detection model comprises:

inputting each first sample fusion feature and each second sample fusion feature into the fully-connected network to obtain a first probability value that each training sample is of a preset type and a second probability value that each training sample is of the preset type;

respectively acquiring a first feature distribution similarity corresponding to the first convolutional neural network and a second feature distribution similarity corresponding to the second convolutional neural network according to the first probability values and the second probability values;

determining a first loss function corresponding to the first convolutional neural network according to the first feature distribution similarity, and determining a second loss function corresponding to the second convolutional neural network according to the second feature distribution similarity;

and performing iterative training on the first convolutional neural network and the second convolutional neural network respectively according to the first loss function and the second loss function to obtain the image detection model.

7. The method of claim 5, wherein determining a first loss function corresponding to the first convolutional neural network according to the first feature distribution similarity comprises:

calculating by utilizing each first probability value according to a first preset algorithm to obtain a first prediction result;

and determining the first loss function according to the first prediction result and the first feature distribution similarity.

8. The method of claim 5, wherein determining a second loss function corresponding to the second convolutional neural network according to the second feature distribution similarity comprises:

calculating by utilizing the second probability values according to a second preset algorithm to obtain a second prediction result;

and determining the second loss function according to the second prediction result and the second feature distribution similarity.

9. An apparatus for glaucoma image detection comprising a processor and a memory having stored thereon program instructions, characterized in that the processor is configured to carry out the method for glaucoma image detection according to any one of claims 1 to 8 when executing the program instructions.

10. A storage medium storing program instructions which, when executed, perform a method for cyan-eye image detection according to any one of claims 1 to 8.