CN115393362A

CN115393362A - Method, equipment and medium for selecting automatic glaucoma recognition model

Info

Publication number: CN115393362A
Application number: CN202211332335.4A
Authority: CN
Inventors: 张健; 戴梅
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-10-28
Filing date: 2022-10-28
Publication date: 2022-11-25
Anticipated expiration: 2042-10-28
Also published as: CN115393362B

Abstract

The invention discloses a method, equipment and a medium for selecting an automatic glaucoma identification model, wherein the method comprises the following steps: acquiring a pre-training model library; selecting a common retinal image dataset as a source domain dataset

Taking the glaucoma fundus image data set as a target domain data set

(ii) a Measure each model in

And

inter-migratability: using model extraction

And

the feature vector of the middle sample is subjected to bilinear transformation, and the obtained high-dimensional feature vector is subjected to low-dimensional mapping to obtain a feature set

、

(ii) a Computing

And

the distance is used for representing the mobility of the current prediction model for automatically identifying the source domain and the target domain; and selecting the model with the strongest mobility for training, and automatically identifying the glaucoma. The prediction model selected by the invention does not need a glaucoma sample label and has better automatic glaucoma identification effect.

Description

Method, equipment and medium for selecting automatic glaucoma recognition model

Technical Field

The invention belongs to the technical field of deep learning, and particularly relates to a method, equipment and medium for selecting an automatic glaucoma identification model based on mobility measurement.

Background

Recent advances in deep learning have been applied to different medical fields for early detection or prediction of certain abnormalities. In the field of ophthalmology, medical image analysis using deep learning methods has made significant progress. Among the major ophthalmic abnormalities, glaucoma is a common and serious one, which can lead to irreversible loss of vision. At present, the automatic identification research of glaucoma at home and abroad is mainly based on glaucoma prior characteristics and stealth characteristics based on deep learning. Where classification of glaucoma is based on a large number of manually screened features, this has the advantage of being somewhat targeted. However, this also results in a high time cost, since a large amount of labor is required to screen the classification features; the second screened feature may be affected by subjective factors of the screening personnel, so that the screened feature is not accurate enough; the third model is difficult to generalize, and large-scale glaucoma fundus data is difficult to obtain because medical data relates to privacy of patients and data barrier problems among hospitals in China are still serious.

In order to solve the problem of lack of effective labeling of medical data, researchers propose a new solution, namely a transfer learning method. Transfer learning is a learning method that mimics the human visual system in the process of performing a new task in a particular domain, using a large amount of prior knowledge in other relevant domains. It is expected that the model will train an ideal recognition effect when the number of medical image data sets is small. And the method also has higher automatic identification performance on a new test data set. However, different retinal fundus image datasets, due to different scanners, image resolution, light source intensity and parameter settings, result in images with significant differences in appearance. Resulting in a large degradation of performance on the target domain data set for identifying well-performing deep learning models on the source domain data set. In current migration learning applications, finding the optimal migration strategy still requires time-consuming experimentation and domain knowledge. The migratability metric for the model can quantitatively reveal how easily it is to migrate knowledge learned from the source domain data to the target domain data. And providing guidance for selecting the transfer learning model. Therefore, model migratability measurement research is of great significance to the wide and efficient application of migration learning in glaucoma automatic identification. At present, the model mobility measurement research mainly comprises the following methods, which have breakthroughs and face certain limitations:

(1) Model migratability metrics based on empirical studies: taskonomy evaluates migration performance by retraining the source model for each target task. Expensive training calculations are required.

(2) Model migratability measurements based on analytical methods: h-score analytically assesses migratability by solving the HGR maximum correlation problem. The NCE measures migratability at a particular setting using conditional entropy. LEEP constructs an empirical predictor by estimating the joint distribution of pre-training and target label space, predicts the virtual label distribution of target data in source label space, and calculates the empirical condition distribution of the target label of a given virtual label. The performance of the empirical predictor is used to evaluate the pre-trained model. Although fast in computation, the a priori method is not accurate and is applied specifically to image classification of supervised pre-trained models. On the one hand, these methods have strict assumptions about the data, and on the other hand, work poorly in cross-domain settings.

(3) Mobility metric based on Optimal Transport (OT): the migratability is described as a linear combination of domain differences and task differences. In the calculation process, part of target data set data with known labels is needed to be used as observation samples.

(4) Attribution graph-based heterogeneous depth model migratability measures: and calculating a data attribution map of each trained model in the model base to the detection data set by utilizing the existing depth model attribution method. The mobility metric of the model is measured by the similarity of the attributed graphs. The method needs to establish a detection data set, and an author collects the part of data through various network picture search engines, so that a large amount of additional cost is increased; the requirement on the detection data is strict, and the quality of the detection data has great influence on the measurement effect of the model mobility; and calculating the attribution graph of each model in the model library to the target data, wherein when the number of the models is large, the storage of the attribution graphs and the calculation cost of the distance are not negligible.

(5) A zero sample image retrieval method and device based on Hash coding and graph attention mechanism are disclosed: from the macroscopic view, the method utilizes the selected model to extract picture features, compares the unknown label image with all known label images in the database one by one in the model application stage, and selects the label of the image with the minimum Hamming distance of the Hash code between the unknown label image and the database as the prediction classification result of the unknown label image. This model selection approach, lacking a priori estimates, does not enable a measure of the model's own migratability before the model is used for actual migration behavior. From the microscopic perspective, the method realizes classification by comparing each picture in the database one by one, and the classification performance of the model in practical application is greatly limited by the richness of the known label image library of the database.

Due to the characteristics of transfer learning, the application effect of the transfer learning method in the field of automatic identification of glaucoma fundus images is closely related to model selection. Searching for a model with optimal recognition performance to realize the migration behavior requires a great deal of experiments and information. Learning the model with an imperfect recognition accuracy even after consuming a lot of computing resources, is a waste of computing resources. This lack of an evaluated model selection approach leaves the field of automatic glaucoma identification with greater uncertainty. Therefore, the mobility of the deep learning model is measured, and the method has guiding significance for the application of the fundus image recognition model. The model migratability is affected by many factors, such as the size of the data set, the model optimization method, etc. The capability of the deep learning model to the feature extraction module of the image is a main factor influencing the migration capability of the model. How to measure the ability of the deep learning model to the feature extraction module of the image and how to compare different models under the same criterion is then a key issue in evaluating model mobility. The current research aiming at the model mobility measurement mainly faces the following problems: the source model needs to be retrained, and expensive training calculation is consumed; strict assumption is made on data, and the working effect in cross-domain setting is poor; the partial target domain data sample label needs to be known.

Disclosure of Invention

Aiming at the defects of the existing model mobility measurement method, the invention provides a method, equipment and medium for selecting an automatic glaucoma identification model based on mobility measurement, which do not need a glaucoma sample label and have better automatic glaucoma identification effect.

In order to achieve the technical purpose, the invention adopts the following technical scheme:

a method for glaucoma auto-identification model selection based on migratability metrics, comprising:

step 1, obtaining a pre-training model library obtained by training on a standard data set

(ii) a Wherein

Are respectively asNA plurality of different pre-training models;

step 2, selecting a public retina image data set as a source domain data set

Comprises that

A common retinal image sample, noted

(ii) a Using the glaucoma fundus image dataset as a target domain dataset

Comprises thatnA sample of a glaucoma fundus image

(ii) a For each pre-training model, its data set in the source domain is measured in steps 3-5

With the target domain data set

Migratability for automatic identification;

step 3, extracting a source domain data set by using a pre-training model

And a target domain data set

Carrying out bilinear transformation on the extracted feature vector to obtain a high-dimensional feature vector;

step 4, performing count sketch mapping on each high-dimensional feature vector obtained in the step 3 to obtain a characterization source domain data set

Source domain feature set of

And characterizing the target domain dataset

Target domain feature set of

；

Step 5, calculating a source domain feature set

Central feature of

And a set of target features

Central feature of

Then calculate

And with

And using the distance to characterize the current prediction model versus the source domain data set

With the target domain data set

The mobility of automatic identification is carried out, and the mobility of the model with the minimum distance is ultra strong;

step 6, selecting a pre-training model with the strongest mobility, and using a source domain data set

And training, and taking the model obtained by training as an automatic glaucoma identification model.

Further, theNThe different pre-training models are all heterogeneous deep learning models.

Further, the computation formula for performing bilinear transformation on the feature vector is as follows:

in the formula (I), the compound is shown in the specification,

the feature vectors extracted for the pre-trained models,

is a feature vector

The transposed vector of (a) is provided,

a matrix obtained by bilinear transformation;

then the matrix is divided

All elements are spliced into a length of

High-dimensional feature vector of

。

Further, the specific process of performing count sketch mapping on each high-dimensional feature vector is as follows:

(1) Self-defining a projection dimension d of the count sketch transformation function;

(2) Randomly generating an array

And

wherein

Slave array

The assignment is randomly drawn,

assigning values from an array {1, -1} of random samples; initializationdZero vector of dimension

；

(3) Computing

Obtained bydDimension vector

The feature vector is obtained by mapping; wherein

As high-dimensional feature vectors

ToiAnd (4) a component.

Further, the source domain feature set

Central feature of

The calculating method comprises the following steps:

(1) Feature set for source domain

Setting the number of clusters

Order cluster

(ii) a Wherein

Respectively source domain data sets

InmThe samples are mapped by the count sketchmA feature vector;

(2) From

Randomly selecting 1 feature vector as initial mean vector

；

(3) From

Randomly selects 1 feature vector

Adding to clusters

；

(4) Updating mean vector

At the same time will

From cluster C and source domain feature set

Removing; wherein

And

the mean vector is the vector before and after updating;

(5) Repeating the steps (3) and (4) until the source domain feature set

Empty, mean vector at this time

Is the source domain feature set

Central feature of

；

The set of target domain features

Central feature of

Computing method, and source domain feature set

Central feature of

The calculation method is the same.

In a further aspect of the present invention,

and

the Kanbera distance between them is calculated as:

in the formula (I), the compound is shown in the specification,

is composed of

And

the Kanbera distance between the two can be determined,

and

respectively represent

And

to (1)iDimension feature, d is the dimension of the feature vector obtained by the count sketch mapping.

An electronic device, comprising a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor is enabled to implement the method for selecting model for glaucoma automatic identification based on migratability metrics according to any of the above technical solutions.

A computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method for selecting a model for glaucoma automatic identification based on migratability measures according to any of the above-mentioned aspects.

Advantageous effects

1 in the prior art, a plurality of source models need to be retrained when transfer learning is carried out, and the transferability of each model is evaluated according to the recognition precision of a target image after the actual transfer behavior of the model, however, in the field of medical images, the data quantity of fundus data sets is small, the difference between data is large, and the application effect of the transfer learning in the field of automatic glaucoma recognition is limited. According to the method, the model mobility is measured at the stage of extracting the image characteristic from the convolution basis of the deep learning model, the source model does not need to be retrained, the model does not need to be actually migrated, the deep learning model with better migration performance can be selected for automatically identifying the glaucoma fundus image, and the glaucoma sample label is not needed and the better automatic glaucoma identification effect is achieved.

2. Aiming at the problems that the feature dimension extracted from the convolution base of the learning model with different depths is high, the characterization capability is not strong, the measurement cannot be realized and the like, the invention provides the method for generating the joint representation by using the bilinear feature and enhancing the characterization capability of the feature vector; approximating a kernel function by using a count sketch transformation function, and mapping high-dimensional bilinear features to the same vector space with relatively low dimensionality; the distance between the converted image feature vectors is measured by the Kanbera quantity, so that the model mobility can be reflected, and guidance is provided for model selection in the migration learning application process. On one hand, compared with other measurement methods, the Kanbera distance is suitable for measuring the distance between two points in a vector space, is sensitive to the value change close to 0 (more than or equal to 0) and is suitable for the model mobility measurement scene, and on the other hand, the method is low in calculation cost and does not need extra space storage cost.

3. Aiming at the problems that in the field of model mobility measurement, strict assumptions are made on data of a source domain and data of a target domain, and measurement effects are poor in cross-domain setting, the method for performing mobility measurement on a plurality of pre-training models does not strictly assume data of the source domain and the target domain, and measurement effects are not influenced in cross-domain setting.

Drawings

FIG. 1 is a diagram of a full flow analysis of a method according to an embodiment of the present application.

Detailed Description

The following describes embodiments of the present invention in detail, which are developed based on the technical solutions of the present invention, and give detailed implementation manners and specific operation procedures to further explain the technical solutions of the present invention.

Example 1

The embodiment provides a glaucoma automatic identification model selection method based on migratability measurement, which is shown in fig. 1 and includes the following steps:

step 1, obtaining a pre-training model library obtained by training on a standard data set ImageNet

(ii) a Wherein

Are respectively asNA different pre-training model, in this embodimentNA heterogeneous deep learning model.

Step 2, selecting a public retina image data set as a source domain data set

Comprises that

A common retinal image sample, noted

(ii) a Using the cyan light fundus image dataset as a target domain dataset

N cyan light fundus image samples, and recording

With the target domain data set

For automatic identification.

The common retinal image set can be Drishti-GS, RIM-ONE-R1, R2, R3, REGUGGE and the like, and Drishti-GS is selected in the embodiment and comprises 101 retinal images and mask marks of optical disc and optical cup for detecting glaucoma.

Step 3, extracting a source domain data set by using a pre-training model

And a target domain data set

And then carrying out bilinear transformation on the extracted feature vector to obtain a high-dimensional feature vector.

The feature vectors of each common retinal image and each cyan fundus image are recorded as

Wherein

Representing feature vectors

S-dimensional feature of (1).

For feature vector

Performing bilinear transformation to obtain a value ofMatrix of s

Namely:

then the matrix is divided into

All elements are spliced to obtain the product with the length of

High-dimensional feature vector of

。

Source domain feature set of

And characterizing the target domain dataset

Target domain feature set of

. The embodiment specifically includes:

(1) And (5) defining the projection dimension d of the count sketch transformation function. The appropriate setting of d depends on the amount of training data, memory budget and task difficulty. In this embodiment d =8000 is sufficient to achieve near maximum accuracy.

(2) Randomly generating an array

And

in which

Slave array

The assignment is randomly extracted and,

。

(3) Computing

Obtained bydDimension vector

The feature vector obtained by mapping is low in dimensionality and high in representation; wherein

As a high-dimensional feature vector

To (1)iAnd (4) a component.

Integrating source domain data sets

Passing through the model

Extracting features from the convolution basis, mapping a count sketch function to finally obtain a feature set which is expressed as

In which

Representing source domain data

The characteristics of (1). Target domain data set

Passing through the model

Extracting features from the convolution base, mapping the count sketch function to finally obtain a feature set expressed as

In which

Representing target domain data

The characteristics of (1).

Step 5, calculating a source domain characteristic set

Central feature of

And a set of target features

Central feature of

Then calculate

And with

The Kancperra distance therebetween, and using the distance to characterize the current prediction model versus the source domain data set

With the target domain data set

And the mobility of the automatic identification is realized, and the mobility of the model with the minimum distance is ultra strong.

Wherein the source domain feature set

Central feature of

The calculating method comprises the following steps:

(1) Feature set for source domain

Setting the number of clustered clusters

Order cluster

(ii) a Wherein

Respectively source domain data sets

InmThe samples are mapped by the count sketchmA feature vector;

(2) From

Randomly selecting 1 feature vector as initial mean vector

；

(3) From

Randomly selects 1 feature vector

Adding to cluster C;

(4) Updating mean vectors

At the same time will

From cluster C and source domain feature set

Removing; wherein

And

the mean vector is the vector before and after updating;

(5) Repeating the steps (3) and (4) until the source domain feature set

Empty, mean vector at this time

Is the source domain feature set

Central feature of

。

Target domain feature set

Central feature of

Computing method, and source domain feature set

Central feature of

The calculation method comprises the following steps:

(1) Targeting domain feature sets

Setting the number of clusters

Order cluster

(ii) a Wherein

Respectively target domain feature set

The n samples are mapped by the count sketchnA feature vector;

(2) From

Randomly selecting 1 feature vector as initial mean vector

；

(3) From

Randomly selects 1 feature vector

Adding to cluster C;

(4) Updating mean vector

At the same time will

From cluster C and target domain feature set

Removing; wherein

And

the mean vector is the vector before and after updating;

(5) Repeating the steps (3) and (4) until the target domain feature set

Empty, mean vector at this time

Is the target domain feature set

Central feature of (2)

。

In addition, the first and second substrates are,

and

the Kanbera distance between them is calculated as:

in the formula (I), the compound is shown in the specification,

is composed of

And

the Kanbera distance between the two can be determined,

and

respectively represent

And

to (1) aiThe dimensional characteristics of the image data are measured,dthe dimensions of the feature vector obtained for the count sketch map.

For each pre-training model, the method obtained according to the step 3-5

And

the Kancperla distance between the two can be used for representing the source domain data set of the current prediction model

With the target domain data set

The automatic identification is performed. And the smaller the Kanbera distance, the more migratability of the pre-trained model is indicated.

Step (ii) of6, selecting the pre-training model with the strongest mobility, and using the labeled source domain data set

Example 2

An electronic device comprising a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor to implement the method for automatic glaucoma recognition model selection based on migratability metrics of embodiment 1.

Example 3

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method for automatic glaucoma identification model selection based on migratability metrics according to embodiment 1.

The above embodiments are preferred embodiments of the present application, and those skilled in the art can make various changes or modifications without departing from the general concept of the present application, and such changes or modifications should fall within the scope of the claims of the present application.

Claims

1. A method for selecting an automatic glaucoma identification model based on a mobility metric, comprising:

(ii) a Wherein

Are respectively asNA plurality of different pre-training models;

step 2, selecting a public retina image data set as a source domain data set

Comprises that

A common retinal image sample, noted

(ii) a Using the glaucoma fundus image dataset as a target domain dataset

Comprises thatnA sample of a glaucoma fundus image

With the target domain data set

Migratability for automatic identification;

step 3, extracting a source domain data set by using a pre-training model

And a target domain data set

Source domain feature set of

And characterizing the target domain dataset

Target domain feature set of

；

Step 5, calculating a source domain characteristic set

Central feature of

And a set of target features

Central feature of

Then calculate

And

With the target domain data set

step 6, selecting the pre-stage with the strongest mobilityTraining models using source domain datasets

2. The method for glaucoma model selection with automatic recognition according to claim 1, wherein the model is selected from the group consisting of a model for glaucoma model selection, and a model for glaucoma model selectionNThe different pre-training models are all heterogeneous deep learning models.

3. The method for selecting a model for automatic glaucoma recognition according to claim 1, wherein the computation formula for bilinear transformation of the feature vectors is:

in the formula (I), the compound is shown in the specification,

the feature vectors extracted for the pre-trained models,

is a feature vector

The transposed vector of (a) is,

a matrix obtained by bilinear transformation;

then the matrix is divided into

All elements are spliced into a length of

Is highDimensional feature vector

。

4. The method for selecting a model for automatic glaucoma recognition according to claim 1, wherein the specific process of performing count sketch mapping on each high-dimensional feature vector is as follows:

(2) Randomly generating an array

And

wherein

Slave array

The assignment is randomly extracted and,

；

(3) Computing

Obtained bydDimension vector

The feature vector is obtained by mapping; wherein

As a high-dimensional feature vector

To (1)iAnd (4) a component.

5. The method for selecting a model for automatic recognition of glaucoma according to claim 1, wherein the set of source domain features

Central feature of

The calculation method comprises the following steps:

(1) Feature set for source domain

Setting the number of clustered clusters

Order cluster

(ii) a Wherein

Respectively source domain data sets

InmThe samples are mapped by the count sketchmA feature vector;

(2) From

Randomly selecting 1 feature vector as initial mean vector

；

(3) From

Randomly selects 1 feature vector

Adding to cluster C;

(4) Updating mean vector

At the same time will

From cluster C and source domain feature set

Removing; wherein

And

the mean vector is the vector before and after updating;

(5) Repeating the steps (3) and (4) until the source domain feature set

Empty, mean vector at this time

Is the source domain feature set

Central feature of

；

The set of target domain features

Central feature of

Computing method, and source domain feature set

Central feature of

The calculation method is the same.

6. The glaucoma automatic recognition model selection method according to claim 1,

and

the Kanbera distance between them is calculated as:

in the formula (I), the compound is shown in the specification,

is composed of

And

the kanperla distance therebetween is increased by the distance,

and

respectively represent

And

7. An electronic device comprising a memory and a processor, the memory having stored therein a computer program, wherein the computer program, when executed by the processor, causes the processor to carry out the method according to any one of claims 1 to 6.

8. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.