CN114091621A

CN114091621A - BPPV eye shake signal labeling method

Info

Publication number: CN114091621A
Application number: CN202111455667.7A
Authority: CN
Inventors: 时海波
Original assignee: Shanghai Sixth Peoples Hospital
Current assignee: Shanghai Sixth Peoples Hospital
Priority date: 2021-12-01
Filing date: 2021-12-01
Publication date: 2022-02-25

Abstract

The invention relates to a BPPV eye shake signal labeling method, which comprises the following steps: according to clinical diagnosis experience, the expert labels the partially selected high-quality video sample; according to the labeled data of an expert, adopting data processing schemes such as data enhancement, positive sample expansion and the like to expand the sample amount to obtain sufficient sample amount, and training a basic deep learning Model to be used as a Teacher Model; taking the unmarked data as a test set, and predicting the unmarked data by using a Teacher Model to give a corresponding pseudo label effect; selecting and discarding the samples with the pseudo labels showing negativity and low confidence coefficient positivity and the independently existing positive samples; for continuous positive samples, marking the samples with high confidence as positive samples as new training samples; the invention trains the deep learning model by using the limited labeled data set and automatically labels a new patient sample, thereby reducing the labeling cost.

Description

BPPV eye shake signal labeling method

[ technical field ]

The invention relates to data modeling in the medical field, in particular to a BPPV (Business process planning) eye shake signal labeling method.

[ background art ]

When modeling real video data, a certain amount of annotation data is needed for supervised learning. In the medical field, however, the labeling of samples often requires the participation of professional doctors, and the accuracy of labeling is extremely high; meanwhile, because of privacy, laws and other relevant factors, real data of a patient cannot be obtained from the outside, which causes very high cost for a hospital to obtain a labeled trainable sample.

Deep learning model training requires a large amount of annotation data, which is very costly in the medical field. For example, in the Deep Learning Model proposed in the paper "Developing a Diagnostic Decision Support System for Benign partial volumetric positive useful a Deep-Learning Model", labeled video data of 7 ten thousand patients is required. When modeling is performed in a three-in-one hospital, the number of available labeled video samples is less than 200, and the method becomes the biggest obstacle to deep learning model training. And for non-professional data organizations, the cost of obtaining annotated samples is very high.

Furthermore, after a detailed study of BPPV disease etiology and video samples, the following conclusions were drawn: (1) the number of positive samples (in the presence of an eye shake) is much smaller than the number of negative samples; (2) the eye shake signal is that the pupil regularly moves in fast and slow phases, and the signal on the time sequence of the pupil track is very clear; (3) positive samples cost more to label than negative samples. Based on the above conclusions, the invention considers that for a high-accuracy one-dimensional convolutional neural network model, the prediction score of the model for a new unlabeled sample can be used as the basis for labeling a positive sample.

[ summary of the invention ]

The invention aims to solve the above defects and provide a method for labeling a BPPV eye-shake signal, which can train a deep learning model by using a limited labeled data set and automatically label a new patient sample, thereby reducing the labeling cost.

The BPPV eye shake signal labeling method designed for achieving the purpose comprises the following steps: 1) expert labeling was performed on a small number of samples: labeling the partially selected high-quality video sample by an expert of a hospital according to clinical diagnosis experience; 2) training a deep learning model: according to the labeled data of an expert, adopting data processing schemes such as data enhancement, positive sample expansion and the like to expand the sample amount to obtain sufficient sample amount, and training a basic deep learning Model to serve as a Teacher Model; 3) prediction of unlabeled samples: taking the unmarked data as a test set, and predicting the unmarked data by using the Teacher Model Teacher Model in the step 2) to give a corresponding pseudo label effect; 4) screening pseudo labels: selecting samples showing negative false labels and positive low confidence degrees, namely <0.95, and discarding; for independently present positive samples, selection was discarded; for consecutive positive samples, samples with a high confidence level, i.e., >0.95, were marked as positive as new training samples.

Further, step 4) is followed by the step of updating the model: combining the positive sample obtained in the step 4) with the training set data in the step 2), performing data enhancement and positive and negative sample rebalancing, dividing a new training set and a new verification set, training a new model by using the new data set, and repeating iteration until the loss function is converged or is not promoted any more.

Further, in step 2), the sample size is extended, including but not limited to data inversion, noise addition, and SMOTE sample construction algorithm.

Further, in step 3), when the new sample needs to perform the prediction of the eye-shake signal, the model is used to roll and slice the eye-shake curve into sub-samples with the length of 400 frames in the prediction stage, and the model is used to perform the prediction on each sub-sample.

Further, in step 3), the sub-sample is divided into a plurality of sub-sample segments, and the predicted pseudo tag value corresponding to each sub-sample segment is represented by 1 and 0, where 1 represents that an eye shake exists in the sub-sample segment, and 0 represents that an eye shake is not detected in the sub-sample segment.

Compared with the prior art, the method mainly utilizes the high accuracy of the BPPV eye shake detection model and the universality of eye shake signals, and replaces the original full data labeling learning scheme by using the method of expert manual labeling and model self-training labeling through several links of model training, model prediction and model updating, thereby achieving the function of reducing the cost. Therefore, the invention solves the problem of how to train the deep learning model by using a limited labeling data set, automatically label a new patient sample and reduce the labeling cost.

[ description of the drawings ]

FIG. 1 is an overall flow diagram of the present invention;

fig. 2 is an exemplary diagram of an eye shake video sample according to an embodiment of the invention.

[ detailed description of the invention ]

The invention is further illustrated below with reference to specific examples:

as shown in fig. 1, the present invention provides a method for labeling a BPPV eye shake signal, the overall process thereof includes the following steps:

step one, carrying out expert annotation on a small amount of samples:

and (4) marking the partially selected high-quality video sample by the expert of the hospital according to clinical diagnosis experience. In general, the expert's label may be defaulted to accurate.

Step two, training a deep learning model:

according to the labeled data of an expert, data processing schemes such as data enhancement and positive sample expansion are adopted, the sample size is expanded (for example, strategies such as data inversion, inversion and noise addition and sample construction algorithms such as SMOTE) to obtain sufficient sample number, and a basic deep learning Model is trained to serve as a Teacher Model.

Step three, predicting an unlabeled sample:

and taking the unmarked data as a test set, and predicting the unmarked data by using the Teacher Model in the step two to give a corresponding pseudo label effect. When new samples are needed for eye-shake signal prediction, our model will roll-slice the eye-shake curve into sub-samples of 400 frames in length during the prediction stage, and use the model to predict each sub-sample, as shown in fig. 2 as fig. 1 eye-shake video sample diagram.

As shown in fig. 2, the sequence of eye-quakes of the video has 16 sub-sample segments, and the corresponding predicted pseudo-label values are:

[1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1]

wherein 1 represents that the segment (length 400 frames) has the eye shake, 0 represents that the segment does not detect the eye shake, and the dotted line interval in the figure is the sequence of the detected eye shake signal, corresponding to the positions of 11 continuous positive labels.

Step four, pseudo label screening:

in the BPPV case, the following characteristics exist for the patient's sample distribution: (1) the positive and negative samples are very unbalanced, the number of negative samples (no eyeshock) is much larger than the positive samples (eyeshock). (2) The classification of BPPV symptoms is complex, and the correct marking of positive samples is more valuable.

In view of the above two points, the present invention screens the pseudo tags:

for samples with false tags showing negative, low confidence positive (<0.95), discard was selected;

for independently present positive samples (last label in fig. 2 fig. 1), discard was selected;

for consecutive positive samples, samples with high confidence (>0.95) are marked as positive as new training samples.

The screening steps do not need manual participation and can be processed in batch.

Step five, updating the model:

(1) combining the obtained positive sample with training set data;

(2) performing data enhancement and positive and negative sample rebalancing, and dividing a new training set and a new verification set;

(3) training a new model using the new data set;

(4) the iteration is repeated until the loss function converges/no longer promotes.

The most similar solution to the present invention today is a self-training model in semi-supervised learning. Semi-supervised learning is supervised learning that uses both labeled and unlabeled data for training. In a real scene, it is very difficult to obtain a large amount of high-quality labeling data, and in particular in the medical field, the labeling needs the participation of professionals, and resources required for labeling 100 samples and tens of millions of samples are different. Semi-supervised learning hopes to train a model with strong generalization capability through a small amount of labeled data labeled by experts and combining a large amount of unlabeled data, thereby solving the problem in practice. The method mainly utilizes the high accuracy of the BPPV eye shake detection model and the universality of eye shake signals, and replaces the original full data labeling learning scheme by using a method of expert manual labeling and model self-training labeling through several links of model training, model prediction and model updating, thereby achieving the function of reducing the cost.

In conclusion, the method trains a reference model by using a method of small amount of expert labeling data and data enhancement, then performs self-training semi-supervised learning, replaces the scheme of full amount of expert labeling and model training, and saves the cost; meanwhile, a pseudo label screening strategy is carried out in the semi-supervised learning by combining the actual symptoms and data distribution.

In the image annotation field in the industry, practitioners need to acquire special service annotation data, and some scenes need manual participation of experts. However, manual labeling with a great deal of effort is inefficient for data of a single form. In the prediction of BPPV eye shake signals, people do not need to label massive patients, but can complete model updating and the labeling task of a new sample in a data enhancement and automatic labeling mode, so that a large amount of human resources can be saved, and errors caused by manual labeling can be reduced by high-accuracy model scoring.

The present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents and are included in the scope of the present invention.

Claims

1. A method for labeling BPPV eye shake signals is characterized by comprising the following steps:

1) expert labeling was performed on a small number of samples: labeling the partially selected high-quality video sample by an expert of a hospital according to clinical diagnosis experience;

2) training a deep learning model: according to the labeled data of an expert, adopting data processing schemes such as data enhancement, positive sample expansion and the like to expand the sample amount to obtain sufficient sample amount, and training a basic deep learning Model to serve as a Teacher Model;

3) prediction of unlabeled samples: taking the unmarked data as a test set, and predicting the unmarked data by using the Teacher Model Teacher Model in the step 2) to give a corresponding pseudo label effect;

4) screening pseudo labels: selecting samples showing negative false labels and positive low confidence degrees, namely <0.95, and discarding; for independently present positive samples, selection was discarded; for consecutive positive samples, samples with a high confidence level, i.e., >0.95, were marked as positive as new training samples.

2. The method for labeling BPPV eye-shake signals according to claim 1, further comprising, after step 4), the step of updating the model: combining the positive sample obtained in the step 4) with the training set data in the step 2), performing data enhancement and positive and negative sample rebalancing, dividing a new training set and a new verification set, training a new model by using the new data set, and repeating iteration until the loss function is converged or is not promoted any more.

3. A method of labelling a BPPV eyeshock signal according to claim 1 or 2, characterised by: in step 2), the sample size is expanded, including but not limited to data inversion, noise addition, and SMOTE sample construction algorithm.

4. A method of labelling a BPPV eyeshock signal according to claim 1 or 2, characterised by: in step 3), when the new sample needs to perform the eye-shake signal prediction, the model is used to roll and segment the eye-shake curve into sub-samples with the length of 400 frames in the prediction stage, and the model is used to predict each sub-sample.

5. The method of claim 4 for labeling a BPPV nystagmus signal, wherein: in step 3), the subsample is divided into a plurality of subsample segments, and the predicted pseudo tag value corresponding to each subsample segment is represented by 1 and 0, wherein 1 represents that the subsample segment has an eye shock, and 0 represents that the subsample segment does not detect the eye shock.