CN113657449A

CN113657449A - Traditional Chinese medicine tongue picture greasy classification method containing noise labeling data

Info

Publication number: CN113657449A
Application number: CN202110797875.9A
Authority: CN
Inventors: 李晓光; 王艳阳; 卓力; 房振亚
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2021-07-15
Filing date: 2021-07-15
Publication date: 2021-11-16

Abstract

A traditional Chinese medicine tongue picture greasy classification method for noisy labeled data belongs to the field of computer vision, label confidence evaluation and updating processing are carried out on a noisy labeled tongue picture greasy data set through a plurality of machine learning classification models, label quality updating of the label data set and updating of classification network model parameters are put into an iteration process, and an active learning thought is introduced, so that a novel traditional Chinese medicine tongue picture greasy sample labeling quality and classification model interactive iteration improving method is provided. The method improves the confidence of the labeled samples, thereby screening high-quality labeled samples, being applied to other tongue picture samples containing noise labels, effectively solving the problems of strong subjectivity and noise labels of the tongue picture labeling in traditional Chinese medicine, and having higher popularization.

Description

Traditional Chinese medicine tongue picture greasy classification method containing noise labeling data

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a traditional Chinese medicine tongue picture greasy classification method containing noise labeling data.

Background

The tongue diagnosis is an examination method for understanding the physiological functions and pathological conditions of the human body by observing the changes of tongue manifestation, and is one of the most popular diagnosis methods with wide application in the four diagnostic methods of traditional Chinese medicine. The tongue manifestation feature analysis of traditional Chinese medicine is the core content for realizing the objective tongue diagnosis, and the tongue manifestation feature is the reflection of the pathological changes of qi and blood of internal organs of human body such as cold, heat, deficiency and excess. In tongue analysis, different aspects of tongue body characteristics, such as tongue coating color, tongue coating thickness and humidity, tongue texture, prickling, cracking, tongue morphology (tooth marks, skew and fat and thin), are usually analyzed as required. The greasy characteristic is the reflection of the tongue coating texture, mainly refers to the change of the granularity and the density of the tongue coating, and is generally classified into non-greasy, greasy and greasy 3 types, and the extracted greasy characteristic is mainly used for doctors to diagnose diseases and is one of the key links of differential treatment in traditional Chinese medicine.

The tongue diagnosis usually is performed by visual inspection and differentiation based on experience. There is a lack of quantitative, objective measurements and analysis compared to western medicine. The tongue diagnosis result varies from person to person, has strong subjectivity and poor repeatability under the influence of subjective factors such as knowledge level, experience and the like of doctors, and directly influences the standardization and standardization of treatment based on syndrome differentiation. The tongue picture characteristics are extracted and classified through a computer algorithm, and the method has important significance for the objectification of tongue diagnosis.

The marking cost of the sample data of the greasy tongue picture characteristic in the traditional Chinese medicine is high, the number of the greasy marked samples is small, and the characteristic of different classes needs to be effectively distinguished by using the unmarked samples; the greasy characteristics have certain similarity among different classes, and the doctor labeling has subjectivity, so that inconsistent noise labeling samples exist. How to extract high-quality labeled samples is the premise of realizing the efficient training of the classification model.

The deep learning method effectively extracts the image depth features and learns the intrinsic rules and depth feature expression of sample data, so that higher classification accuracy and stronger model generalization capability are obtained. The deep learning model usually needs a large amount of labeled data for supervised training, and the noisy limited labeled tongue picture sample becomes an important factor influencing the accuracy rate and generalization capability of the model.

Therefore, the invention provides a traditional Chinese medicine tongue picture greasy automatic classification method containing noise labeling data by combining active learning and semi-supervised learning technologies according to the characteristics of the tongue picture greasy characteristics. The method comprises the steps of adopting a plurality of classification networks as basic networks, extracting deep features of tongue images, predicting noisy label data, counting prediction consistency, selecting samples with large information quantity to carry out manual re-labeling, achieving the purpose of data cleaning, updating high-quality labeled samples, introducing unlabeled samples, mining implicit information of the unlabeled samples, effectively assisting training of a final classifier through semi-supervised learning, and improving classification performance.

Disclosure of Invention

The invention provides a traditional Chinese medicine tongue picture greasy classification method containing noise labeling data, which comprises the following steps:

1) construction of a tongue greasy dataset

The data set applied by the invention is acquired by the tongue manifestation appearance of traditional Chinese medicine, and the category of the greasy is calibrated by professional doctors. The tongue quality of a single tongue picture is greasy and is mainly judged by the central block image block of the tongue body. In order to construct a training and testing data set, three steps of tongue region extraction, tongue center block extraction and tongue picture block slider processing are required:

step 1.1: extracting a tongue body area from an original image to be used as a basic tongue image;

step 1.2: extracting a central block of the tongue body to be used as a representative block of greasy characteristics of the tongue body;

step 1.3: carrying out slide block processing on the extracted tongue picture blocks to obtain tongue picture blocks for constructing a large-batch tongue picture block data set;

2) sample class prediction and consistency statistics based on multi-classification model

The noise-containing labeled samples have uniformity in the result predicted by the multiple models, the difficultly-classified samples with large information content are screened and manually re-labeled, the labeled data set is updated, and the confidence coefficient of data labeling is improved.

Step 2.1: training a plurality of classification models by using noisy labeled tongue picture greasy data;

step 2.2: and respectively adopting a plurality of classification models to carry out class prediction of labeled data, and estimating the confidence coefficient of the data label according to the multi-model prediction result. Samples were classified into the following 3 classes:

class 1: for samples with consistent multi-model prediction results and the same labeling type, classifying the samples into samples with consistent labeling, and directly using the samples with high label reliability for optimization training of subsequent classification models;

class 2: classifying the samples with consistent prediction results but inconsistent labeling results into suspected labeling error samples; submitting manual work to check the labeling result;

class 3: and for the samples with inconsistent multi-model prediction results, calculating the confidence degree of the label according to the specific prediction result condition and the relation with the labeled type. Performing empowerment application or manual calibration on the sample according to the confidence coefficient;

step 2.3: updating annotated samples

And respectively carrying out corresponding processing on the 3 types of samples, and updating the labeled data set. The samples of the category 1 are directly used for training and testing; after manual marking and verification, adding the samples of the category 2 into a training and testing set; for category 3, according to the sample proportion and the training mode, removing, manually calibrating and empowering;

3) classification model training

Step 3.1: dividing a training set and a testing set by using the updated labeled sample; and carrying out network training on the multi-classification model. The retraining model can be adopted or the optimization training can be carried out on the existing network model;

step 3.2: training a plurality of classification models until the models are converged, namely the test classification accuracy does not change greatly any more, and recording the test accuracy;

step 3.3: repeating the steps 2) and 3), and performing a new round of label prediction and consistency statistics, data label updating and network model training on the labeled sample;

step 3.4: and stopping iteration when the proportion of the consistency label samples in the labeled data is obviously improved and the integral classification performance of the network is not improved any more, namely the accuracy of the record is not changed any more. Selecting a classifier with optimal performance to realize a tongue picture greasy classification network according to the requirements of system precision and complexity;

4) optionally, the multi-classification model label prediction scheme set forth in the invention can also be used for performing label prediction on unlabeled data; directly endowing a prediction pseudo label to the samples with consistent prediction results, and adding a training set; selecting a sample with poor consistency of a prediction result to carry out artificial labeling for active learning; and for the sample with the intermediate confidence coefficient, flexible application is carried out. The technology of the invention can conveniently carry out noise-containing labeling data cleaning, semi-supervised learning false label prediction, active learning hard-to-separate sample selection and the like.

A traditional Chinese medicine tongue picture greasy classification method containing noise labeling data comprises the following steps:

1) construction of a tongue greasy dataset

The first step is as follows: extracting a tongue body area from an original image to be used as a basic tongue image; the tongue body is segmented by adopting the existing traditional Chinese medicine tongue picture segmentation method, and the extracted tongue pictures are unified in standard size;

the second step is that: extracting a central block of the tongue body to be used as a representative block of greasy characteristics of the tongue body; the central part of each segmented tongue body reserves the characteristic of greasy tongue quality, so that the central tongue image block of each tongue image is extracted, a small region filtering method is adopted, a minimum area threshold value is set, the threshold value is set to be 100 multiplied by 100 pixels in an experiment, whether the area of each region is smaller than the threshold value or not is judged by acquiring all communication regions in the image, the region with the area smaller than the threshold value is filtered, the region equal to or larger than the threshold value is reserved, the maximum value and the minimum value of the pixels of a target region are acquired, the target region is determined, the position of a central point is positioned, a cutting region is set, and the central block of the tongue body is obtained by sliding;

the third step: constructing a tongue picture block training data set; carrying out sliding block processing on the extracted tongue picture blocks, setting the size of the obtained target tongue picture block, and setting a uniform step length to slide to obtain the tongue picture block;

2) multi-model integrated predictive statistics

The first step is as follows: training a plurality of classification models by using the first batch of noisy labeled tongue picture greasy data; convolutional network InceptitionV 3_ v1, MobileNet _ v1, ResNet50 and Effect Net b4 network structure training are adopted, 4 classification model training is adopted to train network model parameters by adopting an SGD gradient descent algorithm, the range of learning rate is set to be 0.00001-1, the range of momentum is set to be 0.5-0.99, and the range of decay coefficient of decay is set to be 1 e-9-1 e-2;

the second step is that: selecting Inception V3, MobileNet _ v1 and ResNet50 network models as a basic network, and uniformly predicting all labeled samples in the first batch according to a plurality of model integrated prediction modes, wherein as shown in formula (1), a represents the confidence coefficient of the samples, the quantity of the models with consistent prediction is set to be K, the quantity of the models with consistent prediction is set to be K, Z is a prediction label, and Z is an original label, if 3 models with consistent prediction and the same type as the original labeled samples are classified into a label set, the confidence coefficient is set to be 1, the label reliability is high, and the label reliability is directly used for subsequent cyclic training; if the prediction result is consistent with the original labeling type, setting the confidence coefficient as 1/2, classifying the samples into suspected labeling error samples, introducing an active learning idea, and submitting manual labeling result verification; if the 3 models are not consistent in prediction, according to the specific prediction result condition and the relation with the labeled type: setting a threshold value to be 1/2, and judging whether the sample is consistent with the original label or not when the K/K ratio is larger than the threshold value, namely, most models in the models are predicted to be consistent, and only a few models are inconsistent; if the confidence degrees are consistent, the confidence degrees are set to be 1, and the annotation sets are classified; if the two are inconsistent, the confidence coefficient is set to 1/2, and label verification is carried out manually; when the K/K ratio is equal to or smaller than the threshold value, namely the predictions of a few models are consistent, the predictions of most models are inconsistent, and the confidence coefficient is set to be 0, the models are classified into unmarked data sets; when the K/K ratio is 0, namely all models are predicted to be inconsistent, representing a high-probability noise label sample, and if the confidence coefficient is set to be 0, classifying the high-probability noise label sample into an unlabeled data set; updating the labeled sample set after all labeling is finished to obtain a denoising and labeling sample set X participated manually;

the third step: dividing the denoising labeling sample into a training test set, performing integrated prediction according to a new trained model, keeping parameters unchanged, and recording the classification precision of the optimal model; updating the labeled sample set again according to the criterion in the second step, and iterating and circulating until the classification precision of the sample test set is kept unchanged;

3) semi-supervised model training

The first step is as follows: on the basis of data screening in 2), taking a denoising labeling set as a labeled sample set, taking the rest unlabeled tongue picture data as an unlabeled sample set, applying data enhancement to both labeled and unlabeled samples, taking batch as an enhanced data batch, performing data enhancement on labeled data of one batch once, performing data enhancement on unlabeled data of one batch for M times, setting the experiment M to be 2, selecting a standard enhancing mode for data enhancement to be random cutting and horizontal turning respectively, randomly cutting into different sizes and aspect ratios, and then scaling the cut image into a target size to obtain an enhanced labeled data set X and unlabeled data U;

the second step is that: comparing results of the K (K is 3) classification model in the step 2), adopting a training Efficient Net b4 classification model as a semi-supervised learning classifier, carrying out prediction classification on M enhancement results of single data by the classifier, and finally determining a final pseudo label by the classification result by adopting an average method, wherein a specific calculation method is shown as a formula (2);

wherein u is_bRepresenting unlabeled data, y is a prediction label, M is the enhancement times, the total enhancement times are M, theta is a model parameter, P is the probability of being predicted as a y category by the classifier,

the average predicted probability value is obtained finally; in the process of generating the pseudo tag, introducing a minimum entropy operation into the model, wherein the specific operation is to introduce a "sharpen" function, which is specifically shown in formula (3);

wherein Q is the average prediction of the enhancement dataProbability value corresponding to that in formula (1)

T is a temperature hyper-parameter, a T value is set to be 0.5 in an experiment, classification entropy can be adjusted, L is a category total amount, i and j point to a single category, when the T value tends to 0, output is close to one-hot distribution, the T value is reduced, the model is encouraged to carry out low-entropy prediction on enhanced unmarked data, and the prediction accuracy of the model is improved; through the steps, the pseudo label of the enhanced unmarked data is obtained;

the third step: in order to relatively enhance the generalization capability of the model, a mix up method is used for data mixing, a marked data set is recorded as X, an unmarked data set is recorded as U, the mixing operation is realized by mixing the marked data set X and the unmarked data set U to form a mixed data set W, then the former X data mix up in the marked data X and the W data set forms a new labeled data set as X ', the latter U data set mix up in the unmarked data and the W data set forms a new unlabeled data set as U', and a mixed marked and unmarked greasy tongue data set is constructed;

the fourth step: for the labeled data set X', calculating the cross entropy loss between the label and the model prediction, wherein the calculation method is specifically shown as formula (4),

wherein X 'represents all marked samples, | X' | is the quantity value of all marked samples, X is a single marked sample, P is the distribution of real sample labels, P is the probability that a sample X is predicted to be a y category by a classifier, and the H function is the cross entropy between the distribution P and P,

for the unmarked data set U', calculating the mean square loss function between the model prediction and the pseudo label, subtracting the estimation value from the target value and then squaring to obtain the error, as shown in formula (5),

wherein, U 'represents all the unlabeled samples, | U' | is the quantity value of all the unlabeled samples, U is a single unlabeled sample, q is the label distribution of the real sample, P is the probability that the sample U is predicted to be the y category by the classifier, theta is the model parameter, and L is the total amount of the categories;

the total Loss term is shown in formula (6), Loss represents the total Loss, which is the sum of two losses,

Loss＝L_x+λL_u#(6)

λ is a hyper-parameter, set to 100; introducing the loss constraint network on the constructed new data set, continuously carrying out iterative training on the network, setting the learning rate to be 0.00001-1, setting the momentum to be 0.5-0.99, and setting the decay coefficient to be 1 e-9-1 e-2; and training until the loss of the classification model is converged and the greasy classification indexes are unchanged, and selecting the model with the highest verification result index as the optimal model for the classification network training.

The technical scheme of the application provides a traditional Chinese medicine tongue picture greasy classification method containing noise labeling data, and is used for computer-assisted tongue picture greasy characteristic classification. The method carries out label confidence evaluation and updating processing on a noisy labeled tongue picture greasy data set through a plurality of machine learning classification models, label quality updating of the label data set and updating of classification network model parameters are put into an iteration process, and an active learning idea is introduced, so that the novel method for interactive iteration improvement of the labeling quality and the classification model of the traditional Chinese medicine tongue picture greasy sample is provided.

Compared with a single model decision-making method in deep learning, the traditional Chinese medicine tongue picture greasy classification method of the noisy annotation data provided by the invention can improve the confidence coefficient of the annotated sample, so that a high-quality annotated sample can be screened, the method can be applied to other problems of noisy label tongue picture samples, the problems of strong subjectivity and noisy labels of the traditional Chinese medicine tongue picture annotation can be effectively solved, and the method has higher popularization; the method for evaluating and updating the data labeling quality and updating and training the classification model can be popularized to other similar traditional Chinese medicine tongue picture characteristic classification tasks and can be widely applied to other classification tasks of noisy labeling data.

The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the technical solutions of the present invention and the objects, features, and advantages thereof more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

FIG. 1 illustrates a flowchart of tongue corruption block data set generation for classification provided by the present invention;

FIG. 2 illustrates a schematic diagram of data cleansing provided by the present invention;

FIG. 3 is a diagram illustrating a semi-supervised classification model framework for tongue greasiness according to the present invention;

the specific implementation mode is as follows:

exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

According to the method, all labeled data are predicted according to a plurality of models, inconsistent data are screened and predicted by adopting an active learning idea and manually labeled again, a high-quality denoising labeled data set is obtained, and a network training mode is optimized. The characteristic information of the unmarked data is fully utilized, so that the model can learn more greasy characteristics with high distinguishability, and the network can clearly distinguish non-greasy, greasy and greasy 3 types of data. The invention provides a traditional Chinese medicine tongue picture greasy automatic classification method containing noise labeled data, as shown in attached figures 2 and 3, the method adopts a plurality of trained models to predict noisy labeled data, distinguishes labeled data sets and unlabeled data sets through predicting consistency, updates data after manual labeling, adopts a data enhancement means aiming at other unlabeled data, introduces minimum entropy operation for averaging and predicting the enhanced data, finally fuses the labeled data and the unlabeled data, updates the data, iteratively trains a network, enhances the network robustness, finally obtains a semi-supervised training model, and performs the category test of tongue picture greasy samples.

The following description of the embodiments of the present invention is provided in conjunction with the accompanying drawings:

the data set adopted by the invention is acquired by a tongue manifestation apparatus of traditional Chinese medicine, and the category of the data set is calibrated by a doctor of traditional Chinese medicine. The tongue manifestation sample used is 1423 full tongue images collected in TCM department of TCM, and the tongue nature is classified into non-greasy, greasy and greasy 3 types. Dividing the training set into marked samples and unmarked samples, wherein the marked samples are 618 tongue pictures, and preprocessing the tongue pictures into 2472 tongue picture blocks comprising non-greasy 1512 blocks according to the tongue picture data set manufacturing process in the step 1); a rotting 432 block; and (4) 528 blocks of putty, wherein 10000 blocks of samples are not marked. The test set has 405 tongue images corresponding to 405 tongue body center blocks, wherein the tongue body center blocks are non-greasy 249 blocks, greasy 9 blocks and greasy 147 blocks.

1) Construction of a tongue greasy dataset

The first step is as follows: the tongue body area is extracted from the original image and used as a base tongue image. Because the tongue fur area needs to be focused on when the specific tongue fur information is analyzed, the tongue image collected by the traditional Chinese medicine tongue picture instrument usually contains the human face and lip interference information, the tongue body part needs to be extracted from the original image, the tongue body is segmented by adopting the traditional Chinese medicine tongue picture segmentation method, and the extracted tongue image has the unified standard size of 3456 x 3456 pixels.

The second step is that: and extracting a center block of the tongue body to be used as a representative block of the greasy characteristic of the tongue body. The central part of each segmented tongue body keeps the characteristic of greasy tongue, so that the central tongue picture block is extracted aiming at each tongue image. Firstly, a small region filtering method is adopted, a minimum area threshold value is set, the threshold value is set to be 100 multiplied by 100 pixels through experiments, whether the area of each region is smaller than the threshold value or not is judged by obtaining all connected regions in an image, the region with the area smaller than the threshold value is filtered, and the region equal to or larger than the threshold value is reserved, so that an interference region in the background of the image is distinguished through the method, and deviation in positioning of a target region is prevented. Obtaining the maximum value and the minimum value of the pixels of the target area, determining the target area, then positioning the position of the central point, setting the size of the cutting area to be 512 multiplied by 512 pixels, and obtaining the central block of the tongue body.

The third step: and constructing a large batch of tongue picture block training data sets. And performing sliding block processing on the extracted tongue picture blocks, setting the size of the obtained target tongue picture block to be 224 multiplied by 224 pixels, setting the step length to be 65, and obtaining 2472 tongue picture blocks in a sliding mode to meet the characteristic of large demand of a training set in the deep learning network.

3) Multi-model integrated predictive statistics

The first step is as follows: and training a plurality of classification models by using the first batch of noisy labeled tongue picture greasy data. The convolutional network commonly used in the classification field at present has an inclusion V3, MobileNet and ResNet series model, and the effective Net network structure is also applied according to the advantage that the convolutional network can keep higher classification precision and has short time. Convolutional network InceptitionV 3_ v1, MobileNet _ v1, ResNet50 and effective Net b4 network structure training are adopted, 4-classification model training is adopted to train network model parameters by adopting an SGD gradient descent algorithm, the initial learning rate in the experiment is set to be 0.001, momentum is set to be 0.9, and the decay coefficient is set to be 1 e-6.

The second step is that: selecting Inception V3, MobileNet _ v1 and ResNet50 network models as a basic network, and uniformly predicting all labeled samples in the first batch according to a plurality of model integrated prediction modes, wherein as shown in formula (1), a represents the confidence coefficient of the samples, the quantity of the models with consistent prediction is set to be K, the quantity of the models with consistent prediction is set to be K, Z is a prediction label, and Z is an original label, if 3 models with consistent prediction and the same type as the original labeled samples are classified into a label set, the confidence coefficient is set to be 1, the label reliability is high, and the label reliability is directly used for subsequent cyclic training; if the prediction result is consistent with the original labeling type, setting the confidence coefficient as 1/2, classifying the samples into suspected labeling error samples, introducing an active learning idea, and submitting manual labeling result verification; if the 3 models are not consistent in prediction, according to the specific prediction result condition and the relation with the labeled type: the threshold value is set to 1/2, when the K/K ratio is larger than the threshold value, namely when most models in the models are predicted to be consistent and only a few models are inconsistent, whether the sample is consistent with the original label or not is judged; if the confidence degrees are consistent, the confidence degrees are set to be 1, and the annotation sets are classified; if the two are inconsistent, the confidence coefficient is set to 1/2, and label verification is carried out manually; when the K/K ratio is equal to or smaller than a threshold value, namely, the predictions of a few models are consistent, the predictions of most models are inconsistent, and the confidence coefficient is set to be 0, the models are classified into unmarked data sets; and when the K/K ratio is 0, namely all models are predicted to be inconsistent, representing a high-probability noise label sample, and if the confidence coefficient is set to be 0, classifying the data set as an unlabeled data set. And updating the labeled sample set after all labeling is finished to obtain the artificially involved denoising and labeling sample set X.

The third step: dividing the denoising labeling sample into a training test set, performing integrated prediction according to a new trained model, keeping parameters unchanged, and recording the classification precision of the optimal model; and updating the labeled sample set according to the criterion in the second step, and iterating and circulating until the classification precision of the sample test set is basically kept unchanged.

3) Semi-supervised model training

The first step is as follows: on the basis of data screening in 2), taking a denoising labeling set as a labeled sample set, taking the rest unlabeled tongue picture data as an unlabeled sample set, applying data enhancement to both labeled and unlabeled samples, setting a batch with a specific size, setting the batch of the experiment to be 4, performing data enhancement on labeled data of one batch once, performing data enhancement on unlabeled data of one batch M times, setting the experiment to be 2, selecting a standard enhancing mode for data enhancement to be random cutting and horizontal overturning respectively, randomly cutting to different sizes and aspect ratios, then scaling the cut image to be 380 × 380 pixels, and obtaining the enhanced labeled data set X and unlabeled data U.

The second step is that: comparing results of the three classification models in the step 2), the effective Net b4 network model has a good greasy classification effect, the experiment adopts the training effective Net b4 classification model as a semi-supervised learning classifier, the classifier performs predictive classification on M enhanced results of single data, the classification result finally adopts an averaging method to determine a final pseudo label, and a specific calculation method is shown in a formula (2).

the resulting average predicted probability value. In the process of generating the pseudo label, in order to make the prediction confidence of the model on the label-free sample as high as possible, the model introduces a minimum entropy operation, specifically introduces a "sharpen" function, as shown in formula (3).

Wherein Q is the average predicted probability value of the enhancement data, corresponding to that in formula (1)

T is a temperature hyper-parameter, a T value is set to be 0.5 through experiments, classification entropy can be adjusted, L is a category total amount, i and j point to a single category, when the T value tends to 0, output is close to one-hot distribution, the T value is reduced, the model is encouraged to carry out low-entropy prediction on enhanced unmarked data, and the prediction accuracy of the model is improved. By this step, enhanced unlabeled data is obtainedThe pseudo tag of (1).

The third step: in order to relatively enhance the generalization capability of the model, a mix up method is used for data mixing, a labeled data set is recorded as X, an unlabeled data set is recorded as U, the mixing operation is specifically that the labeled data set X and the unlabeled data set U are mixed to form a mixed data set W, then the previous X data mix up in the labeled data set X and the W data set form a new labeled data set as X ', the unlabeled data set and the next U data set mix up in the W data set form a new unlabeled data set as U', and a mixed labeled and unlabeled greasy tongue data set is constructed.

wherein, U 'represents all the unlabeled samples, | U' | is the quantity value of all the unlabeled samples, U is a single unlabeled sample, q is the distribution of the real sample label, P is the probability that the sample U is predicted to be the y category by the classifier, θ is the model parameter, and L is the total category.

Loss＝L_x+λL_u#(6)

λ is the hyperparameter and the experiment was set to 100. Introducing the loss constraint network on the constructed new data set, continuously carrying out iterative training on the network, setting the initial learning rate in the training parameters to be 0.001, setting the decay coefficient of decapay to be 0.999, setting the epochs to be the iteration times to be 100, carrying out training until the loss of the classification model is converged and the greasy classification index is close to zero change, and selecting the model with the highest verification result index as the optimal model for the classification network training.

618 marked tongue greasy images are selected from a database as initial marking training samples, 2472 tongue picture blocks are obtained through pretreatment, 10000 tongue picture blocks are obtained through unmarked training samples, and 405 tongue pictures are obtained through testing samples. The same training strategy is adopted, firstly, full-supervised training is respectively carried out on 4 models including incidence V3, Mobile Net, Res Net and efficiency Net b0, the average classification precision is 82.47%, 89.62%, 90.86% and 91.11%, the efficiency Net model is better in performance, secondly, training is carried out on the efficiency Net b0-b4 network architecture, the optimal classification result is 93.09% obtained on the b4 model, 10000 pieces of unlabelled data are introduced for semi-learning, the classification result is 94.50%, and the accuracy is improved by 1.41%. The characteristics of the introduced unmarked data increase sample diversity in the preliminary verification, so that the classifier can extract more effective information, and meanwhile, the unmarked data are utilized, so that the model has higher generalization capability.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.

Claims

1. A traditional Chinese medicine tongue picture greasy classification method containing noise labeling data is characterized by comprising the following steps:

1) construction of a tongue greasy dataset

The applied data set is acquired by a tongue manifestation appearance of traditional Chinese medicine, and the greasy category is calibrated; in order to construct a training and testing data set, three steps of tongue region extraction, tongue center block extraction and tongue picture block slider processing are required:

step 1.3: carrying out slide block processing on the extracted tongue picture blocks to obtain tongue picture blocks for constructing a tongue picture block data set;

The noise-containing labeled samples are predicted through a plurality of models, the result is unified, the labeled data set is updated, and the confidence coefficient of data labeling is improved;

step 2.2: respectively adopting a plurality of classification models to carry out class prediction of labeled data, and estimating the confidence coefficient of the data label according to the multi-model prediction result; samples were classified into the following 3 classes:

class 3: for samples with inconsistent multi-model prediction results, calculating the confidence coefficient of the label according to the specific prediction result condition and the relation with the labeled category; performing empowerment application or manual calibration on the sample according to the confidence coefficient;

step 2.3: updating annotated samples

Respectively carrying out corresponding processing on the 3 types of samples, and updating the labeled data set; the samples of the category 1 are directly used for training and testing; after manual marking and verification, adding the samples of the category 2 into a training and testing set; for category 3, according to the sample proportion and the training mode, removing, manually calibrating and empowering;

3) classification model training

Step 3.1: dividing a training set and a testing set by using the updated labeled sample; performing network training on the multi-classification model; adopting a retraining model or carrying out optimization training on an existing network model;

step 3.2: training a plurality of classification models until the models are converged, and recording the test accuracy;

step 3.4: when the proportion of the consistency label samples in the labeled data is improved and the integral classification performance of the network is not improved any more, namely the accuracy of the record is not changed any more, stopping iteration; and selecting a classifier with optimal performance to realize tongue picture greasy classification.

2. The method of claim 1, wherein:

1) construction of a tongue greasy dataset

2) multi-model integrated predictive statistics

3) semi-supervised model training

T is a temperature hyper-parameter, the T value is set to be 0.5 in an experiment, the classification entropy can be adjusted, L is the total amount of categories, i and j point to a single category, when the T value tends to 0, the output is close to one-hot distribution, the T value is reduced, and the model is encouraged to enhance the enhanced dataThe low entropy prediction is carried out on the data which are not marked, so that the prediction accuracy of the model is improved; through the steps, the pseudo label of the enhanced unmarked data is obtained;

Loss＝L_x+λL_u#(6)