CN116543414A

CN116543414A - Tongue color classification and tongue redness and purple quantification method based on multi-model fusion

Info

Publication number: CN116543414A
Application number: CN202310305163.XA
Authority: CN
Inventors: 董秀; 陈虹; 郭朋
Original assignee: Guangdong Xinhuangpu Joint Innovation Institute Of Traditional Chinese Medicine
Current assignee: Guangdong Xinhuangpu Joint Innovation Institute Of Traditional Chinese Medicine
Priority date: 2023-03-24
Filing date: 2023-03-24
Publication date: 2023-08-04

Abstract

The invention discloses a tongue color classification and tongue redness and purple quantization method based on multi-model fusion, which comprises the following steps: generating a tongue color dataset; generating a multi-model fused tongue color classification model, wherein the multi-model fused tongue color classification model is used for classifying a tongue color data set; quantifying the tongue redness of four types of tongue images, namely pale tongue, red tongue and dark tongue, and quantifying the tongue redness of two types of tongue images, namely dark tongue and blue tongue; and outputting the category and the quantized value of the tongue picture according to the classification result and the quantized result. The tongue color classification method of the traditional Chinese medicine abandons the existing single model mode, and improves the accuracy and generalization capability of the model by a multi-model fusion mode. Meanwhile, the invention provides a tongue redness and purple quantification method based on the classification method, and provides an electronic data diagnosis basis for tongue diagnosis in traditional Chinese medicine.

Description

Tongue color classification and tongue redness and purple quantification method based on multi-model fusion

Technical Field

The invention relates to the technical field of image processing, in particular to a multi-model fusion tongue color classification and tongue redness and purple quantification method.

Background

The tongue diagnosis is mainly to judge the symptoms by visual observation and experience of doctors. The tongue diagnosis is to observe the physiological and pathological changes of the body by observing the changes of tongue. The tongue manifestations include the tongue proper and tongue coating, and abnormal changes of the tongue coating and tongue proper form pathological tongue manifestations, which are important basis for the diagnosis of doctors in traditional Chinese medicine. However, the judgment of tongue appearance is affected by human factors, and the influence is uncontrollable.

Nowadays, with the continuous development of deep learning, the theoretical knowledge of deep learning is applied to tongue color classification of traditional Chinese medicine. The tongue color can be divided into six categories, namely pale tongue, red tongue, dark tongue, and blue tongue. In the prior art, a single model is adopted to classify tongue color, however, the classification method has certain defects, such as: the model has insufficient generalization capability and low accuracy. In the prior art, a tongue color classification method based on multi-model fusion and a tongue redness and purple quantification method based on the classification method are not available.

Disclosure of Invention

In order to solve one or more of the problems, the invention provides a tongue color classification method based on multi-model fusion and a tongue redness and purple quantization method, wherein the deep learning theoretical knowledge is applied to tongue color classification of traditional Chinese medicine to construct a tongue color classification model. The multi-model fusion mode makes up the defects in a single model, improves the generalization capability and accuracy of the model, realizes the digitization and automation of tongue diagnosis and improves the efficiency of doctor tongue diagnosis. Meanwhile, the invention provides a tongue redness and purple quantification method based on the classification method, which provides data reference for tongue diagnosis in traditional Chinese medicine.

According to a first aspect of the present invention, there is provided a multi-model fused tongue color classification and tongue redness and purple quantification method, comprising:

generating a tongue color dataset; the tongue color dataset comprises at least: the tongue picture data of six marked categories are that: pale tongue, reddish tongue, dark tongue, blue tongue;

generating a multi-model fused tongue color classification model, wherein the multi-model fused tongue color classification model is used for classifying the tongue color dataset;

quantifying the tongue redness of four types of tongue images, namely pale tongue, red tongue and dark tongue, and quantifying the tongue redness of two types of tongue images, namely dark tongue and blue tongue;

and outputting the category and the quantized value of the tongue picture according to the classification result and the quantized result.

In some embodiments, the generating the tongue color dataset specifically includes: collecting tongue picture data marked with six categories; and screening the collected tongue picture data, and taking the screened data as a tongue color data set.

In some embodiments, the generating a multi-model fused tongue color classification model for classifying the tongue color dataset comprises:

constructing an abnormality detection model, and distinguishing a tongue picture of a normal category from a tongue picture of an abnormal category according to the abnormality detection model; the tongue picture of the normal type is a tongue picture of four types of pale tongue, red tongue and dark tongue, and the tongue picture of the abnormal type is a tongue picture of two types of dark red tongue and dark purple tongue;

after classifying the tongue picture of the normal type and the tongue picture of the abnormal type respectively, inputting the tongue picture of the normal type into a four-classification model, and inputting the tongue picture of the abnormal type into a red-violet classification model; the four-classification model classifies four types of tongue images, namely a pale tongue, a red tongue and a pale dark tongue, and then inputs the tongue image of the type of the pale dark tongue in the four types of tongue images into the red-violet classification model;

the tongue picture of abnormal class and the tongue picture of dark class are classified according to the redness of the tongue picture and the purple of the tongue picture respectively through a red-purple classification model.

In some embodiments, the constructing the abnormality detection model and distinguishing the tongue picture of the normal category from the tongue picture of the abnormal category according to the abnormality detection model further includes:

dividing a tongue color data set into a training set and a testing set;

carrying out data enhancement on tongue picture of the training set; the data enhancement mode at least comprises one of the following steps: rotating, horizontally overturning, adding random noise, blurring or transforming picture modes to the tongue picture;

extracting features of the data-enhanced tongue picture data, and distinguishing normal class tongue pictures from abnormal class tongue pictures according to feature extraction results;

selecting a loss function for optimizing the anomaly detection model; wherein the loss function is an improved Pseudo-Huber loss function;

creating an optimizer for optimizing the result of the loss function according to the result of the loss function; the optimizer is an Adam optimizer, and the learning rate of the optimizer is set to be 1e-2;

training to generate an abnormality detection model; the data used for training are tongue picture data after data enhancement. The difference of the data enhancement modes can enable the details of the pictures to be changed and the scenes to be more diversified.

In some embodiments, the inputting the normal type tongue picture into the four classification model includes constructing a four classification model for classifying tongue pictures of four categories of pale tongue, red tongue, and pale dark tongue;

the building of the four-classification model specifically comprises the following steps:

constructing a four-classification model data set;

data enhancement is carried out on the data required to be used in the used data set; the data enhancement method at least comprises one of the following steps: rotation, horizontal overturning, random noise addition, fuzzy processing and normalization;

extracting features of the data-enhanced tongue picture data, and distinguishing four types of tongue pictures of pale tongue, red tongue and dark tongue according to feature extraction results;

creating a loss function according to the four-classification model; wherein the loss function is an LDAM loss function;

selecting an optimizer according to the result of the loss function; the optimizing device is an adaptive moment estimation optimizing device, and the learning rate of the optimizing device is set to be 4e-5;

training is carried out, and a four-classification model is generated; the training times are set to 500 times, wherein the first 200 times of training use an LDAM loss function to optimize a model; the cross entropy loss function is used in the last 300 training processes to perform the optimization model.

In some embodiments, the inputting the tongue picture of the anomaly class into the red-violet classification model further comprises: constructing a red-violet classification model;

the construction of the red and violet classification model specifically comprises the following steps:

constructing a red-violet classification model data set;

carrying out data enhancement on the red-violet classification model data set;

extracting features of the data-enhanced tongue picture data, and distinguishing a reddish tongue picture from a purple tongue picture according to the feature extraction result;

creating a loss function according to the red-violet classification model; wherein the loss function is a cross entropy loss function;

selecting an optimizer according to the result of the loss function; the optimizing device is an adaptive moment estimation optimizing device, and the learning rate of the optimizing device is set to be 2e-3;

training is carried out, and a red and violet classification model is generated.

In some embodiments, quantifying the tongue redness of the four categories of tongue images, namely pale tongue, red tongue, and deep-red tongue, and quantifying the tongue redness of the two categories of tongue images, namely dark tongue and dark purple tongue, further comprises:

separating tongue coating from tongue picture;

sampling the tongue picture with separated tongue coating;

and carrying out tongue color quantification according to the sampling result.

In some embodiments, the tongue image separation further comprises:

converting the tongue picture from the RGB color model to a Lab color model;

according to the value a of all pixel points in the tongue picture, dividing a tongue region and a tongue fur region in the tongue picture by a Kmeans clustering method, removing the tongue fur region, and only keeping the tongue region.

In some specific embodiments, the tongue picture separation further comprises:

and eliminating pixel points in a tongue fur area in the tongue picture. And carrying out subsequent processing on the tongue picture with separated tongue coating so as to remove discrete pixel points and small areas of the tongue coating area.

In some embodiments, the quantifying tongue purple according to the sampling result further comprises:

drawing a least square fitting straight line;

dividing quantization levels according to the fitted straight line;

and quantifying the tongue color of the tongue picture according to the sampling result.

Tongue purple is quantized according to the sampling result: and calculating the class distribution voting condition of all sampling points of sampling on the quantization matching standard of the corresponding class by using the Euclidean distance, reserving the class with the largest voting, and calculating the average value on the class as the quantization value of the picture.

According to a second aspect of the present invention, a multi-model fused tongue color classification and tongue redness and purple quantification apparatus at least includes:

a first generation unit for generating a tongue color dataset;

the second generation unit is used for generating a tongue color classification model and distinguishing six classes of tongue picture according to the tongue color classification model;

the quantization unit is used for quantifying the tongue redness of the tongue picture of the four categories of pale tongue, red tongue and dark tongue and quantifying the tongue redness of the tongue picture of the two categories of dark tongue and dark purple tongue;

and the output unit outputs the category and the quantized value of the tongue picture according to the classification result and the quantized result.

According to a third aspect of the invention, a computer device comprises a memory storing a computer program and a processor implementing the steps of the above method when executing the computer program.

The beneficial effects are that:

according to the tongue color classification model, the tongue image is classified according to the tongue color by utilizing the tongue color classification model, so that the tongue of the tongue image is automatically and rapidly classified, the subjective influence is eliminated by the classification result, the tongue color classification model is more accurate, and the defects of strong subjectivity and low efficiency in the manual judgment process are overcome. The invention provides a method for quantifying tongue redness and tongue purple based on the model, which provides a more specific and reliable diagnosis basis for tongue diagnosis.

Drawings

FIG. 1 is a flow chart of a multi-model fusion tongue color classification and tongue redness and purple quantification method according to an embodiment of the present invention;

FIG. 2 is a tongue diagram of one embodiment of the present invention;

FIG. 3 is a schematic diagram of FIG. 2 after being subjected to a CenterCrop function;

FIG. 4 is a schematic diagram of FIG. 2 after being processed by a ColorJitter function;

FIG. 5 is a fitted line graph of four categories of pale tongue, red tongue, and deep-red tongue according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a standard visual color bar for red matching according to the four categories of FIG. 5;

FIG. 7 is a graph showing the visual distribution of tongue redness bars drawn from the four categories of fitted straight lines of FIG. 5 and the redness matching criteria visual bars of FIG. 6;

FIG. 8 is a diagram of a purple matching standard visual color bar according to an embodiment of the present invention;

FIG. 9 is a schematic diagram showing the visual distribution of the purple tongue color bars according to an embodiment of the present invention;

FIG. 10 is a schematic tongue diagram of another embodiment of the present invention;

FIG. 11 is a tongue with the tongue coating removed of FIG. 10;

FIG. 12 is a schematic illustration of the post-treatment operation of FIG. 11;

FIG. 13 is a schematic diagram of sampling of FIG. 12;

fig. 14 is a schematic diagram of redness quantization according to another embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings.

In light of the foregoing, the following is a specific experimental procedure, but the scope of protection of the present patent is not limited to this implementation procedure, and the flow chart is shown in fig. 1. The specific implementation process is as follows:

step 1: a tongue color dataset is generated.

The tongue picture data set used in this example is shown in table 1.

TABLE 1

Tongue color category	Pale tongue	Pale red tongue	Red tongue	A deep-red tongue	Dark tongue	Bluish purple tongue	Totals to
								Number of tongue pictures	108	489	364	10	310	21	1302

Step 1.1: tongue picture data labeled with six categories are collected. Specifically, 4400 Zhang She pictorial views are collected first, wherein each collected tongue pictorial view is marked by 30 doctors according to six categories of tongue colors. The six categories are respectively: pale tongue, reddish tongue, deep red tongue, pale dark tongue, and blue-purple tongue. Wherein, as shown in fig. 2, the tongue picture is collected.

Step 1.2: the collected tongue picture data is screened. Screening the collected tongue picture data further comprises: and (5) retaining effective tongue picture data and eliminating ineffective tongue picture data. The same picture is marked by 30 doctors, and when the marking result of the picture is higher than the total number of the same category, namely forty-five percent of 30 doctors are regarded as effective tongue picture data, and less than forty-five percent are regarded as ineffective tongue picture data. The retained tongue picture can be shown in table 1 as a tongue color dataset.

Step 2: and generating a multi-model fused tongue color classification model, wherein the multi-model fused tongue color classification model is used for classifying the tongue color data set.

Step 2.1: and constructing an abnormality detection model, and distinguishing a tongue picture of a normal category from a tongue picture of an abnormal category according to the abnormality detection model.

The tongue images of pale tongue, red tongue, and dark tongue are normal and the tongue images of red tongue and dark tongue are abnormal. The model is used to classify tongue images of both abnormal and normal categories.

Step 2.1.1: dividing tongue color dataset into training set and test set

Wherein, the data of the normal category is that the tongue picture number of the pale tongue, the red tongue and the pale dark tongue are added to obtain 1271 tongue. 1271 tongue picture is divided into a training set and a test set according to a three-to-seven ratio. The training set has 890 tongue picture, and the test set has 381 tongue picture.

Step 2.1.2: data enhancement of tongue picture of training set

Specifically, the data enhancement mode includes, but is not limited to, rotation, horizontal overturn, random noise addition, fuzzy processing, picture mode conversion and the like, so that the number of pictures of the training set is increased, and detail changes and scenes of the pictures are more diversified.

Preferably, existing data enhancement methods are optimized. The centrcrip function and ColorJitter function are culled. As shown in fig. 3, the center crop center loop function can find that the cropped picture does not include the tongue portion pixels, and is therefore unsuitable for tongue color detection.

As shown in fig. 4, the actual color of the picture is affected by the adjustment of the color tone such as brightness, contrast, saturation, brightness, etc. of the parameters through the color adjustment ColorJitter function, which is not suitable for the requirements of tongue color detection. Therefore, the two functions, the centrcrop function and the color adjustment ColorJitter function, are eliminated.

Wherein, the classified objects of each model are not identical, and the data enhancement modes are different.

Step 2.1.3: feature extraction is carried out on the data of the tongue picture after data enhancement, and the tongue picture of the normal category and the tongue picture of the abnormal category are distinguished according to the feature extraction result

Specifically, after feature extraction is performed on the tongue picture, a feature picture is output, and then the tongue picture of the normal category and the tongue picture of the abnormal category are distinguished according to whether the abnormality score exceeds a boundary threshold value. Feature extraction is performed on the training set pictures using the full convolution data description (FCDD, fully convolutional data description). The tongue picture is mapped to a feature picture with fixed size after passing through an FCN full-connection layer in the FCDD, wherein a certain element on the output feature picture is influenced by an input image, so that the abnormal score of the feature picture can be mapped back to the position of the original picture, and space information is reserved. An anomaly score is defined as a boundary threshold for normal and anomaly. According to the boundary threshold, the tongue picture is divided into a tongue picture of a normal class and a tongue picture of an abnormal class, namely, the tongue picture is subjected to FCN full connection layer, then an abnormal score is calculated, if the abnormal score exceeds the boundary threshold, the tongue picture of the abnormal class is regarded as the tongue picture of the abnormal class, and if the abnormal score is smaller than the boundary threshold, the tongue picture of the normal class is regarded as the tongue picture of the normal class. In this embodiment, the boundary threshold is set to 0.5.

Step 2.1.4: creating a loss function for optimizing an anomaly detection model based on the anomaly detection model

Specifically, the output matrix of the FCDD loss function at the FCN layer of the convolution layer using the Pseudo-Huber loss function is the following formula:

the loss function is:

wherein X is input data, W is a weight parameter of the network,the I A (X) is an output matrix, and the size of the output matrix is u X v, X _i For the ith input sample, y _i For the tag value of the i-th input sample, n is the total number of samples, ||A (X _i )|| ₁ Is the anomaly score for the i-th input sample. y is _i When the sample is 0, the sample is normal; y is _i When 1, the sample is abnormal. Here, A (X) ₁ Is the sum of all elements in ||a (X) |, which are all positive. The penalty is maximizing the outlier sample A (X) A ₁ Minimize the I A (X) I of the normal sample ₁ Therefore use ||A (X) is used ₁ As an anomaly score.

The above function is used as a loss function. The loss function is an operation function for measuring the degree of difference between the predicted value and the actual value of the model, and the variation of the target value is continuously reduced through the improvement of the process.

Step 2.1.5: creating an optimizer from loss function results

Specifically, using an Adam optimizer, the learning rate is set to 1e-2. This step is to minimize the loss function by training the optimization parameters. The model is penalized or represented by feedback of the loss function, i.e., minimized, so that the model gradually reaches better.

Step 2.1.6: setting training parameters according to the model optimized by the optimizer, and starting training

Specifically, the training number was set to 500. Optionally, the training times can be adjusted according to the category. During the training process, the file with the highest f1 score ((precision rate x recall)/(precision rate + recall)) value is reserved for each training. The score is calculated (precision x recall)/(precision + recall) for each training. If the score is higher than the historical score during the current training, replacing the current score, and updating and storing the current weight file.

Step 2.1.7: after training is completed, model reasoning is carried out

Specifically, the completion of training means that the construction of the abnormality detection model is completed at the same time. Then, new unlabeled tongue picture data is input into an anomaly detection model for reasoning. The model learns the characteristics of the abnormal pictures and the normal pictures in the training stage, and reads the weight files in the training stage in the model reasoning stage to perform reasoning prediction.

The test results utilized AUC as an evaluation index. The AUC is a model evaluation index for the classification model, and means that a positive sample and a negative sample are given randomly, the probability of predicting the positive sample as positive is p1, the probability of predicting the negative sample as positive is p2, and the probability of p1> p2 is the AUC. The AUC of the prior art method is 0.81 and the AUC of the present method is 0.96. The greater the AUC, the better the classification of the model. It can be found that the effect of the modified method is significantly improved.

Step 2.2: classifying the tongue picture of the normal category and the tongue picture of the abnormal category respectively, inputting the tongue picture of the normal category into a four-classification model, and inputting the tongue picture of the abnormal category into a red-violet classification model.

Step 2.2.1: building a four-classification model to classify tongue images of four categories of pale tongue, red tongue and pale dark tongue

The model is used for distinguishing between pale tongue, red tongue and dark tongue, and the data volume is 108, 489, 364 and 310 respectively.

Step 2.2.1.1: construction of a four-classification model dataset

Specifically, training set and test set were according to 7:3 divisions. The training set has 890 pictures, and the test set has 381 pictures.

Step 2.2.1.2: data enhancement of four-class model dataset

Specifically, the data enhancement method is one or a combination of rotation, horizontal overturn, random noise addition, blurring processing and normalization. The method comprises the step of amplifying the number of training set data divided in the previous step, so that the model learns more robust features, and the generalization capability of the model is effectively improved.

Step 2.2.1.3: extracting features of the data-enhanced tongue picture data, and distinguishing four types of tongue picture of pale tongue, red tongue and dark tongue according to the feature extraction result

Specifically, a DenseNet201 network is used as a feature extraction layer of the four-classification model, and each feature data is multiplied by a threshold value of a tongue picture corresponding to the feature picture on the output feature picture. The network model mainly comprises three core layers, namely a DenseLayer layer and an atomic unit of the most basic of the whole model, and is used for completing the feature extraction of the most basic once; a DenseBlock layer, a foundation unit for densely connecting the whole model; transition layer, transition unit between different dense connections. The DenseNet201 network is an existing network, and is used for four classification in the present application, and the output dimension of the last full connection layer is modified to be 4.

And the whole model is built by splicing the three layers and the classification layer. The threshold value of the labeling is that each picture mentioned above is labeled by 30 doctors. Wherein the total number of doctors labeling this category is the ratio of the total number of total labeling doctors, namely 30 doctors, and the ratio is the threshold value.

Step 2.2.1.4: creating a loss function from a four-class model

Specifically, the loss function is created according to four kinds of tongue images of pale tongue, reddish tongue and dull tongue. In view of the large difference in the number of pictures in the four categories, the uneven distribution among the categories is illustrated, so that the LDAM loss function is introduced to prevent long tail conditions. The long tail condition refers to that the total number of pictures of the data sets in different categories is greatly different, namely the distribution among the categories is uneven, namely the data is unbalanced. The LDAM loss function is based on the idea of a vector machine (SVM), the distance of the characteristics of the samples of the same category in the characteristic space is relatively close, and the distances of the characteristics of the samples of different categories in the characteristic space are relatively far. The nature of classification is then to find a boundary so that the feature points can be correctly classified into the corresponding classes.

The LDAM loss function is as follows:

wherein j e { 1..once, k }, y is a label, x is a sample of label y, and the result of model prediction of sample x of label y is z _y ，z _j For the sampleThe output value of the j-th class obtained by the above method.

Comparison of cross entropy loss function:

while the LDAM loss function is defined by z _y Becomes z _y Δy, when the output of the model is z, with the output of the other classes unchanged _y When +Deltay, the loss value of LDMA and the output of the model are z _y The loss values are the same, that is, the output of the LDMA algorithm for the desired correct class can be higher by Δy than the cross entropy loss function to achieve the optimal classification effect.

Step 2.2.1.5: creating an optimizer from the results of a loss function

Specifically, the learning rate was set to 4e-5 using an adaptive moment estimation (Adam, adaptive Moment Estimation) optimizer. The application of the optimizer has the advantages that: not only can adapt to sparse gradients, but also can alleviate the problem of gradient oscillation.

Step 2.2.1.6: setting training parameters according to the model optimized by the optimizer, and starting training

Specifically, the training number was set to 500. Training for the first 200 times, and optimizing a model by using an LDAM loss function; then, a cross entropy loss function is used in the last 300 training processes, the influence of unbalanced data on training is reduced through the first two hundred times of training, and the last three hundred times of training are recovered to be normal, so that high-precision classification is realized.

Step 2.2.1.7: after training is completed, reasoning is carried out on the model

From the reasoning results, the best performance is found by using the DenseNet201 network as a feature extraction layer compared with the existing backbone networks such as ResNet152, vgg19 and the like. Meanwhile, the data of the data set is subjected to data enhancement and the data which is not subjected to data enhancement is found out according to the training result, and the accuracy of entering the data subjected to data enhancement into the training model is higher.

And 2.2.1.8, judging whether the current tongue picture is a light dark tongue in four categories, and if so, inputting the tongue picture of the category of the light dark tongue into a red-violet classification model. If not, inputting the tongue picture to the step 3.1.

Specifically, the tongue picture of the dark tongue is marked as reddish tongue picture and purple tongue picture by doctor, and the number is 130 and 180 respectively.

Step 2.2.2: building a red-violet classification model according to the tongue image of the dark tongue and the tongue image of the abnormal class

The model is used for distinguishing whether the tongue color of the tongue picture is purple or red.

Step 2.2.2.1: construction of red and violet classification model dataset

Specifically, the number of the dark red tongue and the dark purple tongue passing through the abnormality detection model is 10 and 21, and the number of the dark red tongue and the dark purple tongue is 130 and 180. The number of the two reddish tongue pictures is added up to 140. The number of the two purple tongue pictures is added up to 201. The above data are scaled to ratio 7:3 dividing the training set and the test set as the data set of the model. The test set data in this embodiment is used to test the performance of the model after the model is completed.

Step 2.2.2.2: data enhancement of red-violet classification model datasets

In particular, the data enhancement modes include, but are not limited to, rotation, horizontal flipping, normalization operations. The step can not only increase the picture data quantity, but also improve the generalization capability of the model.

Step 2.2.2.3, extracting features from the data-enhanced tongue picture data, and distinguishing reddish tongue picture from purple tongue picture according to the feature extraction result

Specifically, resNet18 network is used as a feature extraction layer of the red and violet classification model. The basic architecture of the network is ResNet. Where 18 represents the depth of the network, referring to 18 layers. The 18 layers comprise convolution layers and full connection layers, wherein the last 1 layer is the full connection layer, the first 17 layers are convolution layers, the convolution layer of the first layer is 7*7 in size, the other convolution layers are 3*3 in size, the number of channels of the convolution layers from the first layer to the fifth layer is 64, the number of channels from the sixth layer to the ninth layer is 128, the number of channels from the tenth layer to the tenth layer is 256, and the number of channels from the fourteenth layer to the seventeenth layer is 512. The ResNet18 network is an existing network, and is used for two classification in the present application, the output dimension of the last fully connected layer is modified to be 2.

Step 2.2.2.4 constructs a loss function based on the created red-violet classification model

Specifically, the number of the reddish tongue picture and the reddish tongue picture in the model is balanced, so that a cross entropy loss (cross EntopyLoss) function is selected as a loss function. The loss function is used to represent the degree of gap between predicted and actual data. Optimizing the model according to the loss function is specifically shown as follows: determining a performance of the model from the value of the loss function; the loss function can be adjusted and optimized according to the parameters of the model so as to optimize and improve the predictability, accuracy and other performances of the model.

For a single sample, assuming the true distribution is y, the network output distribution isIf the total class number is n and i is the class to which the current picture belongs, the method for calculating the cross entropy loss function is as follows

Step 2.2.2.5: creating optimizers from loss functions

Specifically, the learning rate was set to 2e-3 using an adaptive moment estimation (Adam, adaptive Moment Estimation) optimizer. The application of the optimizer has the advantages that: not only can adapt to sparse gradients, but also can alleviate the problem of gradient oscillation.

Step 2.2.2.6: setting training parameters according to the model optimized by the optimizer, and starting training

Specifically, the training time was 500 times. During the training process, the optimal weight file is saved. The judgment basis of the optimal weight is as follows: (precision rate x recall)/(precision rate + recall) is high. For each training, calculating the score (precision rate x recall)/(precision rate + recall), if the current score is higher than the original score, replacing the original score with the score, and updating and storing the current weight file.

Step 2.2.2.7: after training is completed, reasoning is carried out on the model

The inference result shows that the red and purple are high in distinguishing degree, the picture features are obvious and classified well, so that small network extraction features are selected for classification, the accuracy is high, and the time is less.

Step 2.3: abnormal tongue picture and dark tongue picture are classified according to redness and purple degree of tongue picture through red and purple classification model

The step is used for classifying the reddish tongue picture and the purple tongue picture in the dark-light tongue, and classifying the reddish tongue picture and the bluish tongue picture separated in the abnormality detection model.

The method comprises classifying reddish tongue images and purplish tongue images in dark tongues, inputting the purplish tongue images into step 3.2, and inputting the reddish tongue images into step 4 without reddening the tongue; the method further comprises the steps of classifying the tongue picture of the dark red tongue and the tongue picture of the blue tongue, then inputting the tongue picture of the dark red tongue into the step 3.1, and inputting the tongue picture of the blue tongue into the step 3.2.

Step 3: quantifying the redness of the tongue picture of the four categories of pale tongue, reddish tongue and dark tongue and quantifying the redness of the tongue picture of the two categories of dark tongue and dark purple tongue

Step 3.1: quantification of tongue redness of tongue images of four categories, pale tongue, reddish tongue, dark red tongue

Step 3.1.1: tongue picture with tongue coating separation

Specifically, the original image is first converted from the RGB color model to the Lab color model. The Lab color model is a color model set by the International Commission on illumination, L represents brightness, a represents red to green change, and b represents yellow to blue change. The original figure can be seen with reference to fig. 10. At this time, pixel points of a tongue picture are converted from RGB format to Lab format, then tongue regions and tongue coating regions in the tongue picture are divided by Kmeans clustering method according to a value of all pixel points in the tongue picture, the tongue coating regions are removed, and only the tongue regions are reserved. The k value in the clustering method is set to be 2, so that separation of tongue and tongue fur is realized. As shown in fig. 11, the tongue coating area still has partial pixels in the tongue picture after the separation of the coating materials.

Preferably, the tongue picture after the separation of the tongue coating is subjected to subsequent treatment. The subsequent processing is to remove discrete pixel points and small areas of the tongue coating area. Specifically, the number of contours is calculated on the tongue picture after the separation of the tongue coating by using the findContours function. The contoursrea calculates the area of each contour, determines the area less than 200 as an interference point, and replaces it with the background color. As shown in fig. 12, it can be found that after the post-processing operation, the discrete pixel points of the tongue fur area in the graph are effectively removed, and step 3.1.2 can be performed: sampling tongue picture with separated tongue coating

Specifically, first, coordinates (x, y) of the center point from the highest point to the lowest point of the tongue surface on the vertical line of the midpoint of the horizontal line of the tongue picture after segmentation in step 1.1 are calculated, and then, the tongue picture after tongue coating separation is sampled with reference to fig. 12. Traversal is performed in (0:w, y), (0:w, y+1), (0:w, y+2) on fig. 12. Wherein, (0: w, y) refers to coordinates (0, y), (1, y), (2, y), (3, y)......... (w, y). 2000 pixel values with RGB values other than (0, 0) are selected as sampling points. The pixel value of RGB value (0, 0) is black, i.e., background pixel. As shown in fig. 13, the light areas on both sides of the tongue area are sampling areas.

Step 3.1.3: tongue redness quantization based on sampling results

Step 3.1.3.1: drawing a least square fitting straight line

Specifically, firstly, calculating an average value of an R value, a G value and a B value of an acquisition pixel point, and then calculating an R value and a G value of each tongue picture according to the following formula: r=r/(r+g+b) and g=g/(r+g+b), and finally fitting a straight line by using a least square method according to the R value and G value of all tongue images in each category. A fitted straight line with the r value on the abscissa and the g value on the ordinate may be as shown in fig. 5. Theoretically, the pale red tongue fitting line should be located on the left side of the red tongue fitting line, and as shown in fig. 5, the pale red tongue fitting line and the red tongue fitting line overlap in the vertical direction.

Step 3.1.3.2: dividing the grade of redness according to the fitted straight line

Preferably, the fitted straight lines of the four categories are equally divided. Wherein, the fitting straight line of the pale tongue is equally divided into 30 points, the fitting straight line of the pale tongue is equally divided into 40 points, the fitting straight line of the red tongue is equally divided into 50 points, and the fitting straight line of the dark tongue is equally divided into 10 points. And adding fault-tolerant areas among the categories according to the overlapping condition of the redness distribution of the four categories. Wherein, the pale tongue fitting straight line overlaps 10 points with the pale tongue fitting straight line, and the pale tongue fitting straight line overlaps 20 points with the red tongue fitting straight line. This data is used as a redness matching criterion for each category.

As shown in fig. 6, the four kinds of redness matching standard visual color bars are a pale tongue, a red tongue and a dark red tongue. The pale tongue range is from level one to level three, the pale tongue range is from level three to level six, the red tongue range is from level five to level nine, and the dark red tongue range is from level ten. As shown in fig. 7, the tongue redness bar visual distribution with g value on the ordinate and r value on the abscissa can be seen. This fig. 7 corresponds to the four category fitting lines of fig. 5.

Step 3.1.3.3: quantifying tongue redness of tongue picture according to sampling result

Specifically, the Euclidean distance is used for calculating the class distribution voting condition of all sampling points of sampling on the quantization matching standard of the corresponding class, the class with the largest voting is reserved, and the average value on the class is calculated as the quantization value of the picture.

Illustratively, the graph is classified as a red tongue by the above model, with the quantization range of the red tongue being between class five and class nine. Then, the Euclidean distance is used for calculating the distance between the level five and the level nine of each sampling point of the picture, and the pixel point is divided into the nearest level quantized values. The voting distribution of 2000 sampling points on the redness quantization standard in fig. 14 is as follows: the number of pixel points collected after the background pixel points are removed is 300, 356, 461, 517 and 366 for grade five, grade six, grade seven, grade eight and grade nine respectively. The quantized sums at each level are 1371, 1953, 2982, 3785, 3123, respectively. The background pixels correspond to black in the picture.

Next, the rank with the most votes, i.e., rank eight, is reserved for calculating the average quantization value. The calculation process is as follows: 517 divided by 3785 is equal to about 7.3. Then, the redness quantization value of fig. 14 is 7.3. In particular, rounding can be performed, leaving only one decimal point.

Step 3.2: quantifying tongue purple in tongue image

This step is to quantify the purple tongue picture of a purplish tongue and a bluish-purple tongue in a dark-colored tongue.

Step 3.2.1: the tongue picture is subjected to tongue coating separation, and the tongue picture after tongue coating separation is sampled

This step is identical to the procedure carried out in steps 3.1.1 and 3.1.2 above. The purple tongue picture can be separated from red tongue picture and sampled together, or can be separated.

Step 3.2.2: quantification of tongue purple is performed according to the sampling result

Step 3.2.2.1: drawing a least square fitting straight line

This step is identical to the procedure of step 3.1.3.1, but the calculated values are different. Since the degree of violet is calculated, the b and g values are calculated. Where b=b/(r+g+b) and g=g/(r+g+b). As shown in fig. 7, the data distribution of the two categories is relatively uniform, and the fitting straight lines are not overlapped end to end, so that a fault tolerance area is not needed.

Step 3.1.3.2: dividing the level of the purple according to the fitted straight line

Preferably, the fit lines of the two categories are equally divided. Wherein, the fitting straight line of the dark tongue is equally divided into 30 points, and the fitting straight line of the blue-violet tongue is equally divided into 70 points. This data was used as a purple match criterion for each category.

As shown in fig. 8, the two kinds of purple matching standard visual color bars of dark tongue and blue and purple tongue are obtained. Wherein, the range of dark tongue is grade one to grade three, and the range of dark tongue is grade four to grade ten. As shown in fig. 9, the tongue purple bar visual distribution with g on the ordinate and b on the abscissa can be seen. The graph fits a straight line for both categories.

Step 4: outputting the class and quantization value of the tongue picture according to the classification result and quantization result

The step obtains the classification result of the tongue picture according to the model, namely which class the tongue picture belongs to, and meanwhile, the quantized value of the class can be obtained according to the step 3. In this example, the reddish tongue picture in the dark tongue was not quantified.

And testing the classification method and the redness and purple quantification method of the test set pair. The test results are: the accuracy rates of pale tongue, red tongue, dark tongue and dark purple tongue are 82%, 88%, 89%, 95%, 86% and 95%, respectively, and the average value of the accuracy rates is as high as 87%; and the recall rates are 77%, 87%, 92%, 85% and 85%, respectively, and the average value of the recall rates is as high as 87%.

The application also provides a corresponding device based on the method, which comprises the following steps:

a first generation unit for generating a tongue color dataset;

According to a third aspect of the invention, a computer device comprises a memory storing a computer program and a processor implementing the steps of the above method when the processor executes the computer program.

The foregoing are merely some embodiments of the invention. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit of the invention.

Claims

1. The tongue color classification and tongue redness and purple quantification method based on multi-model fusion is characterized by comprising the following steps of:

2. The multi-model fused tongue color classification and tongue redness and purple quantification method of claim 1, wherein the generating a multi-model fused tongue color classification model for classifying the tongue color dataset comprises:

after classifying the tongue picture of the normal type and the tongue picture of the abnormal type respectively, inputting the tongue picture of the normal type into a four-classification model, and inputting the tongue picture of the abnormal type into a red-violet classification model; the four classification models classify four types of tongue image graphs, namely pale tongue, red tongue and pale dark tongue, and then input the tongue image graph of the type of pale dark tongue in the four types of tongue image graphs into the red-violet classification model;

3. The multi-model fusion tongue color classification and tongue redness and purple quantification method according to claim 2, wherein the constructing the anomaly detection model and distinguishing the tongue picture of the normal class from the tongue picture of the anomaly class according to the anomaly detection model further comprises:

dividing a tongue color data set into a training set and a testing set;

carrying out data enhancement on tongue picture data of the training set;

selecting a loss function for optimizing the anomaly detection model;

selecting an optimizer for optimizing the operation result of the loss function according to the result of the loss function;

training is carried out, and an abnormality detection model is generated.

4. The multi-model fusion tongue color classification and tongue redness and purple quantification method according to claim 2, wherein the inputting the normal type tongue picture into the four classification model comprises constructing a four classification model for classifying the tongue picture of the four categories of pale tongue, red tongue and pale dark tongue;

constructing a four-classification model data set;

carrying out data enhancement on the tongue picture of the four-classification model dataset;

creating a loss function according to the four-classification model;

selecting an optimizer according to the result of the loss function;

training is carried out, and a four-classification model is generated.

5. The multi-model fused tongue color classification and tongue redness and purple quantification method according to claim 2, wherein the inputting the tongue picture of the abnormal class into the red-purple classification model further comprises: constructing a red-violet classification model;

constructing a red-violet classification model data set;

carrying out data enhancement on the red-violet classification model data set;

creating a loss function according to the red-violet classification model;

selecting an optimizer according to the result of the loss function;

6. The method for classifying and quantifying tongue redness and purple according to claim 1, wherein quantifying tongue redness of four kinds of tongue images, namely pale tongue, red tongue and deep-red tongue, and quantifying tongue redness of two kinds of tongue images, namely dark tongue and dark purple tongue, further comprises:

separating tongue coating from tongue picture;

sampling the tongue picture with separated tongue coating;

and carrying out tongue color quantification according to the sampling result.

7. The multi-model fused tongue color classification and tongue redness and purple quantification method of claim 6, wherein the tongue picture tongue separation further comprises:

converting the tongue picture from the RGB color model to a Lab color model;

8. The multi-model fusion tongue color classification and tongue redness and purple quantization method according to claim 6, wherein the tongue color quantization according to the sampling result further comprises:

drawing a least square fitting straight line;

dividing quantization levels according to the fitted straight line;

9. A multi-model fused tongue color classification and tongue redness and purple quantification device, the device comprising:

a first generation unit for generating a tongue color dataset;

10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 8 when the computer program is executed.