CN116311380A - Skin typing method for autonomous learning of small sample data - Google Patents

Skin typing method for autonomous learning of small sample data Download PDF

Info

Publication number
CN116311380A
CN116311380A CN202310377219.2A CN202310377219A CN116311380A CN 116311380 A CN116311380 A CN 116311380A CN 202310377219 A CN202310377219 A CN 202310377219A CN 116311380 A CN116311380 A CN 116311380A
Authority
CN
China
Prior art keywords
samples
model
skin
reliability
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310377219.2A
Other languages
Chinese (zh)
Inventor
王曦
华薇
舒晓红
熊丽丹
唐洁
李利
李朝霞
霍维
邹琳
汤莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West China Hospital of Sichuan University
Original Assignee
West China Hospital of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West China Hospital of Sichuan University filed Critical West China Hospital of Sichuan University
Priority to CN202310377219.2A priority Critical patent/CN116311380A/en
Publication of CN116311380A publication Critical patent/CN116311380A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a skin typing prediction method for autonomous learning of small sample data, and relates to the technical field of deep learning and medical skin typing. According to the method, the initial model is obtained through training of a convolutional neural network by a small number of marked samples, the subsequent new samples are received through the initial model, whether the new samples have marking values or not is judged according to the model prediction result, and therefore only samples with inaccurate model judgment are selected for marking and retraining. Compared with the existing skin parting method, the skin parting method has the advantages that a more perfect result can be obtained by only using a single model, the model is continuously and iteratively optimized by using the enhancement unit in the actual use process, meanwhile, the sample marking with larger uncertainty is selected by using the idea of active learning, the number of samples needing expert marking can be greatly reduced, and the labor cost is reduced. According to the reliability module during retraining, skin typing with insufficient reliability can be more prone to be conducted when samples are expanded, and therefore the final typing effect is improved.

Description

Skin typing method for autonomous learning of small sample data
Technical Field
The invention relates to the technical field of deep learning and medical skin typing, in particular to a skin typing method for autonomous learning of small sample data.
Background
In recent years, along with the improvement of living standard and quality, people pay more attention to the image and appearance of individuals, so that the skin care is vigorously developed, continuous prosperity of cosmetics and medical and aesthetic industries is brought, the misuse of skin care products is brought to people when the skin care consciousness is improved, the care methods and the skin care products required by different skin types are generally different, and the effect of beautifying and whitening can be really achieved only by selecting proper skin care products according to the skin types, so that the skin typing has a great demand in the medical and aesthetic field.
At present, the skin parting method mainly comprises a manual identification and skin quality tester, the manual identification efficiency is extremely low and the labor waste is caused, but the parting is generally more accurate; the skin type tester has higher efficiency, and the general simple skin type tester has fewer functions, such as a water content test pen, and mainly determines the water content of the skin according to the characteristic that the skin conductivity with different water contents has different; while complex instruments such as a visual skin tester, a Dermavision PRO skin detection tool, etc., mainly comprise a part of small-sized sensors and high-definition cameras, the skin is typed mainly by taking high-definition pictures of the skin, such as 4K ultra-definition pictures, in spite of a certain accuracy, using a conventional digital image processing method.
With the advent of machine learning, the development of image recognition and multi-classification application is greatly promoted, and a new method is also provided in the skin typing field, but the application quantity is small, and meanwhile, the following problems exist:
1. the existing skin classification method based on deep learning mainly focuses on a certain detail to perform two classifications, such as whether oily skin exists, but cannot perform multi-label classification, such as dry-sensitive skin;
2. the skin sample is difficult to obtain, a large number of samples are needed at the beginning of modeling of the existing skin typing scheme, the acquisition level of the current skin sample data is not high, the number of the samples is very limited, and the feasibility of skin typing aiming at small samples is not high;
3. the prior proposal needs to label a large number of samples before training the model, and the process needs to consume a large amount of manpower and material resources, and the manual operation can produce the problem of error labeling.
Disclosure of Invention
The invention aims to provide a skin parting prediction method and a skin parting prediction system for autonomous learning of small sample data, which are characterized in that an initial model is obtained by training a convolutional neural network through a small number of marked samples, a subsequent new sample is received through the initial model, and whether the new sample has marking value is judged according to a model prediction result, so that only the sample with inaccurate model judgment is selected for marking and retraining, and the prediction accuracy of the model can be improved while the workload is greatly reduced.
In order to achieve the technical purpose and the technical effect, the invention is realized by the following technical scheme:
a skin parting prediction method for autonomous learning of small sample data is characterized in that an initial model is obtained by training a small number of marked samples through a convolutional neural network, a subsequent new sample is received through the initial model, whether the new sample has marking value is judged according to a model prediction result, and therefore only samples with inaccurate model judgment are selected for marking and retraining.
A skin typing prediction method for autonomous learning of small sample data specifically comprises the following steps:
s1: a small amount of marked samples are transmitted, the reliability of each parting label is preset, and an initial classification model with multiple output layers is established by using marked data;
s2: new unlabeled samples are transmitted, model prediction is used, and the samples are labeled and classified according to a prediction result updating reliability module;
s3: if the input sample has an uncertain label, entering a manual labeling queue, judging whether the model reaches a retraining threshold, if so, expanding the sample according to a certain rule based on the labeled sample and the reliability module and entering S4, otherwise, judging whether the reliability module accords with a final model standard, if not, entering S2, and if so, entering S5;
s4: training the model again by combining the historical training sample and the new label expanded sample, and entering S2 after training the model;
s5: the model closes the optimizing channel, and is not updated any more, so that mass production and application of the model can be replicated.
Further, the S1 specifically includes:
s101: selecting a small amount of skin sample images, and manually labeling the skin typing of the samples;
s102: initializing a reliability evaluation module, wherein the skin types with mutually exclusive probabilities, namely dryness, oiliness, neutrality and mixedness, are A groups, the skin types with independent probabilities, namely sensitivity, aging and color spots, are B, C, D groups respectively, and each group has independent reliability and an initial value of 0;
s103: unifying formats and normalizing the sizes of the marked images, expanding the number of samples to be 3 times of the original number by using a geometric enhancement method, and dividing a training set and a testing set;
s104: modeling by using the data set divided by S103 based on VGG16 deep neural network structure, and setting multiple output layers, wherein the output layers O 1 To classify the A group of mutually exclusive skin types, output layer O 2 Results of classifying B, C, D groups of several probability independent skin types.
Further, the step S2 specifically includes:
s201: new samples are fed into the model in batches, the samples can be from image resources reserved in advance or applied clinically, and the probability of each attribute of the samples is calculated;
s202: updating the reliability evaluation group to which the calculated attribute belongs according to the calculated attribute: if O 1 If the number of the unreliable samples in the group A is less than 0.5, the reliability is considered to be insufficient, and the number of unreliable samples in the group A is increased by 1; if O 2 If the result of several attributes of the set is between 0.35 and 0.65, the reliability of the attribute is considered to be insufficient, the number of unreliable samples of the corresponding set is increased by 1, otherwise, the number of reliable samples is increased by 1. Finally, calculating the reliability of the group in which the unreliable sample is positioned according to the number of the unreliable samples;
if O 1 Is below 0.5, meaning that predicting several mutually exclusive attributes does not result in a higher confidence answer, and therefore the result is considered unreliable, while O 2 The predicted value of the several probability independent attributes of (a) is within the interval of 0.5+/-0.15, which indicates that the result is ambiguous and is considered unreliable; the A, B, C, D group then updates the reliability of the group, respectively, by the following calculation formula:
Figure BDA0004170861890000031
where m is the minimum number of samples required for reliability evaluation, for example: m is 1000 means that at least the most recently predicted 1000 samples are used to calculate reliability; r is (r) i Representing the reliability of sample i, r i E {0,1},1 represents the sampleThis i is reliable, 0 vice versa; n is the number of all samples currently predicted; r is the reliability of the group;
as can be seen from the combination of the above-mentioned real conditions, n appears in the initial application<m, thus the minimum value of n is set as m, and the missing samples are directly set as r i A value of 0 means that if the number of predicted samples does not reach the most basic requirement, the reliability is insufficient;
s203: if the parting result is reliable, marking a corresponding parting label for the sample, otherwise marking an uncertain label for the parting of the sample;
further, the step S3 specifically includes:
s301: if the sample transmitted by the S2 has an uncertain label, entering a manual labeling queue, and waiting for labeling of field experts;
s302: if the number of marked samples accords with the retraining threshold, namely the number of samples manually marked for the current model reaches a value N, and the reliability module does not reach the standard, entering S303, otherwise entering S304;
s303: expanding the number of new labeling samples by using a geometric enhancement method according to the reliability of the A, B, C, D group, and then entering step S4;
for the number expansion of new samples, firstly expanding the number of the new samples to 3 times of the original data by adopting a geometric enhancement mode, and on the basis, obtaining an additional expansion multiple of A group according to the reliability ratio of each group, if the reliability ratio of A, B, C, D is a:b:c:d
Figure BDA0004170861890000041
Wherein M is the coefficient of the extra expansion, and the value is {1,2,3}, which means that the extra expansion is M times.
S304: if the reliability module meets the final model standard, the method proceeds to S5, otherwise, the method proceeds to S2 to continue prediction.
Further, the step S4 specifically includes:
s401: integrating the historical sample data and the data after the new sample expansion, and re-dividing the training set and the testing set;
s402: retraining the model according to the mode of training the model in the step S1, and entering the step S2 after training is finished;
further, the step S5 includes: the model closes the optimizing channel, and is not updated any more, so that mass production and application of the model can be replicated.
Another object of the present invention is to provide a prediction system based on a skin typing prediction method for autonomous learning of small sample data;
the prediction system includes: the system comprises a model embryonic unit, a model enhancement unit and a final application unit;
the model embryonic unit is used for preprocessing a small amount of skin image sample data, realizing multi-classification and multi-label classification of skin images based on a prediction model of a convolutional neural network, and simultaneously establishing reliability evaluation modules aiming at various classifications;
the model enhancing unit applies the preliminary model generated by the embryonic unit and updates the reliability evaluating module, namely: inputting a new skin image for prediction, updating reliability according to the classification probability, manually labeling samples meeting labeling conditions, then retraining by using the history labeled samples, and continuously iterating the flow of the unit;
the final application unit is that after the model is iterated by the enhancement unit to a set degree, the model can be directly popularized and applied in batches without manually labeling samples, and the accepted input is a skin image and the output is the parting of the skin.
The invention has the beneficial effects that:
(1) In real application, the collection of skin sample data is very difficult, the number of collected samples is very limited, and compared with the existing skin parting model modeling method, the method only needs a small number of marked samples, and an initial model can be obtained through convolutional neural network training, so that the requirement for collecting the initial sample can be reduced;
(2) The skin classifying method has the multiple output layers, can achieve the effect of multiple labels and multiple classifications, has more comprehensive and more accurate skin classifying results than the existing skin classifying method with only a single label, and solves the defect that the existing skin classifying method can only perform single classification (namely two classification);
(3) After the preliminary model is obtained, the invention can receive the subsequent new sample and judge whether the new sample has labeling value according to the result of model prediction, thereby only selecting the sample with inaccurate model judgment for labeling and retraining, greatly reducing the workload and improving the classification accuracy effect of the model. When the traditional skin parting model is constructed, a great amount of manpower is required to manually mark the collected image data;
(4) The model training method has the capability of absorbing new knowledge and simultaneously retaining and optimizing old knowledge, combines a history labeling sample and a new labeling sample for retraining based on the thought of incremental learning, so that the model performance is improved, the existing skin parting model is not changed or updated once the model is initialized and trained, the model level can be improved to a certain threshold value in the new sample which comes continuously, the capability of absorbing new knowledge and simultaneously retaining and optimizing old knowledge is achieved, the training effect of the model is strengthened through continuous iterative training, and the accuracy of parting prediction is improved.
Of course, it is not necessary for any one product to practice the invention to achieve all of the advantages set forth above at the same time.
Drawings
Fig. 1 is a schematic flow chart of a skin typing method for autonomous learning of small sample data according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an autonomous learning method of a skin typing method for autonomous learning of small sample data according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a prediction system structure of a skin typing method based on autonomous learning of small sample data according to an embodiment of the present invention;
Detailed Description
In order to more clearly describe the technical scheme of the embodiment of the present invention, the embodiment of the present invention will be described in detail below with reference to the accompanying drawings.
A skin parting prediction method for autonomous learning of small sample data is characterized in that an initial model is obtained by training a small number of marked samples through a convolutional neural network, a subsequent new sample is received through the initial model, whether the new sample has marking value is judged according to a model prediction result, and therefore only samples with inaccurate model judgment are selected for marking and retraining.
The invention is illustrated below with reference to specific examples:
example 1
As shown in fig. 1 and 2:
a skin typing prediction method for autonomous learning of small sample data specifically comprises the following steps:
s1: a small amount of marked samples are transmitted, the reliability of each parting label is preset, and an initial classification model with multiple output layers is established by using marked data;
in this embodiment, 600-800 skin sample images are selected, and the specific steps are as follows:
s101: selecting 600 skin sample images, and manually labeling the skin typing of the samples;
in this embodiment, the modeling is based on the idea of active learning, meaning that the optimization of the model depends on the subsequent samples rather than a large number of initial samples, so that only a small number of manually labeled samples are needed to train out the basic model during initial modeling, which is obvious.
S102: initializing a reliability evaluation module, wherein the skin types with mutually exclusive probabilities, namely dryness, oiliness, neutrality and mixedness, are A groups, the skin types with independent probabilities, namely sensitivity, aging and color spots, are B, C, D groups respectively, and each group has independent reliability and an initial value of 0;
in this embodiment, the reliability module mainly provides a measurement standard for the subsequent retraining of the expanded samples, and if the reliability of a certain type is low, the samples of the type are expanded in a focused manner, so that the model obtained by retraining performs better.
S103: unifying formats and normalizing the sizes of the marked images, expanding the number of samples to be 3 times of the original number by using a geometric enhancement method, and dividing a training set and a testing set;
in the embodiment, the marked images are uniformly converted into a 227×227×3 format, the sizes are normalized, the number of samples is increased to 3 times of the original data by using geometric enhancement methods such as translation, rotation and shearing, and a training set and a testing set are divided according to a principle of 7:3;
in this embodiment, the unified format and size normalization are performed according to the standard neural network training model step, and the data is preprocessed, so that the geometric enhancement and expansion of the sample is performed to make up for the problem of too little initialization data, and the sample is properly expanded.
S104: modeling by using the data set divided by S103 based on VGG16 deep neural network structure, and setting multiple output layers, wherein the output layers O 1 To classify the A group of mutually exclusive skin types, output layer O 2 Results of classifying B, C, D groups of several probability independent skin types.
In the embodiment, the Sigmoid and Softmax activation functions are mixed and applied, the Sigmoid function maps the result to between 0 and 1, and three labels respectively correspond to the three mappings to obtain predicted values of sensitivity, aging and color spots, so that the aim of multi-label classification is fulfilled. The output layer activation function of the existing skin type model uses a Softmax function, and the activation function can normalize a numerical vector into a probability distribution vector, and the sum of the probabilities is 1, so that only the oil content characteristics of the skin can be distinguished, and contents such as skin marking sensitivity, aging property, color spots and the like cannot be obtained.
In this embodiment, the input image is subjected to two convolutions and maximum pooling, then to third to fifth convolutions, the fifth convolutions are subjected to maximum pooling, then to full-connection, 25% of neurons are randomly discarded by using the Dropout technique, and the full-connection and Dropout operations are repeated once, and then the last full-connection is performed. At the same time, an output layer O is arranged 1 The Softmax activation function is used for calculating the probability of several mutually exclusive skin types, the result is distributed between 0 and 1, and the sum of the results of the several categories is 1, thereby achieving the purpose of mutually exclusive classification and obtaining dryness, oiliness, neutrality and mixingPredictive value of sex; setting an output layer O 2 Activating by using a Sigmoid function, mapping the result to between 0 and 1, and respectively mapping three labels to obtain predicted values of sensitivity, aging and color spots so as to achieve the aim of multi-label classification;
in this embodiment, the VGG16 deep neural network used in the present solution includes 13 convolutional layers and 3 full-connection layers, and is widely applied to a deep learning network in the field of image recognition, and the output layer is improved by using the basic structure of VGG16, where the improvement is that: output layer O 1 After the processing of the Softmax function, the calculated results of the four types of dryness, oiliness, neutrality and mixedness are between 0 and 1, and the sum of the results of the categories is 1, so that the purpose of mutual exclusion classification is achieved; output layer O 2 After the Sigmoid function is used for processing, the result can be mapped to between 0 and 1, and three labels of sensitivity, aging and color spots respectively correspond to the three mappings, so that the model can be subjected to multi-label classification, and compared with the existing single-output layer scheme, the advantages are obvious;
in this embodiment: the existing skin parting model needs a large number of marked samples in construction, only a small number of marked samples are selected for training and initializing the neural network, the difficulty of collecting the samples is reduced, meanwhile, multiple classification and multiple label classification are realized by using Softmax and Sigmoid functions, the problem of single existing skin parting result is solved, and more comprehensive parting can be carried out on skin.
S2: new unlabeled samples are transmitted, model prediction is used, and the samples are labeled and classified according to a prediction result updating reliability module;
in this embodiment, step S2 is shown in fig. 2, and specifically includes:
s201: new samples are transmitted to the model in batches, and the probability of each attribute is calculated;
s202: updating the reliability evaluation group to which the calculated attribute belongs according to the calculated attribute: if O 1 If the number of the attributes of the group A unreliable samples is lower than 0.5, the confidence is considered to be insufficient, and the number of the unreliable samples of the group A is increased by 1; if O 2 Several attribute junctions of (2)If the result is between 0.35 and 0.65, the confidence of the attribute is considered to be insufficient, the unreliable sample number of the corresponding group is increased by 1, and otherwise, the reliable sample number is increased by 1. Finally, calculating the reliability of the group in which the unreliable sample is positioned according to the number of the unreliable samples;
if O 1 Is below 0.5, meaning that predicting several mutually exclusive attributes does not result in a higher confidence answer, and therefore the result is considered unreliable, while O 2 The predicted value of the several probability independent attributes of (a) is within the interval of 0.5+/-0.15, which indicates that the result is ambiguous and is considered unreliable; the A, B, C, D group then updates the reliability of the group, respectively, by the following calculation formula:
Figure BDA0004170861890000091
where m is the minimum number of samples required for reliability evaluation, for example: m is 1000 means that at least the most recently predicted 1000 samples are used to calculate reliability; r is (r) i Representing the reliability of sample i, r i E {0,1},1 representing that sample i is reliable, 0 vice versa; n is the number of all samples currently predicted; r is the reliability of the group;
the initial value of n is set to 1000, corresponding r i All are 0, and if the number of predicted samples does not reach the most basic requirement m, the reliability is insufficient.
S203: if the parting result is reliable, marking a corresponding parting label for the sample, otherwise marking an uncertain label for the parting of the sample;
in this embodiment, the screening of the new sample with higher uncertainty can be automatically completed, the process can be iterated continuously, and the reliability module is updated, so that the skin typing prediction result is more accurate.
S3: if the input sample has an uncertain label, entering a manual labeling queue, judging whether the model reaches a retraining threshold, if so, expanding the sample according to a certain rule based on the labeled sample and a reliability module and entering S4, otherwise, judging whether the confidence module accords with a final model standard, if not, entering S2, and if so, entering S5;
in this embodiment:
s301: if the sample transmitted by the S2 has an uncertain label, entering a manual labeling queue, and waiting for labeling of field experts;
in this embodiment, the core content of active learning is that sample data which are difficult to classify are obtained through a machine learning method, so that manual verification and auditing are performed again, then the data obtained through manual labeling are trained again by using a supervised learning model or a semi-supervised learning model, the prediction effect of the model is gradually improved, and manual experience is integrated into the machine learning model. The prior skin type modeling scheme can train the model only by requiring a large amount of skin data and labeled classification labels when training the model, which means that a large amount of time is consumed by experts in the medical field, but the invention is based on the principle of active learning, and only when S2 is input and a sample is marked as uncertain, the model in the stage is proved to have insufficient reliability on the prediction result of the part of the sample, the model is difficult to distinguish the relevant labels of the sample data, and the expert label queue is required to be accessed only by manual labeling, thus the burden of the expert label can be greatly reduced, the label data with larger value can be obtained by using less cost, and the effect of an algorithm is further improved.
S302: if the number of marked samples accords with the retraining threshold, namely the number of samples manually marked for the current model reaches a value N, and the reliability module does not reach the standard, entering S303, otherwise entering S304;
in this embodiment, the value N may be determined according to practical situations, if the requirement on the retraining threshold is high and the sample resources are more, N may be set to 800, and if the requirement is low, N may be set to 500; the standard range of reliability is (0, 1), which again depends on the situation, should not be too low, otherwise the final model is poor, while too high may lead to difficult realization of the final model.
S303: expanding the number of new labeling samples by using a geometric enhancement method according to the reliability of the A, B, C, D group, and then entering step S4;
s304: if the reliability module meets the final model standard, the method proceeds to S5, otherwise, the method proceeds to S2 to continue prediction.
In this embodiment, the specific steps are as follows:
after entering a manual marking queue, waiting for marking, and reminding a mark responsible person if the sample waiting time in the queue is longer than 24 hours and is not marked in a task starting interval;
if the retraining threshold is met, namely the number of samples manually marked for the current model reaches 500, and the reliability of each group does not reach 0.85, entering S303, otherwise entering S304;
expanding the number of new labeling samples by using a geometric enhancement method according to the reliability of the A, B, C, D group, and then entering step S4;
for the expansion of the number of new samples, firstly, expanding the number of the new samples to 3 times of the original data by using a geometric enhancement mode, and on the basis, setting the reliability ratio of A, B, C, D as a:b:c:d according to the reliability ratio of each group, wherein the A group has the additional expansion multiple as follows:
Figure BDA0004170861890000111
the remaining groups are pushed in this way:
where 8 represents the product of the additional expansion coefficient m=2 and the number of groups 4, meaning an additional expansion of 2 times.
If the reliability of each group of the reliability module reaches 0.85, and the model passes the manual random verification, entering S5, otherwise entering S2 to continue prediction;
in this embodiment, the manual verification scheme is: randomly extracting 100 samples which are predicted by the model and are not marked with uncertainty, and manually verifying, wherein if the error is lower than 10%, the manual verification is passed;
in this embodiment, according to the idea of active learning, only the samples with labeling values are manually labeled, so that the manual labeling cost can be greatly reduced, the problem that a large amount of manpower is required to be consumed for labeling samples during the initial training of the existing skin parting model is solved, and meanwhile, when a new sample is expanded, the grouping number with insufficient reliability can be increased according to the reliability ratio, so that the parting with insufficient reliability is emphasized during the subsequent retraining of the model.
S4: training the model again by combining the historical training sample and the new label expanded sample, and entering S2 after training the model;
in this embodiment:
s401: integrating the historical sample data and the data after the new sample expansion, and re-dividing the training set and the testing set;
in this embodiment, the training set and the test set are randomly re-divided according to 7:3;
s402: retraining the model according to the mode of training the model in the step S1, and entering the step S2 after training is finished;
in this embodiment, as time goes by, more data gradually expands the labeled samples, so that the data set available for training is increased, and therefore, the model can be continuously updated before the model reaches the shaping threshold by using the data to retrain the model, and the existing skin parting model is shaped after the initial training and does not have any modification later, so that compared with the existing scheme, the invention has the capability of absorbing new knowledge and simultaneously retaining and optimizing old knowledge.
In the embodiment, based on the thought of incremental learning, the history labeling sample and the new labeling sample are fused for retraining, so that the performance of the model is improved, the existing skin parting model is not changed or updated once the model is initialized and trained, and the accuracy of the model prediction result can be improved by continuously training the new sample which comes continuously.
S5: the model closes the optimizing channel, and is not updated any more, so that mass production and application of the model can be replicated.
In this embodiment:
only 600 marked samples are needed for initializing the model, and 10000 samples are needed for initializing the model according to the embodiment of the prior scheme, which is 16.67 times as large as the embodiment of the invention, so that the demand of the initial samples is less and one of the advantages of the scheme; meanwhile, the scheme model can output dryness, oiliness, neutrality, miscibility, sensitivity, aging property, color spots and the like of the skin when in use, and the embodiment of the prior scheme can only distinguish oiliness level; according to the method, based on the thought of active learning, samples with labeling values are selected for manual labeling in continuous practice, 10000 samples are required to be manually labeled before a model is initialized by the existing scheme, a large amount of manpower is consumed, and the embodiment of the method only needs about 3000 manual labeling times in total, so that the final application standard can be achieved, and the labeling cost is saved by about 70% compared with that of the existing scheme.
Example 2
A prediction system based on the skin typing prediction method of embodiment 1 for autonomous learning of small sample data;
the prediction system includes: the system comprises a model embryonic unit, a model enhancement unit and a final application unit;
the model embryonic unit is used for preprocessing a small amount of skin image sample data, realizing multi-classification and multi-label classification of skin images based on a prediction model of a convolutional neural network, and simultaneously establishing reliability evaluation modules aiming at various classifications;
the model enhancing unit applies the preliminary model generated by the embryonic unit and updates the reliability evaluating module, namely: inputting a new skin image for prediction, updating reliability according to the classification probability, manually labeling samples meeting labeling conditions, then retraining by using the history labeled samples, and continuously iterating the flow of the unit;
the final application unit is that after the model is iterated by the enhancement unit to a set degree, the model can be directly popularized and applied in batches without manually labeling samples, and the accepted input is a skin image and the output is the parting of the skin.
Compared with the existing skin parting model, the method has the advantages that the requirements on initial data are obviously less, the skin images can be classified by multiple classifications and multiple labels after the model is realized, more perfect results can be obtained by only using a single model, then the model is optimized by using an enhancement unit constantly and iteratively in the actual use process, meanwhile, the thought of active learning can be used, the sample label with larger uncertainty is selected, compared with the existing skin parting model, the number of samples marked by required experts can be greatly reduced, the labor cost is reduced, and the model with insufficient reliability can be more prone to parting when the samples are expanded according to the reliability module during retraining, so that the final parting effect is improved.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims (8)

1. A skin typing prediction method for autonomous learning of small sample data is characterized in that: and training a small number of marked samples by a convolutional neural network to obtain an initial model, receiving a subsequent new sample by the initial model, judging whether the new sample has marking value according to a model prediction result, and selecting only samples with inaccurate model judgment for marking and retraining.
2. The skin typing prediction method for autonomous learning of small sample data according to claim 1, wherein: the method specifically comprises the following steps:
s1: a small amount of marked samples are transmitted, the reliability of each parting label is preset, and an initial classification model with multiple output layers is established by using marked data;
s2: new unlabeled samples are transmitted, model prediction is used, and the samples are labeled and classified according to a prediction result updating reliability module;
s3: if the input sample has an uncertain label, entering a manual labeling queue, judging whether the model reaches a retraining threshold, if so, expanding the sample according to a certain rule based on the labeled sample and the reliability module and entering S4, otherwise, judging whether the reliability module accords with a final model standard, if not, entering S2, and if so, entering S5;
s4: training the model again by combining the historical training sample and the new label expanded sample, and entering S2 after training the model;
s5: the model closes the optimizing channel, and is not updated any more, so that mass production and application of the model can be replicated.
3. The skin typing prediction method for autonomous learning of small sample data according to claim 2, wherein S1 specifically comprises:
s101: selecting a small amount of skin sample images, and manually labeling the skin typing of the samples;
s102: initializing a reliability evaluation module, wherein the skin types with mutually exclusive probabilities, namely dryness, oiliness, neutrality and mixedness, are A groups, the skin types with independent probabilities, namely sensitivity, aging and color spots, are B, C, D groups respectively, and each group has independent reliability and an initial value of 0;
s103: unifying formats and normalizing the sizes of the marked images, expanding the number of samples to be 3 times of the original number by using a geometric enhancement method, and dividing a training set and a testing set;
s104: modeling by using the data set divided by S103 based on VGG16 deep neural network structure, and setting multiple output layers, wherein the output layers O 1 To classify the A group of mutually exclusive skin types, output layer O 2 Results of classifying B, C, D groups of several probability independent skin types.
4. The skin typing prediction method for autonomous learning of small sample data according to claim 2, wherein: the step S2 specifically comprises the following steps:
s201: new samples are fed into the model in batches, the samples can be from image resources reserved in advance or applied clinically, and the probability of each attribute of the samples is calculated;
s202: updating the reliability evaluation group to which the calculated attribute belongs according to the calculated attribute: if O 1 If the number of the unreliable samples in the group A is less than 0.5, the reliability is considered to be insufficient, and the number of unreliable samples in the group A is increased by 1; if O 2 If the results of several attributes are between 0.35 and 0.65, the reliability of the attributes is considered to be insufficient, the number of unreliable samples of the corresponding group is increased by 1, otherwise, the number of reliable samples is increased by 1; finally, calculating the reliability of the group in which the unreliable sample is positioned according to the number of the unreliable samples;
if O 1 Is below 0.5, meaning that predicting several mutually exclusive attributes does not result in a higher confidence answer, and therefore the result is considered unreliable, while O 2 The predicted value of several probability independent attributes is in the interval of 0.5+/-0.15, which indicates that the result is ambiguous and is considered unreliable; the A, B, C, D group then updates the reliability of the group, respectively, by the following calculation formula:
Figure QLYQS_1
where m is the minimum number of samples required for reliability evaluation, for example: m is 1000 means that at least the most recently predicted 1000 samples are used to calculate reliability; r is (r) i Representing the reliability of sample i, r i E {0,1},1 representing that sample i is reliable, 0 vice versa; n is the number of all samples currently predicted; r is the reliability of the group;
as can be seen from the combination of the above-mentioned real conditions, n appears in the initial application<m, thus the minimum value of n is set as m, and the missing samples are directly set as r i A value of 0 means that if the number of predicted samples does not reach the most basic requirement, the reliability is insufficient;
s203: if the result of the typing is reliable, labeling the corresponding typing of the sample, otherwise labeling the sample with an uncertain label.
5. The skin typing prediction method for autonomous learning of small sample data according to claim 2, wherein the step S3 specifically comprises:
s301: if the sample transmitted by the S2 has an uncertain label, entering a manual labeling queue, and waiting for labeling of field experts;
s302: if the number of marked samples accords with the retraining threshold, namely the number of samples manually marked for the current model reaches a value N, and the reliability module does not reach the standard, entering S303, otherwise entering S304;
s303: expanding the number of new labeling samples by using a geometric enhancement method according to the reliability of the A, B, C, D group, and then entering step S4;
for the number expansion of new samples, firstly expanding the number of the new samples to 3 times of the original data by adopting a geometric enhancement mode, and on the basis, obtaining an additional expansion multiple of A group according to the reliability ratio of each group, if the reliability ratio of A, B, C, D is a:b:c:d
Figure QLYQS_2
Wherein M is the additional expansion coefficient, and the value is {1,2,3}, which means that the additional expansion is M times;
s304: if the reliability module meets the final model standard, the method proceeds to S5, otherwise, the method proceeds to S2 to continue prediction.
6. The skin typing prediction method for autonomous learning of small sample data according to claim 2, wherein S4 specifically comprises:
s401: integrating the historical sample data and the data after the new sample expansion, and re-dividing the training set and the testing set;
s402: and (2) retraining the model according to the mode of training the model in the step (S1), and entering the step (S2) after training is finished.
7. The small sample data autonomous learning oriented skin typing prediction method of claim 1, wherein S5 comprises: the model closes the optimizing channel, and is not updated any more, so that mass production and application of the model can be replicated.
8. A prediction system based on the skin typing prediction method for autonomous learning of small sample data according to any one of claims 1 to 7, characterized in that:
the prediction system includes: the system comprises a model embryonic unit, a model enhancement unit and a final application unit;
the model embryonic unit is used for preprocessing a small amount of skin image sample data, realizing multi-classification and multi-label classification of skin images based on a prediction model of a convolutional neural network, and simultaneously establishing reliability evaluation modules aiming at various classifications;
the model enhancing unit applies the preliminary model generated by the embryonic unit and updates the reliability evaluating module, namely: inputting a new skin image for prediction, updating reliability according to the classification probability, manually labeling samples meeting labeling conditions, then retraining by using the history labeled samples, and continuously iterating the flow of the unit;
the final application unit is that after the model is iterated by the enhancement unit to a set degree, the model can be directly popularized and applied in batches without manually labeling samples, and the accepted input is a skin image and the output is the parting of the skin.
CN202310377219.2A 2023-04-10 2023-04-10 Skin typing method for autonomous learning of small sample data Pending CN116311380A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310377219.2A CN116311380A (en) 2023-04-10 2023-04-10 Skin typing method for autonomous learning of small sample data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310377219.2A CN116311380A (en) 2023-04-10 2023-04-10 Skin typing method for autonomous learning of small sample data

Publications (1)

Publication Number Publication Date
CN116311380A true CN116311380A (en) 2023-06-23

Family

ID=86832459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310377219.2A Pending CN116311380A (en) 2023-04-10 2023-04-10 Skin typing method for autonomous learning of small sample data

Country Status (1)

Country Link
CN (1) CN116311380A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935388A (en) * 2023-09-18 2023-10-24 四川大学 Skin acne image auxiliary labeling method and system, and grading method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116935388A (en) * 2023-09-18 2023-10-24 四川大学 Skin acne image auxiliary labeling method and system, and grading method and system
CN116935388B (en) * 2023-09-18 2023-11-21 四川大学 Skin acne image auxiliary labeling method and system, and grading method and system

Similar Documents

Publication Publication Date Title
CN108960073B (en) Cross-modal image mode identification method for biomedical literature
CN108376267B (en) Zero sample classification method based on class transfer
CN107239529B (en) Public opinion hotspot category classification method based on deep learning
CN108399428B (en) Triple loss function design method based on trace ratio criterion
CN109002834B (en) Fine-grained image classification method based on multi-modal representation
CN112148916A (en) Cross-modal retrieval method, device, equipment and medium based on supervision
CN109918671A (en) Electronic health record entity relation extraction method based on convolution loop neural network
CN113657425B (en) Multi-label image classification method based on multi-scale and cross-modal attention mechanism
CN109492750B (en) Zero sample image classification method based on convolutional neural network and factor space
CN112052684A (en) Named entity identification method, device, equipment and storage medium for power metering
CN108392213B (en) Psychological analysis method and device based on painting psychology
CN113779260B (en) Pre-training model-based domain map entity and relationship joint extraction method and system
CN111062928A (en) Method for identifying lesion in medical CT image
CN116311380A (en) Skin typing method for autonomous learning of small sample data
CN109086794B (en) Driving behavior pattern recognition method based on T-LDA topic model
CN116127065A (en) Simple and easy-to-use incremental learning text classification method and system
CN113609851A (en) Psychological idea cognitive deviation identification method and device and electronic equipment
CN112364166A (en) Method for establishing relation extraction model and relation extraction method
CN110717602B (en) Noise data-based machine learning model robustness assessment method
CN112347780B (en) Judicial fact finding generation method, device and medium based on deep neural network
CN114139624A (en) Method for mining time series data similarity information based on integrated model
CN111898528B (en) Data processing method, device, computer readable medium and electronic equipment
CN111242131B (en) Method, storage medium and device for identifying images in intelligent paper reading
CN116958700A (en) Image classification method based on prompt engineering and contrast learning
CN112115994A (en) Training method and device of image recognition model, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination