CN111612133A

CN111612133A - Internal organ feature coding method based on face image multi-stage relation learning

Info

Publication number: CN111612133A
Application number: CN202010430517.XA
Authority: CN
Inventors: 文鹏程
Original assignee: Guangzhou Huajian Intelligent Technology Co ltd
Current assignee: Guangzhou Huajian Intelligent Technology Co ltd
Priority date: 2020-05-20
Filing date: 2020-05-20
Publication date: 2020-09-01
Anticipated expiration: 2040-05-20
Also published as: CN111612133B

Abstract

The invention discloses an internal organ feature coding method based on face image multi-stage relation learning, which comprises the steps of collecting face images and obtaining labels marked for each face image, wherein the labels comprise internal organ labels and organ feature labels associated with each internal organ; after the face image is subjected to data amplification, normalization and standardization processing are respectively carried out on the RGB channels to obtain a training set; and simultaneously performing supervised learning of two subtask branches on the face image training set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining a visceral feature coding model embedded with the prior knowledge. The invention can fully consider the relevance between the human face image and the organ label and the organ feature label, carries out modeling and analysis on the multi-stage relation learning model, and provides visual and objective basic support for human health care and health preservation for the coding result of the human organ features.

Description

Internal organ feature coding method based on face image multi-stage relation learning

Technical Field

The invention relates to the technical field of machine learning, in particular to an internal organ feature coding method based on face image multi-stage relation learning.

Background

The Chinese ancient book 'Huangdi' records that 'twelve meridians, three hundred and sixty five ways, blood and qi rise on the face and empty orifices are all seen', which shows that the conditions of the five internal organs and six internal organs of human beings can be shown in the relevant areas on the face, so that the conditions of the five internal organs and six internal organs of a user can be mastered by observing the facial complexion of the human body, and then the five internal organs and six internal organs are conditioned by improving diet, exercise and living habits, and the like, so that the purpose of health preservation is achieved. For example, people with dampness typically have patches, acne, a shiny face, red nose and head, etc. on their faces, and thus, the body's moisture can be expressed by facial features. The internal dampness is related to five zang-organs and six fu-organs, possibly in the lung and possibly in the spleen, and only the viscera where the dampness is located are known, a reasonable diet conditioning scheme, a sport conditioning scheme and a living habit improvement scheme can be formulated, but the reasonable diet conditioning scheme, the sport conditioning scheme and the living habit improvement scheme need abundant health-care expert experience, and the ordinary people do not have the experience and cannot formulate a proper conditioning scheme.

With the development of science and technology, big data and deep learning technology develop rapidly in recent years. The machine learning technology utilizes a multi-level neural network, and can enable a computer to learn and understand complex data such as images and sounds and make corresponding behaviors through the training of mass data. The machine learning network can extract contrast features which are high in complexity and difficult for human beings to understand.

However, the current deep supervised learning method needs a large amount of label data to support, otherwise model training is easy to fall into overfitting, so that the generalization capability of the model is not enough to effectively express the distinguishing characteristics of the data. However, the collection of label data is very expensive and it is almost impossible to collect sufficient training data for every task. And a plurality of related tasks are learned through a multi-task framework, and the prior knowledge is embedded by using the tag data of the related tasks, so that the knowledge of the related tasks can be migrated to the main task, and the purpose of assisting the main task modeling is finally achieved. The dependency on the tag data can be reduced, the utilization rate of the existing tag data is improved, and the generalization capability of the main task model is improved.

Therefore, how to provide a method for effectively encoding human organ features through a human face image based on limited tag data is a problem that needs to be solved by those skilled in the art.

Disclosure of Invention

In view of the above, the invention provides an internal organ feature coding method based on face image multi-stage relationship learning, which can fully consider the correlation between a face image and internal organ labels and organ feature labels, perform modeling and analysis on a multi-stage relationship learning model, and provide basic support for predictive coding of internal organ features in a human body.

In order to achieve the above purpose, the invention provides the following technical scheme:

an internal organ feature coding method based on face image multi-stage relation learning comprises the following steps:

step 1, data acquisition: collecting face images and obtaining labels marked for each face image, wherein the labels comprise internal organ labels and organ feature labels corresponding to each internal organ;

step 2, data processing: the human face image is subjected to data amplification, and then the R, G, B three channels are respectively subjected to normalization and standardization processing to obtain a training set D_tr＝{(x_n，y_n)，n＝1...N}，x_n∈X，y_n∈ Y, wherein X is the processed face image sample set, and Y is the label set;

step 3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining a visceral feature coding model embedded with the prior knowledge.

Preferably, in step 1, the internal organ labels include large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; organ characteristic labels include qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown.

Preferably, the process of data augmentation in step 2 includes scaling the face image to 256 × 256 size, randomly cutting out an image with 224 × 224 size from the scaled image, and randomly horizontally flipping the cut image to perform data augmentation.

Preferably, the step 3 comprises:

step 31, constructing a backbone network, using the internal organ labels and the organ feature labels as a subtask set of a multi-task learning frame, and using the face image sample x_nExtracting common characteristics shared among tasks through a backbone network;

step 32, constructing a dynamic sequence module DSM, modeling two tasks of the internal organ label and the organ feature label by taking the common feature as input through a Recurrent Neural Network (RNN), and outputting task related features of the internal organ and the organ feature;

step 33, constructing classifiers, inputting the output of step 32 into respective multi-label classifiers after passing through the full connection layer

And

a set of decision vectors is obtained that includes predictions of internal organ and organ characteristics.

Preferably, the step 32 specifically includes:

step 321, modeling internal organs, using the common features extracted from the main network as the input of nodes corresponding to each time step of DSM of task related layer, wherein the nodes of each time step correspond to a subtask branch, and are two tasks of internal organs and organ features, and the number of tasks is the number of time steps;

step 322, the hidden state feature and the common feature of the time step are introduced into the next time step node in the RNN to model the internal organ feature of the human body, and the hidden state feature of the time step

Is shown as

Wherein the content of the first and second substances,

wherein M is and

mask M-Bernoulli (r) with uniform size_s)，

And D_hIs the feature dimension of the hidden-state feature,

an initial state for random initialization; obtaining outputs at each time step as organ branches

Output branched from organ characteristics

Wherein

Common characteristic of input and hidden state of last time step

With probability r_sInfluence.

Preferably, the step 3 further includes a process of constructing a semantic regularization loss function:

step 34, setting a hyper-parametric threshold z, calculating an activation value for the visceral organ and organ characterization task based on the magnitude of the L2 loss

And

respectively representing the model to the internal organsIdentification results of organs and organ characteristics;

step 35, introducing penalty item into the loss function, setting a gating item G to control the activation of the penalty item, and calculating the loss of semantic constraint

Output/tag pairs for each branch

Calculating binary cross entropy loss to respectively obtain two multi-label classification losses L^locAnd L^nat(ii) a The final loss is obtained by adding the three losses:

wherein L is^locAnd L^natMulti-label cross entropy classification losses of human internal organs and human internal organ characteristics respectively;

step 36, calculating the loss L^totalAnd performing back propagation, adjusting parameters of the model, and finally embedding the priori knowledge of the internal organs and the organ characteristics into the internal organ characteristic code to obtain the trained internal organ characteristic code model.

Compared with the prior art, the internal organ feature coding method based on the face image multi-stage relation learning has the advantages that:

the method provided by the invention is used for encoding the labels of the internal organs and the organ characteristics, assisting in identifying the characteristics of the internal organs, embedding priori guiding knowledge into the internal characteristics and improving the characteristic encoding of the internal characteristics corresponding to the face image. The method includes the steps that a multi-task learning model is constructed, supervised learning of two subtask branches is conducted simultaneously by means of two groups of labels of visceral organ and organ characteristics, priori knowledge is embedded, and finally visceral characteristic codes embedded with the priori knowledge are obtained. Experiments show that improved visceral feature codes can be obtained by inputting face images in test samples into the model disclosed by the invention, and the obtained codes can be used as basic technical supports of diet conditioning schemes, exercise conditioning schemes and living habit improvement schemes, so that the pertinence of the conditioning schemes is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flow chart of an internal organ feature coding method based on face image multi-stage relationship learning according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention discloses an internal organ feature coding method based on face image multi-stage relation learning, which comprises the following steps:

s1, data acquisition: the method comprises the steps of collecting face images of a user, and labeling two groups of labels related to internal organs and corresponding features of the human body to each image, wherein each group of labels is provided with one or more category labels. An internal organ label labeled for each face image and an organ feature label associated with each internal organ are acquired. Wherein, the label types of the human internal organs are 11: large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; there are 13 categories of human internal organ feature labels: qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown. Finally, a face image from 7153 users was acquired, constituting a face image data set consisting of 10337 pictures. The acquired face images are obtained by shooting with a common digital camera or a smart phone. And the label of each picture is manually labeled by the health care experts according to the actual conditions of the users, and two groups of label information need to be acquired in the step.

S2, data processing, dividing the face data collected in S1 into a training set and a testing set according to the ratio of 4: 1, zooming each face image to 256 × 256 size for the training set, randomly cutting out 224 × 224 size images from the zoomed images, randomly horizontally turning over the cut images for data augmentation, and respectively normalizing and standardizing R, G, B channels of the images to obtain a training set D_tr＝{(x_n，y_n)，n＝1...N}，x_n∈X，y_n∈ Y, X is the processed face image sample set, Y is the label set, for the test set, all face images are scaled to 256 × 256 size, then the picture with 224 × 224 size is cut out from the center, and the test set is used to test the effect of the coding model by normalizing and standardizing R, G, B three channels of the picture.

The training set has N label samples, and the visceral organ label has C^loc(specifically 12) class, visceral organ tag set

The internal organ characteristics are labeled C^nat(specifically 14) class, visceral organ feature tag set

The number of tasks of the multi-task learning frame is T, and the task set is

Then the organ characteristicsIn which T is 2 and

that is to say one sample x_nCorresponding to two sets of tags from two tasks

Wherein

And

that is, the tag set Y includes two tasks of tag Y ═ Y^loc，Y^nat}。

S3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining the visceral feature code embedded with the prior knowledge. Referring to the attached figure 1, the specific construction steps are as follows:

s31, constructing a backbone network, using the internal organ labels and the organ feature labels as a task set of a multi-task learning frame, and giving a human face image sample x_nIt will go through the backbone network f_bk：

Extracting common characteristics shared between tasks

Wherein

And m (specifically set to 1024) is the feature dimension of the task commonality feature.

S32, constructing a dynamic sequence module DSM, modeling two tasks of the internal organ label and the organ feature label by taking the common feature as input through a recurrent neural network RNN, and outputting task related features of the internal organ and the organ feature.

Step 32 specifically includes:

step 321, modeling the internal organs and the organ characteristics by a Recurrent Neural Network (RNN) method. Modeling internal organs, assuming that the internal organs are currently in a training stage s, and extracting the common characteristics of the trunk network extracted in the step 3.1

As the input of each time step T corresponding node of DSM at stage s of task related layer, the node of each time step T corresponds to a subtask branch, so the task number T in the multi-task learning frame is also the time step number and

the common characteristics extracted by the main network are used as the input of nodes corresponding to each time step of DSM of a task related layer, the nodes of each time step correspond to a subtask branch which is two tasks of internal organs and organ characteristics, and the task number is the time step number.

DSM is a timing structure f_dsm：

And is

Thus in DSM of training phase s, input is given

Each time step outputs a set of task related features, respectively

And

wherein the output of the organ characteristic branches

Is a task commonality feature entered

And hidden state of last time step

With probability r_sThe influence of the magnetic field.

Step 322, the hidden state feature and the common feature of the time step are introduced into the next time step node in the RNN to model the organ feature, in this embodiment, a RNN layer is used, and then there is a hidden state feature for the time step t

Can be according to F_hThe calculation is as follows:

wherein the content of the first and second substances,

and D_hThe feature dimension of the hidden state feature. Specifically, in the task of internal organ feature training, the RNN layer includes two time steps, and each time step calculates its own implicit feature through the above formula. The two groups of hidden features calculated at the two time steps respectively correspond to the two subtasks. Their hidden state features can be re-expressed as

Wherein the content of the first and second substances,

is an initial state of random initialization. In order to control the strength of the relational modeling, the embodiment introduces an AND operation on the time sequence connection

Mask M-Bernoulli (r) with uniform size_s)，r_sTo preserve the probability:

wherein ⊙ refers to multiply by element and

is the probability r_sTo keep the hidden state

Of the node value of (1), and then adjusting

The next time step node is entered. Specifically, the hidden state features of the organ features may be calculated as:

obtaining respective hidden state characteristics at each time step t

Then, can be according to F_vCalculating the output of each time step

Comprises the following steps:

wherein the content of the first and second substances,

and g is an activation function. Then, two outputs are respectively obtained from the DSM

And

task related features corresponding to internal organs and organ characteristics.

S33, constructing classifiers, inputting the output of step 32 into respective multi-label classifiers after passing through the full connection layer

And

obtaining a set of decision vectors comprising predictions of internal organ and organ characteristics

Wherein

Element (1) of

Is an 0/1 binary vector.

It is seen from S3 that the prediction of the characteristics of human organs is influenced by the model prediction of human organs. It can be seen that the accuracy of the model for predicting the internal organs of the human body is assumed that the model correctly predicts the result of the internal organ characteristics of the human body based on the wrong result of the internal organ prediction of the human body, so that the loss of the internal organ characteristics of the human body is small, which is called as an abnormal condition. This may result in erroneous semantic guidance. To reduce this, the situation is semantically constrained by introducing some penalty value in the loss function. The specific process is as follows:

s34, constructing a semantic regularized loss function: setting a hyper-parameter threshold z to define whether the model predicts the correct or incorrect result:

wherein sigma is sigmoid function. Output/tag pair for nth sample

F_actBy setting a hyper-parameter threshold z, an activation value close to 0 or 1 is calculated for the visceral organ and organ characterization task, respectively, depending on the magnitude of the L2 loss

$ and

respectively, the model predicts the 'correct' or 'wrong' of the human internal organ and the human internal organ characteristics.

And step 35, introducing a penalty term into the loss function, and setting a gating term G to control the activation of the penalty term, so that the network can obtain larger gradient return to correct network parameters. Gating item G is:

wherein [ ·]₊Representing max (0,), ∈ is a constant that avoids the denominator being 0_trOutput/tag pairs of all samples in

Penalty value L in this case_pThe calculation is as follows:

wherein L is^natFor categorical losses of organ characteristics, Q is a constant (typically set to L)^natUpper bound of). Then, the penalty term is activated when the above-mentioned abnormal condition occurs, and the penalty is heavier as the loss of the organ feature is smaller. Meanwhile, a hyperparameter omega is set to balance the penalty value, and can be regarded as a weight coefficient of the penalty value Lp, omega_sRepresenting a penalty factor in the training phase s, then loss of semantic constraints

Can be summarized as follows:

furthermore, for classification loss, output/tag pairs for each branch

Calculating binary cross entropy loss to respectively obtain two multi-label classification losses L^locAnd L^nat. Finally, the three losses are added to obtain the final loss:

wherein L is^locAnd L^natMulti-label cross entropy classification loss of visceral organ, organ features respectively.

Step 36, calculating the loss L^totalAnd performing back propagation, adjusting parameters of the model, finally embedding the prior knowledge of the internal organs and the organ characteristics into the internal organ characteristic codes, and finally outputting the internal organ characteristic codes.

And after the network is constructed, performing multi-stage training by using the network to obtain a coding model. The training process is as follows:

1) setting a set of reservation probabilities

For example: a set of equal difference retention probabilities from 0.1 to 1.0, with 10 stages, (i.e., S-10), r is initialized₁0.1, starting from the 20 th round of training, every 10 rounds increase by 0.1;

2) setting a set of penalty factors

For example: a set of equal-difference penalty factors from 0.1 to 1.0, with 10 stages, (i.e., S-10), initializing ω₁0.1, starting from the 20 th round of training, every 10 rounds increase by 0.1;

3) setting a retention probability and a penalty factor corresponding to the current stage, constructing a model of the multi-stage relation learning network according to the method of the third step, and using a residual error network (ResNet-50) as a main network;

4) training 3) by using the human face human internal organ and human internal organ feature data set constructed in the step two to construct a multi-stage relation learning network:

training the network by adopting a random gradient descent optimization algorithm with the momentum of 0.9;

setting the initial learning rate to be 0.1, and respectively reducing 10 times in the iteration rounds of 120 and 160, and training for 170 rounds in total;

weight decay is set to 0.00001, batch size (batch size) is set to 32;

the super-parameter z in the semantic regularization loss function is set to 0.1 and Q is set to 1.0.

After the model training is finished, the test set input value model is subjected to recognition result testing:

the preprocessed face image of the test set is transmitted into the trained model, and two groups of features are calculated and output in step 32

And

and connecting the two groups of features along the channel dimension to obtain a group of features and outputting the group of features, namely encoding the visceral features embedded with the priori knowledge.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An internal organ feature coding method based on face image multi-stage relation learning is characterized by comprising the following steps:

step 2, data processing: after the face image is subjected to data amplification, the R, G, B three channels are respectively subjected to normalization and standardization processing to obtain a training set D_tr＝{(x_n，y_n)，n＝1...N}，x_n∈X，y_n∈ Y, wherein X is the processed face image sample set, and Y is the label set;

step 3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining the visceral feature code embedded with the prior knowledge.

2. The method for encoding internal organ characteristics based on the multi-stage relationship learning of the face image according to the claim 1, wherein in the step 1, the internal organ labels comprise large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; organ characteristic labels include qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown.

3. The method as claimed in claim 1, wherein the step 2 of data augmentation comprises scaling the face image to 256 x 256 size, randomly cutting out 224 x 224 size image from the scaled image, and randomly horizontally flipping the cut image for data augmentation.

4. The internal organ feature coding method based on face image multi-stage relation learning according to claim 1, wherein the step 3 comprises:

And

5. The method for encoding internal organ features based on multi-stage relationship learning of face images as claimed in claim 4, wherein the step 32 specifically comprises:

step 321, modeling internal organs, using the common features extracted from the main network as the input of nodes corresponding to each time step of DSM of a task related layer, wherein the nodes of each time step correspond to a task branch which is respectively two tasks of the internal organs and the organ features, and the number of the tasks is the number of the time steps;

Is shown as

Wherein the content of the first and second substances,

wherein M is and

mask M-Bernoulli (r) with uniform size_s)，

And D_hIs the feature dimension of the hidden-state feature,

Output branched from organ characteristics

Wherein

Common characteristic of input and hidden state of last time step

With probability r_sInfluence.

6. The internal organ feature coding method based on multi-stage relation learning of face images as claimed in claim 4, wherein the step 3 further comprises a process of constructing a semantic canonical loss function:

To know

Respectively representing the recognition results of the models on the internal organs and the organ characteristics;

Output/tag pairs for each branch

step 36, calculating the loss L^totalAnd (4) performing back propagation, adjusting parameters of the model, and finally embedding the prior knowledge of the internal organs and the organ characteristics into the internal organ characteristic code to obtain a final internal organ characteristic coding model.