CN111612133A - Internal organ feature coding method based on face image multi-stage relation learning - Google Patents

Internal organ feature coding method based on face image multi-stage relation learning Download PDF

Info

Publication number
CN111612133A
CN111612133A CN202010430517.XA CN202010430517A CN111612133A CN 111612133 A CN111612133 A CN 111612133A CN 202010430517 A CN202010430517 A CN 202010430517A CN 111612133 A CN111612133 A CN 111612133A
Authority
CN
China
Prior art keywords
organ
feature
face image
internal organ
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010430517.XA
Other languages
Chinese (zh)
Other versions
CN111612133B (en
Inventor
文鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huajian Intelligent Technology Co ltd
Original Assignee
Guangzhou Huajian Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huajian Intelligent Technology Co ltd filed Critical Guangzhou Huajian Intelligent Technology Co ltd
Priority to CN202010430517.XA priority Critical patent/CN111612133B/en
Publication of CN111612133A publication Critical patent/CN111612133A/en
Application granted granted Critical
Publication of CN111612133B publication Critical patent/CN111612133B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an internal organ feature coding method based on face image multi-stage relation learning, which comprises the steps of collecting face images and obtaining labels marked for each face image, wherein the labels comprise internal organ labels and organ feature labels associated with each internal organ; after the face image is subjected to data amplification, normalization and standardization processing are respectively carried out on the RGB channels to obtain a training set; and simultaneously performing supervised learning of two subtask branches on the face image training set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining a visceral feature coding model embedded with the prior knowledge. The invention can fully consider the relevance between the human face image and the organ label and the organ feature label, carries out modeling and analysis on the multi-stage relation learning model, and provides visual and objective basic support for human health care and health preservation for the coding result of the human organ features.

Description

Internal organ feature coding method based on face image multi-stage relation learning
Technical Field
The invention relates to the technical field of machine learning, in particular to an internal organ feature coding method based on face image multi-stage relation learning.
Background
The Chinese ancient book 'Huangdi' records that 'twelve meridians, three hundred and sixty five ways, blood and qi rise on the face and empty orifices are all seen', which shows that the conditions of the five internal organs and six internal organs of human beings can be shown in the relevant areas on the face, so that the conditions of the five internal organs and six internal organs of a user can be mastered by observing the facial complexion of the human body, and then the five internal organs and six internal organs are conditioned by improving diet, exercise and living habits, and the like, so that the purpose of health preservation is achieved. For example, people with dampness typically have patches, acne, a shiny face, red nose and head, etc. on their faces, and thus, the body's moisture can be expressed by facial features. The internal dampness is related to five zang-organs and six fu-organs, possibly in the lung and possibly in the spleen, and only the viscera where the dampness is located are known, a reasonable diet conditioning scheme, a sport conditioning scheme and a living habit improvement scheme can be formulated, but the reasonable diet conditioning scheme, the sport conditioning scheme and the living habit improvement scheme need abundant health-care expert experience, and the ordinary people do not have the experience and cannot formulate a proper conditioning scheme.
With the development of science and technology, big data and deep learning technology develop rapidly in recent years. The machine learning technology utilizes a multi-level neural network, and can enable a computer to learn and understand complex data such as images and sounds and make corresponding behaviors through the training of mass data. The machine learning network can extract contrast features which are high in complexity and difficult for human beings to understand.
However, the current deep supervised learning method needs a large amount of label data to support, otherwise model training is easy to fall into overfitting, so that the generalization capability of the model is not enough to effectively express the distinguishing characteristics of the data. However, the collection of label data is very expensive and it is almost impossible to collect sufficient training data for every task. And a plurality of related tasks are learned through a multi-task framework, and the prior knowledge is embedded by using the tag data of the related tasks, so that the knowledge of the related tasks can be migrated to the main task, and the purpose of assisting the main task modeling is finally achieved. The dependency on the tag data can be reduced, the utilization rate of the existing tag data is improved, and the generalization capability of the main task model is improved.
Therefore, how to provide a method for effectively encoding human organ features through a human face image based on limited tag data is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, the invention provides an internal organ feature coding method based on face image multi-stage relationship learning, which can fully consider the correlation between a face image and internal organ labels and organ feature labels, perform modeling and analysis on a multi-stage relationship learning model, and provide basic support for predictive coding of internal organ features in a human body.
In order to achieve the above purpose, the invention provides the following technical scheme:
an internal organ feature coding method based on face image multi-stage relation learning comprises the following steps:
step 1, data acquisition: collecting face images and obtaining labels marked for each face image, wherein the labels comprise internal organ labels and organ feature labels corresponding to each internal organ;
step 2, data processing: the human face image is subjected to data amplification, and then the R, G, B three channels are respectively subjected to normalization and standardization processing to obtain a training set Dtr={(xn,yn),n=1...N},xn∈X,yn∈ Y, wherein X is the processed face image sample set, and Y is the label set;
step 3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining a visceral feature coding model embedded with the prior knowledge.
Preferably, in step 1, the internal organ labels include large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; organ characteristic labels include qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown.
Preferably, the process of data augmentation in step 2 includes scaling the face image to 256 × 256 size, randomly cutting out an image with 224 × 224 size from the scaled image, and randomly horizontally flipping the cut image to perform data augmentation.
Preferably, the step 3 comprises:
step 31, constructing a backbone network, using the internal organ labels and the organ feature labels as a subtask set of a multi-task learning frame, and using the face image sample xnExtracting common characteristics shared among tasks through a backbone network;
step 32, constructing a dynamic sequence module DSM, modeling two tasks of the internal organ label and the organ feature label by taking the common feature as input through a Recurrent Neural Network (RNN), and outputting task related features of the internal organ and the organ feature;
step 33, constructing classifiers, inputting the output of step 32 into respective multi-label classifiers after passing through the full connection layer
Figure BDA0002500370990000031
And
Figure BDA0002500370990000032
a set of decision vectors is obtained that includes predictions of internal organ and organ characteristics.
Preferably, the step 32 specifically includes:
step 321, modeling internal organs, using the common features extracted from the main network as the input of nodes corresponding to each time step of DSM of task related layer, wherein the nodes of each time step correspond to a subtask branch, and are two tasks of internal organs and organ features, and the number of tasks is the number of time steps;
step 322, the hidden state feature and the common feature of the time step are introduced into the next time step node in the RNN to model the internal organ feature of the human body, and the hidden state feature of the time step
Figure BDA0002500370990000033
Is shown as
Figure BDA0002500370990000034
Figure BDA0002500370990000035
Wherein the content of the first and second substances,
Figure BDA0002500370990000036
wherein M is and
Figure BDA0002500370990000037
mask M-Bernoulli (r) with uniform sizes),
Figure BDA0002500370990000038
And DhIs the feature dimension of the hidden-state feature,
Figure BDA0002500370990000039
an initial state for random initialization; obtaining outputs at each time step as organ branches
Figure BDA0002500370990000041
Output branched from organ characteristics
Figure BDA0002500370990000042
Wherein
Figure BDA0002500370990000043
Common characteristic of input and hidden state of last time step
Figure BDA0002500370990000044
With probability rsInfluence.
Preferably, the step 3 further includes a process of constructing a semantic regularization loss function:
step 34, setting a hyper-parametric threshold z, calculating an activation value for the visceral organ and organ characterization task based on the magnitude of the L2 loss
Figure BDA0002500370990000045
And
Figure BDA0002500370990000046
respectively representing the model to the internal organsIdentification results of organs and organ characteristics;
step 35, introducing penalty item into the loss function, setting a gating item G to control the activation of the penalty item, and calculating the loss of semantic constraint
Figure BDA0002500370990000047
Output/tag pairs for each branch
Figure BDA0002500370990000048
Calculating binary cross entropy loss to respectively obtain two multi-label classification losses LlocAnd Lnat(ii) a The final loss is obtained by adding the three losses:
Figure BDA0002500370990000049
wherein L islocAnd LnatMulti-label cross entropy classification losses of human internal organs and human internal organ characteristics respectively;
step 36, calculating the loss LtotalAnd performing back propagation, adjusting parameters of the model, and finally embedding the priori knowledge of the internal organs and the organ characteristics into the internal organ characteristic code to obtain the trained internal organ characteristic code model.
Compared with the prior art, the internal organ feature coding method based on the face image multi-stage relation learning has the advantages that:
the method provided by the invention is used for encoding the labels of the internal organs and the organ characteristics, assisting in identifying the characteristics of the internal organs, embedding priori guiding knowledge into the internal characteristics and improving the characteristic encoding of the internal characteristics corresponding to the face image. The method includes the steps that a multi-task learning model is constructed, supervised learning of two subtask branches is conducted simultaneously by means of two groups of labels of visceral organ and organ characteristics, priori knowledge is embedded, and finally visceral characteristic codes embedded with the priori knowledge are obtained. Experiments show that improved visceral feature codes can be obtained by inputting face images in test samples into the model disclosed by the invention, and the obtained codes can be used as basic technical supports of diet conditioning schemes, exercise conditioning schemes and living habit improvement schemes, so that the pertinence of the conditioning schemes is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flow chart of an internal organ feature coding method based on face image multi-stage relationship learning according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses an internal organ feature coding method based on face image multi-stage relation learning, which comprises the following steps:
an internal organ feature coding method based on face image multi-stage relation learning comprises the following steps:
s1, data acquisition: the method comprises the steps of collecting face images of a user, and labeling two groups of labels related to internal organs and corresponding features of the human body to each image, wherein each group of labels is provided with one or more category labels. An internal organ label labeled for each face image and an organ feature label associated with each internal organ are acquired. Wherein, the label types of the human internal organs are 11: large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; there are 13 categories of human internal organ feature labels: qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown. Finally, a face image from 7153 users was acquired, constituting a face image data set consisting of 10337 pictures. The acquired face images are obtained by shooting with a common digital camera or a smart phone. And the label of each picture is manually labeled by the health care experts according to the actual conditions of the users, and two groups of label information need to be acquired in the step.
S2, data processing, dividing the face data collected in S1 into a training set and a testing set according to the ratio of 4: 1, zooming each face image to 256 × 256 size for the training set, randomly cutting out 224 × 224 size images from the zoomed images, randomly horizontally turning over the cut images for data augmentation, and respectively normalizing and standardizing R, G, B channels of the images to obtain a training set Dtr={(xn,yn),n=1...N},xn∈X,yn∈ Y, X is the processed face image sample set, Y is the label set, for the test set, all face images are scaled to 256 × 256 size, then the picture with 224 × 224 size is cut out from the center, and the test set is used to test the effect of the coding model by normalizing and standardizing R, G, B three channels of the picture.
The training set has N label samples, and the visceral organ label has Cloc(specifically 12) class, visceral organ tag set
Figure BDA0002500370990000061
The internal organ characteristics are labeled Cnat(specifically 14) class, visceral organ feature tag set
Figure BDA0002500370990000062
The number of tasks of the multi-task learning frame is T, and the task set is
Figure BDA0002500370990000063
Then the organ characteristicsIn which T is 2 and
Figure BDA0002500370990000064
that is to say one sample xnCorresponding to two sets of tags from two tasks
Figure BDA0002500370990000065
Wherein
Figure BDA0002500370990000066
And
Figure BDA0002500370990000067
that is, the tag set Y includes two tasks of tag Y ═ Yloc,Ynat}。
S3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining the visceral feature code embedded with the prior knowledge. Referring to the attached figure 1, the specific construction steps are as follows:
s31, constructing a backbone network, using the internal organ labels and the organ feature labels as a task set of a multi-task learning frame, and giving a human face image sample xnIt will go through the backbone network fbk
Figure BDA0002500370990000071
Extracting common characteristics shared between tasks
Figure BDA0002500370990000072
Wherein
Figure BDA0002500370990000073
And m (specifically set to 1024) is the feature dimension of the task commonality feature.
S32, constructing a dynamic sequence module DSM, modeling two tasks of the internal organ label and the organ feature label by taking the common feature as input through a recurrent neural network RNN, and outputting task related features of the internal organ and the organ feature.
Step 32 specifically includes:
step 321, modeling the internal organs and the organ characteristics by a Recurrent Neural Network (RNN) method. Modeling internal organs, assuming that the internal organs are currently in a training stage s, and extracting the common characteristics of the trunk network extracted in the step 3.1
Figure BDA0002500370990000074
As the input of each time step T corresponding node of DSM at stage s of task related layer, the node of each time step T corresponds to a subtask branch, so the task number T in the multi-task learning frame is also the time step number and
Figure BDA0002500370990000075
the common characteristics extracted by the main network are used as the input of nodes corresponding to each time step of DSM of a task related layer, the nodes of each time step correspond to a subtask branch which is two tasks of internal organs and organ characteristics, and the task number is the time step number.
DSM is a timing structure fdsm
Figure BDA0002500370990000076
And is
Figure BDA0002500370990000077
Figure BDA0002500370990000078
Thus in DSM of training phase s, input is given
Figure BDA0002500370990000079
Each time step outputs a set of task related features, respectively
Figure BDA00025003709900000710
And
Figure BDA00025003709900000711
wherein the output of the organ characteristic branches
Figure BDA00025003709900000712
Is a task commonality feature entered
Figure BDA00025003709900000713
And hidden state of last time step
Figure BDA00025003709900000714
With probability rsThe influence of the magnetic field.
Step 322, the hidden state feature and the common feature of the time step are introduced into the next time step node in the RNN to model the organ feature, in this embodiment, a RNN layer is used, and then there is a hidden state feature for the time step t
Figure BDA00025003709900000715
Can be according to FhThe calculation is as follows:
Figure BDA00025003709900000716
Figure BDA0002500370990000081
wherein the content of the first and second substances,
Figure BDA0002500370990000082
and DhThe feature dimension of the hidden state feature. Specifically, in the task of internal organ feature training, the RNN layer includes two time steps, and each time step calculates its own implicit feature through the above formula. The two groups of hidden features calculated at the two time steps respectively correspond to the two subtasks. Their hidden state features can be re-expressed as
Figure BDA0002500370990000083
Figure BDA0002500370990000084
Wherein the content of the first and second substances,
Figure BDA0002500370990000085
is an initial state of random initialization. In order to control the strength of the relational modeling, the embodiment introduces an AND operation on the time sequence connection
Figure BDA0002500370990000086
Mask M-Bernoulli (r) with uniform sizes),rsTo preserve the probability:
Figure BDA0002500370990000087
wherein ⊙ refers to multiply by element and
Figure BDA0002500370990000088
Figure BDA0002500370990000089
is the probability rsTo keep the hidden state
Figure BDA00025003709900000810
Of the node value of (1), and then adjusting
Figure BDA00025003709900000811
The next time step node is entered. Specifically, the hidden state features of the organ features may be calculated as:
Figure BDA00025003709900000812
obtaining respective hidden state characteristics at each time step t
Figure BDA00025003709900000813
Then, can be according to FvCalculating the output of each time step
Figure BDA00025003709900000814
Comprises the following steps:
Figure BDA00025003709900000815
wherein the content of the first and second substances,
Figure BDA00025003709900000816
and g is an activation function. Then, two outputs are respectively obtained from the DSM
Figure BDA00025003709900000817
And
Figure BDA00025003709900000818
task related features corresponding to internal organs and organ characteristics.
S33, constructing classifiers, inputting the output of step 32 into respective multi-label classifiers after passing through the full connection layer
Figure BDA00025003709900000821
And
Figure BDA00025003709900000822
obtaining a set of decision vectors comprising predictions of internal organ and organ characteristics
Figure BDA0002500370990000091
Wherein
Figure BDA0002500370990000092
Element (1) of
Figure BDA0002500370990000093
Is an 0/1 binary vector.
It is seen from S3 that the prediction of the characteristics of human organs is influenced by the model prediction of human organs. It can be seen that the accuracy of the model for predicting the internal organs of the human body is assumed that the model correctly predicts the result of the internal organ characteristics of the human body based on the wrong result of the internal organ prediction of the human body, so that the loss of the internal organ characteristics of the human body is small, which is called as an abnormal condition. This may result in erroneous semantic guidance. To reduce this, the situation is semantically constrained by introducing some penalty value in the loss function. The specific process is as follows:
s34, constructing a semantic regularized loss function: setting a hyper-parameter threshold z to define whether the model predicts the correct or incorrect result:
Figure BDA0002500370990000094
wherein sigma is sigmoid function. Output/tag pair for nth sample
Figure BDA0002500370990000095
FactBy setting a hyper-parameter threshold z, an activation value close to 0 or 1 is calculated for the visceral organ and organ characterization task, respectively, depending on the magnitude of the L2 loss
Figure BDA0002500370990000096
$ and
Figure BDA0002500370990000097
respectively, the model predicts the 'correct' or 'wrong' of the human internal organ and the human internal organ characteristics.
And step 35, introducing a penalty term into the loss function, and setting a gating term G to control the activation of the penalty term, so that the network can obtain larger gradient return to correct network parameters. Gating item G is:
Figure BDA0002500370990000098
wherein [ ·]+Representing max (0,), ∈ is a constant that avoids the denominator being 0trOutput/tag pairs of all samples in
Figure BDA0002500370990000099
Penalty value L in this casepThe calculation is as follows:
Figure BDA00025003709900000910
wherein L isnatFor categorical losses of organ characteristics, Q is a constant (typically set to L)natUpper bound of). Then, the penalty term is activated when the above-mentioned abnormal condition occurs, and the penalty is heavier as the loss of the organ feature is smaller. Meanwhile, a hyperparameter omega is set to balance the penalty value, and can be regarded as a weight coefficient of the penalty value Lp, omegasRepresenting a penalty factor in the training phase s, then loss of semantic constraints
Figure BDA0002500370990000101
Can be summarized as follows:
Figure BDA0002500370990000102
furthermore, for classification loss, output/tag pairs for each branch
Figure BDA0002500370990000103
Calculating binary cross entropy loss to respectively obtain two multi-label classification losses LlocAnd Lnat. Finally, the three losses are added to obtain the final loss:
Figure BDA0002500370990000104
wherein L islocAnd LnatMulti-label cross entropy classification loss of visceral organ, organ features respectively.
Step 36, calculating the loss LtotalAnd performing back propagation, adjusting parameters of the model, finally embedding the prior knowledge of the internal organs and the organ characteristics into the internal organ characteristic codes, and finally outputting the internal organ characteristic codes.
And after the network is constructed, performing multi-stage training by using the network to obtain a coding model. The training process is as follows:
1) setting a set of reservation probabilities
Figure BDA0002500370990000105
For example: a set of equal difference retention probabilities from 0.1 to 1.0, with 10 stages, (i.e., S-10), r is initialized10.1, starting from the 20 th round of training, every 10 rounds increase by 0.1;
2) setting a set of penalty factors
Figure BDA0002500370990000106
For example: a set of equal-difference penalty factors from 0.1 to 1.0, with 10 stages, (i.e., S-10), initializing ω10.1, starting from the 20 th round of training, every 10 rounds increase by 0.1;
3) setting a retention probability and a penalty factor corresponding to the current stage, constructing a model of the multi-stage relation learning network according to the method of the third step, and using a residual error network (ResNet-50) as a main network;
4) training 3) by using the human face human internal organ and human internal organ feature data set constructed in the step two to construct a multi-stage relation learning network:
training the network by adopting a random gradient descent optimization algorithm with the momentum of 0.9;
setting the initial learning rate to be 0.1, and respectively reducing 10 times in the iteration rounds of 120 and 160, and training for 170 rounds in total;
weight decay is set to 0.00001, batch size (batch size) is set to 32;
the super-parameter z in the semantic regularization loss function is set to 0.1 and Q is set to 1.0.
After the model training is finished, the test set input value model is subjected to recognition result testing:
the preprocessed face image of the test set is transmitted into the trained model, and two groups of features are calculated and output in step 32
Figure BDA0002500370990000111
And
Figure BDA0002500370990000112
and connecting the two groups of features along the channel dimension to obtain a group of features and outputting the group of features, namely encoding the visceral features embedded with the priori knowledge.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. An internal organ feature coding method based on face image multi-stage relation learning is characterized by comprising the following steps:
step 1, data acquisition: collecting face images and obtaining labels marked for each face image, wherein the labels comprise internal organ labels and organ feature labels corresponding to each internal organ;
step 2, data processing: after the face image is subjected to data amplification, the R, G, B three channels are respectively subjected to normalization and standardization processing to obtain a training set Dtr={(xn,yn),n=1...N},xn∈X,yn∈ Y, wherein X is the processed face image sample set, and Y is the label set;
step 3, constructing a multi-stage relation learning network model: and simultaneously performing supervised learning of two subtask branches on the face image sample set by using the visceral organ label and the organ feature label to embed prior guiding knowledge of the visceral features, and finally obtaining the visceral feature code embedded with the prior knowledge.
2. The method for encoding internal organ characteristics based on the multi-stage relationship learning of the face image according to the claim 1, wherein in the step 1, the internal organ labels comprise large intestine, gallbladder, lung, liver, bladder, spleen, kidney, stomach, small intestine, heart, and unknown; organ characteristic labels include qi deficiency, blood deficiency, yin deficiency, yang deficiency, qi stagnation, blood stasis, phlegm, wind, heat, cold, dryness, dampness, and unknown.
3. The method as claimed in claim 1, wherein the step 2 of data augmentation comprises scaling the face image to 256 x 256 size, randomly cutting out 224 x 224 size image from the scaled image, and randomly horizontally flipping the cut image for data augmentation.
4. The internal organ feature coding method based on face image multi-stage relation learning according to claim 1, wherein the step 3 comprises:
step 31, constructing a backbone network, using the internal organ labels and the organ feature labels as a subtask set of a multi-task learning frame, and using the face image sample xnExtracting common characteristics shared among tasks through a backbone network;
step 32, constructing a dynamic sequence module DSM, modeling two tasks of the internal organ label and the organ feature label by taking the common feature as input through a Recurrent Neural Network (RNN), and outputting task related features of the internal organ and the organ feature;
step 33, constructing classifiers, inputting the output of step 32 into respective multi-label classifiers after passing through the full connection layer
Figure FDA0002500370980000021
Figure FDA0002500370980000022
And
Figure FDA0002500370980000023
Figure FDA0002500370980000024
a set of decision vectors is obtained that includes predictions of internal organ and organ characteristics.
5. The method for encoding internal organ features based on multi-stage relationship learning of face images as claimed in claim 4, wherein the step 32 specifically comprises:
step 321, modeling internal organs, using the common features extracted from the main network as the input of nodes corresponding to each time step of DSM of a task related layer, wherein the nodes of each time step correspond to a task branch which is respectively two tasks of the internal organs and the organ features, and the number of the tasks is the number of the time steps;
step 322, the hidden state feature and the common feature of the time step are introduced into the next time step node in the RNN to model the internal organ feature of the human body, and the hidden state feature of the time step
Figure FDA0002500370980000025
Is shown as
Figure FDA0002500370980000026
Figure FDA0002500370980000027
Wherein the content of the first and second substances,
Figure FDA0002500370980000028
wherein M is and
Figure FDA0002500370980000029
mask M-Bernoulli (r) with uniform sizes),
Figure FDA00025003709800000210
And DhIs the feature dimension of the hidden-state feature,
Figure FDA00025003709800000211
an initial state for random initialization; obtaining outputs at each time step as organ branches
Figure FDA00025003709800000212
Output branched from organ characteristics
Figure FDA00025003709800000213
Wherein
Figure FDA00025003709800000214
Common characteristic of input and hidden state of last time step
Figure FDA00025003709800000215
With probability rsInfluence.
6. The internal organ feature coding method based on multi-stage relation learning of face images as claimed in claim 4, wherein the step 3 further comprises a process of constructing a semantic canonical loss function:
step 34, setting a hyper-parametric threshold z, calculating an activation value for the visceral organ and organ characterization task based on the magnitude of the L2 loss
Figure FDA00025003709800000216
To know
Figure FDA00025003709800000217
Respectively representing the recognition results of the models on the internal organs and the organ characteristics;
step 35, introducing penalty item into the loss function, setting a gating item G to control the activation of the penalty item, and calculating the loss of semantic constraint
Figure FDA00025003709800000218
Output/tag pairs for each branch
Figure FDA00025003709800000219
Calculating binary cross entropy loss to respectively obtain two multi-label classification losses LlocAnd Lnat(ii) a The final loss is obtained by adding the three losses:
Figure FDA0002500370980000031
wherein L islocAnd LnatMulti-label cross entropy classification losses of human internal organs and human internal organ characteristics respectively;
step 36, calculating the loss LtotalAnd (4) performing back propagation, adjusting parameters of the model, and finally embedding the prior knowledge of the internal organs and the organ characteristics into the internal organ characteristic code to obtain a final internal organ characteristic coding model.
CN202010430517.XA 2020-05-20 2020-05-20 Internal organ feature coding method based on face image multi-stage relation learning Active CN111612133B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010430517.XA CN111612133B (en) 2020-05-20 2020-05-20 Internal organ feature coding method based on face image multi-stage relation learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010430517.XA CN111612133B (en) 2020-05-20 2020-05-20 Internal organ feature coding method based on face image multi-stage relation learning

Publications (2)

Publication Number Publication Date
CN111612133A true CN111612133A (en) 2020-09-01
CN111612133B CN111612133B (en) 2021-10-19

Family

ID=72198781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010430517.XA Active CN111612133B (en) 2020-05-20 2020-05-20 Internal organ feature coding method based on face image multi-stage relation learning

Country Status (1)

Country Link
CN (1) CN111612133B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818146A (en) * 2021-01-26 2021-05-18 山西三友和智慧信息技术股份有限公司 Recommendation method based on product image style
CN112837275A (en) * 2021-01-14 2021-05-25 长春大学 Capsule endoscope image organ classification method, device, equipment and storage medium
WO2022105118A1 (en) * 2020-11-17 2022-05-27 平安科技(深圳)有限公司 Image-based health status identification method and apparatus, device and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093087A (en) * 2013-01-05 2013-05-08 电子科技大学 Multimodal brain network feature fusion method based on multi-task learning
CN105760859A (en) * 2016-03-22 2016-07-13 中国科学院自动化研究所 Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network
US20160371539A1 (en) * 2014-04-03 2016-12-22 Tencent Technology (Shenzhen) Company Limited Method and system for extracting characteristic of three-dimensional face image
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN107977671A (en) * 2017-10-27 2018-05-01 浙江工业大学 A kind of tongue picture sorting technique based on multitask convolutional neural networks
CN108596338A (en) * 2018-05-09 2018-09-28 四川斐讯信息技术有限公司 A kind of acquisition methods and its system of neural metwork training collection
CN110263756A (en) * 2019-06-28 2019-09-20 东北大学 A kind of human face super-resolution reconstructing system based on joint multi-task learning
CN110348416A (en) * 2019-07-17 2019-10-18 北方工业大学 Multi-task face recognition method based on multi-scale feature fusion convolutional neural network
CN110443189A (en) * 2019-07-31 2019-11-12 厦门大学 Face character recognition methods based on multitask multi-tag study convolutional neural networks
CN110945125A (en) * 2017-06-06 2020-03-31 齐默尔根公司 HTP genetic engineering modification platform for improving escherichia coli

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103093087A (en) * 2013-01-05 2013-05-08 电子科技大学 Multimodal brain network feature fusion method based on multi-task learning
US20160371539A1 (en) * 2014-04-03 2016-12-22 Tencent Technology (Shenzhen) Company Limited Method and system for extracting characteristic of three-dimensional face image
CN105760859A (en) * 2016-03-22 2016-07-13 中国科学院自动化研究所 Method and device for identifying reticulate pattern face image based on multi-task convolutional neural network
CN106845421A (en) * 2017-01-22 2017-06-13 北京飞搜科技有限公司 Face characteristic recognition methods and system based on multi-region feature and metric learning
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
CN110945125A (en) * 2017-06-06 2020-03-31 齐默尔根公司 HTP genetic engineering modification platform for improving escherichia coli
CN107977671A (en) * 2017-10-27 2018-05-01 浙江工业大学 A kind of tongue picture sorting technique based on multitask convolutional neural networks
CN108596338A (en) * 2018-05-09 2018-09-28 四川斐讯信息技术有限公司 A kind of acquisition methods and its system of neural metwork training collection
CN110263756A (en) * 2019-06-28 2019-09-20 东北大学 A kind of human face super-resolution reconstructing system based on joint multi-task learning
CN110348416A (en) * 2019-07-17 2019-10-18 北方工业大学 Multi-task face recognition method based on multi-scale feature fusion convolutional neural network
CN110443189A (en) * 2019-07-31 2019-11-12 厦门大学 Face character recognition methods based on multitask multi-tag study convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘明佳: "基于人脸彩色图像的疾病诊断研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
蒋俊钊等: "基于标签相关性的卷积神经网络多标签分类算法", 《工业控制计算机》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022105118A1 (en) * 2020-11-17 2022-05-27 平安科技(深圳)有限公司 Image-based health status identification method and apparatus, device and storage medium
CN112837275A (en) * 2021-01-14 2021-05-25 长春大学 Capsule endoscope image organ classification method, device, equipment and storage medium
CN112837275B (en) * 2021-01-14 2023-10-24 长春大学 Capsule endoscope image organ classification method, device, equipment and storage medium
CN112818146A (en) * 2021-01-26 2021-05-18 山西三友和智慧信息技术股份有限公司 Recommendation method based on product image style

Also Published As

Publication number Publication date
CN111612133B (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN111612133B (en) Internal organ feature coding method based on face image multi-stage relation learning
CN110147457B (en) Image-text matching method, device, storage medium and equipment
CN111930992B (en) Neural network training method and device and electronic equipment
Gkioxari et al. Chained predictions using convolutional neural networks
CN113312500B (en) Method for constructing event map for safe operation of dam
CN111444960A (en) Skin disease image classification system based on multi-mode data input
CN111414461B (en) Intelligent question-answering method and system fusing knowledge base and user modeling
CN111191660A (en) Rectal cancer pathology image classification method based on multi-channel collaborative capsule network
CN111144448A (en) Video barrage emotion analysis method based on multi-scale attention convolutional coding network
CN110993094A (en) Intelligent auxiliary diagnosis method and terminal based on medical images
Massiceti et al. Flipdial: A generative model for two-way visual dialogue
CN109255289B (en) Cross-aging face recognition method based on unified generation model
CN109743642B (en) Video abstract generation method based on hierarchical recurrent neural network
CN111400494B (en) Emotion analysis method based on GCN-Attention
CN112784929B (en) Small sample image classification method and device based on double-element group expansion
CN112465120A (en) Fast attention neural network architecture searching method based on evolution method
CN110263174B (en) Topic category analysis method based on focus attention
CN110163131B (en) Human body action classification method based on hybrid convolutional neural network and ecological niche wolf optimization
CN113486987A (en) Multi-source domain adaptation method based on feature decoupling
Zhao et al. Distilling ordinal relation and dark knowledge for facial age estimation
CN113012811B (en) Traditional Chinese medicine syndrome diagnosis and health evaluation method combining deep convolutional network and graph neural network
Lin et al. R 2-resnext: A resnext-based regression model with relative ranking for facial beauty prediction
CN112668486A (en) Method, device and carrier for identifying facial expressions of pre-activated residual depth separable convolutional network
CN113220891A (en) Unsupervised concept-to-sentence based generation confrontation network image description algorithm
CN112786160A (en) Multi-image input multi-label gastroscope image classification method based on graph neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant