WO2020215697A1 - Tongue image extraction method and device, and a computer readable storage medium - Google Patents

Tongue image extraction method and device, and a computer readable storage medium Download PDF

Info

Publication number
WO2020215697A1
WO2020215697A1 PCT/CN2019/118413 CN2019118413W WO2020215697A1 WO 2020215697 A1 WO2020215697 A1 WO 2020215697A1 CN 2019118413 W CN2019118413 W CN 2019118413W WO 2020215697 A1 WO2020215697 A1 WO 2020215697A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
tongue
feature
training
matrix
Prior art date
Application number
PCT/CN2019/118413
Other languages
French (fr)
Chinese (zh)
Inventor
曹靖康
王健宗
王义文
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to SG11202008404RA priority Critical patent/SG11202008404RA/en
Publication of WO2020215697A1 publication Critical patent/WO2020215697A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
    • G06F18/21322Rendering the within-class scatter matrix non-singular
    • G06F18/21328Rendering the within-class scatter matrix non-singular involving subspace restrictions, e.g. nullspace techniques

Definitions

  • This application relates to a tongue image extraction method, device and computer readable storage medium.
  • Existing tongue image detection methods usually adopt the method of target detection, using a sliding window to slide on the image in the horizontal and vertical directions, and extract the spatial features of the objects in the sliding window through the CNN model, and use the SVM classifier to extract The spatial features obtained are classified to determine whether there is a tongue image in the sliding window.
  • the coordinates of the four corner points of the sliding window are output, and the position of the tongue image is calibrated with the coordinates of the four corner points.
  • the size of the target frame is uncertain, which requires multiple times of sliding recognition through target frames of various sizes, which also leads to a certain degree of complexity in target detection.
  • a tongue image extraction method is provided, which is applied to an electronic device, and the method includes the following steps:
  • the dimension of the non-negative feature matrix W is n*r, and column r is a feature base image.
  • the feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
  • the dimension of the weight matrix H is r*m, and each column is a code
  • the characteristic area containing the tongue feature and the non-characteristic area not containing the tongue feature are respectively identified with different labels, where the label set corresponds to the boundary information of the characteristic area, and the extreme values in the up, down, left, and right directions in the boundary information are extracted To determine the border that contains the characteristic area.
  • This application also provides a tongue image extraction device, including:
  • the matrix decomposition module is used to convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V.
  • the LNMF algorithm is used for training to decompose the matrix V into non-negative features
  • the dimension of the non-negative feature matrix W is n*r, column r is the feature base image, and the feature base image refers to the non-negative feature matrix representing the tongue feature W, the non-negative feature matrix W constitutes a non-negative subspace;
  • the dimension of the weight matrix H is r*m, and each column is a code;
  • the tongue feature extraction module uses the EHMM model to identify whether the test image contains a face image, and if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain The similarity of the feature coefficients corresponding to the training image and the test image, and extract the tongue-representing feature in the test image whose similarity is higher than the similarity threshold as the tongue feature;
  • the tongue image segmentation module uses different labels to mark the characteristic areas that contain tongue features and the non-feature areas that do not contain tongue features.
  • the label set corresponds to the boundary information of the characteristic area, and extracts the extreme values in the up, down, left, and right directions in the boundary information to determine the inclusion The border of the characteristic area.
  • the present application also provides an electronic device, which includes a memory and a processor, and a tongue image extraction program is stored in the memory.
  • a tongue image extraction program is executed by the processor, the following steps are implemented:
  • the dimension of the non-negative feature matrix W is n*r, and column r is a feature base image.
  • the feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
  • the dimension of the weight matrix H is r*m, and each column is a code
  • the characteristic area containing the tongue feature and the non-characteristic area not containing the tongue feature are respectively identified with different labels, where the label set corresponds to the boundary information of the characteristic area, and the extreme values in the up, down, left, and right directions in the boundary information are extracted To determine the border that contains the characteristic area.
  • a computer non-volatile readable storage medium stores a computer program.
  • the computer program includes program instructions. When the program instructions are executed by a processor, Realize any of the tongue image extraction methods described above.
  • FIG. 1 is a schematic flowchart of a tongue image extraction method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram 1 of the flow of the super state and the embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
  • FIG. 3 is a second schematic diagram of the flow of the super state and the embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
  • FIG. 4 is a third flowchart of the superstate and embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
  • FIG. 5 is a schematic diagram of the hardware architecture of an electronic device according to an embodiment of the present application.
  • Fig. 6 is a block diagram of a tongue image extraction program according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of adjusting a frame of a linear regression model according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a tongue image extraction method provided by an embodiment of the application, which is applied to an electronic device, and the method includes the following steps:
  • the LNMF local non-negative matrix factorization
  • 1000 tongue images that is, the images contain the tongue and reflect the shape and color of the tongue
  • the tongue images have been previously annotated.
  • the tongue image can be compressed first, such as compressed to 56*64 pixels, and the tongue image is de-averaged and normalized, and the LNMF algorithm is used for training to obtain feature base images of different dimensions.
  • the base image refers to a non-negative feature matrix W representing the characteristics of the tongue, and the non-negative feature matrix W constitutes a non-negative subspace.
  • LNMF is an improvement on the basis of NMF.
  • the dimension of the feature matrix W is n*r, and column r is the base image;
  • the dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
  • the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace
  • the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain
  • the similarity of the feature coefficients corresponding to the training image and the test image is extracted, and the tongue-representative features whose similarity of the feature coefficients are higher than the set threshold are extracted as tongue features, so that images with tongue features are screened from the test images.
  • the characteristics of the tongue include the shape, angle, color, state of the tongue coating, and the positional relationship between the tongue and the facial organs.
  • the test image is projected to the non-negative subspace.
  • the process of projection is equivalent to transforming the test image to the non-negative subspace. It is still an image and is an image composed of learned features.
  • the tongue will be included. Featured feature areas and non-featured areas without tongue features are respectively identified with different labels, so that the feature areas containing tongue features are segmented from the test image.
  • the label set corresponds to the boundary information of the characteristic area, and the upper, lower, left, and right extreme values in the boundary information are extracted to determine the smallest border containing the characteristic area.
  • the significance of using the minimum frame is that linear regression will be used to adjust the position of the minimum frame to eliminate or reduce the position error.
  • the non-characteristic area and the characteristic area have different labels.
  • the non-characteristic area is 0 and the characteristic area is non-zero.
  • the image area representing the tongue feature can be segmented from each test image with a frame. Further, it also includes step S140, using an SVM classifier to classify the features extracted from the test image, and sending the extracted features to k svm classifiers for recognition, and the value of k is equal to the number of categories. Specifically, it can be classified into "tongue” and "non-tongue", for example. It can also be classified according to the characteristics of the pathological condition of the tongue.
  • the physical condition of a person can include damp heat, yin deficiency, normal, high heat, impassable qi and blood, and blood stasis.
  • the class with the highest score among the k SVM classifiers As a result of classification.
  • step S150 adjusting the border position of the tongue image through the linear regression model.
  • the linear regression model For each category, for example, damp-heat, yin deficiency, normal, heat-rich, qi and blood impassable, and blood stasis are trained separately.
  • the input is the characteristics of the image in the frame
  • the output is the translation (left-right translation and up-down translation) value and zoom value of the border.
  • the linear regression model is used to calculate the translation value and zoom value of the border
  • the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
  • Position prediction The true value of the position (G x , G y , G w , G h ).
  • the bounding box regression is to learn to obtain accurate d x (P), d y (P), d w (P), d h (P) these four transformation values.
  • the width and height of is (t w , t h ), where
  • Construct objective function w * is the parameter to be learned (* means x, y, w, h, that is, an objective function is set for each transformation), and d*(P) is the predicted value of the transformation.
  • K(P) is the feature vector corresponding to the feature region.
  • i is the i-th iteration
  • N is the number of samples.
  • d * (P) is d x (P), d y (P), d w (P), d h (P) value.
  • NMF is a local subspace projection method. Since the features extracted by the NMF algorithm are based on global features, there is no restriction on the locality of the feature space.
  • LNMF emphasizes the localization of the basic feature components in the original image decomposition process. The formula of the LNMF algorithm is as follows:
  • ⁇ and ⁇ are normal numbers
  • W j represents the j-th column vector of the feature base matrix W, which means that each column of the feature base matrix W is normalized;
  • V [V 1, V 2 ... V i ... V m] represents a set of training images web m
  • V i represents the column vector web training image i
  • V ij denotes the j-th amplitude of the i gray value image
  • each piece The size of the training image is n
  • the size of V is n*m.
  • W [W 1 , W 2 ,...W j ...W r ] is the characteristic matrix, the size is n*r;
  • H [H 1 , H 2 ...H j ...H m ] is the weight matrix, H j is the j-th column vector of H, and the size is r*m.
  • both the training image and the test image are binarized first, and the binarization refers to setting the gray value of the pixel on the image to only 0 or 255, that is, to present the entire image The process of producing obvious black and white effects.
  • the tongue area can be obtained more accurately.
  • the gray value of the pixel on the image is only set to 0 or 255 according to the set gray value threshold, and the gray value threshold can be the middle value from 0 to 255.
  • the bit value is set to 0 if it is less than the gray value threshold, and 255 if it is greater than or equal to the gray value threshold.
  • the EHMM Embedded Hidden Markov Algorithm
  • the image is classified according to two categories of "human face” and "no human face”. Optimize recognition accuracy.
  • the specific classification process includes the following steps:
  • the EHMM model scans the test image from top to bottom and from left to right through a moving window. It first scans from left to right. Each window scans to obtain a set of feature vectors, which is the face at this time A feature extraction of a region. After the scanning window calculates the feature vector, it moves to the right at a fixed distance to continue the feature extraction. When it moves to the right side of the image, it changes to the next line to continue scanning from left to right. Until the window moves to the lower right of the image, the entire scanning process is ended and multiple sets of feature vectors are obtained, and multiple sets of feature vectors form an observation sequence.
  • the EHMM model contains a set of superstates.
  • the number of superstates in the superstate set is the same as the number of vertical slice pictures of a human face.
  • Each superstate encapsulates a set of embedded state sets.
  • the number of embedded states is the same as the number of slice images of the face in the horizontal direction.
  • the EHMM model scans the image from left to right and top to bottom through a fixed-size window.
  • the facial features can correspond to the super state from top to bottom and the embedded state from left to right.
  • the slices of the image corresponding to the longitudinal super state are forehead area, eye area, nose area, mouth area, and chin area. From top to bottom, the positional relationship of these areas is fixed, which is the common feature of human faces.
  • the personality of the vertical face is reflected by the characteristics of each superstate (that is, each region), and the relationship between the superstates. From left to right, the face is divided into left face, left eye, middle of the two eyes, right eye, and right face. This positional relationship is also fixed, and the personality of the horizontal face passes through each embedded state And the mutual connection between each embedded state is reflected.
  • the forward algorithm is used to find the probability that the observation value sequence is similar to the feature sequence composed of multiple feature points of the face. If the similarity probability is greater than the judgment threshold, it is considered that the detected image contains a face.
  • the training process of the EHMM model is as follows:
  • Transition transition matrix A 0 a 0 ,ij, where a 0 ,ij is the probability of transition matrix from superstate i to superstate j.
  • a 0 ,ij is the probability of transition matrix from superstate i to superstate j.
  • the only allowed transition is The transition from this state to the next state, so the transition probability from the original state to the previous state is 0;
  • Bk represents the probability matrix of observations
  • the embedded state j representing the super state k produces observations The probability
  • the two variables correspond to the vertical and horizontal dimensions respectively,
  • Image segmentation The training image is uniformly divided, and the observation sequence obtained from the image is evenly divided into N 0 longitudinal slices corresponding to the longitudinal superstate. Each longitudinal slice can be divided into multiple embedded states from left to right.
  • Parameter initialization After segmentation, the initial values of the model parameters are obtained through the initialization probability and the transition probability of the state.
  • the state of each EHMM uses K-means clustering to calculate the probability of observation. K is the number of Gaussian distributions in each state. All the observed value vectors extracted in the embedded state can use the Gaussian mixture model to explain the observed value probability density function.
  • the state initialization rule of each super state is as follows: the initialization probability of the first state of each EHMM is specified as 1.0, and the initialization probability of other states is 0.
  • Embedded Viterbi segmentation After the first step of the iteration, the dual embedded Viterbi algorithm (Viterbi algorithm) is used instead of uniform segmentation, and a new set of initialization and migration probabilities are determined through the new segmentation and event frequency counting.
  • Viterbi algorithm the dual embedded Viterbi algorithm
  • Kmeans clustering is used to calculate the observation value vector corresponding to the new state and the new observation value probability density function. In the next iteration, these values will be used as the initial values for a new round of dual embedded Viterbi segmentation.
  • the longitudinal slice of the face also includes the hair area.
  • the hair area provides additional features of the face, which really helps Recognize faces more accurately.
  • the calculation process is basically the same as the above process, so I won't repeat it here.
  • the above is that the entire face is divided into slice images in the vertical direction to correspond to the super state, and the horizontal division into slices corresponds to the embedded state.
  • the purpose of this application is tongue image extraction, in face recognition, only partial longitudinal slices can be used to correspond to the super state. For example, as shown in Fig. 4, only the chin and mouth area are corresponding to the super state.
  • the calculation process is basically the same as the above process, so I won't repeat it here.
  • the image of the human face is further classified, including identifying the gender and age of the image.
  • age and gender of a person cannot be accurately determined based on the state of the tongue, the condition of a person's tongue is related to its age and gender (for example, different age stages, taste buds (papillary protrusions distributed on the tongue))
  • the quantity is different, and it shows a decreasing trend.
  • taste buds there are about 10,000 taste buds. With age, the cells will slowly age. In old age, taste buds are only 20% of childhood. And the younger the age, the tenderer the tongue, and the redder the tongue; the older, the darker the tongue.
  • Female tongues are usually smaller than male tongues).
  • the test images can be classified according to the age and gender of the person, and then images of the tongue area can be extracted from the images in the categories classified by age and gender. Since the images in each category are more related to the tongue that the category should actually correspond to, that is to say, the tongue in an image belongs to the elderly, so it is classified into the older category, in the image
  • the tongue has the characteristics of the tongue that the elderly should have, such as a small number of taste buds (of course only a rough number of recognition), and the color of the tongue is dim, so it can be recognized faster, which is equivalent to reducing the amount of calculation of the model .
  • this requires prior training of the LNMF model corresponding to the age group. That is to say, the images in the training set are first classified according to age and gender, and the images in each category are labeled with and without tongue, which also forms the age-gender-with tongue label. For each category Train an LNMF model separately.
  • CNN Convolutional Neural Network
  • 6 categories are set according to age group and gender, such as 0-20-male, 20-40-male, 40-70-male, 0-20-female, 20-40-female, 40-70-female.
  • Use CNN Convolutional Neural Network to identify and classify training images into the above six age-gender categories;
  • label the images in these six categories, and the corresponding label for each image is the label of age, gender, and tongue;
  • LNMF models are trained for these six categories respectively, and six LNMF models corresponding to the above six categories are obtained.
  • the LNMF model corresponding to the 0-20-male tongue is used to identify the 0-20-male tongue.
  • the LNMF model corresponding to the 20-40-female tongue is used to identify the 20-40-female tongue;
  • the tongue image area is extracted according to the LNMF model after age group, gender, and tongue correspondence training.
  • the test images are also divided into corresponding age and gender categories, and tongues of this age and gender have corresponding characteristics, which can be more Conducive to LNMF model to identify.
  • dividing the test image into multiple categories for simultaneous recognition also speeds up the recognition efficiency.
  • the tongue image corresponds to the age group and gender, this also helps the accuracy and rapidity of the later classification of damp heat, yin deficiency, normal, heat rich, qi and blood barrier, and blood stasis.
  • the electronic device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions.
  • the electronic device 2 can be a smart phone, a tablet computer, a notebook computer, a desktop computer, a server, etc.
  • the electronic device 2 at least includes, but is not limited to, a memory 21, a processor 22, and a network interface 23 that can be communicatively connected to each other through a system bus.
  • the memory 21 includes at least one type of computer non-volatile readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, memory, magnetic disk, optical disk, and the like.
  • the memory 21 may be an internal storage unit of the electronic device 2, such as a hard disk or a memory of the electronic device 2.
  • the memory 21 may also be an external storage device of the electronic device 2, for example, a plug-in hard disk equipped on the electronic device 2, a smart media card (SMC), a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 21 is generally used to store the operating system and various application software installed in the electronic device 2, such as the tongue image extraction program code.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 22 is used to run the program code or processing data stored in the memory 21, for example, to run the tongue image extraction program.
  • the network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the electronic device 2 and other electronic devices.
  • the electronic device 2 may also include a display, and the display may also be called a display screen or a display unit.
  • the memory 21 containing a readable storage medium may include an operating system, a tongue image extraction program 50, and the like.
  • the processor 22 executes the tongue image extraction program 50 in the memory 21, the steps from S1 to S4 described above are implemented, which will not be repeated here.
  • the tongue image extraction program 50 stored in the memory 21 can be divided into one or more program modules, and the one or more program modules are stored in the memory 21, and can be divided into one or more program modules. It is executed by two processors (in this embodiment, the processor 22) to complete the application.
  • FIG. 6 shows a schematic diagram of the program modules of the tongue image extraction program.
  • the tongue image extraction program 50 can be divided into a matrix decomposition module 501, a tongue feature extraction module 502, and a tongue image segmentation module 503. .
  • the following description will specifically introduce the specific functions of the program modules.
  • the matrix factorization module 501 is used for training using the LNMF (local non-negative matrix factorization) algorithm to obtain feature base images of different dimensions.
  • LNMF local non-negative matrix factorization
  • 1000 tongue images that is, the images contain the tongue and reflect the shape and color of the tongue
  • the tongue images have been previously annotated.
  • LNMF is an improvement on the basis of NMF.
  • the dimension of the feature matrix W is n*r, and column r is the base image;
  • the dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
  • NMF is a projection method based on local subspace. Since the features extracted by the NMF algorithm are based on global features, there is no restriction on the locality of the feature space.
  • LNMF emphasizes the localization of the basic feature components in the original image decomposition process. The formula of the LNMF algorithm is as described above and will not be detailed here.
  • the tongue feature extraction module 502 uses the EHMM model to identify whether the test image contains a face image, and if it does, it performs feature extraction on the test image.
  • the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace
  • the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain The similarity of the feature coefficients corresponding to the training image and the test image, so as to extract the features in the test image.
  • the similarity of the feature coefficients is higher than the set threshold, it means that the feature base in the test image is the tongue, so that images with tongue features can be selected from the test image.
  • the tongue image segmentation module 503 is used to project the test image to the non-negative subspace.
  • the process of projection is equivalent to transforming the test image to the non-negative subspace. It is still an image and is an image composed of learned features.
  • the non-characteristic area and the characteristic area have different labels. For example, the non-characteristic area is 0 and the characteristic area is non-zero. Based on 0 and non-zero, the image area representing the tongue feature can be segmented from each test image with a frame.
  • a classification module 504 is further included.
  • the classification module 504 is used to classify the features extracted from the test image using an SVM classifier, and send the extracted features to k svm classifiers for identification, and the value of k
  • the value is equal to the number of categories. Specifically, it can be classified into “tongue” and "non-tongue", for example. It can also be classified according to the characteristics of the pathological condition of the tongue, so as to obtain a framed tongue image. Specifically, it can be classified according to the characteristics of the different tongue images corresponding to the person's body condition. Among them, the person's body condition can include damp heat, yin deficiency, normal, heat, qi and blood barrier, blood stasis, and The class with the highest score among the k SVM classifiers is used as the classification result.
  • a frame adjustment module 505 is further included.
  • the frame adjustment module 505 is used to adjust the frame position of the tongue image through a linear regression model.
  • a linear regression model For each category, for example, damp heat, yin deficiency, normal, heat and blood, Train a linear regression model for the unreasonable and blood stasis respectively.
  • the input is the characteristics of the image in the frame, and the output is the translation (left-right translation and up-down translation) value and zoom value of the border.
  • the linear regression model is used to calculate the translation and zoom values of the border, and the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
  • a binarization module 506 is further included.
  • the binarization module 506 is used to binarize both the training image and the test image (referring to setting the gray value of the pixel on the image to 0 Or 255, that is, the process of presenting the entire image with a clear black and white effect), because color images (such as RGB images) are changed through the three color channels of red (R), green (G), and blue (B) and They are superimposed on each other to get a variety of colors, the tongue area obtained by them is more hollow (missing) areas, black and white images (only single channel), single channel is more conducive to the model than three channels Optimized to obtain the tongue area more accurately.
  • the face recognition module 507 is used to first use EHMM (Embedded Hidden Markov Algorithm) to classify the test image. Specifically, the image is classified as ”And “Nobody’s face” are classified into two categories to optimize the recognition accuracy.
  • EHMM model for face recognition includes the following steps:
  • the EHMM model scans the test image from top to bottom and from left to right through a moving window, because the EHMM model contains a set of superstates.
  • the number of superstates in the superstate set is related to the number of human faces.
  • the number of vertical slice pictures is the same, each super state encapsulates a set of embedded state sets, and the number of embedded states in the embedded state set is the same as the number of face slice pictures in the horizontal direction. .
  • the EHMM model scans the image from left to right and top to bottom through a fixed-size window (face features can be super-state structure from top to bottom, and embedded state from left to right).
  • each window scans to obtain a set of feature vectors, which is a feature extraction of the face region at this time.
  • the scanning window calculates the feature vector, it moves to the right at a fixed distance to continue the feature extraction.
  • it moves to the right side of the image, it changes to the next line to continue scanning from left to right.
  • the entire scanning process is ended and multiple sets of feature vectors are obtained, and multiple sets of feature vectors form an observation sequence.
  • the forward algorithm is used to find the probability that the observation value sequence is similar to the feature sequence composed of multiple feature points of the face. If the similarity probability is greater than the judgment threshold, it is considered that the detected image contains a face.
  • the training process of the EHMM model is as follows:
  • Transition transition matrix A 0 a 0 ,ij, where a 0 ,ij is the probability of transition matrix from superstate i to superstate j.
  • a 0 ,ij is the probability of transition matrix from superstate i to superstate j.
  • the only allowed transition is The transition from this state to the next state, so the transition probability from the original state to the previous state is 0;
  • B k represents the probability matrix of observations
  • the embedded state j representing the super state k produces observations The probability
  • the two variables correspond to the vertical and horizontal dimensions respectively
  • Image segmentation The test image is uniformly divided, and the observation value sequence obtained from the image is evenly divided into N 0 longitudinal slices corresponding to the longitudinal superstate. Each longitudinal slice can be divided into multiple embedded states from left to right.
  • Parameter initialization After segmentation, the initial values of the model parameters are obtained through the initialization probability and the transition probability of the state.
  • the state of each EHMM uses K-means clustering to calculate the probability of observation. K is the number of Gaussian distributions in each state. All the observed value vectors extracted in the embedded state can use the Gaussian mixture model to explain the observed value probability density function.
  • the state initialization rule of each super state is as follows: the initialization probability of the first state of each EHMM is specified as 1.0, and the initialization probability of other states is 0.
  • Embedded Viterbi segmentation After the first step of the iteration, the dual embedded Viterbi algorithm (Viterbi algorithm) is used instead of uniform segmentation, and a new set of initialization and migration probabilities are determined through the new segmentation and event frequency counting.
  • Viterbi algorithm the dual embedded Viterbi algorithm
  • Kmeans clustering is used to calculate the observation value vector corresponding to the new state and the new observation value probability density function. In the next iteration, these values will be used as the initial values for a new round of dual embedded Viterbi segmentation.
  • a reclassification module 508 is further included.
  • the reclassification module 508 is used to classify the face and non-face of the test image using the embedded hidden Markov algorithm, and then further classify the human face Images are classified, including identifying the gender and age of the image.
  • CNN Convolutional Neural Network
  • 6 categories are set according to age group and gender, such as 0-20-male, 20-40-male, 40-70-male, 0-20-female, 20-40-female, 40-70-female.
  • Use CNN Convolutional Neural Network to identify and classify training images into the above six categories;
  • LNMF models are trained for these six categories respectively, and six LNMF models corresponding to the above six categories are obtained.
  • the LNMF model corresponding to the 0-20-male tongue is used to identify the 0-20-male tongue.
  • the LNMF model corresponding to the 20-40-female tongue is used to identify the 20-40-female tongue;
  • the corresponding LNMF model is used to extract the tongue image area.
  • the test images are also divided into corresponding age and gender categories, and tongues of this age and gender have corresponding characteristics, which can be more Conducive to LNMF model to identify.
  • dividing the test image into multiple categories for simultaneous recognition also speeds up the recognition efficiency.
  • the tongue image corresponds to the age group and gender, this also helps the accuracy and rapidity of the later classification of damp heat, yin deficiency, normal, heat rich, qi and blood barrier, and blood stasis.
  • an embodiment of the present application also provides a tongue image extraction device, which includes a matrix decomposition module 501, a tongue feature extraction module 502, and a tongue image segmentation module 503.
  • the matrix decomposition module 501 is used to convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and the LNMF algorithm is used for training to decompose the matrix V into
  • the product of the non-negative feature matrix W and the weight matrix H, that is, V WH; the dimension of the non-negative feature matrix W is n*r, and the column r is the feature base image.
  • the feature base image refers to the non-negative feature of the tongue.
  • Negative feature matrix W, the non-negative feature matrix W constitutes a non-negative subspace; the dimension of the weight matrix H is r*m, and each column is a code.
  • the tongue feature extraction module 502 uses the EHMM model to identify whether the test image contains a face image, and if it does, the training image and the test image are respectively projected to the non-negative subspace to obtain feature coefficients respectively, using the nearest neighbor criterion To obtain the similarity of the feature coefficients corresponding to the training image and the test image, and extract the features in the test image whose similarity is higher than the similarity threshold as tongue features;
  • the tongue image segmentation module 503 uses different tags to distinguish the feature area containing the tongue feature from the non-feature area without the tongue feature, and determines the minimum border containing the feature area by reading the tag, thereby identifying the feature area representing the tongue feature Segmented from the test image.
  • a classification module 504 is further included.
  • the classification module 504 is used to classify the features extracted from the test image using an SVM classifier, and send the extracted features to k svm classifiers for identification, and the value of k The value is equal to the number of categories.
  • a frame adjustment module 505 is further included.
  • the frame adjustment module 505 is used to adjust the frame position of the tongue image through a linear regression model.
  • a linear regression model For each category, for example, damp heat, yin deficiency, normal, heat and blood, Train a linear regression model for the unreasonable and blood stasis respectively.
  • the input is the characteristics of the image in the frame, and the output is the translation (left-right translation and up-down translation) value and zoom value of the border.
  • the linear regression model is used to calculate the translation and zoom values of the border, and the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
  • a binarization module 506 is further included.
  • the binarization module 506 is used to perform binarization on both the training image and the test image.
  • the face recognition module 507 is used to first use EHMM (Embedded Hidden Markov Algorithm) to classify the test image. Specifically, the image is classified as ”And “Nobody’s face” are classified into two categories to optimize the recognition accuracy.
  • EHMM Embedded Hidden Markov Algorithm
  • the embodiment of the present application also proposes a computer non-volatile readable storage medium.
  • the computer non-volatile readable storage medium may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read-only memory (ROM), erasable programmable read-only memory (EPROM), portable compact disk read-only memory (CD-ROM), USB memory, etc., or any combination of several.
  • the computer non-volatile readable storage medium includes a tongue image extraction program, etc., when the tongue image extraction program 50 is executed by the processor 22, the following operations are implemented:
  • the LNMF local non-negative matrix factorization
  • 1000 tongue images that is, the images contain the tongue and reflect the shape and color of the tongue
  • the tongue images have been previously annotated.
  • the tongue image can be compressed first, such as compressed to 56*64 pixels, and the tongue image is de-averaged and normalized, and the LNMF algorithm is used for training to obtain feature base images of different dimensions.
  • the base image refers to a non-negative feature matrix W representing the characteristics of the tongue, and the non-negative feature matrix W constitutes a non-negative subspace.
  • LNMF is an improvement on the basis of NMF.
  • the dimension of the feature matrix W is n*r, and column r is the base image;
  • the dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
  • the EHMM model uses the EHMM model to identify whether the test image contains a face image, and if it does, perform feature extraction on the test image.
  • the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace
  • the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain The similarity of the feature coefficients corresponding to the training image and the test image, so as to extract the features in the test image.
  • the similarity of the feature coefficients is higher than the set threshold, it means that the feature base in the test image is the tongue, so that images with tongue features can be selected from the test image.
  • the test image is projected to the non-negative subspace, and the process of projection is equivalent to transforming the test image to the non-negative subspace, which is still an image, and is an image composed of learned features.
  • Different labels are used to mark the characteristic regions containing tongue features and non-feature regions without tongue features, and the smallest borders containing the characteristic regions are determined by reading the labels, and the characteristic regions containing tongue features are segmented from the test image.
  • the non-characteristic area is 0 and the characteristic area is non-zero. According to 0 and non-zero, the image area representing the tongue feature can be segmented from each test image with a frame.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

A tongue image extraction method and device, and a computer readable storage medium. In the invention, an LNMF algorithm is used to perform training, and a matrix V corresponding to a training image is decomposed into the product of a non-negative feature matrix W and a weight value matrix H; the dimensions of the non-negative feature matrix W are n*r, r columns being a feature base image, and the non-negative feature matrix W forming a non-negative subspace; the dimensions of the weight value matrix H are r*m; the training image and test images are projected onto the non-negative subspace to obtain a feature coefficient for each; a nearest neighbor rule is used to solve for the degree of similarity of the feature coefficients corresponding to the training image and to the test images; and features in the test images for which the degree of similarity is higher than a threshold value are extracted, and thus frames may be used to separate out from each test image regions representing tongue features.

Description

舌头图像提取方法、装置及计算机可读存储介质Tongue image extraction method, device and computer readable storage medium
相关申请的交叉引用:Cross-references to related applications:
本申请要求于2019年8月9日提交至中国专利局申请号201910733855.8,申请名称为“舌头图像提取方法、装置及计算机可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed to the China Patent Office with application number 201910733855.8 on August 9, 2019, with the application titled "Tongue Image Extraction Method, Apparatus, and Computer-readable Storage Medium", the entire content of which is incorporated by reference In this application.
技术领域Technical field
本申请涉及一种舌头图像提取方法、装置及计算机可读存储介质。This application relates to a tongue image extraction method, device and computer readable storage medium.
背景技术Background technique
现有的舌头图像检测方法,通常是采用目标检测的方式,利用滑动窗口沿水平和竖向分别在图像上滑动,并通过CNN模型对滑动窗口中的物体提取空间特征,利用SVM分类器对提取到的所述空间特征分类,从而确定滑动窗口中是否是舌头图像。输出滑动窗口的四个角点的坐标,以四个角点的坐标来标定舌头图像的位置。但是,由于不同图像中舌头的尺寸相差很多,角度姿态也各不相同。因此,目标框的大小不确定,这就要通过各种大小的目标框来多次进行滑动识别,这也导致目标检测具有一定的复杂度。Existing tongue image detection methods usually adopt the method of target detection, using a sliding window to slide on the image in the horizontal and vertical directions, and extract the spatial features of the objects in the sliding window through the CNN model, and use the SVM classifier to extract The spatial features obtained are classified to determine whether there is a tongue image in the sliding window. The coordinates of the four corner points of the sliding window are output, and the position of the tongue image is calibrated with the coordinates of the four corner points. However, due to the large differences in the size of the tongue in different images, the angle and posture are also different. Therefore, the size of the target frame is uncertain, which requires multiple times of sliding recognition through target frames of various sizes, which also leads to a certain degree of complexity in target detection.
因此,发明人意识到如何快速获取位姿正确并且完整清晰的舌头图像是一个亟待解决的问题。Therefore, the inventor realized that how to quickly obtain a correct, complete and clear tongue image is an urgent problem to be solved.
发明内容Summary of the invention
根据本申请公开的各种实施例,提供一种舌头图像提取方法,应用于电子装置,该方法包括以下步骤:According to various embodiments disclosed in the present application, a tongue image extraction method is provided, which is applied to an electronic device, and the method includes the following steps:
S110,将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;S110: Convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and use the LNMF algorithm for training to decompose the matrix V into a non-negative feature matrix W and weights. The product of the value matrix H, that is, V=WH;
非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;The dimension of the non-negative feature matrix W is n*r, and column r is a feature base image. The feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
权值矩阵H的维数为r*m,其每一列为编码;The dimension of the weight matrix H is r*m, and each column is a code;
S120,采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;S120. Use the EHMM model to identify whether the test image contains a face image, and if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain the training image and Test the similarity of the feature coefficients corresponding to the image, and extract the tongue-representing feature in the test image whose similarity is higher than the similarity threshold as the tongue feature;
S130,经过投影会将包含舌头特征的特征区域和不含舌头特征的非特征区域分别用不同标签标识出来,其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右方向的极值,确定包含特征区域的边框。S130: After projection, the characteristic area containing the tongue feature and the non-characteristic area not containing the tongue feature are respectively identified with different labels, where the label set corresponds to the boundary information of the characteristic area, and the extreme values in the up, down, left, and right directions in the boundary information are extracted To determine the border that contains the characteristic area.
本申请还提供一种舌头图像提取装置,包括:This application also provides a tongue image extraction device, including:
矩阵分解模块,用于将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;非负特征矩阵W的维数为n*r,r列为特征基图像, 所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;权值矩阵H的维数为r*m,其每一列为编码;The matrix decomposition module is used to convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V. The LNMF algorithm is used for training to decompose the matrix V into non-negative features The product of the matrix W and the weight matrix H, that is, V=WH; the dimension of the non-negative feature matrix W is n*r, column r is the feature base image, and the feature base image refers to the non-negative feature matrix representing the tongue feature W, the non-negative feature matrix W constitutes a non-negative subspace; the dimension of the weight matrix H is r*m, and each column is a code;
舌头特征提取模块,采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;The tongue feature extraction module uses the EHMM model to identify whether the test image contains a face image, and if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain The similarity of the feature coefficients corresponding to the training image and the test image, and extract the tongue-representing feature in the test image whose similarity is higher than the similarity threshold as the tongue feature;
舌头图像分割模块,利用不同标签标记包含舌头特征的特征区域和不含舌头特征的非特征区域,其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右方向的极值,确定包含特征区域的边框。The tongue image segmentation module uses different labels to mark the characteristic areas that contain tongue features and the non-feature areas that do not contain tongue features. The label set corresponds to the boundary information of the characteristic area, and extracts the extreme values in the up, down, left, and right directions in the boundary information to determine the inclusion The border of the characteristic area.
本申请还提供一种电子装置,该电子装置包括:存储器和处理器,所述存储器中存储有舌头图像提取程序,所述舌头图像提取程序被所述处理器执行时实现如下步骤:The present application also provides an electronic device, which includes a memory and a processor, and a tongue image extraction program is stored in the memory. When the tongue image extraction program is executed by the processor, the following steps are implemented:
S110,将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;S110: Convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and use the LNMF algorithm for training to decompose the matrix V into a non-negative feature matrix W and weights. The product of the value matrix H, that is, V=WH;
非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;The dimension of the non-negative feature matrix W is n*r, and column r is a feature base image. The feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
权值矩阵H的维数为r*m,其每一列为编码;The dimension of the weight matrix H is r*m, and each column is a code;
S120,采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;S120. Use the EHMM model to identify whether the test image contains a face image, and if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain the training image and Test the similarity of the feature coefficients corresponding to the image, and extract the tongue-representing feature in the test image whose similarity is higher than the similarity threshold as the tongue feature;
S130,经过投影会将包含舌头特征的特征区域和不含舌头特征的非特征区域分别用不同标签标识出来,其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右方向的极值,确定包含特征区域的边框。S130: After projection, the characteristic area containing the tongue feature and the non-characteristic area not containing the tongue feature are respectively identified with different labels, where the label set corresponds to the boundary information of the characteristic area, and the extreme values in the up, down, left, and right directions in the boundary information are extracted To determine the border that contains the characteristic area.
另外,还提供一种计算机非易失性可读存储介质,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时,实现以上任一项所述的舌头图像提取方法。In addition, a computer non-volatile readable storage medium is also provided. The computer non-volatile readable storage medium stores a computer program. The computer program includes program instructions. When the program instructions are executed by a processor, Realize any of the tongue image extraction methods described above.
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。The details of one or more embodiments of the application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.
附图说明Description of the drawings
通过结合下面附图对其实施例进行描述,本申请的上述特征和技术优点将会变得更加清楚和容易理解。By describing its embodiments in conjunction with the following drawings, the above-mentioned features and technical advantages of the present application will become clearer and easier to understand.
图1是本申请实施例的舌头图像提取方法的流程示意图;FIG. 1 is a schematic flowchart of a tongue image extraction method according to an embodiment of the present application;
图2是本申请实施例的图像的切片对应的EHMM的超状态及嵌入状态的的流程示意图一;2 is a schematic diagram 1 of the flow of the super state and the embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
图3是本申请实施例的图像的切片对应的EHMM的超状态及嵌入状态的的流程示意图二;FIG. 3 is a second schematic diagram of the flow of the super state and the embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
图4是本申请实施例的图像的切片对应的EHMM的超状态及嵌入状态的的流程示意图三;FIG. 4 is a third flowchart of the superstate and embedded state of the EHMM corresponding to the slice of the image in the embodiment of the present application;
图5是本申请实施例的电子装置的硬件架构示意图;5 is a schematic diagram of the hardware architecture of an electronic device according to an embodiment of the present application;
图6是本申请实施例的舌头图像提取程序的模块构成图;Fig. 6 is a block diagram of a tongue image extraction program according to an embodiment of the present application;
图7是本申请实施例的线性回归模型调整边框的示意图。FIG. 7 is a schematic diagram of adjusting a frame of a linear regression model according to an embodiment of the present application.
具体实施方式Detailed ways
下面将参考附图来描述本申请所述的舌头图像提取方法、装置及计算机非易失性可读 存储介质的实施例。Hereinafter, embodiments of the tongue image extraction method, device, and computer non-volatile readable storage medium described in this application will be described with reference to the accompanying drawings.
在其中一个实施例中,图1为本申请实施例提供的舌头图像提取方法的流程示意图,应用于电子装置,该方法包括以下步骤:In one of the embodiments, FIG. 1 is a schematic flowchart of a tongue image extraction method provided by an embodiment of the application, which is applied to an electronic device, and the method includes the following steps:
S110、利用LNMF(局部非负矩阵分解)算法进行训练,得到不同维数的特征基图像。例如采用1000张舌头图像(即图像中包含有舌头,并体现出舌头的形状、颜色等特征)作为训练图像集,且舌头图像已事先进行了标注。优选地,可以先将舌头图像进行压缩,例如压缩为56*64像素,并对舌头图像进行去均值与归一化的处理,利用LNMF算法进行训练得到不同维数的特征基图像,所述特征基图像是指代表舌头的特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间。S110. Use the LNMF (local non-negative matrix factorization) algorithm for training to obtain feature base images of different dimensions. For example, 1000 tongue images (that is, the images contain the tongue and reflect the shape and color of the tongue) are used as the training image set, and the tongue images have been previously annotated. Preferably, the tongue image can be compressed first, such as compressed to 56*64 pixels, and the tongue image is de-averaged and normalized, and the LNMF algorithm is used for training to obtain feature base images of different dimensions. The base image refers to a non-negative feature matrix W representing the characteristics of the tongue, and the non-negative feature matrix W constitutes a non-negative subspace.
其中,LNMF是在NMF的基础上的改进,LNMF算法是将训练图像对应的矩阵V分解为特征矩阵W与权值矩阵H的乘积,即V=WH。Among them, LNMF is an improvement on the basis of NMF. The LNMF algorithm decomposes the matrix V corresponding to the training image into the product of the feature matrix W and the weight matrix H, that is, V=WH.
其中,V是n*m的矩阵,V=(V1,V2,……Vm),一张图像的所有的非负灰度值就是对应V中的一列,V中的数据就是训练图像所对应的灰度值。Among them, V is a matrix of n*m, V=(V1, V2,...Vm), all the non-negative gray values of an image correspond to a column in V, and the data in V is the corresponding training image grayscale value.
特征矩阵W的维数为n*r,r列为基图像;The dimension of the feature matrix W is n*r, and column r is the base image;
权值矩阵H的维数为r*m,其每一列为编码,与V中的一个舌头图像一一对应,由此,一个训练图像则可以表示为基图像的线性组合。The dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
S120、采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,对测试图像进行特征提取。具体说,代表舌头的特征的非负特征矩阵W构成非负子空间,将训练图像和测试图像分别向训练图像集获得的非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似程度,并将特征系数的相似程度高于设定的阈值的代表舌头的特征提取出来作为舌头特征,从而从测试图像中筛选出具有舌头特征的图像。其中,舌头的特征包括舌头的形状、角度、颜色、舌苔状态,以及舌头与人脸器官的位置关系。S120: Use the EHMM model to identify whether the test image contains a face image, and if it does, perform feature extraction on the test image. Specifically, the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace, and the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain The similarity of the feature coefficients corresponding to the training image and the test image is extracted, and the tongue-representative features whose similarity of the feature coefficients are higher than the set threshold are extracted as tongue features, so that images with tongue features are screened from the test images. Among them, the characteristics of the tongue include the shape, angle, color, state of the tongue coating, and the positional relationship between the tongue and the facial organs.
S130、测试图像向非负子空间投影,投影的过程相当于将测试图像变换至非负子空间,其仍是图像,并且是一幅由学习到的特征组成的图像,经过投影会将包含舌头特征的特征区域和不含舌头特征的非特征区域分别用不同标签标识出来,从而将包含舌头特征的特征区域从测试图像中分割出来。其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右的极值,可确定包含特征区域的最小边框。这里采用最小边框的意义是后面会采用线性回归的方式来调整最小边框的位置,以消除或减小位置误差。其非特征区域和特征区域具有不同的标签,例如非特征区域为0,特征区域为非0,根据0和非0即可将代表舌头特征的图像区域用边框从各测试图像中分割出来。进一步地,还包括步骤S140、使用SVM分类器对测试图像中提取的特征进行分类,将提取到的特征送到k个svm分类器中识别,k的取值与类别数相等。具体说,可以是例如“舌头”、“非舌头”的分类。也可以是根据舌头的病理状况下的特征来分类,其中,人的身体情况可以包括湿热、阴虚、正常、热盛、气血不通、血瘀,将k个SVM分类器中得分最高的类作为分类结果。S130. The test image is projected to the non-negative subspace. The process of projection is equivalent to transforming the test image to the non-negative subspace. It is still an image and is an image composed of learned features. After projection, the tongue will be included. Featured feature areas and non-featured areas without tongue features are respectively identified with different labels, so that the feature areas containing tongue features are segmented from the test image. Among them, the label set corresponds to the boundary information of the characteristic area, and the upper, lower, left, and right extreme values in the boundary information are extracted to determine the smallest border containing the characteristic area. The significance of using the minimum frame here is that linear regression will be used to adjust the position of the minimum frame to eliminate or reduce the position error. The non-characteristic area and the characteristic area have different labels. For example, the non-characteristic area is 0 and the characteristic area is non-zero. Based on 0 and non-zero, the image area representing the tongue feature can be segmented from each test image with a frame. Further, it also includes step S140, using an SVM classifier to classify the features extracted from the test image, and sending the extracted features to k svm classifiers for recognition, and the value of k is equal to the number of categories. Specifically, it can be classified into "tongue" and "non-tongue", for example. It can also be classified according to the characteristics of the pathological condition of the tongue. Among them, the physical condition of a person can include damp heat, yin deficiency, normal, high heat, impassable qi and blood, and blood stasis. The class with the highest score among the k SVM classifiers As a result of classification.
在其中一个实施例中,,还包括步骤S150,通过线性回归模型调整舌头图像的边框位置,对于每一个类,例如,湿热、阴虚、正常、热盛、气血不通、血瘀分别训练一个线性回归模型,输入为边框中的图像的特征,而输出为边框的平移(左右平移和上下平移)值、缩放值。通过线性回归模型计算得到边框的平移值和缩放值,并利用损失函数约束边框的位置误差,从而不断调整边框移动到合适的位置。In one of the embodiments, it further includes step S150, adjusting the border position of the tongue image through the linear regression model. For each category, for example, damp-heat, yin deficiency, normal, heat-rich, qi and blood impassable, and blood stasis are trained separately. In the linear regression model, the input is the characteristics of the image in the frame, and the output is the translation (left-right translation and up-down translation) value and zoom value of the border. The linear regression model is used to calculate the translation value and zoom value of the border, and the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
其中,如图7所示,线性回归模型是给定位置原始值P=(P x,P y,P w,P h),其中,P x,P y表示边框的坐标,P w,P h分别表示边框的宽和高,通过机器学习的方式获取映射f,使得
Figure PCTCN2019118413-appb-000001
并且,
Among them, as shown in Figure 7, the linear regression model is the original value of a given position P=(P x ,P y ,P w ,P h ), where P x ,P y represent the coordinates of the frame, P w ,P h Represents the width and height of the border respectively, and obtains the mapping f through machine learning, so that
Figure PCTCN2019118413-appb-000001
and,
位置预测值
Figure PCTCN2019118413-appb-000002
≈位置真实值(G x,G y,G w,G h)。
Position prediction
Figure PCTCN2019118413-appb-000002
≈The true value of the position (G x , G y , G w , G h ).
假设平移(Δx,Δy),Δx=P wd x(P),Δy=P hd y(P),则 Assuming translation (Δx, Δy), Δx=P w d x (P), Δy=P h d y (P), then
Figure PCTCN2019118413-appb-000003
Figure PCTCN2019118413-appb-000003
Figure PCTCN2019118413-appb-000004
Figure PCTCN2019118413-appb-000004
假设尺度缩放(S w,S h),S w=exp(d w(P)),S h=exp(d h(P)),则 Assuming scale scaling (S w ,S h ), S w =exp(d w (P)), Sh =exp(d h (P)), then
Figure PCTCN2019118413-appb-000005
Figure PCTCN2019118413-appb-000005
Figure PCTCN2019118413-appb-000006
Figure PCTCN2019118413-appb-000006
边框回归就是学习获得准确的d x(P),d y(P),d w(P),d h(P)这四个变换值。 The bounding box regression is to learn to obtain accurate d x (P), d y (P), d w (P), d h (P) these four transformation values.
输入为P=(P x,P y,P w,P h),输出为位置预测值
Figure PCTCN2019118413-appb-000007
而位置原始值变换为位置实际值G则需要经过真实变换值t *=(t x,t y,t w,t h),其中,真实平移量为(t x,t y),尺度真实缩放的宽和高为(t w,t h),其中,
The input is P=(P x ,P y ,P w ,P h ), and the output is the position prediction value
Figure PCTCN2019118413-appb-000007
While the original position value is transformed into the actual position value G, the real transformation value t * = (t x , t y , t w , t h ) is required, where the real translation amount is (t x , t y ), and the scale is actually scaled The width and height of is (t w , t h ), where
t x=(G x-P x)/P w                                   (5) t x =(G x -P x )/P w (5)
t y=(G y-P y)/P h                                   (6) t y = (G y -P y )/P h (6)
t w=log(G w/P w)                                   (7) t w =log(G w /P w ) (7)
t h=log(G h/P h)                                   (8) t h =log(G h /P h ) (8)
构造目标函数
Figure PCTCN2019118413-appb-000008
w *是要学习的参数(*表示x,y,w,h,也就是每一个变换都设置一个目标函数),d*(P)是得到的变换预测值。K(P)是特征区域对应的特征向量,要让变换预测值跟变换真实值t*=(t x,t y,t w,t h)的差距最小,构造损失函数Loss并最小化:
Construct objective function
Figure PCTCN2019118413-appb-000008
w * is the parameter to be learned (* means x, y, w, h, that is, an objective function is set for each transformation), and d*(P) is the predicted value of the transformation. K(P) is the feature vector corresponding to the feature region. To minimize the difference between the transformed predicted value and the transformed true value t*=(t x , t y , t w , t h ), construct the loss function Loss and minimize:
Figure PCTCN2019118413-appb-000009
Figure PCTCN2019118413-appb-000009
其中,i是第i次迭代;Among them, i is the i-th iteration;
N是样本数量。N is the number of samples.
通过一些样本训练,使得损失函数最小化,从而得到w*,即可得到d *(P),也就是d x(P),d y(P),d w(P),d h(P)值。 Through some sample training, the loss function is minimized to obtain w*, and then d * (P) can be obtained, which is d x (P), d y (P), d w (P), d h (P) value.
在其中一个实施例中,NMF是一种基于局部子空间投影方法,由于NMF算法提取的特征是基于全局特征,对特征空间的局部性没有任何限制。为了增强特征矩阵W的主成分局部化特征,LNMF更加强调了原图像分解过程中基本特征成分的局部化。LNMF算法的公式如下:In one of the embodiments, NMF is a local subspace projection method. Since the features extracted by the NMF algorithm are based on global features, there is no restriction on the locality of the feature space. In order to enhance the localization of the principal components of the feature matrix W, LNMF emphasizes the localization of the basic feature components in the original image decomposition process. The formula of the LNMF algorithm is as follows:
构建目标函数Build the objective function
Figure PCTCN2019118413-appb-000010
Figure PCTCN2019118413-appb-000010
其中,α、β为正常数;Among them, α and β are normal numbers;
V、W、H≥0;V, W, H≥0;
Figure PCTCN2019118413-appb-000011
||W j||=1,W j表示特征基矩阵W的第j列向量,表示对特征基矩阵W的每一列都进行归一化处理;
Figure PCTCN2019118413-appb-000011
||W j || = 1, W j represents the j-th column vector of the feature base matrix W, which means that each column of the feature base matrix W is normalized;
V=[V 1,V 2…V i…V m]表示m幅训练图像集合,列向量V i表示第i幅训练图像,V ij表示第i幅图像的第j个灰度值,每幅训练图像的尺寸为n,V的大小为n*m。 V = [V 1, V 2 ... V i ... V m] represents a set of training images web m, V i represents the column vector web training image i, V ij denotes the j-th amplitude of the i gray value image, each piece The size of the training image is n, and the size of V is n*m.
W=[W 1,W 2,…W j…W r]为特征矩阵,大小为n*r; W=[W 1 , W 2 ,...W j …W r ] is the characteristic matrix, the size is n*r;
H=[H 1,H 2…H j…H m]为权值矩阵,H j为H的第j列向量,大小为r*m。 H=[H 1 , H 2 …H j …H m ] is the weight matrix, H j is the j-th column vector of H, and the size is r*m.
通过下式迭代更新W,H,实现目标函数最小化,Iteratively update W and H through the following formula to minimize the objective function,
Figure PCTCN2019118413-appb-000012
Figure PCTCN2019118413-appb-000012
Figure PCTCN2019118413-appb-000013
Figure PCTCN2019118413-appb-000013
Figure PCTCN2019118413-appb-000014
Figure PCTCN2019118413-appb-000014
其中,i=1,2,…m;j=1,2,…,r;l=1,2,…n,在迭代过程中W,H始终保持为非负数。Among them, i=1,2,...m; j=1, 2,...,r; l=1,2,...n, W and H always remain non-negative numbers during the iteration process.
在其中一个实施例中,对训练图像和测试图像都先进行二值化,所述二值化是指将图像上的像素点的灰度值仅设置为0或255,也就是将整个图像呈现出明显的黑白效果的过程。可以更加精确的获取舌头区域,具体地说,是根据设定的灰度值阈值将图像上的像素点的灰度值仅设置为0或255,灰度值阈值可以是取0至255的中位值,小于灰度值阈值的设为0,大于等于灰度值阈值的设为255。In one of the embodiments, both the training image and the test image are binarized first, and the binarization refers to setting the gray value of the pixel on the image to only 0 or 255, that is, to present the entire image The process of producing obvious black and white effects. The tongue area can be obtained more accurately. Specifically, the gray value of the pixel on the image is only set to 0 or 255 according to the set gray value threshold, and the gray value threshold can be the middle value from 0 to 255. The bit value is set to 0 if it is less than the gray value threshold, and 255 if it is greater than or equal to the gray value threshold.
在其中一个实施例中,先采用EHMM(嵌入式隐马尔科夫算法)对测试图像进行分类处理,具体说,将图像按照“有人脸”、“没人脸”两个类别来进行分类,可以优化识别准确率。具体分类过程包括以下步骤:In one of the embodiments, the EHMM (Embedded Hidden Markov Algorithm) is first used to classify the test image. Specifically, the image is classified according to two categories of "human face" and "no human face". Optimize recognition accuracy. The specific classification process includes the following steps:
选取人脸的多个特征点形成特征序列。Select multiple feature points of the face to form a feature sequence.
将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,其首先从左向右扫描,每个窗口扫描得到一组特征向量,是对此时人脸区域的一种特征提取。扫描窗口计算得到特征向量后,间隔固定距离向右移动,继续进行特征提取,当移到图像右侧时,换到下一行继续从左向右完成扫描。直到窗口移动到图像的右下方,结束整个扫描过程并得到多组特征向量,多组特征向量组成观察值序列。Input the test image into the EHMM model. The EHMM model scans the test image from top to bottom and from left to right through a moving window. It first scans from left to right. Each window scans to obtain a set of feature vectors, which is the face at this time A feature extraction of a region. After the scanning window calculates the feature vector, it moves to the right at a fixed distance to continue the feature extraction. When it moves to the right side of the image, it changes to the next line to continue scanning from left to right. Until the window moves to the lower right of the image, the entire scanning process is ended and multiple sets of feature vectors are obtained, and multiple sets of feature vectors form an observation sequence.
其中,EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超状态封装有一组嵌入式状态集合,嵌入式状态集合中的嵌入 式状态的数量与人脸在横向的切片图片数量相同。EHMM模型通过固定大小窗口从左到右,从上到下扫描图像,脸部的特征可以对应从上到下的超状态,和从左到右的嵌入式状态。其中,如图2所示,纵向的超状态对应图像的切片是额头区域、眼睛区域、鼻子区域、嘴巴区域、下巴区域。从上往下看,这些区域的位置关系是固定的,这是人脸的共性。而竖向的人脸的个性则通过每个超状态(即各个区域)的自身的特性,以及各超状态之间相互的联系反映出来。而从左往右看,人脸又划分为左脸部、左眼、两眼中间、右眼、右脸部,这位置关系也是固定的,而横向的人脸的个性则通过每个嵌入状态以及各嵌入状态之间的相互的联系反映出来。Among them, the EHMM model contains a set of superstates. The number of superstates in the superstate set is the same as the number of vertical slice pictures of a human face. Each superstate encapsulates a set of embedded state sets. The number of embedded states is the same as the number of slice images of the face in the horizontal direction. The EHMM model scans the image from left to right and top to bottom through a fixed-size window. The facial features can correspond to the super state from top to bottom and the embedded state from left to right. Among them, as shown in Figure 2, the slices of the image corresponding to the longitudinal super state are forehead area, eye area, nose area, mouth area, and chin area. From top to bottom, the positional relationship of these areas is fixed, which is the common feature of human faces. The personality of the vertical face is reflected by the characteristics of each superstate (that is, each region), and the relationship between the superstates. From left to right, the face is divided into left face, left eye, middle of the two eyes, right eye, and right face. This positional relationship is also fixed, and the personality of the horizontal face passes through each embedded state And the mutual connection between each embedded state is reflected.
利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸。The forward algorithm is used to find the probability that the observation value sequence is similar to the feature sequence composed of multiple feature points of the face. If the similarity probability is greater than the judgment threshold, it is considered that the detected image contains a face.
在其中一个实施例中,EHMM模型的训练过程如下:In one of the embodiments, the training process of the EHMM model is as follows:
1)EHMM建模:EHMM可以被定义为λ=(P 0,A 0,∧)的三元式,其中
Figure PCTCN2019118413-appb-000015
EHMM模型的基本元素包括:
1) EHMM modeling: EHMM can be defined as a ternary formula of λ = (P 0 , A 0 , ∧), where
Figure PCTCN2019118413-appb-000015
The basic elements of the EHMM model include:
(1)超状态的初始概率P 0=π 0,i,π 0,i是超状态i在time=0时候的概率,1≤i≤N 0,N 0表示超状态的数量; (1) The initial probability of the super state P 00,i , π 0,i is the probability of the super state i at time=0, 1≤i≤N 0 , N 0 represents the number of super states;
(2)过渡转移矩阵A 0=a 0,ij,其中a 0,ij是迁移矩阵从超状态i转到超状态j的概率,在从左到右的EHMM模型中,唯一被允许的迁移是本状态到下一个状态的迁移,因此原状态到前一个状态的迁移概率是0; (2) Transition transition matrix A 0 =a 0 ,ij, where a 0 ,ij is the probability of transition matrix from superstate i to superstate j. In the left-to-right EHMM model, the only allowed transition is The transition from this state to the next state, so the transition probability from the original state to the previous state is 0;
(3)
Figure PCTCN2019118413-appb-000016
表示第k个超状态的参数集,1≤k≤N 0
(3)
Figure PCTCN2019118413-appb-000016
Represents the parameter set of the k-th super state, 1≤k≤N 0 ;
其中,
Figure PCTCN2019118413-appb-000017
为嵌入状态的初始概率分布;
among them,
Figure PCTCN2019118413-appb-000017
Is the initial probability distribution of the embedded state;
Figure PCTCN2019118413-appb-000018
是嵌入状态转移概率矩阵;
Figure PCTCN2019118413-appb-000018
Is the embedded state transition probability matrix;
Bk表示观察值概率矩阵,
Figure PCTCN2019118413-appb-000019
表示超状态k的嵌入状态j产生观察量
Figure PCTCN2019118413-appb-000020
的概率,
Figure PCTCN2019118413-appb-000021
两个变量分别对应竖向和横向两个维度,
Bk represents the probability matrix of observations,
Figure PCTCN2019118413-appb-000019
The embedded state j representing the super state k produces observations
Figure PCTCN2019118413-appb-000020
The probability,
Figure PCTCN2019118413-appb-000021
The two variables correspond to the vertical and horizontal dimensions respectively,
Figure PCTCN2019118413-appb-000022
Figure PCTCN2019118413-appb-000022
其中,
Figure PCTCN2019118413-appb-000023
表示混合高斯数目;
among them,
Figure PCTCN2019118413-appb-000023
Represents the number of mixed Gaussians;
Figure PCTCN2019118413-appb-000024
是超状态k的嵌入状态j的第m个混合项的混合系数;
Figure PCTCN2019118413-appb-000024
Is the mixing coefficient of the m-th mixing term of the embedded state j of the super state k;
Figure PCTCN2019118413-appb-000025
是以
Figure PCTCN2019118413-appb-000026
为均值向量、
Figure PCTCN2019118413-appb-000027
为协方差矩阵的高斯密度。
Figure PCTCN2019118413-appb-000025
So
Figure PCTCN2019118413-appb-000026
Is the mean vector,
Figure PCTCN2019118413-appb-000027
Is the Gaussian density of the covariance matrix.
2)图像分割:均匀分割训练图像,从图像得到的观察值序列被均匀分为N 0个对应纵向的超状态的纵向切片。每个纵向切片可以从左到右切分到多个嵌入状态中。 2) Image segmentation: The training image is uniformly divided, and the observation sequence obtained from the image is evenly divided into N 0 longitudinal slices corresponding to the longitudinal superstate. Each longitudinal slice can be divided into multiple embedded states from left to right.
3)参数初始化:分割后,模型参数的初始值是通过状态的初始化概率和迁移概率得到的。每个EHMM的状态采用了K-means聚类计算观察值概率。K是每个状态的高斯分 布个数。所有嵌入状态中提取的观察值向量都能用高斯混合模型解释其观察值概率密度函数。每个超状态的状态初始化规则如下:每个EHMM的第一个状态初始化概率规定为1.0,其他状态初始化概率为0。3) Parameter initialization: After segmentation, the initial values of the model parameters are obtained through the initialization probability and the transition probability of the state. The state of each EHMM uses K-means clustering to calculate the probability of observation. K is the number of Gaussian distributions in each state. All the observed value vectors extracted in the embedded state can use the Gaussian mixture model to explain the observed value probability density function. The state initialization rule of each super state is as follows: the initialization probability of the first state of each EHMM is specified as 1.0, and the initialization probability of other states is 0.
4)嵌入式Viterbi分割:在迭代第一步后,使用双嵌入式Viterbi算法(维特比算法)代替均匀分割,通过新的分割和事件频率计数,确定一组新的初始化和迁移概率。4) Embedded Viterbi segmentation: After the first step of the iteration, the dual embedded Viterbi algorithm (Viterbi algorithm) is used instead of uniform segmentation, and a new set of initialization and migration probabilities are determined through the new segmentation and event frequency counting.
5)分割Kmeans聚类:根据第4步分割结果,使用Kmeans聚类计算新状态对应的观察值向量,以及新的观察值概率密度函数。在下一次迭代中,这些值将会作为新一轮双嵌入式Viterbi分割的初始值。5) Segmentation of Kmeans clustering: According to the segmentation result of step 4, Kmeans clustering is used to calculate the observation value vector corresponding to the new state and the new observation value probability density function. In the next iteration, these values will be used as the initial values for a new round of dual embedded Viterbi segmentation.
6)重复第4和第5步,直到连续迭代过程变化小于设定的收敛阈值。6) Repeat steps 4 and 5 until the continuous iteration process change is less than the set convergence threshold.
在其中一个实施例中,如图3所示,人脸沿纵向的切片还包括头发区域,虽然并不是所有人都有头发,但是头发区域提供了人脸所具备的附加特征,确实有助于更加精确的识别出人脸。其计算过程与以上过程基本相同,在此不再赘述。In one of the embodiments, as shown in Fig. 3, the longitudinal slice of the face also includes the hair area. Although not everyone has hair, the hair area provides additional features of the face, which really helps Recognize faces more accurately. The calculation process is basically the same as the above process, so I won't repeat it here.
在其中一个实施例中,以上是以整张人脸在纵向划分切片图像来对应超状态,并在横向划分切片对应嵌入状态。然而由于本申请是以舌头图像提取为目的,所以,在人脸识别上,也可以仅以部分纵向切片来对应超状态,例如,如图4所示,仅以下巴和嘴巴区域来对应超状态,而不需要对人脸其他区域进行识别,也可以通过训练来识别出是否是人脸,并且可以减少计算量。其计算过程与以上过程基本相同,在此不再赘述。In one of the embodiments, the above is that the entire face is divided into slice images in the vertical direction to correspond to the super state, and the horizontal division into slices corresponds to the embedded state. However, since the purpose of this application is tongue image extraction, in face recognition, only partial longitudinal slices can be used to correspond to the super state. For example, as shown in Fig. 4, only the chin and mouth area are corresponding to the super state. , Without the need to recognize other areas of the face, it can also be trained to identify whether it is a face, and the amount of calculation can be reduced. The calculation process is basically the same as the above process, so I won't repeat it here.
在其中一个实施例中,在采用嵌入式隐马尔科夫算法对测试图像进行人脸、非人脸的分类处理后,还进一步对有人脸的图像进行分类,包括识别图像的性别和年龄。虽然根据舌头的状态并不能准确判断出人的年龄和性别,但是,人的舌头的状况是与其年龄和性别有关联的(例如,不同的年龄阶段,味蕾(分布在舌头上的乳头状突起)的数量是不同的,其呈现出减少的趋势,对于儿童来说大约有1万个的味蕾,随着年龄的增长,细胞会慢慢的老化,到了老年,味蕾仅仅是儿童时期的20%,而且年龄越小,舌头越嫩,舌头越红;年龄越大,舌头越暗。而女性的舌头则通常比男性的舌头细小)。可以根据人的年龄和性别先把测试图像分类,然后再从年龄和性别分类的类别中的图像分别提取舌头区域图像。由于各类别中的图像是与该类别实际应该对应的舌头更相关联的,也就是说,一张图像中的舌头是属于老年人的,那么其被分类到年龄大的类别中,该图像中的舌头具有老年人应具有的舌头的特征,例如味蕾数量少(当然只是大概数量的识别),舌头的颜色暗淡,则其可以被更快的识别出来,也就相当于减少了模型的计算量。当然,这需要事先训练对应该年龄段的LNMF模型。就是说,把训练集中的图像先根据年龄和性别分类,对每个分类中的图像都进行有舌头、没舌头的标注,也就形成了年龄段-性别-有舌头的标注,对每个分类分别训练一个LNMF模型。In one of the embodiments, after using the embedded Hidden Markov Algorithm to classify the test image for human face and non-human face, the image of the human face is further classified, including identifying the gender and age of the image. Although the age and gender of a person cannot be accurately determined based on the state of the tongue, the condition of a person's tongue is related to its age and gender (for example, different age stages, taste buds (papillary protrusions distributed on the tongue)) The quantity is different, and it shows a decreasing trend. For children, there are about 10,000 taste buds. With age, the cells will slowly age. In old age, taste buds are only 20% of childhood. And the younger the age, the tenderer the tongue, and the redder the tongue; the older, the darker the tongue. Female tongues are usually smaller than male tongues). The test images can be classified according to the age and gender of the person, and then images of the tongue area can be extracted from the images in the categories classified by age and gender. Since the images in each category are more related to the tongue that the category should actually correspond to, that is to say, the tongue in an image belongs to the elderly, so it is classified into the older category, in the image The tongue has the characteristics of the tongue that the elderly should have, such as a small number of taste buds (of course only a rough number of recognition), and the color of the tongue is dim, so it can be recognized faster, which is equivalent to reducing the amount of calculation of the model . Of course, this requires prior training of the LNMF model corresponding to the age group. That is to say, the images in the training set are first classified according to age and gender, and the images in each category are labeled with and without tongue, which also forms the age-gender-with tongue label. For each category Train an LNMF model separately.
下面具体说明一下,首先,获得CNN(卷积神经网络)模型,该CNN模型经过训练可以识别人脸的性别和年龄。假设按照年龄段和性别设定6个分类,例如0~20-男、20~40-男、40~70-男、0~20-女、20~40-女、40~70-女这六个类别,采用CNN(卷积神经网络)来识别将训练图像分类到以上六个年龄段-性别类别中;The following is a specific explanation. First, a CNN (Convolutional Neural Network) model is obtained, which can be trained to recognize the gender and age of a face. Suppose 6 categories are set according to age group and gender, such as 0-20-male, 20-40-male, 40-70-male, 0-20-female, 20-40-female, 40-70-female. Use CNN (Convolutional Neural Network) to identify and classify training images into the above six age-gender categories;
然后,对这六个类别中的图像都进行标注,相应的每张图像对应的标签就是年龄段、性别、舌头的标签;Then, label the images in these six categories, and the corresponding label for each image is the label of age, gender, and tongue;
然后,针对这六个分类分别训练LNMF模型,得到对应以上6个分类的6个LNMF模型,例如0~20-男的舌头对应的LNMF模型,就用于识别0~20-男的舌头。20~40-女的舌头对应的LNMF模型,就用于识别20~40-女的舌头;Then, LNMF models are trained for these six categories respectively, and six LNMF models corresponding to the above six categories are obtained. For example, the LNMF model corresponding to the 0-20-male tongue is used to identify the 0-20-male tongue. The LNMF model corresponding to the 20-40-female tongue is used to identify the 20-40-female tongue;
然后,采用前面经过训练的CNN模型来识别测试图像的性别和年龄,同样的把测试图像按照性别和年龄段进行分类,其类别与训练图像所分的类别相同;Then, use the previously trained CNN model to identify the gender and age of the test image, and similarly classify the test image according to gender and age, and its category is the same as that of the training image;
然后,按照年龄段、性别、舌头对应训练后的LNMF模型来提取舌头图像区域。Then, the tongue image area is extracted according to the LNMF model after age group, gender, and tongue correspondence training.
由于LNMF模型是经过训练专门识别对应年龄段和性别的舌头图像,而测试图像也相应的划分为对应的年龄段和性别分类,而该年龄段和性别的舌头具有对应的一些特征,可以更加有利于LNMF模型来识别。另一方面,测试图像划分为多个类别同时识别,也加快了识别效率。另外,由于舌头图像对应的标注有年龄段和性别,这也有助于后期在湿热、阴虚、正常、热盛、气血不通、血瘀分类上的准确性和快速性。Since the LNMF model is trained to specifically identify tongue images corresponding to age and gender, the test images are also divided into corresponding age and gender categories, and tongues of this age and gender have corresponding characteristics, which can be more Conducive to LNMF model to identify. On the other hand, dividing the test image into multiple categories for simultaneous recognition also speeds up the recognition efficiency. In addition, since the tongue image corresponds to the age group and gender, this also helps the accuracy and rapidity of the later classification of damp heat, yin deficiency, normal, heat rich, qi and blood barrier, and blood stasis.
参阅图5所示,是本申请电子装置的实施例的硬件架构示意图。本实施例中,所述电子装置2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。例如,可以是智能手机、平板电脑、笔记本电脑、台式计算机、服务器等。如图5所示,所述电子装置2至少包括,但不限于,可通过系统总线相互通信连接的存储器21、处理器22、网络接口23。其中:所述存储器21至少包括一种类型的计算机非易失性可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、存储器、磁盘、光盘等。在一些实施例中,所述存储器21可以是所述电子装置2的内部存储单元,例如该电子装置2的硬盘或内存。在另一些实施例中,所述存储器21也可以是所述电子装置2的外部存储设备,例如该电子装置2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。本实施例中,所述存储器21通常用于存储安装于所述电子装置2的操作系统和各类应用软件,例如所述舌头图像提取程序代码等。Refer to FIG. 5, which is a schematic diagram of the hardware architecture of an embodiment of the electronic device of the present application. In this embodiment, the electronic device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. For example, it can be a smart phone, a tablet computer, a notebook computer, a desktop computer, a server, etc. As shown in FIG. 5, the electronic device 2 at least includes, but is not limited to, a memory 21, a processor 22, and a network interface 23 that can be communicatively connected to each other through a system bus. Wherein, the memory 21 includes at least one type of computer non-volatile readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 21 may be an internal storage unit of the electronic device 2, such as a hard disk or a memory of the electronic device 2. In other embodiments, the memory 21 may also be an external storage device of the electronic device 2, for example, a plug-in hard disk equipped on the electronic device 2, a smart media card (SMC), a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. In this embodiment, the memory 21 is generally used to store the operating system and various application software installed in the electronic device 2, such as the tongue image extraction program code.
所述处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。本实施例中,所述处理器22用于运行所述存储器21中存储的程序代码或者处理数据,例如运行所述的舌头图像提取程序等。The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. In this embodiment, the processor 22 is used to run the program code or processing data stored in the memory 21, for example, to run the tongue image extraction program.
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述电子装置2与其他电子装置之间建立通信连接。The network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the electronic device 2 and other electronic devices.
可选地,该电子装置2还可以包括显示器,显示器也可以称为显示屏或显示单元。Optionally, the electronic device 2 may also include a display, and the display may also be called a display screen or a display unit.
包含可读存储介质的存储器21中可以包括操作系统、舌头图像提取程序50等。处理器22执行存储器21中舌头图像提取程序50时实现如上所述的S1至S4的步骤,在此不再赘述。在本实施例中,存储于存储器21中的所述舌头图像提取程序50可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并可由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。例如,图6示出了所述舌头图像提取程序的程序模块示意图,该实施例中,所述舌头图像提取程序50可以被分割为矩阵分解模块501、舌头特征提取模块502、舌头图像分割模块503。以下描述将具体介绍所述程序模块的具体功能。The memory 21 containing a readable storage medium may include an operating system, a tongue image extraction program 50, and the like. When the processor 22 executes the tongue image extraction program 50 in the memory 21, the steps from S1 to S4 described above are implemented, which will not be repeated here. In this embodiment, the tongue image extraction program 50 stored in the memory 21 can be divided into one or more program modules, and the one or more program modules are stored in the memory 21, and can be divided into one or more program modules. It is executed by two processors (in this embodiment, the processor 22) to complete the application. For example, FIG. 6 shows a schematic diagram of the program modules of the tongue image extraction program. In this embodiment, the tongue image extraction program 50 can be divided into a matrix decomposition module 501, a tongue feature extraction module 502, and a tongue image segmentation module 503. . The following description will specifically introduce the specific functions of the program modules.
其中,矩阵分解模块501用于利用LNMF(局部非负矩阵分解)算法进行训练,得到不同维数的特征基图像。例如采用1000张舌头图像(即图像中包含有舌头,并体现出舌头的形状、颜色等特征)作为训练图像集,且舌头图像已事先进行了标注。Among them, the matrix factorization module 501 is used for training using the LNMF (local non-negative matrix factorization) algorithm to obtain feature base images of different dimensions. For example, 1000 tongue images (that is, the images contain the tongue and reflect the shape and color of the tongue) are used as the training image set, and the tongue images have been previously annotated.
其中,LNMF是在NMF的基础上的改进,LNMF算法是将训练图像对应的矩阵V分解为特征矩阵W与权值矩阵H的乘积,即V=WH。Among them, LNMF is an improvement on the basis of NMF. The LNMF algorithm decomposes the matrix V corresponding to the training image into the product of the feature matrix W and the weight matrix H, that is, V=WH.
其中,V是n*m的矩阵,V=(V1,V2,……Vm),一张图像的所有的非负灰度值就是对应V中的一列,V中的数据就是训练图像所对应的灰度值。Among them, V is a matrix of n*m, V=(V1, V2,...Vm), all the non-negative gray values of an image correspond to a column in V, and the data in V is the corresponding training image grayscale value.
特征矩阵W的维数为n*r,r列为基图像;The dimension of the feature matrix W is n*r, and column r is the base image;
权值矩阵H的维数为r*m,其每一列为编码,与V中的一个舌头图像一一对应,由此,一个训练图像则可以表示为基图像的线性组合。The dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
NMF是一种基于局部子空间投影方法,由于NMF算法提取的特征是基于全局特征,对特征空间的局部性没有任何限制。为了增强特征矩阵W的主成分局部化特征,LNMF更加强调了原图像分解过程中基本特征成分的局部化。LNMF算法的公式如上所述,在此不再详述。NMF is a projection method based on local subspace. Since the features extracted by the NMF algorithm are based on global features, there is no restriction on the locality of the feature space. In order to enhance the localization of the principal components of the feature matrix W, LNMF emphasizes the localization of the basic feature components in the original image decomposition process. The formula of the LNMF algorithm is as described above and will not be detailed here.
舌头特征提取模块502采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,则对测试图像进行特征提取。具体说,代表舌头的特征的非负特征矩阵W构成非负子空间,将训练图像和测试图像分别向训练图像集获得的非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似程度,从而提取测试图像中的特征。也就说,特征系数的相似程度高于设定的阈值,则表示测试图像中的特征基是舌头,从而从测试图像中筛选出具有舌头特征的图像。The tongue feature extraction module 502 uses the EHMM model to identify whether the test image contains a face image, and if it does, it performs feature extraction on the test image. Specifically, the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace, and the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain The similarity of the feature coefficients corresponding to the training image and the test image, so as to extract the features in the test image. In other words, if the similarity of the feature coefficients is higher than the set threshold, it means that the feature base in the test image is the tongue, so that images with tongue features can be selected from the test image.
舌头图像分割模块503用于把测试图像向非负子空间投影,投影的过程相当于将测试图像变换至非负子空间,其仍是图像,并且是一幅由学习到的特征组成的图像,其非特征区域和特征区域具有不同的标签,例如非特征区域为0,特征区域为非0,根据0和非0即可将代表舌头特征的图像区域用边框从各测试图像中分割出来。The tongue image segmentation module 503 is used to project the test image to the non-negative subspace. The process of projection is equivalent to transforming the test image to the non-negative subspace. It is still an image and is an image composed of learned features. The non-characteristic area and the characteristic area have different labels. For example, the non-characteristic area is 0 and the characteristic area is non-zero. Based on 0 and non-zero, the image area representing the tongue feature can be segmented from each test image with a frame.
在其中一个实施例中,还包括分类模块504,分类模块504用于使用SVM分类器对测试图像中提取的特征进行分类,将提取到的特征送到k个svm分类器中识别,k的取值与类别数相等。具体说,可以是例如“舌头”、“非舌头”的分类。也可以是根据舌头的病理状况下的特征来分类,从而得到框选的舌头的图像。具体说,可以是按照人的身体的情况来分别对应的不同的舌头图像的特征来分类,其中,人的身体情况可以包括湿热、阴虚、正常、热盛、气血不通、血瘀,将k个SVM分类器中得分最高的类作为分类结果。In one of the embodiments, a classification module 504 is further included. The classification module 504 is used to classify the features extracted from the test image using an SVM classifier, and send the extracted features to k svm classifiers for identification, and the value of k The value is equal to the number of categories. Specifically, it can be classified into "tongue" and "non-tongue", for example. It can also be classified according to the characteristics of the pathological condition of the tongue, so as to obtain a framed tongue image. Specifically, it can be classified according to the characteristics of the different tongue images corresponding to the person's body condition. Among them, the person's body condition can include damp heat, yin deficiency, normal, heat, qi and blood barrier, blood stasis, and The class with the highest score among the k SVM classifiers is used as the classification result.
在其中一个实施例中,还包括边框调整模块505,边框调整模块505用于通过线性回归模型调整舌头图像的边框位置,对于每一个类,例如,湿热、阴虚、正常、热盛、气血不通、血瘀分别训练一个线性回归模型,输入为边框中的图像的特征,而输出为边框的平移(左右平移和上下平移)值、缩放值。通过线性回归模型计算得到边框的平移和缩放值,并利用损失函数约束边框的位置误差,从而不断调整边框移动到合适的位置。In one of the embodiments, a frame adjustment module 505 is further included. The frame adjustment module 505 is used to adjust the frame position of the tongue image through a linear regression model. For each category, for example, damp heat, yin deficiency, normal, heat and blood, Train a linear regression model for the unreasonable and blood stasis respectively. The input is the characteristics of the image in the frame, and the output is the translation (left-right translation and up-down translation) value and zoom value of the border. The linear regression model is used to calculate the translation and zoom values of the border, and the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
在其中一个实施例中,还包括二值化模块506,二值化模块506用于对训练图像和测试图像都先进行二值化(是指将图像上的像素点的灰度值设置为0或255,也就是将整个图像呈现出明显的黑白效果的过程),由于彩色图像(例如RGB图像)为通过对红(R)、绿(G)、蓝(B)三个颜色通道的变化以及它们相互之间的叠加来得到各式各样的颜色,其获取的舌头区域空洞(漏掉)的区域较多,黑白图像(仅为单通道),单通道相对于三通道更有利于模型的优化,可以更加精确的获取舌头区域。In one of the embodiments, a binarization module 506 is further included. The binarization module 506 is used to binarize both the training image and the test image (referring to setting the gray value of the pixel on the image to 0 Or 255, that is, the process of presenting the entire image with a clear black and white effect), because color images (such as RGB images) are changed through the three color channels of red (R), green (G), and blue (B) and They are superimposed on each other to get a variety of colors, the tongue area obtained by them is more hollow (missing) areas, black and white images (only single channel), single channel is more conducive to the model than three channels Optimized to obtain the tongue area more accurately.
在其中一个实施例中,还包括人脸识别模块507,人脸识别模块507用于先采用EHMM(嵌入式隐马尔科夫算法)对测试图像进行分类处理,具体说,将图像按照“有人脸”、“没人脸”两个类别来进行分类,可以优化识别准确率。EHMM模型进行人脸识别包括以下步骤:In one of the embodiments, it also includes a face recognition module 507. The face recognition module 507 is used to first use EHMM (Embedded Hidden Markov Algorithm) to classify the test image. Specifically, the image is classified as ”And “Nobody’s face” are classified into two categories to optimize the recognition accuracy. EHMM model for face recognition includes the following steps:
选取人脸的多个特征点形成特征序列。Select multiple feature points of the face to form a feature sequence.
将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,因为,EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超状态封装有一组嵌入式状态集合,嵌入式状态集合中的嵌入式状态的数量与人脸在横向的切片图片数量相同。。EHMM模型通过固定大小窗口从左到右,从上到下扫描图像(脸部的特征可以从上到下的超状态结构,和从左到右的嵌入式状态)。例如,其首先从左向右扫描,每个窗口扫描得到一组特征向量,是对此时人脸区域的一种特征提取。扫描窗口计算得到特征向量后,间隔固定距离向右移动,继续进行特征提取,当移到图像右侧时,换到下一行继续从左向右完成扫描。直到窗口移动到图像的右下方,结束整个扫描过程并得到多组特征向量,多组特征向量组成观察值序列。Input the test image into the EHMM model, and the EHMM model scans the test image from top to bottom and from left to right through a moving window, because the EHMM model contains a set of superstates. The number of superstates in the superstate set is related to the number of human faces. The number of vertical slice pictures is the same, each super state encapsulates a set of embedded state sets, and the number of embedded states in the embedded state set is the same as the number of face slice pictures in the horizontal direction. . The EHMM model scans the image from left to right and top to bottom through a fixed-size window (face features can be super-state structure from top to bottom, and embedded state from left to right). For example, it first scans from left to right, and each window scans to obtain a set of feature vectors, which is a feature extraction of the face region at this time. After the scanning window calculates the feature vector, it moves to the right at a fixed distance to continue the feature extraction. When it moves to the right side of the image, it changes to the next line to continue scanning from left to right. Until the window moves to the lower right of the image, the entire scanning process is ended and multiple sets of feature vectors are obtained, and multiple sets of feature vectors form an observation sequence.
利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸。The forward algorithm is used to find the probability that the observation value sequence is similar to the feature sequence composed of multiple feature points of the face. If the similarity probability is greater than the judgment threshold, it is considered that the detected image contains a face.
在其中一个实施例中,EHMM模型的训练过程如下:In one of the embodiments, the training process of the EHMM model is as follows:
1)EHMM建模:EHMM可以被定义为λ=(P 0,A 0,∧)的三元式,其中
Figure PCTCN2019118413-appb-000028
EHMM模型的基本元素包括:
1) EHMM modeling: EHMM can be defined as a ternary formula of λ = (P 0 , A 0 , ∧), where
Figure PCTCN2019118413-appb-000028
The basic elements of the EHMM model include:
(1)超状态的初始概率P 0=π 0,i,π 0,i是超状态i在time=0时候的概率,1≤i≤N 0,N 0表示超状态的数量; (1) The initial probability of the super state P 00,i , π 0,i is the probability of the super state i at time=0, 1≤i≤N 0 , N 0 represents the number of super states;
(2)过渡转移矩阵A 0=a 0,ij,其中a 0,ij是迁移矩阵从超状态i转到超状态j的概率,在从左到右的EHMM模型中,唯一被允许的迁移是本状态到下一个状态的迁移,因此原状态到前一个状态的迁移概率是0; (2) Transition transition matrix A 0 =a 0 ,ij, where a 0 ,ij is the probability of transition matrix from superstate i to superstate j. In the left-to-right EHMM model, the only allowed transition is The transition from this state to the next state, so the transition probability from the original state to the previous state is 0;
(3)
Figure PCTCN2019118413-appb-000029
表示第k个超状态的参数集,1≤k≤N 0
(3)
Figure PCTCN2019118413-appb-000029
Represents the parameter set of the k-th super state, 1≤k≤N 0 ;
其中,
Figure PCTCN2019118413-appb-000030
为嵌入状态的初始概率分布;
among them,
Figure PCTCN2019118413-appb-000030
Is the initial probability distribution of the embedded state;
Figure PCTCN2019118413-appb-000031
是嵌入状态转移概率矩阵;
Figure PCTCN2019118413-appb-000031
Is the embedded state transition probability matrix;
B k表示观察值概率矩阵,
Figure PCTCN2019118413-appb-000032
表示超状态k的嵌入状态j产生观察量
Figure PCTCN2019118413-appb-000033
的概率,
Figure PCTCN2019118413-appb-000034
两个变量分别对应竖向和横向两个维度,
B k represents the probability matrix of observations,
Figure PCTCN2019118413-appb-000032
The embedded state j representing the super state k produces observations
Figure PCTCN2019118413-appb-000033
The probability,
Figure PCTCN2019118413-appb-000034
The two variables correspond to the vertical and horizontal dimensions respectively,
Figure PCTCN2019118413-appb-000035
Figure PCTCN2019118413-appb-000035
其中,
Figure PCTCN2019118413-appb-000036
表示混合高斯数目;
among them,
Figure PCTCN2019118413-appb-000036
Represents the number of mixed Gaussians;
Figure PCTCN2019118413-appb-000037
是超状态K的嵌入状态j的第m个混合项的混合系数;
Figure PCTCN2019118413-appb-000037
Is the mixing coefficient of the m-th mixing term of the embedded state j of the super state K;
Figure PCTCN2019118413-appb-000038
是以
Figure PCTCN2019118413-appb-000039
为均值向量、
Figure PCTCN2019118413-appb-000040
为协方差矩阵的高斯密度。
Figure PCTCN2019118413-appb-000038
So
Figure PCTCN2019118413-appb-000039
Is the mean vector,
Figure PCTCN2019118413-appb-000040
Is the Gaussian density of the covariance matrix.
2)图像分割:均匀分割测试图像,从图像得到的观察值序列被均匀分为N 0个对应纵向的超状态的纵向切片。每个纵向切片可以从左到右切分到多个嵌入状态中。 2) Image segmentation: The test image is uniformly divided, and the observation value sequence obtained from the image is evenly divided into N 0 longitudinal slices corresponding to the longitudinal superstate. Each longitudinal slice can be divided into multiple embedded states from left to right.
3)参数初始化:分割后,模型参数的初始值是通过状态的初始化概率和迁移概率得到的。每个EHMM的状态采用了K-means聚类计算观察值概率。K是每个状态的高斯分布个数。所有嵌入状态中提取的观察值向量都能用高斯混合模型解释其观察值概率密度函数。每个超状态的状态初始化规则如下:每个EHMM的第一个状态初始化概率规定为1.0,其他状态初始化概率为0。3) Parameter initialization: After segmentation, the initial values of the model parameters are obtained through the initialization probability and the transition probability of the state. The state of each EHMM uses K-means clustering to calculate the probability of observation. K is the number of Gaussian distributions in each state. All the observed value vectors extracted in the embedded state can use the Gaussian mixture model to explain the observed value probability density function. The state initialization rule of each super state is as follows: the initialization probability of the first state of each EHMM is specified as 1.0, and the initialization probability of other states is 0.
4)嵌入式Viterbi分割:在迭代第一步后,使用双嵌入式Viterbi算法(维特比算法)代替均匀分割,通过新的分割和事件频率计数,确定一组新的初始化和迁移概率。4) Embedded Viterbi segmentation: After the first step of the iteration, the dual embedded Viterbi algorithm (Viterbi algorithm) is used instead of uniform segmentation, and a new set of initialization and migration probabilities are determined through the new segmentation and event frequency counting.
5)分割Kmeans聚类:根据第4步分割结果,使用Kmeans聚类计算新状态对应的观察值向量,以及新的观察值概率密度函数。在下一次迭代中,这些值将会作为新一轮双嵌入式Viterbi分割的初始值。5) Segmentation of Kmeans clustering: According to the segmentation result of step 4, Kmeans clustering is used to calculate the observation value vector corresponding to the new state and the new observation value probability density function. In the next iteration, these values will be used as the initial values for a new round of dual embedded Viterbi segmentation.
6)重复第4和第5步,直到连续迭代过程变化小于设定阈值。6) Repeat steps 4 and 5 until the continuous iteration process change is less than the set threshold.
在其中一个实施例中,还包括再分类模块508,再分类模块508用于在采用嵌入式隐马尔科夫算法对测试图像进行人脸、非人脸的分类处理后,还进一步对有人脸的图像进行分类,包括识别图像的性别和年龄。In one of the embodiments, a reclassification module 508 is further included. The reclassification module 508 is used to classify the face and non-face of the test image using the embedded hidden Markov algorithm, and then further classify the human face Images are classified, including identifying the gender and age of the image.
下面具体说明一下,首先,获得CNN(卷积神经网络)模型,该CNN模型经过训练可以识别人脸的性别和年龄。假设按照年龄段和性别设定6个分类,例如0~20-男、20~40-男、40~70-男、0~20-女、20~40-女、40~70-女这六个类别,采用CNN(卷积神经网络)来识别将训练图像分类到以上六个类别中;The following is a specific explanation. First, a CNN (Convolutional Neural Network) model is obtained, which can be trained to recognize the gender and age of a face. Suppose 6 categories are set according to age group and gender, such as 0-20-male, 20-40-male, 40-70-male, 0-20-female, 20-40-female, 40-70-female. Use CNN (Convolutional Neural Network) to identify and classify training images into the above six categories;
然后,对这六个类别中的图像都进行标注,相应的每张图像对应的标签就是年龄段-性别-舌头;Then, the images in these six categories are labeled, and the corresponding label for each image is age group-gender-tongue;
然后,针对这六个分类分别训练LNMF模型,得到对应以上6个分类的6个LNMF模型,例如0~20-男的舌头对应的LNMF模型,就用于识别0~20-男的舌头。20~40-女的舌头对应的LNMF模型,就用于识别20~40-女的舌头;Then, LNMF models are trained for these six categories respectively, and six LNMF models corresponding to the above six categories are obtained. For example, the LNMF model corresponding to the 0-20-male tongue is used to identify the 0-20-male tongue. The LNMF model corresponding to the 20-40-female tongue is used to identify the 20-40-female tongue;
然后,采用前面经过训练的CNN模型来识别测试图像的性别和年龄,同样的把测试图像按照性别和年龄段进行分类,其类别与训练图像所分的类别相同;Then, use the previously trained CNN model to identify the gender and age of the test image, and similarly classify the test image according to gender and age, and its category is the same as that of the training image;
然后,按照各类别来分别采用对应的LNMF模型来提取舌头图像区域。Then, according to each category, the corresponding LNMF model is used to extract the tongue image area.
由于LNMF模型是经过训练专门识别对应年龄段和性别的舌头图像,而测试图像也相应的划分为对应的年龄段和性别分类,而该年龄段和性别的舌头具有对应的一些特征,可以更加有利于LNMF模型来识别。另一方面,测试图像划分为多个类别同时识别,也加快了识别效率。另外,由于舌头图像对应的标注有年龄段和性别,这也有助于后期在湿热、阴虚、正常、热盛、气血不通、血瘀分类上的准确性和快速性。Since the LNMF model is trained to specifically identify tongue images corresponding to age and gender, the test images are also divided into corresponding age and gender categories, and tongues of this age and gender have corresponding characteristics, which can be more Conducive to LNMF model to identify. On the other hand, dividing the test image into multiple categories for simultaneous recognition also speeds up the recognition efficiency. In addition, since the tongue image corresponds to the age group and gender, this also helps the accuracy and rapidity of the later classification of damp heat, yin deficiency, normal, heat rich, qi and blood barrier, and blood stasis.
此外,本申请实施例还提供一种舌头图像提取装置,包括矩阵分解模块501、舌头特征提取模块502、舌头图像分割模块503。In addition, an embodiment of the present application also provides a tongue image extraction device, which includes a matrix decomposition module 501, a tongue feature extraction module 502, and a tongue image segmentation module 503.
其中,矩阵分解模块501用于将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,并利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;权值矩阵H的维数为r*m,其每一列为编码。Among them, the matrix decomposition module 501 is used to convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and the LNMF algorithm is used for training to decompose the matrix V into The product of the non-negative feature matrix W and the weight matrix H, that is, V=WH; the dimension of the non-negative feature matrix W is n*r, and the column r is the feature base image. The feature base image refers to the non-negative feature of the tongue. Negative feature matrix W, the non-negative feature matrix W constitutes a non-negative subspace; the dimension of the weight matrix H is r*m, and each column is a code.
其中,舌头特征提取模块502采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,则将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的特征提取出来作为舌头特征;Among them, the tongue feature extraction module 502 uses the EHMM model to identify whether the test image contains a face image, and if it does, the training image and the test image are respectively projected to the non-negative subspace to obtain feature coefficients respectively, using the nearest neighbor criterion To obtain the similarity of the feature coefficients corresponding to the training image and the test image, and extract the features in the test image whose similarity is higher than the similarity threshold as tongue features;
其中,舌头图像分割模块503,利用不同标签区分开包含舌头特征的特征区域和不含舌头特征的非特征区域,并通过读取标签确定包含特征区域的最小边框,从而将代表舌头特征的特征区域从测试图像中分割出来。Among them, the tongue image segmentation module 503 uses different tags to distinguish the feature area containing the tongue feature from the non-feature area without the tongue feature, and determines the minimum border containing the feature area by reading the tag, thereby identifying the feature area representing the tongue feature Segmented from the test image.
在其中一个实施例中,还包括分类模块504,分类模块504用于使用SVM分类器对测试图像中提取的特征进行分类,将提取到的特征送到k个svm分类器中识别,k的取值与类别数相等。In one of the embodiments, a classification module 504 is further included. The classification module 504 is used to classify the features extracted from the test image using an SVM classifier, and send the extracted features to k svm classifiers for identification, and the value of k The value is equal to the number of categories.
在其中一个实施例中,还包括边框调整模块505,边框调整模块505用于通过线性回归模型调整舌头图像的边框位置,对于每一个类,例如,湿热、阴虚、正常、热盛、气血不通、血瘀分别训练一个线性回归模型,输入为边框中的图像的特征,而输出为边框的平移(左右平移和上下平移)值、缩放值。通过线性回归模型计算得到边框的平移和缩放值,并利用损失函数约束边框的位置误差,从而不断调整边框移动到合适的位置。In one of the embodiments, a frame adjustment module 505 is further included. The frame adjustment module 505 is used to adjust the frame position of the tongue image through a linear regression model. For each category, for example, damp heat, yin deficiency, normal, heat and blood, Train a linear regression model for the unreasonable and blood stasis respectively. The input is the characteristics of the image in the frame, and the output is the translation (left-right translation and up-down translation) value and zoom value of the border. The linear regression model is used to calculate the translation and zoom values of the border, and the loss function is used to constrain the position error of the border, so as to continuously adjust the border to move to a suitable position.
在其中一个实施例中,还包括二值化模块506,二值化模块506用于对训练图像和测试图像都先进行二值化。In one of the embodiments, a binarization module 506 is further included. The binarization module 506 is used to perform binarization on both the training image and the test image.
在其中一个实施例中,还包括人脸识别模块507,人脸识别模块507用于先采用EHMM(嵌入式隐马尔科夫算法)对测试图像进行分类处理,具体说,将图像按照“有人脸”、“没人脸”两个类别来进行分类,可以优化识别准确率。In one of the embodiments, it also includes a face recognition module 507. The face recognition module 507 is used to first use EHMM (Embedded Hidden Markov Algorithm) to classify the test image. Specifically, the image is classified as ”And “Nobody’s face” are classified into two categories to optimize the recognition accuracy.
此外,本申请实施例还提出一种计算机非易失性可读存储介质,所述计算机非易失性 可读存储介质可以是硬盘、多媒体卡、SD卡、闪存卡、SMC、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器等等中的任意一种或者几种的任意组合。所述计算机非易失性可读存储介质中包括舌头图像提取程序等,所述舌头图像提取程序50被处理器22执行时实现如下操作:In addition, the embodiment of the present application also proposes a computer non-volatile readable storage medium. The computer non-volatile readable storage medium may be a hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read-only memory ( ROM), erasable programmable read-only memory (EPROM), portable compact disk read-only memory (CD-ROM), USB memory, etc., or any combination of several. The computer non-volatile readable storage medium includes a tongue image extraction program, etc., when the tongue image extraction program 50 is executed by the processor 22, the following operations are implemented:
S110、利用LNMF(局部非负矩阵分解)算法进行训练,得到不同维数的特征基图像。例如采用1000张舌头图像(即图像中包含有舌头,并体现出舌头的形状、颜色等特征)作为训练图像集,且舌头图像已事先进行了标注。优选地,可以先将舌头图像进行压缩,例如压缩为56*64像素,并对舌头图像进行去均值与归一化的处理,利用LNMF算法进行训练得到不同维数的特征基图像,所述特征基图像是指代表舌头的特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间。S110. Use the LNMF (local non-negative matrix factorization) algorithm for training to obtain feature base images of different dimensions. For example, 1000 tongue images (that is, the images contain the tongue and reflect the shape and color of the tongue) are used as the training image set, and the tongue images have been previously annotated. Preferably, the tongue image can be compressed first, such as compressed to 56*64 pixels, and the tongue image is de-averaged and normalized, and the LNMF algorithm is used for training to obtain feature base images of different dimensions. The base image refers to a non-negative feature matrix W representing the characteristics of the tongue, and the non-negative feature matrix W constitutes a non-negative subspace.
其中,LNMF是在NMF的基础上的改进,LNMF算法是将训练图像对应的矩阵V分解为特征矩阵W与权值矩阵H的乘积,即V=WH。Among them, LNMF is an improvement on the basis of NMF. The LNMF algorithm decomposes the matrix V corresponding to the training image into the product of the feature matrix W and the weight matrix H, that is, V=WH.
其中,V是n*m的矩阵,V=(V1,V2,……Vm),一张图像的所有的非负灰度值就是对应V中的一列,V中的数据就是训练图像所对应的灰度值。Among them, V is a matrix of n*m, V=(V1, V2,...Vm), all the non-negative gray values of an image correspond to a column in V, and the data in V is the corresponding training image grayscale value.
特征矩阵W的维数为n*r,r列为基图像;The dimension of the feature matrix W is n*r, and column r is the base image;
权值矩阵H的维数为r*m,其每一列为编码,与V中的一个舌头图像一一对应,由此,一个训练图像则可以表示为基图像的线性组合。The dimension of the weight matrix H is r*m, and each column of it is a code, which corresponds to a tongue image in V one to one. Therefore, a training image can be expressed as a linear combination of base images.
S120、采用EHMM模型识别所述测试图像是否包含人脸图像,如包含,则对测试图像进行特征提取。具体说,代表舌头的特征的非负特征矩阵W构成非负子空间,将训练图像和测试图像分别向训练图像集获得的非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似程度,从而提取测试图像中的特征。也就说,特征系数的相似程度高于设定的阈值,则表示测试图像中的特征基是舌头,从而从测试图像中筛选出具有舌头特征的图像。S120. Use the EHMM model to identify whether the test image contains a face image, and if it does, perform feature extraction on the test image. Specifically, the non-negative feature matrix W representing the characteristics of the tongue constitutes a non-negative subspace, and the training image and the test image are respectively projected to the non-negative subspace obtained by the training image set, and the feature coefficients are obtained respectively, and the nearest neighbor criterion is used to obtain The similarity of the feature coefficients corresponding to the training image and the test image, so as to extract the features in the test image. In other words, if the similarity of the feature coefficients is higher than the set threshold, it means that the feature base in the test image is the tongue, so that images with tongue features can be selected from the test image.
S130、测试图像向非负子空间投影,投影的过程相当于将测试图像变换至非负子空间,其仍是图像,并且是一幅由学习到的特征组成的图像。利用不同标签标记包含舌头特征的特征区域和不含舌头特征的非特征区域,并通过读取标签确定包含特征区域的最小边框,将包含舌头特征的特征区域从测试图像中分割出来。例如非特征区域为0,特征区域为非0,根据0和非0即可将代表舌头特征的图像区域用边框从各测试图像中分割出来。S130. The test image is projected to the non-negative subspace, and the process of projection is equivalent to transforming the test image to the non-negative subspace, which is still an image, and is an image composed of learned features. Different labels are used to mark the characteristic regions containing tongue features and non-feature regions without tongue features, and the smallest borders containing the characteristic regions are determined by reading the labels, and the characteristic regions containing tongue features are segmented from the test image. For example, the non-characteristic area is 0 and the characteristic area is non-zero. According to 0 and non-zero, the image area representing the tongue feature can be segmented from each test image with a frame.
本申请之计算机非易失性可读存储介质的具体实施方式与上述舌头图像提取方法以及电子装置2的具体实施方式大致相同,在此不再赘述。The specific implementation of the computer non-volatile readable storage medium of the present application is substantially the same as the specific implementation of the tongue image extraction method and the electronic device 2 described above, and will not be repeated here.
以上所述仅为本申请的优选实施例,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The foregoing descriptions are only preferred embodiments of the application, and are not intended to limit the application. For those skilled in the art, the application may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application.

Claims (20)

  1. 一种舌头图像提取方法,应用于电子装置,其特征在于,所述方法包括以下步骤:A tongue image extraction method applied to an electronic device, characterized in that the method includes the following steps:
    S110,将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;S110: Convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and use the LNMF algorithm for training to decompose the matrix V into a non-negative feature matrix W and weights. The product of the value matrix H, that is, V=WH;
    非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;The dimension of the non-negative feature matrix W is n*r, and column r is a feature base image. The feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
    权值矩阵H的维数为r*m,其每一列为编码;The dimension of the weight matrix H is r*m, and each column is a code;
    S120,采用EHMM模型识别测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;S120, using the EHMM model to identify whether the test image contains a face image, if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain the training image and the test image The similarity of the corresponding feature coefficients, and extract the tongue-representing features in the test image whose similarity is higher than the similarity threshold as the tongue feature;
    S130,经过投影会将包含舌头特征的特征区域和不含舌头特征的非特征区域分别用不同标签标识出来,其中,标签集合对应有特征区域的边界信息,提取边界信息中的上下左右方向的极值,确定包含特征区域的边框。S130: After projection, the characteristic area containing the tongue feature and the non-characteristic area without the tongue feature are respectively identified with different labels. The label set corresponds to the boundary information of the characteristic area, and the extremes in the up, down, left, and right directions in the boundary information are extracted. Value to determine the bounding box containing the characteristic area.
  2. 根据权利要求1所述的舌头图像提取方法,其特征在于,The tongue image extraction method according to claim 1, wherein:
    还包括步骤S150,通过线性回归模型计算得到边框的平移值和缩放值,并利用损失函数约束边框的位置误差,调整边框移动到合适的位置。It also includes step S150, calculating the translation value and scaling value of the frame through the linear regression model, and using the loss function to constrain the position error of the frame, and adjust the frame to move to a suitable position.
  3. 根据权利要求1所述的舌头图像提取方法,其特征在于,The tongue image extraction method according to claim 1, wherein:
    在步骤S110前,对训练图像和测试图像都先进行二值化,根据设定的灰度值阈值将图像上的像素点的灰度值仅设置为0或255。Before step S110, the training image and the test image are binarized first, and the gray value of the pixel on the image is only set to 0 or 255 according to the set gray value threshold.
  4. 根据权利要求1所述的舌头图像提取方法,其特征在于,The tongue image extraction method according to claim 1, wherein:
    采用EHMM模型识别测试图像中的包含人脸的图像包括以下步骤:Using the EHMM model to recognize images containing human faces in the test image includes the following steps:
    选取人脸的多个特征点形成特征序列;Select multiple feature points of the face to form a feature sequence;
    将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,得到多组特征向量,多组特征向量组成观察值序列;Input the test image into the EHMM model. The EHMM model scans the test image from top to bottom and from left to right through a moving window to obtain multiple sets of feature vectors, and multiple sets of feature vectors form an observation sequence;
    利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸,其中,所述EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超 状态封装对应一组嵌入式状态集合,嵌入式状态集合中的嵌入式状态的数量与人脸在横向的切片图片数量相同。The forward algorithm is used to find the probability that the observation sequence is similar to the feature sequence composed of multiple feature points of the face. If the similar probability is greater than the determination threshold, the detected image is considered to contain a face. The EHMM model contains a set of Super state set. The number of super states in the super state set is the same as the number of vertical slice pictures of the face. Each super state package corresponds to a set of embedded state sets. The number of embedded states in the embedded state set is equal to The number of horizontal slice images of the face is the same.
  5. 根据权利要求1所述的舌头图像提取方法,其特征在于,S110中,先将训练图像进行压缩,并对训练图像进行去均值与归一化的处理,然后再利用LNMF算法进行训练得到特征基图像,其中,The tongue image extraction method according to claim 1, wherein, in S110, the training image is compressed first, and the training image is de-averaged and normalized, and then the LNMF algorithm is used for training to obtain the feature base Image, where
    将矩阵V的每一列元素减去该列元素的均值;Subtract the mean value of each column element of matrix V;
    根据矩阵V的每一列元素与该列元素中最小值的差值,与该列元素中最大值与最小值的差值的比值。According to the ratio of the difference between each column element of the matrix V and the minimum value in the column element, and the difference between the maximum value and the minimum value in the column element.
  6. 根据权利要求4所述的舌头图像提取方法,其特征在于,The tongue image extraction method according to claim 4, wherein:
    在识别包含人脸的图像后,把测试图像分类到对应的年龄段-性别分类中,并采用针对各年龄段-性别分类的LNMF模型在各个年龄段-性别分类中分别提取舌头图像,包含以下步骤:After identifying the image containing the face, the test image is classified into the corresponding age-sex classification, and the LNMF model for each age-sex classification is used to extract the tongue images in each age-sex classification, including the following step:
    获得CNN模型,所述CNN模型经过训练并用于判断性别和所属年龄段,将训练图像分类到各年龄段-性别分类中;Obtain a CNN model, which is trained and used to determine gender and age group, and classify training images into each age group-gender classification;
    对各年龄段-性别类别中的训练图像都进行标注,每张训练图像获得年龄段、性别、舌头的标签;Annotate the training images in each age-gender category, and each training image gets the label of age, gender, and tongue;
    根据年龄段、性别、舌头的标签分别训练LNMF模型,得到对应训练后的LNMF模型;Train the LNMF model separately according to the age group, gender, and tongue label to obtain the corresponding LNMF model after training;
    采用经过训练的CNN模型来识别测试图像的性别和所属年龄段,把测试图像按照性别和年龄段进行分类;Use a trained CNN model to identify the gender and age group of the test images, and classify the test images according to gender and age group;
    根据年龄段、性别、舌头对应训练后的LNMF模型提取舌头图像。The tongue image is extracted according to the LNMF model after age group, gender and tongue correspondence training.
  7. 一种舌头图像提取装置,其特征在于,包括:A tongue image extraction device is characterized in that it comprises:
    矩阵分解模块,用于将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;权值矩阵H的维数为r*m,其每一列为编码;The matrix decomposition module is used to convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V. The LNMF algorithm is used for training to decompose the matrix V into non-negative features The product of the matrix W and the weight matrix H, that is, V=WH; the dimension of the non-negative feature matrix W is n*r, column r is the feature base image, and the feature base image refers to the non-negative feature matrix representing the tongue feature W, the non-negative feature matrix W constitutes a non-negative subspace; the dimension of the weight matrix H is r*m, and each column is a code;
    舌头特征提取模块,采用EHMM模型识别测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来 求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;The tongue feature extraction module uses the EHMM model to identify whether the test image contains a face image. If it does, the training image and the test image are respectively projected to the non-negative subspace to obtain feature coefficients, and the nearest neighbor criterion is used to obtain the training image The similarity of the feature coefficients corresponding to the test image, and extract the tongue-representative feature in the test image whose similarity is higher than the similarity threshold as the tongue feature;
    舌头图像分割模块,利用不同标签标记包含舌头特征的特征区域和不含舌头特征的非特征区域,其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右方向的极值,确定包含特征区域的边框。The tongue image segmentation module uses different labels to mark the characteristic areas that contain tongue features and the non-feature areas that do not contain tongue features. The label set corresponds to the boundary information of the characteristic area, and extracts the extreme values in the up, down, left, and right directions in the boundary information to determine the inclusion The border of the characteristic area.
  8. 根据权利要求7所述的舌头图像提取装置,其特征在于,The tongue image extraction device according to claim 7, wherein:
    还包括边框调整模块,用于通过线性回归模型计算得到边框的平移值和缩放值,并利用损失函数约束边框的位置误差,调整边框移动到合适的位置。It also includes a border adjustment module, which is used to calculate the translation value and zoom value of the border through the linear regression model, and use the loss function to constrain the position error of the border and adjust the border to move to a suitable position.
  9. 根据权利要求7所述的舌头图像提取装置,其特征在于,The tongue image extraction device according to claim 7, wherein:
    还包括人脸识别模块,用于采用EHMM模型识别测试图像中的包含人脸的图像,包括以下步骤:It also includes a face recognition module for using the EHMM model to recognize images containing human faces in the test image, including the following steps:
    选取人脸的多个特征点形成特征序列;Select multiple feature points of the face to form a feature sequence;
    将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,得到多组特征向量,多组特征向量组成观察值序列;Input the test image into the EHMM model. The EHMM model scans the test image from top to bottom and from left to right through a moving window to obtain multiple sets of feature vectors, and multiple sets of feature vectors form an observation sequence;
    利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸,其中,所述EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超状态封装对应一组嵌入式状态集合,嵌入式状态集合中的嵌入式状态的数量与人脸在横向的切片图片数量相同。The forward algorithm is used to find the probability that the observation sequence is similar to the feature sequence composed of multiple feature points of the face. If the similar probability is greater than the determination threshold, the detected image is considered to contain a face. The EHMM model contains a set of Super state set. The number of super states in the super state set is the same as the number of vertical slice pictures of the face. Each super state package corresponds to a set of embedded state sets. The number of embedded states in the embedded state set is equal to The number of horizontal slice images of the face is the same.
  10. 根据权利要求7所述的舌头图像提取装置,其特征在于,The tongue image extraction device according to claim 7, wherein:
    还包括二值化模块,用于对训练图像和测试图像都先进行二值化,根据设定的灰度值阈值将图像上的像素点的灰度值仅设置为0或255。It also includes a binarization module, which is used to binarize both the training image and the test image first, and set the gray value of the pixel on the image to only 0 or 255 according to the set gray value threshold.
  11. 一种电子装置,其特征在于,该电子装置包括:存储器和处理器,所述存储器中存储有舌头图像提取程序,所述舌头图像提取程序被所述处理器执行时实现如下步骤:An electronic device, characterized in that the electronic device comprises: a memory and a processor, the memory stores a tongue image extraction program, and when the tongue image extraction program is executed by the processor, the following steps are implemented:
    S110,将包含舌头的训练图像转换为矩阵V,其中,一张图像的所有的非负灰度值对应V中的一列,利用LNMF算法进行训练,将矩阵V分解为非负特征矩阵W与权值矩阵H的乘积,即V=WH;S110: Convert the training image containing the tongue into a matrix V, where all the non-negative gray values of an image correspond to a column in V, and use the LNMF algorithm for training to decompose the matrix V into a non-negative feature matrix W and weights. The product of the value matrix H, that is, V=WH;
    非负特征矩阵W的维数为n*r,r列为特征基图像,所述特征基图像是指代表舌头特征的非负特征矩阵W,该非负特征矩阵W构成非负子空间;The dimension of the non-negative feature matrix W is n*r, and column r is a feature base image. The feature base image refers to a non-negative feature matrix W representing tongue features, and the non-negative feature matrix W forms a non-negative subspace;
    权值矩阵H的维数为r*m,其每一列为编码;The dimension of the weight matrix H is r*m, and each column is a code;
    S120,采用EHMM模型识别测试图像是否包含人脸图像,如包含,将训练图像和测试图像分别向所述非负子空间投影,分别得到特征系数,利用最近邻准则来求取训练图像与测试图像对应的特征系数的相似度,并将相似度高于相似度阈值的测试图像中的代表舌头的特征提取出来作为舌头特征;S120, using the EHMM model to identify whether the test image contains a face image, if it does, project the training image and the test image to the non-negative subspace respectively to obtain feature coefficients, and use the nearest neighbor criterion to obtain the training image and the test image The similarity of the corresponding feature coefficients, and extract the tongue-representing features in the test image whose similarity is higher than the similarity threshold as the tongue feature;
    S130,经过投影会将包含舌头特征的特征区域和不含舌头特征的非特征区域分别用不同标签标识出来,其中,标签集合对应有特征区域的边界信息,提取边界信息中上下左右方向的极值,确定包含特征区域的边框。S130: After projection, the characteristic area containing the tongue feature and the non-characteristic area not containing the tongue feature are respectively identified with different labels, where the label set corresponds to the boundary information of the characteristic area, and the extreme values in the up, down, left, and right directions in the boundary information are extracted To determine the border that contains the characteristic area.
  12. 根据权利要求11所述的电子装置,其特征在于,所述舌头图像提取程序被所述处理器执行时还包括:The electronic device according to claim 11, wherein the tongue image extraction program when executed by the processor further comprises:
    在步骤S110前,对训练图像和测试图像都先进行二值化,根据设定的灰度值阈值将图像上的像素点的灰度值仅设置为0或255。Before step S110, the training image and the test image are binarized first, and the gray value of the pixel on the image is only set to 0 or 255 according to the set gray value threshold.
  13. 根据权利要求11所述的电子装置,其特征在于,所述舌头图像提取程序被所述处理器执行时还包括:The electronic device according to claim 11, wherein the tongue image extraction program when executed by the processor further comprises:
    还包括步骤S150,通过线性回归模型计算得到边框的平移值和缩放值,并利用损失函数约束边框的位置误差,调整边框移动到合适的位置。It also includes step S150, calculating the translation value and scaling value of the frame through the linear regression model, and using the loss function to constrain the position error of the frame, and adjust the frame to move to a suitable position.
  14. 根据权利要求12所述的电子装置,其特征在于,所述舌头图像提取程序被所述处理器执行时还包括:11. The electronic device according to claim 12, wherein the tongue image extraction program when executed by the processor further comprises:
    采用EHMM模型识别测试图像中的包含人脸的图像,包括以下步骤:Using the EHMM model to identify images containing human faces in the test image includes the following steps:
    选取人脸的多个特征点形成特征序列;Select multiple feature points of the face to form a feature sequence;
    将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,得到多组特征向量,多组特征向量组成观察值序列;Input the test image into the EHMM model. The EHMM model scans the test image from top to bottom and from left to right through a moving window to obtain multiple sets of feature vectors, and multiple sets of feature vectors form an observation sequence;
    利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸,其中,所述EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超状态封装对应一组嵌入式状态集合,嵌入式状态集合中的嵌入式状态的数量与人脸在横向的切片图片数量相同。The forward algorithm is used to find the probability that the observation sequence is similar to the feature sequence composed of multiple feature points of the face. If the similar probability is greater than the determination threshold, the detected image is considered to contain a face. The EHMM model contains a set of Super state set. The number of super states in the super state set is the same as the number of vertical slice pictures of the face. Each super state package corresponds to a set of embedded state sets. The number of embedded states in the embedded state set is equal to The number of horizontal slice images of the face is the same.
  15. 根据权利要求12所述的电子装置,其特征在于,所述舌头图像提取程序被所述处理器执行时还包括:11. The electronic device according to claim 12, wherein the tongue image extraction program when executed by the processor further comprises:
    S110中,先将训练图像进行压缩,并对训练图像进行去均值与归一化的处理,然后再利用LNMF算法进行训练得到特征基图像,其中,In S110, the training image is first compressed, and the training image is de-averaged and normalized, and then the LNMF algorithm is used for training to obtain the feature base image, where,
    将矩阵V的每一列元素减去该列元素的均值;Subtract the mean value of each column element of matrix V;
    根据矩阵V的每一列元素与该列元素中最小值的差值,与该列元素中最大值与最小值的差值的比值。According to the ratio of the difference between each column element of the matrix V and the minimum value in the column element, and the difference between the maximum value and the minimum value in the column element.
  16. 根据权利要求12所述的电子装置,其特征在于,所述舌头图像提取程序被所述处理器执行时还包括:11. The electronic device according to claim 12, wherein the tongue image extraction program when executed by the processor further comprises:
    在识别包含人脸的图像后,把测试图像分类到对应的年龄段-性别分类中,并采用针对各年龄段-性别分类的LNMF模型在各个年龄段-性别分类中分别提取舌头图像,包含以下步骤:After identifying the image containing the face, the test image is classified into the corresponding age-sex classification, and the LNMF model for each age-sex classification is used to extract the tongue images in each age-sex classification, including the following step:
    获得CNN模型,所述CNN模型经过训练并用于判断性别和所属年龄段,将训练图像分类到各年龄段-性别分类中;Obtain a CNN model, which is trained and used to determine gender and age group, and classify training images into each age group-gender classification;
    对各年龄段-性别类别中的训练图像都进行标注,每张训练图像获得年龄段、性别、舌头的标签;Annotate the training images in each age-gender category, and each training image gets the label of age, gender, and tongue;
    根据年龄段、性别、舌头的标签分别训练LNMF模型,得到对应训练后的LNMF模型;Train the LNMF model separately according to the age group, gender, and tongue label to obtain the corresponding LNMF model after training;
    采用经过训练的CNN模型来识别测试图像的性别和所属年龄段,把测试图像按照性别和年龄段进行分类;Use a trained CNN model to identify the gender and age group of the test images, and classify the test images according to gender and age group;
    根据年龄段、性别、舌头对应训练后的LNMF模型提取舌头图像。The tongue image is extracted according to the LNMF model after age group, gender and tongue correspondence training.
  17. 一种计算机非易失性可读存储介质,其特征在于,所述计算机非易失性可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令被处理器执行时,实现权利要求1所述的舌头图像提取方法。A computer nonvolatile readable storage medium, wherein the computer nonvolatile readable storage medium stores a computer program, the computer program includes program instructions, and when the program instructions are executed by a processor, The tongue image extraction method according to claim 1 is realized.
  18. 根据权利要求17所述的计算机非易失性可读存储介质,其特征在于,所述程序指令被处理器执行时,还实现:The computer non-volatile readable storage medium according to claim 17, wherein when the program instructions are executed by the processor, they further implement:
    采用EHMM模型识别测试图像中的包含人脸的图像包括以下步骤:Using the EHMM model to recognize images containing human faces in the test image includes the following steps:
    选取人脸的多个特征点形成特征序列;Select multiple feature points of the face to form a feature sequence;
    将测试图像输入EHMM模型,EHMM模型通过移动的窗口从上到下和从左到右扫描测试图像,得到多组特征向量,多组特征向量组成观察值序列;Input the test image into the EHMM model. The EHMM model scans the test image from top to bottom and from left to right through a moving window to obtain multiple sets of feature vectors, and multiple sets of feature vectors form an observation sequence;
    利用前向算法求出观察值序列与人脸的多个特征点组成的特征序列相似的或然率,相似的或然率大于判定阈值,则认为检测的图像包含人脸,其中,所述EHMM模型包含一组超状态集合,超状态集合中的超状态的数量与人脸在竖向的切片图片数量相同,每个超 状态封装对应一组嵌入式状态集合,嵌入式状态集合中的嵌入式状态的数量与人脸在横向的切片图片数量相同。The forward algorithm is used to find the probability that the observation sequence is similar to the feature sequence composed of multiple feature points of the face. If the similar probability is greater than the determination threshold, the detected image is considered to contain a face. The EHMM model contains a set of Super state set. The number of super states in the super state set is the same as the number of vertical slice pictures of the face. Each super state package corresponds to a set of embedded state sets. The number of embedded states in the embedded state set is equal to The number of horizontal slice images of the face is the same.
  19. 根据权利要求17所述的计算机非易失性可读存储介质,其特征在于,所述程序指令被处理器执行时,还实现:The computer non-volatile readable storage medium according to claim 17, wherein when the program instructions are executed by the processor, they further implement:
    S110中,先将训练图像进行压缩,并对训练图像进行去均值与归一化的处理,然后再利用LNMF算法进行训练得到特征基图像,其中,In S110, the training image is first compressed, and the training image is de-averaged and normalized, and then the LNMF algorithm is used for training to obtain the feature base image, where,
    将矩阵V的每一列元素减去该列元素的均值;Subtract the mean value of each column element of matrix V;
    根据矩阵V的每一列元素与该列元素中最小值的差值,与该列元素中最大值与最小值的差值的比值。According to the ratio of the difference between each column element of the matrix V and the minimum value in the column element, and the difference between the maximum value and the minimum value in the column element.
  20. 根据权利要求17所述的计算机非易失性可读存储介质,其特征在于,所述程序指令被处理器执行时,还实现:The computer non-volatile readable storage medium according to claim 17, wherein when the program instructions are executed by the processor, they further implement:
    在识别包含人脸的图像后,把测试图像分类到对应的年龄段-性别分类中,并采用针对各年龄段-性别分类的LNMF模型在各个年龄段-性别分类中分别提取舌头图像,包含以下步骤:After identifying the image containing the face, the test image is classified into the corresponding age-sex classification, and the LNMF model for each age-sex classification is used to extract the tongue images in each age-sex classification, including the following step:
    获得CNN模型,所述CNN模型经过训练并用于判断性别和所属年龄段,将训练图像分类到各年龄段-性别分类中;Obtain a CNN model, which is trained and used to determine gender and age group, and classify training images into each age group-gender classification;
    对各年龄段-性别类别中的训练图像都进行标注,每张训练图像获得年龄段、性别、舌头的标签;Annotate the training images in each age-gender category, and each training image gets the label of age, gender, and tongue;
    根据年龄段、性别、舌头的标签分别训练LNMF模型,得到对应训练后的LNMF模型;Train the LNMF model separately according to the age group, gender, and tongue label to obtain the corresponding LNMF model after training;
    采用经过训练的CNN模型来识别测试图像的性别和所属年龄段,把测试图像按照性别和年龄段进行分类;Use a trained CNN model to identify the gender and age group of the test images, and classify the test images according to gender and age group;
    根据年龄段、性别、舌头对应训练后的LNMF模型提取舌头图像。The tongue image is extracted according to the LNMF model after age group, gender and tongue correspondence training.
PCT/CN2019/118413 2019-08-09 2019-11-14 Tongue image extraction method and device, and a computer readable storage medium WO2020215697A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
SG11202008404RA SG11202008404RA (en) 2019-08-09 2019-11-14 Method and device for tongue image extraction and computer readable storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910733855.8 2019-08-09
CN201910733855.8A CN110569879B (en) 2019-08-09 2019-08-09 Tongue image extraction method, tongue image extraction device and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2020215697A1 true WO2020215697A1 (en) 2020-10-29

Family

ID=68774935

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118413 WO2020215697A1 (en) 2019-08-09 2019-11-14 Tongue image extraction method and device, and a computer readable storage medium

Country Status (3)

Country Link
CN (1) CN110569879B (en)
SG (1) SG11202008404RA (en)
WO (1) WO2020215697A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536986A (en) * 2021-06-29 2021-10-22 南京逸智网络空间技术创新研究院有限公司 Representative feature-based dense target detection method in remote sensing image
CN113808075A (en) * 2021-08-04 2021-12-17 上海大学 Two-stage tongue picture identification method based on deep learning
CN113947140A (en) * 2021-10-13 2022-01-18 北京百度网讯科技有限公司 Training method of face feature extraction model and face feature extraction method
CN114943735A (en) * 2022-07-22 2022-08-26 浙江省肿瘤医院 Tongue picture image-based tumor prediction system and method and application thereof
CN114972354A (en) * 2022-08-02 2022-08-30 济宁金筑新型建材科技有限公司 Image processing-based autoclaved aerated concrete block production control method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060215854A1 (en) * 2005-03-23 2006-09-28 Kaoru Suzuki Apparatus, method and program for processing acoustic signal, and recording medium in which acoustic signal, processing program is recorded
CN105335719A (en) * 2015-10-29 2016-02-17 北京汉王智远科技有限公司 Living body detection method and device
CN108198576A (en) * 2018-02-11 2018-06-22 华南理工大学 A kind of Alzheimer's disease prescreening method based on phonetic feature Non-negative Matrix Factorization
CN108415883A (en) * 2018-02-13 2018-08-17 中国科学院西安光学精密机械研究所 Convex non-negative matrix factorization method based on subspace clustering
CN109657611A (en) * 2018-12-19 2019-04-19 河南科技大学 A kind of adaptive figure regularization non-negative matrix factorization method for recognition of face

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8805653B2 (en) * 2010-08-11 2014-08-12 Seiko Epson Corporation Supervised nonnegative matrix factorization
CN102393910B (en) * 2011-06-29 2013-04-24 浙江工业大学 Human behavior identification method based on non-negative matrix decomposition and hidden Markov model
CN102592148A (en) * 2011-12-29 2012-07-18 华南师范大学 Face identification method based on non-negative matrix factorization and a plurality of distance functions
CN105335732B (en) * 2015-11-17 2018-08-21 西安电子科技大学 Based on piecemeal and differentiate that Non-negative Matrix Factorization blocks face identification method
CN105893954B (en) * 2016-03-30 2019-04-23 深圳大学 A kind of Non-negative Matrix Factorization face identification method and system based on nuclear machine learning
CN107451545B (en) * 2017-07-15 2019-11-15 西安电子科技大学 The face identification method of Non-negative Matrix Factorization is differentiated based on multichannel under soft label
CN108268872B (en) * 2018-02-28 2021-06-08 电子科技大学 Robust nonnegative matrix factorization method based on incremental learning
CN109829481B (en) * 2019-01-04 2020-10-30 北京邮电大学 Image classification method and device, electronic equipment and readable storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060215854A1 (en) * 2005-03-23 2006-09-28 Kaoru Suzuki Apparatus, method and program for processing acoustic signal, and recording medium in which acoustic signal, processing program is recorded
CN105335719A (en) * 2015-10-29 2016-02-17 北京汉王智远科技有限公司 Living body detection method and device
CN108198576A (en) * 2018-02-11 2018-06-22 华南理工大学 A kind of Alzheimer's disease prescreening method based on phonetic feature Non-negative Matrix Factorization
CN108415883A (en) * 2018-02-13 2018-08-17 中国科学院西安光学精密机械研究所 Convex non-negative matrix factorization method based on subspace clustering
CN109657611A (en) * 2018-12-19 2019-04-19 河南科技大学 A kind of adaptive figure regularization non-negative matrix factorization method for recognition of face

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536986A (en) * 2021-06-29 2021-10-22 南京逸智网络空间技术创新研究院有限公司 Representative feature-based dense target detection method in remote sensing image
CN113808075A (en) * 2021-08-04 2021-12-17 上海大学 Two-stage tongue picture identification method based on deep learning
CN113947140A (en) * 2021-10-13 2022-01-18 北京百度网讯科技有限公司 Training method of face feature extraction model and face feature extraction method
CN114943735A (en) * 2022-07-22 2022-08-26 浙江省肿瘤医院 Tongue picture image-based tumor prediction system and method and application thereof
CN114972354A (en) * 2022-08-02 2022-08-30 济宁金筑新型建材科技有限公司 Image processing-based autoclaved aerated concrete block production control method and system

Also Published As

Publication number Publication date
SG11202008404RA (en) 2020-10-29
CN110569879B (en) 2024-03-15
CN110569879A (en) 2019-12-13

Similar Documents

Publication Publication Date Title
WO2020215697A1 (en) Tongue image extraction method and device, and a computer readable storage medium
Dai et al. A 3d morphable model of craniofacial shape and texture variation
US7254256B2 (en) Method and computer program product for locating facial features
Ding et al. Features versus context: An approach for precise and detailed detection and delineation of faces and facial features
WO2020001082A1 (en) Face attribute analysis method based on transfer learning
Wan et al. An accurate active shape model for facial feature extraction
Kotsia et al. Texture and shape information fusion for facial expression and facial action unit recognition
CN100410963C (en) Two-dimensional linear discrimination human face analysis identificating method based on interblock correlation
Sahbi et al. A Hierarchy of Support Vector Machines for Pattern Detection.
US20140307063A1 (en) Method and apparatus for generating viewer face-tracing information, recording medium for same, and three-dimensional display apparatus
CN113705371B (en) Water visual scene segmentation method and device
CN110991258B (en) Face fusion feature extraction method and system
JP2021193610A (en) Information processing method, information processing device, electronic apparatus and storage medium
Zhou et al. Shape-appearance-correlated active appearance model
Barbu An automatic face detection system for RGB images
Tran et al. Disentangling geometry and appearance with regularised geometry-aware generative adversarial networks
Chen et al. 3D shape constraint for facial feature localization using probabilistic-like output
Xu et al. Face recognition using spatially constrained earth mover's distance
RU2768797C1 (en) Method and system for determining synthetically modified face images on video
CN110968735B (en) Unsupervised pedestrian re-identification method based on spherical similarity hierarchical clustering
Hong et al. Efficient facial landmark localization using spatial–contextual AdaBoost algorithm
Liu et al. Multi-view face alignment guided by several facial feature points
Yang Hand gesture recognition and face detection in images
Nie et al. The facial features analysis method based on human star-structured model
CN111444860A (en) Expression recognition method and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926126

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926126

Country of ref document: EP

Kind code of ref document: A1