CN109711283B

CN109711283B - Occlusion expression recognition method combining double dictionaries and error matrix

Info

Publication number: CN109711283B
Application number: CN201811506374.5A
Authority: CN
Inventors: 董俊兰; 张灵
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2018-12-10
Filing date: 2018-12-10
Publication date: 2022-11-15
Anticipated expiration: 2038-12-10
Also published as: CN109711283A

Abstract

The invention discloses an occluded facial expression recognition method combined with a double dictionary and an error matrix. matrix for dictionary learning to obtain intra-class related dictionaries and difference structure dictionaries; secondly, when classifying occluded images, the original sparse coding does not consider coding errors and cannot accurately describe the coding errors caused by occlusions. matrix representation, which can be separated from the feature matrix of unoccluded training images; the clear image can be recovered by subtracting the error matrix from the test sample; the clear image sample is low-rank decomposed into identity features using a dual-dictionary collaborative representation and expression features, and finally achieve classification according to the contribution of each category of expression features in the joint sparse representation. The present invention is robust to randomly occluded expression recognition.

Description

An occluded facial expression recognition method based on a combination of double dictionaries and error matrix

技术领域technical field

本发明涉及图像处理领域，更具体地，涉及一种联合双字典和误差矩阵的遮挡表情识别方法。The present invention relates to the field of image processing, and more specifically, relates to a method for recognizing occluded facial expressions by combining a double dictionary and an error matrix.

背景技术Background technique

人脸表情识别技术是生理学、模式识别和计算机视觉等领域一个非常热门的课题，相关技术也越来越成熟。为了确保表情信息的完整性，大多数研究都是在可控的实验条件下进行。然而在实际应用中，人脸表情识别需要克服光照、遮挡、姿态等各种随机性问题。因此，人脸表情识别仍然是一个具有挑战性的学科。在遮挡人脸表情识别方面，研究人员提出了许多方法来减少遮挡对表情识别的影响，最常用的方法有特征重构，子区域分析、稀疏表示、深度学习等。Xia等将RPCA和显著性检测联合应用于遮挡表情识别。首先利用显著性检测定位遮挡物，然后用RPCA重建出图像的遮挡区域。Zhang等结合迭代最近点(ICP)和模糊C-Means(FCM)算法来重建被遮挡人脸图像中的54个基准点。这种基于面部轮廓特征恢复被遮挡区域的特征严重依赖可靠的面部检测和面部特征的跟踪，并且需要预先定位遮挡区域以及精确的面部对齐，这在实际应用程序中很难实现。Lin等利用人脸点位移的几何特征预测口腔遮挡下的AU单元。采用高斯混合模型(GMM)对人脸区域的灰度像素分布进行建模，对遮挡区域进行检测。基于子区域分析的方法都假设遮挡只出现在面部的一小部分，这种方法通常能够对小范围的遮挡产生满意的性能。然而，将人脸细分为局部区域的粒度及其对性能的影响仍然是一个有待解决的问题，特别是对于没有固定位置、形状和大小的随机遮挡。同时基于子区域的方法由于人脸定位、对齐和归一化不准确，对噪声也很敏感。Cheng等提出一种局部遮挡条件下的深层神经网络结构。利用Gabor滤波器从人脸图像中提取出多尺度、多方向的Gabor量级，并将其输入三层深玻尔兹曼机(DBM)进行情绪分类。姚等提出通过Wasserstein生成式对抗网络来补全人脸图像的遮挡区域，可以缓解因局部表情信息缺失带来的影响。虽然深度学习技术可以从原始面部数据中自动学习到最具鉴别性的面部表情特征模式，通常不需要单独的遮挡检测或重建过程。然而，使用深度架构需要大量的训练数据来确保适当的特性学习、调整大量的系统参数的困难以及昂贵的计算量的需要。朱明旱等通过对图像进行分块得到各级的遮挡基图像，利用这些图像构建出非正交的遮挡字典，然后对待测图像稀疏分解得到系数，在待测图像的子空间内实现表情类别判断。刘帅师等提出一种鲁棒的正则化编码随机遮挡表情识别。通过给表情图像的每个像素点赋予不同的权重，连续迭代使权重收敛到阈值，最后，待测图像的稀疏表示通过最优权重矩阵计算。该方法取得了较好的识别效果，但是该方法没有避免身份特征对表情分类的干扰。稀疏表示方法的最大优点是它不仅对遮挡和损坏具有鲁棒性，而且可以用来估计脸部的遮挡或损坏部分。Facial expression recognition technology is a very popular topic in the fields of physiology, pattern recognition and computer vision, and related technologies are becoming more and more mature. In order to ensure the integrity of expression information, most studies are conducted under controlled experimental conditions. However, in practical applications, facial expression recognition needs to overcome various random problems such as illumination, occlusion, and posture. Therefore, facial expression recognition remains a challenging subject. In terms of occluded facial expression recognition, researchers have proposed many methods to reduce the impact of occlusion on expression recognition. The most commonly used methods include feature reconstruction, sub-region analysis, sparse representation, and deep learning. Xia et al. combined RPCA and saliency detection for occluded expression recognition. Firstly, saliency detection is used to locate the occluders, and then RPCA is used to reconstruct the occluded areas of the image. combined iterative closest point (ICP) and fuzzy C-Means (FCM) algorithms to reconstruct 54 fiducial points in occluded face images. This feature recovery of occluded regions based on facial contour features relies heavily on reliable face detection and tracking of facial features, and requires pre-positioning of occluded regions and precise face alignment, which is difficult to achieve in practical applications. Lin et al. used the geometric features of face point displacement to predict the AU unit under oral cavity occlusion. Gaussian Mixture Model (GMM) is used to model the distribution of gray pixels in the face area, and to detect the occluded area. Methods based on sub-region analysis all assume that occlusions occur only in a small part of the face, and this method can usually produce satisfactory performance for small-scale occlusions. However, the granularity of subdividing faces into local regions and its impact on performance is still an open problem, especially for random occlusions without fixed positions, shapes, and sizes. Meanwhile, sub-region based methods are also sensitive to noise due to inaccurate face localization, alignment and normalization. Cheng et al. proposed a deep neural network structure under partial occlusion conditions. Gabor filters are used to extract multi-scale and multi-directional Gabor magnitudes from face images, and they are fed into a three-layer Deep Boltzmann Machine (DBM) for emotion classification. Yao et al proposed to use the Wasserstein generative adversarial network to complement the occluded area of the face image, which can alleviate the impact caused by the lack of local expression information. Although deep learning techniques can automatically learn the most discriminative facial expression feature patterns from raw facial data, a separate occlusion detection or reconstruction process is usually not required. However, using deep architectures requires a large amount of training data to ensure proper feature learning, the difficulty of tuning a large number of system parameters, and the need for expensive computation. Zhu Minghan et al obtained occlusion base images at all levels by dividing the image into blocks, and used these images to construct a non-orthogonal occlusion dictionary, and then sparsely decomposed the image to be tested to obtain coefficients, and realized the expression category in the subspace of the image to be tested. judge. Liu Shuaishi et al. proposed a robust regularization encoding for randomly occluded expression recognition. By assigning different weights to each pixel of the expression image, continuous iterations make the weights converge to the threshold, and finally, the sparse representation of the image to be tested is calculated through the optimal weight matrix. This method has achieved a good recognition effect, but this method does not avoid the interference of identity features on expression classification. The biggest advantage of the sparse representation method is that it is not only robust to occlusion and damage, but also can be used to estimate the occluded or damaged parts of the face.

传统的稀疏表示方法SRC和CRC将每个字典原子分隔开，独立处理，没有考虑各原子之间的关系，所产生的稀疏表示是非结构的。同时SRC和CRC都是采用原始的训练数据集作为分类字典，当训练样本受损、遮挡、形变等情况下，算法缺乏稳健性，分类效果差。The traditional sparse representation methods SRC and CRC separate each dictionary atom and process it independently, without considering the relationship between each atom, and the resulting sparse representation is non-structural. At the same time, both SRC and CRC use the original training data set as the classification dictionary. When the training samples are damaged, occluded, deformed, etc., the algorithm lacks robustness and the classification effect is poor.

发明内容Contents of the invention

本发明为克服上述现有技术所述的分类方法缺乏稳健性，分类效果差的缺陷，提供一种联合双字典和误差矩阵的遮挡表情识别方法。In order to overcome the shortcomings of the lack of robustness and poor classification effect of the classification method described in the above prior art, the present invention provides a method for recognizing occluded expressions combined with a double dictionary and an error matrix.

本发明旨在至少在一定程度上解决上述技术问题。The present invention aims to solve the above-mentioned technical problems at least to a certain extent.

本发明的技术方案如下：Technical scheme of the present invention is as follows:

一种联合双字典和误差矩阵的遮挡表情识别方法，所述算法包括如下步骤：A kind of occlusion facial expression recognition method of joint double dictionary and error matrix, described algorithm comprises the following steps:

S1：数据预处理，从数据库中获取数据样本集，将样本集通过旋转使眼睛水平面对准，并根据两眼间的距离来从原始的表情图像裁剪出只含正面人脸表情的矩形区域，将头发，脖子的冗余信息都去除掉；S1: Data preprocessing, obtain a data sample set from the database, rotate the sample set to align the eye level, and cut out a rectangular area containing only positive facial expressions from the original expression image according to the distance between the two eyes , remove the redundant information of hair and neck;

S2：遮挡模拟，将预处理后的数据样本集通过在不同位置添加不同大小的黑块来进行遮挡模拟，遮挡模拟后的数据样本作为训练样本；S2: Occlusion simulation, the preprocessed data sample set is used to perform occlusion simulation by adding black blocks of different sizes at different positions, and the data samples after occlusion simulation are used as training samples;

S3：训练阶段，将训练样本中若干种表情图像进行低秩分解得到表情特征和身份特征，其中表情特征和身份特征均采用矩阵进行表示；给定一个训练样本X＝[X₁，X₂，...，X_c]，分解函数如下所示：S3: In the training phase, perform low-rank decomposition of several kinds of expression images in the training sample to obtain expression features and identity features, where both expression features and identity features are represented by a matrix; given a training sample X=[X ₁ , X ₂ , ..., X _c ], the decomposition function is as follows:

对该组训练样本进行低秩分解，通过分解后得到L，S；Perform low-rank decomposition on the set of training samples, and obtain L and S after decomposition;

L＝[L₁，L₂，..，L_c]∈R^n×mc L=[L ₁ , L ₂ , .., L _c ]∈R ^n×mc

S＝[S₁，S₂，..，S_c]∈R^n×mc S=[S ₁ , S ₂ , .., S _c ]∈R ^n×mc

式中L表示表情特征矩阵，S表示身份特征矩阵，c表示类别数；将表情图像的表情特征和身份特征分离开，分别进行字典学习，得到类内相关字典以及差异结构字典的双字典模式；所述表情特征矩阵为低秩矩阵，所述身份特征矩阵为稀疏矩阵；In the formula, L represents the expression feature matrix, S represents the identity feature matrix, and c represents the number of categories; the expression features and identity features of the expression image are separated, and the dictionary learning is performed separately to obtain the double-dictionary mode of the intra-class related dictionary and the difference structure dictionary; The expression feature matrix is a low-rank matrix, and the identity feature matrix is a sparse matrix;

S4：通过双字典来重构出测试图像的身份特征和表情特征，同时引入统一编码误差来表示噪声信息，最终根据各类别表情特征在联合稀疏表示中的贡献量来实现分类。S4: Reconstruct the identity features and expression features of the test image through the double dictionary, and introduce a unified coding error to represent the noise information, and finally realize the classification according to the contribution of each category of expression features in the joint sparse representation.

进一步地，步骤S4具体过程如下：Further, the specific process of step S4 is as follows:

S4.1：构建基于双字典和误差协同表示分类方法模型如下：S4.1: Build a classification method model based on double dictionaries and error collaborative representation as follows:

s.t.y＝Ax₁+Bx₂+esty＝Ax ₁ +Bx ₂ +e

式中e表示如下：e＝e_a+e_b+e_s，公式中x₁为字典A的稀疏系数向量，x₂为字典B的稀疏系数向量，字典A表示表情特征的字典，B表示身份特征的字典，e_a表示重建表情特征的误差，e_b表示重建身份特征的误差，e_s表示遮挡带来的误差；λ，η，β都表示权重系数，可以适当的进行调整；In the formula, e is expressed as follows: e=e _a +e _b +es , _x ₁ in the formula is the sparse coefficient vector of dictionary A, x ₂ is the sparse coefficient vector of dictionary B, dictionary A represents the dictionary of expression features, and B represents identity Feature dictionary, e _a represents the error of reconstructing expression features, e _b represents the error of reconstructing identity features, and e _s represents the error caused by occlusion; λ, η, β all represent weight coefficients, which can be adjusted appropriately;

S4.2：使用非精确拉格朗日来解决优化上述基于双字典和误差协同表示分类方法模型，优化函数如下：S4.2: Use inaccurate Lagrangian to solve and optimize the above classification method model based on double dictionaries and error collaborative representation, the optimization function is as follows:

公式中φ是拉格朗日乘法的一个向量，ξ是惩罚项参数；In the formula, φ is a vector of Lagrange multiplication, and ξ is a penalty parameter;

通过优化函数得到测试表情图像的系数和噪声信息，通过重构出测试图片的身份部分和遮挡部分的信息，去掉与表情无关的部分，减少数据的冗余部分；The coefficients and noise information of the test expression image are obtained by optimizing the function, and the information of the identity part and the occlusion part of the test image is reconstructed to remove the part irrelevant to the expression and reduce the redundant part of the data;

dy＝y-Bx₂-edy=y-Bx ₂ -e

公式中dy表示测试图片干净的表情特征，Bx₂表示重构出的身份信息，e表示噪声信息同时也包含遮挡带来的误差信息；In the formula, dy represents the clean expression feature of the test picture, Bx ₂ represents the reconstructed identity information, e represents the noise information and also includes the error information caused by occlusion;

S4.3：计算测试样本的类别标签，判别测试样本的类别，公式如下：S4.3: Calculate the category label of the test sample and distinguish the category of the test sample. The formula is as follows:

其中i表示类别，i∈(1，c)；x₁为字典A的稀疏系数向量；identity(y)输出测试样本的标签；δ(i)是一个矩阵，其中第i个元素为1，其余都为0；Where i represents the category, i ∈ (1, c); x ₁ is the sparse coefficient vector of the dictionary A; identity(y) outputs the label of the test sample; δ(i) is a matrix in which the i-th element is 1, and the rest Both are 0;

S4.4：根据计算的类别标签将样本图片划归所属分类，根据分类结果计算测试集的识别率。S4.4: According to the calculated category labels, classify the sample pictures into their categories, and calculate the recognition rate of the test set according to the classification results.

进一步地，所述数据库为CK+数据库和KDEF数据库。Further, the database is CK+ database and KDEF database.

进一步地，步骤S3选用了6种表情图像。Further, step S3 selects 6 facial expression images.

与现有技术相比，本发明技术方案的有益效果是：Compared with the prior art, the beneficial effects of the technical solution of the present invention are:

1.本发明通过对图像的内部结构信息特征进行分解，提高稀疏表示的精度同时增强识别的鲁棒性。1. The present invention improves the accuracy of sparse representation and enhances the robustness of recognition by decomposing the internal structure information features of the image.

2.通过对低秩部分和稀疏部分进行字典学习，得到类内相关字典以及差异结构字典，使得字典规模减少，减少了字典训练过程的过拟合和不稳定性问题。另外，构建双字典的稀疏表示自适应性更强，稀疏结构更有效、紧凑，表达能力更强。2. By learning the dictionary on the low-rank part and the sparse part, the intra-class correlation dictionary and the difference structure dictionary are obtained, which reduces the size of the dictionary and reduces the problems of over-fitting and instability in the dictionary training process. In addition, the sparse representation of the double dictionary is more adaptable, the sparse structure is more effective, compact, and expressive.

3.在稀疏编码分类时，定义了单个误差矩阵来表示遮挡信息带来的误差，同时利用双字典协同表示测试图片，补全了表情图像的人脸表情特征，从而缓解因局部表情信息缺失带来的影响。3. In sparse coding classification, a single error matrix is defined to represent the error caused by occlusion information. At the same time, the double dictionary is used to jointly represent the test picture, which complements the facial expression characteristics of the expression image, thereby alleviating the problem caused by the lack of local expression information. coming impact.

附图说明Description of drawings

图1为本发明所述的算法框架图。Fig. 1 is a frame diagram of the algorithm described in the present invention.

图2本发明所述的模拟随机遮挡的示意图。Fig. 2 is a schematic diagram of simulating random occlusion according to the present invention.

图3本发明所述的训练样本的示意图。Fig. 3 is a schematic diagram of training samples according to the present invention.

图4本发明所述算法识别率以及与相似算法识别率的比较。Fig. 4 The recognition rate of the algorithm of the present invention and the comparison with the recognition rate of similar algorithms.

具体实施方式Detailed ways

下面结合附图和实施例对本发明的技术方案做进一步的说明。The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

实施例1Example 1

参见图1，一种联合双字典和误差矩阵的遮挡表情识别方法，具体步骤如下：Referring to Figure 1, a method for occlusion expression recognition combined with a double dictionary and an error matrix, the specific steps are as follows:

步骤1：数据预处理。本实施例选取了CK+表情数据库和KDEF表情数据库作为样本集。CK+数据库是由210个对象的6种表情图像序列构成，选取生气、厌恶、害怕、高兴、悲伤、惊讶这6种表情作为实验数据，针对每种表情在数据库中随机挑选30个不同对象视频序列的最后一张组成训练集，共180幅图片，然后选取50个对象的6种表情组成测试集，共300张。KDEF数据集包含70个人，35个男性，35个女性，每种表情有5个角度，本方法选取的实验数据均为正脸图像。在KDEF数据库中，正脸图像随机选取30个不同对象的6种表情图像作为训练集，剩下的40个对象的6种表情图像作为测试集。由于CK+数据库中的表情图像稍有头部倾斜和尺寸大小不一的情绪，本方法通过旋转使眼睛水平面对准，并根据两眼间的距离来从原始的表情图像裁剪出只含正面人脸表情的矩形区域，将头发，脖子这些冗余信息都去除掉。本实施例所有的实验均是在训练集图像没有遮挡，测试图像有遮挡的情况下进行的，模拟随机遮挡处理如图2。Step 1: Data preprocessing. In this embodiment, the CK+ expression database and the KDEF expression database are selected as sample sets. The CK+ database is composed of 6 facial expression image sequences of 210 subjects. Six expressions of anger, disgust, fear, happiness, sadness, and surprise are selected as experimental data, and 30 video sequences of different subjects are randomly selected in the database for each expression. The last one constitutes the training set, a total of 180 pictures, and then selects 6 expressions of 50 subjects to form a test set, a total of 300 pictures. The KDEF data set contains 70 people, 35 males and 35 females, and each expression has 5 angles. The experimental data selected by this method are all frontal face images. In the KDEF database, the front face images randomly select 30 different objects with 6 kinds of expression images as the training set, and the remaining 40 objects with 6 kinds of expression images as the test set. Since the expression images in the CK+ database have slightly tilted heads and emotions of different sizes, this method aligns the horizontal plane of the eyes by rotating them, and crops only frontal faces from the original expression images according to the distance between the two eyes. In the rectangular area of facial expression, redundant information such as hair and neck are removed. All the experiments in this embodiment are carried out under the condition that the training set images are not occluded and the test images are occluded. The simulated random occlusion process is shown in FIG. 2 .

步骤2：遮挡模拟，将预处理后的数据集通过在不同位置添加不同大小的黑块来进行遮挡模拟；Step 2: Occlusion simulation, the preprocessed data set is subjected to occlusion simulation by adding black blocks of different sizes at different positions;

步骤3：训练阶段。将训练样本中6种表情图像进行低秩分解得到表情特征和身份特征，其中表情特征和身份特征均采用矩阵进行表示；训练样本数据示意图如图3所示。给定一组训练样本X＝[X₁，X₂，...，X_c]，将训练样本低秩分解为低秩矩阵L和稀疏矩阵S。目标函数如下所示：Step 3: Training phase. The expression features and identity features are obtained by low-rank decomposition of the six expression images in the training samples, in which the expression features and identity features are represented by a matrix; the training sample data schematic diagram is shown in Figure 3. _Given a set of training samples X=[X ₁ , X ₂ , . The objective function looks like this:

分别对每类训练样本进行低秩分解，通过分解后得到L，S。Low-rank decomposition is performed on each type of training samples respectively, and L and S are obtained after decomposition.

L＝[L₁，L₂，...，L_c]∈R^n×mc L=[L ₁ , L ₂ , . . . , L _c ]∈R ^n×mc

S＝[S₁，S₂，...，S_c]∈R^n×mc S=[S ₁ , S ₂ , . . . , S _c ]∈R ^n×mc

式中L表示表情特征矩阵，S表示身份特征矩阵，c表示类别数。rank()是一个函数，rank(L)计算矩阵L的秩；γ是大于0的常数，用于平衡矩阵的秩和噪声矩阵的稀疏性；n和m表示表示图像的大小；R^n×mc是一个集合实数集，表示矩阵大小范围；In the formula, L represents the expression feature matrix, S represents the identity feature matrix, and c represents the number of categories. rank() is a function, rank(L) calculates the rank of the matrix L; γ is a constant greater than 0, used to balance the rank of the matrix and the sparsity of the noise matrix; n and m represent the size of the image; R ^n×mc is a set of real numbers, representing the size range of the matrix;

步骤4：基于双字典和误差协同表示分类方法。对于任意一张表情图片，不仅包含着表情信息、身份信息，还包含着噪声、光照等其他信息。对步骤3中的低秩矩阵和稀疏矩阵进行字典学习得到双字典，使用k-svd字典学习算法对L和S进行字典学习得到能捕捉类判别特征的字典A和体现类变化的字典B，k-svd算法主要提升字典更新速度和降低复杂度。由于字典A，B是由非遮挡的人脸表情图像构成的，能很好地表征人脸表情，当测试图像有局部遮挡时会造成巨大的编码偏差。因此，本实例在稀疏编码的时候引入统一判别重构误差约束，可以有效地减少噪声和遮挡的干扰同时更好的保留原始数据的结构信息。本方法提出新的模型来表示测试图片，具体如下所示：Step 4: Classification method based on double dictionary and error co-representation. For any facial expression picture, it not only contains facial expression information, identity information, but also other information such as noise and illumination. Perform dictionary learning on the low-rank matrix and sparse matrix in step 3 to obtain a double dictionary, and use the k-svd dictionary learning algorithm to perform dictionary learning on L and S to obtain a dictionary A that can capture class discriminant features and a dictionary B that reflects class changes, k -svd algorithm mainly improves dictionary update speed and reduces complexity. Since the dictionaries A and B are composed of non-occluded facial expression images, which can well represent facial expressions, it will cause a huge encoding deviation when the test image has partial occlusions. Therefore, this example introduces a unified discriminant reconstruction error constraint during sparse coding, which can effectively reduce the interference of noise and occlusion while better retaining the structural information of the original data. This method proposes a new model to represent the test image, as follows:

s.t.y＝Ax₁+Bx₂+esty＝Ax ₁ +Bx ₂ +e

式中e表示如下：e＝e_a+e_b+e_s，公式中x₁为字典A的稀疏系数向量，x₂为字典B的稀疏系数向量，字典A表示表情特征的字典，B表示身份特征的字典，e_a表示重建表情特征的误差，e_b表示重建身份特征的误差，e_s表示遮挡带来的误差；λ，η，β都表示权重系数，可以适当的进行调整。In the formula, e is expressed as follows: e=e _a +e _b +es , _x ₁ in the formula is the sparse coefficient vector of dictionary A, x ₂ is the sparse coefficient vector of dictionary B, dictionary A represents the dictionary of expression features, and B represents identity Feature dictionary, e _a represents the error of reconstructing expression features, e _b represents the error of reconstructing identity features, and e _s represents the error caused by occlusion; λ, η, β all represent weight coefficients, which can be adjusted appropriately.

本实例利用非精确拉格朗日来解决上述优化问题，优化函数如下所示：This example uses inexact Lagrange to solve the above optimization problem, and the optimization function is as follows:

公式中φ是拉格朗日乘法的一个向量，ξ是惩罚项参数，具体求解方法如下所示。求解过程：非精确ALM求解In the formula, φ is a vector of Lagrangian multiplication, and ξ is a penalty parameter. The specific solution method is as follows. Solution process: non-exact ALM solution

输入：A、B、y、λ、β、η，Input: A, B, y, λ, β, η,

求解过程：非精确ALM求解Solution process: non-exact ALM solution

While没有收敛do；While there is no convergence do;

1.固定其它值，更新e：1. Fix other values and update e:

2.固定其它值，更新x₁；2. Fix other values and update x1 _;

3.固定其它值，更新x₂；3. Fix other values and update x ₂ ;

4.更新φ；4. Update φ;

φ＝φ+ξ(y-Ax₁-Bx₂-e)φ＝φ+ξ(y-Ax ₁ -Bx ₂ -e)

5.更新ξ，p＝1.5为系数5. Update ξ, p=1.5 is the coefficient

ξ＝min(pξ，ξ_max)ξ=min(pξ,ξ _max )

判断是否满足收敛条件ε＝10^-6：Judging whether the convergence condition ε=10 ^-6 is satisfied:

Endend

输出：x₁，x₂，e。Output: x ₁ , x ₂ , e.

通过优化函数可以得到测试表情图像的系数和噪声信息，通过重构出测试图片的身份部分和遮挡部分的信息。去掉与表情无关的部分，减少数据的冗余部分。By optimizing the function, the coefficient and noise information of the test expression image can be obtained, and the information of the identity part and the occlusion part of the test picture can be reconstructed. Remove the part irrelevant to the expression and reduce the redundant part of the data.

dy＝y-Bx₂-edy=y-Bx ₂ -e

公式中dy表示测试图片干净的表情特征，Bx₂表示重构出的身份信息，e表示噪声信息同时也包含遮挡带来的误差信息。In the formula, dy represents the clean expression feature of the test picture, Bx ₂ represents the reconstructed identity information, and e represents the noise information and also includes the error information caused by occlusion.

为了判别测试样本的类别，结合表情字典A来重构测试样本的表情信息，哪类字典能更好的重构测试样本，便将测试样本归为相应的类别。具体公式如下所示：In order to distinguish the category of the test sample, the expression information of the test sample is reconstructed in combination with the expression dictionary A. Which type of dictionary can better reconstruct the test sample, and the test sample is classified into the corresponding category. The specific formula is as follows:

其中i表示类别，i∈(1，c)；x₁为字典A的稀疏系数向量；identity(y)输出测试样本的标签；δ(i)是一个矩阵，其中第i个元素为1，其余都为0。根据每张测试样本的识别结果来计算最终整个测试集的识别率。Where i represents the category, i ∈ (1, c); x ₁ is the sparse coefficient vector of the dictionary A; identity(y) outputs the label of the test sample; δ(i) is a matrix in which the i-th element is 1, and the rest Both are 0. Calculate the final recognition rate of the entire test set according to the recognition results of each test sample.

为了验证本文联合双字典和误差协同表示分类方法的有效性，本文与经典的稀疏表示分类方法在随机遮挡的表情数据集上进行了实验，具体结果如图4所示所示。In order to verify the effectiveness of this paper's joint dual dictionary and error collaborative representation classification method, this paper and the classic sparse representation classification method were tested on a randomly occluded expression dataset. The specific results are shown in Figure 4.

相同或相似的标号对应相同或相似的部件；The same or similar reference numerals correspond to the same or similar components;

附图中描述位置关系的用语仅用于示例性说明，不能理解为对本专利的限制；The terms describing the positional relationship in the drawings are only for illustrative purposes and cannot be interpreted as limitations on this patent;

显然，本发明的上述实施例仅仅是为清楚地说明本发明所作的举例，而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等，均应包含在本发明权利要求的保护范围之内。Apparently, the above-mentioned embodiments of the present invention are only examples for clearly illustrating the present invention, rather than limiting the implementation of the present invention. For those of ordinary skill in the art, other changes or changes in different forms can be made on the basis of the above description. It is not necessary and impossible to exhaustively list all the implementation manners here. All modifications, equivalent replacements and improvements made within the spirit and principles of the present invention shall be included within the protection scope of the claims of the present invention.

Claims

1. A method for recognizing occlusion expressions by combining double dictionaries and an error matrix is characterized by comprising the following steps:

s1: data preprocessing, namely acquiring a data sample set from a database, aligning the horizontal planes of eyes by rotating the sample set, cutting out a rectangular area only containing facial expressions from an original expression image according to the distance between the two eyes, and removing redundant information of hair and neck;

s2: performing occlusion simulation, namely performing occlusion simulation by adding black blocks with different sizes at different positions to the preprocessed data sample set, wherein the data sample after occlusion simulation is used as a training sample;

s3: in the training stage, performing low-rank decomposition on a plurality of expression image training samples to obtain expression characteristics and identity characteristics, wherein the expression characteristics and the identity characteristics are expressed by adopting matrixes; given a set of training samples X = [ X = ₁ ，X ₂ ...，X _c ]The decomposition function is as follows:

respectively carrying out low-rank decomposition on each type of training sample to obtain L and S after decomposition;

L＝[L ₁ ，L ₂ ，..，L _c ]∈R ^n×mc

S＝[S ₁ ，S ₂ ，..，S _c ]∈R ^n×mc

in the formula, L represents an expression characteristic matrix, S represents an identity characteristic matrix, and c represents the number of categories; separating the expression features and the identity features of the expression image, and respectively performing dictionary learning to obtain a double-dictionary mode of an intra-class related dictionary and a difference structure dictionary; the expression characteristic matrix is a low-rank matrix, and the identity characteristic matrix is a sparse matrix;

s4: the identity features and the expression features of the test image are reconstructed through the double dictionaries, a coding error is introduced to express noise information, and finally classification is achieved according to the contribution of each category of expression features in the joint sparse representation.

2. The method for recognizing the occlusion expression by combining the double dictionaries and the error matrix according to claim 1, wherein the step S4 specifically comprises the following steps:

s4.1: a classification method model based on double dictionaries and error collaborative representation is constructed as follows:

s.t.y＝Ax ₁ +Bx ₂ +e

wherein e represents as follows: e = e _a +e _b +e _s X in the formula ₁ Sparse coefficient vector, x, of dictionary A ₂ Sparse coefficient vectors for dictionary B, dictionary A representing dictionary of expressive features, dictionary B representing dictionary of identity features, e _a Error representing the reconstructed expressive features, e _b Error representing the identity of the reconstruction, e _s Representing errors due to occlusion; λ, η, β all represent weight coefficients, which can be adjusted; y represents a test sample;

s4.2: and solving and optimizing the classification method model based on the double dictionaries and the error co-expression by using non-precise Lagrangian, wherein the optimization function is as follows:

phi in the formula is a vector of Lagrange multiplication, and xi is a penalty parameter;

obtaining the coefficient and noise information of the test expression image through an optimization function, and removing the part irrelevant to the expression by reconstructing the information of the identity part and the shielding part of the test image to reduce the redundant part of data;

dy＝y-Bx ₂ -e

dy in the formula represents the clean expression characteristic of the test picture, bx ₂ Representing the reconstructed identity information, and e representing noise information and also including error information caused by shielding;

s4.3: calculating a class label of the test sample, and judging the class of the test sample, wherein the formula is as follows:

where i represents a category, i ∈ (1,c); x is the number of ₁ Sparse coefficient vector of dictionary A; identity (y) outputs a label of the test specimen; δ (i) is a matrix in which the ith element is 1 and the rest are 0;

s4.4: classifying the sample pictures according to the calculated class labels, and calculating the recognition rate of the test set according to the classification result.

3. The method for recognizing the occlusion expressions by combining double dictionaries and error matrices as claimed in claim 1, wherein the databases are a CK + database and a KDEF database.

4. The method for recognizing the occlusion expression by combining the double dictionaries and the error matrix as claimed in claim 1, wherein 6 expression images are selected in the step S3.