CN113420591B

CN113420591B - Emotion-based OCC-PAD-OCEAN federal cognitive modeling method

Info

Publication number: CN113420591B
Application number: CN202110523544.6A
Authority: CN
Inventors: 刘峰; 张嘉淏; 王晗阳; 沈思源; 贾迅; 胡静怡; 周爱民; 齐佳音; 李志斌
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2021-05-13
Filing date: 2021-05-13
Publication date: 2023-08-22
Anticipated expiration: 2041-05-13
Also published as: CN113420591A

Abstract

The invention provides an emotion-based OCC-PAD-OCEAN federal cognitive modeling method, which comprises the following steps: constructing a VGG-FACS-OCC model, and calculating the emotion space vector of the tested video; according to the parameter quantization mapping relation between the OCC emotion space and the PAD emotion space in the OCC-PAD-OCEAN model, mapping the emotion space vector to the PAD emotion space to obtain an emotion space vector; and mapping the mood space vector to an OCEAN personality space to obtain a personality space vector. According to the invention, the expression is mapped to the PAD mood space through the established expression-mood mapping relation, and then the average mood in a period of time is mapped through the established mood-personality mapping relation, so that extraction of personality characteristics is realized, and finally, information with certain statistical significance and credibility in the personality space is obtained.

Description

A sentiment-based OCC-PAD-OCEAN federated cognitive modeling method

技术领域technical field

本发明涉及心理学认知建模技术领域，尤其涉及一种基于情感的 OCC-PAD-OCEAN联邦认知建模方法。The invention relates to the technical field of psychological cognitive modeling, in particular to an emotion-based OCC-PAD-OCEAN federated cognitive modeling method.

背景技术Background technique

心理学作为实验科学，绝大多数心理学研究的进展都是基于心理学实验范式，对被试的主客观数据进行采集与分析。实验中要求主试在变量进行严格控制的情况下对被试的各个方面行为进行观察与记录，并对采集的数据结果进行分析。即便如此，心理学实验依然存在大量实验结果信效度不高，实验不可重复等问题。这些问题部分来源于旁观者效应，部分来源于实验室实验的局限性等，进而导致心理学实验被局限在一定场景中，无法进行外推延展。针对以上问题，计算机技术可发挥精准控制、量化数据的作用，这种现代心理测量技术可以诸多益处，包括避免复杂的信度测量、改进的结构效度、避免暴露效应和高测量效率等。大多数心理实验的主要目的是探究人类行为原理或人类认知模式：基于数据采集的角度，通过计算机技术对观测数据进行采集，能对整个实验环境实现精确的数字控制，如准确采集视频信号、音频信号、传感器数据、人体运动信息等。基于构建实验环境的角度，计算机的辅助可以使得被试有沉浸式的体验，例如在情绪心理学的研究中使用虚拟现实技术(VR)可以使得情感相比于一般的图片或语言刺激，能被更有效地诱发。此外，计算机技术可以对假设模型进行仿真，解释所观察到的行为，而在实验条件受限制的情况下，计算机的模拟也可以对假设进行初步理想化的验证。近十年来，对人类情感行为的研究越来越受到人们的关注。情感计算正是基于心理学与计算机科学，研究如何利用计算机识别、建模甚至表达人类情感的一门交叉学科。从情感计算延伸出的人格计算则能够推进所有与人类行为的理解、预测的技术。Psychology is an experimental science, and the vast majority of psychological research progress is based on the psychological experimental paradigm, which collects and analyzes the subjective and objective data of the subjects. In the experiment, the main tester is required to observe and record all aspects of the behavior of the testee under the condition of strict control of the variables, and analyze the collected data results. Even so, there are still many problems in psychological experiments, such as low reliability and validity of experimental results, and unrepeatable experiments. Some of these problems come from the bystander effect, and some of them come from the limitations of laboratory experiments, etc., which lead to psychological experiments being limited to certain scenarios and unable to be extrapolated and extended. In response to the above problems, computer technology can play a role in precise control and quantification of data. This modern psychometric technique can have many benefits, including avoiding complex reliability measurements, improving construct validity, avoiding exposure effects, and high measurement efficiency. The main purpose of most psychological experiments is to explore the principles of human behavior or human cognitive models: based on the perspective of data collection, the observation data is collected through computer technology, which can realize precise digital control of the entire experimental environment, such as accurate collection of video signals, Audio signals, sensor data, human motion information, etc. Based on the perspective of constructing the experimental environment, computer assistance can enable the subjects to have an immersive experience. For example, the use of virtual reality technology (VR) in the study of emotional psychology can make emotions more visible than general pictures or language stimuli. induced more effectively. In addition, computer technology can simulate hypothetical models to explain the observed behavior, and in the case of limited experimental conditions, computer simulation can also perform preliminary idealized verification of hypotheses. In the past decade, the study of human emotional behavior has attracted more and more attention. Affective computing is an interdisciplinary subject based on psychology and computer science, which studies how to use computers to recognize, model and even express human emotions. Personality computing, which extends from emotional computing, can advance all technologies related to the understanding and prediction of human behavior.

在关于人的行为、预测等方向的心理学研究中，人格是一个非常重要的决定因素，它能描述稳定的个人特征，这些特征通常可以用定量的方式衡量，解释和预测可观察到的行为差异。大五人格模型(FFM)是当今人格心理学中的重要理论，是心理学研究中最有影响力的模型之一。其五因子包括开明性 (openness)、责任性(conscientiousness)、外向性(extraversion)、宜人性 (agreeableness)和神经质(neuroticism)。社会心理学家Harry Reis将FFM 述为“行为科学中最科学严谨的分类法”。大五人格模型提供了一个可以将大多数人的人格特征进行分类的结构，通过一组高度可复制的维度，可简约且全面地描述大多数个体差异。从计算机的角度来说，特征模型用数值的形式表示个性，可适合于计算机处理。而目前大多数人格评估都采用自我报告的形式，通过量表中的陈述或形容词评估个性。自我报告范式简单明了却无法控制被试回答的真实性，实验结果也受到多种无关因素的影响而容易产生较大偏差。自我评估的重大局限性之一也在于被试可能倾向于使评分偏向于社会的期望值，尤其是当评估可能产生负面后果时，被试可能隐藏消极的特征，从而导致结果不符合真实性格。In psychological research on human behavior, prediction, etc., personality is a very important determinant, which can describe stable personal characteristics, which can usually be measured in a quantitative way, explain and predict observable behavior. difference. The Big Five Personality Model (FFM) is an important theory in personality psychology today and one of the most influential models in psychological research. Its five factors include openness, conscientiousness, extraversion, agreeableness, and neuroticism. Social psychologist Harry Reis described FFM as "the most scientifically rigorous taxonomy in the behavioral sciences." The Big Five personality model provides a structure for classifying the personality traits of most people, and describes most individual differences concisely and comprehensively through a set of highly replicable dimensions. From a computer point of view, the feature model represents personality in numerical form, which is suitable for computer processing. However, most current personality assessments take the form of self-reports, assessing personality through statements or adjectives on a scale. The self-report paradigm is simple and clear, but it cannot control the authenticity of the subjects' answers. The experimental results are also affected by many irrelevant factors and are prone to large deviations. One of the significant limitations of self-assessment is also that subjects may tend to bias ratings toward social expectations, especially when assessments may have negative consequences, and subjects may hide negative characteristics, resulting in results that do not correspond to real personalities.

总体而言，虽然目前部分交叉创新研究在推进心理学理论的计算与量化，但心理学理论仍然以传统的定性结论为主，难以给计算机的算法实现供直接的量化模型支持。此外，计算机的算法程序也无法准确表达出心理学中的情绪理论与情绪模型，两者之前存在较大壁垒。现存的大多数研究通常只从计算机科学或心理学其中一个角度考虑，而非从交叉融合的视角。与此同时，虽然目前基于深度学习的人脸表情识别技术已经较为完善，但利用深度学习技术来处理心理学信号的研究仍处于起步阶段。因而从情绪心理学基础理论出发，融合深度学习等算法进行深层次融合的认知建模方法仍然比较欠缺，如何在高效处理认知问题的同时提升模型的可解释性也是一个关键问题。Generally speaking, although some cross-innovation research is promoting the calculation and quantification of psychological theories, psychological theories are still based on traditional qualitative conclusions, and it is difficult to provide direct quantitative model support for computer algorithm realization. In addition, computer algorithm programs cannot accurately express the emotion theory and emotion model in psychology, and there was a big barrier between the two. Most of the existing research is usually only considered from either computer science or psychology perspective, rather than from a cross-integration perspective. At the same time, although the facial expression recognition technology based on deep learning has been relatively perfect, the research on using deep learning technology to process psychological signals is still in its infancy. Therefore, starting from the basic theory of emotional psychology, cognitive modeling methods that integrate deep learning and other algorithms for deep integration are still relatively lacking. How to efficiently deal with cognitive problems while improving the interpretability of the model is also a key issue.

发明内容Contents of the invention

本发明提供了一种基于情感的OCC-PAD-OCEAN联邦认知建模方法能够利用深度学习技术来处理心理学信号，解决了现有技术中融合深度学习等算法进行深层次融合的认知建模方法比较欠缺的问题。The invention provides an emotion-based OCC-PAD-OCEAN federated cognitive modeling method that can use deep learning technology to process psychological signals, and solves the problem of deep-level fusion of deep learning algorithms in the prior art. The lack of modeling methods.

为了解决上述技术问题，本发明提供的方法包括：S1构建VGG-FACS-OCC 模型，计算出被试视频的情感空间向量；S2依据OCC-PAD-OCEAN模型中的 OCC情感空间与PAD心情空间的参数量化映射关系，将情感空间向量映射到 PAD心情空间，得到心情空间向量；S3将心情空间向量映射到OCEAN人格空间，通过心情空间向量得到人格空间向量。In order to solve the above-mentioned technical problems, the method that the present invention provides comprises: S1 builds VGG-FACS-OCC model, calculates the emotional space vector of tested video; Parameters quantify the mapping relationship, and map the emotion space vector to the PAD mood space to obtain the mood space vector; S3 maps the mood space vector to the OCEAN personality space, and obtains the personality space vector through the mood space vector.

具体地，所述S1具体包括以下步骤：S11将被试视频按时间切分为若干图片帧Image_t，按固定频次对图片帧Image_t进行抽样得到若干抽样帧I_i，(i＝1，2，3， …，n)；S12对抽样帧I_i进行预处理，去除干扰信息；S13将预处理后的抽样帧 I_i映射到OCC情感空间，得到情感空间向量E_i。Specifically, the S1 specifically includes the following steps: S11 divides the video under test into several picture frames Image _t according to time, and samples the picture frame Image _t at a fixed frequency to obtain a number of sampling frames I _i , (i=1, 2 , 3, ..., n); S12 preprocesses the sampled frame I _i to remove interference information; S13 maps the preprocessed sampled frame I _i to the OCC emotion space to obtain an emotion space vector E _i .

进一步地，所述S12为采用预处理函数Pre对抽样帧I_i进行预处理，所述 S12具体包括：S121利用MTCNN人脸识别算法对抽样帧I_i进行人脸目标检测，得到目标框集合B＝{b₁，b₂，…b_m}，其中，b_i＝(x_i，y_i，h_i，w_i，p_i)，x_i表示目标框的左上角横坐标，y_i表示目标框的左上角纵坐标，h_i表示目标框的高度， w_i表示目标框的宽度，p_i表示目标框的置信度；S122确定高度阈值h_t、宽度阈值w_t和置信度阈值p_t；对于任意b_i∈B，保留 B′＝{h_i＞h_t且w_i＞w_t且S123从B′中获取置信度p_i最高的目标框b_*，根据b_*对I_i进行裁剪，并将裁剪后的I_i调整为特定大小得到Pre(I_i)，得到情感空间向量E_i＝VGG(Pre(I_i))。Further, the S12 is to use the preprocessing function Pre to preprocess the sampled frame _Ii , and the S12 specifically includes: S121 uses the MTCNN face recognition algorithm to perform face target detection on the sampled frame _Ii to obtain the target frame set B ={b ₁ , b ₂ ,...b _m }, where, b _i =(xi _, y _i , _hi , w _i , p _i ), xi represents the abscissa of the upper left corner of the target frame, and _{y i} _represents the target The ordinate of the upper left corner of the frame, h _i represents the height of the target frame, w _i represents the width of the target frame, p _i represents the confidence of the target frame; S122 determines the height threshold h _t , the width threshold w _t and the confidence threshold p _t ; For any b _i ∈ B, hold B′={h _i ＞h _t and w _i ＞w _t and S123 Obtain the target frame b _* with the highest confidence p _i from B′, crop I _i according to b _* , and adjust the cropped I _i to a specific size to obtain Pre(I _i ), and obtain the emotional space vector E _i =VGG(Pre(I _i )).

具体地，所述S2包括：S21计算心情空间向量M_i＝K×E_i，其中K为OCC 情感空间与PAD心情空间的参数量化映射关系的转换矩阵；S22将心情空间与人格空间的映射关系连续化，得到M_i＝K×E′_i；S23计算平均心情空间向量 Specifically, the S2 includes: S21 calculates the mood space vector M _i =K×E _i , where K is the transformation matrix of the parameter quantization mapping relationship between the OCC emotion space and the PAD mood space; S22 calculates the mapping relationship between the mood space and the personality space Continuation, get M _i =K×E′ _i ; S23 Calculate the average mood space vector

具体地，所述S3为计算人格空间向量P_e＝Z×M_v，其中Z为PAD心情空间到OCEAN人格空间的转换矩阵。Specifically, the S3 is to calculate the personality space vector P _e =Z×M _v , where Z is the conversion matrix from the PAD mood space to the OCEAN personality space.

上述技术方案具有如下优点或者有益效果：本发明通过建立的表情-心情映射关系将表情映射到PAD心情空间，再通过建立的心情-人格映射关系，将一段时间内的平均心情进行映射，利用深度学习技术来处理心理学信号，实现对人格特征的萃取，最终获取在人格空间上有一定统计学意义信效度的信息，解决了现有技术中融合深度学习等算法进行深层次融合的认知建模方法比较欠缺的问题。The above technical solution has the following advantages or beneficial effects: the present invention maps expressions to the PAD mood space through the established expression-mood mapping relationship, and then maps the average mood within a period of time through the established mood-personality mapping relationship, and utilizes depth Learning technology to process psychological signals, realize the extraction of personality characteristics, and finally obtain information with certain statistical significance and validity in personality space, and solve the cognition of deep integration of algorithms such as deep learning in the existing technology The modeling method is relatively deficient.

附图说明Description of drawings

通过阅读参照以下附图对非限制性实施例所作的详细描述，本发明及其特征、外形和优点将会变得更加明显。在全部附图中相同的标记指示相同的部分。并未刻意按照比例绘制附图，重点在于示出本发明的主旨。The invention and its characteristics, shapes and advantages will become more apparent by reading the detailed description of non-limiting embodiments with reference to the following drawings. Like numbers designate like parts throughout the drawings. The drawings are not intended to be drawn to scale, emphasis instead being placed upon illustrating the gist of the invention.

图1是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的简要流程图；Fig. 1 is the brief flowchart of the OCC-PAD-OCEAN federation cognitive modeling method based on emotion that the embodiment of the present invention 1 provides;

图2是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的模型处理示意图；Fig. 2 is the model processing schematic diagram of the OCC-PAD-OCEAN federated cognitive modeling method based on emotion that embodiment 1 of the present invention provides;

图3是本发明实施例1提供的VGG-19各层次的函数表达释意图；Fig. 3 is the function expression diagram of each level of VGG-19 provided by Embodiment 1 of the present invention;

图4是本发明实施例1提供的基于CK+表情特征的FACS-OCC情感映射表；Fig. 4 is the FACS-OCC emotion mapping table based on CK+expression feature that embodiment 1 of the present invention provides;

图5是本发明实施例1提供的PAD心情空间与情感的量化映射表；Fig. 5 is the quantitative mapping table of the PAD mood space and emotion provided by Embodiment 1 of the present invention;

图6是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的模型的数据处理过程示意图；Fig. 6 is the data processing procedure schematic diagram of the model of the OCC-PAD-OCEAN federated cognitive modeling method based on emotion that embodiment 1 of the present invention provides;

图7是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的视频所有帧中OCC六个情感维度时序数据图；Fig. 7 is the OCC six emotion dimension time-series data diagrams in all frames of the video of the OCC-PAD-OCEAN federated cognitive modeling method based on emotion provided by Embodiment 1 of the present invention;

图8是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的视频所有帧中PAD心情空间的时序数据图；Fig. 8 is the timing data diagram of PAD mood space in all frames of the video of the OCC-PAD-OCEAN federated cognitive modeling method based on emotion that the embodiment of the present invention 1 provides;

图9是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的输出的人格雷达图；Fig. 9 is the personality radar map of the output of the OCC-PAD-OCEAN federated cognitive modeling method based on emotion that the embodiment of the present invention 1 provides;

图10是本发明实施例1提供的基于情感的OCC-PAD-OCEAN联邦认知建模方法的全体被试五中人格偏差率。Fig. 10 is the five personality deviation rates of all subjects in the emotion-based OCC-PAD-OCEAN federated cognitive modeling method provided in Embodiment 1 of the present invention.

具体实施方式Detailed ways

下面结合附图和具体的实施例对本发明作进一步的说明，但是不作为本发明的限定。The present invention will be further described below in conjunction with accompanying drawing and specific embodiment, but not as limitation of the present invention.

实施例1：Example 1:

情感能够反映人的短期状态，由于受到外界条件的变化或刺激，情感在短时间内能够发生较大变化。OCC情感模型作为情绪合成的标准模型指定了22种情绪类别，基于Ekman关于所有非基本情绪可以由基本情绪合成的理论，本发明采用其定义的六种基本情感构建OCC情感空间，即愤怒(anger)、厌恶 (disgust)、恐惧(fear)、高兴(happiness)、悲伤(sadness)、惊奇(surprise)，以向量形式表示为E＝[e_angry，e_disgust，e_fear，e_happy，e_sad，e_surprise]^T，其中各元素取值范围为[0，1]，表示情感的强度。Emotion can reflect a person's short-term state. Due to changes or stimuli in external conditions, emotion can undergo major changes in a short period of time. The OCC emotion model specifies 22 emotion categories as a standard model of emotion synthesis. Based on Ekman's theory that all non-basic emotions can be synthesized by basic emotions, the present invention adopts six basic emotions defined by it to construct the OCC emotion space, i.e. anger ), disgust (disgust), fear (fear), happiness (happiness), sadness (sadness), surprise (surprise), expressed in vector form as E=[e _angry , e _disgust , e _fear , e _happy , e _sad , e _surprise ] ^T , where the value range of each element is [0, 1], indicating the intensity of emotion.

心情作为情感与个性的中间量，反映的是人在一段时间内的情感状态。从测量的角度来看，心情可以通过平均个体在一段时间内的情感状态来得到，然而由于离散情感状态(如愤怒、厌恶、恐惧、高兴、悲伤)的组合不能有意义地平均，因此需要一个概念性的系统来构建情绪的基本维度，于是本发明引入由相互独立的三维构成的PAD心情空间，即愉悦度Pleasure(P)、唤醒度 Arousal(A)、支配度Dominance(D)，以向量形式表示为M＝[m_P，m_A，m_D]^T，元素取值范围为[-1，1]。其中愉悦度Pleasure(P)描述积极情感状态与消极情感状态的相对优势；唤醒度Arousal(A)衡量一个人被“高信息”(复杂的、变化的、意想不到的)刺激唤醒的程度，以及恢复到基线水平的速度；支配度 Dominance(D)评估了一个人对其生活环境的控制感和影响力，以及被他人或事件控制和影响的感觉。As an intermediate measure of emotion and personality, mood reflects the emotional state of a person over a period of time. From a measurement perspective, mood can be obtained by averaging an individual's affective states over a period of time, however since combinations of discrete affective states (e.g., anger, disgust, fear, happiness, sadness) cannot be meaningfully averaged, a Conceptual system to construct the basic dimension of emotions, so the present invention introduces the PAD mood space composed of mutually independent three-dimensional, that is, Pleasure (P), Arousal (A) and Dominance (D). The form is expressed as M=[m _P , m _A , m _D ] ^T , and the value range of the elements is [-1, 1]. Among them, Pleasure (P) describes the relative advantages of positive emotional state and negative emotional state; Arousal (A) measures the degree to which a person is aroused by "high information" (complex, changing, unexpected) stimuli, and Speed of return to baseline; Dominance (D) assesses a person's sense of control and influence over his or her life circumstances, and of being controlled and influenced by other people or events.

个性反映了个体之间在心理特征上的差异，在长期的过程中都不会发生较大改变，因此本发明采用大五人格模型构建OCEAN人格空间，其5个因子分别为：开明性(openness)，责任性(conscientiousness)，外向性(extFaveFsion)，宜人性(agreeableness)，神经质(neuroticism)，以向量形式表示为 P＝[p_O，p_C，p_E，p_A，p_N]^T，元素取值范围为[-1，1]。其中开明性描述一个人的认知风格，对经验寻求理解，以及对陌生情境的容忍和探索；责任性指控制、管理和调节自身冲动的方式，评估个体在目标导向行为上的组织、坚持和动机；外向性表示人际互动的数量和密度、对刺激的需要以及获得愉悦的能力；宜人性考察个体对其他人所持的态度；神经质反映个体情感调节过程，反映个体体验消极情绪的倾向和情绪不稳定性。Personality reflects the differences in psychological characteristics between individuals, and will not change greatly in the long-term process. Therefore, the present invention uses the Big Five personality model to construct the OCEAN personality space, and its five factors are: openness ), responsibility (conscientiousness), extraversion (extFaveFsion), agreeableness (agreeableness), neuroticism (neuroticism), expressed in vector form as P=[p _O , p _C , p _E , p _A , p _N ] ^T , The value range of the element is [-1, 1]. Among them, enlightenedness describes a person's cognitive style, seeks understanding of experience, and tolerates and explores unfamiliar situations; conscientiousness refers to the way to control, manage and regulate one's own impulses, and evaluates the individual's organization, persistence and goal-oriented behavior. Motivation; extraversion indicates the quantity and density of interpersonal interaction, the need for stimulation and the ability to obtain pleasure; agreeableness examines the individual's attitude towards other people; neuroticism reflects the individual's emotional regulation process, reflecting the individual's tendency to experience negative emotions and emotional dissatisfaction stability.

由于VGG是一类具有强大图像特征提取能力的经典深度卷积神经网络，因此本文以在CK+数据集上训练的VGG-19作为视频-情感空间推理模型，建立 VGG-FACS-OCC模型，参照图1及图2，将意自然谈话视频V按时间切分若干图片帧Image_t，按固定频次对图片帧Image_t进行抽样得到若干抽样帧I_i，(i＝1，2，3， …，n)，设帧预处理函数为Pre，Pre具体为利用MTCNN人脸识别算法对抽样帧 I_i进行人脸目标检测，得到目标框集合B＝{b₁，b₂，…b_m}，其中， b_i＝(x_i，y_i，h_i，w_i，p_i)，x_i表示目标框的左上角横坐标，y_i表示目标框的左上角纵坐标，h_i表示目标框的高度，w_i表示目标框的宽度，p_i表示目标框的置信度；确定高度阈值h_t、宽度阈值w_t和置信度阈值p_t；对于任意b_i∈B，保留 B′＝{h_i＞h_t且w_i＞w_t且}；最后从B′中获取置信度p_i最高的目标框b_*，根据b_*对I_i进行裁剪，并将裁剪后的I_i调整为特定大小得到Pre(I_i)，得到情感空间向量E_i＝VGG(Pre(I_i))。Since VGG is a class of classic deep convolutional neural network with powerful image feature extraction capabilities, this paper uses VGG-19 trained on the CK+ dataset as a video-emotional space reasoning model to establish a VGG-FACS-OCC model, as shown in Fig. 1 and Fig. 2, the Italian natural conversation video V is divided into several picture frames Image _t according to time, and the picture frame Image _t is sampled according to a fixed frequency to obtain a number of sampling frames I _i , (i=1, 2, 3, ..., n ), let the frame preprocessing function be Pre, and Pre is specifically to use the MTCNN face recognition algorithm to perform face target detection on the sampled frame I _i , and obtain the target frame set B={b ₁ , b ₂ ,...b _m }, where, b _i =(x _i , y _i , h _i , w _i , p _i ), x _i represents the abscissa of the upper left corner of the target frame, y _i represents the ordinate of the upper left corner of the target frame, h _i represents the height of the target frame, w _i represents the width of the target frame, p _i represents the confidence of the target frame; determine the height threshold h _t , width threshold w _t and confidence threshold p _t ; for any b _i ∈ B, keep B′={h _i ＞h _t and w _i ＞w _t and }; Finally, get the target frame b _* with the highest confidence p _i from B′, clip I _i according to b _* , and adjust the cropped I _i to a specific size to get Pre(I _i ), and get the emotional space vector E _i =VGG(Pre(I _i )).

具体地，通过VGG将单帧图像转化为情感空间向量的方法为：假设要推理的图片帧为像素矩阵VGG主要涉及的计算过程有卷积层、全连接层、池化层三种，首先给出卷积运算的形式化表达：Specifically, the method of converting a single frame image into an emotional space vector through VGG is as follows: Assume that the picture frame to be inferred is a pixel matrix VGG mainly involves three kinds of calculation processes: convolution layer, fully connected layer, and pooling layer. First, the formal expression of convolution operation is given:

， ,

其中s和p分别是步长和补零的层数，Z^l是第1层的输入且Z^0＝I，K为卷积层的通道数，F是卷积核的高度和宽度，L_(1+1)为第1+1层卷积层输入的尺寸，且有：σ(·)表示非线性激活函数，通常为线形整流函数(ReLU)函数：ReLU(x)＝max(0，x)，对于输出通道数变换的情况，假设输出通道数为K′，一般用K′个不同的卷积核进行多次二维卷积操作，然后再将所有结果在通道维度上连接实现。Among them, s and p are the step size and the number of zero-filled layers respectively, Z^l is the input of the first layer and Z^0=I, K is the number of channels of the convolution layer, F is the height and width of the convolution kernel, L_(1+1) is the size of the input of the 1+1 layer convolutional layer, and there are: σ( ) represents a nonlinear activation function, usually a linear rectification function (ReLU) function: ReLU(x)=max(0,x), for the case of changing the number of output channels, assuming that the number of output channels is K′, generally used K′ different convolution kernels perform multiple two-dimensional convolution operations, and then connect all the results in the channel dimension.

全连接层的形式化表达如下所示： The formal expression of the fully connected layer is as follows:

池化层的的形式化表达如下所示：当p→∞时，池化操作成为最大池化，记为MaxPool_F，即取池化区域内灰度值最大的像素值。即：/> The formal expression of the pooling layer is as follows: When p → ∞, the pooling operation becomes maximum pooling, denoted as MaxPool_F, that is, the pixel value with the largest gray value in the pooling area is taken. Namely: />

VGG的模型参数一般由预训练过程决定，其正向传播推理过程可形式化为：其中f_1，f_2，…，f_n表示神经网络的不同层次对应的函数，/>表示函数复合，意即通过神经网络不同层次的函数复合完成模型前向传播。对于VGG的一种实现方式，即VGG-19，其各层次的函数表达形式如图3 所示。The model parameters of VGG are generally determined by the pre-training process, and its forward propagation reasoning process can be formalized as: Where f_1, f_2, ..., f_n represent the functions corresponding to different levels of the neural network, /> Represents function composition, which means that the forward propagation of the model is completed through the function composition of different levels of the neural network. For one implementation of VGG, that is, VGG-19, the function expressions of each level are shown in Fig. 3 .

为调整VGG-19的参数，使得VGG-19有良好的FACS特征提取能力与情感分类能力，本文用CK+数据集对VGG-19模型进行了训练。CK+数据集提供了基于FACS提供了人脸图片的OCC情感标注。其FACS-OCC转化方式如下图4所示。设FACS-AU强度向量为f，其每一维度表示与情感识别相关的FACS AU的强度，取值范围为[0，1]，将上述FACS-OCC转换法记为函数F2O(f)，则对于训练集中的图片I和FACS-AU特征标签f，VGG模型的优化目标为： L(VGG(I)，F2O(f))＝CrossEntropy(VGG(I)，F2O(f))，其中交叉熵损失为：其中n为标签数量，对于本文中的OCC情感分类问题，标签数量应当为6。通过批量梯度下降方式对VGG模型以CK+数据集进行训练，最小化目标函数L，随着模型参数的调整，VGG的隐层就可以进行FACS特征的提取，最终得到具体的OCC情感向量。In order to adjust the parameters of VGG-19 so that VGG-19 has good FACS feature extraction ability and emotion classification ability, this paper uses CK+ data set to train the VGG-19 model. The CK+ dataset provides OCC emotion annotations based on FACS provided face pictures. The FACS-OCC conversion method is shown in Figure 4 below. Let the FACS-AU intensity vector be f, each dimension of which represents the intensity of FACS AU related to emotion recognition, and the value range is [0, 1]. The above-mentioned FACS-OCC conversion method is recorded as the function F2O(f), then For the picture I in the training set and the FACS-AU feature label f, the optimization objective of the VGG model is: L(VGG(I), F2O(f))=CrossEntropy(VGG(I), F2O(f)), where the cross entropy The loss is: Where n is the number of labels. For the OCC sentiment classification problem in this paper, the number of labels should be 6. The VGG model is trained with the CK+ data set by batch gradient descent, and the objective function L is minimized. With the adjustment of the model parameters, the hidden layer of VGG can extract the FACS features, and finally obtain the specific OCC emotion vector.

得到情感空间向量E_i＝VGG(Pre(I_i))后，计算情感空间向量到PAD心情空间的映射，参照图5所示的PAD心情空间与情感的量化映射关系，其中情感惊讶(surprise)在原始PAD量表中无对应PAD值，我们通过观察原始量表中与惊讶相似的情感对应的PAD值，进行分析后假定其PAD值分别为0.20、0.45、-0.45。我们假设在上述量表中各个情感之间相互独立，当一种情感数值为最大值1，且其余情感数值均为0时，可以映射到右侧对应的PAD值。于是，将上述量表形式化后，写为如下的映射关系：f_e(e＝1.0)＝[m_Pe，m_Ae，m_De]^T，e∈{e_angry，e_disgust，e_fear，e_happy，e_sad，e_surprise}，进一步，我们将其写为便于计算机运算的矩阵乘法形式M_i＝K×E_i，其中为情感空间到心情空间的转换矩阵，E_i为模为1的二值向量。通过人脸表情识别技术，我们得到情感向量为E_i＝[e_angry，e_disgust，e_fear，e_happy，e_sad，e_surprise]^T，而由于表中仅给出了离散的对应关系，E的取值只能是1个1，5个0，例如[1，0，0，0，0，0]，[0，1，0，0，0，0]，我们需要将其转化为一个连续的映射函数来得到与情感向量E_i对应的心情向量 M_i。于是，我们对公式进行扩展，得到心情空间向量M_i＝K×E′_i，此时E′_i为计算机识别得到的情感向量，E′_i可以在其取值范围内取任意的值。接着计算心情空间向量的算数平均得到平均心情空间向量 After obtaining the emotional space vector E _i =VGG (Pre (I _i )), calculate the mapping of the emotional space vector to the PAD mood space, with reference to the quantitative mapping relationship between the PAD mood space and emotion shown in Figure 5, wherein the emotion is surprised (surprise) There is no corresponding PAD value in the original PAD scale. We observe the PAD values corresponding to emotions similar to surprise in the original scale, and after analysis, assume that their PAD values are 0.20, 0.45, and -0.45, respectively. We assume that each emotion in the above scale is independent of each other. When one emotion has a maximum value of 1 and the rest of the emotions are all 0, it can be mapped to the corresponding PAD value on the right. Therefore, after formalizing the scale above, it is written as the following mapping relationship: f _e (e=1.0)=[m _Pe , m _Ae , m _De ] ^T , e∈{e _angry , e _disgust , e _fear , e _happy , e _sad , e _surprise }, further, we write it as a matrix multiplication form M _i =K×E _i that is convenient for computer operations, where is the transformation matrix from emotion space to mood space, and E _i is a binary vector whose modulus is 1. Through facial expression recognition technology, we get the emotion vector as E _i =[e _angry , e _disgust , e _fear , e _happy , e _sad , e _surprise ] ^T , and since only discrete correspondences are given in the table, E The value can only be 1 1, 5 0, such as [1, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0], we need to convert it into a continuous mapping function to obtain the mood vector M _i corresponding to the emotion vector E _i . Therefore, we expand the formula to obtain the mood space vector M _i =K×E′ _i , where E′ _i is the emotion vector recognized by the computer, and E′ _i can take any value within its value range. Then calculate the arithmetic mean of the mood space vector to get the average mood space vector

最后将平均心情空间向量M_v映射到OCEAN人格空间，通过线性回归建立心情空间到个性空间的转换关系为：Finally, the average mood space vector M _v is mapped to the OCEAN personality space, and the transformation relationship from mood space to personality space is established through linear regression as:

Sophistication＝.16P+.24A+.46DSophistication＝.16P+.24A+.46D

Conscientiousness＝.25P+.19DConscientiousness＝.25P+.19D

Extraversion＝.29P+.59DExtraversion=.29P+.59D

Agreeableness＝.74P+.13A-.18DAgreeableness＝.74P+.13A-.18D

Emotional stability＝.43P-.49AEmotional stability＝.43P-.49A

由于Sophistication和Openness都源于Cultural，因此本文假定Sophistication近似认为与Openness同义。Emotional stability描述的是个人情绪的稳定性而Neuroticism描述的是个体情绪的不稳定性。因此假定Emotional stability与Neuroticism互为反义。即：Sophistication＝Openness；Emotional stability＝ -Neuroticism；从而得到由PAD心情空间到OCEAN人格空间的转换关系：Since both Sophistication and Openness are derived from Cultural, this paper assumes that Sophistication is approximately considered synonymous with Openness. Emotional stability describes the stability of individual emotions and Neuroticism describes the instability of individual emotions. Therefore, it is assumed that Emotional stability and Neuroticism are opposite to each other. Namely: Sophistication＝Openness; Emotional stability＝ -Neuroticism; thus obtain the conversion relationship from PAD mood space to OCEAN personality space:

Openness＝.16P+.24A+.46DOpenness＝.16P+.24A+.46D

Conscientiousness＝.25P++.19DConscientiousness＝.25P++.19D

Extraversion＝.29P+.59DExtraversion=.29P+.59D

Agreeableness＝.74P+.13A-.18DAgreeableness＝.74P+.13A-.18D

Neuroticism＝-.43P+.49ANeuroticism＝-.43P+.49A

即人格空间向量P_e＝Z×M_v，其中，B为心情空间向量与人格空间向量的转换矩阵，That is, the personality space vector P _e =Z×M _v , where B is the transformation matrix between the mood space vector and the personality space vector,

为证实提出的认知建模及其计算机实现的可行性，本发明开展了相关实验，研究对象为31位在校大学生，实验遵循心理学实验标准程序和范式。实验过程如下，先对被测进行一段5-10分钟的访谈，访谈的内容以激发回忆与情感为主，同时，实验中用摄像机记录下被试的表情。在访谈完成后，被测填写传统大五人格量表。最后，通过对量表实验结果与传统大五人格量表的结果分别进行分析，来实证模型的有效性。相关实验素材、原始数据与分析结果在GitHub中提供开源访问。同时，关于视频的算法与结果分析处理，运行的硬件环境如下：内存：8GB；CPU：Inter(R)Core(TM)i5-7300HQ 2.50GHz 4核；系统运行环境Debian 10。以被试ID为“lf”的视频数据为例进行处理流程如图6所示。In order to verify the proposed cognitive modeling and the feasibility of its computer implementation, the present invention carried out related experiments, the research objects were 31 college students, and the experiments followed the standard procedures and paradigms of psychological experiments. The experimental process is as follows. First, a 5-10 minute interview was conducted with the test subject. The content of the interview was mainly to stimulate memories and emotions. At the same time, a video camera was used to record the subject's expression during the experiment. After the interview, the subjects filled out the traditional Big Five personality scale. Finally, the effectiveness of the model is verified by analyzing the experimental results of the scale and the results of the traditional Big Five personality scale. Relevant experimental materials, raw data and analysis results are provided in GitHub for open source access. At the same time, regarding the algorithm and result analysis and processing of the video, the operating hardware environment is as follows: Memory: 8GB; CPU: Inter(R) Core(TM) i5-7300HQ 2.50GHz 4 cores; the system operating environment is Debian 10. Taking the video data whose subject ID is "lf" as an example, the processing flow is shown in Figure 6.

联邦认知模型(EFCM)数据处理主要过程如下，在大五人格实验中首先与被试进行访谈，获取其视频流信息，之后利用VGG-19基于FACS-OCC情感建模将视频流数据处理为情感OCC特征数据，在结合OCC-OCEAN的认知建模处理来最终获取被试的时序人格数据，并最终获取加权人格数据。The main process of the Federal Cognitive Model (EFCM) data processing is as follows. In the Big Five personality experiment, the subjects were first interviewed to obtain their video stream information, and then the video stream data was processed by using VGG-19 based on FACS-OCC emotional modeling. The emotional OCC feature data is combined with the OCC-OCEAN cognitive modeling process to finally obtain the time-series personality data of the subjects, and finally obtain the weighted personality data.

具体的处理流程如下。首先，通过基于CK+表情特征的FACS-OCC情感建模，来从被试的视频流信息中获取OCC的六个表情维度时序数据，该过程处理后的六维OCC情感时序数据如图7所示可知。被试在实验过程中OCC情感激活情况在快乐(happy)与悲伤(sad)两种的情绪上进行不同程度的波动，且较为频繁，而在愤怒(Angry)、尴尬(Disgust)、恐惧(Fear)、与惊喜(Surprise) 四种表情的激活相对有限。The specific processing flow is as follows. First, through the FACS-OCC emotion modeling based on CK+expression features, the six-dimensional OCC emotion time-series data of OCC is obtained from the video stream information of the subjects. The processed six-dimensional OCC emotion time-series data is shown in Figure 7 It can be seen. During the experiment, the OCC emotional activation of the subjects fluctuated to different degrees in happy (happy) and sad (sad) emotions, and they were relatively frequent, while in Angry (Angry), embarrassment (Disgust), fear ( ), and Surprise are relatively limited in activation.

根据上述输出的OCC六维情感时序数据，通过情感空间到心情空间的映射处理，获取PAD的时序数据，如图8所示。According to the OCC six-dimensional emotional time-series data output above, the time-series data of PAD is obtained through the mapping process from the emotional space to the mood space, as shown in Figure 8.

再结合的Mehrabian心情空间到个性空间的转换理论可以最终获取到动态人格识别数据，即图9中的小圆点。同时结合心理学传统的大五人格量表，即图9中大圆点。Combined with the transformation theory of Mehrabian mood space to personality space, the dynamic personality recognition data can be finally obtained, that is, the small dots in Figure 9. At the same time, combined with the traditional big five personality scale in psychology, that is, the big dots in Figure 9.

根据图9可知大圆点所代表的传统大五人格数据落在算法可信识别区域内，同时为了量化每种人格的偏差情况，我们通过计算全体被试的五种人格偏差率来得知EFCM认知建模的准确率。具体的偏差率计算方法以开明性Openness 为例进行说明。其中Openness_rec_t表示通过算法对第t帧得到的Openness值，n表示总帧数， Openness表示由人格量表计算得到的Openness值，Openness_recmax表示所有帧中由算法计算得到的最大的Openness值，Openness_recmin表示所有帧中由算法计算得到的最小Openness值，为避免负数的情况，对分子取绝对值。According to Figure 9, it can be seen that the traditional Big Five personality data represented by the big dots falls within the credible recognition area of the algorithm. At the same time, in order to quantify the deviation of each personality, we calculated the five personality deviation rates of all the subjects to know the EFCM recognition. The accuracy of the model is known. The specific calculation method of deviation rate is explained by taking Openness as an example. Among them, Openness_rec _t represents the Openness value obtained by the algorithm for the tth frame, n represents the total number of frames, Openness represents the Openness value calculated by the personality scale, Openness_recmax represents the largest Openness value calculated by the algorithm in all frames, and Openness_recmin represents The minimum Openness value calculated by the algorithm in all frames. In order to avoid negative numbers, the absolute value of the numerator is taken.

上式表示将算法得到的Openness值的算数平均与人格量表得到的 Openness值作差取绝对值与算法得到的Openness最小值与最大值之差的比值即Openness偏差率。同理可分别得到Conscientiousness、Extraversion、 Agreeableness、Neuroticism偏差率如下：The above formula expresses the ratio of the difference between the arithmetic mean of the Openness value obtained by the algorithm and the Openness value obtained by the personality scale, and the difference between the absolute value and the difference between the minimum value and the maximum value of Openness obtained by the algorithm, that is, the Openness deviation rate. Similarly, the deviation rates of Conscientiousness, Extraversion, Agreeableness, and Neuroticism can be obtained as follows:

根据上述计算方法，可得到图10所示EFCM认知模型的全体被试五种人格偏差率表如下。According to the above calculation method, the five personality deviation rates of all subjects in the EFCM cognitive model shown in Figure 10 can be obtained as follows.

本法发提供的联邦认知模型在理论上打通了从视觉信息输入到最终大五人格数据输出的全流程，并结合形式化进行了详尽的推导。模型在试验阶段证明了，除了神经质这一种人格客观上无法被验证之外，实验结果证明，四种有效测试的人格结果其平均偏差率在20.41％左右，即平均准确率在79.56％。The federated cognitive model provided by this law has theoretically opened up the whole process from the input of visual information to the output of the final Big Five personality data, and combined with the formalization to conduct a detailed derivation. The model proved in the experimental stage that, except for the personality of neuroticism, which cannot be verified objectively, the experimental results show that the average deviation rate of the personality results of the four effective tests is about 20.41%, that is, the average accuracy rate is 79.56%.

对于偏差率较大的神经质人格偏差结果，考虑到在实际测试过程中通过回忆相对较难真实的捕捉生活中相对少见的负面情绪，故而影响了后续人格对比结果中的被试神经质(Neuroticism)结果及偏差率。从心理学基础理论可以知道神经质和大量负面情绪相关，而在标准的大五人格测试实验过程中，不会出现且不能刻意刺激被试负面情绪，因此被试在实验中神经质特性并不能精准的测试并记录，客观上造成神经质Neuroticism人格的异常结果。若将模型应用于实际场景之时，在被试无意识中被记录并进行分析，理论上是能够有效进行神经质人格的测试。因此研究的下一步是对海量场景中，受到被试许可的前提下进行海量数据的观测和实验，用以验证联邦认知模型在大规模观测下的有效性。For the results of neurotic personality deviation with a large deviation rate, considering that it is relatively difficult to truly capture the relatively rare negative emotions in life through recall during the actual test process, it affects the neuroticism results of the subjects in the follow-up personality comparison results and deviation rate. From the basic theory of psychology, it can be known that neuroticism is related to a large number of negative emotions. In the standard Big Five personality test experiment, the negative emotions of the subjects will not appear and cannot be deliberately stimulated. Therefore, the neurotic characteristics of the subjects in the experiment cannot be accurately measured. Test and record, objectively cause abnormal results of Neuroticism personality. If the model is applied to the actual scene, it will be recorded and analyzed unconsciously by the subjects, theoretically it can effectively test the neurotic personality. Therefore, the next step of the research is to observe and experiment with massive data in massive scenarios with the permission of the subjects to verify the effectiveness of the federated cognitive model under large-scale observations.

以上对本发明的较佳实施例进行了描述；需要理解的是，本发明并不局限于上述特定实施方式，其中未尽详细描述的设备和结构应该理解为用本领域中的普通方式予以实施；任何熟悉本领域的技术人员，在不脱离本发明技术方案作出许多可能的变动和修饰，或修改为等同变化的等效实施例，这并不影响本发明的实质内容；因此，凡是未脱离本发明技术方案的内容，依据本发明的技术实质对以上实施例所做的任何简单修改、等同变化及修饰，均仍属于本发明技术方案保护的范围内。The preferred embodiments of the present invention have been described above; it should be understood that the present invention is not limited to the above-mentioned specific embodiments, and the equipment and structures that are not described in detail should be understood as being implemented in a common manner in the art; Any person skilled in the art can make many possible changes and modifications without departing from the technical solutions of the present invention, or modify them into equivalent embodiments of equivalent changes, which will not affect the essence of the present invention; therefore, anyone who does not depart from the present invention For the content of the technical solution of the invention, any simple modifications, equivalent changes and modifications made to the above embodiments according to the technical essence of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. the OCC-PAD-OCEAN federal cognitive modeling method based on emotion, it is characterized in that, the method comprises the following steps:

S1 constructs the VGG-FACS-OCC model, sets the FACS-AU intensity vector as f, sets the value range as [0,1], and marks the conversion of FACS-OCC as a function F2O(f);

The optimization objective of the VGG model is set as:

L(VGG(I),F2O(f))＝CrossEntropy(VGG(I),F2O(f))

The cross-entropy loss function is set to:

The VGG model is trained with the CK+ data set through batch gradient descent to minimize the objective function L;

Divide the video under test into several picture frames Image _t according to time, and sample the picture frame Image _t at a fixed frequency to obtain a number of sampling frames I _i , (i=1,2,3,...,n);

Use the MTCNN face recognition algorithm to perform face target detection on the sampled frame I _i , and obtain the target frame set B={b ₁ , b ₂ ,...b _m }, where b _i =(xi _, y _i , h _i , w _i , p _i ), x _i represents the abscissa of the upper left corner of the target frame, y _i represents the vertical coordinate of the upper left corner of the target frame, h i represents the height of the target frame, _{w i} _represents the width of the target frame, and p _i represents the target frame confidence level;

Determine height threshold h _t , width threshold w _t and confidence threshold p _t ; for any b _i ∈ B, keep B′={h _i >h _t and w _i >w _t and

S123 Obtain the target frame b _* with the highest confidence p _i from B′, clip I _i according to b _* , and adjust the cropped I _i to a specific size to obtain Pre(I _i );

Perform FACS feature extraction based on the hidden layer of VGG, map the preprocessed sampling frame I _i to the OCC emotion space, and obtain the emotion space vector E _i , the emotion space vector E _i =VGG(Pre(I _i ));

S2 calculates the mood space vector M _i =K×E _i , wherein K is the conversion matrix of the parameter quantization mapping relationship between the OCC emotion space and the PAD mood space;

Continuing the mapping relationship between mood space and personality space, we get M _i =K×E′ _i ;

Compute the mean mood space vector

S3 maps the average mood space vector to the OCEAN personality space, and obtains the personality space vector through the mood space vector; the personality space vector P _e =Z×M _v , where Z is the conversion matrix from the PAD mood space to the OCEAN personality space.