CN102184232A - Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD) - Google Patents

Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD) Download PDF

Info

Publication number
CN102184232A
CN102184232A CN2011101218630A CN201110121863A CN102184232A CN 102184232 A CN102184232 A CN 102184232A CN 2011101218630 A CN2011101218630 A CN 2011101218630A CN 201110121863 A CN201110121863 A CN 201110121863A CN 102184232 A CN102184232 A CN 102184232A
Authority
CN
China
Prior art keywords
vocabulary
cluster
emotion
chinese
euclidean distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011101218630A
Other languages
Chinese (zh)
Inventor
毛峡
江琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN2011101218630A priority Critical patent/CN102184232A/en
Publication of CN102184232A publication Critical patent/CN102184232A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD). The Chinese vocabulary emotion modeling method comprises the following steps of: (1) establishing a Chinese emotional vocabulary original database; (2) labeling vocabularies in the Chinese emotional vocabulary original database in three dimensions of P, A and D respectively within a labeling range of -4 and +4, wherein in order to make the labeling objective and accurate, each dimension is defined to be labeled by three different evaluators and each vocabulary is evaluated for three times; and the P, A and D measured values of each vocabulary are average values of the three evaluations and normalized so that the values range from -1 to +1; (3) performing hierarchical cluster analysis on all labeled emotional vocabularies according to the P, A and D measured values, wherein in order to obtain a better cluster effect, a Euclidean distance is used as cluster distance measurement, and a weighted average distance method is used as a cluster algorithm; and selecting a cluster number N according to an actual requirement; and (4) if a new word is absent in the original database, labeling the new word in the P, A and D dimensions at first, calculating the Euclidean distance between the new word and a final cluster, and normalizing the new word into the cluster having the shortest Euclidean distance.

Description

A kind of Chinese vocabulary emotion modeling method based on PAD
(1) technical field
The present invention relates to a kind of emotion modeling method,, belong to emotion and calculate the field especially based on the emotion modeling method of the Chinese vocabulary of PAD model.
(2) background technology
A lot of media of people and computer interactive are all based on text.Containing abundant emotion information in the text, corresponding human corresponding psychological condition.Therefore the research of text emotion extraction has significance in emotion calculating and intelligent interaction field.And the extraction of text emotion must depend on good emotion model, so more accurate recognition user's affective state.
The vocabulary that human mood of a large amount of descriptions and emotion are arranged in the Chinese, as: happiness, optimism, melancholy.These emotion vocabulary have all reflected people's psychological condition from different angles.Distinguish these vocabulary though people can experience by individual psychology,, just need quantize and cluster analysis, thereby realize emotion modeling these vocabulary if expectation can also can be distinguished these vocabulary by computing machine accurately.
The PAD model is the dimension measurement model that Mehrabian and Russell propose.This model is divided into mood: joyful degree (Pleasure)---represent government's characteristic of individual affective state, activity (Arousal)---represent individual nervous physiology activation level; Dominance (Dominance)---expression is individual to situation and other people state of a control.The PAD model has not only provided the theory conception that emotional space is described, and adopts the method that quantizes to attempt to set up the location and the relation of various mood categories in the emotional space simultaneously.
In the emotion modeling field, Chinese vocabulary is not carried out modeling targetedly as yet at present, this has restricted further developing of Chinese text emotion Study of recognition to a certain extent.Lack this problem of emotion model in the text emotion identification field and can solve in conjunction with the Chinese vocabulary emotion modeling method of PAD model.Therefore, propose a kind of effective Chinese vocabulary emotion modeling method and have very strong realistic meaning.
(3) summary of the invention
The objective of the invention is to propose a kind of method that can carry out emotion modeling to Chinese vocabulary, quantitatively discern the problem of vocabulary emotion to solve computing machine.
The invention provides a kind of Chinese vocabulary emotion modeling method, may further comprise the steps based on PAD:
Step 1: set up Chinese emotion vocabulary raw data base, from newspaper, digest, blog, multiple channels such as social network sites and BBS are collected the vocabulary that shows emotion.
Step 2: the vocabulary in the Chinese emotion vocabulary raw data base is carried out P respectively, A, the mark of three dimensions of D, the scope of mark is-4 to+4.In order to make mark objective and accurate, each dimension is all marked by three different estimators, and each vocabulary all carries out three different evaluations.The P of each vocabulary, A, D measurement value are the mean value of these three times evaluations and carry out normalized, make its value between (1 ,+1).
Step 3: the whole emotion vocabulary that marked are carried out the hierarchical clustering analysis by its P, A, D value.In order to obtain better cluster effect, adopt the distance metric of Euclidean distance as cluster, P, A, the D that establishes i vocabulary measures and is (p i, a i, d i), then its Euclidean distance is as follows:
dis ij = ( p i - p j ) 2 + ( a i - a j ) 2 + ( d i - d j ) 2
For clustering algorithm, then adopt weighted average distance method (WPGMA) that each cluster is carried out cluster.If the mean distance of each cluster is C i, then its weighted average distance is as follows:
d ( C q , C s ) = 1 2 [ d ( C i , C s ) + d ( C j , C q ) ]
At last, can select the number N of cluster according to the actual requirements.
Step 4: if the neologisms that do not have in the raw data base are arranged, then earlier it is carried out the mark of PAD dimension, calculate itself and the Euclidean distance of final cluster cluster then, and it is referred to that cluster of Euclidean distance minimum.
Chinese vocabulary emotion modeling method provided by the invention, its advantage and good effect are:
1 this method is based on thymopsyche scientific principle opinion, from a plurality of angles the vocabulary emotion is differentiated.
2 these methods have solved not accurately tolerance and the classification of Chinese emotion vocabulary, are not easy to carry out the problem of computer Recognition.
(4) description of drawings:
Fig. 1 Chinese vocabulary emotion modeling process flow diagram
Fig. 2 cubic space cluster result figure
(5) embodiment:
Basic thought of the present invention is by emotion vocabulary being carried out the mark of P, A, three dimensions of D, the affective characteristics of vocabulary is quantized, and according to the method for hierarchical clustering it is carried out cluster analysis, finally finishes the emotion modeling of vocabulary.
According to above thought, process flow diagram of the present invention as shown in Figure 1.
The embodiment of this modeling method is described below by a concrete example:
1, collect 88 emotion vocabulary through various channels, as follows:
Proud/as to suspect/as to envy/joyful/awkward/as to be willing to/as to show loving care for/gratified/as to treasure/cherish the memory of/feel grateful/sympathize with/glad/unhappy
In a depressed state/boring/as puzzle/to worry/worry/as to be ready/sad/as to treasure/oppressiveness/discrimination/sorry/grieved/as to feel uncertain/feel oneself inferior
Sad/sadness/trust/depression/panic/as to have a guilty conscience/anxiety/fear/as to show consideration for/regret deeply/as to make allowances for/uneven/bitterly disappointed/as to envy and hate
Disappointed/as to loosen/show off/disdain/take pride in/as to hate/grievance/as to abhor/confidence/downhearted/complacency/grief and indignation/peacefulness/anxiety
Relieved/despair/animosity/an innocent person/decadence/as to despise/filled with anger/wrathful/as to regret/respect/indignation/pride/detestation/ashamed
Arrogant/as to advocate/as to worry/surprised/excitement/worship/impetuousness/emotion/madness/arrogant and imperious/shyness/poverty-stricken/agitation/shock
Worried/disagreeable/sorry/as to be weary of
2, above-mentioned 88 vocabulary are carried out the demarcation of P, A, D dimension.For example: proud, boring, the calibration result suspected
The estimator 1 The estimator 2 The estimator 3
Proud P 3 3 4
Boring P -1 -1 -2
Suspect P -2 -1 -2
The estimator 4 The estimator 5 The estimator 6
Proud A 3 2 4
Boring A -2 -2 -3
Suspect A 1 1 2
The estimator 7 The estimator 8 The estimator 9
Proud D 4 2 4
Boring D -1 -2 -4
Suspect D -2 -2 -2
3,88 emotion vocabulary are carried out the hierarchical clustering analysis, obtain following result:
Figure BDA0000060564980000041
Fig. 2 is the cluster result at cubic space:
And by the concrete analysis to cluster result, classification results has well reflected general character and the individual character between the different emotions vocabulary as can be seen.Such as the discrimination in the 4th classification, self-satisfied two vocabulary, it not be very relevant that these two vocabulary seem, but can two vocabulary clearly all comprising to oneself's high praise and to other people meaning of belittling by the personal experience.From this point as can be seen, the marking model of PAD meets thymopsyche generation rule.
4, provide the neologisms that do not have in the raw data base: terrified.At first to its mark that carries out P, A, D, obtain final normalization result and be:
P A D
In terror -0.75 0.75 -0.5
And, it is classified as the 7th class according to the Euclidean distance of itself and existing cluster, as can be seen from the results, obtained good effect according to the emotion modeling of PAD model.

Claims (1)

1. Chinese vocabulary emotion modeling method based on PAD is primarily characterized in that:
Step 1: set up Chinese emotion vocabulary raw data base, from newspaper, digest, blog, multiple channels such as social network sites and BBS are collected the vocabulary that shows emotion.
Step 2: the vocabulary in the Chinese emotion vocabulary raw data base is carried out the mark of P, A, three dimensions of D respectively, and the scope of mark is-4 to+4; In order to make mark objective and accurate, the present invention defines each dimension and is marked by three different estimators, and each vocabulary all carries out three different evaluations; The P of each vocabulary, A, D measurement value are the mean value of these three times evaluations and carry out normalized, make its value between (1 ,+1).
Step 3: the whole emotion vocabulary that marked are carried out the hierarchical clustering analysis by its P, A, D value; In order to obtain better cluster effect, the present invention adopts the distance metric of Euclidean distance as cluster, and the weighted average distance method is as clustering algorithm; And can select cluster number N according to the actual requirements.
Step 4: if the neologisms that do not have in the raw data base are arranged, then earlier it is carried out the mark of PAD dimension, calculate itself and the Euclidean distance of final cluster cluster then, and it is referred to that cluster of Euclidean distance minimum.
CN2011101218630A 2011-05-11 2011-05-11 Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD) Pending CN102184232A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011101218630A CN102184232A (en) 2011-05-11 2011-05-11 Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011101218630A CN102184232A (en) 2011-05-11 2011-05-11 Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD)

Publications (1)

Publication Number Publication Date
CN102184232A true CN102184232A (en) 2011-09-14

Family

ID=44570409

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011101218630A Pending CN102184232A (en) 2011-05-11 2011-05-11 Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD)

Country Status (1)

Country Link
CN (1) CN102184232A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9152625B2 (en) 2011-11-14 2015-10-06 Microsoft Technology Licensing, Llc Microblog summarization
CN107895027A (en) * 2017-11-17 2018-04-10 合肥工业大学 Individual feelings and emotions knowledge mapping method for building up and device
CN109509486A (en) * 2018-07-31 2019-03-22 苏州大学 A kind of Emotional Corpus construction method embodying emotion detailed information
CN111538834A (en) * 2020-01-21 2020-08-14 中国银联股份有限公司 Emotion dictionary construction method and system, emotion recognition method and system and storage medium
CN114547240A (en) * 2022-01-28 2022-05-27 同济大学 PAD-based image emotion labeling method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZE-JING CHUANG等: "Multi-Modal Emotion Recognition from Speech and Text", 《COMPUTATIONAL LINGUISTICS AND CHINESE LANGUAGE PROCESSING》 *
孙佩宏: "PAD 情感空间中情感数据概率特性分析", 《第四届和谐人机环境联合学术会议论文集》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9152625B2 (en) 2011-11-14 2015-10-06 Microsoft Technology Licensing, Llc Microblog summarization
CN107895027A (en) * 2017-11-17 2018-04-10 合肥工业大学 Individual feelings and emotions knowledge mapping method for building up and device
CN109509486A (en) * 2018-07-31 2019-03-22 苏州大学 A kind of Emotional Corpus construction method embodying emotion detailed information
CN109509486B (en) * 2018-07-31 2021-04-09 苏州大学 Emotion corpus construction method for embodying emotion detail information
CN111538834A (en) * 2020-01-21 2020-08-14 中国银联股份有限公司 Emotion dictionary construction method and system, emotion recognition method and system and storage medium
CN114547240A (en) * 2022-01-28 2022-05-27 同济大学 PAD-based image emotion labeling method

Similar Documents

Publication Publication Date Title
Neuendorf Content analysis and thematic analysis
Heck et al. Multilevel modeling of categorical outcomes using IBM SPSS
CN101763401B (en) Network public sentiment hotspot prediction and analysis method
Ha et al. National identity, national pride, and happiness: The case of South Korea
CN103699626B (en) Method and system for analysing individual emotion tendency of microblog user
CN104182805B (en) Dangerous tendency Forecasting Methodology based on inmate's behavioural characteristic integrated study model
CN103207913B (en) The acquisition methods of commercial fine granularity semantic relation and system
CN106683688B (en) Emotion detection method and device
US9443193B2 (en) Systems and methods for generating automated evaluation models
CN102184232A (en) Chinese vocabulary emotion modeling method based on pleasure, arousal and dominance (PAD)
KR102244938B1 (en) Artificial intelligence employment system and employing method of thereof
da Silva et al. Personality recognition from Facebook text
Bakhtiyari et al. Fuzzy model of dominance emotions in affective computing
CN102592593A (en) Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech
Kim Analysis of standard vocabulary use of the open government data: the case of the public data portal of Korea
Suls Psychological perspectives on the self, Volume 4: The self in social perspective
CN110188671A (en) A method of handwriting characteristic is analyzed using machine learning algorithm
Rajamanickam Statistical methods in psychological and educational research
CN107578785A (en) The continuous emotional feature analysis evaluation method of music based on Gamma distributional analysis
Karl et al. More than yes and no: Predicting the magnitude of non-invariance between countries from systematic features
Masalimova et al. Exploring Preservice STEM Teachers' Smartphone Addiction.
JP2012098921A (en) User classification system
CN110222262A (en) A kind of network user's personality automatic identifying method using news comment behavior
US20120082964A1 (en) Enhanced graphological detection of deception using control questions
Saliya Data and Variables

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110914