CN107729320A - A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method - Google Patents

A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method Download PDF

Info

Publication number
CN107729320A
CN107729320A CN201710976797.2A CN201710976797A CN107729320A CN 107729320 A CN107729320 A CN 107729320A CN 201710976797 A CN201710976797 A CN 201710976797A CN 107729320 A CN107729320 A CN 107729320A
Authority
CN
China
Prior art keywords
emotion
user
emoticon
dictionary
conversation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710976797.2A
Other languages
Chinese (zh)
Other versions
CN107729320B (en
Inventor
高岭
周俊鹏
曹瑞
杨旭东
郑杰
杨建峰
高全力
王海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201710976797.2A priority Critical patent/CN107729320B/en
Publication of CN107729320A publication Critical patent/CN107729320A/en
Application granted granted Critical
Publication of CN107729320B publication Critical patent/CN107729320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Algebra (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method, analyzes the emotion value of dialogue by excavating user's chat record, mapping relations of the emoticon in emotion matrix are built with this;Emotion keyword is calculated using emotion dictionary analysis Conversation History;21 dimension emotion vectors of session are calculated by emotion keyword and computation rule;Passage time sequence(ARMA/ARIMA)Development of the model to user's current session emotion vector carries out Single-step Prediction, and prediction result is passed through into nearest neighbor algorithm from mapping relations(KNN)Choose the expression group closest to user feeling trend and generate recommendation list.According to technical scheme provided by the present invention, user is when using chat tool, can be promptly and accurately to user recommend to meet active user's emotion with can language border emoticon, so as to be very easy to the complex operations that user selects emoticon, recommendation coverage rate is improved, also enhances Consumer's Experience.

Description

A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method
Technical field
The present invention relates to intelligent recommendation technical field, and in particular to a kind of based on Time-Series analysis user conversation emotion trend Emoticon recommends method.
Background technology
Emoticon is a kind of daily most important itself feelings of expression in addition to spoken and written languages of chatting on line of present people The mode and interactive communication mode of sense, different emoticons carry different abundant implications.The representative of Britain's language dictionary 《Oxford dictionary》An emoji emoticon has been used first in the annual vocabulary of year ends 2015 issue:Face with Tears Of Joy, official explanation are ' smiling face being so happy as to weep '.As can be seen here, emoticon caters to that 21 century is quick, collection as one kind Product under middle visual demand background, it is great representational.This pictographic image, such as emoji emoticons, can be with The emotion blank in flat language is supplemented, can be the graceful intonation of word injection, allow exchange way to become rich and varied, therefore It has surmounted the limitation of language, turn into depart from language and can self-existent individual, play in a network very important Effect.
With continuing to increase for expression quantity, user is encountered " selecting difficult disease " using emoticon.Obtained from big data Extensive concern so far, commending system as an effective means that can alleviate information overload, have been widely recognized and Profound development and application.When the emoticon that can be used more to user has no way of selection, when user is for Quick-return And when earnestly seeking that suitable emoticon, when emoticon no longer attracts user, need that emoticon is recommended Ask and just emerge from.
The application that emoticon is recommended can not only solve the problems, such as popularization, help the creators for making emoticon to harvest Bigger economic benefit, more works can be easily accepted by a user, meanwhile, the use habit of user is also more bonded, allows and chatted Journey becomes more easily, fast, convenient, can also make chat more personalized.
The recommendation of emoticon is daily by counting user at present mainly to use frequency as recommendation foundation based on user Usage amount, the most used emoticon is inserted in recommendation list, but such mode does not embody any push away not only Function is recommended, also limit the popularization of itself emoticon;It is the endless full pinyin inputted by user mostly in Chinese character coding input method To predict that the expression corresponding to word that user will input, or the Chinese label according to corresponding to emoticon carry out matching and pushed away Recommend, still, say from the strict sense, these are all not belonging to proposed algorithm, and can only say is the reproduction of historical record or passes through mark The expression of label.
The final purpose of ripe commending system be those are seldom entered to the user in the access customer visual field may be interested Products Show, so as to not only meet the psychology of hunting for novelty of user, also causes profit potential to maximize, in emoticon to user Use, most of emoticon obtain final recommendation using traditional method based on frequency of use, can not adapt to use The different use demands at family.Therefore, the present invention takes into full account user's regular job experience and its caused demand, proposes a kind of new Emoticon recommend method:Emotion keyword is calculated using emotion dictionary analysis Conversation History, analysis user uses every Front and rear emotion change during one emoticon calculates emoticon-emotion value mapping dictionary, passes through analysis time information, profit The emotion value of next period is calculated with autoregression integration moving average model, is finally inquired about from emoticon emotion dictionary Calculate and recommend emoticon.
The content of the invention
In order to overcome the above-mentioned deficiencies of the prior art, it is an object of the invention to provide one kind to be based on Time-Series analysis user conversation The emoticon recommendation method of emotion trend is user using a emoticon is timely and accurately recommended during chat tool, more Mend linguistic obscure, give expression to more rich emotion, the recommendation method is mainly excavated potential present in user session record Sentimental value, it is therefore an objective to extract its information unit for including affective content, the information content is converted into computer can recognize that Structural data, while basic emotion is divided into:Seven classes such as " good (love, respect), disliking, like (pleasure), anger, sorrow, fear, be intended to ", and It is quantified, emoticon-affection index matrix is established with this.User session historical record meter is analyzed using emotion dictionary Emotion keyword is calculated, so as to analyze front and rear emotion change when user uses each emoticon, by emotion change more Accurately calculate the emoticon needed in user's chat process.A suitable time series models are resettled, utilize the time Series model predicts the emotion trend of user's current session, is chosen from emoticon-emotion matrix relationship closest to user's feelings The expression group of sense trend simultaneously generates user's recommendation list.Meanwhile the invention provides the example of constructed emotion dictionary, mainly relate to And expanding sentiment dictionary, modal particle auxiliary emotion dictionary and punctuation mark auxiliary emotion dictionary.
To achieve the above object, the technical method that uses of the present invention is:
A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method, comprises the following steps:
1) user's chat record is excavated to pre-process and analyze the emotion value of dialogue, and emoticon is built in emotion square with this Mapping relations in battle array;
2) emotion keyword is calculated using emotion dictionary analysis Conversation History;It is divided into emotion dictionary, modal particle emotion Dictionary, punctuation mark emotion dictionary;Consider to use Forward Maximum Method method when with emotion dictionary, identical word will be included The word of different length be divided on user dictionary, arranged in a manner of from long to short, so that priority match most can directly be sought Phrase, word;
3) 21 dimension emotion vectors of session are calculated by emotion keyword and computation rule;
4) development of the passage time series model to user's current session emotion vector carries out Single-step Prediction, and prediction is tied Fruit chooses the expression group closest to user feeling trend by nearest neighbor algorithm from mapping relations and generates recommendation list.
User's chat record is obtained in described step 1), analyzes the emotion value of dialogue, including:Excavate user's chat record Information, it is divided into text information, voice messaging;Using information such as the existing chat records of user by filtering, segmenting, removing stop words Operation, and established with emotion dictionary matching and belong to personal unique emotion dictionary, for marking the emotion value of emoticon.
Mapping relations of described step 1) the structure emoticon in emotion matrix, including by calculating the feelings of its user Inductance value, obtain an expression-affection index calculating matrix;Count the two dimension of each expression and its emotion value that can be expressed Relation.The described calculating to emoticon-emotion value matrix, including emoticon-emotion value mapping relations are mainly used in retouching State the form of expression of the emoticon of each user transmission in emotion value;Talked about all comprising computable due to being not every Emotion, therefore in emotion value calculating process, should extract k bands before expression appears in has language if emotion, it is ensured that for segmenting User dictionary and possess identical entry for calculating the emotion dictionary of emotion value, to maximize dictionary matching effect.
User conversation record calculates its emotion value in described step 2), including:By emotion according to Ekman division methods, The criteria for classifying of emotion is determined, expands 21 groups;Divided by emotion, establish a reference standard, quantify its concrete term Actual emotional expression;Emotion keyword is calculated using emotion dictionary analysis Conversation History, including establishes corresponding emotion Dictionary;For user's history conversation recording, it is segmented, extraction process, while also modal particle, punctuation mark established Aid in emotion tone vocabulary, emotion punctuation mark table;
21 dimension emotion vectors of session are calculated in described step 3) by emotion keyword and computation rule, including are taken out The preceding sentient language of n bands of family conversation recording is taken, the pretreatment such as is segmented, filtered;By the sentence after processing in feelings Lookup matching is carried out in sense dictionary, calculates total expectation of its Sentiment orientation, 21 dimension emotion vectors of corresponding expression are obtained with this.
Pretreatment of the described step 1) to chat record information, affection data in chronological sequence sequential arrangement will be talked with, A Random time sequence is formed, formula is:
{Emotion1, i=t1, t2, t3..., tn
The sentence repeated in conversation recording is subjected to data deduplication, Incomplete information is carried out curve fitting+adopted again Sample processing.
It is pre- to carry out single step for development of the passage time series model to user's current session emotion vector in described step 4) Survey, including extraction conversation history record, i.e. the historical data of passage time dimension calculates its changing rule, and the rule is expanded To future, so as to which the change to the following things is made prediction;Settling time series analysis model, AR models, MA models and two The combination ARM of person, wherein ARMA (p, q) general formulae are:
Yt01Yt-12Yt-2+L+βpYt-pt1εt-1+L+αqεt-q
In formula, p, q are the Autoregressive and moving average order of model;α, β are autoregressive coefficient and moving average system Number;εtFor error term;YtFor steady, normal state and the time series of zero-mean;If difference operator isFor non-stationary sequence Arrange { XtCarry out the new sequence that d order difference computings obtainA stationary sequence, if assume the sequence be adapted to ARMA (p, Q) model, according to model algebraization method, autoregressive coefficient polynomial equation formula is:
Its moving average coefficient polynomial formula is:
θ (B)=1- θ1B-θ2B2-…-θqBq
If data are unstable, after difference processing, then calculated using ARIMA (p, d, q) model, its fortran For:
Here d is the number of difference in actual carry out tranquilization processing, but no more than 2 times, ARMA (p, q) model is joined Several methods of estimation uses least-squares estimation mode, residual sum of squares (RSS) is reached that group of minimum parameter value, setting parameter set For:
δ=(α1, α2, L αp, β1, β2, L βq)T, then make:
Reach minimum,For the least square of original parameter set Estimation, wherein, the variance of white noiseLeast-squares estimation be:
The stationarity of data is verified, builds time series analysis procedural model, after data are detected by stationarity, for flat Steady sequence, ARMA (p, q) model is directly fitted, for non-stationary series, is then examined after calculus of differences again by stationarity Test, be finally fitted ARIMA (p, d, q) model.
The expression group closest to user feeling trend is chosen in described step 4) and generates emoticon recommendation list, is wrapped Include and recorded by user's historical session, using understanding of the emotion trend analysis user for each emoticon and use habit It is used;With reference to emoticon-emotion mapping table, recommend next emoticon for meeting its emotion trend for user.
Triple has been used to represent a word in structure emotion dictionary, including emotion vocabulary body in described step 2) Converge, info represents the ontology information of vocabulary, including numbers, explains, corresponding to translator of English, part of speech, typing person's information;relation Represent vocabulary and the direct relation of vocabulary, including synonymy, antonymy etc.;Emotion represents the emotion information of vocabulary, its It is expressed as:
Lexiconi=(info, relation, emotion)
The emotion information of each word includes part of speech species, meaning of a word quantity, emotional semantic classification, intensity, polarity, sub- emotion Classification, sub- intensity, sub- polarity etc.;
Described structure modal particle emotion dictionary, including emotion intensity are grown from weak to strong and divide 7 grades into, and positive negative tendency is pressed It is provided simultaneously with passing judgement on both sexes division according to 0 neutrality, 1 commendation, 2 derogatory sense, 3, phase is made for the modal particle appeared in conversational system The calculation optimization answered;
Structure rule of the structure of described modal particle emotion supplementary table according to conventional gerund emotion dictionary, each language Gas word considers itself, above occurs character/word, the three kinds of situations of character/word occurred below;
Described structure punctuation mark emotion dictionary, including punctuation mark tone emotion supplementary table structure is according to inquiry Chinese The modes such as pertinent literature, dictionary obtain its application method i.e. expression effect, further according to effect it is artificial constructed its in emotion value Influence mode.
The beneficial effects of the invention are as follows:
It is user using a emoticon is timely and accurately recommended during chat tool, makes up linguistic obscure, expression Go out more rich emotion, the recommendation method mainly excavates potential sentimental value present in user session record, it is therefore an objective to extracts Go out its information unit for including affective content, the information content is converted into the recognizable structural data of computer, simultaneously will Basic emotion is divided into:Seven classes such as " good (love, respect), disliking, like (pleasure), anger, sorrow, fear, be intended to ", and it is quantified, built with this Vertical emoticon-affection index matrix.Emotion keyword is calculated using emotion dictionary analysis user session historical record, so as to divide Analysis user uses front and rear emotion change during each emoticon, and more accurately calculating user by emotion change chatted The emoticon needed in journey.A suitable time series models are resettled, predict that user is current using time series models The emotion trend of dialogue, the expression group closest to user feeling trend is chosen from emoticon-emotion matrix relationship and is generated User's recommendation list.Meanwhile the invention provides the example of constructed emotion dictionary, relate generally to expanding sentiment dictionary, the tone Word aids in emotion dictionary and punctuation mark auxiliary emotion dictionary.To help user under session scene, preferably choose and be adapted to work as The emoticon of preceding linguistic context, so as to bring more accurately emoticon recommendation for user.
Brief description of the drawings
Fig. 1 is that the emoticon of the present invention recommends the overall framework figure of method.
Fig. 2 is that the emoticon of the present invention recommends the overall flow figure of method.
Fig. 3 is the value matrix flow chart of the present invention.
Fig. 4 is that the emoticon of the present invention recommends the time series analysis flow chart of method.
Embodiment
The present invention is further discussed below below in conjunction with accompanying drawing, but the present invention is not limited to following examples.
The invention provides a kind of emoticon recommend method overall framework, as shown in figure 1, the framework specifically include with Lower link:
S11, user's chat record is excavated, analyze the emotion value of its dialogue;
S12, the mapping relations of emoticon-emotion value matrix are built, build emotion dictionary;
S13, emotion keyword is calculated using emotion dictionary analysis Conversation History;
S14,21 dimension emotion matrixes of session are calculated according to emotion keyword;
S15, Single-step Prediction is carried out to the emotion of user's current session using time series models;
S16, according to Single-step Prediction result, recommendation list is generated according to arest neighbors in emoticon-emotion value matrix.
According to technical scheme provided by the present invention, user when using chat tool, can be promptly and accurately to user Recommend to meet active user's emotion and the emoticon in meeting language border, emoticon is selected so as to be very easy to user Complex operations, enhance Consumer's Experience.
Fig. 2 is the overall flow figure that emoticon provided by the present invention recommends method, and this method mainly includes following step Suddenly:
Emoticon-emotion value matrix is initialized, obtains user's chat data, and data are filtered, cleaned;
According to decimation rule, the preceding k rows data that emoticon occurs are chosen;
The preceding k datas of selection are pre-processed, including filter, segment and go the operation of stop words;
Structure emotion dictionary matches to word segmentation result, and emotion dictionary includes expanding sentiment dictionary, modal particle auxiliary feelings Feel dictionary and punctuation mark auxiliary emotion dictionary;
21 dimension emotion value vectors are calculated, and passage time Series Modeling is pre-processed;
Next step emotion tendency is predicted using time series models;
Make corresponding recommendation results by inquiring about emoticon-emotion value matrix, at the same judge recommendation results success with It is no, if failure, that is, update emoticon-emotion value matrix.
Emotion must be made before emotion dictionary is built to divide,, can be effectively by dividing emotion such as the example of table 1 The standard of a reference is established, while corresponding quantization is made to the actual emotional expression of concrete term, also causes each table Feelings symbol establishes emoticon-emotion value matrix with corresponding emotion value into the relation mapped one by one, and after being also convenient for.
The emotion of table 1 divides example
The structure of emotion dictionary, except increasing the neologisms not occurred in original dictionary, built also directed to modal particle, punctuation mark Two auxiliary emotion tone vocabularys, emotion punctuation mark tables have been found, for aiding in original dictionary matching result of calculation, have made it more Accurately, it is specifically described as follows:
(1) conventional verb, noun
Triple has been used to represent a vocabulary in emotion vocabulary body, info represents the ontology information of vocabulary, including compiles Number, explain, corresponding translator of English, part of speech, typing person's information;Relation represents vocabulary and the direct relation of vocabulary, including same Adopted relation, antonymy etc.;Emotion represents the emotion information of vocabulary, and we are primarily upon and using this part content here.
Lexiconi=(info, relation, emotion)
The emotion information of each word includes part of speech species, meaning of a word quantity, emotional semantic classification, intensity, polarity, sub- emotion Classification, sub- intensity, sub- polarity etc..On the basis of original 27466 emotion words, the Data Enter of newly-increased expansion vocabulary uses Following steps:
Vocabulary is increased newly for each
1. if the vocabulary occurs in original emotion dictionary, do not deal with;
If 2. the vocabulary does not occur in emotion dictionary,:
A. the synonym of the word is searched in info, for each synonym, is searched in original emotion dictionary, If finding the word, the division of its emotion is included in newly-increased vocabulary;
If b. synonym does not have in emotion dictionary yet, Chinese solution corresponding to its translator of English is found in emotion dictionary Release, and the division of its emotion is included in the newly-increased vocabulary;
3. emotion Strength co-mputation primarily determines that emotion intensity using its mutual information with standard vocabulary is calculated, to unreasonable Result of calculation need to be manually adjusted.
(2) modal particle auxiliary emotion dictionary
Emotion table is aided according to one modal particle being made up of common modal particle of related data simple construction, is shown in Table 2 examples, Wherein, emotion information is formatted according to { emotional symbol | emotion intensity | pass judgement on tendency } and showed, and emotion intensity, which grows from weak to strong, divides 7 into Individual grade, positive negative tendency is provided simultaneously with passing judgement on both sexes division according to 0 neutrality, 1 commendation, 2 derogatory sense, 3, for appearing in dialogue system Modal particle in system makes corresponding calculation optimization.
The modal particle of table 2 auxiliary emotion table structure example
(3) punctuation mark auxiliary emotion dictionary
Punctuation mark emotion supplementary table structure according to inquire about the modes such as Chinese pertinent literature, dictionary obtain its application method and Expression effect, further according to its artificial constructed influence mode in emotion value of effect, while a computation rule is built, carried out Emotion value considers during calculating and adds this set rule, to optimize result of calculation.Several conventional marks are as follows Point symbol is multiplexed in the method for expression sense, a simple punctuation mark emotion dictionary is constructed, such as table 3 below example.
The punctuation mark tone emotion supplementary table example of table 3
Fig. 3 is renewal emoticon-emotion value matrix flow chart that emoticon provided by the present invention recommends method, main The form of expression of the emoticon for being used to describe each user transmission in emotion value, its calculation procedure can be briefly described For:
Emoticon-emotion value matrix calculating process:
Pretreatment:
Language material is divided according to emoticon service condition:For each emoticon, selection sends the preceding k of the expression Bar record is as an example for calculating the emoticon emotion value.
Calculation procedure:
For each emoticon:
1. each row of pair each example carries out word segmentation processing;
2. pair each word segmentation result matches in emotion dictionary, if finding, result is included in this of the example In 21 dimension emotion vectors of sentence;
3. modal particle and punctuation mark service condition are searched according to modal particle matched rule and punctuation mark matched rule, and Result is charged in the emotion vector of word;
4. by every one-dimensional vector in example, add up summation, obtain the 21 peacekeepings vector of this example;
5. calculating the average of whole examples vector of the expression, emoticon-emotion value matrix of the emoticon is obtained.
Fig. 4 recommends the time series analysis flow chart of method, time series analysis for emoticon provided by the present invention The basic thought of method is:Future is predicted by the behavior of things change histories.That is the historical data of passage time dimension calculates it Changing rule, and the rule will be expanded to future, so as to which the change to the following things is made prediction.The three of time series analysis Planting main models is:AR models (Auto regressive), MA models (Moving Average) and the combination ARMA of the two, ARMA (p, q) general formulae therein is:
Yt01Yt-12Yt-2+L+βpYt-pt1εt-1+L+αqεt-q
In formula, p, q are the Autoregressive and moving average order of model;α, β are autoregressive coefficient and moving average system Number;εtFor error term;YtFor steady, normal state and the time series of zero-mean.If difference operator isFor non-stationary sequence Arrange { XtCarry out the new sequence that d order difference computings obtainA stationary sequence, if assume the sequence be adapted to ARMA (p, Q) model, according to model algebraization method, autoregressive coefficient polynomial equation formula is:
Its moving average coefficient polynomial formula is:
θ (B)=1- θ1B-θ2B2-…-θqBq
If data are unstable, after difference processing, then calculated using ARIMA (p, d, q) model, its fortran For:
Wherein d is exactly the number of difference in actual carry out tranquilization processing, but is usually no more than 2 times.
The method of estimation of ARMA (p, q) model parameter uses least-squares estimation mode, i.e.,:Reach residual sum of squares (RSS) That group of minimum parameter value, setting parameter collection are combined into:
δ=(α1, α2, L αp, β1, β2, L βq)T, then make;
Reach minimum,For the least square of original parameter set Estimation, wherein, the variance of white noiseLeast-squares estimation be:
Because the stationarity of data needs to verify, therefore, time series analysis procedural model as shown in Figure 4 is built, After data are detected by stationarity, for stationary sequence, ARMA (p, q) model is directly fitted;For non-stationary series, then pass through Again by stationary test after calculus of differences, ARIMA (p, d, q) model is finally fitted.
Approach described above can improve the recommendation effect based on frequency to a certain extent, it is even more important that the present invention The technical method provided has merged user feeling trend, is changed using the emotion of time series analysis user's next step, with this More accurately generate user's emoticon recommendation list.In addition, the technical method in this specification is using laddering Describe, close association be present between the embodiment for the modules being previously mentioned, while be previously mentioned in detail in the claims Key technology method, is discussed in detail in this manual.

Claims (9)

1. a kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method, it is characterised in that including following Step:
1) user's chat record is excavated to pre-process and analyze the emotion value of dialogue, and emoticon is built in emotion matrix with this Mapping relations;
2) emotion keyword is calculated using emotion dictionary analysis Conversation History;Be divided into emotion dictionary, modal particle emotion dictionary, Punctuation mark emotion dictionary;When with emotion dictionary consider use Forward Maximum Method method, by comprising identical word not Word with length is divided on user dictionary, is arranged in a manner of from long to short, so that priority match most directly can be target-seeking short Language, word;
3) 21 dimension emotion vectors of session are calculated by emotion keyword and computation rule;
4) passage time series model to user's current session emotion vector development carry out Single-step Prediction, and by prediction result from The expression group closest to user feeling trend is chosen by nearest neighbor algorithm in mapping relations and generates recommendation list.
2. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, obtaining user's chat record in described step 1), the emotion value of dialogue is analyzed, including:Excavate user's chat Record information, it is divided into text information, voice messaging;Using information such as the existing chat records of user by filtering, segmenting, going to stop Word operates, and is established with emotion dictionary matching and belong to personal unique emotion dictionary, for marking the emotion value of emoticon.
3. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, mapping relations of described step 1) the structure emoticon in emotion matrix, including by calculating its user Emotion value, obtain an expression-affection index calculating matrix;Count each expression and its emotion value that can be expressed Two-dimentional relation, the calculating to emoticon-emotion value matrix, including emoticon-emotion value mapping relations are mainly used In the form of expression of the emoticon in emotion value for describing each user and sending;Due to being not that every words are all included and can counted The emotion of calculation, therefore in emotion value calculating process, should extract k bands before expression appears in has language if emotion, it is ensured that is used for The user dictionary of participle and possess identical entry for calculating the emotion dictionary of emotion value, to maximize dictionary matching effect Fruit.
4. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, user conversation record calculates its emotion value in described step 2), including:By emotion according to Ekman division sides Method, the criteria for classifying of emotion is determined, expand 21 groups;Divided by emotion, establish a reference standard, quantify its specific word The actual emotional expression of language;Emotion keyword is calculated using emotion dictionary analysis Conversation History, including establishes corresponding feelings Feel dictionary;For user's history conversation recording, it is segmented, extraction process, while also modal particle, punctuation mark built Vertical auxiliary emotion tone vocabulary, emotion punctuation mark table.
5. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, 21 dimension emotion vectors of session, bag are calculated in described step 3) by emotion keyword and computation rule The preceding sentient language of n bands for extracting user conversation record is included, the pretreatment such as is segmented, filtered;By the sentence after processing Lookup matching is carried out in emotion dictionary, calculates total expectation of its Sentiment orientation, with this obtain 21 dimension emotions of corresponding expression to Amount.
6. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, pretreatment of the described step 1) to chat record information, by dialogue affection data, in chronological sequence order is arranged Row, form a Random time sequence, and formula is:
{Emotioni, i=t1, t2, t3..., tn,
By the sentence repeated in conversation recording carry out data deduplication, Incomplete information is carried out curve fitting+resampling at Reason.
7. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, development of the passage time series model to user's current session emotion vector carries out single step in described step 4) Prediction, including extraction conversation history record, i.e. the historical data of passage time dimension calculates its changing rule, and the rule is opened up Exhibition will be to future, so as to which the change to the following things is made prediction;Settling time series analysis model, AR models, MA models and The combination ARM of the two, wherein ARMA (p, q) general formulae are:
Yt01Yt-12Yt-2+L+βpYt-pt1εt-1+L+αqεt-q
In formula, p, q are the Autoregressive and moving average order of model;α, β are autoregressive coefficient and moving average coefficient;εt For error term;YtFor steady, normal state and the time series of zero-mean;If difference operator isFor non-stationary series {XtCarry out the new sequence that d order difference computings obtainIt is a stationary sequence, if assuming, the sequence is adapted to ARMA (p, q) Model, according to model algebraization method, autoregressive coefficient polynomial equation formula is:
Its moving average coefficient polynomial formula is:
θ (B)=1- θ1B-θ2B2-…-θqBq
If data are unstable, after difference processing, then calculated using ARIMA (p, d, q) model, its fortran is:
Here d is the number of difference in actual carry out tranquilization processing, but no more than 2 times, ARMA (p, q) model parameter Method of estimation uses least-squares estimation mode, residual sum of squares (RSS) is reached that group of minimum parameter value, and setting parameter collection is combined into:
δ=(α1, α2, L αp, β1, β2, L βq)T, then make:
Reach minimum,For the least-squares estimation of original parameter set, Wherein, the variance of white noiseLeast-squares estimation be:
The stationarity of data is verified, time series analysis procedural model is built, after data are detected by stationarity, for steady sequence Row, ARMA (p, q) model is directly fitted, for non-stationary series, then after calculus of differences again by stationary test, most ARIMA (p, d, q) model is fitted afterwards.
8. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Recommend row characterized in that, choosing the expression group closest to user feeling trend in described step 4) and generating emoticon Table, including recorded by user's historical session, using understanding of the emotion trend analysis user for each emoticon and make With custom;With reference to emoticon-emotion mapping table, recommend next emoticon for meeting its emotion trend for user.
9. a kind of emoticon based on Time-Series analysis user conversation emotion trend according to claim 1 recommends method, Characterized in that, triple has been used to represent one in structure emotion dictionary, including emotion vocabulary body in described step 2) Vocabulary, info represent the ontology information of vocabulary, including number, explain, corresponding to translator of English, part of speech, typing person's information; Relation represents vocabulary and the direct relation of vocabulary, including synonymy, antonymy etc.;Emotion represents the feelings of vocabulary Feel information, it is expressed as:
Lexiconi=(info, relation, emotion)
The emotion information of each word include part of speech species, meaning of a word quantity, emotional semantic classification, intensity, polarity, sub- emotional semantic classification, Sub- intensity, sub- polarity etc.;
Described structure modal particle emotion dictionary, including emotion intensity grow from weak to strong and divide 7 grades into, and positive negative tendency is according to 0 Neutrality, 1 commendation, 2 derogatory sense, 3 are provided simultaneously with passing judgement on both sexes division, are made accordingly for the modal particle appeared in conversational system Calculation optimization;
Structure rule of the structure of described modal particle emotion supplementary table according to conventional gerund emotion dictionary, each modal particle The character/word for consider itself, above occurring, the three kinds of situations of character/word occurred below;
Described structure punctuation mark emotion dictionary, including punctuation mark tone emotion supplementary table structure are related according to inquiry Chinese The modes such as document, dictionary obtain its application method i.e. expression effect, further according to its artificial constructed influence in emotion value of effect Mode.
CN201710976797.2A 2017-10-19 2017-10-19 Emoticon recommendation method based on time sequence analysis of user session emotion trend Active CN107729320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710976797.2A CN107729320B (en) 2017-10-19 2017-10-19 Emoticon recommendation method based on time sequence analysis of user session emotion trend

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710976797.2A CN107729320B (en) 2017-10-19 2017-10-19 Emoticon recommendation method based on time sequence analysis of user session emotion trend

Publications (2)

Publication Number Publication Date
CN107729320A true CN107729320A (en) 2018-02-23
CN107729320B CN107729320B (en) 2021-04-13

Family

ID=61212056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710976797.2A Active CN107729320B (en) 2017-10-19 2017-10-19 Emoticon recommendation method based on time sequence analysis of user session emotion trend

Country Status (1)

Country Link
CN (1) CN107729320B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520046A (en) * 2018-03-30 2018-09-11 上海掌门科技有限公司 Search for the method and apparatus of chat record
CN108733838A (en) * 2018-05-29 2018-11-02 东北电力大学 User's behavior prediction system and method based on multipole sentiment analysis
CN109145306A (en) * 2018-09-11 2019-01-04 刘瑞军 The three-dimensional expression generation method of text-driven
CN109325112A (en) * 2018-06-27 2019-02-12 北京大学 A kind of across language sentiment analysis method and apparatus based on emoji
CN109783800A (en) * 2018-12-13 2019-05-21 北京百度网讯科技有限公司 Acquisition methods, device, equipment and the storage medium of emotion keyword
CN109977409A (en) * 2019-03-28 2019-07-05 北京科技大学 A kind of intelligent expression recommended method and system based on user's chat habit
CN110189742A (en) * 2019-05-30 2019-08-30 芋头科技(杭州)有限公司 Determine emotion audio, affect display, the method for text-to-speech and relevant apparatus
CN110619073A (en) * 2019-08-30 2019-12-27 北京影谱科技股份有限公司 Method and device for constructing video subtitle network expression dictionary based on Apriori algorithm
CN110717109A (en) * 2019-09-30 2020-01-21 北京达佳互联信息技术有限公司 Method and device for recommending data, electronic equipment and storage medium
CN110895558A (en) * 2018-08-23 2020-03-20 北京搜狗科技发展有限公司 Dialog reply method and related device
WO2020098669A1 (en) * 2018-11-15 2020-05-22 中兴通讯股份有限公司 Expression input method and apparatus, and device and storage medium
CN111897990A (en) * 2019-05-06 2020-11-06 阿里巴巴集团控股有限公司 Method, device and system for acquiring expression information
CN113360003A (en) * 2021-06-30 2021-09-07 北京海纳数聚科技有限公司 Intelligent text input method association method based on dynamic session scene
CN113360615A (en) * 2021-06-02 2021-09-07 首都师范大学 Dialog recommendation method, system and equipment based on knowledge graph and time sequence characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618222A (en) * 2015-01-07 2015-05-13 腾讯科技(深圳)有限公司 Method and device for matching expression image
KR20160056994A (en) * 2014-11-12 2016-05-23 한양대학교 산학협력단 Method for Recommending Emoticon and User Device for Recommending Emoticon
CN105975563A (en) * 2016-04-29 2016-09-28 腾讯科技(深圳)有限公司 Facial expression recommendation method and apparatus
US20170052946A1 (en) * 2014-06-06 2017-02-23 Siyu Gu Semantic understanding based emoji input method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170052946A1 (en) * 2014-06-06 2017-02-23 Siyu Gu Semantic understanding based emoji input method and device
KR20160056994A (en) * 2014-11-12 2016-05-23 한양대학교 산학협력단 Method for Recommending Emoticon and User Device for Recommending Emoticon
CN104618222A (en) * 2015-01-07 2015-05-13 腾讯科技(深圳)有限公司 Method and device for matching expression image
CN105975563A (en) * 2016-04-29 2016-09-28 腾讯科技(深圳)有限公司 Facial expression recommendation method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曹瑞: "基于时序分析的移动用户情感预测研究与应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520046A (en) * 2018-03-30 2018-09-11 上海掌门科技有限公司 Search for the method and apparatus of chat record
CN108733838A (en) * 2018-05-29 2018-11-02 东北电力大学 User's behavior prediction system and method based on multipole sentiment analysis
CN108733838B (en) * 2018-05-29 2021-04-23 东北电力大学 User behavior prediction system and method based on multi-polar emotion analysis
CN109325112A (en) * 2018-06-27 2019-02-12 北京大学 A kind of across language sentiment analysis method and apparatus based on emoji
CN110895558B (en) * 2018-08-23 2024-01-30 北京搜狗科技发展有限公司 Dialogue reply method and related device
CN110895558A (en) * 2018-08-23 2020-03-20 北京搜狗科技发展有限公司 Dialog reply method and related device
CN109145306A (en) * 2018-09-11 2019-01-04 刘瑞军 The three-dimensional expression generation method of text-driven
CN111190493A (en) * 2018-11-15 2020-05-22 中兴通讯股份有限公司 Expression input method, device, equipment and storage medium
WO2020098669A1 (en) * 2018-11-15 2020-05-22 中兴通讯股份有限公司 Expression input method and apparatus, and device and storage medium
CN109783800A (en) * 2018-12-13 2019-05-21 北京百度网讯科技有限公司 Acquisition methods, device, equipment and the storage medium of emotion keyword
CN109783800B (en) * 2018-12-13 2024-04-12 北京百度网讯科技有限公司 Emotion keyword acquisition method, device, equipment and storage medium
CN109977409A (en) * 2019-03-28 2019-07-05 北京科技大学 A kind of intelligent expression recommended method and system based on user's chat habit
CN111897990A (en) * 2019-05-06 2020-11-06 阿里巴巴集团控股有限公司 Method, device and system for acquiring expression information
CN110189742A (en) * 2019-05-30 2019-08-30 芋头科技(杭州)有限公司 Determine emotion audio, affect display, the method for text-to-speech and relevant apparatus
CN110619073B (en) * 2019-08-30 2022-04-22 北京影谱科技股份有限公司 Method and device for constructing video subtitle network expression dictionary based on Apriori algorithm
CN110619073A (en) * 2019-08-30 2019-12-27 北京影谱科技股份有限公司 Method and device for constructing video subtitle network expression dictionary based on Apriori algorithm
CN110717109A (en) * 2019-09-30 2020-01-21 北京达佳互联信息技术有限公司 Method and device for recommending data, electronic equipment and storage medium
CN110717109B (en) * 2019-09-30 2024-03-15 北京达佳互联信息技术有限公司 Method, device, electronic equipment and storage medium for recommending data
CN113360615A (en) * 2021-06-02 2021-09-07 首都师范大学 Dialog recommendation method, system and equipment based on knowledge graph and time sequence characteristics
CN113360615B (en) * 2021-06-02 2024-03-08 首都师范大学 Dialogue recommendation method, system and equipment based on knowledge graph and time sequence characteristics
CN113360003A (en) * 2021-06-30 2021-09-07 北京海纳数聚科技有限公司 Intelligent text input method association method based on dynamic session scene
CN113360003B (en) * 2021-06-30 2023-12-05 北京海纳数聚科技有限公司 Intelligent text input method association method based on dynamic session scene

Also Published As

Publication number Publication date
CN107729320B (en) 2021-04-13

Similar Documents

Publication Publication Date Title
CN107729320A (en) A kind of emoticon based on Time-Series analysis user conversation emotion trend recommends method
CN109977413B (en) Emotion analysis method based on improved CNN-LDA
CN106570496B (en) Emotion identification method and apparatus and intelligent interactive method and equipment
CN110297907B (en) Method for generating interview report, computer-readable storage medium and terminal device
CN110472017A (en) A kind of analysis of words art and topic point identify matched method and system
CN109670167B (en) Electric power customer service work order emotion quantitative analysis method based on similarity word order matrix
CN111090736B (en) Question-answering model training method, question-answering method, device and computer storage medium
WO2019080863A1 (en) Text sentiment classification method, storage medium and computer
CN108664615A (en) A kind of knowledge mapping construction method of discipline-oriented educational resource
CN109446416B (en) Law recommendation method based on word vector model
KR20190028793A (en) Human Machine Interactive Method and Device Based on Artificial Intelligence
CN106777013A (en) Dialogue management method and apparatus
CN106095749A (en) A kind of text key word extracting method based on degree of depth study
CN107133214A (en) A kind of product demand preference profiles based on comment information are excavated and its method for evaluating quality
CN108062304A (en) A kind of sentiment analysis method of the comment on commodity data based on machine learning
CN109977215A (en) Sentence recommended method and device based on association point of interest
CN108647191A (en) It is a kind of based on have supervision emotion text and term vector sentiment dictionary construction method
CN114238607B (en) Deep interactive AI intelligent job-searching consultant method, system and storage medium
CN109947934A (en) For the data digging method and system of short text
CN113609264B (en) Data query method and device for power system nodes
CN112860896A (en) Corpus generalization method and man-machine conversation emotion analysis method for industrial field
CN109325236A (en) The method of service robot Auditory Perception kinsfolk's diet information
CN114783421A (en) Intelligent recommendation method and device, equipment and medium
CN106227720B (en) A kind of APP software users comment mode identification method
Wu et al. Estimating the uncertainty in emotion class labels with utterance-specific Dirichlet priors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant