CN108491393B - Emotion strength determining party and device for emotion words - Google Patents

Emotion strength determining party and device for emotion words Download PDF

Info

Publication number
CN108491393B
CN108491393B CN201810272426.0A CN201810272426A CN108491393B CN 108491393 B CN108491393 B CN 108491393B CN 201810272426 A CN201810272426 A CN 201810272426A CN 108491393 B CN108491393 B CN 108491393B
Authority
CN
China
Prior art keywords
emotion
word
emotional
words
word set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810272426.0A
Other languages
Chinese (zh)
Other versions
CN108491393A (en
Inventor
杨涛
李建丽
王肃
卢洪志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201810272426.0A priority Critical patent/CN108491393B/en
Publication of CN108491393A publication Critical patent/CN108491393A/en
Application granted granted Critical
Publication of CN108491393B publication Critical patent/CN108491393B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a method and a device for determining emotion intensity of emotion words, wherein the method comprises the following steps: obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words; and aiming at each emotional word in the emotional word set, calculating an emotional weight of the emotional word based on the relevance between the emotional word and each emotional word in a first preset number of emotional words in the emotional word set, wherein the emotional weight is used for measuring the emotional intensity represented by the emotional word.

Description

Emotion strength determining party and device for emotional words
Technical Field
The application relates to the technical field of data analysis, in particular to an emotion word emotion intensity determining party and device.
Background
The linguistic or written expression can present a rich variety of emotions, reflecting different emotions, tendencies, emotions, or attitudes. For example: has supporting, objecting and neutral attitude to a certain view; "beauty", "hero" usually represent positive emotions, whereas "ugly", "thief" represent negative emotions. Moreover, different emotional levels can be reflected by different words or moods, for example: "nausea" is more intense in emotional expression than "aversion".
At present, words representing emotions in the emotion dictionary are generally screened from a large amount of data in a manual mode, and emotion weights are labeled for the screened words in the manual mode and represent the emotion intensity of each word. However, this method is labor-consuming and time-consuming, and in the operation process, due to the difference of the cognition of the operators, the words in the selected emotion dictionary are not unified or reasonable, and the setting of the emotion intensity of the words in the emotion dictionary is also not reasonable, so that the application effect of the emotion dictionary is often unsatisfactory, and the requirements of practical application cannot be met.
Disclosure of Invention
In view of the above, an object of the present application is to provide a method and an apparatus for determining emotion intensity of an emotion word, which are used to solve the problem that the emotion intensity of an emotion word determined in the prior art is unreasonable.
In a first aspect, an embodiment of the present application provides a method for determining emotion intensity of an emotion word, where the method includes:
obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words;
and aiming at each emotional word in the emotional word set, calculating an emotional weight of the emotional word based on the relevance between the emotional word and each emotional word in a first preset number of emotional words in the emotional word set, wherein the emotional weight is used for measuring the emotional intensity represented by the emotional word.
Optionally, the obtaining of the emotion weight of the emotion word by calculation based on the relevance between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set includes:
and carrying out weighted calculation on the relevance between the emotional words and each emotional word in the first preset number of emotional words in the emotional word set to obtain the emotional weight of the emotional words.
Optionally, the emotion word set is constructed in the following manner:
obtaining corpora from a preset platform;
performing word segmentation processing on the corpus and converting words into word vector representations to obtain an initial word set;
determining emotion seed words representing emotion;
and calculating the correlation degree between each word in the initial word set and the emotion seed word aiming at each emotion seed word, and selecting a second preset number of words from high to low according to the correlation degree to construct an emotion word set.
Optionally, the degree of correlation between words is calculated based on a word vector.
Optionally, before calculating, for each emotion word in the emotion word set, an emotion weight of the emotion word based on a correlation between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, the method further includes the following steps:
carrying out duplication removal processing on the emotional words in the emotional word set;
removing useless words in the emotion word set; and
and calculating the correlation degree between every two emotional words in the emotional word set.
Optionally, the performing weighted calculation on the relevance between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set includes:
and calculating the average value of the sum of the correlation degrees between the emotional words and the emotional words in the first preset number of emotional words in the emotional word set.
In a second aspect, an embodiment of the present application provides an emotion intensity determination apparatus for emotion words, where the apparatus includes:
the obtaining module is used for obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words;
and the processing module is used for calculating an emotion weight of each emotion word in the emotion word set based on the relevance between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, and the emotion weight is used for measuring the emotion intensity represented by the emotion word.
Optionally, the method further comprises: a build module to:
obtaining corpora from a preset platform;
performing word segmentation processing on the corpus and converting words into word vector representations to obtain an initial word set;
determining emotion seed words representing emotion;
and calculating the correlation degree between each word in the initial word set and the emotion seed word aiming at each emotion seed word, and selecting a second preset number of words from high to low according to the correlation degree to construct an emotion word set.
In a third aspect, an embodiment of the present application provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the method when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, performs the steps of the above method.
According to the method for determining the emotion intensity of the emotion words, when the emotion intensity of the emotion words is determined, the emotion word set generated in advance is adopted, the time consumed by manually marking the emotion words in the prior art is reduced, weighting calculation is further carried out on the emotion weight of each emotion word based on the relevance between the emotion words in the emotion word set, and the accuracy of the emotion intensity of the emotion words is improved.
In order to make the aforementioned objects, features and advantages of the present application comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a schematic flowchart of a method for determining emotion intensity of an emotion word according to an embodiment of the present application;
fig. 2 is a first structural diagram of an emotion word emotion intensity determination apparatus according to an embodiment of the present application;
fig. 3 is a second structural diagram of an emotion word emotion intensity determination apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides a method for determining emotion intensity of an emotion word, which specifically comprises the following steps as shown in fig. 1:
s101, obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words. Here, the emotion word set is generated in advance according to actual needs, and in the word set, each emotion word exists in the form of a word vector, which is usually a multidimensional vector, so that the emotion word set is expressed as a word vector matrix.
Here, the emotion word set may be constructed in the following manner:
first, corpora are obtained from a pre-defined platform including, but not limited to, encyclopedia, wikipedia, encyclopedia, libraries, forums, posts, word stocks, etc.
After obtaining the corpus, a word segmentation model may be used to perform word segmentation on the corpus, for example: performing word segmentation processing by using a jieba toolkit; then a pre-trained model is used, for example: converting the divided words into word vector representation by using a trained word2vec model, and constructing a word vector matrix to obtain an initial word set; the word segmentation process and the word vector representation process can also be simultaneously realized by adopting a pre-trained model, for example: and performing word segmentation by adopting a pre-trained GloVe model and constructing a word vector matrix to obtain an initial word set.
In fact, the word segmentation processing mode and the word vector matrix construction mode are not limited in the application, a hidden markov word segmentation model, a maximum entropy word segmentation model, a probability word segmentation model, a training word segmentation model and the like can be used for achieving word segmentation processing, and machine learning models or natural language processing models such as word2vec, CBOW and GloVe can be used for constructing the word vector matrix. This is described in detail in the prior art and will not be described more than here. Thus, in the initial set of words constructed, the word vector maps the words into a word vector space and is used to represent the corresponding words. In fact, a word vector matrix can realize that a group of words with similar semantics or similar emotions are also similar in a word vector space, and the word vector matrix construction model can be corrected or trained by displaying a group of similar words in the word vector space and judging whether the semantics are similar to evaluate the quality of the word vector construction.
After the initial word set is obtained through processing, the emotion seed words which represent emotions can be determined, and the emotion seed words can be from the initial word set, or from other word sets or self-defined words. The selection of the emotional seed words can be specified, and can also be selected through a certain rule, such as words with high occurrence frequency, common words, and the like. After the emotion seed words are selected, for each emotion seed word, calculating the correlation degree between each word in the initial word set and the emotion seed word by using a word vector, and selecting a second preset number of words according to the sequence of the correlation degrees from high to low to construct an emotion word set.
The relevancy may be represented by a distance between word vectors in a word vector space, may be represented by a difference between distances from different word vectors to the center of the word vector space, may be calculated by a difference between feature values of different word vectors, and the like. In general, a smaller distance or difference indicates a greater degree of correlation between two or more words, which may be indicated by reciprocal or other means. The present application does not limit the way in which the correlation is calculated.
The second predetermined number is generally set according to actual requirements, for example, the second predetermined number may be 10, 20, 30, 50, etc., so that for each emotional seed word, a second predetermined number of similar words are respectively selected to construct an emotional word set.
In another embodiment, after the emotion word set is constructed, the correlation between every two words in the word set can be calculated in advance, so as to provide a basis for the following calculation.
In one embodiment, the selected emotion seed words may include: bingbi, unprepared, impatience, cachexia, retching of the column, aggression, grayness, defience and dirtinence. And aiming at each emotion seed word, based on the correlation degree between each word in the initial word set and the emotion seed word, sequencing the words according to the sequence of the emotion degrees from high to low, selecting the first 5 words to form a set of the emotion seed words, and finally merging the sets of the emotion seed words to obtain an emotion word set. The set of emotional words may be dead, dying, poisoned suicide, poisoned death, feelings, cynanchum delicicum, giant knotweed, cockroaches, inaudible, checked, criminal, vigorous, clary, holiday, hutussian, emotional, help, urban, trapped, stolen, fish-eye mixed beads, rosewood, confusing audio-visual, disorganized, intimidation, tragic, small action, tampering, dim, caging, pale, depressed, dim-light, dark-light, no-lift, heart-path, morning-ending, liu-sunset, morning-yang, morning sickness, ugly, foul, vergence, nausea, disgust, common, boredom, narrow-narrow, heartburn, improper, etc., although it should be understood that the above examples are merely illustrative and that the initial set of words and the set of emotional words obtained in actual applications may include a huge number of applications for which may not limit the present application.
Optionally, after obtaining the pre-generated emotion word set, before calculating, for each emotion word in the emotion word set, an emotion weight of the emotion word based on a correlation between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, the method further includes the following steps: and carrying out duplication removal processing on the emotional words in the emotional word set, and removing useless words in the emotional word set. Here, the deduplication processing algorithm and the dead word removal algorithm are described in detail in the prior art and will not be described herein too much.
Specifically, after the emotion word set is obtained, repeated emotion words and useless words in the emotion word set can be removed, and when the useless words are removed, the disabled words can be removed through the stop word dictionary. It should be noted here that the steps of performing deduplication and removing stop words are not in sequential order.
In an implementation mode, the emotion word set can be selected to obtain an emotion keyword set, and the emotion keyword set is used for replacing the emotion word set to calculate the subsequent emotion weight.
After the emotion word set is obtained, step S102 may be executed to calculate, for each emotion word in the emotion word set, an emotion weight of the emotion word based on a correlation between the emotion word and each emotion word in the first preset number of emotion words in the emotion word set. Also, the emotion word set can be replaced by the emotion keyword set. Here, the first preset number is generally preset, for example, the first preset number may be 10, 20, 30, 50, and the like, and the preset number may also be the total number of all words in the word set minus 1, so that for each emotional word in the emotional word set, the correlation between the emotional word and all other emotional words in the emotional word set is calculated, and then the emotion weight representing the emotional word is calculated.
It should be noted that the first predetermined number may be the same as the second predetermined number, or may be different from the second predetermined number.
Here, the emotion weight is used to represent or measure the emotion, mood, tendency, emotional color or attitude, etc. of the word, and the degree of strength. For example: the expression of words is characterized by positive emotion, negative emotion, positive energy, negative energy, emotional color and the like, and the expression emotion and the intensity of emotion are reflected.
Calculating the emotion weight of the emotion word based on the relevance between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, wherein the calculation comprises the following steps: and carrying out weighted calculation on the relevance between the emotional words and each emotional word in the first preset number of emotional words in the emotional word set to obtain the emotional weight of the emotional words. The weighting calculation may be: and calculating the average value of the sum of the correlation degrees between the emotional words and the emotional words in the first preset number of emotional words in the emotional word set. Specifically, the formula for calculating the emotion weight of the tth emotion word in the emotion word set is as follows:
Figure BDA0001612854710000081
wherein, WtRepresenting the emotion weight of the tth emotion word in the emotion word set; mtiRepresenting the relevance of the tth emotional word and the ith emotional word in the emotional word set; n is a positive integer.
In a specific embodiment, continuing with the example in step S101, after calculation, the emotion weight of each emotion word finally determined is as follows:
not necessarily: 0.225055308974, respectively;
and (4) lattice drawing: 0.237463076273, respectively;
puzzlement: 0.280854065627;
and (3) invisible lifting: 0.0995836868698, respectively;
joker: 0.358425231714, respectively;
death: 0.210787240306, respectively;
heartache: 0.237443471888;
taking poison and suicide: 0.168410165478, respectively;
dimming: 0.149310689373, respectively;
contamination: 0.31575428613, respectively;
depression of yin: 0.17931641767, respectively;
giant knotweed rhizome: 0.192788279296, respectively;
aversion: 0.340582611321, respectively;
in-pants frightening: 0.270607901397, respectively;
dull and dull: 0.191998418473, respectively;
this can be done: 0.127667602763, respectively;
strange qi of yin and yang: 0.167638729948;
roughly: 0.275876841611, respectively;
and (3) tensioning: 0.155195724218, respectively;
city mansion: 0.184038034523, respectively;
gram day: 0.0281753637595, respectively;
radix Cynanchi Paniculati: 0.134982406048, respectively;
and (3) moving path: 0.368638767845, respectively;
killing chicken and scaring monkeys: 0.0258759585254, respectively;
narrowness: 0.0287116162871, respectively;
covering with a cage: 0.157211849688, respectively;
coating with mustache: 0.0822772808212;
however, the situation is as follows: 0.0190850172395, respectively;
carrying and assisting: 0.0129267696506;
pallor: 0.252874239679;
and (3) killing by poison: 0.206415688939
Stealing and replacing: 0.220298256574, respectively;
small actions are played: 0.160591664622, respectively;
ugly: 0.340536015607;
short aspiration: 0.126220243866, respectively;
transferring flowers and grafting trees: 0.179029835062, respectively;
carrying: 0.189277123663;
hei is: 0.0713146527683, respectively;
death: 0.104256561249, respectively;
counterfeiting: 0.109098715068, respectively;
threatening: 0.176020835399, respectively;
getting: 0.117048612834, respectively;
liu Hua: 0.0292226205072, respectively;
the Chinese herbal medicine can not be heard: 0.296912292653, respectively;
the prisoner: 0.0725393154002, respectively;
mixing the fish eyes: 0.25995818035, respectively;
confusing audio-visual: 0.289359445867, respectively;
case: 0.0706871038805;
and death: 0.18535992685, respectively;
heart channel: 0.0641814580904, respectively;
according to the method for determining the emotion intensity of the emotion words, when the emotion intensity of the emotion words is determined, the emotion word set generated in advance is adopted, the time consumed by manually marking the emotion words in the prior art is reduced, weighting calculation is further carried out on the emotion weight of each emotion word based on the relevance between the emotion words in the emotion word set, the accuracy of the emotion intensity of the emotion words is improved, and the prediction accuracy rate when the emotion words are used for emotion prediction is improved.
An embodiment of the present application provides an emotion word emotion intensity determination apparatus, as shown in fig. 2, the apparatus includes:
the obtaining module 21 is configured to obtain a pre-generated emotion word set, where the emotion word set includes a plurality of emotion words;
and the processing module 22 is configured to calculate, for each emotion word in the emotion word set, an emotion weight of the emotion word based on a correlation between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, where the emotion weight is used to measure an emotion intensity represented by the emotion word.
Optionally, the processing module 22 is specifically configured to:
and carrying out weighted calculation on the relevance between the emotional words and each emotional word in the first preset number of emotional words in the emotional word set to obtain the emotional weight of the emotional words.
Optionally, the processing module 22 is further configured to:
and calculating the average value of the sum of the correlation degrees between the emotional words and the emotional words in the first preset number of emotional words in the emotional word set.
Another emotion word emotion intensity determination apparatus is provided in an embodiment of the present application, as shown in fig. 3, the apparatus further includes, compared with the apparatus in fig. 2: a building module 23 for:
obtaining corpora from a preset platform;
performing word segmentation processing on the corpus and converting words into word vector representations to obtain an initial word set;
determining emotion seed words representing emotion;
and calculating the correlation degree between each word in the initial word set and the emotion seed word aiming at each emotion seed word, and selecting a second preset number of words from high to low according to the correlation degree to construct an emotion word set.
Optionally, said relevance between words is calculated based on a representation vector of words
Optionally, the processing module 22 is further configured to:
carrying out duplication removal processing on the emotional words in the emotional word set;
removing useless words in the emotion word set; and
and calculating the correlation degree between every two emotional words in the emotional word set.
Corresponding to the method for determining emotional intensity of emotional words in fig. 1, an embodiment of the present application further provides a computer device, as shown in fig. 4, the device includes a memory 1000, a processor 2000 and a computer program stored in the memory 1000 and operable on the processor 2000, where the processor 2000 implements the steps of the method for determining emotional intensity of emotional words when executing the computer program.
Specifically, the memory 1000 and the processor 2000 can be general memories and processors, and are not specifically limited herein, and when the processor 2000 runs a computer program stored in the memory 1000, the method for determining the emotion intensity of an emotion word can be executed, so as to solve the problem that the emotion intensity of an emotion word is unreasonable in the prior art.
Corresponding to the method for determining emotion intensity of emotion words in fig. 1, an embodiment of the present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the method for determining emotion intensity of emotion words.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk and the like, when a computer program on the storage medium is run, the method for determining the emotion intensity of the emotion words can be executed, the problem that the emotion intensity of the emotion words is unreasonable in the prior art is solved, when the emotion intensity of the emotion words is determined, a pre-generated emotion word set is adopted, the time consumed by manually marking the emotion words in the prior art is reduced, the emotion weight of each emotion word is determined by weighting calculation based on the relevance among the emotion words in the emotion word set, the accuracy of the emotion intensity of the emotion words is improved, and the prediction accuracy in emotion prediction using the emotion words is improved.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used to illustrate the technical solutions of the present application, but not to limit the technical solutions, and the scope of the present application is not limited to the above-mentioned embodiments, although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. A method for determining emotional strength of emotional words is characterized by comprising the following steps:
obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words, the emotion word set is expressed as a word vector matrix, and each emotion word exists in a word vector mode;
calculating the correlation degree between every two emotional words in the emotional word set; wherein the relevancy is determined based on the distance between the word vectors of every two emotional words, and/or the difference of the distance from each word vector to the center of the word vector space, and/or the difference of the feature value of each word vector;
and aiming at each emotional word in the emotional word set, calculating an emotional weight of the emotional word based on the relevance between the emotional word and each emotional word in a first preset number of emotional words in the emotional word set, wherein the emotional weight is used for measuring the emotional intensity represented by the emotional word.
2. The method of claim 1, wherein the calculating of the emotion weight of the emotion word based on the correlation between the emotion word and each emotion word in the first predetermined number of emotion words in the set of emotion words comprises:
and carrying out weighted calculation on the relevance between the emotional words and each emotional word in the first preset number of emotional words in the emotional word set to obtain the emotional weight of the emotional words.
3. The method of claim 1, wherein the emotion word set is constructed as follows:
obtaining corpora from a preset platform;
performing word segmentation processing on the corpus, and converting words into word vectors for representation to obtain an initial word set;
determining emotion seed words representing emotions;
and calculating the correlation degree between each word in the initial word set and each emotion seed word aiming at each emotion seed word, and selecting a second preset number of words according to the sequence of the correlation degrees from high to low to construct an emotion word set.
4. The method of claim 1, wherein before calculating an emotion weight for each emotion word in the emotion word set based on a correlation between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, the method further comprises the following steps:
carrying out duplication removal processing on the emotional words in the emotional word set;
and removing useless words in the emotion word set.
5. The method of claim 2, wherein the performing a weighted calculation of the relevance between the emotion word and each emotion word in the first predetermined number of emotion words in the set of emotion words comprises:
and calculating the average value of the sum of the correlation degrees between the emotional words and the emotional words in the first preset number of emotional words in the emotional word set.
6. An emotional intensity determination device for an emotional word, the device comprising:
the obtaining module is used for obtaining a pre-generated emotion word set, wherein the emotion word set comprises a plurality of emotion words, the emotion word set is expressed as a word vector matrix, and each emotion word exists in a word vector mode;
the processing module is used for calculating the correlation degree between every two emotional words in the emotional word set; calculating an emotion weight of each emotion word in an emotion word set based on the relevance between the emotion word and each emotion word in a first preset number of emotion words in the emotion word set, wherein the emotion weight is used for measuring the emotion intensity represented by the emotion word; wherein the relevancy is determined based on the distance between the word vectors of every two emotion words, and/or the difference of the distance from each word vector to the center of the word vector space, and/or the difference of the feature values of each word vector.
7. The apparatus of claim 6, further comprising: a build module to:
obtaining corpora from a preset platform;
performing word segmentation processing on the corpus and converting words into word vector representations to obtain an initial word set;
determining emotion seed words representing emotion;
and calculating the correlation degree between each word in the initial word set and the emotion seed word aiming at each emotion seed word, and selecting a second preset number of words from high to low according to the correlation degree to construct an emotion word set.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of the preceding claims 1 to 5 are implemented by the processor when executing the computer program.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 5.
CN201810272426.0A 2018-03-29 2018-03-29 Emotion strength determining party and device for emotion words Active CN108491393B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810272426.0A CN108491393B (en) 2018-03-29 2018-03-29 Emotion strength determining party and device for emotion words

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810272426.0A CN108491393B (en) 2018-03-29 2018-03-29 Emotion strength determining party and device for emotion words

Publications (2)

Publication Number Publication Date
CN108491393A CN108491393A (en) 2018-09-04
CN108491393B true CN108491393B (en) 2022-05-20

Family

ID=63317059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810272426.0A Active CN108491393B (en) 2018-03-29 2018-03-29 Emotion strength determining party and device for emotion words

Country Status (1)

Country Link
CN (1) CN108491393B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809108A (en) * 2015-05-20 2015-07-29 成都布林特信息技术有限公司 Information monitoring and analyzing system
CN104899267A (en) * 2015-05-22 2015-09-09 中国电子科技集团公司第二十八研究所 Integrated data mining method for similarity of accounts on social network sites
CN105608130A (en) * 2015-12-16 2016-05-25 小米科技有限责任公司 Method and device for obtaining sentiment word knowledge base as well as terminal
KR20160099127A (en) * 2015-02-11 2016-08-22 중앙대학교 산학협력단 Method and apparatus for selecting feature used to classify multi-label
CN107688630A (en) * 2017-08-21 2018-02-13 北京工业大学 A kind of more sentiment dictionary extending methods of Weakly supervised microblogging based on semanteme
CN107766331A (en) * 2017-11-10 2018-03-06 云南大学 The method that automatic Calibration is carried out to word emotion value

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102955837A (en) * 2011-12-13 2013-03-06 华东师范大学 Analogy retrieval control method based on Chinese word pair relationship similarity
CN102663139B (en) * 2012-05-07 2013-04-03 苏州大学 Method and system for constructing emotional dictionary
CN106649783B (en) * 2016-12-28 2022-12-06 上海智臻智能网络科技股份有限公司 Synonym mining method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160099127A (en) * 2015-02-11 2016-08-22 중앙대학교 산학협력단 Method and apparatus for selecting feature used to classify multi-label
CN104809108A (en) * 2015-05-20 2015-07-29 成都布林特信息技术有限公司 Information monitoring and analyzing system
CN104899267A (en) * 2015-05-22 2015-09-09 中国电子科技集团公司第二十八研究所 Integrated data mining method for similarity of accounts on social network sites
CN105608130A (en) * 2015-12-16 2016-05-25 小米科技有限责任公司 Method and device for obtaining sentiment word knowledge base as well as terminal
CN107688630A (en) * 2017-08-21 2018-02-13 北京工业大学 A kind of more sentiment dictionary extending methods of Weakly supervised microblogging based on semanteme
CN107766331A (en) * 2017-11-10 2018-03-06 云南大学 The method that automatic Calibration is carried out to word emotion value

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
具有权重因子的细粒度情感词库构建方法;黄高峰 等;《计算机工程》;20141115;第40卷(第11期);第211-213页 *
基于《知网》概念定义的情感词典构建研究;张森 等;《计算机工程与应用》;20140224;第51卷(第17期);第118、121页 *

Also Published As

Publication number Publication date
CN108491393A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
US10839790B2 (en) Sequence-to-sequence convolutional architecture
Ramiro et al. Algorithms in the historical emergence of word senses
Abbott et al. Focal colors across languages are representative members of color categories
US20180225281A1 (en) Systems and Methods for Automatic Semantic Token Tagging
CN110297893B (en) Natural language question-answering method, device, computer device and storage medium
CN110378731A (en) Obtain method, apparatus, server and the storage medium of user's portrait
CN111672098A (en) Virtual object marking method and device, electronic equipment and storage medium
CN106649739B (en) Multi-round interactive information inheritance identification method and device and interactive system
Coelho et al. Building Machine Learning Systems with Python: Explore machine learning and deep learning techniques for building intelligent systems using scikit-learn and TensorFlow
CN110221747A (en) The rendering method of the e-book reading page calculates equipment and computer storage medium
CN113094478B (en) Expression reply method, device, equipment and storage medium
CN114694224A (en) Customer service question and answer method, customer service question and answer device, customer service question and answer equipment, storage medium and computer program product
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN110516175B (en) Method, device, equipment and medium for determining user label
Angele et al. Eye movements and parafoveal preview of compound words: Does morpheme order matter?
CN110489552A (en) A kind of microblog users suicide risk checking method and device
CN114490926A (en) Method and device for determining similar problems, storage medium and terminal
CN108491393B (en) Emotion strength determining party and device for emotion words
US11896902B1 (en) Emotion based music style change using deep learning
Hare et al. Ambiguity and frequency effects in regular verb inflection
KR102059017B1 (en) Control method, apparatus and system for knowledge sharing platform
Marasco ‘I would rather wait for you than believe that you are not coming at all’: Revolutionary love in a post-revolutionary time
CN116910201A (en) Dialogue data generation method and related equipment thereof
CN110647914A (en) Intelligent service level training method and device and computer readable storage medium
KR102187594B1 (en) Multi-omics data processing apparatus and method for discovering new drug candidates

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District

Applicant after: Guoxin Youyi Data Co., Ltd

Address before: 100070, No. 188, building 31, headquarters square, South Fourth Ring Road West, Fengtai District, Beijing

Applicant before: SIC YOUE DATA Co.,Ltd.

GR01 Patent grant
GR01 Patent grant