CN107807920A - Construction method, device and the server of mood dictionary based on big data - Google Patents

Construction method, device and the server of mood dictionary based on big data Download PDF

Info

Publication number
CN107807920A
CN107807920A CN201711148610.6A CN201711148610A CN107807920A CN 107807920 A CN107807920 A CN 107807920A CN 201711148610 A CN201711148610 A CN 201711148610A CN 107807920 A CN107807920 A CN 107807920A
Authority
CN
China
Prior art keywords
mood
word
dictionary
classification
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711148610.6A
Other languages
Chinese (zh)
Inventor
赵立永
吴新丽
姚笛
李云飞
王文文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XINHUA NETWORK CO Ltd
Original Assignee
XINHUA NETWORK CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XINHUA NETWORK CO Ltd filed Critical XINHUA NETWORK CO Ltd
Priority to CN201711148610.6A priority Critical patent/CN107807920A/en
Publication of CN107807920A publication Critical patent/CN107807920A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The invention provides construction method, device and the server of the mood dictionary based on big data.The construction method of the mood dictionary includes:Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word;Determine the similarity between each mood word in any pending word and pre-established basic mood dictionary;According to mood classification corresponding to each mood word in the similarity of determination, and basic mood dictionary and emotional intensity corresponding with mood classification, basic mood dictionary is updated to build mood dictionary.When the mood dictionary built using the present invention identifies to text message, the mood classification belonging to text information can be not only identified, emotional intensity of the text information under the mood classification can also be determined;More fine granularity text message can be analyzed using the embodiment of the present invention, and then more accurately analyze user and the mood of text message is inclined to.

Description

Construction method, device and the server of mood dictionary based on big data
Technical field
The present invention relates to text mining, natural language processing field, and specifically, the present invention relates to the feelings based on big data Construction method, device and the server of thread dictionary.
Background technology
With the continuous development of technique of internet, user can deliver individual conception on the net for various events, commodity etc. Point (text message), express personal mood.By carrying out mood mining analysis to text message, to obtain user to event or business The mood tendentiousness of product, is advantageous to event handling, product improvement, has very high use value.
It is typically to the mood analysis method of text message in the prior art:By preset mood dictionary to text message In word matched, to determine the mood word included in text information;According to each feelings in preset mood dictionary Mood classification corresponding to thread word, mood classification corresponding to the mood word that the match is successful in text information is determined, for example, The mood classification determined is positive or passive, front or negative or commendation and derogatory sense etc..
From above-mentioned prior art:When analyzing text message, the effect of mood dictionary is particularly important, directly Connect the reasonability and accuracy for the mood analysis for being related to text message.And the construction method of existing mood dictionary is very simple Single, the simply simply classification of preset each mood word, it is larger to analyze the mood of text message granularity, sometimes can not be accurate The mood tendency that user issues text message really is analyzed, reusability is not high.
Therefore, a kind of construction method of the mood dictionary based on big data is needed at present so that the mood dictionary constructed The mood analysis of text message more can be realized to fine granularity, and then more accurately analyzes user and issues text message Mood tendency.
The content of the invention
In view of the foregoing, the invention provides construction method, device and the server of the mood dictionary based on big data, The mood dictionary constructed using the present invention, compared to prior art, more can be analyzed text message to fine granularity, And then more accurately analyze the mood tendency that user issues text message.
The embodiments of the invention provide a kind of construction method of the mood dictionary based on big data, including:
Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word;
Determine the similarity between each mood word in any pending word and pre-established basic mood dictionary;
According to mood classification and and feelings corresponding to each mood word in the similarity of determination, and basic mood dictionary Emotional intensity corresponding to thread classification, basic mood dictionary is updated to build mood dictionary.
Preferably, the phase between any pending word and each mood word in pre-established basic mood dictionary is determined The step of seemingly spending, including:
It is upper and lower in urtext information according to each mood word in pending word and basic mood dictionary Literary information, determine the term vector of each word;
According to the term vector determined, the similarity between each pending word and each mood word is calculated.
Preferably, cutting word processing is carried out to urtext information to obtain pending word, in addition to:
Delete the stop words in cutting word result.
Preferably, according to corresponding to each mood word in the similarity of determination, and basic mood dictionary mood classification, And emotional intensity corresponding with mood classification, the step of being updated basic mood dictionary to build mood dictionary, including:
Similarity between each pending word and any mood word is ranked up, chooses preset value before sorting Pending word extends word as the candidate of the mood word;
According to the similarity between any candidate extension word and the mood word, the mood classification of the mood word and it is somebody's turn to do Emotional intensity corresponding to mood classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to basic mood dictionary To build mood dictionary.
Preferably, when any candidate extend word correspond to multiple mood words when, to each pending word with it is any Similarity between mood word is ranked up, and chooses candidate of the pending word of preset value before sorting as the mood word After extending word, in addition to:
Determine that any candidate extends the maximum similarity between word and corresponding each mood word;
Wherein, similarity, the mood class of the mood word between word and the mood word are extended according to any candidate Emotional intensity corresponding to the other and mood classification, determine that any candidate extends the mood classification of word and corresponds to emotional intensity Step, including:
The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
Preferably, after being updated to basic mood dictionary, in addition to:
The first quantity that the text message of any mood word in mood dictionary is included in preset time period is obtained, and in advance If the second quantity of the text message of whole mood words in mood dictionary is included in the period;
The effective rate of utilization of any mood word is determined according to the first quantity and the second quantity;
When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from base Deleted in plinth mood dictionary.
The embodiments of the invention provide a kind of construction device of the mood dictionary based on big data, including:
Acquiring unit is used to obtain urtext information, and it is pending to obtain that cutting word processing is carried out to urtext information Word;
Determining unit be used to determine in any pending word and pre-established basic mood dictionary each mood word it Between similarity;
Updating block is used for the similarity according to determination, and mood corresponding to each mood word in basic mood dictionary Classification and emotional intensity corresponding with mood classification, are updated to basic mood dictionary to build mood dictionary.
Preferably, determining unit is specifically used for:
It is upper and lower in urtext information according to each mood word in pending word and basic mood dictionary Literary information, determine the term vector of each word;
According to the term vector determined, the similarity between each pending word and each mood word is calculated.
Preferably, acquiring unit is additionally operable to:
Delete the stop words in cutting word result.
Preferably, updating block is specifically used for:
Similarity between each pending word and any mood word is ranked up, chooses preset value before sorting Pending word extends word as the candidate of the mood word;
According to the similarity between any candidate extension word and the mood word, the mood classification of the mood word and it is somebody's turn to do Emotional intensity corresponding to mood classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to basic mood dictionary To build mood dictionary.
Preferably, when any candidate, which extends word, corresponds to multiple mood words, updating block is additionally operable to:
It is ranked up to the similarity between each pending word and any mood word, chooses preset value before sequence Pending word extend word as the candidate of the mood word after, determine any candidate extend word and with it is corresponding Each mood word between maximum similarity;
The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
Preferably, in addition to unit is deleted, deletes unit and be used for:
After being updated to basic mood dictionary, obtain and any mood word in mood dictionary is included in preset time period The text message of whole mood words in mood dictionary is included in first quantity of the text message of language, and preset time period Second quantity;
The effective rate of utilization of any mood word is determined according to the first quantity and the second quantity;
When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from base Deleted in plinth mood dictionary.
A kind of computer-readable recording medium, computer program is stored with computer-readable recording medium, the program quilt The step of either method any one of provided in an embodiment of the present invention is realized during computing device.
The embodiment of the present invention additionally provides a kind of server, including memory and processor, memory include for storage The information of programmed instruction, processor are used for the execution of control program instruction, realize that the present invention is implemented when program is executed by processor The step of either method that example provides.
Had the beneficial effect that using what the embodiment of the present invention obtained:
In embodiments of the present invention, cutting word processing first is carried out to urtext information, obtains pending word;Determine any The similarity between each mood word in pending word and pre-established basic dictionary;According to the similarity of determination, and Mood classification corresponding to each mood word and emotional intensity corresponding with mood classification in basic mood dictionary, to basic feelings Thread dictionary is updated to build mood dictionary.When the mood dictionary built using the embodiment of the present invention identifies to text message, The mood classification belonging to text information can be not only identified, feelings of the text message under the mood classification can also be determined Thread intensity;So, to the analysis result of text message not only only comprising actively or passive, or front or negative wait simply Dimension, can also the emotional intensity according to corresponding to the mood word that any mood classification includes be separated into multiple grades, i.e.,:Using The mood dictionary that the embodiment of the present invention constructs, more fine granularity text message can be analyzed, so it is more accurate Ground analyzes user and the mood of text message is inclined to.
The additional aspect of the present invention and advantage will be set forth in part in the description, and these will become from the following description Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein:
Fig. 1 is a kind of flow signal of the construction method for mood dictionary based on big data that the embodiment of the present invention 1 provides Figure;
Fig. 2 is a kind of schematic diagram of the method for foundation basis mood dictionary that the embodiment of the present invention 1 provides;
Fig. 3 is a kind of example process signal of the construction method of mood dictionary based on big data of the embodiment of the present invention 1 Figure;
Fig. 4 is a kind of structural representation of the construction device of mood dictionary based on big data of the embodiment of the present invention 2;
Fig. 5 is a kind of structural representation for server that the present invention implements 3.
Embodiment
Embodiments of the invention are described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or the element with same or like function.Below with reference to attached The embodiment of figure description is exemplary, is only used for explaining the present invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singulative " one " used herein, " one It is individual ", " described " and "the" may also comprise plural form.It is to be further understood that what is used in the specification of the present invention arranges Diction " comprising " refer to the feature, integer, step, operation, element and/or component be present, but it is not excluded that in the presence of or addition One or more other features, integer, step, operation, element, component and/or their groups.It should be understood that when we claim member Part is " connected " or during " coupled " to another element, and it can be directly connected or coupled to other elements, or there may also be Intermediary element.In addition, " connection " used herein or " coupling " can include wireless connection or wireless coupling.It is used herein to arrange Taking leave "and/or" includes whole or any cell and all combinations of one or more associated list items.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technology art Language and scientific terminology), there is the general understanding identical meaning with the those of ordinary skill in art of the present invention.Should also Understand, those terms defined in such as general dictionary, it should be understood that have with the context of prior art The consistent meaning of meaning, and unless by specific definitions as here, idealization or the implication of overly formal otherwise will not be used To explain.
The specific technical scheme for introducing each embodiment of the present invention below in conjunction with the accompanying drawings.
Embodiment 1
The embodiments of the invention provide a kind of construction method of the mood dictionary based on big data, the flow of this method is illustrated Figure is as shown in figure 1, specifically include following steps:
S101:Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word.
S102:Determine similar between any pending word and each mood word in pre-established basic mood dictionary Degree.
S103:According to mood classification corresponding to each mood word in the similarity of determination, and basic mood dictionary and Emotional intensity corresponding with mood classification, basic mood dictionary is updated to build mood dictionary.
When the mood dictionary built using the embodiment of the present invention identifies to text message, it can not only identify that the text is believed Mood classification belonging to breath, can also determine emotional intensity of the text information under the mood classification;So, to text envelope The analysis result of breath is not only only comprising actively or passive, or front or the simple dimension such as negatively, can also be according to any Emotional intensity corresponding to the mood word that mood classification includes is separated into multiple grades, i.e.,:Constructed using the embodiment of the present invention Mood dictionary, more fine granularity text message can be analyzed, and then more accurately analyze user to text The mood tendency of information.
The specific implementation of each step is described further below for more than:
S101:Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word.
In this step, urtext information is obtained first, and the urtext information can be user to certain event or business Comment information of product etc..Specific acquisition methods have many kinds, for example, can be by web crawlers technology in news website, opinion Altar and the target urtext information about being captured in application platform.
In a preferred embodiment, after urtext information is obtained, to the urtext information duplicate removal, example Such as, for some topic, same user may deliver a plurality of identical comment information, only retain one that the user delivers and comment By information.Preferably, after to urtext information duplicate removal, then the noise information in the urtext information, the noise are removed Information includes:Character of theme label, URL (Uniform Resource Locator, URL) or repetition etc. Deng.
After urtext information is obtained, cutting word processing is carried out to the urtext information, to obtain pending word.
In a kind of preferable mode, after cutting word processing is carried out to urtext information, cutting word result is also deleted In stop words, so as to obtain pending word.Here stop words includes:Modal particle, numeral, punctuation mark etc., for example, " ", " ", " ".
S102:Determine similar between any pending word and each mood word in pre-established basic mood dictionary Degree.
In this step, any pending word of S101 determinations and each feelings in pre-established basic mood dictionary are determined Similarity between thread word.The specific method for determining similarity includes:According to pending word and basic mood dictionary In contextual information of each mood word in urtext information, determine the term vector of each word;According to determining Term vector, calculate the similarity between each pending word and each mood word.
Specifically, the method for the term vector of each word of above-mentioned determination includes:Respectively from pending word and basic feelings In contextual information of each mood word in urtext information in thread dictionary, the word of predetermined number is extracted;According to The word and term vector model (for example, Word2ve models) of the predetermined number of extraction, determine each pending word and each The term vector of individual mood word.
Each mood word pair in each pending word, pre-established basic mood in determining urtext information After the term vector answered, according to the term vector determined, the similarity between each pending word and each mood word is calculated. The method for calculating similarity can calculate the included angle cosine value between pending word and mood word equivalent vector, example Such as, if pending word is w1, corresponding term vector is v1, mood word is w2, corresponding term vector is v2, then two words Similarity is as follows:
In actual applications, the method for calculating two term vectors has many kinds, for example, it is also possible to be by calculating two term vectors Between Euclidean distance, to determine the similarity between this two term vector.The embodiment of the present invention is not especially limited to this.
S103:According to mood classification corresponding to each mood word in the similarity of determination, and basic mood dictionary and Emotional intensity corresponding with mood classification, basic mood dictionary is updated to build mood dictionary.
In this step, in one embodiment, first to the phase between each pending word and any mood word It is ranked up like degree, the pending word for choosing preset value before sorting extends word as the candidate of the mood word;According to appoint Similarity, the mood classification of the mood word and the mood classification between one candidate extension word and the mood word is corresponding Emotional intensity, determine that any candidate extends the mood classification of word and corresponding emotional intensity;According to any candidate's expansion word The mood classification of language and corresponding emotional intensity are updated to basic mood dictionary to build mood dictionary.
Specifically, as shown in table 1, it is assumed that a mood word in basic mood dictionary is " happy ", by respectively treating Similarity is ranked up between processing word and the mood word, and 5 pending word is " fast as mood word before selection ranking The candidate of pleasure " extends word.As shown in table 1, the candidate of mood word " happy " extends word and included:" happy ", " happiness ", " excitement ", " liking " and " appreciation ".
Table 1
Basic mood dictionary in the embodiment of the present invention is specifically comprising feelings corresponding to multiple mood words and each mood word Thread classification and emotional intensity, for example, as shown in table 2, in basic mood dictionary, the mood classification of mood word " happy " is " actively health ", and corresponding emotional intensity is " 90 " under the mood classification.Table 2 is exemplary explanation basis mood The content that dictionary is included, in actual applications, the content included in basic mood dictionary may be more complicated, comprehensive, The embodiment of the present invention is not especially limited to this.
Table 2
Mood word Mood classification Emotional intensity
It is happy Actively health 90
Happiness Actively health 85
It is sad It is dull passive 90
It is unhappy It is dull passive 85
In a kind of embodiment, time of the pending word of preset value as the mood word before sequence is chosen After choosing extension word, mood classification corresponding to the mood word just extends the mood classification of word for candidate.According to any candidate Extend similarity between word and the mood word and the corresponding emotional intensity of the mood classification;In a kind of preferable embodiment party In formula, determine that the method for emotional intensity corresponding to the mood classification can be such as following formula:
score(w1)=cos (w1,w2)×score(w2)
Wherein, pending word is w1, mood word is w2, score (w1) it is that mood corresponding to the pending word is strong Degree, score (w2) it is emotional intensity corresponding to the mood word.In table 1 the 4th row be calculated according to above-mentioned formula it is each Pending word is in emotional intensity corresponding to mood classification " positive ".
In actual applications, it is possible that candidate's word corresponds to the situation of multiple mood words;For this feelings Shape, the embodiment of the present invention provide a kind of preferred embodiment, and the embodiment specifically includes:The preset value before sequence is chosen After pending word extends word as the candidate of the mood word, determine that any candidate extends word and corresponding each Maximum similarity between individual mood word;The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, really Fixed any candidate extends the mood classification of word and corresponding emotional intensity.In a preferred embodiment, mood is determined The method of emotional intensity corresponding to classification can be such as following formula:
score(w1)=max { cos (w1,w2)}×score(w2)
As shown in table 3, it is assumed that it is " happy " that candidate, which extends word, and " happy " while is chosen for waiting by multiple mood words Choosing extension word.Specifically, mood word corresponding to " happy " difference is " happiness ", " happiness " and " excitement " etc., passes through calculating Similarity between " happy " and these mood words, as a result find that the similarity between " happiness " and " happy " is maximum, at this moment, Mood classification corresponding to " happy " is just mood classification corresponding to " happiness ", i.e.,:Actively health;According to the maximum similarity with Emotional intensity corresponding to " happiness ", determine the emotional intensity of " happy ".For example, the emotional intensity of " happy " is " 90% × 90 =81 "
Table 3
When determining that each mood word corresponds to the mood classification that candidate extends word, and mood word is in mood classification Under emotional intensity after, pre-established basic mood dictionary is updated to build mood dictionary.
For the embodiment of the present invention, after being updated to pre-established basic mood dictionary, preset time period is obtained Interior the first quantity for including the text message of any mood word in mood dictionary, and include mood dictionary in preset time period Second quantity of the text message of middle whole mood words;Any mood word is determined according to the first quantity and the second quantity Effective rate of utilization;When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word Deleted from basic mood dictionary.
In a kind of preferred embodiment, the calculating of the effective rate of utilization of each mood word is public in the mood dictionary after renewal Formula can be:
Wherein, It(w) represent in preset time period t, mood word w effective rate of utilization;ntRepresent in preset time period In t, the bar number (the first quantity) of w text message is matched;NtExpression matches the text of whole mood words in mood dictionary The bar number (the second quantity) of information.
For example, it is assumed that in the corpus obtained in nearest 7 days one week, mood word " happy " in mood dictionary is matched The bar number of text message be 100, sharing 500 text messages in the corpus matches feelings whole in the mood dictionary Thread word, then effective rate of utilization corresponding to " happy " be
Using above-mentioned embodiment, the effective rate of utilization of each mood word in mood dictionary is calculated, by effective rate of utilization Less than the mood word of utilization rate threshold value, deleted from basic mood dictionary.So, the utilization of whole basic dictionary is improved Rate, and saved in the resource using basic dictionary analysis target text information.
In actual applications, because information is constantly changing, therefore the mood dictionary that the embodiment of the present invention is established In continuous renewal.User can be as needed, voluntarily sets the update cycle, i.e.,:Above-mentioned S101~S103 is repeated constantly to feelings Thread dictionary is updated, to meet the needs of user.
Foregoing teachings stress the method being updated based on pre-established basic mood dictionary, and the embodiment of the present invention is also A kind of method for establishing basic mood dictionary is provided, the schematic flow sheet of the method for building up of the basic mood dictionary as shown in Fig. 2 Specifically include following steps:
S201:Determine the mood classification in mood dictionary;
S202:Determine the mood word under each mood classification.
, can be by analyzing news website, forum and about the text message in application platform (for example, commenting for S201 By information), by analyzing, summarizing the mood classification belonging to each text message, finally determine the mood that mood dictionary includes Classification.Table 4 below is the example of each mood classification in a kind of mood dictionary provided in an embodiment of the present invention.
Table 4
Numbering One-level mood classification Two level mood classification
1 Actively health It is happy, feel at ease, respect, praising, believing, liking, wishing
2 It is dull passive It is sad, remorse, think, it is flurried, frightened, shy, unhappy
3 Opposition is cursed in rage Disappointed, indignation
4 Attack abuse Abhor, censure, suspect, envy
5 Unexpectedly In surprise
As shown in table 4, the mood classification in mood dictionary includes 5 " one-level mood classifications " altogether, each one-level mood class Include two level mood classification in not.The following detailed description of in a kind of determination two level mood classification provided in an embodiment of the present invention Each affiliated one-level mood class method for distinguishing of subclass, is comprised the following steps that.
S2011:It is determined that marking section;
S2012:Semantic association degree between each subclass in one-level mood classification and two level mood classification is carried out Marking;
S2013:Count the semantic association between each subclass in each one-level mood classification and two level mood classification Score, to determine the category belonging to each subclass.
Specifically, for S2011, for example, the marking section set is { 1,2,3,4,5 }, wherein, " 1 " represents one-level feelings The degree of association between subclass in thread classification and two level mood classification is minimum, and " 5 " represent one-level mood classification and two level mood The degree of association between subclass in classification is maximum.
For S2012, more people are allowed to regard to the pass between each subclass in one-level mood classification and two level mood classification Connection degree is given a mark.For example, the marking result of statistics subclass " happy " for actively health 5, dull passiveness 1, opposes indignation 1, Attack abuse 1, unexpected 1 }, the marking result represent that the semantic association degree of " happy " and " actively health " is maximum, and other The semantic association degree of the subitem of one-level mood classification is minimum.
For S2013, each subclass of each two level mood classification and the subitem of each one-level mood classification are counted Semantic association score, acquirement divide mood classification of the highest one-level mood classification as the subclass of two level mood classification, such as " happy " overall marking result is { actively health 90, dull passiveness 0, oppose indignation 0, attack abuse 0, unexpected 0 }, then really " " happy " belongs to one-level mood classification " actively health " to stator classification.
For S202, the embodiment of the present invention provides a kind of side for preferably determining the mood word under each mood classification Method, the mood dictionary of predetermined number can be chosen in this method in advance as referring to mood dictionary.This method is specific as follows:
First, it is determined that structure index, in one embodiment, the embodiment of the present invention define four structure indexs, this Four structure indexs specifically include:
1st, the uniformity (SIM) of mood word to be selected affiliated mood classification in each dictionary with reference to mood, for example, to be selected Select mood word " happiness " and belong to " happy " mood in reference to mood dictionary 1, belonging to " commendatory term " with reference to mood dictionary 2, With reference to " front " evaluating word is belonged in mood dictionary 3, i.e.,:Mood word to be selected has semantic congruence each with reference to mood dictionary Property, and meet that there is more close semantic association with one-level mood classification.
2nd, whether mood word to be selected is included in each with reference to (FREQ) in mood dictionary, specifically judges mood word to be selected Whether simultaneously included in each reference mood dictionary, and semantic with consistent mood.
3rd, the mood of mood word to be selected is strong and weak (HIGH), mainly with being compared with reference to the emotional intensity in mood dictionary Compared with for example " happiness " is being 7 with reference to emotional intensity in mood dictionary, then the emotional intensity of mood word " incomparable happy " select It may be set to 9, it is preferable that the bigger mood word to be selected of selection emotional intensity.
4th, when that can not be made a policy by SIM, FREQ, HIGH, mood word to be selected all is added into basic mood word In allusion quotation (OHER).More put down with reference to affiliated mood classification in mood dictionary each for example, working as same mood word to be selected in SIM , for example, being found by statistics:Mood word " disappointment " to be selected is respectively in three mood classifications with reference to belonging in mood dictionary Respectively:" dull passive ", " opposition is cursed in rage ", " attack abuse ";And " disappointment " is corresponding in these three refer to mood dictionary Emotional intensity all be 8, at this moment, possibly can not intuitively judge the mood classification that " disappointment " belongs to, this situation just belongs to The situation that can not be made a policy by SIM, FREQ, HIGH, for this situation, the method for the embodiment of the present invention is by " disappointment " It is added in these three mood classifications.
After determining after four structure indexs corresponding to selection mood word, it is determined that can the mood word be selected be put into Basic mood word.Candidate's mood is treated for example, first can be used as from the word for being more than 7 with reference to selection emotional intensity in mood dictionary Word;Then, more people treat that the mood classification belonging to candidate's mood word is labeled to this, for the one of same one-level mood classification mark Cause property reaches more than 85%, it is determined that this treats mood word based on candidate's mood word.
Also it is only a kind of construction method for basic mood dictionary that the embodiment of the present invention is enumerated above, in practical application In, a variety of methods for building basic dictionaries are also had, the present invention repeats no more to this.
In order to clearly illustrate the whole implementation process of the present invention, illustrate that the present invention is real below by a complete example Apply example.The schematic flow sheet of the example is as shown in figure 3, specifically include following steps:
S301:Determine the mood classification in mood dictionary;
S302:The mood word under each mood classification is determined, to complete structure (the pre-established base of basic mood dictionary Plinth mood dictionary);
S303:Urtext information is obtained from corpus, and cutting word processing is carried out to the urtext information, is cut Word result;
S304:The stop words in cutting word result is deleted to obtain pending word;
S305:Determine similar between any pending word and each mood word in pre-established basic mood dictionary Degree;
S306:According to mood classification corresponding to each mood word in the similarity of determination, and basic mood dictionary, with And emotional intensity corresponding with mood classification, basic mood dictionary is updated to build mood dictionary;
S307:The effective rate of utilization of each mood word in the basic mood dictionary after renewal is calculated, by effective rate of utilization Less mood word is deleted from basic mood dictionary.
S303~S307 is repeated according to the default update cycle, to realize the continuous renewal to mood dictionary.
Embodiment 2
Based on identical inventive concept, the embodiment of the present invention provides a kind of structure dress of mood dictionary based on big data Put, the structural representation of the device is as shown in figure 4, the device is specifically included with lower unit:
Acquiring unit 401, determining unit 402 and updating block 403, wherein;
Acquiring unit 401 is used to obtain urtext information, and cutting word processing is carried out to urtext information to be treated Handle word;
Determining unit 402 is used to determine any pending word and each mood word in pre-established basic mood dictionary Between similarity;
Updating block 403 is used for the similarity according to determination, and in basic mood dictionary corresponding to each mood word Mood classification and emotional intensity corresponding with mood classification, are updated to basic mood dictionary to build mood dictionary.
The specific workflow of present apparatus embodiment is:First, acquiring unit 401 obtains urtext information, and to original Beginning text message carries out cutting word processing to obtain pending word, secondly, determining unit 402 determine any pending word with it is pre- Similarity in the basic mood dictionary of foundation between each mood word, then, updating block 403 is according to the similar of determination Degree, and mood classification corresponding to each mood word and emotional intensity corresponding with mood classification in basic mood dictionary, it is right Basic mood dictionary is updated to build mood dictionary.
When the mood dictionary built using present apparatus embodiment identifies to text message, it can not only identify that the text is believed Mood classification belonging to breath, can also determine emotional intensity of the text information under the mood classification;So, to text envelope The analysis result of breath is not only only comprising actively or passive, or front or the simple dimension such as negatively, can also be according to any Emotional intensity corresponding to the mood word that mood classification includes is separated into multiple grades, i.e.,:Constructed using the embodiment of the present invention Mood dictionary, more fine granularity text message can be analyzed, and then more accurately analyze user to text The mood tendency of information.
Present apparatus embodiment realizes that the embodiment of structure mood dictionary has many kinds.For example, in the first embodiment In, determining unit 402 is specifically used for:
It is upper and lower in urtext information according to each mood word in pending word and basic mood dictionary Literary information, determine the term vector of each word;
According to the term vector determined, the similarity between each pending word and each mood word is calculated.
In second of embodiment, acquiring unit 401 is additionally operable to:
Delete the stop words in cutting word result.
In the third embodiment, updating block 403 is specifically used for:
Similarity between each pending word and any mood word is ranked up, chooses preset value before sorting Pending word extends word as the candidate of the mood word;
According to the similarity between any candidate extension word and the mood word, the mood classification of the mood word and it is somebody's turn to do Emotional intensity corresponding to mood classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to basic mood dictionary To build mood dictionary.
In the 4th kind of embodiment, when any candidate, which extends word, corresponds to multiple mood words, updating block 403 is also For:
It is ranked up to the similarity between each pending word and any mood word, chooses preset value before sequence Pending word extend word as the candidate of the mood word after, determine any candidate extend word and with it is corresponding Each mood word between maximum similarity;
The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
In the 5th kind of embodiment, the construction device of mood dictionary provided in an embodiment of the present invention also includes deleting list Member, delete unit and be used for:
After being updated to basic mood dictionary, obtain and any mood word in mood dictionary is included in preset time period The text message of whole mood words in mood dictionary is included in first quantity of the text message of language, and preset time period Second quantity;
The effective rate of utilization of any mood word is determined according to the first quantity and the second quantity;
When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from base Deleted in plinth mood dictionary.
Embodiment 3
Based on identical inventive concept, the embodiment of the present invention provides a kind of computer-readable recording medium, and the computer can Read to be stored with computer program in storage medium, at least one program realizes following steps when being executed by processor:
Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word;
Determine the similarity between each mood word in any pending word and pre-established basic mood dictionary;
According to mood classification and and feelings corresponding to each mood word in the similarity of determination, and basic mood dictionary Emotional intensity corresponding to thread classification, basic mood dictionary is updated to build mood dictionary.
Preferably, at least one program is used to realize:
It is upper and lower in urtext information according to each mood word in pending word and basic mood dictionary Literary information, determine the term vector of each word;
According to the term vector determined, the similarity between each pending word and each mood word is calculated.
Preferably, at least one program is used to realize:
Delete the stop words in cutting word result.
Preferably, at least one program is used to realize:
Similarity between each pending word and any mood word is ranked up, chooses preset value before sorting Pending word extends word as the candidate of the mood word;
According to the similarity between any candidate extension word and the mood word, the mood classification of the mood word and it is somebody's turn to do Emotional intensity corresponding to mood classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to basic mood dictionary To build mood dictionary.
Preferably, at least one program is used to realize:
When any candidate, which extends word, corresponds to multiple mood words, to each pending word and any mood word Between similarity be ranked up, choose sequence before preset value pending word as the mood word candidate extend word Afterwards, determine that any candidate extends the maximum similarity between word and corresponding each mood word;
The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
Preferably, at least one program is used to realize:
After being updated to basic mood dictionary, obtain and any mood word in mood dictionary is included in preset time period The text message of whole mood words in mood dictionary is included in first quantity of the text message of language, and preset time period Second quantity;
The effective rate of utilization of any mood word is determined according to the first quantity and the second quantity;
When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from base Deleted in plinth mood dictionary.
The embodiment of the present invention also provides a kind of server, and the structural representation of the server is as shown in figure 5, including memory 501 and processor 502, memory 501 is used to store the information for including programmed instruction, and processor 502 is for control program instruction Execution, program realizes the construction method of any mood dictionary provided in an embodiment of the present invention when being performed by processor 502 Step.
Specifically, at least one program stored in memory 501 is used to realize following steps when being performed by processor 502 Suddenly:
Urtext information is obtained, and cutting word processing is carried out to urtext information to obtain pending word;
Determine the similarity between each mood word in any pending word and pre-established basic mood dictionary;
According to mood classification and and feelings corresponding to each mood word in the similarity of determination, and basic mood dictionary Emotional intensity corresponding to thread classification, basic mood dictionary is updated to build mood dictionary.
Preferably, at least one program is used to realize:
It is upper and lower in urtext information according to each mood word in pending word and basic mood dictionary Literary information, determine the term vector of each word;
According to the term vector determined, the similarity between each pending word and each mood word is calculated.
Preferably, at least one program is used to realize:
Delete the stop words in cutting word result.
Preferably, at least one program is used to realize:
Similarity between each pending word and any mood word is ranked up, chooses preset value before sorting Pending word extends word as the candidate of the mood word;
According to the similarity between any candidate extension word and the mood word, the mood classification of the mood word and it is somebody's turn to do Emotional intensity corresponding to mood classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to basic mood dictionary To build mood dictionary.
Preferably, at least one program is used to realize:
When any candidate, which extends word, corresponds to multiple mood words, to each pending word and any mood word Between similarity be ranked up, choose sequence before preset value pending word as the mood word candidate extend word Afterwards, determine that any candidate extends the maximum similarity between word and corresponding each mood word;
The mood classification and emotional intensity of mood word are corresponded to according to maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
Preferably, at least one program is used to realize:
After being updated to basic mood dictionary, in addition to:
The first quantity that the text message of any mood word in mood dictionary is included in preset time period is obtained, and in advance If the second quantity of the text message of whole mood words in mood dictionary is included in the period;
The effective rate of utilization of any mood word is determined according to the first quantity and the second quantity;
When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from base Deleted in plinth mood dictionary.
The beneficial effect obtained using computer-readable recording medium provided in an embodiment of the present invention and server, it is and preceding The beneficial effect that the embodiment of the method or device embodiment stated are obtained is same or like, and this is repeated no more.
Those skilled in the art of the present technique are appreciated that the present invention includes being related to for performing in operation described herein One or more equipment.These equipment can specially be designed and manufactured for required purpose, or can also be included general Known device in computer.These equipment have the computer program being stored in it, and these computer programs are optionally Activation or reconstruct.Such computer program can be stored in equipment (for example, computer) computer-readable recording medium or be stored in E-command and it is coupled to respectively in any kind of medium of bus suitable for storage, the computer-readable medium is included but not Be limited to any kind of disk (including floppy disk, hard disk, CD, CD-ROM and magneto-optic disk), ROM (Read-Only Memory, only Read memory), RAM (Random Access Memory, immediately memory), EPROM (Erasable Programmable Read-Only Memory, Erarable Programmable Read only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, EEPROM), flash memory, magnetic card or light card Piece.It is, computer-readable recording medium includes storing or transmitting any Jie of information in the form of it can read by equipment (for example, computer) Matter.
Those skilled in the art of the present technique be appreciated that can with computer program instructions come realize these structure charts and/or The combination of each frame and these structure charts and/or the frame in block diagram and/or flow graph in block diagram and/or flow graph.This technology is led Field technique personnel be appreciated that these computer program instructions can be supplied to all-purpose computer, special purpose computer or other The processor of programmable data processing method is realized, so as to pass through the processing of computer or other programmable data processing methods Device performs the scheme specified in the frame of structure chart and/or block diagram and/or flow graph disclosed by the invention or multiple frames.
Those skilled in the art of the present technique are appreciated that in each kind of operation, method, the flow discussed in the present invention The step of, measure, scheme can be replaced, changed, combined or deleted.Further, have what is discussed in the present invention Other steps, measure, scheme in each kind of operation, method, flow can also be replaced, changed, being reset, being decomposed, being combined or Delete.Further, it is of the prior art have with the step in each kind of operation, method, flow disclosed in the present invention, arrange Apply, scheme can also be replaced, changed, reset, decompose, combines or be deleted.
Described above is only some embodiments of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (14)

  1. A kind of 1. construction method of the mood dictionary based on big data, it is characterised in that including:
    Urtext information is obtained, and cutting word processing is carried out to the urtext information to obtain pending word;
    Determine the similarity between each mood word in any pending word and pre-established basic mood dictionary;
    According to mood classification and and feelings corresponding to each mood word in the similarity of determination, and the basic mood dictionary Emotional intensity corresponding to thread classification, the basic mood dictionary is updated to build mood dictionary.
  2. 2. construction method according to claim 1, it is characterised in that it is described determine any pending word with it is pre-established In basic mood dictionary the step of similarity between each mood word, including:
    According to each mood word in the pending word and basic mood dictionary in the urtext information Contextual information, determine the term vector of each word;
    According to the term vector determined, the similarity between each pending word and each mood word is calculated.
  3. 3. construction method according to claim 1 or 2, it is characterised in that described to be cut to the urtext information Word is handled to obtain pending word, in addition to:
    Delete the stop words in cutting word result.
  4. 4. construction method according to claim 1, it is characterised in that the similarity according to determination, and the base Mood classification corresponding to each mood word and emotional intensity corresponding with mood classification in plinth mood dictionary, to the basis The step of mood dictionary is updated to build mood dictionary, including:
    Similarity between each pending word and any mood word is ranked up, preset value waits to locate before selection sequence The candidate that word is managed as the mood word extends word;
    Similarity, the mood classification of the mood word and mood between word and the mood word is extended according to any candidate Emotional intensity corresponding to classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
    The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to the basic mood dictionary To build mood dictionary.
  5. 5. construction method according to claim 4, it is characterised in that correspond to multiple mood words when any candidate extends word During language, it is ranked up in the similarity between each pending word and any mood word, is preset before choosing sequence After the pending word of value extends word as the candidate of the mood word, in addition to:
    Determine that any candidate extends the maximum similarity between word and corresponding each mood word;
    Wherein, similarity, the mood class of the mood word extended according to any candidate between word and the mood word Emotional intensity corresponding to the other and mood classification, determine that any candidate extends the mood classification of word and corresponds to emotional intensity Step, including:
    The mood classification and emotional intensity of mood word are corresponded to according to the maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
  6. 6. construction method according to claim 1, it is characterised in that it is being updated to the basic mood dictionary Afterwards, in addition to:
    Obtain the first quantity that the text message of any mood word in the mood dictionary is included in preset time period, Yi Jisuo State the second quantity that the text message of whole mood words in the mood dictionary is included in preset time period;
    The effective rate of utilization of any mood word is determined according to first quantity and second quantity;
    When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from the base Deleted in plinth mood dictionary.
  7. A kind of 7. construction device of the mood dictionary based on big data, it is characterised in that including:
    Acquiring unit, determining unit and updating block, wherein;
    The acquiring unit is used to obtain urtext information, and cutting word processing is carried out to the urtext information to be treated Handle word;
    The determining unit be used to determine in any pending word and pre-established basic mood dictionary each mood word it Between similarity;
    The updating block is used for the similarity according to determination, and in the basic mood dictionary corresponding to each mood word Mood classification and emotional intensity corresponding with mood classification, the basic mood dictionary is updated to build mood dictionary.
  8. 8. construction device according to claim 7, it is characterised in that the determining unit is specifically used for:
    According to each mood word in the pending word and basic mood dictionary in the urtext information Contextual information, determine the term vector of each word;
    According to the term vector determined, the similarity between each pending word and each mood word is calculated.
  9. 9. the construction device according to claim 7 or 8, it is characterised in that the acquiring unit is additionally operable to:
    Delete the stop words in cutting word result.
  10. 10. construction device according to claim 7, it is characterised in that the updating block is specifically used for:
    Similarity between each pending word and any mood word is ranked up, preset value waits to locate before selection sequence The candidate that word is managed as the mood word extends word;
    Similarity, the mood classification of the mood word and mood between word and the mood word is extended according to any candidate Emotional intensity corresponding to classification, determine that any candidate extends the mood classification of word and corresponding emotional intensity;
    The mood classification of word is extended according to any candidate and corresponding emotional intensity is updated to the basic mood dictionary To build mood dictionary.
  11. 11. construction device according to claim 10, it is characterised in that correspond to multiple moods when any candidate extends word During word, the updating block is additionally operable to:
    It is ranked up in the similarity between each pending word and any mood word, chooses preset value before sequence Pending word extend word as the candidate of the mood word after, determine any candidate extend word and with it is corresponding Each mood word between maximum similarity;
    The mood classification and emotional intensity of mood word are corresponded to according to the maximum similarity, determines that any candidate extends word Mood classification and corresponding emotional intensity.
  12. 12. construction device according to claim 7, it is characterised in that also include deleting unit, the deletion unit is used In:
    After being updated to the basic mood dictionary, obtain and any feelings in the mood dictionary are included in preset time period Whole mood words in the mood dictionary are included in first quantity of the text message of thread word, and the preset time period Text message the second quantity;
    The effective rate of utilization of any mood word is determined according to first quantity and second quantity;
    When the effective rate of utilization for judging any mood word is less than utilization rate threshold value, by any mood word from the base Deleted in plinth mood dictionary.
  13. 13. a kind of computer-readable recording medium, it is characterised in that be stored with computer on the computer-readable recording medium Program, the program realize the method any one of claim 1-6 when being executed by processor.
  14. 14. a kind of server, including memory and processor, the memory is used to store the information for including programmed instruction, institute State the execution that processor is used for control program instruction, it is characterised in that program is realized that right such as will during the computing device The step of seeking 1-6 any methods describeds.
CN201711148610.6A 2017-11-17 2017-11-17 Construction method, device and the server of mood dictionary based on big data Pending CN107807920A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711148610.6A CN107807920A (en) 2017-11-17 2017-11-17 Construction method, device and the server of mood dictionary based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711148610.6A CN107807920A (en) 2017-11-17 2017-11-17 Construction method, device and the server of mood dictionary based on big data

Publications (1)

Publication Number Publication Date
CN107807920A true CN107807920A (en) 2018-03-16

Family

ID=61580404

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711148610.6A Pending CN107807920A (en) 2017-11-17 2017-11-17 Construction method, device and the server of mood dictionary based on big data

Country Status (1)

Country Link
CN (1) CN107807920A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388714A (en) * 2018-10-23 2019-02-26 东软集团股份有限公司 Text marking method, apparatus, equipment and computer readable storage medium
CN110389667A (en) * 2018-04-17 2019-10-29 北京搜狗科技发展有限公司 A kind of input method and device
CN110858099A (en) * 2018-08-20 2020-03-03 北京搜狗科技发展有限公司 Candidate word generation method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080270116A1 (en) * 2007-04-24 2008-10-30 Namrata Godbole Large-Scale Sentiment Analysis
CN103995803A (en) * 2014-04-25 2014-08-20 西北工业大学 Fine granularity text sentiment analysis method
CN104090864A (en) * 2014-06-09 2014-10-08 合肥工业大学 Emotion dictionary building and emotion calculation method
CN104794211A (en) * 2015-04-24 2015-07-22 清华大学 Method and system for extracting sentiment inducements and analyzing inducement elements based on microblog text

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080270116A1 (en) * 2007-04-24 2008-10-30 Namrata Godbole Large-Scale Sentiment Analysis
CN103995803A (en) * 2014-04-25 2014-08-20 西北工业大学 Fine granularity text sentiment analysis method
CN104090864A (en) * 2014-06-09 2014-10-08 合肥工业大学 Emotion dictionary building and emotion calculation method
CN104794211A (en) * 2015-04-24 2015-07-22 清华大学 Method and system for extracting sentiment inducements and analyzing inducement elements based on microblog text

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李勇敢 等: "中文微博情感分析研究与实现", 《软件学报》 *
蒋盛益 等: "面向微博的社会情绪词典构建及情绪分析方法研究", 《中文信息学报》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110389667A (en) * 2018-04-17 2019-10-29 北京搜狗科技发展有限公司 A kind of input method and device
CN110858099A (en) * 2018-08-20 2020-03-03 北京搜狗科技发展有限公司 Candidate word generation method and device
CN110858099B (en) * 2018-08-20 2024-04-12 北京搜狗科技发展有限公司 Candidate word generation method and device
CN109388714A (en) * 2018-10-23 2019-02-26 东软集团股份有限公司 Text marking method, apparatus, equipment and computer readable storage medium
CN109388714B (en) * 2018-10-23 2020-11-24 东软集团股份有限公司 Text labeling method, device, equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
US8909648B2 (en) Methods and systems of supervised learning of semantic relatedness
CN104281622B (en) Information recommendation method and device in a kind of social media
CN104615608B (en) A kind of data mining processing system and method
CN104991899B (en) The recognition methods of user property and device
CN108388608B (en) Emotion feedback method and device based on text perception, computer equipment and storage medium
CN109145216A (en) Network public-opinion monitoring method, device and storage medium
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN106354818B (en) Social media-based dynamic user attribute extraction method
CN107943789A (en) Mood analysis method, device and the server of topic information
Chatzakou et al. Harvesting opinions and emotions from social media textual resources
CN106776574A (en) User comment text method for digging and device
CN110457672B (en) Keyword determination method and device, electronic equipment and storage medium
CN111309916B (en) Digest extracting method and apparatus, storage medium, and electronic apparatus
CN109325124A (en) A kind of sensibility classification method, device, server and storage medium
CN107807920A (en) Construction method, device and the server of mood dictionary based on big data
JP2007041721A (en) Information classifying method and program, device and recording medium
Raghuvanshi et al. A brief review on sentiment analysis
CN109471932A (en) Rumour detection method, system and storage medium based on learning model
CN111061838B (en) Text feature keyword determination method and device and storage medium
Rygl Automatic adaptation of author’s stylometric features to document types
CN110019556B (en) Topic news acquisition method, device and equipment thereof
CN110069686A (en) User behavior analysis method, apparatus, computer installation and storage medium
Paul et al. Editing Behavior to Recognize Authors of Crowdsourced Content.
CN106997340A (en) The generation of dictionary and the Document Classification Method and device using dictionary
CN105095385B (en) A kind of output method and device of retrieval result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180316