CN106547740A - Text message processing method and device - Google Patents
Text message processing method and device Download PDFInfo
- Publication number
- CN106547740A CN106547740A CN201611043882.5A CN201611043882A CN106547740A CN 106547740 A CN106547740 A CN 106547740A CN 201611043882 A CN201611043882 A CN 201611043882A CN 106547740 A CN106547740 A CN 106547740A
- Authority
- CN
- China
- Prior art keywords
- word
- dictionary
- text message
- term vector
- emotion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a kind of text message processing method and device, belong to natural language processing and data mining technology field.Methods described includes:Obtain text message;Word segmentation processing is carried out to the text message and obtains multiple words undetermined;Obtain the plurality of word undetermined and distinguish corresponding term vector;Calculate the similarity of the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in default sentiment dictionary;The emotion attribute of the text message is judged according to the similarity of the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in the sentiment dictionary.Compared to existing method, text message processing method and device that the present invention is provided reduce the requirement of the renewal speed to sentiment dictionary, avoid sentiment dictionary and update the problem for causing sentiment analysis effect poor not in time, be effectively improved precision of analysis.
Description
Technical field
The present invention relates to natural language processing and data mining technology field, at a kind of text message
Reason method and device.
Background technology
The sentiment analysis of text message are that the subjective texts with emotional color are analyzed, conclusion and reasoning is processed
Process, be widely used in the aspects such as Internet public opinion analysis and early warning, business decision.Traditional sentiment analysis method is main
Emotion attribute based on sample word.For example, sample word includes substantial amounts of emotion word, judges to treat by searching sample word
The emotion attribute of analysis word.But, existing method extremely relies on these sample words, if searched not in sample word
The emotion attribute of word to be analyzed cannot be then obtained to word to be analyzed, that is to say, that renewal of the existing method to sample word
Rate request is very high.
The content of the invention
In view of this, it is an object of the present invention to provide a kind of text message processing method and device, can be effective
Ground improves the problems referred to above.
To achieve these goals, technical scheme is as follows:
In a first aspect, embodiments providing a kind of text message processing method, methods described includes:Obtain text
Information;Word segmentation processing is carried out to the text message and obtains multiple words undetermined;Obtain the plurality of word undetermined to correspond to respectively
Term vector;Calculate the corresponding term vector of each word undetermined word corresponding with each the emotion word in default sentiment dictionary
The similarity of vector, wherein, the sentiment dictionary includes at least two dictionaries, and each described dictionary belongs to corresponding to a kind of emotion
Property, each described dictionary includes at least one emotion word, each emotion word one term vector of correspondence;According to each word undetermined
The similarity of the corresponding term vector of language term vector corresponding with each the emotion word in the sentiment dictionary judges the text
The emotion attribute of information.
Second aspect, the embodiment of the present invention additionally provide a kind of text message processing apparatus, and described device includes:Text envelope
Breath acquisition module, word-dividing mode, term vector acquisition module, similarity calculation module and emotion attribute determination module.Text message
Acquisition module, for obtaining text message.Word-dividing mode, obtains multiple undetermined for word segmentation processing is carried out to the text message
Word.Term vector acquisition module, distinguishes corresponding term vector for obtaining the plurality of word undetermined.Similarity calculation module,
For calculating the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in default sentiment dictionary
Similarity, wherein, the sentiment dictionary includes at least two dictionaries, each described dictionary correspond to a kind of emotion attribute, often
The individual dictionary includes at least one emotion word, each emotion word one term vector of correspondence.Emotion attribute determination module, uses
In the phase according to the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in the sentiment dictionary
The emotion attribute of the text message is judged like degree.
Text message processing method provided in an embodiment of the present invention and device be according to the corresponding word of each word undetermined to
The similarity of amount term vector corresponding with each the emotion word in default sentiment dictionary belongs to come the emotion for judging text message
Property.Compared to existing method, it is not necessary to ensure to exist in default sentiment dictionary the word undetermined that needs to judge and its corresponding
Emotion attribute, reduce the more new demand to sentiment dictionary, it is to avoid sentiment dictionary updates causes the sentiment analysis to imitate not in time
Really poor problem, is effectively improved precision of analysis.
To enable the above objects, features and advantages of the present invention to become apparent, preferred embodiment cited below particularly, and coordinate
Appended accompanying drawing, is described in detail below.
Description of the drawings
In order to be illustrated more clearly that the technical scheme of the embodiment of the present invention, below by to be used attached needed for embodiment
Figure is briefly described, it will be appreciated that the following drawings illustrate only certain embodiments of the present invention, thus be not construed as it is right
The restriction of scope, for those of ordinary skill in the art, on the premise of not paying creative work, can be with according to this
A little accompanying drawings obtain other related accompanying drawings.
Fig. 1 shows a kind of structured flowchart of the computer that can be applicable to the embodiment of the present invention;
Fig. 2 shows a kind of schematic flow sheet of text message processing method that first embodiment of the invention is provided;
Fig. 3 shows a kind of embodiment party of step S150 in the text message processing method that first embodiment of the invention is provided
The schematic flow sheet of formula;
Fig. 4 shows another kind of enforcement of step S150 in the text message processing method that first embodiment of the invention is provided
The schematic flow sheet of mode;
Fig. 5 shows the schematic flow sheet of the text message processing method that second embodiment of the invention is provided;
Fig. 6 shows the model support composition of NNLM models;
Fig. 7 shows the model support composition of CBOW models;
Fig. 8 shows the model support composition of Skip-gram models;
Fig. 9 shows a kind of functional block diagram of text message processing apparatus that third embodiment of the invention is provided;
Figure 10 shows a kind of functional block diagram of text message processing apparatus that fourth embodiment of the invention is provided.
Specific embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Ground description, it is clear that described embodiment is only a part of embodiment of the invention, rather than the embodiment of whole.Generally exist
The component of the embodiment of the present invention described and illustrated in accompanying drawing can be arranged and be designed with a variety of configurations herein.Cause
This, the detailed description of the embodiments of the invention to providing in the accompanying drawings is not intended to limit claimed invention below
Scope, but it is merely representative of the selected embodiment of the present invention.Based on embodiments of the invention, those skilled in the art are not doing
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
As shown in figure 1, being a kind of block diagram of computer 100.The computer 100 includes that text information processing is filled
Put, memorizer 120, storage control 130, processor 140, Peripheral Interface 150, input-output unit 160 and display unit
170。
It is the memorizer 120, storage control 130, processor 140, Peripheral Interface 150, input-output unit 160, aobvious
Show that 170 each element of unit is directly or indirectly electrically connected with each other, to realize the transmission or interaction of data.For example, these
Element can pass through one or more communication bus each other or holding wire is realized being electrically connected with.Text message processing apparatus include
During at least one can be stored in the memorizer 120 in the form of software or firmware (firmware) or it is solidificated in the computer
Software function module in 100 operating system (operating system, OS).The processor 140 is used to perform storage
The executable module stored in device 120, the software function mould that such as text message processing apparatus provided in an embodiment of the present invention include
Block or computer program.
Wherein, memorizer 120 may be, but not limited to, random access memory (Random Access Memory,
RAM), read only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only
Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM),
Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..
Wherein, memorizer 120 is used for storage program, and the processor 140 performs described program after execute instruction is received, this
Method performed by the computer 100 of the flow definition that bright any embodiment is disclosed is can apply in processor 140, or
Realized by processor 140.
A kind of possibly IC chip of processor 140, the disposal ability with signal.Above-mentioned processor 140 can
Being general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit
(Network Processor, abbreviation NP) etc.;Can also be digital signal processor (DSP), special IC (ASIC),
It is ready-made programmable gate array (FPGA) or other PLDs, discrete gate or transistor logic, discrete hard
Part component.Can realize or perform disclosed each method in the embodiment of the present invention, step and logic diagram.General processor
Can be microprocessor or the processor can also be any conventional processor etc..
Various input/output devices are coupled to processor 140 and memorizer 120 by the Peripheral Interface 150.At some
In embodiment, Peripheral Interface 150, processor 140 and storage control 130 can be realized in one single chip.Other one
In a little examples, they can be realized by independent chip respectively.
Input-output unit 160 is used to be supplied to user input data to realize interacting for user and computer 100.It is described defeated
Enter output unit 160 may be, but not limited to, mouse and keyboard etc..
Display unit 170 provide between computer 100 and user an interactive interface (such as user interface) or
Refer to user for display image data.In the present embodiment, the display unit 170 can be liquid crystal display or touch-control
Display.If touch control display, which can be the capacitance type touch control screen or electric resistance touch-control for supporting single-point and multi-point touch operation
Screen etc..Support that single-point and multi-point touch operation refer to that touch control display can sense on the touch control display one or more
The touch control operation for being produced at position simultaneously, and transfer to processor 140 to be calculated and processed the touch control operation for sensing.
It is appreciated that structure shown in Fig. 1 is only illustrated, computer 100 may also include more more than shown in Fig. 1 or more
Few component, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can adopt hardware, software or its group
Close and realize.
Text is made up of word, just can determine that the emotion category of whole text with reference to the emotion attribute of each word
Property.The emotion attribute of word is for example liked and is detested, actively and passiveness etc. it can be appreciated that the Sentiment orientation that represents of word.
The embodiment of the present invention is primarily directed in Chinese short text, it is proposed that a kind of emotion attribute analysis method of text message.According to
The similarity of the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in default sentiment dictionary
To judge the emotion attribute of text message, effectively reduce the requirement of the renewal speed to sentiment dictionary, it is to avoid sentiment dictionary
The problem that renewal causes sentiment analysis effect poor not in time, is effectively improved precision of analysis.Certainly, the present invention
The text message processing method and device that embodiment is provided can be used for the emotion attribute analysis of other Languages text.
First embodiment
The flow chart that Fig. 2 shows a kind of text message processing method that first embodiment of the invention is provided.Refer to figure
2, the text message processing method includes:
Step S110, obtains text message;
In the present embodiment, text message is mainly Chinese short text, can be input into by input-output unit 160, or,
Network Capture can also be passed through.Certainly, text message can also be the text of other Languages, for example, it is also possible to be English text.
Step S120, carries out word segmentation processing and obtains multiple words undetermined to the text message;
When text message is Chinese text, different from English text between two neighboring word using space as dividing naturally
Boundary accords with, no obvious delimiter between the adjacent word of Chinese text, accordingly, it would be desirable to carry out at Chinese word segmentation to text message
Reason.Chinese word segmentation will Chinese character sequence be cut into one by one individually word.In the present embodiment, Chinese word segmentation can be selected
Python jieba participles component or Chinese lexical analysis system (Institute of Computing Technology,
Chinese Lexical Analysis System, ICTCLAS).It is of course also possible to use other Chinese Word Automatic Segmentations.On
The two kinds of Chinese word cutting methods stated respectively have advantage, wherein, jieba participles component is installed, is used simply, and ICTCLAS participles
Precision is higher.
Additionally, before word segmentation processing is carried out to text message, in order to improve analysis efficiency, needing to carry out text message
Data prediction.The data prediction of text message is included:Data cleansing is carried out to the text message, by text without
There are the html tag of emotion attribute, CSS labels, URL link etc. to remove.
Further, the data prediction of text message is also included:The coded format of text message is changed into default volume
Code form.For example, the coded formats such as UTF-8, GBK can be converted to.For example, the coding of text message is entered using chardet
Row identification, and by the decode functions of python, encode functions, it is possible to rapidly the coding of text message is united
One.
In order to further save memory space and improve analysis efficiency, word segmentation processing is carried out to text message and obtains multiple treating
After determining word, in addition it is also necessary to remove the stop words in the plurality of word undetermined.Stop words is usually function word, with other word phases
Than being typically no physical meaning, can be preposition, pronoun, function word and some characters unrelated with emotion etc., for example " i.e.
Make ", " 1. ", " # " etc..The mode for removing stop words is specifically as follows:By resulting word multiple undetermined and default deactivation
The stop words that vocabulary includes is contrasted, and when the stop words consistent with word undetermined is found in vocabulary is disabled, is removed
The word undetermined.For example, above-mentioned deactivation vocabulary can adopt the deactivation word list of Harbin Institute of Technology's offer, Baidu to provide
Deactivation word list and Sichuan University's machine intelligence laboratory disable one or more in word list.
Step S130, obtains the plurality of word undetermined and distinguishes corresponding term vector;
Term vector can be represented with Distributed representation, be a kind of low-dimensional real number vector.Term vector
With the good feature of semanteme, it is the usual way for characterizing word feature.Which represents certain semantic and grammer per one-dimensional value
The feature of upper explanation.Therefore, it can for the every one-dimensional of term vector to be referred to as a word feature.The term vector of certain word can pass through
Language model training is obtained, i.e., each word is mapped to K by training and ties up real number, and wherein K is the integer more than 1.For example, certain
The term vector of word can be expressed as [0.785,0.109, -0.117, -0.127,0.652 ...].
Step S140, calculates each the emotion word in the corresponding term vector of each word undetermined and default sentiment dictionary
The similarity of corresponding term vector;
Wherein, sentiment dictionary includes at least two dictionaries, and each described dictionary corresponds to a kind of emotion attribute, described in each
Dictionary includes at least one emotion word.
Sentiment dictionary can need setting according to user.For example, when user needs to differentiate the Sentiment orientation of microblogging text,
Differentiate microblogging text belong to front, it is negative or neutral when, sentiment dictionary can include three dictionaries, the corresponding emotion of difference
Attribute is front, negative and neutrality.Correspondence emotion attribute is that positive dictionary includes multiple representative front words,
For example, like, happiness, happiness etc..Correspondence emotion attribute is that negative dictionary includes multiple representative negative words,
For example, oppressive, gloomy etc..Correspondence emotion attribute is that neutral dictionary includes multiple representative neutral words.
In above-mentioned sentiment dictionary, each emotion word one term vector of correspondence.It should be noted that each emotion word pair
The term vector answered can be using obtaining with step S130 identical method, and the corresponding term vector of emotion word and word pair undetermined
The dimension of the term vector answered is identical.
At this point it is possible to according to the corresponding term vector of certain word undetermined word corresponding with certain the emotion word in sentiment dictionary to
Amount judges the semantic similarity between the two words.For example, language can be judged by methods such as cosine similarity, Euclidean distances
Adopted similarity.By taking cosine similarity algorithm as an example, the included angle cosine value between two term vectors is calculated as the two term vectors
The similarity of corresponding word, it is assumed that two term vectors are respectively a, b, the angle of two term vectors is θ, cos θ=(ab)/
(a|·|b).At this point it is possible to calculate the similarity in each word undetermined and sentiment dictionary between each emotion word respectively.
Step S150, according to each the emotion word pair in the corresponding term vector of each word undetermined and the sentiment dictionary
The similarity of the term vector answered judges the emotion attribute of the text message.
With reference to the semanteme of all words undetermined, you can to judge the emotion attribute of whole text message.For example, when being needed
Determine in word, the quantity of the word undetermined of a certain emotion attribute is most, you can to judge that the emotion attribute is whole text
The emotion attribute of information.
As in sentiment dictionary, each dictionary includes at least one emotion word, certainly, in order that the feelings of word undetermined
The judgement of sense attribute is more accurate, and each dictionary preferably includes multiple emotion words.
Specifically, as shown in figure 3, a kind of embodiment of step S150 can include step S151 and step S152.
Step S151, belongs to same in calculating the corresponding term vector of all words undetermined and the sentiment dictionary respectively
The similarity sum of each corresponding term vector of emotion word of dictionary, as the phase between each dictionary and the text message
Like degree.
Can by word each word undetermined corresponding term vector corresponding with each emotion word for belonging to same dictionary to
The similarity sum of amount is used as the similarity between each dictionary and each word undetermined.Then again by all words undetermined with it is same
Similarity sum between one dictionary is used as the similarity between the dictionary and text message.Thus, it is possible to try to achieve each word
Similarity between storehouse and text message.
For example, text message is 20 according to the word undetermined that step S120 is divided, and default sentiment dictionary includes difference
Three dictionaries of emotion attribute, respectively S1, S2 and S3, each dictionary include 10 emotion words.It is undetermined that each is calculated respectively
Word and the similarity of dictionary S1, S2, S3, then calculate the similarity sum of 20 words undetermined and dictionary S1 again as text
Similarity between information and dictionary S1, the similarity sum for calculating 20 words undetermined with dictionary S2 is used as text message and word
Similarity between the S2 of storehouse, calculates the similarity sum of 20 words undetermined and dictionary S3 as between text message and dictionary S3
Similarity.
Step S152, using the emotion attribute of the similarity between the text message maximum dictionary as the text
The emotion attribute of information.
Similarity in comparison step S151 between calculated each dictionary and text message, will with text message it
Between the maximum dictionary of similarity emotion attribute as text message emotion attribute.Assume compared between dictionary S1
Similarity and with the similarity between dictionary S2, similarity between 20 above-mentioned words undetermined and dictionary S3 is maximum, then will
Emotion attribute of the emotion attribute of dictionary S3 as text message.
As another embodiment, it is also possible to by each word undetermined and each emotion word for belonging to same dictionary
Meansigma methodss between similarity are used as the word undetermined and the similarity of each dictionary.Then calculate again all words undetermined with it is same
The similarity sum of one dictionary is used as the similarity between the dictionary and text message.
In a kind of specific embodiment of the present embodiment, sentiment dictionary includes the first dictionary and the second dictionary, wherein, institute
It is positive to state the corresponding emotion attribute of the first dictionary, and the corresponding emotion attribute of second dictionary is passiveness.Now, such as Fig. 4 institutes
Show, it is above-mentioned calculate the corresponding term vector of all words undetermined and sentiment dictionary respectively in belong to each feelings of same dictionary
The similarity sum of the corresponding term vector of sense word, as the similarity between each dictionary and text message, will be with the text
The emotion attribute of the dictionary of the similarity maximum between this information is specifically included as the emotion attribute of the text message:
Step S1501, calculates the corresponding term vector of all words undetermined and each emotion word in first dictionary
The similarity sum of the corresponding term vector of language, as the similarity between first dictionary and the text message;
Step S1502, calculates the corresponding term vector of all words undetermined and each emotion word in second dictionary
The similarity sum of the corresponding term vector of language, as the similarity between second dictionary and the text message;
Step S1503, calculate first dictionary and the similarity of the text message and second dictionary with it is described
Difference between the similarity of text message;
Whether step S1504, judge the difference more than 0;
When the difference is more than zero, represent the similarity between the first dictionary and text message more than the second dictionary and text
Similarity between this information, execution step S1505;When the difference is less than zero, represent the second dictionary and text message it
Between similarity more than similarity between the first dictionary and text message, execution step S1506.
Step S1505, judges the emotion attribute of the text message as actively;
Step S1506, judges the emotion attribute of the text message as passiveness.
The text message processing method that the present embodiment is provided be according to the corresponding term vector of each word undetermined with it is default
The similarity of each the corresponding term vector of emotion word in sentiment dictionary is judging the emotion attribute of text message.Compared to existing
Some methods, it is not necessary to there is the word undetermined and its corresponding emotion attribute for needing to judge in ensureing default sentiment dictionary,
Reduce the requirement of the renewal speed to sentiment dictionary, it is to avoid sentiment dictionary updates causes the sentiment analysis effect poor not in time
Problem, be effectively improved precision of analysis.
Second embodiment
The flow chart that Fig. 5 shows a kind of text message processing method that second embodiment of the invention is provided.Refer to figure
5, the text message processing method includes:
Step S210, obtains language material;
In the present embodiment, Sohu's news data (SogouCS) that search dog laboratory provides can be adopted, and reptile is adopted
The language material as training such as the microblogging language material of technology crawl or forum's comment.
Step S220, the language material is trained using word2vec algorithms obtain correspondence table multiple training words and
Each described training corresponding term vector of word;
Each word can be expressed as Distributed Representation term vector forms by Word2vec, and vectorial
Similarity spatially can be used to represent the similarity on phrase semantic.Word2vec be with based on three-layer neural network from
So language makes improvement based on estimating model (Neural Network Language Model, NNLM), proposes based on two
Logarithm linear shape model (Log-linear Model):Continuous bag of words (continuous-bag-of-words, CBOW) model and company
Continuous Skip-gram models (Continuous Skip-gram Model), this improves and causes the training speed of neutral net big
It is big to improve.
Wherein, the illustraton of model of NNLM is as shown in fig. 6, as NNLM is a kind of existing model, its concrete principle here is not
It is described further.Word2vec includes two kinds of training patterns of CBOW and Skip-gram.Fig. 7 shows the model support composition of CBOW models,
Fig. 8 shows the model support composition of Skip-gram models.CBOW models and Skip-gram models include input layer, mapping layer
And output layer.The ultimate principle of CBOW models and Skip-gram models, is not also described further herein.
In the present embodiment, CBOW models or Skip-gram models can be adopted to be trained the language material for obtaining and to obtain many
Individual training word and each corresponding term vector of training word.Multiple training words and each corresponding term vector structure of training word
Into correspondence table.
Step S230, obtains text message;
Step S240, carries out word segmentation processing and obtains multiple words undetermined to the text message;
The specific embodiment of step S230 and step S240 be referred to step S110 in above-mentioned first embodiment and
Step S120, here is omitted.
Step S250, the instruction pre-conditioned with the consistent sexual satisfaction of word undetermined each described in the default correspondence table of lookup
Practice word, using the term vector for the corresponding term vector of word being trained as the word undetermined;
Correspondence table is obtained by step S220, the correspondence table include one-to-one multiple training words and multiple words to
Amount.It is in step S250, pre-conditioned setting to be needed according to user.For example, it is pre-conditioned to be:Lexical similarity is set
Degree threshold value, when the acceptation similarity between two words is more than or equal to the threshold value, then judges that the similarity of two words is full
Foot is pre-conditioned.Or, a synonym table is set, synonym table includes multigroup synonym, when existing in correspondence table and
During current word identical training word undetermined, using the training word as bar default with the consistent sexual satisfaction of current word undetermined
The training word of part;When not existing in correspondence table with current word identical training word undetermined, obtained according to synonym table
The synonym of current word undetermined, will train with the synonym identical of current word undetermined in correspondence table word as with it is current
The pre-conditioned training word of the consistent sexual satisfaction of word undetermined.
Using each word undetermined as current word undetermined, the concordance in lookup correspondence table with current word undetermined
Meet pre-conditioned training word, and the corresponding term vector of word is trained as the word of current word undetermined using what is found
Vector, so as to obtain the corresponding term vector of each word undetermined.
Step S260, calculates each the emotion word in the corresponding term vector of each word undetermined and default sentiment dictionary
The similarity of corresponding term vector;
Step S270, according to each the emotion word pair in the corresponding term vector of each word undetermined and the sentiment dictionary
The similarity of the term vector answered judges the emotion attribute of the text message.
The specific embodiment of step S260 and step S270 be referred to step S140 in above-mentioned first embodiment and
Step S150, here is omitted.
The text message processing method that the present embodiment is provided be according to the corresponding term vector of each word undetermined with it is default
The similarity of each the corresponding term vector of emotion word in sentiment dictionary is judging the emotion attribute of text message, it is not necessary to protect
There is the word undetermined and its corresponding emotion attribute for needing to judge in demonstrate,proving default sentiment dictionary, reduce to sentiment dictionary
The requirement of renewal speed, it is to avoid sentiment dictionary updates the problem for causing the sentiment analysis effect poor not in time, effectively improves
Precision of analysis.
3rd embodiment
Fig. 9 shows a kind of functional block diagram of text message processing apparatus that third embodiment of the invention is provided.This
The text message processing apparatus that embodiment is provided are can run in computer 100, for realizing the text of first embodiment offer
Information processing method.Fig. 9 is referred to, the text message processing apparatus 10 that the present embodiment is provided include:Text message acquisition module
11st, word-dividing mode 12, term vector acquisition module 13, similarity calculation module 14 and emotion attribute determination module 15.
Wherein, text message acquisition module 11, for obtaining text message;
Word-dividing mode 12, obtains multiple words undetermined for word segmentation processing is carried out to the text message;
Term vector acquisition module 13, distinguishes corresponding term vector for obtaining the plurality of word undetermined;
Similarity calculation module 14, for calculating in the corresponding term vector of each word undetermined and default sentiment dictionary
The similarity of each corresponding term vector of emotion word.Wherein, the sentiment dictionary includes at least two dictionaries, each institute's predicate
Storehouse corresponds to a kind of emotion attribute, and each described dictionary includes at least one emotion word, each emotion word one word of correspondence
Vector;
Emotion attribute determination module 15, for according in the corresponding term vector of each word undetermined and the sentiment dictionary
The similarity of each corresponding term vector of emotion word judges the emotion attribute of the text message.
Further, as shown in figure 9, the emotion attribute determination module 15 includes:
Computing unit 151, for being calculated in the corresponding term vector of all words undetermined and the sentiment dictionary respectively
Belong to the similarity sum of each corresponding term vector of emotion word of same dictionary, as each dictionary and the text message
Between similarity;
Identifying unit 152, for using the emotion attribute of the similarity between the text message maximum dictionary as
The emotion attribute of the text message.
In a kind of specific embodiment of the present embodiment, the sentiment dictionary includes the first dictionary and the second dictionary, its
In, the corresponding emotion attribute of first dictionary is positive, and the corresponding emotion attribute of second dictionary is passiveness.
Now, the computing unit 151 is specifically for calculating the corresponding term vector of all words undetermined with described the
The similarity sum of each corresponding term vector of emotion word in one dictionary, as first dictionary and the text message it
Between similarity, calculate the corresponding term vector of all words undetermined corresponding with each emotion word in second dictionary
The similarity sum of term vector, as the similarity between second dictionary and the text message.
The identifying unit 152 is specifically for calculating first dictionary and the similarity of the text message with described the
Difference between the similarity of two dictionaries and the text message, when the difference is more than zero, judges the text message
Emotion attribute is positive, when the difference is less than zero, judges the emotion attribute of the text message as passiveness.
In the present embodiment, each module can be that now, above-mentioned each module can be stored in computer by software code realization
In 100 memorizer 120.Each module equally can be realized by hardware such as IC chip above.
Fourth embodiment
Figure 10 shows a kind of functional block diagram of text message processing apparatus that fourth embodiment of the invention is provided.This
The text message processing apparatus that embodiment is provided are can run in computer 100, for realizing the text of second embodiment offer
Information processing method.Figure 10 is referred to, the text message processing apparatus 20 that the present embodiment is provided include:Language material acquisition module 21,
Training module 22, text message acquisition module 23, word-dividing mode 24, term vector acquisition module 25,26 and of similarity calculation module
Emotion attribute determination module 27.
Wherein, language material acquisition module 21, for obtaining language material;
Training module 22, obtains the multiple of the correspondence table for being trained to the language material using word2vec algorithms
Training word and each described training corresponding term vector of word;
Text message acquisition module 23, for obtaining text message;
Word-dividing mode 24, obtains multiple words undetermined for word segmentation processing is carried out to the text message;
Term vector acquisition module 25, for searching the consistent sexual satisfaction in default correspondence table with word undetermined each described
Pre-conditioned training word, using the term vector for training the corresponding term vector of word as the word undetermined.Wherein, institute
Stating correspondence table includes one-to-one multiple training words and multiple term vectors;
Similarity calculation module 26, for calculating in the corresponding term vector of each word undetermined and default sentiment dictionary
The similarity of each corresponding term vector of emotion word.Wherein, the sentiment dictionary includes at least two dictionaries, each institute's predicate
Storehouse corresponds to a kind of emotion attribute, and each described dictionary includes at least one emotion word, each emotion word one word of correspondence
Vector;
Emotion attribute determination module 27, for according in the corresponding term vector of each word undetermined and the sentiment dictionary
The similarity of each corresponding term vector of emotion word judges the emotion attribute of the text message.
In the present embodiment, each module can be that now, above-mentioned each module can be stored in computer by software code realization
In 100 memorizer 120.Each module equally can be realized by hardware such as IC chip above.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation is all difference with other embodiment, between each embodiment identical similar part mutually referring to.
The text message processing apparatus provided by the embodiment of the present invention, which realizes the technique effect of principle and generation and aforementioned
Embodiment of the method is identical, is brief description, and device embodiment part does not refer to part, refers to corresponding in preceding method embodiment
Content.
In several embodiments provided herein, it should be understood that disclosed apparatus and method, it is also possible to pass through
Other modes are realized.Device embodiment described above is only schematically, for example flow chart and block diagram in accompanying drawing
Show the device of multiple embodiments of the invention, the architectural framework in the cards of method and computer program product,
Function and operation.At this point, each square frame in flow chart or block diagram can represent the one of module, program segment or a code
Part, a part for the module, program segment or code are used to realize holding for the logic function for specifying comprising one or more
Row instruction.It should also be noted that at some as in the implementations replaced, the function of being marked in square frame can also be being different from
The order marked in accompanying drawing occurs.For example, two continuous square frames can essentially be performed substantially in parallel, and they are sometimes
Can perform in the opposite order, this is depending on involved function.It is also noted that every in block diagram and/or flow chart
The combination of individual square frame and block diagram and/or the square frame in flow chart, can use the special base for performing the function or action of regulation
Realize in the system of hardware, or can be realized with the combination of specialized hardware and computer instruction.
In addition, each functional module in each embodiment of the invention can integrate to form an independent portion
Divide, or modules individualism, it is also possible to which two or more modules are integrated to form an independent part.
If the function is realized using in the form of software function module and as independent production marketing or when using, can be with
It is stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially in other words
The part contributed to prior art or the part of the technical scheme can be embodied in the form of software product, the meter
Calculation machine software product is stored in a storage medium, is used including some instructions so that a computer equipment (can be individual
People's computer 100, server, or network equipment etc.) perform all or part of step of each embodiment methods described of the invention
Suddenly.And aforesaid storage medium includes:USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), deposit at random
Access to memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with the medium of store program codes.
It should be noted that herein, such as first and second or the like relational terms are used merely to an entity or behaviour
Make with another entity or operation make a distinction, and not necessarily require or imply these entities or operate between exist it is any this
Plant actual relation or order.And, term " including ", "comprising" or its any other variant are intended to nonexcludability
Include so that a series of process, method, article or equipment including key elements not only include those key elements, but also
Including other key elements being not expressly set out, or also include intrinsic for this process, method, article or equipment wanting
Element.In the absence of more restrictions, the key element for being limited by sentence "including a ...", it is not excluded that wanting including described
The process of element, method, also there is other identical element in article or equipment.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area
For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within the scope of the present invention.It should be noted that:Similar label and letter exist
Similar terms is represented in figure below, therefore, once being defined in a certain Xiang Yi accompanying drawing, then it is not required in subsequent accompanying drawing
Which is further defined and is explained.
Claims (10)
1. a kind of text message processing method, it is characterised in that methods described includes:
Obtain text message;
Word segmentation processing is carried out to the text message and obtains multiple words undetermined;
Obtain the plurality of word undetermined and distinguish corresponding term vector;
Calculate the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in default sentiment dictionary
Similarity, wherein, the sentiment dictionary includes at least two dictionaries, each described dictionary correspond to a kind of emotion attribute, often
The individual dictionary includes at least one emotion word, each emotion word one term vector of correspondence;
According to the corresponding term vector of each word undetermined term vector corresponding with each the emotion word in the sentiment dictionary
Similarity judges the emotion attribute of the text message.
2. method according to claim 1, it is characterised in that the plurality of word undetermined of the acquisition distinguishes corresponding word
Vector, including:
The training word pre-conditioned with the consistent sexual satisfaction of word undetermined each described in the default correspondence table of lookup, will be described
Term vector of the corresponding term vector of training word as the word undetermined, wherein, the correspondence table includes one-to-one many
Individual training word and multiple term vectors.
3. method according to claim 2, it is characterised in that before the step of the acquisition text message, also include:
Obtain language material;
The language material is trained using word2vec algorithms described in the multiple training words and each for obtaining the correspondence table
The corresponding term vector of training word.
4. method according to claim 1, it is characterised in that described according to the corresponding term vector of each word undetermined and institute
The similarity for stating each the corresponding term vector of emotion word in sentiment dictionary judges the emotion attribute of the text message, bag
Include:
Belong to each feelings of same dictionary in calculating the corresponding term vector of all words undetermined and the sentiment dictionary respectively
The similarity sum of the corresponding term vector of sense word, as the similarity between each dictionary and the text message;
Belong to the emotion attribute of the similarity between the text message maximum dictionary as the emotion of the text message
Property.
5. method according to claim 4, it is characterised in that the sentiment dictionary includes the first dictionary and the second dictionary,
Wherein, the corresponding emotion attribute of first dictionary is positive, and the corresponding emotion attribute of second dictionary is passiveness, described point
Belong to the corresponding term vector of all words undetermined and the sentiment dictionary are not calculated in each emotion word of same dictionary
The similarity sum of corresponding term vector, as the similarity between each dictionary and the text message, will be with the text
Emotion attribute of the emotion attribute of the dictionary of the similarity maximum between information as the text message, including:
Calculate the corresponding term vector of all words undetermined term vector corresponding with each emotion word in first dictionary
Similarity sum, as the similarity between first dictionary and the text message;
Calculate the corresponding term vector of all words undetermined term vector corresponding with each emotion word in second dictionary
Similarity sum, as the similarity between second dictionary and the text message;
Calculate first dictionary similar to the text message to second dictionary to the similarity of the text message
Difference between degree;
When the difference is more than zero, judge the emotion attribute of the text message as actively;
When the difference is less than zero, judge the emotion attribute of the text message as passiveness.
6. a kind of text message processing apparatus, it is characterised in that described device includes:
Text message acquisition module, for obtaining text message;
Word-dividing mode, obtains multiple words undetermined for word segmentation processing is carried out to the text message;
Term vector acquisition module, distinguishes corresponding term vector for obtaining the plurality of word undetermined;
Similarity calculation module, for calculating each feelings in the corresponding term vector of each word undetermined and default sentiment dictionary
The similarity of the corresponding term vector of sense word, wherein, the sentiment dictionary includes at least two dictionaries, each described dictionary correspondence
In a kind of emotion attribute, each described dictionary includes at least one emotion word, each emotion word one term vector of correspondence;
Emotion attribute determination module, for according to each feelings in the corresponding term vector of each word undetermined and the sentiment dictionary
The similarity of the corresponding term vector of sense word judges the emotion attribute of the text message.
7. device according to claim 6, it is characterised in that the term vector acquisition module is default specifically for searching
The training word pre-conditioned with the consistent sexual satisfaction of word undetermined each described in correspondence table, will be the training word corresponding
Term vector of the term vector as the word undetermined, wherein, the correspondence table includes one-to-one multiple training words and many
Individual term vector.
8. device according to claim 7, it is characterised in that described device also includes:
Language material acquisition module, for obtaining language material;
Training module, for being trained the multiple training words for obtaining the correspondence table using word2vec algorithms to the language material
Language and each described training corresponding term vector of word.
9. device according to claim 6, it is characterised in that the emotion attribute determination module includes:
Computing unit, belongs to same for being calculated in the corresponding term vector of all words undetermined and the sentiment dictionary respectively
The similarity sum of each corresponding term vector of emotion word of dictionary, as the phase between each dictionary and the text message
Like degree;
Identifying unit, for using the emotion attribute of the similarity between the text message maximum dictionary as the text
The emotion attribute of information.
10. device according to claim 9, it is characterised in that the sentiment dictionary includes the first dictionary and the second dictionary,
Wherein, the corresponding emotion attribute of first dictionary is positive, and the corresponding emotion attribute of second dictionary is passiveness;
The computing unit is specifically for calculating in the corresponding term vector of all words undetermined and first dictionary each
The similarity sum of the corresponding term vector of emotion word, as the similarity between first dictionary and the text message,
Calculate the phase of the corresponding term vector of all words undetermined term vector corresponding with each emotion word in second dictionary
Like degree sum, as the similarity between second dictionary and the text message;
The identifying unit is specifically for calculating first dictionary and the similarity of the text message and second dictionary
And the difference between the similarity of the text message, when the difference is more than zero, judges the emotion category of the text message
Property be it is positive, when the difference be less than zero when, judge the emotion attribute of the text message as passiveness.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611043882.5A CN106547740A (en) | 2016-11-24 | 2016-11-24 | Text message processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611043882.5A CN106547740A (en) | 2016-11-24 | 2016-11-24 | Text message processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106547740A true CN106547740A (en) | 2017-03-29 |
Family
ID=58394892
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611043882.5A Pending CN106547740A (en) | 2016-11-24 | 2016-11-24 | Text message processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547740A (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169142A (en) * | 2017-06-15 | 2017-09-15 | 厦门快商通科技股份有限公司 | A kind of document sentiment analysis system and method automatically updated |
CN107451126A (en) * | 2017-08-21 | 2017-12-08 | 广州多益网络股份有限公司 | A kind of near synonym screening technique and system |
CN107885785A (en) * | 2017-10-17 | 2018-04-06 | 北京京东尚科信息技术有限公司 | Text emotion analysis method and device |
CN107967258A (en) * | 2017-11-23 | 2018-04-27 | 广州艾媒数聚信息咨询股份有限公司 | The sentiment analysis method and system of text message |
CN108052508A (en) * | 2017-12-29 | 2018-05-18 | 北京嘉和美康信息技术有限公司 | A kind of information extraction method and device |
CN109271510A (en) * | 2018-08-16 | 2019-01-25 | 龙马智芯(珠海横琴)科技有限公司 | Emotion term vector construction method and system |
CN109299400A (en) * | 2018-09-06 | 2019-02-01 | 北京奇艺世纪科技有限公司 | A kind of viewpoint abstracting method, device and equipment |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
CN109885687A (en) * | 2018-12-29 | 2019-06-14 | 深兰科技(上海)有限公司 | A kind of sentiment analysis method, apparatus, electronic equipment and the storage medium of text |
CN109902300A (en) * | 2018-12-29 | 2019-06-18 | 深兰科技(上海)有限公司 | A kind of method, apparatus, electronic equipment and storage medium creating dictionary |
CN110134934A (en) * | 2018-02-02 | 2019-08-16 | 普天信息技术有限公司 | Text emotion analysis method and device |
CN110457339A (en) * | 2018-05-02 | 2019-11-15 | 北京京东尚科信息技术有限公司 | Data search method and device, electronic equipment, storage medium |
TWI687825B (en) * | 2018-12-03 | 2020-03-11 | 國立臺灣師範大學 | Method and system for mapping from natural language to color combination |
CN111164589A (en) * | 2019-12-30 | 2020-05-15 | 深圳市优必选科技股份有限公司 | Emotion marking method, device and equipment of speaking content and storage medium |
CN111199148A (en) * | 2019-12-26 | 2020-05-26 | 东软集团股份有限公司 | Text similarity determination method and device, storage medium and electronic equipment |
CN111898377A (en) * | 2020-07-07 | 2020-11-06 | 苏宁金融科技(南京)有限公司 | Emotion recognition method and device, computer equipment and storage medium |
CN112115212A (en) * | 2020-09-29 | 2020-12-22 | 中国工商银行股份有限公司 | Parameter identification method and device and electronic equipment |
CN112446217A (en) * | 2020-11-27 | 2021-03-05 | 广州三七互娱科技有限公司 | Emotion analysis method and device and electronic equipment |
CN112446202A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Text analysis method and device |
CN113807807A (en) * | 2021-08-16 | 2021-12-17 | 深圳市云采网络科技有限公司 | Component parameter identification method and device, electronic equipment and readable medium |
CN116580402A (en) * | 2023-05-26 | 2023-08-11 | 读书郎教育科技有限公司 | Text recognition method and device for dictionary pen |
CN117112628A (en) * | 2023-09-08 | 2023-11-24 | 廊坊丛林科技有限公司 | Logistics data updating method and system |
CN116580402B (en) * | 2023-05-26 | 2024-06-25 | 读书郎教育科技有限公司 | Text recognition method and device for dictionary pen |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101634983A (en) * | 2008-07-21 | 2010-01-27 | 华为技术有限公司 | Method and device for text classification |
CN102880600A (en) * | 2012-08-30 | 2013-01-16 | 北京航空航天大学 | Word semantic tendency prediction method based on universal knowledge network |
CN103678278A (en) * | 2013-12-16 | 2014-03-26 | 中国科学院计算机网络信息中心 | Chinese text emotion recognition method |
CN104462378A (en) * | 2014-12-09 | 2015-03-25 | 北京国双科技有限公司 | Data processing method and device for text recognition |
US9075796B2 (en) * | 2012-05-24 | 2015-07-07 | International Business Machines Corporation | Text mining for large medical text datasets and corresponding medical text classification using informative feature selection |
CN104965822A (en) * | 2015-07-29 | 2015-10-07 | 中南大学 | Emotion analysis method for Chinese texts based on computer information processing technology |
CN105589941A (en) * | 2015-12-15 | 2016-05-18 | 北京百分点信息科技有限公司 | Emotional information detection method and apparatus for web text |
CN105893444A (en) * | 2015-12-15 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | Sentiment classification method and apparatus |
-
2016
- 2016-11-24 CN CN201611043882.5A patent/CN106547740A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101634983A (en) * | 2008-07-21 | 2010-01-27 | 华为技术有限公司 | Method and device for text classification |
US9075796B2 (en) * | 2012-05-24 | 2015-07-07 | International Business Machines Corporation | Text mining for large medical text datasets and corresponding medical text classification using informative feature selection |
CN102880600A (en) * | 2012-08-30 | 2013-01-16 | 北京航空航天大学 | Word semantic tendency prediction method based on universal knowledge network |
CN103678278A (en) * | 2013-12-16 | 2014-03-26 | 中国科学院计算机网络信息中心 | Chinese text emotion recognition method |
CN104462378A (en) * | 2014-12-09 | 2015-03-25 | 北京国双科技有限公司 | Data processing method and device for text recognition |
CN104965822A (en) * | 2015-07-29 | 2015-10-07 | 中南大学 | Emotion analysis method for Chinese texts based on computer information processing technology |
CN105589941A (en) * | 2015-12-15 | 2016-05-18 | 北京百分点信息科技有限公司 | Emotional information detection method and apparatus for web text |
CN105893444A (en) * | 2015-12-15 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | Sentiment classification method and apparatus |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107169142A (en) * | 2017-06-15 | 2017-09-15 | 厦门快商通科技股份有限公司 | A kind of document sentiment analysis system and method automatically updated |
CN107451126A (en) * | 2017-08-21 | 2017-12-08 | 广州多益网络股份有限公司 | A kind of near synonym screening technique and system |
CN107885785A (en) * | 2017-10-17 | 2018-04-06 | 北京京东尚科信息技术有限公司 | Text emotion analysis method and device |
CN107967258A (en) * | 2017-11-23 | 2018-04-27 | 广州艾媒数聚信息咨询股份有限公司 | The sentiment analysis method and system of text message |
CN107967258B (en) * | 2017-11-23 | 2021-09-17 | 广州艾媒数聚信息咨询股份有限公司 | Method and system for emotion analysis of text information |
CN108052508A (en) * | 2017-12-29 | 2018-05-18 | 北京嘉和美康信息技术有限公司 | A kind of information extraction method and device |
CN108052508B (en) * | 2017-12-29 | 2021-11-09 | 北京嘉和海森健康科技有限公司 | Information extraction method and device |
CN110134934A (en) * | 2018-02-02 | 2019-08-16 | 普天信息技术有限公司 | Text emotion analysis method and device |
CN110457339A (en) * | 2018-05-02 | 2019-11-15 | 北京京东尚科信息技术有限公司 | Data search method and device, electronic equipment, storage medium |
CN109271510A (en) * | 2018-08-16 | 2019-01-25 | 龙马智芯(珠海横琴)科技有限公司 | Emotion term vector construction method and system |
CN109271510B (en) * | 2018-08-16 | 2019-07-09 | 龙马智芯(珠海横琴)科技有限公司 | Emotion term vector construction method and system |
CN109299400A (en) * | 2018-09-06 | 2019-02-01 | 北京奇艺世纪科技有限公司 | A kind of viewpoint abstracting method, device and equipment |
TWI687825B (en) * | 2018-12-03 | 2020-03-11 | 國立臺灣師範大學 | Method and system for mapping from natural language to color combination |
CN109902300A (en) * | 2018-12-29 | 2019-06-18 | 深兰科技(上海)有限公司 | A kind of method, apparatus, electronic equipment and storage medium creating dictionary |
CN109885687A (en) * | 2018-12-29 | 2019-06-14 | 深兰科技(上海)有限公司 | A kind of sentiment analysis method, apparatus, electronic equipment and the storage medium of text |
CN109858004A (en) * | 2019-02-12 | 2019-06-07 | 四川无声信息技术有限公司 | Text Improvement, device and electronic equipment |
CN109858004B (en) * | 2019-02-12 | 2023-08-01 | 四川无声信息技术有限公司 | Text rewriting method and device and electronic equipment |
CN112446202A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Text analysis method and device |
CN111199148B (en) * | 2019-12-26 | 2023-01-20 | 东软集团股份有限公司 | Text similarity determination method and device, storage medium and electronic equipment |
CN111199148A (en) * | 2019-12-26 | 2020-05-26 | 东软集团股份有限公司 | Text similarity determination method and device, storage medium and electronic equipment |
WO2021134177A1 (en) * | 2019-12-30 | 2021-07-08 | 深圳市优必选科技股份有限公司 | Sentiment labeling method, apparatus and device for speaking content, and storage medium |
CN111164589A (en) * | 2019-12-30 | 2020-05-15 | 深圳市优必选科技股份有限公司 | Emotion marking method, device and equipment of speaking content and storage medium |
CN111898377A (en) * | 2020-07-07 | 2020-11-06 | 苏宁金融科技(南京)有限公司 | Emotion recognition method and device, computer equipment and storage medium |
CN112115212A (en) * | 2020-09-29 | 2020-12-22 | 中国工商银行股份有限公司 | Parameter identification method and device and electronic equipment |
CN112115212B (en) * | 2020-09-29 | 2023-10-03 | 中国工商银行股份有限公司 | Parameter identification method and device and electronic equipment |
CN112446217A (en) * | 2020-11-27 | 2021-03-05 | 广州三七互娱科技有限公司 | Emotion analysis method and device and electronic equipment |
CN112446217B (en) * | 2020-11-27 | 2024-05-28 | 广州三七互娱科技有限公司 | Emotion analysis method and device and electronic equipment |
CN113807807A (en) * | 2021-08-16 | 2021-12-17 | 深圳市云采网络科技有限公司 | Component parameter identification method and device, electronic equipment and readable medium |
CN116580402A (en) * | 2023-05-26 | 2023-08-11 | 读书郎教育科技有限公司 | Text recognition method and device for dictionary pen |
CN116580402B (en) * | 2023-05-26 | 2024-06-25 | 读书郎教育科技有限公司 | Text recognition method and device for dictionary pen |
CN117112628A (en) * | 2023-09-08 | 2023-11-24 | 廊坊丛林科技有限公司 | Logistics data updating method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106547740A (en) | Text message processing method and device | |
CN106445998B (en) | Text content auditing method and system based on sensitive words | |
CN106294350B (en) | A kind of text polymerization and device | |
WO2019227710A1 (en) | Network public opinion analysis method and apparatus, and computer-readable storage medium | |
WO2022141861A1 (en) | Emotion classification method and apparatus, electronic device, and storage medium | |
CN107704503A (en) | User's keyword extracting device, method and computer-readable recording medium | |
Khuc et al. | Towards building large-scale distributed systems for twitter sentiment analysis | |
EP3179384A1 (en) | Method and device for parsing interrogative sentence in knowledge base | |
CN106776574B (en) | User comment text mining method and device | |
Vogel et al. | Robust language identification in short, noisy texts: Improvements to liga | |
Jang et al. | Metaphor detection in discourse | |
CN110096573B (en) | Text parsing method and device | |
CN105843796A (en) | Microblog emotional tendency analysis method and device | |
CN106649250A (en) | Method and device for identifying emotional new words | |
Kaviani et al. | Emhash: Hashtag recommendation using neural network based on bert embedding | |
CN111680131B (en) | Document clustering method and system based on semantics and computer equipment | |
CN109829151B (en) | Text segmentation method based on hierarchical dirichlet model | |
CN111488732B (en) | Method, system and related equipment for detecting deformed keywords | |
CN113051356A (en) | Open relationship extraction method and device, electronic equipment and storage medium | |
CN104850617A (en) | Short text processing method and apparatus | |
CN109766447B (en) | Method and device for determining sensitive information | |
Gao et al. | Text classification research based on improved Word2vec and CNN | |
CN115392237B (en) | Emotion analysis model training method, device, equipment and storage medium | |
CN115017303A (en) | Method, computing device and medium for enterprise risk assessment based on news text | |
CN111401065A (en) | Entity identification method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170329 |