CN108733644A - A kind of text emotion analysis method, computer readable storage medium and terminal device - Google Patents
A kind of text emotion analysis method, computer readable storage medium and terminal device Download PDFInfo
- Publication number
- CN108733644A CN108733644A CN201810309676.7A CN201810309676A CN108733644A CN 108733644 A CN108733644 A CN 108733644A CN 201810309676 A CN201810309676 A CN 201810309676A CN 108733644 A CN108733644 A CN 108733644A
- Authority
- CN
- China
- Prior art keywords
- text
- vector
- emotion
- input
- participle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Abstract
The invention belongs to a kind of field of computer technology more particularly to text emotion analysis method, computer readable storage medium and terminal devices.The method carries out cutting word processing to statement text to be analyzed, obtains each participle for constituting the statement text;Search the column vector of each participle respectively in preset term vector database, and by the Column vector groups of each participle at input matrix, wherein, each row of the input matrix correspond to a column vector, and the term vector database is the database for recording the correspondence between word and column vector;The emotion main body that a participle corresponding with preset analysis object is analyzed as text emotion is chosen from the statement text;The input matrix and input vector are input in preset text emotion analysis neural network model, affective style of the emotion main body in the statement text is obtained, the input vector is the column vector of the emotion main body.
Description
Technical field
The invention belongs to field of computer technology more particularly to a kind of text emotion analysis method, computer-readable storages
Medium and terminal device.
Background technology
Text emotion analysis refers to meaning expressed by text and emotion information is divided into front or negative by text
The technology of two or more affective styles.Current text emotion analysis method mainly counts in text and represents different emotions
Adjectival quantity, and to this progress one quantitative analysis, this method to only include single emotional main body statement text into
Accuracy rate is higher when row sentiment analysis, but when carrying out sentiment analysis to the statement text comprising multiple emotion main bodys, is then difficult to
Reflect the mixed feeling of multiple emotion main bodys, for example, a certain statement text is " company A sales achievement substantially surmounts B companies ",
In, two emotion main bodys, respectively " company A " and " B companies ", for emotion main body " company A ", the sentence are contained altogether
Text should be positive emotion type, but for emotion main body " B companies ", which is but negative emotion type,
And the current obtained analysis result of text emotion analysis method is unrelated with emotion main body, can only obtain one uniquely
The affective style of emotion main body is not differentiated between.
Invention content
In view of this, an embodiment of the present invention provides a kind of text emotion analysis method, computer readable storage medium and
Terminal device is difficult to reflect the mixed feeling of multiple emotion main bodys with the text emotion analysis method for solving the problems, such as current.
The first aspect of the embodiment of the present invention provides a kind of text emotion analysis method, may include:
Cutting word processing is carried out to statement text to be analyzed, obtains each participle for constituting the statement text;
Search the column vector of each participle respectively in preset term vector database, and by each participle
Column vector groups are at input matrix, wherein each row of the input matrix correspond to a column vector, the term vector database
The database of correspondence between record word and column vector;
Choose what a participle corresponding with preset analysis object was analyzed as text emotion from the statement text
Emotion main body;
The input matrix and input vector are input in preset text emotion analysis neural network model, institute is obtained
Affective style of the emotion main body in the statement text is stated, the input vector is the column vector of the emotion main body.
The second aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage
Media storage has computer-readable instruction, the computer-readable instruction to realize following steps when being executed by processor:
Cutting word processing is carried out to statement text to be analyzed, obtains each participle for constituting the statement text;
Search the column vector of each participle respectively in preset term vector database, and by each participle
Column vector groups are at input matrix, wherein each row of the input matrix correspond to a column vector, the term vector database
The database of correspondence between record word and column vector;
Choose what a participle corresponding with preset analysis object was analyzed as text emotion from the statement text
Emotion main body;
The input matrix and input vector are input in preset text emotion analysis neural network model, institute is obtained
Affective style of the emotion main body in the statement text is stated, the input vector is the column vector of the emotion main body.
The third aspect of the embodiment of the present invention provides a kind of text emotion analysing terminal equipment, including memory, processing
Device and it is stored in the computer-readable instruction that can be run in the memory and on the processor, the processor executes
Following steps are realized when the computer-readable instruction:
Cutting word processing is carried out to statement text to be analyzed, obtains each participle for constituting the statement text;
Search the column vector of each participle respectively in preset term vector database, and by each participle
Column vector groups are at input matrix, wherein each row of the input matrix correspond to a column vector, the term vector database
The database of correspondence between record word and column vector;
Choose what a participle corresponding with preset analysis object was analyzed as text emotion from the statement text
Emotion main body;
The input matrix and input vector are input in preset text emotion analysis neural network model, institute is obtained
Affective style of the emotion main body in the statement text is stated, the input vector is the column vector of the emotion main body.
Existing advantageous effect is the embodiment of the present invention compared with prior art:The embodiment of the present invention is first to be analyzed
Statement text carries out cutting word processing, each participle for constituting the statement text is obtained, then in preset term vector database
The middle column vector for searching each participle respectively, and by the Column vector groups of each participle at input matrix, then from described
The emotion main body that a participle corresponding with preset analysis object is analyzed as text emotion is chosen in statement text, finally will
The input matrix and input vector are input in preset text emotion analysis neural network model, obtain the emotion main body
Affective style in the statement text.Compared with prior art, in addition to considering whole sentence text in the embodiment of the present invention
This is outer, and also by the column vector of emotion main body as an individually input, by the processing of neural network model, what is obtained is
Emotion main body has also been selected as the final feelings of influence by affective style of the emotion main body in the statement text
Feel a decision condition of type, in this way, when carrying out sentiment analysis to the statement text comprising multiple emotion main bodys, by right
The selection of different emotion main bodys, can obtain corresponding affective style, admirably reflect answering for multiple emotion main bodys
Miscellaneous emotion.
Description of the drawings
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description be only the present invention some
Embodiment for those of ordinary skill in the art without having to pay creative labor, can also be according to these
Attached drawing obtains other attached drawings.
Fig. 1 is a kind of one embodiment flow chart of text emotion analysis method in the embodiment of the present invention;
Fig. 2 is the schematic flow diagram for searching the column vector currently segmented in the embodiment of the present invention in term vector database;
Fig. 3 is the exemplary flow of the data handling procedure of text sentiment analysis neural network model in the embodiment of the present invention
Figure;
Fig. 4 is the schematic flow diagram of the training process of text sentiment analysis neural network model in the embodiment of the present invention;
Fig. 5 is a kind of one embodiment structure chart of text emotion analytical equipment in the embodiment of the present invention;
Fig. 6 is a kind of schematic block diagram of text emotion analysing terminal equipment in the embodiment of the present invention.
Specific implementation mode
In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention
Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that disclosed below
Embodiment be only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, this field
All other embodiment that those of ordinary skill is obtained without making creative work, belongs to protection of the present invention
Range.
Referring to Fig. 1, a kind of one embodiment of text emotion analysis method may include in the embodiment of the present invention:
Step S101 carries out cutting word processing to statement text to be analyzed, obtains each point that constitutes the statement text
Word.
Cutting word processing refers to that a statement text is cut into individual word one by one namely each participle,
In the present embodiment, cutting can be carried out to statement text according to universaling dictionary, it is normal vocabulary, such as word to ensure the word separated all
Language does not separate individual character then in dictionary.When front-rear direction can be at word, such as " praying to Gods for blessing ", it can be according to the big of statistics word frequency
Small division, as " it is required that " " it is required that/god " is separated if word frequency height, it is separated if " praying to Gods for blessing " word frequency height " want/pray to Gods for blessing ".
After splitting out each participle, if considering binary combination word, then neighbouring word combination of two can be increased
" celebration meeting ", " conference is smooth ", the binary combinations word such as " smoothly closing ".It preferably, can also be further according to word frequency to these
Binary combination word is screened.Specifically, the frequency threshold for pre-setting a screening obtains what each binary combination word occurred
Frequency retains the binary combination word if the frequency that some binary combination word occurs is greater than or equal to the frequency threshold, if some
The frequency that binary combination word occurs is less than the frequency threshold, then weeds out the binary combination word, namely be regarded as two independences
Unitary word.If the frequency threshold that we set as 5, rejects all occurrence numbers in 5 binary combination words below.
Step S102 searches the column vector of each participle respectively in preset term vector database.
The term vector database is the database for recording the correspondence between word and column vector.The column vector can
To be according to corresponding term vector obtained by word2vec model training words.This is indicated according to the contextual information of word
The probability that word occurs.Each vocabulary is first shown as a 0-1 vector by the training of term vector still according to the thought of word2vec
(one-hot) form, then word2vec model trainings are carried out with term vector, predict n-th of word, neural network with n-1 word
The pilot process obtained after model prediction is as term vector.Specifically, as " celebrations " one-hot vectors assume be set to [1,0,
0,0 ... ..., 0], the one-hot vectors of " conference " are [0,1,0,0 ... ..., 0], the one-hot vectors of " smooth " for [0,0,
1,0 ... ..., 0], predict that the vector [0,0,0,1 ... ..., 0] of " closing ", model can generate the coefficient square of hidden layer by training
Battle array W, the product of the one-hot vector sum coefficient matrixes of each word are the term vector of the word, and last form will be analogous to " celebrating
Wish [- 0.28,0.34, -0.02 ... ..., 0.92] " such a multi-C vector.
In the present embodiment, the term vector database can be K grades of tree-shaped fragment storage organizations, then step S102 can be with
Including step as shown in Figure 2:
Step S1021, using multiple mutually independent hash functions to currently segmenting carry out Hash operation.
The current participle is any one of participle.
Specifically, K mutually independent hash functions can be used to carry out Hash fortune to currently segmenting respectively according to the following formula
It calculates:
HashKeyk=HASHk(BasicWord)
Wherein, BasicWord is the current participle, HASHkFor the hash function of serial number k, HashKeykIt is obtained for operation
The cryptographic Hash of the serial number k arrived, 1≤k≤K, K are the integer more than 1.
Step S1022 calculates the serial number of the storage fragments at different levels belonging to the current participle.
Specifically, the serial number of the kth grade storage fragment belonging to the current participle can be calculated according to the following formula:
Wherein, MaxHashKeykFor hash function HASHkMaximum occurrences, FragNumkFor the storage point of kth grade subtree
The number of piece, Ceil are the function that rounds up, and Floor is downward bracket function, and WordRoute is the number in record storage path
Group, WordRoute [k-1] are the serial number of the kth grade fragment belonging to the current participle, and are k-th yuan of WordRoute
Element.
Step S1023 searches the column vector currently segmented under the store path of record.
Specifically, i.e., the column vector currently segmented is searched under the store path that array WordRoute is recorded.Example
Such as, if array WordRoute=[1,2,1,3,5], then store path is:The storage fragment-of 1st grade of subtree serial number 1>2nd
The storage fragment-of grade subtree serial number 2>The storage fragment-of 3rd level subtree serial number 1>The storage of 4th grade of subtree serial number 3
Fragment->The storage fragment of 5th grade of subtree serial number 5 searches the column vector currently segmented under the store path.
Step S103, by the Column vector groups of each participle at input matrix.
Wherein, each row of the input matrix correspond to a column vector, i.e., the Column vector groups of first participle are at institute
Show the first row of input matrix, the Column vector groups of second participle are at the secondary series ... ... of shown input matrix, n-th participle
Column vector groups at shown input matrix Nth column.N is the number of the participle.
Step S104 chooses a participle corresponding with preset analysis object from the statement text and is used as text feelings
Feel the emotion main body of analysis.
For example, a certain statement text is " company A sales achievement substantially surmounts B companies ", wherein there are two emotion main bodys altogether
It is available, respectively " company A " and " B companies ", if current emotion class for wanting analysis " company A " in the statement text
Type, i.e., the described analysis object are " company A ", then choose the emotion main body that " company A " is analyzed as text emotion, if current want
The affective style of " B companies " in the statement text is analyzed, i.e., the described analysis object is " B companies ", then chooses " B companies " work
For the emotion main body of text emotion analysis.
The input matrix and input vector are input to preset text emotion and analyze neural network model by step S105
In, obtain affective style of the emotion main body in the statement text.
The input vector is the column vector of the emotion main body.
The data handling procedure of text emotion analysis neural network model may include step as shown in Figure 3:
Step S1051 calculates the coupling vector between the input matrix and the input vector.
Specifically, the coupling vector between the input matrix and the input vector can be calculated according to the following formula:
CoupVec=(CoupFactor1,CoupFactor2,......,CoupFactorn,......,
CoupFactorN)T,
Wherein, 1≤n≤N, N are the columns of the input matrix, and T is transposition symbol,
WordVecnFor the input matrix n-th row, MainVec be the input vector, WeightMatrix,
WeightMatrix ' is preset weight matrix,CoupVec is coupling vector.
Step S1052 calculates the composite vector of the statement text.
Specifically, the composite vector of the statement text can be calculated according to the following formula:
CompVec=WordMatrix*CoupVec,
Wherein, CompVec is the composite vector, and WordMatrix is the input matrix,
And WordMatrix=(WordVec1,WordVec2,......,WordVecn,......,WordVecN)。
Step S1053 calculates separately the probability value of each affective style.
Specifically, the probability value of each affective style can be calculated separately according to the following formula:
Wherein, 1≤m≤M, M are the number of affective style, WeightMatrixmFor preset and m-th of affective style pair
The weight matrix answered, ProbmFor the probability value of m-th of affective style.
Specific affective style classification can be arranged according to actual conditions, for example, can be classified as positive emotion type and
Two class of negative emotion type can also be classified as positive emotion type, negative emotion type and neutral affective style three classes, also
It can be classified as more types.
The maximum affective style of probability value is determined as the emotion main body in the statement text by step S1054
Affective style.
Preferably, the training process of the text emotion analysis neural network model may include step as shown in Figure 4:
Step S401 chooses the training sample of preset number.
Each sample includes an input matrix, an input vector and an anticipated output affective style.
Preferably, training sample can be chosen in pairs in the form of training sample pair, each training sample is to including two
The input matrix of training sample, two training samples of same training sample centering is identical, is each point of same statement text
The input vector of the matrix that the column vector of word is formed, two training samples of same training sample centering is different, respectively together
The column vector of two different emotions main bodys of one statement text, the anticipated output of two training samples of same training sample centering
Affective style is different, and one is positive emotion type, another is negative emotion type.
Step S402, by each training sample be separately input in text emotion analysis neural network model into
Row processing.
Specific processing procedure is similar with step S105, specifically can refer to the explanation in step S105, details are not described herein.
Step S403 calculates the global error of epicycle training.
Specifically, the global error of epicycle training can be calculated according to the following formula:
Wherein, CalcProbl,mFor probability value of m-th of affective style in first of training sample, ExpProbl,mIt is
Expected probability value of the m affective style in first of training sample,
AndExpSeq is the anticipated output affective style of first of training sample
Serial number, 1≤l≤L, L be the training sample number, 1≤m≤M, M be affective style number, ln be natural logrithm letter
Number, LOSSlFor the training error of first of training sample, LOSS is the global error.
Step S404, judges whether the global error is less than preset error threshold.
If the global error is greater than or equal to the error threshold, S405 is thened follow the steps, if the global error is small
In the error threshold, S406 is thened follow the steps.
Step S405, the parameter for analyzing the text emotion neural network model are adjusted.
The parameter specifically adjusted may include above-mentioned WeightMatrix, WeightMatrix ', WeightMatrixm
Etc. parameters.After completing parameter adjustment, S402 is returned to step, until the global error is less than the error threshold
Only.
Step S405 terminates training.
When the global error is less than the error threshold, that is, the text emotion analysis neural network model is illustrated
Through having reached expected analysis precision, the training process to it can be terminated at this time, actual text emotion analysis is carried out using it.
In conclusion the embodiment of the present invention carries out cutting word processing to statement text to be analyzed first, obtain described in composition
Then each participle of statement text searches the column vector of each participle respectively in preset term vector database, and
By the Column vector groups of each participle at input matrix, then one and preset analysis object are chosen from the statement text
It is corresponding to segment the emotion main body analyzed as text emotion, finally the input matrix and input vector are input to preset
Text emotion is analyzed in neural network model, and affective style of the emotion main body in the statement text is obtained.With it is existing
Technology is compared, in the embodiment of the present invention other than considering whole statement text, also by the column vector of emotion main body as one
A individual input, by the processing of neural network model, what is obtained is feelings of the emotion main body in the statement text
Type is felt, also i.e. by a decision condition for being selected as the final affective style of influence for emotion main body, in this way, to packet
When statement text containing multiple emotion main bodys carries out sentiment analysis, by the selection to different emotion main bodys, can obtain with
Corresponding affective style, admirably reflect the mixed feeling of multiple emotion main bodys.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
Corresponding to a kind of text emotion analysis method described in foregoing embodiments, Fig. 5 shows that the embodiment of the present invention provides
A kind of text emotion analytical equipment one embodiment structure chart.
In the present embodiment, a kind of text emotion analytical equipment may include:
Text cutting word module 501 obtains constituting the sentence text for carrying out cutting word processing to statement text to be analyzed
This each participle;
Column vector searching module 502, the row for searching each participle respectively in preset term vector database
Vector, the term vector database are the database for recording the correspondence between word and column vector;
Input matrix comprising modules 503 are used for the Column vector groups of each participle into input matrix, wherein described
Each row of input matrix correspond to a column vector;
Emotion main body chooses module 504, corresponding with preset analysis object for choosing one from the statement text
The emotion main body analyzed as text emotion of participle;
Text emotion analysis module 505, for the input matrix and input vector to be input to preset text emotion
It analyzes in neural network model, obtains affective style of the emotion main body in the statement text, the input vector is
The column vector of the emotion main body.
Further, the text emotion analysis module may include:
Vector calculation unit is coupled, for calculating the coupling between the input matrix and the input vector according to the following formula
Vector:
CoupVec=(CoupFactor1,CoupFactor2,......,CoupFactorn,......,
CoupFactorN)T,
Wherein, 1≤n≤N, N are the columns of the input matrix, and T is transposition symbol,
WordVecnFor the input matrix n-th row, MainVec be the input vector, WeightMatrix,
WeightMatrix ' is preset weight matrix,CoupVec is coupling vector;
Composite vector computing unit, the composite vector for calculating the statement text according to the following formula:
CompVec=WordMatrix*CoupVec,
Wherein, CompVec is the composite vector, and WordMatrix is the input matrix, and WordMatrix=
(WordVec1,WordVec2,......,WordVecn,......,WordVecN);
Affective style probability value computing unit, the probability value for calculating separately each affective style according to the following formula:
Wherein, 1≤m≤M, M are the number of affective style, WeightMatrixmFor preset and m-th of affective style pair
The weight matrix answered, ProbmFor the probability value of m-th of affective style;
Affective style determination unit, for the maximum affective style of probability value to be determined as the emotion main body in institute's predicate
Affective style in sentence text.
Further, the text emotion analytical equipment can also include:
Training sample choose module, the training sample for choosing preset number, each sample include an input matrix,
One input vector and an anticipated output affective style;
Global error computing module analyzes nerve for each training sample to be separately input to the text emotion
It is handled in network model, and calculates the global error of epicycle training according to the following formula:
Wherein, CalcProbl,mFor probability value of m-th of affective style in first of training sample, ExpProbl,mIt is
Expected probability value of the m affective style in first of training sample, andExpSeq
For the serial number of the anticipated output affective style of first of training sample, 1≤l≤L, L are the number of the training sample, 1≤m≤
M, M are the number of affective style, and ln is natural logrithm function, LOSSlFor the training error of first of training sample, LOSS is institute
State global error;
Parameter adjustment module, if being greater than or equal to preset error threshold for the global error, to the text
The parameter of sentiment analysis neural network model is adjusted;
Terminate training module, if being less than the error threshold for the global error, terminates to train.
Further, the training sample selection module may include:
First selection unit, for choosing training sample in pairs in the form of training sample pair, each training sample is to packet
Two training samples are included, the input matrix of two training samples of same training sample centering is identical, is same statement text
The input vector of the matrix that the column vector of each participle is formed, two training samples of same training sample centering is different, point
Not Wei same statement text two different emotions main bodys column vector, two training samples of same training sample centering it is pre-
Phase exports affective style difference, and one is positive emotion type, another is negative emotion type.
Further, the column vector searching module may include:
Hash operation unit, for using K mutually independent hash functions to be carried out to currently segmenting respectively according to the following formula
Hash operation, the current participle is any one of participle:
HashKeyk=HASHk(BasicWord)
Wherein, BasicWord is the current participle, HASHkFor the hash function of serial number k, HashKeykIt is obtained for operation
The cryptographic Hash of the serial number k arrived, 1≤k≤K, K are the integer more than 1;
Fragment serial number computing unit is stored, fragment is stored for calculating the kth grade belonging to the current participle according to the following formula
Serial number:
Wherein, MaxHashKeykFor hash function HASHkMaximum occurrences, FragNumkFor the storage point of kth grade subtree
The number of piece, Ceil are the function that rounds up, and Floor is downward bracket function, and WordRoute is the number in record storage path
Group, WordRoute [k-1] are the serial number of the kth grade fragment belonging to the current participle, and are k-th yuan of WordRoute
Element;
Column vector searching unit, for searching the current participle under the store path that array WordRoute is recorded
Column vector.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description,
The specific work process of module and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
Fig. 6 shows a kind of schematic block diagram of text emotion analysing terminal equipment provided in an embodiment of the present invention, in order to just
In explanation, illustrate only and the relevant part of the embodiment of the present invention.
In the present embodiment, the text emotion analysing terminal equipment 6 can be mobile phone, tablet computer, desktop calculating
The computing devices such as machine, notebook, palm PC and cloud server.Text sentiment analysis terminal device 6 may include:Processor
60, memory 61 and it is stored in the computer-readable instruction that can be run in the memory 61 and on the processor 60
62, such as execute the computer-readable instruction of above-mentioned text emotion analysis method.The processor 60 executes the computer
The step in above-mentioned each text emotion analysis method embodiment, such as step S101 shown in FIG. 1 are realized when readable instruction 62
To S105.Alternatively, the processor 60 realizes each mould in above-mentioned each device embodiment when executing the computer-readable instruction 62
The function of block/unit, for example, module 501 to 505 shown in Fig. 5 function.
Illustratively, the computer-readable instruction 62 can be divided into one or more module/units, one
Or multiple module/units are stored in the memory 61, and executed by the processor 60, to complete the present invention.Institute
It can be the series of computation machine readable instruction section that can complete specific function, the instruction segment to state one or more module/units
For describing implementation procedure of the computer-readable instruction 62 in the text emotion analysing terminal equipment 6.
The processor 60 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), application-specific integrated circuit
(Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor
Deng.
The memory 61 can be the internal storage unit of the text emotion analysing terminal equipment 6, such as text feelings
Feel the hard disk or memory of analysing terminal equipment 6.The memory 61 can also be the outer of the text emotion analysing terminal equipment 6
The plug-in type hard disk being equipped in portion's storage device, such as the text emotion analysing terminal equipment 6, intelligent memory card (Smart
Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further,
The memory 61 can also both include the text emotion analysing terminal equipment 6 internal storage unit and also including external storage
Equipment.The memory 61 is for storing needed for the computer-readable instruction and the text emotion analysing terminal equipment 6
Other instruction and datas.The memory 61 can be also used for temporarily storing the data that has exported or will export.
Each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also be each
Unit physically exists alone, can also be during two or more units are integrated in one unit.Above-mentioned integrated unit both may be used
It realizes, can also be realized in the form of SFU software functional unit in the form of using hardware.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can be stored in a computer readable storage medium.Based on this understanding, technical scheme of the present invention substantially or
Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products
Reveal and, which is stored in a storage medium, including several computer-readable instructions are used so that one
Platform computer equipment (can be personal computer, server or the network equipment etc.) executes described in each embodiment of the present invention
The all or part of step of method.And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various can be with
Store the medium of computer-readable instruction.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although with reference to aforementioned reality
Applying example, invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to aforementioned each
Technical solution recorded in embodiment is modified or equivalent replacement of some of the technical features;And these are changed
Or it replaces, the spirit and scope for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution.
Claims (10)
1. a kind of text emotion analysis method, which is characterized in that including:
Cutting word processing is carried out to statement text to be analyzed, obtains each participle for constituting the statement text;
Search the column vector of each participle respectively in preset term vector database, and by the row of each participle to
Amount composition input matrix, wherein each row of the input matrix correspond to a column vector, and the term vector database is note
Record the database of the correspondence between word and column vector;
The emotion that a participle corresponding with preset analysis object is analyzed as text emotion is chosen from the statement text
Main body;
The input matrix and input vector are input in preset text emotion analysis neural network model, the feelings are obtained
Feel affective style of the main body in the statement text, the input vector is the column vector of the emotion main body.
2. text emotion analysis method according to claim 1, which is characterized in that the text emotion analyzes neural network
The data handling procedure of model includes:
The coupling vector between the input matrix and the input vector is calculated according to the following formula:
CoupVec=(CoupFactor1,CoupFactor2,......,CoupFactorn,......,CoupFactorN)T,
Wherein, 1≤n≤N, N are the columns of the input matrix, and T is transposition symbol,
WordVecnFor the input matrix n-th row, MainVec be the input vector, WeightMatrix,
WeightMatrix ' is preset weight matrix,CoupVec is coupling vector;
The composite vector of the statement text is calculated according to the following formula:
CompVec=WordMatrix*CoupVec,
Wherein, CompVec is the composite vector, and WordMatrix is the input matrix, and WordMatrix=
(WordVec1,WordVec2,......,WordVecn,......,WordVecN);
The probability value of each affective style is calculated separately according to the following formula:
Wherein, 1≤m≤M, M are the number of affective style, WeightMatrixmIt is preset corresponding with m-th of affective style
Weight matrix, ProbmFor the probability value of m-th of affective style;
The maximum affective style of probability value is determined as affective style of the emotion main body in the statement text.
3. text emotion analysis method according to claim 1, which is characterized in that the text emotion analyzes neural network
The training process of model includes:
The training sample of preset number is chosen, each sample includes that an input matrix, an input vector and an expection are defeated
Go out affective style;
Each training sample is separately input to handle in the text emotion analysis neural network model, and according to
Following formula calculates the global error of epicycle training:
Wherein, CalcProbl,mFor probability value of m-th of affective style in first of training sample, ExpProbl,mIt is m-th
Expected probability value of the affective style in first of training sample, andExpSeq is
The serial number of the anticipated output affective style of first of training sample, 1≤l≤L, L are the number of the training sample, 1≤m≤M, M
For the number of affective style, ln is natural logrithm function, LOSSlFor the training error of first of training sample, LOSS is described complete
Office's error;
If the global error is greater than or equal to preset error threshold, neural network model is analyzed to the text emotion
Parameter is adjusted, and is returned to execute and described each training sample is separately input to the text emotion is analyzed nerve net
The step of being handled in network model, until the global error is less than the error threshold;
If the global error is less than the error threshold, terminate to train.
4. text emotion analysis method according to claim 3, which is characterized in that the training sample for choosing preset number
Originally include:
Choose training sample in pairs in the form of training sample pair, each training sample is to including two training samples, same instruction
The input matrix for practicing two training samples of sample centering is identical, is made of the column vector of each participle of same statement text
Matrix, the input vectors of two training samples of same training sample centering is different, two of respectively same statement text
The anticipated output affective style of the column vector of different emotions main body, two training samples of same training sample centering is different, and one
A is positive emotion type, another is negative emotion type.
5. text emotion analysis method according to any one of claim 1 to 4, which is characterized in that the term vector number
It is K grades of tree-shaped fragment storage organizations, the row for searching each participle respectively in preset term vector database according to library
Vector includes:
Use K mutually independent hash functions to currently segmenting carry out Hash operation, the current participle respectively according to the following formula
For any one of participle:
HashKeyk=HASHk(BasicWord)
Wherein, BasicWord is the current participle, HASHkFor the hash function of serial number k, HashKeykIt is obtained for operation
The cryptographic Hash of serial number k, 1≤k≤K, K are the integer more than 1;
The serial number of the kth grade storage fragment belonging to the current participle is calculated according to the following formula:
Wherein, MaxHashKeykFor hash function HASHkMaximum occurrences, FragNumkFor the storage fragment of kth grade subtree
Number, Ceil are the function that rounds up, and Floor is downward bracket function, and WordRoute is the array in record storage path,
WordRoute [k-1] is the serial number of the kth grade fragment belonging to the current participle, and is k-th of element of WordRoute;
The column vector currently segmented is searched under the store path that array WordRoute is recorded.
6. a kind of computer readable storage medium, the computer-readable recording medium storage has computer-readable instruction, special
Sign is, the text feelings as described in any one of claim 1 to 5 are realized when the computer-readable instruction is executed by processor
The step of feeling analysis method.
7. a kind of text emotion analysing terminal equipment, including memory, processor and it is stored in the memory and can be
The computer-readable instruction run on the processor, which is characterized in that the processor executes the computer-readable instruction
Shi Shixian following steps:
Cutting word processing is carried out to statement text to be analyzed, obtains each participle for constituting the statement text;
Search the column vector of each participle respectively in preset term vector database, and by the row of each participle to
Amount composition input matrix, wherein each row of the input matrix correspond to a column vector, and the term vector database is note
Record the database of the correspondence between word and column vector;
The emotion that a participle corresponding with preset analysis object is analyzed as text emotion is chosen from the statement text
Main body;
The input matrix and input vector are input in preset text emotion analysis neural network model, the feelings are obtained
Feel affective style of the main body in the statement text, the input vector is the column vector of the emotion main body.
8. text emotion analysing terminal equipment according to claim 7, which is characterized in that the text emotion analysis nerve
The data handling procedure of network model includes:
The coupling vector between the input matrix and the input vector is calculated according to the following formula:
CoupVec=(CoupFactor1,CoupFactor2,......,CoupFactorn,......,CoupFactorN)T,
Wherein, 1≤n≤N, N are the columns of the input matrix, and T is transposition symbol,
WordVecnFor the input matrix n-th row, MainVec be the input vector, WeightMatrix,
WeightMatrix ' is preset weight matrix,CoupVec is coupling vector;
The composite vector of the statement text is calculated according to the following formula:
CompVec=WordMatrix*CoupVec,
Wherein, CompVec is the composite vector, and WordMatrix is the input matrix, and WordMatrix=
(WordVec1,WordVec2,......,WordVecn,......,WordVecN);
The probability value of each affective style is calculated separately according to the following formula:
Wherein, 1≤m≤M, M are the number of affective style, WeightMatrixmIt is preset corresponding with m-th of affective style
Weight matrix, ProbmFor the probability value of m-th of affective style;
The maximum affective style of probability value is determined as affective style of the emotion main body in the statement text.
9. text emotion analysing terminal equipment according to claim 7, which is characterized in that the text emotion analysis nerve
The training process of network model includes:
The training sample of preset number is chosen, each sample includes that an input matrix, an input vector and an expection are defeated
Go out affective style;
Each training sample is separately input to handle in the text emotion analysis neural network model, and according to
Following formula calculates the global error of epicycle training:
Wherein, CalcProbl,mFor probability value of m-th of affective style in first of training sample, ExpProbl,mIt is m-th
Expected probability value of the affective style in first of training sample, andExpSeq is
The serial number of the anticipated output affective style of first of training sample, 1≤l≤L, L are the number of the training sample, 1≤m≤M, M
For the number of affective style, ln is natural logrithm function, LOSSlFor the training error of first of training sample, LOSS is described complete
Office's error;
If the global error is greater than or equal to preset error threshold, neural network model is analyzed to the text emotion
Parameter is adjusted, and is returned to execute and described each training sample is separately input to the text emotion is analyzed nerve net
The step of being handled in network model, until the global error is less than the error threshold;
If the global error is less than the error threshold, terminate to train.
10. the text emotion analysing terminal equipment according to any one of claim 7 to 9, which is characterized in that institute's predicate to
Amount database is K grades of tree-shaped fragment storage organizations, described to search each participle respectively in preset term vector database
Column vector include:
Use K mutually independent hash functions to currently segmenting carry out Hash operation, the current participle respectively according to the following formula
For any one of participle:
HashKeyk=HASHk(BasicWord)
Wherein, BasicWord is the current participle, HASHkFor the hash function of serial number k, HashKeykIt is obtained for operation
The cryptographic Hash of serial number k, 1≤k≤K, K are the integer more than 1;
The serial number of the kth grade storage fragment belonging to the current participle is calculated according to the following formula:
Wherein, MaxHashKeykFor hash function HASHkMaximum occurrences, FragNumkFor the storage fragment of kth grade subtree
Number, Ceil are the function that rounds up, and Floor is downward bracket function, and WordRoute is the array in record storage path,
WordRoute [k-1] is the serial number of the kth grade fragment belonging to the current participle, and is k-th of element of WordRoute;
The column vector currently segmented is searched under the store path that array WordRoute is recorded.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810309676.7A CN108733644B (en) | 2018-04-09 | 2018-04-09 | A kind of text emotion analysis method, computer readable storage medium and terminal device |
PCT/CN2018/093344 WO2019196208A1 (en) | 2018-04-09 | 2018-06-28 | Text sentiment analysis method, readable storage medium, terminal device, and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810309676.7A CN108733644B (en) | 2018-04-09 | 2018-04-09 | A kind of text emotion analysis method, computer readable storage medium and terminal device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108733644A true CN108733644A (en) | 2018-11-02 |
CN108733644B CN108733644B (en) | 2019-07-19 |
Family
ID=63941208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810309676.7A Active CN108733644B (en) | 2018-04-09 | 2018-04-09 | A kind of text emotion analysis method, computer readable storage medium and terminal device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108733644B (en) |
WO (1) | WO2019196208A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390002A (en) * | 2019-06-18 | 2019-10-29 | 深圳壹账通智能科技有限公司 | Call resource allocation method, device, computer readable storage medium and server |
CN110717022A (en) * | 2019-09-18 | 2020-01-21 | 平安科技(深圳)有限公司 | Robot dialogue generation method and device, readable storage medium and robot |
CN111191438A (en) * | 2019-12-30 | 2020-05-22 | 北京百分点信息科技有限公司 | Emotion analysis method and device and electronic equipment |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112445898A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Dialogue emotion analysis method and device, storage medium and processor |
CN112818681A (en) * | 2020-12-31 | 2021-05-18 | 北京知因智慧科技有限公司 | Text emotion analysis method and system and electronic equipment |
US11748573B2 (en) | 2019-12-16 | 2023-09-05 | Tata Consultancy Services Limited | System and method to quantify subject-specific sentiment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469145A (en) * | 2016-09-30 | 2017-03-01 | 中科鼎富(北京)科技发展有限公司 | Text emotion analysis method and device |
CN106776581A (en) * | 2017-02-21 | 2017-05-31 | 浙江工商大学 | Subjective texts sentiment analysis method based on deep learning |
CN106844330A (en) * | 2016-11-15 | 2017-06-13 | 平安科技(深圳)有限公司 | The analysis method and device of article emotion |
CN107066449A (en) * | 2017-05-09 | 2017-08-18 | 北京京东尚科信息技术有限公司 | Information-pushing method and device |
-
2018
- 2018-04-09 CN CN201810309676.7A patent/CN108733644B/en active Active
- 2018-06-28 WO PCT/CN2018/093344 patent/WO2019196208A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106469145A (en) * | 2016-09-30 | 2017-03-01 | 中科鼎富(北京)科技发展有限公司 | Text emotion analysis method and device |
CN106844330A (en) * | 2016-11-15 | 2017-06-13 | 平安科技(深圳)有限公司 | The analysis method and device of article emotion |
CN106776581A (en) * | 2017-02-21 | 2017-05-31 | 浙江工商大学 | Subjective texts sentiment analysis method based on deep learning |
CN107066449A (en) * | 2017-05-09 | 2017-08-18 | 北京京东尚科信息技术有限公司 | Information-pushing method and device |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390002A (en) * | 2019-06-18 | 2019-10-29 | 深圳壹账通智能科技有限公司 | Call resource allocation method, device, computer readable storage medium and server |
CN112445898A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Dialogue emotion analysis method and device, storage medium and processor |
CN110717022A (en) * | 2019-09-18 | 2020-01-21 | 平安科技(深圳)有限公司 | Robot dialogue generation method and device, readable storage medium and robot |
US11748573B2 (en) | 2019-12-16 | 2023-09-05 | Tata Consultancy Services Limited | System and method to quantify subject-specific sentiment |
CN111191438A (en) * | 2019-12-30 | 2020-05-22 | 北京百分点信息科技有限公司 | Emotion analysis method and device and electronic equipment |
CN111191438B (en) * | 2019-12-30 | 2023-03-21 | 北京百分点科技集团股份有限公司 | Emotion analysis method and device and electronic equipment |
CN112214576A (en) * | 2020-09-10 | 2021-01-12 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, device, terminal equipment and computer readable storage medium |
CN112214576B (en) * | 2020-09-10 | 2024-02-06 | 深圳价值在线信息科技股份有限公司 | Public opinion analysis method, public opinion analysis device, terminal equipment and computer readable storage medium |
CN112818681A (en) * | 2020-12-31 | 2021-05-18 | 北京知因智慧科技有限公司 | Text emotion analysis method and system and electronic equipment |
CN112818681B (en) * | 2020-12-31 | 2023-11-10 | 北京知因智慧科技有限公司 | Text emotion analysis method and system and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN108733644B (en) | 2019-07-19 |
WO2019196208A1 (en) | 2019-10-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108733644B (en) | A kind of text emotion analysis method, computer readable storage medium and terminal device | |
CN111444320B (en) | Text retrieval method and device, computer equipment and storage medium | |
CN108363790A (en) | For the method, apparatus, equipment and storage medium to being assessed | |
US11915104B2 (en) | Normalizing text attributes for machine learning models | |
CN110874439B (en) | Recommendation method based on comment information | |
CN105022754B (en) | Object classification method and device based on social network | |
CN110377759A (en) | Event relation map construction method and device | |
JP2021504789A (en) | ESG-based corporate evaluation execution device and its operation method | |
CN106202032A (en) | A kind of sentiment analysis method towards microblogging short text and system thereof | |
CN108108347B (en) | Dialogue mode analysis system and method | |
CN108108468A (en) | A kind of short text sentiment analysis method and apparatus based on concept and text emotion | |
CN103678436A (en) | Information processing system and information processing method | |
CN112270546A (en) | Risk prediction method and device based on stacking algorithm and electronic equipment | |
CN109409504A (en) | A kind of data processing method, device, computer and storage medium | |
CN106445915A (en) | New word discovery method and device | |
CN107644051A (en) | System and method for the packet of similar entity | |
JP5682448B2 (en) | Causal word pair extraction device, causal word pair extraction method, and causal word pair extraction program | |
CN107688609A (en) | A kind of position label recommendation method and computing device | |
CN116402166B (en) | Training method and device of prediction model, electronic equipment and storage medium | |
CN110377706B (en) | Search sentence mining method and device based on deep learning | |
CN107704763A (en) | Multi-source heterogeneous leak information De-weight method, stage division and device | |
CN111930944A (en) | File label classification method and device | |
CN111597336A (en) | Processing method and device of training text, electronic equipment and readable storage medium | |
CN104216880A (en) | Term definition discriminating and analysis method based on Internet | |
CN111737461B (en) | Text processing method and device, electronic equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |