Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with this specification.On the contrary, they are only and such as institute
The example of the consistent device and method of some aspects be described in detail in attached claims, this specification.
It is only to be not intended to be limiting this explanation merely for for the purpose of describing particular embodiments in the term that this specification uses
Book.The "an" of used singular, " described " and "the" are also intended to packet in this specification and in the appended claims
Most forms are included, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein is
Refer to and includes that one or more associated any or all of project listed may combine.
It will be appreciated that though various information may be described using term first, second, third, etc. in this specification, but
These information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not taking off
In the case where this specification range, the first information can also be referred to as the second information, and similarly, the second information can also be claimed
For the first information.Depending on context, word as used in this " if " can be construed to " ... when " or
" when ... " or " in response to determination ".
This specification provides a kind of analytical plan of user's evaluation, the word in user's evaluation text first can be replaced with institute
Centre word under predicate language generic obtains the replacement text of the evaluation text, then to the replacement text of each evaluation text
This is classified, and summarizes to the replacement text under specified text categories, and extract and meet predetermined condition in summarized results
Word analyze result as the user's evaluation of corresponding text classification.
Fig. 1 is a kind of flow diagram of the analysis method of user's evaluation shown in one exemplary embodiment of this specification.
Referring to FIG. 1, the analysis method of the user's evaluation can comprise the following steps that
Step 102, the word in user's evaluation text is converted into corresponding term vector.
In the present embodiment, the user's evaluation text may include evaluation text of the user to the commodity of purchase or service publication
This may also comprise user to the evaluation text of the software service publication used, and this specification is not particularly limited this.
In the present embodiment, the user's evaluation text generally includes one or more sentences.
For example, it is directed to certain part clothes, user A publication evaluation text " fabric hand feel is bad ".
For another example being directed to certain instant communication software, " page setup is good, and communication speed is fast, praises for user B publication evaluation text
One " etc..
In the present embodiment, for every user's evaluation text, word segmentation processing first can be carried out to the evaluation text, it will
The evaluation text is divided into one or more words.
For example, the participle Open-Source Tools provided in the related technology can be used, ICTCLAS, SCWS etc. are carried out at participle
Reason can also be used independently developed tool and carry out word segmentation processing to the evaluation text, it is special that this specification does not make this certainly
Limitation.
In the present embodiment, cw2vec algorithm can be used, each word after division is converted into corresponding term vector.Its
In, cw2vec algorithm is a kind of Chinese word vector algorithm based on Chinese-character stroke information, can be effectively improved using cw2vec algorithm
The accuracy of Chinese language processing.
Certainly, in other examples, word2vec scheduling algorithm can also be used, the word evaluated in text is converted into correspondence
Vector.
Step 104, word included by several evaluation texts is clustered according to the term vector, to obtain each word
Word classification belonging to language.
In the present embodiment, the evaluation text in a period of time can be obtained, is then based on included by the evaluation text of acquisition
The term vector of word these words are clustered, to carry out the division of classification to word, each word is divided into one
In word classification.
As an example it is assumed that getting 100 evaluation texts, this 100 evaluation texts include 5000 words altogether, are used
These words can be divided into 800 word classifications after clustering to these words by the term vector of this 5000 words.So,
It can determine word classification belonging to each word based on this step.
In the present embodiment, for each word classification, it may further determine that the centre word of the word classification.
For example, can calculate separately in the word classification, each word is at a distance from class center, then most by distance-like center
Close word is determined as the centre word of the word classification.When nearest word has multiple when distance-like center, one can be randomly choosed
Centre word of a word as the word classification.
For another example if the class center is exactly word, then the word of the class center representative can also be determined as this
The centre word of word classification.
In another example can also randomly select word etc. centered on a word from the word classification.
In the present embodiment, used clustering algorithm when being clustered to the word can include: (K- is equal by K-means
Value) algorithm, GMM (Gaussian Mixture Model, gauss hybrid models) algorithm etc., this specification does not make special limit to this
System.
Step 106, the word in the evaluation text is replaced with into the centre word under the affiliated word classification of the word, obtained
To the replacement text of the evaluation text.
In the present embodiment, for each word in the evaluation text, the affiliated word classification of the word can be used
Under centre word replace the word, to obtain the replacement text of the evaluation text.
Word |
The centre word of affiliated word classification |
Fabric |
Material |
Feel |
Feel |
It is bad |
Difference |
Table 1
It is assumed that certain evaluation text is " fabric hand feel is bad ", 3 words can be obtained after segmenting to the evaluation text,
Respectively " fabric ", " feel " and " bad ".Table 1 shows the centre word of this affiliated word classification of 3 words, is based on table 1
Example, which can be replaced with " material feel is poor ", i.e. the replacement text of the evaluation text is " material feel is poor ".
Step 108, classified using replacement text of the textual classification model to each evaluation text.
In the present embodiment, for each evaluation text, the corresponding word of replacement text of the evaluation text can be first determined
Vector set.
Still for evaluating text and be " fabric hand feel is bad ", it can be obtained and replace each word in text " material feel is poor "
Then the term vector of language obtains the term vector set of the replacement text.
In the present embodiment, the term vector set of the replacement text of the evaluation text can be inputted textual classification model,
Obtain the text classification result of the replacement text.
The textual classification model can be LSTM (Long Short-Term Memory, shot and long term memory network)+
Softmax model.In other examples, other textual classification models can also be used, this specification is not particularly limited this.
The textual classification model can be used for predicting whether the replacement text of input belongs to specified text categories.
Wherein, the specified text categories may include complaining classification, praising classification etc..
It should be noted that the replacement text of input textual classification model, in some cases, actually there is no carry out
The replacement of word, because of the word inherently centre word in evaluation text.
In the present embodiment, compared to evaluation text, replacement text replaces generic word using centre word, can be big
It is big to reduce different terms quantity, improve the accuracy of text classification result.
Step 110, the replacement text under specified text categories is summarized, and extracts and meets predetermined item in summarized results
The word of part analyzes result as the user's evaluation of corresponding text classification.
In the present embodiment, the replacement text under the specified text categories can be gathered, obtains summarized results.
The summarized results includes word included by each replacement text.
Then parameter of measurement of each word under predetermined dimension in summarized results can be calculated, and is based on the parameter of measurement
Word extraction is carried out, such as each word can be ranked up according to the size of parameter of measurement, and extracts and is arranged in front several positions
User's evaluation analysis result etc. as corresponding text classification of word.
Parameter of measurement under the predetermined dimension may include word frequency, TF-IDF (term frequency-inverse
Document frequency, the reverse document-frequency of word frequency -) etc..
Replace text |
Including word |
Replace text 1 |
Word A, word B, word C, word D, word E |
Replace text 2 |
Word A, word C, word W |
Replace text 3 |
Word A, word Y, word Z |
Table 2
By taking word frequency as an example, it is assumed that the replacement text under the text categories has 3, and the word that each replacement text includes can join
Examine the example of table 2.After summarizing to replacement text 1- replacement text 3, the word of summarized results word shown in table 3 can be obtained
Frequency list.
Word |
Word frequency |
Word A |
3 |
Word C |
2 |
Word B, word D, word E, word W, word Y, word Z |
1 |
Table 3
In the present embodiment, the word that extractable word frequency is arranged in top N is commented as user under the specified text categories
The analysis result of valence.Wherein, the value of N can be preset, such as 3,5 etc..
Please continue to refer to the example of table 3, it is assumed that the value of N is 2, then can extract word A and word C as the specified text
The analysis result of user's evaluation under this classification.
The present embodiment extracts evaluation analysis result, it can be achieved that similarity analysis result from the summarized results of replacement text
Duplicate removal, to improve the accuracy of evaluation analysis result.
In other examples, when the specified text categories have tendency positively or negatively, it can extract summarized results
The middle noun for meeting predetermined condition is analyzed as the user's evaluation of the corresponding specified text categories as a result, according to the specified text
The tendency of this classification can determine the tendency of each title in analysis result, and then know the conclusion of user's evaluation.
For example, classification is complained to be inclined to negative sense, can extract for the replacement text complained under classification full in summarized results
The noun of sufficient predetermined condition is analyzed as the user's evaluation as a result, color difference, material, logistics etc..It, can based on classification is complained
The reason of knowing customer complaint is product color difference, product material and logistics.
For another example praising classification that there is positive tendency, can extract in summarized results for the replacement text praised under classification
The name for meeting predetermined condition is referred to as the user's evaluation and analyzes as a result, sound quality, taste etc..Based on praise classification, it is known that
Know customer satisfaction system reason and is sound quality and taste.
This example extracts evaluation analysis from the noun of summarized results as a result, calculation amount can be effectively reduced, and improves user's evaluation
Analysis efficiency, while do not reduce analysis result accuracy.
In practical applications, the replacement text that can be carried out as unit of evaluation object under specified text categories summarizes, from
It and is that evaluation object sums up the analysis of user's evaluation under corresponding text categories as a result, for evaluation object reference.
Wherein, the evaluation object be usually evaluate text towards object, it may include user purchase commodity, service,
The retail shop of the commodity, service, the software that user uses, the developer etc. of the software are provided.
It is worth noting that, in user's evaluation text would generally include some not essential meanings word, such as " ",
" " etc. may filter that these are nonsensical when the technical solution recorded using this specification carries out the analysis of user's evaluation
Word.
For example, word filtering can be carried out after 102 pairs of evaluation texts of abovementioned steps carry out participle division.
Word filtering, this specification pair are carried out when for another example can also summarize in abovementioned steps 110 to replacement text
This is not particularly limited.
Word in user's evaluation text first can be replaced with by the word institute by this specification it can be seen from above description
Belong to the centre word under classification, the replacement text of the evaluation text is obtained, then using textual classification model to each evaluation text
Replacement text classify, and the replacement text under specified text categories is summarized, and extract in summarized results and meet
The word of predetermined condition analyzes the analysis as a result, to realization to user's evaluation as the user's evaluation of corresponding text classification,
Facilitate the superiority and inferiority for assisting evaluation object to find product, service.
The realization process of this specification is described below with reference to specific application scenarios.
After user does shopping on electric business platform, the article of purchase or service can be evaluated.Electric business platform can be obtained periodically
Take the user's evaluation in this period, such as can No. 1 acquisition last month all user's evaluations in every month.Getting user's evaluation
Afterwards, the word in can evaluating each item be converted to corresponding term vector, and based on the term vector to including in user's evaluation
All words are clustered, and word classification belonging to each word is obtained.
After cluster, the word in user's evaluation that can be will acquire replaces with the centre word under its generic,
Obtain the replacement text of each user's evaluation.
Then, the textual classification model that use has been trained can classify to the replacement text of each evaluation text, identify
Complain class text.Then the same commodity are directed to, all complaint texts for the commodity can be summarized, and extract in summarized results
The reason of noun that word frequency or TF-IDF are arranged in front several is directed to the Merchandise Complaint as user, and the reason can be sent out
Give the trade company for selling the commodity.
As an example it is assumed that for the skirt sold on electric business platform, customer complaint text include " material for making clothes is too poor ",
" material is not all right ", " fabric hand feel is bad " etc., the above-mentioned technical proposal recorded by this specification, can analyze customer complaint
Reason is " material ", and then the reason can be fed back to businessman.
Corresponding with the embodiment of the analysis method of aforementioned user's evaluation, this specification additionally provides the analysis of user's evaluation
The embodiment of device.
The embodiment of the analytical equipment of this specification user's evaluation can be using on the server.Installation practice can lead to
Software realization is crossed, can also be realized by way of hardware or software and hardware combining.Taking software implementation as an example, as a logic
Device in meaning is by the processor of server where it by computer program instructions corresponding in nonvolatile memory
It is read into memory what operation was formed.For hardware view, as shown in Fig. 2, being the analytical equipment of this specification user's evaluation
A kind of hardware structure diagram of place server, in addition to processor shown in Fig. 2, memory, network interface and non-volatile memories
Except device, the server in embodiment where device can also include other hardware generally according to the actual functional capability of the server,
This is repeated no more.
Fig. 3 is a kind of block diagram of the analytical equipment of user's evaluation shown in one exemplary embodiment of this specification.
Referring to FIG. 3, the analytical equipment 200 of the user's evaluation can be applied in aforementioned server shown in Fig. 2, packet
It has included: vector transduced cell 201, term clustering unit 202, text replacement unit 203, text classification unit 204 and evaluation point
Analyse unit 205.
Wherein, the word in user's evaluation text is converted to corresponding term vector by vector transduced cell 201;
Term clustering unit 202 clusters word included by several evaluation texts according to the term vector, to obtain
Word classification belonging to each word;
Text replacement unit 203 replaces with the word in the evaluation text under the affiliated word classification of the word
Centre word obtains the replacement text of the evaluation text;
Text classification unit 204 is classified using replacement text of the textual classification model to each evaluation text;
Evaluation analysis unit 205 summarizes the replacement text under specified text categories, and extracts full in summarized results
The word of sufficient predetermined condition analyzes result as the user's evaluation of corresponding text classification.
Optionally, the vector transduced cell 201:
Word segmentation processing is carried out to user's evaluation text, the evaluation text is divided into one or more words;
The word is converted to by corresponding term vector using cw2vec algorithm.
Optionally, the determination process of the centre word includes:
Using the nearest word in distance-like center in each word classification as the centre word under corresponding word classification.
Optionally, the evaluation analysis unit 205:
Extract the user's evaluation analysis result for meeting the noun of predetermined condition in summarized results as corresponding text classification.
Optionally, the evaluation analysis unit 205:
Calculate parameter of measurement of each word under predetermined dimension in summarized results;
The word is ranked up according to the sequence of parameter of measurement from big to small;
Extract the user's evaluation analysis result for being arranged in the word of top N as corresponding text classification.
Optionally, the parameter of measurement under the predetermined dimension includes one or more of:
Word frequency, TF-IDF.
Optionally, the specified text categories include complaining classification.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus
Realization process, details are not described herein.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality
Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit
The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with
It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual
The purpose for needing to select some or all of the modules therein to realize this specification scheme.Those of ordinary skill in the art are not
In the case where making the creative labor, it can understand and implement.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.A kind of typically to realize that equipment is computer, the concrete form of computer can
To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment
The combination of any several equipment.
Corresponding with the embodiment of the analysis method of aforementioned user's evaluation, this specification also provides a kind of point of user's evaluation
Analysis apparatus, the device include: processor and the memory for storing machine-executable instruction.Wherein, processor and storage
Device is usually connected with each other by internal bus.In other possible implementations, the equipment is also possible that external interface,
Can be communicated with other equipment or component.
In the present embodiment, by reading and executing the corresponding with the analysis logic of user's evaluation of the memory storage
Machine-executable instruction, the processor are prompted to:
Word in user's evaluation text is converted into corresponding term vector;
Word included by several evaluation texts is clustered according to the term vector, to obtain belonging to each word
Word classification;
Word in the evaluation text is replaced with into the centre word under the affiliated word classification of the word, obtains institute's commentary
The replacement text of valence text;
Classified using replacement text of the textual classification model to each evaluation text;
Replacement text under specified text categories is summarized, and extracts the word for meeting predetermined condition in summarized results
User's evaluation as corresponding text classification analyzes result.
Optionally, when the word in user's evaluation text is converted to corresponding term vector, the processor is prompted to:
Word segmentation processing is carried out to user's evaluation text, the evaluation text is divided into one or more words;
The word is converted to by corresponding term vector using cw2vec algorithm.
Optionally, when determining centre word, the processor is prompted to:
Using the nearest word in distance-like center in each word classification as the centre word under corresponding word classification.
Optionally, user's evaluation of the word of predetermined condition as corresponding text classification is met in extracting summarized results
When analyzing result, the processor is prompted to:
Extract the user's evaluation analysis result for meeting the noun of predetermined condition in summarized results as corresponding text classification.
Optionally, user's evaluation of the word of predetermined condition as corresponding text classification is met in extracting summarized results
When analyzing result, the processor is prompted to:
Calculate parameter of measurement of each word under predetermined dimension in summarized results;
The word is ranked up according to the sequence of parameter of measurement from big to small;
Extract the user's evaluation analysis result for being arranged in the word of top N as corresponding text classification.
Optionally, the parameter of measurement under the predetermined dimension includes one or more of:
Word frequency, TF-IDF.
Optionally, the specified text categories include complaining classification.
Corresponding with the embodiment of the analysis method of aforementioned user's evaluation, this specification also provides a kind of computer-readable deposit
Storage media is stored with computer program on the computer readable storage medium, realizes when which is executed by processor following
Step:
Word in user's evaluation text is converted into corresponding term vector;
Word included by several evaluation texts is clustered according to the term vector, to obtain belonging to each word
Word classification;
Word in the evaluation text is replaced with into the centre word under the affiliated word classification of the word, obtains institute's commentary
The replacement text of valence text;
Classified using replacement text of the textual classification model to each evaluation text;
Replacement text under specified text categories is summarized, and extracts the word for meeting predetermined condition in summarized results
User's evaluation as corresponding text classification analyzes result.
Optionally, the word by user's evaluation text is converted to corresponding term vector, comprising:
Word segmentation processing is carried out to user's evaluation text, the evaluation text is divided into one or more words;
The word is converted to by corresponding term vector using cw2vec algorithm.
Optionally, the determination process of the centre word includes:
Using the nearest word in distance-like center in each word classification as the centre word under corresponding word classification.
Optionally, the word for predetermined condition being met in the extraction summarized results is commented as the user of corresponding text classification
Valence analyzes result, comprising:
Extract the user's evaluation analysis result for meeting the noun of predetermined condition in summarized results as corresponding text classification.
Optionally, the word for predetermined condition being met in the extraction summarized results is commented as the user of corresponding text classification
Valence analyzes result, comprising:
Calculate parameter of measurement of each word under predetermined dimension in summarized results;
The word is ranked up according to the sequence of parameter of measurement from big to small;
Extract the user's evaluation analysis result for being arranged in the word of top N as corresponding text classification.
Optionally, the parameter of measurement under the predetermined dimension includes one or more of:
Word frequency, TF-IDF.
Optionally, the specified text categories include complaining classification.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment
It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable
Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can
With or may be advantageous.
The foregoing is merely the preferred embodiments of this specification, all in this explanation not to limit this specification
Within the spirit and principle of book, any modification, equivalent substitution, improvement and etc. done should be included in the model of this specification protection
Within enclosing.