CN109145115A - Product public sentiment finds method, apparatus, computer equipment and storage medium - Google Patents
Product public sentiment finds method, apparatus, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109145115A CN109145115A CN201811005075.3A CN201811005075A CN109145115A CN 109145115 A CN109145115 A CN 109145115A CN 201811005075 A CN201811005075 A CN 201811005075A CN 109145115 A CN109145115 A CN 109145115A
- Authority
- CN
- China
- Prior art keywords
- public sentiment
- classification
- information record
- vector
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Abstract
This application involves product public sentiment discovery method, apparatus, computer equipment and storage mediums, extract the text data of each information record in presupposed information source;Data vector is converted by text data;Classified according to data vector to information record, obtains public sentiment classification belonging to information record;When the information record in preset time period in public sentiment classification, when meeting quantity term, the corresponding discovery result of public sentiment classification is determined.Due to not being to be recorded information into classification by the keyword of text data, but by the data vector to entire text data, classify to information record, it so can be to avoid the loss of semantic information, the accuracy of classification is improved, to improve the accuracy of product public sentiment discovery.
Description
Technical field
This application involves data mining technology fields, find method, apparatus, computer more particularly to a kind of product public sentiment
Equipment and storage medium.
Background technique
With the sustainable development of internet, daily life, which is increasingly interconnected net, to be influenced, it is online see news,
Shopping, exchange etc. is more and more common mutually.When certain product used breaks down, end user always exists at the first time
It propagates and discusses on network, therefore, the monitoring for the public sentiment of specific products becomes more and more important, and is supervised by product public sentiment
Control, product offer can find unexpected incidents early, to take reasonable action, and then public sentiment be avoided to continue to expand.Such as,
When product is game, game player can log in corresponding forum or official website publication related commentary when encountering the system failure.
Traditional product public sentiment finds method, is classified based on keyword to public sentiment, however simply by closing
Key word is lower to public sentiment often accuracy rate of being classified.Therefore, traditional product public sentiment finds method, and it is lower that there are accuracys rate
Problem.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of product public sentiment discovery side that can be improved accuracy rate
Method, device, computer equipment and storage medium.
A kind of product public sentiment discovery method, which comprises
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record
Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined
The corresponding discovery result of feelings classification.
A kind of product public sentiment discovery device, described device include:
Text Feature Extraction module, the text data that each information for extracting presupposed information source records;
Vector conversion module, for converting data vector for the text data;
Public sentiment categorization module obtains the information for classifying according to the data vector to information record
Public sentiment classification belonging to record;
Public sentiment discovery module, for meeting quantity when the information record in preset time period in the public sentiment classification
When condition, the corresponding discovery result of the public sentiment classification is determined.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record
Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined
The corresponding discovery result of feelings classification.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
It is performed the steps of when row
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record
Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined
The corresponding discovery result of feelings classification.
The said goods public sentiment finds method, apparatus, computer equipment and storage medium, extracts each item letter in presupposed information source
Cease the text data of record;Data vector is converted by text data;Classified according to data vector to information record, is obtained
Public sentiment classification belonging to information record;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine
The corresponding discovery result of public sentiment classification.Due to not recorded information into classification by the keyword of text data, pass through
To the data vector of entire text data, classify to information record, so can improve and divide to avoid the loss of semantic information
The accuracy of class, to improve the accuracy of product public sentiment discovery.
Detailed description of the invention
Fig. 1 is the applied environment figure that product public sentiment finds method in one embodiment;
Fig. 2 is the flow diagram that product public sentiment finds method in one embodiment;
Fig. 3 is the flow diagram that product public sentiment finds method in a specific embodiment;
Fig. 4 is the exemplary diagram for the alarm notification that product public sentiment finds method in a specific embodiment;
Fig. 5 is the public sentiment volume trends figure that product public sentiment finds method in a specific embodiment;
Fig. 6 is that product public sentiment finds that the public sentiment of method describes exemplary diagram in a specific embodiment;
Fig. 7 is another exemplary diagram for the alarm notification that product public sentiment finds method in a specific embodiment;
Fig. 8 is the detailed page for the public sentiment description that product public sentiment finds method in a specific embodiment;
Fig. 9 is the training process for the word incorporation model that product public sentiment finds method in a specific embodiment, neural network point
The process comparison diagram of training process and product the public sentiment classification of class model;
Figure 10 is that the product public sentiment of an embodiment finds the structural block diagram of device;
Figure 11 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Product public sentiment provided by the present application finds method, can be used for the public sentiment monitoring of product, such as the login carriage of game products
Feelings supplement public sentiment, Caton public sentiment and system failure public sentiment with money etc., can provide help for the decision of product provider.The product carriage
Feelings discovery can be applied in application environment as shown in Figure 1.Wherein, terminal 102 is led to by network with server 104
Letter.The product public sentiment discovery method of the embodiment of the present application may operate in terminal 102, the corresponding server in presupposed information source
104 can send information by network records to terminal 102, and terminal 102 receives each information record in presupposed information source, extracts pre-
If the text data that each information of information source records;Data vector is converted by the text data;According to the data to
Amount classifies to information record, obtains public sentiment classification belonging to the information record;When carriage described in preset time period
Information record in feelings classification, when meeting quantity term, determines the corresponding discovery result of the public sentiment classification.Wherein, eventually
End 102 can be, but not limited to be various servers, personal computer, laptop, smart phone, tablet computer and portable
Wearable device, server can be realized with the server cluster of the either multiple server compositions of independent server.
In one embodiment, as shown in Fig. 2, providing a kind of product public sentiment discovery method, this method can be run on
Terminal 102 in Fig. 1.The product public sentiment finds method, comprising the following steps:
S202 extracts the text data of each information record in presupposed information source.
Presupposed information source can be the information sources such as forum of official corresponding with product, wechat circle, Baidu's discussion bar.Information record
Can for article, model, comment, comment the forms such as reply.Text data may include information record in the form of text
The content of text of presentation, or text data may include the content of text recorded in information record;Text data can also include
Content of text made of the data conversion that other forms (such as expression, picture etc.) is presented.
Text data is converted data vector by S204.
Data vector is with semantic vector.Semanteme expressed by the data vector is identical as the semanteme of text data.
S206 classifies to information record according to data vector, obtains public sentiment classification belonging to information record.
It, can be according to the semanteme of data vector to the number after converting text data to semantic data vector
Classify according to vector, to classify to the data vector corresponding information record, obtains carriage belonging to information record
Feelings classification.Public sentiment classification may include product price public sentiment, product quality public sentiment etc., and e.g., in game products, public sentiment classification can
To include logging in public sentiment, supplementing public sentiment, Caton public sentiment and system failure public sentiment with money etc..
After the public sentiment classification for obtaining information record, the public sentiment classification that can first record the information is stored to data
Library.
S208 when meeting quantity term, determines public sentiment classification pair when the information record in preset time period in public sentiment classification
The discovery result answered.
Preset time period can in 10 minutes before current point in time, in 20 minutes, in 30 minutes, in 1 hour, two
In hour, within 1 day, in 1 week etc..Current point in time can be the time point of the current time obtained in real time.Current point in time
Chronomere can be as accurate as 1 minute, also can be as accurate as 1 second, can also be accurate to 1 hour.Carriage in preset time period
Information record in feelings classification can be, newly-increased during this period of time, to belong to public sentiment classification information record;It can be with
It is the information record for belonging to the public sentiment classification during this period of time.
Quantity term can be the condition of the quantity of the record of the information within a preset period of time, in public sentiment classification;Quantity item
Part can also be the condition of the newly-increased quantity of the record of the information within a preset period of time, in public sentiment classification.It was found that result can be
When the information record in preset time period in public sentiment classification, when meeting quantity term, the corresponding public sentiment of public sentiment classification.
Product public sentiment based on the present embodiment finds method, extracts the textual data of each information record in presupposed information source
According to;Data vector is converted by text data;Classified according to data vector to information record, is obtained belonging to information record
Public sentiment classification;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine that public sentiment classification is corresponding
It was found that result.Due to not recorded information into classification by the keyword of text data, by entire text data
Data vector, to information record classify, so can improve the accuracy of classification to avoid the loss of semantic information, from
And improve the accuracy of product public sentiment discovery.
In one embodiment, data vector is converted by text data, comprising: text data is pre-processed,
Obtain text to be segmented;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Each semantic word is converted into
Term vector obtains the data vector of text data.
Pretreatment may include deleting punctuation mark, deleting network address and delete number etc. not including practical semantic text portion
Point.By pre-processing to text data, the text to be segmented made is with practical semantic textual portions.To having
Practical semantic text to be segmented carries out word segmentation processing, it is available should text be segmented semantic word namely it is available should
The semantic word of text data.It is to be appreciated that an information records the quantity of semantic word corresponding to corresponding text data extremely
It is less 1.It is converted into each semantic word the data vector of text data is obtained, in this way, conveniently passing through number comprising semantic term vector
According to the form of vector, classify to text data, namely conveniently classifies to information record.
Product public sentiment discovery method based on the present embodiment obtains text to be segmented due to pre-processing to text data
This;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Each semantic word is converted into term vector, is obtained
The data vector of text data.In this way, classifying to information record by the data vector to entire text data, not being
Information is recorded into classification by the keyword of text data, so can improve the standard of classification to avoid the loss of semantic information
True property, to improve the accuracy of product public sentiment discovery.
In one embodiment, each semantic word is converted into term vector, obtains the data vector of text data, comprising:
Each semantic word is converted into term vector, obtains the data vector of text data by word-based incorporation model.
Word incorporation model is used to for the semantic word of textual form being converted into the term vector of vector form.Word incorporation model can be with
For fasttext model (quick text classifier model), doc2vec (a kind of article vector model), GloVe model
(Global vectors for word representation, the model of Global Vector is indicated with word) etc..
By way of word-based incorporation model, each semantic word is converted into term vector, available more accurate text
The data vector of notebook data, to further increase the accuracy of product public sentiment discovery.
In one embodiment, pretreatment includes at least one deleted in punctuation mark, deletion network address and deletion number
?.Do not have practical semantic, unnecessary content of text in this way, can delete, with improving word segmentation processing accuracy, the same to time
About resource so as to improve the accuracy of product public sentiment discovery, while saving system resource.
Further, pretreatment can also include deleting stop words.In this way, can be in the accuracy drop that product public sentiment is found
In lower situation, resource is further saved.It is to be appreciated that in the embodiment that pretreatment does not include deletion stop words,
The accuracy of product public sentiment discovery is higher.
In a wherein specific embodiment, data vector can be the vector of no less than default dimension, and such as default dimension can
Think 300.In this way, making data vector include more semantic information, so that classification results are more accurate, to further mention
The accuracy of high product public sentiment discovery.
In one embodiment, classified according to data vector to information record, obtain carriage belonging to information record
Feelings classification, comprising:
Based on neural network classification model, is classified according to data vector to information record, obtained belonging to information record
Public sentiment classification.
By neural network classification model, classify to the data vector of input, namely corresponding to the data vector
Information record is classified, and public sentiment classification belonging to information record is obtained.Due to passing through neural network classification model, to input
Data vector is classified, and obtained classification results are more accurate, therefore, can be further improved the accurate of product public sentiment discovery
Property.
It is to be appreciated that in other embodiments, neural network classification model can not also be used, and use other classification
Device classifies to information record according to data vector.In this way, be not by the keyword of text data to information record into point
Class, but by the data vector to entire text data, classify to information record, it can losing to avoid semantic information
It loses, improves the accuracy of classification, to improve the accuracy of product public sentiment discovery.
Further, it is based on neural network classification model, is classified according to data vector to information record, obtains information
Public sentiment classification belonging to record, comprising: by the input layer of neural network classification model, by data vector input neural network point
Class model;Data vector is weighted by the hidden layer of neural network classification model, obtains classification results;Pass through nerve net
The output layer of network model, output category result, classification results are corresponding with public sentiment classification.
Neural network classification model includes the hidden layer between input layer, output layer and input layer and output layer.By this
Data vector is inputted neural network classification model by input layer;Data vector is weighted by the hidden layer, is classified
As a result;By the output layer, output category result, which can be the corresponding serial number of public sentiment classification, so obtain letter
Public sentiment classification belonging to breath record.Due to classifying to the data vector of input, obtaining by neural network classification model
Classification results are more accurate, therefore, can be further improved the accuracy of product public sentiment discovery.
In one embodiment, in neural network classification model hidden layer weight, by target classification result with it is right
The training classification results that training sample is trained determine;Training sample include target classification result and training data to
Amount, training data vector is identical as the data structure of data vector, and training data vector is corresponding with target classification result.
Target classification result is input to after neural network classification model for the training data vector in training sample, it is expected that
Obtained classification results.Training classification results are to be input to training data vector after the neural network model in training, real
The classification results that border obtains.Training data vector is with target classification the result is that corresponding a, training in a training sample
Data vector corresponds to a target classification result.Simultaneously as training data vector is also intended to input neural network model, because
This, the data structure of training data vector also data vector is identical.
Further, it by target classification result and the training classification results being trained to training sample, determines
The weight of hidden layer in neural network classification model may include: to obtain when according to target classification result and training classification results
Loss function value when reaching default optimal conditions, determine the weight of hidden layer in neural network classification model.Default optimization item
Part can be loss function value and reach preset value, is also possible to loss function value and changes within a preset period of time less than preset value.
In this way, optimal neural network classification model is obtained, so that it is determined that neural network model.
In a wherein specific embodiment, neural network classification model is multilayer neural network disaggregated model.Multilayer nerve
Network model can be the disaggregated model for the network structure being made of multiple perceptrons.Such as, multilayer neural network disaggregated model is
Three-layer neural network disaggregated model.For another example, each layer of number of nodes can be respectively 300,100,7.In this way, can be with closer one
The accuracy of the raising classification of step, to further increase the accuracy of product public sentiment classification.
In one embodiment, public sentiment classification includes logging in public sentiment, supplementing public sentiment, Caton public sentiment and system failure carriage with money
Feelings.Wherein, it logs in public sentiment and refers to the information record logged in about account;It supplements public sentiment with money and refers to the information note supplemented with money about account
Record;Caton public sentiment, which refers to, to be recorded in product use process about the information of system fluency;System failure public sentiment, which refers to, to be produced
In product use process, the information about the system failure is recorded.In this way, product public sentiment discovery method is made to be particularly suitable for network
The public sentiment of product monitors, as the public sentiment of game products monitors.Traffic issues can be found for product item group, enable project team
Enough early processing, reduce influence of the failure to user, to improve user's viscosity.
In one embodiment, when the information record in preset time period in public sentiment classification, when meeting quantity term, really
Determine the corresponding discovery result of public sentiment classification, comprising: when the information record in preset time period in public sentiment classification, be greater than or equal to pre-
If when threshold value, determining the corresponding discovery result of public sentiment classification.
In the present embodiment, if the information in preset time period in public sentiment classification is recorded as belonging to the carriage during this period of time
When the information record of feelings classification, quantity term is that the newly-increased quantity of the information record within a preset period of time, in public sentiment classification is big
In or equal to preset threshold.If the information record in preset time period in public sentiment classification can be during this period of time it is newly-increased,
Belong to the information record of the public sentiment classification, quantity term is the quantity of the information record within a preset period of time, in public sentiment classification
More than or equal to preset threshold.It was found that result can be ought within a preset period of time, belong in the public sentiment classification information record
Newly-increased quantity, be greater than or equal to preset threshold when, the corresponding public sentiment of public sentiment classification.
Product public sentiment based on the present embodiment finds method, when the information record in preset time period, in public sentiment classification
Newly-increased quantity, be greater than or equal to preset threshold when, determine the corresponding discovery result of public sentiment classification.In this way, can be to avoid history
The interference of information record, to further increase the accuracy of product public sentiment discovery.
In one embodiment, when the determination process of preset threshold includes: for each in the first historical time section
Between point, determine public sentiment classification in the second historical time section of each time point information record quantity, obtain the first history carriage
Feelings quantity;According to the maximum value of the first history public sentiment quantity in third historical time section, the second history public sentiment at time point is obtained
Quantity;According in the first historical time section, the average value and standard deviation of the second history public sentiment quantity of each time point, determine pre-
If threshold value.
Wherein, the time span of the first historical time section is greater than the time span of the second historical time section, also greater than third
The time span of historical time section.The time span of third historical time section is greater than the time span of the second historical time section.
In a wherein specific embodiment, as shown in figure 3, the first historical time section can be first 7 days of current point in time.
Second historical time section may include preceding k minutes of current point in time, and the value of k can be 10 minutes, 20 minutes and 30 minutes.
Optionally, the second historical time section may include at least one period.In this way, can at least can be true at 10 minutes or so
Surely result is found.Third historical time section may include the time of former and later two hours of current point in time.If using XiWhen expression
Between point i the first history public sentiment quantity, use MiIndicate the maximum value of the first history public sentiment quantity in third historical time section, i.e., the
Two history public sentiment quantity, the unit time at time point are 1 minute.Then the second history public sentiment quantity Mi=max (Xi-120,
Xi-119,...,Xi+119,Xi+120), wherein Xi-120120 minutes the second history public sentiment quantity, X before expression time point ii-119Table
Show 119 minutes before time point i the second history public sentiment quantity, Xi+119119 minutes the second history public sentiment numbers after expression time point i
Amount, Xi+120120 minutes the second history public sentiment quantity after expression time point i.In first historical time section, each time point the
The average value of two history public sentiment quantity, can indicate are as follows: avg (Mi).First historical time section is interior, each time point second goes through
Standard deviation std (the M of history public sentiment quantityi)。
Please continue to refer to Fig. 3, in a wherein specific embodiment, according in the first historical time section, each time point
The average value and standard deviation of second history public sentiment quantity, determine preset threshold, comprising: according to pre-set zoom ratio, default fixation
The average value and standard deviation of scale value and interior, each time point the second history public sentiment quantity of the first historical time section, determine
Preset threshold.Such as, preset threshold can use BiIt indicates, determines that formula can be with are as follows:
Bi=m*avg (Mi)+3*std(Mi)+n
Wherein, m indicates pre-set zoom ratio, and fixedly scaling value is preset in n expression.It pre-set zoom ratio and presets fixedly scaling
Value can be empirically determined, e.g., can be adjusted according to the size of the second history public sentiment quantity.
Product public sentiment based on the present embodiment finds method, since elder generation is for each time in the first historical time section
Point determines the quantity of the information record of public sentiment classification in the second historical time section of each time point, obtains the first history public sentiment
Quantity;Further according to the maximum value of the first history public sentiment quantity in third historical time section, the second history public sentiment at time point is obtained
Quantity;Finally according in the first historical time section, the average value and standard deviation of the second history public sentiment quantity of each time point, really
Determine preset threshold.The mode of the determination preset threshold can obtain more reasonable preset threshold, it is thus possible to improve product carriage
The accuracy of feelings discovery.
In one embodiment, when the information record in preset time period in public sentiment classification, it is greater than or equal to default threshold
When value, determine the corresponding discovery of public sentiment classification as a result, later further include: according to discovery as a result, issuing alarm notification.
Alarm notification can be to be issued by way of pop-up window, can also be by showing shape with the information of different fonts
Formula issues.Alarm notification can also be issued by sending the forms such as instant message or SMS;The wechat public can also be passed through
Number send.Alarm notification can in the form of sound or luminous form issues alarm.For example, the alarm notification in an example can
With are as follows: " nearest 20 minutes works 8 class exception public sentiments that go offline of 12:00XX product are more than threshold value 7 ".In another example, alarm is logical
Knowing can be by as shown in figure 4, the sending in the form of wechat information is known in alarm all, warning content includes: time of origin, influence business
(i.e. product) influences situation and possible cause etc..In this way, can be convenient the problem of product provider has found product, and right as early as possible
The problem is handled, to reduce influence of the problem to user.
In one embodiment, when the information record in preset time period in public sentiment classification, when meeting quantity term, really
The corresponding discovery of public sentiment classification is determined as a result, later further include: the statistical result of display public sentiment classification.
Statistical result includes the public sentiment volume trends figure based on each public sentiment classification, and public sentiment volume trends figure counts different time
The quantity or accelerate that point, the information for belonging to the public sentiment classification record.Statistical result can also include in each public sentiment classification
The details of information record.In this way, providing statistical result for product provider, provider is facilitated to check, and can be used as certainly
Plan foundation.
In a wherein specific embodiment, public sentiment volume trends figure can be to indulge and sit as shown in figure 5, abscissa indicates the time
Mark indicates public sentiment quantity.The public sentiment volume trends figure indicates the statistics knot of the abnormal public sentiment summation of each public sentiment classification on a timeline
Fruit.
In a wherein specific embodiment, the details letter of the information record of the statistical result of the public sentiment classification of a game products
Breath can be as shown in Figure 6, comprising: name of product problem types, game account, mobile phone model, login mode, cell phone system, is
The information such as system type, abnormal time, problem description, related screenshot.
In a wherein specific embodiment, alarm notification can be issued by wechat public platform.The alarm notification page can
As shown in fig. 7, comprises positioning content and field data.Wherein positioning content includes that time of origin, name of product influence industry
Business, discovery result influence situation.Field data includes the statistical data of public sentiment classification, such as the analysis of public opinion tendency chart.Further
Ground can enter the detailed page of public sentiment description as shown in Figure 8 by the alarm notification page, be described by public sentiment detailed
The feelings page can view under the public sentiment classification, it is found that the statistical result of each information record in result, the statistical result can wrap
Include public sentiment record time and public sentiment description.
In a wherein specific embodiment, product public sentiment discovery method includes: to extract each information note in presupposed information source
The text data of record;Text data is pre-processed, text to be segmented is obtained, pretreatment includes deleting punctuation mark, deleting
Network address and deletion number;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Word-based incorporation model,
Each semantic word is converted into term vector, obtains the data vector of text data;Based on neural network classification model, according to data to
Amount classifies to information record, obtains public sentiment classification belonging to information record;When the letter in preset time period in public sentiment classification
Breath record determines the corresponding discovery result of public sentiment classification when being greater than or equal to preset threshold;According to discovery as a result, issuing alarm
Notice.
Wherein, word incorporation model can be fasttext model, the training process of word incorporation model, neural network classification mould
The process comparison of training process and product the public sentiment classification of type is as shown in Figure 9.
In the training process of word incorporation model, the object of processing can be urtext, which can be text
The content of text of this form.After pre-processing to the urtext, text to be segmented is obtained;The text to be segmented is divided
Word processing, obtains semantic word.By these semantic words and the sample pair of the corresponding term vector composition of semantic word, as training sample
This, is trained word incorporation model.
In the training process of neural network classification model, the object of processing can be training data, the training data
Data structure can be corresponding with the data structure of text data that information records.Training data is being pre-processed, at participle
After reason, vector conversion is carried out by trained word incorporation model, obtains training data vector.By training data vector and mesh
The sample pair of mark classification results composition is trained neural network classification model as training sample.
During the classification of product public sentiment, the text data that is recorded using new information as dealing with objects, to its into
After row pretreatment, word segmentation processing, vector conversion, data vector is obtained, obtained data vector is input to trained mind
In network class model, classification prediction is carried out, obtains classification results, i.e. public sentiment classification belonging to information record.
It should be understood that although each step in the flow chart of Fig. 2 is successively shown according to the instruction of arrow, this
A little steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein, these steps
It executes there is no the limitation of stringent sequence, these steps can execute in other order.Moreover, at least part in Fig. 2
Step may include that perhaps these sub-steps of multiple stages or stage are executed in synchronization to multiple sub-steps
It completes, but can execute at different times, the execution sequence in these sub-steps or stage, which is also not necessarily, successively to be carried out,
But it can be executed in turn or alternately at least part of the sub-step or stage of other steps or other steps.
In one embodiment, as shown in Figure 10, a kind of product public sentiment hair of terminal 102 run in Fig. 1 is provided
Existing device, comprising:
Text Feature Extraction module 1002, the text data that each information for extracting presupposed information source records;
Vector conversion module 1004, for converting data vector for the text data;
Public sentiment categorization module 1006 obtains described for being classified according to the data vector to information record
Public sentiment classification belonging to information record;
Public sentiment discovery module 1008, for meeting when the information record in preset time period in the public sentiment classification
When quantity term, the corresponding discovery result of the public sentiment classification is determined.
Product public sentiment based on the present embodiment finds device, extracts the textual data of each information record in presupposed information source
According to;Data vector is converted by text data;Classified according to data vector to information record, is obtained belonging to information record
Public sentiment classification;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine that public sentiment classification is corresponding
It was found that result.Due to not recorded information into classification by the keyword of text data, by entire text data
Data vector, to information record classify, so can improve the accuracy of classification to avoid the loss of semantic information, from
And improve the accuracy of product public sentiment discovery.
In one embodiment, described device further includes preprocessing module and word segmentation module.
Preprocessing module obtains text to be segmented for pre-processing to the text data;
Word segmentation module obtains the semantic word of the text data for carrying out word segmentation processing to the text to be segmented;
Vector conversion module 1004 obtains the number of the text data for each semantic word to be converted into term vector
According to vector.
In one embodiment, vector conversion module 1004 is used for word-based incorporation model, and each semantic word is turned
It changes term vector into, obtains the data vector of the text data.
In one embodiment, pretreatment includes at least one deleted in punctuation mark, deletion network address and deletion number
?.
In one embodiment, public sentiment categorization module 1006, for being based on neural network classification model, according to the number
Classify according to vector to information record, obtains public sentiment classification belonging to the information record.
In one embodiment, public sentiment categorization module 1006, for the input by the neural network classification model
The data vector is inputted the neural network classification model by layer;Pass through the hidden layer pair of the neural network classification model
The data vector is weighted, and obtains classification results;By the output layer of the neural network model, the classification knot is exported
Fruit, the classification results are corresponding with public sentiment classification.
In one embodiment, in the neural network classification model hidden layer weight, pass through target classification result
It is determined with the training classification results being trained to training sample;The training sample include the target classification result and
Training data vector, the training data vector is identical as the data structure of the data vector, the training data vector with
The target classification result is corresponding.
In one embodiment, the public sentiment classification includes logging in public sentiment, supplementing public sentiment, Caton public sentiment and the system failure with money
Public sentiment.
In one embodiment, public sentiment discovery module 1008, for working as in preset time period in the public sentiment classification
The information record determines the corresponding discovery result of the public sentiment classification when being greater than or equal to preset threshold.
In one embodiment, described device further includes threshold determination module.The threshold determination module includes:
First quantity determination unit, for for each time point in the first historical time section, when determining each described
Between in the second historical time section for putting the information record of the public sentiment classification quantity, obtain the first history public sentiment quantity;
Second quantity determination unit, for the maximum according to the first history public sentiment quantity described in third historical time section
Value, obtains the second history public sentiment quantity at the time point;
Preset threshold determination unit, for according in the first historical time section, the second history carriage at each time point
The average value and standard deviation of feelings quantity, determine the preset threshold.
In one embodiment, described device further include:
Alarm issues module, for being found according to described as a result, issuing alarm notification.
In one embodiment, described device further include:
Result display module, for showing the statistical result of the public sentiment classification.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in figure 11.The computer equipment includes processor, memory and the network interface connected by system bus.
Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory of the computer equipment includes non-easy
The property lost storage medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and database.It should
Built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The computer equipment
Database for storing data.The network interface of the computer equipment is used to communicate with external terminal by network connection.
To realize a kind of product public sentiment discovery method when the computer program is executed by processor.
It will be understood by those skilled in the art that structure shown in Figure 11, only part relevant to application scheme
The block diagram of structure, does not constitute the restriction for the computer equipment being applied thereon to application scheme, and specific computer is set
Standby may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor, the memory storage are provided
There is computer program, the processor realizes the step of the said goods public sentiment finds method when executing the computer program.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, it is described
The step of the said goods public sentiment finds method is realized when computer program is executed by processor.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synch l i nk) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (15)
1. a kind of product public sentiment finds method, which comprises
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment classification belonging to the information record;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the public sentiment class is determined
Not corresponding discovery result.
2. being wrapped the method according to claim 1, wherein described convert data vector for the text data
It includes:
The text data is pre-processed, text to be segmented is obtained;
Word segmentation processing is carried out to the text to be segmented, obtains the semantic word of the text data;
Each semantic word is converted into term vector, obtains the data vector of the text data.
3. according to the method described in claim 2, obtaining it is characterized in that, described be converted into term vector for each semantic word
The data vector of the text data, comprising:
Each semantic word is converted into term vector, obtains the data vector of the text data by word-based incorporation model.
4. according to the method described in claim 2, it is characterized in that, pretreatment includes deleting punctuation mark, deleting network address and delete
Except at least one in number.
5. the method according to claim 1, wherein it is described according to the data vector to the information record into
Row classification obtains public sentiment classification belonging to the information record, comprising:
Based on neural network classification model, is classified according to the data vector to information record, obtain the information
Public sentiment classification belonging to record.
6. according to the method described in claim 5, it is characterized in that, described be based on neural network classification model, according to the number
Classify according to vector to information record, obtain public sentiment classification belonging to the information record, comprising:
By the input layer of the neural network classification model, the data vector is inputted into the neural network classification model;
The data vector is weighted by the hidden layer of the neural network classification model, obtains classification results;
By the output layer of the neural network model, the classification results are exported, the classification results are corresponding with public sentiment classification.
7. according to the method described in claim 5, it is characterized in that, in the neural network classification model hidden layer weight,
It is determined by target classification result with the training classification results being trained to training sample;The training sample includes institute
Target classification result and training data vector are stated, the training data vector is identical as the data structure of the data vector, institute
It is corresponding with the target classification result to state training data vector.
8. the method according to claim 1, wherein the public sentiment classification includes logging in public sentiment, supplementing public sentiment, card with money
Public sentiment and system failure public sentiment.
9. the method according to claim 1, wherein the institute when in preset time period in the public sentiment classification
Information record is stated, when meeting quantity term, determines the corresponding discovery result of the public sentiment classification, comprising:
When the information record in public sentiment classification described in preset time period, when being greater than or equal to preset threshold, described in determination
The corresponding discovery result of public sentiment classification.
10. according to the method described in claim 9, it is characterized in that, the determination process of the preset threshold includes:
For each time point in the first historical time section, determine described in the second historical time section at each time point
The quantity of the information record of public sentiment classification, obtains the first history public sentiment quantity;
According to the maximum value of the first history public sentiment quantity described in third historical time section, second history at the time point is obtained
Public sentiment quantity;
According in the first historical time section, the average value and standard deviation of the second history public sentiment quantity at each time point, really
The fixed preset threshold.
11. according to the method described in claim 9, it is characterized in that, described work as in preset time period in the public sentiment classification
The information record when being greater than or equal to preset threshold, determines the corresponding discovery of the public sentiment classification as a result, later further include:
According to the discovery as a result, issuing alarm notification.
12. according to claim 1 to method described in 11 any one, which is characterized in that described when described in preset time period
Information record in public sentiment classification when meeting quantity term, determines the corresponding discovery of the public sentiment classification as a result, later also
Include:
Show the statistical result of the public sentiment classification.
13. a kind of product public sentiment finds device, described device includes:
Text Feature Extraction module, the text data that each information for extracting presupposed information source records;
Vector conversion module, for converting data vector for the text data;
Public sentiment categorization module obtains the information record for classifying according to the data vector to information record
Affiliated public sentiment classification;
Public sentiment discovery module, for meeting quantity term when the information record in preset time period in the public sentiment classification
When, determine the corresponding discovery result of the public sentiment classification.
14. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 12 the method when executing the computer program.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 12 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811005075.3A CN109145115B (en) | 2018-08-30 | 2018-08-30 | Product public opinion discovery method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811005075.3A CN109145115B (en) | 2018-08-30 | 2018-08-30 | Product public opinion discovery method, device, computer equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109145115A true CN109145115A (en) | 2019-01-04 |
CN109145115B CN109145115B (en) | 2020-11-24 |
Family
ID=64829497
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811005075.3A Active CN109145115B (en) | 2018-08-30 | 2018-08-30 | Product public opinion discovery method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109145115B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112488666A (en) * | 2020-12-15 | 2021-03-12 | 北京易兴元石化科技有限公司 | Network-based petroleum comprehensive data processing method and device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105550269A (en) * | 2015-12-10 | 2016-05-04 | 复旦大学 | Product comment analyzing method and system with learning supervising function |
US20160170966A1 (en) * | 2014-12-10 | 2016-06-16 | Brian Kolo | Methods and systems for automated language identification |
CN107977397A (en) * | 2017-09-08 | 2018-05-01 | 华瑞新智科技(北京)有限公司 | Internet user's notice index calculation method and system based on deep learning |
CN108334605A (en) * | 2018-02-01 | 2018-07-27 | 腾讯科技(深圳)有限公司 | File classification method, device, computer equipment and storage medium |
-
2018
- 2018-08-30 CN CN201811005075.3A patent/CN109145115B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160170966A1 (en) * | 2014-12-10 | 2016-06-16 | Brian Kolo | Methods and systems for automated language identification |
CN105550269A (en) * | 2015-12-10 | 2016-05-04 | 复旦大学 | Product comment analyzing method and system with learning supervising function |
CN107977397A (en) * | 2017-09-08 | 2018-05-01 | 华瑞新智科技(北京)有限公司 | Internet user's notice index calculation method and system based on deep learning |
CN108334605A (en) * | 2018-02-01 | 2018-07-27 | 腾讯科技(深圳)有限公司 | File classification method, device, computer equipment and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112488666A (en) * | 2020-12-15 | 2021-03-12 | 北京易兴元石化科技有限公司 | Network-based petroleum comprehensive data processing method and device and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109145115B (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10380249B2 (en) | Predicting future trending topics | |
CN107808011B (en) | Information classification extraction method and device, computer equipment and storage medium | |
US9213997B2 (en) | Method and system for social media burst classifications | |
US8868609B2 (en) | Tagging method and apparatus based on structured data set | |
US20160071117A1 (en) | System and method for using marketing automation activity data for lead prioritization and marketing campaign optimization | |
EP2973379B1 (en) | Personalized summaries for content | |
US9894138B2 (en) | Natural language management of online social network connections | |
CN112926308B (en) | Method, device, equipment, storage medium and program product for matching text | |
CN110909768A (en) | Method and device for acquiring marked data | |
Khemani et al. | A review on reddit news headlines with nltk tool | |
CN112995414B (en) | Behavior quality inspection method, device, equipment and storage medium based on voice call | |
CN109145115A (en) | Product public sentiment finds method, apparatus, computer equipment and storage medium | |
CN116663505B (en) | Comment area management method and system based on Internet | |
Aziz et al. | Social network analytics: natural disaster analysis through twitter | |
US11763398B2 (en) | Expanding semantic classes via user feedback | |
CN114970540A (en) | Method and device for training text audit model | |
Ahmad et al. | A Comprehensive Data Analysis on FUDMA ASUU Whatsapp Group Chat | |
US20210073247A1 (en) | System and method for machine learning architecture for interdependence detection | |
Yin et al. | Social spammer detection: a multi-relational embedding approach | |
CN116308237B (en) | ERP mail processing method and related equipment thereof | |
CN111221938B (en) | Prediction result generation method, terminal and storage medium | |
Akshara et al. | An Effective Model to Prognosticate the Stock Market Tendency Using Social Media | |
US20220383094A1 (en) | System and method for obtaining raw event embedding and applications thereof | |
US20230245136A1 (en) | Retail product listing escalation event detection | |
Adityani et al. | Sentiment and Discussion Topic Analysis on Social Media Group using Support Vector Machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |