CN109145115A - Product public sentiment finds method, apparatus, computer equipment and storage medium - Google Patents

Product public sentiment finds method, apparatus, computer equipment and storage medium Download PDF

Info

Publication number
CN109145115A
CN109145115A CN201811005075.3A CN201811005075A CN109145115A CN 109145115 A CN109145115 A CN 109145115A CN 201811005075 A CN201811005075 A CN 201811005075A CN 109145115 A CN109145115 A CN 109145115A
Authority
CN
China
Prior art keywords
public sentiment
classification
information record
vector
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811005075.3A
Other languages
Chinese (zh)
Other versions
CN109145115B (en
Inventor
雷航
洪楷
刘伟
张学亮
王月瑶
陈乃华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Chengdu Co Ltd
Original Assignee
Tencent Technology Chengdu Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Chengdu Co Ltd filed Critical Tencent Technology Chengdu Co Ltd
Priority to CN201811005075.3A priority Critical patent/CN109145115B/en
Publication of CN109145115A publication Critical patent/CN109145115A/en
Application granted granted Critical
Publication of CN109145115B publication Critical patent/CN109145115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Abstract

This application involves product public sentiment discovery method, apparatus, computer equipment and storage mediums, extract the text data of each information record in presupposed information source;Data vector is converted by text data;Classified according to data vector to information record, obtains public sentiment classification belonging to information record;When the information record in preset time period in public sentiment classification, when meeting quantity term, the corresponding discovery result of public sentiment classification is determined.Due to not being to be recorded information into classification by the keyword of text data, but by the data vector to entire text data, classify to information record, it so can be to avoid the loss of semantic information, the accuracy of classification is improved, to improve the accuracy of product public sentiment discovery.

Description

Product public sentiment finds method, apparatus, computer equipment and storage medium
Technical field
This application involves data mining technology fields, find method, apparatus, computer more particularly to a kind of product public sentiment Equipment and storage medium.
Background technique
With the sustainable development of internet, daily life, which is increasingly interconnected net, to be influenced, it is online see news, Shopping, exchange etc. is more and more common mutually.When certain product used breaks down, end user always exists at the first time It propagates and discusses on network, therefore, the monitoring for the public sentiment of specific products becomes more and more important, and is supervised by product public sentiment Control, product offer can find unexpected incidents early, to take reasonable action, and then public sentiment be avoided to continue to expand.Such as, When product is game, game player can log in corresponding forum or official website publication related commentary when encountering the system failure.
Traditional product public sentiment finds method, is classified based on keyword to public sentiment, however simply by closing Key word is lower to public sentiment often accuracy rate of being classified.Therefore, traditional product public sentiment finds method, and it is lower that there are accuracys rate Problem.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of product public sentiment discovery side that can be improved accuracy rate Method, device, computer equipment and storage medium.
A kind of product public sentiment discovery method, which comprises
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined The corresponding discovery result of feelings classification.
A kind of product public sentiment discovery device, described device include:
Text Feature Extraction module, the text data that each information for extracting presupposed information source records;
Vector conversion module, for converting data vector for the text data;
Public sentiment categorization module obtains the information for classifying according to the data vector to information record Public sentiment classification belonging to record;
Public sentiment discovery module, for meeting quantity when the information record in preset time period in the public sentiment classification When condition, the corresponding discovery result of the public sentiment classification is determined.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device performs the steps of when executing the computer program
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined The corresponding discovery result of feelings classification.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor It is performed the steps of when row
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment class belonging to the information record Not;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the carriage is determined The corresponding discovery result of feelings classification.
The said goods public sentiment finds method, apparatus, computer equipment and storage medium, extracts each item letter in presupposed information source Cease the text data of record;Data vector is converted by text data;Classified according to data vector to information record, is obtained Public sentiment classification belonging to information record;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine The corresponding discovery result of public sentiment classification.Due to not recorded information into classification by the keyword of text data, pass through To the data vector of entire text data, classify to information record, so can improve and divide to avoid the loss of semantic information The accuracy of class, to improve the accuracy of product public sentiment discovery.
Detailed description of the invention
Fig. 1 is the applied environment figure that product public sentiment finds method in one embodiment;
Fig. 2 is the flow diagram that product public sentiment finds method in one embodiment;
Fig. 3 is the flow diagram that product public sentiment finds method in a specific embodiment;
Fig. 4 is the exemplary diagram for the alarm notification that product public sentiment finds method in a specific embodiment;
Fig. 5 is the public sentiment volume trends figure that product public sentiment finds method in a specific embodiment;
Fig. 6 is that product public sentiment finds that the public sentiment of method describes exemplary diagram in a specific embodiment;
Fig. 7 is another exemplary diagram for the alarm notification that product public sentiment finds method in a specific embodiment;
Fig. 8 is the detailed page for the public sentiment description that product public sentiment finds method in a specific embodiment;
Fig. 9 is the training process for the word incorporation model that product public sentiment finds method in a specific embodiment, neural network point The process comparison diagram of training process and product the public sentiment classification of class model;
Figure 10 is that the product public sentiment of an embodiment finds the structural block diagram of device;
Figure 11 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Product public sentiment provided by the present application finds method, can be used for the public sentiment monitoring of product, such as the login carriage of game products Feelings supplement public sentiment, Caton public sentiment and system failure public sentiment with money etc., can provide help for the decision of product provider.The product carriage Feelings discovery can be applied in application environment as shown in Figure 1.Wherein, terminal 102 is led to by network with server 104 Letter.The product public sentiment discovery method of the embodiment of the present application may operate in terminal 102, the corresponding server in presupposed information source 104 can send information by network records to terminal 102, and terminal 102 receives each information record in presupposed information source, extracts pre- If the text data that each information of information source records;Data vector is converted by the text data;According to the data to Amount classifies to information record, obtains public sentiment classification belonging to the information record;When carriage described in preset time period Information record in feelings classification, when meeting quantity term, determines the corresponding discovery result of the public sentiment classification.Wherein, eventually End 102 can be, but not limited to be various servers, personal computer, laptop, smart phone, tablet computer and portable Wearable device, server can be realized with the server cluster of the either multiple server compositions of independent server.
In one embodiment, as shown in Fig. 2, providing a kind of product public sentiment discovery method, this method can be run on Terminal 102 in Fig. 1.The product public sentiment finds method, comprising the following steps:
S202 extracts the text data of each information record in presupposed information source.
Presupposed information source can be the information sources such as forum of official corresponding with product, wechat circle, Baidu's discussion bar.Information record Can for article, model, comment, comment the forms such as reply.Text data may include information record in the form of text The content of text of presentation, or text data may include the content of text recorded in information record;Text data can also include Content of text made of the data conversion that other forms (such as expression, picture etc.) is presented.
Text data is converted data vector by S204.
Data vector is with semantic vector.Semanteme expressed by the data vector is identical as the semanteme of text data.
S206 classifies to information record according to data vector, obtains public sentiment classification belonging to information record.
It, can be according to the semanteme of data vector to the number after converting text data to semantic data vector Classify according to vector, to classify to the data vector corresponding information record, obtains carriage belonging to information record Feelings classification.Public sentiment classification may include product price public sentiment, product quality public sentiment etc., and e.g., in game products, public sentiment classification can To include logging in public sentiment, supplementing public sentiment, Caton public sentiment and system failure public sentiment with money etc..
After the public sentiment classification for obtaining information record, the public sentiment classification that can first record the information is stored to data Library.
S208 when meeting quantity term, determines public sentiment classification pair when the information record in preset time period in public sentiment classification The discovery result answered.
Preset time period can in 10 minutes before current point in time, in 20 minutes, in 30 minutes, in 1 hour, two In hour, within 1 day, in 1 week etc..Current point in time can be the time point of the current time obtained in real time.Current point in time Chronomere can be as accurate as 1 minute, also can be as accurate as 1 second, can also be accurate to 1 hour.Carriage in preset time period Information record in feelings classification can be, newly-increased during this period of time, to belong to public sentiment classification information record;It can be with It is the information record for belonging to the public sentiment classification during this period of time.
Quantity term can be the condition of the quantity of the record of the information within a preset period of time, in public sentiment classification;Quantity item Part can also be the condition of the newly-increased quantity of the record of the information within a preset period of time, in public sentiment classification.It was found that result can be When the information record in preset time period in public sentiment classification, when meeting quantity term, the corresponding public sentiment of public sentiment classification.
Product public sentiment based on the present embodiment finds method, extracts the textual data of each information record in presupposed information source According to;Data vector is converted by text data;Classified according to data vector to information record, is obtained belonging to information record Public sentiment classification;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine that public sentiment classification is corresponding It was found that result.Due to not recorded information into classification by the keyword of text data, by entire text data Data vector, to information record classify, so can improve the accuracy of classification to avoid the loss of semantic information, from And improve the accuracy of product public sentiment discovery.
In one embodiment, data vector is converted by text data, comprising: text data is pre-processed, Obtain text to be segmented;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Each semantic word is converted into Term vector obtains the data vector of text data.
Pretreatment may include deleting punctuation mark, deleting network address and delete number etc. not including practical semantic text portion Point.By pre-processing to text data, the text to be segmented made is with practical semantic textual portions.To having Practical semantic text to be segmented carries out word segmentation processing, it is available should text be segmented semantic word namely it is available should The semantic word of text data.It is to be appreciated that an information records the quantity of semantic word corresponding to corresponding text data extremely It is less 1.It is converted into each semantic word the data vector of text data is obtained, in this way, conveniently passing through number comprising semantic term vector According to the form of vector, classify to text data, namely conveniently classifies to information record.
Product public sentiment discovery method based on the present embodiment obtains text to be segmented due to pre-processing to text data This;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Each semantic word is converted into term vector, is obtained The data vector of text data.In this way, classifying to information record by the data vector to entire text data, not being Information is recorded into classification by the keyword of text data, so can improve the standard of classification to avoid the loss of semantic information True property, to improve the accuracy of product public sentiment discovery.
In one embodiment, each semantic word is converted into term vector, obtains the data vector of text data, comprising: Each semantic word is converted into term vector, obtains the data vector of text data by word-based incorporation model.
Word incorporation model is used to for the semantic word of textual form being converted into the term vector of vector form.Word incorporation model can be with For fasttext model (quick text classifier model), doc2vec (a kind of article vector model), GloVe model (Global vectors for word representation, the model of Global Vector is indicated with word) etc..
By way of word-based incorporation model, each semantic word is converted into term vector, available more accurate text The data vector of notebook data, to further increase the accuracy of product public sentiment discovery.
In one embodiment, pretreatment includes at least one deleted in punctuation mark, deletion network address and deletion number ?.Do not have practical semantic, unnecessary content of text in this way, can delete, with improving word segmentation processing accuracy, the same to time About resource so as to improve the accuracy of product public sentiment discovery, while saving system resource.
Further, pretreatment can also include deleting stop words.In this way, can be in the accuracy drop that product public sentiment is found In lower situation, resource is further saved.It is to be appreciated that in the embodiment that pretreatment does not include deletion stop words, The accuracy of product public sentiment discovery is higher.
In a wherein specific embodiment, data vector can be the vector of no less than default dimension, and such as default dimension can Think 300.In this way, making data vector include more semantic information, so that classification results are more accurate, to further mention The accuracy of high product public sentiment discovery.
In one embodiment, classified according to data vector to information record, obtain carriage belonging to information record Feelings classification, comprising:
Based on neural network classification model, is classified according to data vector to information record, obtained belonging to information record Public sentiment classification.
By neural network classification model, classify to the data vector of input, namely corresponding to the data vector Information record is classified, and public sentiment classification belonging to information record is obtained.Due to passing through neural network classification model, to input Data vector is classified, and obtained classification results are more accurate, therefore, can be further improved the accurate of product public sentiment discovery Property.
It is to be appreciated that in other embodiments, neural network classification model can not also be used, and use other classification Device classifies to information record according to data vector.In this way, be not by the keyword of text data to information record into point Class, but by the data vector to entire text data, classify to information record, it can losing to avoid semantic information It loses, improves the accuracy of classification, to improve the accuracy of product public sentiment discovery.
Further, it is based on neural network classification model, is classified according to data vector to information record, obtains information Public sentiment classification belonging to record, comprising: by the input layer of neural network classification model, by data vector input neural network point Class model;Data vector is weighted by the hidden layer of neural network classification model, obtains classification results;Pass through nerve net The output layer of network model, output category result, classification results are corresponding with public sentiment classification.
Neural network classification model includes the hidden layer between input layer, output layer and input layer and output layer.By this Data vector is inputted neural network classification model by input layer;Data vector is weighted by the hidden layer, is classified As a result;By the output layer, output category result, which can be the corresponding serial number of public sentiment classification, so obtain letter Public sentiment classification belonging to breath record.Due to classifying to the data vector of input, obtaining by neural network classification model Classification results are more accurate, therefore, can be further improved the accuracy of product public sentiment discovery.
In one embodiment, in neural network classification model hidden layer weight, by target classification result with it is right The training classification results that training sample is trained determine;Training sample include target classification result and training data to Amount, training data vector is identical as the data structure of data vector, and training data vector is corresponding with target classification result.
Target classification result is input to after neural network classification model for the training data vector in training sample, it is expected that Obtained classification results.Training classification results are to be input to training data vector after the neural network model in training, real The classification results that border obtains.Training data vector is with target classification the result is that corresponding a, training in a training sample Data vector corresponds to a target classification result.Simultaneously as training data vector is also intended to input neural network model, because This, the data structure of training data vector also data vector is identical.
Further, it by target classification result and the training classification results being trained to training sample, determines The weight of hidden layer in neural network classification model may include: to obtain when according to target classification result and training classification results Loss function value when reaching default optimal conditions, determine the weight of hidden layer in neural network classification model.Default optimization item Part can be loss function value and reach preset value, is also possible to loss function value and changes within a preset period of time less than preset value. In this way, optimal neural network classification model is obtained, so that it is determined that neural network model.
In a wherein specific embodiment, neural network classification model is multilayer neural network disaggregated model.Multilayer nerve Network model can be the disaggregated model for the network structure being made of multiple perceptrons.Such as, multilayer neural network disaggregated model is Three-layer neural network disaggregated model.For another example, each layer of number of nodes can be respectively 300,100,7.In this way, can be with closer one The accuracy of the raising classification of step, to further increase the accuracy of product public sentiment classification.
In one embodiment, public sentiment classification includes logging in public sentiment, supplementing public sentiment, Caton public sentiment and system failure carriage with money Feelings.Wherein, it logs in public sentiment and refers to the information record logged in about account;It supplements public sentiment with money and refers to the information note supplemented with money about account Record;Caton public sentiment, which refers to, to be recorded in product use process about the information of system fluency;System failure public sentiment, which refers to, to be produced In product use process, the information about the system failure is recorded.In this way, product public sentiment discovery method is made to be particularly suitable for network The public sentiment of product monitors, as the public sentiment of game products monitors.Traffic issues can be found for product item group, enable project team Enough early processing, reduce influence of the failure to user, to improve user's viscosity.
In one embodiment, when the information record in preset time period in public sentiment classification, when meeting quantity term, really Determine the corresponding discovery result of public sentiment classification, comprising: when the information record in preset time period in public sentiment classification, be greater than or equal to pre- If when threshold value, determining the corresponding discovery result of public sentiment classification.
In the present embodiment, if the information in preset time period in public sentiment classification is recorded as belonging to the carriage during this period of time When the information record of feelings classification, quantity term is that the newly-increased quantity of the information record within a preset period of time, in public sentiment classification is big In or equal to preset threshold.If the information record in preset time period in public sentiment classification can be during this period of time it is newly-increased, Belong to the information record of the public sentiment classification, quantity term is the quantity of the information record within a preset period of time, in public sentiment classification More than or equal to preset threshold.It was found that result can be ought within a preset period of time, belong in the public sentiment classification information record Newly-increased quantity, be greater than or equal to preset threshold when, the corresponding public sentiment of public sentiment classification.
Product public sentiment based on the present embodiment finds method, when the information record in preset time period, in public sentiment classification Newly-increased quantity, be greater than or equal to preset threshold when, determine the corresponding discovery result of public sentiment classification.In this way, can be to avoid history The interference of information record, to further increase the accuracy of product public sentiment discovery.
In one embodiment, when the determination process of preset threshold includes: for each in the first historical time section Between point, determine public sentiment classification in the second historical time section of each time point information record quantity, obtain the first history carriage Feelings quantity;According to the maximum value of the first history public sentiment quantity in third historical time section, the second history public sentiment at time point is obtained Quantity;According in the first historical time section, the average value and standard deviation of the second history public sentiment quantity of each time point, determine pre- If threshold value.
Wherein, the time span of the first historical time section is greater than the time span of the second historical time section, also greater than third The time span of historical time section.The time span of third historical time section is greater than the time span of the second historical time section.
In a wherein specific embodiment, as shown in figure 3, the first historical time section can be first 7 days of current point in time. Second historical time section may include preceding k minutes of current point in time, and the value of k can be 10 minutes, 20 minutes and 30 minutes. Optionally, the second historical time section may include at least one period.In this way, can at least can be true at 10 minutes or so Surely result is found.Third historical time section may include the time of former and later two hours of current point in time.If using XiWhen expression Between point i the first history public sentiment quantity, use MiIndicate the maximum value of the first history public sentiment quantity in third historical time section, i.e., the Two history public sentiment quantity, the unit time at time point are 1 minute.Then the second history public sentiment quantity Mi=max (Xi-120, Xi-119,...,Xi+119,Xi+120), wherein Xi-120120 minutes the second history public sentiment quantity, X before expression time point ii-119Table Show 119 minutes before time point i the second history public sentiment quantity, Xi+119119 minutes the second history public sentiment numbers after expression time point i Amount, Xi+120120 minutes the second history public sentiment quantity after expression time point i.In first historical time section, each time point the The average value of two history public sentiment quantity, can indicate are as follows: avg (Mi).First historical time section is interior, each time point second goes through Standard deviation std (the M of history public sentiment quantityi)。
Please continue to refer to Fig. 3, in a wherein specific embodiment, according in the first historical time section, each time point The average value and standard deviation of second history public sentiment quantity, determine preset threshold, comprising: according to pre-set zoom ratio, default fixation The average value and standard deviation of scale value and interior, each time point the second history public sentiment quantity of the first historical time section, determine Preset threshold.Such as, preset threshold can use BiIt indicates, determines that formula can be with are as follows:
Bi=m*avg (Mi)+3*std(Mi)+n
Wherein, m indicates pre-set zoom ratio, and fixedly scaling value is preset in n expression.It pre-set zoom ratio and presets fixedly scaling Value can be empirically determined, e.g., can be adjusted according to the size of the second history public sentiment quantity.
Product public sentiment based on the present embodiment finds method, since elder generation is for each time in the first historical time section Point determines the quantity of the information record of public sentiment classification in the second historical time section of each time point, obtains the first history public sentiment Quantity;Further according to the maximum value of the first history public sentiment quantity in third historical time section, the second history public sentiment at time point is obtained Quantity;Finally according in the first historical time section, the average value and standard deviation of the second history public sentiment quantity of each time point, really Determine preset threshold.The mode of the determination preset threshold can obtain more reasonable preset threshold, it is thus possible to improve product carriage The accuracy of feelings discovery.
In one embodiment, when the information record in preset time period in public sentiment classification, it is greater than or equal to default threshold When value, determine the corresponding discovery of public sentiment classification as a result, later further include: according to discovery as a result, issuing alarm notification.
Alarm notification can be to be issued by way of pop-up window, can also be by showing shape with the information of different fonts Formula issues.Alarm notification can also be issued by sending the forms such as instant message or SMS;The wechat public can also be passed through Number send.Alarm notification can in the form of sound or luminous form issues alarm.For example, the alarm notification in an example can With are as follows: " nearest 20 minutes works 8 class exception public sentiments that go offline of 12:00XX product are more than threshold value 7 ".In another example, alarm is logical Knowing can be by as shown in figure 4, the sending in the form of wechat information is known in alarm all, warning content includes: time of origin, influence business (i.e. product) influences situation and possible cause etc..In this way, can be convenient the problem of product provider has found product, and right as early as possible The problem is handled, to reduce influence of the problem to user.
In one embodiment, when the information record in preset time period in public sentiment classification, when meeting quantity term, really The corresponding discovery of public sentiment classification is determined as a result, later further include: the statistical result of display public sentiment classification.
Statistical result includes the public sentiment volume trends figure based on each public sentiment classification, and public sentiment volume trends figure counts different time The quantity or accelerate that point, the information for belonging to the public sentiment classification record.Statistical result can also include in each public sentiment classification The details of information record.In this way, providing statistical result for product provider, provider is facilitated to check, and can be used as certainly Plan foundation.
In a wherein specific embodiment, public sentiment volume trends figure can be to indulge and sit as shown in figure 5, abscissa indicates the time Mark indicates public sentiment quantity.The public sentiment volume trends figure indicates the statistics knot of the abnormal public sentiment summation of each public sentiment classification on a timeline Fruit.
In a wherein specific embodiment, the details letter of the information record of the statistical result of the public sentiment classification of a game products Breath can be as shown in Figure 6, comprising: name of product problem types, game account, mobile phone model, login mode, cell phone system, is The information such as system type, abnormal time, problem description, related screenshot.
In a wherein specific embodiment, alarm notification can be issued by wechat public platform.The alarm notification page can As shown in fig. 7, comprises positioning content and field data.Wherein positioning content includes that time of origin, name of product influence industry Business, discovery result influence situation.Field data includes the statistical data of public sentiment classification, such as the analysis of public opinion tendency chart.Further Ground can enter the detailed page of public sentiment description as shown in Figure 8 by the alarm notification page, be described by public sentiment detailed The feelings page can view under the public sentiment classification, it is found that the statistical result of each information record in result, the statistical result can wrap Include public sentiment record time and public sentiment description.
In a wherein specific embodiment, product public sentiment discovery method includes: to extract each information note in presupposed information source The text data of record;Text data is pre-processed, text to be segmented is obtained, pretreatment includes deleting punctuation mark, deleting Network address and deletion number;It treats participle text and carries out word segmentation processing, obtain the semantic word of text data;Word-based incorporation model, Each semantic word is converted into term vector, obtains the data vector of text data;Based on neural network classification model, according to data to Amount classifies to information record, obtains public sentiment classification belonging to information record;When the letter in preset time period in public sentiment classification Breath record determines the corresponding discovery result of public sentiment classification when being greater than or equal to preset threshold;According to discovery as a result, issuing alarm Notice.
Wherein, word incorporation model can be fasttext model, the training process of word incorporation model, neural network classification mould The process comparison of training process and product the public sentiment classification of type is as shown in Figure 9.
In the training process of word incorporation model, the object of processing can be urtext, which can be text The content of text of this form.After pre-processing to the urtext, text to be segmented is obtained;The text to be segmented is divided Word processing, obtains semantic word.By these semantic words and the sample pair of the corresponding term vector composition of semantic word, as training sample This, is trained word incorporation model.
In the training process of neural network classification model, the object of processing can be training data, the training data Data structure can be corresponding with the data structure of text data that information records.Training data is being pre-processed, at participle After reason, vector conversion is carried out by trained word incorporation model, obtains training data vector.By training data vector and mesh The sample pair of mark classification results composition is trained neural network classification model as training sample.
During the classification of product public sentiment, the text data that is recorded using new information as dealing with objects, to its into After row pretreatment, word segmentation processing, vector conversion, data vector is obtained, obtained data vector is input to trained mind In network class model, classification prediction is carried out, obtains classification results, i.e. public sentiment classification belonging to information record.
It should be understood that although each step in the flow chart of Fig. 2 is successively shown according to the instruction of arrow, this A little steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly state otherwise herein, these steps It executes there is no the limitation of stringent sequence, these steps can execute in other order.Moreover, at least part in Fig. 2 Step may include that perhaps these sub-steps of multiple stages or stage are executed in synchronization to multiple sub-steps It completes, but can execute at different times, the execution sequence in these sub-steps or stage, which is also not necessarily, successively to be carried out, But it can be executed in turn or alternately at least part of the sub-step or stage of other steps or other steps.
In one embodiment, as shown in Figure 10, a kind of product public sentiment hair of terminal 102 run in Fig. 1 is provided Existing device, comprising:
Text Feature Extraction module 1002, the text data that each information for extracting presupposed information source records;
Vector conversion module 1004, for converting data vector for the text data;
Public sentiment categorization module 1006 obtains described for being classified according to the data vector to information record Public sentiment classification belonging to information record;
Public sentiment discovery module 1008, for meeting when the information record in preset time period in the public sentiment classification When quantity term, the corresponding discovery result of the public sentiment classification is determined.
Product public sentiment based on the present embodiment finds device, extracts the textual data of each information record in presupposed information source According to;Data vector is converted by text data;Classified according to data vector to information record, is obtained belonging to information record Public sentiment classification;When the information record in preset time period in public sentiment classification, when meeting quantity term, determine that public sentiment classification is corresponding It was found that result.Due to not recorded information into classification by the keyword of text data, by entire text data Data vector, to information record classify, so can improve the accuracy of classification to avoid the loss of semantic information, from And improve the accuracy of product public sentiment discovery.
In one embodiment, described device further includes preprocessing module and word segmentation module.
Preprocessing module obtains text to be segmented for pre-processing to the text data;
Word segmentation module obtains the semantic word of the text data for carrying out word segmentation processing to the text to be segmented;
Vector conversion module 1004 obtains the number of the text data for each semantic word to be converted into term vector According to vector.
In one embodiment, vector conversion module 1004 is used for word-based incorporation model, and each semantic word is turned It changes term vector into, obtains the data vector of the text data.
In one embodiment, pretreatment includes at least one deleted in punctuation mark, deletion network address and deletion number ?.
In one embodiment, public sentiment categorization module 1006, for being based on neural network classification model, according to the number Classify according to vector to information record, obtains public sentiment classification belonging to the information record.
In one embodiment, public sentiment categorization module 1006, for the input by the neural network classification model The data vector is inputted the neural network classification model by layer;Pass through the hidden layer pair of the neural network classification model The data vector is weighted, and obtains classification results;By the output layer of the neural network model, the classification knot is exported Fruit, the classification results are corresponding with public sentiment classification.
In one embodiment, in the neural network classification model hidden layer weight, pass through target classification result It is determined with the training classification results being trained to training sample;The training sample include the target classification result and Training data vector, the training data vector is identical as the data structure of the data vector, the training data vector with The target classification result is corresponding.
In one embodiment, the public sentiment classification includes logging in public sentiment, supplementing public sentiment, Caton public sentiment and the system failure with money Public sentiment.
In one embodiment, public sentiment discovery module 1008, for working as in preset time period in the public sentiment classification The information record determines the corresponding discovery result of the public sentiment classification when being greater than or equal to preset threshold.
In one embodiment, described device further includes threshold determination module.The threshold determination module includes:
First quantity determination unit, for for each time point in the first historical time section, when determining each described Between in the second historical time section for putting the information record of the public sentiment classification quantity, obtain the first history public sentiment quantity;
Second quantity determination unit, for the maximum according to the first history public sentiment quantity described in third historical time section Value, obtains the second history public sentiment quantity at the time point;
Preset threshold determination unit, for according in the first historical time section, the second history carriage at each time point The average value and standard deviation of feelings quantity, determine the preset threshold.
In one embodiment, described device further include:
Alarm issues module, for being found according to described as a result, issuing alarm notification.
In one embodiment, described device further include:
Result display module, for showing the statistical result of the public sentiment classification.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in figure 11.The computer equipment includes processor, memory and the network interface connected by system bus. Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory of the computer equipment includes non-easy The property lost storage medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and database.It should Built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The computer equipment Database for storing data.The network interface of the computer equipment is used to communicate with external terminal by network connection. To realize a kind of product public sentiment discovery method when the computer program is executed by processor.
It will be understood by those skilled in the art that structure shown in Figure 11, only part relevant to application scheme The block diagram of structure, does not constitute the restriction for the computer equipment being applied thereon to application scheme, and specific computer is set Standby may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor, the memory storage are provided There is computer program, the processor realizes the step of the said goods public sentiment finds method when executing the computer program.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, it is described The step of the said goods public sentiment finds method is realized when computer program is executed by processor.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synch l i nk) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (15)

1. a kind of product public sentiment finds method, which comprises
Extract the text data of each information record in presupposed information source;
Data vector is converted by the text data;
Classified according to the data vector to information record, obtains public sentiment classification belonging to the information record;
When the information record in public sentiment classification described in preset time period, when meeting quantity term, the public sentiment class is determined Not corresponding discovery result.
2. being wrapped the method according to claim 1, wherein described convert data vector for the text data It includes:
The text data is pre-processed, text to be segmented is obtained;
Word segmentation processing is carried out to the text to be segmented, obtains the semantic word of the text data;
Each semantic word is converted into term vector, obtains the data vector of the text data.
3. according to the method described in claim 2, obtaining it is characterized in that, described be converted into term vector for each semantic word The data vector of the text data, comprising:
Each semantic word is converted into term vector, obtains the data vector of the text data by word-based incorporation model.
4. according to the method described in claim 2, it is characterized in that, pretreatment includes deleting punctuation mark, deleting network address and delete Except at least one in number.
5. the method according to claim 1, wherein it is described according to the data vector to the information record into Row classification obtains public sentiment classification belonging to the information record, comprising:
Based on neural network classification model, is classified according to the data vector to information record, obtain the information Public sentiment classification belonging to record.
6. according to the method described in claim 5, it is characterized in that, described be based on neural network classification model, according to the number Classify according to vector to information record, obtain public sentiment classification belonging to the information record, comprising:
By the input layer of the neural network classification model, the data vector is inputted into the neural network classification model;
The data vector is weighted by the hidden layer of the neural network classification model, obtains classification results;
By the output layer of the neural network model, the classification results are exported, the classification results are corresponding with public sentiment classification.
7. according to the method described in claim 5, it is characterized in that, in the neural network classification model hidden layer weight, It is determined by target classification result with the training classification results being trained to training sample;The training sample includes institute Target classification result and training data vector are stated, the training data vector is identical as the data structure of the data vector, institute It is corresponding with the target classification result to state training data vector.
8. the method according to claim 1, wherein the public sentiment classification includes logging in public sentiment, supplementing public sentiment, card with money Public sentiment and system failure public sentiment.
9. the method according to claim 1, wherein the institute when in preset time period in the public sentiment classification Information record is stated, when meeting quantity term, determines the corresponding discovery result of the public sentiment classification, comprising:
When the information record in public sentiment classification described in preset time period, when being greater than or equal to preset threshold, described in determination The corresponding discovery result of public sentiment classification.
10. according to the method described in claim 9, it is characterized in that, the determination process of the preset threshold includes:
For each time point in the first historical time section, determine described in the second historical time section at each time point The quantity of the information record of public sentiment classification, obtains the first history public sentiment quantity;
According to the maximum value of the first history public sentiment quantity described in third historical time section, second history at the time point is obtained Public sentiment quantity;
According in the first historical time section, the average value and standard deviation of the second history public sentiment quantity at each time point, really The fixed preset threshold.
11. according to the method described in claim 9, it is characterized in that, described work as in preset time period in the public sentiment classification The information record when being greater than or equal to preset threshold, determines the corresponding discovery of the public sentiment classification as a result, later further include:
According to the discovery as a result, issuing alarm notification.
12. according to claim 1 to method described in 11 any one, which is characterized in that described when described in preset time period Information record in public sentiment classification when meeting quantity term, determines the corresponding discovery of the public sentiment classification as a result, later also Include:
Show the statistical result of the public sentiment classification.
13. a kind of product public sentiment finds device, described device includes:
Text Feature Extraction module, the text data that each information for extracting presupposed information source records;
Vector conversion module, for converting data vector for the text data;
Public sentiment categorization module obtains the information record for classifying according to the data vector to information record Affiliated public sentiment classification;
Public sentiment discovery module, for meeting quantity term when the information record in preset time period in the public sentiment classification When, determine the corresponding discovery result of the public sentiment classification.
14. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 12 the method when executing the computer program.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 12 is realized when being executed by processor.
CN201811005075.3A 2018-08-30 2018-08-30 Product public opinion discovery method, device, computer equipment and storage medium Active CN109145115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811005075.3A CN109145115B (en) 2018-08-30 2018-08-30 Product public opinion discovery method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811005075.3A CN109145115B (en) 2018-08-30 2018-08-30 Product public opinion discovery method, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109145115A true CN109145115A (en) 2019-01-04
CN109145115B CN109145115B (en) 2020-11-24

Family

ID=64829497

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811005075.3A Active CN109145115B (en) 2018-08-30 2018-08-30 Product public opinion discovery method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109145115B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488666A (en) * 2020-12-15 2021-03-12 北京易兴元石化科技有限公司 Network-based petroleum comprehensive data processing method and device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
US20160170966A1 (en) * 2014-12-10 2016-06-16 Brian Kolo Methods and systems for automated language identification
CN107977397A (en) * 2017-09-08 2018-05-01 华瑞新智科技(北京)有限公司 Internet user's notice index calculation method and system based on deep learning
CN108334605A (en) * 2018-02-01 2018-07-27 腾讯科技(深圳)有限公司 File classification method, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160170966A1 (en) * 2014-12-10 2016-06-16 Brian Kolo Methods and systems for automated language identification
CN105550269A (en) * 2015-12-10 2016-05-04 复旦大学 Product comment analyzing method and system with learning supervising function
CN107977397A (en) * 2017-09-08 2018-05-01 华瑞新智科技(北京)有限公司 Internet user's notice index calculation method and system based on deep learning
CN108334605A (en) * 2018-02-01 2018-07-27 腾讯科技(深圳)有限公司 File classification method, device, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488666A (en) * 2020-12-15 2021-03-12 北京易兴元石化科技有限公司 Network-based petroleum comprehensive data processing method and device and storage medium

Also Published As

Publication number Publication date
CN109145115B (en) 2020-11-24

Similar Documents

Publication Publication Date Title
US10380249B2 (en) Predicting future trending topics
CN107808011B (en) Information classification extraction method and device, computer equipment and storage medium
US9213997B2 (en) Method and system for social media burst classifications
US8868609B2 (en) Tagging method and apparatus based on structured data set
US20160071117A1 (en) System and method for using marketing automation activity data for lead prioritization and marketing campaign optimization
EP2973379B1 (en) Personalized summaries for content
US9894138B2 (en) Natural language management of online social network connections
CN112926308B (en) Method, device, equipment, storage medium and program product for matching text
CN110909768A (en) Method and device for acquiring marked data
Khemani et al. A review on reddit news headlines with nltk tool
CN112995414B (en) Behavior quality inspection method, device, equipment and storage medium based on voice call
CN109145115A (en) Product public sentiment finds method, apparatus, computer equipment and storage medium
CN116663505B (en) Comment area management method and system based on Internet
Aziz et al. Social network analytics: natural disaster analysis through twitter
US11763398B2 (en) Expanding semantic classes via user feedback
CN114970540A (en) Method and device for training text audit model
Ahmad et al. A Comprehensive Data Analysis on FUDMA ASUU Whatsapp Group Chat
US20210073247A1 (en) System and method for machine learning architecture for interdependence detection
Yin et al. Social spammer detection: a multi-relational embedding approach
CN116308237B (en) ERP mail processing method and related equipment thereof
CN111221938B (en) Prediction result generation method, terminal and storage medium
Akshara et al. An Effective Model to Prognosticate the Stock Market Tendency Using Social Media
US20220383094A1 (en) System and method for obtaining raw event embedding and applications thereof
US20230245136A1 (en) Retail product listing escalation event detection
Adityani et al. Sentiment and Discussion Topic Analysis on Social Media Group using Support Vector Machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant