CN108021704A - A kind of Optimal Configuration Method of attending a banquet based on Social Public Feelings data mining technology - Google Patents

A kind of Optimal Configuration Method of attending a banquet based on Social Public Feelings data mining technology Download PDF

Info

Publication number
CN108021704A
CN108021704A CN201711445217.3A CN201711445217A CN108021704A CN 108021704 A CN108021704 A CN 108021704A CN 201711445217 A CN201711445217 A CN 201711445217A CN 108021704 A CN108021704 A CN 108021704A
Authority
CN
China
Prior art keywords
data
social public
text
public opinion
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711445217.3A
Other languages
Chinese (zh)
Other versions
CN108021704B (en
Inventor
孔祥明
杨晓霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Industry Kaiyuan Science And Technology Co Ltd
Original Assignee
Guangdong Industry Kaiyuan Science And Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Industry Kaiyuan Science And Technology Co Ltd filed Critical Guangdong Industry Kaiyuan Science And Technology Co Ltd
Priority to CN201711445217.3A priority Critical patent/CN108021704B/en
Publication of CN108021704A publication Critical patent/CN108021704A/en
Application granted granted Critical
Publication of CN108021704B publication Critical patent/CN108021704B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Primary Health Care (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of Optimal Configuration Method of attending a banquet based on Social Public Feelings data mining technology, comprise the following steps:Step 1, using web crawlers technology be collected public sentiment data;Step 2, public sentiment data pretreatment, including data cleansing, data integration, data conversion and data regularization;Step 3, using Text Mining Technology and algorithm of support vector machine, establish Social Public Feelings confidence model, divide Social Public Feelings information confidence rank;Step 4, establish Optimal Allocation Model of attending a banquet.Optimal configuration algorithm model of attending a banquet provided by the invention based on Social Public Feelings data mining, according to the real-time public sentiment data of internet, confidence partition of the level is carried out to Social Public Feelings with reference to text mining and SVM scheduling algorithms, and then assessment prediction is carried out to the phone incoming call amount of 12345 complaints and denunciation, the scientific and reasonable configuration attended a banquet is realized with big data technology.

Description

Agent optimal configuration method based on social public opinion data mining technology
Technical Field
The invention relates to the technical field of databases, in particular to an agent optimal configuration method based on social public sentiment data mining technology.
Background
With the steady development of politics, economy, culture and society in China, the right-maintaining consciousness of people is gradually enhanced, the attention to social affairs is continuously improved, and the 12345 government affair service hotline becomes an effective window for reflecting social problem phenomena and expressing social appeal of people. In order to effectively exert the influence and the acting force of a 12345 government affair service hot line, the reasonable arrangement of the seats is basic work which cannot be ignored, and the reasonable configuration of the seats is the basis and the key for effectively expressing complaints and reflecting problems in time for people.
The existing seat configuration model only sets seats based on historical data such as call quantity, average processing time and the like, the seat model considers single factors, and the phenomenon that the seat configuration is unreasonable is easily caused by neglecting social public opinions which are closely related to the number of complaints of the masses. Along with the high-speed transmission of internet information, the relevance between the real-time public opinion data of the internet and complaint reporting information is continuously enhanced, and the mining of the social public opinion data can provide strong leading significance for the optimal configuration of an agent.
Disclosure of Invention
In view of the above defects in the prior art, the technical problem to be solved by the present invention is to provide an agent optimal configuration method based on social public opinion data mining technology, which performs confidence level division on social public opinions according to real-time public opinion data of the internet and algorithms such as text mining and SVM, evaluates and predicts telephone incoming call volume and hot-line average processing time of 12345 complaint reporting, and provides effective data support for scientific and reasonable configuration of an agent by using big data technology.
In order to achieve the purpose, the invention provides an agent optimal configuration method based on social public sentiment data mining technology, which comprises the following steps:
step1, collecting public opinion data by using a web crawler technology;
step2, public opinion data preprocessing, including data cleaning, data integration, data conversion and data reduction;
step3, establishing a social public opinion confidence model by using a text mining technology and a support vector machine algorithm, and dividing the social public opinion information confidence level;
and 4, establishing an agent optimization configuration model.
Further, the step2 specifically comprises:
step 21: data cleaning, namely identifying and processing the vacant data, the incomplete data and the unreasonable data;
step 22: data integration, namely organically centralizing and integrating data of different sources, formats and characteristics;
step 23: data conversion, converting the format of the data;
step 24: and (4) data reduction, namely, on the premise of keeping the integrity of the data, simplifying the data.
Further, the step3 specifically includes:
step 31: manually labeling, namely randomly extracting texts in a certain proportion, performing labeling classification by a plurality of related professionals, and counting the consistency of corpus labeling according to the labeling result;
step 32: feature selection, wherein the feature selection refers to selecting some representative words from a dictionary to realize dimension reduction, a Chi method is adopted to perform feature selection, and feature words w and categories a are assumed i The chi-square distribution of the first-order degree of freedom is formed between the characteristic words c and the class a i The chi-square formula of (c) is:
n is the total number of documents, a is the number belonging to a i The number of documents of class and containing entry w, b not belonging to a i The number of documents of class but containing entry w, c being a i Number of documents of class but not containing entry w, d being not a i The number of documents that are class and do not contain an entry w;
for the case of multiple categories, calculating chi-square statistic value of the entry w under each category;
if the feature word w and the category a i Chi-square statistic value of (b) =0, then the characteristic word w and the text category a are explained i Are independent of each other; if the chi-square statistic value is larger, the characteristic word w and the category a are explained i The stronger the correlation of (c); eliminating the features lower than the specific threshold value through a chi-square formula, and reserving the features higher than the threshold value to realize feature selection;
step 33: feature extraction, namely mapping a high latitude space to a low latitude space to realize dimension reduction, and performing feature extraction on a text by using an LSA algorithm, wherein the method mainly comprises the following steps:
1) Establishing a word frequency matrix M;
2) Calculating singular value decomposition of a word frequency matrix M, and decomposing the M into three matrixes of U, S and V, wherein U and V are orthogonal matrixes, and S is a diagonal matrix;
3) Mapping other training samples into a U space;
4) Indexing and calculating similarity of the converted documents, and obtaining an LSA classifier through training;
step 34: constructing a feature vector, converting each text into a text vector with N, forming a text vector space by a plurality of text vectors, assuming that N feature words exist, and representing each text by the text, wherein each text is an N-dimensional vector;
step 35: constructing SVM classifier, and setting { (x 1, y) 1 ),(x2,y 2 ),…,(xn,y n ) Is a training set, where xi represents the input vector, yi ∈ { -1,1} represents the output vector, if the training set can be linearly divided by a hyperplane W · X + b =0, the problem turns into the optimization hyperplane problem:
if the linear divisible is not linear, then the low-dimensional input space R can be divided by the kernel function K (x 1, x 2) n Mapping to a high-dimensional characteristic space H to realize linear divisibility, and selecting a polynomial kernel function, wherein the formula is as follows:
K(x1,x2)=(<x1,x2>+R) d
1) Selecting a proper kernel function K (x 1, x 2) and a penalty coefficient C >0, the formula of the objective function is as follows:
2) Calculating a corresponding a vector when the formula (2) is minimized by using an SMO algorithm;
3) Calculating w, the formula is
4) Finding out all samples (xm, ym) corresponding to the conditions that the (ai) and the (C) are in a range of 0 and assume that M support vectors are in total;
5) By passingCalculating bm corresponding to each support vector (xm, ym), and finally
6) Thus, the classification hyperplane isThe classification decision function is
Step 36: and (5) predicting a result by the model.
Further, the step4 specifically includes:
step 41, according to the classification result of the social public opinion information confidence level in the step3, combining with the recent historical data of 12345 service hotline, drawing an analysis curve related to time, and performing relevance analysis on the number of hotline incoming calls in each day and different time periods and the average processing time by using a relevance analysis algorithm;
step 42, by using a multiple regression analysis algorithm, taking historical data of social public opinion confidence level X1, number X2 of complaints and reports of 12345, work order item type X3 and work order severity X4 as input variables, realizing weight distribution through multiple fitting, and finally constructing a daily hot-line incoming call quantity calculation formula in the following form:
F 1 (X)=W1*f 1 (X1)+W2*f 2 (X2)+W3*f 3 (X3)+W4*f 4 (X4)
number of hot calls F in different time periods 2 (X) and a hot-line average processing time length F 3 (X) is calculated in a similar manner;
step 41, constructing an agent optimization configuration model by utilizing a multiple regression analysis algorithm, and assuming that the hot line incoming call quantity per day is F 1 (X) the number of hot calls in different periods is F 2 (X) average hot-line processing time length F 3 (X) if the call completing rate is a and the maximum occupancy rate is b, the agent optimization configuration function is as follows:
G(X,a,b)=U1*F 1 (X)+U2*F 2 (X)+U3*F 3 (X) + U4 g (a) + U5 h (b) Ui represents a weight.
Further, the step3 divides the social public opinion information into five confidence levels of optimistic, prudent, optimistic, neutral, prudent, pessimistic and pessimistic.
The beneficial effects of the invention are:
according to the agent optimal configuration algorithm model based on social public opinion data mining, provided by the invention, the social public opinions are divided into confidence levels according to real-time public opinion data of the Internet and algorithms such as text mining and SVM (support vector machine) and the like, so that the call volume reported by 12345 complaints is estimated and predicted, and the agent is scientifically and reasonably configured by using a big data technology.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is an overall flow chart of the present invention.
Fig. 2 is a flow chart of establishing a social public opinion confidence model according to the present invention.
Fig. 3 is a flow chart of agent optimization configuration model establishment according to the present invention.
Detailed Description
As shown in fig. 1, the method for optimal configuration of an agent based on social public sentiment data mining technology of the present invention specifically comprises the following operation steps:
the method comprises the following steps: public opinion data collection
The method comprises the steps of collecting social public opinion data on the Internet by utilizing a web crawler technology, for example, regularly crawling and collecting the social public opinion data on social media such as various news websites, microblogs, forums, blogs and the like, and mainly using unstructured data mainly comprising text information.
Step two: public opinion data preprocessing
And preprocessing the crawled public opinion data, including the steps of data cleaning, data integration, data conversion, data reduction and the like.
Step1: data cleaning: and the method identifies and processes the blank data, the incomplete data and the unreasonable data, and ensures the integrity, the reasonability, the authority and the consistency of the data.
Step2: data integration: the data of different sources, formats and characteristics are organically centralized and integrated.
Step3: data conversion: the format of the data is converted, so that the data can be analyzed and mined conveniently in the follow-up process.
Step4: and (3) data reduction: on the premise of keeping the integrity of the data as much as possible, the data is simplified and processed by common dimensionality reduction methods such as PCA (principal component analysis).
Step three: the social public opinion confidence model is shown in fig. 2:
a social public opinion confidence model is established by utilizing a text mining technology and a support vector machine algorithm, and public opinion information is divided into five confidence levels of optimism, judicious optimism, neutrality, judicious pessimism and pessimism.
Step1: manual labeling: randomly extracting texts in a certain proportion, carrying out labeling classification by a plurality of related professionals, counting the consistency of corpus labeling according to the labeling result, and using the passed labeling for information classification.
Step2: selecting characteristics: the feature selection refers to selecting some representative words from a dictionary to realize dimension reduction.
Performing feature selection by adopting Chi method, and assuming feature words w and categories a i The chi-square distribution of the first-order degree of freedom is formed between the characteristic words c and the class a i The chi-square formula of (c) is:
n is the total number of documents, a is the number of documents belonging to i Number of documents of class and containing entry w, b being not a i The number of documents of class but containing the entry w, c being a i Number of documents of class but not containing entry w, d being not a i Class and number of documents without entry w.
For the case of multiple categories, it is necessary to compute the chi-squared statistic of the entry w under each category.
If the feature word w and the category a i The chi-square statistic value =0, then the feature word w and the text category a are described i Are independent of each other; if the chi-square statistic value is larger, the characteristic word w and the category a are explained i The stronger the correlation. Through a chi-square formula, the features lower than a specific threshold value can be removed, the features higher than the threshold value are reserved, and feature selection is realized.
Step3: characteristic extraction: and the dimension reduction is realized by mapping the high weft space to the low weft space.
The method is characterized by comprising the following steps of applying an LSA algorithm to extract the features of a text:
1) Establishing a word frequency matrix M;
2) Calculating singular value decomposition of a word frequency matrix M, and decomposing the M into three matrixes of U, S and V, wherein U and V are orthogonal matrixes, and S is a diagonal matrix;
3) Mapping other training samples into a U space;
4) And indexing and calculating the similarity of the converted documents, and obtaining the LSA classifier through training.
Step4: constructing a feature vector: and converting each text into a text vector with n, wherein a plurality of text vectors form a text vector space. Assuming that there are N feature words, each text is an N-dimensional vector after being represented by the text.
Step5: constructing an SVM classifier, and setting { (x 1, y) 1 ),(x2,y 2 ),…,(xn,y n ) Is the training set, where xi represents the input vector, yi ∈ { -1,1} represents the output vector, if the training set can be linearly divided by a hyperplane W · X + b =0, the problem transforms into the optimization hyperplane problem:
if the linear divisible is not linear, then the low-dimensional input space R can be divided by the kernel function K (x 1, x 2) n Mapping to a high-dimensional feature space H to realize linear divisibility.
The kernel function refers to an inner product function of two vectors in a space after implicit mapping, common kernel functions include a polynomial kernel function, a linear kernel function, a gaussian kernel function and the like, and the polynomial kernel function is selected herein, and a formula is as follows:
K(x1,x2)=(<x1,x2>+R) d
1) Selecting a proper kernel function K (x 1, x 2) and a penalty coefficient C >0, the formula of the objective function is as follows:
2) Calculating a corresponding a vector when the formula (2) is minimized by using an SMO algorithm;
3) Calculating w by the formula
4) Finding out all samples (xm, ym) corresponding to the conditions that the (ai) and the (C) are in a range of 0 and assume that M support vectors are in total;
5) By passingCalculating bm corresponding to each support vector (xm, ym), and finally
6) Thus, the classification hyperplane isA classification decision function of
Step6: model prediction results
Adopting SVM parameters and models trained in step5 to classify the unmarked social public opinion data according to the confidence level
Step four: the agent optimization configuration model is as shown in FIG. 3:
step1: and (4) according to the social public opinion confidence level classification result in the step three, combining with the recent historical data of the 12345 service hotline, drawing an analysis curve related to time, and performing relevance analysis on the number of the hotline calls in each day and different time intervals and the average processing time by using a relevance analysis algorithm.
Step2: by utilizing a multiple regression analysis algorithm, taking historical data such as social public opinion confidence level X1, number X2 of complaints and reports of 12345, work order item type X3, work order severity X4 and the like as input variables, realizing weight distribution through multiple fitting, and finally constructing a daily hot-line incoming call quantity calculation formula in the following form:
F 1 (X)=W1*f 1 (X1)+W2*f 2 (X2)+W3*f 3 (X3)+W4*f 4 (X4)
number of hot calls F in different time periods 2 (X) and hotline average processing time period F 3 The calculation method of (X) is similar.
Step3: and constructing an agent optimization configuration model by using a multiple regression analysis algorithm. Suppose that the daily hot line incoming call amount is F 1 (X) the number of hot calls in different periods is F 2 (X) average hot line processing time length F 3 (X) if the call completing rate is a and the maximum occupancy rate is b, the agent optimization configuration function is as follows:
G(X,a,b)=U1*F 1 (X)+U2*F 2 (X)+U3*F 3 (X) + U4 g (a) + U5 h (b) Ui represents a weight.
According to the agent optimal configuration algorithm model based on social public opinion data mining, provided by the invention, the social public opinions are divided into confidence levels according to real-time public opinion data of the Internet and algorithms such as text mining and SVM (support vector machine) and the like, so that the call volume reported by 12345 complaints is estimated and predicted, and the agent is scientifically and reasonably configured by using a big data technology.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (5)

1. A method for optimally configuring an agent based on a social public sentiment data mining technology is characterized by comprising the following steps:
step1, collecting public sentiment data by using a web crawler technology;
step2, public opinion data preprocessing, including data cleaning, data integration, data conversion and data reduction;
step3, establishing a social public opinion confidence model by using a text mining technology and a support vector machine algorithm, and dividing the social public opinion information confidence level;
and 4, establishing an agent optimization configuration model.
2. The method for optimal configuration of the agent based on the social public opinion data mining technology as claimed in claim 1, wherein the step2 specifically comprises:
step 21: data cleaning, namely identifying and processing vacant data, incomplete data and unreasonable data;
step 22: data integration, namely organically centralizing and integrating data of different sources, formats and characteristics;
step 23: data conversion, converting the format of the data;
step 24: and (4) data reduction, namely, on the premise of keeping the integrity of the data, simplifying the data.
3. The method for optimal configuration of the agent based on the social public opinion data mining technology as claimed in claim 1, wherein the step3 specifically comprises:
step 31: manually labeling, namely randomly extracting texts in a certain proportion, performing labeling classification by a plurality of related professionals, and counting the consistency of corpus labeling according to the labeling result;
step 32: feature selection, wherein the feature selection refers to selecting some representative words from a dictionary to realize dimension reduction, a Chi method is adopted to perform feature selection, and feature words w and categories a are assumed i The characteristic word c is distributed according to the chi-square distribution of the first-order degree of freedom, so that the characteristic word c is corresponding to the category a i The chi-square formula of (c) is:
n is the total number of documents, a is the number belonging to a i Number of documents of class and containing entry w, b being not a i The number of documents of class but containing entry w, c being a i Number of documents of class but not containing entry w, d being not a i The number of documents that are class and do not contain an entry w;
for the case of multiple categories, calculating chi-square statistic of the entry w under each category is needed;
if the feature word w and the category a i The chi-square statistic value =0, then the feature word w and the text category a are described i Are independent of each other; if the chi-square statistic value is larger, the characteristic word w and the category a are explained i The stronger the correlation of (c); removing the features lower than a specific threshold value through a chi-square formula, and reserving the features higher than the threshold value to realize feature selection;
step 33: the method comprises the following steps of (1) feature extraction, namely, mapping a high weft space to a low weft space to realize dimension reduction, and performing feature extraction on a text by using an LSA algorithm, wherein the method mainly comprises the following steps:
1) Establishing a word frequency matrix M;
2) Calculating singular value decomposition of a word frequency matrix M, and decomposing the M into three matrixes of U, S and V, wherein U and V are orthogonal matrixes, and S is a diagonal matrix;
3) Mapping other training samples into a U space;
4) Indexing and calculating similarity of the converted documents, and obtaining an LSA classifier through training;
step 34: constructing a feature vector, converting each text into a text vector with N, forming a text vector space by a plurality of text vectors, assuming that N feature words exist, and representing each text by the text, wherein each text is an N-dimensional vector;
step 35: constructing SVM classifier, and setting { (x 1, y) 1 ),(x2,y 2 ),…,(xn,y n ) Is a training set, where xi represents the input vector, yi ∈ { -1,1} represents the output vector, if the training set can be linearly divided by a hyperplane W · X + b =0, the problem turns into the optimization hyperplane problem:
if the linear divisible is not linear, then the low-dimensional input space R can be divided by the kernel function K (x 1, x 2) n Mapping to a high-dimensional feature space H, implementing linearityAnd selecting a polynomial kernel function, wherein the formula is as follows:
K(x1,x2)=(<x1,x2>+R) d
1) Selecting a proper kernel function K (x 1, x 2) and a penalty coefficient C >0, the formula of the objective function is as follows:
2) Calculating a corresponding a vector when the formula (2) is minimized by using an SMO algorithm;
3) Calculating w, the formula is
4) Finding out all samples (xm, ym) corresponding to the conditions that the (ai) and the (C) are in a range of 0 and assume that M support vectors are in total;
5) By passingCalculating bm corresponding to each support vector (xm, ym), and obtaining
6) Thus, the classification hyperplane isThe classification decision function is
And step 36: and (5) predicting a result by the model.
4. The method for optimal configuration of the agent based on the social public opinion data mining technology as claimed in claim 1, wherein the step4 specifically comprises:
step 41, drawing an analysis curve related to time according to the classification result of the confidence level of the social public opinion information in step3 and combining with the recent historical data of 12345 service hotlines, and performing relevance analysis on the number of hotline incoming calls in each day and different time periods and the average processing time by using a relevance analysis algorithm;
step 42, by using a multiple regression analysis algorithm, taking historical data of social public opinion confidence level X1, complaint reporting number X2 of 12345, work order item type X3 and work order severity X4 as input variables, realizing weight distribution through multiple fitting, and finally constructing a daily hot line incoming call amount calculation formula in the following form:
F 1 (X)=W1*f 1 (X1)+W2*f 2 (X2)+W3*f 3 (X3)+W4*f 4 (X4)
number of hot calls F in different periods 2 (X) and hotline average processing time period F 3 (X) is calculated in a similar manner;
step 41, constructing an agent optimization configuration model by utilizing a multiple regression analysis algorithm, and assuming that the hot call volume per day is F 1 (X) the number of hot calls in different periods is F 2 (X) average hot-line processing time length F 3 (X) if the call completing rate is a and the maximum occupancy rate is b, the agent optimization configuration function is as follows:
G(X,a,b)=U1*F 1 (X)+U2*F 2 (X)+U3*F 3 (X) + U4 × g (a) + U5 × h (b) Ui denote weights.
5. The method as claimed in claim 1, wherein the step3 divides the social public opinion information into five confidence levels of optimism, judicious optimism, neutrality, judicious pessimism and pessimism.
CN201711445217.3A 2017-12-27 2017-12-27 Agent optimal configuration method based on social public opinion data mining technology Active CN108021704B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711445217.3A CN108021704B (en) 2017-12-27 2017-12-27 Agent optimal configuration method based on social public opinion data mining technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711445217.3A CN108021704B (en) 2017-12-27 2017-12-27 Agent optimal configuration method based on social public opinion data mining technology

Publications (2)

Publication Number Publication Date
CN108021704A true CN108021704A (en) 2018-05-11
CN108021704B CN108021704B (en) 2021-05-04

Family

ID=62071068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711445217.3A Active CN108021704B (en) 2017-12-27 2017-12-27 Agent optimal configuration method based on social public opinion data mining technology

Country Status (1)

Country Link
CN (1) CN108021704B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871889A (en) * 2019-01-31 2019-06-11 内蒙古工业大学 Mass psychology appraisal procedure under emergency event
WO2021103492A1 (en) * 2019-11-28 2021-06-03 福建亿榕信息技术有限公司 Risk prediction method and system for business operations
CN115048487A (en) * 2022-05-30 2022-09-13 平安科技(深圳)有限公司 Artificial intelligence-based public opinion analysis method, device, computer equipment and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033188A1 (en) * 2005-08-05 2007-02-08 Ori Levy Method and system for extracting web data
CN103970864A (en) * 2014-05-08 2014-08-06 清华大学 Emotion classification and emotion component analyzing method and system based on microblog texts
CN104113643A (en) * 2014-06-27 2014-10-22 国家电网公司 Customer service center on-site monitoring system and method
US9141966B2 (en) * 2009-12-23 2015-09-22 Yahoo! Inc. Opinion aggregation system
US9536191B1 (en) * 2015-11-25 2017-01-03 Osaro, Inc. Reinforcement learning using confidence scores
CN106530127A (en) * 2016-11-09 2017-03-22 国网江苏省电力公司南京供电公司 Complaint early warning and monitoring analysis system based on text mining
CN106791225A (en) * 2017-03-23 2017-05-31 国家电网公司客户服务中心 A kind of alarm method and device
US20170364834A1 (en) * 2011-06-14 2017-12-21 Microsoft Technology Licensing, Llc Real-time monitoring of public sentiment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033188A1 (en) * 2005-08-05 2007-02-08 Ori Levy Method and system for extracting web data
US9141966B2 (en) * 2009-12-23 2015-09-22 Yahoo! Inc. Opinion aggregation system
US20170364834A1 (en) * 2011-06-14 2017-12-21 Microsoft Technology Licensing, Llc Real-time monitoring of public sentiment
CN103970864A (en) * 2014-05-08 2014-08-06 清华大学 Emotion classification and emotion component analyzing method and system based on microblog texts
CN104113643A (en) * 2014-06-27 2014-10-22 国家电网公司 Customer service center on-site monitoring system and method
US9536191B1 (en) * 2015-11-25 2017-01-03 Osaro, Inc. Reinforcement learning using confidence scores
CN106530127A (en) * 2016-11-09 2017-03-22 国网江苏省电力公司南京供电公司 Complaint early warning and monitoring analysis system based on text mining
CN106791225A (en) * 2017-03-23 2017-05-31 国家电网公司客户服务中心 A kind of alarm method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵纪涛等: "基于数据挖掘的网络舆情分析模型", 《现代计算机》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871889A (en) * 2019-01-31 2019-06-11 内蒙古工业大学 Mass psychology appraisal procedure under emergency event
CN109871889B (en) * 2019-01-31 2019-12-24 内蒙古工业大学 Public psychological assessment method under emergency
WO2021103492A1 (en) * 2019-11-28 2021-06-03 福建亿榕信息技术有限公司 Risk prediction method and system for business operations
CN115048487A (en) * 2022-05-30 2022-09-13 平安科技(深圳)有限公司 Artificial intelligence-based public opinion analysis method, device, computer equipment and medium
CN115048487B (en) * 2022-05-30 2024-05-03 平安科技(深圳)有限公司 Public opinion analysis method, device, computer equipment and medium based on artificial intelligence

Also Published As

Publication number Publication date
CN108021704B (en) 2021-05-04

Similar Documents

Publication Publication Date Title
Jain et al. An intelligent cognitive-inspired computing with big data analytics framework for sentiment analysis and classification
Gupta et al. Study of Twitter sentiment analysis using machine learning algorithms on Python
CN110059181B (en) Short text label method, system and device for large-scale classification system
CN106383877B (en) Social media online short text clustering and topic detection method
CN109271514B (en) Generation method, classification method, device and storage medium of short text classification model
Tran et al. Hashtag recommendation approach based on content and user characteristics
CN106294568A (en) A kind of Chinese Text Categorization rule generating method based on BP network and system
CN108021704B (en) Agent optimal configuration method based on social public opinion data mining technology
CN103279478A (en) Method for extracting features based on distributed mutual information documents
CN104834651A (en) Method and apparatus for providing answers to frequently asked questions
CN102436512B (en) Preference-based web page text content control method
CN112580332A (en) Enterprise portrait method based on label layering and deepening modeling
CN116843162B (en) Contradiction reconciliation scheme recommendation and scoring system and method
Kang et al. The science of emotion: malaysian airlines sentiment analysis using bert approach
Saeed et al. A framework to predict early news popularity using deep temporal propagation patterns
Hair Zaki et al. Text detergent: The systematic combination of text pre-processing techniques for social media sentiment analysis
CN105183894B (en) Method and device for filtering website internal links
CN114691993A (en) Dynamic self-adaptive topic tracking method, system and device based on time sequence
KR20220105792A (en) AI-based Decision Making Support System utilizing Dynamic Text Sources
CN112270189A (en) Question type analysis node generation method, question type analysis node generation system and storage medium
Sefara et al. Topic classification of tweets in the broadcasting domain using machine learning methods
CN114372136B (en) User identity information identification method and device based on multi-level data representation learning
Shelke et al. An Ensemble Based Approach for Sentiment Classification in Asian Regional Language.
CN112883160B (en) Capture method and auxiliary system for result transfer conversion
CN111159393B (en) Text generation method for abstract extraction based on LDA and D2V

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant