CN105447128A - Method for predicting spread range of microblog public opinions - Google Patents
Method for predicting spread range of microblog public opinions Download PDFInfo
- Publication number
- CN105447128A CN105447128A CN201510795223.6A CN201510795223A CN105447128A CN 105447128 A CN105447128 A CN 105447128A CN 201510795223 A CN201510795223 A CN 201510795223A CN 105447128 A CN105447128 A CN 105447128A
- Authority
- CN
- China
- Prior art keywords
- node
- microblog
- public sentiment
- network
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the field of social network modeling and analysis, and specifically relates to a method for predicting a spread range of microblog public opinions. The method comprises the following steps: step 1), constructing a propagation network model of a microblog system; step 2), selecting a sentinel node in a microblog propagation network for judging a public opinion coverage area; step 3), using a sentinel monitoring node to establish a prediction model of the spread range of the microblog public opinions; and step 4), performing empirical statistical analysis on an event public opinion in an actual microblog network, and determining a key parameter in the prediction model. According to the method for predicting the spread range of the microblog public opinions provided by the invention, an empirical statistical experiment is performed in a social network with 500 to 1,000 microblog nodes, so as to obtain important parameters of the prediction model; then a network microblog capture program is written voluntarily to analyze, and statistical processing is made to the social network data with 5,000 to 10,000 nodes; and the microblog network of this sc ale is used to verify the accuracy of the prediction method, and the experimental result shows that the prediction accuracy is about 83.2%.
Description
Technical field
The present invention relates to community network modeling and analysis field, be specifically related to the method for a kind of microblogging public sentiment spread scope prediction.
Background technology
Microblogging has become one of most important new media platform of modern society, compared with traditional media, has in time, the feature such as fragmentation, free and open and popularity.But anyone can utilize microblogging to issue bad viewpoint and comment, and can be diffused into rapidly in entire society's network after the forwarding and comment of everybody.Some fraudulent speeches can cause the destruction of social safety, serious meeting causes social groups' event.Therefore departments of government must the public feelings information in microblogging be analyzed, monitor and predict, for further to manage and preparation is made in control.
Existing internet public feelings information monitoring and analysis mainly pay close attention to two problems: one is solve the difficult problem to the manual process of magnanimity information, propose the automatic the analysis of public opinion system that some utilize the text analyzing of computing machine and the method design of machine learning, reduce the hand labor in network public-opinion monitor procedure with this; Two is attempt solving the difficult problem that network public-opinion finds degree of accuracy, by improving and optimize the method such as text analyzing, clustering algorithm, improves the accuracy that in text, public sentiment semanteme excavates.
Through finding the literature search of prior art, China Patent Publication No. is: CN101661513B, patent name is: the detection method of network hotspot and public sentiment, the present solution provides the detection method of a kind of network hotspot in network information processing field and public sentiment, can be applied in the determination and analysis of microblogging public sentiment.By the microblogging text message within the scope of collection certain hour and review information, and carry out word segmentation processing, Conceptual Projection process to the content of text of these information, eliminate the uncertainty of semantic concept, final extraction can reflect the feature of content of text.Recycle these content characteristic data and carry out cluster, form the information document set that several comprise unequal number amount, gather according to each number comprising information document and determine whether focus incident in network, the information document set of focus incident is being passed judgement on to the analysis of tendency, thus grasp netizen to the public sentiment viewpoint of this event, detect microblogging public sentiment with this.
The existing method monitored microblogging and analyze pays close attention to the judgement of automated analysis process and public feelings information, ignore public sentiment propagates trend analysis at whole online community network, public sentiment cannot be provided to have propagated into which kind of degree to network public-opinion management and control personnel, namely cannot judge the public sentiment diffusion of certain event.The present invention carrys out determination and analysis microblogging public sentiment from the overall angle of community network and propagates, and proposes a kind of method predicting microblogging public sentiment prevalence, judges public sentiment spread condition by the information of monitoring sentinel node.
Summary of the invention
The object of the invention is to solve the problem, a kind of method that microblogging public sentiment spread scope is predicted is provided, actual count data are utilized to set up nonlinear model by microblogging Forecasting Methodology, the state monitoring sentinel node according to the character of public sentiment event determines the coverage condition of microblogging public sentiment, and provides accurate public sentiment to propagate quantized data to network public-opinion supvr.
The present invention's adopted technical scheme that solves the problem is:
A method for microblogging public sentiment spread scope prediction, carry out successively according to following order:
1) build the communication network model of microblog system: each microblog users is considered as a node, set up the company limit between node according to the bean vermicelli of microblogging, concern and friend relation, form a complicated online community network model; Public sentiment spread scope and public sentiment message coverage rate;
2) in microblogging communication network, the sentinel node judging public sentiment coverage is selected;
3) sentry's monitoring node is utilized to set up the forecast model of microblogging public sentiment spread scope;
In actual micro blog network, real example statistical study is carried out to event public sentiment, and determine the key parameter in forecast model.
Preferably, 1) the public sentiment message coverage rate described in is known the node set of message and the ratio of whole node set,
In formula
represent nodes,
for whole nodes, notice that whole node refers to the node total number in micro blog network within the scope of validated user;
Message propagation process is time series T={t1, t2 ..., ti, ti+1 ..., monitoring moment t
kinformation coverage be O
k, namely
Preferably, the forecast model 3) is that the problem of micro blog network sentinel node information of forecasting coverage rate changes into by V
k the event be merged into is to predict O
k, research Node subsets V
kand the rule between coverage rate O, sets up forecast model, belong to V by detection
kthe information realization of sentinel node to information coverage O
kassessment; A node propagation effect power is selected in sentinel node.
Preferably, described sentinel node comprises live-vertex in leader of opinion's node, community, inactive node.
Preferably, described node propagation effect power is the degree of node and the product of indirect communication node mean distance,
I (i) represents the influence power of node i, the out-degree that outdegree (i) is node, d
ijrepresent the distance between the node j of node i indirect communication, count (i) represents the number of other all nodes of node i indirect communication; Finally set up and first set up the relational model between node influence power and information coverage by statistical method
O(I)=f(I),
With O (I)=f (I), as basis for forecasting, detect some nodes and whether propagate into certain information, carry out appreciation information coverage rate with this, the propagation effect power of node j is I
j, then O (I is drawn after substituting into
j), be abbreviated as O
jthe information coverage that expression probe node j gets;
Select S curve as the basic model of regretional analysis,
The invention has the beneficial effects as follows:
The present invention carries out real example statistical experiment in the community network of 500-1000 microblogging node, and obtains the important parameter of forecast model with this.And then write network microblog capture program voluntarily to analyze, and add up the community network data of 5000-10000 node, by the accuracy of the micro blog network checking Forecasting Methodology of this scale, the accuracy of experimental result display prediction is about 83.2%.
Accompanying drawing explanation
Fig. 1 is the statistical information figure of influence power minor node of the present invention as source point;
Fig. 2 is the statistical information figure of the large node of influence power of the present invention as source point;
Fig. 3 is the statistical information figure of medium influence power node of the present invention as source point;
Embodiment
Below in conjunction with accompanying drawing and embodiment, the present invention is described in further detail:
As shown in Figure 1, Figure 2 and Figure 3, the method for a kind of microblogging public sentiment spread scope prediction of the present invention, the scope of real example chooses school curricula-variable student 587 people of certain university's industry science four institute, and what relate to 12 specialties, 15 classes of 3 grades is studying in college life.After everyone registers Sina's microblogging, then form social relationships on line with natural way, according to same bedroom, friend, classmate and in the school community activity do not allow after forming stable line co-relation to add new relation.In addition, only consider the node within the scope of university, ignore the node relationships of other modes, such as senior middle school classmate, kith and kin etc.
With Sina's microblog system for Information Communication platform, choose random node as information source point to issue some information of the same natures, the publicity of such as training and learning and business promotion activity, the issue of university student's action message.Only allow student utilize microblogging to understand and diffuse information, eliminate the interference of propagating under line as far as possible.For every bar test post defines a unique id, be labeled as M
i, the unique id of each student's node sets, is designated as V
j, when student receives M
inormally comment on and forward, simultaneously send an envelope Email to a public mailbox, this Email Header is M
iand V
j.The last track extracting message propagation in email list, each student information is a tlv triple <M
i, V
j, t
i>, wherein Mi is information indicating number, V
jfor user's sign number, t
ifor the time of reception of mail, in this approximate representation message propagation time of arrival.
Adopt three kinds of influence power nodes as propagation source point in real example: the node that the node that influence power is low, influence power are high and medium influence power node, be respectively shown in Fig. 1-3.In figure, x-axis represents node influence power, and y-axis represents information coverage.Each selection 5 homogeneity message propagation carry out real example, the error range of comformed information coverage rate.Observe after Fig. 1-3 and can find to there is certain nonlinear relationship between node influence power and information coverage, the information coverage that influence power high node is corresponding lower, and the information coverage that the low node of influence power is corresponding high.This rule is consistent with intuitive analysis in society, and we attempt setting up node influence power and the direct relation of information coverage by real example data configuration nonlinear model.
Using the little node of influence power as propagating source in Fig. 1, form a smoother curve.The method of regretional analysis can be adopted to carry out matching formula (4) O (I
j).Comparatively speaking, the medium influence power node monitored in real example is less, and the interval of medium influence power node is relatively sparse.
Using the large node of influence power as propagating source in Fig. 2, medium influence power node region is more sparse, but medial error scope obviously reduces between the node area that influence power is large, this is because by the factor of the large node of influence power as propagating source, 5 real example process error fluctuations are less.
Using the node of medium influence power as propagating source in Fig. 3, no longer sparse between medium influence power node location, and fluctuating error is less; There is minimizing trend in the node that influence power is large, information coverage fluctuating error becomes large; The node that influence power is little increases, and fluctuating error is without significant change.
The technician of the industry should understand; the present invention is not restricted to the described embodiments; what describe in above-described embodiment and instructions just illustrates principle of the present invention; without departing from the spirit and scope of the present invention; the present invention also has various changes and modifications, and these changes and improvements all fall in the claimed scope of the invention.Application claims protection domain is defined by appending claims and equivalent thereof.
Claims (5)
1. a method for microblogging public sentiment spread scope prediction, is characterized in that: carry out successively according to following order:
1) build the communication network model of microblog system: each microblog users is considered as a node, set up the company limit between node according to the bean vermicelli of microblogging, concern and friend relation, form a complicated online community network model; Public sentiment spread scope and public sentiment message coverage rate;
2) in microblogging communication network, the sentinel node judging public sentiment coverage is selected;
3) sentry's monitoring node is utilized to set up the forecast model of microblogging public sentiment spread scope;
4) in actual micro blog network, real example statistical study is carried out to event public sentiment, and determine the key parameter in forecast model.
2. the method for a kind of microblogging public sentiment spread scope prediction according to claim 1, is characterized in that: the public sentiment message coverage rate 1) is known the node set of message and the ratio of whole node set,
In formula
represent nodes, | V| is whole nodes, notices that whole node refers to the node total number in micro blog network within the scope of validated user;
Message propagation process is time series T={t1, t2 ..., ti, ti+1 ..., monitoring moment t
kinformation coverage be O
k, namely
3. the method for a kind of microblogging public sentiment spread scope prediction according to claim 1, is characterized in that: the forecast model 3) be the problem of micro blog network sentinel node information of forecasting coverage rate change into by
the event be merged into is to predict O
k, research Node subsets V
kand the rule between coverage rate O, sets up forecast model, belong to V by detection
kthe information realization of sentinel node to information coverage O
kassessment; A node propagation effect power is selected in sentinel node.
4. the method for a kind of microblogging public sentiment spread scope prediction according to claim 1, is characterized in that: described sentinel node comprises live-vertex in leader of opinion's node, community, inactive node.
5. the method for a kind of microblogging public sentiment spread scope prediction according to claim 1, is characterized in that: described node propagation effect power is the degree of node and the product of indirect communication node mean distance,
I (i) represents the influence power of node i, the out-degree that outdegree (i) is node, d
ijrepresent the distance between the node j of node i indirect communication, count (i) represents the number of other all nodes of node i indirect communication; Finally set up and first set up the relational model between node influence power and information coverage by statistical method
O(I)=f(I),
With O (I)=f (I), as basis for forecasting, detect some nodes and whether propagate into certain information, carry out appreciation information coverage rate with this, the propagation effect power of node j is I
j, then O (I is drawn after substituting into
j), be abbreviated as O
jthe information coverage that expression probe node j gets;
Select S curve as the basic model of regretional analysis,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510795223.6A CN105447128A (en) | 2015-11-18 | 2015-11-18 | Method for predicting spread range of microblog public opinions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510795223.6A CN105447128A (en) | 2015-11-18 | 2015-11-18 | Method for predicting spread range of microblog public opinions |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105447128A true CN105447128A (en) | 2016-03-30 |
Family
ID=55557305
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510795223.6A Pending CN105447128A (en) | 2015-11-18 | 2015-11-18 | Method for predicting spread range of microblog public opinions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105447128A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106447508A (en) * | 2016-10-20 | 2017-02-22 | 宁波江东大金佰汇信息技术有限公司 | Improved high-quality node detection system based on computer large data in social network |
CN110335059A (en) * | 2019-05-14 | 2019-10-15 | 浙江工业大学 | A kind of analysis method for micro blog network advertisement information propagation trend |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183743A (en) * | 2015-06-29 | 2015-12-23 | 临沂大学 | Prediction method of MicroBlog public sentiment propagation range |
-
2015
- 2015-11-18 CN CN201510795223.6A patent/CN105447128A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183743A (en) * | 2015-06-29 | 2015-12-23 | 临沂大学 | Prediction method of MicroBlog public sentiment propagation range |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106447508A (en) * | 2016-10-20 | 2017-02-22 | 宁波江东大金佰汇信息技术有限公司 | Improved high-quality node detection system based on computer large data in social network |
CN110335059A (en) * | 2019-05-14 | 2019-10-15 | 浙江工业大学 | A kind of analysis method for micro blog network advertisement information propagation trend |
CN110335059B (en) * | 2019-05-14 | 2022-05-03 | 浙江工业大学 | Method for analyzing propagation trend of microblog network advertisement information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | SentiDiff: combining textual information and sentiment diffusion patterns for Twitter sentiment analysis | |
Jiang et al. | Public-opinion sentiment analysis for large hydro projects | |
Song et al. | CED: Credible early detection of social media rumors | |
CN105183743A (en) | Prediction method of MicroBlog public sentiment propagation range | |
Varshney et al. | A review on rumour prediction and veracity assessment in online social network | |
Kong et al. | Digital technology and corporate social responsibility: evidence from China | |
Bigonha et al. | Sentiment-based influence detection on Twitter | |
Sun et al. | Identifying influential users by their postings in social networks | |
CN110472884A (en) | ESG index monitoring method, device, terminal device and storage medium | |
Kardara et al. | Large-scale evaluation framework for local influence theories in Twitter | |
Ackland et al. | Truth and the dynamics of news diffusion on twitter | |
Zhang et al. | Joint monitoring of post-sales online review processes based on a distribution-free EWMA scheme | |
Yin et al. | Measuring pair-wise social influence in microblog | |
Brodeur et al. | We need to talk about mechanical turk: What 22,989 hypothesis tests tell us about publication bias and p-hacking in online experiments | |
Cirqueira et al. | Explainable sentiment analysis application for social media crisis management in retail | |
Deligiannis et al. | Deep learning for geolocating social media users and detecting fake news | |
Yang | The evaluation of online education course performance using decision tree mining algorithm | |
Kong et al. | Slipping to the extreme: A mixed method to explain how extreme opinions infiltrate online discussions | |
Gosain et al. | Validating dimension hierarchy metrics for the understandability of multidimensional models for data warehouse | |
Hui | Construction of information security risk assessment model in smart city | |
CN105447128A (en) | Method for predicting spread range of microblog public opinions | |
Yu et al. | Research on situational perception of power grid business based on user portrait | |
Wang | [Retracted] Design of Chinese Teaching Evaluation System for International Students under the Background of Data Mining | |
Hu et al. | [Retracted] Dynamical Alert of Thought and Politics Teaching Based on the Long‐and Short‐Term Memory Neural Network | |
Yuan et al. | Interpretable and effective opinion spam detection via temporal patterns mining across websites |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20160330 |