CN112632365B - Public opinion event development stage automatic division and identification method - Google Patents

Public opinion event development stage automatic division and identification method Download PDF

Info

Publication number
CN112632365B
CN112632365B CN202110263077.8A CN202110263077A CN112632365B CN 112632365 B CN112632365 B CN 112632365B CN 202110263077 A CN202110263077 A CN 202110263077A CN 112632365 B CN112632365 B CN 112632365B
Authority
CN
China
Prior art keywords
value
period
trend
decline
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110263077.8A
Other languages
Chinese (zh)
Other versions
CN112632365A (en
Inventor
宇婷
王晓斌
桂迎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Eefung Software Co ltd
Original Assignee
Hunan Eefung Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Eefung Software Co ltd filed Critical Hunan Eefung Software Co ltd
Priority to CN202110263077.8A priority Critical patent/CN112632365B/en
Publication of CN112632365A publication Critical patent/CN112632365A/en
Application granted granted Critical
Publication of CN112632365B publication Critical patent/CN112632365B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines

Abstract

The invention discloses a public opinion event development stage automatic division and identification method, which comprises the following steps: acquiring time sequence data of public opinion event changes; judging whether the time series data need to be divided according to a preset first division standard; if so, dividing the time series data into a plurality of trend stages by adopting a partition competition method; respectively judging whether the time sequence data in the trend stage need to be further divided according to a preset second division standard; if so, respectively adopting a partition competition method to divide the time sequence data in the trend phase into a plurality of trend phases again; iterating the division until the time sequence data in the trend phase does not meet a second division standard; and respectively marking the trend stages as a plurality of public sentiment stages according to the indexes of the adjacent stages of the life cycle of the public sentiment events. By adopting the method and the device, the division and identification of the development stage of the public sentiment event can be quickly and accurately realized without manual participation.

Description

Public opinion event development stage automatic division and identification method
Technical Field
The invention relates to the field of internet, in particular to a public opinion event development stage automatic division and identification method.
Background
In recent years, with the rapid development of the internet and social networks, the number of users of social platforms such as microblogs and twitter is increasing at a high speed. People can share and discuss about interests or news events at any time, and the method has strong timeliness and randomness, so that malicious transmission of false information and spreading of network irrational emotion are easily caused, adverse effects are further generated, and huge network public opinion pressure is caused. Where such public opinion is rather concentrated, it is necessary to guide it correctly. From the analysis of the development process of the event public opinion, the development change rules of different types of events can be summarized, and a basis is provided for the study and judgment of similar events so as to make scientific countermeasures and realize correct guidance. The event development stage division can quickly locate important change points of event trends, and search possible internal factors causing event mutation, and is an important aspect of event development process analysis.
The event development stage is divided, and the event life cycle is divided into a plurality of different stages, such as a latent period, a diffusion period, an outbreak period, a decay period and the like, according to the event sound volume (the sound volume describes and measures the influence of information propagation) at different time points in the event life cycle. At present, the event development stage division method is mainly based on a manual participation mode, division is carried out by means of experience, or the number of nodes is manually selected, and division is carried out by adopting methods such as simple window movement. The method is suitable for events with small data volume and short development period. However, the data processing method has the defects of ineffectiveness, incompleteness, inflexibility and the like when the data size is large, the duration is long or incomplete events occur. Therefore, the inventor of the present invention proposes an automatic division and identification method for public sentiment event development stages to solve the above problems.
Disclosure of Invention
In order to solve the above problems, an object of the present invention is to provide an automatic division and identification method for public sentiment event development stages, which can quickly and accurately realize the division and identification of public sentiment event development stages without manual participation.
Based on the above, the invention provides a public opinion event development stage automatic division and identification method, which comprises the following steps:
step 1: acquiring time sequence data of public opinion event changes;
step 2: judging whether the time sequence data need to be divided according to a preset first division standard for the time sequence data;
and step 3: if so, dividing the time sequence data into a plurality of trend stages by adopting a partition competition method;
and 4, step 4: respectively judging whether the time sequence data in the trend stage need to be further divided according to a preset second division standard;
and 5: if so, respectively adopting the partition competition method to divide the time sequence data in the trend phase into a plurality of trend phases again;
step 6: iterating the steps 4 and 5 until the time sequence data in the trend phase do not accord with the second division standard;
and 7: the trend stage is divided into a plurality of public opinion stages according to the indexes of adjacent stages of the life cycle of the public opinion events, and the public opinion stages comprise: latent period, diffusion period, outbreak period, fluctuation decline period, secondary public sentiment period and decline period.
Wherein, the step 1: the acquiring of the time-series data of the public opinion event change specifically comprises:
public opinion event data is obtained;
sampling the public opinion event data according to the same time interval;
and if the data of a certain time point does not exist when the public opinion event data is sampled, filling the public opinion event data with a value of 0 or the average value of the public opinion event data in the same preset time interval in the left and right directions of the certain time point.
Wherein the step 2 of judging whether the time series data need to be divided according to a preset first division standard includes:
acquiring the maximum value, median and average of the time sequence data;
obtaining a sequence stability index value according to the median and the average, wherein obtaining the sequence stability index value according to the median and the average specifically includes:
Figure 8752DEST_PATH_IMAGE001
wherein d issmoothFor the sequence stability index value, dmeanIs said average number, said dmedianIs the median;
and if the maximum value is larger than a preset maximum threshold value and the sequence stability index value is smaller than a preset sequence stability index threshold value, dividing the time series data.
Wherein the dividing the time series data into a plurality of trend phases by using a partition competition method comprises:
acquiring a cumulative sum of squared deviations value of the time sequence data and acquiring a symmetric distribution value of the time sequence data;
wherein obtaining the cumulative sum of squared deviations value for the time series data comprises:
Figure 308671DEST_PATH_IMAGE002
wherein, said yiFor the ith data in the time series data, the
Figure 916239DEST_PATH_IMAGE003
Is the average value in the time sequence data, and L is the cumulative sum of squared deviations value;
acquiring a symmetric distribution value of the time-series data includes:
Figure 592071DEST_PATH_IMAGE004
wherein t is the number of the time series data points, i is the ith point, and u is a symmetrical distribution value;
multiplying the accumulated sum of squared deviations value and the symmetric distribution value to obtain a maximum difference value;
and taking the time sequence data corresponding to the maximum difference value as a dividing point, and dividing the time sequence data into two trend stages by the dividing point.
Wherein the judging whether further division is needed according to a preset second division standard for the time series data in the trend phase comprises:
acquiring the median and average of the time sequence data;
acquiring a sequence stability index value according to the median and the average number;
and if the average is larger than a preset average threshold and the sequence stability index value is smaller than a preset sequence stability index threshold, dividing the time series data.
Wherein, the marking the trend stage as a plurality of public sentiment stages according to the public sentiment event life cycle adjacent stage index values respectively comprises:
acquiring an average of time series data in each trend phase;
acquiring rising index values of adjacent stages according to the ratio of the average numbers of the front and rear trend stages;
and if the rising index value of the adjacent stage is greater than 1 and the maximum value of the outbreak period is greater than a preset outbreak period threshold value, selecting the trend stage corresponding to the maximum average as the outbreak period, wherein the trend stage before the outbreak period is a latent period or a diffusion period.
Wherein the method further comprises:
judging whether the public sentiment event has the fluctuation decline period or the decline period;
and acquiring the ratio of the maximum value of the trend phase to the average value of the outbreak period or the diffusion period as a fluctuation decline value, wherein if the fluctuation decline value is less than a fluctuation decline preset value, the public sentiment event has the fluctuation decline period or the decline period.
Wherein the method further comprises:
judging whether the period after the fluctuation decline period is the decline period;
acquiring the ratio of the average numbers of the front and rear trend stages as a decline value;
and if the decline value is smaller than a preset decline threshold value, the decline period is followed by the fluctuation decline period.
Wherein the method further comprises:
judging whether the public sentiment event has the secondary public sentiment period;
and acquiring the ratio of the maximum value of the trend stage to the average value of the outbreak period as a secondary public opinion value, wherein if the secondary public opinion value is greater than a preset secondary public opinion threshold value, the public opinion event exists in the secondary public opinion period.
Wherein the method further comprises:
judging whether the period after the secondary public sentiment period is the fluctuation decline period or the decline period;
and acquiring the ratio of the average values of the front and rear trend stages as a second decline value, wherein if the second decline value is smaller than a preset second decline threshold, the second public sentiment period is followed by the fluctuation decline period or the decline period.
The invention has the following beneficial effects:
1. the implementation process of the invention is fully automatic, public sentiment event data is input, the development stage of the event can be automatically, rapidly and accurately divided, and the development period and the output important node are identified;
2. the invention constructs a stage division frame by using a binary tree-like idea, does not need to manually set parameters, performs multi-stage division on data trend in real time, adopts a partition competition method with dynamic monitoring, can perform stage division, can realize pruning to simplify the original structure of the binary tree, and accelerates the division efficiency on the premise of ensuring the effect.
3. The invention does not need complete life cycle data of the event (the analysis object can be the event in development or the historical event), and can mark the corresponding development period in real time only by combining the quantitative characteristics of each stage of the event, and different event trends can adaptively divide and identify different stages instead of enabling all event trends to have the same development periods at the same time. The whole identification dividing process saves the manual analysis cost, saves time and labor, reduces subjective errors, quickly and accurately realizes event trend analysis, and provides help for quickly positioning and searching possible factors of event development and change.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram illustrating a public sentiment event development stage automatic partitioning and identification method according to an embodiment of the present invention;
fig. 2 is a schematic diagram illustrating a public sentiment event partitioning process according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a public sentiment event life cycle identifier according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic diagram of a public opinion event development stage automatic division and identification method according to an embodiment of the present invention, the method includes:
based on the above, the invention provides a public opinion event development stage automatic division and identification method, which comprises the following steps:
step 1: acquiring time sequence data of public opinion event changes;
wherein, the acquiring time series data of public sentiment event changes specifically comprises:
public opinion event data is obtained; the public opinion event data comprises: forwarding amount, comment amount, praise amount, collection number, and the like.
Sampling the public opinion event data according to the same time interval;
and if the data of a certain time point does not exist when the public opinion event data is sampled, filling the public opinion event data with a value of 0 or the average value of the public opinion event data in the same preset time interval in the left and right directions of the certain time point.
And obtaining a group of complete time sequence data d representing the development and change of the public sentiment event through the processing.
Step 2: judging whether the time sequence data need to be divided according to a preset first division standard for the time sequence data;
judging whether the time series data need to be divided according to a preset first division standard specifically comprises the following steps:
acquiring the maximum value, median and average of the time sequence data;
obtaining a sequence stability index value according to the median and the average, wherein obtaining the sequence stability index value according to the median and the average specifically includes:
Figure 892471DEST_PATH_IMAGE005
wherein d issmoothFor the sequence stability index value, dmeanIs said average number, said dmedianIs the median;
and if the maximum value is larger than a preset maximum threshold value and the sequence stability index value is smaller than a preset sequence stability index threshold value, dividing the time series data.
dsmoothThe setting principle of the method is that if the average number and the median number in time series data in a certain section of trend are close, the section of trend is relatively stable, and the trend of another stage does not exist, otherwise, the section of trend needs to be divided to separate different stages.
Maximum value d of the time-series datamaxThe principle of setting is to filter out the public sentiment events with small popularity (namely, the time sequence data are all smaller than the preset maximum threshold).
And step 3: if so, dividing the time sequence data into a plurality of trend stages by adopting a partition competition method;
if the time series data can be divided according to the standard of step 2, dividing the time series data into a plurality of trend stages by adopting a partition competition method, which specifically comprises the following steps:
acquiring a cumulative sum of squared deviations value of the time sequence data and acquiring a symmetric distribution value of the time sequence data;
wherein obtaining the cumulative sum of squared deviations value for the time series data comprises:
Figure 473494DEST_PATH_IMAGE006
wherein, said yiFor the ith data in the time series data, the
Figure 635485DEST_PATH_IMAGE007
Is the average value in the time sequence data, and L is the cumulative sum of squared deviations value;
acquiring a symmetric distribution value of the time-series data includes:
Figure 90124DEST_PATH_IMAGE008
wherein t is the number of the time series data points, i is the ith point, and u is a symmetrical distribution value;
multiplying the accumulated sum of squared deviations value and the symmetric distribution value to obtain a maximum difference value;
Figure 577737DEST_PATH_IMAGE009
where c is the variance maximization value.
And taking the time sequence data corresponding to the maximum difference value as a dividing point, and dividing the time sequence data into two trend stages by the dividing point.
When the set division number is 1, calculating a maximum difference value c for the time series data, and dividing the time series data by using a division point which is a numerical point with the maximum value c, for example, when the time series data is 1, 2, 200, and 230 in sequence, calculating the maximum difference value c, wherein the maximum value c of the array is 200, so that 200 is used as the division point for dividing the trend phase to obtain two trend phases. And when the set number of the division sections is n, continuously searching division points for the two trend stages respectively to divide again, and iterating in the above way until the number of the division sections is n.
The method can quickly and effectively find the division points and is very suitable for public sentiment event stage division. In order to solve the problem that the partition competition method needs manual participation and sets proper division stage number aiming at different events, the invention uses a binary tree structure for judging iteration to realize automatic stage division without manually defining the stage number.
And 4, step 4: respectively judging whether the time sequence data in the trend stage need to be further divided according to a preset second division standard;
based on the step 3, two trend stages can be obtained by division, and the two trend stages are respectively subjected to division judgment again.
Wherein, the judging whether the time series data in the trend phase need to be further divided according to a preset second division standard respectively includes:
acquiring the median and average of the time sequence data;
acquiring a sequence stability index value according to the median and the average number;
and if the average is larger than a preset average threshold and the sequence stability index value is smaller than a preset sequence stability index threshold, dividing the time series data.
This process is identical to the process of step 3 except that the second partition criterion uses the mean d of the time series data in the trend phasemeanReplaces the original time series data maximum value dmaxAs a quantitative index.
The reason is that dmeanThe significance of trend analysis can be reflected. Binding dsmoothAs the subsequent dividing judgment condition, the trend stages obtained by the previous step are respectively subjected to the next dividing judgment,
and 5: if so, respectively adopting the partition competition method to divide the time sequence data in the trend phase into a plurality of trend phases again;
if a certain segment meets the partition judgment condition, namely the second partition standard, the segment is partitioned by using a partition competition method, otherwise, the trend of the segment is considered to be relatively stable, no obvious stage boundary exists (namely no other trend exists), and the segment is not partitioned.
The partition competition method F can maximize the difference of the divided stages as much as possible, and multiple tests show that the intermediate stage obtained by twice division is relatively stable. Therefore, after the second division, the division judgment can be carried out only on the trends at the two ends of the public sentiment event, the pruning of the binary tree is realized, and the division speed is accelerated on the basis of ensuring the division effect as shown in fig. 2.
Step 6: iterating the steps 4 and 5 until the time sequence data in the trend phase do not accord with the second division standard;
and (5) iterating the steps 4 and 5 until all the stages do not meet the division judgment condition, namely the second division standard, and finishing the division.
And 7: respectively identifying the trend stages into a plurality of public opinion stages according to indexes of adjacent stages of the life cycle of the public opinion events, wherein the public opinion stages comprise: latent phase, diffusion phase, burst phase, fluctuation decline phase, secondary public sentiment phase, decline phase, please refer to fig. 3.
The latent period is as follows: in step 2 dmaxAnd when the value is less than a preset latency threshold, the public sentiment event is considered to be in the latency.
The diffusion period is as follows: after the life cycle of public sentiment time is labeled according to the partition competition method, the diffusion period can be obtained as a trend stage of the average number of the time series data of the diffusion period being larger than the average number of the time series data of the latent period and smaller than the average number of the time series data of the outbreak period.
The trend from the diffusion period to the burst period is ascending.
Wherein, it includes respectively to said trend stage according to the adjacent stage index mark of public sentiment event life cycle divide into a plurality of public sentiment stage:
acquiring an average of time series data in each trend phase;
acquiring rising index values of adjacent stages according to the ratio of the average numbers of the front and rear trend stages;
and if the rising index value of the adjacent stage is greater than 1 and the maximum value of the outbreak period is greater than a preset outbreak period threshold value, selecting the trend stage corresponding to the maximum average as the outbreak period, wherein the trend stage before the outbreak period is a latent period or a diffusion period.
The identification of the burst period may use a maximum burst period value dbao_maxAnd judging whether the maximum threshold value of the preset outbreak period is reached.
If d isbao_maxAnd if the maximum threshold of the outbreak period is not reached, the public sentiment event is considered not to reach the outbreak period. Of course, there is another possibility that the maximum threshold of the outbreak period is reached, and the outbreak period is not necessarily entered, because the highest point of the click reading amount of some public sentiment events is ten thousands, the situation that the outbreak period is advanced occurs at this time. According to the life cycle rule of public sentiment events, before the outbreak period is not reached, the public sentiment basically shows a gradually rising trend, and the advance of the outbreak period can also be an early warning for users. Secondly, the division and identification of the public sentiment phase is real-time, the burst period is just a definition of relativity, and when higher values appear, the burst period moves along with the burst period.
The stage corresponding to the descending trend is from the outbreak period to the fluctuation decline period, from the fluctuation decline period to the decline period or from the outbreak period to the decline period. Some public sentiment events greatly rise again in a fluctuation decline period or a decline period, and the condition is called secondary public sentiment. And (3) combining the analysis and summary experience of the early-stage manual work on the historical events, and aiming at the part of complex trends, adopting the following identification judgment indexes:
the fluctuation decline period and the decline period generally exist after the outbreak period or the diffusion period, and the method for judging whether the fluctuation decline period or the decline period exists is as follows:
wherein the method further comprises:
judging whether the public sentiment event has the fluctuation decline period or the decline period;
and acquiring the ratio of the maximum value of the trend phase to the average value of the outbreak period or the diffusion period as a fluctuation decline value, wherein if the fluctuation decline value is less than a fluctuation decline preset value, the public sentiment event has the fluctuation decline period or the decline period. Otherwise, it is in the burst phase or diffusion phase.
After the decline period, the development trend of the public sentiment event has two different trends, one trend is to continue descending and enter the decline period, and the judgment indexes are as follows:
wherein the method further comprises:
judging whether the period after the fluctuation decline period is the decline period;
acquiring the ratio of the average numbers of the front and rear trend stages as a decline value;
and if the decline value is smaller than a preset decline threshold value, the decline period is followed by the fluctuation decline period. Otherwise, the patient is still in the wave decline period.
And if the trend is ascending after the fluctuation decline period or the decline period, secondary public sentiment exists, the public sentiment quantity of the secondary public sentiment is larger, and according to the early summary, the secondary public sentiment is found to have a certain proportional relation with the outbreak period.
The criteria are as follows:
wherein the method further comprises:
judging whether the public sentiment event has the secondary public sentiment period;
and acquiring the ratio of the maximum value of the trend stage to the average value of the outbreak period as a secondary public opinion value, wherein if the secondary public opinion value is greater than a preset secondary public opinion threshold value, the public opinion event exists in the secondary public opinion period. Otherwise, the stage is considered to be in the previous stage (wave decline stage or decline stage).
Wherein the method further comprises:
judging whether the period after the secondary public sentiment period is the fluctuation decline period or the decline period;
and acquiring the ratio of the average values of the front and rear trend stages as a second decline value, wherein if the second decline value is smaller than a preset second decline threshold, the second public sentiment period is followed by the fluctuation decline period or the decline period. Otherwise, this stage is still in secondary public sentiment.
In summary, the corresponding development period after the phase identification and the important transition node between adjacent development periods can be obtained, and the method is suitable for division and identification of the development periods of historical events, ongoing events or time period selection trends.
The invention has the following beneficial effects:
1. the implementation process of the invention is fully automatic, public sentiment event data is input, the development stage of the event can be automatically, rapidly and accurately divided, and the development period and the output important node are identified;
2. the invention constructs a stage division frame by using a binary tree-like idea, does not need to manually set parameters, performs multi-stage division on data trend in real time, adopts a partition competition method with dynamic monitoring, can perform stage division, can realize pruning to simplify the original structure of the binary tree, and accelerates the division efficiency on the premise of ensuring the effect.
3. The invention does not need complete life cycle data of the event (the analysis object can be the event in development or the historical event), and can mark the corresponding development period in real time only by combining the quantitative characteristics of each stage of the event, and different event trends can adaptively divide and identify different stages instead of enabling all event trends to have the same development periods at the same time. The whole identification dividing process saves the manual analysis cost, saves time and labor, reduces subjective errors, quickly and accurately realizes event trend analysis, and provides help for quickly positioning and searching possible factors of event development and change.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and substitutions can be made without departing from the technical principle of the present invention, and these modifications and substitutions should also be regarded as the protection scope of the present invention.

Claims (10)

1. A public opinion event development stage automatic division and identification method is characterized by comprising the following steps:
step 1: acquiring time sequence data of public opinion event changes;
step 2: judging whether the time sequence data need to be divided according to a preset first division standard for the time sequence data;
and step 3: if so, dividing the time sequence data into a plurality of trend stages by adopting a partition competition method, wherein the partition competition method comprises the following steps: acquiring a cumulative sum of squared deviations value of the time sequence data and acquiring a symmetric distribution value of the time sequence data; multiplying the accumulated sum of squared deviations value and the symmetric distribution value to obtain a maximum difference value; taking the time sequence data corresponding to the maximum difference value as a dividing point, and dividing the time sequence data into trend stages according to the dividing point; and 4, step 4: respectively judging whether the time sequence data in the trend stage need to be further divided according to a preset second division standard;
and 5: if so, respectively adopting the partition competition method to divide the time sequence data in the trend phase into a plurality of trend phases again;
step 6: iterating the steps 4 and 5 until the time sequence data in the trend phase do not accord with the second division standard;
and 7: respectively identifying the trend stages into a plurality of public opinion stages according to indexes of adjacent stages of the life cycle of the public opinion events, wherein the public opinion stages comprise: latent period, diffusion period, outbreak period, fluctuation decline period, secondary public sentiment period and decline period.
2. The public opinion event development stage automatic division and identification method as claimed in claim 1, characterized in that the step 1: the acquiring of the time-series data of the public opinion event change specifically comprises:
public opinion event data is obtained;
sampling the public opinion event data according to the same time interval;
and if the data of a certain time point does not exist when the public opinion event data is sampled, filling the public opinion event data with a value of 0 or the average value of the public opinion event data in the same preset time interval in the left and right directions of the certain time point.
3. The method as set forth in claim 1, wherein the step 2 of determining whether the time-series data need to be divided according to a preset first division criterion for the time-series data comprises:
acquiring the maximum value, median and average of the time sequence data;
obtaining a sequence stability index value according to the median and the average, wherein obtaining the sequence stability index value according to the median and the average specifically includes:
Figure DEST_PATH_IMAGE001
wherein d issmoothFor the sequence stability index value, dmeanIs said average number, said dmedianIs the median;
and if the maximum value is larger than a preset maximum threshold value and the sequence stability index value is smaller than a preset sequence stability index threshold value, dividing the time series data.
4. The public opinion event development stage automatic division and identification method as claimed in claim 1,
acquiring the cumulative sum of squared deviations value of the time series data and acquiring the symmetric distribution value of the time series data comprises:
wherein obtaining the cumulative sum of squared deviations value for the time series data comprises:
Figure 86194DEST_PATH_IMAGE002
wherein, said yiFor the ith data in the time series data, the
Figure 456870DEST_PATH_IMAGE003
Is composed of
An average value in the time series data, L being the cumulative sum of squared deviations value;
acquiring a symmetric distribution value of the time-series data includes:
Figure 122337DEST_PATH_IMAGE004
wherein, t is the number of the time series data points, i is the ith point, and u is a symmetrical distribution value.
5. The method as claimed in claim 1, wherein the automatically dividing and identifying the time series data in the trend phase according to a preset second dividing standard comprises:
acquiring median and average of the time sequence data;
acquiring a sequence stability index value according to the median and the average number;
and if the average is larger than a preset average threshold and the sequence stability index value is smaller than a preset sequence stability index threshold, dividing the time series data.
6. The method as claimed in claim 1, wherein the automatically dividing and identifying the trend phases into a plurality of public sentiment phases according to the indicators of the adjacent phases of the life cycle of the public sentiment events comprises:
acquiring an average of time series data in each trend phase;
acquiring rising index values of adjacent stages according to the ratio of the average numbers of the front and rear trend stages;
and if the rising index value of the adjacent stage is greater than 1 and the maximum value of the outbreak period is greater than a preset outbreak period threshold value, selecting the trend stage corresponding to the maximum average as the outbreak period, wherein the trend stage before the outbreak period is a latent period or a diffusion period.
7. The method for automatically classifying and identifying public sentiment event development stages as claimed in claim 1, wherein the method further comprises:
judging whether the public sentiment event has the fluctuation decline period or the decline period;
and acquiring the ratio of the maximum value of the trend phase to the average value of the outbreak period or the diffusion period as a fluctuation decline value, wherein if the fluctuation decline value is less than a fluctuation decline preset value, the public sentiment event has the fluctuation decline period or the decline period.
8. The method for automatically classifying and identifying public sentiment event development stages as claimed in claim 1, wherein the method further comprises:
judging whether the period after the fluctuation decline period is the decline period;
acquiring the ratio of the average numbers of the front and rear trend stages as a decline value;
and if the decline value is smaller than a preset decline threshold value, the decline period is followed by the fluctuation decline period.
9. The method for automatically classifying and identifying public sentiment event development stages as claimed in claim 1, wherein the method further comprises:
judging whether the public sentiment event has the secondary public sentiment period;
and acquiring the ratio of the maximum value of the trend stage to the average value of the outbreak period as a secondary public opinion value, wherein if the secondary public opinion value is greater than a preset secondary public opinion threshold value, the public opinion event exists in the secondary public opinion period.
10. The method of automatic division and identification of public sentiment event development stages as claimed in claim 9, wherein the method further comprises:
judging whether the period after the secondary public sentiment period is the fluctuation decline period or the decline period;
and acquiring the ratio of the average number of the front and rear trend stages as a second decline value, wherein if the second decline value is smaller than a preset second decline threshold value, the second public sentiment stage is followed by the fluctuation decline period or the decline period.
CN202110263077.8A 2021-03-11 2021-03-11 Public opinion event development stage automatic division and identification method Active CN112632365B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110263077.8A CN112632365B (en) 2021-03-11 2021-03-11 Public opinion event development stage automatic division and identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110263077.8A CN112632365B (en) 2021-03-11 2021-03-11 Public opinion event development stage automatic division and identification method

Publications (2)

Publication Number Publication Date
CN112632365A CN112632365A (en) 2021-04-09
CN112632365B true CN112632365B (en) 2021-06-01

Family

ID=75297696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110263077.8A Active CN112632365B (en) 2021-03-11 2021-03-11 Public opinion event development stage automatic division and identification method

Country Status (1)

Country Link
CN (1) CN112632365B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339463A (en) * 2016-08-26 2017-01-18 中国传媒大学 Network public opinion early-warning system based on logistic model and early-warning method thereof
KR20180091496A (en) * 2017-02-07 2018-08-16 주식회사 에스엘커뮤니케이션즈 Method for public opinion making using social network based on emotion analysys
CN106960059A (en) * 2017-04-06 2017-07-18 山东大学 A kind of Model of Time Series Streaming dimensionality reduction based on Piecewise Linear Representation is with simplifying method for expressing
CN108549957B (en) * 2018-04-11 2021-10-29 中译语通科技股份有限公司 Internet topic trend auxiliary prediction method and system and information data processing terminal

Also Published As

Publication number Publication date
CN112632365A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN107229668B (en) Text extraction method based on keyword matching
CN108733791B (en) Network event detection method
CN117078048B (en) Digital twinning-based intelligent city resource management method and system
CN108536866B (en) Microblog hidden key user analysis method based on topic transfer entropy
CN110222790B (en) User identity identification method and device and server
CN111639230A (en) Similar video screening method, device, equipment and storage medium
CN104572633A (en) Method for determining meanings of polysemous word
CN112632365B (en) Public opinion event development stage automatic division and identification method
CN107885716A (en) Text recognition method and device
CN112235254B (en) Rapid identification method for Tor network bridge in high-speed backbone network
CN111930701A (en) Log structured processing method and device
CN110580280B (en) New word discovery method, device and storage medium
CN116975778A (en) Social network information propagation influence prediction method based on information cascading
CN116070958A (en) Attribution analysis method, attribution analysis device, electronic equipment and storage medium
CN112241820A (en) Risk identification method and device for key nodes in fund flow and computing equipment
CN113706459B (en) Detection and simulation repair device for abnormal brain area of autism patient
CN115248888A (en) Data identification system for searching hot words through big data
CN109213922A (en) A kind of method and apparatus of pair of search results ranking
CN111382345B (en) Topic screening and publishing method, device and server
CN113139102A (en) Data processing method, data processing device, nonvolatile storage medium and processor
CN114817563B (en) Mining method of specific Twitter user group based on maximum group discovery
CN113570392B (en) User grouping method, device, electronic equipment and computer storage medium
CN111026863A (en) Customer behavior prediction method, apparatus, device and medium
CN114363673B (en) Video clipping method, model training method and device
CN117828382B (en) Network interface clustering method and device based on URL

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant