CN112541358A - Public opinion risk early warning method and device and computer storage medium - Google Patents

Public opinion risk early warning method and device and computer storage medium Download PDF

Info

Publication number
CN112541358A
CN112541358A CN202010595267.5A CN202010595267A CN112541358A CN 112541358 A CN112541358 A CN 112541358A CN 202010595267 A CN202010595267 A CN 202010595267A CN 112541358 A CN112541358 A CN 112541358A
Authority
CN
China
Prior art keywords
event
public opinion
public
early warning
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010595267.5A
Other languages
Chinese (zh)
Inventor
廖倡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN STOCK EXCHANGE
Original Assignee
SHENZHEN STOCK EXCHANGE
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN STOCK EXCHANGE filed Critical SHENZHEN STOCK EXCHANGE
Priority to CN202010595267.5A priority Critical patent/CN112541358A/en
Publication of CN112541358A publication Critical patent/CN112541358A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a public opinion risk early warning method, which comprises the following steps: acquiring a public opinion event set; determining an associated public sentiment event in the public sentiment event set; generating a public sentiment event subset according to the related public sentiment events; acquiring a propagation main body of the public opinion event subset, and acquiring an influence value of the public opinion event subset according to historical interaction data of the propagation main body; and generating and outputting early warning information according to the influence numerical value. The invention also discloses a public opinion risk early warning device and a computer storage medium, wherein the public opinion risk early warning is realized by acquiring the related events in the public opinion event set as the same public opinion event, calculating the influence value of the event through the historical interactive data of the event transmission main body, and carrying out the public opinion early warning according to the influence value, thereby realizing the automation of the public opinion risk early warning and improving the efficiency of the public opinion risk early warning.

Description

Public opinion risk early warning method and device and computer storage medium
Technical Field
The invention relates to the technical field of public opinion early warning, in particular to a public opinion risk early warning method, a public opinion risk early warning device and a storage medium.
Background
Public opinion information from the securities industry is numerous in origin, including: company bulletins, government bulletins, research reports, business administration penalty information, mass news, social media information, and the like, all of which have a very important effect on the situation change of the securities industry.
When mass public sentiment data are faced, public sentiment early warning is generally carried out in a pure manual collection, processing and monitoring mode, and the efficiency is low.
The above is only for the purpose of assisting understanding of the technical aspects of the present invention, and does not represent an admission that the above is prior art.
Disclosure of Invention
The invention mainly aims to provide a public opinion risk early warning method, a device and a storage medium x, aiming at realizing the automation of public opinion early warning and improving the early warning efficiency.
In order to achieve the above object, the present invention provides a public opinion risk early warning method, which comprises the following steps:
acquiring a public opinion event set;
determining an associated public sentiment event in the public sentiment event set;
generating a public sentiment event subset according to the related public sentiment events;
acquiring a propagation main body of the public opinion event subset, and acquiring an influence value of the public opinion event subset according to historical interaction data of the propagation main body;
and generating and outputting early warning information according to the influence numerical value.
Optionally, the historical interaction data includes at least one of a number of praise, a number of forward, and a number of review.
Optionally, the step of obtaining the public opinion event set includes:
collecting public opinion data generated in a preset time period through a preset network platform, wherein the public opinion data comprises a plurality of texts;
acquiring the similarity between the texts;
determining a cohesiveness numerical value among a plurality of texts according to the similarity;
determining a plurality of text sets according to the cohesiveness numerical value, wherein the text sets comprise a plurality of texts;
taking the text set as the public opinion event;
and forming the public opinion event set according to a plurality of public opinion events.
Optionally, after the step of determining a plurality of text sets according to the cohesiveness value, the public opinion risk early warning method includes:
acquiring the number of texts contained in the text set;
acquiring the radius of the text set;
acquiring the density of the text set according to the number and the radius;
and adding or deleting texts in the text set according to the density, wherein the corrected text set is taken as the public opinion event.
Optionally, the step of determining the associated public sentiment event in the public sentiment event set comprises:
determining a mass center corresponding to a text set contained in a public sentiment event in the public sentiment event set;
and determining the related public opinion events in the public opinion event set according to the distance between the centroids.
Optionally, the step of obtaining the influence value of the public opinion event subset according to the historical interaction data of the propagating subject includes:
generating an interaction information matrix corresponding to the propagation subject according to the historical interaction data;
performing iterative processing on the interaction information matrix according to a preset function to obtain an influence numerical matrix;
and acquiring the influence value according to the influence value matrix.
Optionally, the step of obtaining the influence value according to the influence value matrix includes:
acquiring event characteristics of the public opinion event subset, wherein the event characteristics comprise the number of public opinion events contained in the public opinion event subset;
acquiring the time span of the release time corresponding to the public opinion event subset;
and acquiring the influence value according to the event characteristic, the time span and the influence value matrix.
Optionally, the step of obtaining the public opinion event set includes:
acquiring a preset event main body;
and acquiring a public sentiment event set corresponding to the preset event main body, wherein the early warning information is output to the terminal equipment corresponding to the preset event main body.
Optionally, the step of generating and outputting warning information according to the influence value includes:
when the influence value is larger than a preset threshold value, acquiring an event main body corresponding to the public sentiment event subset;
acquiring the sum of the tender corresponding to the event main body;
and generating and outputting the early warning information according to the public sentiment event subset, the influence numerical value and the amount of the target.
In addition, in order to achieve the above object, the present invention further provides a public opinion risk early warning device, including: the public opinion risk early warning method comprises a memory, a processor and a public opinion risk early warning program which is stored on the memory and can run on the processor, wherein when the public opinion risk early warning program is executed by the processor, the steps of the public opinion risk early warning method are realized.
In addition, to achieve the above object, the present invention further provides a computer storage medium, wherein a public opinion risk early warning program is stored on the computer storage medium, and when the public opinion risk early warning program is executed by a processor, the method of the public opinion risk early warning method is implemented.
The public opinion risk early warning method, the device and the storage medium provided by the embodiment of the invention are used for acquiring a public opinion event set, determining a related public opinion event in the public opinion event set, generating a public opinion event subset according to the related public opinion event, acquiring a propagation main body of the public opinion event subset, acquiring an influence value of the public opinion event subset according to historical interaction data of the propagation main body, and generating and outputting early warning information according to the influence value. The invention obtains the related events in the public sentiment event set as the same public sentiment event, calculates the influence value of the event through the historical interactive data of the event transmission main body, and carries out public sentiment early warning according to the influence value, thereby realizing the automation of public sentiment risk early warning and improving the efficiency of public sentiment risk early warning.
Drawings
Fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a public opinion risk warning method according to an embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating a public opinion risk warning method according to another embodiment of the present invention;
fig. 4 is a schematic flow chart illustrating a public opinion risk warning method according to another embodiment of the present invention;
fig. 5 is a flowchart illustrating a public opinion risk warning method according to another embodiment of the present invention;
FIG. 6 is a diagram illustrating the context of a public sentiment event in the present invention;
FIG. 7 is a schematic diagram of a jump diffusion process based on a gamma distribution;
FIG. 8 is a schematic illustration of the propagation and dissipation process of the force values.
The implementation, functional features and advantages of the objects of the present invention will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The embodiment of the invention provides a solution, which is characterized in that related events in a public opinion event set are obtained to serve as the same public opinion event, the influence value of the event is calculated through historical interactive data of an event transmission main body, public opinion early warning is carried out according to the influence value, the automation of public opinion risk early warning is realized, and the efficiency of public opinion risk early warning is improved.
As shown in fig. 1, fig. 1 is a schematic terminal structure diagram of a hardware operating environment according to an embodiment of the present invention.
The terminal in the embodiment of the invention is terminal equipment such as a PC.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001 described above.
Those skilled in the art will appreciate that the terminal structure shown in fig. 1 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include an operating system, a network communication module, a user interface module, and a public opinion risk warning program therein.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a backend server and performing data communication with the backend server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to call the public opinion risk early warning program stored in the memory 1005, and perform the following operations:
acquiring a public opinion event set;
determining an associated public sentiment event in the public sentiment event set;
generating a public sentiment event subset according to the related public sentiment events;
acquiring a propagation main body of the public opinion event subset, and acquiring an influence value of the public opinion event subset according to historical interaction data of the propagation main body;
and generating and outputting early warning information according to the influence numerical value.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
the historical interaction data comprises at least one of the number of praise times, the number of forward times and the number of comment times.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
collecting public opinion data generated in a preset time period through a preset network platform, wherein the public opinion data comprises a plurality of texts;
acquiring the similarity between the texts;
determining a cohesiveness numerical value among a plurality of texts according to the similarity;
determining a plurality of text sets according to the cohesiveness numerical value, wherein the text sets comprise a plurality of texts;
taking the text set as the public opinion event;
and forming the public opinion event set according to a plurality of public opinion events.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
acquiring the number of texts contained in the text set;
acquiring the radius of the text set;
acquiring the density of the text set according to the number and the radius;
and adding or deleting texts in the text set according to the density, wherein the corrected text set is taken as the public opinion event.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
determining a mass center corresponding to a text set contained in a public sentiment event in the public sentiment event set;
and determining the related public opinion events in the public opinion event set according to the distance between the centroids.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
generating an interaction information matrix corresponding to the propagation subject according to the historical interaction data;
performing iterative processing on the interaction information matrix according to a preset function to obtain an influence numerical matrix;
and acquiring the influence value according to the influence value matrix.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
acquiring event characteristics of the public opinion event subset, wherein the event characteristics comprise the number of public opinion events contained in the public opinion event subset;
acquiring the time span of the release time corresponding to the public opinion event subset;
and acquiring the influence value according to the event characteristic, the time span and the influence value matrix.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
acquiring a preset event main body;
and acquiring a public sentiment event set corresponding to the preset event main body, wherein the early warning information is output to the terminal equipment corresponding to the preset event main body.
Further, the processor 1001 may call the public opinion risk pre-warning program stored in the memory 1005, and further perform the following operations:
when the influence value is larger than a preset threshold value, acquiring an event main body corresponding to the public sentiment event subset;
acquiring the sum of the tender corresponding to the event main body;
and generating and outputting the early warning information according to the public sentiment event subset, the influence numerical value and the amount of the target.
Referring to fig. 2, in an embodiment, the public opinion risk early warning method includes the following steps:
step S10, acquiring a public sentiment event set;
in this embodiment, a public opinion risk early warning system operating in a public opinion risk early warning device collects public opinion data, generates public opinion events according to the public opinion data, and takes a set comprising a plurality of public opinion events as a public opinion event set.
Optionally, the public opinion risk early warning system interfaces with a plurality of data platforms to obtain a plurality of types of public opinion data from the plurality of data platforms. For example, when public opinion risk early warning is performed on a securities market, the public opinion data may include internal data and external data, the internal data includes internal market data, listed company information, securities trading settlement data, member information and the like of the securities market, and the external data includes off-site company reports, news reports, self-media information, industry and business information and the like, wherein the news reports may be from various media platforms, public numbers, official financial news publishing platforms and the like. A single public opinion data is typically presented in the form of a single text.
Alternatively, similarity between a plurality of texts is calculated, and when the similarity is greater than a preset similarity, the plurality of files are determined to represent the same event, so that a collection of the plurality of files can be regarded as a single public opinion event, thereby determining a plurality of public opinion events, and the plurality of public opinion events can be regarded as a public opinion event collection. When calculating the similarity between the texts, the similarity can be determined comprehensively according to the similarity between the event subjects corresponding to the texts and the similarity between the event contents corresponding to the texts.
Optionally, in the securities industry, many clients have a demand for public opinion risk early warning, and therefore, when acquiring a public opinion event collection, a preset event main body preset in a public opinion risk early warning system can be acquired, and then a text containing the preset event main body is screened out, a single public opinion event is determined according to the similarity between the screened out texts, and then the public opinion event collection corresponding to the preset event main body is determined, so as to realize public opinion risk early warning for the preset event main body, and further, according to early warning information, other market main bodies related to the client benefits and a conduction link of risks, public opinion situation, public opinion grade and public opinion walking direction are analyzed, the degree of influence of the public opinion event on the client is determined, a risk event with higher influence is identified, and monitoring early warning service is provided for the client. For example, the related public opinion event in the early warning information can be obtained, other events having causal relation with the related public opinion event can be determined according to a prestored knowledge graph of the securities industry, wherein in the book intelligence world, the knowledge graph is called knowledge domain visualization or knowledge domain mapping map, is a series of various graphs displaying knowledge development progress and structure relation, describes knowledge resources and carriers thereof by using visualization technology, and excavates, analyzes, constructs, draws and displays knowledge and mutual relation between the knowledge and the knowledge. For example, in the knowledge graph, after an event a occurs, an event B often occurs, so after the public opinion risk warning outputs the warning information, the prediction information including the event B can be output to predict the event trend.
Step S20, determining related public sentiment events in the public sentiment event set;
step S30, generating a public sentiment event subset according to the related public sentiment events;
in the embodiment, when determining the related public sentiment events in the public sentiment event set, the degree of the association between the plurality of public sentiment events determines the related public sentiment events, and the related public sentiment events are taken as the public sentiment event subset.
Optionally, the public sentiment event collection includes public sentiment events in a plurality of different time periods, for example, with each day as a time period, the public sentiment event collection may include today's public sentiment event, yesterday's public sentiment event, and antecedent's public sentiment event. The time period corresponding to a single public sentiment event can be determined according to the generation time of the text in the public sentiment event, for example, yesterday is published on the network in the public sentiment event, and then the yesterday is taken as the time period corresponding to the public sentiment event. When determining the related public opinion events in the public opinion event set, calculating the similarity of the public opinion events in different time periods, and using a plurality of public opinion events with the similarity larger than the similarity threshold as the related public opinion events, for example, when the public opinion event set comprises a public opinion event A and a public opinion event B in a first time period and a public opinion event C and a public opinion event D in a second time period, if the similarity between the public opinion event A and the public opinion event C is greater than the similarity threshold, the public sentiment event A and the public sentiment event C are taken as related public sentiment events, the combination of the public sentiment event A and the public sentiment event C is taken as a public sentiment event subset, if the similarity between the public sentiment event B and the public sentiment event D is more than a similarity threshold value, the public sentiment event B and the public sentiment event D are used as related public sentiment events, and the combination of the public sentiment event B and the public sentiment event D is used as a public sentiment event subset.
Step S40, acquiring a propagation subject of the public sentiment event subset, and acquiring an influence value of the public sentiment event subset according to history interactive data of the propagation subject;
in this embodiment, since the public sentiment data are all propagated in the network, and the propagation subject publishes the public sentiment data for other network subjects to view, comment, forward, like, after determining the public sentiment event subset, the distribution subject of the text of the public sentiment event in the public sentiment event subset can be determined, and the distribution subject is taken as the propagation subject of the public sentiment event subset, wherein the text of the public sentiment event may exist in multiple numbers and is published by different subjects, so the propagation subjects of the public sentiment event subset can exist in multiple numbers at the same time.
After the propagation main body publishes the public opinion data, other main bodies on the network can check, comment, forward, like and pay attention to the public opinion data, so after the propagation main body is determined, the historical interactive data of the propagation main body is determined according to at least one of the browsing times, the like, the forwarding times, the comment times and the attention number of other network main bodies in the public opinion data which are published by the propagation main body in history. The historical interaction data comprises at least one of browsing times, praise times, forwarding times, comment times and number of people in close relation.
Optionally, when the influence value of the public sentiment event subset is obtained according to the historical interaction data, the propagation subject is scored according to browsing times, praise times, forwarding times, comment times, attention number and the like, and the score is used as the influence value of the public sentiment event subset. When the evaluation is carried out, a plurality of different preset frequency ranges can be obtained, so that the preset frequency ranges of the browsing frequency, the like frequency, the forwarding frequency, the comment frequency and the attention number are respectively determined, the scores corresponding to the preset frequency ranges of the browsing frequency, the like frequency, the forwarding frequency, the comment frequency and the attention number are used as the scores of the item, and the sum of the scores of the browsing frequency, the like frequency, the forwarding frequency, the comment frequency and the attention number is used as the influence value of the public opinion event subset. When there are multiple propagation subjects of the public sentiment event subset, scores are obtained according to historical interaction data corresponding to the multiple propagation subjects, and the sum of the scores corresponding to the multiple propagation subjects is used as the influence value of the public sentiment event subset.
And step S50, generating and outputting early warning information according to the influence value.
In this embodiment, after the influence value is obtained, the early warning information is generated according to the text included in the public sentiment event subset and the influence value, and the early warning information is output by the public sentiment risk early warning device to prompt the user about the public sentiment situation of the public sentiment event subset.
Optionally, after the influence value is obtained, whether the influence value is greater than a preset threshold is judged, if the influence value is greater than the preset threshold, it is indicated that the public sentiment risk is high, and a step of generating and outputting early warning information according to the text contained in the public sentiment event subset and the influence value is executed.
Optionally, for the application requirement of market risk monitoring in the current security industry, when the influence value is greater than a preset threshold, an event main body corresponding to a public sentiment event subset can be obtained, the event main body is an object where the public sentiment occurs, the amount of a target corresponding to the event main body in the security industry is obtained, and early warning information is generated and output according to the text, the influence value and the target amount contained in the public sentiment event subset, so that detection and early warning of the public sentiment risk condition of the client in the security market are realized. Through the early warning information, the risk of the event subject in an industrial chain or a debt chain can be determined, a visual risk view is constructed, and risk identification, risk monitoring and early warning and the like are carried out. The management analysis platform in the public opinion risk early warning system can realize functions of risk index measurement and calculation, public opinion event recommendation, event trend analysis and the like according to the influence value, and can perform performance evaluation and regression test on the accuracy of the algorithm of the public opinion risk early warning system according to the influence value actually generated by the past historical public opinion event subset so as to further optimize the algorithm.
In the technical scheme disclosed in this embodiment, through obtaining the associated events in the public opinion event collection as the same public opinion event, and through calculating the influence value of the event according to the historical interactive data of the event dissemination main body, the public opinion early warning is performed according to the influence value, so that the automation of the public opinion risk early warning is realized, and the efficiency of the public opinion risk early warning is improved.
In another embodiment, as shown in fig. 3, on the basis of the embodiment shown in fig. 2, the step S10 includes:
step S11, public sentiment data generated in a preset time period is collected through a preset network platform, wherein the public sentiment data comprises a plurality of texts;
in this embodiment, when acquiring a public opinion event set, first, a plurality of preset network platforms are connected by a public opinion risk early warning system, so as to collect a large amount of public opinion data from the plurality of preset network platforms. The step of collecting the public opinion data may be performed periodically, so that the public opinion data corresponding to the period in a preset time period is collected each time, for example, the public opinion data is collected once every zero point, and the collected public opinion data is the public opinion data released in the previous day. Public opinion data is generally transmitted through a network in the form of text, and therefore, the collected public opinion data may include a plurality of texts.
Step S12, obtaining the similarity between the texts;
in this embodiment, when calculating the similarity between texts, the similarity can be implemented by cosine similarity, Jaccard distance, and the like.
Optionally, a weighted Jaccard distance between the texts is calculated, the weighted Jaccard distance is taken as a similarity between the texts, and the similarity between the texts is higher when the weighted Jaccard distance is smaller. Specifically, an event main body and event content in the text are determined, and the weighted Jaccard distance is calculated according to the frequency of occurrence of the event main body and the event content. For example, when the text d includes a plurality of event subjects ∈ (d) and a plurality of event contents S (d), the text d may be expressed as f (d) ∈ (d) · S (d), and when the text d ' includes a plurality of event subjects ∈ ' (d) and a plurality of event contents S ' (d), the text d ' may be expressed as f (d ') ∈ (d '), > S (d '), where an expression of a weighted Jaccard distance δ (d, d ') between the text d and the text d ' is as follows:
Figure BDA0002555158580000111
where weight (e) represents the frequency of occurrence of the event body or event content e in the text.
When the similarity between the text d and the text d ' is lower, f (d) and f (d) should be very small and even empty, and f (d) are much smaller than f (d) and f (d), at this time, the weighted Jaccard distance δ (d, d ') is very large, and δ (d, d ') can take a value of 1 at maximum; the higher the similarity between text d and text d ', the closer f (d) and f (d) are, the smaller the weighted Jaccard distance δ (d, d ') can be, and the smallest value of δ (d, d ') can be 0.
Optionally, when determining the event subject and the event content in the text, the text corresponding to a large amount of public opinion data obtained from multiple data platforms may be subjected to data preprocessing, including at least one of Uniform Resource Locator (URL) deduplication, word segmentation, invalid information elimination, and text deduplication. When URL deduplication is performed, a URL address corresponding to a text is determined, and the text of the same URL address is regarded as the same text, and therefore, any of the duplicate texts is deleted. The word segmentation is a process of recombining continuous word sequences into word sequences according to a certain standard, and can be specifically realized by a word segmentation method based on character string matching, a word segmentation method based on understanding and a word segmentation method based on statistics. The text content corresponding to the text may include meaningless content, where the meaningless content includes stop words, meaningless symbols, and the like, for example, the meaningless content may be "ado-ado", and therefore invalid information needs to be eliminated, and a sequence of a plurality of words including subjects, predicates, and objects is obtained. When text deduplication is performed, the text deduplication can be realized through a simhash algorithm, and the simhash algorithm comprises five processes: word segmentation, Hash, weighting, merging and dimension reduction.
Optionally, after data preprocessing is performed on the text to obtain a sequence of a plurality of words including a subject, a predicate, and an object, an event subject and an event content in the sequence of words need to be determined. The predetermined matching rule may include a plurality of pre-stored event bodies, and when the vocabulary sequence also includes the pre-stored event bodies, the pre-stored event bodies are taken as the event bodies in the vocabulary sequence, and in the securities industry, the event bodies are generally company names, such as a company with limited responsibility, a company with limited shares, and the like. In addition, when the preset matching rule is generated, the associated pre-stored event main body with larger co-occurrence in the plurality of pre-stored event main bodies is determined according to the Chinese language model (N-Gram), and the associated pre-stored event main bodies are used as the same event main body to generate the corresponding preset matching rule.
Optionally, when determining the event subject in the vocabulary sequence, the vocabulary in the sequence may be subjected to Word Sense Disambiguation (WSD), and then the event subject matching may be performed on the vocabulary sequence after Word Sense Disambiguation. In computer linguistics, word sense disambiguation is an open problem for natural language processing and ontology. Ambiguity and disambiguation are the most core problems in natural language understanding, the phenomenon that languages are different according to context semantics can occur in word meaning, sentence meaning and chapter meaning levels, and disambiguation refers to a process of determining object semantics according to context. Similarly to the way of determining the event body in the vocabulary sequence, the event content in the vocabulary sequence can be determined, and will not be described in detail herein.
Alternatively, the text generally includes a title and body content, the title being a condensed summary of the body content. When the text is subjected to data preprocessing, the title content is preprocessed preferentially to obtain a vocabulary sequence corresponding to the title, if the event main body and the event content in the vocabulary sequence corresponding to the title cannot be determined according to the preset matching rule, the text content is subjected to data preprocessing again, the event main body in the vocabulary sequence corresponding to the text content is taken as the event main body corresponding to the text, the event content in the vocabulary sequence corresponding to the text content is taken as the event content corresponding to the text, and the data processing efficiency is improved. It should be noted that, if the event body and the event body in the vocabulary sequence corresponding to the text content cannot be determined, the text is considered as useless data, and the text is discarded.
Optionally, for the text content with a longer length, the text content may be further truncated to divide the text content into a plurality of portions, and since the main content of the text is generally concentrated in the head portion of the text content, the data preprocessing may be preferentially performed according to the head portion of the text content, and the event body and the event content corresponding to the head portion may be used as the event body and the event content corresponding to the text.
Step S13, determining a cohesion value among a plurality of texts according to the similarity;
step S14, determining a plurality of text sets according to the cohesiveness numerical value, wherein the text sets comprise a plurality of texts;
step S15, taking the text set as the public opinion event;
step S16, forming the public sentiment event set according to a plurality of public sentiment events.
In this embodiment, when determining the cohesion value between the texts according to the similarity, the similarity may represent the relative distance between different texts, and at this time, one point may represent one text, and the texts may be converted into a plurality of points at different positions on the plane, and the distance between the points is fixed. Therefore, the cohesiveness value of a plurality of points can be calculated by the calculation formula of the cohesiveness value, the cohesiveness value is used to indicate the degree of association between the plurality of points, and the higher the cohesiveness value is, the more similar the text corresponding to the plurality of points is. For example, when the distance between texts is the weighted Jaccard distance δ (d, d'), the expression of the cohesion value Φ of the set c composed of a plurality of texts is as follows:
Figure BDA0002555158580000131
where | c | represents the number of texts in the set c, δ (d, d ') is the weighted Jaccard distance between two different texts d and d' in the set c, and the denominator in the expression represents the sum of the distances of all texts in the set c.
Randomly dividing all texts in the public opinion data into different sets, calculating the cohesion values of the sets obtained by different dividing methods, and when the sum of the cohesion values of all the sets is maximum, taking all the sets in the public opinion data as a text collection to obtain a plurality of text collections, wherein each text collection comprises a plurality of texts. And after a plurality of text collections are obtained, taking the single text collection as a single public opinion event, and generating a collection of all public opinion events according to the public opinion data as a public opinion event collection.
Optionally, after determining the plurality of text sets according to the cohesiveness numerical value, the text sets may be further modified to avoid an influence of abnormal texts in the text sets on the division of the text sets. Specifically, since the texts in the text collection can be converted into points with fixed relative positions on the plane, a circle with a minimum area can be generated according to the points corresponding to all the texts in the text collection, so that the points corresponding to all the texts in the text collection are located in the circle, and the area of the circle is as small as possible. It will be understood that the size of the circle is fixed, and therefore the center and radius of the circle can be determined, the radius of the circle is the radius of the text collection, the center of the circle is the centroid of the text collection, and taking the text collection c including the texts d and d', the centroid Γ of the text collection as an examplecThe expression of (a) is:
Figure BDA0002555158580000132
the expression for the radius ρ of the text collection is:
Figure BDA0002555158580000133
acquiring the number k of texts contained in the text collection, and acquiring the density psi of the text collection according to the number k of texts and the radius rho of the text collection, wherein the expression of the density psi is as follows:
Figure BDA0002555158580000134
the expression of the size S of the corresponding text collection is:
Figure BDA0002555158580000135
where ρ iskRepresenting the radius of a collection of text having a text number k.
Acquiring a threshold corresponding to the size S of the text collection, and increasing or deleting the texts in the text collection to make the size S of the modified text collection larger than the threshold, wherein the density psi of the modified text collection is as large as possible. Generally, the size S of the collection of text increases when the text in the collection of text is added, and decreases when the text in the collection of text is deleted. Therefore, firstly, whether the size S of the text collection is larger than a threshold is judged, if so, the texts in the text collection are added so that the size S of the text collection is larger than the threshold, and if so, the texts in the text collection are added or deleted according to the density of the text collection on the premise of ensuring that the size S of the text collection is always larger than the threshold, so as to determine the text collection with the maximum density, namely the corrected text collection, and the corrected text collection is used as a public opinion event, so that the situation that the division of the text collection is unreasonable due to abnormal texts is avoided.
In the technical scheme disclosed in this embodiment, public sentiment data generated in a preset time period is collected through a preset network platform, the public sentiment data is divided into a plurality of public sentiment events, and the plurality of public sentiment events form a public sentiment event aggregate, so that the purpose of extracting the public sentiment events from a large amount of public sentiment data is realized.
In yet another embodiment, as shown in fig. 4, on the basis of the embodiment shown in any one of fig. 2 to 3, step S20 includes:
step S21, determining the centroid corresponding to the text set contained in the public sentiment event set;
and step S22, determining the related public sentiment events in the public sentiment event set according to the distance between the centroids.
In the present embodiment, similar to the expression of the centroid Γ c of the text set in the embodiment shown in fig. 3, the expression of the centroid corresponding to the text set included in the public sentiment event set is as follows:
Figure BDA0002555158580000141
wherein, the texts d and d ' are texts in a text set contained in the public sentiment event, and δ (d, d ') is a weighted Jaccard distance between the texts d and d '.
As shown in fig. 6, fig. 6 is a schematic diagram of a public sentiment event context, where a range of each circle in fig. 6 is a public sentiment event, all the public sentiment events form a public sentiment event collection, a point in each circle is a text included in the public sentiment event, and a circle center is a centroid corresponding to the public sentiment event. Since the text in the text collection can be converted into a point with a fixed relative position on the plane, the distance between the centroids of different public sentiment events is also relatively fixed, and thus the distance between the centroids of different public sentiment events can be obtained, for example, the distance between the centroid of the public sentiment event c and the centroid of the public sentiment event c 'is δ (c, c').
After the distance between the mass centers is determined, public sentiment events corresponding to different time periods are obtained to obtain a plurality of public sentiment events, time characteristic values of the public sentiment events are calculated, the time characteristic values represent the correlation degree of the public sentiment events, and the correlation public sentiment events in a plurality of public sentiment event sets can be determined according to the time characteristic values. For example, as shown in fig. 6, taking one day as a time period and today as the t-th day, the public sentiment event c corresponding to the i-th day is obtainediAnd i is less than or equal to t, so that a time characteristic numerical value H of a plurality of public sentiment events corresponding to different time periods is obtainedc(c) The expression of (a) is:
Figure RE-GDA0002609752270000151
wherein, S (c)i) Is a public sentiment event ciSize of (d), S (c)i) Reference may be made to the expression of the size S of the text collection, δ (c), in the embodiment shown in FIG. 3i,ci-1) Public sentiment event c corresponding to the ith dayiMass center of (1) and the public sentiment event c corresponding to the i-1 th dayi-1A and beta are preset hyper-parameters, e is the base number of the natural logarithm, and e belongs to an irrational number. Delta (c)i,ci-1) Size scaleMeasuring the existence of public sentiment events, at delta (c)i,ci-1) When a certain threshold value is exceeded, the distance between the centroid of the public sentiment event on the day and the centroid of the public sentiment event on the previous day is too large, the correlation between the public sentiment events on the two days is poor, and the public sentiment event cannot continue, gradually loses heat or has theme shift along with time. Randomly collecting a public sentiment event from a plurality of public sentiment events corresponding to each time period in different time periods to obtain a group of public sentiment events, calculating a plurality of time characteristic values corresponding to the collected groups of public sentiment events, and taking the plurality of public sentiment events in the group as related public sentiment events when the sum of the time characteristic values corresponding to the public sentiment events of all the groups is maximum, wherein two public sentiment events pointed by bidirectional arrows in fig. 6 are the related public sentiment events, the plurality of related events form a public sentiment event subset, and the public sentiment events in the public sentiment event subset are similar events.
In the technical scheme disclosed in the embodiment, the mass centers corresponding to the text sets contained in the public sentiments in the public sentiment event set are determined, and the public sentiments with correlation are determined according to the distances between the mass centers, so that the purpose of finding out the public sentiment event venation from a large number of public sentiments is achieved.
In another embodiment, as shown in fig. 5, on the basis of the embodiment shown in any one of fig. 2 to 4, step S40 includes:
step S41, generating an interaction information matrix corresponding to the propagation subject according to the historical interaction data;
in this embodiment, when obtaining the influence value of the public sentiment event subset according to the historical interaction data, first, an interaction information matrix corresponding to the propagation subject of the public sentiment event subset is generated according to the historical interaction data. The expression of the interaction information matrix m (t) is as follows:
M(t)=(μij(t))N*N
the public sentiment event subset comprises N propagation subjects, a matrix of N x N is formed, i and j are the propagation subjects in the public sentiment event subset, i belongs to N, j belongs to N and muij(t) indicates that the body j is oriented during the time period tThe interaction ratio of the subject i, for example, on the t day, the subject j performs 25 interactions (attention, comment, forward, like) with the subject i, and if the subject i publishes 40 texts on the network on the t day, the interaction ratio μ of the subject j to the subject i now occursij(t) is 25/40, i.e. 0.625.
Step S42, carrying out iteration processing on the interaction information matrix according to a preset function to obtain an influence value matrix;
in this embodiment, after the interaction information matrix m (t) is obtained, an influence value matrix g (t) is constructed by an iterative algorithm. The expression of G (t) is as follows:
G(t)=(gij(t))N*N
wherein, gij(t) is the value of the influence of the body i on the body j during the time period t.
In determining gijIn (t), the calculation can be performed in three cases:
1. when i ≠ j, gij(t) representing the value of the influence between different propagation subjects, firstly obtaining an intermediate matrix P according to an interaction information matrix M (t), wherein the expression of the intermediate matrix P is as follows:
P=(pij(tl))N*N=(I+λI-M(tl)T)-1
wherein p isij(tl) Represents the intermediate influence value of a subject I on a subject j in a time period l, wherein I is an imaginary unit, M (t)l)TRepresenting the interaction information matrix M (t) over a period of time ll) λ is a hyper-parameter used to estimate the decreasing value of the influence in the network propagation.
gijThe expression of (t) is as follows:
Figure BDA0002555158580000161
wherein p isji(t) represents the intermediate influence value of the body j on the body i during the time period t, which can be obtained from the intermediate matrix P, Pii(t) is represented inThe intermediate confidence level (Self-influence) of the subject i in the time period t can be obtained from the intermediate matrix P, gii(t) represents the degree of confidence (Self-influence), g, of subject i over time period tiiThe specific calculation manner of (t) can refer to the contents of the second case and the third case;
2. when i equals j and t equals 0, i.e. at the initial instant, no interaction information matrix exists, gijThe expression of (t) is:
gij(t)=gii(0)=1,
wherein, gii(0) It indicates that the confidence level (Self-influence) of the propagation subject i at the initial time is 1.
3. When i ≠ j and t > 0, other network entities k interact with the network dissemination i, so that there is an interaction information matrix, the confidence level g of the dissemination entity iii(t) is obtained by the information conduction matrix M (t) and the influence values of the propagation main body i and other network main bodies k, giiThe expression of (t) is as follows:
Figure BDA0002555158580000171
wherein λ is a hyper-parameter for estimating a decreasing value of the influence in the network propagation, μki(t) is the interaction ratio of the subject k to the subject i in the time period t, gik(t-1) represents the value of the influence of the body i on the body k during the time period (t-1). And (3) calculating the influence value of the propagation main body of the public sentiment event subset through iteration of the time period t and the time period (t-1), and further constructing an influence value matrix.
And step S43, acquiring the influence value according to the influence value matrix.
In this embodiment, after the influence value matrix g (t) is obtained, the influence values of the public sentiment event subset are calculated according to the influence value matrix g (t). Specifically, the event characteristics of the public sentiment event subset are obtained, the event characteristics comprise the number of the public sentiment events contained in the public sentiment event subset, and the number of the public sentiment events can be a public sentiment event subset packageThe amount of text contained. Calculating the instant influence SA of the public sentiment event subset in the time period t through the influence values and the event characteristics among the propagation subjects in the influence value matrixt,SAtThe expression of (a) is as follows:
Figure BDA0002555158580000172
wherein d isi(t) the number of texts issued by the propagation subject i in the time period t, namely the event characteristics; gij(t) represents the value of the influence of the body i on the body j over the time period t. It can be seen that the instantaneous influence is determined by the influence of the propagating subject itself and the event characteristics of the public sentiment event subset.
As shown in fig. 7, a function η (t) of a jump diffusion process based on a Gamma (Gamma) distribution is introduced to simulate a change in an influence value with time in network propagation, and the expression of the function η (t) is as follows:
Figure BDA0002555158580000173
wherein e is the base number of the natural logarithm, κ and τ are the predetermined hyper-parameters during the jumping process, κ is generally 1.5, and τ is generally 42.24.
As shown in fig. 8, a function h (t) of the propagation of the influence value and the dissipation process is calculated according to a function η (t), the expression of the function h (t) is as follows:
Figure BDA0002555158580000174
calculating the time span w of the publication time corresponding to the public sentiment event subset, i.e. the earliest time period t of publication in the public sentiment event subsetlWith the latest time period tfDifference between w and tf-tl
According to instantaneous influence SAtThe time span w, and then based on the function H (t) or the function eta (t),calculating the impact value PCDSA of public sentiment event subsettf,w,PCDSAtf,wThe expression of (a) is as follows:
Figure BDA0002555158580000181
the impact value is represented at tfAnd at the moment, accumulating the influence values generated by the events corresponding to the public sentiment event subset in the past time span with the length of w. The value of the influence includes the instantaneous influence SAtAnd the cumulative effect of this transient effect as it propagates and diffuses through the propagation network.
In the technical scheme disclosed in this embodiment, the interaction information matrix corresponding to the main body is propagated according to the historical interaction data, and then the influence value matrix is obtained through iterative processing, so as to determine the influence value of the public opinion event subset, thereby achieving the purpose of automatically estimating the public opinion risk and improving the efficiency of public opinion risk early warning.
In addition, the embodiment of the present invention further provides a public opinion risk early warning device, which includes: the public opinion risk early warning method comprises a memory, a processor and a public opinion risk early warning program which is stored on the memory and can run on the processor, wherein when the public opinion risk early warning program is executed by the processor, the steps of the public opinion risk early warning method are realized according to the above embodiments.
In addition, an embodiment of the present invention further provides a computer storage medium, where a public opinion risk early warning program is stored on the computer storage medium, and when the public opinion risk early warning program is executed by a processor, the steps of the public opinion risk early warning method according to the above embodiments are implemented.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the present specification and drawings, or used directly or indirectly in other related fields, are included in the scope of the present invention.

Claims (10)

1. A public opinion risk early warning method is characterized by comprising the following steps:
acquiring a public opinion event set;
determining an associated public sentiment event in the public sentiment event set;
generating a public sentiment event subset according to the related public sentiment events;
acquiring a propagation main body of the public opinion event subset, and acquiring an influence value of the public opinion event subset according to historical interaction data of the propagation main body;
and generating and outputting early warning information according to the influence numerical value.
2. The public opinion risk pre-warning method as set forth in claim 1, wherein the historical interaction data includes at least one of a number of praise, a number of forward, and a number of comments.
3. The public opinion risk early warning method according to claim 1, wherein the step of obtaining the public opinion event set comprises:
collecting public opinion data generated in a preset time period through a preset network platform, wherein the public opinion data comprises a plurality of texts;
acquiring the similarity between the texts;
determining a cohesiveness numerical value among a plurality of texts according to the similarity;
determining a plurality of text sets according to the cohesiveness numerical value, wherein the text sets comprise a plurality of texts;
taking the text set as the public opinion event;
and forming the public opinion event set according to a plurality of public opinion events.
4. The public opinion risk warning method according to claim 3, wherein after the step of determining the plurality of text sets according to the cohesion values, the public opinion risk warning method comprises:
acquiring the number of texts contained in the text set;
acquiring the radius of the text set;
acquiring the density of the text set according to the number and the radius;
and adding or deleting texts in the text set according to the density, wherein the corrected text set is taken as the public sentiment event.
5. The public opinion risk warning method of claim 1 wherein the step of determining the associated public opinion events in the set of public opinion events comprises:
determining a mass center corresponding to a text set contained in a public sentiment event in the public sentiment event set;
and determining the related public opinion events in the public opinion event set according to the distance between the centroids.
6. The public opinion risk pre-warning method as claimed in claim 1, wherein the step of obtaining the influence value of the public opinion event subset according to the historical interaction data of the dissemination subject comprises:
generating an interaction information matrix corresponding to the propagation subject according to the historical interaction data;
performing iterative processing on the interaction information matrix according to a preset function to obtain an influence numerical matrix;
and acquiring the influence value according to the influence value matrix.
7. The public opinion risk warning method according to claim 6, wherein the step of obtaining the influence value according to the influence value matrix comprises:
acquiring event characteristics of the public opinion event subset, wherein the event characteristics comprise the number of public opinion events contained in the public opinion event subset;
acquiring the time span of the release time corresponding to the public opinion event subset;
and acquiring the influence value according to the event characteristic, the time span and the influence value matrix.
8. The public opinion risk warning method according to claim 1, wherein the step of generating and outputting warning information according to the influence value comprises:
when the influence value is larger than a preset threshold value, acquiring an event main body corresponding to the public sentiment event subset;
acquiring the sum of the tender corresponding to the event main body;
and generating and outputting the early warning information according to the public sentiment event subset, the influence numerical value and the amount of the target.
9. The utility model provides a public opinion risk early warning device which characterized in that, public opinion risk early warning device includes: a memory, a processor and a public opinion risk early warning program stored on the memory and executable on the processor, wherein the public opinion risk early warning program when executed by the processor implements the steps of the public opinion risk early warning method according to any one of claims 1 to 8.
10. A computer storage medium, wherein a public opinion risk early warning program is stored on the computer storage medium, and when being executed by a processor, the computer storage medium implements the steps of the public opinion risk early warning method according to any one of claims 1 to 8.
CN202010595267.5A 2020-06-24 2020-06-24 Public opinion risk early warning method and device and computer storage medium Pending CN112541358A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010595267.5A CN112541358A (en) 2020-06-24 2020-06-24 Public opinion risk early warning method and device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010595267.5A CN112541358A (en) 2020-06-24 2020-06-24 Public opinion risk early warning method and device and computer storage medium

Publications (1)

Publication Number Publication Date
CN112541358A true CN112541358A (en) 2021-03-23

Family

ID=75013449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010595267.5A Pending CN112541358A (en) 2020-06-24 2020-06-24 Public opinion risk early warning method and device and computer storage medium

Country Status (1)

Country Link
CN (1) CN112541358A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098738A1 (en) * 2014-10-06 2016-04-07 Chunghwa Telecom Co., Ltd. Issue-manage-style internet public opinion information evaluation management system and method thereof
CN108932291A (en) * 2018-05-23 2018-12-04 福建亿榕信息技术有限公司 Power grid public sentiment evaluation method, storage medium and computer
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN110705288A (en) * 2019-09-29 2020-01-17 武汉海昌信息技术有限公司 Big data-based public opinion analysis system
CN110750636A (en) * 2018-07-04 2020-02-04 百度在线网络技术(北京)有限公司 Network public opinion information processing method and device
CN111324789A (en) * 2020-02-13 2020-06-23 创新奇智(上海)科技有限公司 Method for calculating network information data heat

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160098738A1 (en) * 2014-10-06 2016-04-07 Chunghwa Telecom Co., Ltd. Issue-manage-style internet public opinion information evaluation management system and method thereof
CN108932291A (en) * 2018-05-23 2018-12-04 福建亿榕信息技术有限公司 Power grid public sentiment evaluation method, storage medium and computer
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN110750636A (en) * 2018-07-04 2020-02-04 百度在线网络技术(北京)有限公司 Network public opinion information processing method and device
CN110705288A (en) * 2019-09-29 2020-01-17 武汉海昌信息技术有限公司 Big data-based public opinion analysis system
CN111324789A (en) * 2020-02-13 2020-06-23 创新奇智(上海)科技有限公司 Method for calculating network information data heat

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
CHANG LIAO 等: ""Tracking Dynamic Magnet Communities Insights from a Network Perspective"", 《DASFAA 2018: DATABASE SYSTEMS FOR ADVANCED APPLICATIONS》, 13 May 2018 (2018-05-13), pages 406 - 424, XP047477048, DOI: 10.1007/978-3-319-91452-7_27 *
CHANG LIAO 等: ""Mining influence in evolving entities: A study on stock market"", 《2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)》, 12 March 2015 (2015-03-12), pages 1 - 7 *
CHUN-Y AN WANG 等: ""A K-nearest neighbor algorithm based on cluster in text classification"", 《2010 INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS, CONTROL AND ELECTRONIC ENGINEERING》, 25 October 2010 (2010-10-25), pages 225 - 228 *
何振 等: ""突发事件社会舆情风险生成演化及防控研究*"", 《湘潭大学学报( 哲学社会科学版)》, vol. 44, no. 2, 31 March 2020 (2020-03-31), pages 22 - 26 *
李丽香 等: "《混沌蚁群算法及应用》", 31 January 2013, 中国科学技术出版社, pages: 104 - 106 *
杨林瑞 等: ""IPI:一种基于影响力和兴趣的链接预测算法"", 《 计算机系统应用》, vol. 25, no. 1, 15 January 2016 (2016-01-15), pages 160 - 164 *
林芹 等: ""企业网络舆情传播的系统动力学仿真研究――基于传播主体特性"", 《情报科学》, vol. 35, no. 04, 5 April 2017 (2017-04-05), pages 54 - 67 *
殷复莲: "《数据分析与数据挖掘实用教程》", 30 September 2017, 中国传媒大学出版社, pages: 259 - 266 *

Similar Documents

Publication Publication Date Title
Culotta Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages
US8402035B2 (en) Methods and systems for determing media value
Smeureanu et al. Applying supervised opinion mining techniques on online user reviews
RU2671047C2 (en) Search tables understanding
Ghag et al. Comparative analysis of the techniques for sentiment analysis
KR20190069443A (en) Method and system for identifying similarity levels between a plurality of data representations
US20100262454A1 (en) System and method for sentiment-based text classification and relevancy ranking
US20110119208A1 (en) Method and system for developing a classification tool
US20160189033A1 (en) NLP Duration and Duration Range Comparison Methodology Using Similarity Weighting
Kiefer Assessing the Quality of Unstructured Data: An Initial Overview.
WO2021210992A9 (en) Systems and methods for determining entity attribute representations
Afsana et al. Automatically assessing quality of online health articles
Gopal et al. Machine learning based classification of online news data for disaster management
JP5353523B2 (en) Graph analysis apparatus, graph analysis method, and graph analysis program
Arvanitis et al. Real-time investors’ sentiment analysis from newspaper articles
CN115689717A (en) Enterprise risk early warning method, device, electronic equipment, medium and program product
JP4750628B2 (en) Information ranking method and apparatus, program, and computer-readable recording medium
CN111737607B (en) Data processing method, device, electronic equipment and storage medium
CN111126073B (en) Semantic retrieval method and device
JP6554306B2 (en) Information processing system, information processing method, and computer program
CN112541358A (en) Public opinion risk early warning method and device and computer storage medium
Riadsolh et al. Cloud-Based Sentiment Analysis for Measuring Customer Satisfaction in the Moroccan Banking Sector Using Na? ve Bayes and Stanford NLP
Li et al. Incorporating textual network improves Chinese stock market analysis
CN115187066A (en) Risk identification method and device, electronic equipment and storage medium
Wahsheh et al. The evaluation of trust and credibility metrics: Websites of Jordanian universities and e-government portals as a case study

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination