CN114840725A - Event correlation based public opinion security analysis method and device and electronic equipment - Google Patents

Event correlation based public opinion security analysis method and device and electronic equipment Download PDF

Info

Publication number
CN114840725A
CN114840725A CN202210509195.7A CN202210509195A CN114840725A CN 114840725 A CN114840725 A CN 114840725A CN 202210509195 A CN202210509195 A CN 202210509195A CN 114840725 A CN114840725 A CN 114840725A
Authority
CN
China
Prior art keywords
public
public opinion
opinion data
sentiment
classification set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210509195.7A
Other languages
Chinese (zh)
Inventor
厉山山
吴业超
刘方舟
任天悦
刘辉耀
李冰
郭佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Qiyue Information Technology Co Ltd
Original Assignee
Shanghai Qiyue Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Qiyue Information Technology Co Ltd filed Critical Shanghai Qiyue Information Technology Co Ltd
Priority to CN202210509195.7A priority Critical patent/CN114840725A/en
Publication of CN114840725A publication Critical patent/CN114840725A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Economics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a public opinion safety analysis method, a device and electronic equipment based on event correlation, wherein the method comprises the following steps: carrying out classification processing on the public opinion data to be classified into corresponding public opinion classification sets; performing dimensionality reduction on the public opinion data in the public opinion classification set, searching a historical event classification set to which the processed public opinion data belongs based on a first time window, and classifying the public opinion data into the historical event classification set to which the public opinion data belongs; carrying out correlation matching on the public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets; and carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set. The invention can classify, trace to the source, ferment, secondary fermentation to the public sentiment; therefore, the complete generation and fermentation process of public sentiment events is quickly restored, hidden public sentiment risks are prevented, public sentiment safety is guaranteed, and enterprise and social images are maintained.

Description

Event correlation based public opinion security analysis method and device and electronic equipment
Technical Field
The invention relates to the technical field of data processing, in particular to a public opinion security analysis method and device based on event correlation, electronic equipment and a computer readable medium.
Background
With the rapid development and popularization of the internet, people are used to publish respective opinions or speeches on social hotspots, social public affairs and the like through networks; meanwhile, various self-media and social platforms in various forms are also developed, such as public numbers, microblogs and the like. When social events and social problems occur, people often know the causes and the development processes of the events quickly by means of a media platform, and then publish opinions through a network media, and the opinions have an effect which is not neglected on the development of the events, so that public sentiments are generated. In the process of public opinion occurrence and propagation, some high-repeatability accounts, content forwarding accounts and the like are often arranged and distorted to events, and the public image of enterprises, individuals and even the society is seriously influenced, so that public opinion safety is very important to the enterprises and the society.
At present, due to rapidity, universality and strong interactivity of network transmission, network public sentiment is often increased in an explosive manner and is in a complicated form, so that the characteristics of large public sentiment information data volume, various contents, difficult traceability, various public sentiment publishers, flexible forwarding, numerous points of interest and the like are caused, and the public sentiment cannot be analyzed quickly and effectively.
Disclosure of Invention
In view of the above, the present invention is directed to a method, an apparatus, an electronic device and a computer-readable medium for public opinion security analysis based on event correlation, so as to at least partially solve at least one of the above technical problems.
In order to solve the technical problem, a first aspect of the present invention provides a public opinion security analysis method based on event correlation, including:
carrying out classification processing on the public opinion data to be classified into corresponding public opinion classification sets;
performing dimensionality reduction on the public opinion data in the public opinion classification set, searching a historical event classification set to which the processed public opinion data belongs based on a first time window, and classifying the public opinion data into the historical event classification set to which the public opinion data belongs;
carrying out correlation matching on the public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets;
and carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
According to a preferred embodiment of the present invention, the performing a dimension reduction process on the public opinion data in the public opinion classification set, and searching the historical event classification set to which the processed public opinion data belongs based on the first time window includes:
carrying out binary conversion on public sentiment data in the public sentiment classified concentration, and segmenting converted character strings to obtain public sentiment sections;
searching character strings to be compared related to all public sentiment sections in a first time window;
comparing the character string of the public opinion segment with the character string to be compared according to the position, and determining the character string similar to the public opinion data;
and determining a historical event classification set to which the public opinion data belongs according to the character strings similar to the public opinion data.
According to a preferred embodiment of the invention, the method further comprises:
carrying out dimensionality reduction on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending event fermentation alarm information;
if the historical event classification set to which the processed new public opinion data belongs is not found in the first time window, replacing the first time window with a second time window, finding the historical event classification set to which the processed new public opinion data belongs based on the second time window, dividing the new public opinion data into the historical event classification set to which the new public opinion data belongs, and sending event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window.
According to a preferred embodiment of the present invention, the comparison mechanism of the historical events to which the newly added public opinion data belongs after the search processing based on the first time window is different from the comparison mechanism of the historical events to which the newly added public opinion data belongs after the search processing based on the second time window.
According to a preferred embodiment of the present invention, the method further comprises: and searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the user search information.
According to a preferred embodiment of the present invention, the searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the user search information comprises:
matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on a plurality of dimensions according to user search information;
determining the matching degree of each target public opinion according to the weight value of each dimension;
and displaying each target public opinion according to the matching degree of the target public opinions.
According to a preferred embodiment of the present invention, before displaying each target public opinion according to the matching degree of the target public opinions, the method further comprises:
adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; or:
and adjusting the matching degree of the target public sentiment according to the final public sentiment selected from the historical target public sentiments by the user.
According to a preferred embodiment of the present invention, the matching of the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on a plurality of dimensions according to the user search information comprises:
matching user search information based on multiple dimensions from the public opinion classification set, the historical event classification set and the negative public opinion set respectively to obtain classified target public opinions, event target public opinions and negative target public opinions;
and deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment respectively as the target public sentiment.
According to a preferred embodiment of the invention, the method further comprises:
searching and constructing a user portrait according to the user history;
pushing information to the user based on the user image.
According to a preferred embodiment of the present invention, the classifying the public opinion data into corresponding public opinion classification sets includes:
carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
judging the emotion types of the classified public opinion data based on the positive and negative emotion corpus;
and processing the public opinion data after emotion classification by adopting a classification model to obtain a public opinion classification set in which the public opinion data is positioned.
According to a preferred embodiment of the invention, if the emotion type of the classified public opinion data is negative, a negative public opinion alarm is issued.
According to a preferred embodiment of the present invention, the public opinion data includes: media public opinion data and social network public opinion data, the method further comprises:
and carrying out correlation matching on the social network public opinion data, and distributing the matched social network public opinion data to a negative public opinion set.
In order to solve the above technical problems, a second aspect of the present invention provides a public opinion security analysis device based on event correlation, the device comprising:
the classification module is used for classifying the public sentiment data into corresponding public sentiment classification sets;
the dimensionality reduction searching module is used for carrying out dimensionality reduction on the public sentiment data in the public sentiment classified set, searching a historical event classified set to which the processed public sentiment data belongs based on a first time window, and dividing the public sentiment data into the historical event classified set to which the public sentiment data belongs;
the first matching module is used for carrying out correlation matching on the public opinion data in the same historical event classification set and dividing the matched public opinion data into negative public opinion sets;
and the analysis module is used for carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
According to a preferred embodiment of the present invention, the dimension reduction and search module includes:
the segmentation module is used for carrying out binary conversion on public sentiment data in the public sentiment classified concentration and segmenting converted character strings to obtain public sentiment sections;
the sub-searching module is used for searching character strings to be compared related to all public sentiment sections in a first time window;
the comparison module is used for comparing the character string where the public opinion segment is located with the character string to be compared according to positions to determine the character string similar to the public opinion data;
and the sub-determination module is used for determining the historical event classification set to which the public sentiment data belongs according to the character strings similar to the public sentiment data.
According to a preferred embodiment of the invention, the device further comprises:
the first dimension reduction searching module is used for carrying out dimension reduction processing on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending out event fermentation alarm information;
the second dimensionality reduction searching module is used for replacing the first time window with a second time window if the historical event classification set to which the processed new public opinion data belongs is not searched in the first time window, searching the historical event classification set to which the processed new public opinion data belongs based on the second time window, classifying the new public opinion data into the historical event classification set to which the new public opinion data belongs, and sending out event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window.
According to a preferred embodiment of the present invention, the comparison mechanism of the first dimension reduction searching module and the second dimension reduction searching module is different.
According to a preferred embodiment of the invention, the device further comprises: and the searching and displaying module is used for searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the searching information of the user.
According to a preferred embodiment of the present invention, the search presentation module includes:
the multi-dimension matching module is used for matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on multiple dimensions according to user search information;
the determining module is used for determining the matching degree of each target public opinion according to the weight value of each dimension;
and the display module is used for displaying each target public opinion according to the matching degree of the target public opinions.
According to a preferred embodiment of the invention, the device further comprises:
the adjusting module is used for adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; or: and adjusting the matching degree of the target public sentiment according to the final public sentiment selected from the historical target public sentiments by the user.
According to a preferred embodiment of the present invention, the multidimensional matching module comprises:
the sub-matching module is used for matching user search information based on multiple dimensions from the public opinion classification set, the historical event classification set and the negative public opinion set respectively to obtain a classification target public opinion, an event target public opinion and a negative target public opinion;
and the deleting module is used for respectively deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment to serve as the target public sentiment.
According to a preferred embodiment of the invention, the device further comprises:
the construction module is used for searching and constructing a user portrait according to the user history;
and the pushing module is used for pushing information to the user based on the user portrait.
According to a preferred embodiment of the invention, the classification module comprises:
the correlation analysis module is used for carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
the sub-classification module is used for classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
the judging module is used for judging the emotion types of the classified public opinion data based on the positive and negative emotion corpus;
and the model processing module is used for processing the public opinion data after emotion classification by adopting the classification model to obtain a public opinion classification set in which the public opinion data is positioned.
According to a preferred embodiment of the invention, the device further comprises: and the alarm module is used for sending a negative public opinion alarm if the classified emotion types of the public opinion data are negative.
According to a preferred embodiment of the present invention, the public opinion data includes: media public opinion data and social network public opinion data, the device still includes:
and the second matching module is used for carrying out correlation matching on the social network public opinion data and distributing the matched social network public opinion data to a negative public opinion set.
To solve the above technical problem, a third aspect of the present invention provides an electronic device, comprising:
a processor; and
a memory storing computer executable instructions that, when executed, cause the processor to perform the method described above.
To solve the above technical problems, a fourth aspect of the present invention provides a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs which, when executed by a processor, implement the above method.
The method comprises the steps of classifying public opinion data into corresponding public opinion classification sets, performing dimensionality reduction on the public opinion data in the public opinion classification sets, searching historical event classification sets to which the processed public opinion data belongs based on a first time window, classifying the public opinion data into the corresponding historical event classification sets, performing correlation matching on the public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets; public sentiment categories can be analyzed according to the public sentiment classification sets, public sentiments can be subjected to source tracing, fermentation and secondary fermentation analysis according to the historical event classification sets, and repeated account numbers and forwarding account numbers in the public sentiment generation process are tracked and analyzed according to the negative public sentiment sets; therefore, the complete generation and fermentation process of public sentiment events is quickly restored, hidden public sentiment risks are prevented, public sentiment safety is guaranteed, and enterprise and social images are maintained.
Drawings
In order to make the technical problems solved by the present invention, the technical means adopted and the technical effects obtained more clear, the following will describe in detail the embodiments of the present invention with reference to the accompanying drawings. It should be noted, however, that the drawings described below are only illustrations of exemplary embodiments of the invention, from which other embodiments can be derived by those skilled in the art without inventive step.
Fig. 1 is a flowchart illustrating a public opinion security analysis method based on event correlation according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of performing dimension reduction processing on public opinion data in public opinion classification sets and searching a historical event classification set to which the processed public opinion data belongs based on a first time window according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a process of searching for and showing target public opinions from the public opinion classification set, the historical event classification set and the negative public opinion set according to the user search information according to an embodiment of the present invention;
fig. 4 is a schematic structural framework of a public opinion security analysis device based on event correlation according to an embodiment of the present invention;
FIG. 5 is a block diagram of an exemplary embodiment of an electronic device in accordance with the present invention;
FIG. 6 is a schematic diagram of one embodiment of a computer-readable medium of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the invention may be embodied in many specific forms, and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art.
The structures, properties, effects or other characteristics described in a certain embodiment may be combined in any suitable manner in one or more other embodiments, while still complying with the technical idea of the invention.
In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.
The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.
The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.
The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.
Referring to fig. 1, fig. 1 is a public opinion security analysis method based on event correlation according to the present invention, as shown in fig. 1, the method includes:
s1, classifying the public sentiment data into corresponding public sentiment classification sets;
wherein: the public opinion data may include: media public opinion data from official media, social network public opinion data from a network social platform, media public opinion data from self-media, and the like.
In this embodiment, public opinion data is processed in a text form. Therefore, before this step, public opinion texts may be obtained from official media, self-media and social networking platforms, and this step classifies the public opinion texts according to attributes such as text length, time, emotion attribute, word frequency-inverse file frequency, source, and the like, for example, this step may include:
s11, carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
such as: similar or related texts in the public opinion texts can be identified based on the semantic model, and the similar or related public opinion texts are used as the related public opinion data.
S12, classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
exemplary text attributes may include: the length, time, word frequency-inverse file frequency, source and other attributes of the text. For example, according to word frequency-inverse file frequency processing, public opinion texts are input into an Analyzer word segmentation device for word segmentation, the public opinion texts are processed into words, stop words in the words are filtered to obtain effective words, then word frequency-inverse file frequency processing is performed on the public opinion texts based on the effective words to obtain keywords of all the public opinion texts, and the public opinion texts are classified based on the keywords. The step of filtering stop words in the participles to obtain valid participles may include: and judging whether stop words exist in the segmented words, if so, filtering the stop words from the segmented words to obtain effective segmented words, and if not, taking all the segmented words as effective words.
S13, judging the emotion type of the classified public sentiment data based on the positive and negative emotion corpus;
in this embodiment, the positive and negative emotion corpus is a corpus containing positive and negative text keywords constructed based on the historical corpus, and word segmentation and emotion analysis can be performed on newly-added situations in real time through the corpus. In the step, the segmented words after the stop words are filtered are compared with positive and negative text keywords in a positive and negative emotion corpus, the number of the obtained positive words and negative words is counted, and the emotion type of the public opinion text is determined according to the proportion of the positive words and the negative words. Such as: the proportion of the positive words and the negative words is larger than a first threshold, the emotion type of the public opinion text is positive, the proportion of the positive words and the negative words is smaller than a second threshold, and the emotion type of the public opinion text is negative.
Furthermore, if the emotion type of the public opinion text is judged to be negative in the step, a negative public opinion alarm is sent, so that the public opinion can be monitored, concerned and processed conveniently in time, and the risk public opinion can be controlled as early as possible.
And S14, processing the emotion classified public opinion data by adopting the classification model to obtain a public opinion classification set in which the public opinion data is located.
Illustratively, the public sentiment texts with the same initial classification and emotion classification and/or the public sentiment participles corresponding to the public sentiment texts are input into a classification model trained in advance, and the public sentiments are finally classified according to the output result of the model to obtain the public sentiment classification sets of the public sentiment texts and/or the public sentiment participles. Wherein: a public opinion classification set corresponds to a public opinion category of an emotion type, and the emotion category can include: supportive, praise, objection, criticism, etc. Public opinion categories may include: science, military, entertainment, politics, news, etc.
Furthermore, the same label can be marked on the public sentiment text and/or the public sentiment participles of the same class to show the distinction, and the public sentiment classification database is constructed through the marked public sentiment text structure and/or the public sentiment participles, so that the later use is facilitated. In order to facilitate subsequent searching based on the public opinion classification database, the public opinion classification database can be constructed and an inverted index can be established for the public opinion data.
In addition, because accounts such as the high-repetition account and the content forwarding account are mainly from the self-media and social network platforms, the step can also perform correlation matching on the self-media public opinion data and the social network public opinion data, and divide the matched social network public opinion data into negative public opinion sets for important attention. Illustratively, relevance matching may be performed through keywords and/or public opinion publishing accounts.
S2, performing dimensionality reduction on the public sentiment data in the public sentiment classified set, searching a historical event classified set to which the processed public sentiment data belongs based on a first time window, and dividing the public sentiment data into the historical event classified set to which the public sentiment data belongs;
considering that the processing efficiency is affected by mass public sentiment data, the embodiment performs dimension reduction processing on the public sentiment data, and searches for relevant historical events based on a sliding event window, thereby realizing the source tracing analysis of the public sentiment. Wherein: the dimension reduction process may include: PCA, ICA LDA, ISOMAP, LLE, etc., in this embodiment, for convenience of processing, the dimension reduction processing converts the public sentiment data in the public sentiment classification set into a binary character string, as shown in fig. 2: this step may include:
s21, carrying out binary conversion on the public sentiment data in the public sentiment classified concentration, and segmenting the converted character strings to obtain public sentiment sections;
illustratively, if the public sentiment classified set only comprises public sentiment texts, the public sentiment texts are firstly input into an Analyzer word segmentation device for word segmentation, the public sentiment texts are processed into words, stop words in the words are filtered to obtain effective words, and the effective words are converted into 64-bit binary character strings. If the public sentiment classified set comprises the public sentiment text and the effective participles corresponding to the public sentiment text, the effective participles are directly converted into 64-bit binary character strings.
In this embodiment, in order to further increase the processing speed, the character string obtained by dimension reduction may be segmented, one public sentiment segment is used to search the relevant historical event classification set, and then all public sentiment data of the relevant historical event classification set is recalled and compared with the whole character string, so that the amount of calculation for comparison is greatly reduced, and the processing speed is increased.
Wherein: the character string may be divided into equal length, for example, a 64-bit binary character string is divided into 4 16-bit character strings, and each divided character string is a public opinion segment. Optionally, each public opinion segment can be cached in a redis database, and the public opinion segments obtained by dimensionality reduction and segmentation of the same public opinion text are marked by the same mark, so that the follow-up use is facilitated.
S22, searching character strings to be compared related to each public sentiment section in a first time window;
the time window may be set according to actual needs, and the embodiment may adopt a sliding time window, so that the first and second pairs of different time windows are adopted for distinguishing. For example, the first time window may be set to be within two days up to now, and the second time window may be set to be within two to four days up to now.
For example, the step may first obtain each public opinion segment after the same public opinion text is subjected to dimensionality reduction segmentation, and search for a to-be-compared character string related to the public opinion segment occurring in the first time window according to each public opinion segment. Wherein: searching for the to-be-compared character string occurring in the first time window according to each public opinion segment may include: and acquiring character strings of public sentiment texts generated in a first time window after dimension reduction treatment, recording the character strings as character strings to be matched, and performing correlation matching on each public sentiment section and each character string to be matched to obtain character strings to be compared. Taking a public sentiment segment which is obtained by dividing a public sentiment text into 4 16-bit strings in a dimension reduction way as an example, firstly obtaining the 4 16-bit strings which are obtained by dividing the same public sentiment text in a dimension reduction way from a redis database according to an identifier, obtaining a character string to be matched after the public sentiment text generated in a first time window is subjected to dimension reduction treatment from the redis database, for example, obtaining 10 64-bit character strings to be matched, performing correlation matching on each 16-bit string and each 64-bit character string to be matched, and taking the successfully matched character string to be matched as a character string to be compared.
S23, comparing the character string of the public sentiment section with the character string to be compared according to the position, and determining the character string similar to the public sentiment data.
Wherein: the character string of the public sentiment segment is a character string before segmentation, in the present example, the character string of the public sentiment segment is a 64-bit character string obtained by performing word segmentation binary conversion on the public sentiment text, in this step, the 64-bit character string is compared with each 64-bit character string to be compared according to bits, if the different number of bits is less than a predetermined value (for example, 4), the 64-bit character string to be compared is used as a character string similar to the public sentiment data. Illustratively, a simhash method and emotion analysis results can be adopted for comparison in the comparison process.
And S24, determining a historical event classification set to which the public sentiment data belong according to the character strings similar to the public sentiment data.
For example, a historical event classification set to which each character string similar to the public opinion data belongs may be searched first, and the historical event classification set containing the most character strings similar to the public opinion data may be used as the historical event classification set to which the public opinion data belongs.
And then, dividing the public opinion data (such as public opinion texts, character strings corresponding to the public opinion texts and participles corresponding to the public opinion texts) into the historical event classification sets to which the public opinion data belongs. If the character string similar to the public sentiment data does not exist, a new historical event classification set is created for the public sentiment data.
Furthermore, the dimension reduction processing can be carried out on newly-added public opinion data in real time, relevant historical events can be searched based on a sliding event window, and the early warning of public opinion fermentation and secondary fermentation is achieved. Accordingly, the method may further comprise:
s201, performing dimensionality reduction on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending event fermentation alarm information;
s202, if the historical event classification set to which the processed new public opinion data belongs is not found in the first time window, replacing the preset time window with a second time window, finding the historical event classification set to which the processed new public opinion data belongs based on the second time window, dividing the new public opinion data into the historical event classification set to which the new public opinion data belongs, and sending event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window. The searching method of the new historical event classification set to which the opinion data belongs can refer to steps S21 to S24.
Further, in order to make the event tracing analysis clearer, primary fermentation and secondary fermentation are separated, and a comparison mechanism of historical events of newly added public opinion data searched and processed based on a first time window is different from a comparison mechanism of historical events of newly added public opinion data searched and processed based on a second time window. Wherein: the alignment mechanism may include: and (3) comparison methods, such as: the simhash method, the emotion analysis result, etc., such as content: such as: text, character strings, account numbers, keywords, etc.
In addition, the same label can be marked on the public sentiment text, the public sentiment participle and the character string in the same historical event classification set to show the distinction, and the historical event classification database is constructed through the marked public sentiment text, the public sentiment participle and the character string, so that the later use is facilitated. In order to facilitate subsequent searching based on the historical event classification database, an inverted index can be established for public sentiment data while the historical event classification database is established.
S3, carrying out correlation matching on public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets;
in this embodiment, the correlation matching may be performed by at least one of a character string, a keyword, and a public opinion publishing account. Public sentiment security analysis dimensions based on event relevance can be supplemented by establishing a negative public sentiment set.
And S4, carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
Illustratively, public sentiment categories can be analyzed according to a public sentiment classification set, public sentiments can be subjected to source tracing, fermentation and secondary fermentation analysis according to a historical event classification set, and repeated account numbers, forwarding account numbers and the like in the public sentiment generation process are tracked and analyzed according to a negative public sentiment set; therefore, the complete generation and fermentation process of public sentiment events is quickly restored, hidden public sentiment risks are prevented, public sentiment safety is guaranteed, and enterprise and social images are maintained.
Furthermore, the embodiment of the invention can also search and display the information input by the user based on the public opinion classification set, the historical event classification set and the negative public opinion set, thereby ensuring the safety and reliability of displaying the information to the user. Based on this, the method further comprises:
and S5, searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the user search information.
Wherein: the user search information may be information such as keywords, text, etc. input by the user. Illustratively, as shown in fig. 3, this step may include:
s51, matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on multiple dimensions according to user search information;
wherein: the dimensions are used for matching the target public sentiment from different angles, and illustratively, the target public sentiment can be matched from the dimensions of article title matching degree, content matching degree, heat degree, tf-idf value, originality, author portrait and the like. Illustratively, multi-dimensional matching may be performed by matching rules or matching models.
Considering that negative public opinion concentrated public opinion data may have false and profound data, the data is shown to users to mislead the users, the enterprise image is damaged, and the data needs to be deleted from the target public opinion. Therefore, this step can match the user search information based on multiple dimensions from the public opinion classification set, the historical event classification set, and the negative public opinion set, respectively, to obtain a classification target public opinion, an event target public opinion, and a negative target public opinion; and deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment respectively as the target public sentiment.
S52, determining the matching degree of each target public opinion according to the weight value of each dimension;
in this embodiment, a weight may be set for each dimension in advance, and the matching degree of each target public opinion is calculated based on the score of each dimension of the target public opinion and the weight of the corresponding dimension.
In a preferred example, the matching degree of the target public sentiment can be adjusted by integrating the historical search results of the user, so that the display result is closer to the user requirement, and the user experience is improved. The step can also be used for adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; specifically, historical target public opinions matched with user search information are obtained; comparing historical target public sentiment with the target public sentiment; and adjusting the matching degree of the target public sentiments (namely the display sequence of the target public sentiments) according to the comparison result. Illustratively, the matching degree of the target public opinions is adjusted according to the number of the target public opinions with the same history, such as: and setting corresponding weight by taking the number of the public opinions with the same historical target as a dimension, and adjusting the matching degree based on the weight. Or, the matching degree is directly adjusted in equal proportion according to the number of the same historical target public opinions.
In addition, the matching degree of the target public sentiment can be adjusted according to the final public sentiment selected by the user from the displayed historical target public sentiments, and then the step can also be as follows: acquiring a final public opinion selected by a user from historical target public opinions; comparing the final public sentiment with the target public sentiment; and adjusting the matching degree of the target public sentiment according to the comparison result.
And S53, displaying each target public sentiment according to the matching degree of the target public sentiments.
Such as: and displaying the target public sentiment from top to bottom according to the matching degree in a pull-down list mode. Or, setting different display areas, and displaying the target public sentiment according to the significant level of the matching degree based on the areas, such as: and displaying the target public sentiment with the maximum matching degree in a display area with a first significance level, and displaying the target public sentiment with a second matching degree in a display area with a second significance level. Wherein: the display grade reflects how fast the user finds the content of the display area, such as: the content of the area in the middle of the display screen is most easily found by the user.
Furthermore, the embodiment of the invention can also analyze the user attributes based on the historical search of the user in the public opinion classification set, the historical event classification set and the negative public opinion set, construct the user portrait and push the information of the user. Based on this, the method further comprises:
and S6, searching and constructing the user portrait according to the user history, and pushing information to the user based on the user portrait.
Wherein: the user history search may include information such as keywords and texts input by the user, and may also include texts finally selected by the user in the search results.
Compared with the prior art, the invention has at least the following beneficial effects:
1. the system can classify, trace to the source, ferment and carry out secondary fermentation on mass public sentiments and safety information in a database of the system, and can reduce the complete occurrence and fermentation processes of events, thereby facilitating the workers to prevent the events to be fermented in time and ensuring the public sentiment safety of enterprises and society.
2. The public opinion data is subjected to dimension reduction processing, the public opinion data after dimension reduction is divided, one public opinion segment is adopted to search related character strings to be compared, and then the whole character string where the public opinion segment is located is recalled to be compared with each character string to be compared, so that the calculation amount for comparison is greatly reduced, and the processing speed is improved.
3. The number of algorithm recalls in the analysis process is large, the calculation amount is large, and the consumed time is large; because the calculated amount cannot be reduced, the core idea is to reduce the recall time, and the recall of each 16-bit substring can be directly get from redis without query, so that the comparison time is greatly reduced.
4. And constructing a structured massive intelligence library, classifying the texts according to attributes such as the length, time, emotion attribute, word frequency-inverse file frequency, source and the like of the texts, and constructing an inverted index while warehousing the texts.
5. The title matching degree, content matching degree, heat degree, tf-idf value, originality, author portrait and other information of the article are weighted. Meanwhile, historical search habits of the user are analyzed, high-frequency keywords are found, and secondary search and score calculation are carried out on the primary search results. And calculating the scores of the two searches to obtain the final article presentation sequence.
6. And searching and displaying the user input information based on the public opinion classification set, the historical event classification set and the negative public opinion set, so as to ensure the safety and reliability of displaying the information to the user.
7. Based on different comparison mechanisms of newly-added public sentiments in different time windows, the method can distinguish the primary, secondary and repeated fermentation of events, and accurately trace the propagation process and link of each event.
8. The behavior of one person with multiple numbers, forwarding and the like in the time transmission process can be analyzed and judged, the key transmission crowd is found, and meanwhile, the public opinion analysis error is reduced.
9. By extracting the user search history and constructing the user portrait, the thousand-face personalized user recommendation is realized, and the user search experience is optimized.
10. A large amount of public sentiments and safe texts can be subjected to duplicate removal treatment; whether the text is positive or negative can be judged according to public sentiment and safety text attributes.
Fig. 4 is a public opinion security analysis apparatus based on event correlation according to the present invention, as shown in fig. 4, the apparatus includes:
the classification module 41 is used for classifying the public sentiment data into corresponding public sentiment classification sets;
the dimensionality reduction searching module 42 is used for performing dimensionality reduction on the public opinion data in the public opinion classified set, searching a historical event classified set to which the processed public opinion data belongs based on a first time window, and dividing the public opinion data into the historical event classified set to which the public opinion data belongs;
a first matching module 43, configured to perform correlation matching on public opinion data in the same historical event classification set, and classify the matched public opinion data into negative public opinion sets;
and the analysis module 44 is used for carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
In one embodiment, the dimension reduction lookup module 42 includes:
the segmentation module is used for carrying out binary conversion on public sentiment data in the public sentiment classified concentration and segmenting converted character strings to obtain public sentiment sections;
the sub-searching module is used for searching character strings to be compared related to all public sentiment sections in a first time window;
the comparison module is used for comparing the character string where the public opinion segment is located with the character string to be compared according to positions to determine the character string similar to the public opinion data;
and the sub-determination module is used for determining the historical event classification set to which the public sentiment data belongs according to the character strings similar to the public sentiment data.
Further, the apparatus further comprises:
the first dimension reduction searching module is used for carrying out dimension reduction processing on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending out event fermentation alarm information;
the second dimension reduction searching module is used for replacing the first time window with a second time window if the historical event classification set to which the processed newly-added public opinion data belongs is not searched in the first time window, searching the historical event classification set to which the processed newly-added public opinion data belongs based on the second time window, classifying the newly-added public opinion data into the historical event classification set to which the newly-added public opinion data belongs, and sending event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window.
The comparison mechanism of the first dimension reduction searching module and the second dimension reduction searching module is different.
Further, the apparatus further comprises: and the searching and displaying module is used for searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the searching information of the user.
In one embodiment, the search presentation module comprises:
the multi-dimension matching module is used for matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on multiple dimensions according to user search information;
the determining module is used for determining the matching degree of each target public opinion according to the weight value of each dimension;
and the display module is used for displaying each target public opinion according to the matching degree of the target public opinions.
Further, the apparatus further comprises:
the adjusting module is used for adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; or: and adjusting the matching degree of the target public sentiment according to the final public sentiment selected from the historical target public sentiments by the user.
In one embodiment, the multi-dimensional matching module comprises:
the sub-matching module is used for matching user search information based on multiple dimensions from the public opinion classification set, the historical event classification set and the negative public opinion set respectively to obtain classified target public opinions, event target public opinions and negative target public opinions;
and the deleting module is used for respectively deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment to serve as the target public sentiment.
Further, the apparatus further comprises:
the construction module is used for searching and constructing a user portrait according to the user history;
and the pushing module is used for pushing information to the user based on the user portrait.
In one embodiment, the classification module comprises:
the correlation analysis module is used for carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
the sub-classification module is used for classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
the judging module is used for judging the emotion types of the classified public opinion data based on the positive and negative emotion corpus;
and the model processing module is used for processing the public opinion data after emotion classification by adopting the classification model to obtain a public opinion classification set in which the public opinion data is positioned.
Further, the apparatus further comprises: and the alarm module is used for sending a negative public opinion alarm if the classified emotion types of the public opinion data are negative.
Further, the public opinion data comprises: media public opinion data and social network public opinion data, the device still includes:
and the second matching module is used for carrying out correlation matching on the social network public opinion data and distributing the matched social network public opinion data to a negative public opinion set.
Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.
In the following, embodiments of the electronic device of the present invention are described, which may be regarded as an implementation in physical form for the above-described embodiments of the method and apparatus of the present invention. Details described in the embodiments of the electronic device of the invention should be considered supplementary to the embodiments of the method or apparatus described above; for details which are not disclosed in embodiments of the electronic device of the invention, reference may be made to the above-described embodiments of the method or the apparatus.
Fig. 5 is a block diagram of an exemplary embodiment of an electronic device according to the present invention. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 5, the electronic device 500 of the exemplary embodiment is represented in the form of a general-purpose data processing device. The components of the electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 connecting different electronic device components (including the memory unit 520 and the processing unit 510), a display unit 540, and the like.
The storage unit 520 stores a computer readable program, which may be a code of a source program or a read-only program. The program may be executed by the processing unit 510 such that the processing unit 510 performs the steps of various embodiments of the present invention. For example, the processing unit 510 may perform the steps as shown in fig. 1.
The memory unit 520 may include readable media in the form of volatile memory units, such as a random access memory unit (RAM)5201 and/or a cache memory unit 5202, and may further include a read-only memory unit (ROM) 5203. The storage unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 4205 including, but not limited to: operating the electronic device, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 530 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 500 may also communicate with one or more external devices 100 (e.g., keyboard, display, network device, bluetooth device, etc.), enable a user to interact with the electronic device 500 via the external devices 100, and/or enable the electronic device 500 to communicate with one or more other data processing devices (e.g., router, modem, etc.). Such communication can occur via input/output (I/O) interfaces 550, and can also occur via network adapter 560 to one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in FIG. 5, other hardware and/or software modules may be used in the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID electronics, tape drives, and data backup storage electronics, among others.
FIG. 6 is a schematic diagram of one computer-readable medium embodiment of the present invention. As shown in fig. 6, the computer program may be stored on one or more computer readable media. The computer readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic device, apparatus, or device that is electronic, magnetic, optical, electromagnetic, infrared, or semiconductor, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. The computer program, when executed by one or more data processing devices, enables the computer-readable medium to implement the above-described method of the invention, namely: carrying out classification processing on the public opinion data to be classified into corresponding public opinion classification sets; performing dimensionality reduction on the public opinion data in the public opinion classification set, searching a historical event classification set to which the processed public opinion data belongs based on a first time window, and classifying the public opinion data into the historical event classification set to which the public opinion data belongs; carrying out correlation matching on the public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets; and carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments of the present invention described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a computer-readable storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a data processing device (which can be a personal computer, a server, or a network device, etc.) execute the above-mentioned method according to the present invention.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution electronic device, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object oriented programming languages such as Java, C + + or the like and conventional procedural programming languages, such as "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In summary, the present invention can be implemented as a method, an apparatus, an electronic device, or a computer-readable medium executing a computer program. Some or all of the functions of the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP).
While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims (26)

1. A public opinion security analysis method based on event correlation is characterized by comprising the following steps:
carrying out classification processing on the public opinion data to be classified into corresponding public opinion classification sets;
performing dimensionality reduction on the public opinion data in the public opinion classification set, searching a historical event classification set to which the processed public opinion data belongs based on a first time window, and classifying the public opinion data into the historical event classification set to which the public opinion data belongs;
carrying out correlation matching on the public opinion data in the same historical event classification set, and classifying the matched public opinion data into negative public opinion sets;
and carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
2. The method of claim 1, wherein the performing a dimension reduction process on the public sentiment data in the public sentiment classification set, and the searching for the historical event classification set to which the processed public sentiment data belongs based on the first time window comprises:
carrying out binary conversion on public sentiment data in the public sentiment classified concentration, and segmenting converted character strings to obtain public sentiment sections;
searching character strings to be compared related to all public sentiment sections in a first time window;
comparing the character string of the public opinion segment with the character string to be compared according to the position, and determining the character string similar to the public opinion data;
and determining a historical event classification set to which the public opinion data belongs according to the character strings similar to the public opinion data.
3. The method of claim 2, further comprising:
carrying out dimensionality reduction on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending event fermentation alarm information;
if the historical event classification set to which the processed new public opinion data belongs is not found in the first time window, replacing the first time window with a second time window, finding the historical event classification set to which the processed new public opinion data belongs based on the second time window, dividing the new public opinion data into the historical event classification set to which the new public opinion data belongs, and sending event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window.
4. The method of claim 3, wherein a comparison mechanism of the historical events to which the newly added public opinion data belongs after the search processing based on the first time window is different from a comparison mechanism of the historical events to which the newly added public opinion data belongs after the search processing based on the second time window.
5. The method of claim 1, further comprising: and searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the user search information.
6. The method of claim 5, wherein the finding and presenting the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the user search information comprises:
matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on a plurality of dimensions according to user search information;
determining the matching degree of each target public opinion according to the weight value of each dimension;
and displaying each target public opinion according to the matching degree of the target public opinions.
7. The method of claim 6, wherein before presenting each of the target public opinions according to the matching degree of the target public opinions, the method further comprises:
adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; or:
and adjusting the matching degree of the target public sentiment according to the final public sentiment selected from the historical target public sentiments by the user.
8. The method of claim 6, wherein the matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on multiple dimensions according to the user search information comprises:
matching user search information based on multiple dimensions from the public opinion classification set, the historical event classification set and the negative public opinion set respectively to obtain classified target public opinions, event target public opinions and negative target public opinions;
and deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment respectively as the target public sentiment.
9. The method of claim 5, further comprising:
searching and constructing a user portrait according to the user history;
pushing information to the user based on the user image.
10. The method of claim 1, wherein classifying the public opinion data into corresponding public opinion classification sets comprises:
carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
judging the emotion types of the classified public opinion data based on the positive and negative emotion corpus;
and processing the public opinion data after emotion classification by adopting a classification model to obtain a public opinion classification set in which the public opinion data is positioned.
11. The method of claim 10, wherein if the classified sentiment type of the public opinion data is negative, a negative public opinion alarm is issued.
12. The method of claim 10, wherein the public opinion data comprises: media public opinion data and social network public opinion data, the method further comprising:
and carrying out correlation matching on the social network public opinion data, and distributing the matched social network public opinion data to a negative public opinion set.
13. A public opinion security analysis apparatus based on event correlation, the apparatus comprising:
the classification module is used for classifying the public opinion data into corresponding public opinion classification sets;
the dimension reduction searching module is used for carrying out dimension reduction processing on the public opinion data in the public opinion classification set, searching a historical event classification set to which the processed public opinion data belongs based on a first time window, and classifying the public opinion data into the historical event classification set to which the public opinion data belongs;
the first matching module is used for carrying out correlation matching on the public opinion data in the same historical event classification set and dividing the matched public opinion data into negative public opinion sets;
and the analysis module is used for carrying out public opinion safety analysis based on the public opinion classification set, the historical event classification set and the negative public opinion set.
14. The apparatus of claim 13, wherein the dimension reduction lookup module comprises:
the segmentation module is used for carrying out binary conversion on public sentiment data in the public sentiment classified concentration and segmenting converted character strings to obtain public sentiment sections;
the sub-searching module is used for searching character strings to be compared related to all public sentiment sections in a first time window;
the comparison module is used for comparing the character string where the public opinion segment is located with the character string to be compared according to positions to determine the character string similar to the public opinion data;
and the sub-determination module is used for determining the character strings similar to the public opinion data according to the character strings similar to the public opinion data.
15. The apparatus of claim 14, further comprising:
the first dimension reduction searching module is used for carrying out dimension reduction processing on the newly added public opinion data, searching a historical event classification set to which the newly added public opinion data belongs based on a first time window, classifying the newly added public opinion data into the historical event classification set to which the newly added public opinion data belongs, and sending out event fermentation alarm information;
the second dimensionality reduction searching module is used for replacing the first time window with a second time window if the historical event classification set to which the processed new public opinion data belongs is not searched in the first time window, searching the historical event classification set to which the processed new public opinion data belongs based on the second time window, classifying the new public opinion data into the historical event classification set to which the new public opinion data belongs, and sending out event secondary fermentation alarm information;
wherein: the second time window is earlier than the first time window.
16. The apparatus of claim 15, wherein the first dimension reduction lookup module and the second dimension reduction lookup module have different alignment mechanisms.
17. The apparatus of claim 13, further comprising: and the searching and displaying module is used for searching and displaying the target public sentiment from the public sentiment classification set, the historical event classification set and the negative public sentiment set according to the searching information of the user.
18. The apparatus of claim 17, wherein the search presentation module comprises:
the multi-dimension matching module is used for matching target public sentiments from the public sentiment classification set, the historical event classification set and the negative public sentiment set based on multiple dimensions according to user search information;
the determining module is used for determining the matching degree of each target public opinion according to the weight value of each dimension;
and the display module is used for displaying each target public opinion according to the matching degree of the target public opinions.
19. The apparatus of claim 18, further comprising:
the adjusting module is used for adjusting the matching degree of the target public sentiment according to the historical target public sentiment matched with the user searching information; or: and adjusting the matching degree of the target public sentiment according to the final public sentiment selected from the historical target public sentiments by the user.
20. The apparatus of claim 18, wherein the multi-dimensional matching module comprises:
the sub-matching module is used for matching user search information based on multiple dimensions from the public opinion classification set, the historical event classification set and the negative public opinion set respectively to obtain classified target public opinions, event target public opinions and negative target public opinions;
and the deleting module is used for respectively deleting the public sentiment data related to the negative target public sentiment from the classified target public sentiment and the event target public sentiment to serve as the target public sentiment.
21. The apparatus of claim 17, further comprising:
the construction module is used for searching and constructing a user portrait according to the user history;
and the pushing module is used for pushing information to the user based on the user portrait.
22. The apparatus of claim 13, wherein the classification module comprises:
the correlation analysis module is used for carrying out correlation analysis on the public opinion data to obtain the relevant public opinion data;
the sub-classification module is used for classifying the relevant public opinion data according to the text attribute of the relevant public opinion data;
the judging module is used for judging the emotion types of the classified public opinion data based on the positive and negative emotion corpus;
and the model processing module is used for processing the public opinion data after emotion classification by adopting the classification model to obtain a public opinion classification set in which the public opinion data is positioned.
23. The apparatus of claim 22, further comprising: and the alarm module is used for sending a negative public opinion alarm if the classified emotion types of the public opinion data are negative.
24. The apparatus of claim 22, wherein the public opinion data comprises: media public opinion data and social network public opinion data, the device still includes:
and the second matching module is used for carrying out correlation matching on the social network public opinion data and distributing the matched social network public opinion data to a negative public opinion set.
25. An electronic device, comprising:
a processor; and
a memory storing computer-executable instructions that, when executed, cause the processor to perform the method of any of claims 1-12.
26. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which, when executed by a processor, implement the method of any of claims 1-12.
CN202210509195.7A 2022-05-10 2022-05-10 Event correlation based public opinion security analysis method and device and electronic equipment Pending CN114840725A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210509195.7A CN114840725A (en) 2022-05-10 2022-05-10 Event correlation based public opinion security analysis method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210509195.7A CN114840725A (en) 2022-05-10 2022-05-10 Event correlation based public opinion security analysis method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114840725A true CN114840725A (en) 2022-08-02

Family

ID=82569185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210509195.7A Pending CN114840725A (en) 2022-05-10 2022-05-10 Event correlation based public opinion security analysis method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114840725A (en)

Similar Documents

Publication Publication Date Title
US10586155B2 (en) Clarification of submitted questions in a question and answer system
US9317498B2 (en) Systems and methods for generating summaries of documents
US10169706B2 (en) Corpus quality analysis
US10102254B2 (en) Confidence ranking of answers based on temporal semantics
US10713571B2 (en) Displaying quality of question being asked a question answering system
US10332012B2 (en) Knowledge driven solution inference
US20150170051A1 (en) Applying a Genetic Algorithm to Compositional Semantics Sentiment Analysis to Improve Performance and Accelerate Domain Adaptation
US9720962B2 (en) Answering superlative questions with a question and answer system
US9760828B2 (en) Utilizing temporal indicators to weight semantic values
US9632998B2 (en) Claim polarity identification
KR20160042896A (en) Browsing images via mined hyperlinked text snippets
CN115292520B (en) Knowledge graph construction method for multi-source mobile application
US20220365956A1 (en) Method and apparatus for generating patent summary information, and electronic device and medium
Konchady Building Search Applications: Lucene, LingPipe, and Gate
Wei et al. Online education recommendation model based on user behavior data analysis
Eldin et al. An enhanced opinion retrieval approach on Arabic text for customer requirements expansion
CN111160007B (en) Search method and device based on BERT language model, computer equipment and storage medium
WO2010132062A1 (en) System and methods for sentiment analysis
US20230090601A1 (en) System and method for polarity analysis
KR102560521B1 (en) Method and apparatus for generating knowledge graph
CN111368036B (en) Method and device for searching information
CN114840725A (en) Event correlation based public opinion security analysis method and device and electronic equipment
CN113779981A (en) Recommendation method and device based on pointer network and knowledge graph
CN113094469B (en) Text data analysis method and device, electronic equipment and storage medium
Alzhrani et al. Towards Security Awareness of Mobile Applications using Semantic-based Sentiment Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: Zhong Guo

Address after: Room 1109, No. 4, Lane 800, Tongpu Road, Putuo District, Shanghai, 200062

Applicant after: Shanghai Qiyue Information Technology Co.,Ltd.

Address before: Room a2-8914, 58 Fumin Branch Road, Hengsha Township, Chongming District, Shanghai, 201500

Applicant before: Shanghai Qiyue Information Technology Co.,Ltd.

Country or region before: Zhong Guo