CN111753102A - Public opinion analysis method and device based on affair map and electronic equipment - Google Patents

Public opinion analysis method and device based on affair map and electronic equipment Download PDF

Info

Publication number
CN111753102A
CN111753102A CN202010627061.6A CN202010627061A CN111753102A CN 111753102 A CN111753102 A CN 111753102A CN 202010627061 A CN202010627061 A CN 202010627061A CN 111753102 A CN111753102 A CN 111753102A
Authority
CN
China
Prior art keywords
public opinion
causal
events
short
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010627061.6A
Other languages
Chinese (zh)
Inventor
陈程
王贺
李纯懿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Zhuoer Digital Media Technology Co ltd
Original Assignee
Wuhan Zhuoer Digital Media Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Zhuoer Digital Media Technology Co ltd filed Critical Wuhan Zhuoer Digital Media Technology Co ltd
Priority to CN202010627061.6A priority Critical patent/CN111753102A/en
Publication of CN111753102A publication Critical patent/CN111753102A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/048Fuzzy inferencing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Automation & Control Theory (AREA)
  • Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a public opinion analysis method, a device and electronic equipment based on a thought atlas, wherein the method comprises the following steps: carrying out conversation extraction on original public opinion information in a short message social platform, and extracting an information stream containing one or more short texts; extracting causal events in the information flow, wherein the causal events comprise causal events inside the short texts and/or causal events between the two short texts; and establishing a public opinion graph according to the causal events of the information flows, and carrying out public opinion analysis according to the public opinion graph. By the method, the device and the electronic equipment for public opinion analysis based on the concept graph, short texts can be extracted efficiently, and the completeness of information can be kept to the maximum extent; the method has the advantages that the concept map is used for carrying out public opinion analysis, the evolution path of the network public opinion can be analyzed more conveniently, the basis is provided for predicting the network public opinion, and a new visual angle is provided for public opinion evolution research.

Description

Public opinion analysis method and device based on affair map and electronic equipment
Technical Field
The invention relates to the technical field of public opinion analysis, in particular to a public opinion analysis method and device based on a physics graph, electronic equipment and a computer readable storage medium.
Background
The microblog public opinion refers to that in a microblog communication platform, netizens discuss a certain hot event, and all views and mutual discussions are carried out to promote the development and change of the event. Different from the traditional network information, the microblog information is shorter and more prominent in both the spreading property and the themes of the microblog information. The network public sentiment takes the topics and opinions of netizens as links, and some seemingly unrelated public sentiments are associated, so that the evolution direction of the network public sentiment is not in the expectation of people, and the evolution path presents uncertainty and complexity. In the face of network public opinions, most of the traditional methods are based on keyword screening of relevant information and further processing, and the method can realize the prediction of the trend and the heat degree of the network public opinions, but cause and effect relations among events are easy to ignore, the evolution essence of the network public opinions cannot be deeply explained, and the complexity and the systematicness of the network public opinions cannot be visually presented.
Disclosure of Invention
In order to solve the existing technical problems, embodiments of the present invention provide a method, an apparatus, an electronic device, and a computer-readable storage medium for public opinion analysis based on a concept graph.
In a first aspect, an embodiment of the present invention provides a public opinion analysis method based on a concept graph, including:
carrying out conversation extraction on original public opinion information in a short message social platform, and extracting an information stream containing one or more short texts;
extracting causal events in the information flow, wherein the causal events comprise causal events inside the short texts and/or causal events between the two short texts, and each causal event comprises a corresponding causal event and an effect event;
and establishing a public opinion graph according to the causal events of the information flows, and carrying out public opinion analysis according to the public opinion graph.
In a second aspect, an embodiment of the present invention further provides a public opinion analysis device based on a concept graph, including:
the conversation extraction module is used for carrying out conversation extraction on the original public opinion information in the short message social platform and extracting an information stream containing one or more short texts;
the event extraction module is used for extracting the causal events in the information flow, wherein the causal events comprise causal events inside the short texts and/or causal events between the two short texts, and each causal event comprises a corresponding causal event and an effect event;
and the event map analysis module is used for establishing a public opinion event map according to the causal events of the information flows and carrying out public opinion analysis according to the public opinion event map.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a bus, a transceiver, a memory, a processor, and a computer program stored in the memory and executable on the processor, where the transceiver, the memory, and the processor are connected via the bus, and when the computer program is executed by the processor, the method implements any one of the steps in the public opinion analysis method based on the situational map.
In a fourth aspect, the embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method for public opinion analysis based on a concept graph as described in any one of the above is implemented.
According to the method, the device, the electronic equipment and the computer-readable storage medium for public opinion analysis based on the affair map, provided by the embodiment of the invention, aiming at the characteristics of short message social platforms such as microblogs, short texts are taken as data units for session extraction, so that information flow is formed, the short texts can be efficiently extracted, and the integrity of information can be furthest reserved. Public opinion map can visually depict the logic evolution relation between events, wherein the causal relation can fully explain the evolution path of network public opinion and clearly show the evolution direction of network public opinion; the embodiment uses the concept map to analyze the public sentiment, can more conveniently analyze the evolution path of the network public sentiment, provides a basis for predicting the network public sentiment, and simultaneously provides a new visual angle for researching the evolution of the public sentiment.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments or the background art of the present invention, the drawings required to be used in the embodiments or the background art of the present invention will be described below.
Fig. 1 is a flowchart illustrating a public opinion analysis method based on a concept graph according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a concept graph in the method for public opinion analysis based on a concept graph according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram illustrating a public opinion abstract concept graph in the method for public opinion analysis based on the concept graph according to the embodiment of the present invention;
fig. 4 is a schematic structural diagram illustrating a public opinion analysis device based on a concept graph according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram illustrating an electronic device for performing a concept graph-based public opinion analysis method according to an embodiment of the present invention.
Detailed Description
In the description of the embodiments of the present invention, it should be apparent to those skilled in the art that the embodiments of the present invention can be embodied as methods, apparatuses, electronic devices, and computer-readable storage media. Thus, embodiments of the invention may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), a combination of hardware and software. Furthermore, in some embodiments, embodiments of the invention may also be embodied in the form of a computer program product in one or more computer-readable storage media having computer program code embodied in the medium.
The computer-readable storage media described above may take any combination of one or more computer-readable storage media. The computer-readable storage medium includes: an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of the computer-readable storage medium include: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only Memory (ROM), an erasable programmable read-only Memory (EPROM), a Flash Memory, an optical fiber, a compact disc read-only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any combination thereof. In embodiments of the invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, device, or apparatus.
The computer program code embodied on the computer readable storage medium may be transmitted using any appropriate medium, including: wireless, wire, fiber optic cable, Radio Frequency (RF), or any suitable combination thereof.
Computer program code for carrying out operations for embodiments of the present invention may be written in assembly instructions, Instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, integrated circuit configuration data, or in one or more programming languages, including an object oriented programming language, such as: java, Smalltalk, C + +, and also include conventional procedural programming languages, such as: c or a similar programming language. The computer program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be over any of a variety of networks, including: a Local Area Network (LAN) or a Wide Area Network (WAN), which may be connected to the user's computer, may be connected to an external computer.
The method, the device and the electronic equipment are described through the flow chart and/or the block diagram.
It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions. These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner. Thus, the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The embodiments of the present invention will be described below with reference to the drawings.
Fig. 1 shows a flowchart of a public opinion analysis method based on a concept graph according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step 101: and carrying out conversation extraction on the original public opinion information in the short message social platform, and extracting an information stream containing one or more short texts.
In the embodiment of the invention, the short message social platform is a social platform allowing a user to publish short messages, and the short message social platform can be a platform such as a microblog, a bar, a Twitter and the like. In this embodiment, the short message delivered by the user is specifically in the form of a short text, where the short text refers to a text whose message length is smaller than a preset threshold (e.g., 200 words, 140 words, or 80 words); after a user publishes a short text related to public sentiment on a short message social platform, other users (or the other users) can reply, comment and the like on the short text, that is, other users can further publish the short text to further form original public sentiment information in a conversation form. In addition, a short text in this embodiment is a text submitted by a user at one time, that is, texts submitted by the same user at different time points are all taken as a single short text. For example, a user a publishes a short text a related to public sentiment, then a user B publishes a short text B commenting on the short text a, and if the user a replies the short text B later, the content replied by the user a is a single short text c; the short texts a, b, c may form one information stream. Because the short texts contain less content and the expression modes of some short texts are not standard, the difficulty of public opinion analysis is increased; in this embodiment, a complete information stream is formed based on the related short text, so that complete information can be extracted, and the integrity of the information can be retained to the greatest extent.
The original public opinion information can be crawled from the short message social platform based on a crawling mode. Specifically, a crawler system can be built based on a distributed crawling frame scrapy-redis of python, and after a public opinion event concerned is determined, original public opinion information of the relevant event can be crawled, wherein the original public opinion information comprises comment information. In addition, a database can be established to store the crawled original public opinion information so as to carry out public opinion analysis processing at any time in the following process.
Step 102: and extracting the causal events in the information flow, wherein the causal events comprise the causal events inside the short texts and/or the causal events between the two short texts, and each causal event comprises a corresponding causal event and an effect event.
In the embodiment of the invention, after the information stream is acquired, whether a causal event exists in the information stream can be further determined, and the causal event is extracted when the causal event exists. In this embodiment, the causal event includes a causal event and an effect event, and the causal event and the effect event have a relationship capable of indicating causal. Furthermore, the short text itself may have causal events; meanwhile, because one short text may be a reply or a comment to another short text, the two short texts are related to each other, and a causal event may exist between the two short texts. For example, user a published short text a on the short message social platform: "intermediary is contracted by the hostess committee for the promotion of house-raising rate", after which user B replies to it with a short text B: "the agent has stopped working"; at this time, a causal event exists inside the short text a, the causal event is 'jeer to raise the room price', and the causal event is 'living construction committee negotiation'; the single short text b has no causal event, but since the short text b is a reply to the short text a, the corresponding causal event can be extracted from the short text b: the reason event is 'jeer-up rate' or 'living construction committee negotiation', and the corresponding result event is 'intermediary suspension of business'.
Step 103: and establishing a public opinion graph according to the causal events of the information flows, and carrying out public opinion analysis according to the public opinion graph.
In the embodiment of the invention, each information flow can comprise one or more causal events, and the original public opinion information generally comprises a plurality of information flows, and a corresponding affair map, namely a public opinion map, can be established through the causal events in the information flows. Specifically, in the embodiment, the cause event and the result event are used as nodes, and the causal relationship between the two events is used as a directed edge, so as to form a public opinion map in the form of a directed graph. As shown in fig. 2, six nodes in the public opinion graph represent six events a, b, c, d, e, and f, respectively, and a directed edge between two nodes represents a causal relationship between two events; in addition, since causal events exist in the form of event pairs (i.e., including causal events and causal events), in a public opinion graph, an event may be a causal event in one event pair and an causal event in another event pair. As shown in fig. 2, in the event pair (a, b), the event b is a result event; and in the event pair (b, c), the event b is a causal event. After the public opinion affair map is determined, corresponding public opinion analysis can be carried out based on the public opinion affair map, and the public opinion affair map can express public opinion evolution conditions more intuitively.
Optionally, the step 103 of "establishing a public opinion graph according to causal events of a plurality of information flows" includes:
step A1: taking the reason events and the result events as nodes, taking the causal relationship between the reason events and the result events as directed edges, establishing a public opinion affair map, determining the weight of each directed edge, and:
Figure BDA0002566909630000071
wherein (e)i,ej) Indicates a cause event as eiThe resulting event is ejEvent pair of (c), ωijRepresents an event pair (e)i,ek) Weight of the corresponding directed edge, Num (e)i,ej) Represents an event pair (e)i,ej) The number of occurrences of (c).
In an embodiment of the invention, each causal event forms an event pair (e)i,ej) The former eiFor causal events, the latter ejIs a result event, and a cause event eiAnd result event ejThere is a causal relationship between them, which can be represented by a directed edge. In addition, because the same causal event can be extracted from different information flows, that is, different causal events can correspond to the same event pair, in this embodiment, the importance degree of the corresponding event pair is represented by the weight of the directed edge, and the greater the weight of the event pair is, the more important the event pair is to the corresponding public opinion change. Specifically, the present embodiment is based on the event pair (e)i,ej) Number of occurrences Num (e)i,ej) To determine the corresponding weight, wherein the occurrence number specifically refers to the corresponding event pair (e)i,ej) The greater the number of occurrences, the greater the corresponding weight.
According to the public opinion analysis method based on the affair graph, provided by the embodiment of the invention, aiming at the characteristics of short message social platforms such as microblogs, conversation extraction is carried out by taking the short text as a data unit, so that an information flow is formed, the short text can be efficiently extracted, and the integrity of information can be furthest reserved. Public opinion map can visually depict the logic evolution relation between events, wherein the causal relation can fully explain the evolution path of network public opinion and clearly show the evolution direction of network public opinion; the embodiment uses the concept map to analyze the public sentiment, can more conveniently analyze the evolution path of the network public sentiment, provides a basis for predicting the network public sentiment, and simultaneously provides a new visual angle for researching the evolution of the public sentiment.
On the basis of the above embodiment, the step 101 "extracting an information stream containing one or more short texts" includes:
step B1: one or more short texts in the original public opinion information are determined.
Step B2: when the original public opinion information contains a plurality of short texts, determining the publication time and the pointed target object of each short text; and determining the association relationship between the short texts according to the publication time of the short texts and the target object, and generating an information stream taking the short texts as data units, wherein the association relationship exists between any one short text in the information stream and at least one other short text.
In the embodiment of the invention, if the original public opinion information only contains a short text, that is, after the user publishes the short text, no other user replies or comments to the short text, the short text can be directly used as an information stream. When the original public opinion information contains a plurality of short texts, the publication time and the pointed target object of each short text can be determined; the publishing time is the time for the user to publish the short text, and the target object is the user corresponding to other short texts quoted by the short text; for example, the short text a is a text replied to the short text B, the short text B corresponds to the user B, that is, the short text B published by the user B, and the target object of the short text a is the user B referred to by the short text a. In the embodiment, a target object in a short text is extracted by using the characteristics of reply and comment in a short message social platform, and the target object is used as a basis for session extraction, so that an information stream can be extracted more accurately.
Meanwhile, when the short text A refers to the short text B, the short text A and the short text B can form a context relationship, namely, an association relationship exists between the two short texts, and the association relationship specifically refers to the context relationship of the two short texts in the information flow. In this embodiment, other short texts associated with the short text can be determined based on the target object in the short text, and the publication time of the short text is used to define an illegal association relationship, that is, the short text can only refer to other short texts with earlier publication times, so that a corresponding information stream can be generated. In addition, since the short texts in one information flow are formed by replying or commenting on other short texts, an association relationship exists between any one short text and at least one other short text in the information flow. For example, if user a published short text a1, then user B reviewed short text B1, then user a replied to short text B1 with a replied short text a2, then user B replied to short text a2 with a replied short text B2, then the entire information flow may be represented as a1 → B1 → a2 → B2.
On the basis of the above embodiment, the step 102 "extracting causal events in the information stream" includes:
step C1: and judging whether the short text in the information flow is matched with a preset causal rule template or not, and extracting a complete causal event in the short text when the short text is matched with the preset causal rule template.
In the embodiment of the present invention, the short text is also essentially a text, wherein the cause and effect generally includes a complete cause and effect (or an explicit cause and effect) and an ambiguous cause and effect. The complete cause and effect has identification words representing cause and effect relationships, so that cause and effect events in the complete cause and effect relationship can be extracted by setting a cause and effect matching rule; however, the cause and effect components in the fuzzy cause and effect are fuzzy and are not easy to be extracted by setting a rule. Therefore, in the present embodiment, a cause and effect rule template for extracting cause and effect events is preset, and the cause and effect rule template can extract complete cause and effect in the short text. Specifically, the cause and effect rule template includes a plurality of cause and effect matching rules, and may further include a Constraint condition, a Priority, and the like corresponding to each cause and effect matching rule, where the cause and effect rule template may be specifically in the form of < Pattern, Constraint, Priority >, where Pattern represents a cause and effect matching rule, and may be specifically a regular expression, Constraint represents a Constraint condition, and Priority represents a Priority.
For example, a causal matching rule may be < Cue > [ Cause ], [ Effect ], where Cue represents a causal conjunction, e.g., because, now, etc., the corresponding Cause corresponds to a causal event and the Effect corresponds to a causal event. For example, "intermediary agent jeer up rate is contracted by the lien construction committee", which can be matched with the rule, namely, intermediary < factor > [ jeer up rate ], [ belonged construction committee contracted ], so that the causal event can be extracted; since the causal event is fully causal, it is referred to as a fully causal event in this embodiment.
Step C2: and when the short text does not match with the information flow, judging whether the short text has fuzzy causal events, and when the short text has the fuzzy causal events, taking the fuzzy causal events and the complete causal events in the short text as the causal events of the information flow.
In this embodiment, if the short text matches with the causal rule template and the extracted complete causal event corresponds to the complete short text, the next short text may be subjected to matching processing; if the extracted complete causal event corresponds to a complete short text, that is, a part of the content in the short text is still not matched with the causal rule template, or all the short texts are not matched with the causal rule template, it is indicated that no complete causal exists in the corresponding short text, and in this embodiment, it is further determined whether a fuzzy causal exists, so as to avoid omission. If there are fuzzy causal events in the short text, a causal event, that is, a fuzzy causal event, may also be extracted, and both the fuzzy causal event and the complete causal event determined in the above step C1 are causal events of the information flow.
Specifically, the probability of the causal event can be determined by introducing fuzzy numbers, and the causal event are extracted, so that the fuzzy causal event in the short text is extracted; and collected short texts can be continuously smashed, sorted and recombined, so that concept and refinement categories are mined, open coding is performed, and causal event pairs are extracted based on a root theory method. In addition, other existing mature technologies (such as deep learning and the like) may also be used to extract the fuzzy causal event in the short text, which is not limited in this embodiment.
Optionally, the step 102 "extracting causal events in the information stream" further includes:
step C3: judging whether fuzzy causal events exist between two short texts with an association relation, and taking the fuzzy causal events between the short texts as causal events of information flow when the fuzzy causal events exist between the short texts; wherein, the association relationship is the context relationship of the two short texts in the information flow.
In the embodiment of the invention, because an association relationship can exist between two short texts, that is, one short text can be used for replying or commenting on another short text, the two separate short texts are in a contextual relationship in nature, and can also represent an event evolution process, that is, an event pair exists. Since the user generally does not use the causal connection when replying to a short text, i.e. there is no complete causal event between the two, and the causal relationship cannot be extracted through the causal rule template, the causal event between the two short texts, i.e. the fuzzy causal event, is extracted in a similar manner to step C2 in this embodiment.
For example, the user A publishes a short text a 'star feverly', the user B replies a short text B 'pollen shedding', the two short texts a and B have an association relationship, and whether fuzzy cause and effect exists or not can be judged at the moment; the fact that the "star fierce" causes the "pollen shedding" event has a causal relationship, the "star fierce" is a cause event, the "pollen shedding" is an effect event, and the corresponding event pair is an ambiguous cause-effect event.
In the embodiment of the invention, all causal events in the information flow can be completely extracted by determining the complete causal and fuzzy causal inside the short texts and determining the fuzzy causal between the short texts.
On the basis of the above embodiment, the step 103 "performing public opinion analysis according to public opinion map" includes:
step D1: and clustering the events in the public opinion affair map, and generalizing one or more events into a uniform abstract event.
Step D2: and generating a public opinion abstract affair map according to the abstract affairs, and carrying out public opinion analysis according to the public opinion abstract affair map and/or the public opinion abstract affair map.
In the embodiment of the invention, because the events (including the reason events and the result events in each event pair) in the public opinion affair map are extracted from the information flow, and different users may adopt different expression modes for the same events, the different events in the public opinion affair map represent similar or even same events; in the embodiment, the events with higher similarity are generalized together, so that a public opinion abstract affair map is formed, the relation between the events can be represented from a higher level, and the displayed evolution path has more field representativeness.
Specifically, clustering analysis can be performed on events in the public sentiment affair map, so that one or more similar events can be aggregated, and further generalized into a unified event, namely an abstract event; each event can be converted into a word vector, and whether the two events are similar or not is determined by calculating Euclidean distance or cosine similarity between the word vectors, so that clustering is realized; meanwhile, abstract events can be determined based on high-frequency vocabulary in the same kind of events, and a public opinion abstract affair map is finally formed. In addition, the process of determining the weight between two events in step a1 may be similar to the process described above, and is not described herein again. To illustrate by generalizing the public opinion concept graph shown in fig. 2, if after clustering analysis, the event a in fig. 2 is a first type, the events b and d are a first type, the event c is a first type, the events e and f are a first type, and the abstract events corresponding to the four types of events are A, B, C, D, the public opinion abstract concept graph finally generated can be shown in fig. 3.
According to the public opinion analysis method based on the affair graph, provided by the embodiment of the invention, aiming at the characteristics of short message social platforms such as microblogs, conversation extraction is carried out by taking the short text as a data unit, so that an information flow is formed, the short text can be efficiently extracted, and the integrity of information can be furthest reserved. Public opinion map can visually depict the logic evolution relation between events, wherein the causal relation can fully explain the evolution path of network public opinion and clearly show the evolution direction of network public opinion; the embodiment uses the concept map to analyze the public sentiment, can more conveniently analyze the evolution path of the network public sentiment, provides a basis for predicting the network public sentiment, and simultaneously provides a new visual angle for researching the evolution of the public sentiment. The target object is used as a basis for session extraction, so that the information flow can be extracted more accurately. All causal events in the information flow can be completely extracted by determining complete causal and fuzzy causal inside the short texts and determining fuzzy causal between the short texts. Events with high similarity are generalized together, so that a public opinion abstract affair map is formed, the relation between the events can be represented from a higher level, and the displayed evolution path has more field representativeness.
The above-mentioned public opinion analyzing method based on the concept graph according to the embodiment of the present invention is described in detail with reference to fig. 1 to fig. 3, and the method can also be implemented by corresponding devices.
Fig. 4 is a schematic structural diagram illustrating a public opinion analysis device based on a concept graph according to an embodiment of the present invention. As shown in fig. 4, the public opinion analyzing apparatus based on the concept graph includes:
the conversation extraction module 41 is used for carrying out conversation extraction on the original public opinion information in the short message social platform and extracting an information stream containing one or more short texts;
an event extraction module 42, configured to extract causal events in the information stream, where the causal events include a causal event inside a short text and/or a causal event between two short texts, and each causal event includes a corresponding causal event and an effect event;
a concept map analyzing module 43, configured to establish a public opinion map according to the causal events of the plurality of information flows, and perform public opinion analysis according to the public opinion map.
On the basis of the above embodiment, the extracting module 41 extracts an information stream containing one or more short texts, including:
determining one or more short texts in the original public opinion information;
when the original public opinion information contains a plurality of short texts, determining the publication time and the pointed target object of each short text; and determining an association relation between short texts according to the publication time of the short texts and the target object, and generating an information stream taking the short texts as data units, wherein the association relation exists between any one short text in the information stream and at least one other short text.
On the basis of the above embodiment, the extracting, by the event extraction module 42, the causal event in the information flow includes:
judging whether the short text in the information flow is matched with a preset causal rule template or not, and extracting a complete causal event in the short text when the short text in the information flow is matched with the preset causal rule template;
and when the short text and the information flow are not matched, judging whether the short text has fuzzy causal events, and when the short text has the fuzzy causal events, taking the fuzzy causal events and the complete causal events in the short text as the causal events of the information flow.
On the basis of the above embodiment, the extracting, by the event extraction module 42, the causal event in the information flow further includes:
judging whether fuzzy causal events exist between two short texts with an association relation, and taking the fuzzy causal events between the short texts as the causal events of the information flow when the fuzzy causal events exist between the short texts;
wherein the association relationship is a context relationship of the two short texts in the information flow.
On the basis of the above embodiment, the case map analyzing module 43 establishes a public opinion case map according to the causal events of a plurality of information flows, including:
taking the reason events and the result events as nodes, taking the relation between the reason events and the result events as directed edges, establishing a public sentiment affair map, determining the weight of each directed edge, and:
Figure BDA0002566909630000141
wherein (e)i,ej) Indicates a cause event as eiThe resulting event is ejEvent pair of (c), ωijRepresents an event pair (e)i,ek) Weight of the corresponding directed edge, Num (e)i,ej) Represents an event pair (e)i,ej) The number of occurrences of (c).
On the basis of the above embodiment, the concept graph analysis module 43 performs public opinion analysis according to the public opinion graph, including:
clustering events in the public opinion affair map, and generalizing one or more events into a uniform abstract event;
and generating a public opinion abstract affair map according to the abstract affairs, and carrying out public opinion analysis according to the public opinion abstract affair map and/or the public opinion abstract affair map.
According to the public opinion analysis device based on the affair graph, provided by the embodiment of the invention, aiming at the characteristics of short message social platforms such as microblogs, conversation extraction is carried out by taking short texts as data units, so that information flow is formed, the short texts can be efficiently extracted, and the integrity of information can be furthest reserved. Public opinion map can visually depict the logic evolution relation between events, wherein the causal relation can fully explain the evolution path of network public opinion and clearly show the evolution direction of network public opinion; the embodiment uses the concept map to analyze the public sentiment, can more conveniently analyze the evolution path of the network public sentiment, provides a basis for predicting the network public sentiment, and simultaneously provides a new visual angle for researching the evolution of the public sentiment. The target object is used as a basis for session extraction, so that the information flow can be extracted more accurately. All causal events in the information flow can be completely extracted by determining complete causal and fuzzy causal inside the short texts and determining fuzzy causal between the short texts. Events with high similarity are generalized together, so that a public opinion abstract affair map is formed, the relation between the events can be represented from a higher level, and the displayed evolution path has more field representativeness.
In addition, an embodiment of the present invention further provides an electronic device, which includes a bus, a transceiver, a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the transceiver, the memory, and the processor are connected via the bus, and when the computer program is executed by the processor, the processes of the above-mentioned embodiment of the method for analyzing public opinion based on a case map are implemented, and the same technical effects can be achieved, and therefore, in order to avoid repetition, details are not repeated here.
Specifically, referring to fig. 5, an embodiment of the present invention further provides an electronic device, which includes a bus 1110, a processor 1120, a transceiver 1130, a bus interface 1140, a memory 1150, and a user interface 1160.
In an embodiment of the present invention, the electronic device further includes: a computer program stored in the memory 1150 and capable of running on the processor 1120, wherein the computer program when executed by the processor 1120 implements the processes of the above-mentioned embodiment of the concept-atlas-based public opinion analyzing method.
A transceiver 1130 for receiving and transmitting data under the control of the processor 1120.
In embodiments of the invention in which a bus architecture (represented by bus 1110) is used, bus 1110 may include any number of interconnected buses and bridges, with bus 1110 connecting various circuits including one or more processors, represented by processor 1120, and memory, represented by memory 1150.
Bus 1110 represents one or more of any of several types of bus structures, including a memory bus, and memory controller, a peripheral bus, an Accelerated Graphics Port (AGP), a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include: an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an Enhanced ISA (EISA) bus, a Video Electronics Standards Association (VESA), a Peripheral Component Interconnect (PCI) bus.
Processor 1120 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be performed by integrated logic circuits in hardware or instructions in software in a processor. The processor described above includes: general purpose processors, Central Processing Units (CPUs), Network Processors (NPs), Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), Complex Programmable Logic Devices (CPLDs), Programmable Logic Arrays (PLAs), Micro Control Units (MCUs) or other Programmable Logic devices, discrete gates, transistor Logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in embodiments of the present invention may be implemented or performed. For example, the processor may be a single core processor or a multi-core processor, which may be integrated on a single chip or located on multiple different chips.
Processor 1120 may be a microprocessor or any conventional processor. The steps of the method disclosed in connection with the embodiments of the present invention may be directly performed by a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor. The software modules may be located in a Random Access Memory (RAM), a flash Memory (flash Memory), a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), a register, and other readable storage media known in the art. The readable storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The bus 1110 may also connect various other circuits such as peripherals, voltage regulators, or power management circuits to provide an interface between the bus 1110 and the transceiver 1130, as is well known in the art. Therefore, the embodiments of the present invention will not be further described.
The transceiver 1130 may be one element or may be multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. For example: the transceiver 1130 receives external data from other devices, and the transceiver 1130 transmits data processed by the processor 1120 to other devices. Depending on the nature of the computer system, a user interface 1160 may also be provided, such as: touch screen, physical keyboard, display, mouse, speaker, microphone, trackball, joystick, stylus.
It is to be appreciated that in embodiments of the invention, the memory 1150 may further include memory located remotely with respect to the processor 1120, which may be coupled to a server via a network. One or more portions of the above-described networks may be an ad hoc network (ad hoc network), an intranet (intranet), an extranet (extranet), a Virtual Private Network (VPN), a Local Area Network (LAN), a Wireless Local Area Network (WLAN), a Wide Area Network (WAN), a Wireless Wide Area Network (WWAN), a Metropolitan Area Network (MAN), the Internet (Internet), a Public Switched Telephone Network (PSTN), a plain old telephone service network (POTS), a cellular telephone network, a wireless fidelity (Wi-Fi) network, and combinations of two or more of the above. For example, the cellular telephone network and the wireless network may be a global system for Mobile Communications (GSM) system, a Code Division Multiple Access (CDMA) system, a Worldwide Interoperability for Microwave Access (WiMAX) system, a General Packet Radio Service (GPRS) system, a Wideband Code Division Multiple Access (WCDMA) system, a Long Term Evolution (LTE) system, an LTE Frequency Division Duplex (FDD) system, an LTE Time Division Duplex (TDD) system, a long term evolution-advanced (LTE-a) system, a Universal Mobile Telecommunications (UMTS) system, an enhanced Mobile Broadband (eMBB) system, a mass Machine Type Communication (mtc) system, an ultra reliable Low Latency Communication (urrllc) system, or the like.
It is to be understood that the memory 1150 in embodiments of the present invention can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. Wherein the nonvolatile memory includes: Read-Only Memory (ROM), Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), or Flash Memory.
The volatile memory includes: random Access Memory (RAM), which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as: static random access memory (Static RAM, SRAM), Dynamic random access memory (Dynamic RAM, DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data rate Synchronous Dynamic random access memory (Double Data RateSDRAM, DDRSDRAM), Enhanced Synchronous DRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), and direct memory bus RAM (DRRAM). The memory 1150 of the electronic device described in the embodiments of the invention includes, but is not limited to, the above and any other suitable types of memory.
In an embodiment of the present invention, memory 1150 stores the following elements of operating system 1151 and application programs 1152: an executable module, a data structure, or a subset thereof, or an expanded set thereof.
Specifically, the operating system 1151 includes various system programs such as: a framework layer, a core library layer, a driver layer, etc. for implementing various basic services and processing hardware-based tasks. Applications 1152 include various applications such as: media Player (Media Player), Browser (Browser), for implementing various application services. A program implementing a method of an embodiment of the invention may be included in application program 1152. The application programs 1152 include: applets, objects, components, logic, data structures, and other computer system executable instructions that perform particular tasks or implement particular abstract data types.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned public opinion analysis method based on a case map, and can achieve the same technical effect, and in order to avoid repetition, the details are not repeated here.
The computer-readable storage medium includes: permanent and non-permanent, removable and non-removable media may be tangible devices that retain and store instructions for use by an instruction execution apparatus. The computer-readable storage medium includes: electronic memory devices, magnetic memory devices, optical memory devices, electromagnetic memory devices, semiconductor memory devices, and any suitable combination of the foregoing. The computer-readable storage medium includes: phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), non-volatile random access memory (NVRAM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic tape cartridge storage, magnetic tape disk storage or other magnetic storage devices, memory sticks, mechanically encoded devices (e.g., punched cards or raised structures in a groove having instructions recorded thereon), or any other non-transmission medium useful for storing information that may be accessed by a computing device. As defined in embodiments of the present invention, the computer-readable storage medium does not include transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses traveling through a fiber optic cable), or electrical signals transmitted through a wire.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus, electronic device and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electrical, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to solve the problem to be solved by the embodiment of the invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present invention may be substantially or partially contributed by the prior art, or all or part of the technical solutions may be embodied in a software product stored in a storage medium and including instructions for causing a computer device (including a personal computer, a server, a data center, or other network devices) to execute all or part of the steps of the methods of the embodiments of the present invention. And the storage medium includes various media that can store the program code as listed in the foregoing.
The above description is only a specific implementation of the embodiments of the present invention, but the scope of the embodiments of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the embodiments of the present invention, and all such changes or substitutions should be covered by the scope of the embodiments of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A public opinion analysis method based on a thought atlas is characterized by comprising the following steps:
carrying out conversation extraction on original public opinion information in a short message social platform, and extracting an information stream containing one or more short texts;
extracting causal events in the information flow, wherein the causal events comprise causal events inside the short texts and/or causal events between the two short texts, and each causal event comprises a corresponding causal event and an effect event;
and establishing a public opinion graph according to the causal events of the information flows, and carrying out public opinion analysis according to the public opinion graph.
2. The method of claim 1, wherein extracting the information stream containing the one or more short texts comprises:
determining one or more short texts in the original public opinion information;
when the original public opinion information contains a plurality of short texts, determining the publication time and the pointed target object of each short text; and determining an association relation between short texts according to the publication time of the short texts and the target object, and generating an information stream taking the short texts as data units, wherein the association relation exists between any one short text in the information stream and at least one other short text.
3. The method of claim 1, wherein the extracting causal events in the information flow comprises:
judging whether the short text in the information flow is matched with a preset causal rule template or not, and extracting a complete causal event in the short text when the short text in the information flow is matched with the preset causal rule template;
and when the short text and the information flow are not matched, judging whether the short text has fuzzy causal events, and when the short text has the fuzzy causal events, taking the fuzzy causal events and the complete causal events in the short text as the causal events of the information flow.
4. The method of claim 3, wherein the extracting causal events in the information flow further comprises:
judging whether fuzzy causal events exist between two short texts with an association relation, and taking the fuzzy causal events between the short texts as the causal events of the information flow when the fuzzy causal events exist between the short texts;
wherein the association relationship is a context relationship of the two short texts in the information flow.
5. The method of claim 1, wherein establishing a public opinion graph based on the causal events of the plurality of information flows comprises:
taking the reason events and the result events as nodes, taking the relation between the reason events and the result events as directed edges, establishing a public sentiment affair map, determining the weight of each directed edge, and:
Figure FDA0002566909620000021
wherein (e)i,ej) Indicates a cause event as eiThe resulting event is ejEvent pair of (c), ωijRepresents an event pair (e)i,ek) Corresponding directionWeight of edge, Num (e)i,ej) Represents an event pair (e)i,ej) The number of occurrences of (c).
6. The method of claim 1, wherein the public opinion analysis according to the public opinion graph comprises:
clustering events in the public opinion affair map, and generalizing one or more events into a uniform abstract event;
and generating a public opinion abstract affair map according to the abstract affairs, and carrying out public opinion analysis according to the public opinion abstract affair map and/or the public opinion abstract affair map.
7. A public opinion analyzing device based on a thought atlas is characterized by comprising:
the conversation extraction module is used for carrying out conversation extraction on the original public opinion information in the short message social platform and extracting an information stream containing one or more short texts;
the event extraction module is used for extracting the causal events in the information flow, wherein the causal events comprise causal events inside the short texts and/or causal events between the two short texts, and each causal event comprises a corresponding causal event and an effect event;
and the event map analysis module is used for establishing a public opinion event map according to the causal events of the information flows and carrying out public opinion analysis according to the public opinion event map.
8. The apparatus of claim 7, wherein the session extraction module extracts an information stream containing one or more short texts, comprising:
determining one or more short texts in the original public opinion information;
when the original public opinion information contains a plurality of short texts, determining the publication time and the pointed target object of each short text; and determining an association relation between short texts according to the publication time of the short texts and the target object, and generating an information stream taking the short texts as data units, wherein the association relation exists between any one short text in the information stream and at least one other short text.
9. An electronic device comprising a bus, a transceiver, a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the transceiver, the memory and the processor are connected via the bus, wherein the computer program, when executed by the processor, implements the steps of the method for public opinion analysis based on a situational graph according to any of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps in the social graph-based public opinion analysis method according to any one of claims 1 to 6.
CN202010627061.6A 2020-07-02 2020-07-02 Public opinion analysis method and device based on affair map and electronic equipment Pending CN111753102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010627061.6A CN111753102A (en) 2020-07-02 2020-07-02 Public opinion analysis method and device based on affair map and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010627061.6A CN111753102A (en) 2020-07-02 2020-07-02 Public opinion analysis method and device based on affair map and electronic equipment

Publications (1)

Publication Number Publication Date
CN111753102A true CN111753102A (en) 2020-10-09

Family

ID=72678720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010627061.6A Pending CN111753102A (en) 2020-07-02 2020-07-02 Public opinion analysis method and device based on affair map and electronic equipment

Country Status (1)

Country Link
CN (1) CN111753102A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656602A (en) * 2021-09-01 2021-11-16 中国人民解放军31007部队 Method and device for creating affair map

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726293A (en) * 2018-11-14 2019-05-07 数据地平线(广州)科技有限公司 A kind of causal event map construction method, system, device and storage medium
CN110377759A (en) * 2019-07-22 2019-10-25 中国工商银行股份有限公司 Event relation map construction method and device
CN110895569A (en) * 2019-10-10 2020-03-20 卓尔智联(武汉)研究院有限公司 Case affairs map construction method, electronic device and storage medium
CN110968699A (en) * 2019-11-01 2020-04-07 数地科技(北京)有限公司 Logic map construction and early warning method and device based on event recommendation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726293A (en) * 2018-11-14 2019-05-07 数据地平线(广州)科技有限公司 A kind of causal event map construction method, system, device and storage medium
CN110377759A (en) * 2019-07-22 2019-10-25 中国工商银行股份有限公司 Event relation map construction method and device
CN110895569A (en) * 2019-10-10 2020-03-20 卓尔智联(武汉)研究院有限公司 Case affairs map construction method, electronic device and storage medium
CN110968699A (en) * 2019-11-01 2020-04-07 数地科技(北京)有限公司 Logic map construction and early warning method and device based on event recommendation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113656602A (en) * 2021-09-01 2021-11-16 中国人民解放军31007部队 Method and device for creating affair map

Similar Documents

Publication Publication Date Title
WO2020073673A1 (en) Text analysis method and terminal
CN111881991B (en) Method and device for identifying fraud and electronic equipment
CN112749749B (en) Classification decision tree model-based classification method and device and electronic equipment
CN106874253A (en) Recognize the method and device of sensitive information
US9996611B2 (en) Method, computer program, and computer for classifying users of social media
CN113240071B (en) Method and device for processing graph neural network, computer equipment and storage medium
CN112860852A (en) Information analysis method and device, electronic equipment and computer readable storage medium
CN112348123A (en) User clustering method and device and electronic equipment
CN110913354A (en) Short message classification method and device and electronic equipment
CN111444335B (en) Method and device for extracting central word
WO2024179575A1 (en) Data processing method, and device and computer-readable storage medium
CN111078849A (en) Method and apparatus for outputting information
US20210334707A1 (en) System and method for managing classification outcomes of data inputs classified into bias categories
CN111753102A (en) Public opinion analysis method and device based on affair map and electronic equipment
US11289071B2 (en) Information processing system, information processing device, computer program, and method for updating dictionary database
CN111708946A (en) Personalized movie recommendation method and device and electronic equipment
CN117291722A (en) Object management method, related device and computer readable medium
CN112364285B (en) Method and device for establishing abnormality detection model based on UEBA (unified extensible firmware interface) and related products
CN115758211A (en) Text information classification method and device, electronic equipment and storage medium
CN115618065A (en) Data processing method and related equipment
US11042808B2 (en) Predicting activity consequences based on cognitive modeling
CN115186738A (en) Model training method, device and storage medium
CN113128225B (en) Named entity identification method and device, electronic equipment and computer storage medium
CN114662452A (en) Privacy-removing text label analysis method and device
CN110348190B (en) User equipment attribution judging method and device based on user operation behaviors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201009

RJ01 Rejection of invention patent application after publication