CN116522013B - Public opinion analysis method and system based on social network platform - Google Patents

Public opinion analysis method and system based on social network platform Download PDF

Info

Publication number
CN116522013B
CN116522013B CN202310780017.2A CN202310780017A CN116522013B CN 116522013 B CN116522013 B CN 116522013B CN 202310780017 A CN202310780017 A CN 202310780017A CN 116522013 B CN116522013 B CN 116522013B
Authority
CN
China
Prior art keywords
public opinion
information
emotion
determining
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310780017.2A
Other languages
Chinese (zh)
Other versions
CN116522013A (en
Inventor
李志杰
郭晋
姜波清
于瑞清
刀国羚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lemai Information Technology Hangzhou Co ltd
Original Assignee
Lemai Information Technology Hangzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lemai Information Technology Hangzhou Co ltd filed Critical Lemai Information Technology Hangzhou Co ltd
Priority to CN202310780017.2A priority Critical patent/CN116522013B/en
Publication of CN116522013A publication Critical patent/CN116522013A/en
Application granted granted Critical
Publication of CN116522013B publication Critical patent/CN116522013B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a public opinion analysis method and system based on a social network platform, comprising the steps of determining the occurrence frequency and spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and spatial position distribution of each comment text, and determining the public opinion subject corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics; based on each public opinion feature of the public opinion theme, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model; and determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity. The method can effectively perform public opinion analysis.

Description

Public opinion analysis method and system based on social network platform
Technical Field
The application relates to the technical field of public opinion analysis, in particular to a public opinion analysis method and system based on a social network platform.
Background
The rapid development of information technology enhances the spreadability and uncertainty of network conditions, and the complex characteristics of network space endow the network with extremely strong uncontrollability.
CN104484359B, a public opinion analysis method and device based on social graph, discloses that using a network data acquisition tool, capturing social network information from the internet, and obtaining social graph basic data through data processing; obtaining social center points, and comparing the number of the social center points with a preset public opinion center point threshold; judging whether public opinion center point behaviors exist in social graphs with the number of the social center points smaller than a preset public opinion center point threshold; the method comprises the steps of obtaining core nodes, comparing the number of the sub-nodes of the core nodes with a sub-node threshold value of a preset core node, and taking a comparison result as an evaluation standard of public opinion conditions; and carrying out public opinion early warning according to the acquired public opinion topics and the public opinion condition evaluation result.
CN106649578A, a public opinion analysis method and system based on a social network platform, discloses S1: statistically analyzing search words and search frequency of a user to obtain a data set; s2: filtering out duplicate content; s3: clustering and then combining each type of data into a document set; s4: and obtaining a public opinion result of related heat.
The prior art provides two public opinion analysis methods, but the public opinion analysis content considered in the prior art does not consider emotion characteristics, is limited to early warning and current public opinion heat, does not consider public opinion diffusion, and is a factor which has a great influence on public opinion.
Disclosure of Invention
The embodiment of the application provides a public opinion analysis method and system based on a social network platform, which can at least solve part of problems in the prior art, namely, solve the problems that the prior art does not consider emotion characteristics and is limited to early warning and current public opinion heat.
In a first aspect of an embodiment of the present application,
the public opinion analysis method based on the social network platform comprises the following steps:
determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text, and determining public opinion topics corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics;
based on each public opinion feature of the public opinion theme, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting emotion attributes corresponding to the public opinion feature;
and determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity.
In an alternative embodiment of the present application,
the public opinion characteristics of the network public opinion information extracted based on the occurrence frequency and the spatial position distribution of each comment text are shown as the following formula:
wherein ,NPrepresenting public opinion characteristics corresponding to the network public opinion information,represents the frequency of occurrence of a comment text in any piece of acquired network public opinion information,Drepresenting the total amount of the online public opinion information, < >>Representing network public opinion informationiMiddle (f)pSpatial position distribution of comment text, +.>Representing network public opinionInformation of emotioniDistribution dispersion information in sample space, +.>And representing the number of semantic categories in which the acquired network public opinion information exists.
In an alternative embodiment of the present application,
the determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics comprises the following steps:
performing topic distribution on the public opinion features according to a preset topic identification model, respectively determining the number of public opinion features in the radius range of the field of each public opinion feature based on a preset neighborhood radius and a minimum point threshold, if the number of the public opinion features in the radius range of the field is greater than or equal to the minimum point threshold, using the public opinion features as core features, clustering the core features and all the public opinion features in the radius range of the field of the core features, and determining a plurality of first clustering results;
and respectively determining the Euclidean distance of the core features of the adjacent clustering results in the plurality of first clustering results, merging the adjacent clustering results with the nearest Euclidean distance to obtain a second clustering result, and determining the public opinion subjects of each public opinion feature in the second clustering result.
In an alternative embodiment of the present application,
the emotion extraction model comprises an input layer, a characteristic extraction layer and an output layer, wherein a first sub-model is arranged in the input layer, a second sub-model is arranged in the characteristic extraction layer,
based on each public opinion feature of the public opinion topic, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model comprises the following steps:
mapping each public opinion characteristic of the public opinion theme to obtain a matrix formed by real value vectors;
rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector, wherein the first feature vector comprises global semantic information and local semantic information of the public opinion features;
and determining the weight of the first feature vector by using a self-attention mechanism at the output layer, activating by using a sigmoid function based on the first feature vector and the corresponding weight, and determining the emotion attribute corresponding to each public opinion feature.
In an alternative embodiment of the present application,
the second sub-model includes a biglu model and a CNN model,
the step of rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector includes:
convolving the matrix through a BiGRU model of the second sub-model, and weighting bidirectional semantic information generated by forward propagation and backward propagation on a convolution result to obtain a second feature vector, wherein the second feature vector comprises global semantic information of the public opinion features;
convolving the second feature vector through a CNN model of the second sub-model to obtain a third feature vector, wherein the third feature vector comprises local semantic information of the public opinion feature;
and splicing the pooled second feature vector and the pooled third feature vector at the feature extraction layer to obtain the first feature vector.
In an alternative embodiment of the present application,
the determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity comprises:
determining the information entropy and the emotion polarity according to a method shown in the following formula:
wherein ,H(NP)representing the public opinion characteristicsIs used for the information entropy of (a),Nthe number of characteristics of the public opinion is represented,represent the firstiProbability of occurrence of individual public opinion features;
wherein ,POLthe polarity of the emotion is indicated,Mrepresenting emotional characteristicsIs used in the number of (a) and (b),WS()representing the emotion score function,WW ()representing the emotion feature weight distribution function.
In an alternative embodiment of the present application,
the determining the diffuseness of the online public opinion information according to the information entropy and the emotion polarity comprises the following steps:
the diffusivity is determined as shown in the following formula:
wherein ,the variation coefficient weighting information entropy of the emotion polarity value of the topic public opinion text data is represented,maximum value of entropy of variation coefficient weighting information representing theme correspondence,/->Representing the adjustment remainders,/->Is a predetermined parameter;
the coefficient of variation:
wherein ,emotion polarity value representing ith public opinion text data,/->Distribution frequency of emotion polarity value representing ith public opinion text data, ++>Representing a weighted average of emotion polarity values for all public opinion text data under the topic.
In a second aspect of an embodiment of the present application,
the utility model provides a public opinion analysis system based on social network platform, include:
the first unit is used for determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text, and determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics;
the second unit is used for determining emotion attributes corresponding to the public opinion features through a pre-constructed emotion extraction model based on the public opinion features of the public opinion subject, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting the emotion attributes corresponding to the public opinion features;
and the third unit is used for determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity.
In a third aspect of an embodiment of the present application,
there is provided an electronic device including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method described previously.
In a fourth aspect of an embodiment of the present application,
there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method as described above.
The method provided by the application determines the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information, and can reflect the characteristics of the network public opinion information to the greatest extent. And determining public opinion topics corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics, wherein the first clustering identifies clusters through density, and can effectively process clusters with different densities; the second clustering can further combine the clusters with similar density on the basis of the first clustering to form a more complete clustering structure.
Through determining the emotion characteristics corresponding to the network public opinion information, the emotion polarity of the related evaluation information can be accurately divided, and the propagation degree and influence of the public opinion information on social media or a network platform can be reflected according to the degree of diffusion.
Drawings
FIG. 1 is a flow chart of a public opinion analysis method based on a social network platform according to an embodiment of the application;
fig. 2 is a schematic structural diagram of a public opinion analysis system based on a social network platform according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The technical scheme of the application is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 1 is a flow chart of a public opinion analysis method based on a social network platform according to an embodiment of the present application, as shown in fig. 1, the method includes:
s101, determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, and extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text;
the network condition is a phenomenon of cognitive resonance or conflict of users with superpotential guidance caused by the fact that social masses express, interact and transmit comments on a social network aiming at hot events. The formation and diffusion of the dynamic and uncertain trend guiding phenomenon are visual representation of the pursuit of the spirit needs of the social masses in the current age background, are often easily influenced by various subjective and objective factors such as the culture, character, preference, psychology and the like of the masses, and the opinion clusters with one-sided property, bias and even negative property of the masses are key influence factors for promoting the generation of network rumors and negative and dominant feelings and uncontrollable evolution trend of the network feelings, so that public opinion analysis is necessary.
In practical application, the network public opinion often has more noise, the corresponding characteristics of the network public opinion often need to be extracted for subsequent analysis, the public opinion characteristics of the network public opinion information extracted by the prior art often adopts a TF-IDF (term frequency-inverse document frequency) algorithm, which is used as a common algorithm for calculating the weight of characteristic words, the core principle is mainly to calculate the weight by using word frequency, however, the frequency of word segmentation is excessively emphasized, and the semantic condition of the same word at different positions is ignored, so that the method is a major disadvantage of the traditional TF-IDF algorithm.
The application comprehensively considers the spatial position distribution condition of each word, and eliminates the problem that the TF-IDF algorithm ignores the word meaning as much as possible by combining the position distribution of the words. Specifically, the application determines the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information, wherein the occurrence frequency of each comment text is the ratio of the occurrence frequency of each comment text in the public opinion information to the total comment text, and the spatial position distribution of each comment text is the distribution condition of the same comment text in different spatial positions;
when the comment text appears in the network public opinion information more frequently, the spatial distribution of the position where the comment text is located is more scattered and the distribution range is wider, the type representativeness of the comment text in the network public opinion information is stronger; on the contrary, the position distribution of the comment text in the sample space is proved to be relatively concentrated, and the situation that the position of the comment text tends to be close to or easily appear repeatedly can be understood, in the case, the frequency of the comment text tends to be high, but the semantic effect tends to be consistent, and the content of the part is only large, so that the public opinion characteristics of the network public opinion information can be reflected to the greatest extent through the occurrence frequency and the spatial position distribution of each comment text.
In an alternative embodiment of the present application,
the public opinion characteristics of the network public opinion information extracted based on the occurrence frequency and the spatial position distribution of each comment text are shown as the following formula:
wherein ,NPrepresenting public opinion characteristics corresponding to the network public opinion information,represents the frequency of occurrence of a comment text in any piece of acquired network public opinion information,Drepresenting the total amount of the online public opinion information, < >>Representing network public opinion informationiMiddle (f)pSpatial position distribution of comment text, +.>Representing network public opinion informationiDistribution dispersion information in sample space, +.>And representing the number of semantic categories in which the acquired network public opinion information exists.
The distributed and scattered information in the sample space refers to the distribution situation of the network public opinion information in the data set or the sample set, if the network public opinion information shows scattered distribution in the sample space, it means that different types of public opinion information may be in different areas or clusters and have a certain distance or difference between each other. The semantic category of the network public opinion information refers to that the network public opinion information can be divided into different semantic categories or topics. The semantic categories reflect different topics, views or emotion tendencies related to the public opinion information, and the content and trend of the public opinion information can be better understood, and the public opinion information can be monitored, analyzed and responded by carrying out semantic classification on the network public opinion information.
In an optional implementation manner, the determining, according to the public opinion characteristics, the public opinion topic corresponding to the network public opinion information through a combined clustering algorithm includes:
performing topic distribution on the public opinion features according to a preset topic identification model, respectively determining the number of public opinion features in the radius range of the field of each public opinion feature based on a preset neighborhood radius and a minimum point threshold, if the number of the public opinion features in the radius range of the field is greater than or equal to the minimum point threshold, using the public opinion features as core features, clustering the core features and all the public opinion features in the radius range of the field of the core features, and determining a plurality of first clustering results;
and respectively determining the Euclidean distance of the core features of the adjacent clustering results in the plurality of first clustering results, merging the adjacent clustering results with the nearest Euclidean distance to obtain a second clustering result, and determining the public opinion subjects of each public opinion feature in the second clustering result.
The method comprises the steps of carrying out clustering on public opinion characteristics twice to determine a public opinion theme corresponding to the network public opinion, wherein a clustering method based on density can be adopted for the first clustering, and a region with the density reaching a certain threshold is divided into a cluster; the second clustering may employ a tree-based clustering approach to merge similar clusters into one larger cluster.
The first clustering identifies clusters through density, so that clusters with different densities can be effectively processed; the second clustering can further combine the clusters with similar density on the basis of the first clustering to form a more complete clustering structure; illustratively, the first clusters may include DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clusters, the second clusters may include hierarchical clusters (Hierarchical Clustering), the two clusters may process clusters having different scales, the clusters of different densities are adaptively identified, and the hierarchical clusters may process the clusters of different scales by building a hierarchy.
S102, determining emotion characteristics corresponding to the network public opinion information through a pre-constructed emotion extraction model based on the acquired network public opinion information, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting the emotion characteristics corresponding to the network public opinion information;
the emotion classification problem belongs to a subtask of text classification, and is a type of research capable of analyzing subjective opinions such as views, attitudes, evaluations, emotions and the like of people from written languages by means of natural language understanding, text mining, computer languages and the like. The emotion polarity of the related evaluation information is accurately divided, so that merchants can be helped to quickly acquire, analyze and even pre-judge the attitudes of users, and the method plays an important role in data mining and text mining in the business field.
Illustratively, the emotion extraction model of the embodiment of the application comprises an input layer, a feature extraction layer and an output layer, wherein a first sub-model is arranged in the input layer, and a second sub-model is arranged in the feature extraction layer, wherein the first sub-model can comprise a Bert model, and the second sub-model can comprise a BiGRU model and a CNN model.
In an alternative embodiment of the present application,
based on the obtained network public opinion information, determining emotion characteristics corresponding to the network public opinion information through a pre-constructed emotion extraction model comprises the following steps:
pre-training the network public opinion information by using a first sub-model at the input layer, and mapping comment texts in the network public opinion information to obtain a matrix formed by real-valued vectors;
rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector, wherein the first feature vector comprises global semantic information and local semantic information of the network public opinion information;
and determining the weight of the first feature vector by using a self-attention mechanism at the output layer, activating by using a sigmoid function based on the first feature vector and the corresponding weight, and determining the emotion feature corresponding to the network public opinion information.
Illustratively, the input layer may pre-train the input text by a Bert model, map the words into real-valued vectors, and stack the connection vectors to form a matrix as input to the feature extraction layer. The feature extraction layer is formed by connecting a BiGRU model and a CNN model in series, semantic information is captured through combination of the BiGRU model in the forward direction and the reverse direction, and emotion polarity information of network public opinion information is measured integrally; and capturing local feature information of the text by utilizing a CNN model, acquiring feature information with larger contribution to text emotion analysis by effectively combining the local feature information and the feature information, taking the finally extracted feature information as input of an output layer, and transmitting the extracted feature to a Sigmoid function by the output layer to obtain emotion features.
In an alternative embodiment of the present application,
the step of rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector includes:
convolving the matrix through a BiGRU model of the second sub-model, and weighting bidirectional semantic information generated by forward propagation and backward propagation on a convolution result to obtain a second feature vector, wherein the second feature vector comprises global semantic information of the network public opinion information;
convolving the second feature vector through a CNN model of the second sub-model to obtain a third feature vector, wherein the third feature vector comprises local semantic information of the network public opinion information;
and splicing the pooled second feature vector and the pooled third feature vector at the feature extraction layer to obtain the first feature vector.
The reset gate of the BiGRU model multiplies the previous moment information and the current moment information by weight linear transformation respectively, transmits the weight sum to the update gate, obtains the calculated hidden state of the reset gate at the same time, selectively adds the hidden state of the reset gate to the current hidden state, and memorizes the state at the current moment. The updating gate calculates the output of the hidden state at the current moment, simultaneously controls the data quantity of the current information transmitted to the next moment, and the output of the hidden state is jointly determined by the output of the hidden state at the previous moment and the current moment.
The convolution layer automatically extracts the character of the word vector matrix by setting convolution kernels with different sizes, and the characteristic information is transmitted to the pooling layer after being aggregated. The pooling layer can effectively reduce the size of the matrix and achieve the effect of dimension reduction processing on the feature information, so that parameters of the full-connection layer are reduced, and only the maximum value is reserved for all global features calculated by convolution by adopting a maximum pooling method. And carrying out cascading operation on vectors transmitted from the BiGRU layer, and carrying out convolution operation on sentence matrixes obtained by convolution check to generate new features, namely emotion features.
S103, determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics, the public opinion topics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity.
Illustratively, the information entropy of the network public opinion information is a measure for measuring uncertainty or diversity of the information, and in public opinion analysis, the information entropy can reflect the diversity degree of the public opinion information; if the information entropy of the network public opinion information is higher, the topics, views or emotion tendencies related to the public opinion information are more various, and larger uncertainty exists, and conversely, if the information entropy of the network public opinion information is lower, the content of the public opinion information is more concentrated or single.
The emotion polarity is an index for describing the tendency of emotion expressed in public opinion information, and can be classified into positive, negative and neutral, and the emotion tendency in the public opinion information can be determined by carrying out emotion analysis on the network public opinion information, so that whether the public opinion is positive, negative or neutral is judged.
The diffuseness refers to the propagation range or propagation speed of the network public opinion information on the network. The diffuseness can be measured by different indexes, such as forwarding quantity, discussion heat, influence range and the like, and the diffuseness can reflect the spreading degree and influence of public opinion information on social media or a network platform; if the diffuseness of the network public opinion information is high, the information is widely spread on the network and great attention and discussion are drawn.
In an alternative embodiment of the present application,
the determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity comprises:
determining the information entropy and the emotion polarity according to a method shown in the following formula:
wherein ,H(NP)information entropy corresponding to the public opinion characteristics is represented,Nthe number of characteristics of the public opinion is represented,represent the firstiProbability of occurrence of personal public opinion characteristics.
Through analyzing the network public opinion information, the public opinion characteristics such as topic, viewpoint or emotion tendency diversity can be extracted, and further, the uncertainty and diversity of the public opinion information can be measured by calculating the information entropy according to the public opinion characteristics, the higher the information entropy is, the more various the content of the public opinion information is, and otherwise, the more concentrated or single content is represented.
wherein ,POLthe polarity of the emotion is indicated,Mrepresenting emotional characteristicsIs used in the number of (a) and (b),WS()representing the emotion score function,WW ()representing the emotion feature weight distribution function.
The emotion score function can determine emotion scores corresponding to emotion features based on the emotion dictionary, and for each emotion feature, the emotion score corresponding to each emotion feature can be inquired in the emotion dictionary; the emotion feature weight distribution function is used for adjusting the contribution degree of emotion features to emotion polarity.
According to the emotion characteristics in the network public opinion information, emotion analysis can be carried out, each piece of public opinion information is judged to be positive, negative or neutral emotion polarity, emotion tendency of the public opinion information can be known through emotion polarity, and whether the information is positive, negative or neutral is judged.
In an alternative embodiment of the present application,
the determining the diffuseness of the online public opinion information according to the information entropy and the emotion polarity comprises the following steps:
the diffusivity is determined as shown in the following formula:
wherein ,the variation coefficient weighting information entropy of the emotion polarity value of the topic public opinion text data is represented,maximum value of entropy of variation coefficient weighting information representing theme correspondence,/->Representing the adjustment remainders,/->Is a predetermined parameter;
the coefficient of variation:
wherein ,emotion polarity value representing ith public opinion text data,/->Distribution frequency of emotion polarity value representing ith public opinion text data, ++>Representing a weighted average of emotion polarity values for all public opinion text data under the topic.
In a second aspect of an embodiment of the present application,
fig. 2 is a schematic structural diagram of a social network platform-based public opinion analysis system according to an embodiment of the present application, including:
the first unit is used for determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text, and determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics;
the second unit is used for determining emotion attributes corresponding to the public opinion features through a pre-constructed emotion extraction model based on the public opinion features of the public opinion subject, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting the emotion attributes corresponding to the public opinion features;
and the third unit is used for determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity.
In a third aspect of an embodiment of the present application,
there is provided an electronic device including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method described previously.
In a fourth aspect of an embodiment of the present application,
there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method as described above.
The present application may be a method, apparatus, system, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing various aspects of the present application.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (6)

1. The public opinion analysis method based on the social network platform is characterized by comprising the following steps of:
determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text, and determining public opinion topics corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics;
based on each public opinion feature of the public opinion theme, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting emotion attributes corresponding to the public opinion feature;
according to the public opinion characteristics and the emotion characteristics corresponding to the network public opinion information, determining the information entropy and the emotion polarity of the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity;
the public opinion characteristics of the network public opinion information extracted based on the occurrence frequency and the spatial position distribution of each comment text are shown as the following formula:
wherein ,NPrepresenting public opinion characteristics corresponding to the network public opinion information,represents the frequency of occurrence of a comment text in any piece of acquired network public opinion information,Drepresenting the total amount of the online public opinion information, < >>Representing network public opinion informationiMiddle (f)pSpatial position distribution of comment text, +.>Representing network public opinion informationiDistribution dispersion information in sample space, +.>Representing the number of semantic categories in which the acquired network public opinion information exists;
the determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics comprises the following steps:
performing topic distribution on the public opinion features according to a preset topic identification model, respectively determining the number of public opinion features in the radius range of the field of each public opinion feature based on a preset neighborhood radius and a minimum point threshold, if the number of the public opinion features in the radius range of the field is greater than or equal to the minimum point threshold, using the public opinion features as core features, clustering the core features and all the public opinion features in the radius range of the field of the core features, and determining a plurality of first clustering results;
respectively determining Euclidean distances of core features of adjacent clustering results in the plurality of first clustering results, merging adjacent clustering results with nearest Euclidean distances to obtain a second clustering result, and determining public opinion topics of each public opinion feature in the second clustering result;
the emotion extraction model comprises an input layer, a characteristic extraction layer and an output layer, wherein a first sub-model is arranged in the input layer, a second sub-model is arranged in the characteristic extraction layer,
based on each public opinion feature of the public opinion topic, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model comprises the following steps:
mapping each public opinion characteristic of the public opinion theme to obtain a matrix formed by real value vectors;
rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector, wherein the first feature vector comprises global semantic information and local semantic information of the public opinion features;
determining the weight of the first feature vector by using a self-attention mechanism at the output layer, activating by using a sigmoid function based on the first feature vector and the corresponding weight, and determining the emotion attribute corresponding to each public opinion feature;
the second sub-model includes a biglu model and a CNN model,
the step of rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector includes:
convolving the matrix through a BiGRU model of the second sub-model, and weighting bidirectional semantic information generated by forward propagation and backward propagation on a convolution result to obtain a second feature vector, wherein the second feature vector comprises global semantic information of the public opinion features;
convolving the second feature vector through a CNN model of the second sub-model to obtain a third feature vector, wherein the third feature vector comprises local semantic information of the public opinion feature;
and splicing the pooled second feature vector and the pooled third feature vector at the feature extraction layer to obtain the first feature vector.
2. The method of claim 1, wherein the determining information entropy and emotion polarity of the online public opinion information according to the public opinion characteristics and the emotion characteristics corresponding to the online public opinion information, and determining diffuseness of the online public opinion information according to the information entropy and the emotion polarity comprises:
determining the information entropy and the emotion polarity according to a method shown in the following formula:
wherein ,H(NP)information entropy corresponding to the public opinion characteristics is represented,Nthe number of characteristics of the public opinion is represented,represent the firstiProbability of occurrence of individual public opinion features;
wherein ,POLthe polarity of the emotion is indicated,Mrepresenting emotional characteristicsIs used in the number of (a) and (b),WS()representing the emotion score function,WW()representing the emotion feature weight distribution function.
3. The method of claim 2, wherein the determining the diffuseness of the network public opinion information according to the information entropy and the emotion polarity comprises:
the diffusivity is determined as shown in the following formula:
wherein ,coefficient of variation weighting information entropy representing emotion polarity value of topic-based public opinion text data>Maximum value of entropy of variation coefficient weighting information representing theme correspondence,/->Representing the adjustment remainders,/->Is a predetermined parameter;
the coefficient of variation
;
wherein ,emotion polarity value representing ith public opinion text data,/->Distribution frequency of emotion polarity value representing ith public opinion text data, ++>Representing a weighted average of emotion polarity values for all public opinion text data under the topic.
4. The public opinion analysis system based on the social network platform is characterized by comprising:
the first unit is used for determining the occurrence frequency and the spatial position distribution of each comment text in the network public opinion information based on the acquired network public opinion information, extracting the public opinion characteristics of the network public opinion information based on the occurrence frequency and the spatial position distribution of each comment text, and determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics;
the second unit is used for determining emotion attributes corresponding to the public opinion features through a pre-constructed emotion extraction model based on the public opinion features of the public opinion subject, wherein the emotion extraction model is formed by fusing a plurality of models and is used for extracting the emotion attributes corresponding to the public opinion features;
the third unit is used for determining the information entropy and emotion polarity of the network public opinion information according to the public opinion characteristics and emotion characteristics corresponding to the network public opinion information, and determining the diffuseness of the network public opinion information according to the information entropy and emotion polarity;
the public opinion characteristics of the network public opinion information extracted based on the occurrence frequency and the spatial position distribution of each comment text are shown as the following formula:
;
wherein ,NPrepresenting public opinion characteristics corresponding to the network public opinion information,represents the frequency of occurrence of a comment text in any piece of acquired network public opinion information,Drepresenting the total amount of the online public opinion information, < >>Representing network public opinion informationiMiddle (f)pSpatial position distribution of comment text, +.>Representing network public opinion informationiDistribution dispersion information in sample space, +.>Representing the number of semantic categories in which the acquired network public opinion information exists;
the determining the public opinion theme corresponding to the network public opinion information through a combined clustering algorithm according to the public opinion characteristics comprises the following steps:
performing topic distribution on the public opinion features according to a preset topic identification model, respectively determining the number of public opinion features in the radius range of the field of each public opinion feature based on a preset neighborhood radius and a minimum point threshold, if the number of the public opinion features in the radius range of the field is greater than or equal to the minimum point threshold, using the public opinion features as core features, clustering the core features and all the public opinion features in the radius range of the field of the core features, and determining a plurality of first clustering results;
respectively determining Euclidean distances of core features of adjacent clustering results in the plurality of first clustering results, merging adjacent clustering results with nearest Euclidean distances to obtain a second clustering result, and determining public opinion topics of each public opinion feature in the second clustering result;
the emotion extraction model comprises an input layer, a characteristic extraction layer and an output layer, wherein a first sub-model is arranged in the input layer, a second sub-model is arranged in the characteristic extraction layer,
based on each public opinion feature of the public opinion topic, determining emotion attributes corresponding to each public opinion feature through a pre-constructed emotion extraction model comprises the following steps:
mapping each public opinion characteristic of the public opinion theme to obtain a matrix formed by real value vectors;
rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector, wherein the first feature vector comprises global semantic information and local semantic information of the public opinion features;
determining the weight of the first feature vector by using a self-attention mechanism at the output layer, activating by using a sigmoid function based on the first feature vector and the corresponding weight, and determining the emotion attribute corresponding to each public opinion feature;
the second sub-model includes a biglu model and a CNN model,
the step of rolling and pooling the matrix by using a second submodel at the feature extraction layer to obtain a first feature vector includes:
convolving the matrix through a BiGRU model of the second sub-model, and weighting bidirectional semantic information generated by forward propagation and backward propagation on a convolution result to obtain a second feature vector, wherein the second feature vector comprises global semantic information of the public opinion features;
convolving the second feature vector through a CNN model of the second sub-model to obtain a third feature vector, wherein the third feature vector comprises local semantic information of the public opinion feature;
and splicing the pooled second feature vector and the pooled third feature vector at the feature extraction layer to obtain the first feature vector.
5. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method of any of claims 1 to 3.
6. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 3.
CN202310780017.2A 2023-06-29 2023-06-29 Public opinion analysis method and system based on social network platform Active CN116522013B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310780017.2A CN116522013B (en) 2023-06-29 2023-06-29 Public opinion analysis method and system based on social network platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310780017.2A CN116522013B (en) 2023-06-29 2023-06-29 Public opinion analysis method and system based on social network platform

Publications (2)

Publication Number Publication Date
CN116522013A CN116522013A (en) 2023-08-01
CN116522013B true CN116522013B (en) 2023-09-05

Family

ID=87394457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310780017.2A Active CN116522013B (en) 2023-06-29 2023-06-29 Public opinion analysis method and system based on social network platform

Country Status (1)

Country Link
CN (1) CN116522013B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116821457B (en) * 2023-08-30 2023-12-22 环球数科集团有限公司 Intelligent consultation and public opinion processing system based on multi-mode large model

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion
CN109446404A (en) * 2018-08-30 2019-03-08 中国电子进出口有限公司 A kind of the feeling polarities analysis method and device of network public-opinion
CN110991218A (en) * 2019-10-10 2020-04-10 北京邮电大学 Network public opinion early warning system and method based on images
CN111143549A (en) * 2019-06-20 2020-05-12 东华大学 Method for public sentiment emotion evolution based on theme
WO2021073271A1 (en) * 2019-10-17 2021-04-22 平安科技(深圳)有限公司 Public opinion analysis method and device, computer device and storage medium
CN113239290A (en) * 2021-06-10 2021-08-10 杭州安恒信息技术股份有限公司 Data analysis method and device for public opinion monitoring and electronic device
CN113392195A (en) * 2021-02-25 2021-09-14 中国人民解放军战略支援部队信息工程大学 Public opinion monitoring method and device, electronic equipment and storage medium
WO2021217843A1 (en) * 2020-04-29 2021-11-04 平安科技(深圳)有限公司 Enterprise public opinion analysis method and apparatus, and electronic device and medium
CN114091469A (en) * 2021-11-23 2022-02-25 杭州萝卜智能技术有限公司 Sample expansion based network public opinion analysis method
WO2022134794A1 (en) * 2020-12-22 2022-06-30 深圳壹账通智能科技有限公司 Method and apparatus for processing public opinions about news event, storage medium, and computer device
CN114692623A (en) * 2022-02-12 2022-07-01 北京工业大学 Emotion analysis method for environment network public sentiment
CN115098634A (en) * 2022-06-27 2022-09-23 重庆大学 Semantic dependency relationship fusion feature-based public opinion text sentiment analysis method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019227710A1 (en) * 2018-05-31 2019-12-05 平安科技(深圳)有限公司 Network public opinion analysis method and apparatus, and computer-readable storage medium
CN108959383A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Analysis method, device and the computer readable storage medium of network public-opinion
CN109446404A (en) * 2018-08-30 2019-03-08 中国电子进出口有限公司 A kind of the feeling polarities analysis method and device of network public-opinion
CN111143549A (en) * 2019-06-20 2020-05-12 东华大学 Method for public sentiment emotion evolution based on theme
CN110991218A (en) * 2019-10-10 2020-04-10 北京邮电大学 Network public opinion early warning system and method based on images
WO2021073271A1 (en) * 2019-10-17 2021-04-22 平安科技(深圳)有限公司 Public opinion analysis method and device, computer device and storage medium
WO2021217843A1 (en) * 2020-04-29 2021-11-04 平安科技(深圳)有限公司 Enterprise public opinion analysis method and apparatus, and electronic device and medium
WO2022134794A1 (en) * 2020-12-22 2022-06-30 深圳壹账通智能科技有限公司 Method and apparatus for processing public opinions about news event, storage medium, and computer device
CN113392195A (en) * 2021-02-25 2021-09-14 中国人民解放军战略支援部队信息工程大学 Public opinion monitoring method and device, electronic equipment and storage medium
CN113239290A (en) * 2021-06-10 2021-08-10 杭州安恒信息技术股份有限公司 Data analysis method and device for public opinion monitoring and electronic device
CN114091469A (en) * 2021-11-23 2022-02-25 杭州萝卜智能技术有限公司 Sample expansion based network public opinion analysis method
CN114692623A (en) * 2022-02-12 2022-07-01 北京工业大学 Emotion analysis method for environment network public sentiment
CN115098634A (en) * 2022-06-27 2022-09-23 重庆大学 Semantic dependency relationship fusion feature-based public opinion text sentiment analysis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于K-means聚类的网络舆情监控系统;张玉珠;;通信技术(第01期);全文 *

Also Published As

Publication number Publication date
CN116522013A (en) 2023-08-01

Similar Documents

Publication Publication Date Title
Giasemidis et al. Determining the veracity of rumours on Twitter
CN108717408B (en) Sensitive word real-time monitoring method, electronic equipment, storage medium and system
Li et al. Image sentiment prediction based on textual descriptions with adjective noun pairs
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
CN110795641A (en) Network rumor propagation control method based on representation learning
Mardjo et al. HyVADRF: hybrid VADER–random forest and GWO for bitcoin tweet sentiment analysis
CN116522013B (en) Public opinion analysis method and system based on social network platform
US20220058464A1 (en) Information processing apparatus and non-transitory computer readable medium
CN112711705A (en) Public opinion data processing method, equipment and storage medium
Wan Sentiment analysis of Weibo comments based on deep neural network
Zou et al. Collaborative community-specific microblog sentiment analysis via multi-task learning
Sun et al. Conversational structure aware and context sensitive topic model for online discussions
CN108596205B (en) Microblog forwarding behavior prediction method based on region correlation factor and sparse representation
Eligüzel Analyzing society anti-vaccination attitudes towards COVID-19: combining latent dirichlet allocation and fuzzy association rule mining with a fuzzy cognitive map
Kumari et al. OSEMN approach for real time data analysis
Yu et al. Prediction of users retweet times in social network
CN109254993B (en) Text-based character data analysis method and system
Li [Retracted] Forecast and Simulation of the Public Opinion on the Public Policy Based on the Markov Model
CN112487303B (en) Topic recommendation method based on social network user attributes
CN114997155A (en) Fact verification method and device based on table retrieval and entity graph reasoning
Qin et al. Recommender resources based on acquiring user's requirement and exploring user's preference with Word2Vec model in web service
Li et al. Deep recommendation based on dual attention mechanism
CN115497482B (en) Voice dialogue method and related device
Banu S Graph-Based Rumor Detection on social media Using Posts and Reactions
Sarwani et al. Campus Sentiment Analysis E-Complaint Using Probabilistic Neural Network Algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant