CN116340511B - Public opinion analysis method combining deep learning and language logic reasoning - Google Patents

Public opinion analysis method combining deep learning and language logic reasoning Download PDF

Info

Publication number
CN116340511B
CN116340511B CN202310165134.8A CN202310165134A CN116340511B CN 116340511 B CN116340511 B CN 116340511B CN 202310165134 A CN202310165134 A CN 202310165134A CN 116340511 B CN116340511 B CN 116340511B
Authority
CN
China
Prior art keywords
data
text data
word
public opinion
topic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310165134.8A
Other languages
Chinese (zh)
Other versions
CN116340511A (en
Inventor
肖林
黄国柱
杨洲杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shenyi Technology Co ltd
Original Assignee
Shenzhen Shenyi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shenyi Technology Co ltd filed Critical Shenzhen Shenyi Technology Co ltd
Priority to CN202310165134.8A priority Critical patent/CN116340511B/en
Publication of CN116340511A publication Critical patent/CN116340511A/en
Application granted granted Critical
Publication of CN116340511B publication Critical patent/CN116340511B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a public opinion analysis method combining deep learning and language logic reasoning, which comprises the following steps: topic data are obtained, identified and subjected to format conversion, and text data are extracted from the topic data; performing text classification and word vector modeling on the text data to extract first related information of the text data; carrying out structural analysis on the first related information to obtain first relation data of each subject term; determining a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword according to the first relation data; carrying out emotion classification on the first keyword set to obtain first emotion classification data and analyzing the first emotion classification data to obtain a first public opinion analysis result; carrying out validity verification on the first public opinion analysis result to obtain a first verification result; and correcting the first public opinion analysis result according to the first verification result. According to the scheme provided by the application, the public opinion analysis can be accurately performed by utilizing the deep learning technology and natural language logic reasoning.

Description

Public opinion analysis method combining deep learning and language logic reasoning
Technical Field
The application relates to the technical field of industrial control, in particular to a public opinion analysis method combining deep learning and language logic reasoning.
Background
With the rapid development of network technology, the internet has become an important platform for the public to acquire information and express views. The network public opinion is the state of opinion or speaking with certain influence and tendency expressed by public on the hot spot problem spread on the internet, and the public opinion is used for publishing the opinion of the social problem or expressing the speaking and the opinion with strong influence and tendency. The public opinion condition of the network can reflect the social state, and the effective public opinion monitoring and analysis can help to lock hot topics, quickly know the emotion development of network citizens and clear the current situation of public opinion, and simultaneously help to guide the trend of the public opinion and avoid public opinion crisis. Aiming at the description of the public opinion event, mainly from news texts on network media and social platforms similar to newness microblogs, people directly inform others or indirectly know related information of the public opinion event from others through reading, forwarding, commenting and the like. There is a need for a public opinion system that can extract features from these event information and further accurately analyze the current situation and propagation trends of public opinion.
Disclosure of Invention
Based on the problems, the application provides a public opinion analysis method combining deep learning and language logic reasoning.
In view of the above, an aspect of the present application provides a public opinion analysis method combining deep learning and language logic reasoning, including:
acquiring topic data related to a specific topic according to a preset trigger rule;
identifying the topic data, converting the format of the topic data, and extracting text data from the topic data;
performing text classification and word vector modeling on the text data by using a pre-trained first neural network to extract first related information of the text data;
carrying out structural analysis on the first related information by using a preset natural language logic reasoning model to obtain first relation data of each subject term;
processing the first relation data by using a pre-trained keyword determination model, so as to determine a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword in the plurality of first keywords from the subject words;
carrying out emotion classification on the first keyword set by using the trained emotion analysis model to obtain first emotion classification data;
analyzing the first emotion classification data to obtain a first public opinion analysis result;
performing validity verification on the first public opinion analysis result by using a clustering analysis, statistical analysis and accuracy test method to obtain a first verification result;
and correspondingly correcting the first public opinion analysis result according to the first verification result.
Optionally, the pre-trained first neural network is obtained by training by using a machine learning technology and a deep neural network and combining a corpus, so as to perform text classification on the text data, thereby analyzing first related information related to different public opinion categories.
Optionally, the step of performing structural analysis on the first related information by using a preset natural language logic inference model to obtain first relationship data of each subject term includes:
and the preset natural language logic reasoning model identifies each subject word in the first related information by utilizing a natural language processing technology so as to carry out statistical analysis on the topic data, thereby obtaining an accurate public opinion analysis conclusion.
Optionally, the step of acquiring topic data related to a specific topic according to a preset trigger rule includes:
extracting association data of the specific topics from the preset trigger rules and extracting association words from the association data;
performing semantic similarity analysis based on a word vector technology to obtain derivative related words similar to the word vectors of the related words;
and acquiring related texts, audios, images and videos according to the related words and the derived related words to serve as the topic data.
Optionally, the step of extracting text data from the topic data after identifying and format converting includes:
recognizing first voice data and first tone data in the audio, and obtaining audio description text data through a voice recognition algorithm and a semantic recognition algorithm;
recognizing first text data, first facial expression data and first expression symbol data in the image, and combining an expression recognition algorithm to obtain image description text data;
identifying second voice data, second text data, second facial expression data and second expression symbol data in the video, and combining a voice recognition algorithm, a semantic recognition algorithm and an expression recognition algorithm to obtain video description text data;
converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data;
extracting the text data from the initial text data.
Optionally, the step of converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data includes:
performing word segmentation, expression recognition and nonsensical symbol removal and word stopping operation on the text, the audio description text data, the image description text data and the video description text data by using a word segmentation model, an expression symbol recognition model and a stop word recognition model to obtain text data to be processed;
and carrying out standardization processing on the text data to be processed to obtain the initial text data.
Optionally, after the step of acquiring topic data related to the specific topic according to the preset triggering rule, the method further includes:
and acquiring a network address, a user account and user identity characteristic information corresponding to the topic data to generate a unique source identifier corresponding to the topic data.
Optionally, the step of normalizing the text data to be processed to obtain the initial text data includes:
grouping the text data to be processed according to the source identifier to obtain a plurality of grouped text data subgroups;
classifying the text data subgroups according to the original generation time, language, region and each dimension of the source person information to obtain a plurality of text data groups;
and carrying out standardization processing on the plurality of text data groups to obtain the initial text data.
Optionally, the step of normalizing the text data to be processed to obtain the initial text data includes:
for any one first text data subgroup in the plurality of text data subgroups, taking a first single word after word segmentation of the first text data subgroup as a reference word;
the description structure of each individual word after word segmentation is established, specifically:
creating a description structure file;
acquiring a start word, an intermediate word, an end word, a distance between the end word and the reference word and the occurrence number of the reference word of each individual word, and recording the initial word, the intermediate word and the end word in the description structure file;
repeating the steps until all the text data subgroups are iterated.
Optionally, the step of normalizing the text data to be processed to obtain the initial text data includes:
and for each first text data group, carrying out statistical analysis on all the individual words according to the occurrence times and the interval distance, and constructing characteristic structure data of the first text data group by using the ' individual word ', the occurrence times and the interval distance '.
By adopting the technical scheme of the application, the public opinion analysis method combining deep learning and language logic reasoning comprises the following steps: acquiring topic data related to a specific topic according to a preset trigger rule; identifying the topic data, converting the format of the topic data, and extracting text data from the topic data; performing text classification and word vector modeling on the text data by using a pre-trained first neural network to extract first related information of the text data; carrying out structural analysis on the first related information by using a preset natural language logic reasoning model to obtain first relation data of each subject term; processing the first relation data by using a pre-trained keyword determination model, so as to determine a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword in the plurality of first keywords from the subject words; carrying out emotion classification on the first keyword set by using the trained emotion analysis model to obtain first emotion classification data; analyzing the first emotion classification data to obtain a first public opinion analysis result; performing validity verification on the first public opinion analysis result by using a clustering analysis, statistical analysis and accuracy test method to obtain a first verification result; and correspondingly correcting the first public opinion analysis result according to the first verification result. According to the scheme provided by the application, the public opinion analysis can be accurately performed by utilizing the deep learning technology and natural language logic reasoning.
Drawings
FIG. 1 is a flow chart of a method for public opinion analysis combining deep learning and language logic reasoning according to an embodiment of the present application.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced otherwise than as described herein, and therefore the scope of the present application is not limited to the specific embodiments disclosed below.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
A public opinion analysis method combining deep learning and language logic reasoning provided according to some embodiments of the present application is described below with reference to fig. 1.
As shown in FIG. 1, one embodiment of the present application provides a public opinion analysis method combining deep learning and language logic reasoning, comprising:
acquiring topic data related to a specific topic according to a preset trigger rule;
identifying the topic data, converting the format of the topic data, and extracting text data from the topic data;
performing text classification and word vector modeling on the text data by using a pre-trained first neural network to extract first related information of the text data;
in this step, the pre-trained first neural network is obtained by training by using a machine learning technology and a deep neural network and combining a corpus, so as to perform text classification on the text data, thereby analyzing first related information related to different public opinion categories.
Carrying out structural analysis on the first related information by using a preset natural language logic reasoning model to obtain first relation data of each subject term;
in the step, the first related information is subjected to structural analysis by using a preset natural language logic reasoning model to obtain first relation data of each subject term, so that the accuracy of the data can be ensured.
In some possible embodiments of the present application, the step of performing structural analysis on the first related information by using a preset natural language logical inference model to obtain first relationship data of each subject term includes:
and the preset natural language logic reasoning model identifies each subject word in the first related information by utilizing a natural language processing technology so as to carry out statistical analysis on the topic data, thereby obtaining an accurate public opinion analysis conclusion.
Processing the first relation data by using a pre-trained keyword determination model, so as to determine a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword in the plurality of first keywords from the subject words;
in this step, the first relationship data is processed by using a keyword determination model trained in advance in conjunction with a neural network, so that a first keyword set composed of a plurality of first keywords and first attribute data of each of the plurality of first keywords is determined from the respective subject words.
Carrying out emotion classification on the first keyword set by using the trained emotion analysis model to obtain first emotion classification data;
in the step, the trained emotion analysis model is utilized to carry out emotion classification on the first keyword set, first emotion classification data are obtained, and efficiency and accuracy can be improved. The emotion analysis model can be obtained by training the following method: constructing a first emotion dictionary (generally comprising general emotion words, degree adverbs, negative words, field words and the like); calculating the semantic similarity of the words and the reference emotion word set by using a semantic similarity calculation method; using a first emotion dictionary and analyzing the special structure and emotion tendency words of the text sentences, and using a weight algorithm to carry out emotion classification; assigning different weights to the emotion words according to the emotion intensity to obtain a second emotion dictionary; dividing the second emotion dictionary into a training set and a testing set; extracting text emotion characteristics from the test set by using a neural network, and constructing a basic emotion analysis model; and testing the basic emotion analysis model by using a test set, and correcting the basic emotion analysis model according to a test result to obtain the emotion analysis model.
Analyzing the first emotion classification data to obtain a first public opinion analysis result;
performing validity verification on the first public opinion analysis result by using a clustering analysis, statistical analysis and accuracy test method to obtain a first verification result;
and correspondingly correcting the first public opinion analysis result according to the first verification result.
It can be understood that in the embodiment of the present application, according to a preset trigger rule, the implementation of obtaining topic data related to a specific topic may use a crawler technology to obtain data related to a specific topic in an internet platform. Specifically, the related data of the specific topic can be extracted from the preset triggering rule, and related words can be extracted from the related data; performing semantic similarity analysis based on a word vector technology to obtain derivative related words similar to the word vectors of the related words; and acquiring related texts, audios, images and videos according to the related words and the derived related words to serve as the topic data. For example, according to the related words and the derived related words, in the internet platforms such as a tremble sound heat search list, a hundred-degree heat list, a micro-blog heat search list, a head heat search list and the like, the heat search topics containing the related words and the derived related words are searched, and then topics with highest heat, such as tremble sound and video, audio, micro-blog or head articles, pictures and the like, are selected under each topic, and corresponding comment data are obtained. In one embodiment of the application, the acquired content consists essentially of: web links to articles/videos/pictures of public opinion news, etc., posting account numbers, publisher information, source websites, titles, texts, posting times, forwarding numbers, comment numbers, endorsements, etc. In order to improve accuracy and timeliness, in the embodiment of the present application, the preset trigger rule may be set every preset time period.
By adopting the technical scheme of the embodiment, topic data related to a specific topic is obtained according to a preset trigger rule; identifying the topic data, converting the format of the topic data, and extracting text data from the topic data; performing text classification and word vector modeling on the text data by using a pre-trained first neural network to extract first related information of the text data; carrying out structural analysis on the first related information by using a preset natural language logic reasoning model to obtain first relation data of each subject term; processing the first relation data by using a pre-trained keyword determination model, so as to determine a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword in the plurality of first keywords from the subject words; carrying out emotion classification on the first keyword set by using the trained emotion analysis model to obtain first emotion classification data; analyzing the first emotion classification data to obtain a first public opinion analysis result; performing validity verification on the first public opinion analysis result by using a clustering analysis, statistical analysis and accuracy test method to obtain a first verification result; and correspondingly correcting the first public opinion analysis result according to the first verification result. According to the scheme provided by the application, the public opinion analysis can be accurately performed by utilizing the deep learning technology and natural language logic reasoning.
In some possible embodiments of the present application, the step of extracting text data from the topic data after identifying and format converting includes:
recognizing first voice data and first tone data in the audio, and obtaining audio description text data through a voice recognition algorithm and a semantic recognition algorithm;
recognizing first text data, first facial expression data and first expression symbol data in the image, and combining an expression recognition algorithm to obtain image description text data;
identifying second voice data, second text data, second facial expression data and second expression symbol data in the video, and combining a voice recognition algorithm, a semantic recognition algorithm and an expression recognition algorithm to obtain video description text data;
converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data;
extracting the text data from the initial text data.
It can be understood that, in order to make public opinion analysis more accurate, data of different formats and different platforms need to be acquired (namely, related text, audio, image and video are acquired as the topic data according to the related words and the derived related words), but the formats of the data are different and the source platforms are different, and standardized processing needs to be performed in advance, in this embodiment, the first voice data and the first tone data in the audio can be identified (the tone data can express the emotion of a speaker), and audio description text data is obtained through a voice recognition algorithm and a semantic recognition algorithm; recognizing first text data, first facial expression data and first expression symbol data (such as WeChat expression symbol) in the image, and combining an expression recognition algorithm (which can be trained by combining a neural network algorithm) to obtain image description text data; identifying second voice data, second text data, second facial expression data and second expression symbol data in the video, and combining a voice recognition algorithm, a semantic recognition algorithm and an expression recognition algorithm to obtain video description text data; converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data; extracting the text data from the initial text data.
In some possible embodiments of the present application, the step of converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data includes:
performing word segmentation, expression recognition and nonsensical symbol removal and word stopping operation on the text, the audio description text data, the image description text data and the video description text data by using a word segmentation model, an expression symbol recognition model and a stop word recognition model to obtain text data to be processed;
and carrying out standardization processing on the text data to be processed to obtain the initial text data.
It can be understood that, in order to improve accuracy and efficiency of data analysis, in this embodiment, word segmentation, expression recognition, and operations of removing meaningless symbols and disabling words are performed on the text, the audio description text data, the image description text data, and the video description text data by using a word segmentation model, an expression recognition model, and a disabling word recognition model, so as to obtain text data to be processed; and carrying out standardization processing on the text data to be processed to obtain the initial text data.
In some possible embodiments of the present application, after the step of obtaining topic data related to a specific topic according to a preset triggering rule, the method further includes:
and acquiring a network address, a user account and user identity characteristic information corresponding to the topic data to generate a unique source identifier corresponding to the topic data.
It can be understood that in the embodiment of the present application, the network address, the user account and the user identity feature information corresponding to the topic data are obtained to generate the unique source identifier corresponding to the topic data, so as to facilitate classification processing of the data.
In some possible embodiments of the present application, the step of normalizing the text data to be processed to obtain the initial text data includes:
grouping the text data to be processed according to the source identifier to obtain a plurality of grouped text data subgroups;
classifying the text data subgroups according to the original generation time, language, region and each dimension of the source person information to obtain a plurality of text data groups;
and carrying out standardization processing on the plurality of text data groups to obtain the initial text data.
It can be appreciated that, in order to perform public opinion analysis from different angles to ensure the comprehensiveness of the analysis, in this embodiment, the text data to be processed is grouped according to the source identifier, so as to obtain a plurality of grouped text data subgroups; classifying the text data subgroups according to the original generation time, language, region and each dimension of the source person information to obtain a plurality of text data groups; and carrying out standardization processing on the plurality of text data groups to obtain the initial text data.
In some possible embodiments of the present application, the step of normalizing the text data to be processed to obtain the initial text data includes:
for any one first text data subgroup in the plurality of text data subgroups, taking a first single word after word segmentation of the first text data subgroup as a reference word;
the description structure of each individual word after word segmentation is established, specifically:
creating a description structure file;
acquiring a start word, an intermediate word, an end word, a distance between the end word and the reference word and the occurrence number of the reference word of each individual word, and recording the initial word, the intermediate word and the end word in the description structure file;
repeating the steps until all the text data subgroups are iterated.
It can be understood that, in order to accurately analyze the relationship between the individual words/subject words, in this embodiment, a description structure of each individual word after word segmentation is established, specifically: creating a description structure file; and acquiring the start word, the middle word, the end word, the interval distance between the end word and the reference word and the occurrence number of each individual word, and recording the acquired initial word, the middle word and the end word into the description structure file.
In some possible embodiments of the present application, the step of normalizing the text data to be processed to obtain the initial text data includes:
and for each first text data group, carrying out statistical analysis on all the individual words according to the occurrence times and the interval distance, and constructing characteristic structure data of the first text data group by using the ' individual word ', the occurrence times and the interval distance '.
It will be appreciated that, in order to facilitate analysis of the structure of text data, in this embodiment, for each of the first text data subgroups, statistical analysis is performed on all the individual words according to the occurrence number and the separation distance, and feature structure data of the first text data subgroups is constructed with the ' individual word ', occurrence number, and separation distance '.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, such as the above-described division of units, merely a division of logic functions, and there may be additional manners of dividing in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a memory, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
Although the present application is disclosed above, the present application is not limited thereto. Variations and modifications, including combinations of the different functions and implementation steps, as well as embodiments of the software and hardware, may be readily apparent to those skilled in the art without departing from the spirit and scope of the application.

Claims (10)

1. A public opinion analysis method combining deep learning and language logic reasoning is characterized by comprising the following steps:
acquiring topic data related to a specific topic according to a preset trigger rule;
identifying the topic data, converting the format of the topic data, and extracting text data from the topic data;
performing text classification and word vector modeling on the text data by using a pre-trained first neural network to extract first related information of the text data;
carrying out structural analysis on the first related information by using a preset natural language logic reasoning model to obtain first relation data of each subject term;
processing the first relation data by using a pre-trained keyword determination model, so as to determine a first keyword set consisting of a plurality of first keywords and first attribute data of each first keyword in the plurality of first keywords from the subject words;
carrying out emotion classification on the first keyword set by using the trained emotion analysis model to obtain first emotion classification data;
analyzing the first emotion classification data to obtain a first public opinion analysis result;
performing validity verification on the first public opinion analysis result by using a clustering analysis, statistical analysis and accuracy test method to obtain a first verification result;
and correspondingly correcting the first public opinion analysis result according to the first verification result.
2. The public opinion analysis method of claim 1, wherein the pre-trained first neural network is obtained by training with a machine learning technique and a deep neural network in combination with a corpus to perform text classification on the text data, thereby analyzing first related information related to different public opinion categories.
3. The public opinion analysis method of claim 2, wherein the step of performing structural analysis on the first related information using a preset natural language logical inference model to obtain first relationship data of each subject term comprises:
and the preset natural language logic reasoning model identifies each subject word in the first related information by utilizing a natural language processing technology so as to carry out statistical analysis on the topic data, thereby obtaining an accurate public opinion analysis conclusion.
4. The public opinion analysis method of claim 3, wherein the step of obtaining topic data related to a specific topic according to a preset trigger rule comprises:
extracting association data of the specific topics from the preset trigger rules and extracting association words from the association data;
performing semantic similarity analysis based on a word vector technology to obtain derivative related words similar to the word vectors of the related words;
and acquiring related texts, audios, images and videos according to the related words and the derived related words to serve as the topic data.
5. The public opinion analysis method of claim 4, wherein the step of extracting text data from the topic data after identifying and format converting comprises:
recognizing first voice data and first tone data in the audio, and obtaining audio description text data through a voice recognition algorithm and a semantic recognition algorithm;
recognizing first text data, first facial expression data and first expression symbol data in the image, and combining an expression recognition algorithm to obtain image description text data;
identifying second voice data, second text data, second facial expression data and second expression symbol data in the video, and combining a voice recognition algorithm, a semantic recognition algorithm and an expression recognition algorithm to obtain video description text data;
converting the text, the audio description text data, the image description text data and the video description text data into a unified standardized format to obtain initial text data;
extracting the text data from the initial text data.
6. The public opinion analysis method of claim 5, wherein the step of converting the text, the audio description text data, the image description text data, and the video description text data into a unified standardized format results in initial text data, comprises:
performing word segmentation, expression recognition and nonsensical symbol removal and word stopping operation on the text, the audio description text data, the image description text data and the video description text data by using a word segmentation model, an expression symbol recognition model and a stop word recognition model to obtain text data to be processed;
and carrying out standardization processing on the text data to be processed to obtain the initial text data.
7. The public opinion analysis method according to claim 6, wherein after the step of obtaining topic data related to a specific topic according to a preset trigger rule, further comprises:
and acquiring a network address, a user account and user identity characteristic information corresponding to the topic data to generate a unique source identifier corresponding to the topic data.
8. The public opinion analysis method of claim 7, wherein the step of normalizing the text data to be processed to obtain the initial text data comprises:
grouping the text data to be processed according to the source identifier to obtain a plurality of grouped text data subgroups;
classifying the text data subgroups according to the original generation time, language, region and each dimension of the source person information to obtain a plurality of text data groups;
and carrying out standardization processing on the plurality of text data groups to obtain the initial text data.
9. The public opinion analysis method of claim 8, wherein the step of normalizing the text data to be processed to obtain the initial text data comprises:
for any one first text data subgroup in the plurality of text data subgroups, taking a first single word after word segmentation of the first text data subgroup as a reference word;
the description structure of each individual word after word segmentation is established, specifically:
creating a description structure file;
acquiring a start word, an intermediate word, an end word, a distance between the end word and the reference word and the occurrence number of the reference word of each individual word, and recording the initial word, the intermediate word and the end word in the description structure file;
repeating the steps until all the text data subgroups are iterated.
10. The public opinion analysis method of claim 9, wherein the step of normalizing the text data to be processed to obtain the initial text data comprises:
and for each first text data group, carrying out statistical analysis on all the individual words according to the occurrence times and the interval distance, and constructing characteristic structure data of the first text data group by using the individual words, the occurrence times and the interval distance.
CN202310165134.8A 2023-02-16 2023-02-16 Public opinion analysis method combining deep learning and language logic reasoning Active CN116340511B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310165134.8A CN116340511B (en) 2023-02-16 2023-02-16 Public opinion analysis method combining deep learning and language logic reasoning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310165134.8A CN116340511B (en) 2023-02-16 2023-02-16 Public opinion analysis method combining deep learning and language logic reasoning

Publications (2)

Publication Number Publication Date
CN116340511A CN116340511A (en) 2023-06-27
CN116340511B true CN116340511B (en) 2023-09-15

Family

ID=86881437

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310165134.8A Active CN116340511B (en) 2023-02-16 2023-02-16 Public opinion analysis method combining deep learning and language logic reasoning

Country Status (1)

Country Link
CN (1) CN116340511B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763401A (en) * 2009-12-30 2010-06-30 暨南大学 Network public sentiment hotspot prediction and analysis method
CN103440235A (en) * 2013-08-20 2013-12-11 中国科学院自动化研究所 Method and device for identifying text emotion types based on cognitive structure model
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
CN107315778A (en) * 2017-05-31 2017-11-03 温州市鹿城区中津先进科技研究院 A kind of natural language the analysis of public opinion method based on big data sentiment analysis
CN108984523A (en) * 2018-06-29 2018-12-11 重庆邮电大学 A kind of comment on commodity sentiment analysis method based on deep learning model
CN109376251A (en) * 2018-09-25 2019-02-22 南京大学 A kind of microblogging Chinese sentiment dictionary construction method based on term vector learning model
CN109697232A (en) * 2018-12-28 2019-04-30 四川新网银行股份有限公司 A kind of Chinese text sentiment analysis method based on deep learning
US10475182B1 (en) * 2018-11-14 2019-11-12 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
CN110532549A (en) * 2019-08-13 2019-12-03 青岛理工大学 A kind of text emotion analysis method based on binary channels deep learning model
CN110633373A (en) * 2018-06-20 2019-12-31 上海财经大学 Automobile public opinion analysis method based on knowledge graph and deep learning
CN111639183A (en) * 2020-05-19 2020-09-08 民生科技有限责任公司 Financial industry consensus public opinion analysis method and system based on deep learning algorithm
CN113361269A (en) * 2021-06-11 2021-09-07 南京信息工程大学 Method for text emotion classification
CN113468327A (en) * 2021-06-16 2021-10-01 浙江华巽科技有限公司 Early public opinion detection method based on deep learning
CN115203403A (en) * 2022-06-08 2022-10-18 云目未来科技(湖南)有限公司 Text sorting model based on network public sentiment
CN115659990A (en) * 2022-11-07 2023-01-31 北京邮电大学 Tobacco emotion analysis method, device and medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11205103B2 (en) * 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763401A (en) * 2009-12-30 2010-06-30 暨南大学 Network public sentiment hotspot prediction and analysis method
CN103440235A (en) * 2013-08-20 2013-12-11 中国科学院自动化研究所 Method and device for identifying text emotion types based on cognitive structure model
CN107066446A (en) * 2017-04-13 2017-08-18 广东工业大学 A kind of Recognition with Recurrent Neural Network text emotion analysis method of embedded logic rules
CN107315778A (en) * 2017-05-31 2017-11-03 温州市鹿城区中津先进科技研究院 A kind of natural language the analysis of public opinion method based on big data sentiment analysis
CN110633373A (en) * 2018-06-20 2019-12-31 上海财经大学 Automobile public opinion analysis method based on knowledge graph and deep learning
CN108984523A (en) * 2018-06-29 2018-12-11 重庆邮电大学 A kind of comment on commodity sentiment analysis method based on deep learning model
CN109376251A (en) * 2018-09-25 2019-02-22 南京大学 A kind of microblogging Chinese sentiment dictionary construction method based on term vector learning model
US10475182B1 (en) * 2018-11-14 2019-11-12 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
CN109697232A (en) * 2018-12-28 2019-04-30 四川新网银行股份有限公司 A kind of Chinese text sentiment analysis method based on deep learning
CN110532549A (en) * 2019-08-13 2019-12-03 青岛理工大学 A kind of text emotion analysis method based on binary channels deep learning model
CN111639183A (en) * 2020-05-19 2020-09-08 民生科技有限责任公司 Financial industry consensus public opinion analysis method and system based on deep learning algorithm
CN113361269A (en) * 2021-06-11 2021-09-07 南京信息工程大学 Method for text emotion classification
CN113468327A (en) * 2021-06-16 2021-10-01 浙江华巽科技有限公司 Early public opinion detection method based on deep learning
CN115203403A (en) * 2022-06-08 2022-10-18 云目未来科技(湖南)有限公司 Text sorting model based on network public sentiment
CN115659990A (en) * 2022-11-07 2023-01-31 北京邮电大学 Tobacco emotion analysis method, device and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Influence Analysis of Emotional Behaviors and User Relationships Based on Twitter Data;Kiichi Tago;Qun Jin;;Tsinghua Science and Technology(第01期);全文 *

Also Published As

Publication number Publication date
CN116340511A (en) 2023-06-27

Similar Documents

Publication Publication Date Title
CN106328147B (en) Speech recognition method and device
CN110232149B (en) Hot event detection method and system
Cummins et al. Multimodal bag-of-words for cross domains sentiment analysis
Ficamos et al. A topic based approach for sentiment analysis on twitter data
KR101713558B1 (en) Method of classification and analysis of sentiment in social network service
Arshad et al. Corpus for emotion detection on roman urdu
Agrawal et al. Affective representations for sarcasm detection
Houjeij et al. A novel approach for emotion classification based on fusion of text and speech
CN112069312A (en) Text classification method based on entity recognition and electronic device
Beleveslis et al. A hybrid method for sentiment analysis of election related tweets
Boishakhi et al. Multi-modal hate speech detection using machine learning
CN113990352A (en) User emotion recognition and prediction method, device, equipment and storage medium
Polignano et al. Identification Of Bot Accounts In Twitter Using 2D CNNs On User-generated Contents.
Sadiq et al. High dimensional latent space variational autoencoders for fake news detection
Tseng et al. Approaching Human Performance in Behavior Estimation in Couples Therapy Using Deep Sentence Embeddings.
Govindaraj et al. Intensified sentiment analysis of customer product reviews using acoustic and textual features
KR20200066119A (en) Method of fake news evaluation based on knowledge-based inference, recording medium and apparatus for performing the method
CN111382366B (en) Social network user identification method and device based on language and non-language features
Heaton et al. Language models as emotional classifiers for textual conversation
CN116340511B (en) Public opinion analysis method combining deep learning and language logic reasoning
Kaur et al. Sentiment detection from Punjabi text using support vector machine
Dadoun et al. Sentiment Classification Techniques Applied to Swedish Tweets Investigating the Effects of translation on Sentiments from Swedish into English
White et al. Using zero-resource spoken term discovery for ranked retrieval
Hu et al. TDRLM: Stylometric learning for authorship verification by Topic-Debiasing
JP6067616B2 (en) Utterance generation method learning device, utterance generation method selection device, utterance generation method learning method, utterance generation method selection method, program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant