WO2020119533A1 - 基于循环神经网络算法的舆情预警方法、装置、终端及介质 - Google Patents

基于循环神经网络算法的舆情预警方法、装置、终端及介质 Download PDF

Info

Publication number
WO2020119533A1
WO2020119533A1 PCT/CN2019/122787 CN2019122787W WO2020119533A1 WO 2020119533 A1 WO2020119533 A1 WO 2020119533A1 CN 2019122787 W CN2019122787 W CN 2019122787W WO 2020119533 A1 WO2020119533 A1 WO 2020119533A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
keywords
public opinion
keyword set
tendency
Prior art date
Application number
PCT/CN2019/122787
Other languages
English (en)
French (fr)
Inventor
谢波
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2020119533A1 publication Critical patent/WO2020119533A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Definitions

  • the present application relates to the field of artificial intelligence technology, and in particular to a public opinion early warning method, device, terminal and medium based on a recurrent neural network algorithm.
  • Public opinion early warning can find public opinion information and negative information related to "I" at the first time, and timely warning of major public opinion; provide qualitative and quantitative public opinion analysis data, accurately judge the development trend of specific public opinion or a specific public opinion topic; automatically generate public opinion Reports and various statistical reports improve the quality and efficiency of public opinion management and assist leaders in decision-making.
  • the main purpose of the present application is to provide a public opinion early warning method, device, terminal and medium based on a recurrent neural network algorithm, aiming to solve the technical problem of poor prediction of the development trend of public opinion in the existing technology.
  • the present application provides a public opinion early warning method based on a recurrent neural network algorithm, which includes the following steps:
  • public opinion early warning index public opinion early warning is issued.
  • the present application also provides a public opinion early warning device based on a recurrent neural network algorithm, including:
  • a public opinion acquisition module for acquiring public opinion news within a preset time and determining the tendency of keywords in the public opinion news
  • a vector building module used to determine the feature vector corresponding to the keyword according to the tendency of the keyword
  • a sequence determining module configured to determine the feature sequence of the public opinion news according to the feature vector corresponding to the keyword
  • the index determination module is used to input the characteristic sequence of the public opinion news into the trained recurrent neural network model to determine the public opinion early warning index;
  • the early warning issuing module is used to issue a public opinion warning according to the public opinion warning index.
  • the present application also provides a terminal, the terminal includes: a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor, the computer-readable instructions It is configured to implement the steps of the public opinion early warning method based on the recurrent neural network algorithm as described above.
  • the present application also provides a storage medium that stores computer readable instructions, which when executed by the processor, implements the public opinion early warning method based on the cyclic neural network algorithm as described above A step of.
  • FIG. 1 is a schematic structural diagram of a terminal in a hardware operating environment involved in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a first embodiment of a public opinion early warning method based on a recurrent neural network algorithm in this application;
  • FIG. 3 is a schematic flowchart of a second embodiment of a public opinion early warning method based on a recurrent neural network algorithm in this application;
  • FIG. 4 is a schematic flowchart of a third embodiment of a public opinion early warning method based on a recurrent neural network algorithm in this application;
  • FIG. 5 is a schematic flowchart of a fourth embodiment of a public opinion early warning method based on a recurrent neural network algorithm of this application;
  • FIG. 6 is a structural block diagram of a first embodiment of a public opinion early warning device based on a recurrent neural network algorithm in this application.
  • FIG. 1 is a schematic diagram of a terminal structure of a hardware operating environment involved in a solution according to an embodiment of the present application.
  • the terminal may include a processor 1001, such as a central processing unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display (Display), an input module such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a wireless fidelity (WIreless-FIdelity, WI-FI) interface).
  • the memory 1005 may be a high-speed random access memory (Random Access Memory, RAM) memory, or a stable non-volatile memory (Non-Volatile Memory, NVM), such as a disk memory.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • FIG. 1 does not constitute a limitation on the terminal, and may include more or less components than those illustrated, or combine certain components, or arrange different components.
  • the memory 1005 as a storage medium may include an operating system, a data storage module, a network communication module, a user interface module, and computer-readable instructions.
  • the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 in the terminal of the present application, and the memory 1005 may be provided in the terminal
  • the terminal calls the computer-readable instructions stored in the memory 1005 through the processor 1001, and executes the public opinion early warning method based on the cyclic neural network algorithm provided by the embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a first embodiment of a public opinion early warning method based on a recurrent neural network algorithm.
  • the public opinion early warning method based on the recurrent neural network algorithm includes the following steps:
  • Step S10 Obtain public opinion news within a preset time, and determine the tendency of keywords in the public opinion news;
  • public opinion news is a kind of network public opinion, which is spread and spread through a network platform.
  • Public opinion news can be published through web pages, or third-party software, plug-ins, etc.
  • the public opinion news can be obtained through the API interface, or through web crawlers, etc., without specific restrictions here.
  • the tendency of keywords can be divided into positive tendency, negative tendency, or positive tendency, negative tendency, and neutral tendency.
  • the positive tendency of keywords is the degree of positive evaluation
  • the negative tendency of keywords is the degree of negative evaluation
  • the neutral tendency of keywords is the degree of neutral evaluation.
  • the pre-processing methods include:
  • Step S100a Cluster the public opinion news. Because public opinion news is sudden in time and does not have a general pattern, it is necessary to predict the number of the same topic of public opinion news (for example, about the company's senior personnel turnover, company strategic policy related content, etc.). The clustering process is mainly to aggregate the public opinion news described as the same topic into the same category.
  • the clustering method may use the conventional clustering method in the prior art, which is not specifically limited here.
  • Step S100b Acquire related topics.
  • the topics that public opinion prediction usually focuses on can be user-defined, or can be set as regular topics that companies pay attention to, such as the company's senior personnel resignation, the company's strategic policy related content, etc.
  • the relevant topics in the public opinion news can be obtained through keyword search, or other conventional means can be used, and no specific restrictions are made here.
  • Step S100c Perform data aggregation on public opinion news. Through data aggregation of public opinion news, a time series is obtained, and the value of each moment is the number of all public opinion news on the network up to the current moment.
  • the public opinion news within a preset time is acquired, and the word segmentation tool is used to segment the acquired public opinion news, keywords in each public opinion news are acquired, and then the tendency of the keywords is determined.
  • the tendency of determining keywords can be to collect historical public opinion news in advance, mark the historical public opinion news, and count the number of times the keywords appear in the positive opinion public opinion news in the public opinion news, or the number of negative opinion public opinion news in order to This establishes a keyword's propensity database; when determining the propensity of a certain keyword, it is sufficient to search the propensity of the keyword in the propensity database.
  • Step S20 Determine the feature vector corresponding to the keyword according to the tendency of the keyword
  • determining the feature vector corresponding to the keyword according to the tendency of the keyword is to use the tendency of the keyword as a corresponding weight to construct the feature vector corresponding to the keyword.
  • the dimension of the feature vector may be determined according to the division of the propensity. For example, if the propensity is divided into a positive propensity, a negative propensity, and a neutral propensity, the feature vector may be set to at least three dimensions.
  • Step S30 Determine the feature sequence of the public opinion news according to the feature vector corresponding to the keyword
  • public opinion news is composed of multiple keywords
  • the step of determining the feature sequence of the public opinion news according to the feature vector corresponding to the keyword may be combining the feature vectors corresponding to the keywords to form public opinion Characteristic sequence of news.
  • the feature sequence for constructing the public opinion news may be a 3 ⁇ m or m ⁇ 3 dimensional feature vector, or the dimension of the feature sequence may be determined according to a specific classification.
  • Step S40 input the feature sequence of the public opinion news into the trained recurrent neural network model to determine the public opinion early warning index;
  • the hidden layer of the recurrent neural network includes the hidden vector of historical public opinion news.
  • the overall tendency of public opinion is obtained as public opinion Early warning indicators.
  • the exposure of public opinion news is also considered as a correction value, denoted as t2.
  • the circulation of public opinion news can also reflect the dissemination of public opinion information on the Internet and the popularity of discussions. Therefore, the circulation of public opinion news can also be used as a correction value and recorded as t3.
  • the setting of the correction value t3 may be user-defined. For example, the circulation of public opinion news about the company's high-level personnel turnover may not necessarily be large, but for companies, the focus of this topic tends to be relatively high, so it can be adjusted by adjusting t3.
  • the training method of the cyclic neural network model can be to grab public opinion news data through the network; initialize the parameters of the cyclic neural network model, calculate the parameters in the cyclic neural network model according to the keyword data in the public opinion news data and the cyclic neural network model, the specific training method Methods known in the art can also be used.
  • Step S50 According to the public opinion early warning index, a public opinion early warning is issued.
  • the public opinion early warning is issued according to the public opinion early warning indicator, which may be a public opinion early warning when the public opinion early warning indicator is greater than a preset threshold.
  • the preset threshold may be user-defined, or may be preset according to the topic content of public opinion news.
  • This application obtains public opinion news within a preset time and determines the propensity of keywords in the public opinion news, then determines the feature vector corresponding to the keyword according to the propensity of the keyword, and then determines the feature vector corresponding to the keyword Corresponding feature vector, determine the feature sequence of the public opinion news, and finally input the feature sequence of the public opinion news into the trained recurrent neural network model, determine the public opinion warning index, and issue the public opinion warning according to the public opinion warning index, which can be accurate Judging the direction of public opinion solves the technical problem of poor prediction of the development trend of public opinion in existing technologies.
  • FIG. 3 is a schematic flowchart of a second embodiment of a public opinion early warning method based on a recurrent neural network algorithm.
  • the step S10 includes:
  • Step S101 Acquire public opinion news within a preset time and a keyword library established in advance, and determine the tendency of keywords in the public opinion news.
  • the keyword library can also be divided into a positive keyword set, a negative keyword set, and a neutral keyword set, or a positive keyword set and a negative keyword set, the specific classification method is set according to the needs.
  • the pre-established keyword library can be based on the tagged public opinion news, put the keywords appearing in the positive opinion public opinion news into the positive keyword set, and put the keywords appearing in the negative opinion public opinion news into Negative keyword collection, put keywords appearing in public opinion news tagged as neutral evaluation into the neutral keyword collection.
  • the keywords in each keyword set may also be defined by the user based on experience and the like.
  • FIG. 4 is a schematic flowchart of a third embodiment of a public opinion early warning method based on a recurrent neural network algorithm of the present application.
  • the tendency of the keywords includes a positive tendency, a negative tendency, and a neutral tendency.
  • the positive, negative, and neutral tendencies are that the keywords appear in positive news, respectively 1. Probability in negative news and neutral news.
  • the step S101 specifically includes:
  • Step S1011 Establish a keyword library, which includes a positive keyword set, a negative keyword set, and a neutral keyword set;
  • the definition of the keyword library can be classified according to specific needs, which can include a positive keyword set, a negative keyword set, and a neutral keyword set, or it can include a positive keyword set and a negative keyword set.
  • Step S1012 Calculate the correlation between each keyword and the remaining keywords in each keyword set
  • the tendency of the keyword can be determined, for example, keyword A, positive keyword set ⁇ A,B,C,D ⁇ , by calculating the correlation between A and B, C, D to determine the positive tendency of A.
  • n is the number of keywords in the positive keyword set
  • rec(w, v) is the correlation between w and v keywords
  • P is a positive keyword set
  • p(w) is the probability of w keywords appearing in the document
  • p(v) is the probability of v keyword appearing in the document
  • p(w, v) is the probability that w and v will appear in the document together.
  • n is the number of keywords in the negative keyword set
  • rec(w, v) is the correlation between w and v keywords
  • Q is a positive keyword set
  • p(w) is the probability of w keywords appearing in the document
  • p(v) is the probability of v keyword appearing in the document
  • p(w, v) is the probability that w and v will appear in the document together.
  • k is the number of keywords in the neutral keyword set
  • rec(w, v) is the correlation between w and v keywords
  • M is a neutral keyword set
  • p(w) is the probability of w keywords appearing in the document
  • p(v) is the probability of v keyword appearing in the document
  • p(w, v) is the probability that w and v will appear in the document together.
  • Step S1013 Based on the correlation between each keyword and the remaining keywords in each keyword set, calculate the positive tendency, negative tendency, and neutral tendency of the keyword.
  • the average value of the correlation between each keyword and the remaining keywords in each keyword set may be used as the corresponding tendency of the keyword.
  • FIG. 5 is a schematic flowchart of a fourth embodiment of a public opinion early warning method based on a recurrent neural network algorithm.
  • the step S1013 specifically includes:
  • Step S1013a The difference between the relevance of the keyword to the remaining keywords in the positive keyword set, the relevance to the remaining keywords in the negative keyword set, and the relevance to the remaining keywords in the neutral keyword set is taken as the positive tendency degree;
  • step S1013a, step S1013b, and step S1013c have no sequence relationship.
  • Step S1013c may be first, and step S1013a and step S1013b may also be synchronized. Therefore, no specific limitation is imposed here.
  • the positive tendency of a keyword the relevance of the keyword to the remaining keywords in the positive keyword set-the relevance to the remaining keywords in the negative keyword set-the remaining keywords in the neutral keyword set
  • the correlation is rel1-rel2-rel3.
  • the positive tendency of a keyword the average value of the relevance of the keyword to the remaining keywords in the positive keyword set-the average value of the relevance to the remaining keywords in the negative keyword set-the neutral keyword The average of the relevance of the remaining keywords in the set.
  • Step S1013b The difference between the keyword's relevance to the remaining keywords in the negative keyword set, the relevance to the remaining keywords in the positive keyword set, and the relevance to the remaining keywords in the neutral keyword set are regarded as negative trends degree;
  • the negative tendency of a keyword the relevance of the keyword to the remaining keywords in the negative keyword set-the relevance to the remaining keywords in the positive keyword set-the remaining keywords in the neutral keyword set
  • the correlation is rel2-rel1-rel3.
  • the negative tendency of a keyword the average value of the correlation between the keyword and the remaining keywords in the negative keyword set-the average value of the correlation with the remaining keywords in the positive keyword set-and the neutral keyword The average of the relevance of the remaining keywords in the set.
  • Step S1013c The difference between the keyword's relevance to the remaining keywords in the neutral keyword set, the relevance to the remaining keywords in the positive keyword set, and the relevance to the remaining keywords in the negative keyword set are regarded as the neutral tendency degree.
  • the neutral tendency of a keyword the relevance of the keyword to the remaining keywords in the neutral keyword set-the relevance to the remaining keywords in the positive keyword set-the remaining keywords in the negative keyword set
  • the correlation is rel3-rel1-rel2.
  • the neutral tendency of a keyword the average value of the correlation between the keyword and the remaining keywords in the neutral keyword set-the average value of the correlation with the remaining keywords in the positive keyword set-and the negative keyword The average of the relevance of the remaining keywords in the set.
  • an embodiment of the present application further provides a storage medium, and the storage medium may be a non-volatile readable storage medium.
  • Computer storage readable instructions are stored on the storage medium of the present application.
  • the steps of the public opinion early warning method based on the recurrent neural network algorithm described above are implemented.
  • the method implemented when the computer-readable instruction is executed can refer to the embodiments of the public opinion early warning method based on the recurrent neural network algorithm of the present application, and details are not described herein again.
  • FIG. 6 is a structural block diagram of a first embodiment of a public opinion early warning device based on a recurrent neural network algorithm of the present application.
  • the public opinion early warning device based on the cyclic neural network algorithm proposed in the embodiment of the present application includes:
  • the public opinion acquisition module 601 is used to acquire public opinion news within a preset time and determine the tendency of keywords in the public opinion news;
  • public opinion news is a kind of online public opinion, spread and spread through the network platform.
  • Public opinion news can be published through web pages, or third-party software, plug-ins, etc.
  • the public opinion news can be obtained through the API interface, or through web crawlers, etc., without specific restrictions here.
  • the tendency of keywords can be divided into positive tendency, negative tendency, or positive tendency, negative tendency, and neutral tendency.
  • the positive tendency of keywords is the degree of positive evaluation
  • the negative tendency of keywords is the degree of negative evaluation
  • the neutral tendency of keywords is the degree of neutral evaluation.
  • the vector building module 602 is used to determine the feature vector corresponding to the keyword according to the tendency of the keyword;
  • determining the feature vector corresponding to the keyword according to the tendency of the keyword is to use the tendency of the keyword as a corresponding weight to construct the feature vector corresponding to the keyword.
  • the dimension of the feature vector may be determined according to the division of the propensity. For example, if the propensity is divided into a positive propensity, a negative propensity, and a neutral propensity, the feature vector may be set to at least three dimensions.
  • the sequence determination module 603 is used to determine the feature sequence of the public opinion news according to the feature vector corresponding to the keyword;
  • public opinion news is composed of multiple keywords
  • the step of determining the feature sequence of the public opinion news according to the feature vector corresponding to the keyword may be combining the feature vectors corresponding to the keywords to form public opinion Characteristic sequence of news.
  • the feature sequence for constructing the public opinion news may be a 3 ⁇ m or m ⁇ 3 dimension feature vector, or the dimension of the feature sequence may be determined according to a specific classification.
  • the index determination module 604 is used to input the feature sequence of the public opinion news into the trained recurrent neural network model to determine the public opinion early warning index;
  • the hidden layer of the recurrent neural network includes the hidden vector of historical public opinion news.
  • the overall tendency of public opinion is obtained as public opinion Early warning indicators.
  • the exposure of public opinion news is also considered as a correction value, denoted as t2.
  • the circulation of public opinion news can also reflect the spread of public opinion information on the Internet and the popularity of discussions. Therefore, the circulation of public opinion news can also be used as a correction value and is recorded as t3.
  • the setting of the correction value t3 may be user-defined. For example, the circulation of public opinion news about the company's senior personnel turnover may not necessarily be large, but for companies, the focus of this topic tends to be relatively high, so it can be corrected by adjusting t3.
  • the early warning issuing module 605 is configured to issue a public opinion warning according to the public opinion warning index.
  • the public opinion warning is issued according to the public opinion warning index, which may be a public opinion warning when the public opinion warning index is greater than a preset threshold.
  • the preset threshold may be user-defined, or may be preset according to the topic content of public opinion news.
  • This application obtains public opinion news within a preset time and determines the propensity of keywords in the public opinion news, then determines the feature vector corresponding to the keyword according to the propensity of the keyword, and then determines the feature vector corresponding to the keyword Corresponding feature vector, determine the feature sequence of the public opinion news, and finally input the feature sequence of the public opinion news into the trained recurrent neural network model, determine the public opinion warning index, and issue the public opinion warning according to the public opinion warning index, which can be accurate Judging the direction of public opinion solves the technical problem of poor prediction of the development trend of public opinion in existing technologies.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present application can essentially be embodied in the form of a software product that contributes to the existing technology, and the computer software product is stored in a storage medium (such as read-only memory/random access)
  • the memory, magnetic disk, and optical disk include several instructions to enable a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于循环神经网络算法的舆情预警方法、装置、终端及介质,通过获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度,再根据所述关键词的倾向度,确定所述关键词对应的特征向量,再根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列,最后将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标,根据所述舆情预警指标,发出舆情预警,可以准确判断舆情走向,解决了现有技术舆情的发展趋势预测效果差的技术问题。

Description

基于循环神经网络算法的舆情预警方法、装置、终端及介质
本申请要求于2018年12月14日提交中国专利局、申请号为201811530781.X、发明名称为“基于循环神经网络算法的舆情预警方法、装置、终端及介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中
技术领域
本申请涉及人工智能技术领域,尤其涉及一种基于循环神经网络算法的舆情预警方法、装置、终端及介质。
背景技术
随着互联网技术的快速发展,网络的开发性和灵活性让其成为反映社会舆情的主要载体之一。舆情预警可以在第一时间发现与"我"相关的舆情信息、负面信息,重大舆情及时预警;提供定性定量的舆情分析数据,准确判断具体舆情或者某一舆情专题的发展变化趋势;自动生成舆情报告和各种统计报表,提高舆情管理的质量和效率,辅助领导决策。
目前,市场上存在许多舆情预警方法,但是存在很多不足和缺陷,例如舆情的发展趋势预测效果差。
上述内容仅用于辅助理解本申请的技术方案,并不代表承认上述内容是现有技术。
发明内容
本申请的主要目的在于提供了一种基于循环神经网络算法的舆情预警方法、装置、终端及介质,旨在解决现有技术舆情的发展趋势预测效果差的技术问题。
为实现上述目的,本申请提供了一种基于循环神经网络算法的舆情预警方法,其中,包括如下步骤:
获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
根据所述关键词的倾向度,确定所述关键词对应的特征向量;
根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆 情预警指标;
根据所述舆情预警指标,发出舆情预警。
基于上述发明目的,本申请还提供一种基于循环神经网络算法的舆情预警装置,包括:
舆情获取模块,用于获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
向量建立模块,用于根据所述关键词的倾向度,确定所述关键词对应的特征向量;
序列确定模块,用于根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
指标确定模块,用于将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
预警发出模块,用于根据所述舆情预警指标,发出舆情预警。
基于上述发明目的,本申请还提供一种终端,所述终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令配置为实现如上述的基于循环神经网络算法的舆情预警方法的步骤。
基于上述发明目的,本申请还提供一种存储介质,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现如上述的基于循环神经网络算法的舆情预警方法的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其他特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
图1是本申请实施例方案涉及的硬件运行环境的终端的结构示意图;
图2为本申请基于循环神经网络算法的舆情预警方法第一实施例的流程示意图;
图3为本申请基于循环神经网络算法的舆情预警方法第二实施例的流程示意图;
图4为本申请基于循环神经网络算法的舆情预警方法第三实施例的流程示意图;
图5为本申请基于循环神经网络算法的舆情预警方法第四实施例的流程示意图;
图6为本申请基于循环神经网络算法的舆情预警装置第一实施例的结构框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的终端结构示意图。
如图1所示,该终端可以包括:处理器1001,例如中央处理器(Central Processing Unit,CPU),通信总线1002、用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入模块比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如无线保真(WIreless-FIdelity,WI-FI)接口)。存储器1005可以是高速的随机存取存储器(Random Access Memory,RAM)存储器,也可以是稳定的非易失性存储器(Non-Volatile Memory,NVM),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的结构并不构成对终端的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种存储介质的存储器1005中可以包括操作系统、 数据存储模块、网络通信模块、用户接口模块以及计算机可读指令。
在图1所示的终端中,网络接口1004主要用于与网络服务器进行数据通信;用户接口1003主要用于与用户进行数据交互;本申请终端中的处理器1001、存储器1005可以设置在终端中,所述终端通过处理器1001调用存储器1005中存储的计算机可读指令,并执行本申请实施例提供的基于循环神经网络算法的舆情预警方法。
本申请实施例提供了一种基于循环神经网络算法的舆情预警方法,参照图2,图2为本申请基于循环神经网络算法的舆情预警方法第一实施例的流程示意图。
本实施例中,所述基于循环神经网络算法的舆情预警方法包括如下步骤:
步骤S10:获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
需要说明的是,本实施例方法的执行主体为终端,舆情新闻是一种网络舆情,通过网络平台进行扩散和传播。舆情新闻可以是通过网页、或者第三方软件、插件等发布的。而舆情新闻的获取可以是通过API接口获取,也可以是通过网页爬虫等方式获取,在此不做具体限制。
关键词的倾向度可以分为正面倾向度、负面倾向度,也可以分为正面倾向度、负面倾向度以及中立倾向度。关键词的正面倾向度为正面评价的程度,关键词的负面倾向度为负面评价的程度,关键词的中立倾向度为中立评价的程度。
通过在获取预设时间内舆情新闻前,需要对舆情新闻进行预处理,预处理的方法包括:
步骤S100a:对舆情新闻进行聚类。由于舆情新闻在时间上具有突发性,不具有普遍的规律,因此需要对舆情新闻的同一话题(例如关于公司高层人事离职、公司战略政策相关内容等)的数量进行预测。聚类过程主要是将所描述为同一话题的舆情新闻聚合到同一类别中。聚类方法可以采用现有技术中常规的聚类方法,在此不做具体限制。
步骤S100b:获取相关话题。网络上出现的舆情新闻数量通过会很多,对应的话题也会很多。舆情预测通常关注的话题可以是用户自定义,也可以是设置为企业关注的常规话题,例如公司高层人事离职、公司战略政策 相关内容等。获取舆情新闻中相关话题,可以是通过关键词检索获取,也可以采用其他常规手段,在此不做具体限制。
步骤S100c:对舆情新闻进行数据聚合。通过对舆情新闻进行数据聚合,得到一个时间序列,每个时刻的值是到当前时刻为止网络上所有舆情新闻的数量。
具体实现时,获取预设时间内舆情新闻,采用分词工具对获取的舆情新闻进行分词,获取每条舆情新闻中的关键词,再确定关键词的倾向度。
确定关键词的倾向度可以是预先采集历史舆情新闻,对历史舆情新闻进行标记,统计舆情新闻中关键词在正面评价的舆情新闻中出现的次数,或负面评价的舆情新闻中出现的次数,以此建立关键词的倾向度库;在确定某个关键词的倾向度时,则在倾向度库中查找关键词对应的倾向度即可。
步骤S20:根据所述关键词的倾向度,确定所述关键词对应的特征向量;
需要说明的是,所述根据所述关键词的倾向度,确定所述关键词对应的特征向量是将关键词的倾向度作为对应的权值,构建关键词对应的特征向量。特征向量的维数可以根据倾向度的划分而定,例如倾向度分为正面倾向度、负面倾向度以及中立倾向度,则特征向量可以设置为至少三维。
步骤S30:根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
应该理解的是,舆情新闻是由多个关键词组成的,所述根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列的步骤可以为将关键词对应的特征向量组合形成舆情新闻的特征序列。例如,一条舆情新闻中关键词为m个,构建该舆情新闻的特征序列可以为3×m或者m×3维特征向量,也可以根据具体分类确定特征序列的维数。
步骤S40:将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
应该理解的是,将舆情新闻的特征序列作为循环神经网络模型的输入,循环神经网络的隐含层包括历史舆情新闻的隐含向量,通过循环神经网络模型,得到舆情的整体倾向度,作为舆情预警指标。
另外,由于舆情新闻在某一时刻或时间段内,舆情新闻所影响的区域性范围也是不一样的,例如国家、省、市等,因此考虑舆情新闻的数量是 可以将舆情新闻的区域性范围作为一修正值,记为t1。
由于媒体的报道也会对舆情新闻的数量有着较大的影响,因此,舆情新闻的曝光度也考虑作为一修正值,记为t2。
舆情新闻的流通量也可以反映出舆情信息在网络上传播情况以及讨论热度,因此,舆情新闻的流通量也可以作为一修正值,记为t3。修正值t3的设定可以是用户自定义。例如关于公司高层人事离职的舆情新闻的流通量可能并不一定大,但是对于企业而言,这一话题的关注度往往会比较高,因此可以通过调节t3来进行修正。
循环神经网络模型的训练方法可以是通过网络抓取舆情新闻数据;初始化循环神经网络模型的参数,根据舆情新闻数据中关键词数据以及循环神经网络模型计算循环神经网络模型中的参数,具体训练方式也可以采用本领域公知的方法。
步骤S50:根据所述舆情预警指标,发出舆情预警。
具体实现时,所述根据所述舆情预警指标,发出舆情预警,可以是在舆情预警指标大于一预设阀值时,发出舆情预警。其中预设阀值可以是用户自定义的,也可以根据舆情新闻的话题内容预先设定的。发出舆情预警的方式可以多种,例如通过手机电话、短信、邮件等形式,也可以是用户自定义的同时方式。
本申请通过获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度,再根据所述关键词的倾向度,确定所述关键词对应的特征向量,再根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列,最后将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标,根据所述舆情预警指标,发出舆情预警,可以准确判断舆情走向,解决了现有技术舆情的发展趋势预测效果差的技术问题。
参考图3,图3为本申请基于循环神经网络算法的舆情预警方法第二实施例的流程示意图。
基于上述第一实施例,在本实施例中,所述步骤S10,包括:
步骤S101,获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度。
需要说明的是,关键词库也可以分为正面关键词集、负面关键词集以 及中立关键词集,或者是分为正面关键词以及负面关键词集,具体分类方式根据需求设定。
预先建立的关键词库可以是根据带标签的舆情新闻,将标签为正面评价的舆情新闻中出现的关键词放入正面关键词集中,将标签为负面评价的舆情新闻中出现的关键词放入负面关键词集中,将标签为中立评价的舆情新闻中出现的关键词放入中立关键词集中。各关键词集中的关键词也可以是用户根据经验等定义的。
参考图4,图4为本申请基于循环神经网络算法的舆情预警方法第三实施例的流程示意图。
基于上述第二实施例,所述关键词的倾向度包括正面倾向度、负面倾向度以及中立倾向度,所述正面倾向度、负面倾向度以及中立倾向度为所述关键词分别出现在正面新闻、负面新闻、中立新闻中概率,在本实施例中,所述步骤S101,具体包括:
步骤S1011:建立关键词库,所述关键词库包括正面关键词集、负面关键词集以及中立关键词集;
需要说明的是,关键词库的定义可以根据具体需要进行分类,可以是包括正面关键词集、负面关键词集以及中立关键词集,也可以是包括正面关键词集、负面关键词集。
步骤S1012:计算每个关键词与各关键词集中的其余关键词的相关性;
需要说明的是,通过计算每个关键词与各关键词集中的其余关键词的相关性,可以确定该关键词的倾向度,例如关键词A,正面关键词集{A,B,C,D},通过计算A与B、C、D的相关性来确定A的正面倾向度。
根据公式
Figure PCTCN2019122787-appb-000001
计算所述关键词与正面关键词集中的其余关键词的相关性;
其中,n为正面关键词集中关键词的个数;
rec(w,v)为w,v两个关键词的相关性;
P为正面关键词集;
Figure PCTCN2019122787-appb-000002
p(w)为w关键词在文档中出现的概率,
p(v)为v关键词在文档中出现的概率;
p(w,v)为w和v共同在文档中出现的概率。
优选地,根据公式
Figure PCTCN2019122787-appb-000003
计算所述关键词与负面关键词集中的其余关键词的相关性;
其中,m为负面关键词集中关键词的个数;
rec(w,v)为w,v两个关键词的相关性;
Q为正面关键词集;
Figure PCTCN2019122787-appb-000004
p(w)为w关键词在文档中出现的概率,
p(v)为v关键词在文档中出现的概率;
p(w,v)为w和v共同在文档中出现的概率。
优选地,根据公式
Figure PCTCN2019122787-appb-000005
计算所述关键词与中立关键词集中的其余关键词的相关性;
其中,k为中立关键词集中关键词的个数;
rec(w,v)为w,v两个关键词的相关性;
M为中立关键词集;
Figure PCTCN2019122787-appb-000006
p(w)为w关键词在文档中出现的概率,
p(v)为v关键词在文档中出现的概率;
p(w,v)为w和v共同在文档中出现的概率。
步骤S1013:根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度。
具体实现时,可以是将每个关键词与各关键词集中的其余关键词的相关性的均值作为该关键词的对应的倾向度。
参考图5,图5为本申请基于循环神经网络算法的舆情预警方法第四实施例的流程示意图。
基于上述第三实施例,在本实施例中,所述步骤S1013,具体包括:
步骤S1013a:将关键词与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为正面倾向度;
需要说明的是,步骤S1013a、步骤S1013b以及步骤S1013c之间没有先后关系,可以是步骤S1013c在前,步骤S1013a和步骤S1013b在后,也可以同步进行,故,在此不做具体限制。
具体实现时,某一关键词的正面倾向度=关键词与正面关键词集中的其余关键词的相关性-与负面关键词集中的其余关键词的相关性-与中立关键词集中的其余关键词的相关性,即rel1-rel2-rel3。
通常是,某一关键词的正面倾向度=关键词与正面关键词集中的其余关键词的相关性的平均值-与负面关键词集中的其余关键词的相关性的平均值-与中立关键词集中的其余关键词的相关性的平均值。
步骤S1013b:将关键词与负面关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为负面倾向度;
具体实现时,某一关键词的负面倾向度=关键词与负面关键词集中的其余关键词的相关性-与正面关键词集中的其余关键词的相关性-与中立关键词集中的其余关键词的相关性,即rel2-rel1-rel3。
通常是,某一关键词的负面倾向度=关键词与负面关键词集中的其余关键词的相关性的平均值-与正面关键词集中的其余关键词的相关性的平均值-与中立关键词集中的其余关键词的相关性的平均值。
步骤S1013c:将关键词与中立关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性的差值作为中立倾向度。
具体实现时,某一关键词的中立倾向度=关键词与中立关键词集中的其余关键词的相关性-与正面关键词集中的其余关键词的相关性-与负面关键词集中的其余关键词的相关性,即rel3-rel1-rel2。
通常是,某一关键词的中立倾向度=关键词与中立关键词集中的其余关键词的相关性的平均值-与正面关键词集中的其余关键词的相关性的平均值-与负面关键词集中的其余关键词的相关性的平均值。
此外,本申请实施例还提出一种存储介质,,所述存储介质可以为非易失性可读存储介质。
本申请存储介质上存储有计算机可读指令,所述计算机可读指令被处 理器执行时实现如上文所述的基于循环神经网络算法的舆情预警方法的步骤。
其中,该计算机可读指令被执行时所实现的方法可参照本申请基于循环神经网络算法的舆情预警方法的各个实施例,此处不再赘述。
参照图6,图6为本申请基于循环神经网络算法的舆情预警装置第一实施例的结构框图。
如图6所示,本申请实施例提出的基于循环神经网络算法的舆情预警装置包括:
舆情获取模块601,用于获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
需要说明的是,舆情新闻是一种网络舆情,通过网络平台进行扩散和传播。舆情新闻可以是通过网页、或者第三方软件、插件等发布的。而舆情新闻的获取可以是通过API接口获取,也可以是通过网页爬虫等方式获取,在此不做具体限制。
关键词的倾向度可以分为正面倾向度、负面倾向度,也可以分为正面倾向度、负面倾向度以及中立倾向度。关键词的正面倾向度为正面评价的程度,关键词的负面倾向度为负面评价的程度,关键词的中立倾向度为中立评价的程度。
向量建立模块602,用于根据所述关键词的倾向度,确定所述关键词对应的特征向量;
需要说明的是,所述根据所述关键词的倾向度,确定所述关键词对应的特征向量是将关键词的倾向度作为对应的权值,构建关键词对应的特征向量。特征向量的维数可以根据倾向度的划分而定,例如倾向度分为正面倾向度、负面倾向度以及中立倾向度,则特征向量可以设置为至少三维。
序列确定模块603,用于根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
应该理解的是,舆情新闻是由多个关键词组成的,所述根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列的步骤可以为将关键词对应的特征向量组合形成舆情新闻的特征序列。例如,一条舆情新闻中关键词为m个,构建该舆情新闻的特征序列可以为3×m或者m×3维特征向 量,也可以根据具体分类确定特征序列的维数。
指标确定模块604,用于将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
应该理解的是,将舆情新闻的特征序列作为循环神经网络模型的输入,循环神经网络的隐含层包括历史舆情新闻的隐含向量,通过循环神经网络模型,得到舆情的整体倾向度,作为舆情预警指标。
另外,由于舆情新闻在某一时刻或时间段内,舆情新闻所影响的区域性范围也是不一样的,例如国家、省、市等,因此考虑舆情新闻的数量是可以将舆情新闻的区域性范围作为一修正值,记为t1。
由于媒体的报道也会对舆情新闻的数量有着较大的影响,因此,舆情新闻的曝光度也考虑作为一修正值,记为t2。
舆情新闻的流通量也可以反映出舆情信息在网络上传播情况以及讨论热度,因此,舆情新闻的流通量也可以作为一修正值,记为t3。修正值t3的设定可以是用户自定义。例如关于公司高层人事离职的舆情新闻的流通量可能并不一定大,但是对于企业而言,这一话题的关注度往往会比较高,因此可以通过调节t3来进行修正。
预警发出模块605,用于根据所述舆情预警指标,发出舆情预警。
具体实现时,所述根据所述舆情预警指标,发出舆情预警,可以是在舆情预警指标大于一预设阀值时,发出舆情预警。其中预设阀值可以是用户自定义的,也可以根据舆情新闻的话题内容预先设定的。发出舆情预警的方式可以多种,例如通过手机电话、短信、邮件等形式,也可以是用户自定义的同时方式。
本申请通过获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度,再根据所述关键词的倾向度,确定所述关键词对应的特征向量,再根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列,最后将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标,根据所述舆情预警指标,发出舆情预警,可以准确判断舆情走向,解决了现有技术舆情的发展趋势预测效果差的技术问题。
本申请基于循环神经网络算法的舆情预警装置的其他实施例或具体实现方式可参照上述各方法实施例,此处不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如只读存储器/随机存取存储器、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于循环神经网络算法的舆情预警方法,其中,包括如下步骤:
    获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
    根据所述关键词的倾向度,确定所述关键词对应的特征向量;
    根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
    将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
    根据所述舆情预警指标,发出舆情预警;
    其中,所述获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度的步骤,包括:
    获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度;
    其中,所述关键词的倾向度包括正面倾向度、负面倾向度以及中立倾向度,所述正面倾向度、负面倾向度以及中立倾向度为所述关键词分别出现在正面新闻、负面新闻、中立新闻中概率;
    相应地,所述获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度的步骤之前,所述基于循环神经网络算法的舆情预警方法还包括如下步骤:
    建立关键词库,所述关键词库包括正面关键词集、负面关键词集以及中立关键词集;
    计算每个关键词与各关键词集中的其余关键词的相关性;
    根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度。
  2. 如权利要求1所述的基于循环神经网络算法的舆情预警方法,其中,所述根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度的步骤,包括:
    将关键词与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为正面倾向度;
    将关键词与负面关键词集中的其余关键词的相关性、与正面关键词集 中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为负面倾向度;
    将关键词与中立关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性的差值作为中立倾向度。
  3. 如权利要求1所述的基于循环神经网络算法的舆情预警方法,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100001
    计算所述关键词与正面关键词集中的其余关键词的相关性;
    其中,n为正面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    P为正面关键词集;
    Figure PCTCN2019122787-appb-100002
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  4. 如权利要求1所述的基于循环神经网络算法的舆情预警方法,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100003
    计算所述关键词与负面关键词集中的其余关键词的相关性;
    其中,m为负面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    Q为正面关键词集;
    Figure PCTCN2019122787-appb-100004
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  5. 如权利要求1所述的基于循环神经网络算法的舆情预警方法,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100005
    计算所述关键词与中立关键词集中的其 余关键词的相关性;
    其中,k为中立关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    M为中立关键词集;
    Figure PCTCN2019122787-appb-100006
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  6. 一种基于循环神经网络算法的舆情预警装置,其中,包括:
    舆情获取模块,用于获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
    向量建立模块,用于根据所述关键词的倾向度,确定所述关键词对应的特征向量;
    序列确定模块,用于根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
    指标确定模块,用于将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
    预警发出模块,用于根据所述舆情预警指标,发出舆情预警;
    其中,所述舆情获取模块,还用于获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度;
    其中,所述关键词的倾向度包括正面倾向度、负面倾向度以及中立倾向度,所述正面倾向度、负面倾向度以及中立倾向度为所述关键词分别出现在正面新闻、负面新闻、中立新闻中概率;
    所述舆情获取模块,还用于建立关键词库,所述关键词库包括正面关键词集、负面关键词集以及中立关键词集;
    计算每个关键词与各关键词集中的其余关键词的相关性;
    根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度。
  7. 如权利要求6所述的基于循环神经网络算法的舆情预警装置,其中,所述舆情获取模块,还用于将关键词与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性、与中立关键词集中的其 余关键词的相关性的差值作为正面倾向度;
    将关键词与负面关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为负面倾向度;
    将关键词与中立关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性的差值作为中立倾向度。
  8. 如权利要求6所述的基于循环神经网络算法的舆情预警装置,其中,所述舆情获取模块,还用于根据公式
    Figure PCTCN2019122787-appb-100007
    计算所述关键词与正面关键词集中的其余关键词的相关性;
    其中,n为正面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    P为正面关键词集;
    Figure PCTCN2019122787-appb-100008
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  9. 如权利要求6所述的基于循环神经网络算法的舆情预警装置,其中,所述舆情获取模块,还用于根据公式
    Figure PCTCN2019122787-appb-100009
    计算所述关键词与负面关键词集中的其余关键词的相关性;
    其中,m为负面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    Q为正面关键词集;
    Figure PCTCN2019122787-appb-100010
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  10. 如权利要求6所述的基于循环神经网络算法的舆情预警装置,其中,所述舆情获取模块,还用于根据公式
    Figure PCTCN2019122787-appb-100011
    计算所述关键词与中立关键词集中的其余关键词的相关性;
    其中,k为中立关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    M为中立关键词集;
    Figure PCTCN2019122787-appb-100012
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  11. 一种终端,其中,所述终端包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时,实现如下步骤:
    获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
    根据所述关键词的倾向度,确定所述关键词对应的特征向量;
    根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
    将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
    根据所述舆情预警指标,发出舆情预警;
    其中,所述获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度的步骤,包括:
    获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度;
    其中,所述关键词的倾向度包括正面倾向度、负面倾向度以及中立倾向度,所述正面倾向度、负面倾向度以及中立倾向度为所述关键词分别出现在正面新闻、负面新闻、中立新闻中概率;
    相应地,所述获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度的步骤之前,所述处理器还用于执行以下步骤:
    建立关键词库,所述关键词库包括正面关键词集、负面关键词集以及中立关键词集;
    计算每个关键词与各关键词集中的其余关键词的相关性;
    根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度。
  12. 如权利要求11所述的终端,其中,所述根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度的步骤,包括:
    将关键词与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为正面倾向度;
    将关键词与负面关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为负面倾向度;
    将关键词与中立关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性的差值作为中立倾向度。
  13. 如权利要求11所述的终端,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100013
    计算所述关键词与正面关键词集中的其余关键词的相关性;
    其中,n为正面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    P为正面关键词集;
    Figure PCTCN2019122787-appb-100014
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  14. 如权利要求11所述的终端,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100015
    计算所述关键词与负面关键词集中的其余关键词的相关性;
    其中,m为负面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    Q为正面关键词集;
    Figure PCTCN2019122787-appb-100016
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  15. 如权利要求11所述的终端,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100017
    计算所述关键词与中立关键词集中的其余关键词的相关性;
    其中,k为中立关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    M为中立关键词集;
    Figure PCTCN2019122787-appb-100018
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  16. 一种存储介质,其中,所述存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时,实现如下步骤:
    获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度;
    根据所述关键词的倾向度,确定所述关键词对应的特征向量;
    根据所述关键词对应的特征向量,确定所述舆情新闻的特征序列;
    将所述舆情新闻的特征序列输入已训练的循环神经网络模型,确定舆情预警指标;
    根据所述舆情预警指标,发出舆情预警;
    其中,所述获取预设时间内舆情新闻,并确定所述舆情新闻中关键词的倾向度的步骤,包括:
    获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度;
    其中,所述关键词的倾向度包括正面倾向度、负面倾向度以及中立倾向度,所述正面倾向度、负面倾向度以及中立倾向度为所述关键词分别出现在正面新闻、负面新闻、中立新闻中概率;
    相应地,所述获取预设时间内舆情新闻以及预先建立的关键词库,并确定所述舆情新闻中关键词的倾向度的步骤之前,所述处理器还用于执行以下步骤:
    建立关键词库,所述关键词库包括正面关键词集、负面关键词集以及中立关键词集;
    计算每个关键词与各关键词集中的其余关键词的相关性;
    根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度。
  17. 如权利要求16所述的存储介质,其中,所述根据每个关键词与各关键词集中的其余关键词的相关性,计算该关键词的正面倾向度、负面倾向度以及中立倾向度的步骤,包括:
    将关键词与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为正面倾向度;
    将关键词与负面关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与中立关键词集中的其余关键词的相关性的差值作为负面倾向度;
    将关键词与中立关键词集中的其余关键词的相关性、与正面关键词集中的其余关键词的相关性、与负面关键词集中的其余关键词的相关性的差值作为中立倾向度。
  18. 如权利要求16所述的存储介质,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100019
    计算所述关键词与正面关键词集中的其余关键词的相关性;
    其中,n为正面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    P为正面关键词集;
    Figure PCTCN2019122787-appb-100020
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  19. 如权利要求16所述的存储介质,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100021
    计算所述关键词与负面关键词集中的其余关键词的相关性;
    其中,m为负面关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    Q为正面关键词集;
    Figure PCTCN2019122787-appb-100022
    p(w)为w关键词在文档中出现的概率,
    p(v)为v关键词在文档中出现的概率;
    p(w,v)为w和v共同在文档中出现的概率。
  20. 如权利要求16所述的存储介质,其中,所述计算每个关键词与各关键词集中的其余关键词的相关性的步骤,包括:
    根据公式
    Figure PCTCN2019122787-appb-100023
    计算所述关键词与中立关键词集中的其余关键词的相关性;
    其中,k为中立关键词集中关键词的个数;
    rec(w,v)为w,v两个关键词的相关性;
    M为中立关键词集;
    Figure PCTCN2019122787-appb-100024
    p(w)为w关键词在文档中出现的概率,p(v)为v关键词在文档中出现的概率;p(w,v)为w和v共同在文档中出现的概率。
PCT/CN2019/122787 2018-12-14 2019-12-03 基于循环神经网络算法的舆情预警方法、装置、终端及介质 WO2020119533A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811530781.XA CN109800302A (zh) 2018-12-14 2018-12-14 基于循环神经网络算法的舆情预警方法、装置、终端及介质
CN201811530781.X 2018-12-14

Publications (1)

Publication Number Publication Date
WO2020119533A1 true WO2020119533A1 (zh) 2020-06-18

Family

ID=66556615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/122787 WO2020119533A1 (zh) 2018-12-14 2019-12-03 基于循环神经网络算法的舆情预警方法、装置、终端及介质

Country Status (2)

Country Link
CN (1) CN109800302A (zh)
WO (1) WO2020119533A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800302A (zh) * 2018-12-14 2019-05-24 深圳壹账通智能科技有限公司 基于循环神经网络算法的舆情预警方法、装置、终端及介质
CN112256974B (zh) * 2020-11-13 2023-11-17 泰康保险集团股份有限公司 一种舆情信息的处理方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100098014A (ko) * 2009-02-27 2010-09-06 에스케이 텔레콤주식회사 여론 분석 장치 및 문서 분석을 통한 여론 평가 방법
CN105589941A (zh) * 2015-12-15 2016-05-18 北京百分点信息科技有限公司 网络文本的情感信息检测方法和装置
CN107066442A (zh) * 2017-02-15 2017-08-18 阿里巴巴集团控股有限公司 情绪值的检测方法、装置及电子设备
CN108959383A (zh) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 网络舆情的分析方法、装置及计算机可读存储介质
CN109800302A (zh) * 2018-12-14 2019-05-24 深圳壹账通智能科技有限公司 基于循环神经网络算法的舆情预警方法、装置、终端及介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090048998A (ko) * 2007-11-12 2009-05-15 주식회사 비즈모델라인 키워드를 통한 부정 여론 알림 방법 및 시스템과 이를 위한기록매체
US20130290232A1 (en) * 2012-04-30 2013-10-31 Mikalai Tsytsarau Identifying news events that cause a shift in sentiment
CN104657393A (zh) * 2013-11-25 2015-05-27 深圳市至高通信技术发展有限公司 一种舆情分析方法及相应的装置
CN108776671A (zh) * 2018-05-12 2018-11-09 苏州华必讯信息科技有限公司 一种网络舆情监控系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100098014A (ko) * 2009-02-27 2010-09-06 에스케이 텔레콤주식회사 여론 분석 장치 및 문서 분석을 통한 여론 평가 방법
CN105589941A (zh) * 2015-12-15 2016-05-18 北京百分点信息科技有限公司 网络文本的情感信息检测方法和装置
CN107066442A (zh) * 2017-02-15 2017-08-18 阿里巴巴集团控股有限公司 情绪值的检测方法、装置及电子设备
CN108959383A (zh) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 网络舆情的分析方法、装置及计算机可读存储介质
CN109800302A (zh) * 2018-12-14 2019-05-24 深圳壹账通智能科技有限公司 基于循环神经网络算法的舆情预警方法、装置、终端及介质

Also Published As

Publication number Publication date
CN109800302A (zh) 2019-05-24

Similar Documents

Publication Publication Date Title
US10614077B2 (en) Computer system for automated assessment at scale of topic-specific social media impact
WO2020062660A1 (zh) 企业信用风险评估方法、装置、设备及存储介质
Lu et al. User-generated content as a research mode in tourism and hospitality applications: Topics, methods, and software
US8401771B2 (en) Discovering points of interest from users map annotations
CN105051719B (zh) 用于使得众包内容动态失效的装置和方法
JP6911603B2 (ja) ユーザによって訪問される施設のカテゴリの予測モデルを生成する方法、プログラム、サーバ装置、及び処理装置
TW201737072A (zh) 一種對應用程序進行項目評估的方法及系統
WO2022179384A1 (zh) 一种社交群体的划分方法、划分系统及相关装置
JP2009151760A (ja) オブジェクト間競合指標計算方法およびシステム
CN108241867B (zh) 一种分类方法及装置
WO2019061665A1 (zh) 电子装置、构建零售网点评分模型的方法、系统及存储介质
CN107666649A (zh) 个人财产状态评估方法及装置
WO2020119533A1 (zh) 基于循环神经网络算法的舆情预警方法、装置、终端及介质
CN116108393A (zh) 电力敏感数据分类分级方法、装置、存储介质及电子设备
Alsudais Quantifying the offline interactions between hosts and guests of Airbnb
US20190171745A1 (en) Open ended question identification for investigations
CN115983900A (zh) 用户营销策略的构建方法、装置、设备、介质和程序产品
WO2023165145A1 (zh) 时序流量预测方法及装置、存储介质及电子设备
US20230119405A1 (en) Computer-Based Systems and Methods for Sentiment Analysis
JP7117474B2 (ja) 法令関連情報利用支援システム及びこれを用いた法令関連情報利用支援方法
CN115758271A (zh) 数据处理方法、装置、计算机设备和存储介质
Ma [Retracted] Construction of Tourism Management Engineering Based on Data Mining Technology
WO2021129368A1 (zh) 一种客户类型的确定方法及装置
CN109919811B (zh) 基于大数据的保险代理人培养方案生成方法及相关设备
WO2021134944A1 (zh) 一种基于移动新闻客户端的评估方法及其系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19894501

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 29/09/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19894501

Country of ref document: EP

Kind code of ref document: A1