CN109241430A - A kind of election prediction technique of internet multi-resources Heterogeneous data fusion - Google Patents

A kind of election prediction technique of internet multi-resources Heterogeneous data fusion Download PDF

Info

Publication number
CN109241430A
CN109241430A CN201811038860.9A CN201811038860A CN109241430A CN 109241430 A CN109241430 A CN 109241430A CN 201811038860 A CN201811038860 A CN 201811038860A CN 109241430 A CN109241430 A CN 109241430A
Authority
CN
China
Prior art keywords
candidate
election
prediction
internet
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811038860.9A
Other languages
Chinese (zh)
Inventor
赵忠华
吴俊杰
解峥
袁翠欣
孙小宁
李欣
万欣欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
National Computer Network and Information Security Management Center
Original Assignee
Beihang University
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University, National Computer Network and Information Security Management Center filed Critical Beihang University
Priority to CN201811038860.9A priority Critical patent/CN109241430A/en
Publication of CN109241430A publication Critical patent/CN109241430A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of election prediction techniques of internet multi-resources Heterogeneous data fusion, belong to the field of data mining.First from internet data, screening is able to reflect the information source of election country popular feelings trend.Then specific features are extracted from the internet information source filtered out, construct candidate's supporting rate prediction index system based on internet platform.All kinds of prediction index of extraction are finally considered as the signal to reflect the people's will, are merged with Kalman filter model, the supporting rate of dynamic realtime tracking prediction candidate.The present invention have the characteristics that data source extensively, strong real-time, there is important application value in public sentiment monitoring and the fields such as viewpoint analysis.

Description

A kind of election prediction technique of internet multi-resources Heterogeneous data fusion
Technical field
The invention belongs to the field of data mining, are related to a kind of election prediction technique of internet multi-resources Heterogeneous data fusion.
Background technique
Electoral system sets up the history for having over one hundred year so far, and the prediction of general election result is paid close attention to by various circles of society, Multiclass prediction technique and technology are emerged.
Initial election prediction relies on opinion poll, fact-finding organ be generally from survey organization, major mainstream media and The research institution of university, they are often based upon Sampling Survey theory and carry out information collection, are aided with expertise amendment, with the will of the people The political wind direction of test assessment obtains prediction result in turn.This prediction technique advantage based on poll is: real-time is stronger, closes on Election may include in the result the new information that the will of the people impacts.But due to investigation method, sample size and poll The influence of the factors such as political party of mechanism tendentiousness, poll result often have partially.
It is subsequent, there are some scholars and corporate facility to propose the prediction technique based on macrovariable.This kind of prediction technique is comprehensive It closes and considers State-level macroeconomic data, building prediction model predicts general election percentage of votes obtained.Such method prediction model It is easy to get, has to election results stronger explanatory.But prediction model is often based upon long history data, and timeliness is not strong, nothing Method introduces the up-to-date information for closing on election in a model;And in the case where candidate is roughly the same, it is difficult to make Accurate Prediction.
With the rapid development of Internet technology, information is in explosive growth, and election information presentation mode is also more and more Sample, the abundant information contained in big data bring new resolving ideas to election prediction.The election of multiple countries All demonstrate effect of the social networks such as Facebook and Twitter in percentage of votes obtained prediction.Based on internet big data Election prediction technique compared to poll method and the prediction technique based on macrovariable have stronger real-time, but at present side Fado belongs to ex-post analysis, and is based only upon single social media data source, does not account for user and participates in social media platform Diversity.In this way, obtained candidate's supporting rate prediction result often has biggish deviation, it is difficult to reflection election public sentiment comprehensively.
Summary of the invention
To solve the above problems, the invention proposes a kind of prediction technique for obtaining election percentage of votes obtained, it is specifically a kind of mutual The election prediction technique for multi-resources Heterogeneous data fusion of networking;With person participating in the election's supporting rate be prediction object, by fusion social media, The multi-source heterogeneous big data such as search engine and election contest homepage, overcomes deviation of the data mapping in terms of disclosing the will of the people, to realize The target of real-time tracking and predicting candidate people's supporting rate.
The election prediction technique of the internet multi-resources Heterogeneous data fusion, the specific steps are as follows:
Step 1: from internet data, screening is able to reflect the information source of election country popular feelings trend.
The step of filter information source specifically:
Firstly, for election country, what the internet management and service organization for searching the country were issued Research report extracts the widely used internet platform of netizen from report.
Then, traffic statistics are carried out by the website to internet platform, the website for obtaining the election country makes With ranking, most frequently used website is filtered out.
Finally, retaining social networks class and search engine class etc. from most frequently used website and being generated with user The information source of content.Meanwhile candidates participating in election campaign homepage being added in candidate information source, and then pass through traffic statistics website, analysis Degree of concern of the common people for different candidates participating in election campaign websites.
Step 2: extracting specific features from the internet information source filtered out, constructs the candidate based on internet platform People's supporting rate prediction index system.
The prediction index includes: social networks prediction index, search engine prediction index and candidates participating in election campaign homepage Prediction index.Specific building process is as follows:
(1) social networks prediction index is constructed in terms of quantity and emotion two;
In quantitative aspects, by referring to the ratio of posting of candidate as prediction index in social networks.
Specifically, if referring in social network-i i-platform for t days, the model quantity of candidate i isThen same day time The i that chooses what the platform obtained refers to supporting rate indexCalculation is as follows:
Or support of the number as netizen to the candidate is praised using what each per day every note text of candidate obtained.
Specifically, number is praised in every model j acquisition if t days candidate i have issued n bars of model in social network-i i-platform ForThen the same day candidate i praises several supporting rate indexs what the platform obtainedCalculation is as follows:
In terms of emotion, emotional semantic classification is carried out to the text information in social networks, and calculates positive emotion and passive feelings The ratio of sense, thus as netizen to the supporting rate prediction index of candidate.
Specifically, if shared about posting for candidate i in t days social networksItem, wherein positive emotion model ForItem, Negative Affect model areItem, then the text emotion supporting rate index of the candidate iCalculation It is as follows:
(2) search engine prediction index is constructed;
Firstly, choosing the election maximum search engine of country usage amount;
Then, volumes of searches of the candidate i on t is obtainedCalculate concern of the candidate i in t days search engines Spend index:
(3) candidates participating in election campaign homepage prediction index is constructed;
Candidate i in t days IP amount of access is by election contest websiteThe election contest homepage that candidate i is calculated on t closes Note degree index:
Step 3: all kinds of prediction index of extraction are considered as the signal to reflect the people's will, are carried out with Kalman filter model Fusion, the supporting rate of dynamic realtime tracking prediction candidate.
Detailed process is as follows:
Step 301 carries out all kinds of prediction index extracted with the method for moving average smoothly, it is flat to obtain each prediction index Sliding value
When to candidate's i supporting rate is predicted within t+1 days, t-l to t day each index value daily is first calculatedC ∈ { count, like, senti, search, IP }, each prediction index after then calculating separately rolling average Smooth valueCalculation method is as follows:
Step 302, the state according to the common people to candidate i on t-1, develop that calculate the candidate i true at t days State value
B is control input variable coefficient matrix;ut-1To control input variable;wt-1For process noise vector, the noise to Amount obey mean value be 0, covariance matrix QtMultivariate normal distributions, wt~N (0, Qt)。
Step 303, at each moment, by each prediction index smooth valueAs time of day valueReflection;Building the T days measured valuesWith time of day valueBetween mapping relations.
Measured valueHtFor time of day value to observation measured value Mapping matrix;vtFor the white Gaussian noise of measurement, obey mean value be 0, covariance matrix RtMultivariate normal distributions, vt~N (0,Rt).Assuming that during state evolution, original stateProcess noise wtWith measurement noise vtIndependently of each other.
Step 304 observed measured value when t daysAfter input Kalman filter model, Kalman filtering was according to the same day The prior state estimated value and observation of candidate's supporting rate, after predicting the same day with kalman gain coefficient Weighted Fusion Test state estimation
Indicate the estimated value of the supporting rate according to preceding t-1 days of observation to candidate i on t.KtFor karr Graceful gain coefficient, to measure the weight of prior state estimated value and measured value in fusion process.
T days posteriority state estimations and state transition equation are updated by step 305 with Kalman filtering, Obtain the posteriority state estimation of next day supporting rate.
The present invention has the advantages that a kind of election prediction technique of internet multi-resources Heterogeneous data fusion, it is contemplated that user Using the diversity of internet platform, have the characteristics that data source extensively, strong real-time, in the neck such as public sentiment monitoring and viewpoint analysis Domain has important application value.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the election prediction technique of internet multi-resources Heterogeneous data fusion of the present invention.
Fig. 2 is the supporting rate of dynamic realtime tracking prediction candidate after the present invention is merged the prediction index of extraction Flow chart.
Specific embodiment
Below in conjunction with drawings and examples, the present invention is described in further detail.
In view of the big data scale of construction is huge, data type is various, value density is low, processing speed wants the characteristics such as fast, the present invention In view of the wide participation of user on internet platform, proposes for this kind of events are elected, dug from internet platform The method for digging the will of the people;Simultaneously in view of user uses the diversity of internet platform, propose based on Kalman filter model Candidate's supporting rate prediction technique of multi-resources Heterogeneous data fusion;This method considers that country internet is elected to use first Situation filters out the internet platform for being able to reflect popular feelings trend.In turn, filtering out from numerous and disorderly internet platform can be anti- Reflect the information source of election country popular feelings trend;In turn, for each information source filtered out, the invention proposes the will of the people Prediction index extracting method;Candidate's supporting rate prediction index system based on internet platform of building.Finally, will extract Index is considered as the signal to reflect the people's will, with signal processing model --- and Kalman filter model in real time refers to multi-resources Heterogeneous prediction The dynamically track prediction of candidate's percentage of votes obtained is realized in mark fusion.
A kind of election prediction technique of internet multi-resources Heterogeneous data fusion, detailed process as shown in Figure 1, implementation steps such as Under:
Step 1: screening is able to reflect the information source of election country popular feelings trend.
In face of internet data abundant, it is quasi- for finding and being able to reflect the reliable information source of election country popular feelings trend The really basis of prediction election results.In terms of filter information source, it is broadly divided into two steps:
Step 101, the research report of election country internet management and service organization's publication is searched.
Internet management and service organization can be all issued every year for the analysis of the Internet Use in the country in which it is located or area Report can form preliminary understanding to the network use habit of election country by these research reports, and then from report It is the widely used internet platform of netizen that election country is extracted in announcement.
Currently, internet management and service organization mainly have in international coverage: International Telecommunication Union (ITU), international interconnection Net association (ISOC), Internet information centre (INTERNIC) etc..The internet management of the Asian-Pacific area and service organization Mainly have: Asian-Pacific area internet society (APIA), Asian-Pacific area internet group (APNG), Asia Pacific Internet Information Center (APNIC), China Internet Network Information Center (CNNIC), Japanese Network Information Centre (JPNIC), South Korea Network Information Centre (KRNIC), Malaysian domain name registration management organization (MIMOS) etc..The mechanism in America area is with specifically including that America area IP Location management and distributor gear (ARIN), domain name registration management organization of the U.S. (NeuStar), Canadian internet registration office (CIRA) etc..The mechanism of European Region mainly has: the committee of top level domain registration management mechanism of European countries (CENTR), Germany Inter network information center (DENIC), inter network information center of Britain (Nominet), European Region IP address management and point Fitting mechanism (RIPE).African Territories mainly have: African inter network information center (AfriNIC) etc..Australia area specifically includes that Australian domain name registration management organization (AUDA) etc..
Step 102, the Web vector graphic investigation report of research firm's publication of election country is consulted.For example, Alexa Etc. website traffic statistics website can provide the website in every country or area using ranking.According to website ranking, screening election The most frequently used website in country.
On the basis of first two steps, the big website of a batch election country usage amount can be filtered out.Due to using Frequently, common people's wide participation, these websites are more likely to disclose popular feelings trend.
Step 103, it is contemplated that prediction index should disclose the viewpoint of the common people as far as possible, only retain in the high website of amount of access These information sources with user-generated content such as social networks class and search engine class.Simultaneously, it is contemplated that the spy of general election topic Candidates participating in election campaign homepage should be also added in different property in candidate information source, and then passes through the traffic statistics web analytics common people such as Alexa For the degree of concern of different candidates participating in election campaign websites.As a result, can preliminary screening go out be able to reflect popular feelings trend internet letter Breath source.
Step 2: extracting specific features from the internet information source filtered out, constructs the candidate based on internet platform People's supporting rate prediction index system.
Prediction index includes: that social networks prediction index, search engine prediction index and the prediction of candidates participating in election campaign homepage refer to Mark.The index system of building overall scientific is the key that selection prediction.Combining information source specific features, each channel forecast index tool Body building process is as follows:
(1) social networks prediction index is constructed in terms of quantity and emotion two;
Social networks has become the main platform that the common people obtain information, express an opinion because of its interactivity and timeliness.Such as Favor of the social medias such as Facebook, Twitter by more and more users.Allow netizen logical in these social media platforms It crosses and the Behavior Expressions such as thumbs up, comments on to the view of election candidate.The common people are to candidate in excavating these user-generated contents Tendentiousness when, prediction index can be constructed in terms of quantity and emotion two.
Discuss that the model quantity of candidate reflects the common people for candidate in quantitative aspects, Facebook, Twitter Attention rate.It therefore, can be by referring to the ratio of posting of candidate as prediction index in social networks.Specifically, if The model quantity for referring to candidate i for t days in social network-i i-platform isThen same day candidate i is mentioned what the platform obtained And supporting rate indexCalculation is as follows:
In addition, other than the referring to and can reflect common people's support of candidate in social network-i i-platform, many social networks Station, which additionally provides, the functions such as thumbs up.It thumbs up and may be considered netizen for the strong approval of candidate people's words unrest.It therefore, can be to wait What each per day every note text of choosing obtained praises support of the number as netizen to candidate.Specifically, if t days candidate People's i social network-i i-platform has issued n model, and every model j acquisition praises number and isThen same day candidate i is obtained in the platform Praise several supporting rate indexsCalculation is as follows:
Referred in terms of emotion, in social networks candidate post and candidate's personal homepage in comment embody Netizen's viewpoint abundant.It, can be to the text envelope in social networks in order to excavate the Sentiment orientation contained in these text informations Breath carries out emotional semantic classification, and calculates the ratio of positive emotion and Negative Affect, so as to the support as netizen to candidate Rate prediction index.Specifically, if shared about posting for candidate i in t days social networksItem, wherein positive emotion note Son isItem, Negative Affect model areItem, then the text emotion supporting rate index of candidate iCalculation It is as follows:
(2) search engine prediction index is constructed;
The retrieval behavior of each user in a search engine is the displaying of active wish.In order to help user to understand net People's focus of attention, more search engines provide keyword search query index service, such as Google Trends.These indexes are with sea Based on measuring netizen's behavioral data, it is capable of providing the search scale of some keyword in a search engine, is usually updated by day degree. For the scene that the present invention considers, the election maximum search engine of country usage amount is chosen, candidate i is then obtained and exists T days volumes of searchesCalculate attention rate index of the candidate i in t days search engines:
(3) candidates participating in election campaign homepage prediction index is constructed;
Candidate is in order to publicize the opinion in power of oneself, draw ballot paper over to one's side, it will usually set up election contest homepage.By campaigning for net It stands, on the one hand candidate shows recent electioneering and speech;On the other hand the whip-round page would generally be set up, it is competing to obtain development Select movable financial support.The IP amount of access of candidates participating in election campaign homepage reflects the common people for the concern of candidate's words and deeds.In order to Website adjusting and optimizing is helped, the traffic statistics mechanism such as Alexa, SEO comprehensive inquiry head of a station's tool can provide appointed website Daily IP amount of access.If candidate i campaigns for website in t days IP amount of accessCalculate election contest of the candidate i on t Homepage attention rate index:
Step 3: all kinds of prediction index of extraction are considered as the signal to reflect the people's will, are carried out with Kalman filter model Fusion, the supporting rate of dynamic realtime tracking prediction candidate.
The five class prediction index extracted in step 2 reflect concern of the common people to candidate from different perspectives.Due to Family uses the diversity and excess kurtosis of internet platform, and only relying upon the prediction that some above-mentioned index is made may have partially. Therefore, it is necessary to one kind can merge multi-resources Heterogeneous index, the method for concentrated expression candidate's support.The present invention will be in step 2 The five class prediction index extracted are considered as the signal of will of the people reflection, with signal processing method --- Kalman filter model fusion Multi-resources Heterogeneous signal.Specifically, implementation the following steps are included:
Step 301, prediction index is smooth.In order to reflect the supporting rate trend of candidate's acquisition, at the same it is each in order to prevent Influence of the prediction index fluctuation for prediction result first has to carry out the five class prediction index extracted smooth.The present invention The method used is the method for moving average.Specifically, when to candidate's i supporting rate prediction in t+1 days, when calculating t-l to t first Carve daily prediction index valueRolling average is calculated separately again Five class prediction index values afterwardsAs second step --- Kalman filter model Input.Calculation method is as follows:
Step 302, Kalman filter model fusion forecasting index.Kalman filtering is a kind of utilization linear system state side Journey, by carrying out the algorithm of optimal estimation to system mode with noisy observation data.In the present invention, with multi-source on line Counted each prediction index in dataAs the observation of common people's supporting rate, estimated by carrying out fusion to these prediction index Count the practical status of support of the common people.If time of day value of the common people to candidate i on tIt is drilled by the state at (t-1) moment Become, it may be assumed that
Wherein, B is control input variable coefficient matrix;ut-1To control input variable;wt-1For process noise vector, this is made an uproar Sound vector obey mean value be 0, covariance matrix QtMultivariate normal distributions, wt~N (0, Qt)。
Step 303, at each moment, measured value is constructedWith it is true Real state valueBetween mapping relations, and think that observation contains noise, it may be assumed that
Wherein, HtFor the mapping matrix of state value to measured value;vtTo measure noise, and it is assumed to be white Gaussian noise, vt~ N(0,Rt).Assuming that during state evolution, original stateProcess noise wt, measurement noise vtIndependently of each other.
Step 304, Kalman filtering includes two stages: prediction and update.First in forecast period, Kalman filtering root Go out the state value at current time according to the posteriority status predication of last moment Indicate the observation of (t-1) day before Be worth to candidate i moment t prior state estimated value.Measured value was observed when t daysIt afterwards, will be to the priori inscribed when this State estimationAnd observationIt is weighted fusion, obtains the posteriority state estimation at current time
Wherein, KtFor kalman gain coefficient, to measure prior state estimated value and measured value in fusion process Weight.Remember that posteriority state estimation mistake is
The covariance matrix of posteriority state estimation mistake isIt is expressed as
In order to enable posteriority state estimation and time of day value as close possible to, minimize posteriority state estimation mistake, It is equivalent to minimizeThis optimization is equivalent to minimize posteriority state estimation error covariance matrix's Mark solves:
It can be transported as a result, according to the prior state estimated value of daily candidate's supporting rate and the observation of each channel support rate The posteriority state estimation on the same day is obtained with kalman gain coefficient Weighted Fusion.
Step 305, the more new stage of Kalman filtering is obtained by the posteriority state estimation and state transition equation on the same day To the posteriority state estimation of next day supporting rate:
The present invention uses the diversity of internet platform in view of user, has the characteristics that data source is extensive, strong real-time, Deviation of the data mapping in terms of disclosing the will of the people is overcome, will be had broad application prospects in future.

Claims (2)

1. a kind of election prediction technique of internet multi-resources Heterogeneous data fusion, which is characterized in that specific step is as follows:
Step 1: from internet data, screening is able to reflect the information source of election country popular feelings trend;
Step 2: extracting specific features from the internet information source filtered out, constructs candidate's branch based on internet platform Holdup prediction index system;
The prediction index includes: social networks prediction index, search engine prediction index and the prediction of candidates participating in election campaign homepage Index;Specific building process is as follows:
(1) social networks prediction index is constructed in terms of quantity and emotion two;
In quantitative aspects, by referring to the ratio of posting of candidate as prediction index in social networks;
Specifically, if referring in social network-i i-platform for t days, the model quantity of candidate i isThen same day candidate i Supporting rate index is referred to what the platform obtainedCalculation is as follows:
Or support of the number as netizen to the candidate is praised using what each per day every note text of candidate obtained;
Specifically, every model j acquisition praises number and is if t days candidate i have issued n bars of model in social network-i i-platform Then the same day candidate i praises several supporting rate indexs what the platform obtainedCalculation is as follows:
In terms of emotion, emotional semantic classification is carried out to the text information in social networks, and calculate positive emotion and Negative Affect Ratio, thus as netizen to the supporting rate prediction index of candidate;
Specifically, if shared about posting for candidate i in t days social networksItem, wherein positive emotion model beItem, Negative Affect model areItem, then the text emotion supporting rate index of the candidate iCalculation is such as Under:
(2) search engine prediction index is constructed;
Firstly, choosing the election maximum search engine of country usage amount;
Then, volumes of searches of the candidate i on t is obtainedAttention rate of the candidate i in t days search engines is calculated to refer to Mark:
(3) candidates participating in election campaign homepage prediction index is constructed;
Candidate i in t days IP amount of access is by election contest websiteCalculate election contest homepage attention rate of the candidate i on t Index:
Step 3: being considered as the signal to reflect the people's will for all kinds of prediction index of extraction, merged with Kalman filter model, The supporting rate of dynamic realtime tracking prediction candidate;
Detailed process is as follows:
Step 301 carries out all kinds of prediction index extracted with the method for moving average smoothly, to obtain each prediction index smooth value
When to candidate's i supporting rate is predicted within t+1 days, t-l to t day each index value daily is first calculatedc ∈ { count, like, senti, search, IP }, each prediction index smooth value after then calculating separately rolling averageMeter Calculation method is as follows:
Step 302, the state according to the common people to candidate i on t-1 develop and calculate the time of day of the candidate i on t Value
B is control input variable coefficient matrix;ut-1To control input variable;wt-1For process noise vector, noise vector clothes From mean value be 0, covariance matrix QtMultivariate normal distributions, wt~N (0, Qt);
Step 303, at each moment, by each prediction index smooth valueAs time of day valueReflection;It constructs t days Measured valueWith time of day valueBetween mapping relations;
Measured valueHtFor time of day value to the mapping of observation measured value Matrix;vtFor the white Gaussian noise of measurement, obey mean value be 0, covariance matrix RtMultivariate normal distributions, vt~N (0, Rt);Assuming that during state evolution, original stateProcess noise wtWith measurement noise vtIndependently of each other;
Step 304 observed measured value when t daysAfter input Kalman filter model, Kalman filtering is according to same day candidate The prior state estimated value and observation of supporting rate predict the posteriority state on the same day with kalman gain coefficient Weighted Fusion Estimated value
Indicate the estimated value of the supporting rate according to preceding t-1 days of observation to candidate i on t;KtFor Kalman's increasing Beneficial coefficient, to measure the weight of prior state estimated value and measured value in fusion process;
T days posteriority state estimations and state transition equation are updated by step 305 with Kalman filtering, are obtained The posteriority state estimation of next day supporting rate:
2. a kind of election prediction technique of internet multi-resources Heterogeneous data fusion as described in claim 1, which is characterized in that step Described in rapid one the step of filter information source specifically:
Firstly, searching the internet management of the country and the research of service organization's publication for election country Report, extracts the widely used internet platform of netizen from report;
Then, traffic statistics are carried out by the website to internet platform, the website for obtaining the election country uses row Name, filters out most frequently used website;
Finally, leave strip has the information source of user-generated content from most frequently used website;Meanwhile in candidate information Candidates participating in election campaign homepage is added in source, and then by traffic statistics website, analyzes the common people for different candidates participating in election campaign websites Degree of concern.
CN201811038860.9A 2018-09-06 2018-09-06 A kind of election prediction technique of internet multi-resources Heterogeneous data fusion Pending CN109241430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811038860.9A CN109241430A (en) 2018-09-06 2018-09-06 A kind of election prediction technique of internet multi-resources Heterogeneous data fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811038860.9A CN109241430A (en) 2018-09-06 2018-09-06 A kind of election prediction technique of internet multi-resources Heterogeneous data fusion

Publications (1)

Publication Number Publication Date
CN109241430A true CN109241430A (en) 2019-01-18

Family

ID=65067469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811038860.9A Pending CN109241430A (en) 2018-09-06 2018-09-06 A kind of election prediction technique of internet multi-resources Heterogeneous data fusion

Country Status (1)

Country Link
CN (1) CN109241430A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563918A (en) * 2020-03-30 2020-08-21 西北工业大学 Target tracking method for data fusion of multiple Kalman filters
CN112348257A (en) * 2020-11-09 2021-02-09 中国石油大学(华东) Election prediction method driven by multi-source data fusion and time sequence analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289004A1 (en) * 2012-08-10 2014-09-25 Itron, Inc. Near-Term Data Filtering, Smoothing and Load Forecasting
CN104408108A (en) * 2014-11-18 2015-03-11 重庆邮电大学 Hot topic group influence analysis system and method based on grey system theory
CN105050132A (en) * 2015-08-10 2015-11-11 北京邮电大学 Method for estimating extreme value throughput capacity of cell
CN106227766A (en) * 2016-07-15 2016-12-14 国家计算机网络与信息安全管理中心 A kind of election public opinion prediction method of big data-driven
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140289004A1 (en) * 2012-08-10 2014-09-25 Itron, Inc. Near-Term Data Filtering, Smoothing and Load Forecasting
CN104408108A (en) * 2014-11-18 2015-03-11 重庆邮电大学 Hot topic group influence analysis system and method based on grey system theory
CN105050132A (en) * 2015-08-10 2015-11-11 北京邮电大学 Method for estimating extreme value throughput capacity of cell
CN106227766A (en) * 2016-07-15 2016-12-14 国家计算机网络与信息安全管理中心 A kind of election public opinion prediction method of big data-driven
CN107577782A (en) * 2017-09-14 2018-01-12 国家计算机网络与信息安全管理中心 A kind of people-similarity depicting method based on heterogeneous data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHENG XIE 等: "《Wisdom of fusion: Prediction of 2016 Taiwan election with heterogeneous big data》", 《2016 13TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT(ICSSSM)》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111563918A (en) * 2020-03-30 2020-08-21 西北工业大学 Target tracking method for data fusion of multiple Kalman filters
CN111563918B (en) * 2020-03-30 2022-03-04 西北工业大学 Target tracking method for data fusion of multiple Kalman filters
CN112348257A (en) * 2020-11-09 2021-02-09 中国石油大学(华东) Election prediction method driven by multi-source data fusion and time sequence analysis

Similar Documents

Publication Publication Date Title
Jaidka et al. Predicting elections from social media: a three-country, three-method comparative study
Elçi The rise of populism in Turkey: A content analysis
Bozarth et al. Toward a better performance evaluation framework for fake news classification
Rao et al. Actionable and political text classification using word embeddings and LSTM
CN108363753A (en) Comment text sentiment classification model is trained and sensibility classification method, device and equipment
CN104750856B (en) A kind of System and method for of multidimensional Collaborative Recommendation
CN103198072B (en) Method and device is recommended in a kind of excavation of popular search word
Castro et al. Back to# 6D: Predicting Venezuelan states political election results through Twitter
CN107291886A (en) A kind of microblog topic detecting method and system based on incremental clustering algorithm
CN103699626A (en) Method and system for analysing individual emotion tendency of microblog user
CN104572888B (en) A kind of associated information retrieval method of time series
CN111241425B (en) POI recommendation method based on hierarchical attention mechanism
Gómez Fortes et al. Basque regional elections 2012: The return of nationalism under the influence of the economic crisis
Jerven Measuring African development: past and present. Introduction to the Special Issue
CN109241430A (en) A kind of election prediction technique of internet multi-resources Heterogeneous data fusion
CN113407729A (en) Judicial-oriented personalized case recommendation method and system
Chueri et al. Closing the gap: how descriptive and substantive representation affect women’s vote for populist radical right parties
Buono et al. Big data econometrics: Now casting and early estimates
Nawaz et al. Mining public opinion: a sentiment based forecasting for democratic elections of Pakistan
Stauffer et al. Contextualizing the gender gap in voter turnout
De Groot Culture, contiguity and conflict: on the measurement of ethnolinguistic effects in spatial spillovers
JP7291100B2 (en) Anomaly/change estimation method, program and device using multiple posted time-series data
Bergman Insights from the Quantification of the Study of Populism
CN106227766A (en) A kind of election public opinion prediction method of big data-driven
Vilas et al. The irruption of cryptocurrencies into Twitter cashtags: a classifying solution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190118