CN113537593A

CN113537593A - Method and device for predicting voting tendency of agenda

Info

Publication number: CN113537593A
Application number: CN202110803621.3A
Authority: CN
Inventors: 魏忠钰; 牟馨忆
Original assignee: Fudan University; Zhejiang Lab
Current assignee: Fudan University; Zhejiang Lab
Priority date: 2021-07-15
Filing date: 2021-07-15
Publication date: 2021-10-22

Abstract

The invention discloses a method and a device for predicting voting tendency of an agenda, which relate to the technical field of political voting, and comprise the following steps: establishing an agent node according to basic information of an agent, and establishing a speech node according to speech information issued by the agent in twitter; establishing a relationship between nodes; acquiring an initialization representation of a node; a heterogeneous graph convolution of the keyword-based speaker network; carrying out heterogeneous graph convolution on the talk network based on the topic labels; initializing text information based on the topic of the long-term and short-term memory network; and updating representations of the agenda and the discussion nodes by using a heterogeneous graph convolutional neural network, performing joint training through a triplet loss function, learning node representations and discussion topic representations in the heterogeneous graph, and measuring voting preference of the agenda on the discussion topic through the distance between the agenda and the discussion topic so as to predict voting tendency of the agenda on the discussion topic. The method and the device improve the performance of roll call trend prediction, and are suitable for prediction of newly-added agenda without voting records.

Description

Method and device for predicting voting tendency of agenda

Technical Field

The invention relates to the technical field of political voting, in particular to a method and a device for predicting voting tendency of a salesman.

Background

The goal of roll call voting prediction is to estimate their likely attitudes towards emerging issues using the history of the agenda votes. Since the political preferences and cultural background of the agenda have a great impact on their position and appeal, learning the presentation of the agenda from the roll call voting data has become an effective tool in predicting their voting propensity.

Previous research has promoted the prediction of the agenda singing vote mainly from two aspects. On the one hand, the text information of rich topics is applied to increase the characteristics of classification, on the other hand, the relationship is established among the agendas with the same voting, initiating and donation behaviors, and the performance of roll call prediction is greatly improved by integrating the agendas with similar political backgrounds.

However, for the first time an agent takes part in the voting of the issue, the lack of availability of a reference to the agent makes it difficult to obtain a representation of the agent with contextual information that embodies the interaction, resulting in a so-called cold start problem. Especially in the political field, expiry usually means renewal of the participating hospital or party. For example, more than 10% of the agenda on a given data set are newly selected. Furthermore, the voting behavior is only indicative of the resultant action of the agenda, while the content presented to the public on the social platform contains clues to their final selection. It is therefore valuable to explore the reasons behind their opinions and final decisions, and to facilitate the explanation of the subject process.

Disclosure of Invention

In order to overcome the above-mentioned drawbacks of the prior art, embodiments of the present invention provide a method and an apparatus for predicting voting tendency of an attendee, which can improve the performance of roll call voting prediction and is suitable for those newly attended attendees.

The specific technical scheme of the embodiment of the invention is as follows:

a method of predicting a voting propensity of an agent, the method comprising:

establishing an agent node according to basic information of an agent, and establishing a speech node according to semantic information of speech issued by the agent;

establishing a relationship between nodes;

acquiring an initialization representation of a node;

carrying out word network heterogeneous graph convolution based on keywords;

carrying out language network heterogeneous graph convolution based on the topic label;

initializing text information based on the topic of the long-term and short-term memory network;

and updating representations of the agenda and the discussion nodes by using a heterogeneous graph convolutional neural network, performing joint training through a triplet loss function, learning node representations and discussion topic representations in the heterogeneous graph, and measuring voting preference of the agenda on the discussion topic through the distance between the agenda and the discussion topic so as to predict voting tendency of the agenda on the discussion topic.

Preferably, the basic information of the agenda includes member ID, state of belongings and political party; the speech issued by the agenda comprises speech text of the agenda on the twitter; the speaker nodes include at least one of keywords and topic labels.

Preferably, a heterogeneous graph model is constructed, wherein the heterogeneous graph comprises nodes, the establishment of the relationship among the nodes and the initialization of the nodes; there are three relationships of R1, R2 and R3 in the heterogeneous graph based on the keyword speaking network, and there are four relationships of R1, R2, R3 and R4 in the graph based on the tag speaking network; wherein, R1 represents the co-initiated issues between the agent nodes, and the weight thereof is the number of the issues co-initiated by two agents within a preset time; r2 represents the co-occurrence relation of the speaking nodes, and the weight is the co-occurrence times of two keywords or topic labels; r3 represents the relationship between the agenda node and the speaker node, with the weight of the number of times the agenda refers to the keyword or topic tag; r4 indicates the newsletter's feelings of texting under a certain topic.

Preferably, in the initialization representation of the step obtaining node, specifically: the basic information of the agenda is used to obtain its initial representation, which is obtained by concatenating its ID, state and political party information by the following formula:

X_ID(i) ID, X representing an agent i_Party(i) Indicating the State, X, to which the Agendar i belongs_state(i) Representing the political party to which the agenda i belongs;

for the initial representation of the keyword, use the GloVe word vector;

for the topic label node, the average value of the GloVe word vectors of a preset number of high-frequency words in the tweet with the topic label is used.

Preferably, in the step of keyword-based speaker network heterogeneous graph convolution, the expressions of the agenda and the keyword are updated using the following formula:

wherein σ represents an activation function Sigmoid function; w₁ ^(l)And

is the weight matrix of the l-th hidden layer; x^(l)And Y^(l)Is a node representation of level l; lambda [ alpha ]₁And λ₂Is a weighted hyperparameter;

and

standardized adjacencies of an agenda network and a speaking network, respectivelyA matrix;

and

normalized neighborhood matrices of edges from keyword to agent and from agent to keyword, respectively; x^(l ⁺¹⁾And Y^(l+1)Is a node representation of level l + 1.

Preferably, in the step of topic network heterogeneous graph convolution based on topic labels, a graph convolution neural network is generalized to process different relations between any pair of nodes, and different weight matrixes and normalization factors are used for each relation type, and the specific process is as follows:

wherein the content of the first and second substances,

is a set of neighbors of type r in relation to node i, c_i,rIs a normalization factor, usually set to

R represents a set of relationship types, hi represents the hidden state of node i, hj is the hidden state of node j, hi^(l)Indicating the hidden state of node i at the l-th level.

Preferably, the graph convolution operation based on the topic label is represented as follows:

wherein the content of the first and second substances,

and

representing a weight matrix;

and

denotes the normalization factor, N_iA set of neighbors representing a node i,

is a set of neighbors of type r in relation to node i, x_i、x_j、x_kRespectively representing the hidden states of the agenda nodes i, j, k, y_i、y_j、y_kRespectively representing the hidden states of the speaking nodes i, j and k, and l represents the l-th layer.

Preferably, in the step of text information initialization based on the issues of the long-short term memory network, for which the title, description and summary are the direct text information available, the text information of the issues is compiled using the long-short term memory network to get its initial representation:

X_lgn(i)＝LSTM(t_i)

wherein, t_iText information representing issue i, LSTM representing long-short term memory network.

Preferably, in the step of using the heterogeneous graph convolutional neural network to update the representations of the agenda and the speaking node, and performing joint training through a triplet loss function, learning the node representations and the representation of the issue in the heterogeneous graph, and measuring the voting preference of the agenda for the issue through the distance between the agenda and the issue to predict the voting tendency of the agenda for the issue, the method specifically comprises the following steps:

after the initialization representation is obtained, firstly, the representation of an agenda and a keyword or a topic label is updated through a heterogeneous graph convolutional neural network; then, the presentation of the agenda and the issue is jointly learned by a triple loss function, specifically, a batch of triples is sampled, each triplet is composed of a target issue a and a pair of agendas p and n, the voting result satisfies the VOTE (n, a) < VOTE (p, a), and the rating criterion of the voting result is NO < NOT VOTE < YES, the loss function is expressed as:

L＝max(d(a，p)-d(a，n)+margin，0)；

the agenda is ranked according to their distance from the issue and their choice is predicted according to the proportion of different tickets to the issue.

An apparatus for predicting a voting tendency of an agenda, comprising: a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, performs the steps of: a method of predicting a voting tendency of a human observer according to any one of claims 1 to 9.

The technical scheme of the invention has the following remarkable beneficial effects:

first, the present application is the first scheme combining the historical speech and voting of the agenda, skillfully representing the agenda and defining the relationship between the agenda, thereby greatly improving the accuracy of roll call trend prediction.

Secondly, a heterogeneous graph is constructed based on the common initiator relationship and the language similarity of the agenda, and a heterogeneous graph convolution model is provided to effectively learn the speaking of the agenda.

Third, further analysis demonstrates that a speaking network, a network of speaking nodes and an agent node, including an established heterogeneous graph containing information about the speaking of an agent, can provide a more specific indication of the agent, and the ability to alleviate cold-start problems to some extent.

Specific embodiments of the present invention are disclosed in detail with reference to the following description and drawings, indicating the manner in which the principles of the invention may be employed. It should be understood that the embodiments of the invention are not so limited in scope. The embodiments of the invention include many variations, modifications and equivalents within the spirit and scope of the appended claims. Features that are described and/or illustrated with respect to one embodiment may be used in the same way or in a similar way in one or more other embodiments, in combination with or instead of the features of the other embodiments.

Drawings

The drawings described herein are for illustration purposes only and are not intended to limit the scope of the present disclosure in any way. In addition, the shapes, the proportional sizes, and the like of the respective members in the drawings are merely schematic for facilitating the understanding of the present invention, and do not specifically limit the shapes, the proportional sizes, and the like of the respective members of the present invention. Those skilled in the art, having the benefit of the teachings of this invention, may choose from the various possible shapes and proportional sizes to implement the invention as a matter of case.

FIG. 1 is a flowchart illustrating the steps of a method for predicting voting tendencies of an interviewee in accordance with an embodiment of the present invention;

FIG. 2 is a block diagram of an embodiment of a method for predicting voting tendency of an adversary.

Detailed Description

The details of the present invention can be more clearly understood in conjunction with the accompanying drawings and the description of the embodiments of the present invention. However, the specific embodiments of the present invention described herein are for the purpose of illustration only and are not to be construed as limiting the invention in any way. Any possible variations based on the present invention may be conceived by the skilled person in the light of the teachings of the present invention, and these should be considered to fall within the scope of the present invention. It will be understood that when an element is referred to as being "disposed on" another element, it can be directly on the other element or intervening elements may also be present. When an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present. The terms "mounted," "connected," and "connected" are to be construed broadly and may include, for example, mechanical or electrical connections, communications between two elements, direct connections, indirect connections through intermediaries, and the like. The terms "vertical," "horizontal," "upper," "lower," "left," "right," and the like as used herein are for illustrative purposes only and do not denote a unique embodiment.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

In this application, the applicant has collected the agenda's historical opinions as an important extension in describing its political views and supplemented a hub connecting the agenda with similar emphasis and position. For example, when discussing whether partial streaming is prohibited or not, praise emphasizes protection of life, while the opposite emphasizes freedom of choice. It can be seen that the differences in vocabulary usage not only convey the speaker's conscious morphological beliefs, but can also be divided into different groups. By adding the speaking network, even a newly added agenda can clearly identify the group to which the new agenda belongs; the speaking network refers to that keywords or topic labels in the twitter text are regarded as nodes, and the relationship between the keywords or the topic labels is established to form a network.

First, a data set is constructed, for example, that includes historical voting records from 1993 to 2018 by an actual agenda. Each instance includes background information of the agenda, the content of the issue, and the final votes by the agenda for the issue, i.e., approval, disapproval, and abstain. At the same time, the data set may be expanded to include other Twitter language text information set for each agenda. Since Twitter began to prevail between agendas from 2011 or so, only 907,844 roll call records after 2011 were kept thereafter, involving an agenda of 978 and an agenda of 2,189.

To collect the agenda's public announcement, all of the agenda's personal data may be retrieved from the American Council website. Then, the twin account number of the agenda displayed on its homepage is extracted. For an agenda who does not have a Twitter account provided there, the name of the agenda may be manually searched on Twitter and his account number manually identified by examining his authentication information and personal introduction. Thus, the extended data set contains 735 member accounts, covering 79 members who cannot establish a co-originating issue relationship with others in the data set. Finally, a twitterscorper library can be used to crawl all tweets for these accounts to form an extended dataset.

As can be seen from the statistics of the corpus of expanded rosters related to the size of the agenda and their Twitter participation, the number of agendas participating in the roll vote each year, and the number of agendas who issued at least one tweet before the year, the number of agendas participating in the issue remains relatively constant, while an increasing number of agendas tend to share their opinions on Twitter, since social platforms are an important medium for winning support and follow. Overall, each agenda issued an average of 2,601 tweets, with over 51.2% of the agendas having Twitter records exceeding 2,000. Even more surprising, there are still a small number of agendas who issue more than 10,000 tweets.

In order to improve the performance of roll call voting prediction and be suitable for newly participating participants, a method for predicting voting tendency of the participants is proposed in the present application, fig. 1 is a flowchart of the method for predicting voting tendency of the participants in the embodiment of the present invention, and as shown in fig. 1, the method for predicting voting tendency of the participants may include the following steps:

s101, establishing an agent node according to basic information of an agent, and establishing a speech node according to semantic information in speech issued by the agent.

Fig. 2 is a general architecture of the method for predicting the voting tendency of the agenda in the embodiment of the present invention, and as shown in fig. 2, a heterogeneous graph model is constructed, where the heterogeneous graph includes nodes, the establishment of the relationship between the nodes, and the initialization of the nodes.

The model presented in this application relates to two types of nodes, one being an agenda node and the other being a talk node. And establishing an agenda node according to basic information of the agenda. Basic information for the agenda may include member ID, state of belongings and political party, etc. And establishing a speech node according to semantic information in the speech issued by the agenda. For example, the comments posted by the agenda may be the statements posted in the Twitter data.

The speaker nodes may be keywords or topic tags. When using keywords as speaker nodes, the top K highest frequency words can be extracted from a set of filtered tweets. Similarly, for the nodes of the topic tags, the top K highest frequency topic tags may be retained.

The reason for selecting the two types of nodes is their difference in representation capability and relationship construction role. The keyword is used as a node which is a simpler method, and the topic label is used as a more concise and clear expression mode, so that more information about the view of a legislator can be disclosed. In the step, the tweets containing keywords related to the legislation are selected, and all topic labels of the tweets are obtained. On one hand, the speech network constructed by the topic labels focuses more on specific topics. On the other hand, the topic tag network can reflect more complex relationships with legislators, because not only the number of times the legislator mentions the topic can be calculated, but also their sentiment score for the topic content can be calculated using an automated toolkit.

To build a more representative speaker network, keywords related to legislation may be applied to filter, preserving topic labels related to legislation. Since the text on Twitter is complex, possibly including other politically unrelated content, keywords may first be extracted from the topic text, leaving only the tweets containing these topic keywords. Furthermore, since some keywords related to issues, such as "congress" and "united states," are widely referred to by all legislators, these keywords are meaningless to the model in the present application because they do not distinguish the legislators well. Therefore, these words can be manually deleted when the model is built. When considering the topic labels as the speaking nodes, the top K most common topic labels may be similarly retained.

S102: relationships between the nodes are determined.

Four relationship types can be included in the heterogeneous graph model, 1, R1: the issues between the legislator nodes are co-sponsored with a weight that is the number of issues that two nodes have initiated together over a period of time (e.g., over the last four years). 2. R2: the co-occurrence of the nodes is said. This relationship needs to be considered because semantic associations exist between two tags that appear simultaneously. The weight of R2 can be defined as the number of times two keywords appear simultaneously in the tweet. 3. R3: the relationships between the legislator agenda and the talk nodes are used to form a bipartite graph. The weight of this relationship is defined as the number of times the agenda mentions the target keyword or topic tag. 4. R4: the agenda's emotion to the topic. The number of times an agenda mentions a particular topic tag may reflect his interest in the question, while emotion may quantitatively convey the position of the agenda for a certain subject or event. Thus, the weight of R4 is defined as the average sentiment score of an adversary's tweet with a particular topic tag. Here, a python library named TextBlob can be used for fast emotion analysis.

In this case, there are three relations of R1, R2, and R3 in the heterogeneous graph based on the keyword speaking network, and four relations of R1, R2, R3, and R4 in the graph based on the topic tagging speaking network. Keywords and topic tags can be used as a hub to connect the agenda in heterogeneous graphs.

S103: and initializing the node.

The nodes involved in the heterogeneous graph are initialized in the following way, taking into account the different functions.

The agent: the basic information of the agenda is used to obtain its initial representation. Specifically, the initial representation of the agenda is obtained by concatenating its member ID, state of affiliation and political party information by the following formula:

wherein, X_ID(i) ID, X representing an agent i_Party(i) Indicating the State, X, to which the Agendar i belongs_State(i) Representing the political party to which the agenda i belongs;

for the initial representation of the keyword, use the GloVe word vector;

topic label: the topic tags include more information than a single word. Thus, for the initial representation of the topic label node, the average of the GloVe word vectors of a preset number of the most common words in the tweet with the topic label is used, for example, the number may be 20, 30, 50, etc.

S104: and (3) carrying out word network heterogeneous graph convolution based on keywords.

Graph convolutional networks can explore passing and aggregating attributes from node neighbors to understand the representation of nodes in the graph. The heterogeneous graph convolution of keyword-based talk networks requires a new approach to feature delivery and aggregation in this graph due to the different types of nodes and edges in the graph. The personalized PageRank layer may be represented as follows:

wherein, X^(l)And Y^(l)Is a node representation of level l; lambda [ alpha ]₁And λ₂Is a weighted hyperparameter;

and

standardized adjacency matrices for the agenda network and the speaking network, respectively;

and

normalized neighborhood matrices of edges from keyword to agent and from agent to keyword, respectively; x^(l+1)And Y^(l+1)Is a node representation of level l + 1.

In the above process, only the user and the content information are focused on, and it is assumed that the weight of the adjacency matrix and the personalization matrix is the same. Therefore, the representations of the agenda and the keywords need to be further updated, and different weights are given to the two adjacency matrices, which are as follows:

where σ denotes an activation function, which may be, for example, a linear rectification (Sigmoid) function; w₁ ^(l)And

is the weight matrix of the l-th hidden layer.

S105: and (3) carrying out word network heterogeneous graph convolution based on the topic labels.

For topic-tag based talk networks, there are two relationships between each "Agents-topic-tag" pair. Generalizing a conventional graph convolution neural network to process different relationships between any pair of nodes, and using different weight matrices and normalization factors for each relationship type, the specific process is as follows:

wherein the content of the first and second substances,

In this case, the graphic operation based on the topic label can be expressed as follows:

wherein the content of the first and second substances,

and

representing a weight matrix;

and

denotes the normalization factor, N_iA set of neighbors representing a node i,

S106: initializing text information based on the topic of the long-short term memory network.

For the subject matter, headings, descriptions, abstracts, etc. are direct textual information that is available, and therefore, long-short term memory networks (LSTM) are used to encode the textual information into an initial representation:

X_lgn(i)＝LSTM(t_i)

wherein, t_iPresentation instrumentThe text message of topic i, LSTM, represents the long-short term memory network.

S107: and updating representations of the agenda and the discussion nodes by using a heterogeneous graph convolutional neural network, performing joint training through a triplet loss function, learning node representations and discussion topic representations in the heterogeneous graph, and measuring voting preference of the agenda on the discussion topic through the distance between the agenda and the discussion topic so as to predict voting tendency of the agenda on the discussion topic.

After obtaining the initialization representation, the initial representation (X) of the agenda may first be updated by means of a heterogeneous graph convolutional neural network (HGCN)_kgt(i) And a representation of keywords or topic tags. Then, on the basis of the initial representation of the agenda, the initial representation of the keywords, the initial representation of the topic labels and the initial representation of the issue, the representation forms of the agenda and the issue are jointly learned through a triplet loss function. Specifically, a set of triples may be sampled, each triplet consisting of a target issue a and a pair of operators p and n whose voting result satisfies a vote (n, a)<Voted (p, a) and measured as NO<NOT VOTE<YES. The purpose of the triplet loss is to shorten the distance between the target issue a and the positive sample p and to push the negative sample n away from a so that it is more than 0 compared to the positive sample. Thus, the loss function can be expressed as:

L＝max(d(a，p)-d(a，n)+margin，0)

since the voter's voting preference for an issue can be measured by the distance between them. The agenda may be ranked according to their distance and their choice predicted according to the proportion of different tickets to the issue. In the method of the present application, the voting rate is considered as a given input.

After initial representations of the agenda and the speech are obtained by splicing basic information of the agenda and average word vectors, neighbor information is transmitted and aggregated by using a heterogeneous graph convolutional neural network (HGCN), and updated representations of the agenda and the speech nodes are obtained. To project the agenda and the issue into the same vector space, then, training is done using triplet penalties. Specifically, a set of triples (a, p, n) is sampled at each training iteration to represent an issue a and the participants p and n participating in the voting for the issue, where the voting results for p and n satisfy the voting (n, a) for n for a < p for a (p, a), where the voting for the issue by the participants follows the rule NO < NOT VOTE < YEA. After the triplets are obtained, the representation (for the subject, the vector output by the LSTM, and for the agent, the vector output by the HGCN) is input into a triple loss function (namely L in the formula) to calculate the loss, and the triple loss function is trained through back propagation to update the parameters of the neural network, so that the representation of the agent and the subject is updated. The purpose of this is to make the issue more similar to the presentation of the advising agent, while keeping away the issue from the advising agent. In this way, distance is used to measure the preference of the agenda subjects, and in future predictions the agenda can be ranked according to distance and their selection predicted according to the voting rate.

Unlike the prior art, the method constructs an agenda talk network based on keywords and topic labels in the statement, and combines the agenda talk network with the existing agenda co-initiator network, so that a bipartite graph is established between the agenda and words. That is, the speaking network and the original sponsor network of the agenda can be regarded as two sub-graphs, and it is defined that R3 and R4 link these two nodes together, so as to establish a bipartite graph. The present application then employs a heterogeneous graph convolutional neural network (HGCN) to update both the presentation of the agenda and the words. After the long-short term memory network is used to encode the issue, the triple loss function is applied to jointly train the agenda and the issue.

The application has the following main beneficial effects:

the first scheme is that the first scheme combines the historical speech of the agenda, skillfully represents the agenda and defines the relationship between the agenda, thereby greatly improving the accuracy of roll call voting prediction.

Secondly, the method constructs a heterogeneous graph based on the mutual initiator relationship and the speech similarity of the agenda, and provides a heterogeneous graph convolution model to effectively learn the statement of the agenda.

A third, further analysis demonstrates the ability of the speaking network, a network of speaking nodes and agent nodes, including a heterogeneous graph built containing the agent's speaking information, to provide a more specific indication of the agent, and to alleviate the cold start problem to some extent.

The application also discloses a device for predicting the voting tendency of the agenda, which comprises: a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, performs the steps of: a method of predicting a voting tendency of an agenda as described in any one of the preceding claims.

All articles and references disclosed, including patent applications and publications, are hereby incorporated by reference for all purposes. The term "consisting essentially of …" describing a combination shall include the identified element, ingredient, component or step as well as other elements, ingredients, components or steps that do not materially affect the basic novel characteristics of the combination. The use of the terms "comprising" or "including" to describe combinations of elements, components, or steps herein also contemplates embodiments that consist essentially of such elements, components, or steps. By using the term "may" herein, it is intended to indicate that any of the described attributes that "may" include are optional. A plurality of elements, components, parts or steps can be provided by a single integrated element, component, part or step. Alternatively, a single integrated element, component, part or step may be divided into separate plural elements, components, parts or steps. The disclosure of "a" or "an" to describe an element, ingredient, component or step is not intended to foreclose other elements, ingredients, components or steps.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The above embodiments are merely illustrative of the technical ideas and features of the present invention, and the purpose thereof is to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the protection scope of the present invention. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. A method of predicting a voting tendency of an agent, the method comprising:

establishing a relationship between nodes;

acquiring an initialization representation of a node;

carrying out word network heterogeneous graph convolution based on keywords;

2. A method of predicting an agenda voting propensity according to claim 1, wherein the basic information of the agenda includes member ID, state of affiliation and political party; the speech issued by the agenda comprises speech text of the agenda on the twitter; the speaking node comprises at least one of a keyword and a topic tag in the speaking text.

3. The method of predicting an agenda voting tendency of claim 2, wherein a heterogeneous graph model is constructed, the heterogeneous graph comprising nodes, the establishment of the relationship between the nodes, and the initialization of the nodes; there are three relationships of R1, R2 and R3 in the heterogeneous graph based on the keyword speaking network, and there are four relationships of R1, R2, R3 and R4 in the graph based on the tag speaking network; wherein, R1 represents the co-initiated issues between the agent nodes, and the weight thereof is the number of the issues co-initiated by two agents within a preset time; r2 represents the co-occurrence relation of the speaking nodes, and the weight is the co-occurrence times of two keywords or topic labels; r3 represents the relationship between the agenda node and the speaker node, with the weight of the number of times the agenda refers to the keyword or topic tag; r4 indicates the newsletter's feelings of texting under a certain topic.

4. The method of claim 2, wherein the initialization representation of the step acquisition node comprises: encoding the basic information of the agenda, and splicing the ID, the state and the political party information of the agenda to obtain the initial representation of the agenda by the following formula:

for the initial representation of the keyword, use the GloVe word vector;

5. The method of predicting an agenda voting tendency of claim 2, wherein in the step of keyword-based speaker network heterogeneous graph convolution, the expressions of the agenda and the keyword are updated using the following formulas:

wherein σ represents an activation function Sigmoid function; w₁ ^(l)And W₂ ^(l)Is the weight matrix of the l-th hidden layer; x^(l)And Y^(l)Is a node representation of level l; lambda [ alpha ]₁And λ₂Is a weighted hyperparameter;

and

and

6. The method of predicting an interviewer's voting tendency according to claim 5, wherein in the step of topic label based speaker network heterogeneous graph convolution, a traditional graph convolution neural network is generalized to handle different relationships between any pair of nodes and different weight matrices and normalization factors are used for each relationship type, as follows:

wherein the content of the first and second substances,

R represents a collection of relationship typesHi denotes the hidden state of node i, hj is the hidden state of node j, hi^(l)Indicating the hidden state of node i at the l-th level.

7. The method of predicting an opinion voting tendency according to claim 6, wherein the graph convolution operation based on the topic label is represented as follows:

wherein the content of the first and second substances,

and

representing a weight matrix;

and

denotes the normalization factor, N_iA set of neighbors representing a node i,

8. A method of predicting a voting tendency of an interviewee according to claim 7, characterized in that the text information of the interviewee based on the long-short term memory network is initialized in the step, for which the title, description and abstract are direct text information available, the text information of the interviewee is compiled using the long-short term memory network to obtain an initial representation thereof:

X_lgn(i)＝LSTM(t_i)

9. The method for predicting voting propensity of an agenda according to claim 8, wherein the step of using the heterogeneous graph convolutional neural network to update representations of an agenda and a speaking node, performing joint training through a triplet loss function, learning the node representations and the expression of an issue in the heterogeneous graph, and measuring voting preference of the agenda on the issue through a distance between the agenda and the issue to predict voting propensity of the agenda on the issue specifically comprises:

L＝max(d(a，p)-d(a，n)+margin，0)；

10. An apparatus for predicting a voting tendency of an interviewer, comprising: a memory and a processor, the memory having stored therein a computer program that, when executed by the processor, performs the steps of: a method of predicting a voting tendency of a human observer according to any one of claims 1 to 9.