CN115878908B

CN115878908B - Social network influence maximization method and system of graph annotation meaning force mechanism

Info

Publication number: CN115878908B
Application number: CN202310025466.6A
Authority: CN
Inventors: 李远鑫; 王振宇; 韩柳; 李萍; 梁朝恺; 钟伟杰
Original assignee: China Post Consumer Finance Co ltd; South China University of Technology SCUT
Current assignee: China Post Consumer Finance Co ltd; South China University of Technology SCUT
Priority date: 2023-01-09
Filing date: 2023-01-09
Publication date: 2023-06-02
Anticipated expiration: 2043-01-09
Also published as: CN115878908A

Abstract

The invention relates to a social network influence maximization method and a social network influence maximization system of a graph annotation force mechanism, wherein the method comprises the following steps: s1: collecting social network data and constructing graph sequence data of a social network; wherein the graph sequence data includes: graph adjacency matrix data and node representation feature data; s2: extracting features of the graph sequence data based on a graph attention network and a Node2Vec combination algorithm; s3: and heuristically selecting candidate seeds from the graph sequence data after feature extraction, and selecting a node with the maximum propagation degree gain from the candidate seeds by adopting a greedy algorithm as a final seed node, so as to form a seed node set with the maximum propagation degree gain. According to the method and the system for maximizing the influence of the social network, the graph attention network is adopted to learn the graph structure of the social network, so that a more complex graph topological structure is effectively learned, the influence maximization is realized in the social network, and the method and the system have good usability.

Description

Social network influence maximization method and system of graph annotation meaning force mechanism

Technical Field

The invention relates to the field of social network information propagation research, in particular to a social network influence maximization method and system of a drawing meaning mechanism.

Background

With the advent of the 5G age and the continuous development of new media technologies, online social networks have become increasingly popular, and in the last few years, online social networks play an important role as virtual communities, players are connected together through various daily personal activities (such as communication and content sharing), and they become the most effective and huge propagation platform through an oral-oral mechanism, so that information can affect a large number of people in a short time. Impact maximization has been studied extensively in recent years as a key algorithmic problem in information dissemination research due to its potential commercial value. The method aims at selecting k users in a social network as seed nodes, and then spreading information through the users to influence other users, so that the number of the influenced users is maximized in the information spreading process. Impact maximization has many well-known applications such as viral marketing, personalized recommendation, cascade detection and information monitoring.

Currently, there are a number of approaches to solving the problem of maximizing impact in social networks. For example, kemp et al demonstrate that the problem of maximizing impact is a NP-hard problem on independent cascading models and linear thresholding models, and propose a greedy algorithm to compute the impact results for a set of seed nodes to achieve an optimal solution. However, the algorithm has the disadvantage that: (1) The algorithm uses Monte Carlo simulation to approximate the influence gain of each node in the estimated network, and the frequency of Monte Carlo simulation is high (generally set to 10000 times) to ensure accuracy; (2) After selecting a node, the influence gain still needs to be recalculated for each node, so that the calculation amount is very large, and therefore, in a huge social network data set, it is difficult to quickly and efficiently find the seed node set with the largest influence. Cheng et al in 2013 proposed a static greedy algorithm, which demonstrated that only a small number of Monte Carlo simulations were needed to ensure a certain approximate solution in each iteration of the algorithm, and then the algorithm randomly stored a set of information propagation maps generated using Monte Carlo simulations during the first iteration, and used the set of random maps to estimate the influence gain of the nodes in subsequent iterations. Although a large number of unnecessary computations in the greedy algorithm can be avoided, the computation efficiency is greatly improved, a long time is still required for selecting a small number of seed nodes in a large-scale network.

Thus, many heuristic algorithms have been proposed in succession, and researchers have used structural features of the network and some characteristics of the information propagation model to find nodes with high impact. Wang et al propose MIA algorithm based on independent cascading model, which assumes that nodes can only affect neighbors in surrounding local tree structures, and at the same time affects propagation paths by considering only one with the largest probability, thus simplifying calculation of node influence propagation. But causes a problem of greater coverage of the impact range, resulting in a smaller impact propagation range as a whole.

Although the problem of maximizing the influence is discussed in the above-mentioned research, there is still a lack of effective and accurate solutions, and at present, the conventional method for maximizing the influence has the defect that only a shallow topology structure in a network can be utilized, and meanwhile, the problem of overlapping the coverage range of the influence also exists. Accordingly, there is a need for improvements in existing social networks that address the problem of maximizing impact.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a social network influence maximization method and a social network influence maximization system of a graph attention network, which adopt a multi-layer attention mechanism of the graph attention network to learn a graph structure of the social network, effectively learn a more complex graph topological structure, select a seed node with the most influence, realize influence maximization in the social network and have better usability.

In order to achieve the purpose of the invention, the invention provides a social network influence maximization method of a graph annotation meaning mechanism, which comprises the following steps:

s1: collecting social network data and constructing graph sequence data of a social network;

wherein the graph sequence data includes: graph adjacency matrix data and node representation feature data;

s2: extracting features of the graph sequence data based on a graph attention network and a Node2Vec combination algorithm;

s3: and heuristically selecting candidate seeds from the graph sequence data after feature extraction, and selecting a node with the maximum propagation degree gain from the candidate seeds by adopting a greedy algorithm as a final seed node, so as to form a seed node set with the maximum propagation degree gain.

Preferably, the specific steps of the step S2 include:

representing feature X to Node by Node2Vec _i Random walk sequence of sampling and negative sampling operation processing, graph attention network layer based on graph attention network represents characteristic X to processed node _i And then processing and calculating node characteristic output, and splicing the K output node characteristic vectors to obtain the final node characteristic vector.

Preferably, in the step S2, the Node is represented by Node2Vec as feature X _i The specific steps of the random walk sequence and the negative sampling operation process of sampling further comprise:

Node-to-Node representation feature X based on Node2Vec _i Random walk sequence and negative sampling operation processing of sampling is performed, and skip-gram algorithm is utilized to maximize central node

And the probability of co-occurrence of context nodes within the length w of the left window and the right window, wherein the calculation formula is as follows:

wherein ,

for node->

Is characterized by a potential node representation of；

The final target loss function is minimized by a formula logarithm method, and the target loss function is optimized and converged by a random gradient descent algorithm to obtain node representation characteristics, wherein the calculation formula of the target loss function is as follows:

。

preferably, the graph attention network layer based on the graph attention network in the step S2 represents the feature X to the processed node _i The specific steps of the treatment include:

based on the graph attention network, h= { H ₁ ，h ₂ ……h _n As input features of the nodes, and calculates the attention coefficient between two nodes, wherein the calculation formula is as follows:

wherein ,

representing a weight matrix for the node characteristics +.>

Performing linear transformation>

Representing a shared attention mechanism, +.>

The node +.>

Node->

Is of importance;

when the attention coefficient is normalized through the activation function, the calculation formula is as follows:

where a represents the weight of a single-layer neural network that calculates attention,

representing transpose operations in a matrix,/->

Representing the operation of the connection to the two matrices.

Preferably, the graph attention network layer based on the graph attention network in the step S2 represents the feature X to the processed node _i And then processing and calculating node characteristic output, and splicing the K output node characteristic vectors to obtain a final node characteristic vector, wherein the specific steps further comprise:

based on the graph attention network, calculating node characteristic output by adopting K attention layers of a multi-head attention mechanism, and splicing the node characteristic vectors of the K outputs to obtain a final node characteristic vector, wherein the calculation formula of the multi-head attention mechanism is as follows:

wherein ,

indicating the connection operation +_>

Indicated by +.>

Attention coefficients calculated by the attention layers, < >>

Indicate->

And learning parameters for linearly transforming the node characteristics.

the calculation method of the last layer of graph annotation force network layer comprises the following steps: the average value of k features is calculated, nonlinear transformation is carried out through a nonlinear activation function, and the node calculation formula of the last layer is as follows:

。

preferably, the specific step of heuristically selecting candidate seeds for the feature extracted graph sequence data in step S2 includes:

the method comprises the steps of taking the characteristic vectors of np nodes of a first-order neighbor and a second-order neighbor of a network user as related vectors of each node, calculating the similarity between the two nodes by utilizing Euclidean norms of the vectors, selecting an rnp node with the maximum similarity as a strong related node, wherein r is a strong related node coefficient, r is E (0, 1), obtaining node frequency by calculating the occurrence times of each node in the strong related node set of other nodes, sorting according to the obtained node frequency, and selecting ck node with the maximum occurrence times as a candidate seed node, wherein ck is the candidate seed node related coefficient and the seed node number respectively;

the similarity is calculated by Euclidean norm formula:

wherein ,

，/>

respectively represent nodesiSum nodej。

Preferably, the specific step of determining the final seed node from the candidate seeds in step S2 by using a greedy algorithm includes:

and calculating the influence propagation degree of the candidate seed nodes, wherein the calculation formula is as follows:

wherein ,

representing the size of the collection +.>

Representation set->

The number of neighbor nodes not activated;

and sequencing the influence propagation degree of each node, and selecting the node with the largest influence propagation degree as a seed node, thereby forming a seed node set with the largest propagation degree gain.

Preferably, the specific steps of the step S1 include:

collecting social network data, wherein the social network is as follows:

wherein ,

represents node set, node->

Representing users in a social network; />

Representing a collection of edges.

Preferably, the present invention further provides a social network influence maximization system of a graph annotation force mechanism, including:

and a network data module: graph sequence data for collecting social network data and constructing a social network, wherein the graph sequence data comprises: graph adjacency matrix data and node representation feature data;

and the feature extraction module is used for: extracting features of the graph sequence data based on a graph attention network and a Node2Vec combination algorithm;

candidate seed selection module: heuristically selecting candidate seeds from the graph sequence data after feature extraction;

seed node selection module: and selecting the node with the maximum propagation degree gain from the candidate seeds by adopting a greedy algorithm as a final seed node, thereby forming a seed node set with the maximum propagation degree gain.

The beneficial effects of the invention are as follows: according to the social network influence maximization method and system of the graph attention mechanism, the multi-layer attention mechanism of the graph attention network is adopted to learn the graph structure of the social network, the graph topological structure with more complex is effectively learned, the seed node with the most influence is selected, influence maximization is achieved in the social network, and good usability is achieved.

Drawings

The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings. Like reference numerals refer to like parts throughout the drawings, and the drawings are not intentionally drawn to scale on actual size or the like, with emphasis on illustrating the principles of the invention.

FIG. 1 is a schematic diagram of a specific flow chart of a method and a system for maximizing social network influence by providing a attention mechanism according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of an embodiment of a method and a system for maximizing social network impact of a graph attention mechanism according to an embodiment of the present invention;

fig. 3 is a schematic diagram of feature extraction based on a graph attention network and a Node2Vec combining algorithm according to an embodiment of the present invention.

Detailed Description

The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and specific examples, so that those skilled in the art can better understand the present invention and implement it, but the examples are not limited thereto.

In a first embodiment, please refer to fig. 1-3, an embodiment of the present invention provides a social network influence maximizing method of a graph annotation mechanism, which includes the following steps:

s2: inputting the graph structure data into a graph injection force network and Node2vec combined influence maximization model to perform feature embedding learning so as to extract information related to influence maximization problems;

s3: and heuristically selecting candidate seeds from the graph sequence data after feature extraction, and selecting a node with the maximum propagation degree gain from the candidate seeds by adopting a greedy algorithm as a final seed node, so that a seed node set with the maximum propagation degree gain is formed, and influence overlapping is avoided. An influence maximization model based on a graph attention mechanism is constructed (comprising a Node characteristic extraction method combining a graph attention network and a Node2Vec, a candidate seed selection method based on a heuristic method and a seed selection method based on a greedy algorithm).

The beneficial effects of the invention are as follows: aiming at complex graph data, the method can effectively utilize the graph topological structure, and solves the problem of suboptimal solution of the seed set. In the Node characteristic processing stage, node2Vec is used for learning the shallow graph structure and processing the deeper graph topology structure by utilizing a graph attention mechanism. In terms of seed selection, a heuristic algorithm is used to select candidate seed nodes, and a greedy algorithm CELF is used to select the seed nodes with the greatest influence, so that the influence overlapping problem is relieved. Because the method is monotonous and sub-model, the approximate optimal solution of the influence maximization problem is ensured, and the method has better usability.

Referring to fig. 2-3, in a preferred embodiment, the specific steps of step S2 include:

to obtain feature embedding of a Node, feature X is represented to the Node by Node2Vec _i Random walk sequence with sampling and processing of negative sampling operations (based on second order random walk super parameters

and />

To generate a random walk sequence), the graph attention network layer of the graph attention network representing the feature X to the processed nodes based on the graph attention network _i And then processing and calculating node characteristic output, and splicing the K output node characteristic vectors to obtain the final node characteristic vector. (in the preferred embodiment, node2Vec generates Node feature dimension is 512, sampling sequence length is 6, and the number of sampling sequences per Node is 200, wherein the learning rate in the graph attention network model is 0.0001, the first layer graph attention network output dimension is 256, and the second layer network output dimension is 16).

Referring to FIGS. 2-3, in a preferred embodiment, feature X is represented to nodes by Node2Vec in step S2 _i The specific steps of the random walk sequence and the negative sampling operation process of sampling further comprise:

Node-to-Node representation feature X based on Node2Vec _i The random walk sequence of sampling and the processing of the negative sampling operation take into account low order neighbors and high orderThe similarity of neighbors can flexibly capture the homogeneity and the structural peering of nodes in the graph and maximize the central node by using a skip-gram algorithm

wherein ,

for node->

Is characterized by a potential node;

the final target loss function is minimized by a formula logarithm method (the reconstruction loss of a graph structure is calculated, the graph is subjected to unsupervised training to obtain a low-dimensional node characteristic representation), and the target loss function is optimized and converged by a random gradient descent algorithm to obtain the node representation characteristic, wherein the calculation formula of the target loss function is as follows:

。

referring to fig. 2-3, in a further preferred embodiment, the graph attention network layer based on the graph attention network in step S2 represents the feature X to the processed node _i The specific steps of the treatment include:

in the aspect of graph structure data, through the action of an attention mechanism, a user can allocate different attention to the neighbor nodes, and a larger attention coefficient is allocated to the neighbor nodes with similar hobbies or similar topological structures, so that the nodes can learn more complex graph topological structure characteristics. Therefore, based on the graph attention network, h= { H ₁ ，h ₂ ……h _n As input features of the nodes, and calculates the attention coefficient between two nodes, wherein the calculation formula is as follows:

wherein ,

representing a weight matrix for the node characteristics +.>

Performing linear transformation>

Representing a shared attention mechanism, +.>

The node +.>

Node->

Is of importance; for the purpose of node->

A larger difference is made in their neighbor attention coefficients, where the attention coefficients can be normalized using the softmax function:

in which a single layer neural network is used for computation

Then use LeakyReLUThe method is characterized in that the attention coefficient is normalized through the activation function as a nonlinear activation function, and the calculation formula is as follows:

representing transpose operations in a matrix,/->

Representing the operation of the connection to the two matrices.

Referring to fig. 2-3, in a further preferred embodiment, the graph attention network layer based on the graph attention network in step S2 represents the feature X to the processed node _i And then processing and calculating node characteristic output, and splicing the K output node characteristic vectors to obtain a final node characteristic vector, wherein the specific steps further comprise:

in order to make the self-attention mechanism more stable, based on a graph attention network, K attention layers of a multi-head attention mechanism are adopted to respectively calculate different node characteristic outputs, and the node characteristic vectors of the K outputs are spliced to form a final node characteristic vector, wherein the calculation formula of the multi-head attention mechanism is as follows:

wherein ,

indicating the connection operation +_>

Indicated by +.>

Attention coefficients calculated by the attention layers, < >>

Indicate->

And learning parameters for linearly transforming the node characteristics.

Referring to fig. 2-3, in a preferred embodiment, the graph attention network layer of the graph attention network-based graph attention network in step S2 represents the feature X to the processed node _i And then processing and calculating node characteristic output, and splicing the K output node characteristic vectors to obtain a final node characteristic vector, wherein the specific steps further comprise:

the calculation method of the last layer of graph annotation force network layer comprises the following steps: for the last layer of models, it will not normally be

The characteristics are spliced, the average value of k characteristics is calculated, nonlinear transformation is carried out through a nonlinear activation function, and the node updating calculation formula of the last layer is as follows: />

The graph attention network layer processes the nodes and learns further graph structures. The method ignores the degree of the nodes in network searching, generates a fixed number of paths for each node, learns a complex graph topological structure, and improves the effectiveness of the algorithm.

Referring to fig. 2-3, in a further preferred embodiment, the step S2 of heuristically selecting candidate seeds for the feature extracted graph sequence data includes:

the euclidean norm of the vector is used to measure the similarity between node pairs, and if there is a high similarity between two nodes, two nodes are considered to be more susceptible to each other, whereas two nodes are considered to be less susceptible to each other.

The propagation path of the information on the social network is generally relatively short, and the propagation range is limited to the range of the second-order neighbors of the nodes, so that the feature vectors of the first-order neighbors and the second-order neighbors of the user are selected as the related vector of each node in the network. And calculating the similarity between the node pairs by using the feature vectors, and then sequencing.

The social network influence maximization method and the social network influence maximization system of the graph attention mechanism utilize a heuristic method to select candidate seed nodes when selecting candidate seeds. After obtaining the node characteristic representation in the network, euclidean distance between nodes is used to evaluate similarity between nodes, and then strong correlation node set of each node is calculated. And finally, counting the occurrence frequency of each node in the strong correlation node sets of other nodes, sequencing the nodes, and selecting partial nodes with high frequency as candidate seed node sets.

By selecting seed nodes in the candidate seed node set, the time efficiency of the whole algorithm can be accelerated. Meanwhile, in order to avoid the problem of influence overlapping among the nodes, a final seed node set is selected from candidate seed nodes by using an optimized greedy algorithm CELF at the stage.

The method comprises the steps of taking the characteristic vectors of np nodes of first-order neighbors and second-order neighbors of a network user as related vectors of each node, calculating the similarity between the two nodes by utilizing Euclidean norms of the vectors, and selecting an r x np node with the maximum similarity as a strong related node, wherein r is a strong related node coefficient, r epsilon (0, 1), the number of the nodes is rounded downwards, node frequencies are obtained by calculating the occurrence times of each node in strong related node sets of other nodes, sorting is carried out according to the obtained node frequencies, ck nodes with the maximum occurrence times are selected as nodes of candidate seeds, and ck is respectively the candidate seed node related coefficient (generally set to 10) and the number of seed nodes (set to 50);

the similarity is calculated by Euclidean norm formula:

wherein ,

，/>

respectively represent nodesiSum nodej。

Referring to fig. 2-3, in a preferred embodiment, the step S2 of determining the final seed node from the candidate seeds by using a greedy algorithm includes:

after the candidate nodes are selected, the final candidate nodes are selected from the candidate node sets

And seed nodes. There may be a problem of overlapping effects among the candidate nodes chosen:

if an individual node u or an individual node v can ideally affect all neighbors around, then nodes are selected in the seed node set

In the case of (2), node +.>

And adding the seed node set, so that the influence propagation degree of the seed node set is not improved. Here a heuristic formula is used to measure the information propagation degree of a seed set S.

The influence propagation degree of the candidate seed nodes is calculated in a heuristic manner, and the calculation formula is as follows:

wherein ,

representing the size of the collection +.>

Representation set->

The number of neighbor nodes not activated;

In the case of an independent cascading propagation model,

is monotonous and sub-mode. The marginal influence of the nodes conforms to the sub-model, and the CELF algorithm is optimized by utilizing the property. Thus calculate the seed node set +.>

. After adding a first node A into a seed node according to the marginal influence, calculating the marginal influence again by a node B with the next smaller marginal influence in each node calculated for the first time, and if the new marginal influence of the node B is larger than or equal to the previous marginal influence of a node C with the next smaller marginal influence of the node B, directly taking the node B as a new seed node without calculating the marginal influence of a later node again. If the influence of the node B is not greater than or equal to the last round of marginal influence of the node C which is smaller than the last round of the node B, the marginal influence of each node is calculated one by one in sequence, and the largest node is selected as a seed node in sequence and placed into a seed subset.

Referring to fig. 2-3, in a preferred embodiment, the specific steps of step S1 include:

collecting social network data, wherein the social network is as follows:

wherein ,

represents node set, node->

Representing users in a social network; />

The method comprises the steps that a set of representative edges (one edge represents the influence possibly generated among nodes, a social network is modeled as a graph sequence, a user conducts information propagation in the social network, other users are influenced through information propagation, and the number of the influenced users is maximized in the information propagation process).

In this embodiment, network data is collected, provided by the arxiv platform, whether there is cooperation between authors in the neighborhood of the high-energy physical theory, and if two authors write at least one paper together, an undirected edge is generated between the two authors, and the collected data includes 31376 edges and 15229 nodes. Algorithm performance was measured using the influence spread, performed under an independent cascade model with probability p=0.1, calculated by repeating 10000 monte carlo simulations.

In a second embodiment, the present application further provides a method for maximizing academic quote network impact based on graph attention mechanisms;

step S1: collecting academic cited network data, and constructing diagram sequence data of the academic cited network, wherein in the embodiment, a data set consists of 2708 nodes and 5429 edges; algorithm performance was measured using the influence spread, performed under an independent cascade model with probability p=0.1, calculated by repeating 10000 monte carlo simulations.

Step S2: inputting academic cited graph sequence data into Node2Vec algorithm to learn shallow graph topology structure to Node representation

In the method, a multi-head attention mechanism is utilized to learn the graph sequence characteristics after preliminary processing, and a two-layer graph attention network layer pair is used for +.>

Processing, calculating different node characteristic outputs, and then adding this +.>

The individual node feature vectors are stitched together as the final node feature vector. And finally, calculating the reconstruction loss of the graph structure, and performing unsupervised training on the graph to obtain a low-dimensional node characteristic representation.

Step S3: inputting graph sequence data with extracted features, selecting candidate seeds by using a heuristic method, and re-selecting the candidate seeds by using an optimized greedy algorithm to determine final seed nodes;

in a third embodiment, the present application further provides a twitter network impact maximizing method based on a graph attention mechanism;

step S1: collecting data of a twitter network, and constructing graph sequence data of the twitter network, wherein a data set consists of 3312 nodes and 4732 edges in the embodiment; algorithm performance was measured using the influence spread, performed under an independent cascade model with probability p=0.1, calculated by repeating 10000 monte carlo simulations.

Step S2: inputting graph sequence data of the twitter network into a Node2Vec algorithm to learn a shallow graph topological structure into a Node representation

referring to fig. 1-3, in a preferred embodiment, the present invention further provides a social network impact maximizing system of a graph annotation mechanism, including:

The social network influence maximization system of the graph attention mechanism provided by the embodiment is the same as the social network influence maximization method of the graph attention mechanism provided by the embodiment, and the social network influence maximization system and the social network influence maximization method can be shared.

The beneficial effects of the invention are as follows: the invention provides a social network influence maximization method and a social network influence maximization system for a graph attention mechanism, which adopt a multi-layer attention mechanism of the graph attention network to learn a graph structure of the social network, effectively learn a more complex graph topological structure, select a seed node with the most influence, realize influence maximization in the social network and have better usability.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. The social network influence maximization method of the graph meaning force mechanism is characterized by comprising the following steps of:

s1: collecting network data of authors or academic quotations or twitter networks in a high-energy physical theory neighborhood in the social network data and constructing graph sequence data of the social network;

s2: representing feature X to Node by Node2Vec _i Random walk sequence with sampling and negative sampling operation process, using skip-gram algorithm to maximize central node

And the probability of co-occurrence of context nodes within the left and right window lengths w; the calculation formula is as follows:

，

wherein ,

for node->

Is characterized by a potential node;

minimizing a final target loss function by a formula logarithm method, and optimizing and converging the target loss function by a random gradient descent algorithm to obtain node representation characteristics;

the calculation formula of the target loss function is as follows:

；

based on a graph attention network, calculating node characteristic output by adopting K attention layers of a multi-head attention mechanism, and splicing the node characteristic vectors of the K outputs to obtain a final node characteristic vector;

s3: the method comprises the steps of taking the characteristic vectors of np nodes of a first-order neighbor and a second-order neighbor of a network user as related vectors of each node, calculating the similarity between the two nodes by utilizing Euclidean norms of the vectors, selecting an rnp node with the maximum similarity as a strong related node, wherein r is a strong related node coefficient, r is E (0, 1), obtaining node frequency by calculating the occurrence times of each node in the strong related node set of other nodes, sorting according to the obtained node frequency, and selecting ck node with the maximum occurrence times as a candidate seed node, wherein ck is the candidate seed node related coefficient and the seed node number respectively;

the similarity is calculated by Euclidean norm formula:

，

wherein ,

，/>

respectively represent nodesi Sum nodej ；

Calculating the influence propagation degree of candidate seed nodes, sequencing the influence propagation degree of each node, and selecting the node with the largest influence propagation degree as the seed node, thereby forming a seed node set with the largest propagation degree gain;

the calculation formula of the influence propagation degree of the candidate seed nodes is as follows:

，

wherein ,

representing the size of the collection +.>

Representation set->

The number of neighbor nodes that are not activated.

2. The method for maximizing the influence of a social network as recited in claim 1, wherein the graph attention network layer based on the graph attention network in said step S2 represents the feature X to the processed nodes _i The specific steps of the treatment include:

，

wherein ,

representing a weight matrix for the node characteristics +.>

Performing linear transformation>

Representing a shared attention mechanism，/>

The node +.>

Node->

Is of importance;

，

wherein ,

representing transpose operations in a matrix,/->

Representing the operation of the connection to the two matrices.

3. The method for maximizing the influence of a social network as recited in claim 1, wherein the step S2 of calculating the node feature output based on the graph attention network by using K attention layers of the multi-head attention mechanism, and the step of stitching the node feature vectors of the K outputs to obtain the final node feature vector further comprises:

the calculation formula of the multi-head attention mechanism is as follows:

，

wherein ,

indicating the connection operation +_>

Indicated by +.>

Attention coefficients calculated by the attention layers, < >>

Represent the first

And learning parameters for linearly transforming the node characteristics.

4. The method for maximizing influence of social network as recited in claim 3, wherein the step S2 of calculating node feature outputs based on the graph attention network by using K attention layers of a multi-head attention mechanism, and the step of concatenating the K output node feature vectors to obtain a final node feature vector further comprises:

。

5. the method for maximizing the influence of a social network as recited in claim 1, wherein the specific step of step S1 includes:

collecting social network data, wherein the social network is as follows:

，

wherein ,

represents node set, node->

Representing users in a social network; />

Representing a collection of edges.

6. A social network impact maximization system of a graph attention mechanism, comprising:

and a network data module: network data for collecting inter-author or academic cited or twitter networks in a high-energy physical theoretical neighborhood in social network data, wherein graph sequence data comprises: graph adjacency matrix data and node representation feature data;

and the feature extraction module is used for: representing feature X to Node by Node2Vec _i Random walk sequence with sampling and negative sampling operation process, using skip-gram algorithm to maximize central node

And the probability of co-occurrence of context nodes within the left and right window lengths w;

the calculation formula is as follows:

，

wherein ,

for node->

Is characterized by a potential node;

the calculation formula of the target loss function is as follows:

，

candidate seed selection module: the method comprises the steps of taking the characteristic vectors of np nodes of a first-order neighbor and a second-order neighbor of a network user as related vectors of each node, calculating the similarity between the two nodes by utilizing Euclidean norms of the vectors, selecting an rnp node with the maximum similarity as a strong related node, wherein r is a strong related node coefficient, r is E (0, 1), obtaining node frequency by calculating the occurrence times of each node in the strong related node set of other nodes, sorting according to the obtained node frequency, and selecting ck node with the maximum occurrence times as a candidate seed node, wherein ck is the candidate seed node related coefficient and the seed node number respectively;

the similarity is calculated by Euclidean norm formula:

，

wherein ,

，/>

respectively represent nodesi Sum nodej ；

Seed node selection module: calculating the influence propagation degree of candidate seed nodes, sequencing the influence propagation degree of each node, and selecting the node with the largest influence propagation degree as the seed node, thereby forming a seed node set with the largest propagation degree gain;

，

wherein ,

，/>

respectively represent nodesi Sum nodej ；

，

wherein ,

representing the size of the collection +.>

Representation set->

The number of neighbor nodes that are not activated. />