CN116049695A

CN116049695A - Group perception and standing analysis method, system and electronic equipment crossing social network

Info

Publication number: CN116049695A
Application number: CN202211643877.3A
Authority: CN
Inventors: 李晓宇; 金力; 张泽群; 李树超; 刘庆; 姚方龙; 关世昌; 马豪伟; 董鹏程
Original assignee: Aerospace Information Research Institute of CAS
Current assignee: Aerospace Information Research Institute of CAS
Priority date: 2022-12-20
Filing date: 2022-12-20
Publication date: 2023-05-02
Anticipated expiration: 2042-12-20
Also published as: CN116049695B

Abstract

The invention provides a group perception and standing analysis method, a system and electronic equipment crossing social networks, wherein the method comprises the following steps: based on a community network representation mode of a non-European hyperbolic space, the user alignment across social networks and the selection of an aggregation associated representation space are realized; the multi-source heterogeneous information of the user content attribute and the topological structure is fused; and obtaining the standing tendency attitude of the user group. The invention can improve the efficiency and accuracy of group perception and standing attitude prediction.

Description

Group perception and standing analysis method, system and electronic equipment crossing social network

Technical Field

The invention relates to the field of artificial intelligence, in particular to a group perception and standing analysis method, a system and electronic equipment crossing social networks.

Background

The position detection is one of leading research branches in the field of natural language processing, and aims to detect the opinion or attitude tendency of a question event and an associated object thereof, such as "endorsement, neutrality or objection", from information issued by a user. The position detection algorithm is an important application branch of the text classification algorithm, compared with an emotion analysis task, the detected data is more hidden and more obscure, and the data is highly related to the object target, so that the classification task is more difficult.

Traditional stand detection tasks are mainly aimed at carrying out stand detection on offline communication data and online forum speech, however, with the development of new media technologies, a novel social network platform has become a main channel for users to issue opinion stands, such as domestic new waves, tremble sounds, foreign twitter, facebook and other platforms. Therefore, the standpoint detection task of designing the media platform is derived, the data set based on the social media platform is published at home and abroad, and a model and a method with exploratory significance are provided.

In addition, conventional standpoint detection is directed to the user level, but on one hand, the standpoint of the user evolves in an interactive manner, i.e., the individual does not fully accept or ignore the standpoint of other individuals, and on the other hand, the standpoint of the user aggregate to form an overall opinion. Therefore, many researches develop related researches of group position analysis and decision prediction, and the core thought is to take users as a group, predict the overall position attitude of the network user group by researching the interaction process among the users, and finally form public opinion perception. The group position detection task of the social media considers the interaction process of the user group in the social media, continuously updates and fuses the opinion and attitude of the group to the same problem, and finally forms different public opinion states such as consensus, opinion polarization or opinion splitting.

Based on the difference of feature representation and extraction, the existing social network group position detection method mainly comprises a method based on feature engineering, a method based on machine learning and a method based on representation learning.

The feature engineering-based standpoint detection algorithm is a method for processing and extracting features from original data by a professional, and inputting the features into an algorithm model, and in the standpoint detection task, expression modes such as linguistic features and structural features, such as lexical and extracted sentence pattern structures and syntactic dependencies, are mainly provided by the professional disciplines such as linguistics and psychology. The existing method can be basically divided into a stand detection method based on semantic features and a stand detection method based on semantic and structural feature fusion, however, the algorithm is seriously dependent on the field expertise of researchers, the algorithm performance is directly dependent on the feature distinguishing degree, the intelligent degree is low, and the algorithm accuracy is unstable.

The machine learning-based method is to build a neural network and other deep learning models, automatically fit a standpoint to detect nonlinear complex mapping relations between the input and the output of tasks through a training network of real data. The deep learning model can automatically adjust parameters of an optimization network through real training data, reduces the process of manually defining characteristics, namely, mapping texts and topics in the position detection into vectors in a high-dimensional space, and then calculating and outputting final position types by using the vectors. The core of the conventional method for detecting the position based on a series of methods such as a convolutional neural network, a cyclic neural network, a drawing meaning neural network and the like is to automatically learn the correlation between the position attitude and text characteristics through a model and classify the text characteristics by utilizing the characteristic information. The accuracy and the robustness of the machine learning method are superior to those of the feature engineering method, but the method often focuses on the expression and mapping of text features, lacks the expression of user attributes and topology, and has lower information utilization rate due to the modal calculation of a single text dimension.

The method based on representation learning refers to obtaining low-dimensional feature vector representation of a research object in a specific embedded space, and then realizing task classification by utilizing measurement calculation of the feature vector, wherein the method does not need feature engineering to extract features, has the performance of good generalization and interpretation, and mainly relates to theories and methods such as user representation and text representation learning in a standing detection task. The text representation learning is mainly to embed the text characteristics of a user into a space coding representation, and mainly comprises a discrete representation method such as single-hot coding, word bag model coding, TF-IDF and the like and continuous coding models such as Word2Vec, bert pre-training models and the like; user representation learning mainly includes representation modeling of dimensions of user attributes, content generation, behavior representation, relational expressions, and the like. The method for representing learning further expands the depicting dimension of the group stand detection modeling, but the conventional method is often used for modeling in the traditional European space because the social network is a real complex network, and the European space lacks the expression capability of the data hierarchy structure characteristics, so that the attribute of the social network structure is distorted, and the representation capability is insufficient. In addition, feature fusion calculation of text representation features, user representation features and other different dimensions is also a bottleneck problem, and the performance effect of the model is affected.

Disclosure of Invention

Aiming at the technical problems, the invention adopts the following technical scheme:

the embodiment of the invention provides a group perception and standing analysis method crossing a social network, which comprises the following steps of:

s100, acquiring Q social networks SN ₁ ，SN ₂ ，…，SN _r ，…，SN _Q The method comprises the steps of carrying out a first treatment on the surface of the r has a value of 1 to Q;

s200, SN is carried out based on the Poincare sphere model of the non-European hyperbolic space ₁ and SN₂ Alignment is carried out to obtain an initial converged network and each of the initial converged networksA token vector of the node in a non-euro hyperbolic space; s200 is executed; the characterization vector comprises a content attribute feature vector and a topological structure feature vector;

s300, setting c=c+1, if c < Q-1, executing S400; otherwise, executing S500;

s400, fusing the current network and SN based on non-European hyperbolic space _(c+2) Aligning to obtain a fusion network (c+1) and a characterization vector of each node in the fusion network (c+1) in a non-European hyperbolic space; s300 is executed;

s500, regarding the current fusion network as a target fusion network g= (V, X, E), wherein,

v is the node set in G _s The value of s is 1 to n, and n is the number of nodes in G; content attribute feature set x= { X ₁ ，X ₂ ，…，X _m ，…，X _h(m) Content attribute feature set of mth community +. >

Representing node v in the mth community in G _i Is a representation vector of (1); topological structure feature set e= { E ₁ ，E ₂ ，…，E _m ，…，E _h(m) }，/>

Node v representing the mth community in G _i and v_j A collection of edges in between; m is 1 to L, i, j is 1 to h (m), i is not equal to j, L is the number of communities in G, and h (m) is the number of nodes in the m th community;

s600, obtaining a content attribute feature map C of the mth community in G _m And topology Structure T _m； wherein ,C_m To obtain based on the distance between nodes in the mth community in G,

is C _m A corresponding adjacency matrix; />

Is T _m A corresponding adjacency matrix;

is C _m Content attribute feature vector of the i-th node in (a), a +.>

Is T _m The topological structure feature vector of the ith node in (a);

s700, C _m and T_m Inputting the content attribute fusion characteristics into a first layer of attention mechanism to obtain corresponding content attribute fusion characteristics respectively

And topology fusion feature->

and />

Respectively C _m and T_m Node v in (a) _i Content attribute fusion features and topology fusion features of (a);

s800, will

and />

Input to the second layer attentionIn the force mechanism, the fusion characteristic Z of the mth community is obtained ^m ＝{z ^m ₁ ，z ^m ₂ ，…，z ^m _i ，…，z ^m _h(m) }；z ^m _i Node v being the mth community _i Is a fusion feature of (2);

s900, z ^m _i Inputting into a set viewpoint tendency prediction model to obtain a corresponding prediction result Pc ^m _i ＝{Pc ^m _ie } ^H _e＝1 ，Pc ^m _ie For node v _i The probability that the corresponding user position belongs to the e-th viewpoint position, and H is the number of the viewpoint positions;

s1000, obtaining the viewpoint position value of the mth community

k _e An attribute value from the standpoint of the e-th perspective.

The invention has at least the following beneficial effects:

according to the group perception and standing analysis method for the cross-social network, provided by the embodiment of the invention, standing tendency attitude mining and aggregation are carried out by adopting the combination of group node interaction and self content attribute characteristics, and the efficiency and accuracy of the group decision prediction problem can be improved by automatically acquiring text views through driving training based on real data through the feature level combination of node content attributes and topological structure multidimensional characteristics.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a method for group perception and standpoint analysis across social networks provided by an embodiment of the present invention.

Fig. 2 is a block diagram of a group awareness and standpoint analysis system across social networks according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

The technical idea of the invention is that when an emergency topic event occurs, the language of users of different social network platforms can be automatically mined, and the formed group standing attitudes can be analyzed. Specifically, the standing attitudes of the topic events can be represented by the language of the users, and the virtual communities formed by the mutual interests and the common hobbies of different users generally form a community standing attitudes, and the attitudes of different communities jointly form the overall public opinion attitude of the social network. From the perspective of data flow analysis, user attribute and structure data of different social networks and speech data published by each user are analyzed, so that the standing attitudes of the virtual communities aiming at topic events at a certain moment are obtained.

The embodiment of the invention provides a group perception and standing analysis method across social networks, which is provided by the embodiment of the invention, as shown in fig. 1, and can comprise the following steps:

s100, acquiring Q target social networks SN ₁ ，SN ₂ ，…，SN _r ，…，SN _Q R has a value of 1 to Q.

In the embodiment of the invention, the target social network can be obtained based on the existing social platform, such as a new wave, a microblog and other social platforms, and different social networks are formed by different social platforms. The size of Q may be set based on actual needs.

Each social network may be represented by a two-tuple g= (V, δ), where V is the set of nodes { V _i ' its cardinal number }

Is N; delta is the edge set { (v) _i ,v _j ) Side elements of δ are referred to as the sides of the network. In a social network, communities are user sets +.>

Is denoted as set +.>

The element users are all given the same community label. The whole community set is called

There is no intersection between any two communities. For two social networks, e.g. source network +.>

And target network->

Where s and t represent the source and target, respectively, different indices i, j are used to distinguish users and indices p and q are used to distinguish communities. The target network is a reference network and users in the source network need to be aligned to users in the target network.

Wherein nodes in the target network are regarded as physical objects of network users and are defined as images. Correspondingly, the nodes in the source network are regarded as original objects of network users and are defined as primary images. Since it is a common phenomenon that a user joins multiple social networks simultaneously, they behave similarly in different social networks, or each has a emphasis. Such users are defined below:

anchor user: if a natural person user is in the source network

There is a like node, marked as +.>

And is in the target network->

There are elephant nodes, note node->

The user is referred to as an anchor user.

Anchor links: let node

And node->

Respectively source node set->

And target node set->

Nodes, if they are the same user in the source network->

And target network->

The image in (a) is constructed to form an Internet link, called an anchor link, and is recorded as a binary group

Anchor user set a: the anchor user set is the totality of known anchor users, denoted as a set. Similarly, the same natural population of people typically participates in multiple social networks, which behave similarly in different social networks, sometimes with each emphasis.

Anchor community: if it is in community

and />

Respectively source network->

And target network->

Is a community- >

And communities

Is an anchor community.

Because the same user registers account numbers on different social platforms and forwards and propagates messages on different social platforms, a plurality of users and communities are formed in a virtual space, if the user needs to penetrate through a real user object layer, the aggregation and mining of the standing attitudes of the users of different social networks are realized, and the alignment problem of the different virtual communities, namely the alignment problem of different users crossing the social networks, needs to be solved first. The accurate alignment of communities is needed to be based on accurate data characterization, and the core work of data characterization is to represent data objects as a high-dimensional vector in a characterization space. Considering that the social network has a complex hierarchical topological structure, namely the social network has a potential non-European structure, the traditional Euclidean space-based characterization space cannot embody the structural characteristics of the data, and the characterization distortion of the data structure attribute is easy to cause. Aiming at the problems, the invention provides a community network characterization mode based on hyperbolic non-European space, and compared with data characterization of European space, the hyperbolic space provides a community characterization mode with higher cohesiveness and stronger external diversity, which is more beneficial to community alignment task development of mutual aliasing. Then, mapping each social network representation into the hyperbolic public subspace for community acquaintance calculation and alignment. Specifically, the model aligns hyperbolic representation spaces of each social network in a manner representing migration using known anchor users, makes the aligned representation spaces a common subspace, and performs social community alignment in the common subspace. And completing the community alignment task of the cross-social network through the embedding mapping of the hyperbolic space and the migration calculation of the public subspace. Specifically, the following are shown in S200 to S400:

S200, SN is carried out based on the Poincare sphere model of the non-European hyperbolic space ₁ and SN₂ Aligning to obtain an initial fusion network and a characterization vector of each node in the initial fusion network in a non-European hyperbolic space; s200 is performed.

In the embodiment of the invention, a poincare sphere model in a non-European hyperbolic space is adopted as a characterization model, and a predicted vector is a characterization vector. The token vector for each node is a high-dimensional vector, which may include a content attribute feature vector and a topology feature vector.

In the embodiment of the invention, the content attribute features can comprise attribute information of the user and published text information, the attribute information can comprise a user name, gender, age, mailbox, address, occupation, place of residence and the like, the text information can comprise published information, interactive comments and the like, namely, the constituent elements of the content attribute features can comprise the user name, gender, age, mailbox, address, occupation, place of residence, published information, interactive comments and the like. The topological structure features comprise network activity information and social relation information of the user, the network activity information can comprise user login time, login frequency, login duration and the like, the social relation information can comprise label information such as user interest love, habit browsing type and the like, and social platform attenuators, focused friends, fan and the like, namely the constituent elements of the topological structure features can comprise user login time, login frequency, login duration, social platform attenuators, focused friends, fan and the like.

In the implementation, each time the value corresponding to the element contained in each feature is obtained, the values are encoded to form corresponding feature vectors, and then the feature vectors are embedded into a non-European hyperbolic space and converted into the characterization vectors in the non-European space.

S300, setting c=c+1, if c < Q-1, executing S400; otherwise, S500 is performed.

S400, fusing the current network and SN based on non-European hyperbolic space _(c+2) Aligning to obtain a fusion network (c+1) and a characterization vector of each node in the fusion network (c+1) in a non-European hyperbolic space; s300 is performed.

In the embodiment of the invention, the core idea of embedding the social network into the poincare sphere model of the non-European hyperbolic space is to measure the intimacy between nodes through the distance of the poincare sphere model so as to learn the hyperbolic characterization vector of each node.

First, a random walk is performed on the social network to capture affinities between nodes. In a walk sequence, several neighboring nodes before and after a given node are referred to as their context nodes. Then node v _i There are two identities: when it is taken as its central node, corresponds to the hyperbolic token vector θ _i The method comprises the steps of carrying out a first treatment on the surface of the When it is used as the context of other nodes, it corresponds to the context vector θ _i ′。

Further, in S200 and S400, the token vector of the node in each converged network is obtained by:

s10, embedding hyperbolic space poincare sphere models into each node of two social networks to be fused, and constructing a node characterization vector objective constraint function of each node

In embodiments of the present invention, embedding is a well-known expression of machine learning in which a neural network is utilized to map a high-order local representation into a low-dimensional distributed space, a process known as embedding. Those skilled in the art will recognize that any method of embedding a hyperbolic spatial poincare sphere model in a node of a social network falls within the scope of the present invention.

wherein ,

for nodes in network x->

and />

The hyperbolic distance between two nodes is used to describe the affinity between the two nodes. />

For nodes in network x->

Neighbor node set,/->

A node set of the network x; network s represents SN ₁ and SN₂ Network t represents SN ₁ and SN₂ As a target network.

In an embodiment of the present invention, in the present invention,

where σ () is a sigmoid function. />

Is->

Is a hyperbolic representation vector of->

Is->

Is defined in the context vector of (a). D () is a hyperbolic distance function, e.g., hyperbolic distance +. >

By optimizing Fu based on the Riemann geometric random gradient descent method, a characterization vector for each node in the network can be obtained.

S20, modeling each community in two social networks embedded with the hyperbolic space poincare sphere model by using a non-European hyperbolic clustering model, and constructing a community characterization vector target constraint function of each community

In the embodiment of the invention, a non-European mixed hyperbolic clustering model is designed based on the characterization vector of the nodes of the non-European hyperbolic space so as to find and characterize communities. In the hybrid clustering model, a hybrid distribution of hyperbolic space is made up of a series of nodes clustered in the community.

wherein ,

for nodes in network x->

The probability of belonging to community p in network x is membership matrix, membership matrix Z _ip The sum of the elements of each row is 1./>

Probability density distribution function of model of community p constructed based on generalized hyperbolic distribution, ++>

Is->

Is a hyperbolic representation vector of->

Hyperbolic parameters for community p in network x; c (C) ^x For the number of communities in network x。

In the inter-hyperbolic cluster model, node characterizations are generated from a mixture distribution in a hyperbolic space. Each component in the mixed distribution corresponds to a community. If a given node characterizes { θ } _(.) Likelihood probability that a node belongs to the community is calculated by:

in the embodiment of the invention, a generalized hyperbolic distribution modeling community is used, and the probability density function of the modeled community is as follows:

wherein ,

beta and mu are respectively a distortion vector and a position vector, wherein the position vector mu is a hyperbolic representation vector of the community. Omega is an aggregation factor, delta is a metric matrix, and d-dimensional positive definite matrix is used for describing Riemann metric. Determinant of delta, K _r (. Bessel function modified for the r-order, which is derivative with respect to both the order r and the argument,

is->

The order-modified Bessel function is derivative with respect to both the order r and the argument.

In the embodiment of the invention, the characterization vector of the community membership matrix and the characterization vector of the hyperbolic community can be obtained simultaneously by optimizing Fc based on a Riemann geometric random gradient descent method by giving the node characterization vector as an observation value.

S30, constructing a pair Ji Gailv of two social networks embedded with a non-European hyperbolic clustering model based on anchor users in the social networksFunction of

For nodes in the target network and in the source network +.>

Nodes connected by anchor links->

Representing node->

and />

Hyperbolic distance between->

Representing node- >

and />

Hyperbolic distance between; />

Is a set of anchor users in two social networks, i.e., node ID intersections in two social networks.

In the embodiment of the invention, a vector space-hyperbolic public subspace formed by combining common dimensions of all nodes is constructed by adopting an anchor user characterization migration method in a non-European hyperbolic space. In the hyperbolic common subspace, two social networks are aligned on the anchor user, through which the representation of the anchor user can migrate through the anchor link. If (v) _i ′，ν′ _k ) Is an anchor link, then the node

Can be used to infer v′ _k Similarly, node v' _k It is also possible to infer its primary image nodes.

S40, constructing the following joint objective function:

′

wherein ,α₁ and α₂ As a weight factor, θ is a hyperbolic standard vector of the node, θ is a context vector of the node,

is a metric matrix for community p in network x.

And S50, optimizing the joint objective function to obtain the characterization vector of each node.

By optimizing the joint objective function based on the Riemann geometric random gradient descent method, the characterization vector and the alignment community of each alignment node in the non-European hyperbolic space in the common Poincare sphere model can be realized.

In the embodiment of the invention, the technical effects of S100 to S400 are as follows: because different social networks are firstly embedded into the same high-dimensional space, then the community alignment and the user alignment in the social networks are realized in the high-dimensional public subspace, and the attribute characteristics of each user are aligned, the alignment and the network association fusion of the multi-source heterogeneous information of the network community users can be realized.

S500, taking the current fusion network as a target fusion network G= (V, X, E) wherein,

v is the node set in G _s The value of s is 1 to n, and n is the number of nodes in G; content attribute feature set x= { X ₁ ，X ₂ ，…，X _m ，…，X _h(m) Content attribute feature set of mth community +.>

Representing node v in the mth community in G _i Is a content attribute feature vector of (1); topological structure feature set e= { E ₁ ，E ₂ ，…，E _m ，…，E _h(m) }，

Node v representing the mth community in G _i and v_j A set of edges in between, a plurality of adjacency matrices->

Network topology for representing graph G, if->

Then indicate->

m is 1 to L, i, j is 1 to h (m), L is the number of communities in G, and h (m) is the number of nodes in the m-th community.

S600, obtaining a content attribute feature map C of the mth community in G _m And topology Structure T _m； wherein ,C_m To obtain based on the distance between nodes in G,

is C _m A corresponding adjacency matrix; />

Is T _m A corresponding adjacency matrix; />

Is C _m Content attribute feature vector of the i-th node in (a),

is T _m The topology feature vector of the i-th node in (a).

In the embodiment of the invention, the topological structure diagram of the node is formed by the interaction of users through attention, praise, comment, forwarding and the like and the connection generated between other users, so that the topological structure diagram is consistent with the topological structure of the target fusion network, namely the network G, and correspondingly,

and the adjacency matrix corresponding to the m-th community in G.

In the embodiment of the invention, the content attribute feature map C of the node _m Is obtained based on the distance between the nodes in the mth community in G, namely the reconstructed graph after calculating the distance between the nodes in the mth community, specifically C _m The method can be obtained by the following steps:

s601, obtaining the sum v in the mth community _i Corresponding similarity

For node v _i And node v _j Similarity between content attribute feature vectors; .

In an embodiment of the present invention, in the present invention,

can be cosine similarity, i.e. +.>

Is C _m Content attribute feature vector of the j-th node in (a).

S602, will

The similarity of the sequences is ordered from big to small to obtain ordered similarity +.>

S603, obtaining

The node corresponding to the first B similarity is taken as v _o Is a neighbor node of (a); and obtaining neighbor nodes of all nodes in the mth community.

The size of B may be custom set, in one exemplary embodiment b=5.

S604, constructing C based on neighbor nodes of all nodes in the mth community _m 。

Specifically, based on the neighbor node of each node, the content attribute feature map corresponding to the node can be constructed, so that the adjacency matrix corresponding to the mth community can be obtained from the content attribute feature data of the node

Specifically, a content attribute feature map corresponding to each node can be constructed based on the neighbor node of the node, and a corresponding adjacency matrix is constructed by utilizing the content attribute map, namely if two nodes are connected, the value of the corresponding position of the matrix is 1, and otherwise, the value of the corresponding position of the matrix is 0.

Because the graph structure formed by the social network is intricate, some nodes form a single graph independently, and some nodes interact to form a correlation graph. In addition, the attribute information of different nodes is also multi-source heterogeneous, the nodes themselves have data information such as user entity, blog state, place position and the like, and heterogeneous edges such as social relationship, writing relationship, position relationship and the like are included among the nodes. The key of mining effective user clustering features and community opinion features is how to commonly extract comprehensive and reasonable features from topological structures and content attributes of nodes, but the current method lacks a mechanism for extracting node topological and attribute features in a simultaneous interaction and synergy mode. Therefore, the invention proposes to abstract model the group feature mining problem on the social network as a multi-dimensional node clustering problem. In order to meet the requirement of the fused complex network structure, a multidimensional deep neural network clustering model for excavating attributes and structures among nodes is designed, and the model is particularly a neural network with two layers of attention mechanisms. The model firstly builds an attribute feature map of the nodes according to the similarity of the content attributes of the nodes, then uses the attribute feature map and the topological map among the nodes as input, and then adaptively extracts hidden features of the node content and the node topology under different dimensions through a two-layer attention mechanism, so that deep mining of the node content and the attribute composite features of each community is realized, and basic features are provided for inter-node community aggregation and community attitude mining. The specific implementation can be as follows S700 to S800:

And topology fusion features

and />

Respectively C _m and T_m Node v in (a) _i Content attribute fusion features and topology fusion features of (a).

In an embodiment of the present invention, a first layer of attention mechanism is used to learn vertex v _i V of each neighbor node (v) _j ,j∈N _i The weighting coefficients of the features.

Taking the content attribute feature map corresponding to the mth community as an example for introduction. Firstly, inputting a content attribute feature map corresponding to an mth community into a multidimensional deep neural network clustering model, and learning a vertex v through the following steps of _i ,v _j Correlation coefficient between

Wherein the operator [ ·| ]]Representing the stitching operation, a (·) is a single layer feedforward neural network, with the activation function being LeakyRelu.

wherein ,

is->

W weight matrix.

The attention distribution is obtained by softmax normalization of the correlation coefficient

Aggregating neighborhood features according to attention distribution coefficients, i.e. C _m The ith node in the network is fused with new features of the domain information

Similarly, the topology fusion feature of node i in the topology feature map

wherein ,/>

Is T _m Topology feature vector of the j-th node in (a), >

Is->

W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, σ () is a sigmoid functionA number.

In the embodiment of the invention, in order to make the model more robust, a multi-head attention mechanism can be introduced to capture different interaction information in different projection spaces, namely, the first-layer attention mechanism comprises K attention mechanisms. The above expression, in which the connection is repeatedly performed independently K times, is characterized as shown in the following formula

Representing the attention coefficient calculated by the kth attention mechanism of the mth feature map (comprising the content attribute feature map and the topological structure feature map), W ^k Is a weight matrix of the corresponding input linear map.

And | represents stitching.

That is, the fused feature of each node may be a feature obtained after feature concatenation from multiple attention mechanisms.

S800, H ^m _a and H^m _t Inputting the fusion characteristics Z of the mth community into a second-layer attention mechanism to obtain the fusion characteristics Z of the mth community ^m ＝{z ^m ₁ ，z ^m ₂ ，…，z ^m _i ，…，z ^m _h(m) }；z ^m _i Node v being the mth community _i Is described.

To learn the importance of each of the topology map and the content attribute feature map, a second level of attention mechanism is implemented. Any node v in the topological graph structure diagram of the mth community _i A kind of electronic device

Firstly, carrying out nonlinear transformation on the node, then using a dot product model to obtain the correlation between the transformed embedding and the query vector q, taking the average value of the attention values of all nodes as the attention value of the topological graph +.>

As shown in the following formula,wherein W is a weight matrix, b is a bias vector, and formula (·) T represents a rank-shifting operation. Similarly, for the fusion feature of the content attribute profile +.>

The value of interest of (2) is->

The fusion characteristics of the content attribute characteristic diagram and the topological structure diagram share the parameters.

Then use softmax function to focus on the value

Normalizing to obtain the attention weights of the topological graph and the feature graph>

Specifically, the attention coefficient of any node i in the content attribute profile

Attention coefficient of any node i in the topology feature map

Finally, the attention is combined with the fusion characteristics corresponding to the topological structure diagram and the content attribute characteristic diagram to obtain

I.e. < ->

S900, z ^m _i Inputting into a set viewpoint tendency prediction model to obtain a corresponding prediction result Pc ^m _i ＝{Pc ^m _ie } ^H _e＝1 ，Pc ^m _ie For node v _i The probability that the corresponding user position belongs to the e-th viewpoint position, H is the number of viewpoint positions.

In embodiments of the present invention, user standpoint may include support, neutrality, and objection.

In the embodiment of the invention, the set viewpoint standing tendency prediction model can be a neural network of a two-layer attention mechanism and can be a model formed by fully connected layers. The model after training can be specifically obtained, and target fusion characteristics of a plurality of communities can be used as samples to be input into the constructed deep neural network model for training. During model training, a gradient descent method can be adopted for training, and a loss function can be used for calculating loss by KL three-degree. The specific training steps may be the prior art, and specific description thereof is omitted for avoiding redundant description.

Wherein, for the target fusion characteristics of the user corresponding to the ith node in the input community m

Output characteristics obtained after passing through the full connection layer

W _out Is the output weight and Bout is the bias vector.

Finally, the probability distribution of the values of the H standing marks is obtained through an output layer consisting of a Softmax layer

S1000, obtaining the viewpoint position value of the mth community

k _e An attribute value from the standpoint of the e-th perspective.

In the embodiment of the invention, different standing tendency attitudes of the support, the neutrality or the objection of the event shown by the user can be quantitatively evaluated as 1, 0-1 according to different comment comments and response behaviors of the user on different topic events, namely, the attribute values of the support, the neutrality and the objection can be respectively 1, 0-1.

In the embodiment of the invention, the text view is automatically acquired by the feature level fusion of the node content attribute and the topological structure multidimensional feature based on the real data driving training, so that the efficiency and the accuracy of the group decision prediction problem can be improved.

Another embodiment of the present invention provides a social network-spanning group awareness and position analysis system for implementing the foregoing method, as shown in fig. 2, where the provided system may include a social network embedding module, a feature fusion mining module, and a group position decision module that are disposed from bottom to top.

The social network embedding module is used for embedding different social networks into the same high-dimensional public subspace based on the Poincare sphere model of the non-European hyperbolic space, realizing the alignment of communities and users in the social network in the high-dimensional public subspace, realizing the alignment of multi-source heterogeneous information of network community users and network association fusion, obtaining aligned target fusion networks and characterization vectors of each user in the target fusion networks in the non-European hyperbolic space, and sending the characterization vectors to the feature fusion mining module, wherein the characterization vectors comprise content attribute feature vectors and topological structure feature vectors. The module is specifically configured to perform the steps shown in the foregoing S100 to S500.

The feature fusion mining module is used for mining and fusing features in the target fusion network through two layers of attention mechanisms to obtain fusion features of each community in the target fusion network and sending the fusion features to the community standing decision module, and particularly the feature fusion mining module further processes community multi-source heterogeneous data of the aligned social network transmitted by the social network embedding module, and supports division of the communities and analysis of standing tendencies in the community standing decision module through deep mining and fusion of the community data. The special feature fusion mining module stacks two layers of attention mechanisms to adaptively extract effective features in graph data, wherein a first layer of attention seeking attention network is used for aggregating neighbor node features, and a second layer of attention fusion topological graph and extracted features of the feature graph. The module is specifically configured to perform the steps shown in the foregoing S600 to S800.

The group position decision module is used for predicting the position tendency attitudes of communities based on the received fusion characteristics to obtain the position tendency attitudes of each community. The feature fusion mining module is used for mining depth sharing features based on the feature level fusion of node content attributes and topological structure multidimensional features, so that the efficiency and the accuracy of the group decision prediction problem are improved by automatically acquiring text views and introducing a group decision prediction algorithm influenced by environmental factors through driving training based on real data. The module is specifically configured to execute the steps shown in the foregoing S900 to S1000.

Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.

In some possible implementations, the various aspects of the present application may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the present application as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, a random access memory (ARM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.

Those skilled in the art will appreciate that the various aspects of the present application may be implemented as a system, method, or program product. Accordingly, aspects of the present application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.

An electronic device according to this embodiment of the present application. The electronic device is only one example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.

The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.

Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the present application described in the above section of the "exemplary method" of the present specification.

The storage may include readable media in the form of volatile storage, such as random access memory (ARM) and/or cache storage, and may further include read-only memory (ROM).

The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.

The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. The network adapter communicates with other modules of the electronic device via a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.

While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims

1. A method of group awareness and position analysis across a social network, the method comprising the steps of:

s200, based on non-European hyperbolic spacePoncare sphere model will SN ₁ and SN₂ Aligning to obtain an initial fusion network and a characterization vector of each node in the initial fusion network in a non-European hyperbolic space; s200 is executed; the characterization vector comprises a content attribute feature vector and a topological structure feature vector;

s300, setting c=c+1, if c < Q-1, executing S400; otherwise, executing S500;

s600, obtaining content attribute characteristics of the mth community in GSign C _m And topology Structure T _m； wherein ,C_m To obtain based on the distance between nodes in the mth community in G,

is C _m A corresponding adjacency matrix; />

Is T _m A corresponding adjacency matrix;

is C _m Content attribute feature vector of the i-th node in (a), a +.>

Is T _m The topological structure feature vector of the ith node in (a);

And topology fusion feature->

and />

s800, will

and />

Inputting the fusion characteristics Z of the mth community into a second-layer attention mechanism to obtain the fusion characteristics Z of the mth community ^m ＝{z ^m ₁ ，z ^m ₂ ，…，z ^m _i ，…，z ^m _h(m) }；z ^m _i Node v being the mth community _i Is a fusion feature of (2);

S1000, obtaining the viewpoint position value of the mth community

k _e An attribute value from the standpoint of the e-th perspective.

2. The method of claim 1, wherein the token vector of the nodes in each converged network is obtained by:

For nodes in network x->

and />

Hyperbolic distance between->

For nodes in network x->

V of neighbor node sets of (a) ^x A node set of the network x; network s represents a source network in two social networks, and network t represents a target network in two social networks;

For nodes in network x->

Probability of belonging to community p in network x, +.>

Is->

Is a hyperbolic representation vector of->

Hyperbolic parameters for community p in network x; c (C) ^x The number of communities in network x;

S30, constructing a pair Ji Gailv function of two social networks embedded with a non-European hyperbolic clustering model based on anchor users in the social networks

For nodes in the target network and in the source network +.>

Nodes connected by anchor links->

Representing node->

and />

Hyperbolic distance between->

Representing node->

and />

Hyperbolic distance between; />

An anchor user set in two social networks;

s40, constructing the following joint objective function:

wherein ,α₁ and α₂ As a weight factor, θ is a hyperbolic standard vector of the node, θ' is a context vector of the node,

a metric matrix for community p in network x;

3. The method of claim 2, wherein the step of determining the position of the substrate comprises,

wherein ,

beta and mu are respectively distortion and position vectors, omega is an aggregation factor, delta is a measurement matrix, and d-dimensional positive definite matrix is used for describing determinant of Riemann measurement, delta is delta, K _r (. Cndot.) is a Bessel function corrected in order r,>

is->

Bessel function of order correction.

4. The method of claim 1, wherein the step of determining the position of the substrate comprises,

is C _m Content attribute feature vector of the j-th node in (a), >

Is->

W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, and σ (·) is a sigmoid function.

5. The method of claim 1, wherein the step of determining the position of the substrate comprises,

is T _m Topology feature vector of the j-th node in (a),>

is->

W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, and σ () is a sigmoid function.

6. The method of claim 1, wherein the first layer of attention mechanisms comprises K attention mechanisms.

7. The method of claim 1, wherein C _m The method comprises the following steps of:

s601, obtaining the m community and any node v _i Corresponding similarity

For node v _i And node v _j Similarity between content attribute feature vectors;

s602, will

S603, obtaining

The node corresponding to the first B similarity is taken as v _i Is a neighbor node of (a); obtaining neighbor nodes of all nodes in the mth community;

8. The method of claim 1, wherein the content attribute features include attribute information of the user and published text information, and the topology features include network activity information and social relationship information of the user.

9. A group awareness and position analysis system across a social network, comprising: the system comprises a social network embedding module, a feature fusion mining module and a group standing decision module;

the social network embedding module is used for embedding different social networks into the same high-dimensional public subspace based on the Poincare sphere model of the non-European hyperbolic space, realizing the alignment of communities and users in the social networks in the high-dimensional public subspace, realizing the alignment of multi-source heterogeneous information of network community users and network association fusion, obtaining the aligned target fusion network and the characterization vector of each user in the non-European hyperbolic space of the target fusion network, and sending the characterization vector to the feature fusion mining module, wherein the characterization vector comprises a content attribute feature vector and a topological structure feature vector;

the feature fusion mining module is used for mining and fusing the features in the target fusion network through a two-layer attention mechanism, so as to obtain the fusion features of each community in the target fusion network and send the fusion features to the group standing decision module;

The group standing decision module is used for predicting standing tendency attitudes of communities based on the received fusion characteristics to obtain the standing tendency attitudes of each community.

10. An electronic device comprising a processor and a memory;

the processor is adapted to perform the steps of the method according to any of claims 1 to 8 by invoking a program or instruction stored in the memory.