CN116049695A - Group perception and standing analysis method, system and electronic equipment crossing social network - Google Patents

Group perception and standing analysis method, system and electronic equipment crossing social network Download PDF

Info

Publication number
CN116049695A
CN116049695A CN202211643877.3A CN202211643877A CN116049695A CN 116049695 A CN116049695 A CN 116049695A CN 202211643877 A CN202211643877 A CN 202211643877A CN 116049695 A CN116049695 A CN 116049695A
Authority
CN
China
Prior art keywords
network
node
community
fusion
hyperbolic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211643877.3A
Other languages
Chinese (zh)
Other versions
CN116049695B (en
Inventor
李晓宇
金力
张泽群
李树超
刘庆
姚方龙
关世昌
马豪伟
董鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202211643877.3A priority Critical patent/CN116049695B/en
Publication of CN116049695A publication Critical patent/CN116049695A/en
Application granted granted Critical
Publication of CN116049695B publication Critical patent/CN116049695B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Resources & Organizations (AREA)
  • Algebra (AREA)
  • Tourism & Hospitality (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a group perception and standing analysis method, a system and electronic equipment crossing social networks, wherein the method comprises the following steps: based on a community network representation mode of a non-European hyperbolic space, the user alignment across social networks and the selection of an aggregation associated representation space are realized; the multi-source heterogeneous information of the user content attribute and the topological structure is fused; and obtaining the standing tendency attitude of the user group. The invention can improve the efficiency and accuracy of group perception and standing attitude prediction.

Description

Group perception and standing analysis method, system and electronic equipment crossing social network
Technical Field
The invention relates to the field of artificial intelligence, in particular to a group perception and standing analysis method, a system and electronic equipment crossing social networks.
Background
The position detection is one of leading research branches in the field of natural language processing, and aims to detect the opinion or attitude tendency of a question event and an associated object thereof, such as "endorsement, neutrality or objection", from information issued by a user. The position detection algorithm is an important application branch of the text classification algorithm, compared with an emotion analysis task, the detected data is more hidden and more obscure, and the data is highly related to the object target, so that the classification task is more difficult.
Traditional stand detection tasks are mainly aimed at carrying out stand detection on offline communication data and online forum speech, however, with the development of new media technologies, a novel social network platform has become a main channel for users to issue opinion stands, such as domestic new waves, tremble sounds, foreign twitter, facebook and other platforms. Therefore, the standpoint detection task of designing the media platform is derived, the data set based on the social media platform is published at home and abroad, and a model and a method with exploratory significance are provided.
In addition, conventional standpoint detection is directed to the user level, but on one hand, the standpoint of the user evolves in an interactive manner, i.e., the individual does not fully accept or ignore the standpoint of other individuals, and on the other hand, the standpoint of the user aggregate to form an overall opinion. Therefore, many researches develop related researches of group position analysis and decision prediction, and the core thought is to take users as a group, predict the overall position attitude of the network user group by researching the interaction process among the users, and finally form public opinion perception. The group position detection task of the social media considers the interaction process of the user group in the social media, continuously updates and fuses the opinion and attitude of the group to the same problem, and finally forms different public opinion states such as consensus, opinion polarization or opinion splitting.
Based on the difference of feature representation and extraction, the existing social network group position detection method mainly comprises a method based on feature engineering, a method based on machine learning and a method based on representation learning.
The feature engineering-based standpoint detection algorithm is a method for processing and extracting features from original data by a professional, and inputting the features into an algorithm model, and in the standpoint detection task, expression modes such as linguistic features and structural features, such as lexical and extracted sentence pattern structures and syntactic dependencies, are mainly provided by the professional disciplines such as linguistics and psychology. The existing method can be basically divided into a stand detection method based on semantic features and a stand detection method based on semantic and structural feature fusion, however, the algorithm is seriously dependent on the field expertise of researchers, the algorithm performance is directly dependent on the feature distinguishing degree, the intelligent degree is low, and the algorithm accuracy is unstable.
The machine learning-based method is to build a neural network and other deep learning models, automatically fit a standpoint to detect nonlinear complex mapping relations between the input and the output of tasks through a training network of real data. The deep learning model can automatically adjust parameters of an optimization network through real training data, reduces the process of manually defining characteristics, namely, mapping texts and topics in the position detection into vectors in a high-dimensional space, and then calculating and outputting final position types by using the vectors. The core of the conventional method for detecting the position based on a series of methods such as a convolutional neural network, a cyclic neural network, a drawing meaning neural network and the like is to automatically learn the correlation between the position attitude and text characteristics through a model and classify the text characteristics by utilizing the characteristic information. The accuracy and the robustness of the machine learning method are superior to those of the feature engineering method, but the method often focuses on the expression and mapping of text features, lacks the expression of user attributes and topology, and has lower information utilization rate due to the modal calculation of a single text dimension.
The method based on representation learning refers to obtaining low-dimensional feature vector representation of a research object in a specific embedded space, and then realizing task classification by utilizing measurement calculation of the feature vector, wherein the method does not need feature engineering to extract features, has the performance of good generalization and interpretation, and mainly relates to theories and methods such as user representation and text representation learning in a standing detection task. The text representation learning is mainly to embed the text characteristics of a user into a space coding representation, and mainly comprises a discrete representation method such as single-hot coding, word bag model coding, TF-IDF and the like and continuous coding models such as Word2Vec, bert pre-training models and the like; user representation learning mainly includes representation modeling of dimensions of user attributes, content generation, behavior representation, relational expressions, and the like. The method for representing learning further expands the depicting dimension of the group stand detection modeling, but the conventional method is often used for modeling in the traditional European space because the social network is a real complex network, and the European space lacks the expression capability of the data hierarchy structure characteristics, so that the attribute of the social network structure is distorted, and the representation capability is insufficient. In addition, feature fusion calculation of text representation features, user representation features and other different dimensions is also a bottleneck problem, and the performance effect of the model is affected.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
the embodiment of the invention provides a group perception and standing analysis method crossing a social network, which comprises the following steps of:
s100, acquiring Q social networks SN 1 ,SN 2 ,…,SN r ,…,SN Q The method comprises the steps of carrying out a first treatment on the surface of the r has a value of 1 to Q;
s200, SN is carried out based on the Poincare sphere model of the non-European hyperbolic space 1 and SN2 Alignment is carried out to obtain an initial converged network and each of the initial converged networksA token vector of the node in a non-euro hyperbolic space; s200 is executed; the characterization vector comprises a content attribute feature vector and a topological structure feature vector;
s300, setting c=c+1, if c < Q-1, executing S400; otherwise, executing S500;
s400, fusing the current network and SN based on non-European hyperbolic space (c+2) Aligning to obtain a fusion network (c+1) and a characterization vector of each node in the fusion network (c+1) in a non-European hyperbolic space; s300 is executed;
s500, regarding the current fusion network as a target fusion network g= (V, X, E), wherein,
Figure BDA0004008909400000031
v is the node set in G s The value of s is 1 to n, and n is the number of nodes in G; content attribute feature set x= { X 1 ,X 2 ,…,X m ,…,X h(m) Content attribute feature set of mth community +. >
Figure BDA0004008909400000032
Representing node v in the mth community in G i Is a representation vector of (1); topological structure feature set e= { E 1 ,E 2 ,…,E m ,…,E h(m) },/>
Figure BDA0004008909400000033
Figure BDA0004008909400000034
Node v representing the mth community in G i and vj A collection of edges in between; m is 1 to L, i, j is 1 to h (m), i is not equal to j, L is the number of communities in G, and h (m) is the number of nodes in the m th community;
s600, obtaining a content attribute feature map C of the mth community in G m And topology Structure T m; wherein ,Cm To obtain based on the distance between nodes in the mth community in G,
Figure BDA0004008909400000035
Figure BDA0004008909400000036
is C m A corresponding adjacency matrix; />
Figure BDA0004008909400000037
Is T m A corresponding adjacency matrix;
Figure BDA0004008909400000038
is C m Content attribute feature vector of the i-th node in (a), a +.>
Figure BDA0004008909400000039
Is T m The topological structure feature vector of the ith node in (a);
s700, C m and Tm Inputting the content attribute fusion characteristics into a first layer of attention mechanism to obtain corresponding content attribute fusion characteristics respectively
Figure BDA00040089094000000310
And topology fusion feature->
Figure BDA00040089094000000311
Figure BDA00040089094000000312
and />
Figure BDA00040089094000000313
Respectively C m and Tm Node v in (a) i Content attribute fusion features and topology fusion features of (a);
s800, will
Figure BDA00040089094000000314
and />
Figure BDA00040089094000000315
Input to the second layer attentionIn the force mechanism, the fusion characteristic Z of the mth community is obtained m ={z m 1 ,z m 2 ,…,z m i ,…,z m h(m) };z m i Node v being the mth community i Is a fusion feature of (2);
s900, z m i Inputting into a set viewpoint tendency prediction model to obtain a corresponding prediction result Pc m i ={Pc m ie } H e=1 ,Pc m ie For node v i The probability that the corresponding user position belongs to the e-th viewpoint position, and H is the number of the viewpoint positions;
s1000, obtaining the viewpoint position value of the mth community
Figure BDA00040089094000000316
k e An attribute value from the standpoint of the e-th perspective.
The invention has at least the following beneficial effects:
according to the group perception and standing analysis method for the cross-social network, provided by the embodiment of the invention, standing tendency attitude mining and aggregation are carried out by adopting the combination of group node interaction and self content attribute characteristics, and the efficiency and accuracy of the group decision prediction problem can be improved by automatically acquiring text views through driving training based on real data through the feature level combination of node content attributes and topological structure multidimensional characteristics.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for group perception and standpoint analysis across social networks provided by an embodiment of the present invention.
Fig. 2 is a block diagram of a group awareness and standpoint analysis system across social networks according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
The technical idea of the invention is that when an emergency topic event occurs, the language of users of different social network platforms can be automatically mined, and the formed group standing attitudes can be analyzed. Specifically, the standing attitudes of the topic events can be represented by the language of the users, and the virtual communities formed by the mutual interests and the common hobbies of different users generally form a community standing attitudes, and the attitudes of different communities jointly form the overall public opinion attitude of the social network. From the perspective of data flow analysis, user attribute and structure data of different social networks and speech data published by each user are analyzed, so that the standing attitudes of the virtual communities aiming at topic events at a certain moment are obtained.
Fig. 1 is a flowchart of a method for group perception and standpoint analysis across social networks provided by an embodiment of the present invention.
The embodiment of the invention provides a group perception and standing analysis method across social networks, which is provided by the embodiment of the invention, as shown in fig. 1, and can comprise the following steps:
s100, acquiring Q target social networks SN 1 ,SN 2 ,…,SN r ,…,SN Q R has a value of 1 to Q.
In the embodiment of the invention, the target social network can be obtained based on the existing social platform, such as a new wave, a microblog and other social platforms, and different social networks are formed by different social platforms. The size of Q may be set based on actual needs.
Each social network may be represented by a two-tuple g= (V, δ), where V is the set of nodes { V i ' its cardinal number }
Figure BDA0004008909400000041
Is N; delta is the edge set { (v) i ,v j ) Side elements of δ are referred to as the sides of the network. In a social network, communities are user sets +.>
Figure BDA0004008909400000042
Is denoted as set +.>
Figure BDA0004008909400000043
The element users are all given the same community label. The whole community set is called
Figure BDA0004008909400000044
There is no intersection between any two communities. For two social networks, e.g. source network +.>
Figure BDA0004008909400000051
And target network->
Figure BDA0004008909400000052
Where s and t represent the source and target, respectively, different indices i, j are used to distinguish users and indices p and q are used to distinguish communities. The target network is a reference network and users in the source network need to be aligned to users in the target network.
Wherein nodes in the target network are regarded as physical objects of network users and are defined as images. Correspondingly, the nodes in the source network are regarded as original objects of network users and are defined as primary images. Since it is a common phenomenon that a user joins multiple social networks simultaneously, they behave similarly in different social networks, or each has a emphasis. Such users are defined below:
anchor user: if a natural person user is in the source network
Figure BDA0004008909400000053
There is a like node, marked as +.>
Figure BDA0004008909400000054
And is in the target network->
Figure BDA0004008909400000055
There are elephant nodes, note node->
Figure BDA0004008909400000056
The user is referred to as an anchor user.
Anchor links: let node
Figure BDA0004008909400000057
And node->
Figure BDA0004008909400000058
Respectively source node set->
Figure BDA0004008909400000059
And target node set->
Figure BDA00040089094000000510
Nodes, if they are the same user in the source network->
Figure BDA00040089094000000511
And target network->
Figure BDA00040089094000000512
The image in (a) is constructed to form an Internet link, called an anchor link, and is recorded as a binary group
Figure BDA00040089094000000513
Anchor user set a: the anchor user set is the totality of known anchor users, denoted as a set. Similarly, the same natural population of people typically participates in multiple social networks, which behave similarly in different social networks, sometimes with each emphasis.
Anchor community: if it is in community
Figure BDA00040089094000000514
and />
Figure BDA00040089094000000515
Respectively source network->
Figure BDA00040089094000000516
And target network->
Figure BDA00040089094000000517
Is a community- >
Figure BDA00040089094000000518
And communities
Figure BDA00040089094000000519
Is an anchor community.
Because the same user registers account numbers on different social platforms and forwards and propagates messages on different social platforms, a plurality of users and communities are formed in a virtual space, if the user needs to penetrate through a real user object layer, the aggregation and mining of the standing attitudes of the users of different social networks are realized, and the alignment problem of the different virtual communities, namely the alignment problem of different users crossing the social networks, needs to be solved first. The accurate alignment of communities is needed to be based on accurate data characterization, and the core work of data characterization is to represent data objects as a high-dimensional vector in a characterization space. Considering that the social network has a complex hierarchical topological structure, namely the social network has a potential non-European structure, the traditional Euclidean space-based characterization space cannot embody the structural characteristics of the data, and the characterization distortion of the data structure attribute is easy to cause. Aiming at the problems, the invention provides a community network characterization mode based on hyperbolic non-European space, and compared with data characterization of European space, the hyperbolic space provides a community characterization mode with higher cohesiveness and stronger external diversity, which is more beneficial to community alignment task development of mutual aliasing. Then, mapping each social network representation into the hyperbolic public subspace for community acquaintance calculation and alignment. Specifically, the model aligns hyperbolic representation spaces of each social network in a manner representing migration using known anchor users, makes the aligned representation spaces a common subspace, and performs social community alignment in the common subspace. And completing the community alignment task of the cross-social network through the embedding mapping of the hyperbolic space and the migration calculation of the public subspace. Specifically, the following are shown in S200 to S400:
S200, SN is carried out based on the Poincare sphere model of the non-European hyperbolic space 1 and SN2 Aligning to obtain an initial fusion network and a characterization vector of each node in the initial fusion network in a non-European hyperbolic space; s200 is performed.
In the embodiment of the invention, a poincare sphere model in a non-European hyperbolic space is adopted as a characterization model, and a predicted vector is a characterization vector. The token vector for each node is a high-dimensional vector, which may include a content attribute feature vector and a topology feature vector.
In the embodiment of the invention, the content attribute features can comprise attribute information of the user and published text information, the attribute information can comprise a user name, gender, age, mailbox, address, occupation, place of residence and the like, the text information can comprise published information, interactive comments and the like, namely, the constituent elements of the content attribute features can comprise the user name, gender, age, mailbox, address, occupation, place of residence, published information, interactive comments and the like. The topological structure features comprise network activity information and social relation information of the user, the network activity information can comprise user login time, login frequency, login duration and the like, the social relation information can comprise label information such as user interest love, habit browsing type and the like, and social platform attenuators, focused friends, fan and the like, namely the constituent elements of the topological structure features can comprise user login time, login frequency, login duration, social platform attenuators, focused friends, fan and the like.
In the implementation, each time the value corresponding to the element contained in each feature is obtained, the values are encoded to form corresponding feature vectors, and then the feature vectors are embedded into a non-European hyperbolic space and converted into the characterization vectors in the non-European space.
S300, setting c=c+1, if c < Q-1, executing S400; otherwise, S500 is performed.
S400, fusing the current network and SN based on non-European hyperbolic space (c+2) Aligning to obtain a fusion network (c+1) and a characterization vector of each node in the fusion network (c+1) in a non-European hyperbolic space; s300 is performed.
In the embodiment of the invention, the core idea of embedding the social network into the poincare sphere model of the non-European hyperbolic space is to measure the intimacy between nodes through the distance of the poincare sphere model so as to learn the hyperbolic characterization vector of each node.
First, a random walk is performed on the social network to capture affinities between nodes. In a walk sequence, several neighboring nodes before and after a given node are referred to as their context nodes. Then node v i There are two identities: when it is taken as its central node, corresponds to the hyperbolic token vector θ i The method comprises the steps of carrying out a first treatment on the surface of the When it is used as the context of other nodes, it corresponds to the context vector θ i ′。
Further, in S200 and S400, the token vector of the node in each converged network is obtained by:
s10, embedding hyperbolic space poincare sphere models into each node of two social networks to be fused, and constructing a node characterization vector objective constraint function of each node
Figure BDA0004008909400000061
In embodiments of the present invention, embedding is a well-known expression of machine learning in which a neural network is utilized to map a high-order local representation into a low-dimensional distributed space, a process known as embedding. Those skilled in the art will recognize that any method of embedding a hyperbolic spatial poincare sphere model in a node of a social network falls within the scope of the present invention.
wherein ,
Figure BDA0004008909400000071
for nodes in network x->
Figure BDA0004008909400000072
and />
Figure BDA0004008909400000073
The hyperbolic distance between two nodes is used to describe the affinity between the two nodes. />
Figure BDA0004008909400000074
For nodes in network x->
Figure BDA0004008909400000075
Neighbor node set,/->
Figure BDA0004008909400000076
A node set of the network x; network s represents SN 1 and SN2 Network t represents SN 1 and SN2 As a target network.
In an embodiment of the present invention, in the present invention,
Figure BDA0004008909400000077
where σ () is a sigmoid function. />
Figure BDA0004008909400000078
Is->
Figure BDA0004008909400000079
Is a hyperbolic representation vector of->
Figure BDA00040089094000000710
Is->
Figure BDA00040089094000000711
Is defined in the context vector of (a). D () is a hyperbolic distance function, e.g., hyperbolic distance +. >
Figure BDA00040089094000000712
By optimizing Fu based on the Riemann geometric random gradient descent method, a characterization vector for each node in the network can be obtained.
S20, modeling each community in two social networks embedded with the hyperbolic space poincare sphere model by using a non-European hyperbolic clustering model, and constructing a community characterization vector target constraint function of each community
Figure BDA00040089094000000713
In the embodiment of the invention, a non-European mixed hyperbolic clustering model is designed based on the characterization vector of the nodes of the non-European hyperbolic space so as to find and characterize communities. In the hybrid clustering model, a hybrid distribution of hyperbolic space is made up of a series of nodes clustered in the community.
wherein ,
Figure BDA00040089094000000714
for nodes in network x->
Figure BDA00040089094000000715
The probability of belonging to community p in network x is membership matrix, membership matrix Z ip The sum of the elements of each row is 1./>
Figure BDA00040089094000000716
Probability density distribution function of model of community p constructed based on generalized hyperbolic distribution, ++>
Figure BDA00040089094000000717
Is->
Figure BDA00040089094000000718
Is a hyperbolic representation vector of->
Figure BDA00040089094000000719
Hyperbolic parameters for community p in network x; c (C) x For the number of communities in network x。
In the inter-hyperbolic cluster model, node characterizations are generated from a mixture distribution in a hyperbolic space. Each component in the mixed distribution corresponds to a community. If a given node characterizes { θ } (.) Likelihood probability that a node belongs to the community is calculated by:
Figure BDA00040089094000000720
in the embodiment of the invention, a generalized hyperbolic distribution modeling community is used, and the probability density function of the modeled community is as follows:
Figure BDA00040089094000000721
wherein ,
Figure BDA00040089094000000722
beta and mu are respectively a distortion vector and a position vector, wherein the position vector mu is a hyperbolic representation vector of the community. Omega is an aggregation factor, delta is a metric matrix, and d-dimensional positive definite matrix is used for describing Riemann metric. Determinant of delta, K r (. Bessel function modified for the r-order, which is derivative with respect to both the order r and the argument,
Figure BDA0004008909400000081
is->
Figure BDA0004008909400000082
The order-modified Bessel function is derivative with respect to both the order r and the argument.
In the embodiment of the invention, the characterization vector of the community membership matrix and the characterization vector of the hyperbolic community can be obtained simultaneously by optimizing Fc based on a Riemann geometric random gradient descent method by giving the node characterization vector as an observation value.
S30, constructing a pair Ji Gailv of two social networks embedded with a non-European hyperbolic clustering model based on anchor users in the social networksFunction of
Figure BDA0004008909400000083
For nodes in the target network and in the source network +.>
Figure BDA0004008909400000084
Nodes connected by anchor links->
Figure BDA0004008909400000085
Representing node->
Figure BDA0004008909400000086
and />
Figure BDA0004008909400000087
Hyperbolic distance between->
Figure BDA0004008909400000088
Representing node- >
Figure BDA0004008909400000089
and />
Figure BDA00040089094000000810
Hyperbolic distance between; />
Figure BDA00040089094000000811
Is a set of anchor users in two social networks, i.e., node ID intersections in two social networks.
In the embodiment of the invention, a vector space-hyperbolic public subspace formed by combining common dimensions of all nodes is constructed by adopting an anchor user characterization migration method in a non-European hyperbolic space. In the hyperbolic common subspace, two social networks are aligned on the anchor user, through which the representation of the anchor user can migrate through the anchor link. If (v) i ′,ν′ k ) Is an anchor link, then the node
Figure BDA00040089094000000812
Can be used to infer v′ k Similarly, node v' k It is also possible to infer its primary image nodes.
S40, constructing the following joint objective function:
Figure BDA00040089094000000813
wherein ,α1 and α2 As a weight factor, θ is a hyperbolic standard vector of the node, θ is a context vector of the node,
Figure BDA00040089094000000814
is a metric matrix for community p in network x.
And S50, optimizing the joint objective function to obtain the characterization vector of each node.
By optimizing the joint objective function based on the Riemann geometric random gradient descent method, the characterization vector and the alignment community of each alignment node in the non-European hyperbolic space in the common Poincare sphere model can be realized.
In the embodiment of the invention, the technical effects of S100 to S400 are as follows: because different social networks are firstly embedded into the same high-dimensional space, then the community alignment and the user alignment in the social networks are realized in the high-dimensional public subspace, and the attribute characteristics of each user are aligned, the alignment and the network association fusion of the multi-source heterogeneous information of the network community users can be realized.
S500, taking the current fusion network as a target fusion network G= (V, X, E) wherein,
Figure BDA00040089094000000815
v is the node set in G s The value of s is 1 to n, and n is the number of nodes in G; content attribute feature set x= { X 1 ,X 2 ,…,X m ,…,X h(m) Content attribute feature set of mth community +.>
Figure BDA00040089094000000816
Representing node v in the mth community in G i Is a content attribute feature vector of (1); topological structure feature set e= { E 1 ,E 2 ,…,E m ,…,E h(m) },
Figure BDA0004008909400000091
Node v representing the mth community in G i and vj A set of edges in between, a plurality of adjacency matrices->
Figure BDA0004008909400000092
Network topology for representing graph G, if->
Figure BDA0004008909400000093
Then indicate->
Figure BDA0004008909400000094
m is 1 to L, i, j is 1 to h (m), L is the number of communities in G, and h (m) is the number of nodes in the m-th community.
S600, obtaining a content attribute feature map C of the mth community in G m And topology Structure T m; wherein ,Cm To obtain based on the distance between nodes in G,
Figure BDA0004008909400000095
Figure BDA0004008909400000096
is C m A corresponding adjacency matrix; />
Figure BDA0004008909400000097
Is T m A corresponding adjacency matrix; />
Figure BDA0004008909400000098
Is C m Content attribute feature vector of the i-th node in (a),
Figure BDA0004008909400000099
is T m The topology feature vector of the i-th node in (a).
In the embodiment of the invention, the topological structure diagram of the node is formed by the interaction of users through attention, praise, comment, forwarding and the like and the connection generated between other users, so that the topological structure diagram is consistent with the topological structure of the target fusion network, namely the network G, and correspondingly,
Figure BDA00040089094000000910
and the adjacency matrix corresponding to the m-th community in G.
In the embodiment of the invention, the content attribute feature map C of the node m Is obtained based on the distance between the nodes in the mth community in G, namely the reconstructed graph after calculating the distance between the nodes in the mth community, specifically C m The method can be obtained by the following steps:
s601, obtaining the sum v in the mth community i Corresponding similarity
Figure BDA00040089094000000911
For node v i And node v j Similarity between content attribute feature vectors; .
In an embodiment of the present invention, in the present invention,
Figure BDA00040089094000000912
can be cosine similarity, i.e. +.>
Figure BDA00040089094000000913
Is C m Content attribute feature vector of the j-th node in (a).
S602, will
Figure BDA00040089094000000914
The similarity of the sequences is ordered from big to small to obtain ordered similarity +.>
Figure BDA00040089094000000915
S603, obtaining
Figure BDA00040089094000000916
The node corresponding to the first B similarity is taken as v o Is a neighbor node of (a); and obtaining neighbor nodes of all nodes in the mth community.
The size of B may be custom set, in one exemplary embodiment b=5.
S604, constructing C based on neighbor nodes of all nodes in the mth community m
Specifically, based on the neighbor node of each node, the content attribute feature map corresponding to the node can be constructed, so that the adjacency matrix corresponding to the mth community can be obtained from the content attribute feature data of the node
Figure BDA00040089094000000917
Specifically, a content attribute feature map corresponding to each node can be constructed based on the neighbor node of the node, and a corresponding adjacency matrix is constructed by utilizing the content attribute map, namely if two nodes are connected, the value of the corresponding position of the matrix is 1, and otherwise, the value of the corresponding position of the matrix is 0.
Because the graph structure formed by the social network is intricate, some nodes form a single graph independently, and some nodes interact to form a correlation graph. In addition, the attribute information of different nodes is also multi-source heterogeneous, the nodes themselves have data information such as user entity, blog state, place position and the like, and heterogeneous edges such as social relationship, writing relationship, position relationship and the like are included among the nodes. The key of mining effective user clustering features and community opinion features is how to commonly extract comprehensive and reasonable features from topological structures and content attributes of nodes, but the current method lacks a mechanism for extracting node topological and attribute features in a simultaneous interaction and synergy mode. Therefore, the invention proposes to abstract model the group feature mining problem on the social network as a multi-dimensional node clustering problem. In order to meet the requirement of the fused complex network structure, a multidimensional deep neural network clustering model for excavating attributes and structures among nodes is designed, and the model is particularly a neural network with two layers of attention mechanisms. The model firstly builds an attribute feature map of the nodes according to the similarity of the content attributes of the nodes, then uses the attribute feature map and the topological map among the nodes as input, and then adaptively extracts hidden features of the node content and the node topology under different dimensions through a two-layer attention mechanism, so that deep mining of the node content and the attribute composite features of each community is realized, and basic features are provided for inter-node community aggregation and community attitude mining. The specific implementation can be as follows S700 to S800:
S700, C m and Tm Inputting the content attribute fusion characteristics into a first layer of attention mechanism to obtain corresponding content attribute fusion characteristics respectively
Figure BDA0004008909400000101
And topology fusion features
Figure BDA0004008909400000102
Figure BDA0004008909400000103
and />
Figure BDA0004008909400000104
Respectively C m and Tm Node v in (a) i Content attribute fusion features and topology fusion features of (a).
In an embodiment of the present invention, a first layer of attention mechanism is used to learn vertex v i V of each neighbor node (v) j ,j∈N i The weighting coefficients of the features.
Taking the content attribute feature map corresponding to the mth community as an example for introduction. Firstly, inputting a content attribute feature map corresponding to an mth community into a multidimensional deep neural network clustering model, and learning a vertex v through the following steps of i ,v j Correlation coefficient between
Figure BDA0004008909400000105
Wherein the operator [ ·| ]]Representing the stitching operation, a (·) is a single layer feedforward neural network, with the activation function being LeakyRelu.
Figure BDA0004008909400000106
wherein ,
Figure BDA0004008909400000107
is->
Figure BDA0004008909400000108
W weight matrix.
The attention distribution is obtained by softmax normalization of the correlation coefficient
Figure BDA0004008909400000109
Aggregating neighborhood features according to attention distribution coefficients, i.e. C m The ith node in the network is fused with new features of the domain information
Figure BDA0004008909400000111
Similarly, the topology fusion feature of node i in the topology feature map
Figure BDA0004008909400000112
Figure BDA0004008909400000113
wherein ,/>
Figure BDA0004008909400000114
Is T m Topology feature vector of the j-th node in (a), >
Figure BDA0004008909400000115
Is->
Figure BDA0004008909400000116
W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, σ () is a sigmoid functionA number.
In the embodiment of the invention, in order to make the model more robust, a multi-head attention mechanism can be introduced to capture different interaction information in different projection spaces, namely, the first-layer attention mechanism comprises K attention mechanisms. The above expression, in which the connection is repeatedly performed independently K times, is characterized as shown in the following formula
Figure BDA0004008909400000117
Representing the attention coefficient calculated by the kth attention mechanism of the mth feature map (comprising the content attribute feature map and the topological structure feature map), W k Is a weight matrix of the corresponding input linear map.
Figure BDA0004008909400000118
And | represents stitching.
That is, the fused feature of each node may be a feature obtained after feature concatenation from multiple attention mechanisms.
S800, H m a and Hm t Inputting the fusion characteristics Z of the mth community into a second-layer attention mechanism to obtain the fusion characteristics Z of the mth community m ={z m 1 ,z m 2 ,…,z m i ,…,z m h(m) };z m i Node v being the mth community i Is described.
To learn the importance of each of the topology map and the content attribute feature map, a second level of attention mechanism is implemented. Any node v in the topological graph structure diagram of the mth community i A kind of electronic device
Figure BDA0004008909400000119
Firstly, carrying out nonlinear transformation on the node, then using a dot product model to obtain the correlation between the transformed embedding and the query vector q, taking the average value of the attention values of all nodes as the attention value of the topological graph +.>
Figure BDA00040089094000001110
As shown in the following formula,wherein W is a weight matrix, b is a bias vector, and formula (·) T represents a rank-shifting operation. Similarly, for the fusion feature of the content attribute profile +.>
Figure BDA00040089094000001111
The value of interest of (2) is->
Figure BDA00040089094000001112
The fusion characteristics of the content attribute characteristic diagram and the topological structure diagram share the parameters.
Figure BDA00040089094000001113
Then use softmax function to focus on the value
Figure BDA00040089094000001114
Normalizing to obtain the attention weights of the topological graph and the feature graph>
Figure BDA0004008909400000121
Specifically, the attention coefficient of any node i in the content attribute profile
Figure BDA0004008909400000122
Figure BDA0004008909400000123
Attention coefficient of any node i in the topology feature map
Figure BDA0004008909400000124
Figure BDA0004008909400000125
Finally, the attention is combined with the fusion characteristics corresponding to the topological structure diagram and the content attribute characteristic diagram to obtain
Figure BDA0004008909400000126
Figure BDA0004008909400000127
I.e. < ->
Figure BDA0004008909400000128
S900, z m i Inputting into a set viewpoint tendency prediction model to obtain a corresponding prediction result Pc m i ={Pc m ie } H e=1 ,Pc m ie For node v i The probability that the corresponding user position belongs to the e-th viewpoint position, H is the number of viewpoint positions.
In embodiments of the present invention, user standpoint may include support, neutrality, and objection.
In the embodiment of the invention, the set viewpoint standing tendency prediction model can be a neural network of a two-layer attention mechanism and can be a model formed by fully connected layers. The model after training can be specifically obtained, and target fusion characteristics of a plurality of communities can be used as samples to be input into the constructed deep neural network model for training. During model training, a gradient descent method can be adopted for training, and a loss function can be used for calculating loss by KL three-degree. The specific training steps may be the prior art, and specific description thereof is omitted for avoiding redundant description.
Wherein, for the target fusion characteristics of the user corresponding to the ith node in the input community m
Figure BDA00040089094000001212
Output characteristics obtained after passing through the full connection layer
Figure BDA0004008909400000129
W out Is the output weight and Bout is the bias vector.
Finally, the probability distribution of the values of the H standing marks is obtained through an output layer consisting of a Softmax layer
Figure BDA00040089094000001210
S1000, obtaining the viewpoint position value of the mth community
Figure BDA00040089094000001211
k e An attribute value from the standpoint of the e-th perspective.
In the embodiment of the invention, different standing tendency attitudes of the support, the neutrality or the objection of the event shown by the user can be quantitatively evaluated as 1, 0-1 according to different comment comments and response behaviors of the user on different topic events, namely, the attribute values of the support, the neutrality and the objection can be respectively 1, 0-1.
In the embodiment of the invention, the text view is automatically acquired by the feature level fusion of the node content attribute and the topological structure multidimensional feature based on the real data driving training, so that the efficiency and the accuracy of the group decision prediction problem can be improved.
Another embodiment of the present invention provides a social network-spanning group awareness and position analysis system for implementing the foregoing method, as shown in fig. 2, where the provided system may include a social network embedding module, a feature fusion mining module, and a group position decision module that are disposed from bottom to top.
The social network embedding module is used for embedding different social networks into the same high-dimensional public subspace based on the Poincare sphere model of the non-European hyperbolic space, realizing the alignment of communities and users in the social network in the high-dimensional public subspace, realizing the alignment of multi-source heterogeneous information of network community users and network association fusion, obtaining aligned target fusion networks and characterization vectors of each user in the target fusion networks in the non-European hyperbolic space, and sending the characterization vectors to the feature fusion mining module, wherein the characterization vectors comprise content attribute feature vectors and topological structure feature vectors. The module is specifically configured to perform the steps shown in the foregoing S100 to S500.
The feature fusion mining module is used for mining and fusing features in the target fusion network through two layers of attention mechanisms to obtain fusion features of each community in the target fusion network and sending the fusion features to the community standing decision module, and particularly the feature fusion mining module further processes community multi-source heterogeneous data of the aligned social network transmitted by the social network embedding module, and supports division of the communities and analysis of standing tendencies in the community standing decision module through deep mining and fusion of the community data. The special feature fusion mining module stacks two layers of attention mechanisms to adaptively extract effective features in graph data, wherein a first layer of attention seeking attention network is used for aggregating neighbor node features, and a second layer of attention fusion topological graph and extracted features of the feature graph. The module is specifically configured to perform the steps shown in the foregoing S600 to S800.
The group position decision module is used for predicting the position tendency attitudes of communities based on the received fusion characteristics to obtain the position tendency attitudes of each community. The feature fusion mining module is used for mining depth sharing features based on the feature level fusion of node content attributes and topological structure multidimensional features, so that the efficiency and the accuracy of the group decision prediction problem are improved by automatically acquiring text views and introducing a group decision prediction algorithm influenced by environmental factors through driving training based on real data. The module is specifically configured to execute the steps shown in the foregoing S900 to S1000.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
In some possible implementations, the various aspects of the present application may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the present application as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, a random access memory (ARM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Those skilled in the art will appreciate that the various aspects of the present application may be implemented as a system, method, or program product. Accordingly, aspects of the present application may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device according to this embodiment of the present application. The electronic device is only one example and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
The electronic device is in the form of a general purpose computing device. Components of an electronic device may include, but are not limited to: the at least one processor, the at least one memory, and a bus connecting the various system components, including the memory and the processor.
Wherein the memory stores program code that is executable by the processor to cause the processor to perform steps according to various exemplary embodiments of the present application described in the above section of the "exemplary method" of the present specification.
The storage may include readable media in the form of volatile storage, such as random access memory (ARM) and/or cache storage, and may further include read-only memory (ROM).
The storage may also include a program/utility having a set (at least one) of program modules including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The bus may be one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures.
The electronic device may also communicate with one or more external devices (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device, and/or with any device (e.g., router, modem, etc.) that enables the electronic device to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface. And, the electronic device may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through a network adapter. The network adapter communicates with other modules of the electronic device via a bus. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with an electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims (10)

1. A method of group awareness and position analysis across a social network, the method comprising the steps of:
s100, acquiring Q social networks SN 1 ,SN 2 ,…,SN r ,…,SN Q The method comprises the steps of carrying out a first treatment on the surface of the r has a value of 1 to Q;
s200, based on non-European hyperbolic spacePoncare sphere model will SN 1 and SN2 Aligning to obtain an initial fusion network and a characterization vector of each node in the initial fusion network in a non-European hyperbolic space; s200 is executed; the characterization vector comprises a content attribute feature vector and a topological structure feature vector;
s300, setting c=c+1, if c < Q-1, executing S400; otherwise, executing S500;
s400, fusing the current network and SN based on non-European hyperbolic space (c+2) Aligning to obtain a fusion network (c+1) and a characterization vector of each node in the fusion network (c+1) in a non-European hyperbolic space; s300 is executed;
s500, regarding the current fusion network as a target fusion network g= (V, X, E), wherein,
Figure FDA0004008909390000011
v is the node set in G s The value of s is 1 to n, and n is the number of nodes in G; content attribute feature set x= { X 1 ,X 2 ,…,X m ,…,X h(m) Content attribute feature set of mth community +.>
Figure FDA0004008909390000012
Figure FDA0004008909390000013
Representing node v in the mth community in G i Is a representation vector of (1); topological structure feature set e= { E 1 ,E 2 ,…,E m ,…,E h(m) },/>
Figure FDA0004008909390000014
Node v representing the mth community in G i and vj A collection of edges in between; m is 1 to L, i, j is 1 to h (m), i is not equal to j, L is the number of communities in G, and h (m) is the number of nodes in the m th community;
s600, obtaining content attribute characteristics of the mth community in GSign C m And topology Structure T m; wherein ,Cm To obtain based on the distance between nodes in the mth community in G,
Figure FDA0004008909390000015
Figure FDA0004008909390000016
is C m A corresponding adjacency matrix; />
Figure FDA0004008909390000017
Is T m A corresponding adjacency matrix;
Figure FDA0004008909390000018
is C m Content attribute feature vector of the i-th node in (a), a +.>
Figure FDA0004008909390000019
Is T m The topological structure feature vector of the ith node in (a);
s700, C m and Tm Inputting the content attribute fusion characteristics into a first layer of attention mechanism to obtain corresponding content attribute fusion characteristics respectively
Figure FDA00040089093900000110
And topology fusion feature->
Figure FDA00040089093900000111
Figure FDA00040089093900000112
Figure FDA00040089093900000113
and />
Figure FDA00040089093900000114
Respectively C m and Tm Node v in (a) i Content attribute fusion features and topology fusion features of (a);
s800, will
Figure FDA00040089093900000115
and />
Figure FDA00040089093900000116
Inputting the fusion characteristics Z of the mth community into a second-layer attention mechanism to obtain the fusion characteristics Z of the mth community m ={z m 1 ,z m 2 ,…,z m i ,…,z m h(m) };z m i Node v being the mth community i Is a fusion feature of (2);
s900, z m i Inputting into a set viewpoint tendency prediction model to obtain a corresponding prediction result Pc m i ={Pc m ie } H e=1 ,Pc m ie For node v i The probability that the corresponding user position belongs to the e-th viewpoint position, and H is the number of the viewpoint positions;
S1000, obtaining the viewpoint position value of the mth community
Figure FDA0004008909390000021
k e An attribute value from the standpoint of the e-th perspective.
2. The method of claim 1, wherein the token vector of the nodes in each converged network is obtained by:
s10, embedding hyperbolic space poincare sphere models into each node of two social networks to be fused, and constructing a node characterization vector objective constraint function of each node
Figure FDA0004008909390000022
Figure FDA0004008909390000023
For nodes in network x->
Figure FDA0004008909390000024
and />
Figure FDA0004008909390000025
Hyperbolic distance between->
Figure FDA0004008909390000026
For nodes in network x->
Figure FDA0004008909390000027
V of neighbor node sets of (a) x A node set of the network x; network s represents a source network in two social networks, and network t represents a target network in two social networks;
s20, modeling each community in two social networks embedded with the hyperbolic space poincare sphere model by using a non-European hyperbolic clustering model, and constructing a community characterization vector target constraint function of each community
Figure FDA0004008909390000028
Figure FDA0004008909390000029
For nodes in network x->
Figure FDA00040089093900000210
Probability of belonging to community p in network x, +.>
Figure FDA00040089093900000211
Probability density distribution function of model of community p constructed based on generalized hyperbolic distribution, ++>
Figure FDA00040089093900000212
Is->
Figure FDA00040089093900000213
Is a hyperbolic representation vector of->
Figure FDA00040089093900000214
Hyperbolic parameters for community p in network x; c (C) x The number of communities in network x;
S30, constructing a pair Ji Gailv function of two social networks embedded with a non-European hyperbolic clustering model based on anchor users in the social networks
Figure FDA00040089093900000215
Figure FDA00040089093900000216
For nodes in the target network and in the source network +.>
Figure FDA00040089093900000217
Nodes connected by anchor links->
Figure FDA00040089093900000218
Representing node->
Figure FDA00040089093900000219
and />
Figure FDA00040089093900000220
Hyperbolic distance between->
Figure FDA00040089093900000221
Representing node->
Figure FDA00040089093900000222
and />
Figure FDA00040089093900000223
Hyperbolic distance between; />
Figure FDA00040089093900000224
An anchor user set in two social networks;
s40, constructing the following joint objective function:
Figure FDA00040089093900000225
Figure FDA00040089093900000226
wherein ,α1 and α2 As a weight factor, θ is a hyperbolic standard vector of the node, θ' is a context vector of the node,
Figure FDA00040089093900000227
a metric matrix for community p in network x;
and S50, optimizing the joint objective function to obtain the characterization vector of each node.
3. The method of claim 2, wherein the step of determining the position of the substrate comprises,
Figure FDA00040089093900000228
Figure FDA00040089093900000229
wherein ,
Figure FDA0004008909390000031
beta and mu are respectively distortion and position vectors, omega is an aggregation factor, delta is a measurement matrix, and d-dimensional positive definite matrix is used for describing determinant of Riemann measurement, delta is delta, K r (. Cndot.) is a Bessel function corrected in order r,>
Figure FDA0004008909390000032
is->
Figure FDA0004008909390000033
Bessel function of order correction.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises,
Figure FDA0004008909390000034
Figure FDA0004008909390000035
Figure FDA0004008909390000036
is C m Content attribute feature vector of the j-th node in (a), >
Figure FDA0004008909390000037
Is->
Figure FDA0004008909390000038
W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, and σ (·) is a sigmoid function.
5. The method of claim 1, wherein the step of determining the position of the substrate comprises,
Figure FDA0004008909390000039
Figure FDA00040089093900000310
Figure FDA00040089093900000311
is T m Topology feature vector of the j-th node in (a),>
Figure FDA00040089093900000312
is->
Figure FDA00040089093900000313
W is a weight matrix, symbol [ ·|·]Representing the stitching operation, a (·) is a single layer feed-forward neural network, leakyRelu is an activation function, and σ () is a sigmoid function.
6. The method of claim 1, wherein the first layer of attention mechanisms comprises K attention mechanisms.
7. The method of claim 1, wherein C m The method comprises the following steps of:
s601, obtaining the m community and any node v i Corresponding similarity
Figure FDA00040089093900000314
Figure FDA00040089093900000315
Figure FDA00040089093900000316
For node v i And node v j Similarity between content attribute feature vectors;
s602, will
Figure FDA00040089093900000317
The similarity of the sequences is ordered from big to small to obtain ordered similarity +.>
Figure FDA00040089093900000318
S603, obtaining
Figure FDA00040089093900000319
The node corresponding to the first B similarity is taken as v i Is a neighbor node of (a); obtaining neighbor nodes of all nodes in the mth community;
S604, constructing C based on neighbor nodes of all nodes in the mth community m
8. The method of claim 1, wherein the content attribute features include attribute information of the user and published text information, and the topology features include network activity information and social relationship information of the user.
9. A group awareness and position analysis system across a social network, comprising: the system comprises a social network embedding module, a feature fusion mining module and a group standing decision module;
the social network embedding module is used for embedding different social networks into the same high-dimensional public subspace based on the Poincare sphere model of the non-European hyperbolic space, realizing the alignment of communities and users in the social networks in the high-dimensional public subspace, realizing the alignment of multi-source heterogeneous information of network community users and network association fusion, obtaining the aligned target fusion network and the characterization vector of each user in the non-European hyperbolic space of the target fusion network, and sending the characterization vector to the feature fusion mining module, wherein the characterization vector comprises a content attribute feature vector and a topological structure feature vector;
the feature fusion mining module is used for mining and fusing the features in the target fusion network through a two-layer attention mechanism, so as to obtain the fusion features of each community in the target fusion network and send the fusion features to the group standing decision module;
The group standing decision module is used for predicting standing tendency attitudes of communities based on the received fusion characteristics to obtain the standing tendency attitudes of each community.
10. An electronic device comprising a processor and a memory;
the processor is adapted to perform the steps of the method according to any of claims 1 to 8 by invoking a program or instruction stored in the memory.
CN202211643877.3A 2022-12-20 2022-12-20 Group perception and standing analysis method, system and electronic equipment crossing social network Active CN116049695B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211643877.3A CN116049695B (en) 2022-12-20 2022-12-20 Group perception and standing analysis method, system and electronic equipment crossing social network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211643877.3A CN116049695B (en) 2022-12-20 2022-12-20 Group perception and standing analysis method, system and electronic equipment crossing social network

Publications (2)

Publication Number Publication Date
CN116049695A true CN116049695A (en) 2023-05-02
CN116049695B CN116049695B (en) 2023-07-04

Family

ID=86119166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211643877.3A Active CN116049695B (en) 2022-12-20 2022-12-20 Group perception and standing analysis method, system and electronic equipment crossing social network

Country Status (1)

Country Link
CN (1) CN116049695B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080222295A1 (en) * 2006-11-02 2008-09-11 Addnclick, Inc. Using internet content as a means to establish live social networks by linking internet users to each other who are simultaneously engaged in the same and/or similar content
US20170337735A1 (en) * 2016-05-17 2017-11-23 Disney Enterprises, Inc. Systems and methods for changing a perceived speed of motion associated with a user
CN109145109A (en) * 2017-06-19 2019-01-04 国家计算机网络与信息安全管理中心 User group's message propagation anomaly analysis method and device based on social networks
CN109471995A (en) * 2018-10-26 2019-03-15 武汉大学 A kind of hyperbolic embedding grammar of complex network
CN111931903A (en) * 2020-07-09 2020-11-13 北京邮电大学 Network alignment method based on double-layer graph attention neural network
CN114254093A (en) * 2021-12-17 2022-03-29 南京航空航天大学 Multi-space knowledge enhanced knowledge graph question-answering method and system
CN114329227A (en) * 2021-08-13 2022-04-12 北京计算机技术及应用研究所 Topic knowledge graph-based social relationship network construction and expansion method
CN115080871A (en) * 2022-07-07 2022-09-20 国家计算机网络与信息安全管理中心 Cross-social network social user alignment method
CN115186197A (en) * 2022-08-19 2022-10-14 中国科学技术大学 User recommendation method based on end-to-end hyperbolic space

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080222295A1 (en) * 2006-11-02 2008-09-11 Addnclick, Inc. Using internet content as a means to establish live social networks by linking internet users to each other who are simultaneously engaged in the same and/or similar content
US20170337735A1 (en) * 2016-05-17 2017-11-23 Disney Enterprises, Inc. Systems and methods for changing a perceived speed of motion associated with a user
CN109145109A (en) * 2017-06-19 2019-01-04 国家计算机网络与信息安全管理中心 User group's message propagation anomaly analysis method and device based on social networks
CN109471995A (en) * 2018-10-26 2019-03-15 武汉大学 A kind of hyperbolic embedding grammar of complex network
CN111931903A (en) * 2020-07-09 2020-11-13 北京邮电大学 Network alignment method based on double-layer graph attention neural network
CN114329227A (en) * 2021-08-13 2022-04-12 北京计算机技术及应用研究所 Topic knowledge graph-based social relationship network construction and expansion method
CN114254093A (en) * 2021-12-17 2022-03-29 南京航空航天大学 Multi-space knowledge enhanced knowledge graph question-answering method and system
CN115080871A (en) * 2022-07-07 2022-09-20 国家计算机网络与信息安全管理中心 Cross-social network social user alignment method
CN115186197A (en) * 2022-08-19 2022-10-14 中国科学技术大学 User recommendation method based on end-to-end hyperbolic space

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张翠肖;郝杰辉;刘星宇;孙月肖;: "基于CNN-BiLSTM的中文微博立场分析研究", 计算机技术与发展, no. 07 *
杨奕卓;于洪涛;黄瑞阳;刘正铭;: "基于融合表示学习的跨社交网络用户身份匹配", 计算机工程, no. 09 *
白静;李霏;姬东鸿;: "基于注意力的BiLSTM-CNN中文微博立场检测模型", 计算机应用与软件, no. 03 *

Also Published As

Publication number Publication date
CN116049695B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
Joshi Artificial intelligence with python
Chen et al. Deep reinforcement learning in recommender systems: A survey and new perspectives
CN111737552A (en) Method, device and equipment for extracting training information model and acquiring knowledge graph
CN111506714A (en) Knowledge graph embedding based question answering
CN112069302B (en) Training method of conversation intention recognition model, conversation intention recognition method and device
Xu et al. User memory reasoning for conversational recommendation
US11501547B1 (en) Leveraging text profiles to select and configure models for use with textual datasets
Guo et al. Who is answering whom? Finding “Reply-To” relations in group chats with deep bidirectional LSTM networks
KR20200041199A (en) Method, apparatus and computer-readable medium for operating chatbot
CN113535949B (en) Multi-modal combined event detection method based on pictures and sentences
Yang et al. Anchor link prediction across social networks based on multiple consistency
US20230153335A1 (en) Searchable data structure for electronic documents
WO2023164312A1 (en) An apparatus for classifying candidates to postings and a method for its use
CN116049695B (en) Group perception and standing analysis method, system and electronic equipment crossing social network
Shi et al. Practical POMDP-based test mechanism for quality assurance in volunteer crowdsourcing
CN113626685A (en) Propagation uncertainty-oriented rumor detection method and device
Durak et al. Classification and prediction‐based machine learning algorithms to predict students’ low and high programming performance
Adamska et al. Picking peaches or squeezing lemons: selecting crowdsourcing workers for reducing cost of redundancy
CN111444338A (en) Text processing device, storage medium and equipment
Maharaj Generalizing in the Real World with Representation Learning
US11977515B1 (en) Real time analysis of interactive content
US20240020553A1 (en) Interactive electronic device for performing functions of providing responses to questions from users and real-time conversation with the users using models learned by deep learning technique and operating method thereof
US20240028952A1 (en) Apparatus for attribute path generation
US20240168918A1 (en) Systems for cluster analysis of interactive content
US11727215B2 (en) Searchable data structure for electronic documents

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant