CN113409157A - Cross-social network user alignment method and device - Google Patents

Cross-social network user alignment method and device Download PDF

Info

Publication number
CN113409157A
CN113409157A CN202110545701.3A CN202110545701A CN113409157A CN 113409157 A CN113409157 A CN 113409157A CN 202110545701 A CN202110545701 A CN 202110545701A CN 113409157 A CN113409157 A CN 113409157A
Authority
CN
China
Prior art keywords
attribute
feature vector
user
word
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110545701.3A
Other languages
Chinese (zh)
Other versions
CN113409157B (en
Inventor
蔡晓东
王鑫岚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guilin University of Electronic Technology
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202110545701.3A priority Critical patent/CN113409157B/en
Publication of CN113409157A publication Critical patent/CN113409157A/en
Application granted granted Critical
Publication of CN113409157B publication Critical patent/CN113409157B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a method and a device for aligning users across social networks, wherein the method comprises the following steps: importing social network user data, constructing a training model for feature extraction, optimizing the training model according to the social network user data to obtain an optimized model, importing the to-be-tested social network user data, and aligning the to-be-tested social network user data through the optimized model to obtain a user alignment result. The method can extract the distinguishing semantic features, reduce the sparsity of network structure information and greatly improve the accuracy of user alignment across social networks.

Description

Cross-social network user alignment method and device
Technical Field
The invention mainly relates to the technical field of social network analysis, in particular to a cross-social network user alignment method and device.
Background
The social platforms with different functions greatly enrich the life of people, but various information of users can not be integrated after being scattered in the social platforms. Because the root of the social platforms is the user, each social platform wants to monopolize the user and does not want the user to run away, and therefore an information sharing mechanism between the social platforms is lost, and data fragmentation of the user is caused. The fragmentation of the user data causes the user to lose the previous social data when joining a new social platform, and the user needs to spend time again to construct a social circle, so that the user experience is poor. For the social platform, the new user does not have previous social data when joining, cannot know the social relationship and the preference of the new user, cannot perform effective personalized recommendation, and also brings disadvantages for the social platform. User alignment across social networks is achieved by matching different accounts belonging to the same person in the real world in multiple networks, which is of great significance to many aspects of research or applications in the field of social networks.
While the research directions on social networks can be roughly divided into three categories: the method comprises the following steps of user alignment technology research based on user attributes, user alignment technology research based on a network structure, and multi-factor user alignment technology research combining attributes and the network structure. In the multi-factor user alignment technology research combining attributes and network structures, no matter user alignment is performed through the network structures or based on the attributes, good effects are achieved in recent years. Since each approach has its own advantages, there is naturally an attempt by the learner to combine the various approaches together for user alignment. In the prior art, the user alignment is realized by using a social network structure and user profile attributes; some LHNE models are used for cross-network user alignment tasks, and the models simultaneously utilize network structures and user text information; some of the devices use a deep neural network to complete the user alignment task by simultaneously using the network structure and the user position information. However, the method does not extract the distinguishing semantic features, and the sparsity of the network structure information greatly affects the user alignment accuracy.
Disclosure of Invention
The invention provides a cross-social-network user alignment method and device aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a method of user alignment across social networks, comprising the steps of:
importing social network user data, constructing a training model for feature extraction, and optimizing the training model according to the social network user data to obtain an optimized model;
and importing the social network user data to be tested, and aligning the social network user data to be tested through the optimization model to obtain a user alignment result.
The invention has the beneficial effects that: the training model for feature extraction is built, the optimization model is obtained through optimization processing of the training model according to the social network user data, the user alignment result is obtained through alignment processing of the optimization model on the to-be-detected social network user data, the distinguishing semantic features can be extracted, sparsity of network structure information is reduced, and accuracy of cross-social network user alignment is greatly improved.
Drawings
FIG. 1 is a schematic flowchart of a cross-social-network user alignment method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a device for aligning users across social networks according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a schematic flowchart of a cross-social-network user alignment method according to an embodiment of the present invention.
As shown in FIG. 1, a method for aligning users across social networks comprises the following steps:
importing social network user data, constructing a training model for feature extraction, and optimizing the training model according to the social network user data to obtain an optimized model;
and importing the social network user data to be tested, and aligning the social network user data to be tested through the optimization model to obtain a user alignment result.
In the embodiment, the training model for feature extraction is constructed, the optimization model is obtained by optimizing the training model according to the social network user data, and the user alignment result is obtained by aligning the to-be-detected social network user data through the optimization model, so that the distinguishing semantic features can be extracted, the sparsity of network structure information is reduced, and the accuracy of cross-social network user alignment is greatly improved.
Optionally, as an embodiment of the present invention, the social network user data includes a plurality of social network user sub-data carrying preset user numbers, and each of the social network user sub-data includes user attribute information, structural data, and a true value; the process of constructing the training model for feature extraction includes the following steps:
obtaining user attribute information from each piece of social network user subdata, and respectively extracting attribute features of each piece of user attribute information to obtain an attribute feature vector group corresponding to each preset user number;
obtaining structural data from each piece of social network user subdata, and respectively extracting structural features of each piece of structural data to obtain structural feature vectors corresponding to each preset user number;
obtaining real values from each social network user subdata, and performing fusion loss calculation on all attribute feature vector groups, all real values and all structure feature vectors to obtain a fusion loss function;
the process of optimizing the training model according to the social network user data to obtain an optimized model comprises the following steps:
and updating parameters of the training model according to the fusion loss function to obtain an optimized model.
Specifically, step S1: extracting attribute feature vectors of the social network user data containing attribute information and structural information to obtain attribute feature vectors of user nodes (namely the attribute feature vector group); step S2: carrying out structural feature vector extraction on the social network user data containing attribute information and structural information to obtain a structural feature vector of a user node; step S3: with the outputs of step S1 and step S2 as inputs, the attribute feature vectors (i.e., the set of attribute feature vectors) and the structure feature vectors are fused, the user alignment result is determined, compared with the ground truth value (i.e., the trueness) and the fusion loss is calculated.
In the above embodiment, the attribute features of each user attribute information are extracted to obtain the attribute feature vector group corresponding to each preset user number, the structural feature of each structural data is extracted to obtain the structural feature vector corresponding to each preset user number, and the fusion loss function is obtained by performing fusion loss calculation on all the attribute feature vector groups, all the true values, and all the structural feature vectors together.
Optionally, as an embodiment of the present invention, the user attribute information includes neighbor node information and a plurality of user attribute parameters, and the plurality of user attribute information corresponds to each other; the process of respectively extracting the attribute features of each user attribute information to obtain the attribute feature vector group corresponding to the preset user number comprises the following steps:
respectively extracting word features of the user attribute parameters corresponding to the preset user numbers to obtain a plurality of word feature vectors corresponding to the user attribute parameters;
respectively carrying out information balance processing on each word feature vector to obtain a word balance vector corresponding to the word feature vector;
respectively carrying out local feature extraction on the word balance vectors through a TextCNN convolutional network to obtain a local feature vector group corresponding to the user attribute parameters;
evaluating each local feature vector group respectively to obtain semantic feature vectors corresponding to the user attribute parameters;
respectively fusing a plurality of semantic feature vectors corresponding to the preset user number through a first type to obtain a fusion attribute feature vector corresponding to the preset user number, wherein the first type is as follows:
Figure BDA0003073549200000051
wherein z isikA semantic feature vector, gamma, for the k-th attribute of a preset user number ik∈R,γkFor the respective weighting parameter to be learned, viM is the number of semantic feature vectors corresponding to the preset user number;
obtaining a preset user number adjacent to the preset user number according to the neighbor node information, and taking a fusion attribute feature vector corresponding to the adjacent preset user number as a neighbor attribute feature vector;
and respectively carrying out vector fusion on the fusion attribute feature vector corresponding to each pair of corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to the respective fusion attribute feature vectors to obtain an attribute feature vector group corresponding to the preset user number.
It should be understood that two of the user attribute information pairs are known to correspond to the corresponding relationship, and the user attribute information pairs are known in the social network user data, for example, A, B, C pieces of user attribute information are known in the social network user data, where the a user attribute information corresponds to the B user attribute information, and the B user attribute information corresponds to the C user attribute information.
It should be understood that the TextCNN convolutional network is a convolutional neural network that extracts semantic features of different granularity using convolution kernels of different sizes.
It should be understood that the semantic feature vectors of different attributes are taken together as input, and the fused attribute features of the user nodes (i.e., the fused attribute feature vectors) are obtained through an attention mechanism.
It should be understood that all word embedding (i.e. the word balance vector) of the attribute captures local features through TextCNN convolutional networks containing different granularity sizes, and outputs semantic information (i.e. the local feature vector set) of the attribute at different abstraction levels.
In the embodiment, the attribute features of each user attribute information are respectively extracted to obtain the attribute feature vector group corresponding to the preset user number, so that a basis is provided for subsequent data processing, the distinguishing semantic features can be extracted, the sparsity of network structure information is reduced, and the accuracy of user alignment across the social network is greatly improved.
Optionally, as an embodiment of the present invention, the process of respectively performing word feature extraction on the multiple user attribute parameters corresponding to the preset user number to obtain multiple word feature vectors corresponding to the user attribute parameters includes:
respectively carrying out word division on a plurality of user attribute parameters corresponding to the preset user numbers to obtain a plurality of word information corresponding to each user attribute parameter, and converting each word information into a word vector;
respectively carrying out character division on each word information to obtain a plurality of character information corresponding to the word information, and converting each character information into a character vector;
respectively extracting the characteristics of each character vector through a preset one-dimensional convolution layer to obtain a character characteristic vector corresponding to the character vector;
screening each character feature vector through a preset maximum pooling layer, and screening to obtain a plurality of character screening vectors corresponding to the word information;
and respectively carrying out vector splicing on each word vector and the plurality of character screening vectors corresponding to the word vectors to obtain the word characteristic vectors corresponding to the word vectors.
It should be understood that an attribute (i.e., the user attribute parameter) is divided into a word list (i.e., a plurality of the word information), each word (i.e., the word information) is represented as a word embedding (i.e., the word vector) and each word (i.e., the word information) is divided into a character list (i.e., a plurality of the character information), each character (i.e., the character information) is represented as a character embedding (i.e., the character vector), all the character embedding (i.e., the character vector) is subjected to one-dimensional convolution and maximum pooling, and the word embedding (i.e., the word vector) is merged with the pooled character (i.e., the character vector) embedding as a final embedding of one word (i.e., the word feature vector).
In the above embodiment, the word features of the plurality of user attribute parameters corresponding to the preset user numbers are respectively extracted to obtain the plurality of word feature vectors corresponding to the user attribute parameters, so that a data basis is provided for subsequently extracting the distinguishing semantic features, and the accuracy of user alignment across the social network is greatly improved.
Optionally, as an embodiment of the present invention, the step of performing information balancing processing on each word feature vector to obtain a word balance vector corresponding to the word feature vector includes:
respectively carrying out information balance processing on each word feature vector through a second formula to obtain a word balance vector corresponding to the word feature vector, wherein the second formula is as follows:
z=t⊙g(WHh+bH)+(1-t)⊙h,
where t ═ σ (W)Th+bT),
Wherein, WHAnd WTAre all square matrices, bHAnd bTAre bias vectors, g is a non-linear function tanh, h is a word feature vector, and z is a word balance vector.
It should be understood that the second equation is a calculation process of the highway network.
It should be understood that word embedding (i.e., the word feature vector) balances word information with character information over the highway network.
In the embodiment, the information balance processing of each word feature vector is performed through the second formula, so that the word balance vector corresponding to the word feature vector is obtained, the distinguishing semantic features can be extracted, and the accuracy of user alignment across social networks is greatly improved.
Optionally, as an embodiment of the present invention, the process of respectively performing evaluation processing on each local feature vector group to obtain a semantic feature vector corresponding to the user attribute parameter includes:
evaluating each local feature vector group respectively through a third formula to obtain semantic feature vectors corresponding to the user attribute parameters, where the third formula is:
Figure BDA0003073549200000071
wherein the content of the first and second substances,
Figure BDA0003073549200000081
Figure BDA0003073549200000082
wherein A isij=α(zi),zj∈Rd
Wherein the content of the first and second substances,
Figure BDA0003073549200000083
wherein A isijIs a matrix of the degree of similarity, and,
Figure BDA0003073549200000084
multiplication of elements, [;]the vectors are spliced in a row-by-row manner,
Figure BDA0003073549200000085
for context-critical information, W1T、W2T、W3T∈R2d×d,b1、b2、b3∈Rd,W1T、W2T、W3T、b1、b2And b3Are trainable parameters, sigma is a nonlinear function sigmoid, ziIs a set of local feature vectors, and,
Figure BDA0003073549200000086
is a semantic feature vector.
It should be understood that semantic information (i.e., the set of local feature vectors) is subject to a self-attention mechanism to evaluate the importance of each information, resulting in a semantic feature (i.e., the semantic feature vector) for that attribute.
Understandably, R3dExpressed in dimension 1 x 3d, R2d×dExpressed as dimension 2d x d, RdThe expressed dimension is 1 x d, AijIs a similarity matrix calculated by α (·).
Understandably, zjWeighted summation to ziTo indicate more contextually important information.
In the embodiment, the semantic feature vectors corresponding to the user attribute parameters are obtained by respectively evaluating and processing each local feature vector group through the third formula, so that the importance degree of each piece of information can be evaluated, and the accuracy of user alignment across social networks is greatly improved.
Optionally, as an embodiment of the present invention, the process of performing vector fusion on the fusion attribute feature vector corresponding to each of the two pairs of corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to the respective fusion attribute feature vectors to obtain the attribute feature vector group corresponding to the preset user number includes:
performing vector fusion on the fusion attribute feature vector corresponding to each of the two pairs of corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to the respective fusion attribute feature vectors respectively through a fourth formula to obtain a first attribute feature vector corresponding to the preset user number and a second attribute feature vector corresponding to the first attribute feature vector, where the fourth formula is:
Figure BDA0003073549200000091
wherein the content of the first and second substances,
Figure BDA0003073549200000092
wherein the content of the first and second substances,
Figure BDA0003073549200000093
wherein e isjiTo the attention coefficient, viFor the first fused attribute feature vector, uiFor the second fused attribute feature vector, vjIs the jth neighbor attribute feature vector u corresponding to the first fusion attribute feature vectorjIs the j-th neighbor attribute feature vector corresponding to the second fusion attribute feature vector, ajiTo normalize the coefficients, σ (-) is a non-linear function,
Figure BDA0003073549200000094
is a first attribute feature vector, and is,
Figure BDA0003073549200000095
is a second attribute feature vector, WTAnd b are model parameters to be learned;
and obtaining an attribute feature vector group corresponding to the preset user number according to each first attribute feature vector and the second attribute feature vector corresponding to the first attribute feature vector.
It should be understood that the fusion attribute features (i.e. the fusion attribute feature vectors) of the user pair to be predicted (i.e. the user attribute information corresponding to each pair) and the neighbor node pair thereof are used as inputs, the influence of different neighbors on the attribute features is fused through an attention mechanism, and the final attribute feature vector (i.e. the attribute feature vector group) of the user to be predicted is output.
Specifically, the calculation process of the attention mechanism is as follows:
eji=g(vj,uj,vi,ui),
Figure BDA0003073549200000096
Figure BDA0003073549200000097
wherein ejiFor attention coefficients, the neighbors are representedTo (v)j,uj) For predicting the user v to be predictedi(i.e., the first fused attribute feature vector), ui(i.e., the second fused attribute feature vector) whether or not it is a contribution of the same person, g: RK×RK×RK×RK→R。ajiExpressing the normalization coefficients of all the neighbor pairs, wherein the normalization coefficients are used for calculating the linear combination of the user node to be predicted and the feature vectors of the neighbor nodes, applying a nonlinear function sigma (-) on the linear combination, and calculating the final feature vector of the attribute of the user pair to be predicted
Figure BDA0003073549200000098
(i.e. the first attribute feature vector and the second attribute feature vector).
Considering the individual characteristics of each user in the neighbor pair, the similarity between two users in the neighbor pair, and the relationship between the neighbor pair and the focus pair, three specific attention mechanisms of individual attention, differential attention, and relationship attention are proposed:
eji=WT[vj;uj]+b,
eji=WT|vj-uj|+b,
eji=WT||vj-vi|-|uj-ui||+b,
where W and b are the model parameters to be learned. Consider that three attention mechanisms unify the unified attention used in the model in one equation:
Figure BDA0003073549200000101
in the above embodiment, the fusion attribute feature vectors corresponding to each of two pairs of corresponding user attribute information and the vectors of multiple neighbor attribute feature vectors corresponding to each fusion attribute feature vector are fused to obtain the attribute feature vector group corresponding to the preset user number, so that the influence of neighbor nodes is fused, the distinguishing semantic features can be extracted, and the accuracy of user alignment across social networks is greatly improved.
Optionally, as an embodiment of the present invention, the process of respectively performing structural feature extraction on each piece of structural data to obtain a structural feature vector corresponding to each preset user number includes:
respectively converting the adjacent matrixes of the structural data to obtain grid structural data corresponding to the preset user numbers;
respectively carrying out normalization processing on each grid structure data to obtain normalization structure data corresponding to each preset user number;
and respectively extracting the characteristics of the normalized structure data through a preset convolutional neural network to obtain a structure characteristic vector corresponding to each preset user number.
It should be understood that the adjacency matrix is a two-dimensional array storing relationship data between nodes in the graph, and if two nodes are connected, the corresponding position number is 1, and the two nodes are not connected to be 0.
Specifically, irregular graph structure data (namely the structure data) in matching social network data is converted into regular grid structure data by utilizing an adjacency matrix; carrying out normalization processing such as sequencing, zero padding and the like on the network structure data; taking a normalized graph (namely the normalized structure data) as input, extracting structure features through a convolutional neural network, and outputting the structure feature vector of the user pair to be predicted.
In the above embodiment, the adjacent matrix of each piece of structure data is converted to obtain the grid structure data corresponding to each preset user number, the normalization processing of each grid structure data is performed to obtain the normalized structure data corresponding to each preset user number, the feature extraction of each normalized structure data is performed through the preset convolutional neural network to obtain the structure feature vector corresponding to each preset user number, so that data support is provided for subsequent processing, the distinguishing semantic features can be extracted, and the accuracy of user alignment across the social network is greatly improved.
Optionally, as an embodiment of the present invention, the process of performing fusion loss calculation on all attribute feature vector groups, all real values, and all structure feature vectors together to obtain a fusion loss function includes:
performing fusion loss calculation on all attribute feature vector groups, all real values and all structural feature vectors together by using a fifth formula to obtain a fusion loss function, wherein the fifth formula is as follows:
Loss=LossCE+λLosscos
wherein the content of the first and second substances,
Figure BDA0003073549200000111
wherein the content of the first and second substances,
Figure BDA0003073549200000112
wherein, Loss is a fusion Loss function, LossCEFor cross-entropy Loss, LosscosIn order to be a cosine loss,
Figure BDA0003073549200000113
is a first attribute feature vector, and is,
Figure BDA0003073549200000114
is the second attribute feature vector, y is the matching score, yiIs the true value, n is the total number of attribute feature vector groups,
Figure BDA0003073549200000115
is the difference between attribute feature vectors, siIs a structural feature vector.
As should be appreciated, the first and second members,
Figure BDA0003073549200000116
for predicting a match score.
Specifically, the attribute feature vector group and the structure feature vector are spliced to serve as input, and matching scores are predicted; constructing a cosine loss function (namely the cosine loss) according to the cosine similarity, and performing weighted fusion with the cross entropy loss to obtain a fusion loss function; and comparing the obtained matching score with a ground real value (namely the real value) and calculating the fusion loss.
In the embodiment, the fusion loss function is obtained by performing fusion loss calculation on all attribute feature vector groups, all true values and all structural feature vectors together through the fifth formula, so that the loss function is optimized, the distinguishing semantic features can be extracted, and the accuracy of user alignment across the social network is greatly improved.
FIG. 2 is a block diagram of a device for aligning users across social networks according to an embodiment of the present invention.
Optionally, as another embodiment of the present invention, as shown in fig. 2, an apparatus for aligning users across social networks includes:
the model optimization module is used for importing social network user data, constructing a training model for feature extraction, and optimizing the training model according to the social network user data to obtain an optimization model;
and the alignment result obtaining module is used for importing the social network user data to be tested, and performing alignment processing on the social network user data to be tested through the optimization model to obtain a user alignment result.
Optionally, another embodiment of the present invention provides a cross-social-network user alignment apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, which when executed by the processor, implements the cross-social-network user alignment method as described above. The device may be a computer or the like.
Optionally, another embodiment of the invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements a cross-social network user alignment method as described above.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is merely a logical division, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for aligning users across social networks is characterized by comprising the following steps:
importing social network user data, constructing a training model for feature extraction, and optimizing the training model according to the social network user data to obtain an optimized model;
and importing the social network user data to be tested, and aligning the social network user data to be tested through the optimization model to obtain a user alignment result.
2. The cross-social-network user alignment method of claim 1, wherein the social-network user data comprises a plurality of social-network user sub-data carrying a preset user number, each of the social-network user sub-numbers comprising user attribute information, structural data, and a true value; the process of constructing the training model for feature extraction includes the following steps:
obtaining user attribute information from each piece of social network user subdata, and respectively extracting attribute features of each piece of user attribute information to obtain an attribute feature vector group corresponding to each preset user number;
obtaining structural data from each piece of social network user subdata, and respectively extracting structural features of each piece of structural data to obtain structural feature vectors corresponding to each preset user number;
obtaining real values from each social network user subdata, and performing fusion loss calculation on all attribute feature vector groups, all real values and all structure feature vectors to obtain a fusion loss function;
the process of optimizing the training model according to the social network user data to obtain an optimized model comprises the following steps:
and updating parameters of the training model according to the fusion loss function to obtain an optimized model.
3. The cross-social-network user alignment method according to claim 2, wherein the user attribute information includes neighbor node information and a plurality of user attribute parameters, the plurality of user attribute information corresponding pairwise; the process of respectively extracting the attribute features of each user attribute information to obtain the attribute feature vector group corresponding to the preset user number comprises the following steps:
respectively extracting word features of the user attribute parameters corresponding to the preset user numbers to obtain a plurality of word feature vectors corresponding to the user attribute parameters;
respectively carrying out information balance processing on each word feature vector to obtain a word balance vector corresponding to the word feature vector;
respectively carrying out local feature extraction on the word balance vectors through a TextCNN convolutional network to obtain a local feature vector group corresponding to the user attribute parameters;
evaluating each local feature vector group respectively to obtain semantic feature vectors corresponding to the user attribute parameters;
respectively fusing a plurality of semantic feature vectors corresponding to the preset user number through a first type to obtain a fusion attribute feature vector corresponding to the preset user number, wherein the first type is as follows:
Figure FDA0003073549190000021
wherein z isikA semantic feature vector, gamma, for the k-th attribute of a preset user number ik∈R,γkFor the respective weighting parameter to be learned, viM is the number of semantic feature vectors corresponding to the preset user number;
obtaining a preset user number adjacent to the preset user number according to the neighbor node information, and taking a fusion attribute feature vector corresponding to the adjacent preset user number as a neighbor attribute feature vector;
and respectively carrying out vector fusion on the fusion attribute feature vector corresponding to each pair of corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to the respective fusion attribute feature vectors to obtain an attribute feature vector group corresponding to the preset user number.
4. The method of claim 3, wherein the process of extracting word features from the plurality of user attribute parameters corresponding to the preset user numbers to obtain a plurality of word feature vectors corresponding to the user attribute parameters comprises:
respectively carrying out word division on a plurality of user attribute parameters corresponding to the preset user numbers to obtain a plurality of word information corresponding to each user attribute parameter, and converting each word information into a word vector;
respectively carrying out character division on each word information to obtain a plurality of character information corresponding to the word information, and converting each character information into a character vector;
respectively extracting the characteristics of each character vector through a preset one-dimensional convolution layer to obtain a character characteristic vector corresponding to the character vector;
screening each character feature vector through a preset maximum pooling layer, and screening to obtain a plurality of character screening vectors corresponding to the word information;
and respectively carrying out vector splicing on each word vector and the plurality of character screening vectors corresponding to the word vectors to obtain the word characteristic vectors corresponding to the word vectors.
5. The method of claim 3, wherein the step of performing information balancing processing on each word feature vector to obtain a word balance vector corresponding to the word feature vector comprises:
respectively carrying out information balance processing on each word feature vector through a second formula to obtain a word balance vector corresponding to the word feature vector, wherein the second formula is as follows:
z=t⊙g(WHh+bH)+(1-t)⊙h,
where t ═ σ (W)Th+bT),
Wherein, WHAnd WTAre all square matrices, bHAnd bTAre bias vectors, g is a non-linear function tanh, h is a word feature vector, and z is a word balance vector.
6. The method of claim 3, wherein the step of evaluating each local feature vector group to obtain the semantic feature vector corresponding to the user attribute parameter comprises:
evaluating each local feature vector group respectively through a third formula to obtain semantic feature vectors corresponding to the user attribute parameters, where the third formula is:
Figure FDA0003073549190000041
wherein the content of the first and second substances,
Figure FDA0003073549190000042
Figure FDA0003073549190000043
wherein A isij=α(zi),zj∈Rd
Wherein the content of the first and second substances,
Figure FDA0003073549190000044
wherein A isijIs a matrix of the degree of similarity, and,
Figure FDA0003073549190000045
Figure FDA0003073549190000046
multiplication of elements, [;]the vectors are spliced in a row-by-row manner,
Figure FDA0003073549190000047
for context-critical information, W1T、W2T、W3T∈R2d×d,b1、b2、b3∈Rd,W1T、W2T、W3T、b1、b2And b3Are trainable parameters, sigma is a nonlinear function sigmoid, ziIs a set of local feature vectors, and,
Figure FDA0003073549190000048
is a semantic feature vector.
7. The method according to claim 3, wherein the process of vector fusing the fusion attribute feature vector corresponding to each of the pairwise corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to each of the fusion attribute feature vectors to obtain the attribute feature vector group corresponding to the preset user number comprises:
performing vector fusion on the fusion attribute feature vector corresponding to each of the two pairs of corresponding user attribute information and the plurality of neighbor attribute feature vectors corresponding to the respective fusion attribute feature vectors respectively through a fourth formula to obtain a first attribute feature vector corresponding to the preset user number and a second attribute feature vector corresponding to the first attribute feature vector, where the fourth formula is:
Figure FDA0003073549190000049
wherein the content of the first and second substances,
Figure FDA00030735491900000410
wherein the content of the first and second substances,
Figure FDA00030735491900000411
wherein e isjiTo the attention coefficient, viFor the first fused attribute feature vector, uiFor the second fused attribute feature vector, vjIs the jth neighbor attribute feature vector u corresponding to the first fusion attribute feature vectorjIs the j-th neighbor attribute feature vector corresponding to the second fusion attribute feature vector, ajiTo normalize the coefficients, σ (-) is a non-linear function,
Figure FDA0003073549190000051
is a first attribute feature vector, and is,
Figure FDA0003073549190000052
is a second attribute feature vector, WTAnd b are model parameters to be learned;
and obtaining an attribute feature vector group corresponding to the preset user number according to each first attribute feature vector and the second attribute feature vector corresponding to the first attribute feature vector.
8. The method of claim 2, wherein the step of extracting the structural features of the structural data to obtain the structural feature vector corresponding to the preset user number comprises:
respectively converting the adjacent matrixes of the structural data to obtain grid structural data corresponding to the preset user numbers;
respectively carrying out normalization processing on each grid structure data to obtain normalization structure data corresponding to each preset user number;
and respectively extracting the characteristics of the normalized structure data through a preset convolutional neural network to obtain a structure characteristic vector corresponding to each preset user number.
9. The method of claim 2, wherein the step of performing fusion loss calculation on all attribute feature vector groups, all real values, and all structural feature vectors to obtain a fusion loss function comprises:
performing fusion loss calculation on all attribute feature vector groups, all real values and all structural feature vectors together by using a fifth formula to obtain a fusion loss function, wherein the fifth formula is as follows:
Loss=LossCE+λLosscos
wherein the content of the first and second substances,
Figure FDA0003073549190000053
wherein the content of the first and second substances,
Figure FDA0003073549190000054
wherein, Loss is a fusion Loss function, LossCEFor cross-entropy Loss, LosscosIn order to be a cosine loss,
Figure FDA0003073549190000055
is a first attribute feature vector, and is,
Figure FDA0003073549190000056
is the second attribute feature vector, y is the matching score, yiIs the true value, n is the total number of attribute feature vector groups,
Figure FDA0003073549190000061
is the difference between attribute feature vectors, siIs a structural feature vector.
10. An apparatus for aligning users across social networks, comprising:
the model optimization module is used for importing social network user data, constructing a training model for feature extraction, and optimizing the training model according to the social network user data to obtain an optimization model;
and the alignment result obtaining module is used for importing the social network user data to be tested, and performing alignment processing on the social network user data to be tested through the optimization model to obtain a user alignment result.
CN202110545701.3A 2021-05-19 2021-05-19 Cross-social network user alignment method and device Active CN113409157B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110545701.3A CN113409157B (en) 2021-05-19 2021-05-19 Cross-social network user alignment method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110545701.3A CN113409157B (en) 2021-05-19 2021-05-19 Cross-social network user alignment method and device

Publications (2)

Publication Number Publication Date
CN113409157A true CN113409157A (en) 2021-09-17
CN113409157B CN113409157B (en) 2022-06-28

Family

ID=77678934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110545701.3A Active CN113409157B (en) 2021-05-19 2021-05-19 Cross-social network user alignment method and device

Country Status (1)

Country Link
CN (1) CN113409157B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269845A (en) * 2022-08-01 2022-11-01 安徽大学 Network alignment method and system based on social network user personality
CN116503031A (en) * 2023-06-29 2023-07-28 中国人民解放军国防科技大学 Personnel similarity calculation method, device, equipment and medium based on resume analysis

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120215903A1 (en) * 2011-02-18 2012-08-23 Bluefin Lab, Inc. Generating Audience Response Metrics and Ratings From Social Interest In Time-Based Media
US20150347578A1 (en) * 2014-06-03 2015-12-03 Go Daddy Operating Company, LLC System and methods for auto-generating video from website elements
US20150379092A1 (en) * 2014-06-26 2015-12-31 Hapara Inc. Recommending literacy activities in view of document revisions
US9251530B1 (en) * 2012-08-31 2016-02-02 Sprinklr, Inc. Apparatus and method for model-based social analytics
CN106372581A (en) * 2016-08-25 2017-02-01 中国传媒大学 Method for constructing and training human face identification feature extraction network
CN106531254A (en) * 2016-10-20 2017-03-22 中核核电运行管理有限公司 Novel control rod value measuring method
CN110347932A (en) * 2019-06-04 2019-10-18 中国科学院信息工程研究所 A kind of across a network user's alignment schemes based on deep learning
CN110489567A (en) * 2019-08-26 2019-11-22 重庆邮电大学 A kind of node information acquisition method and its device based on across a network Feature Mapping
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph
CN111931903A (en) * 2020-07-09 2020-11-13 北京邮电大学 Network alignment method based on double-layer graph attention neural network
CN112084373A (en) * 2020-08-05 2020-12-15 国家计算机网络与信息安全管理中心 Multi-source heterogeneous network user alignment method based on graph embedding
CN112417890A (en) * 2020-11-29 2021-02-26 中国科学院电子学研究所苏州研究院 Fine-grained entity classification method based on diversified semantic attention model
CN112507247A (en) * 2020-12-15 2021-03-16 重庆邮电大学 Cross-social network user alignment method fusing user state information
CN112819604A (en) * 2021-01-19 2021-05-18 浙江省农村信用社联合社 Personal credit evaluation method and system based on fusion neural network feature mining

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120215903A1 (en) * 2011-02-18 2012-08-23 Bluefin Lab, Inc. Generating Audience Response Metrics and Ratings From Social Interest In Time-Based Media
US9251530B1 (en) * 2012-08-31 2016-02-02 Sprinklr, Inc. Apparatus and method for model-based social analytics
US20150347578A1 (en) * 2014-06-03 2015-12-03 Go Daddy Operating Company, LLC System and methods for auto-generating video from website elements
US20150379092A1 (en) * 2014-06-26 2015-12-31 Hapara Inc. Recommending literacy activities in view of document revisions
CN106372581A (en) * 2016-08-25 2017-02-01 中国传媒大学 Method for constructing and training human face identification feature extraction network
CN106531254A (en) * 2016-10-20 2017-03-22 中核核电运行管理有限公司 Novel control rod value measuring method
CN110347932A (en) * 2019-06-04 2019-10-18 中国科学院信息工程研究所 A kind of across a network user's alignment schemes based on deep learning
CN110489567A (en) * 2019-08-26 2019-11-22 重庆邮电大学 A kind of node information acquisition method and its device based on across a network Feature Mapping
CN110955780A (en) * 2019-10-12 2020-04-03 中国人民解放军国防科技大学 Entity alignment method for knowledge graph
CN111931903A (en) * 2020-07-09 2020-11-13 北京邮电大学 Network alignment method based on double-layer graph attention neural network
CN112084373A (en) * 2020-08-05 2020-12-15 国家计算机网络与信息安全管理中心 Multi-source heterogeneous network user alignment method based on graph embedding
CN112417890A (en) * 2020-11-29 2021-02-26 中国科学院电子学研究所苏州研究院 Fine-grained entity classification method based on diversified semantic attention model
CN112507247A (en) * 2020-12-15 2021-03-16 重庆邮电大学 Cross-social network user alignment method fusing user state information
CN112819604A (en) * 2021-01-19 2021-05-18 浙江省农村信用社联合社 Personal credit evaluation method and system based on fusion neural network feature mining

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LI LIU ET AL: ""ABNE: An Attention-Based Network Embedding for User Alignment Across Social Networks"", 《IEEE ACCESS》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115269845A (en) * 2022-08-01 2022-11-01 安徽大学 Network alignment method and system based on social network user personality
CN115269845B (en) * 2022-08-01 2023-06-23 安徽大学 Network alignment method and system based on social network user personality
CN116503031A (en) * 2023-06-29 2023-07-28 中国人民解放军国防科技大学 Personnel similarity calculation method, device, equipment and medium based on resume analysis
CN116503031B (en) * 2023-06-29 2023-09-08 中国人民解放军国防科技大学 Personnel similarity calculation method, device, equipment and medium based on resume analysis

Also Published As

Publication number Publication date
CN113409157B (en) 2022-06-28

Similar Documents

Publication Publication Date Title
CN111428147B (en) Social recommendation method of heterogeneous graph volume network combining social and interest information
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN109544306B (en) Cross-domain recommendation method and device based on user behavior sequence characteristics
CN109034960B (en) Multi-attribute inference method based on user node embedding
CN110929164A (en) Interest point recommendation method based on user dynamic preference and attention mechanism
CN110807154A (en) Recommendation method and system based on hybrid deep learning model
CN111667022A (en) User data processing method and device, computer equipment and storage medium
CN111127146B (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
Piao et al. Housing price prediction based on CNN
CN112199608A (en) Social media rumor detection method based on network information propagation graph modeling
CN113409157B (en) Cross-social network user alignment method and device
CN112800344B (en) Deep neural network-based movie recommendation method
CN112138403A (en) Interactive behavior recognition method and device, storage medium and electronic equipment
CN104881684A (en) Stereo image quality objective evaluate method
CN114298851A (en) Network user social behavior analysis method and device based on graph sign learning and storage medium
CN112380453A (en) Article recommendation method and device, storage medium and equipment
CN109948242A (en) Network representation learning method based on feature Hash
CN115687760A (en) User learning interest label prediction method based on graph neural network
CN113656699B (en) User feature vector determining method, related equipment and medium
CN112905894B (en) Collaborative filtering recommendation method based on enhanced graph learning
CN114519508A (en) Credit risk assessment method based on time sequence deep learning and legal document information
CN112560105B (en) Joint modeling method and device for protecting multi-party data privacy
CN113569059A (en) Target user identification method and device
Ahan et al. Social network analysis using data segmentation and neural networks
CN109582953B (en) Data support scoring method and equipment for information and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant