CN115828109A - Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment - Google Patents
Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment Download PDFInfo
- Publication number
- CN115828109A CN115828109A CN202211474688.8A CN202211474688A CN115828109A CN 115828109 A CN115828109 A CN 115828109A CN 202211474688 A CN202211474688 A CN 202211474688A CN 115828109 A CN115828109 A CN 115828109A
- Authority
- CN
- China
- Prior art keywords
- user
- representation
- social
- users
- alignment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 70
- 238000000034 method Methods 0.000 title claims abstract description 53
- 230000007246 mechanism Effects 0.000 claims abstract description 12
- 239000013598 vector Substances 0.000 claims description 29
- 239000011159 matrix material Substances 0.000 claims description 28
- 238000005457 optimization Methods 0.000 claims description 24
- 238000000605 extraction Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 230000014509 gene expression Effects 0.000 claims 2
- 230000008569 process Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a cross-social network virtual identity association method and a device based on multi-mode fusion and representation alignment, wherein the method comprises the following steps: extracting characteristics of user names of social networks of different platforms, texts published by users and social relations of the users to respectively obtain characteristic information of the users in different modes; performing multi-mode fusion by using an attention mechanism according to the characteristic information to obtain a first user representation fused with multi-dimensional characteristics; aligning the first user representation through representation to strengthen the user representation, and finally obtaining a second user representation with the same distribution on different platforms; and calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result. The method solves the problems that a single modal model cannot completely describe a user and the distribution difference exists in different social platforms through a multi-modal fusion and representation alignment method.
Description
Technical Field
The invention belongs to the technical field of social network virtual identity association, and particularly relates to a cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment.
Background
Nowadays, social networks are an indispensable part of people's lives with their high degree of convenience. In general, people like to join multiple social platforms to enjoy different services, such as using WeChat to communicate, using microblog to watch news or to punch a card. Therefore, many learners are dedicated to research related to social networks, and cross-social network virtual identity association is an important part of the research, so as to identify social accounts of the same natural person on different platforms, and the research has attracted high attention in the fields of recommendation systems, user behavior analysis, information dissemination and the like.
In fact, there are many methods proposed to be applied to the user identity link, and the methods at present can be divided into three categories: a method based on user attributes, a method based on user generated content, and a method based on user social relationships. However, these methods have certain drawbacks. With respect to user attributes, for privacy reasons, users selectively disclose profile attributes and keep some sensitive information (such as age or contact) secret, and may even forge or mimic the information, adding to uncertainty and ambiguity of the information. Due to the richness of the social network, posts made by users can have various forms (characters, pictures and the like), and if only a single content is used, information loss can be caused. The existing method emphasizes the structured information too much based on the social relationship among users, but the characteristics of the friends in the social network are helpful for identifying the users, after all, the characteristics of the friends may be more unique than the characteristics of the users, and if the characteristics of the friends are taken into consideration, the accuracy is greatly improved. Therefore, user information of multiple modalities should be utilized, not limited to single modality information. On the other hand, the confidence characterizing a user from modality to modality is different. Sometimes the user's text conveys more information than other modalities, and sometimes the picture also plays a key role. Therefore, adaptively characterizing different modes is the key to solving this problem.
Second, although the same user may post similar information on different social platforms, there may be different characterizations of such similar information due to inconsistent data distribution between platforms. However, the existing method is used for carrying out user identity linking directly according to the representation of the user without considering the semantic gap between the user identity linking and the representation of the user identity. Therefore, how to approach the same user's representations on different platforms is another challenge.
Disclosure of Invention
The invention mainly aims to overcome the defects of the prior art and provide a cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment, and solves the problems that a single model cannot completely describe a user and different social platforms have distribution differences through the multi-mode fusion and representation alignment method.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the invention provides a cross-social network virtual identity association method based on multi-modal fusion and representation alignment, comprising the following steps:
extracting characteristics of user names of different social network users, texts published by the users and social relations of the users to respectively obtain user name characteristic information, text characteristic information published by the users and social relation characteristic information of the users;
performing multi-mode fusion by using an attention mechanism according to the obtained user name characteristic information, the text characteristic information published by the user and the user social relation characteristic information to obtain a first user representation fused with multi-dimensional characteristics;
carrying out user representation enhancement processing on the first user representation through a representation alignment method to finally obtain second user representations with different platforms and the same distribution space;
and calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result.
As a preferred technical scheme, the feature extraction of the user name specifically comprises the following steps:
for the user name of a given user, a character-level Bag-of-Words model is utilized to carry out feature extraction, the occurrence times of each character in each user name are counted, and a vector is obtainedSequentially splicing all the obtained user name vectors to obtain a user name counting matrixSince C0 is a sparse matrix, it is converted by an automatic encoder, and the conversion formula is specifically:
wherein, W e ,b e For the weights and offsets of the encoder, W d ,b d As weights and offsets of the decoder, C 1 In order for the decoder to be a matrix of user name vectors,andare respectively usedUsername vector, training loss function L continuously by gradient descent c To obtain the optimum W e And b e Finally, the user name embedded matrix with the dimension d is obtained
As a preferred technical solution, the feature extraction of the text published by the user specifically includes:
inputting texts published by users into a Word2Vec model to obtain embedded vectors of each text, averaging the embedded vectors of the texts published by each user to be used as the representation of the texts published by the user, and sequentially splicing the text embedded vectors of all the users to obtain a text embedded matrix with the dimension d
As a preferred technical scheme, the feature extraction of the user social relationship specifically includes:
will be composed of a platform N 1 N users and platform N 2 Obtaining an n multiplied by m adjacency matrix obtained by the social relationship formed by the m users through a Deepwalk model to obtain an embedded vector of the social relationship of each user, sequentially splicing the embedded vectors of the social relationship of all the users to obtain a user social relationship embedded matrix with the dimension d
As a preferred technical solution, the multi-mode fusion is to perform multi-mode fusion by using an attention mechanism on an embedded matrix of the obtained three kinds of user feature information, to assign different weights to each mode to reflect importance among different modes, and to obtain a first user representation matrix Z after the multi-mode fusion f (ii) a The calculation formula is as follows:
wherein alpha is c ,α T ,α V Embedding the weights of the matrixes into user names, texts and social relations respectively; f (.) is the attention network.
As a preferred technical solution, the specific steps of representing the alignment-enhanced user representation are as follows:
firstly, putting a first user representation into a full connection layer to map the user representations of two platforms into the same space to obtain a second user representation, wherein the calculation formula of the second user representation is as follows:
wherein, W l ,b l Respectively the full link layer weight and the bias,a first user representation obtained by multi-modal fusion of the platform N is obtained, and Z is a second user representation;
secondly, for training all weights and biases in the method, the minimum EMD distance is used as a first optimization target, and the calculation formula of the first optimization target is as follows:
wherein L is E For the first optimization objective, d ij For the userIs represented by a second userAnd the userIs represented by a second userDistance of (D), F ij For the userAnd the userThe probability of association between them,represents the square of the F norm;
in addition, by reducing the presentation distance between pairs of users and P ij And F ij And setting a second optimization objective to better guide learning of the second user representation, wherein the second optimization objective is calculated by the formula:
wherein L is R For the second optimization objective, n p For the number of associated user sample pairs, λ 1 And λ 2 For a hyper-parameter, true association probability P for an associated user sample pair ij =1;
The final optimization objective L is achieved as the sum of the first and second optimization objectives, i.e.:
L=L E +L R
finally, continuously optimizing L by a gradient descent method to obtain optimal weight and bias, and finally obtaining the optimal weight and bias according to the optimal W l And b l A second user representation Z is obtained.
As a preferred technical solution, the identity association result is obtained by calculating cosine similarity between representations of the second users, and the calculation formula is as follows:
wherein,is a platform N 1 To a userAnd a second user representation ofIs a platform N 2 To a userA second user representation of, S ij For the userAnd the userCosine similarity of (c).
In a second aspect, the invention further provides a cross-social network virtual identity correlation system based on multi-modal fusion and representation alignment, and the cross-social network virtual identity correlation method applying the multi-modal fusion and representation alignment comprises a feature extraction module, a multi-modal fusion module, a representation alignment module and an identity correlation module;
the characteristic extraction module is used for extracting characteristics of user names of social networks of different platforms, texts published by users and social relations of the users to respectively obtain user name characteristic information, text characteristic information published by the users and social relations characteristic information of the users;
the multi-mode fusion module is used for performing multi-mode fusion by using an attention mechanism according to the three user feature information to obtain a first user representation fused with multi-dimensional features;
the representation alignment module is used for enhancing the user representation through representation alignment on the first user representation to finally obtain a second user representation with the same distribution on different platforms;
and the identity correlation module is used for calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result.
In a third aspect, the present invention also provides an electronic device, including:
at least one processor; and (c) a second step of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform the cross-social-network virtual identity association method based on multi-modal fusion and representation alignment.
In a fourth aspect, the present invention further provides a computer readable storage medium storing a program, which when executed by a processor, implements the cross-social network virtual identity association method based on multi-modal fusion and representation alignment.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the invention fully excavates three modal information of user name, user published content and social relation. Extracting the characteristics of each user name specific composition based on a character-level Bag-of-Words model; text and social relation features are extracted from Word2Vec and Deepwalk models, on the basis, multi-mode fusion is carried out by utilizing an attention mechanism to automatically learn feature weights, and the problems that a user cannot be completely described by using single-mode information or cannot be perfectly fused by using multi-mode information in the conventional method are solved to a certain extent;
2. according to the method, after the user representation is obtained, the user representation is further strengthened through representation alignment, so that the user representations of different platforms belonging to the same natural person are close to each other as much as possible, and the problem of data distribution difference of different social platforms is solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a block diagram of a cross-social network virtual identity association method based on multi-modal fusion and representation alignment in accordance with an embodiment of the present invention;
FIG. 2 is a block diagram of a cross-social network virtual identity association system based on multi-modal fusion and representation alignment according to an embodiment of the present invention;
fig. 3 is a block diagram of an electronic device according to an embodiment of the invention.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Multimodal fusion: multimodal fusion refers to the process of combining information from two or more modalities to make predictions. In the prediction process, a single mode usually cannot contain all effective information required for generating an accurate prediction result, and the multi-mode fusion process combines information from two or more modes, so that information supplement is realized, the coverage range of the information contained in input data is expanded, the accuracy of the prediction result is improved, and the robustness of a prediction model is improved.
Referring to fig. 1, in one embodiment of the present application, a method for multimodal fusion and representation-aligned cross-social-network virtual identity association is provided, comprising the following steps:
s1, extracting characteristics of user names of different social networks, texts published by users and social relations of the users to respectively obtain user name characteristic information, text characteristic information published by the users and social relation characteristic information of the users.
S11, extracting the characteristics of the user name, which specifically comprises the following steps:
for a user name 'abza 12' of a given user, a character-level Bag-of-Words model is utilized to carry out feature extraction, the occurrence times of each character in each user name are counted, and a vector is obtainedAs in a of user name "abza 12": 2,b:1,z:1,"1": 1,"2": 1, thus, obtaining a vectorAll the obtained user name vectors are sequentially spliced to obtain a user name counting matrixSince C0 is a sparse matrix, it is converted by an automatic encoder, and the conversion formula is specifically:
wherein, W e ,b e For the weights and offsets of the encoder, W d ,b d For the weights and the offset of the decoder,andrespectively user name vector, continuously training L through gradient descent c To obtain the optimum W e And b e Finally, the user name embedded matrix with the dimension d is obtained
S12, extracting the characteristics of the text published by the user, which specifically comprises the following steps:
for a text published by a user, such as "Today is a sunny day", the obtained text is input into a Word2Vec model after words are removed to obtain vector representation of each Word, word vectors of each text are further added to obtain an embedded vector of each text, then the average of embedded vectors of all texts published by each user is taken as the representation of the text published by the user, the text embedded vectors of all users are sequentially spliced to obtain a text embedded matrix with the dimension of d
S13, extracting the characteristics of the social relationship of the user, specifically:
for user social relationships, if platform N 1 Ith user and platform N 2 If the jth user has friend relationship, the ijth position of the adjacent matrix is set to be 1, and the platform N 1 N users and platform N 2 Obtaining an n multiplied by m adjacency matrix obtained by the social relationship formed by the m users through a Deepwalk model to obtain an embedded vector of the social relationship of each user, sequentially splicing the embedded vectors of the social relationship of all the users to obtain a user social relationship embedded matrix with the dimension d
And S2, performing multi-mode fusion by using an attention mechanism according to the characteristic information to obtain a first user representation fused with multi-dimensional characteristics.
S21, the multi-mode fusion is to obtain an embedded matrix of the three kinds of user characteristic information; performing multi-mode fusion by using an attention mechanism, giving different weights to each mode to reflect importance among different modes, and obtaining a first user representation matrix Z after the multi-mode fusion f (ii) a The calculation formula is as follows:
wherein alpha is C ,α T ,α V Embedding the weights of the matrixes into user names, texts and social relations respectively; f (.) is the attention network.
And S3, the first user representation is aligned and strengthened through representation, and finally a second user representation with the same distribution on different platforms is obtained.
S31, firstly, putting the first user representation into a full connection layer to map the user representations of the two platforms into the same space to obtain a second user representation, wherein the calculation formula of the second user representation is as follows:
wherein, W l ,b l Respectively the full link layer weight and the bias,a first user representation resulting from multimodal fusion of platform N, and Z is a second user representation.
S32, secondly, using the minimized EMD distance as a first optimization target, wherein the calculation formula of the first optimization target is as follows:
wherein L is E For the first optimization objective, d ij For the userIs represented by a second userAnd the userOf a second userDistance of (D), F ij For the userAnd the userThe probability of association between them,representing the square of the F-norm.
S33, the method reduces the representation distance between the user pairs and P ij And F ij And setting a second optimization objective to better guide learning of the second user representation, wherein the second optimization objective is calculated by the formula:
wherein L is R For the second optimization objective, n p For the number of associated user sample pairs, λ 1 And λ 2 For a hyper-parameter, true association probability P for an associated user sample pair ij =1;
The final optimization objective L is achieved as the sum of the first and second optimization objectives, i.e.:
L=L E +L R
finally, continuously optimizing L by a gradient descent method to obtain optimal weight and bias, and finally obtaining the optimal weight and bias according to the optimal W l And b l Resulting in a second user representation Z.
And S4, calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result.
S41, computing platform N 1 And platform N 2 The second user between users represents cosine similarity:
wherein,is a platform N 1 To a userAnd a second user representation ofIs a platform N 2 To a userA second user representation of, S ij For the userAnd the userCosine similarity of (c); and finally, according to the similarity score between the users, and taking the user pair with the highest score as an identity association result.
It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention.
Based on the same idea as the cross-social network virtual identity association method based on multi-modal fusion and representation alignment in the above embodiment, the invention further provides a cross-social network virtual identity association system based on multi-modal fusion and representation alignment, which can be used for executing the above cross-social network virtual identity association method based on multi-modal fusion and representation alignment. For convenience of illustration, only the parts related to the embodiments of the present invention are shown in the structural schematic diagram of the cross-social network virtual identity association system embodiment based on multi-modal fusion and representation alignment, and those skilled in the art will understand that the illustrated structure does not constitute a limitation of the apparatus, and may include more or less components than those illustrated, or combine some components, or arrange different components.
Referring to FIG. 2, in another embodiment of the present application, a cross-social network virtual identity association system 100 based on multi-modal fusion and representation alignment is provided and includes a feature extraction module 101, a multi-modal fusion module 102, a representation alignment module 103, and an identity association module 104
The feature extraction module 101 is configured to perform feature extraction on user names of social networks of different platforms, texts published by users, and user social relationships, and obtain user name feature information, text feature information published by users, and user social relationship feature information, respectively;
the multi-modal fusion module 102 is configured to perform multi-modal fusion by using an attention mechanism according to the three user feature information to obtain a first user representation fused with multi-dimensional features;
the representation alignment module 103 is configured to enhance the user representation through representation alignment on the first user representation, and finally obtain a second user representation with the same distribution on different platforms;
the identity association module 104 is configured to calculate cosine similarity between the second user representations to obtain similarity scores between users, and use the user pair with the highest score as an identity association result.
It should be noted that, the cross-social network virtual identity association system based on multi-modal fusion and representation alignment of the present invention corresponds to the cross-social network virtual identity association method based on multi-modal fusion and representation alignment of the present invention one to one, and the technical features and the beneficial effects thereof described in the embodiments of the cross-social network virtual identity association method based on multi-modal fusion and representation alignment are all applicable to the cross-social network virtual identity association system based on multi-modal fusion and representation alignment, and specific contents thereof can be referred to the description in the embodiments of the method of the present invention, and are not repeated herein, and thus, the present invention is declared.
In addition, in the implementation of the multi-modal fusion and representation aligned cross-social network virtual identity association system according to the above embodiment, the logical division of each program module is only an example, and in practical applications, the above function distribution may be performed by different program modules according to needs, for example, due to the configuration requirements of corresponding hardware or the convenience of implementation of software, that is, the internal structure of the cross-social network virtual identity association system based on multi-modal fusion and representation aligned is divided into different program modules to perform all or part of the above described functions.
Referring to fig. 3, in an embodiment, an electronic device for implementing a cross-social-network virtual identity association method based on multi-modal fusion and representation alignment is provided, and the electronic device 200 may include a first processor 201, a first memory 202 and a bus, and may further include a computer program stored in the first memory 202 and executable on the first processor 201, such as a cross-social-network virtual identity association program 203 based on multi-modal fusion and representation alignment.
The first memory 202 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The first memory 202 may in some embodiments be an internal storage unit of the electronic device 200, e.g. a removable hard disk of the electronic device 200. The first memory 202 may also be an external storage device of the electronic device 200 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 200. Further, the first memory 202 may also include both an internal storage unit and an external storage device of the electronic device 200. The first memory 202 can be used not only for storing application software installed on the electronic device 200 and various types of data, such as code of the cross-social-network virtual-identity correlation program 203 aligned with the representation through multi-modal fusion, but also for temporarily storing data that has been output or will be output.
The first processor 201 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same function or different functions, and includes one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The first processor 201 is a Control Unit (Control Unit) of the electronic device, connects various components of the whole electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 200 by running or executing programs or modules stored in the first memory 202 and calling data stored in the first memory 202.
Fig. 3 shows only an electronic device having components, and those skilled in the art will appreciate that the structure shown in fig. 3 does not constitute a limitation of the electronic device 200, and may include fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
The multimodal fusion and cross-social-network virtual-identity association program 203 representing alignment stored in the first memory 202 of the electronic device 200 is a combination of instructions that, when executed in the first processor 201, may implement:
extracting characteristics of user names of different social networks, texts published by users and social relations of the users to respectively obtain characteristic information of the user names, characteristic information of the texts published by the users and characteristic information of the social relations of the users;
performing multi-mode fusion by using an attention mechanism according to the characteristic information to obtain a first user representation fused with multi-dimensional characteristics;
the first user representation is aligned and strengthened through representation, and finally a second user representation with the same distribution of different platforms is obtained;
and calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result.
Further, the integrated modules/units of the electronic device 200 may be stored in a non-volatile computer-readable storage medium if they are implemented in the form of software functional units and sold or used as independent products. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM).
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by a computer program, which may be stored in a non-volatile computer readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (10)
1. A cross-social-network virtual identity association method based on multi-modal fusion and representation alignment is characterized by comprising the following steps of:
extracting characteristics of user names of different social network users, texts published by the users and social relations of the users to respectively obtain user name characteristic information, text characteristic information published by the users and social relation characteristic information of the users;
performing multi-mode fusion by using an attention mechanism according to the obtained user name characteristic information, the text characteristic information published by the user and the user social relation characteristic information to obtain a first user representation fused with multi-dimensional characteristics;
carrying out user representation enhancement processing on the first user representation through a representation alignment method to finally obtain second user representations with different platforms and the same distribution space;
and calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity association result.
2. The cross-social-network virtual-identity correlation method based on multi-modal fusion and representation alignment according to claim 1, wherein the feature extraction of the user name is specifically as follows:
for the user name of a given user, a character-level Bag-of-Words model is utilized to carry out feature extraction, the occurrence times of each character in each user name are counted, and a vector is obtainedSequentially splicing all the obtained user name vectors to obtain a user name counting matrixDue to C 0 Is a sparse matrix, and for this purpose, an automatic encoder is used for converting the sparse matrix, and the conversion formula is specifically as follows:
wherein, W e ,b e For the weights and offsets of the encoder, W d ,b d For the weights and offsets of the decoder, C 1 In order to be a decoder username vector matrix,andrespectively user name vector, continuously training loss function L through gradient descent c To obtain the optimum W e And b e Finally, the user name embedded matrix with the dimension d is obtained
3. The cross-social-network virtual-identity correlation method based on multi-modal fusion and representation alignment according to claim 1, wherein the feature extraction of the user-published text is specifically as follows:
inputting texts published by users into a Word2Vec model to obtain embedded vectors of each text, averaging the embedded vectors of the texts published by each user to be used as the representation of the texts published by the user, and sequentially splicing the text embedded vectors of all the users to obtain a text embedded matrix with dimension d
4. The cross-social-network virtual-identity correlation method based on multi-modal fusion and representation alignment according to claim 1, wherein the feature extraction of the user social relationship is specifically as follows:
will be composed of a platform N 1 N users and platform N 2 Obtaining an n multiplied by m adjacency matrix obtained by the social relationship formed by the m users through a Deepwalk model to obtain an embedded vector of the social relationship of each user, sequentially splicing the embedded vectors of the social relationship of all the users to obtain a user social relationship embedded matrix with the dimension d
5. The cross-social-network virtual-identity correlation method based on multi-modal fusion and representation alignment as claimed in claim 1, wherein the multi-modal fusion is multi-modal fusion by using an attention mechanism to embed three kinds of obtained user feature information into a matrix, each mode is given different weights to reflect importance among different modes, and after the multi-modal fusion, a first user representation matrix Z is obtained f (ii) a The calculation formula is as follows:
wherein alpha is C ,α T ,α V Embedding the weights of the matrixes into user names, texts and social relations respectively; f (.) is the attention network.
6. The cross-social-network virtual-identity correlation method based on multi-modal fusion and representation alignment according to claim 1, wherein the specific steps of representation alignment enhancing user representation are as follows:
firstly, putting a first user representation into a full connection layer to map the user representations of two platforms into the same space to obtain a second user representation, wherein the calculation formula of the second user representation is as follows:
wherein, W l ,b l Respectively the full link layer weight and the bias,a first user representation obtained by multi-modal fusion of the platform N is obtained, and Z is a second user representation;
secondly, for training all weights and biases in the method, the minimum EMD distance is used as a first optimization target, and the calculation formula of the first optimization target is as follows:
wherein L is E For the first optimization objective, d ij For the userIs represented by a second userAnd the userIs represented by a second userDistance of (D), F ij For the userAnd the userThe probability of association between them,represents the square of the F norm;
in addition, by reducing the presentation distance between pairs of users and P ij And F ij The difference between the first and second user expressions, a second optimization objective is set to better guide learning the second user expression, and the calculation formula of the second optimization objective is as follows:
wherein L is R For the second optimization objective, n p For the number of associated user sample pairs, λ 1 And λ 2 For a hyper-parameter, true association probability P for an associated user sample pair ij =1;
The final optimization objective L is achieved as the sum of the first and second optimization objectives, i.e.:
L=L E +L R
finally, continuously optimizing L by a gradient descent method to obtain optimal weight and bias, and finally obtaining optimal W l And b l A second user representation Z is obtained.
7. The cross-social network virtual identity association method based on multi-modal fusion and representation alignment as claimed in claim 1, wherein the identity association result is obtained by calculating cosine similarity between the second user representations, and the calculation formula is as follows:
8. The cross-social network virtual identity correlation system based on multi-modal fusion and representation alignment is characterized in that the cross-social network virtual identity correlation method based on multi-modal fusion and representation alignment applied to any one of claims 1 to 7 comprises a feature extraction module, a multi-modal fusion module, a representation alignment module and an identity correlation module;
the characteristic extraction module is used for extracting characteristics of user names of social networks of different platforms, texts published by users and social relations of the users to respectively obtain user name characteristic information, text characteristic information published by the users and social relations characteristic information of the users;
the multi-mode fusion module is used for performing multi-mode fusion by using an attention mechanism according to the three user characteristic information to obtain a first user representation fused with multi-dimensional characteristics;
the representation alignment module is used for enhancing the user representation through representation alignment on the first user representation to finally obtain a second user representation with the same distribution on different platforms;
and the identity correlation module is used for calculating cosine similarity between the second user representations to obtain similarity scores between the users, and taking the user pair with the highest score as an identity correlation result.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform the cross-social-network virtual identity association method based on multi-modal fusion and representation alignment of any of claims 1-7.
10. A computer readable storage medium storing a program, which when executed by a processor, implements the cross-social network virtual identity association method based on multi-modal fusion and representation alignment of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474688.8A CN115828109A (en) | 2022-11-23 | 2022-11-23 | Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211474688.8A CN115828109A (en) | 2022-11-23 | 2022-11-23 | Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115828109A true CN115828109A (en) | 2023-03-21 |
Family
ID=85530641
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211474688.8A Pending CN115828109A (en) | 2022-11-23 | 2022-11-23 | Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115828109A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776193A (en) * | 2023-05-17 | 2023-09-19 | 广州大学 | Method and device for associating virtual identities across social networks based on attention mechanism |
-
2022
- 2022-11-23 CN CN202211474688.8A patent/CN115828109A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116776193A (en) * | 2023-05-17 | 2023-09-19 | 广州大学 | Method and device for associating virtual identities across social networks based on attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021114840A1 (en) | Scoring method and apparatus based on semantic analysis, terminal device, and storage medium | |
CN110032641A (en) | Method and device that computer executes, that event extraction is carried out using neural network | |
CN112016273A (en) | Document directory generation method and device, electronic equipment and readable storage medium | |
CN114596566B (en) | Text recognition method and related device | |
CN111737499A (en) | Data searching method based on natural language processing and related equipment | |
CN112085091B (en) | Short text matching method, device, equipment and storage medium based on artificial intelligence | |
CN112131881B (en) | Information extraction method and device, electronic equipment and storage medium | |
CN110795938A (en) | Text sequence word segmentation method, device and storage medium | |
CN112733645B (en) | Handwritten signature verification method, handwritten signature verification device, computer equipment and storage medium | |
CN113158656B (en) | Ironic content recognition method, ironic content recognition device, electronic device, and storage medium | |
CN111611811A (en) | Translation method, translation device, electronic equipment and computer readable storage medium | |
CN115840808B (en) | Technological project consultation method, device, server and computer readable storage medium | |
CN115828109A (en) | Cross-social network virtual identity association method and device based on multi-mode fusion and representation alignment | |
CN114676705B (en) | Dialogue relation processing method, computer and readable storage medium | |
CN118172785A (en) | Document information extraction method, apparatus, device, storage medium, and program product | |
CN116776193B (en) | Method and device for associating virtual identities across social networks based on attention mechanism | |
CN115640810B (en) | Method, system and storage medium for identifying communication sensitive information of power system | |
CN116524574A (en) | Facial area recognition method and device and electronic equipment | |
Ledesma et al. | Enabling automated herbarium sheet image post‐processing using neural network models for color reference chart detection | |
CN116484864A (en) | Data identification method and related equipment | |
CN116976341A (en) | Entity identification method, entity identification device, electronic equipment, storage medium and program product | |
CN115346095A (en) | Visual question answering method, device, equipment and storage medium | |
CN112989820A (en) | Legal document positioning method, device, equipment and storage medium | |
CN114386431B (en) | Sentence-based resource library hot updating method, sentence-based recommending method and related devices | |
CN112732913B (en) | Method, device, equipment and storage medium for classifying unbalanced samples |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |