EP2866421B1 - Procédé et appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux - Google Patents

Procédé et appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux Download PDF

Info

Publication number
EP2866421B1
EP2866421B1 EP14190351.8A EP14190351A EP2866421B1 EP 2866421 B1 EP2866421 B1 EP 2866421B1 EP 14190351 A EP14190351 A EP 14190351A EP 2866421 B1 EP2866421 B1 EP 2866421B1
Authority
EP
European Patent Office
Prior art keywords
account
combination
test set
account combination
accounts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP14190351.8A
Other languages
German (de)
English (en)
Other versions
EP2866421A1 (fr
Inventor
Caifeng He
Jianfeng Qian
Wei Fan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of EP2866421A1 publication Critical patent/EP2866421A1/fr
Application granted granted Critical
Publication of EP2866421B1 publication Critical patent/EP2866421B1/fr
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Definitions

  • the present invention relates to the field of network user identification technologies, and in particular, to a method and an apparatus for identifying a user in multiple social networks.
  • Social networks are increasingly popular and of increasing types (for example, Facebook, Twitter, WeChat, and Foursquare). Most of the social networks are independent of one another. Each social network has a large quantity of heterogeneous data sets based on accounts, including diversified data types such as time, location, person and event. One user is often active in multiple social networks by using different accounts, which generates a large quantity of independent data sets scattered over various social networks. Associating data sets of a user in different social networks will greatly extend data information based on the user, which is of great significance for many data mining analyses.
  • a method for associating data sets of a user in different social networks mainly is as follows: firstly, modeling an account in a social network, generating, by using registration information of the account and text content published in the social network by a user of the account, a vector describing account features, where the vector includes attributes such as a name of the user, the birth date of the user, an academic degree of the user, and a favorite (such as a song, a color, and food) of the user of the account; secondly, assigning different weights to different attributes of the vector, where the weight of an attribute indicates the importance of the attribute for distinguishing different users; and finally, computing a similarity between vectors of different accounts, so as to identify whether the accounts belong to a same user.
  • a main technical issue resolved by the present invention is to provide a method and an apparatus for identifying a same user in multiple social networks, which can describe user information comprehensively and accurately, so as to make a final prediction result more accurate.
  • the present invention provides a method for identifying a same user in multiple social networks, where it is defined that a same user has only one account in one social network, that the number of accounts in an account combination equals the number of the social networks, and that each account in the account combination comes from a different network; and the method includes: inputting accounts that are in a test set and are obtained from registered accounts in at least two different social networks, and generating a test set account combination from the accounts in the test set; exacting at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account; inputting accounts in a training set which are obtained from at least two different social networks, and generating a training set account combination from accounts, which belong to a same user, among the accounts in the training set; extracting at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account; and training the training set with a supervised classification algorithm and by using the at least two
  • the association algorithm refers to a process in which correlation between predicted values of a test set account combination is computed to obtain a prediction result for the test set account combination in order to indicate whether the test set account combination belongs or does not belong to the user.
  • the method before the step of inputting the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination, the method includes: inputting accounts in a training set which are obtained from the at least two different social networks, and generating a training set account combination from accounts, which belong to a user, among the accounts in the training set; exacting at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account; and training the training set with a supervised classification algorithm by using the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, to obtain the classification and prediction model.
  • the at least two different features that are correlated with behavioral data of a user of the account include: a social feature of an account in the account combination, a spatial feature of account publishing information in the account combination, a temporal feature of account publishing information in the account combination and a text feature of account publishing information in the account combination.
  • the social feature of an account in the account combination includes: the number of common adjacent elements, a Jaccard Jaccard similarity coefficient and an Adamic/Adar Adamic/Adar measure, where the number of common adjacent elements refers to the number of common friends of accounts in the account combination, and accounts having the common friends are in the training set; the Jaccard similarity coefficient refers to a ratio of the number of common friends of accounts in the account combination to the number of all friends of accounts in the account combination; and the Adamic/Adar measure refers to influence of common friends of accounts in the account combination on respective social networks.
  • the spatial feature of account publishing information in the account combination includes: the number of common locations of all account publishing information in the account combination, a cosine similarity of a location set of all account publishing information in the account combination, and an average distance of the location set of all account publishing information in the account combination.
  • the temporal feature of account publishing information in the account combination includes: the number of common time ranges of all account publishing information in the account combination, and a cosine similarity of a time range set of all account publishing information in the account combination.
  • the text feature of account publishing information in the account combination includes: an inner product of bag-of-words vectors of all account publishing information in the account combination, and a cosine similarity of the bag-of-words vectors of all account publishing information in the account combination.
  • the method further includes: processing, by using a natural language processing technology, information published by an account in the test set or training set account combination; and generating a bag-of-words vector of the account from the processed information by using a term frequency-inverse document frequency (TF-IDF) weighting model.
  • TF-IDF term frequency-inverse document frequency
  • the step of performing computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and outputting a computed prediction result for the test set account combination includes: performing computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and outputting the computed prediction result for the test set account combination.
  • the step of performing computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and outputting the computed prediction result for the test set account combination includes: sorting, in the test set account combination, predicted values or sets of predicted values of all account combinations corresponding to an account in the test set in order of the predicted values, to obtain a list of predicted values of the account; and if a closed account pair exists in the test set account combination, accounts corresponding to the closed account pair belong to a user, and outputting the closed account pair that belongs to a user, where the closed account pair meets the following conditions: a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account a i is (a i , b j ), and a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account b j is (b j ,
  • a tenth possible implementation manner of the first aspect after the step of sorting, in the test account b j do not belong to a same user, and outputting the test set account combination (b j , a k ) that belongs to a same user; and if the predicted value of the test set account combination (a i , b j ) is greater than the predicted value of the test set account combination (b j , a k ), the account a i and the account b j belong to a same user and that the account a k and the account b j do not belong to a same user, and outputting the test set account combination (a i , b j ) that belongs to a same user.
  • the present invention provides an apparatus for identifying a same user in multiple social networks, where it is defined that a same user has only one account in one social network, that the number of accounts in an account combination equals the number of social networks, and that each account in the account combination comes from a different social network; and the apparatus includes: a first generating module, a first extraction module, a first obtaining module and an output module, where the first generating module is configured to, after accounts that are in a test set and are obtained from registered accounts in at least two different social networks are inputted, generate a test set account combination from the accounts in the test set; the first extraction module is configured to, after the first generating module generates the test set account combination, extract at least two different features that are of each account in the test set account combination and are correlated with the behavioral data of a user of the account; the first obtaining module is configured to, after the first extraction module extracts the at least two different features that are of each account in the test set account combination and are correlated with the behavioral data of a user of the account; the
  • the apparatus data of a user of the account, input the extracted feature into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and the output module is configured to, after the first obtaining module obtains the predicted value or the set of predicted values, which may belong to a user, of the test set account combination, perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for test set account combination, wherein the association algorithm refers to a process in which correlation between predicted values of a test set account combination is computed to obtain a prediction result for the test set account combination in order to indicate whether the test set account combination belongs or does not belong to the user.
  • the association algorithm refers to a process in which correlation between predicted values of a test set account combination is computed to obtain a prediction result for the test set account combination in order to indicate whether the test set account combination belongs or does not belong to the user.
  • the apparatus further includes: a second generating module, a second extraction module and a second obtaining module, where the second generating module is configured to, after accounts in a training set which are obtained from registered accounts in at least two different social networks are inputted, generate a training set account combination from accounts, which belong to a user, among the accounts in the training set; the second extraction module is configured to, after the second generating module generates the training set account combination, extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account; and the second obtaining module is configured to, after the second extraction module extracts the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, train the training set with a supervised classification algorithm by using the extracted features, to obtain the classification and prediction model.
  • the second generating module is configured to, after accounts in a training set which are obtained from registered accounts in at least two different social networks are inputted, generate a training set account combination
  • the at least two different features that are correlated with behavioral data of a user of the account include: a social feature of an account in the account combination, a spatial feature of account publishing information in the account combination, a temporal feature of account publishing information in the account combination and a text feature of account publishing information in the account combination.
  • the social feature of an account in the account combination includes: the number of common adjacent elements, a Jaccard Jaccard similarity coefficient and an Adamic/Adar Adamic/Adar measure, where the number of common adjacent elements refers to the number of common friends of accounts in the account combination, and accounts having the common friends are in the training set; the Jaccard similarity coefficient refers to a ratio of the number of common friends of accounts in the account combination to the number of all friends of accounts in the account combination; and the Adamic/Adar measure refers to influence of the common friends of accounts in the account combination on respective social networks.
  • the spatial feature of account publishing information in the account combination includes: the number of common locations of all account publishing information in the account combination, a cosine similarity of a location set of all account publishing information in the account combination, and an average distance of the location set of all account publishing information in the account combination.
  • the temporal feature of account publishing information in the account combination includes: the number of common time ranges of all account publishing information in the account combination, and a cosine similarity of a time range set of all account publishing information in the account combination.
  • the text feature of account publishing information in the account combination includes: an inner product of bag-of-words vectors of all account publishing information in the account combination, and a cosine similarity of the bag-of-words vectors of all account publishing information in the account combination.
  • the apparatus further includes: a processing module and a third generating module, where the processing module is configured to process, by using a natural language processing technology, information published by an account in the test set or training set account combination; and the third generating module is configured to, after the processing module processes information published by an account in the test set or training set account combination, generate a bag-of-words vector of the account from the processed information by using a term frequency-inverse document frequency (TF-IDF) weighting model.
  • TF-IDF term frequency-inverse document frequency
  • the output module is specifically configured to, perform computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and output the computed prediction result for the test set account combination.
  • the output module includes: an obtaining unit and a first output unit, where the obtaining unit is configured to sort, in the test set account combination, predicted values or sets of predicted values of all account combinations corresponding to an account in the test set in order of the predicted values, to obtain a list of predicted values of the account; and the first output unit is configured to, after the obtaining unit obtains the list of predicted values of the account, if a closed account pair exists in the test set account combination, determine that accounts corresponding to the closed account pair belong to a user, and output the closed account pair that belongs to a user, where the closed account pair meets the following conditions: a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account a i is (a i , b j ), and a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account b j is (b j , a
  • the output module further includes: a comparison unit and a second output unit, where the comparison unit is configured to, when the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account a i is (a i , b j ), and the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account b j is (b j , a k ), compare the predicted value of the test set account combination (a i , b j ) with the predicted value of the test set account combination (b j , a k ); and the second output unit is configured to, when a comparison result of the comparison unit is that the predicted value of the test set account combination (a i , b j ) is smaller than the predicted value of the test set account combination (b j , a k ), determine that the account a k and the account b j belong
  • Beneficial effects of the present invention are as follows: unlike the case of the prior art, in the present invention, at least two different features of each account in a test set account combination that are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted. Because the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final prediction result more accurate.
  • FIG. 1 is a flowchart of an implementation manner of a method for identifying a user in multiple social networks according to the present invention, where the method includes:
  • a user has only one account in one social network, that the number of accounts in an account combination equals the number of social networks, and that each account in the account combination comes from a different social network.
  • Social networks are increasingly popular and of increasing types, such as, Facebook, Twitter, WeChat, and Foursquare. Most of the social networks are independent of one another. Many users have registered accounts in different social networks. In the prior art, a technical solution already exists for indentifying that different accounts in one social network belong to a user, and therefore, it is defined that one user has only one account in one social network, that the number of accounts in an account combination equals the number of social networks, and that each account in the account combination comes from a different social network.
  • a social network A and a social network B are different social networks
  • the social network A has three user accounts, namely, a 1 , a 2 and a 3
  • the social network B has four user accounts, namely, b 1 , b 2 , b 3 and b 4
  • the number of accounts in an account combination is 2.
  • account combinations (a 1 , b 1 ), (a 2 , b 2 ), (a 3 , b 3 ) and the like satisfy the requirement
  • account combinations (a 1 , a 2 ), (b 2 , b 3 ) and the like do not satisfy the requirement, because a 1 and a 2 belong to a social network, and b 2 and b 3 also belong to a social network.
  • Step S101 Input accounts that are in a test set and are obtained from registered accounts in at least two different social networks, and generate a test set account combination from the accounts in the test set.
  • the accounts in the test set come from registered accounts in at least two different social networks, and it is also unknown whether the accounts are registered by a user in the at least two different social networks. Firstly, the accounts in the test set are obtained from the registered accounts in the at least two different social networks, and after the accounts in the test set are inputted, the test set account combination is generated from the accounts in the test set, so as to predict whether accounts in the test set account combination belong to a user.
  • Step S102 Extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • a feature correlated with behavioral data of a user of an account refers to a feature that is correlated with data with respect to behavioral habits or behavior characteristics of the user of the account in a social network, where behavioral data of a user of an account includes: data with respect to time ranges when the user logs in to a social network, data with respect to locations where the user logs in to a social network, data with respect to language habits in which the user publishes comments in a social network, data with respect to friends followed by the user, data with respect to points of interest of the user, and the like.
  • behavioral habits or behavior characteristics of one user are often highly fixed and highly personalized. If features represented by data with respect to behavioral habits or behavior characteristics of users corresponding to accounts in different social networks are very similar, it is most likely that the accounts in the different social networks belong to a user.
  • the at least two different features that are correlated with behavioral data of a user of the account include but are not limited to: a social feature of an account in the account combination, a spatial feature of account publishing information in the account combination, a temporal feature of account publishing information in the account combination, and a text feature of account publishing information in the account combination.
  • the social feature of an account in the account combination mainly describes a friendship status or friendship characteristic of a user corresponding to the account.
  • the social feature of an account in the account combination includes: the number of common adjacent elements, a Jaccard Jaccard similarity coefficient and an Adamic/Adar Adamic/Adar measure.
  • the number of common adjacent elements refers to the number of common friends of accounts in the account combination, where accounts having the common friends are in a training set;
  • the Jaccard similarity coefficient refers to a ratio of the number of common friends of accounts in the account combination to the number of all friends of accounts in the account combination;
  • the Adamic/Adar measure refers to influence of the common friends of accounts in the account combination on respective social networks.
  • the Adamic/Adar measure equals ⁇ log ⁇ 1
  • a social network A and a social network B are different social networks
  • the social network A has three user accounts a i , namely, a 1 , a 2 and a 3
  • the social network B has four user accounts b i , namely, b i , b 2 b 3 and b 4
  • the account a 1 has friends a 2 and a 3
  • the account b 1 has friends b 2 , b 3 , and b 4
  • the spatial feature of account publishing information in the account combination mainly refers to a feature with respect to locations where the account publishes information, for example: where the account often publishes information, whether at home or in a company or at a public cybercafé; and a location where the account mostly publishes information, and so on.
  • locations where the account publishes information for example: where the account often publishes information, whether at home or in a company or at a public cybercafé; and a location where the account mostly publishes information, and so on.
  • information in other social networks is updated simultaneously. If there are two accounts in two different social networks, and it is found that the two accounts often publish information at a location, it is most likely that the two accounts in the two different social networks belong to a user.
  • the spatial feature of account publishing information in the account combination includes but is not limited to: the number of common locations of all account publishing information in the account combination, a cosine similarity of a location set of all account publishing information in the account combination, and an average distance of the location set of all account publishing information in the account combination.
  • the cosine similarity is used to compute similarity of two vectors. When a cosine value gets closer to 1, it indicates that an included angle between the two vectors gets closer to 0 degree, that is, the two vectors are more similar. This is called "cosine similarity".
  • locations where the account a 1 publishes information include a location 1 (with 4 information publications), a location 2 (with 7 information publications), and a location 3 (with 2 information publications), and locations where the account b 1 publishes information include the location 1 (with 4 information publications), the location 2 (with 7 information publications), a location 4 (with 1 information publication), and a location 5 (with 1 information publication), then the number of common locations of the account a 1 and the account b 1 for information publication is 2; according to a sequence of the location 1, the location 2, the location 3, the location 4, and the location 5, a vector of the account a 1 may be (4, 7, 2, 0, 0), and a vector of the account b 1 may be (4, 7, 0, 1, 1), and a cosine similarity of the two vectors can be obtained by computing cosine values of the two vectors; according to the three locations of the account a 1 (
  • the temporal feature of account publishing information in the account combination mainly refers to a feature with respect to time ranges in which the account publishes information.
  • the temporal feature of account publishing information in the account combination includes but is not limited to: the number of common time ranges of all account publishing information in the account combination, and a cosine similarity of a time range set of all account publishing information in the account combination.
  • time ranges in which the account a 1 publishes information include a time range 1 (with 5 information publications), a time range 2 (with 8 information publications), and a time range 3 (with 2 information publications), and time ranges in which the account b 1 publishes information include the time range 1 (with 5 information publications), the time range 2 (with 8 information publications), a time range 4 (with 1 information publication), and a time range 5 (with 1 information publication), then the number of common time ranges of the account a 1 and the account b 1 for information publication is 2; according to a sequence of the time range 1, the time range 2, the time range 3, the time range 4, and the time range 5, a vector of the account a 1 may be (5, 8, 2, 0, 0), and a vector of the account b 1 may be (5, 8, 0, 1, 1), and a cosine similarity of the two vectors may be obtained by computing cos
  • the text feature of account publishing information in the account combination refers to some language habits in which the account publishes information.
  • the text feature of account publishing information in the account combination includes but is not limited to: an inner product of bag-of-words vectors of all account publishing information in the account combination, and a cosine similarity of the bag-of-words vectors of all account publishing information in the account combination.
  • Bag of words In information retrieval, a bag of words assumes that, with respect to a text, its word sequence, grammar or syntax are ignored and the text is regarded merely as a word set or a combination of words, and that each word in the text occurs independently and is independent of occurrence of other words. In other words, when an author of the text chooses a word in any one position, the choice is made independently under no influence of a previous sentence.
  • Inner product is also referred to as scalar product or dot product. Assuming there are n-dimensional vectors ⁇ and ⁇ , the inner product of vectors is an inner product of the vectors ⁇ and ⁇ , that is, ⁇ .
  • bags of words with which the account a 1 publishes information include a bag of words 1 (with 15 occurrences), a bag of words 2 (with 21 occurrences), a bag of words 3 (with 12 occurrences), and a bag of words 4 (with 5 occurrences), and bags of words with which the account b 1 publishes information include the bag of words 1 (with 15 occurrences), the bag of words 2 (with 21 occurrences), the bag of words 4 (with 12 occurrences), and a bag of words 5 (with 8 occurrences); according to a sequence of the bag of words 1, the bag of words 2, the bag of words 3, the bag of words 4, and the bag of words 5, a bag-of-words vector of the account a 1 may be (15, 21, 12, 5, 0), and a bag-of-words vector of
  • an embodiment of the present invention may further provide another implementation manner, which specifically includes:
  • IF-IDF reflects importance of one word in a document set to one document, which is typically used as a weight factor in text data mining and information extraction.
  • term frequency Term Frequency, TF
  • Inverse document frequency IDF is a measure of general importance of a word. The IDF of a specific word may be obtained by dividing the total number of documents by the number of documents that contain the word and then acquiring a logarithm of the obtained quotient. A final result is the IDF of this specific word.
  • Step S103 Input the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • the classification and prediction model is already established, and when the classification and prediction model is being established, accounts in the training set is used to generate a training set account combination, where accounts in each training set account combination belong to a user, and when the training set is trained, extracted features are at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • extracted features are at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • a predicted value, which may belong to a user, of the test set account combination may be obtained; and when there are multiple test set account combinations, a set of predicted values, which may belong to a user, of the test set account combinations may be obtained.
  • Step S104 Perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • the association algorithm refers to a process in which correlation between predicted values of a test set account combination is computed to obtain a final prediction result for the test set account combination, that is, whether the test set account combination belongs or does not belong to a user.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted.
  • the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final prediction result more accurate.
  • the Jaccard similarity coefficient, the Adamic/Adar measure, and like indictors can extend a conventional definition manner in the prior art.
  • FIG. 2 is a flowchart of another implementation manner of the method for identifying a user in multiple social networks according to the present invention.
  • This implementation manner is essentially the as the implementation manner in FIG. 1 .
  • content including mainly step S201, step S202 and step S203, where specific content includes:
  • Step S201 Input accounts in a training set which are obtained from at least two different social networks, and generate a training set account combination from accounts, which belong to a user, among the accounts in the training set.
  • the accounts in the training set come from registered accounts in at least two different social networks, that is, the accounts in the training set and those in the test set have sources. Firstly, the accounts in the training set are obtained from the registered accounts in the at least two different social networks, and after the accounts in the training set are inputted, the training set account combination is generated from the accounts in the training set, so as to predict whether accounts in the training set account combination belong to a user.
  • Step S202 Extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • a feature correlated with behavioral data of a user of an account refers to a feature that is correlated with data with respect to behavioral habits or behavior characteristics of the user of the account in a social network, where the behavioral data of a user of an account includes: data with respect to time ranges when the user logs in to a social network, data with respect to locations where the user logs in to a social network, data with respect to language habits in which the user publishes comments in a social network, data with respect to friends followed by the user, data with respect to points of interest of the user, and the like.
  • behavioral habits or behavior characteristics of one user are often highly fixed and highly personalized. If features represented by data with respect to behavioral habits or behavior characteristics of users corresponding to accounts in different social networks are very similar, it is most likely that the accounts in the different social networks belong to a user.
  • a classification and prediction model can be established.
  • Step S203 Train the training set with a supervised classification algorithm by using the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, to obtain a classification and prediction model.
  • the supervised classification algorithm is a kind of machine learning classification algorithm, including but not limited to: support vector machines (Support Vector Machines, SVM), and logistic regression (Logistic Regression, LR).
  • Step S204 Input accounts that are in a test set and are obtained from registered accounts in at least two different social networks, and generate a test set account combination from the accounts in the test set.
  • Step S205 Extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • Step S206 Input the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account into the classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • Step S207 Perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • step S201, step S202 and step S203 only need to be performed before step S206, for example, step S201, step S202 and step S203 may be performed simultaneously with step S204 and step S205, without being limited to the sequence defined in this implementation manner. No further details are described herein.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted. Because the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate.
  • FIG. 3 is a flowchart of still another implementation manner of the method for identifying a user in multiple social networks according to the present invention.
  • This implementation manner is essentially the as the implementation manners in FIG. 1 and FIG. 2 .
  • Mainly step S307a, step S307b, step S307c and step S307d are included, where specific content includes:
  • Step S301 Input accounts in a training set which are obtained from at least two different social networks, and generate a training set account combination from accounts, which belong to a user, among the accounts in the training set.
  • Step S302 Extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • Step S303 Train the training set with a supervised classification algorithm by using the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, to obtain a classification and prediction model.
  • Step S304 Input accounts that are in a test set and are obtained from registered accounts in at least two different social networks, and generate a test set account combination from the accounts in the test set.
  • Step S305 Extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • Step S306 Input the at least two different feature s that are of each account in the test set account combination and are correlated with behavioral data of a user of the account into the classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • Step S307 Perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • Step S307 specifically includes: performing computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and outputting the computed prediction result for the test set account combination. Specifically, the following substeps are included:
  • Substep S307a Sort, in the test set account combination, predicted values or sets of predicted values of all account combinations corresponding to an account in the test set in order of the predicted values, to obtain a list of predicted values of the account.
  • Substep S307b Determine whether a closed account pair exists in the test set account combination. If a closed account pair exists, go to step S307c, and if no closed account pair exists, go to step S307d.
  • Substep S307c If a closed account pair exists in the test set account combination, determine that accounts corresponding to the closed account pair belong to a user, and output the closed account pair that belongs to a user, where the closed account pair meets the following conditions: a test set account combination corresponding to a maximum predicted value in a list of predicted values of the account a i is (a i , b j ), and a test set account combination corresponding to a maximum predicted value in a list of predicted values of the account b j is (b j , a i ).
  • Substep S307d If the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account a i is (a i , b j ), and the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account b j is (b j , a k ), compare the predicted value of the test set account combination (a i , b j ) with the predicted value of the test set account combination (b j , a k ).
  • Substep S307e If the predicted value of the test set account combination (a i , b j ) is smaller than the predicted value of the test set account combination (b j , a k ), determine that the account a k and the account b j belong to a user and that the account a i and the account b j do not belong to a user, and output the test set account combination (b j , a k ) that belongs to a user; and if the predicted value of the test set account combination (a i , b j ) is greater than the predicted value of the test set account combination (b j , a k ), determine that the account a i and the account b j belong to a user and that the account a k and the account b j do not belong to a user, and output the test set account combination (a i , b j ) that belongs to a user.
  • Table 1 Account List of predicted values Matching state Inference result set A' a1 [b1(0.8), b2(0.6)] No [] a2 [b1(0.7), b2(0.4)] No a3 [b1(0.5), b2(0.2)] No bl [a1(0.8), a2(0.7), a3(0.5)] No b2 [a1(0.6), a2(0.4), a3(0.2)] No
  • a possibility that (a1, b1) belongs to user and a possibility that (a1, b2) belongs to a user are 0.8 and 0.6 respectively, and the two values are prediction results of the classification model.
  • the list of predicted values of each account is sorted in descending order of prediction results of various account combinations.
  • the "matching state" describes whether an account is already matched to another account in another network, that is, whether the account appears in the "inference result set A'".
  • the "inference result set A'" includes all account combinations that are deduced as to belong to a user and records a predicted value of each account combination. It is not hard to see that a closed account pair (a1, b1) exists in the above test set. Association computation for the above result is performed below.
  • Step 1 Obtain one account in the social network A, where the account meets conditions that a "matching state” is "No" and that "List of predicted values” is not blank.
  • the account a3 is obtained, and with respect to the account a3, the first-place account b1 in the social network B is obtained from a list of predicted values of the account a3; bl is deleted from the list of predicted values of a3, and a matching state of b1 is determined; and the matching state of b1 is "No", and then (a3, b1) is deduced as to belong to a user, and (a3, b1) is added to "Inference result set A'", and matching states of a3 and b1 are changed to "Yes”.
  • Table 2 Account List of predicted values Matching state Inference result set A' a1 [b1(0.8), b2(0.6)] No [(a3, b1)+(0.5)] a 2 [b 1(0.7), b2(0.4)] No a3 [b2(0.2)] Yes b1 [a1(0.8), a2(0.7), a3(0.5)] Yes b2 [a1(0.6), a2(0.4), a3(0.2)] No
  • Step 2 Obtain one account in the social network A, where the account meets conditions that a "matching state” is "No" and that "List of predicted values” is not blank.
  • the account a1 is obtained, and with respect to the account a3, the first-place account b1 in the social network B is obtained from the list of predicted values of the account a3; b1 is deleted from a list of predicted values of a1, and a matching state of b1 is determined; the matching state of b1 is "Yes", it is found from "Inference result set A'" that b1 is matched to a3 with a predicted value 0.5, and a predicted value of (a1, b1) is compared with the predicted value of (a3, b1); and a result is that the predicted value of (a1, b1) is greater than the predicted value of (a3, b1), and then the following operations are performed:
  • Table 3 Account List of predicted values Matching state Inference result set A' a1 [b2(0.6)] Yes [(a1, b1)+(0.8)] a2 [b1(0.7), b2(0.4)] No a3 [b2(0.2)] No b1 [a1(0.8), a2(0.7), a3(0.5)] Yes b2 [a1(0.6), a2(0.4), a3(0.2)] No
  • Step 3 Obtain one account in the social network A, where the account meets conditions that a "matching state” is "No" and that "List of predicted values” is not blank.
  • the account a2 is obtained, and with respect to the account a2, the first-place account b1 in the social network B is obtained from the list of predicted values of the account a2; b1 is deleted from a list of predicted values of a2, and a matching state of b1 is determined; the matching state of b1 is "Yes", and it is found from "Inference result set A'" that b1 is matched to a1 with a predicted value 0.8, and the predicted value of (a1, b1) is compared with a predicted value of (a2, b1); and a result is that the predicted value of (a1, b1) is greater than the predicted value of (a2, b1), and then a next-step operation is performed.
  • Table 4 Account List of predicted values Matching state Inference result set A' al [b2(0.6)] Yes [(a1, b1)+(0.8)] a2 [b2(0.4)] No a3 [b2(0.2)] No b1 [a1(0.8), a2(0.7), a3(0.5)] Yes b2 [a1(0.6), a2(0.4), a3(0.2)] No
  • Step 4 Obtain one account in the social network A, where the account meets conditions that a "matching state” is "No" and that "List of predicted values” is not blank.
  • the account a2 is obtained, and with respect to the account a2, the first-place account b2 in the social network B is obtained from the list of predicted values of the account a2; b2 is deleted from the list of predicted values of a2, and a matching state of b2 is determined; and the matching state of b2 is "No", and then (a2, b2) is deduced as to belong to a user, and (a2, b2) is added to "Inference result set A'", and the matching states of a2 and b2 are changed to "Yes”.
  • Table 5 Account List of predicted values Matching state Inference result set A' a1 [b2(0.6)] Yes [(a1, b1)+(0.8)] a2 [] Yes [(a2, b2)+(0.4)] a3 [b2(0.2)] No b1 [a1(0.8), a2(0.7), a3(0.5)] Yes b2 [a1(0.6), a2(0.4), a3(0.2)] Yes
  • Step 5 Obtain one account in the social network A, where the account meets conditions that a "matching state” is "No" and that "List of predicted values” is not blank. Now, only the account a3 can be obtained, and with respect to the account a3, the first-place account b2 in the social network B is obtained from the list of predicted values of the account a3; b2 is deleted from the list of predicted values of a3, and a matching state of b2 is determined; the matching state of the b2 is "Yes", and it is found from "Inference result set A'" that b2 is matched to a2 with a predicted value 0.4, and the predicted value of (a2, b2) is compared with a predicted value of (a3, b2); and a result is that the predicted value of (a2, b2) is greater than the predicted value of (a3, b2), and then a next-step operation is performed.
  • Step 6 Now, no account exists, in the social network A, which meets both conditions that a "matching state” is "No” and that "List of predicted values” is not blank, and the operation is ended.
  • the final result is that (a1, b1) belongs to a user and (a2, b2) belongs to a user. Now, it can be seen that the closed account pair (a1, b1) appears in the set A', and the set A' is stable.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted. Because the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate. In addition, an optimal global matching result can be obtained rapidly by using a MAN algorithm.
  • FIG. 4 and FIG. 5 are structural diagrams of two implementation manners of an apparatus for identifying a user in multiple social networks according to the present invention, where the apparatus includes: a first generating module 101, a first extraction module 102, a first obtaining module 103 and an output module 104.
  • the apparatus in this implementation manner can perform the steps in FIG. 1 to FIG. 3 .
  • a user has only one account in one social network, that the number of accounts in an account combination equals the number of social networks, and that each account in the account combination comes from a different social network.
  • Social networks are increasingly popular and of increasing types, such as, Facebook, Twitter, WeChat, and Foursquare. Most of the social networks are independent of one another. Many users have registered accounts in different social networks. In the prior art, a technical solution already exists for indentifying that different accounts in one social network belong to a user, and therefore, it is defined that one user has only one account in one social network, that the number of accounts in an account combination equals the number of social networks, and that each account in the account combination comes from a different social network.
  • the first generating module 101 is configured to, after accounts that are in a test set and are obtained from registered accounts in at least two different social networks are inputted, generate a test set account combination from the accounts in the test set.
  • the accounts in the test set come from registered accounts in at least two different social networks, and it is also unknown whether the accounts are registered by a user in the at least two different social networks. Firstly, the accounts in the test set are obtained from the registered accounts in the at least two different social networks, and after the accounts in the test set are inputted, the test set account combination is generated from the accounts in the test set, so as to predict whether accounts in the test set account combination belong to a user.
  • the first extraction module 102 is configured to, after the first generating module 101 generates the test set account combination, extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • a feature correlated with behavioral data of a user of an account refers to a feature that is correlated with data with respect to behavioral habits or behavior characteristics of the user of the account in a social network.
  • behavioral habits or behavior characteristics of one user are often highly fixed and highly personalized. If features represented by data with respect to behavioral habits or behavior characteristics of users corresponding to accounts in different social networks are very similar, it is most likely that the accounts in the different social networks belong to a user.
  • the at least two different features that are correlated with behavioral data of a user of the account include but are not limited to: a social feature of an account in the account combination, a spatial feature of account publishing information in the account combination, a temporal feature of account publishing information in the account combination, and a text feature of account publishing information in the account combination.
  • the social feature of an account in the account combination mainly describes a friendship status or friendship feature of a user corresponding to the account.
  • the social feature of an account in the account combination includes: the number of common adjacent elements, a Jaccard Jaccard similarity coefficient and an Adamic/Adar Adamic/Adar measure.
  • the number of common adjacent elements refers to the number of common friends of accounts in the account combination, where accounts having the common friends are in a training set;
  • the Jaccard similarity coefficient refers to a ratio of the number of common friends of accounts in the account combination to the number of all friends of accounts in the account combination;
  • the Adamic/Adar measure refers to influence of the common friends of accounts in the account combination on respective social networks.
  • the Adamic/Adar measure equals ⁇ log ⁇ 1
  • a social network A and a social network B are different social networks
  • the social network A has three user accounts a i , namely, a 1 , a 2 and a 3
  • the social network B has four user accounts b i , namely, b 1 , b 2 b 3 and b 4
  • the account a 1 has friends a 2 and a 3
  • the account b 1 has friends b 2 , b 3 , and b 4
  • the spatial feature of account publishing information in the account combination mainly refers to a feature with respect to locations where the account publishes information, for example: where the account often publishes information, whether at home or in a company or at a public cybercafé; and a location where the account mostly publishes information.
  • locations where the account publishes information for example: where the account often publishes information, whether at home or in a company or at a public cybercafé; and a location where the account mostly publishes information.
  • information in other social networks is updated simultaneously. If there are two accounts in two different social networks, and it is found that the two accounts often publish information at a location, it is most likely that the two accounts in the two different social networks belong to a user.
  • the spatial feature of account publishing information in the account combination includes but is not limited to: the number of common locations of all account publishing information in the account combination, a cosine similarity of a location set of all account publishing information in the account combination, and an average distance of the location set of all account publishing information in the account combination.
  • the cosine similarity is used to compute similarity of two vectors. When a cosine value gets closer to 1, it indicates that an included angle between the two vectors gets closer to 0 degree, that is, the two vectors are more similar. This is called "cosine similarity".
  • locations where the account a 1 publishes information include a location 1 (with 4 information publications), a location 2 (with 7 information publications), and a location 3 (with 2 information publications), and locations where the account b 1 publishes information include the location 1 (with 4 information publications), the location 2 (with 7 information publications), a location 4 (with 1 information publication), and a location 5 (with 1 information publication), then the number of common locations of the account a 1 and the account b 1 for information publication is 2; according to a sequence of the location 1, the location 2, the location 3, the location 4, and the location 5, a vector of the account a 1 may be (4, 7, 2, 0, 0), and a vector of the account b 1 may be (4, 7, 0, 1, 1), and a cosine similarity of the two vectors can be obtained by computing cosine values of the two vectors; according to the three locations of the account a 1 (
  • the temporal feature of account publishing information in the account combination mainly refers to a feature with respect to time ranges in which the account publishes information.
  • the temporal feature of account publishing information in the account combination includes but is not limited to: the number of common time ranges of all account publishing information in the account combination, and a cosine similarity of a time range set of all account publishing information in the account combination.
  • time ranges in which the account a 1 publishes information include a time range 1 (with 5 information publications), a time range 2 (with 8 information publications), and a time range 3 (with 2 information publications), and time ranges in which the account b 1 publishes information include the time range 1 (with 5 information publications), the time range 2 (with 8 information publications), a time range 4 (with 1 information publication), and a time range 5 (with 1 information publication), then the number of common time ranges of the account a 1 and the account b 1 for information publication is 2; according to a sequence of the time range 1, the time range 2, the time range 3, the time range 4, and the time range 5, a vector of the account a 1 may be (5, 8, 2, 0, 0), and a vector of the account b 1 may be (5, 8, 0, 1, 1), and a cosine similarity of the two vectors may be obtained by computing cos
  • the text feature of account publishing information in the account combination refers to some language habits in which the account publishes information.
  • the text feature of account publishing information in the account combination includes but is not limited to: an inner product of bag-of-words vectors of all account publishing information in the account combination, and a cosine similarity of the bag-of-words vectors of all account publishing information in the account combination.
  • Bag of words In information retrieval, a bag of words assumes that, with respect to a text, its word sequence, grammar or syntax are ignored and the text is regarded merely as a word set or a combination of words, and that each word in the text occurs independently and is independent of occurrence of other words. In other words, when an author of the text chooses a word in any one position, the choice is made independently under no influence of a previous sentence.
  • Inner product is also referred to as scalar product or dot product. Assuming there are n-dimensional vectors ⁇ and ⁇ , the inner product of vectors is an inner product of the vectors ⁇ and ⁇ , that is, ⁇ .
  • bags of words with which the account a 1 publishes information include a bag of words 1 (with 15 occurrences), a bag of words 2 (with 21 occurrences), a bag of words 3 (with 12 occurrences), and a bag of words 4 (with 5 occurrences), and bags of words with which the account b 1 publishes information include the bag of words 1 (with 15 occurrences), the bag of words 2 (with 21 occurrences), the bag of words 4 (with 12 occurrences), and a bag of words 5 (with 8 occurrences); according to a sequence of the bag of words 1, the bag of words 2, the bag of words 3, the bag of words 4, and the bag of words 5, a bag-of-words vector of the account a 1 may be (15, 21, 12, 5, 0), and a bag-of-words vector of
  • an embodiment of the present invention may further provide another implementation manner. That is, the apparatus further includes: a processing module 105 and a third generating module 106, as shown in FIG. 5 .
  • the processing module 105 is configured to process, by using a natural language processing technology, information published by an account in the test set or training set account combination.
  • Natural language processing is an important direction of the field of computer science and the field of artificial intelligence. It studies various theories and methods for effective communication between a human being and a computer by using a natural language.
  • the third generating module 106 is configured to, after the processing module 105 processes the information published by an account in the test set or training set account combination, generate a bag-of-words vector of the account from the processed information by using a term frequency-inverse document frequency (TF-IDF) weighting model.
  • TF-IDF term frequency-inverse document frequency
  • IF-IDF reflects importance of one word in a document set to one document, which is typically used as a weight factor in text data mining and information extraction.
  • term frequency Term Frequency, TF
  • Inverse document frequency IDF is a measure of general importance of a word. The IDF of a specific word may be obtained by dividing the total number of documents by the number of documents that contain the word and then acquiring a logarithm of the obtained quotient. A final result is the IDF of this specific word.
  • the first obtaining module 103 is configured to, after the first extraction module 102 extracts the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account, input the extracted features into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • the classification and prediction model is already established, and when the classification and prediction model is being established, accounts in the training set is used to generate a training set account combination, where accounts in each training set account combination belong to a user, and when the training set is trained, extracted features are at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • extracted features are at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • a predicted value, which may belong to a user, of the test set account combination may be obtained; and when there are multiple test set account combinations, a set of predicted values, which may belong to a user, of the test set account combinations may be obtained.
  • the output module 104 is configured to, after the first obtaining module 103 obtains the predicted value or the set of predicted values, which may belong to a user, of the test set account combination, perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • the association algorithm refers to a process in which correlation between predicted values of a test set account combination is computed to obtain a final prediction result for the test set account combination, that is, whether the test set account combination belongs or does not belong to a user.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted.
  • the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate.
  • the Jaccard similarity coefficient, the Adamic/Adar measure, and like indicators can extend a conventional definition manner in the prior art.
  • FIG. 6 is a structural diagram of yet another implementation manner of the apparatus for identifying a user in multiple social networks according to the present invention.
  • the apparatus in this implementation manner is essentially the as the apparatuses in FIG. 4 and FIG. 5 .
  • a difference lies in that, in addition to a first generating module 201, a first extraction module 202, a first obtaining module 203 and an output module 204, the apparatus further includes a second generating module 205, a second extraction module 206 and a second obtaining module 207.
  • the apparatus in this implementation manner can perform the steps in FIG. 2 .
  • the first generating module 201 is configured to, after accounts that are in a test set and are obtained from registered accounts in at least two different social networks are inputted, generate a test set account combination from the accounts in the test set.
  • the first extraction module 202 is configured to, after the first generating module 201 generates the test set account combination, extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • the first obtaining module 203 is configured to, after the first extraction module 202 extracts the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account, input the extracted features into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • the output module 204 is configured to, after the first obtaining module 203 obtains the predicted value or the set of predicted values, which may belong to a user, of the test set account combination, perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • the second generating module 205 is configured to, after accounts in a training set which are obtained from registered accounts in at least two different social networks are inputted, generate a training set account combination from accounts, which belong to a user, among the accounts in the training set.
  • the accounts in the training set come from registered accounts in at least two different social networks, that is, the accounts in the training set and those in the test set have sources. Firstly, the accounts in the training set are obtained from the registered accounts in the at least two different social networks, and after the accounts in the training set are inputted, the training set account combination is generated from the accounts in the training set, so as to predict whether accounts in the training set account combination belong to a user.
  • the second extraction module 206 is configured to, after the second generating module 205 generates the training set account combination, extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • a feature correlated with behavioral data of a user of an account refers to a feature that is correlated with data with respect to behavioral habits or behavior characteristics of the user of the account in a social network, where the behavioral data of a user of an account includes: data with respect to time ranges when the user logs in to a social network, data with respect to locations where the user logs in to a social network, data with respect to language habits in which the user publishes comments in a social network, data with respect to friends followed by the user, data with respect to points of interest of the user, and the like.
  • behavioral habits or behavior characteristics of one user are often highly fixed and highly personalized. If features represented by data with respect to behavioral habits or behavior characteristics of users corresponding to accounts in different social networks are very similar, it is most likely that the accounts in the different social networks belong to a user.
  • a classification and prediction model can be established.
  • the second obtaining module 207 is configured to, after the second extraction module 206 extracts the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, train the training set with a supervised algorithm by using the extracted features, to obtain the classification and prediction model.
  • the supervised classification algorithm is a kind of machine learning classification algorithm, including but not limited to: support vector machines and logistic regression.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted. Because the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate.
  • FIG 7 is a structural diagram of yet another implementation manner of the apparatus for identifying a user in multiple social networks according to the present invention.
  • the apparatus in this implementation manner is essentially the as the apparatuses in FIG. 4, FIG 5 and FIG 6 .
  • FIG. 4, FIG 5 and FIG. 6 and corresponding text description please refer to FIG. 4, FIG 5 and FIG. 6 and corresponding text description.
  • the apparatus in this implementation manner can perform the steps in FIG. 3 .
  • the first generating module 301 is configured to, after accounts that are in a test set and are obtained from registered accounts in at least two different social networks are inputted, generate a test set account combination from the accounts in the test set.
  • the first extraction module 302 is configured to, after the first generating module 301 generates the test set account combination, extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account.
  • the first obtaining module 303 is configured to, after the first extraction module 302 extracts the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of the user of the account, input the extracted features into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • the output module 304 is configured to, after the first obtaining module 303 obtains the predicted value or the set of predicted values, which may belong to a user, of the test set account combination, perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and output a computed prediction result for the test set account combination.
  • the output module 304 is specifically configured to perform computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and output the computed prediction result for the test set account combination.
  • the output module 304 includes: the obtaining unit 3041, the determining unit 3042, the first output unit 3043, the comparison unit 3044 and the second output unit 3045.
  • the obtaining unit 3041 is configured to sort, in the test set account combination, predicted values or sets of predicted values of all account combinations corresponding to an account in the test set in order of the predicted values, to obtain a list of predicted values of the account.
  • the determining unit 3042 is configured to determine whether a closed account pair exists in the test set account combination.
  • the first output unit 3043 is configured to, after the obtaining unit 3041 obtains the list of predicted values of the account, and when a closed account pair exists in the test set account combinations, determine that accounts corresponding to the closed account pair belong to a user, and output the closed account pair that belongs to a user, where the closed account pair meets the following conditions: a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account a i is (a i , b j ), and a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account b j is (b j , a i ).
  • the comparison unit 3044 is configured to, if the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account a i is (a i , b j ), and the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account b j is (b j , a k ), compare the predicted value of the test set account combination (a i , b j ) with the predicted value of the test set account combination (b j , a k ).
  • the second output unit 3045 is configured to, when a comparison result of the comparison unit 3044 is that the predicted value of the test set account combination (a i , b j ) is smaller than the predicted value of the test set account combination (b j , a k ), determine that the account a k and the account b j belong to a user and that the account a i and the account b j do not belong to a user, and output the test set account combination (b j , a k ) that belongs to a user; when the comparison result of the comparison unit 3044 is that the predicted value of the test set account combination (a i , b j ) is greater than the predicted value of the test set account combination (b j , a k ), determine that the account a i and the account b j belong to a user and that the account a k and the account b j do not belong to a user, and output the test set account combination (a i , b j
  • the second generating module 305 is configured to, after accounts in a training set which are obtained from registered accounts in at least two different social networks are inputted, generate a training set account combination from accounts, which belong to a user, among the accounts in the training set.
  • the second extraction module 306 is configured to, after the second generating module 305 generates the training set account combination, extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account.
  • the second obtaining module 307 is configured to, after the second extraction module 306 extracts the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, train the training set with a supervised algorithm by using the extracted features, to obtain the classification and prediction model.
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted. Because the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate. In addition, an optimal global matching result can be obtained rapidly by using a MAN algorithm.
  • FIG.8 is a structural diagram of yet another implementation manner of the apparatus for identifying a user in multiple social networks according to the present invention, where the apparatus includes: a processor 11, a memory 12 coupled to the processor 11, an input unit 13, an output unit 14 and an extraction unit 15.
  • the input unit 13 is configured to input accounts that are in a test set and are obtained from registered accounts in at least two different social networks, and the processor 11 is configured to generate a test set account combination from the accounts in the test set and store the generated test set account combination in the memory 12.
  • the extraction unit 15 is configured to extract at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account, and store the extracted features in the memory 12.
  • the processor 11 is configured to fetch the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account from the memory 12, control the input unit 13 to input the fetched features into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination.
  • the processor 11 is configured to perform computation on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and control the output unit 14 to output a computed prediction result for the test set account combination.
  • the input unit 13 is further configured to input accounts in a training set which are obtained from at least two different social networks, and the processor 11 is configured to generate a training set account combination from those in the accounts in the training set that may belong to a user, and store the generated training set account combination in the memory 12.
  • the extraction unit 15 is further configured to extract at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account, and store the extracted features in the memory 12.
  • the processor 11 is configured to fetch the at least two different features that are of each account in the training set account combination and are correlated with behavioral data of a user of the account from the memory 12, and train the training set with a supervised algorithm by using the extracted features, to obtain the classification and prediction model.
  • the at least two different features that are correlated with behavioral data of a user of the account include: a social feature of an account in the account combination, a spatial feature of account publishing information in the account combination, a temporal feature of account publishing information in the account combination, and a text feature of account publishing information in the account combination.
  • the social feature of an account in the account combination includes: the number of common adjacent elements, a Jaccard Jaccard similarity coefficient and an Adamic/Adar Adamic/Adar measure.
  • the number of common adjacent elements refers to the number of common friends of accounts in the account combination, where accounts having the common friends are in a training set;
  • the Jaccard similarity coefficient refers to a ratio of the number of common friends of accounts in the account combination to the number of all friends of accounts in the account combination;
  • the Adamic/Adar measure refers to influence of the common friends of accounts in the account combination on respective social networks.
  • the spatial feature of account publishing information in the account combination includes: the number of common locations of all account publishing information in the account combination, a cosine similarity of a location set of all account publishing information in the account combination, and an average distance of the location set of all account publishing information in the account combination.
  • the temporal feature of account publishing information in the account combination includes: the number of common time ranges of all account publishing information in the account combination, and a cosine similarity of a time range set of all account publishing information in the account combination.
  • the text feature of account publishing information in the account combination includes: an inner product of bag-of-words vectors of all account publishing information in the account combination, and a cosine similarity of the bag-of-words vectors of all account publishing information in the account combination.
  • the processor 11 is further configured to process, by using a natural language processing technology, information published by an account in the test set or training set account combination; and generate a bag-of-words vector of the account from the processed information by using a term frequency-inverse document frequency (TF-IDF) weighting model.
  • TF-IDF term frequency-inverse document frequency
  • the processor 11 is further configured to perform computation on the predicted value or the set of predicted values of the test set account combination by using a multi-network approach MNA algorithm, and control the output unit 14 to output a computed prediction result for the test set account combination.
  • the processor 11 is further configure to sort, in the test set account combination, predicted values or sets of predicted values of all account combinations corresponding to an account in the test set in order of the predicted values, to obtain a list of predicted values of the account; and if a closed account pair exists in the test set account combination, determine that accounts corresponding to the closed account pair belong to a user, and control the output unit 14 to output the closed account pair that belongs to a user, where the closed account pair meets the following conditions: a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account a i is (a i , b j ), and a test set account combination corresponding to a maximum predicted value in a list of predicted values of an account b j is (b j , a i )
  • the processor 11 is further configured to, when the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account a i is (a i , b j ), and the test set account combination corresponding to the maximum predicted value in the list of predicted values of the account b j is (b j , a k ), compare the predicted value of the test set account combination (a i , b j ) with the predicted value of the test set account combination (b j , a k ); if the predicted value of the test set account combination (a i , b j ) is smaller than the predicted value of the test set account combination (b j , a k ), determine that the account a k and the account b j belong to a user and that the account a i and the account b j do not belong to the user, and control the output unit 14 to output the test set account combination (b j , a k ) that belongs to a user; and if
  • At least two different features that are of each account in a test set account combination and are correlated with behavioral data of a user of the account are extracted; the at least two different features that are of each account in the test set account combination and are correlated with behavioral data of a user of the account are inputted into a classification and prediction model that is already established, to obtain a predicted value or a set of predicted values, which may belong to a user, of the test set account combination; and computation is performed on the predicted value or the set of predicted values of the test set account combination by using an association algorithm, and a computed prediction result for the test set account combination is outputted.
  • the extracted features are at least two different features that are correlated with behavioral data of a user of the account, user information can be greatly enriched, which makes a final association result more accurate.
  • a Jaccard Jaccard similarity coefficient, an Adamic/Adar measure, and like indicators can extend a conventional definition manner in the prior art; and an optimal global matching result can be obtained rapidly by using a MAN algorithm.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the described apparatus embodiments are merely exemplary.
  • the division of modules or units is merely a division of logical functions and there may be other divisions in actual implementation.
  • a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed.
  • the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces.
  • the indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
  • the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units.
  • the purpose of this implementation manner may be implemented by selecting a part or all of the units according to practical needs.
  • functional units in the embodiments of the present invention may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
  • the integrated unit may be implemented in a form of hardware, or may be implemented in a form of a software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, the integrated unit may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or all or a part of the technical solutions may be implemented in the form of a software product.
  • the software product is stored in a storage medium and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to perform all or a part of the steps of the methods described in the embodiments of the present invention.
  • the foregoing storage medium includes: any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk, or an optical disc.
  • program code such as a USB flash drive, a removable hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk, or an optical disc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Claims (16)

  1. Procédé d'identification d'un même utilisateur dans de multiples réseaux sociaux, dans lequel il est défini qu'un même utilisateur a uniquement un compte dans un réseau social, que le nombre de comptes dans une combinaison de comptes est égal au nombre de réseaux sociaux, et que chaque compte dans la combinaison de comptes vient d'un réseau social différent ; et le procédé comprend :
    l'entrée (S101) de comptes qui sont dans un ensemble de test et sont obtenus à partir de comptes enregistrés dans au moins deux réseaux sociaux différents, et la génération d'au moins une combinaison de comptes d'ensemble de test à partir des comptes dans l'ensemble de test ;
    l'extraction (S102) d'au moins deux traits différents qui proviennent de chaque compte dans l'au moins une combinaison de comptes d'ensemble de test et sont corrélés à des données comportementales d'un utilisateur du compte ;
    l'entrée (S201) de comptes dans un ensemble d'apprentissage qui sont obtenus à partir d'au moins deux réseaux sociaux différents, et la génération d'une combinaison de comptes d'ensemble d'apprentissage à partir de comptes, qui appartiennent à un même utilisateur, parmi les comptes dans l'ensemble d'apprentissage ;
    l'extraction (S202) d'au moins deux traits différents qui proviennent de chaque compte dans la combinaison de comptes d'ensemble d'apprentissage et sont corrélés à des données comportementales d'un utilisateur du compte ; et
    la formation (S203) de l'ensemble d'apprentissage avec un algorithme de classification supervisé et en utilisant les au moins deux traits différents qui proviennent de chaque compte dans la combinaison de comptes d'ensemble d'apprentissage et sont corrélés à des données comportementales d'un utilisateur du compte, pour obtenir un modèle de classification et de prédiction ;
    l'entrée (S103) des au moins deux traits différents qui proviennent de chaque compte dans l'au moins une combinaison de comptes d'ensemble de test et sont corrélés à des données comportementales d'un utilisateur du compte dans le modèle de classification et de prédiction obtenu, pour obtenir une valeur prédite ou un ensemble de valeurs prédites, qui appartiennent à un même utilisateur, de l'au moins une combinaison de comptes d'ensemble de test ; et
    la réalisation (S104) d'un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'association, et la fourniture en sortie d'un résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test ;
    dans lequel l'algorithme d'association fait référence à un processus dans lequel une corrélation entre les valeurs prédites d'une combinaison de comptes d'ensemble de test est calculée pour obtenir un résultat de prédiction pour la combinaison de comptes d'ensemble de test afin d'indiquer si la combinaison de comptes d'ensemble de test appartient ou non au même utilisateur ;
    dans lequel les au moins deux traits différents qui sont corrélés à des données comportementales d'un utilisateur du compte comprennent : un trait social d'un compte dans la combinaison de comptes, un trait spatial d'informations de publication de compte dans la combinaison de comptes, un trait temporel d'informations de publication de compte dans la combinaison de comptes, et un trait de texte d'informations de publication de compte dans la combinaison de comptes ;
    dans lequel l'étape de réalisation d'un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'association, et de fourniture en sortie d'un résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test comprend : la réalisation d'un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'approche multiréseau MNA, et la fourniture en sortie du résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test.
  2. Procédé selon la revendication 1, dans lequel le trait social d'un compte dans la combinaison de comptes comprend : le nombre d'éléments adjacents communs, un coefficient de similitude de Jaccard et une mesure Adamic/Adar, dans lequel le nombre d'éléments adjacents communs fait référence au nombre d'amis communs de comptes dans la combinaison de comptes, et des comptes ayant les amis communs sont dans l'ensemble d'apprentissage ; le coefficient de similitude de Jaccard fait référence à un rapport entre le nombre d'amis communs de comptes dans la combinaison de comptes et le nombre de tous les amis de comptes dans la combinaison de comptes ; et la mesure Adamic/Adar fait référence à une influence des amis communs de comptes dans la combinaison de comptes sur des réseaux sociaux respectifs.
  3. Procédé selon la revendication 1, dans lequel le trait spatial d'informations de publication de compte dans la combinaison de comptes comprend : le nombre de localisations communes de toutes les informations de publication de compte dans la combinaison de comptes, une similitude cosinusoïdale d'un ensemble de localisations de toutes les informations de publication de compte dans la combinaison de comptes, et une distance moyenne de l'ensemble de localisations de toutes les informations de publication de compte dans la combinaison de comptes.
  4. Procédé selon la revendication 1, dans lequel le trait temporel d'informations de publication de compte dans la combinaison de comptes comprend : le nombre d'intervalles de temps communs de toutes les informations de publication de compte dans la combinaison de comptes, et une similitude cosinusoïdale d'un ensemble d'intervalles de temps de toutes les informations de publication de compte dans la combinaison de comptes.
  5. Procédé selon la revendication 1, dans lequel le trait de texte d'informations de publication de compte dans la combinaison de comptes comprend : un produit intérieur de vecteurs de sac de mots de toutes les informations de publication de compte dans la combinaison de comptes, et une similitude cosinusoïdale des vecteurs de sac de mots de toutes les informations de publication de compte dans la combinaison de comptes.
  6. Procédé selon la revendication 5, dans lequel le procédé comprend en outre :
    le traitement, en utilisant une technologie de traitement de langage naturel, d'informations publiées par un compte dans la combinaison de comptes d'ensemble de test ou d'ensemble d'apprentissage ; et
    la génération d'un vecteur de sac de mots du compte à partir des informations traitées en utilisant un modèle de pondération de fréquence de terme-fréquence inverse de document (TF-IDF).
  7. Procédé selon la revendication 1, dans lequel l'étape de réalisation d'un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'approche multiréseau MNA, et de fourniture en sortie du résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test comprend :
    le tri, dans l'au moins une combinaison de comptes d'ensemble de test, de valeurs prédites ou d'ensembles de valeurs prédites de toutes les combinaisons de comptes correspondant à un compte dans l'ensemble de test dans l'ordre des valeurs prédites, pour obtenir une liste de valeurs prédites du compte ; et
    si une paire de comptes fermés existe dans l'au moins une combinaison de comptes d'ensemble de test, des comptes correspondant à la paire de comptes fermés appartiennent à un même utilisateur, et la fourniture en sortie de la paire de comptes fermés qui appartient à un même utilisateur, dans lequel la paire de comptes fermés satisfait les conditions suivantes : une combinaison de comptes d'ensemble de test correspondant à une valeur prédite maximale dans une liste de valeurs prédites d'un compte ai est (ai, bj), et une combinaison de comptes d'ensemble de test correspondant à une valeur prédite maximale dans une liste de valeurs prédites d'un compte bj est (bj, ai).
  8. Procédé selon la revendication 7, après l'étape de tri, dans l'au moins une combinaison de comptes d'ensemble de test, de valeurs prédites ou d'ensembles de valeurs prédites de toutes les combinaisons de comptes correspondant à un compte dans l'ensemble de test dans l'ordre des valeurs prédites, pour obtenir une liste de valeurs prédites du compte, comprenant en outre :
    si la combinaison de comptes d'ensemble de test correspondant à la valeur prédite maximale dans la liste de valeurs prédites du compte ai est (ai, bj), et la combinaison de comptes d'ensemble de test correspondant à la valeur prédite maximale dans la liste de valeurs prédites du compte bj est (bj, ak), la comparaison de la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak) ;
    si la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) est inférieure à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak), le compte ak et le compte bj appartiennent à un même utilisateur et le compte ai et le compte bj n'appartiennent pas à un même utilisateur, et la fourniture en sortie de la combinaison de comptes d'ensemble de test (bj, ak) qui appartient à un même utilisateur ; et
    si la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) est supérieure à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak), le compte ai et le compte bj appartiennent à un même utilisateur et le compte ak et le compte bj n'appartiennent pas à un même utilisateur, et la fourniture en sortie de la combinaison de comptes d'ensemble de test (ai, bj) qui appartient à un même utilisateur.
  9. Appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux, dans lequel il est défini qu'un même utilisateur a uniquement un compte dans un réseau social, que le nombre de comptes dans une combinaison de comptes est égal au nombre de réseaux sociaux, et que chaque compte dans la combinaison de comptes vient d'un réseau social différent ; et l'appareil comprend : un premier module de génération (101), un premier module d'extraction (102), un premier module d'obtention (103) et un module de sortie (104, 304), dans lequel :
    le premier module de génération (101) est configuré, après que des comptes qui sont dans un ensemble de test et sont obtenus à partir de comptes enregistrés dans au moins deux réseaux sociaux différents sont entrés, pour générer au moins une combinaison de comptes d'ensemble de test à partir des comptes dans l'ensemble de test ;
    le premier module d'extraction (102) est configuré, après que le premier module de génération (101) génère l'au moins une combinaison de comptes d'ensemble de test, pour extraire au moins deux traits différents qui proviennent de chaque compte dans l'au moins une combinaison de comptes d'ensemble de test et sont corrélés à des données comportementales d'un utilisateur du compte ;
    le premier module d'obtention (103) est configuré, après que le premier module d'extraction (102) extrait les au moins deux traits différents qui proviennent de chaque compte dans l'au moins une combinaison de comptes d'ensemble de test et sont corrélés à des données comportementales d'un utilisateur du compte, pour entrer les traits extraits dans un modèle de classification et de prédiction qui est déjà établi, pour obtenir une valeur prédite ou un ensemble de valeurs prédites, qui peuvent appartenir à un même utilisateur, de l'au moins une combinaison de comptes d'ensemble de test ; et
    le module de sortie (104, 304) est configuré, après que le premier module d'obtention (103) obtient la valeur prédite ou l'ensemble de valeurs prédites, qui peuvent appartenir à un même utilisateur, de l'au moins une combinaison de comptes d'ensemble de test, pour réaliser un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'association, et fournir en sortie un résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test ;
    dans lequel l'algorithme d'association fait référence à un processus dans lequel une corrélation entre des valeurs prédites d'une combinaison de comptes d'ensemble de test est calculée pour obtenir un résultat de prédiction pour la combinaison de comptes d'ensemble de test afin d'indiquer si la combinaison de comptes d'ensemble de test appartient ou non au même utilisateur ;
    l'appareil comprenant en outre :
    un deuxième module de génération (305), un second module d'extraction (206) et un second module d'obtention (307) ;
    le deuxième module de génération (305) est configuré, après que des comptes dans un ensemble d'apprentissage qui sont obtenus à partir de comptes enregistrés dans au moins deux réseaux sociaux différents sont entrés, pour générer une combinaison de comptes d'ensemble d'apprentissage à partir de comptes, qui appartiennent à un même utilisateur, parmi les comptes dans l'ensemble d'apprentissage ;
    le second module d'extraction (206) est configuré, après que le deuxième module de génération (305) génère l'au moins une combinaison de comptes d'ensemble de test, pour extraire au moins deux traits différents qui proviennent de chaque compte dans la combinaison de comptes d'ensemble d'apprentissage et sont corrélés à des données comportementales d'un utilisateur du compte ; et
    le second module d'obtention (307) est configuré, après que le second module d'extraction (206) extrait les au moins deux traits différents qui proviennent de chaque compte dans la combinaison de comptes d'ensemble d'apprentissage et sont corrélés à des données comportementales d'un utilisateur du compte, pour former l'ensemble d'apprentissage avec un algorithme supervisé en utilisant les traits extraits, pour obtenir le modèle de classification et de prédiction ;
    dans lequel les au moins deux traits différents qui sont corrélés à des données comportementales d'un utilisateur du compte comprennent : un trait social d'un compte dans la combinaison de comptes, un trait spatial d'informations de publication de compte dans la combinaison de comptes, un trait temporel d'informations de publication de compte dans la combinaison de comptes, et un trait de texte d'informations de publication de compte dans la combinaison de comptes ;
    dans lequel le module de sortie (104, 304) est configuré spécifiquement pour réaliser un calcul sur la valeur prédite ou l'ensemble de valeurs prédites de l'au moins une combinaison de comptes d'ensemble de test en utilisant un algorithme d'approche multiréseau MNA, et fournir en sortie le résultat de prédiction calculé pour l'au moins une combinaison de comptes d'ensemble de test.
  10. Appareil selon la revendication 9, dans lequel le trait social d'un compte dans la combinaison de comptes comprend : le nombre d'éléments adjacents communs, un coefficient de similitude de Jaccard Jaccard et une mesure Adamic/Adar Adamic/Adar, dans lequel le nombre d'éléments adjacents communs fait référence au nombre d'amis communs de comptes dans la combinaison de comptes, et des comptes ayant les amis communs sont dans l'ensemble d'apprentissage ; le coefficient de similitude de Jaccard fait référence à un rapport entre le nombre d'amis communs de comptes dans la combinaison de comptes et le nombre de tous les amis de comptes dans la combinaison de comptes ; et la mesure Adamic/Adar fait référence à une influence des amis communs de comptes dans la combinaison de comptes sur des réseaux sociaux respectifs.
  11. Appareil selon la revendication 9, dans lequel le trait spatial d'informations de publication de compte dans la combinaison de comptes comprend : le nombre de localisations communes de toutes les informations de publication de compte dans la combinaison de comptes, une similitude cosinusoïdale d'un ensemble de localisations de toutes les informations de publication de compte dans la combinaison de comptes, et une distance moyenne de l'ensemble de localisations de toutes les informations de publication de compte dans la combinaison de comptes.
  12. Appareil selon la revendication 9, dans lequel le trait temporel d'informations de publication de compte dans la combinaison de comptes comprend : le nombre d'intervalles de temps communs de toutes les informations de publication de compte dans la combinaison de comptes, et une similitude cosinusoïdale d'un ensemble d'intervalles de temps de toutes les informations de publication de compte dans la combinaison de comptes.
  13. Appareil selon la revendication 9, dans lequel le trait de texte d'informations de publication de compte dans la combinaison de comptes comprend : un produit intérieur de vecteurs de sac de mots de toutes les informations de publication de compte dans la combinaison de comptes, et une similitude cosinusoïdale des vecteurs de sac de mots de toutes les informations de publication de compte dans la combinaison de comptes.
  14. Appareil selon la revendication 13, dans lequel l'appareil comprend en outre : un module de traitement et un troisième module de génération (106) ;
    le module de traitement est configuré pour traiter, en utilisant une technologie de traitement de langage naturel, des informations publiées par un compte dans la combinaison de comptes d'ensemble de test ou d'ensemble d'apprentissage ; et
    le troisième module de génération (106) est configuré, après que le module de traitement traite les informations publiées par un compte dans la combinaison de comptes d'ensemble de test ou d'ensemble d'apprentissage, pour générer un vecteur de sac de mots du compte à partir des informations traitées en utilisant un modèle de pondération de fréquence de terme-fréquence inverse de document (TF-IDF).
  15. Appareil selon la revendication 9, dans lequel le module de sortie (104, 304) comprend :
    une unité d'obtention (3041) et une première unité de sortie (3043) ;
    l'unité d'obtention (3041) est configurée pour trier, dans l'au moins une combinaison de comptes d'ensemble de test, des valeurs prédites ou des ensembles de valeurs prédites de toutes les combinaisons de comptes correspondant à un compte dans l'ensemble de test dans l'ordre des valeurs prédites, pour obtenir une liste de valeurs prédites du compte ; et
    la première unité de sortie (3043) est configurée, après que l'unité d'obtention (3041) obtient la liste de valeurs prédites du compte, si une paire de comptes fermés existe dans l'au moins une combinaison de comptes d'ensemble de test, pour déterminer que des comptes correspondant à la paire de comptes fermés appartiennent à un même utilisateur, et fournir en sortie la paire de comptes fermés qui appartient à un même utilisateur, dans lequel la paire de comptes fermés satisfait les conditions suivantes : une combinaison de comptes d'ensemble de test correspondant à une valeur prédite maximale dans une liste de valeurs prédites d'un compte ai est (ai, bj), et une combinaison de comptes d'ensemble de test correspondant à une valeur prédite maximale dans une liste de valeurs prédites d'un compte bj est (bj, ai).
  16. Appareil selon la revendication 15, dans lequel le module de sortie (104, 304) comprend : une unité de comparaison (3044) et une seconde unité de sortie (3045) ;
    l'unité de comparaison (3044) est configurée, si la combinaison de comptes d'ensemble de test correspondant à la valeur prédite maximale dans la liste de valeurs prédites du compte ai est (ai, bj), et la combinaison de comptes d'ensemble de test correspondant à la valeur prédite maximale dans la liste de valeurs prédites du compte bj est (bj, ak), pour comparer la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak) ; et
    la seconde unité de sortie (3045) est configurée, lorsqu'un résultat de comparaison de l'unité de comparaison (3044) est que la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) est inférieure à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak), pour déterminer que le compte ak et le compte bj appartiennent à un même utilisateur et que le compte ai et le compte bj n'appartiennent pas à un même utilisateur, et pour fournir en sortie la combinaison de comptes d'ensemble de test (bj, ak) qui appartient à un même utilisateur ; et lorsque le résultat de comparaison de l'unité de comparaison est que la valeur prédite de la combinaison de comptes d'ensemble de test (ai, bj) est supérieure à la valeur prédite de la combinaison de comptes d'ensemble de test (bj, ak), pour déterminer que le compte ai et le compte bj appartiennent à un même utilisateur et que le compte ak et le compte bj n'appartiennent pas à un même utilisateur, et pour fournir en sortie la combinaison de comptes d'ensemble de test (ai, bj) qui appartient à un même utilisateur.
EP14190351.8A 2013-10-25 2014-10-24 Procédé et appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux Active EP2866421B1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310513952.9A CN104574192B (zh) 2013-10-25 2013-10-25 在多个社交网络中识别同一用户的方法及装置

Publications (2)

Publication Number Publication Date
EP2866421A1 EP2866421A1 (fr) 2015-04-29
EP2866421B1 true EP2866421B1 (fr) 2019-07-03

Family

ID=51862102

Family Applications (1)

Application Number Title Priority Date Filing Date
EP14190351.8A Active EP2866421B1 (fr) 2013-10-25 2014-10-24 Procédé et appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux

Country Status (2)

Country Link
EP (1) EP2866421B1 (fr)
CN (1) CN104574192B (fr)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778388A (zh) * 2015-05-04 2015-07-15 苏州大学 一种两个不同平台下同一用户识别方法及系统
CN105224593B (zh) * 2015-08-25 2019-08-16 中国人民解放军信息工程大学 一种短暂上网事务中频繁共现账号挖掘方法
CN106572048A (zh) 2015-10-09 2017-04-19 腾讯科技(深圳)有限公司 一种社交网络中用户信息的识别方法和系统
CN105871585A (zh) * 2015-12-03 2016-08-17 乐视网信息技术(北京)股份有限公司 终端关联方法及装置
CN105930501B (zh) * 2016-05-09 2019-08-16 深圳市永兴元科技股份有限公司 网络账号关联方法和装置
CN107741932B (zh) * 2016-06-24 2021-02-26 深圳壹账通智能科技有限公司 用户数据融合方法及系统
CN107577682B (zh) * 2016-07-05 2021-06-29 上海交通大学 基于社交图片的用户兴趣挖掘和用户推荐方法及系统
CN106408411A (zh) * 2016-08-31 2017-02-15 北京城市网邻信息技术有限公司 信用评估方法及装置
CN107872436B (zh) * 2016-09-27 2020-11-24 阿里巴巴集团控股有限公司 一种账号识别方法、装置及系统
CN107070702B (zh) * 2017-03-13 2019-12-10 中国人民解放军信息工程大学 基于合作博弈支持向量机的用户账号关联方法及其装置
CN109561050B (zh) * 2017-09-26 2021-11-09 武汉斗鱼网络科技有限公司 一种识别批量账号的方法和装置
CN107832783A (zh) * 2017-10-25 2018-03-23 平安科技(深圳)有限公司 跨社交平台用户匹配方法、数据处理装置及可读存储介质
CN110162956B (zh) * 2018-03-12 2024-01-19 华东师范大学 确定关联账户的方法和装置
CN109697454B (zh) * 2018-11-06 2020-10-16 邓皓文 一种基于隐私保护的跨设备个体识别方法及装置
CN110097125B (zh) * 2019-05-07 2022-10-14 郑州轻工业学院 一种基于嵌入表示的跨网络账户关联方法
CN110598126B (zh) * 2019-09-05 2023-04-18 河南科技大学 基于行为习惯的跨社交网络用户身份识别方法
CN111192154B (zh) * 2019-12-25 2023-05-02 西安交通大学 一种基于风格迁移的社交网络用户节点匹配方法
CN111784468B (zh) * 2020-07-01 2022-11-18 支付宝(杭州)信息技术有限公司 一种账户关联方法、装置及电子设备
CN113537272B (zh) * 2021-03-29 2024-03-19 之江实验室 基于深度学习的半监督社交网络异常账号检测方法
EP4361855A1 (fr) * 2022-10-24 2024-05-01 Capital One Services, LLC Systèmes et procédés d'authentification de compte externe

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102355664A (zh) * 2011-08-09 2012-02-15 郑毅 一种基于用户的社交网络对用户身份进行识别与匹配的方法
US20130110605A1 (en) * 2011-10-30 2013-05-02 Bank Of America Corporation Product recognition promotional offer matching
CN103166828B (zh) * 2011-12-12 2017-03-15 中兴通讯股份有限公司 社交网络的互操作方法及系统
CN103294817A (zh) * 2013-06-13 2013-09-11 华东师范大学 一种基于类别分布概率的文本特征抽取方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Also Published As

Publication number Publication date
CN104574192B (zh) 2021-01-15
EP2866421A1 (fr) 2015-04-29
CN104574192A (zh) 2015-04-29

Similar Documents

Publication Publication Date Title
EP2866421B1 (fr) Procédé et appareil d'identification d'un même utilisateur dans de multiples réseaux sociaux
US11475143B2 (en) Sensitive data classification
Li et al. Mining opinion summarizations using convolutional neural networks in Chinese microblogging systems
WO2022141861A1 (fr) Procédé et appareil de classification d'émotions, dispositif électronique et support de stockage
US20180357258A1 (en) Personalized search device and method based on product image features
Rasool et al. GAWA–a feature selection method for hybrid sentiment classification
WO2015165372A1 (fr) Procédé et appareil pour classifier un objet sur la base d'un service de réseautage social, et support de stockage
CN108269122B (zh) 广告的相似度处理方法和装置
US20130268457A1 (en) System and Method for Extracting Aspect-Based Ratings from Product and Service Reviews
US11017002B2 (en) Description matching for application program interface mashup generation
EP4258132A1 (fr) Procédé de recommandation, réseau de recommandation et dispositif associé
CN112183881A (zh) 一种基于社交网络的舆情事件预测方法、设备及存储介质
Singh et al. Sentiment analysis of Twitter data using TF-IDF and machine learning techniques
Tabak et al. Comparison of emotion lexicons
Soliman et al. Utilizing support vector machines in mining online customer reviews
Arafat et al. Analyzing public emotion and predicting stock market using social media
CN117557331A (zh) 一种产品推荐方法、装置、计算机设备及存储介质
JP5933863B1 (ja) データ分析システム、制御方法、制御プログラム、および記録媒体
CN110085292B (zh) 药品推荐方法、装置及计算机可读存储介质
CN116680401A (zh) 文档处理方法、文档处理装置、设备及存储介质
CN115329207A (zh) 智能销售信息推荐方法及系统
WO2019019711A1 (fr) Procédé et appareil de publication de données de motif de comportement, dispositif terminal et support
Xin et al. When factorization meets heterogeneous latent topics: an interpretable cross-site recommendation framework
Saeed et al. The impact of spam reviews on feature-based sentiment analysis
CN113722487A (zh) 用户情感分析方法、装置、设备及存储介质

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20141024

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

17Q First examination report despatched

Effective date: 20160324

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20190124

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

Ref country code: AT

Ref legal event code: REF

Ref document number: 1152362

Country of ref document: AT

Kind code of ref document: T

Effective date: 20190715

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602014049387

Country of ref document: DE

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20190703

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1152362

Country of ref document: AT

Kind code of ref document: T

Effective date: 20190703

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191104

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191003

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191003

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191103

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20191004

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200224

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602014049387

Country of ref document: DE

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG2D Information on lapse in contracting state deleted

Ref country code: IS

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191031

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191024

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191031

26N No opposition filed

Effective date: 20200603

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20191031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191024

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20191031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20141024

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 602014049387

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: H04L0029080000

Ipc: H04L0065000000

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190703

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230831

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20230830

Year of fee payment: 10