CN104394118A - User identity identification method and system - Google Patents
User identity identification method and system Download PDFInfo
- Publication number
- CN104394118A CN104394118A CN201410367353.5A CN201410367353A CN104394118A CN 104394118 A CN104394118 A CN 104394118A CN 201410367353 A CN201410367353 A CN 201410367353A CN 104394118 A CN104394118 A CN 104394118A
- Authority
- CN
- China
- Prior art keywords
- identity
- information
- user
- website
- relation storehouse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0876—Network architectures or network communication protocols for network security for authentication of entities based on the identity of the terminal or configuration, e.g. MAC address, hardware or software configuration or device fingerprint
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L2101/00—Indexing scheme associated with group H04L61/00
- H04L2101/30—Types of network names
- H04L2101/33—Types of network names containing protocol addresses or telephone numbers
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Power Engineering (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention relates to a user identity identification method and system. Through basic information, comprising user ID, user name, Email, telephone number, computer IP and the like, formed by user registration, and carrying out extraction on website user behavior data, and by combining information, such as user ID, user name, Email, telephone number, Cookie, computer IP and the like, related in the behavior data, a user information relevance relationship between the two is established and an unique identification identity is given, unified identity identification of users in a B2B website at present can be carried out, identity characteristic relation is established, old and new users can be identified, and user behaviors can be effectively tracked, and thus a series of applications can be established for the users, and user experience is improved.
Description
Technical field
The present invention relates to ecommerce B2B field, particularly a kind of method for identifying ID and system.
Background technology
As e-commerce website, in order to better hold user's request, improve Consumer's Experience, customer analysis is an important component part in web analytics.Customer analysis, needs the userbase understanding website, the user behavior of tracking website, finds the behavioural characteristic of user, hobby and custom etc.By customer analysis, website can be allowed clearly to understand the information of the source of user, whereabouts and user, analyze user to the satisfaction of website, find out the aspect such as website, channels Problems existing, contribute to improving website user's conversion ratio; By the behavioural analysis of user's access websites, the access path of the user of website is optimized, the user of each page is stopped and exits situation analysis, finds out each page Problems existing, improving the rational deployment of the page and website; By user behavior analysis, understand behavioural habits and the interest preference of user, for user provides Personalized service, contribute to the consumer loyalty degree and the user's viscosity that improve website, keep website user here; By user identity identification, for user provides personalized service, the product finding high-quality satisfied that user is faster and better can be helped, for user saves efficiency, improve satisfaction.And first must can identify each user before this, differentiating them is new user or old user, and whose (user name, mailbox, telephone number etc.) differentiates them is.
As B2B websites, the main service for user provides: inquiry product, inquiry businessman and inquiry do not require that user forces to log in, register etc.A lot of user accepts with visitor's identity the service that website provides, and user is identified seem comparatively difficulty.Want the behavior can following the tracks of user accurately, this just requires that the user any one being come to website carries out identification and location.
In patent " method for identifying ID and system based on customizing messages " (application number: CN 201210019678.5), its method proposed: be mapped as user's temporary unique identifier by the customizing messages of user being accessed the Internet situation, and obtain this user's temporary unique identifier and subscriber identity information from communication network side, based on user's temporary unique identifier, customizing messages and subscriber identity information are associated.But the main basis " Computer IP address " or " Computer IP address+port numbers " of the method that this patent proposes is as user's temporary unique identifier, and this method Data Source is more single, and be subject to Computer IP influence of change large, unique identification is clear and definite not.This patent adopts user ID, user name, mailbox, telephone number, Cookie, Computer IP etc. to establish user identity ID, and the relation that is associated, improve the accuracy of identification.
Summary of the invention
For the deficiencies in the prior art, the embodiment of the present invention provides a kind of method for identifying ID and system, solves in current ecommerce B2B websites as user does the problem of Unified Identity identification.
Technical scheme of the present invention is as follows, and a kind of method for identifying ID, comprising:
Step one: gather basic data from electronic commerce Website platform data source systems, classifies to the basic data gathered, forms two class data, and be stored in background server.These two classes data comprise:
(1) user basic information of associated subscriber registration formation, comprises user ID, user name, Email, phone, Computer IP etc.;
(2) data of the website behaviors such as user's registration, login, inquiry, access, search.
Step 2: based on the website behavior such as registration, login, inquiry, access, search of user, extract the record of website behavior in nearest 1 year section, contain the identity information of associated subscriber in often kind of website behavior record, comprise user ID, user name, Email, telephone number, Cookie, Computer IP.User basic information in conjunction with user's registration: user ID, these information are gathered by user name, Email, telephone number, Computer IP information, and remove the record repeated completely.
Wherein, because associated user's identity information of often kind of behavior record is imperfect, the value therefore had may for empty; Inquiry divides user to log in an inquiry situation and user does not log in an inquiry situation, and the user profile of two kinds of situation records is different.As shown in the table.
User behavior | User ID | User name | Telephone number | Cookie | Computer IP | |
Log in | √ | √ | √ | √ | √ | √ |
Log in and send out inquiry | √ | √ | √ | √ | √ | √ |
Do not log in and send out inquiry | √ | √ | √ | √ | ||
Access | √ | √ | ||||
Search | √ | √ | ||||
Log-on message | √ | √ | √ | √ | √ |
Step 3: according to the relation between user ID, user name, Email, telephone number, Cookie, Computer IP, by the corresponding method preset, above subscriber identity information is carried out to duplicate removal, identity normalization, finally obtains the identity information of user identification relevancy relation and correspondence, and gives unique identities ID to user.
The sub-step of corresponding method is specific as follows:
1, first to the record of these two kinds of website behaviors of " login ", " log in and send out inquiry ", and the user basic information in " log-on message ", the association carrying out identity information merges.Because same user ID is considered to same person, namely different user ID is different people.In the identity information of the record of these three kinds of website behaviors, find out all user name, Email, telephone number, Cookie, Computer IP that same user ID is corresponding.
Because in B2B websites, a user ID will distinguish corresponding multiple user name, multiple Email, multiple telephone number, multiple Cookie, multiple Computer IP.Give unique identities ID to this user ID, the incidence relation of formation as shown in Figure 4.
After association process, form identity ID relation storehouse.
Wherein, a corresponding user ID of identity ID, as long as different user ID, namely gives different identity ID; Different identity ID is existed to the situation of identical user name, telephone number, Cookie, Computer IP, while merging identity information, this information need be recorded at user behavior data, and the nearest time produced in user basic information, in order to the ownership of the new user of auxiliary judgment.
2, extract the subscriber identity information of " do not log in and send out inquiry " website behavior record, compare with identity ID relation storehouse, carry out identity merging and renewal.
First Email is compared." do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as telephone number, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
Because Computer IP often changes problem, do not make identity at this and judge.
If below not identical, for website behavior record that is remaining, that be not also integrated into identity ID, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carries out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
3, extract the subscriber identity information of " access ", " search " website behavior record, compare with identity ID relation storehouse, carry out identity merging and renewal.
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
Step 4: identity ID relation storehouse is daily upgraded.For the de novo behavior of website user, the essential information of its subscriber identity information related to and new registration user, compares merger with the information in identity ID relation storehouse, and supplements renewal identity Id relation storehouse.
Concrete sub-step is as follows:
1, to the record of these three kinds of website behaviors of " login ", " log in and send out inquiry " and " log-on message " that new a day produces, extract subscriber identity information wherein, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
First with identity ID relation storehouse in there is " user ID " identity id information compare, if user ID is identical, then by other data of the subscriber identity information in the behavior record of website, the information corresponding with the identity ID of coupling carries out merging duplicate removal, adds in identity ID relation storehouse.
Such as: identity ID relation storehouse has an identity ID to be recorded as:
Identity ID | User ID | User name | Phone | Cookie | Computer IP | |
10 | 001 | cancy | cancy@163.com | 55556666 | asdfghj | 192.168.1.1 |
The record of these three kinds of website behaviors of " login " that produced by new a day, " log in and send out inquiry " and " log-on message ", the identity information in certain record of extraction is:
User ID | User name | Phone | Cookie | Computer IP | |
001 | judy | judy@qq.com | 55556666 | zxcvbnj | 192.168.1.1 |
Through overmatching, after information merges duplicate removal, identity characteristic closes and is
If user ID is different, then compare with the identity id information of nothing " user ID " in identity ID relation storehouse, the scope of comparison is both Email, telephone number, Cookie, if it is identical that both have any one to have wherein, then determine to belong to same person, the corresponding identity ID in identity ID relation storehouse is given the user of website behavior record, other identity informations in the behavior record of website are corresponding to be added in the identity ID in identity ID relation storehouse.
Such as: without the identity id information of " user ID " in identity ID relation storehouse
Identity ID | Phone | Cookie | Computer IP | |
50 | 123@163.com | 33333333 | AAAA | 1.1.1.1 |
The record of these three kinds of website behaviors of " login " that produced by new a day, " log in and send out inquiry " and " log-on message ", the identity information in certain record of extraction is:
User ID | User name | Phone | Cookie | Computer IP | |
105 | coco | 123@163.com | 33333333 | BBBB | 2.2.1.1 |
Through comparing, both Email are identical, and after merging duplicate removal to information, identity id information is
Finally, if compare without any identical identity information, then generate new identity ID and be increased in identity ID relation storehouse.
2, extract the subscriber identity information in " do not log in and send out inquiry " the website behavior record produced for new a day, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
First Email is compared." do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as telephone number, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If below not identical, for website behavior record that is remaining, that be not also integrated into identity ID, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carries out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
3, extract the subscriber identity information in " access ", " search " website behavior record produced for new a day, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
After step 5: identity ID and Association Identity characteristic relation generate, by identity characteristic relational application in follow-up user behavior.Obtain identity ID according to the identity information association identification result in each historical behavior record of user, namely each user has the unique identities ID of website, can be used for analyzing the application such as user behavior.
The present invention discloses a kind of user identity identification system, comprising:
Data information acquisition and memory module, data preparation/conversion/integration module, identification processing module, identity updating maintenance module, identity information application module.
Described data information acquisition and memory module, for extracting the daily record data of recording user various actions from the data source systems of website platform, comprise the behaviors such as access, search, inquiry, login, registration; And extraction user basic information, comprise the data of the essential informations such as user name, area, phone, and be stored in background server;
Described data preparation/conversion/integration module, for reading the daily record data in data memory module, log recording being resolved, forming the intermediate layer data of associated subscriber various actions, and fill in essential information comprising user's registration, and be stored in background server;
Described identification processing module, for giving each user identity ID, sets up the incidence relation of identity ID and user ID, user name, Email, telephone number, Cookie, Computer IP etc.
Described identity updating maintenance module, for the identity information will comprised in the user behavior newly produced, carries out merging, revise, supplement and safeguarding, forms new identity ID and corresponding identity information, supplement and be updated in identity ID relation storehouse.
Described identity information application module, for being applied in the user behavior of website platform by the identity in identity ID relation storehouse, identifying user, follows the tracks of and analyze user behavior.
The present invention has the following advantages:
The present invention proposes a kind of method for identifying ID and system, the essential information of formation is registered by user, comprise user ID, user name, Email, phone, Computer IP etc., and website user's behavioral data is extracted, the user ID related in comprehensive behavioral data, user name, Email, telephone number, Cookie, the information such as Computer IP, the user information correlation relation of both foundation also gives unique identification identity, Unified Identity identification can be done to the user in current B2B websites, set up identity characteristic relation, differentiate old and new users, effective tracking user behavior, thus a series of application can be set up for user, improve Consumer's Experience.
Accompanying drawing explanation
Fig. 1 is embodiment of the present invention method for identifying ID schematic flow sheet.
Fig. 2 is that personal status relationship ID relation storehouse of the present invention forms schematic diagram.
Fig. 3 is the structural representation of embodiment of the present invention user identity identification system.
Fig. 4 is user ID incidence relation schematic diagram of the present invention.
Embodiment
For making the object of embodiments of the invention, technical scheme and advantage clearly, simplicity of explanation is done to some terms related in user identity identification system of the present invention below.
Identity ID: the unique identification of user on website.As long as access websites, no matter whether this user is registered as member, all can distribute unique mark by identification.
User identity characteristic relation: the user ID stayed according to user and website interbehavior, user name, Email, phone, Cookie, the relation between the user identity feature that multiple Q-character such as Computer IP builds, and follow the trail of with this realization character.
Cookie race, Computer IP race, Email race, telephone number race: the relation of concrete multiple value compositions of same user-dependent same Q-character.Such as certain user has reset after using certain Cookie after system and has generated new Cookie, and so these two Cookie can treat as the Cookie race of this user by system.
The corresponding multiple user name of user ID: supplier registers on B2B websites, release product and when carrying out linking up mutual with buyer, it can arrange a primary name in an account book and multiple child user name, primary user's name distributes different management of product authorities and other information management authorities to child user name, carry out information management respectively, in this case, primary user and multiple child user share a user ID.
Composition graphs 1, the recognition methods flow process of the embodiment of the present invention, specifically comprises the following steps:
Step 11, gathers related data from website platform data source systems, and wherein data source systems comprises the user basic information etc. stored in the web log file information relevant to web site traffic, background server; From system, extract data respectively and store.
Step 12, carries out to the data gathered the intermediate layer data of formation about user behavior recorded information, user basic information of classifying, and is stored in background server.In an embodiment of the present invention, based on historical data analysis, determine based on behaviors such as user's registration, login, inquiry, access, search, the subscriber identity information wherein comprised has: user ID, user name, Email, phone, Cookie, Computer IPs etc. as the information of identification, but are not limited thereto, and the index of other reflection identity characteristics all can as the indication information of identification and identification.
Step 13, the subscriber identity information that the behavior record based on user comprises, and the user basic information in log-on message, form user identity relation, and give unique identities ID.Embodiment is: by comprising identity information in the behaviors such as user's login, inquiry, access, search, by the user ID between behavior record, user name, Email, phone, Cookie, Computer IP, and and log-on message in user basic information: user ID, user name, Email, phone, Computer IP etc., be associated relation, by these identify labels, final all unified next to identity ID.
In an embodiment of the present invention, B2B websites a lot of user exist with anonymous Identity, and a user ID may have multiple user name, Email, phone, Cookie, Computer IP etc., therefore, needs unique definition User Identity.
For made in China net, detailed relational structure is as Fig. 2:
(1) first to the record of these 2 kinds of website behaviors of " login ", " log in and send out inquiry ", and the user basic information in " log-on message ", the association carrying out identity information merges, find out all user name, Email, telephone number, Cookie, Computer IP that same user ID is corresponding, different user ID gives different identity ID.A user ID can distinguish corresponding multiple user name, multiple Email, multiple telephone number, multiple Cookie, multiple Computer IP.Form identity ID relation storehouse 1.
(2) extract the subscriber identity information of " do not log in and send out inquiry " website behavior record, compare with identity ID relation storehouse 1, carry out identity merging and renewal.
First Email is compared." do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as telephone number, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If below not identical, for website behavior record that is remaining, that be not also integrated into identity ID, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carries out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID.
For all new identity ID, form identity ID relation storehouse 2.
(3) extract the subscriber identity information of " access ", " search " website behavior record, compare with identity ID relation storehouse 1, identity ID relation storehouse 2, carry out identity merging and renewal.
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with identity ID relation storehouse 1, Cookie comparison in identity ID relation storehouse 2, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID.
For all new identity ID, form identity ID relation storehouse 3.
Finally, realize user ID, user name, Email, telephone number, Cookie, Computer IP incidence relation, identity ID relation storehouse 1, identity ID relation storehouse 2, identity ID relation storehouse 3 are merged, form identity ID relation storehouse.
Step 14, according to current each user behavior identity information de novo, has formed identity ID in identity ID relation storehouse to history and personal status relationship upgrades and safeguards.
For made in China net, detailed step is as follows:
1, to the record of these three kinds of website behaviors of " login ", " log in and send out inquiry " and " log-on message " that new a day produces, extract subscriber identity information wherein, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
First with identity ID relation storehouse in there is " user ID " identity id information compare, if user ID is identical, then by other data of the subscriber identity information in the behavior record of website, the information corresponding with the identity ID of coupling carries out merging duplicate removal, adds in identity ID relation storehouse.
Such as: identity ID relation storehouse has an identity ID to be recorded as:
Identity ID | User ID | User name | Phone | Cookie | Computer IP | |
10 | 001 | cancy | cancy@163.com | 55556666 | asdfghj | 192.168.1.1 |
The record of these three kinds of website behaviors of " login " that produced by new a day, " log in and send out inquiry " and " log-on message ", the identity information in certain record of extraction is:
User ID | User name | Phone | Cookie | Computer IP | |
001 | judy | judy@qq.com | 55556666 | zxcvbnj | 192.168.1.1 |
Through overmatching, after information merges duplicate removal, identity characteristic closes and is
If user ID is different, then compare with the identity id information of nothing " user ID " in identity ID relation storehouse, the scope of comparison is both Email, telephone number, Cookie, if it is identical that both have any one to have wherein, then determine to belong to same person, the corresponding identity ID in identity ID relation storehouse is given the user of website behavior record, other identity informations in the behavior record of website are corresponding to be added in the identity ID in identity ID relation storehouse.
Such as: without the identity id information of " user ID " in identity ID relation storehouse
Identity ID | Phone | Cookie | Computer IP | |
50 | 123@163.com | 33333333 | AAAA | 1.1.1.1 |
The record of these three kinds of website behaviors of " login " that produced by new a day, " log in and send out inquiry " and " log-on message ", the identity information in certain record of extraction is:
User ID | User name | Phone | Cookie | Computer IP | |
105 | coco | 123@163.com | 33333333 | BBBB | 2.2.1.1 |
Through comparing, both Email are identical, and after merging duplicate removal to information, identity id information is
Finally, if compare without any identical identity information, then generate new identity ID and be increased in identity ID relation storehouse.
2, extract the subscriber identity information in " do not log in and send out inquiry " the website behavior record produced for new a day, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
First Email is compared." do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as telephone number, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If below not identical, for website behavior record that is remaining, that be not also integrated into identity ID, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carries out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
3, extract the subscriber identity information in " access ", " search " website behavior record produced for new a day, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, other information are as Email, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse.
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the subscriber identity information that it comprises, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
Step 15, upgrades in user behavior by the identity ID of current renewal and identity information, to each user behavior record assignment identity ID, and completes adaptive process.
Step 16, by final identity ID and corresponding informance relation, is applied to the web analytics such as user behavior tracking and analysis.
Composition graphs 3, embodiment of the present invention recognition system structure, comprising:
Data information acquisition memory module, data preparation/conversion/integration module, identification processing module and identity updating maintenance module, identity information application module.
Described data information acquisition memory module, the essential information data that web log file data and user for extracting recording user behavior from the data source systems of website platform are registered, and be stored in background server.
Data information acquisition memory module comprises log system, background data base system and data storage cell.Log system is used for extracting from website storing the user behavior information mutual with website, and each class behavior of recording user on website, comprises the information such as login, inquiry, registration, access, search; Background data base system, for storing the Back ground Information of backstage operation, comprises the essential information of user's registration; Data storage cell is used for daily from log system and background data base system, extracting data respectively according to data warehouse data extracting rule and storing, and carries out further data processing in order to data preparation/conversion/integration module.
Described data preparation/conversion/integration module, for reading all kinds of daily record datas in data memory module, carrying out to the data gathered the intermediate layer data of formation about user behavior, user basic information of classifying, and being stored in data warehouse.
Data preparation/conversion/integration module comprises ETL submodule and data warehouse submodule.ETL submodule, for reading the Various types of data in data storage cell, carries out further information identification, cleaning, processing and arrangement, and outputs in data warehouse submodule; Data warehousing submodule is used for Classifying Sum information and forms intermediate layer data, and is stored in data warehouse, wherein stores information spinner and will be divided into user behavior information, user basic information etc.The user identity id information that in the embodiment of the present invention, final identification generates also is stored in data warehouse submodule.
Described identification processing module, compare for gathering identity information and user basic information in user behavior record, the each user identity ID of final imparting, and set up identity ID and user ID, user name, the incidence relation of Email, telephone number, Cookie, Computer IP etc., finally obtains user identity relation.Comprise identity information blocks of knowledge, information association processor 1, identity characteristic information association submodule 1, information decision processor 1, information association processor 2, identity characteristic information association submodule 2, information decision processor 2, information association processor 3, identity characteristic information relating module.
Identity information blocks of knowledge is used for from the behavior record such as user's login, inquiry, access, search data warehouse submodule, and extract identity characteristic information in the essential information of user's registration, comprise user ID, user name, Email, telephone number, Cookie, Computer IP information record; These information is preserved and gathers, remove the record repeated completely.
Information association processor 1 is for the record of these the 2 kinds of website behaviors to " login ", " log in and send out inquiry ", and the user basic information in " log-on message ", carry out Identity Association merging, information merging is carried out to all user name corresponding to same user ID, Email, telephone number, Cookie, Computer IP;
Identity characteristic information association submodule 1 is for storing the corresponding relation merging the user ID of duplicate removal, user name, Email, telephone number, Cookie, Computer IP through information association processor 1, and give different identity ID to different user ID, form identity id information record;
Information decision processor 1 is carried out identity compare for identity information in " do not log in the send out an inquiry " behavior record in identity information blocks of knowledge to be associated the identity id information record produced in submodule 1 with identity characteristic information, if identity information is more identical, think same person, then new identity information is merged in identity characteristic information association submodule 1; If information is more not identical, then enter information association processor 2;
Information association processor 2 is for the treatment of the Email, telephone number, Cookie, the Computer IP information that are not also integrated into identity ID in information decision processor 1, wherein Email, telephone number, Cookie are arbitrary identical, then think same person, give same identity ID;
Identity characteristic information association submodule 2 for store through information association processor 2 associate merge Email, telephone number, Cookie, Computer IP and identity ID incidence relation, merge the incidence relation of the identity ID of storage in identity characteristic information association submodule 1 and user ID, user name, Email, telephone number, Cookie, Computer IP simultaneously;
Information decision processor 2 is carried out identity compare for the subscriber identity information in " access ", " search " behavior in identity information blocks of knowledge to be associated the identity id information record produced in submodule 2 with identity characteristic information, if identity information is more identical, think same person, then new identity information is merged in identity characteristic information association submodule 2; If information comparative result is not identical, then enter information association processor 3;
Information association processor 3, for the treatment of the website behavior record not also being integrated into identity ID in information decision processor 1, if the Cookie between them, Computer IP information, if Cookie is identical, is then thought same person, is given same identity ID;
Identity characteristic information relating module associates through information association processor 3 the identity id information record formed after the Cookie and identity ID incidence relation merged for storing, and merges the identity id information record stored in characteristic information association submodule 2 simultaneously.
Described identity updating maintenance module, for upgrading the user identity relation information in identification processing module, based on specific update algorithm, incrementally update mode, for the identity characteristic information data of each the new generation included in model, compare with existing identity characteristic relation and identity ID, carry out updating maintenance, form new identity ID relation storehouse.
Identity updating maintenance module comprises new identity information blocks of knowledge, information decision processor 3, information association processor 4, identity update processors, identity ID characteristic relation result unit.
New identity information blocks of knowledge for storing the essential information of subscriber identity information in the de novo behavior of the website user daily upgraded and new registration user, and carries out duplicate removal;
Information decision processor 3, for by identity information in the behavior record in new identity information blocks of knowledge, compares with the identity id information in identity characteristic information relating module, if identical, enters identity update processors;
Update processors is used for the subscriber identity information in new behavior and identity characteristic information relating module identity id information to carry out merging duplicate removal, upgrades the identity characteristic relation of existing identity ID;
Information association processor 4, for the treatment of website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, wherein between subscriber identity information, carries out information association, forms new identity id information record;
Identity ID characteristic relation result unit is used for the identity id information record that storage update generates, and continues daily to upgrade.
Described identity information application module is used for be formed and constantly updating adaptive identity id information being applied in user behavior, personal status relationship is set up to user's historical behavior and current behavior, identify which behavior is that same user does, with this user behavior followed the tracks of and analyze.
Disclosedly above be only a kind of specific embodiment of the present invention; certainly protection scope of the present invention can not be limited with this; the change made according to the technical spirit of the claims in the present invention or equivalent variations, still fall into the scope that claims of the present invention is contained.
Claims (9)
1. a method for identifying ID, comprising:
Step one: gather basic data from electronic commerce Website platform data source systems, classifies to the basic data gathered, forms two class data, and be stored in background server;
Step 2: based on the registration of user, login, inquiry, access, search website behavior, extract the record of website behavior in the nearest time period, contain the identity information of associated subscriber in often kind of website behavior record, comprise user ID, user name, Email, telephone number, Cookie, Computer IP; User basic information in conjunction with user's registration: user ID, these information are gathered by user name, Email, telephone number, Computer IP information, and remove the record repeated completely;
Step 3: according to the relation between user ID, user name, Email, telephone number, Cookie, Computer IP, by the corresponding method preset, duplicate removal, identity normalization are carried out to subscriber identity information, finally obtain the identity information of user identification relevancy relation and correspondence, and give unique identities ID to user;
Step 4: the timing of identity ID relation storehouse is upgraded, for the de novo behavior of website user, the essential information of its subscriber identity information related to and new registration user, compares merger with the information in identity ID relation storehouse, and supplements renewal identity ID relation storehouse;
After step 5: identity ID and Association Identity characteristic relation generate, by identity characteristic relational application in follow-up user behavior; Obtain identity ID according to the identity information association identification result in each historical behavior record of user, namely each user has the unique identities ID of website, for analyzing user behavior application.
2. method according to claim 1, is characterized in that: two class data in step one comprise:
(1) user basic information of associated subscriber registration formation, comprises user ID, user name, Email, phone, Computer IP;
(2) data of user's registration, login, inquiry, access, search website behavior.
3. method according to claim 1, is characterized in that, in step 3, the sub-step of corresponding method is specially:
Step 3-1, first to the record of these two kinds of website behaviors of " login ", " log in send out an inquiry ", and the user basic information in " log-on message ", the association carrying out identity information merges, and finds out all user name, Email, telephone number, Cookie, Computer IP that same user ID is corresponding; After association process, form identity ID relation storehouse;
The subscriber identity information of step 3-2, extraction " do not log in send out an inquiry " website behavior record, compares with identity ID relation storehouse, carries out identity merging and renewal;
The subscriber identity information of step 3-3, extraction " access ", " search " website behavior record, compares with identity ID relation storehouse, carries out identity merging and renewal.
4. method according to claim 3, is characterized in that:
Step 3-2 is specially:
First Email is compared, " do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse; The telephone number of other information, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse;
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
For the subscriber identity information of remaining website behavior record, give new identity ID, join in identity ID relation storehouse;
Step 3-3 is specially:
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the Email of the subscriber identity information that it comprises, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse;
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
5. method according to claim 1, is characterized in that, the concrete sub-step of step 4 is:
The record of these three kinds of website behaviors of step 4-1, " login " that the new time period is produced, " log in and send out an inquiry " and " log-on message ", extract subscriber identity information wherein, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse;
Step 4-2, the subscriber identity information extracted in " do not log in and send out an inquiry " website behavior record of generation of new time period, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse;
Step 4-3, the subscriber identity information extracted in " access ", " search " website behavior record of generation of new time period, compare with the information in identity ID relation storehouse, and the identity information in the behavior record of website is supplemented in the identity ID being updated to identity ID relation storehouse.
6. method according to claim 5, is characterized in that:
Step 4-1 is specially:
First with identity ID relation storehouse in there is " user ID " identity id information compare, if user ID is identical, then by other data of the subscriber identity information in the behavior record of website, the information corresponding with the identity ID of coupling carries out merging duplicate removal, adds in identity ID relation storehouse;
If user ID is different, then compare with the identity id information of nothing " user ID " in identity ID relation storehouse, the scope of comparison is both Email, telephone number, Cookie, if it is identical that both have any one to have wherein, then determine to belong to same person, the corresponding identity ID in identity ID relation storehouse is given the user of website behavior record, other identity informations in the behavior record of website are corresponding to be added in the identity ID in identity ID relation storehouse;
Finally, if compare without any identical identity information, then generate new identity ID and be increased in identity ID relation storehouse;
Step 4-2 is specially:
First Email is compared, " do not log in and send out inquiry " in the subscriber identity information of website behavior record and include Email information, with the Email comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the telephone number of other information, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse,
If Email is different, the telephone number comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, Cookie, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
If Email, telephone number is all different, the Cookie comprised in subscriber identity information according to " do not log in and send out inquiry " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse;
Step 4-3 is specially:
According to " access ", the Cookie comprised in the subscriber identity information of " search " website behavior record, with the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged in the corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, first comparing of corresponding informance is being carried out with the information merging process of this identity ID, if both corresponding informances are identical, then do not upgrade the respective identity id information in identity ID relation storehouse, if there is difference, then the respective user identity information of this website behavior record is increased in the corresponding identity ID in identity ID relation storehouse, wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then subscriber identity information in this website behavior record is integrated in the user identity ID of nearest generation behavior in identity ID relation storehouse,
If Cookie is different, for " access ", " search " website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, extract the Email of the subscriber identity information that it comprises, telephone number, Cookie, Computer IP, then for different website behavior records, carry out the comparison of these identity informations, as long as have identity information identical between the behavior record of website, then determine that they are same users, give same new identity ID, and increase this new identity ID in identity ID relation storehouse;
For the subscriber identity information of last remaining website behavior record, give new identity ID, join in identity ID relation storehouse.
7. a user identity identification system, is characterized in that, comprising: data information acquisition and memory module, data preparation/conversion/integration module, identification processing module, identity updating maintenance module, identity information application module;
Described data information acquisition and memory module, for extracting the daily record data of recording user various actions from the data source systems of website platform, comprise access, search, inquiry, login, registration behavior; And extraction user basic information, comprise the data of user name, area, phone essential information, and be stored in background server;
Described data preparation/conversion/integration module, for reading the daily record data in data memory module, log recording being resolved, forming the intermediate layer data of associated subscriber various actions, and fill in essential information comprising user's registration, and be stored in background server;
Described identity updating maintenance module, for the identity information will comprised in the user behavior newly produced, carries out merging, revise, supplement and safeguarding, forms new identity ID and corresponding identity information, supplement and be updated in identity ID relation storehouse;
Described identity information application module, for being applied in the user behavior of website platform by the identity in identity ID relation storehouse, identifying user, follows the tracks of and analyze user behavior.
8. system according to claim 7, is characterized in that:
Identification processing module, comprises identity information blocks of knowledge, information association processor 1, identity characteristic information association submodule 1, information decision processor 1, information association processor 2, identity characteristic information association submodule 2, information decision processor 2, information association processor 3, identity characteristic information relating module composition;
Identity information blocks of knowledge is used for from the behavior record such as user's login, inquiry, access, search data warehouse submodule, and extract identity characteristic information in the essential information of user's registration, comprise user ID, user name, Email, telephone number, Cookie, Computer IP information record; These information is preserved and gathers, remove the record repeated completely;
Information association processor 1 is for the record of these the 2 kinds of website behaviors to " login ", " log in and send out inquiry ", and the user basic information in " log-on message ", carry out Identity Association merging, information merging is carried out to all user name corresponding to same user ID, Email, telephone number, Cookie, Computer IP;
Identity characteristic information association submodule 1 is for storing the corresponding relation merging the user ID of duplicate removal, user name, Email, telephone number, Cookie, Computer IP through information association processor 1, and give different identity ID to different user ID, form identity id information record;
Information decision processor 1 is carried out identity compare for identity information in " do not log in the send out an inquiry " behavior record in identity information blocks of knowledge to be associated the identity id information record produced in submodule 1 with identity characteristic information, if identity information is more identical, think same user, then new identity information is merged in identity characteristic information association submodule 1; If information is more not identical, then enter information association processor 2;
Information association processor 2 is for the treatment of the Email, telephone number, Cookie, the Computer IP information that are not also integrated into identity ID in information decision processor 1, wherein Email, telephone number, Cookie are arbitrary identical, then think same user, give same identity ID;
Identity characteristic information association submodule 2 for store through information association processor 2 associate merge Email, telephone number, Cookie, Computer IP and identity ID incidence relation, merge the incidence relation of the identity ID of storage in identity characteristic information association submodule 1 and user ID, user name, Email, telephone number, Cookie, Computer IP simultaneously;
Information decision processor 2 is carried out identity compare for the subscriber identity information in " access ", " search " behavior in identity information blocks of knowledge to be associated the identity id information record produced in submodule 2 with identity characteristic information, if identity information is more identical, think same user, then new identity information is merged in identity characteristic information association submodule 2; If information comparative result is not identical, then enter information association processor 3;
Information association processor 3 is for the treatment of the website behavior record not also being integrated into identity ID in information decision processor 1, if the Cookie between them, Computer IP information, if Cookie is identical, then thinks same user, gives same identity ID;
Identity characteristic information relating module associates through information association processor 3 the identity id information record formed after the Cookie and identity ID incidence relation merged for storing, and merges the identity id information record stored in characteristic information association submodule 2 simultaneously.
9. system according to claim 7, is characterized in that:
Identity updating maintenance module comprises new identity information blocks of knowledge, information decision processor 3, information association processor 4, identity update processors, identity ID characteristic relation result unit;
New identity information blocks of knowledge for the subscriber identity information in the de novo behavior of website user that stores timing and upgrade and the essential information of new registration user, and carries out duplicate removal;
Information decision processor 3, for by identity information in the behavior record in new identity information blocks of knowledge, compares with the identity id information in identity characteristic information relating module, if identical, enters identity update processors;
Identity update processors is used for the subscriber identity information in new behavior and identity characteristic information relating module identity id information to carry out merging duplicate removal, upgrades the identity characteristic relation of existing identity ID;
Information association processor 4, for the treatment of website behavior record that is remaining, that be not also integrated into identity ID relation storehouse, wherein between subscriber identity information, carries out information association, forms new identity id information record;
Identity ID characteristic relation result unit is used for the identity id information record that storage update generates, and continues regularly to upgrade.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410367353.5A CN104394118B (en) | 2014-07-29 | 2014-07-29 | A kind of method for identifying ID and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410367353.5A CN104394118B (en) | 2014-07-29 | 2014-07-29 | A kind of method for identifying ID and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104394118A true CN104394118A (en) | 2015-03-04 |
CN104394118B CN104394118B (en) | 2016-12-14 |
Family
ID=52611954
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410367353.5A Active CN104394118B (en) | 2014-07-29 | 2014-07-29 | A kind of method for identifying ID and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104394118B (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104809156A (en) * | 2015-03-24 | 2015-07-29 | 北京锐安科技有限公司 | Evidence information recording method and device |
CN105550916A (en) * | 2015-11-30 | 2016-05-04 | 成都反思科技有限公司 | Data acquisition method on the basis of multidimensional identification |
CN105912663A (en) * | 2016-04-12 | 2016-08-31 | 宁波极动精准广告传媒有限公司 | User tag merging method based on big data |
CN106202099A (en) * | 2015-05-05 | 2016-12-07 | 北京国双科技有限公司 | The recognition methods of visitor information and device in web log file |
CN106230829A (en) * | 2016-08-03 | 2016-12-14 | 浪潮通用软件有限公司 | Network-oriented threatens the construction method of the virtual identity knowledge mapping found |
CN106302797A (en) * | 2016-08-31 | 2017-01-04 | 北京锐安科技有限公司 | A kind of cookie accesses De-weight method and device |
CN106549914A (en) * | 2015-09-18 | 2017-03-29 | 北京秒针信息咨询有限公司 | A kind of recognition methodss of independent access person and device |
CN106682025A (en) * | 2015-11-09 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Method and device for identifying mobile phone number user |
CN107025563A (en) * | 2016-01-29 | 2017-08-08 | 福建天晴数码有限公司 | Follow the trail of the method and system for delivering advertisement |
CN107066539A (en) * | 2017-03-09 | 2017-08-18 | 北京网康科技有限公司 | A kind of information processing method and device |
CN107665438A (en) * | 2017-08-10 | 2018-02-06 | 深圳市买买提乐购金融服务有限公司 | A kind of data processing method and device |
CN107895280A (en) * | 2017-10-27 | 2018-04-10 | 深圳索信达数据技术股份有限公司 | A kind of marketing program method for pushing, system, terminal and storage medium |
CN108171547A (en) * | 2017-12-27 | 2018-06-15 | 平安普惠企业管理有限公司 | User behavior method for tracing, device, equipment and storage medium |
CN108241795A (en) * | 2016-12-23 | 2018-07-03 | 北京国双科技有限公司 | A kind of method for identifying ID and device |
CN108444584A (en) * | 2018-04-13 | 2018-08-24 | 山东华宇工学院 | A kind of intelligence height and weight measuring system and method |
CN108664375A (en) * | 2017-03-28 | 2018-10-16 | 瀚思安信(北京)软件技术有限公司 | Method for the abnormal behaviour for detecting computer network system user |
CN109086452A (en) * | 2018-08-24 | 2018-12-25 | 北京奇虎科技有限公司 | ID data network beta pruning preprocess method, device and calculating equipment |
CN109344722A (en) * | 2018-09-04 | 2019-02-15 | 阿里巴巴集团控股有限公司 | A kind of user identity determines method, apparatus and electronic equipment |
CN109598529A (en) * | 2017-09-30 | 2019-04-09 | 北京国双科技有限公司 | A kind of recognition methods of user identifier and device |
CN110109814A (en) * | 2019-05-15 | 2019-08-09 | 恒生电子股份有限公司 | User behavior data modification method and device |
CN110727885A (en) * | 2018-06-28 | 2020-01-24 | 上海传漾广告有限公司 | Internet global uniform identifier generation system and generation method thereof |
CN111147511A (en) * | 2019-12-31 | 2020-05-12 | 杭州涂鸦信息技术有限公司 | User identity serial-parallel method and system |
CN112734485A (en) * | 2021-01-13 | 2021-04-30 | 上海群之脉信息科技有限公司 | User intelligent operation system |
CN112734476A (en) * | 2021-01-13 | 2021-04-30 | 上海群之脉信息科技有限公司 | Intelligent customer data detection system |
CN114116863A (en) * | 2021-10-28 | 2022-03-01 | 上海欣兆阳信息科技有限公司 | Method and system for fusing cross-channel consumer identity in real time |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101222348A (en) * | 2007-01-10 | 2008-07-16 | 阿里巴巴公司 | Method and system for calculating number of website real user |
CN103136694A (en) * | 2013-03-20 | 2013-06-05 | 焦点科技股份有限公司 | Collaborative filtering recommendation method based on search behavior perception |
CN103886487A (en) * | 2014-03-28 | 2014-06-25 | 焦点科技股份有限公司 | Individualized recommendation method and system based on distributed B2B platform |
CN103942708A (en) * | 2013-09-30 | 2014-07-23 | 上海本家空调系统有限公司 | Method and system for evaluating regional customers |
-
2014
- 2014-07-29 CN CN201410367353.5A patent/CN104394118B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101222348A (en) * | 2007-01-10 | 2008-07-16 | 阿里巴巴公司 | Method and system for calculating number of website real user |
CN103136694A (en) * | 2013-03-20 | 2013-06-05 | 焦点科技股份有限公司 | Collaborative filtering recommendation method based on search behavior perception |
CN103942708A (en) * | 2013-09-30 | 2014-07-23 | 上海本家空调系统有限公司 | Method and system for evaluating regional customers |
CN103886487A (en) * | 2014-03-28 | 2014-06-25 | 焦点科技股份有限公司 | Individualized recommendation method and system based on distributed B2B platform |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104809156B (en) * | 2015-03-24 | 2019-02-01 | 北京锐安科技有限公司 | The method and apparatus of taking of evidence information |
CN104809156A (en) * | 2015-03-24 | 2015-07-29 | 北京锐安科技有限公司 | Evidence information recording method and device |
CN106202099B (en) * | 2015-05-05 | 2019-11-12 | 北京国双科技有限公司 | The recognition methods of visitor information and device in web log file |
CN106202099A (en) * | 2015-05-05 | 2016-12-07 | 北京国双科技有限公司 | The recognition methods of visitor information and device in web log file |
CN106549914A (en) * | 2015-09-18 | 2017-03-29 | 北京秒针信息咨询有限公司 | A kind of recognition methodss of independent access person and device |
CN106682025A (en) * | 2015-11-09 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Method and device for identifying mobile phone number user |
CN105550916A (en) * | 2015-11-30 | 2016-05-04 | 成都反思科技有限公司 | Data acquisition method on the basis of multidimensional identification |
CN107025563A (en) * | 2016-01-29 | 2017-08-08 | 福建天晴数码有限公司 | Follow the trail of the method and system for delivering advertisement |
CN105912663A (en) * | 2016-04-12 | 2016-08-31 | 宁波极动精准广告传媒有限公司 | User tag merging method based on big data |
CN106230829A (en) * | 2016-08-03 | 2016-12-14 | 浪潮通用软件有限公司 | Network-oriented threatens the construction method of the virtual identity knowledge mapping found |
CN106230829B (en) * | 2016-08-03 | 2019-06-11 | 浪潮通用软件有限公司 | Network-oriented threatens the construction method of the virtual identity knowledge mapping of discovery |
CN106302797A (en) * | 2016-08-31 | 2017-01-04 | 北京锐安科技有限公司 | A kind of cookie accesses De-weight method and device |
CN108241795A (en) * | 2016-12-23 | 2018-07-03 | 北京国双科技有限公司 | A kind of method for identifying ID and device |
CN107066539A (en) * | 2017-03-09 | 2017-08-18 | 北京网康科技有限公司 | A kind of information processing method and device |
CN108664375A (en) * | 2017-03-28 | 2018-10-16 | 瀚思安信(北京)软件技术有限公司 | Method for the abnormal behaviour for detecting computer network system user |
CN108664375B (en) * | 2017-03-28 | 2021-05-18 | 瀚思安信(北京)软件技术有限公司 | Method for detecting abnormal behavior of computer network system user |
CN107665438B (en) * | 2017-08-10 | 2019-04-26 | 深圳市买买提信息科技有限公司 | A kind of data processing method and device |
CN107665438A (en) * | 2017-08-10 | 2018-02-06 | 深圳市买买提乐购金融服务有限公司 | A kind of data processing method and device |
CN109598529A (en) * | 2017-09-30 | 2019-04-09 | 北京国双科技有限公司 | A kind of recognition methods of user identifier and device |
CN107895280A (en) * | 2017-10-27 | 2018-04-10 | 深圳索信达数据技术股份有限公司 | A kind of marketing program method for pushing, system, terminal and storage medium |
CN108171547A (en) * | 2017-12-27 | 2018-06-15 | 平安普惠企业管理有限公司 | User behavior method for tracing, device, equipment and storage medium |
CN108444584A (en) * | 2018-04-13 | 2018-08-24 | 山东华宇工学院 | A kind of intelligence height and weight measuring system and method |
CN110727885A (en) * | 2018-06-28 | 2020-01-24 | 上海传漾广告有限公司 | Internet global uniform identifier generation system and generation method thereof |
CN109086452A (en) * | 2018-08-24 | 2018-12-25 | 北京奇虎科技有限公司 | ID data network beta pruning preprocess method, device and calculating equipment |
CN109344722A (en) * | 2018-09-04 | 2019-02-15 | 阿里巴巴集团控股有限公司 | A kind of user identity determines method, apparatus and electronic equipment |
US10997460B2 (en) | 2018-09-04 | 2021-05-04 | Advanced New Technologies Co., Ltd. | User identity determining method, apparatus, and device |
TWI738011B (en) * | 2018-09-04 | 2021-09-01 | 開曼群島商創新先進技術有限公司 | Method, device and electronic equipment for determining user identity |
US11244199B2 (en) | 2018-09-04 | 2022-02-08 | Advanced New Technologies Co., Ltd. | User identity determining method, apparatus, and device |
CN110109814A (en) * | 2019-05-15 | 2019-08-09 | 恒生电子股份有限公司 | User behavior data modification method and device |
CN111147511A (en) * | 2019-12-31 | 2020-05-12 | 杭州涂鸦信息技术有限公司 | User identity serial-parallel method and system |
CN112734485A (en) * | 2021-01-13 | 2021-04-30 | 上海群之脉信息科技有限公司 | User intelligent operation system |
CN112734476A (en) * | 2021-01-13 | 2021-04-30 | 上海群之脉信息科技有限公司 | Intelligent customer data detection system |
CN114116863A (en) * | 2021-10-28 | 2022-03-01 | 上海欣兆阳信息科技有限公司 | Method and system for fusing cross-channel consumer identity in real time |
CN114116863B (en) * | 2021-10-28 | 2023-07-25 | 上海欣兆阳信息科技有限公司 | Method and system for fusing cross-channel consumer identities in real time |
Also Published As
Publication number | Publication date |
---|---|
CN104394118B (en) | 2016-12-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104394118A (en) | User identity identification method and system | |
CN106156127B (en) | Method and device for selecting data content to push to terminal | |
CN106528693B (en) | Educational resource recommended method and system towards individualized learning | |
JinHuaXu et al. | Web user clustering analysis based on KMeans algorithm | |
US9536003B2 (en) | Method and system for hybrid information query | |
US9367603B2 (en) | Systems and methods for behavioral segmentation of users in a social data network | |
CN107451861B (en) | Method for identifying user internet access characteristics under big data | |
CN101572629B (en) | Method and device for processing IP data | |
CN105069654A (en) | User identification based website real-time/non-real-time marketing investment method and system | |
US11275748B2 (en) | Influence score of a social media domain | |
US20140143012A1 (en) | Method and system for predictive marketing campigns based on users online behavior and profile | |
WO2014193399A1 (en) | Influence score of a brand | |
CN102279851A (en) | Intelligent navigation method, device and system | |
CN103177384A (en) | Network advertisement putting method based on user interest spectrum | |
CN105989074A (en) | Method and device for recommending cold start through mobile equipment information | |
CN103886487A (en) | Individualized recommendation method and system based on distributed B2B platform | |
CN103106285A (en) | Recommendation algorithm based on information security professional social network platform | |
CN104182506A (en) | Log management method | |
CN104376058A (en) | User interest model updating method and device | |
CN104778237A (en) | Individual recommending method and system based on key users | |
US20190050435A1 (en) | Object data association index system and methods for the construction and applications thereof | |
CN110637317A (en) | Distributed node cluster for establishing digital contact points across multiple devices on a digital communications network | |
CN115062087A (en) | User portrait construction method, device, equipment and medium | |
CN114218291A (en) | Portrait generation method, apparatus, device and storage medium based on target object | |
Huang et al. | On the understanding of interdependency of mobile app usage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |