CN104394118B - A kind of method for identifying ID and system - Google Patents

A kind of method for identifying ID and system Download PDF

Info

Publication number
CN104394118B
CN104394118B CN201410367353.5A CN201410367353A CN104394118B CN 104394118 B CN104394118 B CN 104394118B CN 201410367353 A CN201410367353 A CN 201410367353A CN 104394118 B CN104394118 B CN 104394118B
Authority
CN
China
Prior art keywords
identity
information
website
user
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410367353.5A
Other languages
Chinese (zh)
Other versions
CN104394118A (en
Inventor
王婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Focus Technology Co Ltd
Original Assignee
Focus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Focus Technology Co Ltd filed Critical Focus Technology Co Ltd
Priority to CN201410367353.5A priority Critical patent/CN104394118B/en
Publication of CN104394118A publication Critical patent/CN104394118A/en
Application granted granted Critical
Publication of CN104394118B publication Critical patent/CN104394118B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0876Network architectures or network communication protocols for network security for authentication of entities based on the identity of the terminal or configuration, e.g. MAC address, hardware or software configuration or device fingerprint
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/30Types of network names
    • H04L2101/33Types of network names containing protocol addresses or telephone numbers

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Power Engineering (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention proposes a kind of method for identifying ID and system, the essential information of formation is registered by user, including ID, user name, Email, phone, Computer IP etc., and website user's behavioral data is extracted, the ID related in comprehensive behavioral data, user name, Email, telephone number, Cookie, the information such as Computer IP, set up both user information correlation relations and give unique identity, user in current B2B websites can be done Unified Identity identification, set up identity characteristic relation, differentiate old and new users, effectively follow the tracks of user behavior, it is thus possible to set up a series of application for user, improve Consumer's Experience.

Description

A kind of method for identifying ID and system
Technical field
The present invention relates to ecommerce B2B field, particularly a kind of method for identifying ID and system.
Background technology
As e-commerce website, in order to preferably hold user's request, improving Consumer's Experience, customer analysis is that website is divided An important component part in analysis.Customer analysis, it is to be understood that the userbase of website, the user behavior of tracking website, find The behavior characteristics of user, hobby and custom etc..By customer analysis, can allow website be apparent from user source, Whereabouts and the information of user, analyze user's satisfaction to website, finds out the problem that the aspect such as website, channels exists, has Help improve website user's conversion ratio;Access website behavior analysis by user, the access path of the user of website is carried out excellent Change, the user of each page is stopped and the situation that exits is analyzed, find out the problem that each page exists, improve the page and website Rational deployment;By user behavior analysis, understand behavioural habits and the interest preference of user, provide the user personalized customization Service, is favorably improved consumer loyalty degree and user's viscosity of website, keeps website user here;By user identity identification, for Providing personalized service in family, can help the product finding high-quality satisfaction that user is faster and better, save efficiency for user, improve Satisfaction.And must first be capable of identify that each user before this, differentiating them is new user or old user, differentiates them It it is whose (user name, mailbox, telephone number etc.).
As B2B websites, the main service provided for user: inquiry product, inquiry businessman and inquiry are not also wanted User is asked to force login, registration etc..A lot of users accept, with visitor's identity, the service that website provides so that user identifies and seems More difficulty.Wanting to follow the tracks of accurately the behavior of user, this just requires any one user coming website is carried out body Part identifies and location.
At patent " method for identifying ID based on customizing messages and system " (application number: CN 201210019678.5) in, its method proposed: be mapped as user face by user being accessed the customizing messages of the Internet situation Shi Weiyi identifies, and obtains this user's temporary unique identifier and subscriber identity information from communication network side, interim only based on user Customizing messages and subscriber identity information are associated by one mark.But the method that this patent proposes is main according to " Computer IP ground Location " or " Computer IP address+port numbers " as user's temporary unique identifier, this method Data Source is more single, is counted Calculation machine IP influence of change is big, and unique mark is the clearest and the most definite.This patent use ID, user name, mailbox, telephone number, Cookie, Computer IP etc. establish user identity ID, and set up incidence relation, improve the accuracy of identification.
Summary of the invention
For the deficiencies in the prior art, the embodiment of the present invention provides a kind of method for identifying ID and system, The problem solving to do Unified Identity identification for user in current ecommerce B2B websites.
Technical scheme is as follows, a kind of method for identifying ID, including:
Step one: gather basic data from electronic commerce Website platform data source systems, the basic data gathered is entered Row classification, forms two class data, and is stored in background server.These two classes data include:
(1) user basic information that relevant user registration is formed, including ID, user name, Email, phone, computer IP etc.;
(2) user's registration, login, inquiry, access, the data of the website behavior such as search.
Step 2: registration based on user, login, inquiry, access, the website behavior such as search, extract nearest 1 year section The record of interior website behavior, contains the identity information of relevant user, including ID, user in every kind of website behavior record Name, Email, telephone number, Cookie, Computer IP.User basic information in conjunction with user's registration: ID, user name, These information are gathered by Email, telephone number, Computer IP information, and remove the record repeated completely.
Wherein, because associated user's identity information of every kind of behavior record is imperfect, the value therefore having may be for sky;Inquiry is divided User logs in an inquiry situation and user is not logged in sending out an inquiry situation, and the user profile of two kinds of situation records is different.Such as following table institute Show.
User behavior ID User name Email Telephone number Cookie Computer IP
Log in
Log in and send out inquiry
It is not logged in sending out inquiry
Access
Search
Log-on message
Step 3: according to the relation between ID, user name, Email, telephone number, Cookie, Computer IP, logical Cross corresponding method set in advance, above subscriber identity information is carried out duplicate removal, identity normalization, finally gives user identity pass Connection relation and the identity information of correspondence, and user is given unique identities ID.
The sub-step of corresponding method is specific as follows:
1, first to " login ", the record of both websites behavior of " log in and send out inquiry ", and in " log-on message " User basic information, the association carrying out identity information merges.Because same ID is considered as same person, the most different use Family ID is different people.In the identity information of the record of these three website behavior, find out corresponding all of of same ID User name, Email, telephone number, Cookie, Computer IP.
Because in B2B websites, an ID will distinguish corresponding multiple user name, multiple Email, multiple phone number Code, multiple Cookie, multiple Computer IP.Giving unique identities ID to this ID, the incidence relation of formation is as shown in Figure 4.
After association process, form identity ID relation storehouse.
Wherein, a corresponding ID of identity ID, as long as different IDs, i.e. give different identity ID;Right There is identical user name, telephone number, Cookie, the situation of Computer IP in different identity ID, merge identity information Meanwhile, the nearest time that this information produces in user behavior data, and user basic information need to be recorded, in order to assist Judge the ownership of new user.
2, extract the subscriber identity information of " being not logged in sending out inquiry " website behavior record, compare with identity ID relation storehouse, Carry out identity merging and renewal.
First Email is compared." it is not logged in sending out inquiry " in the subscriber identity information of website behavior record and includes Email comparison in Email information, with identity ID relation storehouse, if identical, closes the subscriber identity information of this website behavior record And in corresponding identity ID in identity ID relation storehouse, other information such as telephone number, Cookie, Computer IP, with this identity The information merging process of ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation The respective identity id information in storehouse, if there being difference, then increases to identity ID the corresponding subscriber identity information of this website behavior record In corresponding identity ID in relation storehouse.
If Email is different, the phone comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Telephone number comparison in number, with identity ID relation storehouse, if identical, merges the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse, other information such as Email, Cookie, Computer IP, at the letter with this identity ID Breath merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update the phase in identity ID relation storehouse Answering identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity ID relation storehouse Corresponding identity ID in.Identical with the telephone number of multiple identity ID, then if there is the telephone number in the behavior record of website By in the user identity ID that behavior occurs during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently.
If Email, telephone number are the most different, in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Cookie comparison in the Cookie comprised, with identity ID relation storehouse, if identical, believes the user identity of this website behavior record Breath is merged in corresponding identity ID in identity ID relation storehouse, other information such as Email, telephone number, Computer IP, with this body The information merging process of part ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID and close Being the respective identity id information in storehouse, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity In corresponding identity ID in ID relation storehouse.Wherein, if there is the Cookie in the behavior record of website and multiple identity ID Cookie is identical, then in subscriber identity information is integrated into identity ID relation storehouse in this website behavior record, behavior occurs recently In user identity ID.
Often change problem because of Computer IP, do not make identity at this and judge.
If being more than different from, for remaining, be not the most integrated into the website behavior record of identity ID, extract it and wrapped The subscriber identity information contained, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior record, Carry out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that they are same use Family, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
3, extract " access ", the subscriber identity information of " search " website behavior record, compare with identity ID relation storehouse, enter Row identity merges and updates.
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, close with identity ID It is the Cookie comparison in storehouse, if identical, the subscriber identity information of this website behavior record is merged into identity ID relation storehouse In corresponding identity ID, other information such as Email, telephone number, Computer IP, first with the information merging process of this identity ID Carry out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update the respective identity id information in identity ID relation storehouse, if There is difference, then the corresponding subscriber identity information of this website behavior record is increased in corresponding identity ID in identity ID relation storehouse. Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then by this website behavior note Occur recently during in record, subscriber identity information is integrated into identity ID relation storehouse in the user identity ID of behavior.
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website row in identity ID relation storehouse For record, extract its subscriber identity information comprised, i.e. Email, telephone number, Cookie, Computer IP, then for not Same website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, the most really Determining them is same user, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
Step 4: identity ID relation storehouse is daily updated.Behavior de novo for website user, its use related to Family identity information and the essential information of new registration user, compare merger with the information in identity ID relation storehouse, and supplement more New identity Id relation storehouse.
Concrete sub-step is as follows:
The note of the these three website behavior of " login ", " log in and send out inquiry " and " log-on message " of 1, new a day being produced Record, extracts subscriber identity information therein, compares with the information in identity ID relation storehouse, and the body in the behavior record of website Part information is supplemented and is updated in identity ID in identity ID relation storehouse.
First compare, if ID is identical, then with the identity id information of existence " ID " in identity ID relation storehouse By other data of the subscriber identity information in the behavior record of website, the information corresponding with identity ID of coupling merges Weight, adds in identity ID relation storehouse.
Such as: identity ID relation storehouse has identity ID to be recorded as:
Identity ID ID User name Email Phone Cookie Computer IP
10 001 cancy cancy@163.com 55556666 asdfghj 192.168.1.1
The note of the these three website behavior of " login ", " log in an inquiry " and " log-on message " by producing for new a day Record, the identity information in certain record of extraction is:
ID User name Email Phone Cookie Computer IP
001 judy judy@qq.com 55556666 zxcvbnj 192.168.1.1
Through overmatching, after information merges duplicate removal, identity characteristic relation is
If ID is different, then compares with the identity id information of nothing " ID " in identity ID relation storehouse, compare Scope is both Email, telephone number, Cookie, if to have any one to have wherein identical for both, it is determined that belong to One people, gives the user of website behavior record by corresponding identity ID in identity ID relation storehouse, other in the behavior record of website Identity information adds in identity ID in identity ID relation storehouse accordingly.
Such as: without the identity id information of " ID " in identity ID relation storehouse
Identity ID Email Phone Cookie Computer IP
50 123@163.com 33333333 AAAA 1.1.1.1
The note of the these three website behavior of " login ", " log in an inquiry " and " log-on message " by producing for new a day Record, the identity information in certain record of extraction is:
ID User name Email Phone Cookie Computer IP
105 coco 123@163.com 33333333 BBBB 2.2.1.1
Through comparing, both Email are identical, and after information is merged duplicate removal, identity id information is
Finally, if comparing without any identical identity information, then generating new identity ID increases in identity ID relation storehouse.
2, the subscriber identity information in " being not logged in sending out inquiry " the website behavior record produced is extracted new one day, with identity ID The information in relation storehouse compares, and the identity information in the behavior record of website is supplemented the renewal identity to identity ID relation storehouse In ID.
First Email is compared." it is not logged in sending out inquiry " in the subscriber identity information of website behavior record and includes Email comparison in Email information, with identity ID relation storehouse, if identical, closes the subscriber identity information of this website behavior record And in corresponding identity ID in identity ID relation storehouse, other information such as telephone number, Cookie, Computer IP, with this identity The information merging process of ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation The respective identity id information in storehouse, if there being difference, then increases to identity ID the corresponding subscriber identity information of this website behavior record In corresponding identity ID in relation storehouse.
If Email is different, the phone comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Telephone number comparison in number, with identity ID relation storehouse, if identical, merges the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse, other information such as Email, Cookie, Computer IP, at the letter with this identity ID Breath merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update the phase in identity ID relation storehouse Answering identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity ID relation storehouse Corresponding identity ID in.Identical with the telephone number of multiple identity ID, then if there is the telephone number in the behavior record of website By in the user identity ID that behavior occurs during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently.
If Email, telephone number are the most different, in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Cookie comparison in the Cookie comprised, with identity ID relation storehouse, if identical, believes the user identity of this website behavior record Breath is merged in corresponding identity ID in identity ID relation storehouse, other information such as Email, telephone number, Computer IP, with this body The information merging process of part ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID and close Being the respective identity id information in storehouse, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity In corresponding identity ID in ID relation storehouse.Wherein, if there is the Cookie in the behavior record of website and multiple identity ID Cookie is identical, then in subscriber identity information is integrated into identity ID relation storehouse in this website behavior record, behavior occurs recently In user identity ID.
If being more than different from, for remaining, be not the most integrated into the website behavior record of identity ID, extract it and wrapped The subscriber identity information contained, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior record, Carry out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that they are same use Family, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
3, the subscriber identity information in " access ", " search " website behavior record produced for new one day is extracted, with identity ID The information in relation storehouse compares, and the identity information in the behavior record of website is supplemented the renewal identity to identity ID relation storehouse In ID.
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, close with identity ID It is the Cookie comparison in storehouse, if identical, the subscriber identity information of this website behavior record is merged into identity ID relation storehouse In corresponding identity ID, other information such as Email, telephone number, Computer IP, first with the information merging process of this identity ID Carry out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update the respective identity id information in identity ID relation storehouse, if There is difference, then the corresponding subscriber identity information of this website behavior record is increased in corresponding identity ID in identity ID relation storehouse. Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then by this website behavior note Occur recently during in record, subscriber identity information is integrated into identity ID relation storehouse in the user identity ID of behavior.
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website row in identity ID relation storehouse For record, extract its subscriber identity information comprised, i.e. Email, telephone number, Cookie, Computer IP, then for not Same website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, the most really Determining them is same user, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
Step 5: after identity ID and Association Identity characteristic relation generate, identity characteristic relation is applied to follow-up user In behavior.Identity information association identification result in each the historical behavior record according to user obtains identity ID, i.e. Each user has the unique identities ID of website, can be used for analyzing the application such as user behavior.
The present invention discloses a kind of user identity identification system, including:
Data information acquisition is with memory module, data compilation/conversion/integration module, identification processing module, identity more New maintenance module, identity information application module.
Described data information acquisition and memory module, each for extracting record user from the data source systems of website platform The daily record data of kind of behavior, including access, search, inquiry, log in, the behavior such as registration;And extraction user basic information, including The data of the essential informations such as user name, area, phone, and be stored in background server;
Described data compilation/conversion/integration module, for reading the daily record data in data memory module, remembers daily record Record resolves, and forms the intermediate layer data of relevant user various actions, and fills in essential information comprising user's registration, and deposits It is stored in background server;
Described identification processing module, is used for giving each user identity ID, sets up identity ID and ID, user The incidence relation of name, Email, telephone number, Cookie, Computer IP etc..
Described identity updating maintenance module, for the identity information that will comprise in newly generated user behavior, merge, Revise, supplement and safeguard, form new identity ID and corresponding identity information, supplement and update in identity ID relation storehouse.
Described identity information application module, for being applied to user's row of website platform by the identity in identity ID relation storehouse In for, identify user, user behavior is tracked and analyzes.
The invention have the advantages that
The present invention proposes a kind of method for identifying ID and system, is registered the essential information of formation by user, including ID, user name, Email, phone, Computer IP etc., and website user's behavioral data is extracted, comprehensive behavior number The information such as the ID that relates according to, user name, Email, telephone number, Cookie, Computer IP, set up both user's letters Breath incidence relation also gives unique identity, it is possible to the user in current B2B websites is done Unified Identity identification, sets up identity Characteristic relation, differentiates old and new users, effectively follows the tracks of user behavior such that it is able to set up a series of application for user, improves Consumer's Experience.
Accompanying drawing explanation
Fig. 1 is embodiment of the present invention method for identifying ID schematic flow sheet.
Fig. 2 is that the personal status relationship ID relation storehouse of the present invention forms schematic diagram.
Fig. 3 is the structural representation of embodiment of the present invention user identity identification system.
Fig. 4 is the ID incidence relation schematic diagram of the present invention.
Detailed description of the invention
For making the purpose of embodiments of the invention, technical scheme and advantage clearer, the user's body to the present invention below Some terms related in part identification system do simplicity of explanation.
Identity ID: unique mark of user on website.As long as access website, no matter whether this user is registered as member, Will be by the unique mark of identification distribution.
User identity characteristic relation: the ID stayed with website interbehavior according to user, user name, Email, electricity Words, Cookie, the relation between the user identity feature that multiple Q-characters such as Computer IP build, and realize feature with this and chase after Track.
Cookie race, Computer IP race, Email race, telephone number race: the tool of same user-dependent same Q-character The relation of body multiple value composition.Such as certain user has reset after using certain Cookie and has generated new Cookie after system, then system The two Cookie can be treated as the Cookie race of this user.
The corresponding multiple user names of one ID: supplier registers on B2B websites, release product and carrying out with buyer When linking up mutual, it can arrange a primary name in an account book and multiple child user name, the management of product power that the distribution of primary user's name is different Limit and other information management authorities, to child user name, carry out information management respectively, and in this case, primary user and many height are used Family shares an ID.
In conjunction with Fig. 1, the recognition methods flow process of the embodiment of the present invention, specifically include following steps:
Step 11, gathers related data from website platform data source systems, and wherein data source systems includes and website The user basic information etc. stored in web log file information that business is relevant, background server;Data are extracted respectively from system And store.
The data gathered are carried out classification and are formed in relevant user behavior record information, user basic information by step 12 Between layer data, and be stored in background server.In an embodiment of the present invention, based on historical data analysis, it is determined that based on User's registration, login, inquiry, access, the behavior such as search, the subscriber identity information wherein comprised has: ID, user name, Email, phone, Cookie, Computer IP etc. is as the information of identification, but is not limited to this, other reflection identity characteristics Index all can be as identification and the indication information of identification.
Step 13, the subscriber identity information that behavior record based on user comprises, and the user in log-on message is basic Information, forms user identity relation, and gives unique identities ID.Detailed description of the invention is: by user's login, inquiry, visit Ask, the behavior such as search comprises identity information, by the ID between behavior record, user name, Email, phone, Cookie, Computer IP, and and log-on message in user basic information: ID, user name, Email, phone, Computer IP Deng, set up incidence relation, by these identity, final the most unified to next in identity ID.
In an embodiment of the present invention, a lot of user of B2B websites exists with anonymous Identity, and an ID may have many Individual user name, Email, phone, Cookie, Computer IP etc., accordingly, it would be desirable to unique definition User Identity.
As a example by made in China net, detailed relational structure such as Fig. 2:
(1) first to " login ", the record of these 2 kinds of website behaviors of " log in and send out inquiry ", and in " log-on message " User basic information, the association carrying out identity information merges, and finds out all of user name corresponding to same ID, Email, electricity Words number, Cookie, Computer IP, different IDs gives different identity ID.It is many that one ID can distinguish correspondence Individual user name, multiple Email, multiple telephone number, multiple Cookie, multiple Computer IP.Form identity ID relation storehouse 1.
(2) extract the subscriber identity information of " being not logged in sending out inquiry " website behavior record, do ratio with identity ID relation storehouse 1 Right, carry out identity merging and renewal.
First Email is compared." it is not logged in sending out inquiry " in the subscriber identity information of website behavior record and includes Email comparison in Email information, with identity ID relation storehouse, if identical, closes the subscriber identity information of this website behavior record And in corresponding identity ID in identity ID relation storehouse, other information such as telephone number, Cookie, Computer IP, with this identity The information merging process of ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation The respective identity id information in storehouse, if there being difference, then increases to identity ID the corresponding subscriber identity information of this website behavior record In corresponding identity ID in relation storehouse.
If Email is different, the phone comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Telephone number comparison in number, with identity ID relation storehouse, if identical, merges the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse, other information such as Email, Cookie, Computer IP, at the letter with this identity ID Breath merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update the phase in identity ID relation storehouse Answering identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity ID relation storehouse Corresponding identity ID in.Identical with the telephone number of multiple identity ID, then if there is the telephone number in the behavior record of website By in the user identity ID that behavior occurs during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently.
If Email, telephone number are the most different, in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Cookie comparison in the Cookie comprised, with identity ID relation storehouse, if identical, believes the user identity of this website behavior record Breath is merged in corresponding identity ID in identity ID relation storehouse, other information such as Email, telephone number, Computer IP, with this body The information merging process of part ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID and close Being the respective identity id information in storehouse, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity In corresponding identity ID in ID relation storehouse.Wherein, if there is the Cookie in the behavior record of website and multiple identity ID Cookie is identical, then in subscriber identity information is integrated into identity ID relation storehouse in this website behavior record, behavior occurs recently In user identity ID.
If being more than different from, for remaining, be not the most integrated into the website behavior record of identity ID, extract it and wrapped The subscriber identity information contained, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior record, Carry out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that they are same use Family, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID.
For all new identity ID, form identity ID relation storehouse 2.
(3) " access ", the subscriber identity information of " search " website behavior record are extracted, with identity ID relation storehouse 1, identity ID Relation storehouse 2 compares, and carries out identity merging and renewal.
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, close with identity ID It is the Cookie comparison in storehouse 1, identity ID relation storehouse 2, if identical, the subscriber identity information of this website behavior record is merged into In corresponding identity ID in identity ID relation storehouse, other information such as Email, telephone number, Computer IP, at the letter with this identity ID Breath merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update the phase in identity ID relation storehouse Answering identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity ID relation storehouse Corresponding identity ID in.Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then By in the user identity ID that behavior occurs during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently.
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website row in identity ID relation storehouse For record, extract its subscriber identity information comprised, i.e. Email, telephone number, Cookie, Computer IP, then for not Same website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, the most really Determining them is same user, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID.
For all new identity ID, form identity ID relation storehouse 3.
Finally, it is achieved ID, user name, Email, telephone number, Cookie, Computer IP incidence relation, identity ID relation storehouse 1, identity ID relation storehouse 2, identity ID relation storehouse 3 merge, and form identity ID relation storehouse.
Step 14, according to each user behavior identity information the most de novo, has formed identity ID relation to history Identity ID and personal status relationship in storehouse are updated and safeguard.
As a example by made in China net, detailed step is as follows:
The note of the these three website behavior of " login ", " log in and send out inquiry " and " log-on message " of 1, new a day being produced Record, extracts subscriber identity information therein, compares with the information in identity ID relation storehouse, and the body in the behavior record of website Part information is supplemented and is updated in identity ID in identity ID relation storehouse.
First compare, if ID is identical, then with the identity id information of existence " ID " in identity ID relation storehouse By other data of the subscriber identity information in the behavior record of website, the information corresponding with identity ID of coupling merges Weight, adds in identity ID relation storehouse.
Such as: identity ID relation storehouse has identity ID to be recorded as:
Identity ID ID User name Email Phone Cookie Computer IP
10 001 cancy cancy@163.com 55556666 asdfghj 192.168.1.1
The note of the these three website behavior of " login ", " log in an inquiry " and " log-on message " by producing for new a day Record, the identity information in certain record of extraction is:
ID User name Email Phone Cookie Computer IP
001 judy judy@qq.com 55556666 zxcvbnj 192.168.1.1
Through overmatching, after information merges duplicate removal, identity characteristic relation is
If ID is different, then compares with the identity id information of nothing " ID " in identity ID relation storehouse, compare Scope is both Email, telephone number, Cookie, if to have any one to have wherein identical for both, it is determined that belong to One people, gives the user of website behavior record by corresponding identity ID in identity ID relation storehouse, other in the behavior record of website Identity information adds in identity ID in identity ID relation storehouse accordingly.
Such as: without the identity id information of " ID " in identity ID relation storehouse
Identity ID Email Phone Cookie Computer IP
50 123@163.com 33333333 AAAA 1.1.1.1
The note of the these three website behavior of " login ", " log in an inquiry " and " log-on message " by producing for new a day Record, the identity information in certain record of extraction is:
ID User name Email Phone Cookie Computer IP
105 coco 123@163.com 33333333 BBBB 2.2.1.1
Through comparing, both Email are identical, and after information is merged duplicate removal, identity id information is
Finally, if comparing without any identical identity information, then generating new identity ID increases in identity ID relation storehouse.
2, the subscriber identity information in " being not logged in sending out inquiry " the website behavior record produced is extracted new one day, with identity ID The information in relation storehouse compares, and the identity information in the behavior record of website is supplemented the renewal identity to identity ID relation storehouse In ID.
First Email is compared." it is not logged in sending out inquiry " in the subscriber identity information of website behavior record and includes Email comparison in Email information, with identity ID relation storehouse, if identical, closes the subscriber identity information of this website behavior record And in corresponding identity ID in identity ID relation storehouse, other information such as telephone number, Cookie, Computer IP, with this identity The information merging process of ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation The respective identity id information in storehouse, if there being difference, then increases to identity ID the corresponding subscriber identity information of this website behavior record In corresponding identity ID in relation storehouse.
If Email is different, the phone comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Telephone number comparison in number, with identity ID relation storehouse, if identical, merges the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse, other information such as Email, Cookie, Computer IP, at the letter with this identity ID Breath merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update the phase in identity ID relation storehouse Answering identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity ID relation storehouse Corresponding identity ID in.Identical with the telephone number of multiple identity ID, then if there is the telephone number in the behavior record of website By in the user identity ID that behavior occurs during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently.
If Email, telephone number are the most different, in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record Cookie comparison in the Cookie comprised, with identity ID relation storehouse, if identical, believes the user identity of this website behavior record Breath is merged in corresponding identity ID in identity ID relation storehouse, other information such as Email, telephone number, Computer IP, with this body The information merging process of part ID first carries out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID and close Being the respective identity id information in storehouse, if there being difference, then the corresponding subscriber identity information of this website behavior record being increased to identity In corresponding identity ID in ID relation storehouse.Wherein, if there is the Cookie in the behavior record of website and multiple identity ID Cookie is identical, then in subscriber identity information is integrated into identity ID relation storehouse in this website behavior record, behavior occurs recently In user identity ID.
If being more than different from, for remaining, be not the most integrated into the website behavior record of identity ID, extract it and wrapped The subscriber identity information contained, i.e. Email, telephone number, Cookie, Computer IP, then for different website behavior record, Carry out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that they are same use Family, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
3, the subscriber identity information in " access ", " search " website behavior record produced for new one day is extracted, with identity ID The information in relation storehouse compares, and the identity information in the behavior record of website is supplemented the renewal identity to identity ID relation storehouse In ID.
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, close with identity ID It is the Cookie comparison in storehouse, if identical, the subscriber identity information of this website behavior record is merged into identity ID relation storehouse In corresponding identity ID, other information such as Email, telephone number, Computer IP, first with the information merging process of this identity ID Carry out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update the respective identity id information in identity ID relation storehouse, if There is difference, then the corresponding subscriber identity information of this website behavior record is increased in corresponding identity ID in identity ID relation storehouse. Wherein, identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then by this website behavior note Occur recently during in record, subscriber identity information is integrated into identity ID relation storehouse in the user identity ID of behavior.
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website row in identity ID relation storehouse For record, extract its subscriber identity information comprised, i.e. Email, telephone number, Cookie, Computer IP, then for not Same website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, the most really Determining them is same user, gives same new identity ID, and increases this new identity ID in identity ID relation storehouse.
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID In relation storehouse.
Step 15, updates identity ID currently updated and identity information in user behavior, remembers each user behavior Record assignment identity ID, and complete adaptive process.
Step 16, by final identity ID and corresponding informance relation, is applied to the web analytics such as user behavior tracking and analysis.
In conjunction with Fig. 3, embodiment of the present invention identification system structure, including:
Data information acquisition memory module, data compilation/conversion/integration module, identification processing module and identity are more New maintenance module, identity information application module.
Described data information acquisition memory module, for extracting record user behavior from the data source systems of website platform Web log file data and the essential information data of user's registration, and be stored in background server.
Data information acquisition memory module includes log system, background data base system and data storage cell.Day aspiration System, for the user behavior information mutual with website from website extraction storage, records the user's each class behavior on website, including Login, inquiry, register, access, the information such as search;Background data base system is used for storing the Back ground Information of backstage operation, including The essential information of user's registration;Data storage cell is used for according to data warehouse data extracting rule the most respectively from log system With background data base system is extracted data storing, in case data compilation/conversion/integration module is carried out at further data Reason.
Described data compilation/conversion/integration module, for reading all kinds of daily record datas in data memory module, to adopting The data of collection carry out classification and are formed about user behavior, the intermediate layer data of user basic information, and are stored in data warehouse.
Data compilation/conversion/integration module includes ETL submodule and data warehouse submodule.ETL submodule is used for reading Various types of data in data storage cell, carries out further information identification, cleans, processes and arrange, and exports data bins In the submodule of storehouse;Data warehousing submodule forms intermediate layer data for Classifying Sum information, and is stored in data warehouse, its Middle storage information spinner user behavior to be divided into information, user basic information etc..In the embodiment of the present invention, final identification generates User identity id information also is stored in data warehouse submodule.
Described identification processing module, for entering identity information and user basic information in user behavior record Row collects and compares, each user identity ID of final imparting, and sets up identity ID and ID, user name, Email, phone number The incidence relation of code, Cookie, Computer IP etc., finally gives user identity relation.Including identity information blocks of knowledge, information Association processor 1, identity characteristic information association submodule 1, information decision processor 1, information association processor 2, identity characteristic Information association submodule 2, information decision processor 2, information association processor 3, identity characteristic information relating module.
Identity information blocks of knowledge for user's login, the inquiry from data warehouse submodule, access, the behavior such as search In record, and the essential information of user's registration extracts identity characteristic information, including ID, user name, Email, phone Number, Cookie, Computer IP information record;These information is preserved and gathers, remove the record repeated completely.
Information association processor 1 is used for " login ", the record of these 2 kinds of website behaviors of " log in and send out an inquiry ", and " note Volume information " in user basic information, carry out Identity Association merging, to all of user name corresponding to same ID, Email, telephone number, Cookie, Computer IP carry out information merging;
Identity characteristic information association submodule 1 merges the ID of duplicate removal, user for storage through information association processor 1 Name, Email, telephone number, Cookie, the corresponding relation of Computer IP, and give different identity ID to different IDs, Form identity id information record;
Information decision processor 1 is for by identity in " being not logged in the sending out inquiry " behavior record in identity information blocks of knowledge Information associates the identity id information record produced in submodule 1 and carries out identity and compare with identity characteristic information, if identity information ratio More identical, it is believed that to be same person, then new identity information is merged in identity characteristic information association submodule 1;If information ratio Relatively differ, then enter information association processor 2;
Information association processor 2 is not the most integrated into the Email of identity ID, electricity for processing in information decision processor 1 Words number, Cookie, Computer IP information, wherein Email, telephone number, Cookie are arbitrary identical, then it is assumed that be same People, gives same identity ID;
Identity characteristic information association submodule 2 associates, for storage, Email, the phone merged through information association processor 2 Number, Cookie, Computer IP and the incidence relation of identity ID, merge in identity characteristic information association submodule 1 simultaneously and store Identity ID and ID, user name, Email, telephone number, Cookie, the incidence relation of Computer IP;
Information decision processor 2 is for by the user's body in " access ", " search " behavior in identity information blocks of knowledge Part information associates the identity id information record of generation in submodule 2 and carries out identity and compare with identity characteristic information, if identity information The most identical, it is believed that to be same person, then new identity information is merged in identity characteristic information association submodule 2;If information Comparative result differs, then enter information association processor 3;
Information association processor 3 is for processing the website behavior not the most being integrated into identity ID in information decision processor 1 Record, if the Cookie between them, Computer IP information, if Cookie is identical, then it is assumed that be same person, gives same One identity ID;
Identity characteristic information relating module associates the Cookie and identity ID merged for storage through information association processor 3 The identity id information record formed after incidence relation, merges the identity id information of storage in characteristic information association submodule 2 simultaneously Record.
Described identity updating maintenance module, for carrying out more the user identity relation information in identification processing module Newly, based on specific update algorithm, incrementally update mode, for including each the newly generated identity characteristic in model in Information data, compares with existing identity characteristic relation and identity ID, is updated safeguarding, forms new identity ID relation Storehouse.
Identity updating maintenance module includes new identity information blocks of knowledge, information decision processor 3, information association processor 4, identity more new processor, identity ID characteristic relation result unit.
New identity information blocks of knowledge user identity in the de novo behavior of website user that storage daily updates Information and the essential information of new registration user, and carry out duplicate removal;
Information decision processor 3 is for by identity information in the behavior record in new identity information blocks of knowledge, with identity Identity id information in characteristic information relating module compares, if identical, enters identity more new processor;
More new processor is for by the subscriber identity information in new behavior and identity characteristic information relating module identity ID Information merges duplicate removal, updates the identity characteristic relation of existing identity ID;
Information association processor 4, for processing remaining, not the most to be integrated into identity ID relation storehouse website behavior note Record, wherein between subscriber identity information, carries out information association, forms new identity id information record;
Identity ID characteristic relation result unit is for storing more newly-generated identity id information record, and continues daily to enter Row updates.
Described identity information application module is for being applied to use by forming and constantly update adaptive identity id information In the behavior of family, user's historical behavior and current behavior are set up personal status relationship, identify which behavior is that same user does, with this User behavior is tracked and analyzes.
The disclosed above a kind of specific embodiment being only the present invention, can not limit the present invention's with this certainly Protection domain, the change made according to the technical spirit of the claims in the present invention or equivalent variations, still fall within right of the present invention and want Seek the scope that book is contained.

Claims (5)

1. a method for identifying ID, including:
Step one: gather basic data from electronic commerce Website platform data source systems, the basic data gathered is carried out point Class, forms two class data, and is stored in background server;Two class data include:
(1) user basic information that relevant user registration is formed, including ID, user name, Email, phone, Computer IP;
(2) user's registration, login, inquiry, access, search for the data of website behavior;
Step 2: registration based on user, login, inquiry, access, search for website behavior, extracts website row in the nearest time period For record, every kind of website behavior record contains the identity information of relevant user, including ID, user name, Email, Telephone number, Cookie, Computer IP;User basic information in conjunction with user's registration: ID, user name, Email, phone These information are gathered by number, Computer IP information, and remove the record repeated completely;
Step 3: according to the relation between ID, user name, Email, telephone number, Cookie, Computer IP, by advance The corresponding method first set, carries out duplicate removal, identity normalization to subscriber identity information, finally give user identification relevancy relation with And the identity information of correspondence, and user is given unique identities ID;
Step 3-1, first to " login ", the record of both websites behavior of " log in send out an inquiry ", and in " log-on message " User basic information, carry out identity information association merge, find out all of user name corresponding to same ID, Email, Telephone number, Cookie, Computer IP;After association process, form identity ID relation storehouse;
Step 3-2, extraction " are not logged in sending out an inquiry " subscriber identity information of website behavior record, do ratio with identity ID relation storehouse Right, carry out identity merging and renewal;Particularly as follows:
First Email is compared, the subscriber identity information of " being not logged in sending out inquiry " website behavior record includes Email Email comparison in information, with identity ID relation storehouse, if identical, is merged into the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse;The telephone number of other information, Cookie, Computer IP, with this identity ID Information merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation storehouse Respective identity id information, if there being difference, then increases to identity ID relation the corresponding subscriber identity information of this website behavior record In corresponding identity ID in storehouse;
If Email is different, the telephone number comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record, With the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged into body In corresponding identity ID in part ID relation storehouse, the Email of other information, Cookie, Computer IP, close in the information with this identity ID First carry out the comparison of corresponding informance during and, if both corresponding informances are identical, the most do not update the corresponding body in identity ID relation storehouse Part id information, if there being difference, then increases to the right of identity ID relation storehouse the corresponding subscriber identity information of this website behavior record Answer in identity ID;Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then should Occur recently during in the behavior record of website, subscriber identity information is integrated into identity ID relation storehouse in the user identity ID of behavior;
If Email, telephone number are the most different, the subscriber identity information according to " being not logged in sending out inquiry " website behavior record comprises Cookie, and the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is closed And in corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, with this identity ID Information merging process in first carry out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID relation storehouse Respective identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record is increased to identity ID pass It is in corresponding identity ID in storehouse;Wherein, if there is the Cookie phase of the Cookie in the behavior record of website with multiple identity ID With, then will there is the user identity of behavior during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently In ID;
For the subscriber identity information of remaining website behavior record, give new identity ID, join in identity ID relation storehouse;
Step 3-3, extraction " access ", the subscriber identity information of " search " website behavior record, compare with identity ID relation storehouse, Carry out identity merging and renewal;Particularly as follows:
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, with identity ID relation storehouse In Cookie comparison, if identical, the subscriber identity information of this website behavior record is merged into the correspondence in identity ID relation storehouse In identity ID, the Email of other information, telephone number, Computer IP, first carrying out in the information merging process of this identity ID The comparison of corresponding informance, if both corresponding informances are identical, does not the most update the respective identity id information in identity ID relation storehouse, if having not With, then the corresponding subscriber identity information of this website behavior record is increased in corresponding identity ID in identity ID relation storehouse;Wherein, Identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then will this website behavior record be used Family identity information is integrated in identity ID relation storehouse in the user identity ID that behavior occurs recently;
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website behavior note in identity ID relation storehouse Record, extracts the Email of its subscriber identity information comprised, telephone number, Cookie, Computer IP, then for different Website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that he Be same user, give same new identity ID, and in identity ID relation storehouse, increase this new identity ID;
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID relation In storehouse;
Step 4: the timing of identity ID relation storehouse is updated, behavior de novo for website user, user's body that it relates to Part information and the essential information of new registration user, compare merger with the information in identity ID relation storehouse, and supplement renewal body Part ID relation storehouse;
Step 5: after identity ID and Association Identity characteristic relation generate, identity characteristic relation is applied to follow-up user behavior In;Identity information association identification result in each the historical behavior record according to user obtains identity ID, the most each User has the unique identities ID of website, is used for analyzing user behavior application.
Method the most according to claim 1, it is characterised in that the concrete sub-step of step 4 is:
Step 4-1, " login " that the new time period is produced, the these three website behavior of " log in send out an inquiry " and " log-on message " Record, extracts subscriber identity information therein, compares with the information in identity ID relation storehouse, and in the behavior record of website Identity information supplements and updates in identity ID in identity ID relation storehouse;
Step 4-2, the subscriber identity information extracted in " be not logged in send out an inquiry " website behavior record that the new time period produces, with body The information in part ID relation storehouse compares, and the identity information in the behavior record of website is supplemented renewal to identity ID relation storehouse In identity ID;
Step 4-3, extract the new time period produce " access ", " search " website behavior record in subscriber identity information, with body The information in part ID relation storehouse compares, and the identity information in the behavior record of website is supplemented renewal to identity ID relation storehouse In identity ID.
Method the most according to claim 2, it is characterised in that:
Step 4-1 particularly as follows:
First compare, if ID is identical, then by net with the identity id information of existence " ID " in identity ID relation storehouse Other data of the subscriber identity information stood in behavior record, the information corresponding with identity ID of coupling merges duplicate removal, mends It is charged in identity ID relation storehouse;
If ID is different, then compare with the identity id information of nothing " ID " in identity ID relation storehouse, the scope compared It is both Email, telephone number, Cookie, if to have any one to have wherein identical for both, it is determined that belong to same People, gives the user of website behavior record, other identity in the behavior record of website by corresponding identity ID in identity ID relation storehouse Information adds in identity ID in identity ID relation storehouse accordingly;
Finally, if comparing without any identical identity information, then generating new identity ID increases in identity ID relation storehouse;
Step 4-2 particularly as follows:
First Email is compared, the subscriber identity information of " being not logged in sending out inquiry " website behavior record includes Email Email comparison in information, with identity ID relation storehouse, if identical, is merged into the subscriber identity information of this website behavior record In corresponding identity ID in identity ID relation storehouse, the telephone number of other information, Cookie, Computer IP, with this identity ID Information merging process first carries out the comparison of corresponding informance, if both corresponding informances are identical, does not the most update identity ID relation storehouse Respective identity id information, if there being difference, then increases to identity ID relation the corresponding subscriber identity information of this website behavior record In corresponding identity ID in storehouse;
If Email is different, the telephone number comprised in the subscriber identity information according to " being not logged in sending out inquiry " website behavior record, With the telephone number comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is merged into body In corresponding identity ID in part ID relation storehouse, the Email of other information, Cookie, Computer IP, close in the information with this identity ID First carry out the comparison of corresponding informance during and, if both corresponding informances are identical, the most do not update the corresponding body in identity ID relation storehouse Part id information, if there being difference, then increases to the right of identity ID relation storehouse the corresponding subscriber identity information of this website behavior record Answer in identity ID;Identical with the telephone number of multiple identity ID if there is the telephone number in the behavior record of website, then should Occur recently during in the behavior record of website, subscriber identity information is integrated into identity ID relation storehouse in the user identity ID of behavior;
If Email, telephone number are the most different, the subscriber identity information according to " being not logged in sending out inquiry " website behavior record comprises Cookie, and the Cookie comparison in identity ID relation storehouse, if identical, the subscriber identity information of this website behavior record is closed And in corresponding identity ID in identity ID relation storehouse, the Email of other information, telephone number, Computer IP, with this identity ID Information merging process in first carry out the comparison of corresponding informance, if both corresponding informances are identical, the most do not update identity ID relation storehouse Respective identity id information, if there being difference, then the corresponding subscriber identity information of this website behavior record is increased to identity ID pass It is in corresponding identity ID in storehouse;Wherein, if there is the Cookie phase of the Cookie in the behavior record of website with multiple identity ID With, then will there is the user identity of behavior during subscriber identity information is integrated into identity ID relation storehouse in this website behavior record recently In ID;
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID relation In storehouse;
Step 4-3 particularly as follows:
According to " access ", " search " website behavior record subscriber identity information in the Cookie that comprises, with identity ID relation storehouse In Cookie comparison, if identical, the subscriber identity information of this website behavior record is merged into the correspondence in identity ID relation storehouse In identity ID, the Email of other information, telephone number, Computer IP, first carrying out in the information merging process of this identity ID The comparison of corresponding informance, if both corresponding informances are identical, does not the most update the respective identity id information in identity ID relation storehouse, if having not With, then the corresponding subscriber identity information of this website behavior record is increased in corresponding identity ID in identity ID relation storehouse;Wherein, Identical with the Cookie of multiple identity ID if there is the Cookie in the behavior record of website, then will this website behavior record be used Family identity information is integrated in identity ID relation storehouse in the user identity ID that behavior occurs recently;
If Cookie is different, for remaining, be not the most integrated into " access ", " search " website behavior note in identity ID relation storehouse Record, extracts the Email of its subscriber identity information comprised, telephone number, Cookie, Computer IP, then for different Website behavior record, carries out the comparison of these identity informations, as long as having identity information identical between the behavior record of website, it is determined that he Be same user, give same new identity ID, and in identity ID relation storehouse, increase this new identity ID;
For the subscriber identity information of last remaining website behavior record, give new identity ID, join identity ID relation In storehouse.
4. a user identity identification system, it is characterised in that including: data information acquisition and memory module, data compilation/turn Change/integration module, identification processing module, identity updating maintenance module, identity information application module;
Described data information acquisition and memory module, for extracting the record various row of user from the data source systems of website platform For daily record data, including access, search, inquiry, log in, registration behavior;And extraction user basic information, including user Name, area, the data of phone essential information, and be stored in background server;
Described data compilation/conversion/integration module, for reading data information acquisition and the daily record data in memory module, right Log recording resolves, and forms the intermediate layer data of relevant user various actions, and relevant user is registered the user filled in Essential information, is stored in background server;
Described data compilation/conversion/integration module includes ETL submodule and data warehouse submodule;ETL submodule is used for reading Data information acquisition and the Various types of data in memory module, carry out further information identification, clean, process and arrange, and defeated Go out in data warehouse submodule;Data warehouse submodule forms intermediate layer data for Classifying Sum information, and is stored in number According in warehouse;
Described identification processing module includes identity characteristic information relating module;Described identification processing module, is used for composing Give each user identity ID, set up identity ID and ID, user name, Email, telephone number, Cookie, the pass of Computer IP Connection relation, finally gives user identity relation;
Described identity updating maintenance module, for the identity information that will comprise in newly generated user behavior, merges, repaiies Just, supplementing and safeguarding, forming new identity ID and corresponding identity information, supplement and update in identity ID relation storehouse;Identity updates Maintenance module include new identity information blocks of knowledge, information decision processor, information association processor, identity more new processor, Identity ID characteristic relation result unit;
New identity information blocks of knowledge is for storing the subscriber identity information in the de novo behavior of website user that timing updates With the essential information of new registration user, and carry out duplicate removal;
Information decision processor is for by the identity information in the behavior record in new identity information blocks of knowledge, with identity characteristic Identity id information in information association module compares, if identical, enters identity more new processor;
Identity more new processor is for by the subscriber identity information in new behavior and the body in identity characteristic information relating module Part id information merges duplicate removal, updates the identity characteristic relation of existing identity ID;
Information association processor, for processing remaining, not the most to be integrated into identity ID relation storehouse website behavior record, at it Between middle subscriber identity information, carry out information association, form new identity id information record;
Identity ID characteristic relation result unit is for storing more newly-generated identity id information record, and continues regularly to carry out more Newly;
Described identity information application module, for being applied to the user behavior of website platform by the identity in identity ID relation storehouse In, identify user, user behavior is tracked and analyzes.
System the most according to claim 4, it is characterised in that:
Described identification processing module, also includes that identity information blocks of knowledge, first information association processor, the first identity are special Levy information association submodule, first information decision processor, the second information association processor, Second Identity of Local information association Module, the second information decision processor, the 3rd information association processor;
Identity information blocks of knowledge is for the user's login from data warehouse submodule, inquiry, access, search behavior record In, and the essential information of user's registration is extracted identity characteristic information, including ID, user name, Email, telephone number, Cookie, Computer IP information record;These information is preserved and gathers, remove the record repeated completely;
First information association processor is used for " login ", the record of these 2 kinds of website behaviors of " log in and send out an inquiry ", and " note Volume information " in user basic information, carry out Identity Association merging, to all of user name corresponding to same ID, Email, telephone number, Cookie, Computer IP carry out information merging;
First identity characteristic information association submodule merges the ID of duplicate removal, use for storage through first information association processor Name in an account book, Email, telephone number, Cookie, the corresponding relation of Computer IP, and give different identity to different IDs ID, forms identity id information record;
First information decision processor is for by identity in " being not logged in the sending out inquiry " behavior record in identity information blocks of knowledge Information associates the identity id information record produced in submodule and carries out identity and compare with the first identity characteristic information, if identity information The most identical, it is believed that to be same user, then new identity information is merged in the first identity characteristic information association submodule;If Information compares and differs, then enter the second information association processor;
Second information association processor for process first information decision processor is not the most integrated into identity ID Email, Telephone number, Cookie, Computer IP information, wherein Email, telephone number, Cookie are arbitrary identical, then it is assumed that be same User, gives same identity ID;
Second Identity of Local information association submodule associates Email, the electricity merged for storage through the second information association processor Words number, Cookie, Computer IP and the incidence relation of identity ID, merge in the first identity characteristic information association submodule simultaneously Identity ID of storage and ID, user name, Email, telephone number, Cookie, the incidence relation of Computer IP;
Second information decision processor is for by the user identity in " access ", " search " behavior in identity information blocks of knowledge Information carries out identity with the identity id information record produced in Second Identity of Local information association submodule and compares, if identity information The most identical, it is believed that to be same user, then new identity information is merged in Second Identity of Local information association submodule;If Information comparative result differs, then enter the 3rd information association processor;
3rd information association processor is for processing the website row not the most being integrated into identity ID in first information decision processor For record, compare the Cookie between them, Computer IP information, if Cookie is identical, then it is assumed that be same user, compose Give same identity ID;
Identity characteristic information relating module associates the Cookie and identity ID merged for storage through the 3rd information association processor The identity id information record formed after incidence relation, merges the identity ID letter of storage in second feature information association submodule simultaneously Breath record.
CN201410367353.5A 2014-07-29 2014-07-29 A kind of method for identifying ID and system Active CN104394118B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410367353.5A CN104394118B (en) 2014-07-29 2014-07-29 A kind of method for identifying ID and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410367353.5A CN104394118B (en) 2014-07-29 2014-07-29 A kind of method for identifying ID and system

Publications (2)

Publication Number Publication Date
CN104394118A CN104394118A (en) 2015-03-04
CN104394118B true CN104394118B (en) 2016-12-14

Family

ID=52611954

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410367353.5A Active CN104394118B (en) 2014-07-29 2014-07-29 A kind of method for identifying ID and system

Country Status (1)

Country Link
CN (1) CN104394118B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104809156B (en) * 2015-03-24 2019-02-01 北京锐安科技有限公司 The method and apparatus of taking of evidence information
CN106202099B (en) * 2015-05-05 2019-11-12 北京国双科技有限公司 The recognition methods of visitor information and device in web log file
CN106549914B (en) * 2015-09-18 2019-12-06 北京秒针信息咨询有限公司 identification method and device for independent visitor
CN106682025B (en) * 2015-11-09 2020-04-14 阿里巴巴集团控股有限公司 Method and device for identifying mobile phone number user
CN105550916A (en) * 2015-11-30 2016-05-04 成都反思科技有限公司 Data acquisition method on the basis of multidimensional identification
CN107025563A (en) * 2016-01-29 2017-08-08 福建天晴数码有限公司 Follow the trail of the method and system for delivering advertisement
CN105912663A (en) * 2016-04-12 2016-08-31 宁波极动精准广告传媒有限公司 User tag merging method based on big data
CN106230829B (en) * 2016-08-03 2019-06-11 浪潮通用软件有限公司 Network-oriented threatens the construction method of the virtual identity knowledge mapping of discovery
CN106302797B (en) * 2016-08-31 2019-08-13 北京锐安科技有限公司 A kind of cookie access De-weight method and device
CN108241795A (en) * 2016-12-23 2018-07-03 北京国双科技有限公司 A kind of method for identifying ID and device
CN107066539A (en) * 2017-03-09 2017-08-18 北京网康科技有限公司 A kind of information processing method and device
CN108664375B (en) * 2017-03-28 2021-05-18 瀚思安信(北京)软件技术有限公司 Method for detecting abnormal behavior of computer network system user
CN107665438B (en) * 2017-08-10 2019-04-26 深圳市买买提信息科技有限公司 A kind of data processing method and device
CN109598529A (en) * 2017-09-30 2019-04-09 北京国双科技有限公司 A kind of recognition methods of user identifier and device
CN107895280A (en) * 2017-10-27 2018-04-10 深圳索信达数据技术股份有限公司 A kind of marketing program method for pushing, system, terminal and storage medium
CN108171547B (en) * 2017-12-27 2021-12-07 平安普惠企业管理有限公司 User behavior tracking method, device, equipment and storage medium
CN108444584B (en) * 2018-04-13 2020-03-31 山东华宇工学院 Intelligent height and weight measuring system and method
CN110727885A (en) * 2018-06-28 2020-01-24 上海传漾广告有限公司 Internet global uniform identifier generation system and generation method thereof
CN109086452A (en) * 2018-08-24 2018-12-25 北京奇虎科技有限公司 ID data network beta pruning preprocess method, device and calculating equipment
CN109344722B (en) 2018-09-04 2020-03-24 阿里巴巴集团控股有限公司 User identity determination method and device and electronic equipment
CN110109814B (en) * 2019-05-15 2023-07-21 恒生电子股份有限公司 User behavior data correction method and device
CN111147511A (en) * 2019-12-31 2020-05-12 杭州涂鸦信息技术有限公司 User identity serial-parallel method and system
CN112734476A (en) * 2021-01-13 2021-04-30 上海群之脉信息科技有限公司 Intelligent customer data detection system
CN112734485A (en) * 2021-01-13 2021-04-30 上海群之脉信息科技有限公司 User intelligent operation system
CN114116863B (en) * 2021-10-28 2023-07-25 上海欣兆阳信息科技有限公司 Method and system for fusing cross-channel consumer identities in real time

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136694A (en) * 2013-03-20 2013-06-05 焦点科技股份有限公司 Collaborative filtering recommendation method based on search behavior perception

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101222348B (en) * 2007-01-10 2011-05-11 阿里巴巴集团控股有限公司 Method and system for calculating number of website real user
CN103942708A (en) * 2013-09-30 2014-07-23 上海本家空调系统有限公司 Method and system for evaluating regional customers
CN103886487B (en) * 2014-03-28 2016-01-27 焦点科技股份有限公司 Based on personalized recommendation method and the system of distributed B2B platform

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103136694A (en) * 2013-03-20 2013-06-05 焦点科技股份有限公司 Collaborative filtering recommendation method based on search behavior perception

Also Published As

Publication number Publication date
CN104394118A (en) 2015-03-04

Similar Documents

Publication Publication Date Title
CN104394118B (en) A kind of method for identifying ID and system
Zhang et al. CAP: Community activity prediction based on big data analysis
Zhang et al. Large-scale network analysis for online social brand advertising
JinHuaXu et al. Web user clustering analysis based on KMeans algorithm
US9367603B2 (en) Systems and methods for behavioral segmentation of users in a social data network
CN103678613B (en) Method and device for calculating influence data
US20140143012A1 (en) Method and system for predictive marketing campigns based on users online behavior and profile
CN106022800A (en) User feature data processing method and device
CN105069654A (en) User identification based website real-time/non-real-time marketing investment method and system
CN103295145A (en) Mobile phone advertising method based on user consumption feature vector
CN103177384A (en) Network advertisement putting method based on user interest spectrum
Fani et al. Community detection in social networks
Mirani et al. Sentiment analysis of isis related tweets using absolute location
CN106844407A (en) Label network production method and system based on data set correlation
CN111191099B (en) User activity type identification method based on social media
Kreiss Yes we can (profile you): A brief primer on campaigns and political data
CN107341199A (en) A kind of recommendation method based on documentation & info general model
CN106649363A (en) Data query method and device
Arora et al. Cross-domain based event recommendation using tensor factorization
CN105205046A (en) System and method for on-line user recommendation based on semantic analysis
Ghazouani et al. Assessing socioeconomic status of Twitter users: A survey
CN106549914B (en) identification method and device for independent visitor
Smalec Big Data as a tool helpful in communication management
Canh et al. A spatial LDA model for discovering regional communities
CN114003803A (en) Method and system for discovering media account in specific region on social platform

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant