CN105721578A - User behavior data collection method and system - Google Patents

User behavior data collection method and system Download PDF

Info

Publication number
CN105721578A
CN105721578A CN201610089688.4A CN201610089688A CN105721578A CN 105721578 A CN105721578 A CN 105721578A CN 201610089688 A CN201610089688 A CN 201610089688A CN 105721578 A CN105721578 A CN 105721578A
Authority
CN
China
Prior art keywords
data
page
equations
timestamp
identification information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610089688.4A
Other languages
Chinese (zh)
Other versions
CN105721578B (en
Inventor
王伟
谢潇宇
赵金鑫
张舜华
何小锋
廖继逢
胡宗维
王明龙
卢颖辉
汪楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN201610089688.4A priority Critical patent/CN105721578B/en
Publication of CN105721578A publication Critical patent/CN105721578A/en
Application granted granted Critical
Publication of CN105721578B publication Critical patent/CN105721578B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Abstract

The invention provides a user behavior data collection method and system. The method comprises following steps of recording a first kind of collection data associated with an access request of a user to a page through an Apache process, wherein the first kind of collection data comprises identifier information of the page, a time stamp when the page is loaded and first collection data; collecting a second kind of collection data associated with the access request of the user to the page through a Java script, wherein the second kind of collection data comprises identifier information of the page, a time stamp when the page is loaded and second collection data; and matching the first kind of collection data and the second kind of collection data according to the identifier information and time stamps in the first kind of collection data and the second kind of collection data, thus obtaining the user behavior data. According to the method and the system, the collection mode of the user behavior data can be extended; and the collection comprehensiveness of the user behavior data is improved.

Description

A kind of user behavior data acquisition method and system
Technical field
The present invention relates to data processing field, specifically, relate to a kind of user behavior data acquisition method and system.
Background technology
Along with developing rapidly of Internet technology, big data age arrives.The average daily visit capacity of user of the WEB website of many hot topics has all reached ten million rank, and the related data of these user access activities becomes the basic metadata of big data analysis, and Dynamic Data Acquiring also becomes vital link.
But, the existing page data to WEB website gathers great majority and solely adopts the mode of Apache daily record or Javascript script, and data that every kind of acquisition mode can gather are also different, therefore the data that existing data acquisition modes gathers are more single, not adequately and comprehensively.
Summary of the invention
For solving above-mentioned technical problem, the invention provides a kind of user behavior data acquisition method and device, by the data that Apache and java script two ways gather from user are mated, behavioral data as user, the mode that scalable user behavioral data gathers, significantly improves comprehensive degree of the collection of user behavior data.
First aspect according to embodiment of the present invention, provide a kind of user behavior data acquisition method, the method includes: the first kind access request of the page associated by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data;Gathering, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data;Gather the identification information of the page in data and Equations of The Second Kind collection data according to the described first kind and the described first kind is gathered timestamp data and Equations of The Second Kind gathers the behavioral data that data mate to obtain described user.
In certain embodiments of the present invention, the identification information of the described page includes uniform resource position mark URL.
In certain embodiments of the present invention, the timestamp that the described page generates when loading is saved in the cookie of the page.
In certain embodiments of the present invention, described first collection data include following one or more: the commodity of HTTP conditional code, search in Website key word, the commodity browsed and addition shopping cart.
In certain embodiments of the present invention, described second collection data include following one or more: session id, user agent, Flash version, cookie, screen parameter and the page time of staying.
In certain embodiments of the present invention, the described first kind gathers data for the identification information of the described page gathered according to described first kind collection data and Equations of The Second Kind in data and timestamp and Equations of The Second Kind collection data carry out coupling and include: the identification information of the page in the identification information of the page in described first kind collection data and timestamp and described Equations of The Second Kind collection data and timestamp are compared, if comparison is consistent, the described first kind then gathers data and described Equations of The Second Kind gather data and merge the behavioral data as described user described timestamp on the page corresponding moment.
Second aspect according to embodiment of the present invention, provide user behavior data acquisition system, this system includes: the first acquisition module, the first kind for the access request of the page being associated by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data;Second acquisition module, gathering, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data;Integrate module, for the identification information and timestamp gathering the page in data and Equations of The Second Kind collection data according to the described first kind, described first kind collection data and Equations of The Second Kind are gathered the behavioral data that data mate to obtain described user.
In certain embodiments of the present invention, the identification information of the described page includes uniform resource position mark URL.
In certain embodiments of the present invention, the timestamp that the described page generates when loading is saved in the cookie of the page.
In certain embodiments of the present invention, described first collection data include following one or more: the commodity of HTTP conditional code, search in Website key word, the commodity browsed and addition shopping cart.
In certain embodiments of the present invention, described second collection data include following one or more: session id, user agent, Flash version, cookie, screen parameter and the page time of staying.
In certain embodiments of the present invention, described first kind collection data and Equations of The Second Kind collection data are carried out coupling and include according to identification information and the timestamp of the page in described first kind collection data and Equations of The Second Kind collection data by described integration module: the identification information of the page in the identification information of the page in described first kind collection data and timestamp and described Equations of The Second Kind collection data and timestamp are compared, if comparison is consistent, the described first kind then gathers data and described Equations of The Second Kind gather data and merge the behavioral data as described user described timestamp on the page corresponding moment.
Implement embodiment of the present invention and user behavior data acquisition method and system are provided, it is possible to the mode that extending user behavioral data gathers, improve comprehensive degree that user behavior data gathers simultaneously.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the user behavior data acquisition method according to one embodiment of the present invention;
Fig. 2 is the schematic flow sheet being gathered user-association data by Apache mode according to one embodiment of the present invention;
Fig. 3 is the schematic flow sheet being gathered user-association data by java script according to one embodiment of the present invention;
Fig. 4 is the structural representation of the user behavior data acquisition system according to one embodiment of the present invention.
Detailed description of the invention
It is described in detail to various aspects of the present invention below in conjunction with the drawings and specific embodiments.Wherein, it is thus well known that module, unit and connection each other, link, communication or operation are shown without or do not elaborate.Further, described feature, framework or function can combine by any way in one or more embodiments.It will be appreciated by those skilled in the art that following various embodiments are served only for illustrating, not for limiting the scope of the invention.Can also be easy to understand, module in each embodiment described herein and shown in the drawings or unit or processing mode can be combined by various different configurations and design.
Some concepts that just the present invention relates to below illustrate.
Apache, is the abbreviation of ApacheHTTPServer, is the web page server of an open source code of Apache Software Foundation, it is possible to runs in most computers operating system, belongs to a kind of cross-platform web server software.In embodiments of the present invention, it is possible to use apache process receives HTML (Hypertext Markup Language) (HyperTextTransferprotocol, the HTTP) request that user is initiated to the page by client browser, and records correlation log.
Java script, i.e. Javascript, be a kind of literal translation formula script, belongs to regime type, weak type, language based on prototype.In embodiments of the present invention, it is possible to embed public data acquisition java script at each page, it may be achieved gather user-defined counter.
The user behavior data acquisition method of the present invention is described below in conjunction with accompanying drawing.
Fig. 1 is the schematic flow sheet of the user behavior data acquisition method according to one embodiment of the present invention;Fig. 2 is the schematic flow sheet being gathered user-association data by Apache mode according to one embodiment of the present invention;Fig. 3 is the schematic flow sheet being gathered user-association data by java script according to one embodiment of the present invention.
As shown in Figure 1, the user behavior data acquisition method of embodiment of the present invention can include step S11, S12 and S13, in some other embodiments, the user behavior data acquisition method of the present invention may also include other some steps, such as, the step being pre-configured with and embedding before gathering, and data formatting step etc. after the matching.
Each the step below method of the present invention related to is specifically described.
In step s 11, the first kind access request of the page associated by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data.This step is that the server-side of the website that user accesses is deployed with on the equipment of Apache software to perform.Before step S11, the user behavior data acquisition method of the present invention may also include that configuring Apache journal format, for instance, it is possible to undertaken by system manager.At client-side, after user makes the action clicking Website page, the client browser that can trigger user place initiates HTTP request to this Website page.Server-side in website, apache process can receive this HTTP request, the first kind that the access request of this page is associated by record user gathers data, by Syslog, the data of record are sent to this module grabber, by Syslog synchrodata, such that it is able to asynchronous, Apache daily record is carried out data analysis.In certain embodiments of the present invention, the flow process of Apache log collection operation can be as shown in Figure 2.
The timestamp and first that the first kind gathered by apache process in embodiment of the present invention gathers the identification information of the page that data comprise the steps that the access request of user accesses, this page generates when loading gathers data.The identification information of the page that user accesses can include URL (UniversalResourceLocator, URL), it is also possible to is other identification information that can uniquely identify this page or multiple identification informations.In certain embodiments of the present invention, the identification information of this page is URL.The timestamp that the timestamp generated when the page loads all generates when being the loading of each page, is saved in the cookie of the page, and degree of accuracy can reach 10-9Second rank.In other embodiments, the not equal factor according to the degree of accuracy needs mated, also can adopt the timestamp of other degree of accuracy, for instance, degree of accuracy is 10-1Second, 10-2Second, 10-3Second, 10-4Second, 10-5Second, 10-6Second, 10-7Second, 10-8Second, 10-10Second, 10-11The timestamps such as second.First collection data are different according to the type of the Website page accessed and the purpose of corresponding data analysis, such as, for electricity business website, first gathers data can include one or more following (such as, more than or equal to 2 kinds): the pipelined data etc. of member's operations such as the commodity of the pipelined data of HTTP conditional code, search in Website key word and these key words, the commodity browsed or addition shopping cart.It should be noted that the Website page that in embodiment of the present invention, user accesses is possible not only to be various types of electricity business website, it is also possible to be other kinds of website, for instance, news category website etc..
In step s 12, gathering, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data.At client-side, after user makes the action clicking Website page, HTTP request can be initiated by trigger clients browser to this page, in the process that the page loads, the java script being previously embedded the page can be triggered, java script starts to gather the Equations of The Second Kind associated with this access request and gathers data, after data acquisition completes, the data gathered is sent to corresponding acquisition server.Such that it is able to carry out follow-up analysis of data collected.In certain embodiments of the present invention, the flow process by java script collection data can be as shown in Figure 3.
The Equations of The Second Kind gathered by java script in embodiment of the present invention gathers the identification information of the page that data comprise the steps that the access request of user accesses, this page generates when loading timestamp and the second collection information.The identification information of the page that user accesses can include URL, it is also possible to is other identification informations that can uniquely identify this page or the combination of multiple identification information.In certain embodiments of the present invention, the identification information of this page is URL.The timestamp that the timestamp generated when the page loads all generates when being the loading of each page, is saved in the cookie of the page, and degree of accuracy can reach 10-9Second rank.In other embodiments, the not equal factor according to the degree of accuracy needs mated, also can adopt the timestamp of other degree of accuracy, for instance, degree of accuracy is 10-1Second, 10-2Second, 10-3Second, 10-4Second, 10-5Second, 10-6Second, 10-7Second, 10-8Second, 10-10Second, 10-11The timestamps such as second.Second collection data are different according to the type of the Website page accessed and the purpose of corresponding data analysis, such as, for electricity business website, second gathers data can include one or more following (such as, more than or equal to 2 kinds): session id (sessionID), user agent (UserAgent), Flash version, Cookie, screen parameter and the page time of staying etc..
It should be noted that, although above-mentioned step S11 and S12 is described in a certain order, but in data acquisition, step S11 and S12 can according to first carrying out step S11, the order performing step S12 again performs, according to first carrying out S12, then the order execution of step S11 can also be performed, it is also possible to the order simultaneously performing step S11 and S12 performs.
Gathering data by the above-mentioned steps S11 first kind gathered and can save as the form of file, be distributed in different WEB server, employing SyslogNG can by file asynchronous transmission to Analysis server;Gather data by the above-mentioned steps S12 Equations of The Second Kind gathered and can also save as the form of file, be transferred to the Analysis server of correspondence by Open-Source Tools Flume.
In step s 13, described first kind collection data and Equations of The Second Kind are gathered the behavioral data that data mate to obtain described user by the identification information and the timestamp that gather the page in data and Equations of The Second Kind collection data according to the above-mentioned first kind collected.Specifically, gather data and Equations of The Second Kind according to the above-mentioned first kind collected to gather the identification information of the page in data and timestamp and the described first kind gathers data and Equations of The Second Kind gathers data and mates identification information and the timestamp of the identification information comprising the steps that the page described first kind gathered in data and timestamp and the page in described Equations of The Second Kind collection data and compare, if comparison is consistent, that is, it is all identical with timestamp with the identification information of the page in Equations of The Second Kind collection data that the first kind gathers data, the described first kind then gathers data and described Equations of The Second Kind gather data and merge the behavioral data as described user described timestamp on the page corresponding moment.If comparison is inconsistent, say, that identification information and the timestamp of first kind collection data and the page in Equations of The Second Kind collection data are different, then data are not merged process.That is, the Data Integration of the present invention is based on the first kind and gathers information common in data and Equations of The Second Kind collection data, the identification information of the page is (such as, URL) and this page load time generate timestamp, the first kind gathered by different modes is gathered data and Equations of The Second Kind gathers data and mates, many-sided collection data in obtain this user this timestamp correspondence moment on this page, as this user behavioral data in this moment.After obtaining user's behavioral data at a time, also these behavioral datas can be formatted, for instance, can these data be extracted, processing etc. processes and obtains unified form, it is simple to further statistical analysis processes.
Embodiment of the present invention is by integrating two class user-association data of different modes collection based on identification information and the timestamp of the page, compared with the existing scheme only being gathered user-association data by a kind of mode, the mode that scalable user behavioral data gathers, also can improve comprehensive degree that user behavior data gathers simultaneously.
The user behavior data acquisition method of the present invention is described, below in conjunction with the system that accompanying drawing and object lesson above-mentioned user behavior data acquisition method is corresponding above in conjunction with accompanying drawing and object lesson.
Fig. 4 is the structural representation of the user behavior data acquisition system according to one embodiment of the present invention.
As shown in Figure 4, user behavior data acquisition system 4 can include first acquisition module the 41, second acquisition module 42 and integrate module 43, and these modules may be disposed at the server-side of website, for instance, may be disposed in the server cluster for gathering data.First acquisition module 41 can utilize the collection facility of existing Apache to carry out corresponding data acquisition, and the second acquisition module 41 can also utilize the collection facility of existing java script to carry out corresponding data acquisition.
Below the modules of the user behavior data acquisition system of the present invention is specifically described.
The first kind that the access request of the page is associated by the first acquisition module 41 by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data.In certain embodiments of the present invention, the flow process of Apache log collection operation can be as shown in Figure 2.
The timestamp and first that the first kind gathered by apache process in embodiment of the present invention gathers the identification information of the page that data comprise the steps that the access request of user accesses, this page generates when loading gathers data.The identification information of the page that user accesses can include URL, it is also possible to is other identification information that can uniquely identify this page or multiple identification informations.In certain embodiments of the present invention, the identification information of this page is URL.The timestamp that the timestamp generated when the page loads all generates when being the loading of each page, is saved in the cookie of the page, and degree of accuracy can reach 10-9Second rank.In other embodiments, the not equal factor according to the degree of accuracy needs mated, also can adopt the timestamp of other degree of accuracy, for instance, degree of accuracy is 10-1Second, 10-2Second, 10-3Second, 10-4Second, 10-5Second, 10-6Second, 10-7Second, 10-8Second, 10-10Second, 10-11The timestamps such as second.First collection data are different according to the type of the Website page accessed and the purpose of corresponding data analysis, such as, for electricity business website, first gathers data can include one or more following (such as, more than or equal to 2 kinds): the pipelined data etc. of member's operations such as the commodity of the pipelined data of HTTP conditional code, search in Website key word and these key words, the commodity browsed or addition shopping cart.
Second acquisition module 42 gathers, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, and described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data.In certain embodiments of the present invention, the flow process by java script collection data can be as shown in Figure 3.
The Equations of The Second Kind gathered by java script in embodiment of the present invention gathers the identification information of the page that data comprise the steps that the access request of user accesses, this page generates when loading timestamp and the second collection information.The identification information of the page that user accesses can include URL, it is also possible to is other identification informations that can uniquely identify this page or the combination of multiple identification information.In certain embodiments of the present invention, the identification information of this page is URL.The timestamp that the timestamp generated when the page loads all generates when being the loading of each page, is saved in the cookie of the page, and degree of accuracy can reach 10-9Second rank.In other embodiments, the not equal factor according to the degree of accuracy needs mated, also can adopt the timestamp of other degree of accuracy, for instance, degree of accuracy is 10-1Second, 10-2Second, 10-3Second, 10-4Second, 10-5Second, 10-6Second, 10-7Second, 10-8Second, 10-10Second, 10-11The timestamps such as second.Second collection data are different according to the type of the Website page accessed and the purpose of corresponding data analysis, such as, for electricity business website, second gathers data can include one or more following (such as, more than or equal to 2 kinds): session id (sessionID), user agent (UserAgent), Flash version, Cookie, screen parameter and the page time of staying etc..
The first kind that first acquisition module 41 is gathered by integration module 43 gathers data and described first kind collection data and Equations of The Second Kind collection data are mated to obtain the behavioral data of described user by identification information and timestamp to the page in the Equations of The Second Kind collection data of the second acquisition module 42 collection.Specifically, gather data and Equations of The Second Kind according to the above-mentioned first kind collected to gather the identification information of the page in data and timestamp and the described first kind gathers data and Equations of The Second Kind gathers data and mates identification information and the timestamp of the identification information comprising the steps that the page described first kind gathered in data and timestamp and the page in described Equations of The Second Kind collection data and compare, if comparison is consistent, that is, it is all identical with timestamp with the identification information of the page in Equations of The Second Kind collection data that the first kind gathers data, the described first kind then gathers data and described Equations of The Second Kind gather data and merge the behavioral data as described user described timestamp on the page corresponding moment.If comparison is inconsistent, say, that identification information and the timestamp of first kind collection data and the page in Equations of The Second Kind collection data are different, then data are not merged process.
Embodiment of the present invention is by integrating two class user-association data of different modes collection based on identification information and the timestamp of the page, compared with the existing scheme only being gathered user-association data by a kind of mode, the mode that scalable user behavioral data gathers, also can improve comprehensive degree that user behavior data gathers simultaneously.
Through the above description of the embodiments, those skilled in the art is it can be understood that can realize by the mode of software combined with hardware platform to the present invention.Based on such understanding, what background technology was contributed by technical scheme can embody with the form of software product in whole or in part, this computer software product can be stored in storage medium, such as ROM/RAM, magnetic disc, CD etc., including some instructions with so that a computer equipment (can be personal computer, server, smart mobile phone or the network equipment etc.) perform the method described in some part of each embodiment of the present invention or embodiment.
Terminology used herein of the present invention and wording, just to illustrating, are not intended to constitute restriction.It will be appreciated by those skilled in the art that under the premise of the ultimate principle without departing from disclosed embodiment, each details in above-mentioned embodiment can be carried out various change.Therefore, the scope of the present invention is only determined by claim, and in the claims, except as otherwise noted, all of term should be understood by the broadest rational meaning.

Claims (12)

1. a user behavior data acquisition method, it is characterised in that described method includes:
The first kind access request of the page associated by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data;
Gathering, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data;
Gather the identification information of the page in data and Equations of The Second Kind collection data according to the described first kind and the described first kind is gathered timestamp data and Equations of The Second Kind gathers the behavioral data that data mate to obtain described user.
2. method according to claim 1, it is characterised in that the identification information of the described page includes uniform resource position mark URL.
3. method according to claim 1, it is characterised in that the timestamp generated when the described page loads is saved in the cookie of the page.
4. according to the method in any one of claims 1 to 3, it is characterised in that described first collection data include following one or more: the commodity of HTTP conditional code, search in Website key word, the commodity browsed and addition shopping cart.
5. according to the method in any one of claims 1 to 3, it is characterised in that described second collection data include following one or more: session id, user agent, Flash version, cookie, screen parameter and the page time of staying.
6. according to the method in any one of claims 1 to 3, it is characterized in that, described first kind collection data and Equations of The Second Kind collection data are carried out coupling and include by identification information and the timestamp of the described page gathered according to the described first kind in data and Equations of The Second Kind collection data:
The identification information of the page in the identification information of the page in described first kind collection data and timestamp and described Equations of The Second Kind collection data and timestamp are compared, if comparison is consistent, then the described first kind is gathered data and gathers data with described Equations of The Second Kind and merge the behavioral data as described user described timestamp on the page corresponding moment.
7. a user behavior data acquisition system, it is characterised in that described system includes:
First acquisition module, the first kind for the access request of the page being associated by apache process record user gathers data, and the described first kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and first gathers data;
Second acquisition module, gathering, by java script, the Equations of The Second Kind collection data that the described user access request to the page associates, described Equations of The Second Kind gathers data and includes: the timestamp generated when the identification information of the described page, the described page load and second gathers data;
Integrate module, for the identification information and timestamp gathering the page in data and Equations of The Second Kind collection data according to the described first kind, described first kind collection data and Equations of The Second Kind are gathered the behavioral data that data mate to obtain described user.
8. system according to claim 7, it is characterised in that the identification information of the described page includes uniform resource position mark URL.
9. system according to claim 7, it is characterised in that the timestamp generated when the described page loads is saved in the cookie of the page.
10. the system according to any one of claim 7 to 9, it is characterised in that described first collection data include following one or more: the commodity of HTTP conditional code, search in Website key word, the commodity browsed and addition shopping cart.
11. the system according to any one of claim 7 to 9, it is characterised in that described second collection data include following one or more: session id, user agent, Flash version, cookie, screen parameter and the page time of staying.
12. the system according to any one of claim 7 to 9, it is characterized in that, described first kind collection data and Equations of The Second Kind collection data are carried out coupling and include according to identification information and the timestamp of the page in described first kind collection data and Equations of The Second Kind collection data by described integration module: the identification information of the page in the identification information of the page in described first kind collection data and timestamp and described Equations of The Second Kind collection data and timestamp are compared, if comparison is consistent, the described first kind then gathers data and described Equations of The Second Kind gather data and merge the behavioral data as described user described timestamp on the page corresponding moment.
CN201610089688.4A 2016-02-17 2016-02-17 A kind of user behavior data acquisition method and system Active CN105721578B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610089688.4A CN105721578B (en) 2016-02-17 2016-02-17 A kind of user behavior data acquisition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610089688.4A CN105721578B (en) 2016-02-17 2016-02-17 A kind of user behavior data acquisition method and system

Publications (2)

Publication Number Publication Date
CN105721578A true CN105721578A (en) 2016-06-29
CN105721578B CN105721578B (en) 2019-05-24

Family

ID=56155846

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610089688.4A Active CN105721578B (en) 2016-02-17 2016-02-17 A kind of user behavior data acquisition method and system

Country Status (1)

Country Link
CN (1) CN105721578B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107886382A (en) * 2016-09-29 2018-04-06 北京京东尚科信息技术有限公司 The method, apparatus and system of channel drainage effect in analyzing web site station
CN109144834A (en) * 2017-06-27 2019-01-04 深圳市Tcl高新技术开发有限公司 Acquisition method and device, the Android system and terminal device of user behavior data
CN109145194A (en) * 2017-06-27 2019-01-04 北京国双科技有限公司 The acquisition method and device of user behavior data
CN109558449A (en) * 2018-10-18 2019-04-02 北京新唐思创教育科技有限公司 Data processing platform (DPP) and data processing method
CN111245880A (en) * 2018-11-29 2020-06-05 中国移动通信集团山东有限公司 Behavior trajectory reconstruction-based user experience monitoring method and device
CN111277615A (en) * 2018-12-04 2020-06-12 阿里巴巴集团控股有限公司 User behavior tracking method based on browser, terminal device and server
CN112199263A (en) * 2020-09-30 2021-01-08 北京字节跳动网络技术有限公司 Method, device, equipment and medium for recording page

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070271375A1 (en) * 2004-09-27 2007-11-22 Symphoniq Corporation Method and apparatus for monitoring real users experience with a website capable of using service providers and network appliances
CN101576933A (en) * 2009-06-29 2009-11-11 北京黑米世纪信息技术有限公司 Fully-automatic grouping method of WEB pages based on title separator
CN104601408A (en) * 2015-01-30 2015-05-06 迈普通信技术股份有限公司 Website data statistics and analysis method and system used for non-open network environment
CN104636245A (en) * 2015-03-09 2015-05-20 浪潮集团有限公司 User browsing behavior collection modes based on real-time update

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070271375A1 (en) * 2004-09-27 2007-11-22 Symphoniq Corporation Method and apparatus for monitoring real users experience with a website capable of using service providers and network appliances
CN101576933A (en) * 2009-06-29 2009-11-11 北京黑米世纪信息技术有限公司 Fully-automatic grouping method of WEB pages based on title separator
CN104601408A (en) * 2015-01-30 2015-05-06 迈普通信技术股份有限公司 Website data statistics and analysis method and system used for non-open network environment
CN104636245A (en) * 2015-03-09 2015-05-20 浪潮集团有限公司 User browsing behavior collection modes based on real-time update

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
向坚持等: "基于用户行为的Web使用挖掘数据采集技术研究", 《计算机与现代化》 *
夏天的森林: "系统设计以及javascript笔记:用户行为分析研究之数据采集", 《HTTPS://WWW.CNBLOGS.COM/SHARPXIAJUN/ARCHIVE/2012/06/HTML》 *
朱志国等: "Web 使用挖掘技术的分析与研究", 《计算机应用研究》 *
胡光民等: "基于Hadoop 的网络日志分析系统研究", 《电脑知识与技术》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107886382A (en) * 2016-09-29 2018-04-06 北京京东尚科信息技术有限公司 The method, apparatus and system of channel drainage effect in analyzing web site station
CN107886382B (en) * 2016-09-29 2021-11-30 北京京东尚科信息技术有限公司 Method, device and system for analyzing channel drainage effect in website
CN109144834A (en) * 2017-06-27 2019-01-04 深圳市Tcl高新技术开发有限公司 Acquisition method and device, the Android system and terminal device of user behavior data
CN109145194A (en) * 2017-06-27 2019-01-04 北京国双科技有限公司 The acquisition method and device of user behavior data
CN109144834B (en) * 2017-06-27 2021-11-23 深圳市Tcl高新技术开发有限公司 User behavior data acquisition method and device, android system and terminal equipment
CN109558449A (en) * 2018-10-18 2019-04-02 北京新唐思创教育科技有限公司 Data processing platform (DPP) and data processing method
CN109558449B (en) * 2018-10-18 2022-02-08 北京新唐思创教育科技有限公司 Data processing platform and data processing method
CN111245880A (en) * 2018-11-29 2020-06-05 中国移动通信集团山东有限公司 Behavior trajectory reconstruction-based user experience monitoring method and device
CN111277615A (en) * 2018-12-04 2020-06-12 阿里巴巴集团控股有限公司 User behavior tracking method based on browser, terminal device and server
CN111277615B (en) * 2018-12-04 2022-01-11 阿里巴巴集团控股有限公司 User behavior tracking method based on browser, terminal device and server
CN112199263A (en) * 2020-09-30 2021-01-08 北京字节跳动网络技术有限公司 Method, device, equipment and medium for recording page

Also Published As

Publication number Publication date
CN105721578B (en) 2019-05-24

Similar Documents

Publication Publication Date Title
CN105721578A (en) User behavior data collection method and system
CN102663062B (en) Method and device for processing invalid links in search result
US8180376B1 (en) Mobile analytics tracking and reporting
CN105069087A (en) Web log data mining based website optimization method
CN102521251A (en) Method for directly realizing personalized search, device for realizing method, and search server
US20130185429A1 (en) Processing Store Visiting Data
WO2014099928A2 (en) Customer segmentation
CN104182506A (en) Log management method
CN110163654B (en) Advertisement delivery data tracking method and system
CN103309884A (en) User behavior data collecting method and system
Langhnoja et al. Pre-processing: procedure on web log file for web usage mining
CN102158365A (en) User clustering method and system in weblog mining
WO2017124692A1 (en) Method and apparatus for searching for conversion relationship between form pages and target pages
Lakshmi et al. An overview of preprocessing on web log data for web usage analysis
CN105721519B (en) A kind of webpage data acquiring method, apparatus and system
JP2011515754A (en) URL providing method and system capable of new advertisement
Chitraa et al. An efficient path completion technique for web log mining
Eltahir et al. Extracting knowledge from web server logs using web usage mining
Santhanakumar et al. Web usage based analysis of web pages using rapidminer
Suguna et al. User interest level based preprocessing algorithms using web usage mining
CN106815248A (en) Web analytics method and device
CN111882368B (en) On-line advertisement DPI encryption buried point and transparent transmission tracking method
Domingues et al. A data warehouse for web intelligence
KR20220108590A (en) medical tourism smart marketing method by using SNS platform
Nicholas et al. Evidence of user behaviour: deep log analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant