CN105162822A - Website log data processing method and device - Google Patents
Website log data processing method and device Download PDFInfo
- Publication number
- CN105162822A CN105162822A CN201510377886.6A CN201510377886A CN105162822A CN 105162822 A CN105162822 A CN 105162822A CN 201510377886 A CN201510377886 A CN 201510377886A CN 105162822 A CN105162822 A CN 105162822A
- Authority
- CN
- China
- Prior art keywords
- log file
- web log
- data collection
- type
- file data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/02—Standardisation; Integration
- H04L41/024—Standardisation; Integration using relational databases for representation of network management data, e.g. managing via structured query language [SQL]
Abstract
The invention provides a website log data processing method and device, and the method comprises the following steps that a website log data collection module determines a data collection type and enables access data corresponding to the collected data collection type to be transmitted to a website log data processing module according to the type of a terminal user access website; and the website log data processing module processes the access data corresponding to the collected data collection type, and outputs the results to a target storage region. The method achieves the effective collection of different access data corresponding to different website type, and provides important data support for website construction.
Description
Technical field
The invention belongs to log management field, particularly relate to a kind of web log file data processing method and device.
Background technology
Disclose a kind of website data analytical method and analytical system in prior art, can analyze whole network data from the angle of data flow.Described method comprises: by analyzing web site daily record data, and obtain visit data stream, described visit data stream have recorded the order of accessed web page; Reject the visit data stream not comprising the important page, wherein, the described important page is the page meeting pre defined attribute; Flow to the numerous excavation of line frequency to the remaining visit data comprising the important page to calculate, obtain the occurrence frequency of front m high visit data stream of the frequency of occurrences and each visit data stream; For described m visit data stream, calculate the number of times occurring the important page in each data flow, and the length of each data flow; Utilize the occurrence frequency of each visit data stream, occur the number of times of the important page and the length of data flow, calculate the water of each data flow in described m visit data stream.
Such scheme only only discloses how to the web log file data analysis collected, but how such scheme effectively collects website daily record data if not disclosing.
Summary of the invention
In order to solve the problems of the technologies described above, the invention provides a kind of web log file data processing method and device, to solve the problems of the technologies described above.
In order to reach the object of the invention, the invention provides a kind of web log file data processing method, said method comprises the following steps: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
The present invention also provides a kind of web log file data processing equipment, comprises web log file data collection module, web log file data processing module; Wherein, described web log file data collection module is connected with described web log file data processing module;
Described web log file data collection module, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Described web log file data processing module, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
By following scheme: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module; After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to; Achieve the different access data corresponding to dissimilar website effectively to collect, support for Web Hosting provides significant data.
By following scheme: described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo; Diversified data type, ensures the comprehensive and accuracy of data acquisition.
Accompanying drawing explanation
Accompanying drawing is used to provide the further understanding to technical solution of the present invention, and forms a part for specification, is used from and explains technical scheme of the present invention, do not form the restriction to technical solution of the present invention with the embodiment one of the application.
Fig. 1 is the web log file data processing method flow chart realizing the embodiment of the present invention 1;
Fig. 2 is the web log file data processing equipment structure chart according to the embodiment of the present invention 2;
Fig. 3 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 3;
Fig. 4 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 4.
Embodiment
Hereinafter also describe the present invention in detail with reference to accompanying drawing in conjunction with the embodiments.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.
Fig. 1 is the web log file data processing method flow chart realizing the embodiment of the present invention 1, comprises the following steps:
Step 101: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Further, web log file data collection module is collected the process of visit data corresponding to described type of data collection and is:
What web log file data collection module was arranged by execution buries a program, collects the visit data that described type of data collection is corresponding.
Further, described implementation of burying a program is: by adding one section of javascript code in the page, and dynamic creation script label, and src is pointed to an independent javascript file and by visit data corresponding to type of data collection described in described javascript file collection.
Further, described javascript file, by visit data corresponding for the described type of data collection of collection, passes to web log file data processing module by the mode of http parameter.
Further, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
Further, web log file data collection module prestores the Type of website and the type of data collection table of comparisons.
Further, the Type of website and the type of data collection table of comparisons, such as, shown in table 1:
The Type of website and the type of data collection table of comparisons
Table 1
Step 102: after the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
Further, after the visit data that described web log file data processing module is corresponding to described type of data collection processes, the process exporting territory, target storage to is:
After described web log file data processing module resolves the http parameter information of described javascript file transmission, the variable of corresponding web log file data format be set and visit data corresponding for described type of data collection is recorded in journal file, exporting territory, target storage to.
Fig. 2 is the web log file data processing equipment structure chart according to the embodiment of the present invention 2, comprises web log file data collection module 201, web log file data processing module 202; Wherein, described web log file data collection module 201 is connected with described web log file data processing module 202;
Described web log file data collection module 201, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module 202;
Described web log file data processing module 202, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
Fig. 3 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 3, also comprises and arranges module 200; Wherein, the described module 200 that arranges is connected with described web log file data collection module 201;
Described module 200 is set, collects type for setting data and described type of data collection information is sent to described web log file data collection module 201; Wherein, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
Fig. 4 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 4, also comprises memory module 203; Wherein, described memory module 203 and described web log file data processing module 202;
Described web log file data processing module 202, after processing, exports described memory module 203 to and stores for the visit data corresponding to described type of data collection.
By following scheme: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module; After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to; Achieve the different access data corresponding to dissimilar website effectively to collect, support for Web Hosting provides significant data.
By following scheme: described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo; Diversified data type, ensures the comprehensive and accuracy of data acquisition.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Claims (10)
1. a web log file data processing method, is characterized in that, comprises the following steps:
Web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
2. method according to claim 1, is characterized in that, the process that web log file data collection module collects visit data corresponding to described type of data collection is:
What web log file data collection module was arranged by execution buries a program, collects the visit data that described type of data collection is corresponding.
3. method according to claim 2, it is characterized in that, described implementation of burying a program is: by adding one section of javascript code in the page, and dynamic creation script label, and src is pointed to an independent javascript file and by visit data corresponding to type of data collection described in described javascript file collection.
4. method according to claim 3, is characterized in that, described javascript file, by visit data corresponding for the described type of data collection of collection, passes to web log file data processing module by the mode of http parameter.
5. method according to claim 1, is characterized in that, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
6. method according to claim 1, is characterized in that, web log file data collection module prestores the Type of website and the type of data collection table of comparisons.
7. method according to claim 1, is characterized in that, after the visit data that described web log file data processing module is corresponding to described type of data collection processes, the process exporting territory, target storage to is:
After described web log file data processing module resolves the http parameter information of described javascript file transmission, the variable of corresponding web log file data format be set and visit data corresponding for described type of data collection is recorded in journal file, exporting territory, target storage to.
8. a web log file data processing equipment, is characterized in that, comprises web log file data collection module, web log file data processing module; Wherein, described web log file data collection module is connected with described web log file data processing module;
Described web log file data collection module, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Described web log file data processing module, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
9. web log file data processing equipment according to claim 8, is characterized in that, also comprises and arranges module; Wherein, the described module that arranges is connected with described web log file data collection module;
Described module is set, collects type for setting data and described type of data collection information is sent to described web log file data collection module; Wherein, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
10. web log file data processing equipment according to claim 8, is characterized in that, also comprise memory module; Wherein, described memory module and described web log file data processing module;
Described web log file data processing module, after processing, exports described memory module to and stores for the visit data corresponding to described type of data collection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510377886.6A CN105162822A (en) | 2015-06-30 | 2015-06-30 | Website log data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510377886.6A CN105162822A (en) | 2015-06-30 | 2015-06-30 | Website log data processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105162822A true CN105162822A (en) | 2015-12-16 |
Family
ID=54803576
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510377886.6A Pending CN105162822A (en) | 2015-06-30 | 2015-06-30 | Website log data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105162822A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106204238A (en) * | 2016-07-19 | 2016-12-07 | 荆伟 | A kind of merchandise display system and method |
CN106469185A (en) * | 2016-08-29 | 2017-03-01 | 浪潮电子信息产业股份有限公司 | A kind of method carrying out data collection in website statistics |
WO2017167042A1 (en) * | 2016-04-01 | 2017-10-05 | 阿里巴巴集团控股有限公司 | Statistical method and apparatus for behaviors of front-end users |
CN108921400A (en) * | 2018-06-14 | 2018-11-30 | 万翼科技有限公司 | Statistical method, server and the storage medium of house property information |
CN108920948A (en) * | 2018-05-25 | 2018-11-30 | 众安信息技术服务有限公司 | A kind of anti-fraud streaming computing device and method |
CN110830321A (en) * | 2018-08-13 | 2020-02-21 | 阿里巴巴集团控股有限公司 | Website detection scheduling method and device, storage medium and system |
CN117473200A (en) * | 2023-12-26 | 2024-01-30 | 天津戎行集团有限公司 | Comprehensive acquisition and analysis method for website information data |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006107314A1 (en) * | 2005-03-30 | 2006-10-12 | Google, Inc. | Adjusting an advertising cost, such as a per-ad impression cost, using a likelihood that the ad will be sensed or perceived by users |
CN101038596A (en) * | 2007-04-29 | 2007-09-19 | 北京搜狗科技发展有限公司 | Method and system for classifying website |
CN101118553A (en) * | 2007-08-09 | 2008-02-06 | 姜边 | Internet information acquisition method facing field and oriented by policy |
CN101159592A (en) * | 2007-08-10 | 2008-04-09 | 北大方正集团有限公司 | Statistical method and device of internet data information clicking rates |
CN101551806A (en) * | 2008-04-03 | 2009-10-07 | 北京搜狗科技发展有限公司 | Personalized website navigation method and system |
EP2417540A1 (en) * | 2009-04-08 | 2012-02-15 | Google, Inc. | Generating improved document classification data using historical search results |
CN103412890A (en) * | 2013-07-19 | 2013-11-27 | 北京亿赞普网络技术有限公司 | Webpage loading method and device |
CN103678422A (en) * | 2012-09-25 | 2014-03-26 | 北京亿赞普网络技术有限公司 | Web page classification method and device and training method and device of web page classifier |
-
2015
- 2015-06-30 CN CN201510377886.6A patent/CN105162822A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006107314A1 (en) * | 2005-03-30 | 2006-10-12 | Google, Inc. | Adjusting an advertising cost, such as a per-ad impression cost, using a likelihood that the ad will be sensed or perceived by users |
CN101038596A (en) * | 2007-04-29 | 2007-09-19 | 北京搜狗科技发展有限公司 | Method and system for classifying website |
CN101118553A (en) * | 2007-08-09 | 2008-02-06 | 姜边 | Internet information acquisition method facing field and oriented by policy |
CN101159592A (en) * | 2007-08-10 | 2008-04-09 | 北大方正集团有限公司 | Statistical method and device of internet data information clicking rates |
CN101551806A (en) * | 2008-04-03 | 2009-10-07 | 北京搜狗科技发展有限公司 | Personalized website navigation method and system |
EP2417540A1 (en) * | 2009-04-08 | 2012-02-15 | Google, Inc. | Generating improved document classification data using historical search results |
CN103678422A (en) * | 2012-09-25 | 2014-03-26 | 北京亿赞普网络技术有限公司 | Web page classification method and device and training method and device of web page classifier |
CN103412890A (en) * | 2013-07-19 | 2013-11-27 | 北京亿赞普网络技术有限公司 | Webpage loading method and device |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017167042A1 (en) * | 2016-04-01 | 2017-10-05 | 阿里巴巴集团控股有限公司 | Statistical method and apparatus for behaviors of front-end users |
CN107295050A (en) * | 2016-04-01 | 2017-10-24 | 阿里巴巴集团控股有限公司 | Front end user behavioral statisticses method and device |
CN107295050B (en) * | 2016-04-01 | 2021-05-11 | 阿里巴巴集团控股有限公司 | Front-end user behavior statistical method and device |
TWI753887B (en) * | 2016-04-01 | 2022-02-01 | 香港商阿里巴巴集團服務有限公司 | Front-end user behavior statistics method and device |
CN106204238A (en) * | 2016-07-19 | 2016-12-07 | 荆伟 | A kind of merchandise display system and method |
CN106469185A (en) * | 2016-08-29 | 2017-03-01 | 浪潮电子信息产业股份有限公司 | A kind of method carrying out data collection in website statistics |
CN108920948A (en) * | 2018-05-25 | 2018-11-30 | 众安信息技术服务有限公司 | A kind of anti-fraud streaming computing device and method |
CN108921400A (en) * | 2018-06-14 | 2018-11-30 | 万翼科技有限公司 | Statistical method, server and the storage medium of house property information |
CN110830321A (en) * | 2018-08-13 | 2020-02-21 | 阿里巴巴集团控股有限公司 | Website detection scheduling method and device, storage medium and system |
CN117473200A (en) * | 2023-12-26 | 2024-01-30 | 天津戎行集团有限公司 | Comprehensive acquisition and analysis method for website information data |
CN117473200B (en) * | 2023-12-26 | 2024-03-08 | 天津戎行集团有限公司 | Comprehensive acquisition and analysis method for website information data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105162822A (en) | Website log data processing method and device | |
CN101370024B (en) | Distributed information collection method and system | |
CN106095979B (en) | URL merging processing method and device | |
CN103237094B (en) | A kind of method and device identifying user | |
CN104050281A (en) | Webpage information extraction method and device based on http protocol | |
CN103744985A (en) | Webpage adaption method and webpage adaption system | |
CN102486799B (en) | World wide web (WWW) page processing method and device | |
US20130185429A1 (en) | Processing Store Visiting Data | |
CN103823792B (en) | Method and equipment for detecting hotspot events from text document | |
CN103744856A (en) | Method, device and system for linkage extended search | |
CN105069087A (en) | Web log data mining based website optimization method | |
CN103617266A (en) | Personalized extension search method, device and system | |
CN104572934B (en) | A kind of webpage key content abstracting method based on DOM | |
CN103823811A (en) | Method and system for processing journals | |
CN104111836A (en) | Method for collecting and processing asynchronous loading data by network | |
CN103631957A (en) | Statistical method and device for visitor behavior data | |
CN103729479A (en) | Web page content statistical method and system based on distributed file storage | |
CN108536700A (en) | A kind of method that nothing buries a collector journal | |
CN106302849A (en) | A kind of method carrying out moving solid fusion by carrier data | |
CN103488675A (en) | Automatic precise extraction device for multi-webpage news comment contents | |
CN106446055B (en) | Webpage generation method and system | |
CN109862074B (en) | Data acquisition method and device, readable medium and electronic equipment | |
CN105893584A (en) | Method, client and system for displaying website label of favorites | |
CN104539452B (en) | A kind of method that statistics Web applications access regional characteristic | |
CN108108381B (en) | Page monitoring method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20151216 |