CN105162822A - Website log data processing method and device - Google Patents

Website log data processing method and device Download PDF

Info

Publication number
CN105162822A
CN105162822A CN201510377886.6A CN201510377886A CN105162822A CN 105162822 A CN105162822 A CN 105162822A CN 201510377886 A CN201510377886 A CN 201510377886A CN 105162822 A CN105162822 A CN 105162822A
Authority
CN
China
Prior art keywords
log file
web log
data collection
type
file data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510377886.6A
Other languages
Chinese (zh)
Inventor
郭美思
刘璧怡
吴楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN201510377886.6A priority Critical patent/CN105162822A/en
Publication of CN105162822A publication Critical patent/CN105162822A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/02Standardisation; Integration
    • H04L41/024Standardisation; Integration using relational databases for representation of network management data, e.g. managing via structured query language [SQL]

Abstract

The invention provides a website log data processing method and device, and the method comprises the following steps that a website log data collection module determines a data collection type and enables access data corresponding to the collected data collection type to be transmitted to a website log data processing module according to the type of a terminal user access website; and the website log data processing module processes the access data corresponding to the collected data collection type, and outputs the results to a target storage region. The method achieves the effective collection of different access data corresponding to different website type, and provides important data support for website construction.

Description

A kind of web log file data processing method and device
Technical field
The invention belongs to log management field, particularly relate to a kind of web log file data processing method and device.
Background technology
Disclose a kind of website data analytical method and analytical system in prior art, can analyze whole network data from the angle of data flow.Described method comprises: by analyzing web site daily record data, and obtain visit data stream, described visit data stream have recorded the order of accessed web page; Reject the visit data stream not comprising the important page, wherein, the described important page is the page meeting pre defined attribute; Flow to the numerous excavation of line frequency to the remaining visit data comprising the important page to calculate, obtain the occurrence frequency of front m high visit data stream of the frequency of occurrences and each visit data stream; For described m visit data stream, calculate the number of times occurring the important page in each data flow, and the length of each data flow; Utilize the occurrence frequency of each visit data stream, occur the number of times of the important page and the length of data flow, calculate the water of each data flow in described m visit data stream.
Such scheme only only discloses how to the web log file data analysis collected, but how such scheme effectively collects website daily record data if not disclosing.
Summary of the invention
In order to solve the problems of the technologies described above, the invention provides a kind of web log file data processing method and device, to solve the problems of the technologies described above.
In order to reach the object of the invention, the invention provides a kind of web log file data processing method, said method comprises the following steps: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
The present invention also provides a kind of web log file data processing equipment, comprises web log file data collection module, web log file data processing module; Wherein, described web log file data collection module is connected with described web log file data processing module;
Described web log file data collection module, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Described web log file data processing module, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
By following scheme: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module; After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to; Achieve the different access data corresponding to dissimilar website effectively to collect, support for Web Hosting provides significant data.
By following scheme: described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo; Diversified data type, ensures the comprehensive and accuracy of data acquisition.
Accompanying drawing explanation
Accompanying drawing is used to provide the further understanding to technical solution of the present invention, and forms a part for specification, is used from and explains technical scheme of the present invention, do not form the restriction to technical solution of the present invention with the embodiment one of the application.
Fig. 1 is the web log file data processing method flow chart realizing the embodiment of the present invention 1;
Fig. 2 is the web log file data processing equipment structure chart according to the embodiment of the present invention 2;
Fig. 3 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 3;
Fig. 4 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 4.
Embodiment
Hereinafter also describe the present invention in detail with reference to accompanying drawing in conjunction with the embodiments.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.
Fig. 1 is the web log file data processing method flow chart realizing the embodiment of the present invention 1, comprises the following steps:
Step 101: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Further, web log file data collection module is collected the process of visit data corresponding to described type of data collection and is:
What web log file data collection module was arranged by execution buries a program, collects the visit data that described type of data collection is corresponding.
Further, described implementation of burying a program is: by adding one section of javascript code in the page, and dynamic creation script label, and src is pointed to an independent javascript file and by visit data corresponding to type of data collection described in described javascript file collection.
Further, described javascript file, by visit data corresponding for the described type of data collection of collection, passes to web log file data processing module by the mode of http parameter.
Further, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
Further, web log file data collection module prestores the Type of website and the type of data collection table of comparisons.
Further, the Type of website and the type of data collection table of comparisons, such as, shown in table 1:
The Type of website and the type of data collection table of comparisons
Table 1
Step 102: after the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
Further, after the visit data that described web log file data processing module is corresponding to described type of data collection processes, the process exporting territory, target storage to is:
After described web log file data processing module resolves the http parameter information of described javascript file transmission, the variable of corresponding web log file data format be set and visit data corresponding for described type of data collection is recorded in journal file, exporting territory, target storage to.
Fig. 2 is the web log file data processing equipment structure chart according to the embodiment of the present invention 2, comprises web log file data collection module 201, web log file data processing module 202; Wherein, described web log file data collection module 201 is connected with described web log file data processing module 202;
Described web log file data collection module 201, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module 202;
Described web log file data processing module 202, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
Fig. 3 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 3, also comprises and arranges module 200; Wherein, the described module 200 that arranges is connected with described web log file data collection module 201;
Described module 200 is set, collects type for setting data and described type of data collection information is sent to described web log file data collection module 201; Wherein, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
Fig. 4 is the another structure chart of web log file data processing equipment according to the embodiment of the present invention 4, also comprises memory module 203; Wherein, described memory module 203 and described web log file data processing module 202;
Described web log file data processing module 202, after processing, exports described memory module 203 to and stores for the visit data corresponding to described type of data collection.
By following scheme: web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module; After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to; Achieve the different access data corresponding to dissimilar website effectively to collect, support for Web Hosting provides significant data.
By following scheme: described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo; Diversified data type, ensures the comprehensive and accuracy of data acquisition.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. a web log file data processing method, is characterized in that, comprises the following steps:
Web log file data collection module, according to the end-user access Type of website, is determined type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
After the visit data that described web log file data processing module is corresponding to described type of data collection processes, export territory, target storage to.
2. method according to claim 1, is characterized in that, the process that web log file data collection module collects visit data corresponding to described type of data collection is:
What web log file data collection module was arranged by execution buries a program, collects the visit data that described type of data collection is corresponding.
3. method according to claim 2, it is characterized in that, described implementation of burying a program is: by adding one section of javascript code in the page, and dynamic creation script label, and src is pointed to an independent javascript file and by visit data corresponding to type of data collection described in described javascript file collection.
4. method according to claim 3, is characterized in that, described javascript file, by visit data corresponding for the described type of data collection of collection, passes to web log file data processing module by the mode of http parameter.
5. method according to claim 1, is characterized in that, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
6. method according to claim 1, is characterized in that, web log file data collection module prestores the Type of website and the type of data collection table of comparisons.
7. method according to claim 1, is characterized in that, after the visit data that described web log file data processing module is corresponding to described type of data collection processes, the process exporting territory, target storage to is:
After described web log file data processing module resolves the http parameter information of described javascript file transmission, the variable of corresponding web log file data format be set and visit data corresponding for described type of data collection is recorded in journal file, exporting territory, target storage to.
8. a web log file data processing equipment, is characterized in that, comprises web log file data collection module, web log file data processing module; Wherein, described web log file data collection module is connected with described web log file data processing module;
Described web log file data collection module, for according to the end-user access Type of website, determines type of data collection and visit data corresponding for the described type of data collection of collecting is sent to web log file data processing module;
Described web log file data processing module, after processing, exports territory, target storage to for the visit data corresponding to described type of data collection.
9. web log file data processing equipment according to claim 8, is characterized in that, also comprises and arranges module; Wherein, the described module that arranges is connected with described web log file data collection module;
Described module is set, collects type for setting data and described type of data collection information is sent to described web log file data collection module; Wherein, described type of data collection comprises access time, IP address, domain name, URL, page title, reference document, browsing client, client language, visitor's mark, website logo.
10. web log file data processing equipment according to claim 8, is characterized in that, also comprise memory module; Wherein, described memory module and described web log file data processing module;
Described web log file data processing module, after processing, exports described memory module to and stores for the visit data corresponding to described type of data collection.
CN201510377886.6A 2015-06-30 2015-06-30 Website log data processing method and device Pending CN105162822A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510377886.6A CN105162822A (en) 2015-06-30 2015-06-30 Website log data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510377886.6A CN105162822A (en) 2015-06-30 2015-06-30 Website log data processing method and device

Publications (1)

Publication Number Publication Date
CN105162822A true CN105162822A (en) 2015-12-16

Family

ID=54803576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510377886.6A Pending CN105162822A (en) 2015-06-30 2015-06-30 Website log data processing method and device

Country Status (1)

Country Link
CN (1) CN105162822A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106204238A (en) * 2016-07-19 2016-12-07 荆伟 A kind of merchandise display system and method
CN106469185A (en) * 2016-08-29 2017-03-01 浪潮电子信息产业股份有限公司 A kind of method carrying out data collection in website statistics
WO2017167042A1 (en) * 2016-04-01 2017-10-05 阿里巴巴集团控股有限公司 Statistical method and apparatus for behaviors of front-end users
CN108921400A (en) * 2018-06-14 2018-11-30 万翼科技有限公司 Statistical method, server and the storage medium of house property information
CN108920948A (en) * 2018-05-25 2018-11-30 众安信息技术服务有限公司 A kind of anti-fraud streaming computing device and method
CN110830321A (en) * 2018-08-13 2020-02-21 阿里巴巴集团控股有限公司 Website detection scheduling method and device, storage medium and system
CN117473200A (en) * 2023-12-26 2024-01-30 天津戎行集团有限公司 Comprehensive acquisition and analysis method for website information data

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006107314A1 (en) * 2005-03-30 2006-10-12 Google, Inc. Adjusting an advertising cost, such as a per-ad impression cost, using a likelihood that the ad will be sensed or perceived by users
CN101038596A (en) * 2007-04-29 2007-09-19 北京搜狗科技发展有限公司 Method and system for classifying website
CN101118553A (en) * 2007-08-09 2008-02-06 姜边 Internet information acquisition method facing field and oriented by policy
CN101159592A (en) * 2007-08-10 2008-04-09 北大方正集团有限公司 Statistical method and device of internet data information clicking rates
CN101551806A (en) * 2008-04-03 2009-10-07 北京搜狗科技发展有限公司 Personalized website navigation method and system
EP2417540A1 (en) * 2009-04-08 2012-02-15 Google, Inc. Generating improved document classification data using historical search results
CN103412890A (en) * 2013-07-19 2013-11-27 北京亿赞普网络技术有限公司 Webpage loading method and device
CN103678422A (en) * 2012-09-25 2014-03-26 北京亿赞普网络技术有限公司 Web page classification method and device and training method and device of web page classifier

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006107314A1 (en) * 2005-03-30 2006-10-12 Google, Inc. Adjusting an advertising cost, such as a per-ad impression cost, using a likelihood that the ad will be sensed or perceived by users
CN101038596A (en) * 2007-04-29 2007-09-19 北京搜狗科技发展有限公司 Method and system for classifying website
CN101118553A (en) * 2007-08-09 2008-02-06 姜边 Internet information acquisition method facing field and oriented by policy
CN101159592A (en) * 2007-08-10 2008-04-09 北大方正集团有限公司 Statistical method and device of internet data information clicking rates
CN101551806A (en) * 2008-04-03 2009-10-07 北京搜狗科技发展有限公司 Personalized website navigation method and system
EP2417540A1 (en) * 2009-04-08 2012-02-15 Google, Inc. Generating improved document classification data using historical search results
CN103678422A (en) * 2012-09-25 2014-03-26 北京亿赞普网络技术有限公司 Web page classification method and device and training method and device of web page classifier
CN103412890A (en) * 2013-07-19 2013-11-27 北京亿赞普网络技术有限公司 Webpage loading method and device

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017167042A1 (en) * 2016-04-01 2017-10-05 阿里巴巴集团控股有限公司 Statistical method and apparatus for behaviors of front-end users
CN107295050A (en) * 2016-04-01 2017-10-24 阿里巴巴集团控股有限公司 Front end user behavioral statisticses method and device
CN107295050B (en) * 2016-04-01 2021-05-11 阿里巴巴集团控股有限公司 Front-end user behavior statistical method and device
TWI753887B (en) * 2016-04-01 2022-02-01 香港商阿里巴巴集團服務有限公司 Front-end user behavior statistics method and device
CN106204238A (en) * 2016-07-19 2016-12-07 荆伟 A kind of merchandise display system and method
CN106469185A (en) * 2016-08-29 2017-03-01 浪潮电子信息产业股份有限公司 A kind of method carrying out data collection in website statistics
CN108920948A (en) * 2018-05-25 2018-11-30 众安信息技术服务有限公司 A kind of anti-fraud streaming computing device and method
CN108921400A (en) * 2018-06-14 2018-11-30 万翼科技有限公司 Statistical method, server and the storage medium of house property information
CN110830321A (en) * 2018-08-13 2020-02-21 阿里巴巴集团控股有限公司 Website detection scheduling method and device, storage medium and system
CN117473200A (en) * 2023-12-26 2024-01-30 天津戎行集团有限公司 Comprehensive acquisition and analysis method for website information data
CN117473200B (en) * 2023-12-26 2024-03-08 天津戎行集团有限公司 Comprehensive acquisition and analysis method for website information data

Similar Documents

Publication Publication Date Title
CN105162822A (en) Website log data processing method and device
CN101370024B (en) Distributed information collection method and system
CN106095979B (en) URL merging processing method and device
CN103237094B (en) A kind of method and device identifying user
CN104050281A (en) Webpage information extraction method and device based on http protocol
CN103744985A (en) Webpage adaption method and webpage adaption system
CN102486799B (en) World wide web (WWW) page processing method and device
US20130185429A1 (en) Processing Store Visiting Data
CN103823792B (en) Method and equipment for detecting hotspot events from text document
CN103744856A (en) Method, device and system for linkage extended search
CN105069087A (en) Web log data mining based website optimization method
CN103617266A (en) Personalized extension search method, device and system
CN104572934B (en) A kind of webpage key content abstracting method based on DOM
CN103823811A (en) Method and system for processing journals
CN104111836A (en) Method for collecting and processing asynchronous loading data by network
CN103631957A (en) Statistical method and device for visitor behavior data
CN103729479A (en) Web page content statistical method and system based on distributed file storage
CN108536700A (en) A kind of method that nothing buries a collector journal
CN106302849A (en) A kind of method carrying out moving solid fusion by carrier data
CN103488675A (en) Automatic precise extraction device for multi-webpage news comment contents
CN106446055B (en) Webpage generation method and system
CN109862074B (en) Data acquisition method and device, readable medium and electronic equipment
CN105893584A (en) Method, client and system for displaying website label of favorites
CN104539452B (en) A kind of method that statistics Web applications access regional characteristic
CN108108381B (en) Page monitoring method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20151216