CN107480277A - Method and device for web log file collection - Google Patents

Method and device for web log file collection Download PDF

Info

Publication number
CN107480277A
CN107480277A CN201710722629.0A CN201710722629A CN107480277A CN 107480277 A CN107480277 A CN 107480277A CN 201710722629 A CN201710722629 A CN 201710722629A CN 107480277 A CN107480277 A CN 107480277A
Authority
CN
China
Prior art keywords
client
message
log request
processed
log
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710722629.0A
Other languages
Chinese (zh)
Other versions
CN107480277B (en
Inventor
焦文健
安海雄
李双义
王海旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710722629.0A priority Critical patent/CN107480277B/en
Publication of CN107480277A publication Critical patent/CN107480277A/en
Application granted granted Critical
Publication of CN107480277B publication Critical patent/CN107480277B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Debugging And Monitoring (AREA)

Abstract

A kind of method and device for web log file collection of disclosure.This method includes:Log request is obtained by the browser of client;Judge whether the log request needs to be pre-processed according to pre-defined rule in the client;The log request pre-processed to needs pre-processes, to generate reporting message;And the reporting message is sent to book server by the client.Method and device disclosed in the present application for web log file collection, it is possible to increase operating efficiency, save bandwidth resources, reduce the resource consumption of server end.

Description

Method and device for web log file collection
Technical field
The present invention relates to computer information processing field, in particular to a kind of method for web log file collection And device.
Background technology
Under the background of big data, the analysis of website user's behavioral data turns into enterprise's standard configuration, and the data acquisition modes of main flow are Based on Javascript (JS) B/S framework, the website statistics of early stage only collect the navigation patterns of user, but with ajax technologies Widely use with user's lean operation demand, click on event, the collection demand of the different user behaviors logs such as customized event It is more and more, but as data report increasing for frequency, the duplicate message included in a plurality of daily record is more and more, and rear end is taken The pressure of business device is increasing, also brings for downstream progress daily record dissection process and using the complexity of indicator-specific statistics and chooses War.
In the prior art, conventional log collection mode have page-tag bury a technology with without burying a technology.Page-tag Bury a technology:Website introduces one section of JS and browses data to report, click data or self-defined daily record number for ad-hoc location According to report, it is necessary to corresponding page location add one section of code label, so as to JS monitor relevant position click.Nothing buries a skill Art:Website introduces one section of JS and browses data to report, and for the click data of ad-hoc location, without additionally burying a little, JS passes through prison Web document object model (Document Object Model, DOM) structure is listened, captures all click rows for clicking on element For page full dose click data is reported, is subscribed to subsequently through circle choosing is carried out to page location, shows the hits of relevant position According to.
Page-tag, which buries the shortcomings that technology, is:The definition and deployment of label, workload is big, cycle length.And nothing buries a skill There is also shortcoming for art:Full dose element reports, and brings very big performance challenges to daily record the reception server, log processing and follow-up The resource consumption of parsing is also very big.
Therefore, it is necessary to a kind of new method and device for being used for web log file collection.
Above- mentioned information is only used for strengthening the understanding of the background to the present invention, therefore it disclosed in the background section It can include not forming the information to prior art known to persons of ordinary skill in the art.
The content of the invention
In view of this, the present invention provides a kind of method and device for web log file collection, it is possible to increase operating efficiency, Bandwidth resources are saved, reduce the resource consumption of server end.
Other characteristics and advantage of the present invention will be apparent from by following detailed description, or partially by the present invention Practice and acquistion.
According to an aspect of the invention, it is proposed that a kind of method for web log file collection, this method includes:Pass through client The browser at end obtains log request;Judge whether log request needs to be pre-processed according to pre-defined rule in client;It is right The log request for needing to be pre-processed is pre-processed, to generate reporting message;And reporting message is sent by client To book server.
In a kind of exemplary embodiment of the disclosure, in addition to:By browser acquisition strategy message, and strategy is disappeared Breath is cached to the local of client.
In a kind of exemplary embodiment of the disclosure, by the browser acquisition strategy message of client, and by strategy Message is cached to local, including:When sending log request by the browser of client, acquisition strategy message, and strategy is disappeared Breath is cached to local.
In a kind of exemplary embodiment of the disclosure, log request is obtained by the browser of client, including:Pass through Browser loading collection script;By gathering script, obtained when user operates to the webpage in browser Corresponding log request.
In a kind of exemplary embodiment of the disclosure, client according to pre-defined rule judge log request whether needs Pre-processed, including:Pre-defined rule is generated by policy message in client;And log request is judged according to pre-defined rule Whether need to be pre-processed.
In a kind of exemplary embodiment of the disclosure, the log request pre-processed to needs pre-processes, raw Into reporting message, including:Log request is pre-processed by the localStorage of browser, generates reporting message.
In a kind of exemplary embodiment of the disclosure, the log request pre-processed to needs is handled, and is generated Reporting message, including following processing procedure at least one:The log request pre-processed to needs carries out storage processing, generates Reporting message;The log request pre-processed to needs carries out filtration treatment, generates reporting message;And to needing to carry out in advance The log request of processing carries out calculating processing, generates reporting message.
In a kind of exemplary embodiment of the disclosure, the log request pre-processed to needs carries out storage processing, Reporting message is generated, including:Log request field is divided into public field and variable field;Reported and disappeared by public field generation Breath;And reporting message is generated by variable field.
In a kind of exemplary embodiment of the disclosure, reporting message is sent to book server by client, including: When meeting trigger condition, reporting message is sent to book server by client.
According to an aspect of the invention, it is proposed that a kind of method for web log file collection, this method includes:It is it is determined that tactful Message content;And when receiving the log request of client, policy message is transmitted to client.
According to an aspect of the invention, it is proposed that a kind of device for web log file collection, the device includes:Receive mould Block, for obtaining log request by the browser of client;Judge module, for judging day according to pre-defined rule in client Whether will request needs to be pre-processed;Pretreatment module, the log request for being pre-processed to needs pre-process, Generate reporting message;And sending module, for reporting message to be sent to book server by client.
In a kind of exemplary embodiment of the disclosure, pretreatment module, including:Storage processing submodule, for needing The log request pre-processed carries out storage processing, generates reporting message;Filtration treatment submodule, needs are located in advance The log request of reason carries out filtration treatment, generates reporting message;And calculate processing submodule, the day pre-processed to needs Will request carries out calculating processing, generates reporting message.
According to an aspect of the invention, it is proposed that a kind of device for web log file collection, the device includes:Message mould Block, for determining policy message content;And transport module, during log request for receiving client, policy message is passed Transport to client.
According to an aspect of the invention, it is proposed that a kind of electronic equipment, the electronic equipment includes:One or more processors; Storage device, for storing one or more programs;When one or more programs are executed by one or more processors so that one Individual or multiple processors realize such as methodology above.
According to an aspect of the invention, it is proposed that a kind of computer-readable medium, is stored thereon with computer program, its feature It is, method as mentioned in the above is realized when program is executed by processor.
According to the method and device for being used for web log file and gathering of the present invention, it is possible to increase operating efficiency, save bandwidth money Source, reduce the resource consumption of server end.
It should be appreciated that the general description and following detailed description of the above are only exemplary, this can not be limited Invention.
Brief description of the drawings
Its example embodiment is described in detail by referring to accompanying drawing, above and other target of the invention, feature and advantage will Become more fully apparent.Drawings discussed below is only some embodiments of the present invention, for the ordinary skill of this area For personnel, on the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of system architecture for web log file collection according to an exemplary embodiment.
Fig. 2 is the signal of the method for being used for web log file collection in the prior art according to an exemplary embodiment Figure.
Fig. 3 is a kind of flow chart of method for web log file collection according to an exemplary embodiment.
Fig. 4 is a kind of flow chart of method for web log file collection according to another exemplary embodiment.
Fig. 5 is a kind of schematic diagram of method for web log file collection according to another exemplary embodiment.
Fig. 6 is a kind of block diagram of device for web log file collection according to an exemplary embodiment.
Fig. 7 is a kind of block diagram of device for web log file collection according to another exemplary embodiment.
Fig. 8 is the block diagram of a kind of electronic equipment according to an exemplary embodiment.
Fig. 9 is a kind of computer-readable medium schematic diagram according to an exemplary embodiment.
Specific embodiment
Example embodiment is described more fully with referring now to accompanying drawing.However, example embodiment can be real in a variety of forms Apply, and be not understood as limited to embodiment set forth herein;On the contrary, these embodiments are provided so that the present invention will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical reference represents in figure Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, there is provided many details fully understand so as to provide to embodiments of the invention.However, It will be appreciated by persons skilled in the art that technical scheme can be put into practice without one or more in specific detail, Or other methods, constituent element, device, step etc. can be used.In other cases, side known in being not shown in detail or describe Method, device, realization are operated to avoid fuzzy each aspect of the present invention.
Block diagram shown in accompanying drawing is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Flow chart shown in accompanying drawing is merely illustrative, it is not necessary to including all contents and operation/step, It is not required to perform by described order.For example, some operation/steps can also decompose, and some operation/steps can close And or partly merging, therefore the order actually performed is possible to be changed according to actual conditions.
It should be understood that although herein various assemblies may be described using term first, second, third, etc., these groups Part should not be limited by these terms.These terms are to distinguish a component and another component.Therefore, first group be discussed herein below Part can be described as teaching of second component without departing from disclosure concept.As used herein, term " and/or " include it is associated All combinations for listing any one and one or more in project.
It will be understood by those skilled in the art that accompanying drawing is the schematic diagram of example embodiment, module or flow in accompanying drawing Necessary to not necessarily implementing the present invention, therefore it cannot be used for limiting the scope of the invention.
Fig. 1 is a kind of system architecture for web log file collection according to an exemplary embodiment.
As shown in figure 1, system architecture 100 can include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be interacted with using terminal equipment 101,102,103 by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 101,102,103 The application of page browsing device, searching class application, JICQ, mailbox client, social platform software etc..
Terminal device 101,102,103 can have a display screen and a various electronic equipments that supported web page browses, bag Include but be not limited to smart mobile phone, tablet personal computer, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as utilize terminal device 101,102,103 to user The shopping class App browsed provides the back-stage management server supported.Back-stage management server can be believed the product received The data such as breath inquiry request are carried out the processing such as analyzing, and result (for example can handle daily record) is fed back into terminal and set It is standby.
It should be noted that the subsequent processes for the log information that the embodiment of the present application is provided are typically by server 105 are performed, and correspondingly, the preprocessing process of log information is generally positioned in terminal device 101,102,103.
It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realizing need Will, can have any number of terminal device, network and server.
Fig. 2 is the signal of the method for being used for web log file collection in the prior art according to an exemplary embodiment Figure.
As shown in Fig. 2 in S202, user passes through the browser access page.
In S204, the page obtains the request of user.
In S206, page loading Javascript.JS obtains data, including terminal data, page data, visitor's number According to, sources data etc., wherein visitor's data and sources data can be stored by cookie.
In S208, the information tissue collected is generated log request by Javascript, is sent to daily record the reception server.
In S210, daily record verification is carried out.Daily record the reception server verifies to daily record, and lands into journal file.
In S210, log acquisition and parsing.The analysis program of data warehouse, which is pulled and parsed to journal file, to be added Work, switch to the data of structuring, be convenient for the statistics using index.
Wherein, page-tag bury a technology, it is necessary on the page dispose one section of JS code, when with the addition of in website it is above-mentioned with After track code segment, will be directed to user access each page send web browsing data, including browse page info, visit Objective information, source-information, browser and operation system information etc..And if wanting the related information of statistical phenomeon, i.e., user with it is interior Hold the interaction carried out, it is necessary to additionally dispose event tag.
Without a technology is buried, need also exist for disposing one section of JS code into the page, the JS codes browse except that can be responsible for reporting The daily record of type, the DOM structure of the page can also be analyzed, can report daily record for click every time, eliminate for each The definition of individual event and label deployment.By taking GrowingIO as an example, deployment code, can automatic data collection page browsing, member into the page Element such as browses, clicked at the User action log.
Disclosure example embodiment is described in detail below in conjunction with the accompanying drawings.
Fig. 3 is a kind of flow chart of method for web log file collection according to an exemplary embodiment.
As shown in figure 3, in S302, log request is obtained by the browser of client.As described in background introduction, use Family carries out any operation by the browser access page, and to the page.In the present embodiment, user can for example pass through mobile device The browser access page, can also be for example by the browser access page of computer terminal, the browser may be, for example, existing skill The browser that any company develops in art, the present invention are not limited.The browser accessed by user obtains log request. Can be for example, loading collection script by browser;By gathering script, the webpage in browser is carried out in user Log request corresponding to being obtained during operation.
In S304, judge whether log request needs to be pre-processed according to pre-defined rule in client.Can for example, Client generates pre-defined rule by policy message;And judge whether log request needs to be located in advance according to pre-defined rule Reason.Policy message content can example formulates filtering rule, storage is regular or rule etc. is ignored in other deletions, and the present invention is not with this It is limited.
In S306, the log request pre-processed to needs pre-processes, to generate reporting message.According to predetermined Content in rule, the log request pre-processed to needs pre-process, wherein, it can for example pass through browser LocalStorage pre-processes to log request, generates reporting message.Can also be for example, the daily record pre-processed to needs Request carries out storage processing, when being stored into predetermined quantity, generates reporting message;The log request that needs are pre-processed Filtration treatment is carried out, filters out some click events that need not be counted, reporting message is generated by the result after filtering;With And the log request pre-processed to needs carries out calculating processing, reporting message is generated.
In S308, reporting message is sent to book server by client.By pre-process generate afterwards report and disappear Breath is sent to predetermined server.Can also be for example, according to pre-defined rule, some daily records that need not be pre-processed can be direct Book server is uploaded to, the mode of the direct upload server of daily record has had correlation in the nothing of prior art buries a technology Description, will not be repeated here.
According to the method for being used for web log file and gathering of the present invention, by local in browser, log information is carried out pre- Processing, the data after pretreatment are reported to the mode of server, Webpage log message can be read burying a technology by nothing When, operating efficiency is improved, saves bandwidth resources, reduces the resource consumption of server end.
It will be clearly understood that the present disclosure describe how formation and using particular example, but the principle of the present invention is not limited to Any details of these examples.On the contrary, the teaching based on present disclosure, these principles can be applied to many other Embodiment.
In a kind of exemplary embodiment of the disclosure, client according to pre-defined rule judge log request whether needs Pre-processed, including:Pre-defined rule is generated by policy message in client;And log request is judged according to pre-defined rule Whether need to be pre-processed.
In a kind of exemplary embodiment of the disclosure, the log request pre-processed to needs is handled, and is generated Reporting message, including following processing procedure at least one:The log request pre-processed to needs carries out storage processing, generates Reporting message;The log request pre-processed to needs carries out filtration treatment, generates reporting message;And to needing to carry out in advance The log request of processing carries out calculating processing, generates reporting message.
Policy message can be for example including following strategy:
Log Types:Finger pre-processes to certain a kind of particular log, for example clicks on report a daily record every time, corresponding It is click on daily record.Wherein, tape number threshold value can for example be set:Log buffer is to specific quantity in browser localstorage When, execution journal reports.
It can also for example set filtering rule switch:Whether the filtering for some contents in particular log type comes into force, than Respective code module can be such as safeguarded in JS, the click on some URL does not report, or some elements click not Report.
Can also such as setup algorithm rule switch:For in particular log type, whether some calculating come into force, such as can be with Respective code module is safeguarded in JS, meets specific behavior path and (recommends position from homepage, click commodity details page, add Concern, add shopping cart later) access be identified and report.
Can also for example, setting strategy execution priority, can such as priority it is as follows:LocalStorage has expired or can not With>Last conversation end>Daily record bar number.
In a kind of exemplary embodiment of the disclosure, the log request pre-processed to needs carries out storage processing, Reporting message is generated, including:Log request field is divided into public field and variable field;Reported and disappeared by public field generation Breath;And reporting message is generated by variable field.Also include:When meeting trigger condition, reporting message is sent out by client Deliver to book server.
After the content at the JS acquisition strategies center of browser end, browser local can be cached to, triggers JS requests every time Shi Jinhang is verified, and whether current log type, which needs, is pre-processed.
Pretreatment can for example including:Storage pretreatment:If the configuration content of Strategy Center is directed to the daily record of some type, There is provided the threshold value of upper tape number, JS can keep in this kind of Log Types in localStorage, and temporary daily record bar number reaches Reported during to threshold value, the field of daily record is divided into public field and variable field when storing and report, reduce transmission Bandwidth consumption.
Public field:The field that information will not change in a plurality of daily record, such as browser terminal information, operation system The fields such as system, guest identification.
Variable field:In a plurality of daily record, field that information may change, such as request time, access sequence number, the page Url, click on the fields such as event identifier, sources.
Filtering pretreatment:If the configuration content of Strategy Center is directed to the daily record of some type, filtering rule, JS are enabled Specific filtering can be done for this kind of Log Types in front end according to rule, avoid useless daily record from reporting.
Calculate pretreatment:If the configuration content of Strategy Center is directed to the daily record of some type, computation rule, JS are enabled It can be calculated according to rule in front end for this kind of Log Types, the result after log integrity is reported, it is follow-up to reduce Computation complexity.
In a kind of exemplary embodiment of the disclosure, in addition to:By browser acquisition strategy message, and strategy is disappeared Breath is cached to the local of client.Can be such as:When sending log request by the browser of client, acquisition strategy message, and Policy message is cached to local.The JS of browser end is sent to server when asking, while acquisition strategy information, is cached to this Ground.
According to the method for being used for web log file and gathering of the present invention, disappeared by acquisition strategy during client transmission log request The mode of breath, the content of policy message can be timely and effectively updated, so as to be timely adjusted to pre-defined rule.
Fig. 4 is a kind of flow chart of method for web log file collection according to another exemplary embodiment.
As described in Figure 4, in S402, policy message content is determined.Wherein, come into force in time for the ease of JS, policy message Content may be configured as the form of switch, by the setting in policy message in server, client is some tactful opens for control Close.
In S404, when receiving the log request of client, policy message is transmitted to client.
Server end can be for example including Strategy Center, and Strategy Center is configured with the specific rules of pretreatment, clear for instructing Device of looking at carries out log integrity.It may be, for example, form as shown in table 1:
Log Types Upper tape number threshold value Filtering rule switchs Computation rule switchs Modification time
Click 100 Open Close XXXX
ClickPath 50 Open Open XXXX
Strategy Center is used to safeguard essential information, when JS is toward server transmission log request, while acquisition strategy information, delay Local is stored to, JS pre-processes according to policy information to particular log type.Come into force, can be for example carried out immediately for the ease of JS Following deployment, part rule are finished writing in JS contents, are turned on and off in the form of switches by Strategy Center's control, such as mistake Filter rule or computation rule.
Fig. 5 is a kind of schematic diagram of method for web log file collection according to another exemplary embodiment.
As shown in figure 5, in S502, user passes through the browser access page.As described in background introduction, user passes through clear Look at device accession page, and any operation is carried out to the page.
In S504, the page obtains the request of user.Browser page.
In S506, page loading JS.
In S508, policy check is carried out according to predetermined rule.According to check results, pass through browser end LocalStorage pre-processes to log request.Also include, the log request pre-processed to needs is carried out at storage Reason, reporting message is generated, including:Log request field is divided into public field and variable field;Generated by public field Report message;And reporting message is generated by variable field.In a kind of exemplary embodiment of the disclosure, to needing to carry out in advance The log request of processing is handled, generate reporting message, including following processing procedure at least one:Needs are pre-processed Log request carry out storage processing, generate reporting message;The log request pre-processed to needs carries out filtration treatment, raw Into reporting message;And the log request pre-processed to needs carries out calculating processing, reporting message is generated.
In S510, Thin Client Thick Server center is responsible for providing policy message content.
In S512, daily record verification is carried out.Daily record the reception server verifies to daily record, and lands into journal file.
In S514, log acquisition and parsing.The analysis program of data warehouse, which is pulled and parsed to journal file, to be added Work, switch to the data of structuring, be convenient for the statistics using index.
According to the method for being used for web log file and gathering of the invention, solve the long artificial cycle buried a little, flow complexity, work The problem of work amount is big.Avoiding full dose, to bury the server end request pressure a little brought big, daily record distribution and parsing complexity high Problem.By filtering and computation rule, reporting for invalid daily record is prevented from front end, has saved bandwidth resources, reduced service The resource consumption at device end.
It will be appreciated by those skilled in the art that realize that all or part of step of above-described embodiment is implemented as being performed by CPU Computer program.When the computer program is performed by CPU, the above-mentioned work(that the above method provided by the invention is limited is performed Energy.Described program can be stored in a kind of computer-readable recording medium, and the storage medium can be read-only storage, magnetic Disk or CD etc..
Further, it should be noted that above-mentioned accompanying drawing is only the place included by method according to an exemplary embodiment of the present invention Reason schematically illustrates, rather than limitation purpose.It can be readily appreciated that above-mentioned processing shown in the drawings is not intended that or limited at these The time sequencing of reason.In addition, being also easy to understand, these processing for example can be performed either synchronously or asynchronously in multiple modules.
Following is apparatus of the present invention embodiment, can be used for performing the inventive method embodiment.It is real for apparatus of the present invention The details not disclosed in example is applied, refer to the inventive method embodiment.
Fig. 6 is a kind of block diagram of device for web log file collection according to an exemplary embodiment.
Receiving module 602 is used to obtain log request by the browser of client.
Judge module 604 is used to judge whether log request needs to be pre-processed according to pre-defined rule in client.
Pretreatment module 606 is used to pre-process the log request that needs are pre-processed, and generates reporting message.
Sending module 608 is used to by client send reporting message to book server.
In a kind of exemplary embodiment of the disclosure, pretreatment module, including:Storage processing submodule (does not show in figure Go out), the log request for being pre-processed to needs carries out storage processing, generates reporting message;Filtration treatment submodule (figure Not shown in), the log request pre-processed to needs carries out filtration treatment, generates reporting message;And calculate processing Module (not shown), the log request pre-processed to needs carry out calculating processing, generate reporting message.
According to the device for being used for web log file and gathering of the present invention, by local in browser, log information is carried out pre- Processing, the data after pretreatment are reported to the mode of server, Webpage log message can be read burying a technology by nothing When, operating efficiency is improved, saves bandwidth resources, reduces the resource consumption of server end.
Fig. 7 is a kind of block diagram of device for web log file collection according to another exemplary embodiment.
Message module 702 is used to determine policy message content.
When transport module 704 is used to receive the log request of client, policy message is transmitted to client.
Fig. 8 is the block diagram of a kind of electronic equipment according to an exemplary embodiment.
The electronic equipment 200 according to the embodiment of the invention is described referring to Fig. 8.The electronics that Fig. 8 is shown Equipment 200 is only an example, should not bring any restrictions to the function and use range of the embodiment of the present invention.
As shown in figure 8, electronic equipment 200 is showed in the form of universal computing device.The component of electronic equipment 200 can wrap Include but be not limited to:At least one processing unit 210, at least one memory cell 220, (including the storage of connection different system component Unit 220 and processing unit 210) bus 230, display unit 240 etc..
Wherein, the memory cell is had program stored therein code, and described program code can be held by the processing unit 210 OK so that the processing unit 210 perform described in the above-mentioned electronic prescription circulation processing method part of this specification according to this The step of inventing various illustrative embodiments.For example, the processing unit 210 can be performed as shown in Fig. 3, Fig. 4, Fig. 5 The step of.
The memory cell 220 can include the computer-readable recording medium of volatile memory cell form, such as random access memory Unit (RAM) 8201 and/or cache memory unit 2202, it can further include read-only memory unit (ROM) 2203.
The memory cell 220 can also include program/practical work with one group of (at least one) program module 2205 Tool 2204, such program module 2205 includes but is not limited to:Operating system, one or more application program, other programs Module and routine data, the realization of network environment may be included in each or certain combination in these examples.
Bus 230 can be to represent the one or more in a few class bus structures, including memory cell bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 200 can also be with one or more external equipments 300 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, the equipment communication interacted with the electronic equipment 200 can be also enabled a user to one or more, and/or with causing Any equipment that the electronic equipment 200 can be communicated with one or more of the other computing device (such as router, modulation /demodulation Device etc.) communication.This communication can be carried out by input/output (I/O) interface 250.Also, electronic equipment 200 can be with By network adapter 260 and one or more network (such as LAN (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 260 can be communicated by bus 230 with other modules of electronic equipment 200.Should Understand, although not shown in the drawings, can combine electronic equipment 200 uses other hardware and/or software module, including it is but unlimited In:Microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage system etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can be realized by software, can also be realized by way of software combines necessary hardware.Therefore, according to the disclosure The technical scheme of embodiment can be embodied in the form of software product, the software product can be stored in one it is non-volatile Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are to cause a calculating Equipment (can be personal computer, server or network equipment etc.) performs the above-mentioned electronics according to disclosure embodiment Prescription circulation processing method.
Fig. 9 is a kind of computer-readable medium schematic diagram according to an exemplary embodiment.
With reference to shown in figure 9, the program product for being used to realize the above method according to the embodiment of the present invention is described 400, it can use portable compact disc read only memory (CD-ROM) and including program code, and can in terminal device, Such as run on PC.However, the program product not limited to this of the present invention, in this document, readable storage medium storing program for executing can be with Be it is any include or the tangible medium of storage program, the program can be commanded execution system, device either device use or It is in connection.
Described program product can use any combination of one or more computer-readable recording mediums.Computer-readable recording medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any combination above.The more specifically example of readable storage medium storing program for executing is (non exhaustive List) include:It is electrical connection, portable disc, hard disk, random access memory (RAM) with one or more wires, read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.
The computer-readable recording medium can include believing in a base band or as the data that a carrier wave part is propagated Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any beyond readable storage medium storing program for executing Computer-readable recording medium, the computer-readable recording medium can send, propagate either transmit for being used by instruction execution system, device or device or Person's program in connection.The program code included on readable storage medium storing program for executing can be transmitted with any appropriate medium, bag Include but be not limited to wireless, wired, optical cable, RF etc., or above-mentioned any appropriate combination.
Can being combined to write the program operated for performing the present invention with one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., include routine Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user Perform on computing device, partly perform on a user device, the software kit independent as one performs, is partly calculated in user Its upper side point is performed or performed completely in remote computing device or server on a remote computing.It is remote being related to In the situation of journey computing device, remote computing device can pass through the network of any kind, including LAN (LAN) or wide area network (WAN) user calculating equipment, is connected to, or, it may be connected to external computing device (such as utilize ISP To pass through Internet connection).
Above computer computer-readable recording medium carries one or more program, when said one or multiple programs are by one When the equipment performs so that this is implemented function such as:Log request is obtained by the browser of client;In the client root Judge whether the log request needs to be pre-processed according to pre-defined rule;The log request pre-processed to needs is entered Row pretreatment, to generate reporting message;And the reporting message is sent to book server by the client.It is or real Now following function:Determine policy message content;And when receiving the log request of client, the policy message is transmitted to institute State client.
Person of ordinary skill in the field it is understood that various aspects of the invention can be implemented as system, method or Program product.Therefore, various aspects of the invention can be implemented as following form, i.e.,:It is complete hardware embodiment, complete The embodiment combined in terms of full Software Implementation (including firmware, microcode etc.), or hardware and software, can unite here Referred to as " circuit ", " module " or " system ".
It will be appreciated by those skilled in the art that above-mentioned each module can be distributed in device according to the description of embodiment, also may be used To carry out respective change uniquely different from one or more devices of the present embodiment.The module of above-described embodiment can be merged into One module, can also be further split into multiple submodule.
The description of embodiment more than, those skilled in the art is it can be readily appreciated that example embodiment described herein It can be realized, can also be realized by way of software combines necessary hardware by software.Therefore, implemented according to the present invention The technical scheme of example can be embodied in the form of software product, and the software product can be stored in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are make it that a computing device (can To be personal computer, server, mobile terminal or network equipment etc.) perform method according to embodiments of the present invention.
Detailed description more than, those skilled in the art is it can be readily appreciated that according to embodiments of the present invention is used for net The method and device of log collection of standing has one or more of the following advantages.
According to some embodiments, the method for being used for web log file collection of the invention, by local in browser, to daily record Message is pre-processed, and the data after pretreatment are reported to the mode of server, can read a net burying a technology by nothing During page log information, operating efficiency is improved, saves bandwidth resources, reduces the resource consumption of server end.
According to other embodiments, the method for being used for web log file collection of the invention, sending daily record by client please The mode of acquisition strategy message when asking, it can timely and effectively update the content of policy message.
The exemplary embodiment of the present invention is particularly shown and described above.It should be appreciated that the invention is not restricted to Detailed construction, set-up mode or implementation method described herein;On the contrary, it is intended to cover included in appended claims Various modifications and equivalence setting in spirit and scope.
In addition, structure, ratio, size shown by this specification Figure of description etc., only coordinating specification institute Disclosure, for skilled in the art realises that with reading, being not limited to the enforceable qualifications of the disclosure, therefore Do not have technical essential meaning, the modification of any structure, the change of proportionate relationship or the adjustment of size, do not influenceing the disclosure Under the technique effect that can be generated and achieved purpose, it all should still fall and obtain and can cover in the technology contents disclosed in the disclosure In the range of.Meanwhile in this specification it is cited such as " on ", " first ", the term of " second " and " one ", be also only and be easy to Narration understands, and is not used to limit the enforceable scope of the disclosure, and its relativeness is altered or modified, without substantive change Under technology contents, when being also considered as the enforceable category of the present invention.

Claims (17)

  1. A kind of 1. method for web log file collection, it is characterised in that including:
    Log request is obtained by the browser of client;
    Judge whether the log request needs to be pre-processed according to pre-defined rule in the client;
    The log request pre-processed to needs pre-processes, to generate reporting message;And
    The reporting message is sent to book server by the client.
  2. 2. the method as described in claim 1, it is characterised in that also include:
    By the browser acquisition strategy message, and the policy message is cached to the local of the client.
  3. 3. method as claimed in claim 2, it is characterised in that the browser acquisition strategy message by client, and The policy message is cached to local, including:
    When sending log request by the browser of the client, the policy message is obtained, and the policy message is delayed It is stored to local.
  4. 4. the method as described in claim 1, it is characterised in that the browser by client obtains log request, bag Include:
    Collection script is loaded by the browser;
    By the collection script, when user operates to the webpage in the browser, daily record please corresponding to acquisition Ask.
  5. 5. method as claimed in claim 2, it is characterised in that described that the day is judged according to pre-defined rule in the client Whether will request needs to be pre-processed, including:
    The pre-defined rule is generated by policy message in the client;And
    Judge whether the log request needs to be pre-processed according to pre-defined rule.
  6. 6. the method as described in claim 1, it is characterised in that the described pair of log request for needing to be pre-processed is carried out Pretreatment, reporting message is generated, including:
    The log request is pre-processed by the localStorage of the browser, generates the reporting message.
  7. 7. the method as described in claim 1, it is characterised in that the described pair of log request for needing to be pre-processed is carried out Processing, generate reporting message, including following processing procedure at least one:
    The log request pre-processed to needs carries out storage processing, generates the reporting message;
    The log request pre-processed to needs carries out filtration treatment, generates the reporting message;And
    The log request pre-processed to needs carries out calculating processing, generates the reporting message.
  8. 8. method as claimed in claim 7, it is characterised in that the described pair of log request for needing to be pre-processed is carried out Storage is handled, and generates reporting message, including:
    The log request field is divided into public field and variable field;
    The reporting message is generated by the public field;And
    The reporting message is generated by the variable field.
  9. 9. the method as described in claim 1, it is characterised in that described to send the reporting message to predetermined clothes by client Business device, including:
    When meeting trigger condition, the reporting message is sent to book server by the client.
  10. A kind of 10. method for web log file collection, it is characterised in that including:
    Determine policy message content;And
    When receiving the log request of client, the policy message is transmitted to the client.
  11. A kind of 11. device for web log file collection, it is characterised in that including:
    Receiving module, for obtaining log request by the browser of client;
    Judge module, for judging whether the log request needs to be pre-processed according to pre-defined rule in the client;
    Pretreatment module, the log request for being pre-processed to needs pre-process, and generate reporting message;And
    Sending module, for the reporting message to be sent to book server by the client.
  12. 12. device as claimed in claim 11, it is characterised in that the pretreatment module, including:
    Storage processing submodule, the log request for being pre-processed to needs carry out storage processing, generated on described Report message;
    Filtration treatment submodule, the log request pre-processed to needs carry out filtration treatment, report and disappear described in generation Breath;And
    Processing submodule is calculated, the log request pre-processed to needs carries out calculating processing, reports and disappear described in generation Breath.
  13. A kind of 13. device for web log file collection, it is characterised in that including:
    Message module, for determining policy message content;And
    Transport module, during log request for receiving client, the policy message is transmitted to the client.
  14. 14. a kind of electronic equipment, it is characterised in that including:
    One or more processors;
    Storage device, for storing one or more programs;
    When one or more of programs are by one or more of computing devices so that one or more of processors are real The now method as described in any in claim 1-9.
  15. A kind of 15. server, it is characterised in that including:
    One or more processors;
    Storage device, for storing one or more programs;
    When one or more of programs are by one or more of computing devices so that one or more of processors are real Now method as described in claim 10.
  16. 16. a kind of computer-readable medium, is stored thereon with computer program, it is characterised in that described program is held by processor The method as described in any in claim 1-9 is realized during row.
  17. 17. a kind of computer-readable medium, is stored thereon with computer program, it is characterised in that described program is held by processor Method as described in claim 10 is realized during row.
CN201710722629.0A 2017-08-22 2017-08-22 Method and device for collecting website logs Active CN107480277B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710722629.0A CN107480277B (en) 2017-08-22 2017-08-22 Method and device for collecting website logs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710722629.0A CN107480277B (en) 2017-08-22 2017-08-22 Method and device for collecting website logs

Publications (2)

Publication Number Publication Date
CN107480277A true CN107480277A (en) 2017-12-15
CN107480277B CN107480277B (en) 2021-01-26

Family

ID=60601938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710722629.0A Active CN107480277B (en) 2017-08-22 2017-08-22 Method and device for collecting website logs

Country Status (1)

Country Link
CN (1) CN107480277B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684370A (en) * 2018-09-07 2019-04-26 平安普惠企业管理有限公司 Daily record data processing method, system, equipment and storage medium
CN109727137A (en) * 2018-12-18 2019-05-07 杭州茂财网络技术有限公司 A kind of log reporting method and system based on consumer's risk evaluation and test
CN110008086A (en) * 2019-04-04 2019-07-12 星潮闪耀移动网络科技(中国)有限公司 A kind of log generation method, device and a kind of client
CN110351105A (en) * 2018-04-02 2019-10-18 阿里巴巴集团控股有限公司 A kind of sampling configuration method and device
CN110457195A (en) * 2019-08-05 2019-11-15 深圳乐信软件技术有限公司 Acquisition methods, device, server and the storage medium of client local log
CN110795166A (en) * 2019-10-28 2020-02-14 深圳前海微众银行股份有限公司 Data processing method and device
CN110969457A (en) * 2018-09-29 2020-04-07 中国移动通信集团浙江有限公司 Mobile application log collection method and system
CN113015203A (en) * 2021-03-22 2021-06-22 Oppo广东移动通信有限公司 Information acquisition method, device, terminal, system and storage medium
CN113497723A (en) * 2020-03-20 2021-10-12 阿里巴巴集团控股有限公司 Log processing method, log gateway and log processing system
CN113760564A (en) * 2020-10-20 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method, device and system

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101505245A (en) * 2009-03-06 2009-08-12 成都市华为赛门铁克科技有限公司 Method and apparatus for sending log information
CN102882705A (en) * 2012-09-03 2013-01-16 青岛海信传媒网络技术有限公司 Method for reporting log through terminal equipment and log reporting system
CN102891873A (en) * 2011-07-21 2013-01-23 腾讯科技(深圳)有限公司 Method for storing log data and log data storage system
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN104572976A (en) * 2014-12-30 2015-04-29 广州唯品会信息科技有限公司 Website data updating method and system
CN105095281A (en) * 2014-05-13 2015-11-25 南京理工大学 Website classification catalogue optimization analysis method based on log mining
CN105262812A (en) * 2015-10-16 2016-01-20 浪潮(北京)电子信息产业有限公司 Log data processing method based on cloud computing platform, log data processing device and log data processing system
CN105824744A (en) * 2016-03-21 2016-08-03 焦点科技股份有限公司 Real-time log collection and analysis method on basis of B2B (Business to Business) platform
CN106599005A (en) * 2015-10-20 2017-04-26 阿里巴巴集团控股有限公司 Data archiving method and device
US20170193039A1 (en) * 2015-12-30 2017-07-06 Dropbox, Inc. Servicing queries of an event log

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101505245A (en) * 2009-03-06 2009-08-12 成都市华为赛门铁克科技有限公司 Method and apparatus for sending log information
CN102891873A (en) * 2011-07-21 2013-01-23 腾讯科技(深圳)有限公司 Method for storing log data and log data storage system
CN102882705A (en) * 2012-09-03 2013-01-16 青岛海信传媒网络技术有限公司 Method for reporting log through terminal equipment and log reporting system
CN103914485A (en) * 2013-01-07 2014-07-09 上海宝信软件股份有限公司 System and method for remotely collecting, retrieving and displaying application system logs
CN103595571A (en) * 2013-11-20 2014-02-19 北京国双科技有限公司 Preprocessing method, device and system for website access logs
CN105095281A (en) * 2014-05-13 2015-11-25 南京理工大学 Website classification catalogue optimization analysis method based on log mining
CN104572976A (en) * 2014-12-30 2015-04-29 广州唯品会信息科技有限公司 Website data updating method and system
CN105262812A (en) * 2015-10-16 2016-01-20 浪潮(北京)电子信息产业有限公司 Log data processing method based on cloud computing platform, log data processing device and log data processing system
CN106599005A (en) * 2015-10-20 2017-04-26 阿里巴巴集团控股有限公司 Data archiving method and device
US20170193039A1 (en) * 2015-12-30 2017-07-06 Dropbox, Inc. Servicing queries of an event log
CN105824744A (en) * 2016-03-21 2016-08-03 焦点科技股份有限公司 Real-time log collection and analysis method on basis of B2B (Business to Business) platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄宏涛: "Web日志挖掘中的数据预处理研究", 《黑龙江科技信息》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110351105A (en) * 2018-04-02 2019-10-18 阿里巴巴集团控股有限公司 A kind of sampling configuration method and device
CN110351105B (en) * 2018-04-02 2022-07-19 阿里巴巴集团控股有限公司 Sampling configuration method and device
CN109684370A (en) * 2018-09-07 2019-04-26 平安普惠企业管理有限公司 Daily record data processing method, system, equipment and storage medium
CN110969457A (en) * 2018-09-29 2020-04-07 中国移动通信集团浙江有限公司 Mobile application log collection method and system
CN109727137A (en) * 2018-12-18 2019-05-07 杭州茂财网络技术有限公司 A kind of log reporting method and system based on consumer's risk evaluation and test
CN110008086A (en) * 2019-04-04 2019-07-12 星潮闪耀移动网络科技(中国)有限公司 A kind of log generation method, device and a kind of client
CN110457195A (en) * 2019-08-05 2019-11-15 深圳乐信软件技术有限公司 Acquisition methods, device, server and the storage medium of client local log
CN110457195B (en) * 2019-08-05 2023-12-26 深圳乐信软件技术有限公司 Method and device for obtaining local log of client, server and storage medium
CN110795166A (en) * 2019-10-28 2020-02-14 深圳前海微众银行股份有限公司 Data processing method and device
CN110795166B (en) * 2019-10-28 2021-08-20 深圳前海微众银行股份有限公司 Data processing method and device
CN113497723A (en) * 2020-03-20 2021-10-12 阿里巴巴集团控股有限公司 Log processing method, log gateway and log processing system
CN113497723B (en) * 2020-03-20 2023-04-28 阿里巴巴集团控股有限公司 Log processing method, log gateway and log processing system
CN113760564A (en) * 2020-10-20 2021-12-07 北京沃东天骏信息技术有限公司 Data processing method, device and system
CN113015203A (en) * 2021-03-22 2021-06-22 Oppo广东移动通信有限公司 Information acquisition method, device, terminal, system and storage medium

Also Published As

Publication number Publication date
CN107480277B (en) 2021-01-26

Similar Documents

Publication Publication Date Title
CN107480277A (en) Method and device for web log file collection
CN110442712B (en) Risk determination method, risk determination device, server and text examination system
US9949149B2 (en) Online and distributed optimization framework for wireless analytics
CN108334641B (en) Method, system, electronic equipment and storage medium for collecting user behavior data
CN108280125A (en) Method, apparatus, storage medium and the electronic device that the page is shown
CN104067274A (en) System and method for improving access to search results
CN109284450B (en) Method and device for determining order forming paths, storage medium and electronic equipment
CN108810047B (en) Method and device for determining information push accuracy rate and server
US11347931B2 (en) Process for creating a fixed length representation of a variable length input
CN107634947A (en) Limitation malice logs in or the method and apparatus of registration
CN107911449A (en) Method and apparatus for pushed information
CN107908662B (en) Method and device for realizing search system
CN109446431A (en) For the method, apparatus of information recommendation, medium and calculate equipment
CN107483443A (en) advertisement information processing method, client, storage medium and electronic equipment
US10140377B2 (en) Data processing, data collection
CN111770125A (en) Method and device for pushing information
CN108932640B (en) Method and device for processing orders
CN113128773A (en) Training method of address prediction model, address prediction method and device
CN109582854B (en) Method and apparatus for generating information
CN114297475A (en) Object recommendation method and device, electronic equipment and storage medium
CN114282020A (en) Information display method, device, system, electronic equipment and storage medium
CN111767447A (en) Method and device for determining user traffic path
US20130246559A1 (en) Editorial service supporting contrasting content
US11983227B2 (en) Utilizing reinforcement learning for goal oriented website navigation
US11907311B2 (en) Dynamic website characterization for search optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant