GB2367464A - Web traffic analysis - Google Patents
Web traffic analysis Download PDFInfo
- Publication number
- GB2367464A GB2367464A GB0115369A GB0115369A GB2367464A GB 2367464 A GB2367464 A GB 2367464A GB 0115369 A GB0115369 A GB 0115369A GB 0115369 A GB0115369 A GB 0115369A GB 2367464 A GB2367464 A GB 2367464A
- Authority
- GB
- United Kingdom
- Prior art keywords
- user
- site
- file
- server
- access
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3438—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3495—Performance evaluation by tracing or monitoring for systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/835—Timestamp
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/875—Monitoring of systems including the internet
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
Abstract
A system and method for isolating user and site behavior information for individual users accessing a web site (100) includes a plurality of servers (101-106) potentially located in widely dispersed locations. Log files (107-112) are kept at the various server locations which preserve information about user identification and user activity or user behavior at each server. These log files are combined or concatenated into a single file (113) and then sorted according to time and date (114), as well as according to user identification (115) in order to enable a concise profile of user behavior across an entire web site to be readily obtained. Information describing site behavior of a user across an entirety of a web site enables web site administrators to optimize the provision of service to users of the web site.
Description
WEB TRAFFIC ANALYSIS
RELATED APPLICATIONS
The present application is related to co-pending, commonly assigned, concurrently filed Patent Application Number 10002091-1, entitled"CUSTOMER TRACKING
METHOD AND SYSTEM"which is hereby incorporated herein by reference.
TECHNICAL FIELD
The present invention relates in general to the storage of web site usage information and in particular to the storage of information relating to user access to web sites having a plurality of servers.
BACKGROUND
Internet or World Wide Web sites commonly acquire and store information describing the site behavior of users or clients accessing a particular site. Site behavior generally refers to information describing the overall site experience of a user at a particular site, such as time spent at the site and the sequence of web pages accessed by the user. Such information may be employed to perform marketing analyses or to otherwise improve the web site's performance to better serve the needs of a typical user or client of the web site at issue.
Where a web site includes a particularly large amount of information, more than one server is generally made available to users accessing the site in order to avoid slowing down the system by having one server attempting to simultaneously provide information from a plurality of databases to a plurality of different users. Accordingly, a plurality of different servers are commonly employed to service such sites. While certain servers may initially be dedicated to selected portions of the totality of information available at a particular site, the information initially available at selected servers may be mirrored at other servers in order to provide alternative means for communicating information requested by a web site user where the server initially dedicated to supplying such information is busy performing other services.
Moreover, a particular user accessing the web site may be transferred from one server to another depending upon which web page of the web site a user requests access to. Thus, one user's site behavior may span a plurality of servers associated with the web site.
Where such a plurality of servers is employed to enable efficient access for users to a large base of information, each server generally logs the behavior of particular users at each server separately. Thus, when all servers are operating properly, the behavior of all users at all the available servers associated with the web site is generally contained in a set of server logs kept at the various servers. Although the totality of information describing a user's site behavior may be present among the plurality of servers according the above described scheme of operation, the information may well be scattered among a plurality of servers and thus be difficult to access in a convenient manner.
One prior art approach to condensing information describing an individual user's site behavior involves having a web site administrator identify entries associated with a particular web site user in server logs at all of the server locations accessed by the user during the entirety of the user's session on the web site. This information may then be input to a commercial software package which generally operates to weave together a user's behavior at all servers accessed by the user into a single condensed log describing the overall site behavior of the user from a beginning to an end of a particular user session at the web site.
While this approach may provide a useful package of information, it generally requires considerable and time and effort on the part of a site administrator to gather the bits and pieces of server behavior associated with a particular user session before the user behavior at the various sites may be woven together employing a commercial software package.
Moreover, as a web site expands to include ever more servers, and is able to serve ever more users simultaneously, the task of providing such user-specific server log information from the various servers to the commercial software package becomes progressively more burdensome.
Moreover, where an organization supporting the web site is large and international, the servers may not only be numerous, but may well be located on different continents, supported by different technical teams, and store their respective server logs in different ways and in different locations within memories accessible to the various servers. Where such a large organization is concerned, the task of providing user-specific server behavior to the commercial software product may become prohibitively time consuming. It is also possible that certain information may be missed, such as, for example, where an administrator compiling the various server logs does not know where to a find a log located on a remotely located server.
Accordingly, it is a problem in the art that information describing user behavior at a multiple server web site may be scattered among a plurality of servers.
It is a further problem in the art that gathering site behavior of a particular user in a multiple user web site is time consuming and potentially prone to omission of information at one or more servers.
It is a still further problem in the art that the burden of gathering server log information from a plurality of servers associated with a particular web site becomes progressively more time consuming and difficult as the number of servers grows and as the location of the servers becomes more diverse.
SUMMARY OF THE INVENTION
The present invention is directed to a system and method which combines logs from a plurality of servers associated with a web site into a single file, automatically sorts the resulting single file by the time and date of each entry, and then isolates server log information associated with individual users of the web site into separate logs or files, thereby providing a complete picture of site behavior of an individual user during a particular user session.
Herein, a"user session"generally refers to a single period of access by a user to a particular web site, and"site behavior"generally refers to activity engaged in at the web site by the user during the user session, where the activity may include such information as the sequence of web pages accessed within the web site, the overall time spent on the site, the amount of time spent at each server of the web site, the information requested by the user, information downloaded by the user from the web site, and information provided to the web site by the user. Herein, the term"server log"generally refers to activity occurring at a particular server, by whatever users accessed that server, during a designated time period.
In a preferred embodiment, the inventive mechanism initially combines the various server logs into a single file organized by server. Preferably, a log from server A is acquired first, a log from server B appended to the server A component, and server logs from yet more servers appended thereafter to the growing list of server log components of the combined file or concatenated file. This process is preferably continued until all server logs have been combined into a single file. Generally, the server logs from the various servers may be introduced into the combined file in any order. Although the combined file will generally be formed by progressively appending files at the end of a growing combined file, which is in the process of being completed, server files may optionally be inserted at a selected point in the middle of the combined file as the combined file is being constructed.
In a preferred embodiment, once all the server logs have been combined into a completed combined file, the combined file is sorted according the time of each of the entries in the combined file, which entries originally resided in the various original server files.
Where server logs spanning more than one calendar date are employed, the inventive mechanism may sort by both date and time of day. Herein, the term"time-sorted log, ""time- sorted site log"or"time-sorted combined file"generally refers to the file resulting from operation of the above-described sorting process on the combined file.
In a preferred embodiment, once the time-sorted site log is produced, this log is further sorted according to the individual users associated with each log entry, thereby producing a user-sorted site log. Sorting the time-sorted site log by user, or user identification, may be performed employing either proprietary software or a commercially available software product. Preferably, the user-sorted site log may be readily separated into user-specific components which identify the site behavior of each of the users who accessed the web site in the time period covered by the server logs originally combined in the combined file.
Preferably, the inventive approach will enable information describing a particular user's site behavior to produced in a rapid and automated manner. The time consuming provision of information indicating which user was present at which server at particular times is thereby preferably avoided. Further, the inventive approach is preferably able to seamlessly incorporate additional servers into the above-described operational scheme by merely gathering server logs from any new servers, and appending the resulting new server logs to the above-described combined file, thereby providing scalability in the inventive system.
Accordingly, it is an advantage of a preferred embodiment of the present invention that information describing the site behavior for a particular user may be conveniently concentrated in a single location and within a single file.
It is a further advantage of a preferred embodiment of the present invention that the concentration of user site behavior information may be accomplished in a rapid and automated manner.
It is a still further advantage of a preferred embodiment of the present invention that the system and method for combining information from a plurality of servers associated with a single web site may be readily extended to new or additional servers, thereby providing for scalability in the inventive system and method.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims. The novel features which are believed to be characteristic of the invention, both as to its organization and method of operation, together with further objects and advantages will be better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
FIGURE 1 is a block diagram depicting the generation of log files at various servers and the concatenation of these log files into a single file according a preferred embodiment of the present invention;
FIGURE 2 depicts an exemplary concatenated file including log files from a selection of servers associated with a web site according to a preferred embodiment of the present invention;
FIGURE 3 depicts files describing the site behavior of a selection of users having accessed a web site according to a preferred embodiment of the present invention; and
FIGURE 4 depicts a computer system adaptable for use with a preferred embodiment of the present invention.
DETAILED DESCRIPTION
FIGURE I is a block diagram depicting the generation of log files at various servers and the concatenation of these log files into a single file according a preferred embodiment of the present invention. Servers A 101 through F 106 are shown producing server log files 107 through 112, respectively. Preferably, each server log file of server log files 107-112 includes information describing behavior at each server of various users who have accessed a web site 100 served by servers 101-106.
In a preferred embodiment, the contents of server log files 107-112 are concatenated, or combined, into concatenated log file 113. Although six server log files 107-112 are shown in FIGURE 1, it will be appreciated that fewer or more than six log files may be combined in the manner discussed herein, and all such variations are included in the scope of the present invention. Combining the various server log files 107-112 into a single file preferably operates to condense into a single file, information which was previously scattered among a plurality of servers, which servers may be located in widely dispersed locations. Upon having so concentrated the server log information from the various server log files 107-112, further operations, such as sorting operations, may be performed upon the concatenated log file 113 in order to extract useful information about access to web site 100 by one or more web site users. While a variety of sorting operations are discussed herein with regard to the treatment of concatenated log file 113, it will be appreciated that a plurality of other useful data manipulation operations could be beneficially performed upon concatenated file 113 which would have been difficult and time consuming to perform on the various individual server log files 107-112 prior to concatenation.
In a preferred embodiment, once concatenated log file 113 is produced, the entries within file 113 may be sorted 114 according to the time and date of each entry in order to present the contents of concatenated log file 113 in more useful form for analysis of web site activity. Once the time-based or chronological sorting 114 of concatenated file is complete, a time-sorted concatenated file (or, time-sorted combined file) is preferably generated to preserve site access information within a single and readily accessible data structure.
Preferably, after sorting the various entries in concatenated file 113 by time, the time-sorted concatenated file is then sorted according to user identification in order to better isolate information on web site behavior of individual users who have accessed the web site 100. A device such as a central server or central server station 116 may be employed to store concatenated file 113. It will be appreciated that more than one central server station 116 could be employed. Instead of storing concatenated file 113, in central server station 116, concatenated file 113 could alternatively be stored in any one of servers 101 to 106.
In a preferred embodiment, servers 101-106 and the device which stores the concatenated log file 113 are large UNIX based systems having access to large amount of random access memory (RAM). Alternatively, personal computer (PC) based systems may be employed for one or more of the servers. Moreover, the operations of the various servers and the centralized device storing the concatenated log file 113 could be supplied by any computer. Communication between the servers and a device, such as a central server, storing the concatenated log file 113, may be accomplished by one or more public networks, one or more private networks, or a combination of one or more public networks and one or more private networks.
FIGURE 2 depicts an exemplary concatenated file including log files from a selection of servers associated with a web site according to a preferred embodiment of the present invention. Each entry in log file A 107 is shown having three exemplary data fields: a user identification (user i. d. ), a time period during the identified user accessed the identified server (in this case, server A), and an identification of the web pages and/or services accessed by the user during the user's access of server A. An exemplary representation of user 1's 101 access of server A indicates access to"pages"PI, P6, P9, where the capital"P"preceding each page number is merely intended to abbreviate the word"page. "It will be appreciated that the three data fields depicted in log file A 107 are exemplary, and that many other data fields could be included in addition to, or in place of, the three exemplary entries discussed above. It will be further appreciated that access to a"web page"on user site 100 is but one of many possible activities available at web site 100. Preferably, in addition to merely visiting selected pages, files may be downloaded from web site 100, databases on web site 100 may be modified, communication may be conducted between a user accessing the web site 100 and an automated product support service and/or a person providing live technical or other support on behalf of web site 100.
In a preferred embodiment, additional data fields which may be included in log file A 107 (and other log files of other servers within web site 100) could include such information as: the country of origin the user, the user's status as a first time or return visitor, and a commercial status of a originating web address from which a user is accessing web site 100, such as"customer, ""potential customer,""competitor,""vendor,"or customer, ""competitor,""vendor,"orgovernment entity.
Other information for which data fields may be provided include the identify of files which were requested and/or downloaded from web site 100 to the user, and a means of communication of such downloaded information, such as file transfer protocol (FTP) or hypertext transfer protocol (HTTP). Information describing the operating system and browser software employed by the user, such as, for instance, Windows'98@, and Microsoft Explorer 5. 0@, respectively. The inclusion of such technical information in the server logs preferably enables the operators of web site 100 to optimize the provision of service on web site 100 for those users considered most valuable to the web site operators.
FIGURE 2 depicts exemplary contents of log files A 107, B 108, and C 109. An exemplary set of entries is depicted within the three depicted log files, and an exemplary set of data fields depicted as being included in each of the entries. Preferably, a sequence of log file segments A 107 through C 109, as shown in FIGURE 2 forms only a portion of an exemplary concatenated log file 113 (FIGURE 1) which condenses information from the various servers associated with web site 100. In each of log file A 107, log file B 108, and log file C 109, user 1 101, user 2 102 and user N 103 are shown, along with data fields pertaining to the three users. The inclusion of user N is meant to indicate the ability to incorporate an unlimited number of possible entries in each of the log files 107-109. Once combined into concatenated file 113, log file A 107, log file B 108, and log file C 109 preferably remain substantially intact and form file segments of concatenated file 113.
In a preferred embodiment, the server activity of each user at each of the servers may be gleaned from the server log files 107-109. However, information pertaining to each user is still scattered among several different log files. Specifically, it may be seen in the exemplary activity record depicted in the log file segments of FIGURE 2 that user 1101 accessed web pages 1,6, and 9 between 6: 00 and 6: 30 (on a clock running from 0: 00 to 24: 00) at server A, pages 21,22, and 23 between 6: 30 and 7: 00 at server B, and pages 2,6, and 9 between 6: 45 and 7: 00 at server C. It is generally desirable to further sort the information available in the portion of concatenated file 113 depicted in FIGURE 2 in order to efficiently isolate the web site activity or web site behavior of user 1 101 or any other user accessing web site 100.
Accordingly, the information depicted in FIGURE 2 is preferably further sorted according to user i. d. to produce the data tables depicted in FIGURE 3, and discussed below. It will be appreciated that additional data fields, as discussed above, may be further included to provide more information regarding the identity and site behavior of user 1 101 or any other entity accessing web site 100.
In a preferred embodiment, a sorting operation may be performed on the entries within the portion of exemplary concatenated file 113 depicted in FIGURE 2 so as to list all entries in purely chronological order rather than having the entries separated into server-specific segments as shown in FIGURE 2. Preferably, the start time of a user's activity at any one server would be employed as the reference point for the sorting operation. The resulting list would preferably create a unitary list, no longer separated by server identity, ordered entirely according to start times for various users and the various servers. When two starting times match, another criterion, such as user number, may be used as a secondary criterion to determine priority of ordering within the sorted list.
Thus, for example, upon sorting the list shown in FIGURE 2 in this manner, a first entry would preferably be: user 1; 6: 00-6: 30; PI, P6, P9; server A;, a second entry would preferably be: user 1; 6: 30-7: 00; P21, P22, P23; server B, and a third entry would preferably be: user 2; 6: 30-6 : 45; P8, P12 ; server A;. It may be seen that the second entry precedes the third entry above, even though they have the same start time, because user number"1" receives higher priority than user number"2"according to the described priority system.
Since the entries are no longer automatically segmented according to server identity, the server identity is now preferably explicitly included as a data field in order to unambiguously and fully identify the user, server, time period, and page access activity associated with the entry. While the foregoing describes one possible sorting operation which may be performed on the concatenated file 113, it will be appreciated that other sorting operations may be performed using other parameters as primary and secondary ordering parameters, and all such variations are included in the scope of the present invention.
FIGURE 3 depicts files describing the site behavior of a selection of users having accessed web site 100. FIGURE 3 includes user 1 site behavior file 301 and user 2 site behavior file 302. The site behavior files 301 and 302 preferably result from sorting the concatenated log files by date and time, and then by user i. d. , thereby ultimately producing user site behavior files, such as files 301 and 302, which isolate the information pertaining to a particular user. Preferably, information pertaining to the identify and site behavior of individual users may be advantageously employed to optimize the provision of web site service for the greatest number of web site users.
In the case of user 1 site behavior file 301, the information describing user l's activity throughout web site 100, is conveniently condensed into a single file or table, whereas, in the concatenated file portion of FIGURE 2, such information generally had to be gleaned from a plurality of different file segments. Preferably, the entries describing user 1's web site activity are shown in advancing chronological order, and web pages accessed are displayed in the order in which they were accessed during user 1's original session with each of the servers.
Preferably, the same principle is applied to the organization of user 2 site behavior file 302.
For the purpose of brevity, only two site behavior files are depicted. It will be appreciated that site behavior files similar to files 301 and 302 could be produced for an essentially unlimited number of users. Many data fields (such as those discussed above), either in addition to, or alternatively to, the data fields included in site behavior files 301 and 302 may be included in these and/or other site behavior files, and all such variations are included within the scope of the present invention.
Although FIGURE 3 depicts an organization of entries within a user-specific segment of the user i. d. -sorted and chronologically sorted concatenated log file which employs access time as the primary ordering parameter, other parameters could be so employed. For example, site behavior file 301 could be organized employing web page numbers as a primary ordering parameter and then list the times at which each page was accessed. Although, the site behavior files 301 and 302 depicted in FIGURE 3 list only an overall time span associated with access to a plurality of pages, information describing the precise times at which each web page was accessed by each user may also be provided. Moreover, where a page is accessed more than once, all time periods spent by a particular user at that page may be listed.
For an exemplary case independent of the example presented in FIGURE 3, where user 1 has accessed web page 7 between 6: 15 and 6: 20 using server A, between 7: 00 to 7: 10 using server B, and between 8: 00 to 8: 15 using server C, and used different servers for different access periods, a web page ordered entry could be presented as follows: web page 7: server A, 6: 15-6: 20; server B, 7: 00 to 7: 10; server C, 8: 00 to 8: 15.
It will be appreciated that numerous other sorting criteria could be employed and other variables employed as primary ordering parameters for sorted lists for site behavior files 301 and 302, and all such variations are included within the scope of the present invention.
FIGURE 4 illustrates computer system 400 adaptable for use with a preferred embodiment of the present invention. Central processing unit (CPU) 401 is coupled to system bus 402. The CPU 401 may be any general purpose CPU, such as an HP PA-8200. However, the present invention is not restricted by the architecture of CPU 401 as long as CPU 401 supports the inventive operations as described herein. Bus 402 is coupled to random access memory (RAM) 403, which may be SRAM, DRAM, or
SDRAM. ROM 404 is also coupled to bus 402, which may be PROM, EPROM, or
EEPROM. RAM 403 and ROM 404 hold user and system data and programs as is well known in the art.
The bus 402 is also coupled to input/output (I/O) adapter 405, communications adapter card 411, user interface adapter 408, and display adapter 409. The I/O adapter 405 connects to storage devices 406, such as one or more of hard drive, CD drive, floppy disk drive, tape drive, to the computer system. Communications adapter 411 is adapted to couple the computer system 400 to a network 412, which may be one or more of local (LAN), wide-area (WAN), Ethernet or Internet network. User interface adapter 408 couples user input devices, such as keyboard 413 and pointing device 407, to the computer system 400. The display adapter 409 is driven by CPU 401 to control the display on display device 410.
Although the present invention and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims. Moreover, the scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure of the present invention, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized according to the present invention. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.
Claims (21)
- WHAT IS CLAIMED IS: 1. A method for identifying a site behavior of at least one user on a web site (100) having a plurality of servers (101-106), the method comprising the steps of : generating a server log (107-112) at each server (101-106) accessed by said at least one user, thereby generating a plurality of server logs (107-112); and combining data from said plurality of server logs into a single site access file (113).
- 2. The method of claim 1 further comprising the step of : sorting (114) said site access file (113) according to a chronology of access times in said site access file (113).
- 3. The method of claim 2 further comprising the step of : sorting (115) said chronologically sorted site access file (113) according to an identity of said at least one user, thereby identifying said site behavior of said at least one web site user.
- 4. The method of claim 2 further comprising the step of : sorting said chronologically sorted access file (113) according to an identity of an organization from which said at least one user is accessing said web site.
- 5. The method of claim 4 further comprising the step of : identifying a type of said organization from which said at least one user is accessing said web site (100).
- 6. The method of claim 3 further comprising the step of : including, in a file containing said identified site behavior, a list of all web pages accessed at said web site by said at least one user.
- 7. The method of claim 3 further comprising the step of: including, in a file containing said identified site behavior, a list of all files downloaded from said web site by said at least one user.
- 8. The method of claim 3 further comprising the step of : including, in a file containing said identified site behavior, an indication that said at least one user is one of a first time visitor and a return visitor.
- 9. The method of claim 3 further comprising the step of : including, in a file containing said identified site behavior, an identification of a browser software type employed by said at least one user.
- 10. The method of claim 3 further comprising the step of : separating information in said identity sorted and chronologically sorted site access file into segments corresponding to an identity of servers accessed by said at least one user, thereby producing file segments having information describing activity by said at least one user at each server of said plurality of servers.
- 11. The method of claim 10 further comprising the step of : organizing said produced file segments according to an ordered list of web pages within said web site accessed by said at least one user.
- 12. A system for organizing access information for a web site, the system comprising : a plurality of servers (101-106) for providing access to a plurality of web pages to a plurality of users accessing said web site (100); a log file (107-112) kept at each server (101-106) of said plurality of servers (101106) for storing site access data associated with said each server (101-106), thereby providing a plurality of log files (107-112); a central server system (116) for maintaining site access information for an entirety of said web site (100); and a combined log file (113) stored within said central server system (116) which includes a combination of data stored in said plurality of log files (107-112).
- 13. The system of claim 12 further comprising: a communication infrastructure for enabling communication of data in said plurality of log files (107-112) between said plurality of servers (101-106) and said central server system (116).
- 14. The system of claim 12 further comprising: a chronologically sorted combined log file (113,114), wherein entries present in said combined log file are ordered according to a times of access of said plurality of users to said plurality of servers (101-106).
- 15. The system of claim 14 wherein said chronologically sorted combined log file (113,114) is further sorted according to calendar dates of access of said plurality of users to said plurality of servers (101-106).
- 16. The system of claim 14 further comprising : a combined log file (113) sorted (114) according to web site user identification and by times of access to said plurality of servers.
- 17. A system for ordering access information for a web site for convenient data access, the system comprising: means for generating a server log (107-112) at each server (101-106) of a plurality of servers associated with said web site (100), thereby establishing a plurality of server logs (107-112), wherein each of said server logs (107-112) include a plurality of server access entries ; means for communicating said plurality of server logs (107-112) from said plurality of servers to a central server (116); and means for concatenating said plurality of server logs (107-112) into a single site access file (113).
- 18. The system of claim 17 further comprising: means for sorting (114) said entries in said site access file (113) according to a time and date of each of said entries.
- 19. The system of claim 17 further comprising: means for sorting (115) said entries in said site access file (113) according to an identification of at least one user of said web site, thereby generating user-specific segments of said site access file.
- 20. The system of claim 19 further comprising: means for further sorting said user-identification-sorted entries by time within said user-specific site access file segments according to time and date, thereby generating a sequence of entries forming a chronologically sequential record (301,302) of site activity of said at least one user.
- 21. The system of claim 17, further comprising: means for sorting said entries in said site access file according to a web page identifier, thereby indicating a total level of access to said identified web page and an identification of users having accessed said identified web page.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US61894700A | 2000-07-19 | 2000-07-19 |
Publications (2)
Publication Number | Publication Date |
---|---|
GB0115369D0 GB0115369D0 (en) | 2001-08-15 |
GB2367464A true GB2367464A (en) | 2002-04-03 |
Family
ID=24479796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB0115369A Withdrawn GB2367464A (en) | 2000-07-19 | 2001-06-22 | Web traffic analysis |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2367464A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7565366B2 (en) | 2005-12-14 | 2009-07-21 | Microsoft Corporation | Variable rate sampling for sequence analysis |
EP2088711A1 (en) * | 2006-11-30 | 2009-08-12 | Alibaba Group Holding Limited | A log analyzing method and system based on distributed compute network |
US20140101091A1 (en) * | 2012-10-04 | 2014-04-10 | Adobe Systems Incorporated | Rule-based extraction, transformation, and loading of data between disparate data sources |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996041495A1 (en) * | 1995-06-07 | 1996-12-19 | Media Metrix, Inc. | Computer use meter and analyzer |
US5974572A (en) * | 1996-10-15 | 1999-10-26 | Mercury Interactive Corporation | Software system and methods for generating a load test using a server access log |
WO2000079449A2 (en) * | 1999-06-09 | 2000-12-28 | Teralytics, Inc. | System, method and computer program product for generating an inventory-centric demographic hyper-cube |
WO2001020503A1 (en) * | 1999-09-14 | 2001-03-22 | E-Club Australia Limited | A method of monitoring internet activity |
WO2001025896A1 (en) * | 1999-10-04 | 2001-04-12 | Quantified Systems, Inc. | System and method for monitoring and analyzing internet traffic |
GB2357680A (en) * | 2000-03-14 | 2001-06-27 | Speed Trap Com Ltd | Monitoring of services provided over a network with determination of interactive content of web pages |
-
2001
- 2001-06-22 GB GB0115369A patent/GB2367464A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996041495A1 (en) * | 1995-06-07 | 1996-12-19 | Media Metrix, Inc. | Computer use meter and analyzer |
US5974572A (en) * | 1996-10-15 | 1999-10-26 | Mercury Interactive Corporation | Software system and methods for generating a load test using a server access log |
WO2000079449A2 (en) * | 1999-06-09 | 2000-12-28 | Teralytics, Inc. | System, method and computer program product for generating an inventory-centric demographic hyper-cube |
WO2001020503A1 (en) * | 1999-09-14 | 2001-03-22 | E-Club Australia Limited | A method of monitoring internet activity |
WO2001025896A1 (en) * | 1999-10-04 | 2001-04-12 | Quantified Systems, Inc. | System and method for monitoring and analyzing internet traffic |
GB2357680A (en) * | 2000-03-14 | 2001-06-27 | Speed Trap Com Ltd | Monitoring of services provided over a network with determination of interactive content of web pages |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7565366B2 (en) | 2005-12-14 | 2009-07-21 | Microsoft Corporation | Variable rate sampling for sequence analysis |
EP2088711A1 (en) * | 2006-11-30 | 2009-08-12 | Alibaba Group Holding Limited | A log analyzing method and system based on distributed compute network |
EP2088711A4 (en) * | 2006-11-30 | 2014-05-07 | Alibaba Group Holding Ltd | A log analyzing method and system based on distributed compute network |
US20140101091A1 (en) * | 2012-10-04 | 2014-04-10 | Adobe Systems Incorporated | Rule-based extraction, transformation, and loading of data between disparate data sources |
US9087105B2 (en) * | 2012-10-04 | 2015-07-21 | Adobe Systems Incorporated | Rule-based extraction, transformation, and loading of data between disparate data sources |
US10402420B2 (en) | 2012-10-04 | 2019-09-03 | Adobe Inc. | Rule-based extraction, transformation, and loading of data between disparate data sources |
Also Published As
Publication number | Publication date |
---|---|
GB0115369D0 (en) | 2001-08-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7131062B2 (en) | Systems, methods and computer program products for associating dynamically generated web page content with web site visitors | |
US7330933B2 (en) | Application cache pre-loading | |
US9785533B2 (en) | Session template packages for automated load testing | |
US6393422B1 (en) | Navigation method for dynamically generated HTML pages | |
US20020059193A1 (en) | System and method for tracking usage of multiple resources provided from multiple locations | |
EP2088711A1 (en) | A log analyzing method and system based on distributed compute network | |
US20050050014A1 (en) | Method, device and software for querying and presenting search results | |
JP2006510123A (en) | Intelligent host-based results related to character streams | |
US20090222454A1 (en) | Method and data processing system for restructuring web content | |
EP1234251B1 (en) | Method, system, and computer readable medium for managing resource links | |
JP2007519106A (en) | Method and system for recording a search trail across one or more search engines in a communication network | |
TW201329890A (en) | Processing method and system of shop visiting data | |
CN101562664A (en) | Ticket processing method and system | |
EP1109115A1 (en) | Merging driver for accessing multiple database sources | |
CN107370830B (en) | Trade information supplying system based on big data and method | |
CN116800596A (en) | Log lossless compression analysis method and system | |
US11915044B2 (en) | Distributed task assignment in a cluster computing system | |
GB2367464A (en) | Web traffic analysis | |
EP2435902A1 (en) | Retrieval system, retrieval space map server apparatus and program | |
JPWO2007132849A1 (en) | How to get long data with GET method | |
US20050086250A1 (en) | Select/refresh method and apparatus | |
JP2004514196A (en) | Dynamic selection of images for web pages | |
US6836801B1 (en) | System and method for tracking the use of a web tool by a web user by using broken image tracking | |
JP4807411B2 (en) | Method for using information of another domain, program for using information of another domain, and information transfer program | |
CN113468400B (en) | List rendering method, device and equipment for visual webpage and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WAP | Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1) |