CN104504077A - Web access data statistical method and the device - Google Patents

Web access data statistical method and the device Download PDF

Info

Publication number
CN104504077A
CN104504077A CN201410812114.6A CN201410812114A CN104504077A CN 104504077 A CN104504077 A CN 104504077A CN 201410812114 A CN201410812114 A CN 201410812114A CN 104504077 A CN104504077 A CN 104504077A
Authority
CN
China
Prior art keywords
visitor
access
mark
array
statistics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410812114.6A
Other languages
Chinese (zh)
Other versions
CN104504077B (en
Inventor
池雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201410812114.6A priority Critical patent/CN104504077B/en
Publication of CN104504077A publication Critical patent/CN104504077A/en
Application granted granted Critical
Publication of CN104504077B publication Critical patent/CN104504077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a web access data statistical method and the device. The statistical method includes obtaining the statistical request with statistical time and the sign of the web object of the statistical access web objects, withdrawing one or more accessing array corresponding to the statistical time of the sign of the web objects from a database and accounting the accessing array which is not zero and obtaining the accessing quantity of the web objects. One element in each accessing array is used to record the accessing time to the web object of a visitor. The web access data statistical method and the device resolves the problem of low efficiency when accounting the request of accessing web objects in the prior art, and the high treatment efficiency of the counting of the accessing web objects is realized.

Description

The statistical method of web page access data and device
Technical field
The present invention relates to internet arena, in particular to a kind of statistical method and device of web page access data.
Background technology
At present, in the large data statistics of monitoring of the advertisement, often need to identify the numerical index such as statistics advertisement isolated user quantity, media subscriber intersection degree, project stage property isolated user access number according to visitor.And in statistic processes, often in order to add up a certain numerical index, need to carry out computing to the independent visitor's data more than more than 1,000,000,000 in system, sometimes for an advertisement, the visit capacity of several hundred million times may be exceeded every day, this one of them user may repeatedly access this advertisement, so need to do a large amount of computings for this numerical index of statistics.
Current, the process means of adding up the most general in a certain numerical index are: all access in traversal advertisement special time period, then the mark of visitor is recorded, if visitor was not accessing this advertisement before, counter was adding one, this process is complete in Database Systems substantially, need to read all data in computation process, and do removal re-treatment.In order to accelerate the speed of inquiring about and adding up, identical visitor information is stored in same position by some distributed schemes, then adds up respectively, is finally added by the counter on each server.Even if but adopt distributed schemes, adding up a certain numerical index still needs the long period.
In prior art, adopt each access of mode of data counts will be read once, I/O operation complexity is O (n), counting operation is n, whether the complexity judging user's counting operation is m*n, storage space expense is isolated user number, when adding up in the mode of counting a data target, consider and will read once each visit data, also to judge whether user is first time access, storage space expense also can be very large, it is 6-9 minute that the server of a test configurations is searched the isolated user quantity of an advertisement one day 400,000,000 visit data consuming time by database operation, supposing the system is altogether 100 customer services again, a client has 10 projects, each project has 10 advertisements, so add up the independent access quantity of one day all client, at least need altogether 6 minutes * 10 advertisement *, 10 project *, 100 client=1000 hour=41 days, calculated by 40 station servers even if calculate, so still need the sum that just can count client's independent access in 1 day for more than 1 day.
For inefficient problem when processing the statistics request of accessed web page object in prior art, at present effective solution is not yet proposed.
Summary of the invention
Fundamental purpose of the present invention is the statistical method and the device that provide a kind of web page access data, inefficient problem when processing the statistics request of accessed web page object in prior art to solve.
To achieve these goals, according to an aspect of the embodiment of the present invention, a kind of statistical method of web page access data is provided.
Statistical method according to web page access data of the present invention comprises: the statistics request obtaining statistics accessed web page object, wherein, carries the mark of timing statistics and web object in statistics request; From database, extract one or more access arrays of the corresponding timing statistics of mark of web object, wherein, an element in each access array is for recording the access times of a guest access web object; The number of element non-vanishing in statistics access array, obtains visitor's quantity of accessed web page object.
Further, before the statistics request obtaining statistics accessed web page object, statistical method also comprises: the visit data obtaining guest access web object; Add up the access times of each visitor accessed web page object within the writing time of presetting; Using each access times as element, obtain accessing array.
Further, using each access times as element, obtain accessing array and comprise: visitor's mark of the visitor of access times occurs record; Using the lower footnote of visitor's mark as the element of access times.
Further, the visit data obtaining guest access web object comprises: the visit data obtaining guest access web object comprises: from the visit data of each visitor, extract access identities; Visitor's mark of access identities is set; The visit data of visitor is stored to visitor and identifies indicated storage area, wherein, access identities and visitor identify one_to_one corresponding, and access identities is the mark that visitor generates when accessed web page object, visitor is designated continuous print natural number, and each visitor identifies a corresponding storage area; The access times of adding up each visitor accessed web page object within the writing time of presetting comprise: identify the visit data sequentially read in from storage area in default writing time according to visitor, the quantity of adding up the visit data of each storage area obtains access times.
Further, the number of element non-vanishing in statistics access array, after obtaining visitor's quantity of accessed web page object, statistical method also comprises: the visitor's mark reading non-vanishing element; Obtain the visit information of visitor from the storage area of visitor's mark, wherein, visit information comprises attribute information and the visit data of visitor.
To achieve these goals, according to the another aspect of the embodiment of the present invention, provide a kind of statistic device of web page access data.Statistic device according to web page access data of the present invention comprises: the first acquisition module, for obtaining the statistics request of statistics accessed web page object, wherein, carries the mark of timing statistics and web object in statistics request; Extraction module, one or more access arrays of the corresponding timing statistics of the mark for extracting web object from database, wherein, an element in each access array is for recording the access times of a guest access web object; First statistical module, for adding up the number of accessing element non-vanishing in array, obtains visitor's quantity of accessed web page object.
Further, statistic device also comprises: the second acquisition module, for before the statistics request obtaining statistics accessed web page object, obtains the visit data of guest access web object; Second statistical module, for adding up the access times of each visitor accessed web page object within the writing time of presetting; Array module, for using each access times as element, obtain access array.
Further, array module comprises: logging modle, for recording visitor's mark of the visitor that access times occur; Modular converter, for identifying the lower footnote as the element of access times using visitor.
Further, the second acquisition module comprises: extract submodule, for extracting access identities from the visit data of each visitor; Module is set, for arranging visitor's mark of access identities; Memory module, indicated storage area is identified for the visit data of visitor is stored to visitor, wherein, access identities and visitor identify one_to_one corresponding, access identities is the mark that visitor generates when accessed web page object, visitor is designated continuous print natural number, and each visitor identifies a corresponding storage area; Second statistical module comprises: for sequentially reading in the visit data in default writing time from storage area according to visitor's mark, the quantity of adding up the visit data of each storage area obtains access times.
Further, statistic device also comprises: read module, for the number of element non-vanishing in statistics access array, after obtaining visitor's quantity of accessed web page object, reads visitor's mark of non-vanishing element; 3rd acquisition module, the storage area for identifying from visitor obtains the visit information of visitor, and wherein, visit information comprises attribute information and the visit data of visitor.
Adopt the present invention, by extracting the visitor's mark in the visit data of guest access web object, and using the visitor of visitor mark as array subscript numerical value, using visitor to the access times value of this web object as array element value, what then identified according to visitor by array element according to different web objects and default recording period is stored sequentially in array, like this when user initiates statistics request to the data of accessed web page object, only need to carry out to the element in array the data that simple computation can obtain required by this statistics request.Adopt the present invention, solve inefficient problem when the statistics request of accessed web page object being processed in prior art, reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
Accompanying drawing explanation
The accompanying drawing forming a application's part is used to provide a further understanding of the present invention, and schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the process flow diagram of the statistical method of web page access data according to the embodiment of the present invention; And
Fig. 2 is the schematic diagram of the statistic device of web page access data according to the embodiment of the present invention.
Embodiment
It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.Below with reference to the accompanying drawings and describe the present invention in detail in conjunction with the embodiments.
The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged, in the appropriate case so that embodiments of the invention described herein.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.
Embodiments provide a kind of statistical method of web page access data.
Fig. 1 is the process flow diagram of the statistical method of web page access data according to the embodiment of the present invention.As shown in Figure 1, it is as follows that the method comprising the steps of:
Step S102, obtains the statistics request of statistics accessed web page object, wherein, carries the mark of timing statistics and web object in statistics request.
Step S104, from database, extract one or more access arrays of the corresponding timing statistics of mark of web object, wherein, an element in each access array is for recording the access times of a guest access web object.
Step S106, the number of element non-vanishing in statistics access array, obtains visitor's quantity of accessed web page object.
Adopt the above embodiment of the present invention, extract the visitor's mark in the visit data of guest access web object, and using the visitor of visitor mark as array subscript numerical value, using visitor to the access times value of this web object as array element value, what then identified according to visitor by array element according to different web objects and default recording period is stored sequentially in array, like this when user initiates statistics request to the data of accessed web page object, only need to carry out to the element in array the data that simple computation can obtain required by this statistics request.Adopt the present invention, solve inefficient problem when the statistics request of accessed web page object being processed in prior art, reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
According to the abovementioned embodiments of the present invention, statistical method also comprises: before the statistics request obtaining statistics accessed web page object, obtain the visit data of guest access web object; Add up the access times of each visitor accessed web page object within the writing time of presetting; Using each access times as element, obtain accessing array.
Below in conjunction with table 1, table 2 and table 3 in detail the above embodiment of the present invention is described in detail, table 1 is the existing tables of data stored network access data, table 2 is translation data tables that the access identities that prestores and visitor identify, and table 3 is tables of data stored network access data of the present invention.As shown in table 1, from table 1 can learn access identities be " d44bff32316243a88a7b " user preset writing time in twice access has been carried out to web object " 1 ", the access identities illustrated in table 2 can be cookie value, the user that is " d44bff32316243a88a7b " of access identities in table 2 represents as can be identified " 1 " with visitor, and the numerical value of visitor's mark " 1 " can be the numerical value of the subscript of array a corresponding to web object in table 3 " 1 ", and a 1corresponding element value is the number of times that this user accesses this web object.
In the above-described embodiments, cookie value for generate during subscription client browser access web object to should the unique identification of subscription client browser; Above-mentioned web object can be the object of Gong the user access that website provides, e.g., and advertisement.
In the above embodiment of the present invention, by the visitor of user mark is converted to array subscript, system is when adding up the visit capacity of preset web object, no longer need to inquire about this user and whether before access was being carried out to this web object, thus substantially increase the statistical efficiency of system, from reaching the effect rapidly and efficiently processed the statistics request of accessed web page object.
In the above embodiment of the present invention, using each access times as element, obtaining accessing array can comprise: visitor's mark of the visitor of access times occurs record; Using the lower footnote of visitor's mark as the element of access times.
Further, the visit data obtaining guest access web object can comprise: from the visit data of each visitor, extract access identities; Arrange visitor's mark of access identities, wherein, access identities is the mark that visitor generates when accessed web page object, and the visitor of the visitor of webpage is designated continuous print natural number, and each visitor identifies a corresponding storage area; The visit data of visitor is stored to visitor and identifies indicated storage area.Wherein, access identities and visitor identify one_to_one corresponding, and access identities is the mark that visitor generates when accessed web page object, and visitor is designated continuous print natural number, and each visitor identifies a corresponding storage area.
Need to further illustrate, in the above-described embodiments, visitor can be identified the subscript as access array, the guest access number of times corresponding to each subscript is stored in array as the array element value corresponding to this subscript.Further, indicated storage area can be identified for the visit data occurred in this time period (access array) is stored in visitor by unit as required with set time section (as hour), preserve the corresponding relation of this visit data and time, above-mentioned storage area can be local storage (as hard disk, server, database) simultaneously.
Below in conjunction with table 1, table 2 and table 3 in detail above-described embodiment is described in detail, obtain visitor's mark of the user of accessed web page object (can be advertisement) within the writing time of presetting according to visit data table (can be table 1), then according to the order of the access time of user, access identities is converted to visitor's mark.As in table 2, user's access identities can represent by visitor's mark of correspondence, and the visitor of user mark can be continuous print natural number.In array in table 3, the numerical value of above-mentioned visitor's mark is the numerical value of the subscript of array.As web object in table 3 " 1 " array corresponding to time 2014-01-0110:00 [2,0 ... 0 ..., x] (namely array a) in, visitor mark " 1 " numerical value can be a 1in the numerical value of subscript " 1 ".
In the above-described embodiments, the leftover bits and pieces target numerical value of array is converted to by the numerical value visitor of user identified, make system can fast access visit data (i.e. array), and can visit capacity to particular webpage object in express statistic writing time of presetting, thus reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
Table 1
Access Web object Access time Access identities (Cookie)
1 1 2014-01-0110:00:01.321 d44bff32316243a88a7b
2 1 2014-01-0110:00:07.314 d44bff32316243a88a7b
3 2 2014-01-0110:00:08.294 a6111e2390874b169bbe
100000000 2014-01-0110:59:59.274 ae58f93146a545a0b19a
As shown in table 1, first calling party in the writing time of presetting, its access identities is " d44bff32316243a88a7b ", has carried out accessing and this user has only carried out twice access to web object in preset time period at time point " 2014-01-0110:00:01.321 " to web object " 1 ".Access identities has only carried out once accessing to web object " 2 " in the recording period preset for " a6111e2390874b169bbe ".Wherein, the web object of access can be advertisement.
Table 2
Visitor identifies Access identities (Cookie)
1 d44bff32316243a88a7b
2 a6111e2390874b169bbe
M ae58f93146a545a0b19a
N
In table 2, M and N does not have physical meaning, identifies for replacing abridged visitor.Table 2 is corresponding relation lists that access identities and visitor identify.
Table 3
Web object Time Array
1 2014-01-0110:00 [2,0,…0…,x]
2 2014-01-0110:00 [0,1,…0…,y]
[...............]
Wherein, array [2,0 ... 0 ..., x] and be array a, array [0,1 ... 0 ..., y] and be array b.
According to the abovementioned embodiments of the present invention, the access times of adding up each visitor accessed web page object within the writing time of presetting can comprise: identify the visit data sequentially read in from storage area in default writing time according to visitor, the quantity of adding up the visit data of each storage area obtains access times.
Particularly, obtain the visit data of guest access web object, sequentially read in the visit data in default writing time from storage area according to visitor's mark, the quantity of adding up the visit data of each storage area obtains access times, using each access times as element, obtain accessing array.Wherein, when the quantity of the visit data adding up each storage area obtains access times, need according to visitor's mark from the visit data in the time that storage area reading is preset, and the quantity of adding up the visit data of each storage area obtains the access times that this visitor identifies corresponding user, the element value (access times namely in above-described embodiment) of these access times as the access array corresponding to this visitor mark (namely accessing the subscript of element in array) is stored among this access array.Pass through above-described embodiment, visit data is set up when store access data, when the independent access quantity needing inquiry visitor to particular webpage object, each access array of inquiry corresponding to this web object can obtain independent access quantity separately, improves search efficiency.
Visit data in above-described embodiment comprises the information such as access identities, access time and access object.
What needs further illustrated is, same visitor identifies corresponding access times and is all stored in same storage area (as same server), so just does not need to read the data beyond this storage area when adding up the access times of same visitor mark.If in a distributed system, different storage areas may be stored in because different visitors identifies corresponding visit data, then when a statistics web object access times within writing time (as independent access number of times), need the visit data of the visitor to each storage area to add up respectively, then the statistics of each storage area being for further processing obtains total access times.
Particularly, when user conducts interviews to web object, visit data can be left, this wherein comprises user visitor's mark, access identities as first calling party within the writing time of presetting is " d44bff32316243a88a7b ", in section sometime, possibility is repeatedly carried out to same web object owing to there is a user, and the efficiency of system process statistics request can be increased like this, so the access identities (i.e. cookie value) of user to be converted in above-described embodiment visitor's mark (described above 1, 2 and 3), and the numerical value of visitor's mark is corresponding with the numerical value of the subscript of array, such system process adds up whether this visit just need not considering this user when asking again is repeated accesses, only need to travel through the element value corresponding to inferior horn scale value, thus improve the efficiency of system statistics visit data.
In the above embodiment of the present invention, the number of element non-vanishing in statistics access array, after obtaining visitor's quantity of accessed web page object, statistical method can also comprise: the visitor's mark reading non-vanishing element; Obtain the visit information of visitor from the storage area of visitor's mark, wherein, visit information comprises attribute information and the visit data of visitor.
As shown in table 3, to web object " 1 " preset writing time 2014-01-0110:00 visit data initiate statistics request, only need travel through the subscript of array a, counting operation is done to the value of the element corresponding to array subscript, particularly, if element value is non-vanishing, one is added to statistics, the data of adding up and asking can be drawn, greatly can be improved the efficiency of process statistics request by said method.
In the above embodiment of the present invention, visitor's mark can be distributed for each independent visitor, this visitor mark is corresponding with the mark generated during user's accessed web page object, and this visitor mark is also consistent with the memory location of the visit information of each visitor, like this, according to time (the namely above-mentioned writing time of presetting, time period as 10 o'clock to ten one), dimension (namely above-mentioned web object, as advertisement) and index (namely above-mentioned access times) foundation access array, the inferior horn of the element of access array is designated as visitor's mark, the length of access array is isolated user quantity (quantity of the visitor namely stored in system), each element in access array is for representing the access times of independent this web object of guest access.
When adding up visit data, the access array meeting time and dimension screening taken out, the subscript of traversal access array, carries out counting operation to the element of each access array, has traveled through and drawn statistical information.
As, inquiry 2014-01-01 advertisement 1, the quantity of the independent visitor between 10 o'clock to ten one: the access array of advertisement 1 at 2014-01-0110:00 can be read from table 3, each element content in traversal access array, if content is not 0, counter adds one, return counter results, obtain visitor's quantity of accessed web page object.
Pass through above-described embodiment, because independent visitor's quantity sum keeps relative stability, traversal for visitor's quantity of the access arrays of 1,000,000,000 only needs 2 seconds in PC, read the I/O operation time of an array at about 6 seconds, so independent visitor's quantity of adding up an advertisement in the time can not more than 10 seconds.
Suppose that the time boundary (the namely above-mentioned writing time of presetting) of accessing array is every day, then inquire about the required time for (2 seconds+6 seconds) * 3 days * 10 advertisement * 10 project * 100 client=66 hour=2.8 days, when 40 station servers are parallel, only need 1.68 hours.Suppose, for each project does same storage mode, so the time will shorten to 10 minutes; If be each client with the process of this patent mode, only need 1 minute.
By above-described embodiment, the user's registration between advertisement (namely above-mentioned web object) is analyzed, and the processing speed of sole user analysis is far superior to traditional distributed query.Particularly, storage format determines search efficiency, and the application is in the metastable situation of Cookie value, and based on Cookie value, the mode directly corresponding with physical storage locations by visitor's mark stores, and greatly accelerates the efficiency of inquiry.
Fig. 2 is the schematic diagram of the statistic device of web page access data according to the embodiment of the present invention, and this statistic device can comprise: the first acquisition module 10, extraction module 20 and the first statistical module 30.
Wherein, the first acquisition module 10, for obtaining the statistics request of statistics accessed web page object, wherein, carries the mark of timing statistics and web object in statistics request; Extraction module 20, one or more access arrays of the corresponding timing statistics of the mark for extracting web object from database, wherein, an element in each access array is for recording the access times of a guest access web object; First statistical module 30, for adding up the number of accessing element non-vanishing in array, obtains visitor's quantity of accessed web page object.
Adopt the above embodiment of the present invention, first acquisition module extracts the visitor's mark in the visit data of guest access web object, then extraction module is using the numerical value of the visitor of visitor mark as array subscript, using visitor to the access times value of this web object as array element value, and according to different web objects and preset recording period array element is identified according to visitor be stored sequentially in array, like this when user initiates the statistics request to the data of accessed web page object, first statistical module only needs the element among to array to carry out simple computation can obtain this data required by statistics request.Adopt the present invention, solve inefficient problem when the statistics request of accessed web page object being processed in prior art, reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
According to the abovementioned embodiments of the present invention, statistic device can also comprise: the second acquisition module, for before the statistics request obtaining statistics accessed web page object, obtains the visit data of guest access web object; Second statistical module, for adding up the access times of each visitor accessed web page object within the writing time of presetting; Array module, for using each access times as element, obtain access array.
In the above embodiment of the present invention, obtained the visit data of accessed web page object by the second acquisition module after, same user is to the access times of this web object for second statistical module counts, and then each number of times of accessing is stored in corresponding position by array module.In this process, the visitor of user mark is converted to array subscript, system is when adding up the visit capacity of preset web object, no longer need to inquire about this user and whether before access was being carried out to this web object, thus substantially increase the statistical efficiency of system, from reaching the effect rapidly and efficiently processed the statistics request of accessed web page object.
In the above embodiment of the present invention, array module can comprise: logging modle, for recording visitor's mark of the visitor that access times occur; Modular converter, for identifying the lower footnote as the element of access times using visitor.
In the above-described embodiments, the leftover bits and pieces target numerical value of array is converted to by the numerical value visitor of user identified, make system can fast access access and array, and can visit capacity to particular webpage object in express statistic special time period, thus reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
According to the abovementioned embodiments of the present invention, the second acquisition module can comprise: extract submodule, for extracting access identities from the visit data of each visitor; Module is set, for arranging visitor's mark of access identities; Memory module, identifies indicated storage area for the visit data of visitor is stored to visitor.Second statistical module can comprise: for sequentially reading in the visit data in default writing time from storage area according to visitor's mark, the quantity of adding up the visit data of each storage area obtains access times.
Wherein, access identities and visitor identify one_to_one corresponding, and access identities is the mark that visitor generates when accessed web page object, and visitor is designated continuous print natural number, and each visitor identifies a corresponding storage area.
As shown in table 3, at the visit data of 2014-01-0110:00, statistics request is initiated to web object " 1 ", only need travel through the subscript of array a, value for the element corresponding to subscript does counting operation, particularly, if element value is non-vanishing, one is added to statistics, the data of adding up and asking can be drawn, greatly can be improved the efficiency of process statistics request by said method.
Further, statistic device can also comprise: read module, for the number of element non-vanishing in statistics access array, after obtaining visitor's quantity of accessed web page object, reads visitor's mark of non-vanishing element; 3rd acquisition module, the storage area for identifying from visitor obtains the visit information of visitor, and wherein, visit information comprises attribute information and the visit data of visitor.
In the above embodiment of the present invention, visitor's mark can be distributed for each independent visitor, this visitor mark is corresponding with the mark generated during user's accessed web page object, and this visitor mark is also consistent with the memory location of the visit information of each visitor, like this, according to time (the namely above-mentioned writing time of presetting, time period as 10 o'clock to ten one), dimension (namely above-mentioned web object, as advertisement) and index (namely above-mentioned access times) foundation access array, the inferior horn of the element of access array is designated as visitor's mark, the length of access array is isolated user quantity (quantity of the visitor namely stored in system), each element in access array is for representing the access times of independent this web object of guest access.
When adding up visit data, the access array meeting time and dimension screening taken out, the subscript of traversal access array, carries out counting operation to the element of each access array, has traveled through and drawn statistical information.
By above-described embodiment, the user's registration between advertisement (namely above-mentioned web object) is analyzed, and the processing speed of sole user analysis is far superior to traditional distributed query.Particularly, storage format determines search efficiency, and the application is in the metastable situation of Cookie value, and based on Cookie value, the mode directly corresponding with physical storage locations by visitor's mark stores, and greatly accelerates the efficiency of inquiry.
The modules provided in the present embodiment is identical with the using method that the corresponding step of embodiment of the method provides, application scenarios also can be identical.It is noted, of course, that the scheme that above-mentioned module relates to can be not limited to content in above-described embodiment and scene, and above-mentioned module may operate in terminal or mobile terminal, can pass through software or hardware implementing.
As can be seen from the above description, present invention achieves following technique effect:
Adopt the above embodiment of the present invention, extract the visitor's mark in the visit data of guest access web object, and using the numerical value of the visitor of visitor mark as array subscript, using visitor to the access times value of this web object as array element value, what then identified according to visitor by array element according to different web objects and default recording period is stored sequentially in array, like this when user initiates the statistics request to the data of accessed web page object, only need the element among to array to carry out simple computation and can obtain this data required by statistics request.Adopt the present invention, solve inefficient problem when the statistics request of accessed web page object being processed in prior art, reach the effect that the statistics request of accessed web page object is rapidly and efficiently processed.
Obviously, those skilled in the art should be understood that, above-mentioned of the present invention each module or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on network that multiple calculation element forms, alternatively, they can realize with the executable program code of calculation element, thus, they can be stored and be performed by calculation element in the storage device, or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to any specific hardware and software combination.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
In several embodiments that the application provides, should be understood that, disclosed device, the mode by other realizes.Such as, device embodiment described above is only schematic, the such as division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical or other form.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. a statistical method for web page access data, is characterized in that, comprising:
Obtain the statistics request of statistics accessed web page object, wherein, in described statistics request, carry the mark of timing statistics and described web object;
From database, extract one or more access arrays of the corresponding described timing statistics of mark of described web object, wherein, an element in each described access array is for recording the access times of web object described in a guest access;
Add up the number of element non-vanishing in described access array, obtain visitor's quantity of accessing described web object.
2. statistical method according to claim 1, is characterized in that, before the statistics request obtaining statistics accessed web page object, described statistical method also comprises:
Obtain the visit data of web object described in guest access;
Add up the access times that visitor described in each accesses described web object within the writing time of presetting;
Using access times described in each as element, obtain described access array.
3. statistical method according to claim 2, is characterized in that, using access times described in each as element, obtains described access array and comprises:
There is visitor's mark of the described visitor of described access times in record;
Using the lower footnote of described visitor's mark as the element of described access times.
4. the statistical method according to Claims 2 or 3, is characterized in that,
The visit data obtaining web object described in guest access comprises: from the described visit data of visitor described in each, extract access identities; Visitor's mark of described access identities is set; The visit data of described visitor is stored to described visitor and identifies indicated storage area, wherein, described access identities and described visitor identify one_to_one corresponding, described access identities is the mark that described visitor generates when accessing described web object, described visitor is designated continuous print natural number, and each described visitor identifies a corresponding described storage area;
Add up the access times that visitor described in each accesses described web object within the writing time of presetting to comprise: identify the described visit data sequentially read in from described storage area in described default writing time according to described visitor, the quantity of adding up the described visit data of storage area described in each obtains described access times.
5. statistical method according to claim 4, is characterized in that, the number of non-vanishing element in the described access array of statistics, and after obtaining accessing visitor's quantity of described web object, described statistical method also comprises:
Read visitor's mark of described non-vanishing element;
The described storage area identified from described visitor obtains the visit information of described visitor, and wherein, described visit information comprises the attribute information of visitor and described visit data.
6. a statistic device for web page access data, is characterized in that, comprising:
First acquisition module, for obtaining the statistics request of statistics accessed web page object, wherein, carries the mark of timing statistics and described web object in described statistics request;
Extraction module, one or more access arrays of the corresponding described timing statistics of the mark for extracting described web object from database, wherein, an element in each described access array is for recording the access times of web object described in a guest access;
First statistical module, for adding up the number of element non-vanishing in described access array, obtains visitor's quantity of accessing described web object.
7. statistic device according to claim 6, is characterized in that, described statistic device also comprises:
Second acquisition module, for before the statistics request obtaining statistics accessed web page object, obtains the visit data of web object described in guest access;
Second statistical module, accesses the access times of described web object within the writing time of presetting for adding up visitor described in each;
Array module, for using access times described in each as element, obtain described access array.
8. statistic device according to claim 7, is characterized in that, described array module comprises:
Logging modle, for recording visitor's mark of the described visitor that described access times occur;
Modular converter, for identifying the lower footnote as the element of described access times using described visitor.
9. the statistic device according to claim 7 or 8, is characterized in that,
Described second acquisition module comprises: extract submodule, for extracting access identities from the described visit data of visitor described in each; Module is set, for arranging visitor's mark of described access identities; Memory module, indicated storage area is identified for the visit data of described visitor is stored to described visitor, wherein, described access identities and described visitor identify one_to_one corresponding, described access identities is the mark that described visitor generates when accessing described web object, described visitor is designated continuous print natural number, and each described visitor identifies a corresponding described storage area;
Described second statistical module comprises: for sequentially reading in the described visit data in described default writing time from described storage area according to described visitor's mark, the quantity of adding up the described visit data of storage area described in each obtains described access times.
10. statistic device according to claim 9, is characterized in that, described statistic device also comprises:
Read module, for the number of non-vanishing element in the described access array of statistics, after obtaining accessing visitor's quantity of described web object, reads visitor's mark of described non-vanishing element;
3rd acquisition module, the described storage area for identifying from described visitor obtains the visit information of described visitor, and wherein, described visit information comprises the attribute information of visitor and described visit data.
CN201410812114.6A 2014-12-22 2014-12-22 The statistical method and device of web page access data Active CN104504077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410812114.6A CN104504077B (en) 2014-12-22 2014-12-22 The statistical method and device of web page access data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410812114.6A CN104504077B (en) 2014-12-22 2014-12-22 The statistical method and device of web page access data

Publications (2)

Publication Number Publication Date
CN104504077A true CN104504077A (en) 2015-04-08
CN104504077B CN104504077B (en) 2018-04-03

Family

ID=52945475

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410812114.6A Active CN104504077B (en) 2014-12-22 2014-12-22 The statistical method and device of web page access data

Country Status (1)

Country Link
CN (1) CN104504077B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105468701A (en) * 2015-11-18 2016-04-06 车智互联(北京)科技有限公司 Method and device for computing influence degree of external sources on website traffic fluctuation
CN106294427A (en) * 2015-05-26 2017-01-04 北大方正集团有限公司 Contribution statistical method and contribution statistical system
CN106294090A (en) * 2016-08-03 2017-01-04 五八同城信息技术有限公司 A kind of data statistical approach and device
CN106611342A (en) * 2015-10-21 2017-05-03 北京国双科技有限公司 Information processing method and device
CN106649679A (en) * 2016-12-15 2017-05-10 咪咕文化科技有限公司 HBase-based webpage daily access frequency obtaining method and device
CN107193825A (en) * 2016-03-14 2017-09-22 百度在线网络技术(北京)有限公司 Page statistical method and device
CN107438100A (en) * 2017-07-25 2017-12-05 中国联合网络通信集团有限公司 Web access method and browser
CN108090089A (en) * 2016-11-23 2018-05-29 北京国双科技有限公司 Detect the methods, devices and systems of hot spot data in website
CN110019388A (en) * 2017-09-30 2019-07-16 北京国双科技有限公司 Account quantity statistics method and device
CN111523072A (en) * 2020-04-20 2020-08-11 咪咕文化科技有限公司 Page access data statistical method and device, electronic equipment and storage medium
CN111581512A (en) * 2020-05-08 2020-08-25 孙颐 Webpage visitor number statistical method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207882A (en) * 2012-01-13 2013-07-17 阿里巴巴集团控股有限公司 Shop visiting data processing method and system
CN103500177A (en) * 2013-09-06 2014-01-08 乐视致新电子科技(天津)有限公司 Method and device for counting activated users
CN103593304A (en) * 2012-08-14 2014-02-19 吉林师范大学 Quantization method for efficiently using caches on basis of parallel device model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103207882A (en) * 2012-01-13 2013-07-17 阿里巴巴集团控股有限公司 Shop visiting data processing method and system
CN103593304A (en) * 2012-08-14 2014-02-19 吉林师范大学 Quantization method for efficiently using caches on basis of parallel device model
CN103500177A (en) * 2013-09-06 2014-01-08 乐视致新电子科技(天津)有限公司 Method and device for counting activated users

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
单哲: "网站流量统计分析技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294427A (en) * 2015-05-26 2017-01-04 北大方正集团有限公司 Contribution statistical method and contribution statistical system
CN106611342B (en) * 2015-10-21 2020-05-01 北京国双科技有限公司 Information processing method and device
CN106611342A (en) * 2015-10-21 2017-05-03 北京国双科技有限公司 Information processing method and device
CN105468701A (en) * 2015-11-18 2016-04-06 车智互联(北京)科技有限公司 Method and device for computing influence degree of external sources on website traffic fluctuation
CN105468701B (en) * 2015-11-18 2018-09-14 车智互联(北京)科技有限公司 A kind of method and apparatus calculating the disturbance degree that external source fluctuates website traffic
CN107193825A (en) * 2016-03-14 2017-09-22 百度在线网络技术(北京)有限公司 Page statistical method and device
CN107193825B (en) * 2016-03-14 2021-03-19 百度在线网络技术(北京)有限公司 Page statistical method and device
CN106294090A (en) * 2016-08-03 2017-01-04 五八同城信息技术有限公司 A kind of data statistical approach and device
CN108090089A (en) * 2016-11-23 2018-05-29 北京国双科技有限公司 Detect the methods, devices and systems of hot spot data in website
CN108090089B (en) * 2016-11-23 2021-01-22 北京国双科技有限公司 Method, device and system for detecting hot point data in website
CN106649679A (en) * 2016-12-15 2017-05-10 咪咕文化科技有限公司 HBase-based webpage daily access frequency obtaining method and device
CN107438100B (en) * 2017-07-25 2020-01-31 中国联合网络通信集团有限公司 Webpage access method and browser
CN107438100A (en) * 2017-07-25 2017-12-05 中国联合网络通信集团有限公司 Web access method and browser
CN110019388A (en) * 2017-09-30 2019-07-16 北京国双科技有限公司 Account quantity statistics method and device
CN111523072A (en) * 2020-04-20 2020-08-11 咪咕文化科技有限公司 Page access data statistical method and device, electronic equipment and storage medium
CN111523072B (en) * 2020-04-20 2023-08-15 咪咕文化科技有限公司 Page access data statistics method and device, electronic equipment and storage medium
CN111581512A (en) * 2020-05-08 2020-08-25 孙颐 Webpage visitor number statistical method and device
CN111581512B (en) * 2020-05-08 2023-06-02 孙颐 Webpage visitor quantity counting method and device

Also Published As

Publication number Publication date
CN104504077B (en) 2018-04-03

Similar Documents

Publication Publication Date Title
CN104504077A (en) Web access data statistical method and the device
CN100394727C (en) Log analyzing method and system
CN103200262B (en) A kind of advertisement scheduling method, Apparatus and system based on mobile network
US8307006B2 (en) Methods and apparatus to obtain anonymous audience measurement data from network server data for particular demographic and usage profiles
CN104394118A (en) User identity identification method and system
CN106021583B (en) Statistical method and system for page flow data
CN105630972A (en) Data processing method and device
CN101416179A (en) System and method for providing personalized recommended word and computer readable recording medium recording program for implementing the method
CN108319585B (en) Data processing method and device, electronic equipment and computer readable medium
CN106874266A (en) User's portrait method and the device for user's portrait
CN112347377B (en) IP address field searching method, service scheduling method, device and electronic equipment
CN104143005A (en) Related searching system and method
CN104504136A (en) Website access path analyzing method and device
CN111459985A (en) Identification information processing method and device
CN101577866A (en) User classification method, advertisement release method and device
CN109828993B (en) Statistical data query method and device
CN105389352A (en) Log processing method and apparatus
CN107977678B (en) Method and apparatus for outputting information
CN104270654A (en) Internet video playing and monitoring method and device
CN107886382B (en) Method, device and system for analyzing channel drainage effect in website
CN104965863A (en) Object clustering method and apparatus
CN110347943A (en) Channel information processing method, device, storage medium and computer equipment
CN116680278A (en) Data processing method, device, electronic equipment and storage medium
CN113327146A (en) Information tracking method and device
CN108108444B (en) Enterprise business unit self-adaptive system and implementation method thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Web access data statistical method and the device

Effective date of registration: 20190531

Granted publication date: 20180403

Pledgee: Shenzhen Black Horse World Investment Consulting Co.,Ltd.

Pledgor: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Registration number: 2019990000503

PE01 Entry into force of the registration of the contract for pledge of patent right
PP01 Preservation of patent right

Effective date of registration: 20240604

Granted publication date: 20180403

PP01 Preservation of patent right