CN105577455A - Method and system for performing real-time UV statistic of massive logs - Google Patents

Method and system for performing real-time UV statistic of massive logs Download PDF

Info

Publication number
CN105577455A
CN105577455A CN201610126930.0A CN201610126930A CN105577455A CN 105577455 A CN105577455 A CN 105577455A CN 201610126930 A CN201610126930 A CN 201610126930A CN 105577455 A CN105577455 A CN 105577455A
Authority
CN
China
Prior art keywords
real
time
pvlog
bitarray
daily record
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610126930.0A
Other languages
Chinese (zh)
Inventor
桂洪冠
陈运文
高翔
于敬
江永青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Technology (shanghai) Co Ltd
Original Assignee
Information Technology (shanghai) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Technology (shanghai) Co Ltd filed Critical Information Technology (shanghai) Co Ltd
Priority to CN201610126930.0A priority Critical patent/CN105577455A/en
Publication of CN105577455A publication Critical patent/CN105577455A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/069Management of faults, events, alarms or notifications using logs of notifications; Post-processing of notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and system for performing real-time UV statistic of massive logs. The system comprises a Bloom filter creating and initializing module, a real-time log receiving module, a log processing module and a result output module. In the system for performing real-time UV statistic realized on the basis of a Bloom filter provided by the invention, each real-time PV Log is smartly mapped onto the corresponding number of specific bits of a bit array within a constant time by selecting a plurality of hash functions; and then, a current UV value can be calculated in real time through simple judgement. The system has the advantages of being simple to implement, few in system resource occupation, high in operation efficiency, real-time and the like. By adopting the method disclosed by the invention, few memory resources are occupied; that is to say, the system has the optimal space complexity; few processor resources are occupied; that is to say, the system has the optimal time complexity; and thus, dynamic real-time calculation of UV can be carried out very conveniently.

Description

A kind of method and system of massive logs being carried out to real-time UV statistics
Technical field
The present invention relates to the large data fields in the Internet, particularly a kind of method and system of massive logs being carried out to real-time UV statistics.
Background technology
UV is writing a Chinese character in simplified form of uniquevisitor, refers to by internet access, the natural person browsing this webpage or APP, and Chinese is also known as independent access user.UV is the concept of an actual user of reflection, each isolated user relative to each ip, a more corresponding actual viewer.Use UV as statistic, can understand more accurately in the unit interval and in fact have how many visitors to come the corresponding page, be an important indicator of user's service condition of measurement website or APP.
Relative to UV, an important concept is also had to be exactly PV.PV is writing a Chinese character in simplified form of PageView, i.e. page browsing amount, and in certain measurement period, namely each refreshed web page of user is once calculated only once PV, the same with UV, and PV is also the important indicator that measurement website or APP access situation.The each refreshed web page system of user can record an access log, and access log, also known as PVLog, exists usually in the form of a file.Whom every bar access log generally at least will record and when have accessed what page, and according to the actual requirements, the information also having other is by record together.
From the definition of these two concepts of PV and UV, UV obtains from carrying out duplicate removal calculating to the same subscriber PVLog within a period of time, a period of time scope herein refers to the time cycle that UV adds up, can be sky or hour, corresponding sky level UV or hour level UV.
Due to, the calculating of UV can by calculating the duplicate removal of the same subscriber in PVLog (PV daily record) within a period of time.We know, for large-scale website, PVLog is magnanimity usually, and such as the day PV of certain well-known C2C electricity business site search page domestic reaches billions of level, and daily record is all dynamically produce and constantly generate.So the problem of UV statistics just develops into the problem of how carrying out effective duplicate removal in ultra-large data centralization, UV statistics then means and carries out real-time repetition removal in ever-increasing super large data centralization in real time.
Traditional method of carrying out UV calculating is based on Hash table (hashtable), and there are the following problems for this scheme:
1., for the hashvalue cryptographic Hash of 32bit, when the element of hashtable Hash table reaches 100,000 grades, the collision probability inserting element will higher than 50%; And the collision probability of 1,000,000,000 grades of hashtable to be reduced to less than 1%, then hashvalue be at least 64bit. and now the EMS memory occupation of hashtable reach more than 20G, when data set continue increase time, memory cost likely exceedes the restriction of unit.
The element of 2.10 hundred million grades inserts, and resize is adjusted size tens of times by hashtable, and the expense of resize exponentially increases, and for the demand that some calculates in real time, this scheme will become unavailable.
Summary of the invention
The technical problem to be solved in the present invention is, by based on Bloom filter, and then can carry out real-time repetition removal in super large data centralization, realize quick UV and add up.
Real-time UV statistical system is achieved based on Bloom filter (BloomFilter), cleverly each real-time PVLog " mapping " in constant time, on the specific bit position of the corresponding number of bit array, then is judged to calculate current UV value in real time by simple by selecting several hash functions.There is the advantages such as realization is simple, occupying system resources is few, operational efficiency is high, real-time.
Solve the problems of the technologies described above, the invention provides a kind of method of massive logs being carried out to real-time UV statistics, comprising:
Gather the daily record of PVLog page browsing amount, carry out distribution afterwards etc. pending; UV counter is set simultaneously;
Create BloomFilter Bloom filter, BitArray bit array is created in the heap memory of current process, and the Hash function that definition k is different, K is the number of Hash function in Bloom filter, and element (PVLog) is mapped to a position in BitArray by each Hash function.
Positions all in BitArray are all initialized as 0;
The pending PVLog such as to receive, and to wherein every bar PVLog by k the bit position of k different Hash Function Mapping to BitArray;
Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Export the value of UV counter, complete UV statistics.
Further, the mode defining the individual different Hash function of k is:
To each Hash function according to the mode of even random distribution by element hash in diverse location, k different Hash function is then by individual for element hash to k different position.
Further, the method creating Bloom filter comprises:
When initial condition, be that all positions in the BitArray of m are all set to 0 for length;
For the set D={d1 having n element, d2......dn}, by k mapping function { f1, f2, ... fk}, is mapped as k value { y1, y2......yk} by each element di (1<=i<=n) in set D, again by array [y1] corresponding in BitArray, array [y2] ... array [yk] set is 1.
Further, the method gathering the daily record of PVLog page browsing amount is that front end page js reports, background server reports or mobile terminal client sdk reports.
Further, judge whether above-mentioned k bit position is all 1, if then skip described UV counter, does not count, continuing to receive needs PVLog to be processed.
Based on said method, present invention also offers the system of massive logs being carried out to real-time UV statistics, comprising:
Bloom filter creates and initialization module, described Bloom filter creates with initialization module in order to create BloomFilter Bloom filter, BitArray bit array is created in internal memory, and the Hash function that definition k is different, positions all in BitArray are all initialized as 0;
Daily record real-time reception module, described daily record real-time reception module is in order to the pending PVLog such as to receive;
Log processing module, described log processing module in order to bar PVLog every in the pending PVLog of equity by k the bit position of k different Hash Function Mapping to BitArray; Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Result output module, described result output module, in order to export the value of UV counter, completes UV statistics.
Further, system also comprises, website PV daily record Real-time Collection unit,
In order to be reported by front end page js, the PV daily record collected is sent to described log processing module by background server reports or mobile terminal client sdk reports mode in real time.
Further, system also comprises distribution subsystem,
In order to pass through scribe collector journal, and the daily record of Real-time Collection is distributed to described log processing module.
Further, described daily record real-time reception module sends out in order to real-time reception the PVLog that subsystem sends in real time, and PVLog is transmitted to log processing module.
Further, described result output module is in order to output to external file, database, shared drive and KV storage engines in real time by the value of UV counter.
Beneficial effect of the present invention:
1) less memory source is taken, namely more excellent space complexity
According to above to the derivation conclusion of Bloom filter Falsepositives, for UV statistical demand herein, suppose that the record number of PVLog is 1,000,000,000, i.e. n=10 hundred million, if acceptable error rate is 0.01, then size m ≈ 1,000,000,000 * 9.585 of BitArray, the memory space <1.2G of BitArray, even if we are reduced to 0.0001 acceptable error rate, then size m ≈ 1,000,000,000 * 19.170 of BitArray, the memory space of BitArray is still less than 2.3G. for UV statistics, ten thousand/ error be generally acceptable.
2) less processor resource is taken, namely more excellent time complexity
There is not frequent impact when extensive element that hashtable faces inserts and repeatedly resize problem, the set of the mapping of k hash function and k bit position is all constant time, so the time complexity of whole process is O (N), for linearly.
3) dynamic realtime that can carry out UV very easily calculates
During the PV daily record of general website or APP, dynamic realtime produces, as long as the PVLogFeed received goes processing module to the grand filtration of cloth.
Accompanying drawing explanation
Fig. 1 is a kind of method flow schematic diagram massive logs being carried out to real-time UV statistics in one embodiment of the invention.
Fig. 2 is the define method schematic flow sheet of the Hash function in Fig. 1.
Fig. 3 is the method idiographic flow schematic diagram creating Bloom filter in Fig. 1.
Fig. 4 is the method flow schematic diagram gathering the daily record of PVLog page browsing amount in Fig. 1.
Fig. 5 is another operating procedure schematic flow sheet in Fig. 1.
Fig. 6 is a kind of system configuration schematic diagram massive logs being carried out to real-time UV statistics in one embodiment of the invention.
Fig. 7 is preferred implementation schematic diagram in Fig. 6.
Fig. 8 is preferred implementation schematic diagram in Fig. 6.
Fig. 9 is the BitArray bit array schematic diagram in Fig. 3.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Fig. 1 is a kind of method flow schematic diagram massive logs being carried out to real-time UV statistics in one embodiment of the invention.
In the present embodiment, step is specifically comprised as follows:
Step S101 gathers the daily record of PVLog page browsing amount, and to carry out after distribution etc. pending, PV is also an important indicator of measurement website or APP access situation.The each refreshed web page system of user can record an access log, and access log, also known as PVLog, exists usually in the form of a file.Whom every bar access log generally at least will record and when have accessed what page, and according to the actual requirements, the information also having other is by record together.
Step S102 arranges UV counter simultaneously,
Step S103 creates BloomFilter Bloom filter, and BloomFilter is proposed in 1970 by cloth grand (BurtonHowardBloom).It is actually and is made up of a very long binary vector and a series of random mapping function, and Bloom filter may be used for retrieval element whether in a set.Its advantage is that space efficiency and query time are all considerably beyond general algorithm.In daily life, be included in when designing a calculating machine software, we often will judge an element whether in a set.Such as in Word, inspection English word is needed whether to spell correctly (namely will judge it whether in known dictionary); In web crawlers, whether accessed mistake of network address etc.The most direct method is exactly deposit in a computer by whole element in set, when running into a new element, is directly compared by the element in it and set.In general, the set Hash table (hashtable) in computer stores.Its benefit is quick and precisely, and shortcoming is expense memory space.The basic thought of Bloom filter is, by a Hash function element map can be become a point in a bit array (BitArray), if look at this point whether 1 just know can gather in whether have it.
Step S104 creates BitArray bit array in the heap memory of current process,
Step S105 defines k different Hash function,
Positions all in BitArray are all initialized as 0 by step S106,
Step S107 such as to receive at the pending PVLog, and to wherein every bar PVLog by k the bit position of k different Hash Function Mapping to BitArray,
Does step S108 judge whether above-mentioned k bit position is all 1?
If not, then enter step S109UV counter and add 1, and this k bit position is all set to 1
Step S110 exports the value of UV counter, completes UV statistics.
In the present embodiment, create and initialization BitArray (all bit are initialized as 0) in the internal memory of the heap of process, the individual different hash function of definition k is (each with the even random distribution of uniformrandomdistribution, one by element hash to m diverse location), concrete grammar is:
Opening up the bit array (BitArray is initialized as 0) that a length is m, when initial condition, is that all positions of the bit array array of m are both initialized to 0 for length.For the set D={d1 having n element, d2......dn}, by k mapping function { f1, f2, ... fk}, is mapped as k value { y1, y2......yk} by each element di (1<=i<=n) in set D, again by array [y1] corresponding in bit array array, array [y2] ... array [yk] set is 1.
Fig. 2 is the define method schematic flow sheet of the Hash function in Fig. 1.
In the present embodiment, the define method of Hash function comprises:
Step S201 to each Hash function according to the mode of even random distribution by element hash in diverse location,
The individual different Hash function of step S202k is then by individual for element hash to k different position.
In the present embodiment, create and initialization BitArray (all bit are initialized as 0) in the internal memory of the heap of process, the hash function (each with the even random distribution of uniformrandomdistribution, one by element hash to m diverse location) that definition k is different.
Fig. 3 is the method idiographic flow schematic diagram creating Bloom filter in Fig. 1.
Preferred as in the present embodiment, the method creating Bloom filter is specific as follows:
Step S301, when initial condition, is that all positions in the BitArray of m are all set to 0 for length;
Step S302 for the set D={d1 having n element, d2......dn}, by k mapping function { f 1, f 2... f k;
Each element di (1<=i<=n) in set D is mapped as k value { y by step S303 1, y 2... y k;
Step S304 is by array [y1] corresponding in BitArray, array [y2] ... array [yk] set is 1.
Can be BitArray bit array schematic diagrames in Fig. 3 with reference to figure 9.
According to above-mentioned step S301 ~ S304, wherein, X, y, z, w are the element whether in set to be determined, and wherein x, y, z 3 positions in 3 Hash function hash to BitArray are all 1, so in set, and w has a position to be 0, then not in set.
Fig. 4 is the method flow schematic diagram gathering the daily record of PVLog page browsing amount in Fig. 1.
Preferred as in the present embodiment, the Log page browsing amount daily record in step S101 is:
Step S401, front end page js reports;
Or carry out step S402, background server reports;
Or carry out step S403, mobile terminal client sdk reports.
Above-mentioned method includes but not limited to: by the mode of http daily record data post to central log.
Fig. 5 is another operating procedure schematic flow sheet in Fig. 1.
Does step S108 judge whether above-mentioned k bit position is all 1?
Then enter step S111 if not and skip described UV counter, do not count;
Step S112 continues to receive needs PVLog to be processed.
Fig. 6 is a kind of system configuration schematic diagram massive logs being carried out to real-time UV statistics in one embodiment of the invention.
In the present embodiment, additionally provide a kind of system 100 structure of massive logs being carried out to real-time UV statistics, comprise following structure:
Bloom filter creates and initialization module 1001, in order to create BloomFilter Bloom filter, creates BitArray bit array in internal memory, and the Hash function that definition k is different, and positions all in BitArray are all initialized as 0;
Daily record real-time reception module 1002, in order to the pending PVLog such as to receive;
Log processing module 1003, in order to bar PVLog every in the PVLog that equity is pending by k the bit position of the individual different Hash Function Mapping of k to BitArray; Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Result output module 1004, in order to export the value of UV counter, completes UV statistics.
Wherein, Bloom filter creates to be responsible for creating in internal memory and initialization BitArray (all bit are initialized as 0) with initialization module 1001, the individual different hash function of definition k is (each with the even random distribution of uniformrandomdistribution, one by element hash to m diverse location), concrete grammar is:
Opening up the bit array (BitArray is initialized as 0) that a length is m, when initial condition, is that all positions of the bit array array of m are both initialized to 0 for length.For the set D={d1 having n element, d2......dn}, by k mapping function { f1, f2, ... fk}, is mapped as k value { y1, y2......yk} by each element di (1<=i<=n) in set D, again by array [y1] corresponding in bit array array, array [y2] ... array [yk] set is 1
Wherein, daily record real-time reception module 1002 is responsible for the PVLog that real-time reception PV log collection and distribution subsystem send in real time, and PVLog is transmitted to log processing module
Wherein, log processing module 1003 is responsible for k the bit position every bar PVLog being mapped to BitArray by k different hash function, judges whether this k bit position is all 1, if not then UV counter adds 1, and this k bit position is all set to 1, if then skip.This module is the nucleus module of native system.
Wherein, result output module 1004 is responsible for the value of UV counter to be exported by modes such as interfaces.
Fig. 7 is preferred implementation schematic diagram in Fig. 6.
In the present embodiment, massive logs is carried out to the system of real-time UV statistics, comprising:
Bloom filter creates and initialization module 1001, in order to create BloomFilter Bloom filter, creates BitArray bit array in internal memory, and the Hash function that definition k is different, and positions all in BitArray are all initialized as 0;
Daily record real-time reception module 1002, in order to the pending PVLog such as to receive;
Log processing module 1003, in order to bar PVLog every in the PVLog that equity is pending by k the bit position of the individual different Hash Function Mapping of k to BitArray; Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Result output module 1004, in order to export the value of UV counter, completes UV statistics.
Preferred as in the present embodiment, system also comprises website PV daily record Real-time Collection unit, and in order to be reported by front end page js, the PV daily record collected is sent to described log processing module by background server reports or mobile terminal client sdk reports mode in real time.
Fig. 8 is preferred implementation schematic diagram in Fig. 6.
Bloom filter creates and initialization module 1001, in order to create BloomFilter Bloom filter, creates BitArray bit array in internal memory, and the Hash function that definition k is different, and positions all in BitArray are all initialized as 0;
Daily record real-time reception module 1002, in order to the pending PVLog such as to receive;
Log processing module 1003, in order to bar PVLog every in the PVLog that equity is pending by k the bit position of the individual different Hash Function Mapping of k to BitArray; Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Result output module 1004, in order to export the value of UV counter, completes UV statistics.Preferred as in the present embodiment, described result output module is in order to output to external file, database, shared drive and KV storage engines in real time by the value of UV counter.
Preferred as in the present embodiment, system also comprises distribution subsystem, in order to by scribe collector journal, and the daily record of Real-time Collection is distributed to described log processing module.
Those of ordinary skill in the field are to be understood that: more than; describedly be only specific embodiments of the invention, be not limited to the present invention, within the spirit and principles in the present invention all; any amendment of making, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. massive logs is carried out to a method for real-time UV statistics, it is characterized in that comprising:
Gather the daily record of PVLog page browsing amount, carry out distribution afterwards etc. pending; UV counter is set simultaneously;
Create BloomFilter Bloom filter, in the heap memory of current process, create BitArray bit array, and the Hash function that definition k is different;
Positions all in BitArray are all initialized as 0;
The pending PVLog such as to receive, and to wherein every bar PVLog by k the bit position of k different Hash Function Mapping to BitArray;
Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Export the value of UV counter, complete UV statistics.
2. method of massive logs being carried out to real-time UV statistics according to claim 1, is characterized in that, the mode of the Hash function that definition k is different is:
To each Hash function according to the mode of even random distribution by element hash in diverse location, k different Hash function is then by individual for element hash to k different position.
3. method of massive logs being carried out to real-time UV statistics according to claim 1, is characterized in that, the method creating Bloom filter comprises:
When initial condition, be that all positions in the BitArray of m are all set to 0 for length;
For the set D={d1 having n element, d2......dn}, by k mapping function { f1, f2, ... fk}, is mapped as k value { y1, y2......yk} by each element di (1<=i<=n) in set D, again by array [y1] corresponding in BitArray, array [y2] ... array [yk] set is 1.
4. method of massive logs being carried out to real-time UV statistics according to claim 1, is characterized in that, the method gathering the daily record of PVLog page browsing amount is,
Front end page js reports, background server reports or mobile terminal client sdk reports.
5. method of massive logs being carried out to real-time UV statistics according to claim 1, is characterized in that, judges whether above-mentioned k bit position is all 1, if then skip described UV counter, does not count, and continuing to receive needs PVLog to be processed.
6. pair massive logs carries out the system of real-time UV statistics, it is characterized in that, comprising:
Bloom filter creates and initialization module, described Bloom filter creates with initialization module in order to create BloomFilter Bloom filter, BitArray bit array is created in internal memory, and the Hash function that definition k is different, positions all in BitArray are all initialized as 0;
Daily record real-time reception module, described daily record real-time reception module is in order to the pending PVLog such as to receive
Log processing module, described log processing module in order to bar PVLog every in the pending PVLog of equity by k the bit position of k different Hash Function Mapping to BitArray; Judge whether above-mentioned k bit position is all 1, if not then described UV counter adds 1, and this k bit position is all set to 1;
Result output module, described result output module, in order to export the value of UV counter, completes UV statistics.
7. system of massive logs being carried out to real-time UV statistics according to claim 6, is characterized in that, also comprise, website PV daily record Real-time Collection unit,
In order to be reported by front end page js, the PV daily record collected is sent to described log processing module by background server reports or mobile terminal client sdk reports mode in real time.
8. system of massive logs being carried out to real-time UV statistics according to claim 6, is characterized in that, also comprise distribution subsystem,
In order to pass through scribe collector journal, and the daily record of Real-time Collection is distributed to described log processing module.
9. system of massive logs being carried out to real-time UV statistics according to claim 7, is characterized in that, described daily record real-time reception module sends out in order to real-time reception the PVLog that subsystem sends in real time, and PVLog is transmitted to log processing module.
10. system of massive logs being carried out to real-time UV statistics according to claim 8, is characterized in that, described result output module is in order to output to external file, database, shared drive and KV storage engines in real time by the value of UV counter.
CN201610126930.0A 2016-03-07 2016-03-07 Method and system for performing real-time UV statistic of massive logs Pending CN105577455A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610126930.0A CN105577455A (en) 2016-03-07 2016-03-07 Method and system for performing real-time UV statistic of massive logs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610126930.0A CN105577455A (en) 2016-03-07 2016-03-07 Method and system for performing real-time UV statistic of massive logs

Publications (1)

Publication Number Publication Date
CN105577455A true CN105577455A (en) 2016-05-11

Family

ID=55887152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610126930.0A Pending CN105577455A (en) 2016-03-07 2016-03-07 Method and system for performing real-time UV statistic of massive logs

Country Status (1)

Country Link
CN (1) CN105577455A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294090A (en) * 2016-08-03 2017-01-04 五八同城信息技术有限公司 A kind of data statistical approach and device
CN108900619A (en) * 2018-07-06 2018-11-27 阿里巴巴集团控股有限公司 A kind of independent Statistics of accessing population method and device
WO2021082936A1 (en) * 2019-10-30 2021-05-06 深圳前海微众银行股份有限公司 Method and apparatus for counting number of webpage visitors
CN114385922A (en) * 2022-01-17 2022-04-22 上海阿法迪智能数字科技股份有限公司 Library system knowledge recommendation method based on bloom filter

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102253820A (en) * 2011-06-16 2011-11-23 华中科技大学 Stream type repetitive data detection method
CN104252532A (en) * 2014-09-11 2014-12-31 北京优特捷信息技术有限公司 Website information statistic method and device
WO2015168262A2 (en) * 2014-05-01 2015-11-05 Coho Data, Inc. Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102253820A (en) * 2011-06-16 2011-11-23 华中科技大学 Stream type repetitive data detection method
WO2015168262A2 (en) * 2014-05-01 2015-11-05 Coho Data, Inc. Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof
CN104252532A (en) * 2014-09-11 2014-12-31 北京优特捷信息技术有限公司 Website information statistic method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294090A (en) * 2016-08-03 2017-01-04 五八同城信息技术有限公司 A kind of data statistical approach and device
CN108900619A (en) * 2018-07-06 2018-11-27 阿里巴巴集团控股有限公司 A kind of independent Statistics of accessing population method and device
CN108900619B (en) * 2018-07-06 2022-01-11 创新先进技术有限公司 Independent visitor counting method and device
WO2021082936A1 (en) * 2019-10-30 2021-05-06 深圳前海微众银行股份有限公司 Method and apparatus for counting number of webpage visitors
CN114385922A (en) * 2022-01-17 2022-04-22 上海阿法迪智能数字科技股份有限公司 Library system knowledge recommendation method based on bloom filter

Similar Documents

Publication Publication Date Title
CN104426713B (en) The monitoring method and device of web site access effect data
De Choudhury et al. How does the data sampling strategy impact the discovery of information diffusion in social media?
CN105590055B (en) Method and device for identifying user credible behaviors in network interaction system
CN100462979C (en) Distributed indesx file searching method, searching system and searching server
CN105577455A (en) Method and system for performing real-time UV statistic of massive logs
CN101841435B (en) Method, apparatus and system for detecting abnormality of DNS (domain name system) query flow
CN100589418C (en) The generation method and the generation system of alarm correlation rule
CN103729478B (en) LBS interest point discovery method based on MapReduce
CN103580939B (en) A kind of unexpected message detection method and equipment based on account attribute
CN102473085A (en) Method and system for data logging and analysis
CN110347716A (en) Daily record data processing method, device, terminal and storage medium
CN102681999A (en) Method and device for collecting and sending user action information
Zhang et al. Enhancing traffic incident detection by using spatial point pattern analysis on social media
CN102521248A (en) Network user classification method and device
CN103036977A (en) Business pushing method and pushing system based on content distribution network
CN109739919A (en) A kind of front end processor and acquisition system for electric system
CN103778226A (en) Method for establishing language information recognition model and language information recognition device
CN108268569A (en) The acquisition of water resource monitoring data and analysis system and method based on big data technology
CN101421751A (en) Method and system for transaction monitoring in a communication network
CN103544150B (en) For browser of mobile terminal provides the method and system of recommendation information
Han et al. A comparative analysis on Weibo and Twitter
CN109635084A (en) A kind of real-time quick De-weight method of multi-source data document and system
CN104965863A (en) Object clustering method and apparatus
CN109783553A (en) A kind of power distribution network mass data increased quality system
CN111666344A (en) Heterogeneous data synchronization method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160511