Background technology
Along with developing rapidly of Internet technology, increasing user can use the terminal such as computer, mobile phone to pass through net
Network viewing Online Video is live.The live live video direct broadcast service referring to utilize Internet resource to carry out of Online Video,
Synchronizing to be published on network by on-the-spot video capture, user can see real-time on-the-spot feelings the same time on network
Condition.
In the business scenario of net cast website, a lot of direct broadcasting room main broadcasters are initiating interactive event or website in initiation
During special activities, any active ues or user active on website between needing just for viewing current live carry out interaction,
At this moment it is accomplished by, by an any active ues collection, any active ues is carried out real time record and renewal.
At present, in net cast field, safeguard that the usual thinking of any active ues collection is: when continuous one section an of server
Interior (typically requiring this time self-defined, this time is time-out duration Timeout) does not receive User Page action trail
Get data ready, just this user is eliminated any active ues collection.Specifically, following two method it is generally divided into:
(1) for each user, preserve corresponding lastReceiveTime and (finally receive User Page action trail
Get the time of data ready);Then by an intervalometer, travel through all user conversations each second, reject and meet following public affairs
The user conversation of formula:
Now-lastReceiveTime > Timeout, wherein now is current time.
The shortcoming that method (1) is is: method (1) overall situation is provided only with a repeated timer (repetition intervalometer), when
(keeping up to ten thousand users) when number of users is more, repeated timer travels through the work of all user conversations every time the most simultaneously
Measure relatively big, and the longest, and work efficiency is low.
(2) leveling off to identical with method (1), differing only in method (2) is to arrange an one-for each user conversation
Shot timer (disposable intervalometer), each one-shot timer are receiving getting ready of corresponding User Page action trail
Automatically update during time of data, if each one-shot timer finds time-out, then reject the user conversation of correspondence.
Method (2) is although improve the efficiency of inspection to a certain extent, but still has the disadvantage that i.e. method (2) needs
The quantity of the one-shot timer arranged is more, and the more secondary frequencies of one-shot timer is very fast.When number of users is more
Time, the linking number of user conversation is relatively big, and then meeting " timer queue to be updated " build-up of pressure, also can cause system time serious
Congested even collapse.
Summary of the invention
For defect present in prior art, present invention solves the technical problem that into: provide a kind of based on time wheel disc
Any active ues collection maintaining method and system with page behavior.The present invention can utilize time wheel disc to carry out any active ues collection more
Newly, the timely maintenance to any active ues collection is completed;Not only work efficiency is higher, and system will not cause bigger load, energy
In some interactive event, effectively limit the participation of inactive users, it is ensured that effectively carrying out of interactive event.
For reaching object above, any active ues collection maintenance side based on time wheel disc and page behavior that the present invention provides
Method, comprises the following steps:
A, page jump track data when watching live by user encode, and obtain the page behavioural information of correspondence,
Page behavioural information includes some page behavior identification markings, forwards step B to;
B, all page behavioural informations are decoded, according to decoded page behavior identification marking, are determined for compliance with rule
Fixed page behavioural information;The page behavioural information meeting regulation is cached and after pretreatment, obtains page behavior and locate in advance
Reason data, forward step C to;
C, by page behavior preprocessed data form some data slice, every data slice includes that at least 1 page behavior is pre-
Process data;Timing, by the data slice of current all cachings, is assigned as some groups according to Hash strategy, forwards step D to;
D, determine often group data slice in all page behavior preprocessed datas being verified;By all pages being verified
The ID that face behavior preprocessed data is corresponding, is updated in any active ues collection burst corresponding with current time, forwards step to
E;Described any active ues collection burst refers to: in advance by any active ues collection according to specify active calculating the time period be divided into some
Burst;
E, any active ues collection burst after timing will update add in the time wheel disc being pre-created.
What the present invention provided realizes any active ues collection based on time wheel disc and the page behavior maintenance system of said method,
This system includes the page behavior information generating module being positioned on each terminal unit, the caching pretreatment mould being positioned on server
Block, the data slice comprising modules being positioned on server, the some real-time computing module being positioned on server and be positioned at server
On any active ues collection functional module;
Page behavior information generating module is used for: page jump track data when watching live by user encodes,
Obtaining the page behavioural information of correspondence, page behavioural information includes some page behavior identification markings, by page behavioural information
It is committed to cache pretreatment module;
Caching pretreatment module is used for: be decoded, all page behavioural informations received according to the decoded page
Activity recognition identifies, and is determined for compliance with the page behavioural information of regulation;The page behavioural information meeting regulation is carried out caching with pre-
After process, obtaining page behavior preprocessed data, timing sends data slice composition signal to data slice comprising modules;
Data slice comprising modules is used for: after receiving data slice composition signal, by some for page behavior preprocessed data composition
Data slice, every data slice includes at least 1 page behavior preprocessed data;Timing, by the data slice of current all cachings, is pressed
According to the distribution of Hash strategy to each real-time computing module;
Computing module is used in real time: to all page behavior pretreatment numbers in the data slice of data slice comprising modules distribution
According to verifying, determine all page behavior preprocessed datas being verified;All page behaviors being verified are located in advance
The ID that reason data are corresponding, is updated in any active ues collection burst corresponding with current time;Any active ues collection burst refers to:
The some bursts in advance any active ues collection being divided into according to the active calculating time period specified;
Any active ues collection functional module is used for: any active ues collection burst after real-time computing module is updated by timing adds extremely
In the time wheel disc being pre-created.
Compared with prior art, it is an advantage of the current invention that:
(1) present invention is based on the page behavioural information generated during normal users jump page the page that will be verified
Behavioural information is updated to any active ues collection burst of correspondence as any active ues.In view of this, compared with prior art, the present invention
It is not to use to repeat intervalometer or disposable intervalometer, but periodically adds any active ues collection burst to created time
In wheel disc, utilize time wheel disc that any active ues collection is updated with this, complete the timely maintenance to any active ues collection;Not only
Work efficiency is higher, and system will not cause bigger load, can effectively limit inactive users in some interactive event
Participation, it is ensured that effectively carrying out of interactive event.
(2) any active ues collection burst of the present invention the division time and by any active ues collection burst add to the time take turns
The timing cycle of dish, all can arrange voluntarily according to specifically used situation and adjust;And then make the number of any active ues collection burst
Amount, the statistics granularity (can liveness, minute liveness or other liveness carry out statistical computation by the hour) of any active ues, with
And time wheel disc maintenance period (maintenance period is identical with timing cycle, can by 1 hour safeguard, 1 minute safeguard or other cycles
Safeguard) all can carry out corresponding adjustment.Therefore, the motility of the present invention is relatively strong, and the suitability is higher.
(3) system of the present invention includes multiple real-time computing module for processing data slice, multiple real-time computing modules
Multiple data slice can be processed simultaneously, further increase work efficiency real-time.
Detailed description of the invention
Below in conjunction with drawings and Examples, the present invention is described in further detail.
Any active ues collection maintenance side based on time wheel disc and page behavior shown in Figure 1, in the embodiment of the present invention
Method, comprises the following steps:
The terminal unit that S1: each user uses, page jump track data when watching live by user encodes,
Obtain the page behavioural information of correspondence, forward S2 to.
Page jump track data when user watches live in S1 derives from: user viewing live during, meeting
Redirecting between multiple Webpages, each page has a page ID, each page ID can when user redirects record page
The track that face redirects;The ID of all pages of user's viewing, is in S1 use according to the ID sequence of the sequencing splicing of viewing
The page jump track data at family.The upper limit of page jump track is 10 grades, such as: 1.2.3, it is simply that refer to redirects from the page 1
To the page 2, the page 2 jumps to the page 3 again.
Being encoded by page jump track data in S1, the idiographic flow of the page behavioural information obtaining correspondence is: will
Page jump track data is assembled into JSON form, and (JavaScript Object Notation, the data of a kind of lightweight are handed over
Change form) character string after, this character string is carried out BASE64 (coded system of 8Bit syllabified code) coding obtain page line
For information.
Page behavioural information includes some page behavior identification markings: the URL (network address) of the such as page, the page are jumped
Turn track data, ID (i.e. the ID of user's uniqueness) and identification code.Identification code is regular length, can add after generation
Close;The create-rule of identification code is: the terminal unit ID and the random number arrangement that are used by timestamp, user form, wherein user
The terminal unit ID used is according to API (Application Programming Interface, the application program of terminal unit
DLL) obtain.
S2: all page behavioural informations are decoded (i.e. the character string of JSON form being carried out BASE64 decoding), root
According to decoded page behavior identification marking, the page behavioural information being determined for compliance with regulation (abandons page line against regulation
For information), the page behavioural information meeting regulation is cached and (i.e. Uniform data format) after pretreatment, obtain page line
For preprocessed data, forward S3 to.
S2 meets the page behavior identification marking needs in the page behavioural information of regulation and meets following condition simultaneously: page
Face URL legal (non-rule is against regulation), page jump track data effectively (invalid, against regulation), ID are not
Empty (the most against regulation for sky), ID meet data field type (not meeting data field type the most against regulation),
Timestamp form correct (mistake is the most against regulation), type of user terminal identify legal (non-rule is against regulation).
Page behavior preprocessed data form in S2 is:
S3: page behavior preprocessed data is formed some data slice, the capacity of every data slice is less than or equal to 1MB, often
Sheet data slice includes at least 1 complete page behavior preprocessed data, forwards S4 to.
The idiographic flow of S3 is exemplified below: the page behavior preprocessed data of current cache is 3, and its size is respectively
0.3M, 0.4M and 0.5M, now the flow process of S3 is: by the two of 0.3M, 0.4M preprocessed data composition piece of data sheets, then will
The preprocessed data of 0.5M forms another sheet data slice, the like.
S4: timing, by the data slice of current all cachings, is assigned as at least 3 groups according to Hash strategy, forwards S5 to.The tool of S4
Body flow process is: the total quantity defining some groups is N, distributes a unique ID UUID for every data slice, by each UUID
Carrying out modulo operation (UUID mod N) with N, all data slice that remainder that modulo operation obtains is identical are identical group.
The purpose being grouped data slice by Hash strategy in S4 is: can enter N according to the quantity of data slice during packet
Row increases and decreases, and then raising subsequent calculations often organizes the horizontal extension ability of data slice.
S5: verifying often organizing all of page behavior preprocessed data in data slice respectively, if being verified, forwarding to
S6;If checking is not passed through, abandon the unsanctioned page behavior preprocessed data of checking, terminate.
The idiographic flow of S5 is as follows:
S501: be decrypted often organizing the identification code of all of page behavior preprocessed data in data slice respectively, obtain
Timestamp and terminal unit ID, it is judged that (i.e. whether the time difference of timestamp and current server the most in the reasonable scope for timestamp
In one minute) and terminal unit ID compliant (terminal unit ID meets the create-rule of identification code and is considered as compliant),
If so, forward S202 to, otherwise determine that the checking of current page behavior preprocessed data is not passed through.
S502: current page behavior preprocessed data is resolved, obtains page jump track data, it is judged that the page is jumped
Turn whether track data meets sequence rules, if, it is determined that current page behavior preprocessed data is verified, and otherwise determines
The checking of current page behavior preprocessed data is not passed through.
S6: by ID corresponding for all page behavior preprocessed datas being verified, is updated to current time (i.e.
The time being verified) in corresponding any active ues collection burst, forward S7 to.Any active ues collection burst refers to: in advance by active use
Some bursts that family collection was divided into according to the active calculating time period specified.
Any active ues collection burst in S6 is exemplified below: if the active calculating time period specified is 1 hour, then any active ues
Corresponding one day 24 hours of collection, just has 24 corresponding any active ues collection bursts;If the active calculating time period specified is 1 minute,
Then, just there are 1440 corresponding any active ues collection bursts corresponding a day 24*60=1440 minute of any active ues collection.
S7: any active ues collection burst after regularly (timing cycle can be arranged voluntarily, generally 1 minute) will update adds
To the time wheel disc being pre-created, terminate.
Shown in Figure 2, the time wheel disc in S7 includes 1 end to end ring data structure buffer cycles queue
(i.e. circular buffer), ring data structure buffer cycles queue is divided into some unit grooves, is filled with in each unit groove
A piece of any active ues collection burst;Ring data structure buffer cycles queue is provided with 1 pointer pointing to tail of the queue unit groove.
S7 specifically includes below scheme: the clockwise direction in definition time wheel disc be tail of the queue to head of the queue direction, counterclockwise
Direction is that head of the queue is to tail of the queue direction.After timing cycle, any active ues collection burst in head of the queue unit groove in time wheel disc is moved
Go out, any active ues collection burst in remaining each unit groove, respectively according to the most next the most mobile unit groove (now
The unit groove of tail of the queue is empty).Any active ues collection burst after updating in S6 adds the unit groove to time wheel disc tail of the queue, will refer to
Pin is according to the most next the most mobile unit groove.
Time wheel disc (any active ues collection) in S7 can be exposed to other by Restful interface or RPC interface shape
Application uses.Based on this any active ues collection, can ensure in multiple important scenes that the user participating in interactive event is currently
Any active ues.
Shown in Figure 3, in the embodiment of the present invention realize said method based on time wheel disc and the work of page behavior
The user that jumps collects maintenance system, including the page behavior information generating module being positioned on each terminal unit, is positioned on server
Caching pretreatment module, the data slice comprising modules being positioned on server, be positioned on server some real-time computing module (this
Embodiment is 3) and any active ues collection functional module that is positioned on server.
Page behavior information generating module is used for: page jump track data when watching live by user encodes,
Obtaining the page behavioural information of correspondence, page behavioural information includes some page behavior identification markings, by page behavioural information
It is committed to cache pretreatment module.
Caching pretreatment module is used for: be decoded, all page behavioural informations received according to the decoded page
Activity recognition identifies, and is determined for compliance with the page behavioural information of regulation, meets the page behavior in the page behavioural information of regulation and knows
Biao Shi not need to meet following condition: page URL is legal, page jump track data effective, ID is not empty, user simultaneously
ID meets data field type, timestamp form is correct, type of user terminal mark is legal.Page behavior to meeting regulation is believed
After breath carries out caching and pretreatment, obtaining page behavior preprocessed data, timing sends data slice group to data slice comprising modules
Become signal.
Data slice comprising modules is used for: after receiving data slice composition signal, by some for page behavior preprocessed data composition
Data slice, every data slice includes at least 1 page behavior preprocessed data;Timing, by the data slice of current all cachings, is pressed
According to the distribution of Hash strategy to each real-time computing module, the idiographic flow of distribution is: the total quantity defining some groups is N, for often
Sheet data slice one unique ID UUID of distribution;Each UUID and N is carried out modulo operation, the remainder phase that modulo operation obtains
With all data slice be identical group.
Computing module is used in real time:
(1) all page behavior preprocessed datas in the data slice of data slice comprising modules distribution are verified, really
Fixed all page behavior preprocessed datas being verified;Idiographic flow is:
The identification code of all page behavior preprocessed datas in the data slice of distribution is decrypted, obtain timestamp and
Terminal unit ID, it is judged that timestamp the most in the reasonable scope and terminal unit ID compliant, if it is not, determine current page
The checking of face behavior preprocessed data is not passed through, and abandons current page behavior preprocessed data;Otherwise:
Current page behavior preprocessed data is resolved, obtains page jump track data, it is judged that page jump rail
Whether mark data meet sequence rules, if, it is determined that current page behavior preprocessed data is verified;Otherwise determine current
The checking of page behavior preprocessed data is not passed through, and abandons current page behavior preprocessed data.
(2) by ID corresponding for all page behavior preprocessed datas being verified, it is updated to and current time pair
In any active ues collection burst answered;Any active ues collection burst refers to: in advance by any active ues collection according to specify active calculating time
Between some bursts of being divided into of section.
Any active ues collection functional module is used for: any active ues collection burst after real-time computing module is updated by timing adds extremely
In the time wheel disc being pre-created.Time wheel disc includes 1 end to end ring data structure buffer cycles queue, annular number
It is divided into some unit grooves according to structure buffer cycles queue, each unit groove is filled with a piece of any active ues collection burst;Annular number
According to structure buffer cycles queue is provided with 1 pointer pointing to tail of the queue unit groove.
On this basis, any active ues collection functional module specifically for: after timing cycle, by head of the queue list in time wheel disc
Any active ues collection burst removal in unit's groove, any active ues collection burst in remaining each unit groove, respectively according to side clockwise
To mobile to next unit groove, be clockwise tail of the queue to head of the queue direction, counter clockwise direction is that head of the queue is to tail of the queue direction;Will
Any active ues collection burst after computing module updates in real time adds the unit groove to time wheel disc tail of the queue, by pointer according to clockwise
Direction is moved to next unit groove.
The present invention is not limited to above-mentioned embodiment, for those skilled in the art, without departing from
On the premise of the principle of the invention, it is also possible to make some improvements and modifications, these improvements and modifications are also considered as the protection of the present invention
Within the scope of.The content not being described in detail in this specification belongs to prior art known to professional and technical personnel in the field.