System and method safeguarded by any active ues collection based on time wheel disc and user behavior
Technical field
The present invention relates to the maintenance technology field of any active ues collection in net cast, be specifically a kind of based on time wheel
System and method safeguarded by any active ues collection of dish and user behavior.
Background technology
Along with developing rapidly of Internet technology, increasing user can use the terminal such as computer, mobile phone to pass through net
Network viewing Online Video is live.The live live video direct broadcast service referring to utilize Internet resource to carry out of Online Video,
Synchronizing to be published on network by on-the-spot video capture, user can see real-time on-the-spot feelings the same time on network
Condition.
In the business scenario of net cast website, a lot of direct broadcasting room main broadcasters are initiating interactive event or website in initiation
During special activities, any active ues or user active on website between needing just for viewing current live carry out interaction,
At this moment it is accomplished by, by an any active ues collection, any active ues is carried out real time record and renewal.
At present, in net cast field, safeguard that the usual thinking of any active ues collection is: when server continuous a period of time
Interior (typically requiring this time self-defined, this time is time-out duration Timeout) does not receive user behavior data, just this
User eliminates any active ues collection.Specifically, following two mode it is generally divided into:
(1) for each user, corresponding " finally receiving the time lastReceiveTime of behavioral data " is preserved;So
Afterwards by an intervalometer, travel through the user conversation of all users each second, reject those users meeting below equation
Session:
Current time now-finally receives the time lastReceiveTime > time-out duration Timeout of behavioral data.
But, owing to this way overall situation is provided only with a repetition intervalometer repeated timer, therefore the most overtime
Whole user session information will be checked, if user conversation number bigger (as kept up to ten thousand users simultaneously), the most this
The inspection amount of mode will be very big, and whole checking process is time-consumingly serious, reduces real-time.
(2) the most identical with first kind of way, except for the difference that one disposable intervalometer is set for each user conversation
One-shot timer, just disconnects this session timer expiry when, and receiving user behavior data when every time
Update this intervalometer.
Although the method improves the efficiency of inspection to a certain extent, however it is necessary that and a lot of disposable intervalometer is set, and
Need to update continually intervalometer.If linking number is relatively big, then can be to " timer queue to be updated " build-up of pressure, time serious
System congestion also can be caused even to collapse.
Summary of the invention
The invention aims to overcome the deficiency of above-mentioned background technology, it is provided that a kind of based on time wheel disc and user's row
For any active ues collection safeguard system and method, any active ues collection can be produced based on normal User Page behavior, and when utilizing
Between wheel disc to keep updating to any active ues collection, not only in time efficiently, and system will not be caused bigger load, it is ensured that mutually
Dynamic movable effectively carrying out.
For reaching object above, the present invention provides a kind of any active ues collection based on time wheel disc and user behavior to safeguard system
System, including some user terminals and a live Platform Server, is provided with page behavior logging modle in each user terminal,
Caching pretreatment module, distributed real-time computing module and any active ues collection functional module it is provided with in live Platform Server;
Described page behavior logging modle is used for: the some page behaviors produced during viewing is live according to user,
Record corresponding page behavioural information and submit to live Platform Server;
Described caching pretreatment module is used for: the page behavioural information submitting each user terminal to carries out caching and pre-place
Reason;Timing by caching and pretreated all page behavioural informations by the form of some data slice send to the most distributed in real time in terms of
Calculate module;
Described distributed real-time computing module includes that several real-time calculating sub module, distributed real-time computing module are used for
The each data slice received is distributed to a real-time calculating sub module specified according to Hash strategy;Each real-time calculating submodule
Page behavioural information in data slice is resolved and verifies by block, user corresponding for the page behavioural information being verified is added
Be added in any active ues collection burst that current time is corresponding, described any active ues collection burst by any active ues collection according to specifying
Some bursts that the active calculating time period is divided into;
Described any active ues collection functional module is used for: with specify active calculating the time period as cycle, by current time pair
Any active ues collection burst answered updates in the time wheel disc created, and described time wheel disc is a kind of data structure, its main body
It is an end to end circular list, this circular list includes several unit being referred to as groove, each groove is filled
There is an any active ues collection burst, and this circular list also has a pointer pointing to tail of the queue.
On the basis of technique scheme, described page behavioural information includes page URL (Uniform/Universal
Resource Locator, URL), event id, behavior event type, ID and generate according to ad hoc rule
Identification code, the create-rule of identification code is: timestamp+subscriber terminal equipment API (Application Programming
Interface, application programming interface) the terminal unit ID+ random number that obtains.
On the basis of technique scheme, identification code is regular length, and encrypted process.
On the basis of technique scheme, described any active ues collection functional module with the active calculating time period specified is
In the cycle, when being updated in the time wheel disc created by any active ues collection burst corresponding for current time, specific operation process is:
Often after the active calculating time period specified, in time wheel disc, any active ues collection burst in each groove is to head of the queue reach one
Lattice;Any active ues collection burst corresponding for current time is updated the groove being positioned at tail of the queue;By the pointer on time wheel disc to team
First reach one lattice.
The present invention also provides for a kind of any active ues collection maintaining method based on time wheel disc and user behavior, including following step
Rapid: some page line that A, the page behavior logging modle of each user terminal produce during viewing is live according to user
For, record corresponding page behavioural information;The page behavioural information of record is submitted to live Platform Server;B, live flat
The page behavioural information that each user terminal is submitted to by the caching pretreatment module of station server caches and pretreatment;C, slow
Deposit pretreatment module regularly caching and pretreated all page behavioural informations to be sent to dividing with the form of some data slice
The real-time computing module of cloth;Each data slice is distributed to a reality specified according to Hash strategy by distributed real-time computing module
Time calculating sub module;Page behavioural information in data slice is resolved and verifies by D, each real-time calculating sub module, will test
The user demonstrate,proving the page behavioural information passed through corresponding adds in any active ues collection burst that current time is corresponding;E, any active ues
Any active ues collection burst corresponding for current time, with the active calculating time period as cycle, is periodically updated and creates by collection functional module
In the time wheel disc built.
On the basis of technique scheme, page behavioural information described in step A includes page URL, event id, behavior
Event type, ID and press ad hoc rule generate identification code, the create-rule of identification code is: timestamp+user terminal sets
The terminal unit ID+ random number that standby API obtains;In step B, when described caching pretreatment module caches, can abandon and not be inconsistent
Close the page behavioural information required;Described undesirable page behavioural information includes: page behavior letter illegal for page URL
The illegal page behavioural information of breath, page behavioural information that event id is invalid, event type enumerated value, ID be empty or use
Family ID does not meets the page behavioural information of data field type, the page behavioural information of timestamp format error and user eventually
The page behavioural information that end type identification is illegal.
On the basis of technique scheme, in step C, caching pretreatment module timing will caching and pretreated institute
There is page behavioural information to send to distributed real-time computing module with the form of some data slice, specifically include following operation: slow
Deposit pretreatment module to send current cache and pretreated all page behavioural informations to distributed real-time meter every 1 second
Calculate module, when sending, page behavioural information is combined into some data slice less than or equal to 1M every time and is transmitted.
On the basis of technique scheme, in step C, distributed real-time computing module by each data slice according to Hash
Strategy distributes to a real-time calculating sub module specified, and specifically includes following operation: distributed real-time computing module is each
Data slice distributes one unique No. ID, by this No. ID number delivery by real-time calculating sub module, it is thus achieved that remainder the most corresponding
No. ID of the real-time calculating sub module specified;Data slice is distributed to the real-time calculating sub module of corresponding No. ID.
On the basis of technique scheme, page behavioural information described in step A includes page URL, event id, behavior
Event type, ID and the identification code generated according to ad hoc rule, the create-rule of identification code is: timestamp+user terminal
The terminal unit ID+ random number that device A PI obtains;Step D specifically includes following operation: each real-time calculating sub module is to data
Page behavioural information in sheet resolves, and judges that this page behavioural information is according to the identification code of the page behavioural information resolved
No effectively if invalid, directly abandon this page behavioural information, terminate;If effectively, the then behavior to this page behavioural information
Event type is verified, if authentication failed, the most directly abandons this page behavioural information, terminates, if being proved to be successful, by this page
The ID of behavioural information adds in any active ues collection burst that current time is corresponding.
On the basis of technique scheme, step E specifically includes following operation: often through the active calculating time specified
Duan Hou, in time wheel disc, any active ues collection burst in each groove moves forward lattice to head of the queue;By active use corresponding for current time
Family collection burst updates the groove being positioned at tail of the queue;Pointer on time wheel disc is moved forward lattice to head of the queue.
The beneficial effects of the present invention is:
1, the present invention produces corresponding any active ues collection burst based on normal User Page behavior, and this any active ues collection divides
Sheet is the some bursts being divided into according to the active calculating time period specified by any active ues collection, and only User Page behavior is normal
User just add in any active ues collection burst corresponding with current time as any active ues;On this basis, simultaneously
The time wheel disc of utilization come to any active ues collection keep update, with specify active calculating the time period as cycle, periodically by time current
Between corresponding any active ues collection burst update in the time wheel disc created, thus complete the timely dimension to any active ues collection
Protect.
Compared with prior art, the present invention is not to use traditional repetition intervalometer or disposable intervalometer, but base
In user behavior and time wheel disc, any active ues collection is safeguarded, not only the most efficiently, and system will not be caused relatively
Big load, it is ensured that effectively carrying out of interactive event.
2, in the present invention, any active ues collection burst is to carry out dividing according to the active calculating time period specified, and active
It is also with the active calculating time period specified as maintenance period that user collects functional module.This active calculating time period can be as required
Arranging voluntarily and adjust, the number of any active ues collection burst the most just can adjust accordingly, and this allows for the statistics of any active ues
Granularity can be configured as required and adjust (can by the hour liveness, minute liveness or other liveness to add up meter
Calculate), and also can corresponding adjust (can be by within 1 hour, safeguarding, within 1 minute, safeguard or other cycles tie up corresponding wheel disc cycle time
Protect), motility is strong, and the suitability is high.
3, in the present invention, live Platform Server is provided with caching pretreatment module, and this caching pretreatment module can not only
The page behavioural information submitting each user terminal to caches and pretreatment, moreover it is possible to timing is by all with pretreated for caching
Page behavioural information sends to distributed real-time computing module with the form of some data slice;Further, this distributed real-time calculating
Module is made up of several real-time calculating sub module, and the real-time calculating sub module that each data slice is assigned to specify is carried out
Processing, multiple real-time calculating sub module just can process multiple data slice simultaneously, and treatment effeciency is high, and real-time is higher.
4, in the present invention, it is to select real-time calculating sub module to be submitted, mesh to data slice according to Hash strategy
Be the horizontal extension ability in order to improve distributed real-time computing module so that distributed real-time computing module can according to want
The scale of the data volume processed increases and decreases the number of real-time calculating sub module accordingly.
Accompanying drawing explanation
Fig. 1 is the structural frames that in the embodiment of the present invention, system safeguarded by any active ues collection based on time wheel disc and user behavior
Figure;
Fig. 2 is the structural representation of time wheel disc;
Fig. 3 is the flow process of any active ues collection maintaining method based on time wheel disc and user behavior in the embodiment of the present invention
Figure.
Detailed description of the invention
Below in conjunction with the accompanying drawings and specific embodiment the present invention is described in further detail.
Shown in Figure 1, the embodiment of the present invention provides a kind of any active ues collection based on time wheel disc and user behavior dimension
Protecting system, including some user terminals and a live Platform Server, is provided with page behavior record in each user terminal
Module, is provided with caching pretreatment module, distributed real-time computing module and any active ues collection function in live Platform Server
Module.
Wherein, page behavior logging modle is used for: the some page behaviors produced during viewing is live according to user,
Record corresponding page behavioural information;Live Platform Server is submitted to after the page behavioural information of record being encoded.
It is understood that the page behavior that user produces during viewing is live specifically includes that the page loads behavior
Behavior is clicked on page function.Page behavioural information includes some marks for identifying page behavior: page URL, event id
(each page behavior has a unique ID), behavior event type (load, click etc.), ID (i.e. user's uniqueness
ID) and according to ad hoc rule generate identification code.Wherein, the create-rule of identification code is: timestamp+subscriber terminal equipment
Terminal unit ID unique ID of subscriber terminal equipment (the terminal unit ID the be)+random number that API obtains;This identification code is for fixing long
Through encryption after degree, and generation.
Caching pretreatment module is used for: the page behavioural information submitting each user terminal to caches and pretreatment;Fixed
Time caching and pretreated all page behavioural informations are sent to distributed real-time calculating mould with the form of some data slice
Block.
Distributed real-time computing module includes several real-time calculating sub module, and distributed real-time computing module will be for receiving
To each data slice distribute to a real-time calculating sub module specified according to Hash strategy;Each real-time calculating sub module pair
Page behavioural information in data slice resolves and verifies, is added to by the ID of the page behavioural information being verified and works as
In any active ues collection burst that the front time is corresponding, this any active ues collection burst refers to: in advance by any active ues collection according to specifying
Some bursts that the active calculating time period is divided into.Such as: if the active calculating time period specified is 1 hour, then any active ues
Corresponding one day 24 hours of collection, just has 24 corresponding any active ues collection bursts;If the active calculating time period specified is 1 minute,
Then, just there are 1440 corresponding any active ues collection bursts corresponding a day 24*60=1440 minute of any active ues collection.
Any active ues collection functional module is used for: with specify active calculating the time period as cycle (as with 1 hour or 1 minute
Deng for the cycle), any active ues collection burst corresponding for current time is updated in the time wheel disc created;As in figure 2 it is shown, should
Time wheel disc is a kind of data structure, and its main body is an end to end circular list (circular buffer), this circulation
List includes several unit being referred to as groove (slot), each groove is filled with an any active ues collection burst, and should
Circular list also has a pointer (tail) pointing to tail of the queue.
Shown in Figure 3, the embodiment of the present invention also provide for a kind of apply said system based on time wheel disc and user's row
For any active ues collection maintaining method, comprise the following steps:
Step S1: if what the page behavior logging modle of each user terminal produced during viewing is live according to user
Dry page behavior, records corresponding page behavioural information;The page behavioural information of record is assembled into JSON form, and right
JSON character string carries out BASE64 coding;Page behavioural information after coding is submitted to live Platform Server, proceeds to step
S2。
Step S2: the page behavioural information that each user terminal is submitted to by the caching pretreatment module of live Platform Server
Carry out caching pretreatment (Uniform data format), proceed to step S3.
Specifically, the detailed process that caching pretreatment module carries out caching is: the page submitting each user terminal to
The JSON character string of behavioural information carries out BASE64 decoding, abandons undesirable page behavioural information.Wherein, not meeting will
The page behavioural information asked includes: illegal for page URL page behavioural information, the page behavioural information that event id is invalid, event
The illegal page behavioural information of type enumerated value, ID are empty or ID does not meets the page behavior of data field type
The page behavioural information etc. that information, the page behavioural information of timestamp format error and type of user terminal mark are illegal.
Further, the data form after page behavioural information is preprocessed is as follows:
Step S3: caching pretreatment module timing will cache and pretreated all page behavioural informations are with some data
The form of sheet sends to distributed real-time computing module;Distributed real-time computing module by each data slice of receiving according to Hash
Strategy distributes to a real-time calculating sub module specified, and proceeds to step S4.
During practical operation, step S3 caches pretreatment module timing caching is believed with pretreated all page behaviors
Cease and send to distributed real-time computing module with the form of some data slice, specifically include following operation: cache pretreatment module
Every 1 second, current cache and pretreated all page behavioural informations are sent to distributed real-time computing module, every time
Page behavioural information being combined into during transmission some data slice less than or equal to 1M be transmitted, the most each data slice is by least one
Individual complete page behavioural information composition, and the data slice size of composition is less than or equal to 1M.Such as: the page behavior of current cache
Information one has three, and its size is respectively 0.3M, 0.4M and 0.5M, when the most this time sending, and can be by the two of 0.3M, 0.4M pages
Face behavioural information one data slice of composition, then the page behavioural information of 0.5M is formed another data slice;Finally by the two
Data slice sends together to distributed real-time computing module.
Further, in step S3, each data slice received is distributed by distributed real-time computing module according to Hash strategy
To a real-time calculating sub module specified, specifically include following operation: distributed real-time computing module is that each data slice is divided
Join one unique No. ID, by this No. ID number delivery according to real-time calculating sub module, it is thus achieved that remainder then corresponding specify
No. ID of calculating sub module in real time;Data slice is distributed to the real-time calculating sub module of corresponding No. ID.In the present invention, to data slice
Real-time calculating sub module to be submitted is selected in order to improve the water of distributed real-time computing module by Hash strategy
Flat extended capability so that distributed real-time computing module can increase and decrease meter in real time accordingly according to the scale of data volume to be processed
The number of operator module.
Step S4: the page behavioural information in data slice is resolved by each real-time calculating sub module, according to resolve
The identification code of page behavioural information judges that this page behavioural information is the most effective, if effectively, then proceeds to step S5;Otherwise, directly
Connect and abandon this User Page behavioural information, terminate.
Wherein, it is judged that the most effective detailed process of this page behavioural information is as follows: to the identification in page behavioural information
Code is decrypted, take-off time stamp and terminal unit ID, respectively checking time stab the most in the reasonable scope (i.e. timestamp with work as
Whether the time difference of front server in one minute), whether whether compliant (verifies terminal unit ID to terminal unit ID the most again
Meet the create-rule of identification code);If above-mentioned verification is all passed through, then judge that this page behavioural information is effective, otherwise, it is determined that
For invalid.
Step S5: the behavior event type of this page behavioural information is verified by calculating sub module in real time, if checking is logical
Cross, then the ID of the page behavioural information being verified is added to any active ues collection burst that current time is corresponding
In, proceed to step S6;If checking is not passed through, the most directly abandon this page behavioural information, terminate.
Step S6: any active ues collection functional module with specify active calculating the time period as cycle, periodically by current time
Corresponding any active ues collection burst updates in the time wheel disc created, and terminates.
Specifically, with specify active calculating the time period as cycle, periodically by any active ues collection corresponding for current time
Burst updates in the time wheel disc created, and specifically includes following operation:
Step S601: often after the active calculating time period specified, any active ues collection in each groove in time wheel disc
Burst to head of the queue move forward lattice (in the present invention, definition along clockwise direction to head of the queue move into reach), now, be positioned at head of the queue
Any active ues collection burst in groove is moved out of and destroys, and the groove being positioned at tail of the queue is sky, proceeds to step S602;
Step S602: any active ues collection burst corresponding for current time is updated the groove being positioned at tail of the queue, proceeds to step
S603;
Step S603: the pointer on time wheel disc is moved forward lattice to head of the queue.
The present invention is not limited to above-mentioned embodiment, for those skilled in the art, without departing from
On the premise of the principle of the invention, it is also possible to make some improvements and modifications, these improvements and modifications are also considered as the protection of the present invention
Within the scope of.
The content not being described in detail in this specification belongs to prior art known to professional and technical personnel in the field.