Specific embodiment
In order that those skilled in the art more fully understand the technical scheme in the application, below in conjunction with the application reality
The accompanying drawing in example is applied, the technical scheme in the embodiment of the present application is clearly and completely described, it is clear that described implementation
Example is only some embodiments of the present application, rather than whole embodiments.Based on the embodiment in the application, this area is common
The every other embodiment that technical staff is obtained under the premise of creative work is not made, should all belong to the application protection
Scope.
Fig. 2 is a kind of a kind of method flow diagram of embodiment of window statistical method of herein described data.Although this Shen
Such as following embodiments or method operating procedure shown in the drawings or apparatus structure please be provide, but based on routine or without creating
The work of property can include more or less operating procedure or modular structure in methods described or device.In logicality
In the step of in the absence of necessary causality or structure, the execution sequence of these steps or the modular structure of device are not limited to this Shen
Please embodiment provide execution sequence or modular structure.Described method or the device or terminal product in practice of modular structure
During product application, order execution or executed in parallel can be carried out according to embodiment or method shown in the drawings or modular structure
(environment of such as parallel processor or multiple threads).
Specifically as described in Figure 2, a kind of window statistical method of data that a kind of embodiment of the application is provided can include:
S1:Obtain current time business dimension unit interval DBMS, and the current time a upper unit
The history window statistics of time business dimension.
The present embodiment is with the keyword statistics in certain search system for application scenarios are illustrated.In the present embodiment applied field
In scape, operation system can do the business dimension statistics of unit interval to the data that record, generation unit interval DBMS with
And the period windows negative data of the unit interval DBMS.Unit interval described herein can include the record of setting
Data are done the unit interval of periodic statistics, such as the system of searching can be the unit time with minute in the present embodiment, can be with every
Data are done once with the data statistics of minute level for one minute, the search keyword minute DBMS of current minute is generated.Certainly, institute
The unit interval stated can carry out self-defined setting according to real data treatment or scene, design requirement etc., such as can be with hour
It is the unit time, or day, week etc. are used as the unit interval, then carry out the business dimension statistics of identical unit interval, generate phase
The unit interval DBMS answered.In data window statistics, the general window time window length that window statistics can be also set, this when
Window time window length can be that what is pre-set carry out the time period of window statistics, such as 24 hours, or one week, one month etc..
General, number is arrived in storage after system the data of record can be carried out unit interval statistics as time interval with the unit interval
(count the keyword search in a minute to record and stored) according in storehouse, when the real-time window for carrying out data is counted, can
Window statistics is carried out with the unit interval level (such as minute level) based on storage, therefore, the window of window statistics is carried out under normal circumstances
More than the unit interval of system business dimension statistics, such as the minute of keyword grade is counted mouth time window length, the window of 24 hours is united
Meter.
The present embodiment can carry out business dimension statistics, the window counted for window with 24 hours with minute as the unit time
Time window length, with user search keyword as business dimension, perform it is per minute statistics in the past 24 hours this use search system
The TOP100 keywords for scanning for are for application scenarios are illustrated.Specifically, can be by window per minute in the present embodiment
Statistics is stored in " keyword TOP100 minutes level result computational chart ", then can be according to the window at current moment minute
Statistics real-time update is somebody's turn to do " keyword TOP100 minutes level result computational chart ".Therefore, can be obtained in the embodiment of the present application
Statistics to be processed is needed, the business dimension statistics of keyword then can be done to statistics, generate minute DBMS, can
Stored in database HBase with by this minute DBMS.Then, when the window for carrying out current time business dimension is counted,
It is also dimension unit interval DBMS that the current time can be obtained, while a upper list at the current time can be obtained
The history window statistics of position time business dimension, such as calculates current time 2016-3-12 10:20 " keyword TOP100
Minute level result computational chart ", then can obtain 2016-3-12 10:Minute series of the keyword in this minute in 20 systems
According to, and 2016-3-12 upper one minute 10:19 history window statistics " keyword TOP100 minutes level result computational chart ".
Certainly, business dimension described herein can be specified by operating personnel according to specific application scenarios and set, such as
Can be with the keyword of user's search as business dimension, in other application scenarios in search system in the present embodiment application scenarios
Described business dimension can include but is not limited to flow, geographical position, access originator, designated key field of data-interface etc.
Deng can specifically be set according to the actual business data processing scene of window statistics.Can in the present embodiment application scenarios
To be to be illustrated as a example by business dimension performs minute series according to statistics by keyword.
Statistics described herein can include the log information of the generations such as data manipulation, execution, such as this implementation
Statistics described in example application scenarios can include search application to the keyword searched for and the original log letter of searching times
Breath.In the present embodiment, can be with the real-time original log information for obtaining keyword search per minute, then according to current minute
Original log information generates the minute DBMS of current time keyword, and stores in database HBase.
The embodiment of the present application can obtain current time business dimension when the real-time window for carrying out current time is counted
Unit interval DBMS, and a upper unit interval business dimension at the current time history window statistics.
S2:The period windows negative data at the current time, root are inquired from the history unit interval DBMS of storage
Current time is calculated according to the unit interval DBMS of the period windows negative data and the current time business dimension
Business dimension incremental data.
In the application embodiment, when unit interval series is done according to statistics while can be generated plus window current time
The period windows negative data at correspondence moment after mouth time window length.Specifically, in a kind of embodiment of the application, described cycle window
Mouth negative data can include that the negative statistical value-N of the statistical value N for taking statistics moment T adds the window of setting as statistics moment T
The value of moment (T+L) corresponding service dimension statistics of time window length L.So, the present embodiment is by setting window during window
Length, when carrying out real-time window and counting, can by window time window length before expired invalid statistics balance out, it is real
The calculating of existing business dimension incremental data, ensure the application it is accurate and effective realize window in the way of business dimension incremental data
Mouth statistics.In the application embodiment, when unit interval series is done according to statistics while can be generated plus window current time
The period windows negative data at correspondence moment after mouth time window length.Specifically setting or conversion, the life of described period windows negative data
Can be set according to statistics type or scene demand into mode.A kind of implementation method that the embodiment of the present application is provided
In, the period windows negative data at the current time includes:
S201:The moment is counted in a upper window at the current time, based on the upper window statistics moment industry
The business dimension of the negative value generation of business dimension statistical value is at current time in the unit interval DBMS of dimension of being engaged in
Negative statistical value.
In the application scenarios counted as business dimension with keyword in specific such as the present embodiment search application, obtaining
Original log information in current time 2016-3-12 10:20 keywords " mobile phone " the searching times statistical value of a minute is
10, then the current time 2016-3-12 10 that can be generated:Keyword " hand can be included in 20 unit interval DBMS
Machine " searching times statistical value is 10 data record.Meanwhile, current time 2016-3-12 10 can be generated:20 add window
Cycle moment 2016-3-13 10 of the time window length after 24 hours:The period windows negative data of 20 keywords " mobile phone ", can wrap
Include 2016-3-13 10:20 keywords " mobile phone " searching times statistical value is -10 data record.Fig. 3 is the application generation
A kind of record schematic diagram of minute DBMS and respective cycle window negative data.Certainly, shown in Fig. 3, for the current time
2016-3-12 10:20 other keywords such as " household electrical appliances ", " iPhone SE ", " automobile " can respectively according to original log
Information does the statistics of minute level, generates corresponding unit interval DBMS and corresponding period windows negative data.Such as 2016-3-12
10:20 keywords " household electrical appliances ", the searching times statistical value of a minute is respectively 20, and keyword " iPhone SE " a minute is searched
Rope number of times statistical value is 0, can record quarter at current time 2016-3-12 10:20 24 hours period moment 2016-3-12
10:The period windows negative data of 20 keywords " household electrical appliances " is -20, certainly, can be due to 2016-3- in the present embodiment application scenarios
The 12 non-external disclosures in this day will issue " iPhone SE ", only once be searched in this minute.Certainly, if not having
Search record, then can not have " iPhone SE " to search for record data, or its searching times is set into 0.
In the embodiment of the present application application scenarios the generation unit interval is counted in the keyword that the unit interval is done to statistics
During DBMS, negative statistical value of the keyword at the current time adding window mouthful time window length moment can be accordingly generated, it is possible to
Storage is in database such as HBase.
The embodiment of the present application from the history unit interval DBMS of database purchase when that can inquire described current
The period windows negative data at quarter, then can be according to the period windows negative data and the unit of the current time business dimension
Time Series are according to the business dimension incremental data for being calculated current time.
In the application scenarios of the keyword statistics in specific such as the present embodiment search system, can be from database
Current time 2016-3-13 10 is inquired in the history minute level statistics of HBase storages:Keyword " mobile phone " one in 20
The searching times statistical value of minute is 35, keyword " household electrical appliances " the searching times statistical value of a minute is 20, keyword " iPhone
SE " the searching times statistical values of a minute are 50.Meanwhile, a window statistics moment is also recorded in database HBase
2016-3-12 10:20 statistics generation in moment in cycle 2016-3-13 10:20 period windows negative data is keyword
" mobile phone " one minute searching times statistical value is that -10, keyword " household electrical appliances " the searching times statistical value of a minute is -25.So
After keyword can be done additional calculation and obtain current this minute 2016-3-13 10:20 24 hours window time window lengths
Keyword incremental data can include:24 hours window increments values of keyword " mobile phone " are 35+ (- 10)=25, keyword " family
24 hours window increments values of electricity " are 20+ (- 25)=- 5, and 24 hours window increments values of keyword " iPhone SE " are 0+50=
50.So, be can be seen that as i Phone SE will be listed by the incremental data of keyword search, mobile phone and SE models are searched
Rope temperature is consequently increased.For relatively, household electrical appliances are in 2016-3-12 10:The searching times at 20 statistics moment are 25 times,
In second day 2016-3-13 10:20 statistics moment searching times are 20 times, relative drop 5 times.Because the application is in 2016-
3-12 10:20 statistics moment were provided with window time window length 24 hours i.e. 2016-3-13 10 later:20 keywords " household electrical appliances "
Period windows negative data, therefore, going to 2016-3-13 10:20 are carried out when window is counted when can go out current with accurate statistics
24 hours window increments values for carving keyword " household electrical appliances " are -5.
Can be according to the minute DBMS of minute level keyword statistics generation and corresponding in the embodiment of the present application application scenarios
The keyword incremental data at current time is calculated plus the period windows negative data after 24 hours window time window lengths.
S3:Business dimension incremental data based on the history window statistics and the current time determines described working as
The window statistics of preceding moment business dimension.
Obviously, the amount of calculation of the business dimension statistics incremental data in the unit interval is generally much less than in window time window length
The data amount of calculation of all unit interval business dimension statistics.The embodiment that the application is based on period windows negative data is eliminated
Stale data in window time window length, therefore the business dimension of current time window time window length being calculated can be based on
Degree statistics incremental data is calculated the window statistics at current time.Specifically, described herein based on business dimension
Statistics incremental data determines that window statistics can be carried out accordingly according to specific application scenarios or design window statistical
Be calculated.
A kind of window statistical method of data that the embodiment of the present application is provided, realizes number by the way of based on incremental data
According to window count.Because incremental data needs data to be processed in data compared to prior art window statistics in the unit interval
Magnitude do not go up it is much smaller, therefore using based on the window statistics that statistics is completed by the way of incremental data in the application, can
So that the memory cost and network overhead of system, the disposed of in its entirety performance of lifting system is greatly lowered.In the application embodiment
In, the current statistic time is generated simultaneously plus measurement period (embodiment using when the business dimension for processing the unit interval is counted
Described in window time window length) moment in cycle negative statistical value, store in database.So, the week is being reached
When the window that moment phase carries out the unit interval is counted, can will be expired before earlier than one measurement period of the moment in cycle
Business dimension statistics is balanced out, and the accuracy that business dimension counts incremental data is calculated in effective guarantee the application.Using
The application embodiment effectively can significantly reduce system load and network overhead, improve the data processing effect of window statistics
Rate and server performance.
Certainly, when the window in the business dimension at current time is counted, it is also possible to while generating the cycle window at current time
Mouth negative data, then stores into database, is easy to the real-time window of follow-up data to count.Therefore, herein described a kind of number
According to window statistical method another embodiment in, methods described can also include:
S4:The negative value of business dimension statistical value in the unit interval DBMS of the current time business dimension is taken, with
The negative value as the business dimension the current time plus set window time window length after the corresponding cycle when
The period windows negative data at quarter.
In a kind of implementation method of the application, the business dimension window statistic record data that each unit interval updates can be with
Window statistics for representing current time real-time statistics.For example in above-described embodiment application scenarios, " HBase
24 hours windows statistics keyword search that keyword window calculation result table " can update current moment minute with real time record is united
Meter result data, can specifically include 24 hours in the past users have searched for which keyword and the search time of these keywords
The information such as number.Therefore, it is described based on the history in a kind of a kind of embodiment of the window statistical method of herein described data
The business dimension incremental data at window statistics and the current time determines the window system of the current time business dimension
Meter result includes:
S301:Current time is inquired from the history service dimension window result data of storage in window timing statisticses section
Interior business dimension window result data, the business dimension window result data is merged with the business dimension incremental data
It is updated to after computing in the history service dimension window result data.
Fig. 4 is a kind of method flow diagram of the window statistical method another kind embodiment of data that the application is provided.Storage
History service dimension window result data in inquire current time window timing statisticses section in business dimension window knot
Fruit data, by the business dimension window result data and the business dimension incremental data union operation, obtain current time
Business dimension window result data, then can be by storage described in the business dimension window result data updated value at current time
In history service dimension window result data.Business dimension window time window length record data described herein can include
The form of " HBase keyword window calculation results table " described above, may also comprise the business dimension that other forms storage is calculated
Degree window result data.
Current time statistical number can be represented with the business dimension window result data of stored record in the embodiment of the present application
According to real-time window statistics.Can specifically be looked into from the keyword window statistic record data of the unit interval level for setting
The business dimension window time window length record data in the window time window length at current time is ask out, by the window time window length
History service dimension window time window length record data and the window time window length in business dimension statistics incremental data enter
Row merges, and obtains the business dimension window time window length record data in the window time window length after current time renewal.For example
Shown in above-described embodiment application scenarios, can be provided with what the record minute level keyword search of window statistics in 24 hours was recorded
" HBase keyword window calculation results table ", in current time 2016-3-13 10:20 can be from " the HBase keyword windows
The upper unit interval 2016-3-13 10 of inquiry in mouth result of calculation table ":19 24 hours passes of window time window length of history
Keyword window data is recorded, it is assumed that 24 hours window search number of times statistical values 15000, keyword including keyword " mobile phone "
" household electrical appliances " 24 hours window search number of times statistical value 25000, keyword " iPhone SE " 24 hours window search number of times system
Evaluation 50000.To inquire history keyword word window data record respectively with current time 2016-3-13 10:20 key
Word incremental data 25, -5,50 is added, current time 2016-3-13 10 after merging:20 24 hours keyword window datas note
Record include keyword " mobile phone " 24 hours window search number of times statistical value 15025, keyword " household electrical appliances " 24 hours windows search
24 hours window search number of times statistical values 50050 of rope number of times statistical value 24995, keyword " iPhone SE ".Then will be current
Moment 2016-3-13 10:Crucial window record data after 20 merging is updated to " HBase keyword window calculation results table "
In keyword window record data in.
Certainly, final window is obtained after TOP sequences can also be carried out to window statistics in other application scenes
Mouth statistics.For example according to the keyword TOP100 result of calculations of keyword search number of times statistical updating minute machine per minute,
Filter out the top n keyword of user's focus.A kind of a kind of embodiment of the window statistical method of herein described data
In, business dimension statistics incremental data and the TOPN number of results of a upper unit interval of current one time statistics can be based on
According to calculating the TOPN ranking results of current one time.Therefore, a kind of window statistical method of herein described data is another
In a kind of embodiment, the business dimension incremental data based on the history window statistics and the current time determines
The window statistics of the current time business dimension includes:
S302:The business dimension statistics searching order result of unit interval is obtained, during by a upper unit
Between business dimension statistics searching order result and the current time business dimension statistics incremental data union operation it is laggard
Row sequence, obtains the business dimension statistics searching order result at current time.
Fig. 5 is a kind of method flow diagram of the window statistical method another kind embodiment of data that the application is provided.For example
In the application scenarios of the keyword statistics in above-mentioned search application, can be with minute as unit time (certainly, other applied fields
Can also be with hour or day, week etc. for the unit time in scape), the TOP100 of upper one minute can be counted from HBase databases
Calculate result retrieval out, the keyword incremental data for then being calculated with current time merges, and then resequences, obtain current
Moment newest TOP100 keyword search ranking results, it is possible to store to " the level calculating in keyword TOP100 minute for setting
As a result in table ".Specifically, for example 09:55 moment TOP2 results are:" household electrical appliances " searching statistical value is 200, " mobile phone " search system
Evaluation is 100.09:" mobile phone " 24 hours statistical values of window key incremental data at 56 moment become 150 (current 09:56
200 times of search are plus before 24 hours 9:The period windows negative data -50 of 56 search 50 times), i.e., at current time 09:56,
By upper one minute 09:55 TOP2 and 09:It is after 56 window time window length incremental data merging:" mobile phone " searching statistical
It is worth for 100, " household electrical appliances " searching statistical value is that 200, " mobile phone " searching statistical value is 150, when three data obtain current after merging
Carve 09:56 " mobile phone " searching statistical values are that 150, " household electrical appliances " searching statistical value is 200.Then the two results are ranked up
To current time 09:56 TOPN is:" household electrical appliances " searching statistical value is that 200, " mobile phone " searching statistical value is 150.
So, the TOPN sequences at current time are carried out based on business dimension statistics incremental data, compared in the prior art
Such as addition scheme or subtraction mode window statistics in substantial amounts of digital independent, request, calculating and usually up to several G
Memory consumption, the embodiment of the present application can be greatly lowered and be when unit of account time stage business dimension counts TOPN results
System memory consumption and network overhead, significantly improve window statistical efficiency.
The embodiment that the above embodiments of the present application application scenarios are provided carries out the window statistics of data in search system
During calculating, window statistical computation in a minute can only inquire about minute DBMS twice, QPS=2/60, it is assumed that search per minute is closed
Keyword 2000, inquiry data volume is 2000*2*0.4=1.6MB data to the maximum, inquires about " HBase keyword window calculation results
The QPS=2000*2/60 ≈ 66QPS of table ", the processing data load of final updating storage to HBase is TPS=2000*2/60
≈66TPS.Due to being much smaller than existing incremental data mode using the order of magnitude when the application window is counted, can be greatly lowered
Installed System Memory expense, the data of 1G or higher may be loaded to internal memory during existing such as minute level statistics, be in one minute
System may not processed or expense is huge, as long as and use the application method can be to load or even a few M internal storage datas just can be with can
Counted with the real-time window for completing a minute.As can be seen here, the window statistical method of the data that the embodiment of the present application is provided, uses
Mode based on incremental data realize data window statistics, due in the unit interval incremental data compared to prior art window
Statistics needs data to be processed much smaller in data volume rank, therefore complete by the way of based on incremental data in the application
Window into statistics is counted, and the memory cost and network overhead of system can be greatly lowered, at the entirety of lifting system
Rationality energy.Therefore, the application also provides a kind of method that can apply to and real-time window statistics is carried out to keyword data, specifically
Methods described can include:
The unit interval DBMS of current time keyword is obtained, and a upper unit interval at the current time closes
The history window statistics of keyword;
The period windows negative data of the current time keyword is inquired from the history unit interval DBMS of storage,
Unit interval DBMS according to the period windows negative data and the current time keyword is calculated current time
Keyword incremental data;
When keyword incremental data based on the history window statistics and the current time determines described current
Carve the window statistics of keyword.
Certainly, the embodiment that the above embodiments of the present application are provided can apply in including but not limited to search system
Search keyword data are carried out with the application scenarios of window statistics.Still can be with described herein in other application scenarios
The window statistics of user's dimension of the embodiment that window statistics is carried out based on incremental data, such as account current, data stock
Flow window statistics of the business datum of storage etc..Certainly, in application scenes, the window to business datum can both have been included
Mouth statistics, it is also possible to while the search keyword window statistics including searching service.
The window statistical method of the data that the application is provided may apply to various clothes for carrying out data window statistics
In business device system, window statistics can be effectively solved as higher in being stored with to data in window statistical value or TOP sequencer procedures
TPS, QPS performance requirement and the problem too high to server memory expense.Therefore, the application is based on the described a kind of data for providing
Window statistical method a kind of window statistic device of data is provided.Fig. 6 is a kind of window statistics of data that the application is provided
A kind of modular structure schematic diagram of embodiment of device, as shown in fig. 6, described device can include:
Data acquisition module 101, can be used for obtaining the unit interval DBMS of current time business dimension, and described
The history window statistics of a upper unit interval business dimension at current time;
Incremental data computing module 102, can be used for inquiring described working as from the history unit interval DBMS of storage
The period windows negative data at preceding moment, according to the period windows negative data and the unit interval of the current time business dimension
DBMS is calculated the business dimension incremental data at current time;
Window statistics module 103, can be used for based on the history window statistics and the current time
Business dimension incremental data determines the window statistics of the current time business dimension.
A kind of window statistic device of data that the embodiment of the present application is provided, realizes number by the way of based on incremental data
According to window count.Because incremental data needs data to be processed in data compared to prior art window statistics in the unit interval
Magnitude do not go up it is much smaller, therefore using based on the window statistics that statistics is completed by the way of incremental data in the application, can
So that the memory cost and network overhead of system, the disposed of in its entirety performance of lifting system is greatly lowered.In the application embodiment
In, using when when the business dimension for processing the unit interval is counted, the generation current statistic time adds the cycle of measurement period simultaneously
The negative statistical value at quarter, stores in database.So, reach the moment in cycle carry out the unit interval window statistics
When, will can be balanced out earlier than the expired business dimension statistics before one measurement period of the moment in cycle, effectively
Ensure the accuracy of calculating business dimension statistics incremental data in the application.
The application the period windows negative datas unit interval that can do business dimension including system generate when counting
The period windows negative data at respective cycle moment, described period windows negative data can be generated by the application device, it is also possible to
By other modular devices generation storage of system in database, read for described device and used.Doing the unit interval each time
The period windows negative data of the data can be accordingly generated during statistics generation unit interval DBMS.Therefore, the application institute
State in a kind of embodiment of device, the period windows negative data at the current time can include:
The moment is counted in a upper window at the current time, based on the upper window statistics moment business dimension
Unit interval DBMS in business dimension statistical value negative value generation the business dimension current time negative unite
Evaluation.
As it was previously stated, when the window for having processed current time is counted, while can be generated described current current time
Moment adds the period windows negative data at corresponding moment in cycle after the window time window length for setting, in order to the subsequent statistical moment
Window count when can eliminate stale data, accurately calculate incremental data.Fig. 7 is a kind of window of data that the application is provided
The modular structure schematic diagram of statistic device another kind embodiment.Therefore, it is described in another embodiment of herein described device
Device can also include:
Window negative data generation module 104, can be used for taking the unit interval DBMS of the current time business dimension
The negative value of middle business dimension statistical value, using the negative value as the business dimension at the current time plus setting
The period windows negative data at corresponding moment in cycle after window time window length.
In the application embodiment, current statistic is generated simultaneously using when the business dimension for processing the unit interval is counted
Time, plus the negative statistical value at the moment in cycle of measurement period, is stored in database.So, the moment in cycle is being reached
When the window for carrying out the unit interval is counted, can be by the expired business dimension before earlier than one measurement period of the moment in cycle
Degree statistics is balanced out, and the accuracy that business dimension counts incremental data is calculated in effective guarantee the application.
Fig. 8 is a kind of modular structure schematic diagram of the window statistic device another kind embodiment of data that the application is provided,
As shown in figure 8, in a kind of implementation method of described device, the window statistics module 103 can include:
Business dimension window calculation result unit 1301, can be used for the history service dimension window number of results that storage is calculated
According to, and industry of the current time in window timing statisticses section is inquired from the history service dimension window result data of storage
Business dimension window result data, by after the business dimension window result data and the business dimension incremental data union operation
Update the history service dimension window result data.
History service dimension window result data described in the present embodiment can include " HBase keywords described above
The form of window calculation result table ", may also comprise the business dimension system of the statistical window time span that other forms storage is calculated
Meter searching statistical result." HBase keyword window calculation results table " can be small with the 24 of real time record renewal current moment minute
When window statistics keyword search statistics data, specifically can include in the past 24 hours users which keyword searched for
And the information such as searching times of these keywords.
TOP sequences can also be carried out to window statistics in other application scenes, when generation is final current
Carve real-time window statistics.For example according to the keyword TOP100 of keyword search number of times statistical updating minute machine per minute
Result of calculation, filter out the top n keyword of user's focus.A kind of window statistic device of herein described data
In a kind of embodiment, the business dimension statistics incremental data of current one time statistics and upper unit interval can be based on
TOPN result datas calculate the TOPN ranking results of current one time.Fig. 9 is a kind of window of data that the application is provided
The modular structure schematic diagram of statistic device another kind embodiment, as shown in figure 9, in a kind of implementation method of described device, it is described
Window statistics module 103 can include:
Business dimension window ranking results unit 1302, can be used for obtaining the business dimension statistics of a upper unit interval
Searching order result, the business dimension of a upper unit interval is counted the industry of searching order result and the current time
It is ranked up after business dimension statistics incremental data union operation, obtains the business dimension statistics searching order result at current time.
So, the TOPN sequences at current time are carried out based on business dimension statistics incremental data, compared in the prior art
Such as addition scheme or subtraction mode window statistics in substantial amounts of digital independent, request, calculating and usually up to several G
Memory consumption, the embodiment of the present application can be greatly lowered and be when unit of account time stage business dimension counts TOPN results
System memory consumption and network overhead, significantly improve window statistical efficiency.
Keyword described in the embodiment of the present application can include answering for the unpredictable property keyword search of real-time update
With scene, such as each minute keyword of search of user may be considered what is determined by user in search service, and system can not
Precognition.In other applied fields, it is also possible to obtain business datum in real time, then under specified services dimension to every business number
The critical field specified in is scanned for, counted, and obtains window statistics, for example, count daily and update all users most
Amount paid in nearly one month in purchase data record, or the commodity amount that strikes a bargain of nearest one month of all buyers etc..
When the window for carrying out of this sort business datum is counted, critical field can be preassigned, then to every business of renewal
Data carry out the window statistics of designated key field, can equally utilize described in the application embodiment based on the unit interval
Incremental data carries out window statistics, and the data volume for the treatment of is greatly reduced, and reduces Installed System Memory expense and network overhead, improves system
System server performance.
The window statistic device of the data that the embodiment of the present application is provided, realizes data by the way of based on incremental data
Window is counted.Because incremental data needs data to be processed in data magnitude compared to prior art window statistics in the unit interval
Do not go up it is much smaller, therefore using based on the window statistics that statistics is completed by the way of incremental data, Ke Yi great in the application
The memory cost and network overhead of amplitude reduction system, the disposed of in its entirety performance of lifting system.In the application embodiment, adopt
The current statistic time is generated when being counted used in the business dimension for the treatment of unit interval simultaneously plus the moment in cycle of measurement period
Negative statistical value, stores in database.So, reach the moment in cycle carry out the unit interval window count when, can
Will be balanced out earlier than the expired business dimension statistics before one measurement period of the moment in cycle, effective guarantee sheet
The accuracy that business dimension counts incremental data is calculated in application.Effectively can significantly be reduced using the application embodiment
System load and network overhead, improve the data-handling efficiency and server performance of window statistics.
As it was previously stated, the service that the window statistical method or device of the data of the application offer can be used in search system
The window statistics that device is performed, so, search system is based on the window statistics that business dimension statistics incremental data scans for data
When obtaining business dimension statistical computation result table or business dimension TOPN unit interval level result of calculation etc. in window time window length,
The memory cost and network overhead of system can be greatly lowered, system load is reduced, raising system includes processing server
With the overall performance of database, corresponding saving Internet resources and hardware implementation cost.Therefore, the window based on data described above
Mouth statistical method or device, the application provide a kind of window statistical system of data, and Figure 10 is a kind of data that the application is provided
A kind of embodiment of window statistical system frame construction schematic diagram.Such as search service system or business number can specifically be included
According to window statistical processing system etc..Specific as shown in Figure 10, the window statistical system of the data can include:
First data processing unit 201, can be used for obtaining statistics, and do the unit interval to the statistics
Business dimension is counted, unit interval DBMS of the generation business dimension at the statistics moment, and generates the business dimension in institute
State period windows negative data of the statistics moment plus the window time window length moment;Can be also used for based on the unit interval series
According to and the period windows negative data calculate current time window time window length business dimension count incremental data;
Database 202, can be used for the unit interval DBMS and corresponding period windows negative data of storage generation, and
Store the window statistics that the second data processing unit 203 is calculated;
Second data processing unit 203, can be used for obtaining the unit interval DBMS of current time business dimension, and
The history window statistics of a upper unit interval business dimension at the current time;Can be also used for the history from storage
The period windows negative data at the current time is inquired in unit interval DBMS, according to the period windows negative data and institute
The unit interval DBMS for stating current time business dimension is calculated the business dimension incremental data at current time;Can also use
Determine the current time industry in the business dimension incremental data based on the history window statistics and the current time
The window statistics of dimension of being engaged in.
Certainly, as it was previously stated, in some specific application scenarios, incremental number can be counted with based on the business dimension
According to being calculated business dimension statistical window result of calculation or business dimension TOPN searching order results.Figure 11 is the application offer
A kind of data a kind of frame construction schematic diagram of embodiment application scenarios of window statistical system.Specific herein described system
In a kind of specific embodiment, second data processing unit 203 is arranged for performing at least one of following:
From storage history service dimension window result data in inquire current time window timing statisticses section in
Business dimension window result data, by the business dimension window result data and the business dimension incremental data union operation
After update the history service dimension window result data;
The business dimension statistics searching order result of unit interval was obtained, by the industry of a upper unit interval
It is ranked up after business dimension statistics searching order result and the business dimension statistics incremental data union operation at the current time,
Obtain the business dimension statistics searching order result at current time.
The business dimension of a upper unit interval was obtained from the business dimension statistics searching order result of the storage of database 202
Degree statistics searching order result, by the business dimension of upper unit interval statistics searching order result with it is described current when
It is ranked up after the business dimension statistics incremental data merging for carving statistical window, obtains the business dimension statistics search at current time
Ranking results, the business dimension statistics searching order result at the current time is preserved into the database 202.
Window statistical method, the apparatus and system of the data that the application is provided, incremental number is counted using based on business dimension
According to mode realize that the window of data is counted, can be greatly reduced the memory cost of system, improve systematic function, reduce network and open
Pin, improves the data-handling efficiency of window statistical system.
Although mentioning the business dimension statistics of keyword in teachings herein, being deposited with tables of data and HBase database modes
The data of the window statistics of method for computing data, addition or the subtraction mode that storage data, window are related to when counting or the like are deposited
Storage, the description of information interaction approach, but, the application is not limited to be real professional standard, information exchange or calculating mark
Situation described by accurate or embodiment.The implementation base of some window statistical methods, data storage, information exchange or embodiment description
On plinth embodiment amended slightly can also realize above-described embodiment it is identical, equivalent or close or deformation after it is anticipated that
Implementation result.Judge feedback system, data storage side using the computational methods after these modifications or deformation, information exchange and information
Formula etc., within the scope of still may belong to the optional embodiment of the application.
Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive
The means of the property made can include more or less operating procedures.The step of being enumerated in embodiment order is only numerous steps
A kind of mode in execution sequence, unique execution sequence is not represented.When device or client production in practice is performed, can
Performed or executed in parallel (such as at parallel processor or multithreading with according to embodiment or method shown in the drawings order
The environment of reason).
Unit, device or module that above-described embodiment is illustrated, can specifically be realized, Huo Zheyou by computer chip or entity
Product with certain function is realized.For convenience of description, it is divided into various modules with function when describing apparatus above to distinguish
Description.Certainly, the function of each module can be realized in same or multiple softwares and/or hardware when the application is implemented,
Can also will realize that the module of same function is realized by the combination of multiple submodule or subelement.
It is also known in the art that in addition to realizing controller in pure computer readable program code mode, it is complete
Entirely can by by method and step carry out programming in logic come cause controller with gate, switch, application specific integrated circuit, may be programmed
Logic controller realizes identical function with the form of embedded microcontroller etc..Therefore this controller is considered one kind
Hardware component, and the device for realizing various functions included to its inside can also be considered as the structure in hardware component.Or
Person even, can be used to realizing that the device of various functions is considered as not only being the software module of implementation method but also can be hardware
Structure in part.
The application can be described in the general context of computer executable instructions, such as program
Module.Usually, program module includes performing particular task or realizes routine, program, object, the group of particular abstract data type
Part, data structure, class etc..The application can also be in a distributed computing environment put into practice, in these DCEs,
Task is performed by the remote processing devices connected by communication network.In a distributed computing environment, program module can
With in the local and remote computer-readable storage medium including including storage device.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can
Realized by the mode of software plus required general hardware platform.Based on such understanding, the technical scheme essence of the application
On the part that is contributed to prior art in other words can be embodied in the form of software product, the computer software product
Can store in storage medium, such as ROM/RAM, magnetic disc, CD, including some instructions are used to so that a computer equipment
(can be personal computer, mobile terminal, server, or network equipment etc.) performs each embodiment of the application or implementation
Method described in some parts of example.
Each embodiment in this specification is described by the way of progressive, same or analogous portion between each embodiment
Divide mutually referring to what each embodiment was stressed is the difference with other embodiment.The application can be used for crowd
In more general or special purpose computing system environments or configuration.For example:Personal computer, server computer, handheld device or
Portable set, laptop device, multicomputer system, the system based on microprocessor, set top box, programmable electronics set
Standby, network PC, minicom, mainframe computer, the DCE including any of the above system or equipment etc..
Although depicting the application by embodiment, it will be appreciated by the skilled addressee that the application have it is many deformation and
Change is without deviating from spirit herein, it is desirable to which appended claim includes these deformations and changes without deviating from the application's
Spirit.