CN106503054A - A kind of data query method and server - Google Patents

A kind of data query method and server Download PDF

Info

Publication number
CN106503054A
CN106503054A CN201610850912.7A CN201610850912A CN106503054A CN 106503054 A CN106503054 A CN 106503054A CN 201610850912 A CN201610850912 A CN 201610850912A CN 106503054 A CN106503054 A CN 106503054A
Authority
CN
China
Prior art keywords
time cycle
time
cycle
middle table
active
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201610850912.7A
Other languages
Chinese (zh)
Inventor
邓小龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jinli Communication Equipment Co Ltd
Original Assignee
Shenzhen Jinli Communication Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jinli Communication Equipment Co Ltd filed Critical Shenzhen Jinli Communication Equipment Co Ltd
Priority to CN201610850912.7A priority Critical patent/CN106503054A/en
Publication of CN106503054A publication Critical patent/CN106503054A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the invention discloses a kind of data query method and server, wherein, data query method includes:Receive inquiry request, inquiry request is used for inquiring about the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues;Obtain the target sublist of the second time cycle corresponding target middle table, nearest active time cycle of the target middle table comprising each user before the second time cycle of cut-off, target sublist are the quantity for counting any active ues in the second time cycle of cut-off each time cycle in the past for obtaining according to target middle table;From target sublist the statistics cycle very first time start to terminate to the second time cycle in each time cycle the quantity of any active ues summation, and summation be defined as the cycle very first time start to the second time cycle to terminate the total quantity of interior any active ues.Using the present invention, the quick search of data is can achieve, cost-effective.

Description

A kind of data query method and server
Technical field
The present invention relates to electronic technology field, more particularly to a kind of data query method and server.
Background technology
Big data is the ongoing field technology revolution of Present Global message area, to industry-by-industry, gives birth to including mobile phone Business men, can all produce vital impact and effect.Big data is to build vertical ecosystem, allows the popular and world freely to join The bridge for connecing, by polymerization comprehensively, deeply excavation, efficient application, can allow enterprise to obtain more fully customer insight, be formed more Deep analysis ability, provides information service by comprehensive platform to society, creates social value.
At present, with the fast development and popularization and application of intelligent terminal, the use scene of intelligent terminal is also more and more, uses The instant messaging account number that family can log in oneself with using terminal is chatted, or user can use video account number to watch video Deng when User logs in account number is operated i.e. generation one operation note, i.e. user at this moment in active state, generally need Big data analysis is carried out to the active state of each user, such as, in web assay surfaces, certain should to inquire about random time section With the active users with type dimension, the calculating difficult point of active users is that data volume is big, and needs duplicate removal (same User, in a period of time, can only at most calculate and enliven once).If there is within 1 day 30,000,000 record number, there are within 1 month 900,000,000 notes Record, such as the inquiry active users of 6 months, it is necessary to carry out duplicate removal calculating on 5,400,000,000 data-level.
Due to query time scope uncertainty, it is impossible to carry out the pretreatment of data again, directly can only inquire about, one is needed Huge, powerful query engine storehouse, is calculated by distributed data scale, needs 50 servers support such inquiry (can inquire data at 60 seconds or so), cost is very high, and time-consuming more.
Content of the invention
The embodiment of the present invention provides a kind of data query method and server, can be by pre-building middle table and son Table, realizes the quick search of data, cost-effective.
A kind of data query method is embodiments provided, which may include:
Receive inquiry request, the inquiry request is used for inquiring about in the cycle very first time starts to the second time cycle to terminate The total quantity of any active ues;
The target sublist of second time cycle corresponding target middle table is obtained, the target middle table includes cut-off In the nearest active time cycle of each user before second time cycle, the target sublist is according in the middle of the target Table counts the quantity of any active ues in the cut-off for obtaining described second time cycle each time cycle in the past;
Count from the target sublist cycle very first time start to terminate to second time cycle in each The summation of the quantity of any active ues in time cycle, and when the summation being defined as the cycle very first time starting to second Between in end cycle any active ues total quantity.
A kind of server is embodiments provided, which may include:
Receiving unit, for receiving inquiry request, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues;
First acquisition unit, for obtaining the target sublist of second time cycle corresponding target middle table, described Nearest active time cycle of the target middle table comprising each user before cut-off second time cycle, the target sublist It is to count any active ues in the cut-off for obtaining described second time cycle each time cycle in the past according to the target middle table Quantity;
Statistic unit, starts to week second time for counting the cycle very first time from the target sublist Phase terminate in each time cycle the quantity of any active ues summation, and the summation is defined as the cycle very first time Start to the second time cycle to terminate the total quantity of interior any active ues.
In the embodiment of the present invention, inquiry request is received, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues, obtains the target sublist of the second time cycle corresponding target middle table, should In nearest active time cycle of the target middle table comprising each user before ending second time cycle, the target sublist is root The quantity for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics, from mesh Count in mark sublist the cycle very first time start to terminate to the second time cycle in each time cycle any active ues data Summation, the summation is defined as the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues, this Invention by pre-building the sublist of middle table and middle table, such that it is able to the total quantity of quick search to any active ues, section Save carrying cost.
Description of the drawings
In order to be illustrated more clearly that embodiment of the present invention technical scheme, embodiment will be described below needed for be used Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, for this area general For logical technical staff, on the premise of not paying creative work, can be with according to these other accompanying drawings of accompanying drawings acquisition.
Fig. 1 is the first embodiment schematic flow sheet of data query method provided in an embodiment of the present invention;
Fig. 2 is the second embodiment schematic flow sheet of data query method provided in an embodiment of the present invention;
Fig. 3 is a kind of web terminal result data table provided in an embodiment of the present invention;
Fig. 4 is a kind of web terminal Query Result figure provided in an embodiment of the present invention;
Fig. 5 is a kind of structural representation of server provided in an embodiment of the present invention;
Fig. 6 is a kind of structural representation of first construction unit provided in an embodiment of the present invention;
Fig. 7 is a kind of structural representation of adjustment unit provided in an embodiment of the present invention;
Fig. 8 is the structural representation of another kind of server provided in an embodiment of the present invention.
Specific embodiment
Accompanying drawing in below in conjunction with the embodiment of the present invention, to the embodiment of the present invention in technical scheme carry out clear, complete Site preparation is described, it is clear that described embodiment is a part of embodiment of the invention, rather than whole embodiments.It is based on this Embodiment in bright, the every other enforcement obtained under the premise of creative work is not made by those of ordinary skill in the art Example, belongs to the scope of protection of the invention.
Data query method and device provided in an embodiment of the present invention is specifically described below in conjunction with Fig. 1 to Fig. 4.
Fig. 1 is refer to, is the first embodiment schematic flow sheet of data query method provided in an embodiment of the present invention.This reality The data query method described in example is applied, including step:
S101, receives inquiry request, and the inquiry request starts to the second time cycle for inquiring about the cycle very first time The total quantity of any active ues in terminating;
In the embodiment of the present invention, the time cycle can be day, week, the moon and year etc., and the embodiment of the present invention is with day as when Between the cycle as an example, the cycle very first time can be certain year in such a month, and on such a day, on July 1st, 1, the second time week Phase, such as, the second time cycle can be September in 2016 1 after the second time cycle.The cycle very first time starts to It is the use that there is operation note within the cycle very first time to the second time cycle that two time cycles terminated interior any active ues Family, such as, by taking the corresponding user of an intelligent terminal as an example, intelligent terminal within the cycle very first time to the second time cycle On there is operation note, i.e., the user be any active ues, or further can using application as statistical dimension, the i.e. intelligence There is operation note in certain application of terminal, then the user is any active ues;Or, or united with account number as dimension Meter, an account number correspond to a user, are illustrated here by taking instant messaging application as an example, and user uses the instant messaging account number of oneself Instant messaging application on intelligent terminal carries out register, then account produces an operation note, in current time In cycle, the corresponding user of account is any active ues.
User can utilize the total quantity of any active ues between the web page interrogation any two time cycle in web front-end, need It is noted that same user has a plurality of operation note within the cycle very first time to the second time cycle, also one is calculated Any active ues.Background server receives the inquiry request that web front-end sends, and calls and anticipates the middle table and centre for obtaining The sublist of table is analyzed.
S102, obtains the target sublist of second time cycle corresponding target middle table, the target middle table bag In the nearest active time cycle containing each user before cut-off second time cycle, the target sublist is according to the mesh Mark middle table counts the quantity of any active ues in the cut-off for obtaining described second time cycle each time cycle in the past;
In the embodiment of the present invention, background server (compares previously according to the situation of enlivening of each user in each time cycle As produce operation note), build each time cycle corresponding middle table, in the middle table comprising end the time cycle with Before (comprising the time cycle) each user the nearest active time cycle.
Concrete optional, illustrated as day with the time cycle here, designed a model, increase by one layer of middle table, table name is Fact_action_pro_tot, is analyzed by date, and field is:
last_day_id app_id province_id imei Day_id
Each explanation of field in above table:
a)Day_id:Date subregion key (represents specifically which day middle table)
b)last_day_id:User's last active time cycle
c)App_id:Application id
d)province_id:Regional id
e)Imei:Mobile phone string number, unique ID of identifying user
In a middle table, there was only a record with a user under dimension, be that only to record the user newest The active date, such as in the July middle table of No. 20:Number inside 20160720 subregions of fact_action_pro_tot According to having:
max_day_id app_id province_id imei
20160720 2 4 868563022799652
20160720 2 5 868612024366828
20160719 2 4 868612027430456
…… …… …… ……
Give an example:If user imei=868563022799652, same dimension are active at No. 19 and No. 20 (all there is operation note), the middle table of No. 20 only record newest No. 20 this active record.So intermediate table storage Data just fewer, save memory space.
The sublist of some time cycle corresponding middle table, is to enlivening use in each time cycle in the middle table Amount statistics of variables is obtained, and such as, the time cycle included in upper table is on July 20th, 2016 and on July 19th, 2016, then may be used To count the quantity for obtaining any active ues on the 20th of July in 2016 as 2, the quantity of any active ues on the 19th of July in 2016 is 1.
When need inquire about the cycle very first time to the second time cycle in any active ues total quantity when, then obtain this second The target sublist of time cycle corresponding target middle table, a time cycle corresponding middle table and a sublist, pass through Time cycle mark can obtain the sublist of certain time cycle corresponding middle table, as shown in figure 3, with the time cycle as 2016 Illustrate as a example by year July 27, contain in the sublist of the middle table of the time cycle under application and regional statistics dimension, The quantity of this day any active ues.
S103, counts during the cycle very first time starts to terminate to second time cycle from the target sublist The summation of the quantity of any active ues in each time cycle, and the summation is defined as the cycle very first time starts to Two time cycles terminated the total quantity of interior any active ues.
In the embodiment of the present invention, contain in the target sublist of the second time cycle corresponding target middle table cut-off this The quantity of any active ues before two time cycles in each time cycle, such as the second time cycle is July 27 in 2016, this Embodiment statistics Start Date is on June 1st, 2016, then wrap in the target sublist of the second time cycle corresponding target middle table The quantity of 1 day to 2016 June in 2016,27 daily any active ues of July is contained, when needing to calculate the cycle very first time to second In time cycle during the total quantity of any active ues, will live in each time cycle in the cycle very first time to the second time cycle The summation of the quantity of jump user continues to illustrate as a example by above-mentioned, the cycle very first time for if desired counting as total quantity It it is on July 10th, 2016, then in the target sublist of the 27 corresponding target middle table of July in 2016, by July 10th, 2016 extremely In July 27 in 2016, the quantity of daily any active ues is added.
In the embodiment of the present invention, inquiry request is received, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues, obtains the target sublist of the second time cycle corresponding target middle table, should In nearest active time cycle of the target middle table comprising each user before ending second time cycle, the target sublist is root The quantity for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics, from mesh Count in mark sublist the cycle very first time start to terminate to the second time cycle in each time cycle any active ues data Summation, the summation is defined as the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues, this Invention by pre-building the sublist of middle table and middle table, such that it is able to the total quantity of quick search to any active ues, section Save carrying cost.
Fig. 2 is refer to, is the second embodiment schematic flow sheet of data query method provided in an embodiment of the present invention.This reality The data query method described in example is applied, including step:
S201, builds each time cycle corresponding middle table, before the middle table is comprising the cut-off time cycle The nearest active time cycle of each user;
Optionally, the middle table should using each in each geographic area before can also including the cut-off time cycle In the nearest active time cycle of each user, be that geographic area and application are also served as statistical dimension, but identical dimension Under degree, same user only exists a record, the i.e. nearest activity periods of the user.
The sublist of the middle table is comprising the cut-off time cycle in the past in each geographic area using each application The quantity of any active ues in each time cycle.
Concrete optional, the concrete steps for building each time cycle corresponding middle table may comprise steps of:
Step one, for each time cycle, obtained the time cycle in the past and the phase adjacent with the time cycle Adjacent time cycle corresponding middle table;
Step 2, obtains the mark of at least one any active ues in the time cycle;
Step 3, according to the mark and the corresponding middle table of the adjacent time period of at least one any active ues In any active ues mark, adjust the nearest active time week of each user in the corresponding middle table of the adjacent time period Phase, obtain the time cycle corresponding middle table.
Optionally, the adjustment mode of the corresponding middle table of adjustment adjacent time period is, if the adjacent time period pair The mark of any active ues in the middle table that answers is mated with the first mark in the mark of at least one any active ues, then by institute State the nearest active time cycle of the identified any active ues of the first mark described in the corresponding middle table of adjacent time period more It is newly the time cycle;
If not existing in the corresponding middle table of the adjacent time period in the mark of at least one any active ues Two marks, then add first mark in the corresponding middle table of the adjacent time period, and identify institute by described first The nearest active time cycle of any active ues of mark is defined as the time cycle.
In the embodiment of the present invention, each time cycle data generating procedure of corresponding middle table is as follows, here with the time Cycle is that day is illustrated, and the middle table of yesterday and the data of today is merged (table fact_action_pro_tot takes 20160719 data, fact_action_mac_pro take 20160720 data), carry out polymerization and collect, enlivening the date takes most Big value, is put into data inside 20160720 subregions of fact_action_pro_tot, and HQL grammers are realized as follows:
Data characteristicses:Fact_action_pro_tot is data on the 20th, contains the number of dimensions of all imei (user) According to the corresponding imei of each dimension only has a record, it is possible to understand that into, in the table on the same day, only retain the newest record of user, If the same day sluggish user, retain the newest record of history;Here it is that ladder algorithm is combined using space for time thinking The result of generation.
S202, according to each time cycle described corresponding middle table, builds the sublist of each middle table, the son Quantity of the table comprising any active ues in each time cycle before the cut-off time cycle.
In the embodiment of the present invention, the sublist of each time cycle corresponding middle table, a time cycle corresponding one are built Individual middle table, a middle table correspond to a sublist, and sublist is to be obtained according to data statisticss in corresponding middle table.
Concrete optional, design a model, increase sublist fact_action_pro_tot_sub of middle table, such as Fig. 3 institutes Show, by date day_id subregions, sublist explanation of field is as follows:
a)Day_id:Date subregion key
b)last_day_id:Finally enliven the date
c)app_id:Application ID
d)province_id:Regional ID
e)imei_num:Same day active users
Data realize that collected by fact_action_pro_tot tables, the result granularity of statistics has not been detailed level The data volume of other data, only dimension level, because inside the data of 1 day, 1 user only falls in (newest work some day That day of jump), so data volume very little, the data for there was only million ranks daily.
In the table (being the sublist of middle table) of web front-end data base's design date type, such as fact_action_ Pro_tot_sub_20160727, the result data of generation are as shown in Figure 3.
Front end web-query design:Inquiry table is fact_action_pro_tot_sub_YYYYMMDD, and rule searching is as schemed Shown in 4, can inquire about between any two time cycle, for certain application, the total quantity row of each regional active users OK.
Case 1:Querying condition:During 20160526~20160726, application=Amigo Play (app_id=349), Statistics each province active users seniority among brothers and sisters in the meantime;
SQL query is following (mysql data bases):
Select province_id,sum(imei_num)from fact_action_pro_tot_sub_20160726
Where day_id>=20160526and day_id<=20160726and app_id=349
The execution time is 2 seconds;Equivalent to the active users for having looked into 2 months for 2 seconds
Case 2:Querying condition:During 20160320~20160720, application=Amigo Play (app_id=349), Statistics each province active users seniority among brothers and sisters in the meantime.
SQL query is following (mysql data bases):
Select province_id,sum(imei_num)from fact_action_pro_tot_sub_20160720
Where day_id>=20160401and day_id<=20160720and app_id=349
The execution time is 2 seconds;Equivalent to the active users that queried 4 months for 2 seconds.
Corresponding inquiry table (sublist of middle table), is consistent with the ruling off time of querying condition, 20160320 20160720 this tables are just looked into during~20160720, no matter the date how long, the scope of data of inquiry all completed in single table, Required time is very short.
Inquiry active users, former Duplicate Removal Algorithm count (distinst imei) becomes stackable algorithm, sum (imei_num);Inquiry active users, it is impossible to which the problem for carrying out pretreatment is resolved, and becomes to carry out pretreatment in advance.
S203, receives inquiry request, and the inquiry request starts to the second time cycle for inquiring about the cycle very first time The total quantity of any active ues in terminating;
S204, obtains the target sublist of second time cycle corresponding target middle table, the target middle table bag In the nearest active time cycle containing each user before cut-off second time cycle, the target sublist is according to the mesh Mark middle table counts the quantity of any active ues in the cut-off for obtaining described second time cycle each time cycle in the past;
S205, counts during the cycle very first time starts to terminate to second time cycle from the target sublist The summation of the quantity of any active ues in each time cycle, and the summation is defined as the cycle very first time starts to Two time cycles terminated the total quantity of interior any active ues.
Embodiment of the present invention step S203-S205 refer to embodiment step S101-S103 of Fig. 1, will not be described here.
In the embodiment of the present invention, inquiry request is received, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues, obtains the target sublist of the second time cycle corresponding target middle table, should In nearest active time cycle of the target middle table comprising each user before ending second time cycle, the target sublist is root The quantity for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics, from mesh Count in mark sublist the cycle very first time start to terminate to the second time cycle in each time cycle any active ues data Summation, the summation is defined as the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues, this Invention by pre-building the sublist of middle table and middle table, such that it is able to the total quantity of quick search to any active ues, section Save carrying cost.
Fig. 5 is refer to, is the structural representation of server provided in an embodiment of the present invention, the server can be taken for backstage Business device.Server described in the present embodiment includes receiving unit 10, first acquisition unit 11 and statistic unit 12;
Receiving unit 10, for receiving inquiry request, the inquiry request is used for inquiring about the cycle very first time and starts to the Two time cycles terminated the total quantity of interior any active ues;
In the embodiment of the present invention, the time cycle can be day, week, the moon and year etc., and the embodiment of the present invention is with day as when Between the cycle as an example, the cycle very first time can be certain year in such a month, and on such a day, on July 1st, 1, the second time week Phase, such as, the second time cycle can be September in 2016 1 after the second time cycle.The cycle very first time starts to It is the use that there is operation note within the cycle very first time to the second time cycle that two time cycles terminated interior any active ues Family, such as, by taking the corresponding user of an intelligent terminal as an example, intelligent terminal within the cycle very first time to the second time cycle On there is operation note, i.e., the user be any active ues, or further can using application as statistical dimension, the i.e. intelligence There is operation note in certain application of terminal, then the user is any active ues;Or, or united with account number as dimension Meter, an account number correspond to a user, are illustrated here by taking instant messaging application as an example, and user uses the instant messaging account number of oneself Instant messaging application on intelligent terminal carries out register, then account produces an operation note, in current time In cycle, the corresponding user of account is any active ues.
User can utilize the total quantity of any active ues between the web page interrogation any two time cycle in web front-end, need It is noted that same user has a plurality of operation note within the cycle very first time to the second time cycle, also one is calculated Any active ues.Background server receives the inquiry request that web front-end sends, and calls and anticipates the middle table and centre for obtaining The sublist of table is analyzed.
First acquisition unit 11, for obtaining the target sublist of second time cycle corresponding target middle table, institute State nearest active time cycle of the target middle table comprising each user before cut-off second time cycle, target Table is to be counted to enliven in the cut-off for obtaining described second time cycle each time cycle in the past according to the target middle table to use The quantity at family;
In the embodiment of the present invention, background server (compares previously according to the situation of enlivening of each user in each time cycle As produce operation note), build each time cycle corresponding middle table, in the middle table comprising end the time cycle with Before (comprising the time cycle) each user the nearest active time cycle.
Concrete optional, illustrated as day with the time cycle here, designed a model, increase by one layer of middle table, table name is Fact_action_pro_tot, is analyzed by date, and field is:
last_day_id app_id province_id imei Day_id
Each explanation of field in above table:
a)Day_id:Date subregion key (represents specifically which day middle table)
b)last_day_id:User's last active time cycle
c)App_id:Application id
d)province_id:Regional id
e)Imei:Mobile phone string number, unique ID of identifying user
In a middle table, there was only a record with a user under dimension, be that only to record the user newest The active date, such as in the July middle table of No. 20:Number inside 20160720 subregions of fact_action_pro_tot According to having:
max_day_id app_id province_id imei
20160720 2 4 868563022799652
20160720 2 5 868612024366828
20160719 2 4 868612027430456
…… …… …… ……
Give an example:If user imei=868563022799652, same dimension are active at No. 19 and No. 20 (all there is operation note), the middle table of No. 20 only record newest No. 20 this active record.So intermediate table storage Data just fewer, save memory space.
The sublist of some time cycle corresponding middle table, is to enlivening use in each time cycle in the middle table Amount statistics of variables is obtained, and such as, the time cycle included in upper table is on July 20th, 2016 and on July 19th, 2016, then may be used To count the quantity for obtaining any active ues on the 20th of July in 2016 as 2, the quantity of any active ues on the 19th of July in 2016 is 1.
When need inquire about the cycle very first time to the second time cycle in any active ues total quantity when, then obtain this second The target sublist of time cycle corresponding target middle table, a time cycle corresponding middle table and a sublist, pass through Time cycle mark can obtain the sublist of certain time cycle corresponding middle table, as shown in figure 3, with the time cycle as 2016 Illustrate as a example by year July 27, contain in the sublist of the middle table of the time cycle under application and regional statistics dimension, The quantity of this day any active ues.
Statistic unit 12, starts to second time for counting the cycle very first time from the target sublist In end cycle in each time cycle the quantity of any active ues summation, and the summation is defined as week very first time Phase starts to second time cycle to terminate the total quantity of interior any active ues.
In the embodiment of the present invention, contain in the target sublist of the second time cycle corresponding target middle table cut-off this The quantity of any active ues before two time cycles in each time cycle, such as the second time cycle is July 27 in 2016, this Embodiment statistics Start Date is on June 1st, 2016, then wrap in the target sublist of the second time cycle corresponding target middle table The quantity of 1 day to 2016 June in 2016,27 daily any active ues of July is contained, when needing to calculate the cycle very first time to second In time cycle during the total quantity of any active ues, will live in each time cycle in the cycle very first time to the second time cycle The summation of the quantity of jump user continues to illustrate as a example by above-mentioned, the cycle very first time for if desired counting as total quantity It it is on July 10th, 2016, then in the target sublist of the 27 corresponding target middle table of July in 2016, by July 10th, 2016 extremely In July 27 in 2016, the quantity of daily any active ues is added.
Optionally, as shown in figure 5, the server of the embodiment of the present invention also includes that the first construction unit 13 and second builds list Unit 14;
First construction unit 13, for building each time cycle corresponding middle table, the middle table includes cut-off institute State the nearest active time cycle of time cycle each user former;
Concrete optional, as shown in fig. 6, first construction unit 13 includes that second acquisition unit the 130, the 3rd obtains list Unit 131 and adjustment unit 132;
Second acquisition unit 130, for for each time cycle, before obtaining the time cycle and with the time The corresponding middle table of cycle adjacent adjacent time period;
3rd acquiring unit 131, for obtaining the mark of at least one any active ues in the time cycle;
Adjustment unit 132, for the mark according at least one any active ues and the adjacent time period pair The mark of any active ues in the middle table that answers, in the corresponding middle table of the adjustment adjacent time period, each user's is nearest In the active time cycle, obtain the time cycle corresponding middle table.
Further alternative, as shown in fig. 7, the adjustment unit 132 can include updating block 1320 and adding device 1321;
Updating block 1320, if mark and institute for any active ues in the corresponding middle table of the adjacent time period The first mark coupling in the mark of at least one any active ues is stated, then by described in corresponding for adjacent time period middle table The nearest active time cycle of the identified any active ues of the first mark is updated to the time cycle;
, if for there is no at least one work in the corresponding middle table of the adjacent time period in adding device 1321 Second mark in the mark of jump user, then add first mark in the corresponding middle table of the adjacent time period, and The nearest active time cycle of any active ues identified for the described first mark is defined as the time cycle.
Optionally, the middle table used each application in each geographic area in the past comprising the cut-off time cycle The nearest active time cycle of each user;
The sublist of the middle table is comprising the cut-off time cycle in the past in each geographic area using each application The quantity of any active ues in each time cycle.
In the embodiment of the present invention, each time cycle data generating procedure of corresponding middle table is as follows, here with the time Cycle is that day is illustrated, and the middle table of yesterday and the data of today is merged (table fact_action_pro_tot takes 20160719 data, fact_action_mac_pro take 20160720 data), carry out polymerization and collect, enlivening the date takes most Big value, is put into data inside 20160720 subregions of fact_action_pro_tot, and HQL grammers are realized as follows:
Data characteristicses:Fact_action_pro_tot is data on the 20th, contains the number of dimensions of all imei (user) According to the corresponding imei of each dimension only has a record, it is possible to understand that into, in the table on the same day, only retain the newest record of user, If the same day sluggish user, retain the newest record of history;Here it is that ladder algorithm is combined using space for time thinking The result of generation.
Second construction unit 14, for according to each time cycle described corresponding middle table, building each described centre The sublist of table, quantity of the sublist comprising any active ues in each time cycle before the cut-off time cycle.
In the embodiment of the present invention, the sublist of each time cycle corresponding middle table, a time cycle corresponding one are built Individual middle table, a middle table correspond to a sublist, and sublist is to be obtained according to data statisticss in corresponding middle table.
Concrete optional, design a model, increase sublist fact_action_pro_tot_sub of middle table, such as Fig. 3 institutes Show, by date day_id subregions, sublist explanation of field is as follows:
a)Day_id:Date subregion key
b)last_day_id:Finally enliven the date
c)app_id:Application ID
d)province_id:Regional ID
e)imei_num:Same day active users
Data realize that collected by fact_action_pro_tot tables, the result granularity of statistics has not been detailed level The data volume of other data, only dimension level, because inside the data of 1 day, 1 user only falls in (newest work some day That day of jump), so data volume very little, the data for there was only million ranks daily.
In the table (being the sublist of middle table) of web front-end data base's design date type, such as fact_action_ Pro_tot_sub_20160727, the result data of generation are as shown in Figure 3.
Front end web-query design:Inquiry table is fact_action_pro_tot_sub_YYYYMMDD, and rule searching is as schemed Shown in 4, can inquire about between any two time cycle, for certain application, the total quantity row of each regional active users OK.
Case 1:Querying condition:During 20160526~20160726, application=Amigo Play (app_id=349), Statistics each province active users seniority among brothers and sisters in the meantime;
SQL query is following (mysql data bases):
Select province_id,sum(imei_num)from fact_action_pro_tot_sub_20160726
Where day_id>=20160526 and day_id<=20160726 and app_id=349
The execution time is 2 seconds;Equivalent to the active users for having looked into 2 months for 2 seconds
Case 2:Querying condition:During 20160320~20160720, application=Amigo Play (app_id=349), Statistics each province active users seniority among brothers and sisters in the meantime.
SQL query is following (mysql data bases):
Select province_id,sum(imei_num)from fact_action_pro_tot_sub_20160720
Where day_id>=20160401 and day_id<=20160720 and app_id=349
The execution time is 2 seconds;Equivalent to the active users that queried 4 months for 2 seconds.
Corresponding inquiry table (sublist of middle table), is consistent with the ruling off time of querying condition, 20160320 20160720 this tables are just looked into during~20160720, no matter the date how long, the scope of data of inquiry all completed in single table, Required time is very short.
Inquiry active users, former Duplicate Removal Algorithm count (distinst imei) becomes stackable algorithm, sum (imei_num);Inquiry active users, it is impossible to which the problem for carrying out pretreatment is resolved, and becomes to carry out pretreatment in advance.
In the embodiment of the present invention, inquiry request is received, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues, obtains the target sublist of the second time cycle corresponding target middle table, should In nearest active time cycle of the target middle table comprising each user before ending second time cycle, the target sublist is root The quantity for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics, from mesh Count in mark sublist the cycle very first time start to terminate to the second time cycle in each time cycle any active ues data Summation, the summation is defined as the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues, this Invention by pre-building the sublist of middle table and middle table, such that it is able to the total quantity of quick search to any active ues, section Save carrying cost.
Fig. 8 is refer to, is the structural representation of another kind of server provided in an embodiment of the present invention, the server can be answered For background server.Server described in the present embodiment includes:At least one input equipment 1000;At least one output Equipment 2000;At least one processor 3000, such as CPU;With memorizer 4000, above-mentioned input equipment 1000, outut device 2000th, processor 3000 and memorizer 4000 are connected by bus 5000.
Wherein, the receptor of the concretely server of above-mentioned input equipment 1000, can receive that web front-end sent are looked into Ask request;
The emitter of the concretely server of above-mentioned outut device 2000, for exporting Query Result to web front-end.
Above-mentioned memorizer 4000 can be high-speed RAM memorizer, alternatively non-labile memorizer (non-volatile Memory), such as disk memory.Above-mentioned memorizer 4000 is used for storing batch processing code, above-mentioned input equipment 1000, defeated Going out equipment 2000 and processor 3000 is used for calling the program code stored in memorizer 4000, executes following operation:
Above-mentioned input equipment 1000, for receiving inquiry request, the inquiry request is opened for inquiring about the cycle very first time Begin to the second time cycle to terminate the total quantity of interior any active ues;
Above-mentioned processor 3000, for obtaining the target sublist of second time cycle corresponding target middle table, institute State nearest active time cycle of the target middle table comprising each user before cut-off second time cycle, target Table is to be counted to enliven in the cut-off for obtaining described second time cycle each time cycle in the past according to the target middle table to use The quantity at family;
Above-mentioned processor 3000 is additionally operable to count the cycle very first time from the target sublist to start to described Two time cycles terminate in each time cycle the quantity of any active ues summation, and the summation is defined as described first Time cycle starts to the second time cycle to terminate the total quantity of interior any active ues.
Above-mentioned outut device 2000, starts to the second time cycle for exporting the cycle very first time that inquiry is obtained The total quantity of any active ues in terminating.
Optionally, above-mentioned processor 3000 is additionally operable to build each time cycle corresponding middle table, the middle table bag The nearest active time cycle containing each user before the cut-off time cycle;
Above-mentioned processor 3000 is additionally operable to, according to each time cycle described corresponding middle table, build each described centre The sublist of table, quantity of the sublist comprising any active ues in each time cycle before the cut-off time cycle.
Concrete optional, above-mentioned processor 3000 was additionally operable to for each time cycle, before obtaining the time cycle And the corresponding middle table of the adjacent time period adjacent with the time cycle;
Obtain the mark of at least one any active ues in the time cycle;
According to the work in the mark and the corresponding middle table of the adjacent time period of at least one any active ues The mark of jump user, adjusts the nearest active time cycle of each user in the corresponding middle table of the adjacent time period, obtains Obtain the time cycle corresponding middle table.
Optionally, mark and adjacent time week of the above-mentioned processor 3000 according at least one any active ues The mark of any active ues in phase corresponding middle table, adjusts each user in the corresponding middle table of the adjacent time period Recently the active time cycle, obtain the time cycle corresponding middle table and specifically include:
If the mark of any active ues in the corresponding middle table of the adjacent time period and described at least one active is used In the mark at family, the first mark coupling, then identified described in corresponding for adjacent time period middle table first The nearest active time cycle of any active ues is updated to the time cycle;
If not existing in the corresponding middle table of the adjacent time period in the mark of at least one any active ues Two marks, then add first mark in the corresponding middle table of the adjacent time period, and identify institute by described first The nearest active time cycle of any active ues of mark is defined as the time cycle.
Optionally, the middle table used each application in each geographic area in the past comprising the cut-off time cycle The nearest active time cycle of each user;
The sublist of the middle table is comprising the cut-off time cycle in the past in each geographic area using each application The quantity of any active ues in each time cycle.
In the embodiment of the present invention, inquiry request is received, the inquiry request starts to second for inquiring about the cycle very first time Time cycle terminates the total quantity of interior any active ues, obtains the target sublist of the second time cycle corresponding target middle table, should In nearest active time cycle of the target middle table comprising each user before ending second time cycle, the target sublist is root The quantity for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics, from mesh Count in mark sublist the cycle very first time start to terminate to the second time cycle in each time cycle any active ues data Summation, the summation is defined as the cycle very first time and starts to the second time cycle to terminate the total quantity of interior any active ues, this Invention by pre-building the sublist of middle table and middle table, such that it is able to the total quantity of quick search to any active ues, section Save carrying cost.
Unit in all embodiments of the invention, can be by universal integrated circuit, such as CPU (Central Processing Unit, central processing unit), or pass through ASIC (Application Specific Integrated Circuit, special IC) realizing.
Step in present invention method can carry out order adjustment according to actual needs, merge and delete.
Unit in embodiment of the present invention device can be merged according to actual needs, divides and be deleted.
One of ordinary skill in the art will appreciate that realize all or part of flow process in above-described embodiment method, being can be with Instruct related hardware to complete by computer program, described program can be stored in a computer read/write memory medium In, the program is upon execution, it may include such as the flow process of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
Above disclosed is only present pre-ferred embodiments, can not limit certainly the right model of the present invention with this Enclose, the equivalent variations that is therefore made according to the claims in the present invention, still belong to the scope covered by the present invention.

Claims (10)

1. a kind of data query method, it is characterised in that include:
Inquiry request is received, the inquiry request starts to the second time cycle to terminate interior enlivening for inquiring about the cycle very first time The total quantity of user;
The target sublist of second time cycle corresponding target middle table is obtained, the target middle table is described comprising cut-off The nearest active time cycle of each user before second time cycle, the target sublist are to be united according to the target middle table Count the quantity of any active ues in the cut-off for obtaining described second time cycle each time cycle in the past;
Count from the target sublist cycle very first time start to terminate to second time cycle in each time The summation of the quantity of any active ues in cycle, and when the summation being defined as the cycle very first time starting to described second Between in end cycle any active ues total quantity.
2. the method for claim 1, it is characterised in that before the reception inquiry request, also include:
Each time cycle corresponding middle table is built, the middle table includes each user before the cut-off time cycle The nearest active time cycle;
According to each time cycle described corresponding middle table, the sublist of each middle table is built, the sublist includes cuts Only before the time cycle in each time cycle any active ues quantity.
3. method as claimed in claim 2, it is characterised in that each time cycle of the structure corresponding middle table, including:
For each time cycle, the time cycle was obtained in the past and the adjacent time period pair adjacent with the time cycle The middle table that answers;
Obtain the mark of at least one any active ues in the time cycle;
According to the active use in the mark and the corresponding middle table of the adjacent time period of at least one any active ues The mark at family, adjusts the nearest active time cycle of each user in the corresponding middle table of the adjacent time period, obtains institute State time cycle corresponding middle table.
4. method as claimed in claim 3, it is characterised in that the mark according at least one any active ues and The mark of any active ues in the corresponding middle table of the adjacent time period, adjusts the corresponding centre of the adjacent time period In the nearest active time cycle of each user in table, the time cycle corresponding middle table is obtained, including:
If the mark and at least one any active ues of any active ues in the corresponding middle table of the adjacent time period In mark first mark coupling, then by described in corresponding for adjacent time period middle table first identified enliven The nearest active time cycle of user is updated to the time cycle;
If there is no the second mark in the mark of at least one any active ues in the corresponding middle table of the adjacent time period Know, then add first mark in the corresponding middle table of the adjacent time period, and described first is identified Nearest active time cycle of any active ues be defined as the time cycle.
5. the method as described in claim 2-4 any one, it is characterised in that the middle table includes cut-off week time The nearest active time cycle of each user of each application is used before phase in each geographic area;
The sublist of the middle table used each of each application in the past comprising the cut-off time cycle in each geographic area The quantity of any active ues in time cycle.
6. a kind of server, it is characterised in that include:
Receiving unit, for receiving inquiry request, the inquiry request starts to the second time for inquiring about the cycle very first time The total quantity of any active ues in end cycle;
First acquisition unit, for obtaining the target sublist of second time cycle corresponding target middle table, the target In nearest active time cycle of the middle table comprising each user before cut-off second time cycle, the target sublist is root The number for ending any active ues in second time cycle each time cycle in the past obtained according to target middle table statistics Amount;
Statistic unit, starts to tie to second time cycle for counting the cycle very first time from the target sublist In beam in each time cycle the quantity of any active ues summation, and the summation be defined as the cycle very first time start Terminate the total quantity of interior any active ues to second time cycle.
7. server as claimed in claim 6, it is characterised in that described device also includes:
First construction unit, for building each time cycle corresponding middle table, the middle table includes the cut-off time The nearest active time cycle of each user before cycle;
Second construction unit, for according to each time cycle described corresponding middle table, building the son of each middle table Table, quantity of the sublist comprising any active ues in each time cycle before the cut-off time cycle.
8. server as claimed in claim 7, it is characterised in that first construction unit includes:
Second acquisition unit, for for each time cycle, before obtaining the time cycle and with the time cycle phase The corresponding middle table of adjacent adjacent time period;
3rd acquiring unit, for obtaining the mark of at least one any active ues in the time cycle;
Adjustment unit, for the mark according at least one any active ues and the corresponding centre of the adjacent time period The mark of any active ues in table, adjusts the nearest active time of each user in the corresponding middle table of the adjacent time period In the cycle, obtain the time cycle corresponding middle table.
9. server as claimed in claim 8, it is characterised in that the adjustment unit includes:
Updating block, if the mark and described at least for any active ues in the corresponding middle table of the adjacent time period First mark coupling in the mark of individual any active ues, then by the first mark described in corresponding for adjacent time period middle table The nearest active time cycle of any active ues for being identified is updated to the time cycle;
, if for there are no at least one any active ues in the corresponding middle table of the adjacent time period in adding device Second mark in mark, then add first mark in the corresponding middle table of the adjacent time period, and by described the The nearest active time cycle of the identified any active ues of one mark is defined as the time cycle.
10. the server as described in claim 7-9 any one, it is characterised in that the middle table comprising cut-off described when Between before the cycle each geographic area using each user of each application the nearest active time cycle;
The sublist of the middle table used each of each application in the past comprising the cut-off time cycle in each geographic area The quantity of any active ues in time cycle.
CN201610850912.7A 2016-09-26 2016-09-26 A kind of data query method and server Withdrawn CN106503054A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610850912.7A CN106503054A (en) 2016-09-26 2016-09-26 A kind of data query method and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610850912.7A CN106503054A (en) 2016-09-26 2016-09-26 A kind of data query method and server

Publications (1)

Publication Number Publication Date
CN106503054A true CN106503054A (en) 2017-03-15

Family

ID=58290583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610850912.7A Withdrawn CN106503054A (en) 2016-09-26 2016-09-26 A kind of data query method and server

Country Status (1)

Country Link
CN (1) CN106503054A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117428A (en) * 2017-06-26 2019-01-01 北京嘀嘀无限科技发展有限公司 Date storage method and its device, data query method and device thereof
CN109977135A (en) * 2019-03-28 2019-07-05 北京奇艺世纪科技有限公司 A kind of data query method, apparatus and server
CN110535943A (en) * 2019-08-29 2019-12-03 广州华多网络科技有限公司 Data processing method, device, electronic equipment and storage medium
CN111563026A (en) * 2020-04-28 2020-08-21 浙江每日互动网络科技股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium
US11876734B2 (en) 2017-03-15 2024-01-16 Ventus Ip Holdings, Llc Integrated router having a power cycling switch

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436510A (en) * 2011-12-30 2012-05-02 浙江乐得网络科技有限公司 Method and system for improving on-line real-time search quality by off-line query
CN103500170A (en) * 2013-09-02 2014-01-08 上海淼云文化传播有限公司 Statement generating method and system
CN104182546A (en) * 2014-09-09 2014-12-03 北京国双科技有限公司 Method and device for querying data in databases
CN105426449A (en) * 2015-11-09 2016-03-23 小米科技有限责任公司 Method and device for massive data query and server
US20160179920A1 (en) * 2013-12-30 2016-06-23 Bmc Software, Inc. Reference partitioning for database objects
CN105847518A (en) * 2016-04-28 2016-08-10 北京小米移动软件有限公司 Information display method and device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102436510A (en) * 2011-12-30 2012-05-02 浙江乐得网络科技有限公司 Method and system for improving on-line real-time search quality by off-line query
CN103500170A (en) * 2013-09-02 2014-01-08 上海淼云文化传播有限公司 Statement generating method and system
US20160179920A1 (en) * 2013-12-30 2016-06-23 Bmc Software, Inc. Reference partitioning for database objects
CN104182546A (en) * 2014-09-09 2014-12-03 北京国双科技有限公司 Method and device for querying data in databases
CN105426449A (en) * 2015-11-09 2016-03-23 小米科技有限责任公司 Method and device for massive data query and server
CN105847518A (en) * 2016-04-28 2016-08-10 北京小米移动软件有限公司 Information display method and device

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11876734B2 (en) 2017-03-15 2024-01-16 Ventus Ip Holdings, Llc Integrated router having a power cycling switch
CN109117428A (en) * 2017-06-26 2019-01-01 北京嘀嘀无限科技发展有限公司 Date storage method and its device, data query method and device thereof
CN110800001A (en) * 2017-06-26 2020-02-14 北京嘀嘀无限科技发展有限公司 System and method for data storage and data query
CN109117428B (en) * 2017-06-26 2020-12-08 北京嘀嘀无限科技发展有限公司 Data storage method and device, and data query method and device
CN110800001B (en) * 2017-06-26 2024-01-19 北京嘀嘀无限科技发展有限公司 System and method for data storage and data querying
CN109977135A (en) * 2019-03-28 2019-07-05 北京奇艺世纪科技有限公司 A kind of data query method, apparatus and server
CN110535943A (en) * 2019-08-29 2019-12-03 广州华多网络科技有限公司 Data processing method, device, electronic equipment and storage medium
CN110535943B (en) * 2019-08-29 2022-04-26 广州方硅信息技术有限公司 Data processing method and device, electronic equipment and storage medium
CN111563026A (en) * 2020-04-28 2020-08-21 浙江每日互动网络科技股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium
CN111563026B (en) * 2020-04-28 2023-07-14 每日互动股份有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN106503054A (en) A kind of data query method and server
JP6689515B2 (en) Method and apparatus for identifying the type of user geographic location
CN103748579B (en) Data are handled in MapReduce frame
TWI662426B (en) Method and device for distributed stream data processing
US20220101350A1 (en) Information pushing method and apparatus
CN106407303A (en) Data storage method and apparatus, and data query method and apparatus
WO2015188750A1 (en) Method, apparatus and system for implementing location based services
CN106933836B (en) Data storage method and system based on sub-tables
CN101650717A (en) Method and system for saving storage space of database
CN103024078A (en) Resource allocation method and device in cloud computing environment
CN104834650A (en) Method and system for generating effective query tasks
CN103853838A (en) Data processing method and device
CN111414361A (en) Label data storage method, device, equipment and readable storage medium
CN103488525A (en) Determination of user preference relevant to scene
CN101963993B (en) Method for fast searching database sheet table record
CN104750860B (en) A kind of date storage method of uncertain data
CN101639851A (en) Method for storing and querying data and devices therefor
CN105095224A (en) Method, apparatus and system for carrying out OLAP analysis in mobile communication network
CN103955519A (en) Account inquiring and recording system and inquiring and recording method thereof
TW202024964A (en) User position determination method and apparatus, device, and computer readable storage medium
CN115525652A (en) User access data processing method and device
CN109769027A (en) A kind of information push method, device and equipment
CN115098738A (en) Service data extraction method and device, storage medium and electronic equipment
CN107943981A (en) HBase rows paging method, server and computer-readable recording medium
US8279852B2 (en) Method and system for measuring market share for voice over internet protocol carriers

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20170315

WW01 Invention patent application withdrawn after publication