CN106326220B - Date storage method and device - Google Patents

Date storage method and device Download PDF

Info

Publication number
CN106326220B
CN106326220B CN201510333071.8A CN201510333071A CN106326220B CN 106326220 B CN106326220 B CN 106326220B CN 201510333071 A CN201510333071 A CN 201510333071A CN 106326220 B CN106326220 B CN 106326220B
Authority
CN
China
Prior art keywords
data
summarize
storage
daily
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510333071.8A
Other languages
Chinese (zh)
Other versions
CN106326220A (en
Inventor
窦方钰
冯凯
陈锣斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510333071.8A priority Critical patent/CN106326220B/en
Publication of CN106326220A publication Critical patent/CN106326220A/en
Application granted granted Critical
Publication of CN106326220B publication Critical patent/CN106326220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of date storage method and devices, wherein, the described method includes: after the user behavior data of same dimension is summarized daily, is summarized per hour, storage summarizes data and summarizes data per hour into the first storage organization daily, wherein, first storage organization will summarize that data are corresponding with the time to summarize daily and establish incidence relation between data per hour;By with summarize data daily and summarize per hour the same dimension of data user behavior data summarize according to per minute, each second after, storage summarizes data and summarizes data each second into the second storage organization per minute, wherein, second storage organization will summarize that data are corresponding with the time to summarize per minute and establish incidence relation between data each second.The storage organization to user behavior data can be optimized by the application, inquire to the user behavior data of accumulation/reading/when counting, the access times to database are reduced, optimize storage, the reading performance of database, improve response speed.

Description

Date storage method and device
Technical field
This application involves computer fields, more particularly, to a kind of date storage method and device.
Background technique
In many application scenarios, can all accumulation storage be carried out to user behavior data.These user behavior datas can be with Embody user historical operation behavior, using these user behavior datas can analyze user behavior (such as statistics one user Website several times has been logged within a hour, the amount of money that one IP of statistics is paid in one day how many etc.), and then provide more Quality services (for example, judging the operation of user with the presence or absence of business risk etc.).
It is directed to this demand at present, is mainly realized by way of storage detail flowing water or storage accumulation account.However, this A little modes are not only higher to performance requirements such as database server, networks, but also response speed is slower.
Summary of the invention
The application's is designed to provide a kind of date storage method and device, feelings that can be constant in storing data precision Under condition, optimize storage, the reading performance of database.
To realize that the above-mentioned application first purpose, one embodiment of the application provide a kind of date storage method, the side Method includes:
After the user behavior data of same dimension is summarized daily, is summarized per hour, storage daily summarize data and Summarize data per hour into the first storage organization, wherein first storage organization will summarize data and time pair per hour Incidence relation is established in summarizing daily for answering between data;
By with summarize data daily and summarize the user behavior data of the same dimension of data per hour according to per minute, per second After clock summarizes, storage summarizes data and summarizes data each second into the second storage organization per minute, wherein second storage Structure will summarize that data are corresponding with the time to summarize per minute and establish incidence relation between data each second.
As the further improvement of one embodiment of the application, first storage organization includes a plurality of serial data, and every Serial data is made of time window, and the time window of every data string respectively corresponds storage one and summarizes data and daily with this daily Summarize the corresponding one or more of data time and summarizes data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of multiple time windows, and every data Multiple time windows of string respectively correspond storage one and summarize data per minute and summarize data time corresponding one per minute with this A or multiple each seconds summarize data.
As the further improvement of one embodiment of the application,
First storage organization is the first storage table, and the column/row of first storage table includes that a column/row summarizes daily Data, and more column/rows corresponding with data time is summarized daily summarize data per hour;
Second storage format is the second storage table, and the column/row of second storage table includes that a column/row is converged per minute Total data, and multiple row/lines per second clock corresponding with data time is summarized per minute summarize data.
As the further improvement of one embodiment of the application, the method also includes:
A unique timestamp is configured for each serial data.
As the further improvement of one embodiment of the application, the method also includes:
If the storage numerical value of all time windows is all 0 in certain column/row, the column/row is not stored.
As the further improvement of one embodiment of the application, the method also includes:
When getting new user behavior data, summarize data, every each second of synchronized update and current time matches Minute summarizes data, summarizes data per hour, and summarizes data daily.
To realize that the above-mentioned application first purpose, one embodiment of the application provide a kind of data storage device, the dress It sets and includes:
Data memory format module, for that will summarize per hour, data are corresponding with the time to summarize and establish between data daily Incidence relation forms the first storage organization, and will summarize that data are corresponding with the time to summarize and build between data per minute each second Vertical incidence relation forms the second storage organization;
Mathematical logic memory module, for being summarized the user behavior data of same dimension daily, being summarized per hour Afterwards, will summarize data daily and summarize data per hour and store into the first storage organization, and will with it is daily summarize data and Summarize per hour the same dimension of data user behavior data summarize according to per minute, each second after, data will be summarized per minute It stores with data are summarized each second into the second storage organization.
As the further improvement of one embodiment of the application, first storage organization includes a plurality of serial data, and every Serial data is made of time window, and the time window of every data string can respectively correspond storage one and summarize data and every with this daily Day summarizes the corresponding one or more of data time and summarizes data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of time window, and every data string Time window respectively corresponds storage one and summarizes data per minute and summarize the corresponding one or more of data time per minute with this Each second summarizes data.
As the further improvement of one embodiment of the application,
First storage organization is the first storage table, and the column/row of first storage table includes that a column/row summarizes daily Data, and more column/rows corresponding with data time is summarized daily summarize data per hour;
Second storage format is the second storage table, and the column/row of second storage table includes that a column/row is converged per minute Total data, and multiple row/lines per second clock corresponding with data time is summarized per minute summarize data.
As the further improvement of one embodiment of the application, described device further include:
Mark module, for configuring a unique timestamp for each serial data.
As the further improvement of one embodiment of the application, the mathematical logic memory module is also used to:
If the storage numerical value of all time windows is all 0 in certain column/row, the column/row is not stored.
As the further improvement of one embodiment of the application, described device further include:
Update module, for when getting new user behavior data, driving the first logic storing module and second to be patrolled The each second for collecting memory module synchronized update and current time matches summarizes data, summarizes data, per hour total amount per minute According to, and summarize data daily.
Compared with the existing technology, the date storage method and device of the application having the technical effect that through the application, can Optimize the storage organization to user behavior data, when counting, is subtracted with inquire to the user behavior data of accumulation/reading/ Few access times to database, optimize storage, the reading performance of database, improve response speed.
Detailed description of the invention
Fig. 1 is the flow chart of date storage method in one embodiment of the application;
Fig. 2 a is the first storage organization schematic diagram in one embodiment of the application;
Fig. 2 b is the second storage organization schematic diagram in one embodiment of the application;
Fig. 3 is the module map of data storage device in one embodiment of the application.
Specific embodiment
The application is described in detail below with reference to specific embodiment shown in the drawings.But these embodiments are simultaneously The application is not limited, structure that those skilled in the art are made according to these embodiments, method or functionally Transformation is all contained in the protection scope of the application.
To the statistical analysis of user behavior data, usually changes with the variation of business, may increase at any time, repair Change the dimension of statistics, data volume of statistics etc., therefore, it is common to use the database (such as HBase database) of framework is not deposited Store up the user behavior data of accumulation.
As shown in Figure 1, in one embodiment of the application, the date storage method, comprising:
S100, after the user behavior data of same dimension is summarized daily, summarized per hour, daily total amount is stored According to summarize data per hour into the first storage organization, wherein first storage organization will summarize per hour data and when Between corresponding summarize daily and establish incidence relation between data;
S200, by with summarize data daily and summarize the user behavior data of the same dimension of data per hour according to every point After clock, each second summarize, storage summarizes data and summarizes data each second into the second storage organization per minute, wherein described Second storage organization will summarize that data are corresponding with the time to summarize per minute and establish incidence relation between data each second.
So-called same dimension in present embodiment indicates meaning phase represented by the user behavior data for needing to accumulate Together.For example, the dimension for the user behavior data accumulated can be the quantity for logging in website in a period of time;When can also be one section Interior payment amount etc..Hereinafter, by with the payment amount in a period of time this dimension as an example, to present techniques side Case is described in detail.
In the present embodiment, the user behavior data for accumulating storage can be stored in user behavior data library (such as HBase Database) in comprising the first storage organization and the second storage organization.
Wherein, data will be summarized daily and summarizes data storage per hour into the first storage organization.It will summarize per minute Summarize data and each second data storage into the second storage organization.
Further, in the present embodiment, first storage organization includes a plurality of serial data, every data string by when Between window form, and the time window of every data string can respectively correspond storage one summarize data daily and summarize data daily with this Time, corresponding one or more summarized data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of multiple time windows, and every data Multiple time windows of string can respectively correspond that storage one summarizes data per minute and to summarize data time per minute with this corresponding One or more each seconds summarize data.
Join Fig. 2 a signal, first storage organization include 3 data strings, every data string by D, H0, H1 ... H23,25 time windows form altogether, wherein the time window that D is indicated can store summarizes data daily;H0, H1 ... H23 are indicated The time windows of lower 24 hours on the same day (from 0 point to 23 point) can store and corresponding summarize data per hour.It is understood that In a data string, the total value of H0, H1 ... H23 time window store data inside, equal to the value of D time window store data inside.
Join Fig. 2 b signal, second storage organization include 3 data strings, every data string by M, S0, S1 ... S59,61 time windows form altogether, wherein the time window that M is indicated can store summarizes data per minute;S0, S1 ... S59 table The time window of lower 60 seconds of one minute shown (from 0 second to 59 second), which can store, summarizes data corresponding each second.It is understood that In a data string, the total value of S, S1 ... S59 time window store data inside, equal to the value of M time window store data inside.
Further, the method also includes: for each serial data configure a unique timestamp.
In the present embodiment, first storage organization can be the first storage table, and the column of first storage table include One column summarize data daily, and multiple row corresponding with data time is summarized daily summarizes data per hour.Join shown in Fig. 2 a, institute Stating the first storage table can be based on rowkey (such as id information), and the first column data can be to summarize data daily, and following 24 Column data can be to summarize data per hour, and can also be 1~24 column data certainly is to summarize data per hour, and the 25th column data is every Day summarizes data, and those skilled in the art can change the sequence according to customary means.In addition, for the unique of each serial data configuration Timestamp (timestamp writes a Chinese character in simplified form ts) can be the date on the day of the serial data, to identify which day user's row the serial data be For data.
Second storage organization can be the second storage table, and the column of second storage table include a column total amount per minute According to, and summarize data multiple row each second corresponding with data time is summarized per minute.Join shown in Fig. 2 b, second storage table Can based on rowkey (such as id information)+hour (such as relative users behavioral data occur time, be accurate to hour), First column data is to summarize data per minute, and following 60 column data is to summarize data each second, can also be 1~60 columns certainly According to summarize data each second, the 61st column data is to summarize data per minute, and those skilled in the art can become according to customary means Change the sequence.In addition, the unique time stamps (timestamp writes a Chinese character in simplified form ts) for the configuration of each serial data can be that the serial data is current Minutes, to identify that the serial data is the user behavior data of which minute.
Certainly, in the present embodiment, above-mentioned column can also be replaced with row, to realize the first essentially identical storage Structure and the second storage organization:
First storage format is the first storage table, and the row of first storage table includes that a line summarizes data daily, And multirow corresponding with data time is summarized daily summarizes data per hour;
Second storage format is the second storage table, and the row of second storage table includes a line total amount per minute According to, and summarize data multirow each second corresponding with data time is summarized per minute.
The specifically derivation that signal can be beyond all doubt by Fig. 2 a, Fig. 2 b and the above-mentioned description to first/second storage table column It obtains, details are not described herein.
Further, in the present embodiment, due to the column data of the first storage table and the second storage table be it is dynamic, because This method also includes: if the storage numerical value of all time windows is all 0 in certain column/row, do not store the column/row.Thus may be used To save many memory spaces.It is illustrated below by way of a specific example:
Assuming that get the user that id information is 2088xx1 has carried out 3 payments respectively within following times:
3 yuan were paid at 20150101 01:00:01 seconds;
5 yuan were paid at 20150101 01:00:12 seconds;
2 yuan were paid at 20150101 02:32:12 seconds.
So, the data stored in the first storage format are as follows:
rowkey timestamp D H1 H2
2088xx1 20150101 10 yuan 8 yuan 2 yuan
The data stored in the second storage format are as follows:
Discovery that can be beyond all doubt from above-mentioned example, because of 20150101 01:00:01 to 20150101 01:00:12 The storage numerical value of corresponding time window is 0, and the corresponding time window of 20150101 02:32:00 to 20150101 02:32:12 Storage numerical value be also 0, therefore do not store S2~S11 column.
Certainly, above-mentioned example is to store summarize data daily, summarize data per hour, summarize per minute by way of column Data, each second summarize data, and those skilled in the art can also be used in capable mode and store summarizes data, per hour daily Summarize data, summarize data per minute, each second summarizes in data, details are not described herein.
Further, in the present embodiment, when getting new user behavior data, synchronized update and current time Summarize data matched each second, summarize data per minute, summarize data per hour, and summarizes data daily.Still more than The example in face continues to illustrate:
If the user that id information is 2088xx1 has carried out 7 yuan of a payment in 20150101 02:32:12 again, The each second for needing to update 20150101 02:32:12 summarizes data, and 20150101 02:32's summarizes data per minute, Summarize data per hour when 20150101 02,20150101 summarize data daily.Updated data are as follows:
The data stored in the first storage format are as follows:
rowkey timestamp D H1 H2
2088xx1 20150101 17 yuan 8 yuan 9 yuan
The data stored in the second storage format are as follows:
rowkey timestamp M S1 S12
2088xx1_20150101-01 20150101 01:00 points 8 yuan 3 yuan 5 yuan
2088xx1_20150101-02 20150101 02:32 points 9 yuan It is empty 9 yuan
It is will be explained in detail after accumulating user behavior data using aforesaid way below, user behavior data is inquired Process:
When inquiring user behavior data, can be divided according to the time range of inquiry, and time-slotting into Row inquiry acquires data.
For example, wanting counting user 2088xx1 from 20141125 12:35:09 to the payment of 20141128 15:35:09 gold Volume.So:
S1, the second storage table of inquiry, rowkey=2088xx1_20141125-12 and ts are 20141125 12:35 points To 20141125 12:59 points of data.It is possible thereby to which 20141125 12:35:09 are calculated to 20141125 12:59:59 Data.
S2, the first storage table of inquiry, the data that rowkey=2088xx1 and ts are 20141125 to 20141128.By 20141125 13:00:00 can be calculated to the data of 20141128 15:00:00 in this.
S3, the second storage table of inquiry, rowkey=2088xx1_20141128-15 and ts are 20141128 15:00 points To 20141128 15:35 points of data.It is possible thereby to which 20141128 15:00:00 are calculated to 20141128 15:35:09 Data.
The data summarization that above three step is calculated can be obtained user 2088xx1 from 20141125 12:35:09 To the payment amount of 20141128 15:35:09.
It is understood that starting the data terminated to any time for any time, it is only necessary at most inquire three times User behavior data library, so that it may obtain precision to second accumulation data.Simultaneously because data will be summarized each second, summarized per minute Data with summarize data per hour, summarize data daily and be stored separately, compared with the existing technology, the data volume inquired will significantly It reduces, efficiency improves a lot.In addition, for the data that inquiry terminates since any time to current time, due to current To be exactly that current time is corresponding summarize data daily to the data that summarize of time, therefore only needs above-mentioned S1, S2 step, can inquire It obtains.Therefore it may only be necessary to twice inquiry user behavior data library can be obtained precision to the second accumulation data.
As shown in figure 3, in one embodiment of the application, the data storage device, including user behavior data library 20, The user behavior data library 20 includes:
Data memory format module 201, for that will summarize per hour, data are corresponding with the time to summarize between data daily Establish incidence relation and form the first storage organization, and will summarize each second data it is corresponding with the time it is per minute summarize data it Between establish incidence relation formed the second storage organization;
Mathematical logic memory module 203, for being summarized the user behavior data of same dimension daily, being converged per hour The General Logistics Department stores summarizing data daily and summarizing data per hour into the first storage organization, and will summarize data with daily Summarize per hour the same dimension of data user behavior data summarize according to per minute, each second after, by total amount per minute According to summarize data each second and store into the second storage organization.
So-called same dimension in present embodiment indicates meaning phase represented by the user behavior data for needing to accumulate Together.For example, the dimension for the user behavior data accumulated can be the quantity for logging in website in a period of time;When can also be one section Interior payment amount etc..Hereinafter, by with the payment amount in a period of time this dimension as an example, to present techniques side Case is described in detail.
In the present embodiment, the user behavior data of accumulation can be stored in (such as the HBase number of user behavior data library 20 According to library) in comprising the first storage organization and the second storage organization.
Wherein, data will be summarized daily and summarizes data storage per hour into the first storage organization.It will summarize per minute Summarize data and each second data storage into the second storage organization.
Further, in the present embodiment, first storage organization includes a plurality of serial data, every data string by when Between window form, and the time window of every data string can respectively correspond storage one summarize data daily and summarize data daily with this Time, corresponding one or more summarized data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of multiple time windows, and every data Multiple time windows of string can respectively correspond that storage one summarizes data per minute and to summarize data time per minute with this corresponding One or more each seconds summarize data.
Join Fig. 2 a signal, first storage organization include 3 data strings, every data string by D, H0, H1 ... H23,25 time windows form altogether, wherein the time window that D is indicated can store summarizes data daily;H0, H1 ... H23 are indicated The time windows of lower 24 hours on the same day (from 0 point to 23 point) can store and corresponding summarize data per hour.It is understood that In a data string, the total value of H0, H1 ... H23 time window store data inside, equal to the value of D time window store data inside.
Join Fig. 2 b signal, second storage organization include 3 data strings, every data string by M, S0, S1 ... S59,61 time windows form altogether, wherein the time window that M is indicated can store summarizes data per minute;S0, S1 ... S59 table The time window of lower 60 seconds of one minute shown (from 0 second to 59 second), which can store, summarizes data corresponding each second.It is understood that In a data string, the total value of S, S1 ... S59 time window store data inside, equal to the value of M time window store data inside.
Further, the user behavior data library 20 further includes mark module 205, is used for as the configuration of each serial data One unique timestamp.
In the present embodiment, first storage organization can be the first storage table, and the column of first storage table include One column summarize data daily, and multiple row corresponding with data time is summarized daily summarizes data per hour.Join shown in Fig. 2 a, institute Stating the first storage table can be based on rowkey (such as id information), and the first column data can be to summarize data daily, and following 24 Column data can be to summarize data per hour, and can also be 1~24 column data certainly is to summarize data per hour, and the 25th column data is every Day summarizes data, and those skilled in the art can change the sequence according to customary means.In addition, for the unique of each serial data configuration Timestamp (timestamp writes a Chinese character in simplified form ts) can be the date on the day of the serial data, to identify which day user's row the serial data be For data.
Second storage organization can be the second storage table, and the column of second storage table include a column total amount per minute According to, and summarize data multiple row each second corresponding with data time is summarized per minute.Join shown in Fig. 2 b, second storage table Can based on rowkey (such as id information)+hour (such as relative users behavioral data occur time, be accurate to hour), First column data is to summarize data per minute, and following 60 column data is to summarize data each second, can also be 1~60 columns certainly According to summarize data each second, the 61st column data is to summarize data per minute, and those skilled in the art can become according to customary means Change the sequence.In addition, the unique time stamps (timestamp writes a Chinese character in simplified form ts) for the configuration of each serial data can be that the serial data is current Minutes, to identify that the serial data is the user behavior data of which minute.
Certainly, in the present embodiment, above-mentioned column can also be replaced with row, to realize the first essentially identical storage Structure and the second storage organization:
First storage format is the first storage table, and the row of first storage table includes that a line summarizes data daily, And multirow corresponding with data time is summarized daily summarizes data per hour;
Second storage format is the second storage table, and the row of second storage table includes a line total amount per minute According to, and summarize data multirow each second corresponding with data time is summarized per minute.
The specifically derivation that signal can be beyond all doubt by Fig. 2 a, Fig. 2 b and the above-mentioned description to first/second storage table column It obtains, details are not described herein.
Further, in the present embodiment, due to the column data of the first storage table and the second storage table be it is dynamic, because This, the mathematical logic memory module 203 is also used to: if the storage numerical value of all time windows is all 0 in certain column/row, not being deposited Store up the column/row.It is possible thereby to save many memory spaces.It is illustrated below by way of a specific example:
Assuming that get the user that id information is 2088xx1 has carried out 3 payments respectively within following times:
3 yuan were paid at 20150101 01:00:01 seconds;
5 yuan were paid at 20150101 01:00:12 seconds;
2 yuan were paid at 20150101 02:32:12 seconds.
So, the data stored in the first storage format are as follows:
rowkey timestamp D H1 H2
2088xx1 20150101 10 yuan 8 yuan 2 yuan
The data stored in the second storage format are as follows:
rowkey timestamp M S1 S12
2088xx1_20150101-01 20150101 01:00 points 8 yuan 3 yuan 5 yuan
2088xx1_20150101-02 20150101 02:32 points 2 yuan It is empty 2 yuan
Discovery that can be beyond all doubt from above-mentioned example, because of 20150101 01:00:01 to 20150101 01:00:12 The storage numerical value of corresponding time window is 0, and the corresponding time window of 20150101 02:32:00 to 20150101 02:32:12 Storage numerical value be also 0, therefore do not store S2~S11 column.
Certainly, above-mentioned example is to store summarize data daily, summarize data per hour, summarize per minute by way of column Data, each second summarize data, and those skilled in the art can also be used in capable mode and store summarizes data, per hour daily Summarize data, summarize data per minute, each second summarizes in data, details are not described herein.
Further, in the present embodiment, the user behavior data library 20 further includes update module 207, it is described more New module be used for when getting new user behavior data, summarize each second of synchronized update and current time matches data, Summarize data per minute, summarize data per hour, and summarizing data daily.Still continue to illustrate with above example:
If the user that id information is 2088xx1 has carried out 7 yuan of a payment in 20150101 02:32:12 again, The each second for needing to update 20150101 02:32:12 summarizes data, and 20150101 02:32's summarizes data per minute, Summarize data per hour when 20150101 02,20150101 summarize data daily.Updated data are as follows:
The data stored in the first storage format are as follows:
rowkey timestamp D H1 H2
2088xx1 20150101 17 yuan 8 yuan 9 yuan
The data stored in the second storage format are as follows:
rowkey timestamp M S1 S12
2088xx1_20150101-01 20150101 01:00 points 8 yuan 3 yuan 5 yuan
2088xx1_20150101-02 20150101 02:32 points 9 yuan It is empty 9 yuan
In conclusion passing through the date storage method and device of the application, the storage knot to user behavior data can be optimized Structure, data will be summarized each second, summarize data per minute and summarize per hour data, it is daily summarize data and be stored separately, with Inquire to the user behavior data of accumulation/reading/when counting, reduces the access times to database, optimize database Storage, reading performance improve response speed.
It is apparent to those skilled in the art that for convenience and simplicity of description, the device of foregoing description, The specific work process of device and module, can be with reference to the corresponding process in preceding method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed device, device and method can To realize by another way.For example, device embodiments described above are only schematical, for example, the mould The division of block, only a kind of logical function partition, there may be another division manner in actual implementation, for example, multiple modules or Component may be combined or can be integrated into another device, or some features can be ignored or not executed.Another point is shown The mutual coupling, direct-coupling or communication connection shown or discussed can be through some interfaces, between device or module Coupling or communication connection are connect, can be electrical property, mechanical or other forms.
The module as illustrated by the separation member may or may not be physically separated, aobvious as module The component shown may or may not be physical module, it can and it is in one place, or may be distributed over multiple On network module.Some or all of the modules therein can be selected to realize present embodiment scheme according to the actual needs Purpose.
In addition, can integrate in a processing module in each functional module in each embodiment of the application, it can also To be that modules physically exist alone, can also be integrated in a module with 2 or 2 with upper module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also realize in the form of hardware adds software function module.
The above-mentioned integrated module realized in the form of software function module, can store and computer-readable deposit at one In storage media.Above-mentioned software function module is stored in a storage medium, including some instructions are used so that a computer It is each that device (can be personal computer, server or network equipment etc.) or processor (processor) execute the application The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various It can store the medium of program code.
Finally, it should be noted that embodiment of above is only to illustrate the technical solution of the application, rather than its limitations;To the greatest extent Pipe is described in detail the application referring to aforementioned embodiments, those skilled in the art should understand that: its according to It can so modify to technical solution documented by aforementioned each embodiment, or part of technical characteristic is equal Replacement;And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution Spirit and scope.

Claims (12)

1. a kind of date storage method, which is characterized in that the described method includes:
After the user behavior data of same dimension is summarized daily, summarized per hour, storage summarizes data and daily per small When summarize data into the first storage organization, wherein it is corresponding with the time that first storage organization will summarize per hour data Summarize daily and establishes incidence relation between data;
It will be converged according to per minute, each second with summarizing data daily and summarize the user behavior data of the same dimension of data per hour The General Logistics Department, storage summarize data and summarize data each second into the second storage organization per minute, wherein second storage organization It will summarize that data are corresponding with the time to summarize per minute and establish incidence relation between data each second.
2. date storage method according to claim 1, which is characterized in that first storage organization includes a plurality of data String, every data string is made of time window, and the time window of every data string respectively correspond storage one summarize daily data and Summarize the corresponding one or more of data time daily with this and summarizes data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of multiple time windows, and every data string Multiple time windows respectively correspond storage one per minute summarize data and summarize per minute with this data time it is corresponding one or Multiple each seconds summarize data.
3. date storage method according to claim 2, which is characterized in that
First storage organization is the first storage table, and the column/row of first storage table includes the daily total amount of a column/row According to, and with the corresponding more column/rows of data time are summarized daily summarize data per hour;
Second storage organization is the second storage table, and the column/row of second storage table includes column/row total amount per minute According to, and with the corresponding multiple row/lines per second clock of data time is summarized per minute summarize data.
4. date storage method according to claim 2, which is characterized in that the method also includes:
A unique timestamp is configured for each serial data.
5. date storage method according to claim 3, which is characterized in that the method also includes:
If the storage numerical value of all time windows is all 0 in certain column/row, the column/row is not stored.
6. date storage method according to claim 1, which is characterized in that the method also includes:
When getting new user behavior data, summarize each second of synchronized update and current time matches data, per minute Summarize data, summarize data per hour, and summarizes data daily.
7. a kind of data storage device, which is characterized in that described device includes:
Data memory format module, for that will summarize per hour, data are corresponding with the time to summarize foundation association between data daily Relationship formed the first storage organization, and will summarize each second data it is corresponding with the time per minute summarize between data establish close Connection relationship forms the second storage organization;
Mathematical logic memory module will after the user behavior data of same dimension is summarized daily, summarized per hour Summarize data daily and summarize data per hour and store into the first storage organization, and will be with the daily data and per hour of summarizing Summarize the same dimension of data user behavior data summarize according to per minute, each second after, data and per second will be summarized per minute Clock summarizes data and stores into the second storage organization.
8. data storage device according to claim 7, which is characterized in that first storage organization includes a plurality of data String, every data string are made of time window, and the time window of every data string can respectively correspond storage one and summarize data daily And summarizes the corresponding one or more of data time daily with this and summarize data per hour;
Second storage organization includes a plurality of serial data, and every data string is made of time window, and the time of every data string Window respectively corresponds that storage one summarizes data per minute and to summarize the corresponding one or more of data time per minute with this per second Clock summarizes data.
9. data storage device according to claim 8, which is characterized in that
First storage organization is the first storage table, and the column/row of first storage table includes the daily total amount of a column/row According to, and with the corresponding more column/rows of data time are summarized daily summarize data per hour;
Second storage organization is the second storage table, and the column/row of second storage table includes column/row total amount per minute According to, and with the corresponding multiple row/lines per second clock of data time is summarized per minute summarize data.
10. data storage device according to claim 8, which is characterized in that described device further include:
Mark module, for configuring a unique timestamp for each serial data.
11. data storage device according to claim 9, which is characterized in that the mathematical logic memory module is also used to:
If the storage numerical value of all time windows is all 0 in certain column/row, the column/row is not stored.
12. data storage device according to claim 7, which is characterized in that described device further include:
Update module, for driving the first logic storing module and the second logic to deposit when getting new user behavior data Storage module synchronization is updated to summarize data, summarizes data per minute, summarizes data per hour with each second of current time matches, with And summarize data daily.
CN201510333071.8A 2015-06-16 2015-06-16 Date storage method and device Active CN106326220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510333071.8A CN106326220B (en) 2015-06-16 2015-06-16 Date storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510333071.8A CN106326220B (en) 2015-06-16 2015-06-16 Date storage method and device

Publications (2)

Publication Number Publication Date
CN106326220A CN106326220A (en) 2017-01-11
CN106326220B true CN106326220B (en) 2019-08-27

Family

ID=57733480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510333071.8A Active CN106326220B (en) 2015-06-16 2015-06-16 Date storage method and device

Country Status (1)

Country Link
CN (1) CN106326220B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492009B (en) * 2018-11-25 2023-06-23 广州市塞安物联网科技有限公司 Method and system for identifying relevance time units in big data storage device
CN110704466B (en) * 2019-09-27 2021-12-17 武汉极意网络科技有限公司 Black product data storage method and device
CN113868267A (en) * 2020-06-30 2021-12-31 华为技术有限公司 Method for injecting time sequence data, method for inquiring time sequence data and database system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141759A (en) * 2007-02-12 2008-03-12 中兴通讯股份有限公司 Call behavior statistical and analytical method and device
CN101860454A (en) * 2010-06-24 2010-10-13 杭州华三通信技术有限公司 Network performance data processing method and device thereof
CN102456065A (en) * 2011-07-01 2012-05-16 中国人民解放军国防科学技术大学 Methods for storing and querying offline historical statistical data of data stream
CN103399945A (en) * 2013-08-15 2013-11-20 成都博云科技有限公司 Data structure based on cloud computing database system
CN104572726A (en) * 2013-10-22 2015-04-29 北京品众互动网络营销技术有限公司 Advertisement analysis method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141759A (en) * 2007-02-12 2008-03-12 中兴通讯股份有限公司 Call behavior statistical and analytical method and device
CN101860454A (en) * 2010-06-24 2010-10-13 杭州华三通信技术有限公司 Network performance data processing method and device thereof
CN102456065A (en) * 2011-07-01 2012-05-16 中国人民解放军国防科学技术大学 Methods for storing and querying offline historical statistical data of data stream
CN103399945A (en) * 2013-08-15 2013-11-20 成都博云科技有限公司 Data structure based on cloud computing database system
CN104572726A (en) * 2013-10-22 2015-04-29 北京品众互动网络营销技术有限公司 Advertisement analysis method

Also Published As

Publication number Publication date
CN106326220A (en) 2017-01-11

Similar Documents

Publication Publication Date Title
CN101416179B (en) System and method for providing regulated recommended word to every subscriber
CN104063801A (en) Mobile advertisement recommendation method based on cluster
US10346862B2 (en) Migration system to migrate users to target services
CN105260414B (en) User behavior similarity calculation method and device
CN105975641A (en) Video recommendation method ad device
CN104008184A (en) Method and device for pushing information
CN107784035B (en) Assessment system, the method and apparatus of the node of funnel model
CN106326220B (en) Date storage method and device
KR20190070218A (en) System and method for investing and distributing virtual money
CN109213598A (en) A kind of resource allocation methods, device and computer readable storage medium
CN103714004A (en) JVM online memory leak analysis method and system
CN105095211A (en) Acquisition method and device for multimedia data
CN102681999A (en) Method and device for collecting and sending user action information
CN106022708A (en) Method for predicting employee resignation
KR102068788B1 (en) Server for offering service targetting user and service offering method thereof
CN103838819A (en) Information publish method and system
CN109977296A (en) A kind of information-pushing method, device, equipment and storage medium
CN103714086A (en) Method and device used for generating non-relational data base module
CN110086874A (en) A kind of Expressway Service user classification method, system, equipment and medium
CN109978575B (en) Method and device for mining user flow operation scene
CN104506394B (en) A kind of mobile Internet flow statistical method and system
CN102075896B (en) Price previewing method and system
CN107423999A (en) A kind of orientation based on user grouping issues advertising method and system
CN105243131B (en) Path query method and device
CN101739410A (en) Method, device and system for revealing operation result

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20201012

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Patentee after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Patentee before: Alibaba Group Holding Ltd.