CN105183391B - The method and apparatus that data store under a kind of distributed data platform - Google Patents
The method and apparatus that data store under a kind of distributed data platform Download PDFInfo
- Publication number
- CN105183391B CN105183391B CN201510598396.9A CN201510598396A CN105183391B CN 105183391 B CN105183391 B CN 105183391B CN 201510598396 A CN201510598396 A CN 201510598396A CN 105183391 B CN105183391 B CN 105183391B
- Authority
- CN
- China
- Prior art keywords
- data
- class
- catalogue
- time
- expired
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention provides the method and apparatus that data store under a kind of distributed data platform, can improve data storage and data effectiveness of retrieval while effectively record data variation.The method that data store under a kind of distributed data platform of the invention includes: to be classified by being compared the data on the same day with the data in data mode variation table to changed data;By under the sorted data acquisition to different catalogues, and it is stored under corresponding subregion according to the data storage rule of the catalogue;And it updates the data mode and changes table.
Description
Technical field
The present invention relates to field of computer technology, a kind of method that particularly data store under distributed data platform
And device.
Background technique
Big data --- people describe the epoch of current information explosion with it, it not only shows flies in data volume
Jump, and data storage type is also more and more, it is more various to form from traditional relational data, Key-Value data
Flat file, picture, audio, video etc..So many and diverse data are analyzed, to the calculated performance and storage of data platform
Performance made higher requirement.
Storage and analysis that big data is done using distributed Hadoop system are the common practices of industry, due to distribution
The Hadoop system of formula using file storage data mode, although improving the amount of storage and handling capacity of data,
But the update mechanism of original relevant database is sacrificed, only supports insertion, is deleted, the mode of operation of overlay text file,
Cause the accumulation of current data history can only be by the way of data snapshot.Portion is saved daily to the data stored in database
Snapshot records complete data mode, and as time integral history of forming data store.When needs restore or retrieve data
When the historical track of state change, needs to scan historical data by full dose, carry out the universe calculating ratio pair of different time points, look for
The difference of data out, the data mode of recovery time point.
But there are some disadvantages below for existing technical solution:
1. the storage scheme for relevant database is helpless to the processing of big data quantity;And existing distributed document
The mode that system takes snapshot to accumulate, sacrifices mass storage space, and in subsequent calculating, inefficiency;
2. data retrieval generally requires to carry out full dose scanning, a large amount of system resources are occupied;
3. lacking flexibility for data scene complicated and changeable on line.
However, a data often pass through many state changes, phase from generating to withering away in a large amount of application scenarios
Ying Di, data platform produces more parts of snapshots, data storage meeting rapid expansion when recording data mode variation, and divides in data
During analysis, the tracking that data are carried out with historical track is generally required, needs to scan a large amount of historical data and carries out going back for state
Original, inefficiency.Therefore, how to design a kind of mechanism make data platform can either record data mode variation and convenient for analysis and
Reduction, is the major issue for putting the urgent need to resolve in face of us.
Summary of the invention
In view of this, the present invention provides the method and apparatus that data store under a kind of distributed data platform, can have
While effect record data variation, data storage and data effectiveness of retrieval are improved.
To achieve the above object, according to an aspect of the invention, there is provided data are deposited under a kind of distributed data platform
The method of storage.
A kind of method that data store under distributed data platform, comprising: by becoming the data on the same day and data mode
Data in change table are compared, and are classified to changed data;By the sorted data acquisition to different
Under catalogue, and it is stored under corresponding subregion according to the data storage rule of the catalogue;And it updates the data mode and becomes
Change table.
Optionally, the classification be carried out according to the process of data life period, and including online class, expired class and
File class three types.
Optionally, the step of classifying to changed data includes: the key name by searching for data, by the same day
Data are compared with the data in data mode variation table;If there is no the data in the data mode variation table, and
There are the data in the data on the same day, then the data are online class;If the data mode variation table is worked as with described
There are the data in it data, but the key assignments of the data is different, then the data mode changes the number in table
According to for expired class, and the data on the same day are online class;And if there is the data in the data mode variation table, and
There is no the data in the data on the same day, then the data are filing class.
Optionally, the data storage rule includes 3 partition name, data time and data life deadline catalogues
Rank.
Optionally, the partition name includes online subregion, expired subregion and archive partition.
Optionally, according to the step that the data storage rule of the catalogue is stored under corresponding subregion include: it is described
The first class catalogue partition name of line class data is online subregion, and second-level directory data time is maximum time, three-level catalogue data
Life deadline is maximum time;The first class catalogue partition name of the expired class data is expired subregion, second-level directory number
It is transformation period according to the time, three-level catalogue data life deadline is transformation period;And the level-one of the filing class data
The entitled archive partition of directory partition, second-level directory data time are transformation period, and three-level catalogue data life deadline is
Maximum time.
Optionally, the step of updating the data mode variation table includes: key name, the key for being inserted into the online class data
Value, state change initial time and state change end time, wherein the state change initial time is transformation period, institute
Stating the state change end time is maximum time;And the state change end time of the expired class data is set as becoming
Change the time.
According to another aspect of the present invention, the device that data store under a kind of distributed data platform is provided.
The device that data store under a kind of distributed data platform, comprising: data categorization module, for by by the same day
Data are compared with the data in data mode variation table, are classified to changed data;Data memory module is used
In by under the sorted data acquisition to different catalogues, and it is stored in accordingly according to the data storage rule of the catalogue
Subregion under;And state update module, for updating the data mode variation table.
Optionally, the classification is the process of the life cycle according to data to carry out, and including online class, expired class
With filing class three types.
Optionally, the data categorization module is also used to: by searching for the key name of data, by the data on the same day and data shape
Data in state variation table are compared;If there is no the data in the data mode variation table, and the number on the same day
There are the data in, then the data are online class;If in the data mode variation table and the data on the same day all
There are the data, but the key assignments of the data is different, then the data in the data mode variation table are expired class, and
The data on the same day are online class;And if having the data in the data mode variation table, and in the data on the same day
There is no the data, then the data are filing class.
Optionally, the data storage rule includes 3 partition name, data time and data life deadline catalogues
Rank.
Optionally, the partition name includes online subregion, expired subregion and archive partition.
Optionally, the data memory module is also used to: the first class catalogue partition name of the online class data is online
Subregion, second-level directory data time are maximum time, and three-level catalogue data life deadline is maximum time;The expired class
The first class catalogue partition name of data is expired subregion, and second-level directory data time is transformation period, three-level catalogue data life
Deadline is transformation period;And the first class catalogue partition name of the filing class data is archive partition, second-level directory number
It is transformation period according to the time, three-level catalogue data life deadline is maximum time.
Optionally, the state update module is also used to: key name, the key assignments, state change of the insertion online class data
Initial time and state change end time, wherein the state change initial time is transformation period, the state change knot
The beam time is maximum time;And the state change end time of the expired class data is set as transformation period.
According to the technique and scheme of the present invention, it only when data mode changes, just needs classify to the data, deposit
Storage and state update etc. operation, for not changed data without carry out it is secondary storage or state update, so as to
While effectively recording data variation, data storage and data effectiveness of retrieval are improved, data space is effectively saved, and
And it is also very easy and conveniently to the cleaning of stale data.
Detailed description of the invention
Attached drawing for a better understanding of the present invention, does not constitute an undue limitation on the present invention.Wherein:
Fig. 1 is that the key step for the method that data store under a kind of distributed data platform according to an embodiment of the present invention is shown
It is intended to;
Fig. 2 is the schematic diagram of data partitioned storage according to an embodiment of the present invention;
Fig. 3 is the schematic diagram of data scrubbing according to an embodiment of the present invention;
Fig. 4 is the schematic diagram of data mode variation table according to an embodiment of the present invention;
Fig. 5 is that the main modular for the device that data store under a kind of distributed data platform according to an embodiment of the present invention is shown
It is intended to;
Fig. 6 is the storage effect comparison schematic diagram of the embodiment of the present invention and the prior art.
Specific embodiment
Below in conjunction with attached drawing, an exemplary embodiment of the present invention will be described, including the various of the embodiment of the present invention
Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize
It arrives, it can be with various changes and modifications are made to the embodiments described herein, without departing from scope and spirit of the present invention.Together
Sample, for clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.
The method that data store under a kind of distributed data platform of the invention, only when item status changes,
Just need classified to the data item, store and state update etc. operation, for not changed data item without carry out
Secondary storage or state update, so as to improve data storage and data retrieval while effectively record data variation
Efficiency.
Fig. 1 is that the key step for the method that data store under a kind of distributed data platform according to an embodiment of the present invention is shown
It is intended to.As shown in Figure 1, the method that data store under a kind of distributed data platform of the invention mainly includes the following steps, namely
S11 to step S13.
Step S11: by being compared the data on the same day with the data in data mode variation table, to changed
Data are classified.The characteristics of in order to adapt to Hadoop file system, needs uniform sequential carry out data to store to improve effect
Rate.According to the process of data life period, data can be divided into three classes, i.e., online class (ACTIVE), expired class (EXPIRED),
File class (HISTORY).Online class data indicate the current effective data of meaning, it is possible to can change;Expired class data
Indicate the no longer valid data of current meaning;Filing class data expression has been mothballed and has no longer changed, the lasting effective data of meaning.
When carrying out data classification, according to data processing rule predetermined, by searching for the key name of data, by the same day
Data be compared with the data in data mode variation table with the changed data of determination;If the data mode becomes
There is no the data in change table, and there are the data in the data on the same day, then the data are online class;If the number
According to there is the data in the data on state change table and the same day, but the key assignments of the data is different, then the data
The data in state change table are expired class, and the data on the same day are online class;And the if data mode
There are the data in variation table, and there is no the data in the data on the same day, then the data are filing class.
Step S12: it is stored by under the sorted data acquisition to different catalogues, and according to the data of the catalogue
Rule is stored under corresponding subregion.Wherein, the data storage rule includes that partition name, data time and data life are cut
Only 3 directory levels of time.The data classification in conjunction with described in step S11 is it is found that the partition name includes online subregion, mistake
Phase subregion and archive partition.
As shown in Fig. 2, being the schematic diagram of data partitioned storage according to an embodiment of the present invention.Stablize operation for one
For large enterprise, with being incremented by for time, data volume in expired subregion and archive partition also can steady-state growth, online subregion
Data volume can keep relative stability as far as possible while being increased newly.It is with the time when data store it can be seen from Fig. 2
For main spool, equably it is stored in this corresponding subregion of 3 top-level directories as far as possible.
It, can be according to the number of the catalogue when carrying out data storage for the ease of carrying out the classification storage and lookup of data
It is stored under corresponding subregion according to storage rule, correspondingly includes following 3 kinds of situations for 3 class data above-mentioned:
The first class catalogue partition name of the online class data is online subregion, when second-level directory data time is maximum
Between, three-level catalogue data life deadline is maximum time;
The first class catalogue partition name of the expired class data is expired subregion, when second-level directory data time is variation
Between, three-level catalogue data life deadline is transformation period;And
The first class catalogue partition name of the filing class data is archive partition, when second-level directory data time is variation
Between, three-level catalogue data life deadline is maximum time.
In the following, citing describes specific data storage catalogue hierarchical structure.Such as:
For online class data, data storage catalogue hierarchical structure is dp=ACTIVE/dt=4712-12-31/end_
Date=4712-12-31;
For expired class data, data storage catalogue hierarchical structure is dp=EXPIRED/dt=2013-10-11/end_
Date=2013-10-11;
For filing class data, data storage catalogue hierarchical structure is dp=HISTORY/dt=2014-06-22/end_
Date=4712-12-31.Wherein, dp indicates that data subregion data partition, dt indicate data time data time,
End_date indicates data life by the time.To file class data instance, when storing the data for needing to file, first
It is that determination is stored in " dp=HISTORY " this subregion;Later, according to the transformation period of the data " dt=2014-06-22 "
Can be stored in the time is transferred under the data directory of the subregion;Finally, according to data life by time " end_date=
4712-12-31 " saves the data into corresponding data table.Wherein, since data filing represents data storage, no longer change,
The value and meaning of attribute are until permanent, so its " end_date " is maximum time " 4712-12-31 ".In practical applications, may be used
Data storage catalogue hierarchical structure is set as the case may be.
The schematic diagram of Fig. 3 data scrubbing according to an embodiment of the present invention.It is carried out using partitioned storage mode as shown in Figure 2
Data storage, can very easily carry out the cleaning of historical data.As shown in figure 3, for expired class data, data attribute or
Person's measurement has been changed, and current data meaning is no longer valid, and only needing when clearing up it will be expired point corresponding
Area is deleted, simple to operate.
Step S13: the data mode variation table is updated.When the state of data changes, we need to data
State is updated.In conjunction with step S11 and step S12 it is found that when being updated to data state change table, need to be inserted into institute
State key name, key assignments, state change initial time and the state change end time of online class data, wherein the state change
Initial time is transformation period, and the state change end time is maximum time;And it will be described in the expired class data
The state change end time is set as transformation period.For the data of any variation do not occur, without carrying out state update.
As shown in figure 4, to change the schematic diagram of table according to the data mode of the embodiment of the present invention.As the table of upper left is
The table of data on the day of 2014-01-01, upper right is the data on the day of 2014-01-02, and existing technical solution is will be daily
Data carry out snapshot preservation, when requiring to look up some data or carrying out the processing such as calculating, full dose are needed to scan all snapshots,
Not only it had sacrificed a large amount of storage space but also had wasted system resource.And the scheme of the invention is the table 2014-01-02 of upper right is worked as
It data are compared with the data on the day of the table 2014-01-01 of upper left, are added and are recorded to changed data item, no
Changed data item is without being changed.Meanwhile in the structure of design data state change table, audit field is introduced
Start_date/end_date carrys out the starting and ending time of mark data state change, also, in order to better discriminate between number
According to the major key of tables of data will add audit field start_date.
In Fig. 4, the data of the table 2014-01-02 of upper right are compared with the data of the table 2014-01-01 of upper left
Afterwards, data mode variation table mytable shown in can obtaining below Fig. 4 arrow.In table mytable, major key includes key name key
With the initial time start_date of data state change, each data is distinguished by major key.Data record is usual on line
There are three types of operations: Insert indicates the generation of new record;Delete indicates the termination that record is worth online;Update is of equal value
In Delete/Update composition operation, the transition of recording status, the i.e. production of the end of record previous state and new state are indicated
It is raw.For example, the data of 2014-01-02 are compared with the data of 2014-01-01 can be seen that, the data that key is 1 are become
Change (Update), so in table mytable, according to major key by key is 1 and start_date is 2014/1/1 data
End_date is revised as transformation period, while newly increasing a record, and major key is that key is 1 and start_date is transformation period.
Equally, the data for being 4 for key, in table mytable directly newly-increased (Insert).By by daily data and data
Data in state change table are compared, and can find changed data, according to table mytable identification data state
Method, without daily carry out snapshot preservation, so as to effectively save memory space, and guarantee it is continuous in time, can
To provide basis for subsequent retrieval analysis.
The date storage method as described in above step S11 to step S13, data store organisation according to the invention and
Catalogue divides, and according to the needs of data retrieval and calculating, is directly inquired by writing SQL statement.For example, if we want
The state that the 2014-01-01 same day " 1 " is searched from the table mytable of Fig. 4, it is as follows can to write SQL statement:
Select*from mytable where start_date≤' 2014-01-01'and end_date > '
2014-01-01'and [key='1'];
If to search the state of " 1 " during this section of 2014-01-01 to 2014-01-02 from table mytable, can compile
It is as follows to write SQL statement:
Select*from mytable where start_date≤' 2014-01-02'and end_date >='
2014-01-01'and [key='1'];
If to search " 1 " current last state from table mytable, it is as follows SQL statement can be write:
Select*from mytable where dp=(' ACTIVE'or [dp='HISTORY']) and [key='
1']。
In this way, directly carrying out the inquiry of data mode by writing SQL statement, prescreening can be carried out to catalogue, without
All catalogues are traversed, guarantee the retrieval and calculating of completing data under the smallest resource usage amount.
Fig. 5 is that the main modular for the device that data store under a kind of distributed data platform according to an embodiment of the present invention is shown
It is intended to.As shown in figure 5, the device 50 that data store under distributed data platform of the invention mainly includes data categorization module
51, data memory module 52 and state update module 53.
Data categorization module 51 is used for by being compared the data on the same day with the data in data mode variation table, right
Changed data are classified;Data memory module 52 is used for the sorted data acquisition to different catalogues
Under, and be stored under corresponding subregion according to the data storage rule of the catalogue;And state update module 53 is for updating
The data mode changes table.
Wherein, data categorization module 51 is the process of the life cycle according to data when carrying out data classification to carry out
, and including online class, expired class and filing class three types.
The key name that data categorization module 51 can be also used for by searching for data changes the data on the same day and data mode
Data in table are compared;If there is no the data in the data mode variation table, and have in the data on the same day
The data, then the data are online class;If had in the data mode variation table and the data on the same day described
Data, but the key assignments of the data is different, then the data mode changes the data in table as expired class, and the same day
The data are online class;And if there are the data in the data mode variation table, and there is no institute in the data on the same day
Data are stated, then the data are filing class.
For data memory module 52 when carrying out data storage, the data storage rule of foundation includes partition name, number
According to 3 directory levels of time and data life deadline, and the partition name includes online subregion, expired subregion and filing
Subregion.
Data memory module 52 can be also used for, and the first class catalogue partition name of the online class data is online subregion,
Second-level directory data time is maximum time, and three-level catalogue data life deadline is maximum time;The expired class data
First class catalogue partition name be expired subregion, second-level directory data time be transformation period, three-level catalogue data life cut-off
Time is transformation period;And the first class catalogue partition name of the filing class data is archive partition, when second-level directory data
Between be transformation period, three-level catalogue data life deadline be maximum time.
When state update module 53 can be also used for key name, key assignments, the state change starting for being inserted into the online class data
Between and the state change end time, wherein the state change initial time be transformation period, the state change end time
For maximum time;And the state change end time of the expired class data is set as transformation period.
Fig. 6 is the storage effect comparison schematic diagram of the embodiment of the present invention and the prior art.With the buildup of increments of the prior art
Processing mode is compared, and data storage scheme of the invention can effectively save data space.With one hundred million grade data
For table, in million ranks or so, saving rate in space can be calculated the data volume which increases newly and change daily by following formula
It obtains.
In above formula, base: radix (hundred million grades), N: number of days, C: every daily increment (million grades), M: every daily variation (million
Grade).When N tends to infinity, saving rate in space is 1, it may be assumed that time span is longer, and it is more to save space.In practical applications,
Space saving rate can be to 90% or more.It can be seen that can effectively save data storage using technical solution of the present invention
Space can retain the historical rudiment of total data with the smallest storage.
Technical solution according to an embodiment of the present invention just needs to carry out the data only when data mode changes
The operations such as classification, storage and state update update not changed data without carrying out secondary storage or state, thus
Data storage and data effectiveness of retrieval can be improved while effectively record data variation, it is effective to save data storage
Space, and it is also very easy and conveniently to the cleaning of stale data.
Above-mentioned specific embodiment, does not constitute a limitation on the scope of protection of the present invention.Those skilled in the art should be bright
It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and substitution can occur.It is any
Made modifications, equivalent substitutions and improvements etc. within the spirit and principles in the present invention, should be included in the scope of the present invention
Within.
Claims (12)
1. a kind of method that data store under distributed data platform characterized by comprising
By being compared the data on the same day with the data in data mode variation table, changed data are divided
Class, the classification are the processes of foundation data life period to carry out, and including online class, expired class and filing three type of class
Type, the data mode variation table distinguish data by major key, and major key includes the initial time of key name and data state change;
By under the sorted data acquisition to different catalogues, and phase is stored according to the data storage rule of the catalogue
Under the subregion answered, the partition name includes online subregion, expired subregion and archive partition;And
Update the data mode variation table.
2. the method according to claim 1, wherein the step of classifying to changed data includes:
By searching for the key name of data, the data on the same day are compared with the data in data mode variation table;
If there is no the data in the data mode variation table, and there are the data in the data on the same day, then it is described
Data are online class;
If having the data in the data mode variation table and the data on the same day, but the key assignments of the data is not
Together, then the data in the data mode variation table are expired class, and the data on the same day are online class;And
If there are the data in the data mode variation table, and there is no the data in the data on the same day, then the data
To file class.
3. the method according to claim 1, wherein when the data storage rule includes partition name, data
Between and 3 directory levels of data life deadline.
4. the method according to claim 1, wherein the data storage rule according to the catalogue is stored in accordingly
Subregion under step include:
The first class catalogue partition name of the online class data be online subregion, second-level directory data time be maximum time, three
Grade catalogue data life deadline is maximum time;
The first class catalogue partition name of the expired class data be expired subregion, second-level directory data time be transformation period, three
Grade catalogue data life deadline is transformation period;And
It is described filing class data first class catalogue partition name be archive partition, second-level directory data time be transformation period, three
Grade catalogue data life deadline is maximum time.
5. the method according to claim 1, wherein the step of updating the data mode variation table includes:
It is inserted into key name, key assignments, state change initial time and the state change end time of the online class data, wherein institute
Stating state change initial time is transformation period, and the state change end time is maximum time;And
The state change end time of the expired class data is set as transformation period.
6. the device that data store under a kind of distributed data platform characterized by comprising
Data categorization module, for by being compared the data on the same day with the data in data mode variation table, to generation
The data of variation are classified, and the classification is carried out according to the process of data life period, and including online class, expired
Class and filing class three types, the data mode variation table distinguish data by major key, and major key includes key name and data shape
The initial time of state variation;
Data memory module, for by under the sorted data acquisition to different catalogues, and according to the number of the catalogue
It is stored under corresponding subregion according to storage rule, the partition name includes online subregion, expired subregion and archive partition;And
State update module, for updating the data mode variation table.
7. device according to claim 6, which is characterized in that the data categorization module is also used to:
By searching for the key name of data, the data on the same day are compared with the data in data mode variation table;
If there is no the data in the data mode variation table, and there are the data in the data on the same day, then it is described
Data are online class;
If having the data in the data mode variation table and the data on the same day, but the key assignments of the data is not
Together, then the data in the data mode variation table are expired class, and the data on the same day are online class;And
If there are the data in the data mode variation table, and there is no the data in the data on the same day, then the data
To file class.
8. device according to claim 6, which is characterized in that when the data storage rule includes partition name, data
Between and 3 directory levels of data life deadline.
9. device according to claim 6, which is characterized in that the data memory module is also used to:
The first class catalogue partition name of the online class data be online subregion, second-level directory data time be maximum time, three
Grade catalogue data life deadline is maximum time;
The first class catalogue partition name of the expired class data be expired subregion, second-level directory data time be transformation period, three
Grade catalogue data life deadline is transformation period;And
It is described filing class data first class catalogue partition name be archive partition, second-level directory data time be transformation period, three
Grade catalogue data life deadline is maximum time.
10. device according to claim 6, which is characterized in that the state update module is also used to:
It is inserted into key name, key assignments, state change initial time and the state change end time of the online class data, wherein institute
Stating state change initial time is transformation period, and the state change end time is maximum time;And
The state change end time of the expired class data is set as transformation period.
11. the electronic equipment that data store under a kind of distributed data platform characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method as claimed in any one of claims 1 to 5.
12. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor
Such as method as claimed in any one of claims 1 to 5 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510598396.9A CN105183391B (en) | 2015-09-18 | 2015-09-18 | The method and apparatus that data store under a kind of distributed data platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510598396.9A CN105183391B (en) | 2015-09-18 | 2015-09-18 | The method and apparatus that data store under a kind of distributed data platform |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105183391A CN105183391A (en) | 2015-12-23 |
CN105183391B true CN105183391B (en) | 2018-12-28 |
Family
ID=54905500
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510598396.9A Active CN105183391B (en) | 2015-09-18 | 2015-09-18 | The method and apparatus that data store under a kind of distributed data platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105183391B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106897423A (en) * | 2017-02-24 | 2017-06-27 | 郑州云海信息技术有限公司 | A kind of cloud platform junk data processing method and system |
CN108052281A (en) * | 2017-11-30 | 2018-05-18 | 平安科技(深圳)有限公司 | Business Information storage method, application server and computer storage media |
CN109145052B (en) * | 2018-07-12 | 2021-10-08 | 北京炎黄盈动科技发展有限责任公司 | Data partition storage method, device, system, storage medium and electronic device |
CN110347655A (en) * | 2019-06-12 | 2019-10-18 | 江苏富山软件科技有限公司 | A kind of distributed file system access frame |
CN110865990A (en) * | 2019-10-12 | 2020-03-06 | 中国平安财产保险股份有限公司 | Non-real-time data exchange method, system and computer equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101876983A (en) * | 2009-04-30 | 2010-11-03 | 国际商业机器公司 | Method for partitioning database and system thereof |
CN102141963A (en) * | 2010-01-28 | 2011-08-03 | 阿里巴巴集团控股有限公司 | Method and equipment for analyzing data |
CN102567428A (en) * | 2010-12-30 | 2012-07-11 | 中国移动通信集团浙江有限公司 | Method and device for managing life cycle of online data |
CN103186566A (en) * | 2011-12-28 | 2013-07-03 | 中国移动通信集团河北有限公司 | Data classification storage method, device and system |
CN103838787A (en) * | 2012-11-27 | 2014-06-04 | 阿里巴巴集团控股有限公司 | Method and device for updating distributed data warehouse |
-
2015
- 2015-09-18 CN CN201510598396.9A patent/CN105183391B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101876983A (en) * | 2009-04-30 | 2010-11-03 | 国际商业机器公司 | Method for partitioning database and system thereof |
CN102141963A (en) * | 2010-01-28 | 2011-08-03 | 阿里巴巴集团控股有限公司 | Method and equipment for analyzing data |
CN102567428A (en) * | 2010-12-30 | 2012-07-11 | 中国移动通信集团浙江有限公司 | Method and device for managing life cycle of online data |
CN103186566A (en) * | 2011-12-28 | 2013-07-03 | 中国移动通信集团河北有限公司 | Data classification storage method, device and system |
CN103838787A (en) * | 2012-11-27 | 2014-06-04 | 阿里巴巴集团控股有限公司 | Method and device for updating distributed data warehouse |
Also Published As
Publication number | Publication date |
---|---|
CN105183391A (en) | 2015-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105183391B (en) | The method and apparatus that data store under a kind of distributed data platform | |
CN104933133B (en) | Meta-data snap in distributed file system stores and accesses method | |
US7228299B1 (en) | System and method for performing file lookups based on tags | |
CN101127034B (en) | Data organization, inquiry, presentation, documentation, recovery, deletion, refining method, device and system | |
US8626717B2 (en) | Database backup and restore with integrated index reorganization | |
US7765215B2 (en) | System and method for providing a trustworthy inverted index to enable searching of records | |
CN101278289B (en) | System and method for providing an object to support data structures in WORM storage | |
CN100468402C (en) | Sort data storage and split catalog inquiry method based on catalog tree | |
CN100583832C (en) | Data management method and system | |
CN102375853A (en) | Distributed database system, method for building index therein and query method | |
CN102737133B (en) | A kind of method of real-time search | |
CN109284273B (en) | Massive small file query method and system adopting suffix array index | |
CN104572920A (en) | Data arrangement method and data arrangement device | |
CN110888837B (en) | Object storage small file merging method and device | |
US11681691B2 (en) | Presenting updated data using persisting views | |
CN102024019B (en) | Suffix tree based catalog organizing method in distributed file system | |
US7783589B2 (en) | Inverted index processing | |
US10613988B2 (en) | Purging storage partitions of databases | |
CN102779138A (en) | Hard disk access method of real time data | |
CN103353901A (en) | Orderly table data management method and system based on Hadoop distributed file system (HDFS) | |
CN114780530A (en) | Time sequence data storage method and system based on LSM tree key value separation | |
CN113656397A (en) | Index construction and query method and device for time series data | |
CN101963993B (en) | Method for fast searching database sheet table record | |
CN107169003B (en) | Data association method and device | |
CN109325022B (en) | Data processing method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |