CN103049519A - Data uploading method and data uploading device - Google Patents

Data uploading method and data uploading device Download PDF

Info

Publication number
CN103049519A
CN103049519A CN2012105537786A CN201210553778A CN103049519A CN 103049519 A CN103049519 A CN 103049519A CN 2012105537786 A CN2012105537786 A CN 2012105537786A CN 201210553778 A CN201210553778 A CN 201210553778A CN 103049519 A CN103049519 A CN 103049519A
Authority
CN
China
Prior art keywords
data
subregion
data table
index
ephemeral
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105537786A
Other languages
Chinese (zh)
Inventor
宋怀明
杨浩
苗艳超
刘新春
邵宗有
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dawning Information Industry Beijing Co Ltd
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN2012105537786A priority Critical patent/CN103049519A/en
Publication of CN103049519A publication Critical patent/CN103049519A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a data uploading method. The data uploading method comprises the steps of: uploading data to be uploaded into a temporary data sheet in batches; displacing the data in the temporary data sheet to an intermediate data sheet based on a preset time interval; building indexes for the data placed in the intermediate data sheet; and displacing the data having indexes from the intermediate data sheet to a target data sheet. The invention also discloses a data uploading device. By the adoption of the data uploading method and the data uploading device, data uploading efficiency can be improved.

Description

Data load method and data loading device
Technical field
The present invention relates to the microcomputer data processing field, more specifically, relate to a kind of data load method and data loading device.
Background technology
Along with the development of infotech, data volume presents the trend of explosive growth.The mass data processing system can put in storage when requiring on the one hand data to arrive as early as possible, and this just requires when loading the inspection of data each side and judges as far as possible simple.And on the other hand to the demand of the quick-searching of data, require all data to store regularly, and set up index according to retrieval and indexing condition commonly used, in order to can in needs, obtain fast Query Result.If when data load, it is carried out the optimization of index and storage aspect, certainly will need analysis and calculation to enter the feature of database data, and carry out Ordering according to these features, so greatly limit the writing speed of data, had contradiction so that data high-speed loads and organizes according to the order of sequence between operations such as () index upgrades.
Summary of the invention
The problem that exists for solving prior art the invention provides a kind of data load method and data loading device, makes it possible to improve the efficient that data load.
According to an aspect of the present invention, provide a kind of data load method, having comprised:
Steps A will load batch data and be loaded in the ephemeral data table;
Step B, according to the time interval that presets with the data replacement in the described ephemeral data table in the intermediate data table;
Step C is to being placed to the data creation index in the described intermediate data table; And
Step D replaces the data that create index the target matrix from described intermediate data table.
In optional embodiment, carry out subregion by the time scope in described ephemeral data table and the target matrix and partitioning strategies identical, wherein, Part nSubregion is used for indicating at (T storage time N-1, T n] between data, wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, n>1, Part nThe current subregion of partitioned representation;
Steps A comprises: according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion;
Step B comprises: with the Part in the described ephemeral data table nThe data replacement of subregion is in the first intermediate data table;
Step C comprises: set up index at described the first intermediate data table for the data in described the first intermediate data table; And
Step D comprises: the Part of the data replacement in described the first intermediate data table that built lithol is drawn in the described target matrix nIn the subregion.
In optional embodiment, described ephemeral data table carry out subregion with described target matrix by the time scope and partitioning strategies identical, wherein, Part N-1Subregion is used for indicating at (T storage time N-2, T N-1] between data, Part nSubregion is used for indicating at (T storage time N-1, T n] between data; Described ephemeral data table comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion; Subregion is decided according to designing requirement in the target matrix; Wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1, k>0;
Steps A comprises: according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion, and the data that will postpone to arrive are written to Part N-1Subregion;
Step B comprises: with Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, with Part in the described ephemeral data table N-1The data replacement of subregion is in the second intermediate data table;
Step C comprises: set up index at described the first intermediate data table for the data in described the first intermediate data table,
Step D comprises: the Part of the data replacement in described the first intermediate data table that built lithol is drawn in the described target matrix nIn the subregion; And
After step B, also comprise step: the data in described the second intermediate data table are inserted into Part in the described target matrix N-1In the subregion.
In optional embodiment, described data load method also comprises after steps A:
Receiving not when the intermediate data table is set up the indication of index, with the data replacement in the ephemeral data table in target matrix; And, be described data creation index in described target matrix.
According to a further aspect in the invention, also provide a kind of data loading device, having comprised:
The original upload unit is used for loading batch data and is loaded into the ephemeral data table;
The unit set up in the first index, be used for according to the time interval that presets with the data replacement of described ephemeral data table in the intermediate data table, and to being placed to the data creation index in the described intermediate data table;
The target loading unit is replaced target matrix for the data that will create index from described intermediate data table.
In optional embodiment, carry out subregion by the time scope in described ephemeral data table and the target matrix and partitioning strategies identical, wherein, Part nSubregion is used for indicating at (T storage time N-1, T n] between data, wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, n>1, Part nThe current subregion of partitioned representation;
Described original upload unit be further used for according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion;
Described the first index is set up the unit and is further used for the Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table;
The Part of data replacement in described the first intermediate data table that described target loading unit is further used for built lithol is drawn in the described target matrix nIn the subregion.
In optional embodiment, described ephemeral data table carry out subregion with described target matrix by the time scope and partitioning strategies identical, wherein, Part N-1Subregion is used for indicating at (T storage time N-2, T N-1] between data, Part nSubregion is used for indicating at (T storage time N-1, T n] between data; Described ephemeral data table comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion; Subregion is decided according to designing requirement in the target matrix; Wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1, k>0;
Described original upload unit be further used for according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion, and the data that will postpone to arrive are written to Part N-1Subregion;
Described the first index is set up the unit and is further used for Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table;
Described data loading device also comprises the delayed data processing unit, is used for described ephemeral data table Part N-1The data replacement of subregion is in the second intermediate data table;
The Part of data replacement in described the first intermediate data table that described target loading unit is further used for built lithol is drawn in the described target matrix nIn the subregion; And, also be used for data with described the second intermediate data table and be inserted into Part in the described target matrix N-1In the subregion.
In optional embodiment, described original upload unit, also be used for receiving not when the intermediate data table is set up the indication of index, the indication that sends undo is set up the indication of unit and transmission executable operations to described target loading unit to described the first index; Described target loading unit also is used for when the executable operations indication that receives described original upload unit, with the data replacement in the described ephemeral data table in described target matrix; And described data loading device comprises that also the second index sets up the unit, and being used at described target matrix is described data creation index.
According to another aspect of the invention, also provide a kind of data loading device, having comprised:
The original upload unit is used for loading batch data and is loaded into the ephemeral data table; The target loading unit is used for data replacement with described ephemeral data table in target matrix; And the unit set up in index, and being used at described target matrix is described data creation index.
In accordance with a further aspect of the present invention, also provide a kind of data load method, having comprised: will load batch data and be loaded in the ephemeral data table; And, in target matrix, and be described data creation index in described target matrix with the data replacement in the described ephemeral data table.
The embodiment of the invention is separated by data being write and create two stages of index, so that data are write the fashionable not direct more Index Status of new data, because data are write fashionable not tape index, therefore can adopt the mode that loads in batches to carry out, thereby can improve the performance that data write.In addition, owing on follow-up phase ground the data centralization that writes is created index, therefore also can improve the performance that data are write fashionable index maintenance.
Description of drawings
Fig. 1 is the schematic flow sheet of data load method according to an embodiment of the invention.
Fig. 2 is the schematic flow sheet according to the data load method of further embodiment of this invention.
Fig. 3 is the regularity of distribution of common delayed data.
Fig. 4 is the schematic flow sheet according to the data load method of further embodiment of this invention.
Fig. 5 is an example according to the data load method of the embodiment of the invention.
Fig. 6 is the structural representation of data loading device according to an embodiment of the invention.
Fig. 7 is the structural representation according to the data loading device of further embodiment of this invention.
Embodiment
Below in conjunction with accompanying drawing the present invention is described in detail further.
Fig. 1 is the schematic flow sheet of data load method according to an embodiment of the invention.As shown in Figure 1, described data load method comprises:
S101 will load batch data and be loaded in the ephemeral data table;
S102, according to the time interval that presets with the data replacement in the described ephemeral data table in the intermediate data table;
S103 is to being placed to the data creation index in the described intermediate data table;
S104 replaces the data that create index the target matrix from described intermediate data table.
In embodiments of the present invention, load data direct Index Status in the new database more when loading, but at first will load in the ephemeral data table that data are written to tape index not.Because data are write the fashionable index that need not to set up, therefore can adopt the mode that loads in batches to accelerate, improve write performance.
Then, can be regularly with the data replacement in the ephemeral data table to the intermediate data table, and be described data centralization establishment index at the intermediate data table.After the data replacement in the ephemeral data table is arrived the intermediate data table, can abandon the data in the ephemeral data table.
Then, the data that create index are replaced the target matrix from middle tables of data.Because the data load method of the embodiment of the invention is when concentrating the establishment index to data, the data that will load that newly receive can also be loaded in the ephemeral data table, because can parallel processing, upgrade index when therefore loading with data of the prior art and compare, can improve index creation and maintainability.The embodiment of the invention goes for user's historical behavior is analyzed and the measurement type application, for example, and the application such as loaded and optimized of the data of telephone communication record, bank transaction record data, sensor network and mobile Internet behavioral data.
In the optional embodiment of the present invention, described data load method can also comprise: after step S101, if receive the indication of not setting up index at the intermediate data table, then with the data replacement in the ephemeral data table in target matrix; And, be described data creation index in described target matrix.
Fig. 2 is the schematic flow sheet according to the data load method of further embodiment of this invention.As shown in Figure 2, this data load method comprises:
S201, according in the data with time tag write data into T in the ephemeral data table nSubregion.
In embodiments of the present invention, the ephemeral data table is identical by time range partition and partitioning strategies with target matrix, so each subregion is big or small identical.Part nSubregion is used for indicating at (T storage time N-1, T n] between data, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1.For example carry out subregion, Part take the moon as time range nSubregion is used for storing the data between last month to this month.
S202 is with the Part in the ephemeral data table nThe data replacement of subregion is in the first intermediate data table.
S203 sets up index at the first intermediate data table for the data in described the first intermediate data table.
S204, the Part of the data replacement in described the first intermediate data table that built lithol is drawn in the described target matrix nIn the subregion.
In embodiments of the present invention, all data recording are all with time tag.Target matrix carries out subregion according to time range.In addition, create partial indexes at partition table, i.e. the corresponding index partition of each data partition.When carrying out the subregion exchange, only the data in the need ephemeral data table satisfy the partition characteristics of target matrix, just can carry out the subregion exchange.In embodiments of the present invention, the data dictionary of subregion exchanging policy by the Update Table storehouse can be finished data in the ephemeral data table fast to the data exchange of index data table.Wherein, data dictionary is the metadata information of data of description object definition and memory location in the database, in the subregion exchange process, does not relate to the migration of data block, just revises the name information of exchange both sides' data object, so efficient is very high.Adopt the mode of the delay index of the embodiment of the invention, data have the delay of a subregion from entering into Database Systems to inquiring about necessity, so the uncomfortable selection of subregion is too large.Usually, the time range of a subregion adds that the time that postpones index must be less than the delay of Database Systems permission.In experiment, the inventor finds for most of application systems, loads data into the ephemeral data table owing to there not being index, so the pressure less of data loading, thereby the time of delay index is very short.Generally speaking, postpone the time of index less than 20% of the former data load time, therefore, can meet the demands as long as the selection of time partition size is no more than 80% of system requirements.
Write the application system of arrival for general data high-speed, inquiry is responsive for time interval, therefore adopts the method for carrying out subregion according to the time.Yet, also according to designing requirement, according to other modes ephemeral data table and target matrix are carried out subregion.
In addition, the inventor finds: (1) majority of traffic arrives according to certain sequential, the fraction data may because transmission or other reasons cause delaying to reach, and this can cause low volume data to drop in the other times subregion, bring certain error or difficulty for the realization of exchange partition; (2) data are constantly to arrive in the Database Systems, so data in the process that loads, do not allow the relevant data dictionary of database is made amendment.
Fig. 3 is the regularity of distribution of common delayed data.In the system that data high-speed writes, data delay meets long-tail and distributes, and for example in the note system, most short messages are real-time tranceptions, only has a small amount of short message can occur postponing.Therefore, can know that most of data arrive in very short delay, only have a small amount of data delay larger, write so the arrival of data is a kind of substantially orderly data.But, data delay and out of order situation also can appear, the data that delay to reach can the modification system in existing data acquisition, therefore need to the index that has created be upgraded.
Low volume data occurs for meeting and postpone and out of order problem, the present invention has proposed a kind of embodiment of data load method further.Fig. 4 is the schematic flow sheet according to the data load method of further embodiment of this invention.As shown in Figure 4, this data load method comprises step:
S401, according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion, and the data that will postpone to arrive are written to Part N-1Subregion.
In the present embodiment, data load method comprises with ephemeral data table, intermediate data table and target matrix and loading, and wherein, the data in ephemeral data table and the target matrix are deposited by the time range partition, and partitioning strategies is identical.Therefore the size of subregion is also identical.
Load and write the ephemeral data table between the data, do not have index on the ephemeral data table.Part nSubregion is current subregion, indicates at (T storage time N-1, T n] between data; Part N-1Subregion is the previous subregion of current subregion, and namely storage time, sign was at (T N-2, T N-1] between data.Wherein, T N-1Represented the end in a upper cycle constantly, T nThe end of expression current period constantly.In embodiments of the present invention, the ephemeral data table keeps Part N-1Subregion after the subregion comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., T N+kSubregion, k>0.And target matrix for example comprises Part according to the subregion of the time period of designing requirement reservation needs or all subregions N-1Subregion, Part 2Subregion ..., Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion, k>0.
In the present embodiment, the data that arrive in chronological order write the Part in the ephemeral data table nThe data that subregion, fraction postpone to arrive since time tag at current subregion Part nBefore the time range of subregion, therefore write the Part in the ephemeral data table N-1Subregion.When arriving T nIn the time of constantly, the major part loading data that arrive afterwards will write the Part in the ephemeral data table N+1Subregion.
S402 is with Part in the ephemeral data table nIn the data replacement first intermediate data table of subregion.
S403 is with Part in the described ephemeral data table N-1The data replacement of subregion is in the second intermediate data table.
The execution sequence of step S402 and S403 can be put upside down or can carry out simultaneously.After displacement is finished, can abandon the Part of ephemeral data table N-1Subregion.
S404 is that data in the first intermediate data table are set up index at the first intermediate data table.
S405 is inserted into Part in the target matrix with the data in the second intermediate data table N-1In the subregion.
S406, the Part of the data replacement in the first intermediate data table that built lithol is drawn in the target matrix nIn the subregion.
Wherein, step S404, S405 and S406 can carry out with other order, and perhaps part steps can be carried out simultaneously.
The below further introduces an example in detail according to embodiments of the invention.
Fig. 5 is an example according to the data load method of the embodiment of the invention.In this example, the data ephemeral data table TMP_TAB1 that writes direct during loading, the data that arrive according to the order of sequence enter subregion Part n, the data that fraction postpone to arrive since time tag before current subregion, therefore write subregion Part N-1, as due in T nThe time, most of data that load will enter the Part that TMP_TAB1 shows N+1Then subregion proceeds as follows.
1) Part that TMP_TAB1 is shown nSubregion and ephemeral data table TMP_TAB2 exchange.
2) Part that simultaneously TMP_TAB1 is shown N-1Subregion and ephemeral data table TMP_TAB3 exchange.
3) after exchange is finished, the Part of TMP_TAB1 table N-1Subregion has not just had data, then abandons the Part of TMP_TAB1 table N-1Subregion.
4) with the respective partition Part among the data data inserting table TAB1 of ephemeral data table TMP_TAB3 N-1, insert the data that empty after finishing in the TMP_TAB3 table.
5) create the index that needs at ephemeral data table TMP_TAB2, wherein, create index and can adopt the SQL statement of the establishment index of database to realize.
6) the TMP_TAB2 table of index and the subregion Part of tables of data TAB1 will have been created nExchange, after exchange was finished, data had just entered the TAB1 table, and just do not had data on the TMP_TAB2 this moment, then abandons the index on the TMP_TAB2 table.
In this example, first three step can be by Update Table dictionary (being metadata information) complete operation, total execution time almost can be ignored, and the 4th step and the 5th step can executed in parallel, owing to the data that delay to reach among the ephemeral data table TMP_TAB3 are Part among the tables of data TAB1 N-1The part of the data that subregion should comprise accounts for the very little ratio of whole loading data, is not very large so adopt direct-insert mode on the impact of whole system.And the 5th step created index at TMP_TAB2, neither affected in logic the data (table TMP_TAB1) that loading, and also do not affect the data (table Tab1) of inquiring about, and the establishment index also can take full advantage of parallel being optimized.
In this example, owing to adopting the repeatedly strategy of exchange, the problem that Update Table dictionary and fraction data delay arrive in the time of therefore can effectively solving loading.In addition, in the larger-scale data-density system system, the storage administration for the big data quantity of continuous arrival must effectively utilize I/O (I/O) bandwidth, and therefore orderly storage is necessary continuously.Demand according to Coutinuous store, (table space represents a logic storage unit in the database to this example by tables of data TAB1 and ephemeral data table TMP_TAB1 being created on the identical table space, usually comprise several physical memory cells, be data file), so that data are stored on the data file in object table space when arriving, design Storage that can optimization system, so that the data that arrive continuously storage also is continuous, thus the I/O performance in the time of can improving data query.
Fig. 6 is the structural representation of data loading device according to an embodiment of the invention.As shown in Figure 6, data loading device comprises: original upload unit 10 is used for loading batch data and is loaded into the ephemeral data table; Unit 30 set up in the first index, be used for according to the time interval that presets with the data replacement of described ephemeral data table in the intermediate data table, and to being placed to the data creation index in the described intermediate data table; And target loading unit 50 is replaced target matrix for the data that will create index from described intermediate data table.
According to another embodiment of the present invention, carry out subregion by the time scope in ephemeral data table and the target matrix and partitioning strategies identical, wherein, Part nSubregion is used for indicating at (T storage time N-1, T n] between data, wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, n>1, Part nThe current subregion of partitioned representation.
In this embodiment, original upload unit 10 be further used for according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion.The first index is set up unit 30 and is further used for the Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table.The Part of data replacement in described the first intermediate data table that target loading unit 50 is further used for built lithol is drawn in the described target matrix nIn the subregion.
Fig. 7 is the structural representation according to the data loading device of further embodiment of this invention.As shown in Figure 7, data loading device comprises that original upload unit 10, index set up unit 30, delayed data processing unit 40 and target loading unit 50.
Original upload unit 10 be used for according to data with the time tag data that will arrive in chronological order be written to Partn subregion in the ephemeral data table, and the data that will postpone arrival are written to the Partn-1 subregion.
In the present embodiment, described ephemeral data table carry out subregion with described target matrix by the time scope and partitioning strategies identical, wherein, Part N-1Subregion is used for indicating at (T storage time N-2, T N-1] between data, Part nSubregion is used for indicating at (T storage time N-1, T n] between data; Described ephemeral data table comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion, wherein, T N-2The end moment of expression phase week before last, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1, k>0.Subregion is decided according to designing requirement in the target matrix, can comprise all subregions, perhaps can comprise the subregion of a very long time.
The first index is set up unit 30 and is used for described ephemeral data table Part nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table.
Delayed data processing unit 40 is used for described ephemeral data table Part N-1The data replacement of subregion is in the second intermediate data table;
Target loading unit 50, the data replacement that is used for described the first intermediate data table that built lithol the is drawn Part in the described target matrix nIn the subregion; And, also be used for data with described the second intermediate data table and be inserted into Part in the described target matrix N-1In the subregion.
In addition, write fashionablely in user's mass data, if there is no the situation of the situation of the out of order arrival of data or out of order arrival can be ignored, then can the subregion index be set to invalid, when data are written to target matrix fully, concentrate the index that creates this subregion.
According to an alternative embodiment of the invention, described data loading device can also comprise that the second index sets up the unit.The original upload unit, after will loading data and being loaded into the ephemeral data table, if receive the indication of not setting up index at the intermediate data table, the indication that then sends undo is set up the unit to the first index, sends simultaneously the indication of executable operations to the target loading unit.
Described target loading unit also is used for when the indication that receives from the executable operations of original upload unit, with the data replacement in the described ephemeral data table in described target matrix.The unit set up in the second index, and being used at described target matrix is described data creation index.
According to an alternative embodiment of the invention, also provide a kind of data loading device, having comprised: the original upload unit is used for loading batch data and is loaded into the ephemeral data table; The target loading unit is used for data replacement with described ephemeral data table in target matrix; And the unit set up in index, and being used at described target matrix is described data creation index.
Describing in further detail of the data loading device of the embodiment of the invention can referring to the description of the relevant portion of the application's data load method, not repeat them here.
Upgrade simultaneously index and cause loading velocity very slow loading data in the prior art, and the present invention will load and index is divided into two independent stages, so that load and index can parallel processing, improve the efficient that data is arrived target matrix.Table 1 has shown that the data write time statistics of tape index, statistics of indexless data write time and employing postpone the data write time statistics of index under the pressure test, and wherein the test record number is 500,000, and chronomere is s.
Table 1
Figure BDA00002609103700111
Figure BDA00002609103700121
Annotate: the tape index of speed-up ratio=directly write time/(without swap time index write time+establishment index time+subregion), wherein subregion exchanges the only data dictionary in Update Table storehouse, and the time is Millisecond or shorter, can ignore.
Lineitem table and the Orders table in the TPCH Standard test programme used in this test.Can find out from test result, adopt postponing the index time shortened greatly than the tape index write time, and speed-up ratio is all more than 5, at some index field more under the complicated situation, such as full-text index, it is higher that speed-up ratio can reach.
Owing to when carrying out data query, also with time conditions, therefore the inquiry to whole table can be reduced to the inquiry of some subregion on the his-and-hers watches by the time range partition, improve the performance of inquiry.Simultaneously create partial indexes at partition table, be the corresponding index partition of each data partition, when inquiring about, can be according to time conditions subregion corresponding to locator data at first, then draw very soon required record by retrieving corresponding index partition, greatly improved the efficient of inquiry.
Be not limited to any specific computing machine, virtual system or miscellaneous equipment at this algorithm that provides and realization.According to top description, it is apparent constructing the desired structure of this type systematic.In addition, the present invention is not also for any certain programmed language.Should be understood that and to utilize various programming languages to realize content of the present invention described here, and the top description that language-specific is done is in order to disclose preferred forms of the present invention.
Those skilled in the art are appreciated that and can adaptively change and they are arranged in one or more equipment different from this embodiment the module in the equipment among the embodiment.Can be combined into a module or unit or parts to the module among the embodiment or unit or assembly, and can be divided into a plurality of submodules or subelement or subassembly to them in addition.In such feature and/or process or unit at least some are mutually repelling, and can adopt any combination to disclosed all features in this instructions (comprising claim, summary and the accompanying drawing followed) and so all processes or the unit of disclosed any method or equipment make up.Unless in addition clearly statement, disclosed each feature can be by providing identical, being equal to or the alternative features of similar purpose replaces in this instructions (comprising claim, summary and the accompanying drawing followed).
In addition, although a large amount of details of the embodiment of the invention have been described in the application's the instructions, yet, can understand, the embodiment of the invention is not could implement in all detail situations.In some instances, be not shown specifically known method, structure and technology, so that be expressly understood inventive concept of the present invention.
The above only is preferred embodiment of the present invention, and is in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of doing, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. data load method comprises:
Steps A will load batch data and be loaded in the ephemeral data table;
Step B, according to the time interval that presets with the data replacement in the described ephemeral data table in the intermediate data table;
Step C is to being placed to the data creation index in the described intermediate data table; And
Step D replaces the data that create index the target matrix from described intermediate data table.
2. data load method according to claim 1 is characterized in that:
Carry out subregion by the time scope in described ephemeral data table and the target matrix and partitioning strategies identical, wherein, Part nSubregion is used for indicating at (T storage time N-1, T n] between data, wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, n>1, Part nThe current subregion of partitioned representation;
Steps A comprises: according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion;
Step B comprises: with the Part in the described ephemeral data table nThe data replacement of subregion is in the first intermediate data table;
Step C comprises: set up index at described the first intermediate data table for the data in described the first intermediate data table; And
Step D comprises: the Part of the data replacement in described the first intermediate data table that built lithol is drawn in the described target matrix nIn the subregion.
3. data load method according to claim 1 is characterized in that:
Described ephemeral data table carry out subregion with described target matrix by the time scope and partitioning strategies identical, wherein, Part N-1Subregion is used for indicating at (T storage time N-2, T N-1] between data, Part nSubregion is used for indicating at (T storage time N-1, T n] between data; Described ephemeral data table comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion; Subregion is decided according to designing requirement in the target matrix; Wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1, k>0;
Steps A comprises: according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion, and the data that will postpone to arrive are written to Part N-1Subregion;
Step B comprises: with Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, with Part in the described ephemeral data table N-1The data replacement of subregion is in the second intermediate data table;
Step C comprises: set up index at described the first intermediate data table for the data in described the first intermediate data table,
Step D comprises: the Part of the data replacement in described the first intermediate data table that built lithol is drawn in the described target matrix nIn the subregion; And
After step B, also comprise step: the data in described the second intermediate data table are inserted into Part in the described target matrix N-1In the subregion.
4. data load method according to claim 1 is characterized in that, described data load method also comprises after steps A:
Receiving not when the intermediate data table is set up the indication of index, with the data replacement in the ephemeral data table in target matrix;
It is described data creation index in described target matrix.
5. data loading device comprises:
The original upload unit is used for loading batch data and is loaded into the ephemeral data table;
The unit set up in the first index, be used for according to the time interval that presets with the data replacement of described ephemeral data table in the intermediate data table, and to being placed to the data creation index in the described intermediate data table;
The target loading unit is replaced target matrix for the data that will create index from described intermediate data table.
6. data loading device according to claim 5 is characterized in that:
Carry out subregion by the time scope in described ephemeral data table and the target matrix and partitioning strategies identical, wherein, Part nSubregion is used for indicating at (T storage time N-1, T n] between data, wherein, T N-1Represented the end in a upper cycle constantly, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, n>1, Part nThe current subregion of partitioned representation;
Described original upload unit be further used for according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion;
Described the first index is set up the unit and is further used for the Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table;
The Part of data replacement in described the first intermediate data table that described target loading unit is further used for built lithol is drawn in the described target matrix nIn the subregion.
7. data load method according to claim 5 is characterized in that:
Described ephemeral data table carry out subregion with described target matrix by the time scope and partitioning strategies identical, wherein, Part N-1Subregion is used for indicating at (T storage time N-2, T N-1] between data, Part nSubregion is used for indicating at (T storage time N-1, T n] between data; Described ephemeral data table comprises Part N-1Subregion, Part nSubregion, Part N+1Subregion ..., Part N+kSubregion; Subregion is decided according to designing requirement in the target matrix; Wherein, T N-1Represented the end in a upper cycle constantly, T nThe end moment of expression current period, Part nThe current subregion of partitioned representation, n>1, k>0;
Described original upload unit be further used for according in the data with the time tag data that will arrive in chronological order be written to Part in the ephemeral data table nSubregion, and the data that will postpone to arrive are written to Part N-1Subregion;
Described the first index is set up the unit and is further used for Part in the described ephemeral data table nThe data replacement of subregion in the first intermediate data table, and, set up index at described the first intermediate data table for the data in described the first intermediate data table;
The delayed data processing unit is used for described ephemeral data table Part N-1The data replacement of subregion is in the second intermediate data table;
The Part of data replacement in described the first intermediate data table that described target loading unit is further used for built lithol is drawn in the described target matrix nIn the subregion; And, also be used for data with described the second intermediate data table and be inserted into Part in the described target matrix N-1In the subregion.
8. data loading device according to claim 5 also comprises:
Described original upload unit also is used for receiving not when the intermediate data table is set up the indication of index, and the indication that sends undo is set up the indication of unit and transmission executable operations to described target loading unit to described the first index;
Described target loading unit also is used for when the executable operations indication that receives described original upload unit, with the data replacement in the described ephemeral data table in described target matrix; And
The unit set up in the second index, and being used at described target matrix is described data creation index.
9. data loading device comprises:
The original upload unit is used for loading batch data and is loaded into the ephemeral data table;
The target loading unit is used for data replacement with described ephemeral data table in target matrix;
The unit set up in index, and being used at described target matrix is described data creation index.
10. data load method comprises:
Will load batch data is loaded in the ephemeral data table;
In target matrix, and be described data creation index in described target matrix with the data replacement in the described ephemeral data table.
CN2012105537786A 2012-12-18 2012-12-18 Data uploading method and data uploading device Pending CN103049519A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012105537786A CN103049519A (en) 2012-12-18 2012-12-18 Data uploading method and data uploading device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012105537786A CN103049519A (en) 2012-12-18 2012-12-18 Data uploading method and data uploading device

Publications (1)

Publication Number Publication Date
CN103049519A true CN103049519A (en) 2013-04-17

Family

ID=48062160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012105537786A Pending CN103049519A (en) 2012-12-18 2012-12-18 Data uploading method and data uploading device

Country Status (1)

Country Link
CN (1) CN103049519A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750749A (en) * 2013-12-31 2015-07-01 阿里巴巴集团控股有限公司 Data processing method and data processing device
CN105512313A (en) * 2015-12-15 2016-04-20 北京京东尚科信息技术有限公司 Incremental data processing method and device
CN106649341A (en) * 2015-10-30 2017-05-10 方正国际软件(北京)有限公司 Data processing method and device
CN107145529A (en) * 2017-04-17 2017-09-08 东软集团股份有限公司 A kind of data processing method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101251861A (en) * 2008-03-18 2008-08-27 北京锐安科技有限公司 Method for loading and inquiring magnanimity data
US7548898B1 (en) * 2001-02-28 2009-06-16 Teradata Us, Inc. Parallel migration of data between systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7548898B1 (en) * 2001-02-28 2009-06-16 Teradata Us, Inc. Parallel migration of data between systems
CN101251861A (en) * 2008-03-18 2008-08-27 北京锐安科技有限公司 Method for loading and inquiring magnanimity data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
房友园等: "基于CORBA的海量数据加载并行任务调度技术研究与实现", 《计算机应用与软件》, 31 October 2006 (2006-10-31), pages 13 - 14 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750749A (en) * 2013-12-31 2015-07-01 阿里巴巴集团控股有限公司 Data processing method and data processing device
CN104750749B (en) * 2013-12-31 2018-04-03 阿里巴巴集团控股有限公司 Data processing method and device
CN106649341A (en) * 2015-10-30 2017-05-10 方正国际软件(北京)有限公司 Data processing method and device
CN106649341B (en) * 2015-10-30 2021-02-26 方正国际软件(北京)有限公司 Data processing method and device
CN105512313A (en) * 2015-12-15 2016-04-20 北京京东尚科信息技术有限公司 Incremental data processing method and device
CN105512313B (en) * 2015-12-15 2019-01-22 北京京东尚科信息技术有限公司 A kind of method and apparatus of incremented data processing
CN107145529A (en) * 2017-04-17 2017-09-08 东软集团股份有限公司 A kind of data processing method and device
CN107145529B (en) * 2017-04-17 2020-04-07 东软集团股份有限公司 Data processing method and device

Similar Documents

Publication Publication Date Title
CN103049519A (en) Data uploading method and data uploading device
CN105069134A (en) Method for automatically collecting Oracle statistical information
CN106981024B (en) Transaction limit calculation processing system and processing method thereof
CN102542071A (en) Distributed data processing system and method
CN101102577A (en) Incremental synchronization method for data in tables of frontground and background database of wireless communication base station system
CN102867071A (en) Management method for massive network management historical data
CN109002484A (en) A kind of method and system for sequence consumption data
CN104216893A (en) Partitioned management method for multi-tenant shared data table, server and system
CN106933836A (en) A kind of date storage method and system based on point table
CN102521419A (en) Hierarchical storage realization method and system
CN102402422B (en) The method that processor module and this assembly internal memory are shared
CN101093482A (en) Method for storing and retrieving mass information
CN109388636A (en) Business datum is inserted into database method, apparatus, computer equipment and storage medium
CN102279729B (en) Method, buffer and processor for dynamic reconfigurable array to schedule configuration information
CN102567258B (en) Multi-dimensional DMA (direct memory access) transmitting device and method
WO2023143095A1 (en) Method and system for data query
US7020656B1 (en) Partition exchange loading technique for fast addition of data to a data warehousing system
CN100552631C (en) A kind of method of utilizing surplus resources to distribute register
CN102411632A (en) Chain table-based memory database page type storage method
CN105550351B (en) The extemporaneous inquiry system of passenger's run-length data and method
CN103310008A (en) Cloud control server and file index method
CN101094428A (en) Statistical system and method of querying service database of intelligent network
CN103281383B (en) A kind of time sequence information recording method of Based on Distributed data source
CN110515939A (en) A kind of multi-column data sort method based on GPU
CN110502543A (en) Device performance data storage method, device, equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20130417

RJ01 Rejection of invention patent application after publication