CN108829747A - Data load method and device - Google Patents
Data load method and device Download PDFInfo
- Publication number
- CN108829747A CN108829747A CN201810510384.XA CN201810510384A CN108829747A CN 108829747 A CN108829747 A CN 108829747A CN 201810510384 A CN201810510384 A CN 201810510384A CN 108829747 A CN108829747 A CN 108829747A
- Authority
- CN
- China
- Prior art keywords
- data record
- data
- cache database
- life
- major key
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This disclosure relates to a kind of data load method and device, to solve the problems, such as that data loading efficiency is low.This method includes:When scan period reaches, from data source reads data log as the first data record in data record set;The first data record in ergodic data set of records ends, is retrieved in cache database according to the major key of the first data record;When not retrieving data record matched with major key in cache database, the first data record is added to newly-increased data record set;When retrieving the end of life with the matched data record of major key and the first data record in cache database, the first data record is added to end of life data record set;By the full dose tables of data of the data record insertion target database in newly-increased data record set;With the data record replacement in end of life data record set and the data record in the consistent full dose tables of data of its major key.Improve system treatment effeciency.
Description
Technical field
This disclosure relates to technical field of data processing more particularly to a kind of data load method and device.
Background technique
ETL (Extract-Transform-Load, extraction-conversion-load) is used to describe to pass through data from source terminal
Extract, convert and be loaded onto the process of destination.In some ETL process application scenarios, new data records are continuously generated, and are taken out
It is directly deposited into database after taking new data, no longer update historgraphic data recording;And in other application scenarios, it may be updated
Data record (RecSet) have a life cycle, that is, can also be to data after the T time after data record is originally written into
Individual attribute field is updated, until this data life period terminates (i.e. life cycle terminates), then updates the termination of its life
Mark.Fig. 1 shows the general structure of such data record, as shown in Figure 1, the data recording structure includes:Entity identifier is opened
Begin time, end time (life cycle end mark, be initially empty), renewal time and some other attribute-bit.Wherein,
Entity identifier+time started makees major key.The life cycle of data record includes:Initially:Time started is t1, and the end time is
Sky, relevant parameter are initial value.State 1:Time started is t1, and the end time is sky, relevant parameter x according to service conditions more
Newly.State 2:State in change procedure, relevant parameter x are updated according to service conditions.Terminate:Time started is t1, at the end of
Between be t2, relevant parameter x complete update.
By taking record is stayed in hotel as an example, in this case, entity identifier is resident's identification card number.One is generated when resident moves in
Item record, time started and renewal time are moving in the time for this person, and the end time is sky;After N days, when this person checks out,
It is the check-out time that this record, which will be updated its end time field, while the renewal time field of this record can refresh simultaneously.Fig. 2 shows
The structure for having gone out the data record by taking record is stayed in hotel as an example, in Fig. 2, entity identifier in data record when resident moves in
For the identification card number 411002xxxx of resident, the time started moves in time 2018-01-01 10 for resident's:00, the end time
For sky, renewal time is 2018-01-01 10:00;Entity identifier is the body of resident in data record when resident terminates to move in
Part card 411002xxxx, time started move in time 2018-01-01 10 for resident's:00, the end time checks out for resident
Time 2018-01-03 12:00, renewal time is the time 2018-01-03 12 that resident checks out:00.
It when carrying out ETL to updatable data, needs according to the scan period, is periodically read according to " renewal time " field new
Increase record, then the full dose tables of data being stored in target database is retrieved by major key, but is read within the scan period
Data record need to create interim table in target database and stored (TmpTable-RecSet), can bring so additional
Disk I/O (input/output), exist between interim table and full dose tables of data and press the attended operation (join) of major key, when full dose data
When the data record added up in table is more, the more process resource of target database can be occupied, system Whole Response ability is influenced,
Reduce system response efficiency.
Summary of the invention
In view of this, the present disclosure proposes a kind of data load method and device, to solve in the related technology in logarithm
The lower problem of data loading efficiency during according to progress ETL.
According to the disclosure in a first aspect, provide a kind of data load method, including:When scan period reaches, from number
According to source reads data log to data record set, as the first data record in the data record set;Described in traversal
First data record in data record set, for any first data record traversed, according to first number
It is retrieved in cache database according to the major key of record;It is matched when not retrieved in the cache database with the major key
Data record when, first data record is added to newly-increased data record set;It is examined when in the cache database
When rope is to the end of life of the matched data record of the major key and first data record, by first number
End of life data record set is added to according to record;Data record in the newly-increased data record set is inserted into mesh
Mark the full dose tables of data of database;It is replaced and its major key one with the data record in the end of life data record set
The data record in the full dose tables of data caused.
According to the second aspect of the disclosure, a kind of data loading device is provided, including:Read module, for scanning week
When phase reaches, from data source reads data log to data record set, as the first data in the data record set
Record;Retrieval module, for traversing first data record in the data record set, for any traversed
One data record is retrieved in cache database according to the major key of first data record;First adding module, is used for
When not retrieving data record matched with the major key in the cache database, first data record is added
To newly-increased data record set;Second adding module is matched for working as to retrieve in the cache database with the major key
Data record and when the end of life of first data record, first data record is added to Life Cycle
Final only data record set;It is inserted into module, for the data record in the newly-increased data record set to be inserted into number of targets
According to the full dose tables of data in library;Replacement module, for being replaced with the data record in the end of life data record set
With the data record in the consistent full dose tables of data of its major key.
By the data loading scheme of all aspects of this disclosure, during carrying out ETL to data, use is data cached
Banked cache data record records without using interim table temporal data, reduces the IO of disk, therefore be not necessarily to during ETL
Interim table is created, thus without the attended operation for carrying out interim table Yu full dose tables of data in target database, improves and is
System treatment effeciency.
According to below with reference to the accompanying drawings to detailed description of illustrative embodiments, the other feature and aspect of the disclosure will become
It is clear.
Detailed description of the invention
Comprising in the description and constituting the attached drawing of part of specification and specification together illustrates the disclosure
Exemplary embodiment, feature and aspect, and for explaining the principles of this disclosure.
Fig. 1 shows the general structure of updatable data record;
Fig. 2 shows the structures of the data record by taking record is stayed in hotel as an example;
Fig. 3 is a kind of flow chart of data load method shown according to an exemplary embodiment;
Fig. 4 is showing for the data load method shown according to an exemplary embodiment that the disclosure is realized based on electronic equipment
It is intended to;
Fig. 5 is a kind of flow chart of data load method shown according to an exemplary embodiment;
Fig. 6 is a kind of block diagram of data loading device shown according to an exemplary embodiment.
Specific embodiment
Various exemplary embodiments, feature and the aspect of the disclosure are described in detail below with reference to attached drawing.It is identical in attached drawing
Appended drawing reference indicate element functionally identical or similar.Although the various aspects of embodiment are shown in the attached drawings, remove
It non-specifically points out, it is not necessary to attached drawing drawn to scale.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary "
Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.
In order to better illustrate the disclosure, numerous details is given in specific embodiment below.Ability
Field technique personnel should be appreciated that no certain details, the disclosure equally can be implemented.In some instances, for ability
Method known to field technique personnel, means, element and circuit are not described in detail, in order to highlight the purport of the disclosure.
Fig. 3 is a kind of flow chart of data load method shown according to an exemplary embodiment, and this method can be applied
In server, as shown in figure 3, this method comprises the following steps:
Step 301:When scan period reaches, from data source reads data log to data record set, remember as data
The first data record in record set;
The data read from data source for example can be updatable data, and the life cycle of updatable data is limited
, for example, the lodging data in hotel, life cycle are n days, the Internet data of Internet bar, life cycle is x hours.In order to
Periodically updatable data is loaded, can periodically be remembered from data source reads data log to data according to the scan period
Record set, wherein the scan period can be a fixation of data load process of the periodical execution step 301 to step 306
Time interval, it is understood that for a process cycle of the data load method of the present embodiment, i.e., reached whenever the scan period
When, then a step 301 is executed to step 306.
Step 302:First data record in ergodic data set of records ends, for any first data traversed
Record, is retrieved in cache database according to the major key of the first data record;
In the step 302, can all first data records in ergodic data set of records ends, according to the first data record
Major key retrieved in cache database and the consistent data record of the major key of the first data record, wherein cache database is
For the data record that life cycle in data cached set of records ends does not terminate, may have in the cache database at upper one
The data record that life cycle does not terminate in scanning week, alternatively, the system applied by notebook data loading method is new upper linear system
It does not include any data record in the cache database in the case where system.
Step 303:When not retrieving data record matched with the major key in cache database, by described first
Data record is added to newly-increased data record set;
For example, not retrieved and first number when being retrieved in cache database based on current first data record
When data record consistent according to record major key, it is believed that first data record is newly-increased data record, then can be by first number
Newly-increased data record set is added to according to record.
Step 304:When being retrieved in cache database and the matched data record of the major key and first data
When the end of life of record, first data record is added to end of life data record set;
For example, in cache database based on current first data record retrieved when, if retrieve with this first
When the consistent data record of data record major key and the end of life of first data record, which is added
Add to end of life data record set.
Step 305:By the full dose tables of data of the data record insertion target database in newly-increased data record set;
Due to being the newly-increased data record read in this scan period in newly-increased data record set, in order to realize by
Data are loaded onto the purpose of target database, by the newly-increased data record insertion target database in the combination of newly-increased data record
In full dose tables of data.Wherein, target database loads the database of the corresponding destination of data during being ETL.
Step 306:It is replaced with the data record in end of life data record set consistent described with its major key
Data record in full dose tables of data.
The life cycle of data record in end of life data record set has terminated, i.e., such data no longer produces
Raw more new data.Therefore it can delete in full dose tables of data and remember with the consistent data of the data record major key of the end of life
Record, then the data record that the life cycle is terminated are inserted into full dose tables of data, to realize to data record in full dose tables of data
It updates.
The data load method of this implementation uses the data cached note of cache database during carrying out ETL to data
Record no longer needs to record using interim table temporal data, without grasping in the enterprising hand-manipulating of needle of target database to the connection of full dose tables of data
Make, improves the treatment effeciency of system.
In an achievable mode, the data load method of the disclosure may also include:It is reached in the first scan period
Before, the valid data record that life cycle does not terminate in full dose tables of data is imported into cache database.For example, data load method
Applied system is new online system, but target database system is in operation, then can based on experience value, will a period of time
The valid data record that life cycle does not terminate in the full dose tables of data of interior target database imports cache database;For another example
Target database system in operation, close start again in operation by system applied by the present processes, then basis
Empirical value imports the valid data record that life cycle does not terminate in the full dose tables of data of target database in a period of time slow
Deposit data library, to cover data record original in cache database.Wherein, the first scan period can be notebook data record method
Applied system executes a cycle of this method.
In a kind of achievable mode, may include from data source reads data log to data record set:In scanning week
When phase reaches, one or more is read more from the data source according to the value for the time variable being arranged in a upper scan period
The new time is later than the data record of the value.Wherein, time variable is for example, " last maximum renewal time " last_max_
Update_time, such as within a upper scan period in cache database when the update of the data record of renewal time the latest
Between.In one example, it before data source reads data log set, is recorded in cache database comprising 3 datas,
Renewal time is respectively 1 day 14 April in 2018:00,2 days 13 April in 2018:00,2 days 15 April in 2018:00, then variable
Last_max_update_time is 2 days 15 April in 2018 in the initial value of present scanning cycle:00.When scan period reaches,
For example, being 4 days 00 May in 2018:00, that is, need the reading renewal time from data source to be later than 2 days 15 April in 2018:
00 data record, i.e. 2 days 15 April in 2018:00 to 2018 on May 4,00:The data record generated between 00.
In a kind of achievable mode, the data load method of the disclosure may also include:One or more is read from data source
Renewal time is later than after the data record of time variable value, by the value of time variable be updated in this scan period from
The nearest renewal time in data record that data source is read.With reference to above-mentioned example, it is assumed that from data source in this scan period
Nearest renewal time is 3 days 17 May in 2018 in the data record of reading:00, then from data source reads data log it
Afterwards, last_max_update_time can be updated to 3 days 17 May in 2018:00.
In a kind of achievable mode, the valid data record that life cycle does not terminate in full dose tables of data is imported into caching
When database, when the nearest update that can be set the value of time variable in the cache database in all data records
Between.With reference to above-mentioned example, it is assumed that renewal time nearest in the data record of importing cache database is in this scan period
3 days 17 May in 2018:00, then last_max_update_time can be updated to 3 days 17 May in 2018:00.
In a kind of achievable mode, the data load method of the disclosure may also include:It is not examined when in cache database
When rope is to data record matched with the major key, first data record is stored in cache database;When data cached
When retrieving the end of life with the matched data record of the major key and first data record in library, caching
The data record retrieved described in being deleted in database.Based on the first data record read from data source, by newly-increased number scale
Record is added to cache database, deletes the data in cache database with the first consistent end of life of data record major key
Record, so that being only cached with the data record that life cycle does not terminate in cache database, it is ensured that the number in cache database
It will not infinitely be expanded according to record.
In one implementation, the data load method of the disclosure may also include:To first in data record set
After data record traversal is completed, data record set is deleted;Data record in newly-increased data record set is inserted into mesh
After the full dose tables of data for marking database, newly-increased data record set is deleted;With in end of life data record set
Data record replacement with the consistent full dose tables of data of its major key in data record after, by end of life data record
Set is deleted.Wherein, data record set, newly-increased data record set and end of life data record set can be upper
Stating needs to create in server or electronic equipment using when these data record sets in data load method, is using these
After data record set, in order to save the memory space of server or electronic equipment, then these data record sets can be deleted.
In a kind of achievable mode, cache database is the memory database based on key-value, e.g., redis,
Memcache etc., in addition, if the life cycle of data record is longer, data volume is larger, can also be using hbase as caching
Database.Due to memory database, its query responding time based on major key is Millisecond, so being made using memory database
For cache database, the number of the data record in cache database will not infinitely expand, so that data cached scale
Controllably.
The above-mentioned data load method that the disclosure provides can realize that Fig. 4 is exemplary according to one based on an electronic equipment
Implement the schematic diagram for the data load method that the disclosure is realized based on electronic equipment exemplified, as shown in figure 4, data record collection
It closes 41, newly-increased data record set 42 and end of life data record set 43 is created and saved in electronic equipment 40
In electronic equipment local memory.Target database 44, data source 47 and cache database 46 can be deployed in the electronics respectively and set
It is standby upper, it can also individually dispose on other electronic equipments, the disclosure is without limitation.
Fig. 5 is the flow chart of data load method shown according to an exemplary embodiment, as shown in figure 5, this method packet
Include following steps:
Step 601:The valid data record that life cycle does not terminate in the full dose tables of data of target database is imported slow
Deposit data library.For ease of description, the data record in cache database is known as historgraphic data recording.
The step can be divided into following several situations:
Data record is not present in situation one, the full dose tables of data of target database.
For example, target database is new online system, data are not present in full dose tables of data, are also not present in data source
Data.Under such situation, the step can not be executed, correspondingly, cache database is sky.
There are data records in situation two, the full dose tables of data of target database.
For example, system applied by the present processes is new online system, but target database system is in operation, then
The valid data that life cycle does not terminate in the full dose tables of data of target database in a period of time can be remembered based on experience value
Record imports cache database;For another example target database system is in operation, system applied by the present processes is being run
Middle closing starts again, then based on experience value, not by life cycle in the full dose tables of data of target database in a period of time
The valid data record of termination imports cache database, to cover data record original in cache database.
Cache database in the present embodiment is illustrated by taking redis as an example, redis one hash (hash) of corresponding creation
Structural library is defined as follows:
Key (major key):
The major key (entity identifier+time started) of data record
Fields (attribute):
Begin_time:YYYY-mm-dd hh24:mi:The ss time started
End_time:YYYY-mm-dd hh24:mi:The ss end time
Step 602:Judge whether the scan period reaches, if the scan period reaches, execute step 603, if scanning week
Phase does not reach, returns to step 602;
Step 603:According to the value of variable " maximum renewal time last time " last_max_update_time, from data source
Read one or more renewal time be later than the setting last_max_update_time data record to data record
Gather (RecSet), after the completion of reading, the value of last_max_update_time is updated in this scan period from data source
Nearest renewal time in the data record of reading.
For the ease of subsequent descriptions, the data record in RecSet is denoted as the first data record.
When reaching first scan period, need according to the initial value of variable last_max_update_time from data
Data are read in source, then this variations per hour last_max_update_time initial value can be:
For the situation one in step 601, cache database is sky, the initial value of variable last_max_update_time
Can be the on-line time of system, correspondingly, when reaching first scan period, can read data source from online implementing to
All data records that current time (i.e. the time that first scan period reaches) generates.
For the situation two in step 601, cache database is not sky, then variable last_max_update_time's is first
Initial value can be the nearest renewal time in all data records of cache database, correspondingly, when first scan period arrives
Up to when, from data source read renewal time be later than last_max_update_time initial value original records.
For example, recording in cache database comprising 3 datas, renewal time is respectively 1 day 14 April in 2018:00,
2 days 13 April in 2018:00,2 days 15 April in 2018:00, then the initial value of variable last_max_update_time is 2018
On April 2,15 in:00.Correspondingly, when first scan period reaches, for example, being 4 days 00 May in 2018:00, that is, it needs from number
It is later than 2 days 15 April in 2018 according to renewal time is read in source:00 data record, i.e. 2 days 15 April in 2018:00 to
4 days 00 May in 2018:The data record generated between 00.
It, can be to variable last_max_update_ after first scan period has read data record from data source
The value of time is updated, and the value of updated last_max_update_time is in the data record read this scan period
Nearest renewal time.After having updated last_max_update_time, next scan period is waited, after being performed simultaneously
Continuous step.
When next scan period reaches, according to the value of current last_max_update_time, i.e., a upper scanning
Period has read from the nearest renewal time in the data record that data source is read again from data source reads data log
Cheng Hou is updated last_max_update_time.
The first data record in data record set (RecSet) is traversed, for traverse any first
Data record executes following steps 604-607.
Step 604:It is retrieved in redis by the major key of the first data record.
Step 605:If do not retrieved in redis with the matched data record of the major key of the first data record, this
One data record is newly-increased data record, increases this newly data record and is added to newly-increased data record set (RecSet-New),
And the newly-increased data record is inserted into redis;
It should be noted that when reaching scan period first time, if data record is not present in redis, by being somebody's turn to do
The retrieval of step, all newly-increased data records of data record in RecSet, is all added to RecSet-New and redis
In.
Step 606:If retrieved in redis with the matched data record of the major key of the first data record, and this first
The corresponding end_time field of data record is not empty, it is determined that the end of life of first data record, then by the life
The data record of life cycle arrest is added to end of life data record set (RecSet-Over), and deletes in redis
Except the matched data record of major key with the first data record retrieved.
Step 607:If retrieved in redis with the matched data record of the major key of the first data record, and this first
The corresponding end_time field of data record be sky, then it represents that in redis with the matched data record life cycle of the major key
It does not terminate, is not processed.
Step 608, after the completion of to the first data record traversal in RecSet, RecSet is deleted, return step 602,
And execute step 609 and 610.
It may be performed simultaneously 609 and 610, any one in two steps can also be first carried out.Execute step 609 and
When 610, the detection of step 602 scan period may be performed simultaneously.
Step 609:It will be complete in all newly-increased data record insertion target databases for including in the RecSet-New
Tables of data Table-Data is measured, RecSet-New is then deleted.
Step 610:With the data record replacement in RecSet-Over and the data in the consistent Table-Data of its major key
Record, then deletes RecSet-Over.
In conclusion the disclosure is during carrying out ETL to data, using cache database to can in target database
The data (i.e. the unclosed data record of life cycle) of update are cached, and the data that this scan period is obtained are stored in this
In ground memory, data retrieval and processing rapidly and efficiently may be implemented using local memory and a special memory database,
Only after processing is completed, the partial data of update is synchronized in target database, greatly reduces disk I/O operation, while
Evade the attended operation for being directed to full dose tables of data in target database in the prior art, improves treatment effeciency.
Fig. 6 is the block diagram of data loading device shown according to an exemplary embodiment, as shown in fig. 6, the device 70 wraps
It includes:
Read module 71, when being reached for the scan period, from data source reads data log to data record set, as
The first data record in the data record set;
Retrieval module 72, for traversing first data record in the data record set, for what is traversed
Any first data record is retrieved in cache database according to the major key of first data record;
First adding module 73, for when not retrieving in the cache database and the matched data of the major key are remembered
When record, first data record is added to newly-increased data record set;
Second adding module 74, for when retrieving in the cache database and the matched data of the major key are remembered
When record and the end of life of first data record, first data record is added to end of life number
According to set of records ends;
It is inserted into module 75, for the data record in the newly-increased data record set to be inserted into the full dose of target database
Tables of data;
Replacement module 76, for being replaced and its major key with the data record in the end of life data record set
Data record in the consistent full dose tables of data.
In a kind of achievable mode, the data loading device of the disclosure may also include:Import modul, for being swept first
Before retouching period arrival, the valid data record that life cycle does not terminate in the full dose tables of data is imported described data cached
Library.
In a kind of achievable mode, the read module is used for:When reaching the scan period, according to upper scanning week
The value for the time variable being arranged in phase is later than the data of the value from the data source one or more renewal time of reading
Record.
In a kind of achievable mode, the data loading device of the disclosure further includes:Update module, for from data source
Reading for one or more renewal time is later than after the data record of the value, and the value of the time variable is updated to this
From the nearest renewal time in the data record that data source is read in scan period.
In a kind of achievable mode, the data loading device of the disclosure further includes:Setup module, for will it is described entirely
When the valid data record that life cycle does not terminate in amount tables of data imports the cache database, by taking for the time variable
Value is set as in the cache database the nearest renewal time in all data records.
In a kind of achievable mode, the data loading device of the disclosure further includes:Memory module, for when described slow
When not retrieving data record matched with the major key in deposit data library, first data record is stored in the caching number
According to library;First removing module retrieves and the matched data record of the major key and institute in the cache database for working as
When stating the end of life of the first data record, deleted in the cache database described in the data record that retrieves.
In a kind of achievable mode, the data loading device of the disclosure further includes:Described device further includes:
Second removing module, after being completed to first data record traversal in the data record set,
The data record set is deleted;Third removing module, for inserting the data record in the newly-increased data record set
After the full dose tables of data for entering target database, the newly-increased data record set is deleted;4th removing module, for using
State the data record replacement in end of life data record set and the number in the consistent full dose tables of data of its major key
After record, the end of life data record set is deleted.
The presently disclosed embodiments is described above, above description is exemplary, and non-exclusive, and
It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill
Many modifications and changes are obvious for the those of ordinary skill in art field.The selection of term used herein, purport
In the principle, practical application or technological improvement to the technology in market for best explaining each embodiment, or lead this technology
Other those of ordinary skill in domain can understand each embodiment disclosed herein.
Claims (14)
1. a kind of data load method, which is characterized in that including:
When scan period reaches, from data source reads data log to data record set, as in the data record set
The first data record;
First data record in the data record set is traversed, for any first data record traversed, root
It is retrieved in cache database according to the major key of first data record;
When not retrieving data record matched with the major key in the cache database, by first data record
It is added to newly-increased data record set;
When retrieving the life with the matched data record of the major key and first data record in the cache database
When ordering cycle arrest, first data record is added to end of life data record set;
By the full dose tables of data of the data record insertion target database in the newly-increased data record set;
With the data record replacement and the consistent full dose data of its major key in the end of life data record set
Data record in table.
2. the method according to claim 1, wherein the method also includes:
Before the first scan period reaches, the valid data that life cycle does not terminate in the full dose tables of data are recorded and are imported
The cache database.
3. according to the method described in claim 2, it is characterized in that, described from data source reads data log to data record collection
It closes, including:
When reaching the scan period, read according to the value for the time variable being arranged in a upper scan period from the data source
One or more renewal time was later than the data record of the value.
4. according to the method described in claim 3, it is characterized in that, described read for one or more evening renewal time from data source
After the data record of the value, this method further includes:
By the value of the time variable be updated in this scan period from data source read data record in recently more
The new time.
5. according to the method described in claim 3, it is characterized in that, described do not terminate life cycle in the full dose tables of data
Valid data record import the cache database when,
The nearest renewal time set the value of the time variable in the cache database in all data records.
6. the method according to claim 1, wherein the method also includes:
When not retrieving data record matched with the major key in the cache database, by first data record
It is stored in the cache database;
When retrieving the life with the matched data record of the major key and first data record in the cache database
When ordering cycle arrest, the data record retrieved is deleted in the cache database.
7. the method according to claim 1, wherein the method also includes:
After being completed to first data record traversal in the data record set, the data record set is deleted
It removes;
It, will be described new after the full dose tables of data of the data record insertion target database in the newly-increased data record set
Increase data record set to delete;
With the data record replacement and the consistent full dose data of its major key in the end of life data record set
After data record in table, the end of life data record set is deleted.
8. a kind of data loading device, which is characterized in that including:
Read module, when being reached for the scan period, from data source reads data log to data record set, as the number
According to the first data record in set of records ends;
Retrieval module, for traversing first data record in the data record set, for any traversed
One data record is retrieved in cache database according to the major key of first data record;
First adding module, for when not retrieving data record matched with the major key in the cache database,
First data record is added to newly-increased data record set;
Second adding module retrieves and the matched data record of the major key and institute in the cache database for working as
When stating the end of life of the first data record, first data record is added to end of life data record collection
It closes;
It is inserted into module, for the data record in the newly-increased data record set to be inserted into the full dose data of target database
Table;
Replacement module, for consistent with its major key with the data record replacement in the end of life data record set
Data record in the full dose tables of data.
9. device according to claim 8, which is characterized in that described device further includes:
Import modul, for having before the first scan period reaches by what life cycle in the full dose tables of data did not terminated
Imitate cache database described in Import data records.
10. device according to claim 9, which is characterized in that the read module is used for:
When reaching the scan period, read according to the value for the time variable being arranged in a upper scan period from the data source
One or more renewal time was later than the data record of the value.
11. device according to claim 10, which is characterized in that described device further includes:
Update module, for being read after one or more renewal time was later than the data record of the value from data source,
The value of the time variable is updated in this scan period from when the nearest update in the data record that data source is read
Between.
12. device according to claim 10, which is characterized in that described device further includes:
Setup module, for the valid data record that life cycle does not terminate in the full dose tables of data to be imported the caching
When database, the nearest update that sets the value of the time variable in the cache database in all data records
Time.
13. device according to claim 8, which is characterized in that described device further includes:
Memory module, for when not retrieving data record matched with the major key in the cache database, by institute
It states the first data record and is stored in the cache database;
First removing module retrieves and the matched data record of the major key and institute in the cache database for working as
When stating the end of life of the first data record, the data record retrieved is deleted in the cache database.
14. device according to claim 8, which is characterized in that described device further includes:
Second removing module, after being completed to first data record traversal in the data record set, by institute
State data record set deletion;
Third removing module, for the data record in the newly-increased data record set to be inserted into the full dose number of target database
After table, the newly-increased data record set is deleted;
4th removing module, for being replaced and its major key one with the data record in the end of life data record set
After the data record in the full dose tables of data caused, the end of life data record set is deleted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810510384.XA CN108829747B (en) | 2018-05-24 | 2018-05-24 | Data load method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810510384.XA CN108829747B (en) | 2018-05-24 | 2018-05-24 | Data load method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108829747A true CN108829747A (en) | 2018-11-16 |
CN108829747B CN108829747B (en) | 2019-09-17 |
Family
ID=64145497
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810510384.XA Active CN108829747B (en) | 2018-05-24 | 2018-05-24 | Data load method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108829747B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635032A (en) * | 2018-12-24 | 2019-04-16 | 福建凯米网络科技有限公司 | A kind of method and terminal of data conversion |
CN112256775A (en) * | 2020-09-27 | 2021-01-22 | 建信金融科技有限责任公司 | Method and device for timed data loading of Oracle database |
CN113139081A (en) * | 2021-04-27 | 2021-07-20 | 中山亿联智能科技有限公司 | Method for reporting and reading user online playing information with high efficiency and low delay |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101504664A (en) * | 2009-03-18 | 2009-08-12 | 中国工商银行股份有限公司 | Apparatus and method for extracting, converting and loading total source data |
CN102663114A (en) * | 2012-04-17 | 2012-09-12 | 中国人民大学 | Database inquiry processing method facing concurrency OLAP (On Line Analytical Processing) |
CN104484400A (en) * | 2014-12-12 | 2015-04-01 | 北京国双科技有限公司 | Method and device for data query processing |
CN104573128A (en) * | 2014-10-28 | 2015-04-29 | 北京国双科技有限公司 | Business data processing method, a business data processing device and server |
CN104731791A (en) * | 2013-12-18 | 2015-06-24 | 东阳艾维德广告传媒有限公司 | Marketing analysis data market system |
CN105426292A (en) * | 2015-10-29 | 2016-03-23 | 网易(杭州)网络有限公司 | Game log real-time processing system and method |
CN105512201A (en) * | 2015-11-26 | 2016-04-20 | 晶赞广告(上海)有限公司 | Data collection and processing method and device |
CN105956123A (en) * | 2016-05-03 | 2016-09-21 | 无锡雅座在线科技发展有限公司 | Local updating software-based data processing method and apparatus |
CN106407321A (en) * | 2016-08-31 | 2017-02-15 | 东软集团股份有限公司 | Data synchronization method and device |
CN107229721A (en) * | 2017-06-02 | 2017-10-03 | 泰华智慧产业集团股份有限公司 | A kind of method and device for changing data pick-up |
CN107330003A (en) * | 2017-06-12 | 2017-11-07 | 上海藤榕网络科技有限公司 | Method of data synchronization, system, memory and data syn-chronization equipment |
-
2018
- 2018-05-24 CN CN201810510384.XA patent/CN108829747B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101504664A (en) * | 2009-03-18 | 2009-08-12 | 中国工商银行股份有限公司 | Apparatus and method for extracting, converting and loading total source data |
CN102663114A (en) * | 2012-04-17 | 2012-09-12 | 中国人民大学 | Database inquiry processing method facing concurrency OLAP (On Line Analytical Processing) |
CN104731791A (en) * | 2013-12-18 | 2015-06-24 | 东阳艾维德广告传媒有限公司 | Marketing analysis data market system |
CN104573128A (en) * | 2014-10-28 | 2015-04-29 | 北京国双科技有限公司 | Business data processing method, a business data processing device and server |
CN104484400A (en) * | 2014-12-12 | 2015-04-01 | 北京国双科技有限公司 | Method and device for data query processing |
CN105426292A (en) * | 2015-10-29 | 2016-03-23 | 网易(杭州)网络有限公司 | Game log real-time processing system and method |
CN105512201A (en) * | 2015-11-26 | 2016-04-20 | 晶赞广告(上海)有限公司 | Data collection and processing method and device |
CN105956123A (en) * | 2016-05-03 | 2016-09-21 | 无锡雅座在线科技发展有限公司 | Local updating software-based data processing method and apparatus |
CN106407321A (en) * | 2016-08-31 | 2017-02-15 | 东软集团股份有限公司 | Data synchronization method and device |
CN107229721A (en) * | 2017-06-02 | 2017-10-03 | 泰华智慧产业集团股份有限公司 | A kind of method and device for changing data pick-up |
CN107330003A (en) * | 2017-06-12 | 2017-11-07 | 上海藤榕网络科技有限公司 | Method of data synchronization, system, memory and data syn-chronization equipment |
Non-Patent Citations (1)
Title |
---|
PANOS VASSILIADIS 等: "ARKTOS: TOWARDS THE MODELING, DESIGN, CONTROL AND EXECUTION OF ETL PROCESSES", 《INFORMATION SYSTEMS》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109635032A (en) * | 2018-12-24 | 2019-04-16 | 福建凯米网络科技有限公司 | A kind of method and terminal of data conversion |
CN112256775A (en) * | 2020-09-27 | 2021-01-22 | 建信金融科技有限责任公司 | Method and device for timed data loading of Oracle database |
CN113139081A (en) * | 2021-04-27 | 2021-07-20 | 中山亿联智能科技有限公司 | Method for reporting and reading user online playing information with high efficiency and low delay |
CN113139081B (en) * | 2021-04-27 | 2023-10-27 | 中山亿联智能科技有限公司 | Method for reporting online playing information of reading user with high efficiency and low delay |
Also Published As
Publication number | Publication date |
---|---|
CN108829747B (en) | 2019-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108829747B (en) | Data load method and device | |
CN107229721B (en) | A kind of method and device changing data pick-up | |
CN104133822B (en) | A kind of method and device that file on memorizer is scanned | |
EP1116139B1 (en) | Method and apparatus for reorganizing an active dbms table | |
CN110268399A (en) | Merge tree modification for maintenance operations | |
US9047330B2 (en) | Index compression in databases | |
CN110268394A (en) | KVS tree | |
CN110383261A (en) | Stream selection for multi-stream storage | |
US8768980B2 (en) | Process for optimizing file storage systems | |
CN107943718B (en) | Method and device for cleaning cache file | |
US7536512B2 (en) | Method and apparatus for space efficient identification of candidate objects for eviction from a large cache | |
CN106155934B (en) | Caching method based on repeated data under a kind of cloud environment | |
CN104092670A (en) | Method for utilizing network cache server to process files and device for processing cache files | |
CN110287201A (en) | Data access method, device, equipment and storage medium | |
CN109815425A (en) | Caching data processing method, device, computer equipment and storage medium | |
US20130046798A1 (en) | Method and apparatus for visualization of infrastructure using a non-relational graph data store | |
CN105915619B (en) | Take the cyberspace information service high-performance memory cache method of access temperature into account | |
US10789234B2 (en) | Method and apparatus for storing data | |
CN108446329A (en) | Adaptive databases partition method and system towards industrial time series database | |
Zhang et al. | Recovering SQLite data from fragmented flash pages | |
CN107169047A (en) | A kind of method and device for realizing data buffer storage | |
CN114297196A (en) | Metadata storage method and device, electronic equipment and storage medium | |
CN104219271B (en) | Based on the asynchronous multiserver synchronous method for downloading the page of multithreading | |
CN109582233A (en) | A kind of caching method and device of data | |
CN110399451B (en) | Full-text search engine caching method, system and device based on nonvolatile memory and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |