CN105243067B - A kind of method and device for realizing real-time incremental synchrodata - Google Patents

A kind of method and device for realizing real-time incremental synchrodata Download PDF

Info

Publication number
CN105243067B
CN105243067B CN201410321182.2A CN201410321182A CN105243067B CN 105243067 B CN105243067 B CN 105243067B CN 201410321182 A CN201410321182 A CN 201410321182A CN 105243067 B CN105243067 B CN 105243067B
Authority
CN
China
Prior art keywords
data
relational database
log
module
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410321182.2A
Other languages
Chinese (zh)
Other versions
CN105243067A (en
Inventor
杨威
白军伟
王啸风
冯是聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhizhi Heshu Technology Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201410321182.2A priority Critical patent/CN105243067B/en
Publication of CN105243067A publication Critical patent/CN105243067A/en
Application granted granted Critical
Publication of CN105243067B publication Critical patent/CN105243067B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method and devices for realizing real-time incremental synchrodata, wherein the method for the real-time incremental synchrodata includes: the table structure information according to relevant database, and mapping relations file corresponding with the relevant database is generated in distributed PostgreSQL database HBase towards column;The operation log of the relevant database is obtained in real time;According to the operation log of acquisition, the change data of the relevant database are obtained, according to the mapping relations file of foundation, the change data of acquisition are updated in the HBase of Hadoop.It realizes data and synchronization is updated by the real-time incremental of relevant database to Hadoop, be not only effectively reduced the burden of Hadoop platform, while also increasing user experience.

Description

Method and device for realizing real-time incremental synchronous data
Technical Field
The invention relates to the technical field of big data, in particular to a method and a device for realizing real-time incremental synchronous data.
Background
The rapid development of the internet generates a large amount of data with a rapidly increased volume, the appearance of mass data and the change of a data structure, and brings huge challenges to management and analysis processing of various industries. The traditional processing method based on the relational database data cannot effectively store, analyze and process various service data which are growing increasingly. To this end, many industries have begun to employ a distributed system infrastructure (Hadoop) to analyze data. At present, the mainstream method for synchronizing the relational database data to the Hadoop platform mainly realizes the one-time full-scale import of the data through Sqoop. The Sqoop is an efficient data transmission tool between a relational database and a distributed file system (HDFS), and can lead data in the relational database into the HDFS of Hadoop and also lead data of the HDFS into the relational database.
When data of the relational database changes, if the updated data in the relational database is to be imported into the Hadoop, the data in the relational database needs to be imported in a full amount at regular time. The full import means that all data existing in real time in the relational database is imported into Hadoop. This not only burdens the Hadoop distributed system, but also is time consuming. However, no method is available at present, which can realize real-time incremental update synchronization of data from the relational database to the Hadoop, that is, only change data in the relational database is synchronized to the Hadoop.
Disclosure of Invention
In order to solve the problems, the invention provides a method and a device for realizing real-time incremental synchronization of data, which can realize real-time incremental update synchronization of data from a relational database to a Hadoop, effectively reduce the burden of a Hadoop platform and enhance the user experience.
In order to achieve the above object, the present invention discloses a method for implementing real-time incremental synchronization data, which is applied to data import from a relational database to a distributed system architecture, and comprises:
generating a mapping relation file corresponding to the relational database in a column-oriented database HBase according to the table structure information of the relational database;
acquiring an operation log of the relational database in real time;
and acquiring the change data of the relational database according to the acquired operation log, and updating the acquired change data into the HBase of Hadoop according to the established mapping relation file.
Further, the identity and the starting point of the relational database are configured in advance; the obtaining of the operation log of the relational database in real time includes:
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
Further, the obtaining the operation log of the relational database includes:
receiving the change data of the operation log of the relational database corresponding to the identity identifier, and storing the received change data in a message queue in sequence; or,
and when the request for acquiring the changed data is not received and the changed data in the message queue exceeds a threshold value, sequentially storing the changed data in the message queue into the corresponding directory file.
Further, after obtaining the operation log of the relational database in real time, the method further includes:
updating the initial site of the relational database;
and acquiring the next operation log of the relational database according to the updated initial site.
Further, the method further comprises: and storing the obtained change data in a local file, and recording the update history.
The invention also discloses a device for realizing real-time incremental synchronous data, which is applied to the data import from the relational database to the distributed system architecture and comprises the following steps: the device comprises a table building module, a log obtaining module, a plurality of log analyzing client modules and a data updating module, wherein:
the table building module is used for generating a mapping relation file corresponding to the relational database in a distributed and column-oriented open source database HBase according to the table structure information of the relational database;
the log acquisition module is used for acquiring the operation log of the relational database in real time;
each log analysis client module is respectively connected with the log acquisition module and is used for receiving the operation logs and the change data sent by the log analysis module and sending the obtained change data to the data updating module;
and the data updating module is used for receiving the change data sent by each log analysis client module in real time and updating the obtained change data into the HBase of Hadoop according to the mapping relation file established by the table establishing module.
Further, the log obtaining module is specifically configured to:
configuring the unique identity and the unique start site of the relational database in advance;
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
Further, the log obtaining module is further configured to:
receiving data of an operation log of the relational database corresponding to the identity identifier, and storing the received data in a message queue in sequence for the corresponding log analysis client module to request to obtain; or,
and if the log analysis client module does not request to acquire the data, when the data in the message queue exceeds a threshold value, sequentially storing the data in the message queue into a corresponding directory file.
Further, the log obtaining module is further configured to:
updating the initial site of the relational database;
and acquiring the next operation log of the relational database according to the updated initial site.
Further, the log parsing client module is further configured to: and when the data updating module is not started, storing the received change data in a local file.
Further, the obtained change data is saved in a local file, and the update history is recorded.
The method for realizing real-time incremental synchronization data provided by the technical scheme of the application is applied to data import from a relational database to a distributed system architecture, and comprises the following steps: generating a mapping relation file corresponding to the relational database in a distributed and column-oriented open source database HBase according to the table structure information of the relational database; acquiring an operation log of the relational database in real time; and acquiring the change data of the relational database according to the acquired operation log, and updating the acquired change data into the HBase of Hadoop according to the established mapping relation file. According to the technical scheme, real-time incremental updating synchronization from the relational database to the Hadoop is achieved, meanwhile, the burden of a Hadoop platform is effectively reduced, and user experience is enhanced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a method for implementing real-time incremental synchronization of data in accordance with the present invention;
FIG. 2 is a schematic diagram of the structure of the apparatus for implementing real-time incremental synchronization data according to the present invention;
Detailed Description
The invention is described in detail below with reference to the figures and the specific embodiments.
Fig. 1 is a flowchart of a method for implementing real-time incremental synchronization of data according to the present invention, which is applied to import data from a relational database to a distributed system architecture (Hadoop), and as shown in fig. 1, the method includes the following steps:
step 101, according to the table structure information of the relational database, generating a mapping relation file corresponding to the relational database in a distributed and column-oriented database (HBase).
In this step, an association table corresponding to the relational database may be generated in the data warehouse tool (Hive) and the HBase. The association table refers to the data table created in Hive and HBase and consistent with the table structure in the relational database. After the historical data is imported into the Hadoop platform, the updating of the data table in the relational database is updated to the corresponding table in Hive and HBase. HBase is used for data storage, and Hive provides a query function. The association table may be created by Hive script.
The method for generating the mapping relationship file in this step is well known to those skilled in the art, that is, the mapping relationship file can be obtained from the relational database by a program, or after being connected to the relational database, the mapping relationship file can be obtained by executing some SQL statements through a corresponding interface. The method specifically comprises the following steps: after the table structure information of the relational database is obtained, self-defining the corresponding column names of the HBase in the Hadoop platform (for example, the column names in the relational database can be sequentially corresponding through the sequence of A-Z). In Hive, each column name of the table structure is the same as the column name of the table structure in the relational database, and the customized column name here is the column name corresponding to the HBase.
Therefore, if the primary key in the data table of the relational database is the combination key, the combination key is spliced according to a certain rule to be used as the line primary key of the HBase. The rule can be customized according to the specific and processing requirements of the data, and only the field spliced by each record according to the rule is ensured to be unique. For example, the fields in the combined primary key may be directly stitched together in an underline connection.
And step 102, acquiring an operation log of the relational database in real time.
Firstly, the unique identification and the starting point of the relational database are configured in advance.
In this step, in addition to configuring the unique identity and the start point for the relational database, other necessary information may be configured, such as an IP address, a service port number, a database user name and a password of the host where the relational database is located.
Secondly, the obtaining of the operation log of the relational database in real time specifically includes:
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
Finally, after the operation log of the relational database is obtained in real time, updating the initial site of the relational database; and acquiring the next operation log of the relational database according to the updated initial site.
It should be noted that the start point is a variable stored in the memory, and after the operation log is successfully acquired each time, the value of the variable can be directly modified by the program, that is, the start point is updated, and the value is written into the configuration file by the program. Saving to a file is to prevent loss of the current start site after the program terminates abnormally. Therefore, the next time the program is started, the updating progress of the program when the program is exited last time can be obtained through the starting point saved in the file.
The location is a location identifier of log operation record in the relational database, the starting location is a location identifier of starting to acquire the operation log, and the current location is the current log location in the relational database. If the start site is not configured, the default is the current site in the relational database. For the acquisition of sites, sites are available, for example, in the Mysql database by show master status.
Preferably, for each relational database, a corresponding independent thread may be configured, change data of an operation log of the relational database corresponding to the identity is received, and the received change data is sequentially stored in a message queue of a limited size; or,
and when the request for acquiring the changed data is not received and the changed data in the message queue exceeds a threshold value, storing the changed data in the message queue into the corresponding directory file in sequence.
Thus, once there is an initiating data request, the corresponding change data stored in the corresponding directory file will be sent preferentially.
While configuring the unique identification and the start site of the relational database in advance, the method further comprises the following steps: and importing the full amount of the relational database data into the database in the Hadoop platform through Sqoop.
It should be noted that, because the unique id and start point of the relational database are configured in advance, and this is also unique, it can be seen that the present solution only includes a process of importing the full amount of the relational database into the database in the Hadoop platform once, that is, after the full amount of the data is imported into the Hadoop platform, subsequent data update is imported into the Hadoop platform in a real-time manner, in an incremental manner, rather than in a general manner of importing the full amount again.
And 103, acquiring the change data of the relational database according to the acquired operation log, and updating the acquired change data into the HBase of Hadoop according to the established mapping relation file.
Further, the obtained change data is also saved in a local file, and the update history is recorded.
It should be noted with respect to this method that, for a relational database, three operations, namely addition, update, and deletion, are mainly focused on. While HBase essentially only adds data, its update and delete operations (no insert operations) are very similar in nature, all being performed during subsequent merge (Compact).
For this reason, the update operation of the HBase corresponds to both the insert and update operations of the relational database. For insertions, mapping into HBase is a simple update operation. In particular, the method of manufacturing a semiconductor device,
first, for updating, a distinction is made according to whether the primary key is updated: if the primary key is not updated, the mapping to the HBase is still a simple update operation, otherwise, if the primary key is updated, the mapping to a plurality of operations in the HBase is performed. Because the HBase can store the historical data of multiple versions according to the timestamp, for this reason, under the condition of updating the primary key, the corresponding numerical value needs to be taken out through the old primary key first, then the data composed of the new primary key and the old value is stored in the HBase database, then the new primary key and the new value are utilized to be updated into the HBase, and finally the old value is deleted from the HBase through the old primary key, so that the historical data of multiple versions are still stored in the HBase.
Finally, for the delete operation, the delete operation is still mapped into the HBase, but only marked in the HBase, and the real data delete will be performed in the Compact process (note here that after the delete operation is completed, the data is invisible to the outside, and the query result of the data will not be affected).
Fig. 2 is a schematic diagram of a composition structure of the apparatus for implementing real-time incremental synchronization data according to the present invention, which is applied to data import from a relational database to a distributed system architecture (Hadoop), and as shown in fig. 2, the apparatus includes: the device comprises a table building module, a log obtaining module, a plurality of log analyzing client modules and a data updating module. Wherein,
and the table building module is used for generating a mapping relation file corresponding to the relational database in a distributed and column-oriented open source database (HBase) according to the table structure information of the relational database.
Further, the table building module is further configured to:
and if the primary key in the data table of the relational database is a combined key, splicing the combined key according to a certain rule to be used as the row primary key of the HBase.
It should be noted that the above table building module is further configured to generate an association table corresponding to the relational database in a data warehouse tool (Hive) and the HBase. The association table refers to the data table created in Hive and HBase and consistent with the table structure in the relational database. After the historical data is imported into the Hadoop platform, the updating of the data table in the relational database is updated to the corresponding table in Hive and HBase. HBase is used for data storage, and Hive provides a query function. The association table may be created by Hive script.
And the log acquisition module is used for acquiring the operation log of the relational database in real time.
Firstly, the log obtaining module is specifically configured to:
configuring the unique identity and the unique start site of the relational database in advance;
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
In addition, the log obtaining module is further configured to:
receiving data of an operation log of the relational database corresponding to the identity identifier, and storing the received data in a message queue in sequence for the corresponding log analysis client module to request to obtain; or,
and if the log analysis client module does not request to acquire the data, when the data in the message queue exceeds a threshold value, sequentially storing the data in the message queue into a corresponding directory file.
Finally, the log obtaining module is further configured to:
updating the initial site of the relational database; and acquiring the next operation log of the relational database according to the updated initial site.
Each log analysis client module is respectively connected with the log acquisition module and used for receiving the operation logs and the change data sent by the log acquisition module and sending the obtained change data to the data updating module.
Before receiving the operation log and the change data sent by the log acquisition module, each log analysis client module is further used for sending a data acquisition request to the log acquisition module according to the unique identity of the corresponding relational database and the connection relation between the log analysis client module and the log acquisition module
Each log parsing client module is further to: and when the data updating module is not started, storing the received changed data in a local file.
And the data updating module is used for receiving the change data sent by each log analysis client module in real time and updating the obtained change data to the HBase of the distributed system infrastructure Hadoop according to the mapping relation file established by the table establishing module.
The data update module is further to: and storing the obtained change data in a local file, and recording the update history.
In addition, the above apparatus further comprises: and the full derivative module is used for introducing the full amount of the relational database data into HBase in the Hadoop platform through Sqoop.
In this apparatus, the number of log analysis client modules is the same as the number of relational databases.
Example one
In this embodiment, the relational database Mysql is taken as an example to explain in detail how to realize the real-time incremental data update from the Mysql database to the HBase database in Hadoop.
Firstly, configuring target Mysql database setting, starting a Mysql binary log writing function, and setting the Mysql binary log writing function as a row mode; configuring Mysql database information to be synchronized in a table building module, operating the table building module after the configuration is finished, creating an association table corresponding to the relational database in Hive and HBase, and generating a mapping relation file of a data table in the relational database and a data table in the HBase for the data updating module to use; suppose that there is a table info in the target database, whose table structure is as follows:
name of field Type of field Description of the invention
id bigint Self-increment key
name varchar(10)
age int
The generated mapping relation file content is as follows:
{"COLUMN_FAMILY":"C","SQOOP_INFO":{"COLS":{"AGE":"C","ID":"A","NAME":"B"},"IS_PK_INT":true,"PK":["ID"]}}
wherein, COLUMN _ FAMILY is the NAME of the COLUMN FAMILY in HBase, the COLUMN FAMILY has three COLUMNs, which are named as A, B, C respectively, and correspond to ID, NAME and AGE of the INFO table, wherein the primary key of the INFO table is ID and is Int type, and the NAME of the data table in HBase is SQOOP _ INFO.
Secondly, the relational database log acquisition module configures information such as a unique identity for the database of the target Mysql, and then starts the relational database log acquisition module in a background process mode. The purpose of starting the relational database log acquisition module before the full derivative is performed is that if the target Mysql database has more data, it takes a certain time to introduce all the target relational database data into Hadoop, and during the derivative, the data of the target Mysql database may be changed, such as new data is inserted or part of the data is updated and deleted. Therefore, the relational database log acquisition module needs to be started before the operation, so as to record the Mysql operation log in the period, and synchronize the data change in the period to the HBase database in Hadoop after the derivative is completed.
And finally, after the relational database log acquisition module is started, carrying out full data derivative by utilizing the Sqoop. And after the derivative is completed, starting a relational database log analysis client module and a log updating module in a background process mode. The relational database log analysis client module configures a unique identity for a database of the target Mysql and connection information of the unique identity for the database of the target Mysql through a relational database log acquisition module to request data from the relational database log acquisition module; and the log updating module sends a request to the relational database log analysis client module to obtain data, and updates the obtained data to the HBase database of Hadoop according to the mapping file generated in the first step.
After all modules are deployed, all operations on the target Mysql database can see corresponding data changes (the time delay is basically within one category) through Hive query or HBase query, and corresponding data updating history records can be found from the logs of the data updating modules. The following are several examples of data formats transmitted between the relational database log acquisition module, the relational database log analysis client module, and the data update module:
{"COLUMN_NEWVALUE":["1","alex","25"],"COLUMN_OLDVALUE":["","",""],"OPERATETABLE":"info","PK_NEWVALUE":["1"],"PK_OLDVALUE":[""],"DATABASE":"sqoop","PK":["id"],"OPERATETYPE":"INSERT","USERNAME":"sqoop","COLUMN":["id","name","age"],"OPERATETIME":"2014-05-3003:20:40:000","TABLESPACE":""}
this entry is an insert operation, so its old values are all null. The following are two UPDATE operations, the first one without updating the primary key and the second one with updating the primary key (this can be seen from the operation type OPERATION, the value of OPERATION with updating the primary key is defined as "UPDATE _ FIELDCOMP _ PK", and the value of the non-updated primary key is defined as "UPDATE _ FIELDCOMP"):
{"COLUMN_NEWVALUE":["tom"],"COLUMN_OLDVALUE":["alex"],"OPERATETABLE":"info","PK_NEWVALUE":["1"],"PK_OLDVALUE":["1"],"DATABASE":"sqoop","PK":["id"],"OPERATETYPE":"UPDATE_FIELDCOMP","USERNAME":"sqoop","COLUMN":["name"],"OPERATETIME":"2014-05-3003:21:28.000","TABLESPACE":""}
{"COLUMN_NEWVALUE":[],"COLUMN_OLDVALUE":[],"OPERATETABLE":"info","PK_NEWVALUE":["222"],"PK_OLDVALUE":["1"],"DATABASE":"sqoop","PK":["id"],"OPERATETYPE":"UPDATE_FIELDCOMP_PK","USERNAME":"sqoop","COLUMN":[],"OPERATETIME":"2014-05-3003:24:23.000","TABLESPACE":""}
next, the following is the delete operation:
{"COLUMN_NEWVALUE":["",""],"COLUMN_OLDVALUE":["alex","25"],"OPERATETABLE":"info","PK_NEWVALUE":[""],"PK_OLDVALUE":["1"],"DATABASE":"sqoop","PK":["id"],"OPERATETYPE":"DELETE","USERNA ME":"sqoop","COLUMN":["name","age"],"OPERATETIME":"2014-05-3003:29:39.000","TABLESPACE":""}
the keywords defined in the above four operations will now be described as follows:
COLUMN _ NEWVALUE: sequentially listing new values of fields with changed values in the row after the operation is finished;
COLUMN _ oldvale: sequentially listing all old values with changed values in the record of the row before the operation is carried out;
OPERATETABLE: which table represents the operation;
PK _ NEWVALUE: after the operation is performed, the row records the new value of the primary key;
PK _ oldvale: before the operation is performed, the row records the old value of the primary key;
DATABASE: the database name in Mysql corresponding to the operation;
PK: the column name of the primary key in the data table;
OPERATETYPE: operation types, including INSERT (INSERT), UPDATE (UPDATE _ FIELDCOMP, which indicates that the primary key is not updated, and UPDATE _ FIELDCOMP _ PK, which indicates that the primary key is updated, DELETE (DELETE);
USERNAME: a user name of the database;
COLUMN: the name of each column in the data table whose value changes;
OPERATETIME: the time at which the operation occurred;
TABLESPOCE: a database table space.
It will be understood by those skilled in the art that all or part of the steps of the above methods may be implemented by instructing the relevant hardware through a program, and the program may be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, and the like. Alternatively, all or part of the steps of the above embodiments may be implemented using one or more integrated circuits. Accordingly, each module/unit in the above embodiments may be implemented in the form of hardware, and may also be implemented in the form of a software functional module. The present application is not limited to any specific form of hardware or software combination.
The above description is only a preferred example of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A method for realizing real-time increment synchronous data is applied to data import from a relational database to a distributed system architecture Hadoop, and comprises the following steps:
generating a mapping relation file corresponding to the relational database in a column-oriented database HBase according to the table structure information of the relational database;
acquiring an operation log of the relational database in real time;
acquiring the change data of the relational database according to the acquired operation log, and updating the acquired change data into HBase of Hadoop according to the established mapping relation file;
wherein the obtaining the operation log of the relational database comprises:
receiving change data of an operation log of a relational database corresponding to the identity, and storing the received change data in a message queue in sequence; or,
and when the request for acquiring the changed data is not received and the changed data in the message queue exceeds a threshold value, storing the changed data in the message queue into the corresponding directory file in sequence.
2. The method of claim 1, wherein the identity and start point of the relational database are preconfigured; the acquiring the operation log of the relational database in real time comprises the following steps:
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
3. The method according to claim 1 or 2, wherein after the obtaining the operation log of the relational database in real time, the method further comprises:
updating the start site of the relational database;
and acquiring the next operation log of the relational database according to the updated initial site.
4. The method of claim 3, further comprising: and storing the obtained change data in a local file, and recording the update history.
5. A device for realizing real-time increment synchronous data is applied to data import from a relational database to a distributed system architecture Hadoop, and comprises the following steps: the device comprises a table building module, a log obtaining module, a plurality of log analyzing client modules and a data updating module, wherein:
the table building module is used for generating a mapping relation file corresponding to the relational database in a distributed and column-oriented open source database HBase according to the table structure information of the relational database;
the log acquisition module is used for acquiring the operation log of the relational database in real time;
each log analysis client module is respectively connected with the log acquisition module and used for receiving the operation logs and the change data sent by the log analysis client module and sending the obtained change data to the data updating module;
the data updating module is used for receiving the change data sent by each log analysis client module in real time and updating the obtained change data into an HBase of Hadoop according to the mapping relation file established by the table establishing module;
the log obtaining module is further configured to:
receiving data of an operation log of a relational database corresponding to the identity identifier, and sequentially storing the received data in a message queue for the corresponding log analysis client module to request to obtain; or,
and if the log analysis client module does not request to acquire the data, when the data in the message queue exceeds a threshold value, sequentially storing the data in the message queue into a corresponding directory file.
6. The apparatus of claim 5, wherein the log obtaining module is specifically configured to:
pre-configuring the unique identity and the unique start site of the relational database;
and acquiring an operation log of the relational database corresponding to the identity from the initial site according to the identity and the initial site.
7. The apparatus of claim 5 or 6, wherein the log obtaining module is further configured to:
updating the start site of the relational database;
and acquiring the next operation log of the relational database according to the updated initial site.
8. The apparatus of claim 7, wherein the log parsing client module is further configured to: and when the data updating module is not started, storing the received change data in a local file.
9. The apparatus of claim 7, wherein the data update module is further configured to: and storing the obtained change data in a local file, and recording the update history.
CN201410321182.2A 2014-07-07 2014-07-07 A kind of method and device for realizing real-time incremental synchrodata Active CN105243067B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410321182.2A CN105243067B (en) 2014-07-07 2014-07-07 A kind of method and device for realizing real-time incremental synchrodata

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410321182.2A CN105243067B (en) 2014-07-07 2014-07-07 A kind of method and device for realizing real-time incremental synchrodata

Publications (2)

Publication Number Publication Date
CN105243067A CN105243067A (en) 2016-01-13
CN105243067B true CN105243067B (en) 2019-06-28

Family

ID=55040718

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410321182.2A Active CN105243067B (en) 2014-07-07 2014-07-07 A kind of method and device for realizing real-time incremental synchrodata

Country Status (1)

Country Link
CN (1) CN105243067B (en)

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105824744B (en) * 2016-03-21 2018-06-15 焦点科技股份有限公司 A kind of real-time logs capturing analysis method based on B2B platform
CN105847378B (en) * 2016-04-13 2019-06-28 北京思特奇信息技术股份有限公司 A kind of method and system for realizing that big data is synchronous
CN105956123A (en) * 2016-05-03 2016-09-21 无锡雅座在线科技发展有限公司 Local updating software-based data processing method and apparatus
CN106021422B (en) * 2016-05-13 2019-04-09 北京思特奇信息技术股份有限公司 A kind of method and system forming Hive data warehouse based on relevant database
CN106126551A (en) * 2016-06-13 2016-11-16 浪潮电子信息产业股份有限公司 A kind of generation method of Hbase database access daily record, Apparatus and system
CN107577678B (en) * 2016-06-30 2021-02-09 华为技术有限公司 Method, client and server for processing database transaction
CN106294713A (en) * 2016-08-09 2017-01-04 深圳中兴网信科技有限公司 The method of data synchronization resolved based on Incremental Log and data synchronization unit
CN106339408B (en) * 2016-08-10 2019-08-23 深圳中兴网信科技有限公司 Method of data synchronization, data synchronization unit and server
CN106407329B (en) * 2016-09-05 2019-06-25 国网江苏省电力公司南通供电公司 Magnanimity platform automates the method for importing incremental data toward hadoop platform
GB201615748D0 (en) * 2016-09-15 2016-11-02 Gb Gas Holdings Ltd System for importing data into a data repository
CN106446243A (en) * 2016-10-10 2017-02-22 山东浪潮云服务信息科技有限公司 Data integration structure of relational database
CN107967279A (en) * 2016-10-19 2018-04-27 北京国双科技有限公司 The data-updating method and device of distributed data base
WO2018090249A1 (en) 2016-11-16 2018-05-24 Huawei Technologies Co., Ltd. Log-structured storage method and server
CN106599061B (en) * 2016-11-16 2020-06-30 成都九洲电子信息系统股份有限公司 SQLite-based embedded database synchronization method
CN106682119B (en) * 2016-12-08 2020-01-17 南京卡考网络科技有限公司 Asynchronous data synchronization method and system based on http service section and log system
CN106682140A (en) * 2016-12-20 2017-05-17 华北计算技术研究所(中国电子科技集团公司第十五研究所) Multi-system user incremental synchronization method based on timestamps and mapping strategies
CN108228397A (en) * 2016-12-22 2018-06-29 深圳市优朋普乐传媒发展有限公司 The method and apparatus that a kind of cluster span computer room synchronizes
CN108255838B (en) * 2016-12-28 2022-02-18 航天信息股份有限公司 Method and system for establishing intermediate data warehouse for big data analysis
CN106874389B (en) * 2017-01-11 2023-04-07 腾讯科技(深圳)有限公司 Data migration method and device
CN106844682B (en) * 2017-01-25 2019-08-16 北京百分点信息科技有限公司 Method for interchanging data, apparatus and system
CN106960007A (en) * 2017-02-28 2017-07-18 北京京东尚科信息技术有限公司 A kind of data-updating method, device and electronic equipment
CN107330003A (en) * 2017-06-12 2017-11-07 上海藤榕网络科技有限公司 Method of data synchronization, system, memory and data syn-chronization equipment
CN107320959B (en) * 2017-06-28 2020-10-23 网易(杭州)网络有限公司 Game role identification information generation method, device, medium and electronic equipment
CN107180116A (en) * 2017-06-28 2017-09-19 努比亚技术有限公司 A kind of data synchronizing processing method, mobile terminal and computer-readable recording medium
CN108009207A (en) * 2017-11-06 2018-05-08 东软集团股份有限公司 Incremental data inquiry method and device, storage medium, electronic equipment
CN107741994B (en) * 2017-11-09 2021-09-07 校脸科技(北京)有限公司 Data updating method and device
CN107958082B (en) * 2017-12-15 2021-03-26 杭州有赞科技有限公司 Off-line increment synchronization method and system from database to data warehouse
CN108228755A (en) * 2017-12-21 2018-06-29 江苏瑞中数据股份有限公司 The data of MySQL database based on daily record analytic technique to Hadoop platform synchronize clone method
CN110362582B (en) * 2018-04-03 2024-06-18 北京京东尚科信息技术有限公司 Method and device for realizing zero-shutdown upgrading
CN109213792B (en) * 2018-07-06 2021-11-09 武汉斗鱼网络科技有限公司 Data processing method, server, client, device and readable storage medium
CN108920698B (en) * 2018-07-16 2020-11-03 京东数字科技控股有限公司 Data synchronization method, device, system, medium and electronic equipment
CN109189852B (en) * 2018-08-01 2021-05-28 武汉达梦数据库有限公司 Data synchronization method and device for data synchronization
CN110837535A (en) * 2018-08-16 2020-02-25 中国移动通信集团江西有限公司 Data synchronization method, device, equipment and medium
CN109241184B (en) * 2018-08-20 2024-03-15 中国平安人寿保险股份有限公司 Data synchronization method, device, computer equipment and storage medium
CN109241033A (en) * 2018-08-21 2019-01-18 北京京东尚科信息技术有限公司 The method and apparatus for creating real-time data warehouse
CN109582736A (en) * 2018-11-22 2019-04-05 平安科技(深圳)有限公司 Synchronous method, device and the computer equipment of loan transaction list table
CN110175209A (en) * 2019-04-12 2019-08-27 中国人民财产保险股份有限公司 Incremental data synchronization method, system, equipment and storage medium
CN110413595B (en) * 2019-06-28 2022-07-12 万翼科技有限公司 Data migration method applied to distributed database and related device
AU2020311300A1 (en) * 2019-07-09 2022-03-03 Newsouth Innovations Pty Limited Application and database migration to a block chain data lake system
CN110737720A (en) * 2019-09-06 2020-01-31 苏宁云计算有限公司 DB2 database data synchronization method, device and system
CN110727684B (en) * 2019-10-08 2023-07-25 浪潮软件股份有限公司 Incremental data synchronization method for big data statistical analysis
CN111221918A (en) * 2019-11-04 2020-06-02 深圳力维智联技术有限公司 Data updating method, device, product and medium based on relational database
CN111008241A (en) * 2019-11-14 2020-04-14 微民保险代理有限公司 Data synchronization method and device, storage medium and computer equipment
CN111259178A (en) * 2020-01-19 2020-06-09 罗普特科技集团股份有限公司 Image data synchronization method, device and system
CN111339103B (en) * 2020-03-13 2023-06-20 河南安冉云网络科技有限公司 Data exchange method and system based on full-quantity fragmentation and incremental log analysis
CN113407601A (en) * 2020-03-17 2021-09-17 北京国双科技有限公司 Data acquisition method and device, storage medium and electronic equipment
CN113495894B (en) * 2020-04-01 2024-07-16 北京京东振世信息技术有限公司 Data synchronization method, device, equipment and storage medium
CN113515569B (en) * 2020-04-09 2023-12-26 阿里巴巴集团控股有限公司 Data synchronization method, device and system
CN112000678A (en) * 2020-08-20 2020-11-27 北京达佳互联信息技术有限公司 Data synchronization method, device, server and storage medium
CN112069240A (en) * 2020-08-29 2020-12-11 北京明略昭辉科技有限公司 Data synchronization method and system based on Spark Streaming
CN112115200B (en) * 2020-09-16 2023-08-29 北京奇艺世纪科技有限公司 Data synchronization method, device, electronic equipment and readable storage medium
CN112380180A (en) * 2020-11-17 2021-02-19 平安普惠企业管理有限公司 Data synchronization processing method, device, equipment and storage medium
CN112395360B (en) * 2020-12-01 2023-06-23 中国联合网络通信集团有限公司 Data synchronization method, device, apparatus and medium based on non-relational database
CN112765180B (en) * 2021-01-27 2023-01-17 上海英方软件股份有限公司 Method and device for analyzing column names of table building logs of DB2 database
CN112800073B (en) * 2021-01-27 2023-03-28 浪潮云信息技术股份公司 Method for updating Delta Lake based on NiFi
CN112966046B (en) * 2021-03-03 2024-04-12 北京金山云网络技术有限公司 Data synchronization method and device, electronic equipment and storage medium
CN113377871B (en) * 2021-06-22 2024-03-22 特赞(上海)信息科技有限公司 Data synchronization method, device and storage medium
CN114238523A (en) * 2021-12-17 2022-03-25 蚂蚁区块链科技(上海)有限公司 Data synchronization method and device
CN114547199A (en) * 2022-02-23 2022-05-27 阿维塔科技(重庆)有限公司 Database increment synchronous response method and device and computer readable storage medium
CN115374199A (en) * 2022-08-08 2022-11-22 广州小飞信息科技有限公司 Big data based configuration type extensible statistical warehousing system and method thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841897A (en) * 2011-06-23 2012-12-26 阿里巴巴集团控股有限公司 Incremental data extracting method, device and system
CN103631907A (en) * 2013-11-26 2014-03-12 中国科学院信息工程研究所 Method and system for migrating relational data to HBbase
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102841897A (en) * 2011-06-23 2012-12-26 阿里巴巴集团控股有限公司 Incremental data extracting method, device and system
CN103631907A (en) * 2013-11-26 2014-03-12 中国科学院信息工程研究所 Method and system for migrating relational data to HBbase
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"异构环境下数据库增量同步更新机制";王玉标 等;《计算机工程与设计》;20111231;第32卷(第3期);第949页第2.2节、第950页第2.4节

Also Published As

Publication number Publication date
CN105243067A (en) 2016-01-13

Similar Documents

Publication Publication Date Title
CN105243067B (en) A kind of method and device for realizing real-time incremental synchrodata
US11068449B2 (en) Data migration method, apparatus, and storage medium
CN109241175B (en) Data synchronization method and device, storage medium and electronic equipment
CN110147411B (en) Data synchronization method, device, computer equipment and storage medium
CN108932282B (en) Database migration method and device and storage medium
US10565208B2 (en) Analyzing multiple data streams as a single data object
JP6521402B2 (en) Method for updating data table of KeyValue database and apparatus for updating table data
CN106933703B (en) Database data backup method and device and electronic equipment
CN109189852B (en) Data synchronization method and device for data synchronization
WO2017162032A1 (en) Method and device for executing data recovery operation
US20170031948A1 (en) File synchronization method, server, and terminal
CN105205053A (en) Method and system for analyzing database incremental logs
CN103995854A (en) Equipment cross-version upgrading method and device
CN109086382B (en) Data synchronization method, device, equipment and storage medium
CN111737227B (en) Data modification method and system
US20150006485A1 (en) High Scalability Data Management Techniques for Representing, Editing, and Accessing Data
CN105608126A (en) Method and apparatus for establishing secondary indexes for massive databases
CN110532123A (en) The failover method and device of HBase system
CN111159020B (en) Method and device applied to synchronous software test
US20150039558A1 (en) Database management method, database system and medium
CN110807000B (en) File repair method and device, electronic equipment and storage medium
KR101588375B1 (en) Method and system for managing database
CN115687503A (en) Method, device and equipment for synchronizing data among databases and storage medium
US11941023B2 (en) System and method for implementing incremental data comparison
CN102760154A (en) Method supporting distributed transaction management in text information retrieval service

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yang Wei

Inventor after: Bai Junwei

Inventor after: Wang Xiaofeng

Inventor after: Feng Shicong

Inventor before: Yang Wei

Inventor before: Bai Junwei

Inventor before: Wang Xiaofeng

Inventor before: Feng Shicong

Inventor before: Wu Minghui

COR Change of bibliographic data
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220628

Address after: 15, second floor, east side of clean coal workshop, No. 68, Shijingshan Road, Shijingshan District, Beijing 100043 (cluster registration)

Patentee after: Beijing Zhizhi Heshu Technology Co.,Ltd.

Address before: 100193 No.310, building 4, courtyard 8, Dongbeiwang West Road, Haidian District, Beijing

Patentee before: MININGLAMP SOFTWARE SYSTEMS Co.,Ltd.

TR01 Transfer of patent right