CN110990435A - Data synchronization method, device and computer readable storage medium - Google Patents

Data synchronization method, device and computer readable storage medium Download PDF

Info

Publication number
CN110990435A
CN110990435A CN201911218606.1A CN201911218606A CN110990435A CN 110990435 A CN110990435 A CN 110990435A CN 201911218606 A CN201911218606 A CN 201911218606A CN 110990435 A CN110990435 A CN 110990435A
Authority
CN
China
Prior art keywords
data
synchronized
buffer pool
database
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911218606.1A
Other languages
Chinese (zh)
Inventor
乔智
张斌
孙军锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Miaozhen Information Technology Co Ltd
Original Assignee
Miaozhen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Miaozhen Information Technology Co Ltd filed Critical Miaozhen Information Technology Co Ltd
Priority to CN201911218606.1A priority Critical patent/CN110990435A/en
Publication of CN110990435A publication Critical patent/CN110990435A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The embodiment of the invention provides a data synchronization method, a data synchronization device and a computer readable storage medium, which relate to the field of databases and are used for synchronizing data to two different types of databases, wherein the method comprises the following steps: acquiring data to be synchronized in a first database, writing the data to be synchronized into a buffer pool, judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time, and if so, loading the data to be synchronized in the buffer pool to a second database in batches; the data are temporarily stored in the buffer pool, and when the number of the data to be synchronized in the buffer pool reaches a preset threshold value, the data to be synchronized are loaded to the second database in batches by the database coprocessor, so that the consistency of the data of different databases and the real-time performance of the synchronized data are ensured, and meanwhile, data loss or data collision is avoided.

Description

Data synchronization method, device and computer readable storage medium
Technical Field
The present invention relates to the field of databases, and in particular, to a data synchronization method, apparatus, and computer-readable storage medium.
Background
Various storage frames in the internet era are in endless, such as traditional relational databases: oracle, MySQL, emerging NoSQL HBase, Cassandra, Redis, full text search framework ES (elastic search), Solr, etc.
In the actual production process, the data are usually stored in two parts and written into HBase and ES respectively; therefore, data loss or data conflict is easily caused, the problem of data consistency cannot be guaranteed, and the real-time requirement of data cannot be met.
Based on the above problems, a data synchronization method capable of ensuring data consistency of different databases is needed.
Disclosure of Invention
In view of the above, the present invention provides a data synchronization method, apparatus and computer readable storage medium.
In order to achieve the above purpose, the embodiment of the present invention adopts the following technical solutions:
in a first aspect, an embodiment of the present invention provides a data synchronization method for synchronizing data to two different types of databases, including:
acquiring data to be synchronized in a first database, and writing the data to be synchronized into a buffer pool;
judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time; and if so, loading the data to be synchronized in the buffer pool to a second database in batch.
In an optional embodiment, after the step of bulk loading the data to be synchronized in the buffer pool to the second database, the method includes:
and emptying the data to be synchronized in the buffer pool.
In an optional embodiment, the step of obtaining data to be synchronized in the first database and writing the data to be synchronized into the buffer pool includes:
creating a data writing thread; the write thread comprises a write event;
and writing the data to be synchronized into the buffer pool by triggering the write event.
In an optional embodiment, the step of determining whether the number of the data to be synchronized in the buffer pool reaches a predetermined threshold includes:
creating a data synchronization thread;
executing the data synchronization thread every preset time; and the data synchronization thread is used for judging whether the data to be synchronized in the buffer pool reaches a preset threshold value.
In a second aspect, an embodiment of the present invention provides a data synchronization apparatus for synchronizing data to two different types of databases, including:
the device comprises an acquisition module, a buffer pool and a synchronization module, wherein the acquisition module is used for acquiring data to be synchronized in a first database and writing the data to be synchronized into the buffer pool;
the processing module is used for judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time; and if so, loading the data to be synchronized in the buffer pool to a second database in batch.
In an optional embodiment, the processing module is further configured to empty the data to be synchronized in the buffer pool after the data to be synchronized in the buffer pool is loaded to the second database in batch.
In an optional embodiment, the processing module is further configured to create a data write thread; the write thread comprises a write event;
and the data synchronization module is also used for writing the data to be synchronized into the buffer pool by triggering the write event.
In an optional embodiment, the processing module is further configured to create a data synchronization thread;
and further for executing the data synchronization thread once every predetermined time; and the data synchronization thread is used for judging whether the data to be synchronized in the buffer pool reaches a preset threshold value.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the data synchronization method according to any one of the foregoing embodiments.
The data synchronization method, the data synchronization device and the computer-readable storage medium provided by the embodiment of the invention are used for synchronizing data to two different types of databases, and the method comprises the following steps: acquiring data to be synchronized in a first database, writing the data to be synchronized into a buffer pool, judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time, and if so, loading the data to be synchronized in the buffer pool to a second database in batches; the data are temporarily stored in the buffer pool, and when the number of the data to be synchronized in the buffer pool reaches a preset threshold value, the data to be synchronized are loaded to the second database in batches by the database coprocessor, so that the consistency of the data of different databases and the real-time performance of the synchronized data are ensured, and meanwhile, data loss or data collision is avoided.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 illustrates a data synchronization method provided by an embodiment of the present invention.
Fig. 2 illustrates another data synchronization method provided by the embodiment of the present invention.
Fig. 3 is a schematic diagram illustrating functional modules of a data synchronization apparatus according to an embodiment of the present invention.
Icon: 100-data synchronization means; 110-an obtaining module; 120-processing module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The first database and the second database proposed in this embodiment are Hbase and ES, respectively, and a Coprocessor (Coprocessor) is introduced after the version 0.92 of Hbase, and the Coprocessor is a framework working in a Master/region server, and can run a code written by a user, thereby flexibly completing a task of distributed data processing. HBase supports two types of coprocessors Endpoint and Observer. The Endpoint coprocessors are similar to the storage process in the traditional database, the client can call the Endpoint coprocessors to execute a segment of Server end codes, and the results of the Server end codes are returned to the client for further processing; the most common use is to perform an aggregation operation. Another type of coprocessor, called an Observer, is similar to the trigger in a conventional database, and is called by the Server side when some event occurs. The Observer is hook functions scattered in HBase Server end codes, and the hook functions are triggered to be called when fixed events occur. Such as: a hook function prePut is arranged before the put operation, and the prePut function is called by a Region Server before the put operation is executed; after the put operation there is a postPut hook function.
Referring to fig. 1, a data synchronization method according to an embodiment of the present invention is shown.
Step 101, acquiring data to be synchronized in a first database, and writing the data to be synchronized into a buffer pool.
And step 102, judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time.
And 103, loading the data to be synchronized in the buffer pool to a second database in batch.
The data synchronization method provided by this embodiment is used to synchronize data to two different types of databases, and includes obtaining data to be synchronized in a first database, writing the data to be synchronized into a buffer pool, then determining whether the number of the data to be synchronized in the buffer pool reaches a predetermined threshold value every predetermined time, and if the number of the data to be synchronized in the buffer pool reaches the predetermined threshold value, loading the data to be synchronized in the buffer pool to a second database in batch. In the practical application process, data is written into the HBase firstly, and the data is automatically loaded into the ES in batch through a Coprocessor (Coprocessor) of the HBase, so that the consistency of the data and the real-time performance of synchronous data are ensured, and data loss or data collision is avoided.
Referring to fig. 2, another data synchronization method according to an embodiment of the present invention is shown.
It should be noted that the basic principle and the generated technical effect of the data synchronization method provided by the present embodiment are the same as those of the above embodiments, and for the sake of brief description, no part of the present embodiment is mentioned, and corresponding contents in the above embodiments may be referred to.
Step 101, acquiring data to be synchronized in a first database, and writing the data to be synchronized into a buffer pool.
Executing codes written by a user and used for connecting and accessing the first database by the ES Client; this step is intended to write the data to be synchronized into the buffer pool.
It should be noted that step 101 includes two sub-steps, and details of the sub-steps are not mentioned in this step.
Sub-step 101-1, creates a data write thread.
Writing a custom class inheriting a BaseRegionObserver, and duplicating four methods of start (), stop (), postPut (), and postDelete (), wherein the four methods respectively represent that a cooperator starts to run, the cooperator finishes running, postPut event triggers to store data into hbase and postDelete event triggers to delete data from hbase. Writing the code for initializing the ES client into start (), and closing the Scheduled object defined by the ES client in stop ().
And a substep 101-2 of writing the data to be synchronized into the buffer pool by triggering a write event.
The postPut event triggers the storage of data in the buffer pool.
And step 102, judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time.
It should be noted that step 102 includes two substeps, and details not mentioned in this step are set forth in the substeps.
Sub-step 102-1, creates a data synchronization thread.
In the batch loading bulk ES code which is critical, a scheduledExecutionServic thread is created, and a task is periodically executed by using the scheduledExecutionServic, wherein the task is used for data synchronization.
And a substep 102-2 of executing a data synchronization thread every predetermined time to determine whether the amount of data to be synchronized in the buffer pool reaches a predetermined threshold.
If yes, go to step 103.
Executing a data synchronization thread every other preset time to judge whether the data to be synchronized in the buffer pool is the data needing batch loading (bulk), and if the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value, filling the data to be synchronized in the buffer pool into an ES.
In one embodiment, the predetermined time is 30s, and the predetermined threshold is 10000; other values are possible, as the case may be, and are not limiting herein.
And 103, loading the data to be synchronized in the buffer pool to a second database in batch.
It should be noted that, in order to perform thread security of the bulk process, a lock operation needs to be performed on the bulk process, so as to avoid that different threads operate on the same buffer pool data at the same time.
Specifically, a packaging tool Maven integrated in the development tool is used for packaging the written code and uploading the code to a specified path of the distributed file system for execution, so that batch loading of data is realized.
In summary, the data synchronization method, apparatus and computer-readable storage medium provided in the embodiments of the present invention are used for synchronizing data to two different types of databases, and the method includes: acquiring data to be synchronized in a first database, writing the data to be synchronized into a buffer pool, judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time, and if so, loading the data to be synchronized in the buffer pool to a second database in batches; the data are temporarily stored in the buffer pool, and when the number of the data to be synchronized in the buffer pool reaches a preset threshold value, the data to be synchronized are loaded to the second database in batches by the database coprocessor, so that the consistency of the data of different databases and the real-time performance of the synchronized data are ensured, and meanwhile, data loss or data collision is avoided.
Fig. 3 is a schematic functional module diagram of a data synchronization apparatus according to an embodiment of the present invention. It should be noted that the basic principle and the technical effect are the same as those of the foregoing method embodiments, and for the sake of brief description, the corresponding contents in the foregoing method embodiments may be referred to for the parts that are not mentioned in this embodiment. The data synchronization apparatus 100 is used for performing the data synchronization method described in fig. 1 and 2, and includes an acquisition module 110 and a processing module 120.
It is understood that in one embodiment, step 101 is performed by the acquisition module 110.
It is understood that in one embodiment, steps 102 and 103 are performed by processing module 120.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. A data synchronization method for synchronizing data to two different types of databases, comprising:
acquiring data to be synchronized in a first database, and writing the data to be synchronized into a buffer pool;
judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time; and if so, loading the data to be synchronized in the buffer pool to a second database in batch.
2. The method of claim 1, wherein the step of bulk loading the data to be synchronized in the buffer pool to a second database is followed by:
and emptying the data to be synchronized in the buffer pool.
3. The method according to claim 1, wherein the step of obtaining the data to be synchronized in the first database and writing the data to be synchronized into the buffer pool comprises:
creating a data writing thread; the write thread comprises a write event;
and writing the data to be synchronized into the buffer pool by triggering the write event.
4. The method according to claim 1, wherein the step of determining whether the amount of the data to be synchronized in the buffer pool reaches a predetermined threshold comprises:
creating a data synchronization thread;
executing the data synchronization thread every preset time; and the data synchronization thread is used for judging whether the data to be synchronized in the buffer pool reaches a preset threshold value.
5. A data synchronization apparatus for synchronizing data to two different types of databases, comprising:
the device comprises an acquisition module, a buffer pool and a synchronization module, wherein the acquisition module is used for acquiring data to be synchronized in a first database and writing the data to be synchronized into the buffer pool;
the processing module is used for judging whether the quantity of the data to be synchronized in the buffer pool reaches a preset threshold value every preset time; and if so, loading the data to be synchronized in the buffer pool to a second database in batch.
6. The apparatus of claim 5,
the processing module is further configured to empty the data to be synchronized in the buffer pool after the data to be synchronized in the buffer pool is loaded to a second database in batches.
7. The apparatus of claim 5,
the processing module is also used for creating a data writing thread; the write thread comprises a write event;
and the data synchronization module is also used for writing the data to be synchronized into the buffer pool by triggering the write event.
8. The apparatus of claim 5,
the processing module is also used for creating a data synchronization thread;
and further for executing the data synchronization thread once every predetermined time; and the data synchronization thread is used for judging whether the data to be synchronized in the buffer pool reaches a preset threshold value.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the data synchronization method according to any one of claims 1 to 4.
CN201911218606.1A 2019-12-03 2019-12-03 Data synchronization method, device and computer readable storage medium Pending CN110990435A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911218606.1A CN110990435A (en) 2019-12-03 2019-12-03 Data synchronization method, device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911218606.1A CN110990435A (en) 2019-12-03 2019-12-03 Data synchronization method, device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110990435A true CN110990435A (en) 2020-04-10

Family

ID=70089478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911218606.1A Pending CN110990435A (en) 2019-12-03 2019-12-03 Data synchronization method, device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110990435A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597172A (en) * 2021-01-05 2021-04-02 中国铁塔股份有限公司 Data writing method, system and storage medium
CN112948486A (en) * 2021-02-04 2021-06-11 北京淇瑀信息科技有限公司 Batch data synchronization method and system and electronic equipment
CN113742096A (en) * 2021-07-14 2021-12-03 广州市玄武无线科技股份有限公司 Method and system for realizing event queue
CN114363361A (en) * 2022-03-17 2022-04-15 武汉中科通达高新技术股份有限公司 Data synchronization method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488695A (en) * 2013-09-02 2014-01-01 用友软件股份有限公司 Data synchronizing device and data synchronizing method
CN105446893A (en) * 2014-07-14 2016-03-30 阿里巴巴集团控股有限公司 Data storage method and device
CN107153644A (en) * 2016-03-02 2017-09-12 阿里巴巴集团控股有限公司 A kind of method of data synchronization and device
CN107741965A (en) * 2017-09-30 2018-02-27 北京奇虎科技有限公司 Database synchronization processing method, device, computing device and computer-readable storage medium
CN108268497A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The method of data synchronization and device of relevant database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103488695A (en) * 2013-09-02 2014-01-01 用友软件股份有限公司 Data synchronizing device and data synchronizing method
CN105446893A (en) * 2014-07-14 2016-03-30 阿里巴巴集团控股有限公司 Data storage method and device
CN107153644A (en) * 2016-03-02 2017-09-12 阿里巴巴集团控股有限公司 A kind of method of data synchronization and device
CN108268497A (en) * 2016-12-30 2018-07-10 北京国双科技有限公司 The method of data synchronization and device of relevant database
CN107741965A (en) * 2017-09-30 2018-02-27 北京奇虎科技有限公司 Database synchronization processing method, device, computing device and computer-readable storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112597172A (en) * 2021-01-05 2021-04-02 中国铁塔股份有限公司 Data writing method, system and storage medium
CN112948486A (en) * 2021-02-04 2021-06-11 北京淇瑀信息科技有限公司 Batch data synchronization method and system and electronic equipment
CN113742096A (en) * 2021-07-14 2021-12-03 广州市玄武无线科技股份有限公司 Method and system for realizing event queue
CN113742096B (en) * 2021-07-14 2022-04-22 广州市玄武无线科技股份有限公司 Method and system for realizing event queue
CN114363361A (en) * 2022-03-17 2022-04-15 武汉中科通达高新技术股份有限公司 Data synchronization method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110990435A (en) Data synchronization method, device and computer readable storage medium
US10235337B2 (en) Distributed work flow using database replication
US7383470B2 (en) Method, system, and apparatus for identifying unresponsive portions of a computer program
US11436353B2 (en) Merge updates for key value stores
US11029943B1 (en) Processing framework for in-system programming in a containerized environment
US20150205633A1 (en) Task management in single-threaded environments
US11366788B2 (en) Parallel pipelined processing for snapshot data deletion
US20210064644A1 (en) Yaml configuration modeling
CN113360270A (en) Data cleaning task processing method and device
CA2950688C (en) System and method for recording the beginning and ending of job level activity in a mainframe computing environment
AU2015265599B2 (en) System and method for the production of job level pre-processed backup of critical data and/or datasets in a mainframe computing environment
US9924002B1 (en) Managing stateless processes
CN107832403B (en) Directory file management method and device, electronic terminal and readable storage medium
CN108334333B (en) Method and device for updating source code base
CN110705715B (en) Hyper-parameter management method and device and electronic equipment
US10387887B2 (en) Bloom filter driven data synchronization
CN107729107B (en) Modal dialog box processing method and device
CN107958414B (en) Method and system for eliminating long transactions of CICS (common integrated circuit chip) system
CN114721876A (en) Data backup method, device and medium
CN113806197A (en) Page loading duration calculation method and device
CN114816852A (en) Method, device and medium for recovering user configuration data
CN113934698A (en) Log compression
CN107967275B (en) Data processing method and device in relational database
CN106815001B (en) Method and device for detecting configuration file information
Munnich et al. Calculating worst-case execution times of transactions in databases for event-driven, hard real-time embedded systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200410

WD01 Invention patent application deemed withdrawn after publication