CN108694199A - Data synchronization unit, method, storage medium and electronic equipment - Google Patents

Data synchronization unit, method, storage medium and electronic equipment Download PDF

Info

Publication number
CN108694199A
CN108694199A CN201710229817.XA CN201710229817A CN108694199A CN 108694199 A CN108694199 A CN 108694199A CN 201710229817 A CN201710229817 A CN 201710229817A CN 108694199 A CN108694199 A CN 108694199A
Authority
CN
China
Prior art keywords
data
task
subtask
thread
judging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710229817.XA
Other languages
Chinese (zh)
Inventor
何林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710229817.XA priority Critical patent/CN108694199A/en
Publication of CN108694199A publication Critical patent/CN108694199A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Present disclose provides a kind of data synchronization unit, method of data synchronization, computer readable storage medium and electronic equipment, which includes:Task allocating module is used to periodically calculate total amount of data to be synchronized in source database, and data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and is stored to task buffering queue according to the total amount of data;Data read module, for judging whether a thread gets subtask from the task buffering queue and when judging that the thread gets the subtask, data corresponding with the subtask are read from the source database and are stored to data buffering queue;Data write. module, for judging to whether there is in the data buffering queue and corresponding data in the subtask and when judging to there are data corresponding with the subtask, data corresponding with the subtask are obtained from the data buffering queue and target database is written.The disclosure improves data synchronization efficiency.

Description

Data synchronization unit, method, storage medium and electronic equipment
Technical field
This disclosure relates to technical field of data processing, in particular to a kind of data synchronization unit, the data side of synchronization Method, computer readable storage medium and electronic equipment.
Background technology
With the development of the communication technology, the application scale of data is increasing, therefore data list structure faces redesign. New table designs, and old table still stores the various data of user on line, then needs the data by old table to synchronize and imported into new table.
Currently, the synchronizing function that can be carried by database carries out data synchronization, it is usually only applicable to table structure isomorphism Data synchronize, this method of data synchronization is generally exactly that a thread or multiple threads read old table data simultaneously, then into It is then written to new table after the relevant data mart modeling of row.Such as:The business that master-slave synchronisation and exploitation worker between computer room carry out Logical synchronization.
But the data method of synchronization has the following problems:One, this method of synchronization usually patrols system structure and business It collects and mixes, the disposable data for being suitable only for simple low volume data synchronize, and cannot achieve timing or largely count in real time According to synchronous task;Two, it is susceptible to mistake during this method of synchronization, accuracy and reliability is relatively low;Three, this synchronization Mode needs constantly to access I/O in data processing, and data synchronization efficiency is relatively low, and increases overhead and resource.
It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Invention content
The disclosure is designed to provide a kind of data synchronization unit, method of data synchronization, computer readable storage medium And electronic equipment, and then overcome at least to a certain extent caused by the limitation and defect of the relevant technologies one or Multiple problems.
According to one aspect of the disclosure, a kind of data synchronization unit is provided, described device includes:
Task allocating module, it is total according to the data for periodically calculating total amount of data to be synchronized in source database Data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and stored to task buffering queue by amount;
Data read module, for judging whether a thread gets subtask from the task buffering queue and sentencing When the thread that breaks gets the subtask, data corresponding with the subtask are read from the source database and are stored To data buffering queue;
Data write. module whether there is data corresponding with the subtask for judging in the data buffering queue And when judging to there are data corresponding with the subtask, obtained from the data buffering queue corresponding with the subtask Data and target database is written.
In a kind of exemplary embodiment of the disclosure, the task allocating module includes:
Main thread determination unit, for competing distributed lock simultaneously in the multiple threads of same period internal trigger, and will The thread for obtaining the distributed lock is determined as main thread and determines the thread for not obtaining the distributed lock For standby thread;
Main thread control unit executes data distribution times corresponding with the data to be synchronized for controlling the main thread Business.
In a kind of exemplary embodiment of the disclosure, the main thread determination unit includes:
Task judgment sub-unit, for judging whether the main thread data distribution task described in successful execution and is judging When data described in the main thread successful execution distribute task, the control standby thread stops waiting for;
State changes subelement, for when judging that the main thread executes the data distribution task not successfully, controlling The standby thread competes the distributed lock again.
In a kind of exemplary embodiment of the disclosure, the data read module includes:
Task detection unit, for detecting with the presence or absence of task in the task buffering queue and delaying detecting the task It rushes there are when task in queue, data corresponding with the subtask are simultaneously read in one subtask of acquisition;
Subtask acquiring unit, for when not there is no task in detecting the task buffering queue, judging the thread Whether the duration that the subtask has not been obtained is more than preset time;
Suspend mode control unit, for judging that the duration that the subtask has not been obtained in the thread is more than described pre- If when the time, controlling the dormant state that the thread enters preset duration;
Cycle detection unit, for judge the thread have not been obtained the subtask duration be less than it is described When preset time, it whether there is task in task buffering queue described in cycle detection.
In a kind of exemplary embodiment of the disclosure, the suspend mode control unit includes:
Dormancy time subelement, for judging the dormancy time of the thread whether more than a preset quantity period;
Thread terminates subelement, for when the dormancy time for judging the thread is more than the preset quantity period, tying Shu Suoshu threads.
In a kind of exemplary embodiment of the disclosure, the data read module further includes:
Data storage cell, for judging whether successfully to read data and when judging successfully to read data, by reading Data are stored in the data buffering queue;
First returning unit, for when judging failed reading data, digital independent task being returned to the task and is delayed Rush queue.
In a kind of exemplary embodiment of the disclosure, the data storage cell includes:
Task returns to subelement, for judging whether data are successfully stored in the data buffering queue and are judging data not When success is stored in the data buffering queue, data store tasks are returned into the task buffering queue.
In a kind of exemplary embodiment of the disclosure, the Data write. module further includes:
Judging unit is written, for judging whether data are successfully written the target database and are judging that data successfully write When entering the target database, release profile formula lock;
Second returning unit, for when judging that the target database is written not successfully in data, data are returned described in Data buffering queue.
According to one aspect of the disclosure, a kind of method of data synchronization is provided, including:
Total amount of data to be synchronized in source database is periodically calculated, it will be with data pair to be synchronized according to the total amount of data The data synchronous task cutting answered is multiple subtasks and stores to task buffering queue;
Judge whether a thread gets subtask from the task buffering queue and judging that the thread gets When the subtask, data corresponding with the subtask are read from the source database and are stored to data buffering queue;
Judge in the data buffering queue with the presence or absence of data corresponding with the subtask and in judgement presence and institute When stating the corresponding data in subtask, data corresponding with the subtask are obtained from the data buffering queue and target is written Database.
According to one aspect of the disclosure, a kind of computer readable storage medium is provided, computer program is stored thereon with, The computer program realizes the method for data synchronization described in above-mentioned any one when being executed by processor.
According to one aspect of the disclosure, a kind of electronic equipment is provided, including:
Processor;And
Memory, the executable instruction for storing the processor;
Wherein, the processor is configured to execute the number described in above-mentioned any one via the executable instruction is executed According to synchronous method.
It is to be synchronized in source database by calculating in a kind of data synchronization unit and method of data synchronization that the disclosure provides Total amount of data, will data synchronous task cutting corresponding with data to be synchronized be multiple subtasks and judge the thread from When getting subtask in the task buffering queue, reads data corresponding with the subtask and store and arrive data buffering team Row;Target database is written into data corresponding with the subtask in the data buffering queue in storage.On the one hand, pass through Data synchronous task is subjected to different task cutting, the mode of operation and operating quantity of data synchronization is simplified, it is same to improve data Walk efficiency;On the other hand, when getting subtask, reading and writing data is carried out by different multiple thread parallels, improves number According to synchronous accuracy and reliability.
It should be understood that above general description and following detailed description is only exemplary and explanatory, not The disclosure can be limited.
Description of the drawings
The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.
Fig. 1 schematically shows a kind of block diagram of data synchronization unit.
Fig. 2 schematically shows task in disclosure exemplary embodiment and distributes schematic diagram.
Fig. 3 schematically shows data read process schematic diagram in disclosure exemplary embodiment.
Fig. 4 schematically shows data writing process schematic diagram in disclosure exemplary embodiment.
Fig. 5 schematically shows a kind of flow diagram of method of data synchronization.
Fig. 6 schematically shows the module diagram of the electronic equipment in disclosure exemplary embodiment.
Specific implementation mode
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the disclosure will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical reference numeral indicates in figure Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to fully understand embodiment of the disclosure to provide.However, It will be appreciated by persons skilled in the art that can be with technical solution of the disclosure without one in the specific detail or more It is more, or other methods, constituent element, material, device, step may be used etc..In other cases, it is not shown in detail or describes Known features, method, apparatus, realization, material or operation are to avoid fuzzy all aspects of this disclosure.
Block diagram shown in attached drawing is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening A part for functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device These functional entitys.
A kind of data synchronization unit is provided firstly in this example embodiment, be can be applied to such as message broadcast, is divided Cloth calculates or other business scenarios, can be realized between terminal and server in different storage device or terminal and terminal The synchronization of data with it is shared.Mostly refering to what is shown in Fig. 1, the data synchronization unit 100 may include task allocating module 101, number According to read module 102, Data write. module 103, wherein:
Task allocating module 101 can be used for periodically calculating total amount of data to be synchronized in source database, according to described Data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and stored to task buffering queue by total amount of data;
Data read module 102, can be used for judging whether a thread gets son from the task buffering queue and appoint It is engaged in and when judging that the thread gets the subtask, number corresponding with the subtask is read from the source database According to and store arrive data buffering queue;
Data write. module 103 can be used for judging whether there is and the subtask pair in the data buffering queue The data answered and when judging to there are data corresponding with the subtask, obtain and the son from the data buffering queue Simultaneously target database is written in the corresponding data of task.
It, will by calculating total amount of data to be synchronized in source database in a kind of data synchronization unit that the disclosure provides Data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and is judging that the thread buffers from the task When getting subtask in queue, reads data corresponding with the subtask and store and arrive data buffering queue;Storage is arrived Target database is written in data corresponding with the subtask in the data buffering queue.On the one hand, by synchronizing data Task carries out different task cutting, simplifies the mode of operation and operating quantity of data synchronization, improves data synchronization efficiency;It is another Aspect carries out reading and writing data when getting subtask by different multiple thread parallels, improves the accurate of data synchronization Property and reliability.
In the following, Fig. 1 to Fig. 4 will be combined to make furtherly the modules of the data synchronization unit in this example embodiment It is bright.
Task allocating module 101 can be used for periodically calculating total amount of data to be synchronized in source database, according to described Data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and stored to task buffering queue by total amount of data.
In this example embodiment, source database can be the data for directly providing raw information data or specific data Library may include full-text database, terminological data bank, numerical data base, text value type database, image data base etc..Example Such as, electronic dictionary, goods catalogue etc..Data in the source database can be stored in the form of a table.This example is implemented In mode, data to be synchronized can be all data in source database, or need synchronous in source database and divide The partial data enjoyed, can be according to the self-defined setting of specific requirements of user.The operation for treating synchrodata execution is referred to as Data synchronous task, it can be real-time synchronization that data, which synchronize, i.e., a data is written or updated in old table, can in new table Quickly to read.For example, Information Security it is higher, without network, only LAN or the confined region of outer net, can Can directly carry out data synchronization using multiple computers, in addition to this, data synchronization can also be carried out using network service.
In this example embodiment, data synchronous task cutting can be appointed for multiple sons according to function or other modes Business, multiple subtasks may be the same or different.The data synchronous task can by multiple processes or multiple threads Lai It executes.Wherein, a thread corresponds to one of the multiple subtasks for completing data synchronous task.For example, will be from database It generates a pay sheet report and passes in file destination, this is a subtask;During generating payroll journal, and can It is asked with input source data base querying, this is a subtask.Can by data synchronous task cutting be multiple subtasks it Afterwards, by all subtask storages to task buffering queue, the work that task phase is entirely distributed in a cycle is completed.It connects Get off to continue to carry out in the manner described above the task distribution in next period.Delayed by the way that the subtask of cutting is stored in task Queue is rushed, the asynchronous operation of task scheduling may be implemented.That is the task in source database is pushed to task buffering by a thread Queue, another thread take out task from task buffering queue and are handled.Ensure point of task by task buffering queue With concurrency conflict is not present between acquisition, two threads need not carry out any synchronization action, to improve task distribution Efficiency.The scheduling of task, such as Detection task buffering queue can be realized in such a way that thread pool adds task buffering queue is It is no to have task, when having detected task, work queue is added, after task is finished, idle queues is added and timing recycling is empty Not busy queue, to execute the task distribution in next period.
In addition, in this example embodiment, the task allocating module may include:
Main thread determination unit can be used in the multiple threads of same period internal trigger while compete distributed lock, And the thread for obtaining the distributed lock is determined as main thread and the thread of the distributed lock will not obtained It is determined as standby thread;
Main thread control unit can be used for controlling the main thread and execute data corresponding with the data to be synchronized and divides With task.
It, may only be by a thread in a process to executing assignment operation, i.e., in this example embodiment Only there are one threads to be stored to task buffering queue with calculating task total amount, segmentation task and by the task of cutting.It can lead to It crosses in same period, multiple threads that triggering executes data synchronous task compete distributed lock simultaneously.The distribution will be obtained The thread of formula lock is determined as main thread, while other the remaining threads for not obtaining the distributed lock are determined as standby thread, Wherein, main thread has and only one, standby thread can have multiple.Main thread can distribute task, can also distribute task Task is executed later, and standby thread can execute different task.
In this example embodiment, distributed lock in order to control between distributed system synchronization of access shared resource a kind of side Formula can prevent interference from ensureing consistency.Can be based on database realizing distributed lock, can also be based on caching (Redis, Memcached, Tair) realize distributed lock, distributed lock can also be realized by the interim ordered nodes of Zookeeper, to protect For card in the application cluster of distributed deployment, the same method or the same task can only be by a machines in the same time One thread executes.For example, Redis can directly carry out locking operation by SETNX foo.lock orders, if returned It is 1 to return value, indicates that client has obtained distributed lock;Zookeeper realizes that the mode of lock can be:All clients are together Certain data is write in competition, and only first client can be written, and failure can be all written in other clients, and successfully visitor is written Family end is locked, and the client registers watch events of failure are written, wait for the release of lock, to continue to compete the lock.By dividing The mode of cloth lock, can solve the problems, such as the same task by two threads while obtain, avoid the duplicate allocation of task Cause data to be repeated synchronous problem, improves the reliability and validity of data synchronization.
In addition, in this example embodiment, the main thread determination unit may include:
Task judgment sub-unit, can be used for judging the main thread whether data distribution task described in successful execution and When judging that data described in the main thread successful execution distribute task, the control standby thread stops waiting for;
State changes subelement, can be used for when judging that the main thread executes the data distribution task not successfully, It controls the standby thread and competes the distributed lock again.
In this example embodiment, main thread and standby thread in this example embodiment can be not fixed, can be each other Active and standby thread.In this example embodiment, the thread of first acquisition distributed lock can be known as the first main thread, it is corresponding The thread for not obtaining distributed lock is become first for thread.It, can be with described in real-time judge after determining the first main thread Whether successful execution calculates task total amount to be synchronized, segmentation task and buffers the storage of the task of cutting to task one main thread The work of a series of data distribution task phase in queue, whether the pending datas distribution such as standby thread loops task is by the first master Thread is at the distribution of work.Task total amount, segmentation task and general to be synchronized are calculated judging that first main thread is completely successful to execute When the task storage of cutting works to the total data distribution task in task buffering queue, described first is notified to stop for thread Judge main thread whether successful execution distribution task process in wait state;Judging the not very successful execution of the main thread Divide to the total data in task buffering queue including calculating task total amount to be synchronized, segmentation task, by the task storage of cutting It works with task, that is, does not complete calculating task total amount to be synchronized, segmentation task, stores the task of cutting to task buffering queue Medium multiple tasks one of them when, first main thread failure, the multiple first can be again through competition for thread The mode of distributed lock, determine multiple first for one of thread be the second main thread.It later can be according to this example reality Apply the method competition main thread in mode.In the process, the first main thread of script may be standby thread.
Data read module 102, can be used for judging whether a thread gets son from the task buffering queue and appoint It is engaged in and when judging that the thread gets the subtask, number corresponding with the subtask is read from the source database According to and store arrive data buffering queue.
In this example embodiment, the thread can be all threads of execution data synchronous task, including above-mentioned logical Cross the main thread and standby thread that competition distributed lock determines.The task buffering queue can be that realization task is pacified between multithreading The fifo queue for the first in first out transmitted entirely.Queue can be tandem queue, or parallel queue.Tandem queue is come It says, subtask is executed according to the sequence of first in first out;For parallel queue, asynchronous execution task can create a plurality of sub-line journey Execute multiple subtasks, but subtask executes out at random.In addition, task buffering queue may be Global Queue Either major queue.In this example embodiment, after main thread successful execution data distribution task, all main threads and standby Thread is all opened, and waiting obtains the subtask from task buffering queue.It can be with thread described in cycle criterion whether from institute It states and gets subtask in task buffering queue, and when judging to get subtask, read and the son from source database The corresponding data of task, and by the storage to data buffering queue of the data of reading, cycle is read respectively in the manner described above later The corresponding data in subtask that thread obtains.
In this example embodiment, data buffering queue is stored in by the data corresponding with the subtask that will be read In, the asynchronous operation of reading and writing data may be implemented.That is a thread by the data-pushing of reading to data buffering queue, another Thread takes out data from data buffering queue and is handled.Between the reading and processing that ensure data by data buffering queue There is no concurrency conflict, two threads need not carry out any synchronization action, to improve the efficiency of digital independent.Except this it Outside, data buffering queue can ensure effective transmission of data, if for example, target database is not after the completion of digital independent Can use state, the data buffering queue then can reservation message, until the data is by successful synchronization.Pass through data buffering team Row can also ensure the validity and correctness that data synchronize.
In addition, in this example embodiment, the data read module may include:
Task detection unit can be used for detecting in the task buffering queue with the presence or absence of task and in described of detection Being engaged in buffering queue, there are when task, obtain a subtask and read data corresponding with the subtask;
Subtask acquiring unit can be used for when not there is no task in detecting the task buffering queue, described in judgement Whether the duration that the subtask has not been obtained in thread is more than preset time;
Suspend mode control unit can be used for judging that the duration that the subtask has not been obtained in the thread is more than institute When stating preset time, the dormant state that the thread enters preset duration is controlled;
Cycle detection unit can be used for judging that the thread has not been obtained duration of the subtask and be less than When the preset time, it whether there is task in task buffering queue described in cycle detection.
In this example embodiment, it can judge whether task buffering queue is the empty detection task by function first It whether there is subtask in queue.In judging task buffering queue, there are when subtask, one of subtask is being obtained at random, It reads data corresponding with the subtask and stores into data buffering queue.Not there is no son in judging task buffering queue When task, the thread can be controlled and continuously waited for, and judge that the thread had not been obtained to the duration of subtask to be No is more than preset time t, it is understood that when the thread has not been obtained to subtask, to control the continuous n times of the thread and obtaining It takes, it can be t that continuous n times, which get the time used in small task,;It is more than in the duration that the thread has not been obtained to subtask When preset time t, the thread suspend mode preset duration T can be controlled.After suspend mode preset duration, continue next period Subtask obtain, then in this manner carry out cycle criterion obtain subtask, suspend mode, reading, storage.In the thread When the duration having not been obtained to subtask is less than preset time t, the thread continues the continuous n times from task buffering queue Obtain subtask.
In this example embodiment, flow of task can be divided into two by judging whether the thread gets subtask Kind:One, subtask is got, data corresponding with the subtask is read, stores data into data buffering queue;Two, even Continuous n times have not been obtained to subtask, suspend mode preset time T.In this example embodiment, the suspend mode preset time T of setting avoids It is per second all to call once when there is no task in task buffering queue when using Redis to realize task queue in the prior art RPOP orders have checked whether new task, and know it is the problem of empty queue also constantly accesses data buffering queue I/O perfectly well, reduce Access times, decrease overhead.In addition to this, multiple threads can be enabled while task buffering queue is supervised Control, and in task buffering queue there are when task, send out task reminders notice, once find task buffering queue in have task With regard to circular treatment, the efficiency of data processing is improved.
In addition, in this example embodiment, the suspend mode control unit may include:
Whether dormancy time subelement can be used for judging the dormancy time of the thread more than a preset quantity period;
Thread terminates subelement, can be used in the dormancy time for judging the thread being more than the preset quantity period When, terminate the thread.
In this example embodiment, had not been obtained to subtask into continuous n times, it, can after the flow of suspend mode preset time T Two kinds of situations can occur:One, subtask is got, data corresponding with the subtask are read, it is slow to store data into data Deposit queue;Two, second round continuous n times have not been obtained to subtask, suspend mode preset time T.So circulation is gone down, if the line Small task all has not been obtained in Cheng Lianxu n times, it is understood that is the thread continuous suspend mode N number of period.In this example embodiment In, it can be determined that whether the dormancy period of the thread is more than predetermined period N or preset time N*T, wherein N=k*n, this The k at place is the number for being carried out continuously the small task of n times and getting, the random natural number that k, n all can be significant.Judging the line When the dormancy time of journey is more than the preset quantity period or preset time, terminate the thread.
In addition, in this example embodiment, the data read module can also include:
Data storage cell can be used for judging whether successfully reading data corresponding with the subtask and judge into When work(reads data corresponding with the subtask, the data of reading are stored in the data buffering queue;
First returning unit can be used for when judging to read data corresponding with the subtask not successfully, by data Reading task returns to the task buffering queue.
In this example embodiment, successfully obtaining the subtask from the task buffering queue, read with it is described The corresponding data in subtask, and can decide whether successfully to read data corresponding with the subtask.Whether judge data It can be to judge whether data type, data length are completely the same, judge whether that reading complete either data is that success, which is read, It is no to read correctly, can whether consistent with the data of reading to determine whether successfully reading by comparing the data in source database It takes.When the data in the data and source database for judging to read are completely the same, by the data storage of reading to data buffering team In row.Accordingly, when the data for judging to read are consistent with the data transformation in source database, corresponding digital independent is appointed Business returns in the task buffering queue, is read out again to the corresponding data in the subtask.
In addition, in this example embodiment, the data storage cell may include:
Task returns to subelement, can be used for judging whether data are successfully stored in the data buffering queue and are judging number When according to being stored in the data buffering queue not successfully, data store tasks are returned into the task buffering queue.
In this example embodiment, when the data in the data and source database for judging to read are completely the same, it will read Data store to when data buffering queue, it can be determined that whether data are successfully stored in the data buffering queue.Judge data Successfully whether deposit may be to judge whether data type, data length are completely the same, judge whether full storage either Number corresponding with subtask in source database or general assignment whether is had been written into correct storage or data buffering queue According to.Can by judge storage data and source database in whether data to be synchronized completely the same judges whether successfully to deposit Storage, or inquiry data buffering queue judge whether successfully to store with the presence or absence of corresponding data.In data and the source for judging storage It, can be by corresponding number when data transformation in database unanimously or in data buffering queue does not inquire corresponding data It returns in the task buffering queue according to store tasks, the corresponding data in the subtask is stored again.
Data write. module 103 can be used for judging whether there is and the subtask pair in the data buffering queue The data answered and when judging to there are data corresponding with the subtask, obtain and the son from the data buffering queue Simultaneously target database is written in the corresponding data of task.
In this example embodiment, the data buffering queue can be the queue with task buffering queue same nature, One thread can be responsible for data being stored in data buffering queue, and without carrying out other operations, multiple threads can be held simultaneously A variety of operations of the row to data without mutual restricting relation, thus reduce the time of data synchronization, improve data synchronization Efficiency;And it is finding to be written with to stagger the time, can modified by keyboard either other manner, improve data synchronization Accuracy rate.Data writing process can consider the inverse process of data read process substantially.First determine whether be in data buffering queue No there are data, there are when data in the data buffering queue, can obtain one from the data buffering queue at random Data, the data can be task data corresponding with one of the multiple subtask, and can be to the data Carry out data mart modeling.Data mart modeling may include Data Integration, data summarization and derivative calculating etc..Such as data are added The operations such as close, grouping, coding, update and redundancy deletion.After being processed to data, target database is write data into. The target database can be identical as the source database structure, can also be different.Data in the target database It can store in the form of a table, the tables of data in target database identical as the tables of data of the source database can be surveyed with facilitating Examination.Importing can be carried out when data are written using HTable or from HDFS (Hadoop Distributed File System, Hadoop distributed file system) HBase is imported in file or read field in HBase table write-in HBase table, Or it can be written by other means.When not there are no data in judging data buffering queue, can control the thread into Row waits for, and judges whether the duration that data have not been obtained in the continuous n times of the thread is more than preset time t, in the line When the duration that data have not been obtained in journey is more than preset time t, the thread suspend mode preset duration T can be controlled.In suspend mode After preset duration, continue the data acquisition in next period;It is more than described default in the dormancy time for judging the thread When quantity period or preset time, terminate the thread.
In addition, in this example embodiment, the Data write. module can also include:
Be written judging unit, can be used for judging data whether be successfully written the target database and judge data at When the target database is written in work(, release profile formula lock;
Second returning unit can be used for, when judging that the target database is written not successfully in data, data being returned The data buffering queue.
In this example embodiment, will store target database is written to the data in data buffering queue when, can be with By inquiring in target database with the presence or absence of including data in source database in data to be synchronized or source database The mode that can be used by target database judges whether to be successfully written data corresponding with the subtask.When judging successfully to write When entering data, that is, after the entire synchronous task for completing data, it can be locked with release profile formula to allow data synchronous task next time In multiple threads compete again.In this example embodiment, release profile formula lock can be understood as currently obtaining distributed lock Main thread handling task distribution, digital independent and data write-in etc. after total datas synchronous task, deletion itself is right The node answered.For example, redis can be discharged by DEL foo.lock orders and be locked.Accordingly, when the data for judging write-in When with the data consistent with data to be synchronized in source database being not present in source database, corresponding data write-in task is returned In the task buffering queue, the corresponding data in the subtask are written again.
A kind of method of data synchronization is additionally provided in this example embodiment, the method for data synchronization may include following Step:
Step S110. periodically calculates total amount of data to be synchronized in source database, according to the total amount of data will with wait for The corresponding data synchronous task cutting of synchrodata is multiple subtasks and stores to task buffering queue;
Step S120. judges whether a thread gets subtask from the task buffering queue and judging the line When journey gets the subtask, data corresponding with the subtask are read from the source database and are stored slow to data Rush queue;
Step S130. judges to whether there is data corresponding with the subtask in the data buffering queue and judge When in the presence of data corresponding with the subtask, data corresponding with the subtask are obtained simultaneously from the data buffering queue Target database is written.
The detail of each step has carried out in corresponding data synchronization unit in detail in above-mentioned method of data synchronization Thin description, therefore details are not described herein again.
A kind of storage medium is additionally provided in this example embodiment, is stored thereon with computer program, the computer journey Above-mentioned method of data synchronization is realized when sequence is executed by processor.
Storage medium may include in a base band or as the data-signal that a carrier wave part is propagated, can wherein carrying Reader code.The data-signal of this propagation may be used diversified forms, including but not limited to electromagnetic signal, optical signal or on Any appropriate combination stated.Storage medium can send, propagate either transmission for by instruction execution system, device or device Part uses or program in connection.
The program code for including in storage medium can transmit with any suitable medium, including but not limited to wirelessly, have Line, optical cable, radio frequency etc. or above-mentioned any appropriate combination.
A kind of electronic equipment is additionally provided in this example embodiment, refering to what is shown in Fig. 6, the electronic equipment 10 includes:Place Component 11 is managed, can further include one or more processors, and by the memory resource representated by memory 12, use In the instruction that storage can be executed by processing component 11, such as application program.The application program stored in memory 12 may include It is one or more each correspond to one group of instruction module.In addition, processing component 11 is configured as executing instruction, with Execute the above method.
The electronic equipment 10 can also include:One power supply module, power supply module are configured to executing electronic equipment 10 Carry out power management;One wired or wireless network interface 13 is configured to electronic equipment 10 being connected to network;And one Input and output (I/O) interface 14.The electronic equipment 10 can be operated based on the operating system for being stored in memory 12, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
Block diagram shown in attached drawing is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening A part for functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device These functional entitys.
It should be noted that although being referred to several modules or list for acting the equipment executed in above-detailed Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more The feature and function of module either unit can embody in a module or unit.Conversely, an above-described mould Either the feature and function of unit can be further divided into and embodied by multiple modules or unit block.
In addition, although describing each step of method in the disclosure with particular order in the accompanying drawings, this does not really want These steps must be executed according to the particular order by asking or implying, or the step having to carry out shown in whole could be realized Desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/ Or a step is decomposed into execution of multiple steps etc..
Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the disclosure The technical solution of embodiment can be expressed in the form of software products, the software product can be stored in one it is non-volatile Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, mobile terminal or network equipment etc.) is executed according to disclosure embodiment Method.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure Its embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Person's adaptive change follows the general principles of this disclosure and includes the undocumented common knowledge in the art of the disclosure Or conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by appended Claim is pointed out.

Claims (11)

1. a kind of data synchronization unit, which is characterized in that described device includes:
Task allocating module will according to the total amount of data for periodically calculating total amount of data to be synchronized in source database Data synchronous task cutting corresponding with data to be synchronized is multiple subtasks and stores to task buffering queue;
Data read module, for judging whether a thread from the task buffering queue gets subtask and judging When stating thread and getting the subtask, data corresponding with the subtask are read from the source database and storage is to number According to buffering queue;
Data write. module, for judge in the data buffering queue with the presence or absence of data corresponding with the subtask and When judging to there are data corresponding with the subtask, number corresponding with the subtask is obtained from the data buffering queue According to and target database is written.
2. data synchronization unit according to claim 1, which is characterized in that the task allocating module includes:
Main thread determination unit for competing distributed lock simultaneously in the multiple threads of same period internal trigger, and will obtain The thread of the distributed lock is determined as main thread and is determined as the thread for not obtaining the distributed lock standby Thread;
Main thread control unit executes data distribution task corresponding with the data to be synchronized for controlling the main thread.
3. data synchronization unit according to claim 2, which is characterized in that the main thread determination unit includes:
Task judgment sub-unit, for judge the main thread whether data distribution task described in successful execution and described in judgement When data described in main thread successful execution distribute task, the control standby thread stops waiting for;
State changes subelement, for when judging that the main thread executes data distribution task not successfully, described in control Standby thread competes the distributed lock again.
4. data synchronization unit according to claim 1, which is characterized in that the data read module includes:
Task detection unit, for detecting with the presence or absence of task in the task buffering queue and buffering team detecting the task There are when task, obtain a subtask and read data corresponding with the subtask in row;
Subtask acquiring unit, for when not there is no task in detecting the task buffering queue, judging that the thread does not obtain Whether the duration for getting the subtask is more than preset time;
Suspend mode control unit, for when the duration for judging that the subtask has not been obtained in the thread is more than described default Between when, control the dormant state that the thread enters preset duration;
Cycle detection unit, for judge the thread have not been obtained the subtask duration be less than it is described default When the time, it whether there is task in task buffering queue described in cycle detection.
5. data synchronization unit according to claim 4, which is characterized in that the suspend mode control unit includes:
Dormancy time subelement, for judging the dormancy time of the thread whether more than a preset quantity period;
Thread terminates subelement, for when the dormancy time for judging the thread is more than the preset quantity period, terminating institute State thread.
6. data synchronization unit according to claim 1, which is characterized in that the data read module further includes:
Data storage cell, for judging whether successfully to read data and when judging successfully to read data, by the data of reading It is stored in the data buffering queue;
First returning unit, for when judging failed reading data, digital independent task being returned to the task and buffers team Row.
7. data synchronization unit according to claim 6, which is characterized in that the data storage cell includes:
Task returns to subelement, for judging whether data are successfully stored in the data buffering queue and are judging that data are failed When being stored in the data buffering queue, data store tasks are returned into the task buffering queue.
8. data synchronization unit according to claim 1, which is characterized in that the Data write. module further includes:
Judging unit is written, for judging whether data are successfully written the target database and are judging that data are successfully written institute When stating target database, release profile formula lock;
Second returning unit, for when judging that the target database is written not successfully in data, data to be returned to the data Buffering queue.
9. a kind of method of data synchronization, which is characterized in that including:
Total amount of data to be synchronized in source database is periodically calculated, it will be corresponding with data to be synchronized according to the total amount of data Data synchronous task cutting is multiple subtasks and stores to task buffering queue;
Judge whether a thread gets subtask from the task buffering queue and judging that it is described that the thread is got When subtask, data corresponding with the subtask are read from the source database and are stored to data buffering queue;
Judge in the data buffering queue with the presence or absence of data corresponding with the subtask and in judgement presence and the son When the corresponding data of task, data corresponding with the subtask are obtained from the data buffering queue and target data is written Library.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The method of data synchronization described in claim 9 is realized when being executed by processor.
11. a kind of electronic equipment, which is characterized in that including:
Processor;And
Memory, the executable instruction for storing the processor;
Wherein, the processor is configured to synchronize via the data that the execution executable instruction is come described in perform claim requirement 9 Method.
CN201710229817.XA 2017-04-10 2017-04-10 Data synchronization unit, method, storage medium and electronic equipment Pending CN108694199A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710229817.XA CN108694199A (en) 2017-04-10 2017-04-10 Data synchronization unit, method, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710229817.XA CN108694199A (en) 2017-04-10 2017-04-10 Data synchronization unit, method, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN108694199A true CN108694199A (en) 2018-10-23

Family

ID=63843235

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710229817.XA Pending CN108694199A (en) 2017-04-10 2017-04-10 Data synchronization unit, method, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN108694199A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108712505A (en) * 2018-05-31 2018-10-26 北京百度网讯科技有限公司 Method of data synchronization, device, equipment, system and storage medium
CN109299116A (en) * 2018-12-05 2019-02-01 浪潮电子信息产业股份有限公司 A kind of method of data synchronization, device, equipment and readable storage medium storing program for executing
CN109298948A (en) * 2018-10-31 2019-02-01 北京国信宏数科技有限责任公司 Distributed computing method and system
CN109558218A (en) * 2018-12-04 2019-04-02 山东浪潮通软信息科技有限公司 A kind of distributed service data lock implementation method based on Redis
CN110196884A (en) * 2019-05-31 2019-09-03 北京大米科技有限公司 Method for writing data, storage medium and electronic equipment based on distributed data base
CN110196868A (en) * 2019-06-06 2019-09-03 四川新网银行股份有限公司 Based on distributed work order flow monitoring method
CN110673960A (en) * 2019-08-22 2020-01-10 中国平安财产保险股份有限公司 Data synchronization method, device, equipment and computer readable storage medium
CN111147355A (en) * 2019-12-25 2020-05-12 北京五八信息技术有限公司 Message sending method and device, electronic equipment and storage medium
CN111177456A (en) * 2019-11-27 2020-05-19 视联动力信息技术股份有限公司 Method, device and equipment for parallel synchronization of data and storage medium
CN111190961A (en) * 2019-12-18 2020-05-22 航天信息股份有限公司 Dynamic optimization multithreading data synchronization method and system
CN111225007A (en) * 2018-11-26 2020-06-02 北京京东尚科信息技术有限公司 Database connection method, device and system
CN111259205A (en) * 2020-01-15 2020-06-09 北京百度网讯科技有限公司 Graph database traversal method, device, equipment and storage medium
CN111797158A (en) * 2019-04-08 2020-10-20 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer-readable storage medium
CN112015713A (en) * 2019-05-30 2020-12-01 阿里巴巴集团控股有限公司 Database task processing method and device, electronic equipment and readable medium
CN112148498A (en) * 2020-09-30 2020-12-29 平安普惠企业管理有限公司 Data synchronization method, device, server and storage medium
CN112445596A (en) * 2020-11-27 2021-03-05 平安普惠企业管理有限公司 Multithreading-based data import method and system and storage medium
CN112818054A (en) * 2020-10-15 2021-05-18 广州南天电脑系统有限公司 Data synchronization method and device, computer equipment and storage medium
CN113407544A (en) * 2021-07-13 2021-09-17 南方电网数字电网研究院有限公司 Multi-model data synchronization method and device
CN113806384A (en) * 2021-08-19 2021-12-17 紫光云(南京)数字技术有限公司 Method for allocating incremental integer data based on redis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885986A (en) * 2012-12-21 2014-06-25 阿里巴巴集团控股有限公司 Main and auxiliary database synchronization method and device
CN104462269A (en) * 2014-11-24 2015-03-25 中国联合网络通信集团有限公司 Isomerous database data exchange method and system
CN106528893A (en) * 2016-12-26 2017-03-22 北京奇虎科技有限公司 Data synchronization method and device
CN106557364A (en) * 2015-09-24 2017-04-05 阿里巴巴集团控股有限公司 A kind of method of data synchronization and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103885986A (en) * 2012-12-21 2014-06-25 阿里巴巴集团控股有限公司 Main and auxiliary database synchronization method and device
CN104462269A (en) * 2014-11-24 2015-03-25 中国联合网络通信集团有限公司 Isomerous database data exchange method and system
CN106557364A (en) * 2015-09-24 2017-04-05 阿里巴巴集团控股有限公司 A kind of method of data synchronization and system
CN106528893A (en) * 2016-12-26 2017-03-22 北京奇虎科技有限公司 Data synchronization method and device

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108712505A (en) * 2018-05-31 2018-10-26 北京百度网讯科技有限公司 Method of data synchronization, device, equipment, system and storage medium
CN109298948A (en) * 2018-10-31 2019-02-01 北京国信宏数科技有限责任公司 Distributed computing method and system
CN109298948B (en) * 2018-10-31 2021-04-02 北京国信宏数科技有限责任公司 Distributed computing method and system
CN111225007B (en) * 2018-11-26 2024-01-12 北京京东尚科信息技术有限公司 Database connection method, device and system
CN111225007A (en) * 2018-11-26 2020-06-02 北京京东尚科信息技术有限公司 Database connection method, device and system
CN109558218A (en) * 2018-12-04 2019-04-02 山东浪潮通软信息科技有限公司 A kind of distributed service data lock implementation method based on Redis
CN109299116A (en) * 2018-12-05 2019-02-01 浪潮电子信息产业股份有限公司 A kind of method of data synchronization, device, equipment and readable storage medium storing program for executing
CN111797158A (en) * 2019-04-08 2020-10-20 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer-readable storage medium
CN111797158B (en) * 2019-04-08 2024-04-05 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer readable storage medium
CN112015713B (en) * 2019-05-30 2024-03-26 阿里云计算有限公司 Database task processing method and device, electronic equipment and readable medium
WO2020238737A1 (en) * 2019-05-30 2020-12-03 阿里巴巴集团控股有限公司 Database task processing method and apparatus, electronic device, and readable medium
CN112015713A (en) * 2019-05-30 2020-12-01 阿里巴巴集团控股有限公司 Database task processing method and device, electronic equipment and readable medium
CN110196884B (en) * 2019-05-31 2022-04-29 北京大米科技有限公司 Data writing method based on distributed database, storage medium and electronic equipment
CN110196884A (en) * 2019-05-31 2019-09-03 北京大米科技有限公司 Method for writing data, storage medium and electronic equipment based on distributed data base
CN110196868A (en) * 2019-06-06 2019-09-03 四川新网银行股份有限公司 Based on distributed work order flow monitoring method
CN110673960B (en) * 2019-08-22 2022-11-29 中国平安财产保险股份有限公司 Data synchronization method, device, equipment and computer readable storage medium
CN110673960A (en) * 2019-08-22 2020-01-10 中国平安财产保险股份有限公司 Data synchronization method, device, equipment and computer readable storage medium
CN111177456A (en) * 2019-11-27 2020-05-19 视联动力信息技术股份有限公司 Method, device and equipment for parallel synchronization of data and storage medium
CN111177456B (en) * 2019-11-27 2022-12-23 视联动力信息技术股份有限公司 Method, device and equipment for parallel synchronization of data and storage medium
CN111190961A (en) * 2019-12-18 2020-05-22 航天信息股份有限公司 Dynamic optimization multithreading data synchronization method and system
CN111190961B (en) * 2019-12-18 2023-09-29 航天信息股份有限公司 Dynamic optimization multithreading data synchronization method and system
CN111147355A (en) * 2019-12-25 2020-05-12 北京五八信息技术有限公司 Message sending method and device, electronic equipment and storage medium
CN111259205A (en) * 2020-01-15 2020-06-09 北京百度网讯科技有限公司 Graph database traversal method, device, equipment and storage medium
CN111259205B (en) * 2020-01-15 2023-10-20 北京百度网讯科技有限公司 Graph database traversal method, device, equipment and storage medium
CN112148498A (en) * 2020-09-30 2020-12-29 平安普惠企业管理有限公司 Data synchronization method, device, server and storage medium
CN112818054A (en) * 2020-10-15 2021-05-18 广州南天电脑系统有限公司 Data synchronization method and device, computer equipment and storage medium
CN112445596B (en) * 2020-11-27 2024-02-02 上海睿量私募基金管理有限公司 Data importing method, system and storage medium based on multithreading
CN112445596A (en) * 2020-11-27 2021-03-05 平安普惠企业管理有限公司 Multithreading-based data import method and system and storage medium
CN113407544A (en) * 2021-07-13 2021-09-17 南方电网数字电网研究院有限公司 Multi-model data synchronization method and device
CN113806384A (en) * 2021-08-19 2021-12-17 紫光云(南京)数字技术有限公司 Method for allocating incremental integer data based on redis

Similar Documents

Publication Publication Date Title
CN108694199A (en) Data synchronization unit, method, storage medium and electronic equipment
US9477521B2 (en) Method and system for scheduling repetitive tasks in O(1)
Tan et al. Coupling task progress for mapreduce resource-aware scheduling
Wang et al. Pigeon: An effective distributed, hierarchical datacenter job scheduler
US20100153957A1 (en) System and method for managing thread use in a thread pool
US9218210B2 (en) Distributed processing system
US10275287B2 (en) Concurrent distributed graph processing system with self-balance
CN113641457A (en) Container creation method, device, apparatus, medium, and program product
CN108319495A (en) Task processing method and device
US20150205633A1 (en) Task management in single-threaded environments
CN112114950A (en) Task scheduling method and device and cluster management system
US20100036641A1 (en) System and method of estimating multi-tasking performance
CN109871273A (en) A kind of adaptive task moving method and device
US11036542B2 (en) Automatically limiting repeated checking on completion of a command without relinquishing a processor
He et al. Real-time scheduling in mapreduce clusters
CN115408117A (en) Coroutine operation method and device, computer equipment and storage medium
Jayaram et al. Just-in-Time Aggregation for Federated Learning
CN116304390B (en) Time sequence data processing method and device, storage medium and electronic equipment
CN111737021A (en) Parallel task processing method and device, electronic equipment and storage medium
Osborne et al. Simultaneous multithreading applied to real time
US20130185521A1 (en) Multiprocessor system and scheduling method
Zouaoui et al. CPU scheduling algorithms: Case & comparative study
CN115129438A (en) Method and device for task distributed scheduling
CN113791876A (en) System, method and apparatus for processing tasks
CN112182003A (en) Data synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181023