CN117312453A - Online copying method, device and equipment for distributed data table and storage medium - Google Patents

Online copying method, device and equipment for distributed data table and storage medium Download PDF

Info

Publication number
CN117312453A
CN117312453A CN202311303383.5A CN202311303383A CN117312453A CN 117312453 A CN117312453 A CN 117312453A CN 202311303383 A CN202311303383 A CN 202311303383A CN 117312453 A CN117312453 A CN 117312453A
Authority
CN
China
Prior art keywords
target
data
source
copying
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311303383.5A
Other languages
Chinese (zh)
Inventor
何文然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinzhuan Xinke Co Ltd
Original Assignee
Jinzhuan Xinke Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinzhuan Xinke Co Ltd filed Critical Jinzhuan Xinke Co Ltd
Priority to CN202311303383.5A priority Critical patent/CN117312453A/en
Publication of CN117312453A publication Critical patent/CN117312453A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of distributed databases, in particular to a distributed data table online copying method, a device, equipment and a storage medium, wherein the method comprises the following steps: in response to receiving a copy start command, processing the source table and the target table to obtain a shared state source table and an exclusive state target table; copying source data in the shared state source table to an exclusive state target table to obtain a target primary table; judging whether new data are generated in the shared state source table, if so, acquiring the new data, and copying the new data to the target primary table to obtain a target secondary table; judging whether the shared state source table generates new data in the checking period or not based on the timing of the checking period until the shared state source table does not generate new data in the checking period, if so, copying the new data generated in the checking period to a target secondary table, otherwise, sending copying end information to the client; the method and the device are convenient for improving the overall utilization efficiency of the source table under the condition of ensuring that the source table is consistent with the target table.

Description

Online copying method, device and equipment for distributed data table and storage medium
Technical Field
The present invention relates to the field of distributed database technologies, and in particular, to a method, an apparatus, a device, and a storage medium for online replication of a distributed data table.
Background
The distributed database is provided with a source table for storing data, and the data forming the source table can be distributed in different databases; sometimes, in order to perform data backup on the source table at the current time, the source table needs to be copied into the target table.
Currently, one way to copy a source table to a target table in a distributed database is: when copying is started, in order to ensure that the source table and the target table after the copying operation are consistent, the source table and the target table need to be locked first so as to prevent other transactions from modifying the source table or the target table in the copying process; after the source table and the target table are locked, copying the source table to the target table is started.
In general, the amount of data in the source table is relatively large, which results in a long period of time for the source table to be exclusively used in the copying process, and thus other clients online with the source table cannot operate the source table during the copying process; if the requirement of other clients on operating the source table in the process of copying the source table by the current client is met, the source table is required to be unlocked, but the unlocking of the source table can cause new data to be generated in the source table, so that the target table is inconsistent with the source table data; in summary, during the copying of the source table, the overall utilization efficiency of the source table is lower under the condition of ensuring that the source table is consistent with the target table.
Disclosure of Invention
The embodiment of the invention provides a distributed data table online copying method, a device, equipment and a storage medium, and in a first aspect, the distributed data table online copying method provided by the embodiment of the invention comprises the following steps:
in response to receiving a copy start command, processing the source table and the target table to obtain a shared state source table and an exclusive state target table;
copying source data in the shared state source table to the exclusive state target table to obtain a target primary table;
judging whether first newly-added data are generated in the shared state source table, if so, acquiring the first newly-added data, and copying the first newly-added data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data;
judging whether the shared state source table generates second newly-increased data in the corresponding checking period or not based on the preset checking period at fixed time until the shared state source table does not generate the second newly-increased data in the corresponding checking period, if yes, copying the second newly-increased data generated in the corresponding checking period to the target secondary table, and if not, sending copying end information to a client.
In a specific embodiment, if it is determined that the first new data is not generated in the shared state source table, then:
sending copy ending information to the client;
responding to the received replication ending command returned by the client based on the replication ending information, and judging whether third newly added data is generated in the shared state source table or not;
if yes, copying the third newly added data to the target primary table to obtain a target final table; otherwise, the target primary table is used as the target final table.
In a specific implementation manner, in response to receiving a copy ending command returned by the client based on the copy ending information, judging whether fourth newly added data is generated in the shared state source table;
if yes, copying the fourth newly added data to the target secondary table to obtain a target final table; otherwise, the target secondary table is used as the target final table.
In a specific embodiment, said obtaining said first augmentation data comprises:
acquiring a source table operation log corresponding to the source table;
inquiring a source table operation command generated by the source table operation log in the process of copying the source data;
and acquiring data corresponding to the source table operation command in the source table and recording the data as the first newly-added data.
In a specific embodiment, in response to receiving the copy cancellation command, stopping performing the one distributed data table online copy method and clearing the intermediate file;
judging whether a confirming operation on a target terminal table exists or not, and if not, taking the target table as the target terminal table.
In a specific embodiment, in response to receiving a status query command, parsing the status query command to obtain a status query statement;
and inquiring the distributed database based on the state inquiry statement and the target table to obtain the current total copying state.
In a specific embodiment, said querying the distributed database for the current replication total state based on the state query statement and the target table includes:
determining a sub-database in the distributed database for establishing the target table;
issuing the state query statement to the sub-database;
acquiring a current replication sub-state generated by executing the state query statement by the sub-database;
and summarizing the current replication sub-state to generate the current replication total state, and returning the current replication total state to the client.
In a second aspect, an online replication device for a distributed data table provided by an embodiment of the present invention includes:
and the table processing module is used for responding to the received copy start command, and processing the source table and the target table to obtain a shared state source table and an exclusive state target table.
The copying module is used for copying the source data in the shared state source table to the exclusive state target table to obtain a target primary table;
the first judging and processing module is used for judging whether first newly-added data are generated in the shared state source table, if so, acquiring the first newly-added data, and copying the first newly-added data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data;
and the second judging and processing module is used for judging whether the shared state source table generates second new data in the corresponding checking and adding period or not based on the preset checking and adding period at fixed time until the shared state source table does not generate the second new data in the corresponding checking and adding period, if so, copying the second new data generated in the corresponding checking and adding period to the target secondary table, and if not, sending copying end information to the client.
In a third aspect, a computer device provided by an embodiment of the present invention adopts the following technical scheme: the method comprises a memory and a processor, wherein the memory stores a computer program which can be loaded by the processor and execute any of the distributed data table online copying methods.
In a fourth aspect, a computer readable storage medium provided by an embodiment of the present invention adopts the following technical scheme: a computer program capable of being loaded by a processor and executing any one of the above-described distributed data table online copying methods is stored.
Drawings
FIG. 1 is a flowchart of a distributed data table online copying method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an online copying device for a distributed data table according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Fig. 1 is a flowchart of a distributed data table online replication method according to an embodiment of the present invention, and referring to fig. 1, the method may be performed by an apparatus for performing the method, where the apparatus may be implemented by software and/or hardware, and the method includes:
s100, in response to receiving the copy start command, the source table and the target table are processed to obtain a shared source table and an exclusive target table.
When the client needs to execute operations such as adding, deleting, changing, checking and the like on the distributed database, the corresponding command needs to be sent to the distributed database, then the distributed database analyzes and executes the command after receiving the corresponding command, and then an execution result is returned to the client.
In this embodiment, a source table is set in the distributed database, and the source table can be used by multiple clients; in one case, that is, after the data has been written in the source table, further, in order to ensure the security of the written data, the source table needs to be backed up; the backup work is completed by the distributed database and the client end connected with the distributed database in a communication way; the backup work is to copy the total amount of the source table into a preset empty table in the distributed database, and it is to be noted that the table structure of the preset empty table is completely consistent with the table structure of the preset source table, that is, the table structure information of indexes, fragments and the like of the preset empty table is completely consistent with the table structure of the source table.
In this embodiment, the above-mentioned preset empty table is denoted as a target table t3, and a source table in the distributed database is denoted as t1. When a source table in the distributed database needs to be backed up, a client side personnel can send a replication start command to the distributed database through the client side, for example: copy table from t1 to t3 with start; the middleware of the distributed database receives the replication start command, and processes the source table and the target table to obtain a shared source table and an exclusive target table after receiving the replication start command.
Specifically, the step of processing the source table and the target table to obtain the shared source table and the exclusive target table includes:
s110, performing early-stage examination on the source table and the target table, and determining that the table structures of the source table and the target table are consistent.
In implementation, after receiving a replication start command, middleware of the distributed database checks whether the table structures of the source table and the target table are consistent; if the index and the fragment of the target table are inconsistent, informing the client to adjust the table structure of the target table until the table structures of the source table and the target table are consistent, namely, the index, the fragment and other table structure information of the target table are completely consistent with the table structure information of the source table; if so, continuing to execute the step S120.
S120, under the condition that the table structures of the source table and the target table are consistent, a unique lock is added for both the source table and the target table, and an exclusive source table and an exclusive target table are obtained.
If it is determined by S110 that the table structures of the source table and the target table are identical, an exclusive lock is added to both the source table and the target table.
It should be noted that, in general, the distributed database is communicatively connected with a plurality of clients, and each client may perform a certain operation on the distributed database, which is a process of performing the operation, that is, a transaction; a distributed database may have multiple transactions in parallel. In the process of performing backup copy on the source table, other transactions may cause a table structure of the source table and/or the target table to be changed, thereby causing a transaction failure of performing backup copy on the source table.
It should be explained that, if an exclusive lock corresponding to a backup transaction is added to the source table and the target table before the source table is copied to the target table, the source table and the target table are only used by the backup transaction at this time, and other transactions can not operate on the source table and the target table any more.
In implementation, after determining that the table structures of the source table and the target table are consistent through the step S110, adding an exclusive lock for both the source table and the target table, thereby generating an exclusive source table and an exclusive target table; therefore, the source table and the target table can be ensured not to be interfered by other transactions, and the table structures of the source table and the target table are ensured to be maintained in a consistent state before the source table is backed up to the target table.
S130, generating a source table operation log corresponding to the source table, and degrading the exclusive lock added for the source table into a shared lock to obtain a shared state source table.
After a unique lock is added for the source table and the target table through the step S120, further, an operation log corresponding to the source table is generated and is recorded as a source table operation log, and all subsequent operations on the source table are recorded in the source table operation log; furthermore, in order to facilitate the next copying process of the source table, other transactions can write new data into the source table, so that the backup transaction of the source table and other transactions can be performed in parallel, and further the service efficiency of the source table is improved; at this time, a shared source table and an exclusive destination table have been generated.
S200, copying the source data in the shared source table to the exclusive target table to obtain a target primary table.
After receiving the copy start command, the middleware in the distributed database further performs the following steps according to the copy start command copy table from t to t3 with start 0 The total amount of data written in the shared state source table t1, namely source data, is copied to an exclusive state target table t3, the exclusive state target table in which all data corresponding to the current moment in time is copied is recorded as a target primary table, and the moment of generating the target primary table is recorded as t 1
S300, judging whether first new data are generated in the shared state source table, if so, acquiring the first new data, and copying the first new data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data.
It should be noted that, the process of copying the source data in the shared source table to the exclusive destination in the step S200 may last longer due to the huge amount of data of the source data, and during this period, other transactions may perform data writing operation on the source table, so that the source table generates new data.
After the target primary table is obtained through the step S200, further, the middleware judges that the target primary table is t 0 To t 1 Whether new data written by other transactions are generated in the period sharing state source table or not, if so, t is set 0 To t 1 New numbers of new writes to other transactions generated in the shared state source table during the periodIs noted as first new data.
S310, under the condition that the first new data is generated in the shared state source table, the method comprises the following steps:
the method for acquiring the first new data comprises the following specific steps of:
s311, obtaining a source table operation log corresponding to the source table.
And generating a source table operation log corresponding to the source table through the step S130, and acquiring the source table operation log corresponding to the source table by the middleware under the condition that the step S310 judges that the first newly added data is generated in the shared state source table.
S312, inquiring a source table operation command generated by the source table operation log in the process of copying the source data.
After the source table operation log is obtained in step S311, further, the middleware queries the source table operation log, so as to obtain a t in the source table operation log 0 To t 1 During which generated source table operation commands, at t 0 To t 1 The source table operation command generated during this time may be generated by other transactions at t 0 To t 1 And (3) performing the operation command recorded by the data writing operation on the shared state source table.
S313, acquiring data corresponding to the source table operation command in the source table as first newly-added data.
At t is obtained through S312 0 To t 1 After the source table operation command generated during the period, further, the middleware determines the shared state source table and the shared state source table in t 0 To t 1 And (3) obtaining the data corresponding to the source table operation command generated in the process, obtaining the determined data, and recording the obtained data as first newly-added data.
After the first new data is obtained, the middleware further copies the obtained first new data into the target primary table, so that the target secondary table is obtained.
S320, if the first new data is not generated in the shared state source table, then:
s321, sending copy ending information to the client.
If it is judged thatShared state source table at t 0 To t 1 And if the first new data is not generated in the period, the middleware sends copy ending information to the client, and the fact that the backup copy process of the shared state source table can be ended is characterized.
S322, in response to receiving a copy end command returned by the client based on the copy end information, determining whether third newly added data is generated in the shared state source table.
After sending the copy end information to the client through step S321, the client personnel receives the copy end information through the client, and sends a copy end command "copy table from t to t3 with stop" to the middleware according to the copy end information.
It should be noted that, during the period from the middleware sending the copy end information to the client sending the copy end command to the middleware, the shared source table may still be written with new data, and the newly written data is recorded as the third newly added data.
In the implementation, after receiving the copy end command sent by the client, the middleware first determines whether the third newly added data is generated in the shared state source table.
S323, if yes, copying the third newly added data to a target primary table to obtain a target final table; otherwise, the target primary table is used as a target final table.
If it is determined in step S322 that the third new added data is generated in the shared source table, it is indicated that the middleware transmits the copy end information to the client and the client writes new data into the shared source table during the period when the client transmits the copy end command to the middleware, so that in order to ensure that the shared source table is consistent with the data of the target table copied by the final backup, if it is determined that the third new added data is generated in the shared source table, the third new added data is further copied to the target primary table to obtain a target final table, and the target final table, that is, the target table copied by the final backup, ends the online copying process of the whole distributed data table.
If it is determined in step S322 that the third new data is not generated in the shared state source table, it is indicated that the middleware sends the copy end information to the client until the client sends the copy end command to the middleware, where the shared state source table is not written with new data, and then the target primary table is directly used as the target final table, and the whole online copying process of the distributed data table is ended.
S400, judging whether the shared state source table generates second newly-increased data in the corresponding check period or not at regular time based on the preset check period until the shared state source table does not generate the second newly-increased data in the corresponding check period, if so, copying the second newly-increased data generated in the corresponding check period to the target secondary table, and if not, sending copying end information to the client.
S410, in the case of generating the target secondary table described above:
assume that the generation time of the target secondary table is t 2 The method comprises the steps of carrying out a first treatment on the surface of the T is the number of 1 To t 2 A time range between the generation of the target secondary table, in which other transactions still have the possibility of writing new data into the shared state source table, even at t 2 Other transactions still have the possibility to continue writing new data to the shared state source table after the moment, and t is taken as 1 And after the moment, the data newly written into the shared state source table is recorded as second newly added data. In order to ensure the data consistency of the shared state source table and the target final table, the data consistency is required to be calculated at t 2 And then judging whether the shared state source table generates second newly-increased data in the corresponding check period or not based on the preset check period at regular time until the shared state source table does not generate the second newly-increased data in the corresponding check period.
The period of the increase is a preset period, and in this embodiment, the period of the increase is 30 minutes at the maximum.
And if the shared state source table generates the second newly-added data in a certain checking period under the condition that the second newly-added data is not generated in one checking period, copying all the second newly-added data generated in the checking period to the target secondary table when the checking period is finished.
If the second newly added data is not generated in the first checking period, the fact that new data are not written into the shared state source table temporarily at the moment is indicated, and the backup process of the shared state source table can be finished; therefore, the middleware starts to appear when judging that the second newly added data is not generated any more in one checking period, and sends the copying ending information to the client.
And S420, after sending the copy ending information to the client, the client receives the copy ending information and displays the copy ending information to a client person, and the client person sends a copy ending command 'copy table from t to t3 with stop' to the middleware through the client after receiving the copy ending information.
And S430, responding to the received replication ending command returned by the client based on the replication ending information, and judging whether fourth newly added data is generated in the shared state source table.
After receiving the copy end command, the middleware needs to continuously determine whether there is new data in the shared state source table after determining that the second newly added data is no longer generated in one check period in order to further ensure that new data is not written in the shared state source table, if so, the middleware records the newly written data as fourth newly added data.
S440, if yes, copying the fourth newly added data to a target secondary table to obtain a target final table; otherwise, the target secondary table is used as a target final table.
If it is determined that the second new data is not generated in the first checking period through S430, and then it is determined that the fourth new data is newly added in the shared state source table in the next checking period, after the checking period in which the fourth new data is newly added is finished, the new data is copied to the target secondary table, so as to obtain the target final table, and then the online copying process of the distributed data table is finished.
If it is determined that the second new data is not generated in the first increment period through S430, and it is not determined that the fourth new data is newly added in the shared state source table in the next increment period, the target secondary table at this time is directly used as the target final table, and then the online copying process of the distributed data table is ended.
S500, in the case of receiving the copy cancel command:
it should be noted that, under the influence of some factors, there is a need to cancel the online replication process of the distributed data table, and when the online replication process of the distributed data table needs to be canceled, a client personnel can directly issue a replication cancellation command to the middleware.
S510, responding to the received copy cancel command, stopping executing a distributed data table online copy method, and clearing the intermediate file.
After receiving the copy cancel command copy table from t to t3 with kill, the middleware immediately stops the online copying process of the distributed data table, and simultaneously clears the intermediate file generated by the online copying process of the executed distributed data table, wherein the intermediate file comprises index information, fragmentation information and the like of the source table and the target table.
S520, judging whether a confirmation operation on the target terminal table exists, and if not, taking the target table as the target terminal table.
After the target terminal table is generated, a confirmation operation is performed on the generated target terminal table, and in this embodiment, the confirmation operation is specifically a commit operation, and the characterization stores the generated target terminal table in the distributed database.
In implementation, after completing step S510, further, the middleware determines whether a target terminal table already exists, and further determines whether a confirmation operation for the target terminal table exists; if the target is judged to exist and the confirming operation of the target terminal list exists, the stored target terminal list is used as the target terminal list; and if the target terminal table is judged not to exist or the target terminal table is judged to exist but is not confirmed, taking the original blank target table as the target terminal table.
S600, in the case of receiving a state query command:
it should be noted that, in the following process of executing the online replication method of the distributed data table, there is often a need to query the current replication progress; when a current replication progress needs to be queried, a client person can issue a status query command "copy table from t to t3 with check status" to the middleware through the client.
S610, responding to the received state query command, and analyzing the state query command to obtain a state query statement.
After receiving the state query command, the middleware further analyzes the state query command, so that a state query statement can be obtained.
S620, inquiring the distributed database based on the state inquiry statement and the target table to obtain the current total copying state
Specifically, S620 includes the following steps:
s621, determining a sub-database in the distributed database for establishing the target table.
After obtaining the state query statement, the middleware further determines all sub-databases in the distributed database for building the target table.
S622, issuing the state query statement to the sub-database.
After determining all the sub-databases related to the target table through step S621, the middleware further synchronously issues the state query statement to all the determined sub-databases.
S623, obtaining the current copy sub-state generated by the sub-database execution state query statement.
After each sub-database receives the state query statement, further, each sub-database executes the received state query statement, thereby obtaining the current replication sub-state corresponding to the state query statement, and returning the current replication sub-state to the middle.
S624, summarizing the current replication sub-state to generate a current replication total state, and returning the current replication total state to the client.
The middleware receives the current replication sub-state returned by each sub-database, gathers all received current replication sub-states, so as to obtain a current replication total state, and further returns the gathered current replication total state to the client for display, so that a client personnel can know the online replication progress condition of the distributed data table in time.
It should be noted that, the current replication total state shown includes: cluster ID, groupID, sub-database ID, current progress, copy start time, copy executed time, etc. of the distributed database.
FIG. 1 is a flow chart illustrating an online replication method of a distributed data table in one embodiment. It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows; the steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders; and at least some of the steps in fig. 1 may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor does the order in which the sub-steps or stages are performed necessarily occur in sequence, but may be performed alternately or alternately with at least some of the other steps or sub-steps of other steps.
Fig. 2 is a schematic structural diagram of an online copying device for a distributed data table according to an embodiment of the present invention, and referring to fig. 2, the device includes:
and the table processing module is used for responding to the received copy start command, and processing the source table and the target table to obtain a shared state source table and an exclusive state target table.
The copying module is used for copying the source data in the shared source table to the exclusive target table to obtain a target primary table;
the first judging and processing module is used for judging whether first newly-added data are generated in the shared state source table, if so, acquiring the first newly-added data, and copying the first newly-added data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data;
the second judging and processing module is used for judging whether the shared state source table generates second newly-increased data in the corresponding checking and increasing period or not at regular time based on the preset checking and increasing period until the shared state source table does not generate the second newly-increased data in the corresponding checking and increasing period, if yes, the second newly-increased data generated in the corresponding checking and increasing period is copied to the target secondary table, and if not, copying ending information is sent to the client.
It should be noted that, the technical solution for solving the technical problem provided by the online replication device of the distributed data table is similar to the technical solution defined by the online replication method of the distributed data table, and the technical solution provided by the online replication device of the distributed data table is not repeated here.
Fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and referring to fig. 3, a computer device 60 includes a memory 602, a processor 601, and a computer program stored in the memory 602 and capable of running on the processor, where the processor 601 implements the method in the above embodiment when executing the program. FIG. 3 illustrates a block diagram of an exemplary computer device suitable for use in implementing embodiments of the present invention. The computer device 60 shown in fig. 3 is only an example and should not be construed as limiting the functionality and scope of use of embodiments of the invention. As shown in FIG. 3, the computer device 60 is in the form of a general purpose computing device. The components of the computer device 60 may include, but are not limited to: one or more processors 601, a system memory 602, and a bus 603 that connects the different system components (including the system memory 602 and the processor 601).
Bus 603 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Computer device 60 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 60 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 602 may include computer system readable media in the form of volatile memory such as Random Access Memory (RAM) 604 and/or cache memory 605. The computer device 60 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 606 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, commonly referred to as a "hard disk drive"). Although not shown in fig. 3, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 603 through one or more data medium interfaces. The system memory 602 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the embodiments of the invention.
A program/utility 608 having a set (at least one) of program modules 607 may be stored in, for example, system memory 602, such program modules 607 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 607 generally perform the functions and/or methods of the described embodiments of the invention.
The computer device 60 may also communicate with one or more external devices 609 (e.g., keyboard, pointing device, display 610, etc.), one or more devices that enable a user to interact with the device, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 60 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 611. Moreover, the computer device 60 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 612. As shown in fig. 3, the network adapter 612 communicates with other modules of the computer device 60 over the bus 603. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with computer device 60, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processor 601 executes various functional applications and data processing by running programs stored in the system memory 602.
The embodiment of the present invention also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of the above embodiment.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium may be, for example, but not limited to: an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements, combinations, and substitutions can be made by those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (10)

1. An online replication method for a distributed data table, which is characterized by comprising the following steps:
in response to receiving a copy start command, processing the source table and the target table to obtain a shared state source table and an exclusive state target table;
copying source data in the shared state source table to the exclusive state target table to obtain a target primary table;
judging whether first newly-added data are generated in the shared state source table, if so, acquiring the first newly-added data, and copying the first newly-added data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data;
judging whether the shared state source table generates second newly-increased data in the corresponding checking period or not based on the preset checking period at fixed time until the shared state source table does not generate the second newly-increased data in the corresponding checking period, if yes, copying the second newly-increased data generated in the corresponding checking period to the target secondary table, and if not, sending copying end information to a client.
2. A method as claimed in claim 1, further comprising:
if the first new data is not generated in the shared state source table, then:
sending copy ending information to the client;
responding to the received replication ending command returned by the client based on the replication ending information, and judging whether third newly added data is generated in the shared state source table or not;
if yes, copying the third newly added data to the target primary table to obtain a target final table; otherwise, the target primary table is used as the target final table.
3. A method as claimed in claim 1, further comprising:
responding to the received replication ending command returned by the client based on the replication ending information, and judging whether fourth newly added data is generated in the shared state source table;
if yes, copying the fourth newly added data to the target secondary table to obtain a target final table; otherwise, the target secondary table is used as the target final table.
4. A method according to claim 1, wherein said obtaining said first new data comprises:
acquiring a source table operation log corresponding to the source table;
inquiring a source table operation command generated by the source table operation log in the process of copying the source data;
and acquiring data corresponding to the source table operation command in the source table and recording the data as the first newly-added data.
5. A method as claimed in claim 1, further comprising:
responding to the received copy cancel command, stopping executing the distributed data table online copy method, and clearing the intermediate file;
judging whether a confirming operation on a target terminal table exists or not, and if not, taking the target table as the target terminal table.
6. A method as claimed in claim 1, further comprising:
responding to a received state query command, and analyzing the state query command to obtain a state query statement;
and inquiring the distributed database based on the state inquiry statement and the target table to obtain the current total copying state.
7. The method of claim 6, wherein querying the distributed database for the current replication aggregate state based on the state query statement and the target table comprises:
determining a sub-database in the distributed database for establishing the target table;
issuing the state query statement to the sub-database;
acquiring a current replication sub-state generated by executing the state query statement by the sub-database;
and summarizing the current replication sub-state to generate the current replication total state, and returning the current replication total state to the client.
8. An online replication device for a distributed data table, comprising:
the table processing module is used for responding to the received copy start command, and processing the source table and the target table to obtain a shared state source table and an exclusive state target table;
the copying module is used for copying the source data in the shared state source table to the exclusive state target table to obtain a target primary table;
the first judging and processing module is used for judging whether first newly-added data are generated in the shared state source table, if so, acquiring the first newly-added data, and copying the first newly-added data to the target primary table to obtain a target secondary table; the first new data is generated in the process of copying the source data;
and the second judging and processing module is used for judging whether the shared state source table generates second new data in the corresponding checking and adding period or not based on the preset checking and adding period at fixed time until the shared state source table does not generate the second new data in the corresponding checking and adding period, if so, copying the second new data generated in the corresponding checking and adding period to the target secondary table, and if not, sending copying end information to the client.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1-7 when the program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-7.
CN202311303383.5A 2023-10-10 2023-10-10 Online copying method, device and equipment for distributed data table and storage medium Pending CN117312453A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311303383.5A CN117312453A (en) 2023-10-10 2023-10-10 Online copying method, device and equipment for distributed data table and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311303383.5A CN117312453A (en) 2023-10-10 2023-10-10 Online copying method, device and equipment for distributed data table and storage medium

Publications (1)

Publication Number Publication Date
CN117312453A true CN117312453A (en) 2023-12-29

Family

ID=89261751

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311303383.5A Pending CN117312453A (en) 2023-10-10 2023-10-10 Online copying method, device and equipment for distributed data table and storage medium

Country Status (1)

Country Link
CN (1) CN117312453A (en)

Similar Documents

Publication Publication Date Title
JP6799652B2 (en) Methods and devices for processing information
US8527501B2 (en) Method, system, and program for combining and processing transactions
US9244999B2 (en) Database query using a user-defined function
CN108932338B (en) Data updating method, device, equipment and medium
WO2017049764A1 (en) Method for reading and writing data and distributed storage system
US8086810B2 (en) Rapid defragmentation of storage volumes
CN110806933B (en) Batch task processing method, device, equipment and storage medium
CN103814362A (en) Transaction processing system, method and program
CN105183852A (en) Database migration method and device
JP2006092005A (en) Data processing method, database system, and storage device
US11893041B2 (en) Data synchronization between a source database system and target database system
CN111818145B (en) File transmission method, device, system, equipment and storage medium
CN112579307A (en) Physical lock resource allocation detection method and device and electronic equipment
US8001098B2 (en) Database update management
WO2022148320A1 (en) Transaction execution method, computer device, and storage medium
US11797523B2 (en) Schema and data modification concurrency in query processing pushdown
WO2019169771A1 (en) Electronic device, access instruction information acquisition method and storage medium
US20230376479A1 (en) Schema and data modification concurrency in query processing pushdown
US7478115B2 (en) System and method for database and filesystem coordinated transactions
WO2024027057A1 (en) Data rollback method and apparatus, and device and storage medium therefor
CN117312453A (en) Online copying method, device and equipment for distributed data table and storage medium
WO2023073547A1 (en) Efficient creation of secondary database system
CN115098469A (en) Database migration method and device, electronic equipment and readable storage medium
US7949632B2 (en) Database-rearranging program, database-rearranging method, and database-rearranging apparatus
CN109740027B (en) Data exchange method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination