WO2018113580A1 - 一种数据管理方法及服务器 - Google Patents

一种数据管理方法及服务器 Download PDF

Info

Publication number
WO2018113580A1
WO2018113580A1 PCT/CN2017/116144 CN2017116144W WO2018113580A1 WO 2018113580 A1 WO2018113580 A1 WO 2018113580A1 CN 2017116144 W CN2017116144 W CN 2017116144W WO 2018113580 A1 WO2018113580 A1 WO 2018113580A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
service
incremental
basic
data set
Prior art date
Application number
PCT/CN2017/116144
Other languages
English (en)
French (fr)
Inventor
郭庆南
李跃森
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201611178079.2A external-priority patent/CN108205560B/zh
Priority claimed from CN201611178078.8A external-priority patent/CN108205559B/zh
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2018113580A1 publication Critical patent/WO2018113580A1/zh
Priority to US16/289,808 priority Critical patent/US11500832B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Definitions

  • the present invention relates to the field of Internet technologies, and in particular, to a data management method and a server.
  • the existing database system proposes the management method of hot data and cold data, that is, for the time-sensitive business data, the business data generated in the near future by using the time range as a dividing line (for example, business data of nearly 4 months, etc.) ) Stores separately from business data other than the recent ones (for example, business data of 4 months ago).
  • the embodiment of the invention provides a data management method and a server, which can implement online migration of service data, improve the efficiency of data processing such as querying and modifying service data, and thereby ensure the quality of service services.
  • a first aspect of the embodiments of the present invention provides a data management method, which may include:
  • the incremental data is used to add the mirrored data, and the incremental data is migrated to the second service data set;
  • the first service data set is a current service data set stored in a preset time period
  • the second service data set is a historical service data set stored in addition to the preset time period.
  • a second aspect of the embodiments of the present invention provides a server, including a processor and a memory, where the memory stores instructions executable by the processor, and when the instruction is executed, the processor is configured to:
  • the incremental data is used to add the mirrored data, and the incremental data is migrated to the second service data set;
  • the first service data set is current service data stored in a preset time period.
  • the set, the second service data set is a historical service data set stored in addition to the preset time period.
  • a third aspect of embodiments of the present invention provides a computer readable storage medium storing computer readable instructions that cause at least one processor to perform the method as described above.
  • FIG. 1 is a block diagram of a data management system according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a data management method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of another data management method according to an embodiment of the present invention.
  • FIG. 4 is a schematic flowchart of still another data management method according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart diagram of another data management method according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of a data synchronization method according to an embodiment of the present invention.
  • FIG. 7 is a schematic flowchart diagram of still another data synchronization method according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • FIG. 9 is a schematic structural diagram of another server according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram of an incremental data migration module according to an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of still another server according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of a first synchronization unit according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of an incremental data recording module according to an embodiment of the present disclosure.
  • FIG. 14 is a schematic structural diagram of a second recording unit according to an embodiment of the present invention.
  • FIG. 15 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the management method of using hot data and cold data when part of the hot data needs to be converted into cold data, it is often necessary to suspend the service service and switch the storage node to the hot data that needs to be transferred.
  • the switching of the route is determined by the amount of data of the hot data to be transferred. When the amount of data is large, the quality of the service is easily affected, and the efficiency of data processing such as querying and modifying the service data is reduced.
  • FIG. 1 is a schematic structural diagram of a data management system according to an embodiment of the present invention.
  • the data management system 100 can include a plurality of coordinator nodes and a plurality of data nodes, wherein the coordinator node is configured to provide an interface to the external user terminals 1 to N, and receive the service data sent by the user terminal. Query, modify, and other data processing requests, and distribute to data nodes, and store storage indexes of business data.
  • the location between multiple coordinator nodes is peer-to-peer, and the user equipment can access any coordinator node to perform data processing on the service data.
  • the data node is configured to store business data, and execute data processing requests and the like distributed by the coordinator node.
  • the plurality of coordinator nodes and the plurality of data nodes may be respectively placed in different background service devices to form a service device group.
  • a device group composed of a plurality of coordinator nodes and a plurality of data nodes is referred to as a server.
  • the service with timeliness is adopted by using hot data and cold data.
  • the data is separately stored, wherein the business data has a large amount of data, and the frequency of its access is gradually decreased over time, for example, transaction flow data, call record data, and the like.
  • the hot data includes a plurality of data nodes, as shown in FIG. 1, including data nodes 1 and data nodes 2, and the like, for storing recently generated service data (for example, business data of the past 4 months, etc.).
  • the cold data also includes a plurality of data nodes, as shown in FIG. 1, including data nodes 3 and data nodes 4, etc., for storing business data other than the recent ones (for example, business data of 4 months ago, etc.).
  • the hot data and the cold data can be understood that by storing the hot data and the cold data separately, it is possible to define the frequently accessed data and the infrequently accessed data by the time range as a dividing line, and over time, it is necessary to periodically exceed the time range.
  • the hot data is transferred to the cold data for storage.
  • a current service data set (ie, hot data) stored in the preset time period is defined as a first service data set, and the preset time period is excluded.
  • the stored historical business data set ie, cold data
  • the basic data is migrated to the second business data set (see 110); then, the incremental data acquired for the basic data during the migration of the underlying data is recorded; When the migration process is complete, the incremental data is migrated to the second business data set (see 120).
  • the user terminal may include: a tablet computer, a smart phone, a notebook computer, a palmtop computer, a personal computer, and a terminal device such as a mobile internet device (MID).
  • a tablet computer a smart phone
  • a notebook computer a notebook computer
  • a palmtop computer a personal computer
  • a terminal device such as a mobile internet device (MID).
  • MID mobile internet device
  • FIG. 2 is a schematic flowchart diagram of a data management method according to an embodiment of the present invention. As shown in FIG. 2, the method of the embodiment of the present invention, applied to a server, may include the following steps S101 to S104.
  • the server may obtain the basic data to be migrated in the first service data set, and the first service data set is the current service data set stored in the preset time period, that is, the hot data,
  • the basic data is service data that needs to be transferred from the first service data set to the second service data set outside the preset time period, or the basic data is based on a manager for a preset time period.
  • the second service data set is a historical service stored except the preset time period
  • the data set which is the above cold data.
  • the server is specifically a background service device group including a plurality of coordinator nodes and a plurality of data nodes.
  • the server may generate the same mirror data as the basic data, for example, generate the same mirror data as the basic data by using a data snapshot, and the server migrates the basic data to the second service data. in.
  • S102 Record incremental data acquired for the basic data during migration of the basic data.
  • the server may record the incremental data acquired for the basic data during the migration process of the basic data from the first service data set to the second service data set, which needs to be explained.
  • the incremental data is data that requires data insertion, update, and the like on the basic data during the migration of the basic data.
  • the server may add the mirror data by using the incremental data, and add data of operations such as data insertion and update of the basic data to the In the mirroring data, at the same time, the server further needs to migrate the incremental data into the second service data set, and perform data insertion, update, and the like on the basic data.
  • the server may update routing information of the basic data and the incremental data, that is, routing information of the basic data and the incremental data
  • the first service data set is converted into the second service data set, and the query, insertion, deletion, update, and the like of the basic data and the incremental data initiated by the subsequent user terminal are all allocated to the second service data. Executing in the set, the server simultaneously clears the mirrored data in the first service data set and the incremental data in the first service data set.
  • the basic data in the first service data set when the basic data in the first service data set is migrated, the basic data is first migrated to the second service data set by recording and retaining the mirror data of the basic data, and the basic data migration is recorded.
  • the process of incremental data for the basic data when the basic data migration is completed, the incremental data is migrated, and after the incremental data migration is completed, the mirrored data and the incremental data in the first service data set are cleared.
  • the process of migrating business data online is realized, and the efficiency of data processing such as querying and modifying business data is improved, thereby ensuring the quality of business services.
  • FIG. 3 is a schematic flowchart diagram of another data management method according to an embodiment of the present invention. As shown in FIG. 3, the method in the embodiment of the present invention may include the following steps S201-S207.
  • the server may store the service data that belongs to the preset time period into the first service data set, and store the service data that belongs to the preset time period to the second service data set.
  • the first service data set is a current service data set stored in a preset time period, that is, the hot data
  • the second service data set is a historical service data set stored except the preset time period. That is, the above cold data.
  • the server is specifically a background service device group including a plurality of coordinator nodes and a plurality of data nodes.
  • the server may obtain the basic data to be migrated in the first service data set, and it may be understood that the basic data is required to be from the first service outside the preset time period as time passes.
  • the data set is transferred to the business data of the second service data set, or the basic data is based on the time difference between the administrator and the modification of the preset time period (for example, modification from nearly 4 months to nearly 3 months, etc.)
  • the service data existing in the server the server may generate the same mirror data as the basic data, for example, generate the same mirror data as the basic data by using a data snapshot, and the server migrates the basic data to In the second service data.
  • S203 Record first incremental data acquired for the basic data during migration of the basic data.
  • the server records the first increment obtained for the basic data during the migration process of the basic data.
  • Data, the first increment According to it, it is used to indicate incremental data generated for the basic data in the process of migrating the basic data.
  • S204 The first incremental data is used as incremental data, and the mirrored data is added by using the incremental data, and the incremental data is migrated to the second service data set, where the recording In the migration process of the incremental data, the second incremental data acquired for the basic data and the incremental data is used as the incremental data, and the step is repeated until the The amount of data of the two incremental data is less than the preset data amount threshold.
  • the server may use the first incremental data as incremental data, add the mirrored data by using the incremental data, and migrate the incremental data to the second service data.
  • the second incremental data acquired for the basic data and the incremental data during the migration process of the incremental data is recorded, and the second incremental data is used as the incremental data, and the step is repeatedly performed.
  • the data amount of the second incremental data is less than the preset data amount threshold, it can be understood that the second incremental data is used to indicate that in the process of migrating the first incremental data.
  • the preset data volume threshold may be set by the maintenance personnel according to the experience value to ensure that the impact of the second incremental data on the insertion, update, and the like performed by the user terminal during the migration process is negligible.
  • the data volume of the first incremental data is 1 TB
  • the second incremental data generated during the 8 hour process The amount of data is 0.5 TB, and it takes 4 hours to migrate the second incremental data to the second service data set, and the loop is performed until the data volume of the generated second incremental data is 36 MB, which is required at this time.
  • the migration time is about 1 second, so when the migration time required for the data amount of the second incremental data is very short, the double write operation can be further processed, and the second incremental data need not be recorded again. Incremental data generated when migrating to the second set of business data.
  • the server may use the second incremental data to simultaneously access the first service data set. Modifying the mirror data, the incremental data, and the basic data and the incremental data in the second service data set, that is, using the second incremental data pair
  • the mirror data and the incremental data in the service data set perform data insertion, update, and the like, and simultaneously use the second incremental data to the basic data in the second service data set and the Incremental data is used for data insertion and update operations.
  • the migration of the remaining service data is completed in real time on the basis that the incremental data does not affect the operations of inserting and updating the request to the user terminal.
  • the server can close the double-write operation and perform data rollback. At this time, the routing information of the service data is not changed, and the data-free data rollback is realized. .
  • the server determines that the migration process of the incremental data is completed.
  • the server allocates the query request to the In the first business data set, and return the query results to the location User terminal.
  • the server may update routing information of the basic data and the incremental data, that is, routing information of the basic data and the incremental data
  • the first service data set is converted into the second service data set, and the query, insertion, deletion, update, and the like of the basic data and the incremental data initiated by the subsequent user terminal are all allocated to the second service data. Executing in the set, the server simultaneously clears the mirrored data in the first service data set and the incremental data in the first service data set.
  • the plurality of data nodes in the server may adjust the number of nodes according to the data volume of the service data, and the server may serve the service that meets the deletion condition in the second service data set based on the service requirement.
  • the data is cleared.
  • the service requirement may be a preset deletion time threshold. For example, for a historical data of 10 years or more, the server may delete the second service data set.
  • the service data of the time threshold is cleared, and at the same time, the cleared data node can wait for reuse, realizing the dynamic allocation of the data node.
  • the basic data in the first service data set when the basic data in the first service data set is migrated, the basic data is first migrated to the second service data set by recording and retaining the mirror data of the basic data, and the basic data migration is recorded.
  • the process of incremental data for the basic data when the basic data migration is completed, the incremental data is migrated, and after the incremental data migration is completed, the mirrored data and the incremental data in the first service data set are cleared.
  • the process of migrating business data online is realized, and the efficiency of data processing such as querying and modifying business data is improved, thereby ensuring the quality of business services; and online is realized by cyclically recording incremental data and migrating.
  • the process of migrating business data, reducing the pair The impact of the insertion and update of the service data; the double-write operation is implemented to realize the real-time synchronization of the remaining service data migration based on the operation of inserting and updating the incremental data without affecting the request of the user terminal.
  • the method dynamically allocates data nodes, and when the storage space is insufficient, there is no need to replace the hardware capacity of the data node storage capacity, thereby reducing hardware costs.
  • FIG. 4 is a schematic flowchart diagram of still another data management method according to an embodiment of the present invention. As shown in FIG. 4, the method of the embodiment of the present invention may include the following steps S301-S308.
  • S302. Record incremental data acquired for the basic data during migration of the basic data.
  • the server may further detect whether the time range belongs to the preset time period, and if yes, proceed to step S306; Otherwise, the process proceeds to step S307.
  • the server may use the service that belongs to the preset time period and belongs to the time range in the first service data set.
  • the data is returned to the user terminal.
  • the server when the server detects that the time range does not belong to the preset time period, the server may belong to the second service data set that does not belong to the preset time period but belongs to the time range.
  • the business data is returned to the user terminal.
  • the server may use the first service data that belongs to the preset time period in the first service data set, and The second service data in the second service data set that does not belong to the preset time period is returned to the user terminal.
  • step S305 to step S308 of the embodiment of the present invention may not follow the execution process of the embodiment of the present invention, that is, the user terminal may be in the office.
  • Intent time initiates the query process of business data.
  • the basic data in the first service data set when the basic data in the first service data set is migrated, the basic data is first migrated to the second service data set by recording and retaining the mirror data of the basic data, and the basic data migration is recorded.
  • the incremental data when the basic data migration is completed, the incremental data is migrated, and after the incremental data migration is completed, the mirrored data and the incremental data in the first service data set are cleared.
  • the process of migrating the business data online is realized, the efficiency of data processing such as querying and modifying the business data is improved, and the quality of the service service is ensured; taking into account the fact that the business data is erroneously inserted during the storage process, The erroneously inserted business data is not returned to the user terminal to protect the consistency of data access.
  • the currently used PostgreSQL (Object-Relational Database Management System) database is usually recorded by log recording by various data processing services.
  • the amount of data, and then the incremental data is migrated (or synchronized) to other databases, that is, the synchronization method of the incremental data of the current database needs to be implemented by the log.
  • the log is strongly related to the version of the database. Therefore, for two databases with different versions, it is impossible to synchronize the incremental data through the log.
  • the synchronization method of the incremental data of the current database is implemented based on a single process, that is, all the incremental data needs to be synchronized to other databases through a single process, for example, in the case of large concurrent writes, the incremental data will be The generation is faster, so the synchronization method based on a single process will greatly reduce the synchronization efficiency of incremental data.
  • FIG. 5 is a schematic flowchart diagram of another data management method according to an embodiment of the present invention. As shown in FIG. 5, the method of the embodiment of the present invention may include the following steps S501-S506.
  • step 101 For this step, reference may be made to the description of step 101 above, and details are not described herein again.
  • the incremental data acquired for the basic data during the migration process of the basic data is recorded.
  • the incremental data is referred to herein as total incremental data.
  • the server may acquire a data operation instruction sent by the user equipment, where the data operation instruction may include a data addition instruction, a data deletion instruction, a data modification instruction, and the like, and the server may further perform data processing according to the data operation instruction.
  • the data processing service may include performing basic data, data deletion service, data modification service, etc. on the basic data in the first service data set (for convenience of description, the collection of the basic data is referred to as a source database) Wait. For example, if the data operation instruction is a data deletion instruction, the server deletes part of the data in the corresponding source database according to the data deletion instruction.
  • the total incremental data recorded by the server is incremental data based on a logical statement, and may be incremental data based on a SQL (Structured Query Language) statement.
  • the server may record the modification operation, the deletion operation, and the new operation involved in each row of data in each data table in the database as a single SQL statement, and also record the data table information or database information involved in the SQL statement. And then the recorded SQL statement and the data table information or database information involved are determined For total incremental data. Since SQL statements are common in different versions of the database, incremental data based on SQL statements can be synchronized between different versions of the database.
  • step 103 For the step, refer to the description of step 103 above, and details are not described herein again.
  • each first data table includes corresponding table increment data; and the sum of the table increment data corresponding to each of the first data tables is The total incremental data.
  • the execution status of the data processing service may be further detected. If the execution status of the data processing service is a data rollback state, deleting the total recorded Incremental data. If the execution status of the data processing service is a successful execution state, the data processing service is successfully submitted, and then at least one first data table associated with the total incremental data may be searched for. Each of the first data tables includes corresponding table increment data; the sum of the table increment data corresponding to each of the first data tables is the total incremental data. Wherein, if the data record range is the at least one second data table, the at least one second data table includes the at least one first data table.
  • the 4 data tables including the incremental data may be determined as the total data.
  • the first data table associated with the delta data may be determined as the total data.
  • the server records the total incremental data generated by the data processing service in the source database, and the execution operation corresponding to the data processing service involves five data tables in the source database, and the The data table is determined as the first data table associated with the total incremental data, and the incremental data generated in each first data table is table incremental data, and the corresponding data of the five first data tables respectively The sum of the quantity data is the total incremental data.
  • the server may determine the data table A and the data table B as the For the first data table associated with the total incremental data, the sum of the table increment data corresponding to the data table A and the data table B respectively is the total incremental data.
  • the migration is a process of parallel synchronization.
  • the server may perform SQL distribution according to the first data table, so that all incremental data based on the SQL statement in the same first data table (ie, table incremental data) are synchronized by the same process, and then The table increment data in different first data tables are respectively synchronized by using different processes, so that not only the consistency of the data in the first data table but also the tables in the first data tables can be realized.
  • the incremental data is synchronized in parallel to the second set of service data.
  • the server includes a first data table A, a first data table B, and a first data table C.
  • the first data table A includes table increment data a
  • the first data table B includes table increment data b.
  • a data table C includes table delta data c
  • the server may synchronize the table delta data a to the second service data set by the first process, and synchronize the table delta data b by the second process. Up to the second service data set, and synchronizing the table delta data c to the second service data set by a third process, that is, implementing table delta data a, table delta data b, table increment Data c is synchronized in parallel.
  • the method of parallel synchronization based on the data table can greatly reduce the synchronization time, which can improve the synchronization efficiency of the incremental data, compared to the method of performing single-process synchronization on all incremental data by using one process.
  • step 104 For this step, reference may be made to the description of step 104 above, and details are not described herein again.
  • the embodiment of the present invention can be used in different versions of the database. Incremental data synchronization is performed; and the incremental data of each table is synchronously synchronized by multiple processes, which can effectively improve the synchronization efficiency.
  • FIG. 6 is a schematic flowchart of a data synchronization method according to an embodiment of the present invention.
  • the method may include:
  • the server may further set a data record range corresponding to the data processing service, and may set the data record range according to the current service requirement, for example, the current service.
  • the requirement is that the data modified in the data table A and the data table B needs to be synchronized, then the data record range can be set to the data table A and the data table B, that is, the subsequent records only in the data table A and the data table B are based on
  • the total incremental data generated by the data processing service if it is currently required to synchronize the modified basic data in the entire source database (ie, including the basic data in the first service data set), the data recording range may be set. For the source database, the total incremental data generated based on the data processing service can be subsequently recorded in the source database.
  • the data record range is the source database, it may be first determined whether the modified data (or the deleted data or the newly added data) corresponding to the data processing service belongs to the source database, and if the data belongs, the modified data may be recorded (or deleted) Incremental data corresponding to the data or new data, otherwise the corresponding incremental data is not recorded, that is, the server only records in the source database based on The total incremental data generated by the data processing service.
  • the specified data table or the incremental data in the database can be selectively recorded, so that the data table or the database corresponding modification in the log can be solved.
  • the recorded problem therefore, the embodiment of the present invention can more flexibly select the incremental data of the required record, which can not only improve the synchronization efficiency but also save the storage resource of the server.
  • the server may first detect whether there is currently a data processing service being executed, and if yes, wait for the data processing service being executed to be executed, and then perform step S601 to ensure that the record is performed.
  • the total incremental data is the incremental data of the complete data processing service so as to ensure the integrity of the synchronized data processing services when synchronizing the total incremental data.
  • the data record range is at least one second data table, record total incremental data generated based on the data processing service in the at least one second data table.
  • the data record range is at least one second data table
  • Process the total incremental data generated by the business For example, if there are four second data tables, and the data processing service is a data deletion service, the server only needs to record each data deletion operation and each data deletion operation corresponding to each of the four second data tables.
  • the serial number and the second data table information are determined, and the operation sequence number and the second data table information respectively corresponding to each data deletion operation and each data deletion operation are determined as total incremental data.
  • the total incremental data recorded by the server is incremental data based on logical statements, and may be incremental data based on SQL statements.
  • the server may modify, delete, and add operations related to each row of data in each data table in the database.
  • Each record is recorded as a SQL statement, and also records the data table information or database information involved in the SQL statement, and then the recorded SQL statement and the data table information or database information involved are determined as total incremental data. Since SQL statements are common in different versions of the database, incremental data based on SQL statements can be synchronized between different versions of the database.
  • the data record range is the basic data (referred to as the source database)
  • the source database it may be determined whether the modified data (or the deleted data or the newly added data) corresponding to the data processing service belongs to the source database. If it belongs, the incremental data corresponding to the modified data (or the deleted data or the newly added data) may be recorded, otherwise the corresponding incremental data is not recorded, that is, the server only records the data processing service based on the data processing service in the source database.
  • the total incremental data generated For example, if the server includes five databases, one of which is a source database, and the data processing service is a data modification service, the server only needs to record each data modification operation and each data modification in the source database.
  • the operation data and the data table information respectively corresponding to the modified data and each data modification operation, and the operation sequence number corresponding to each data modification operation, data modified by each data modification operation, and each data modification operation respectively
  • the data sheet information is determined as total incremental data.
  • the total incremental data recorded includes the data deletion part and the data modification part.
  • the specified data table or the incremental data in the database can be selectively recorded, so that the data table or the corresponding modification of the database that is not required to be in the log can be solved and recorded together.
  • the problem is that, therefore, the embodiment of the present invention can more flexibly select the incremental data of the required record, which can not only improve the synchronization efficiency but also save the storage resource of the server.
  • the total incremental data in S603 is also incremental data based on the SQL statement.
  • execution state of the data processing service is a data rollback state, deleting the recorded total incremental data.
  • the execution status of the data processing service may be further detected. If the execution status of the data processing service is a data rollback state, the data processing service is modified. The data (or deleted or added) does not fall, so the total incremental data recorded can be deleted.
  • execution state of the data processing service is a successful execution state, searching for at least one first data table associated with the total incremental data.
  • the execution state of the data processing service is a successful execution state
  • the data processing service is successfully submitted, and then at least one first data table associated with the total incremental data may be searched for.
  • the data record range is the source database
  • the data may be searched in the source database for which data modification (or deletion or addition) is performed in the data table, and the data is searched out.
  • These data tables are the first data tables associated with the total incremental data.
  • the at least one second data table includes the at least one first data table, for example, there are 5 second data tables, of which 4 The second data table contains incremental data (ie, table delta data), and the four data tables containing the delta data may be determined as the first data table associated with the total delta data.
  • Each of the first data tables includes corresponding table increment data; the sum of the table increment data corresponding to each of the first data tables is the total incremental data.
  • the server records the total incremental data generated by the data processing service in the source database, and the execution operation corresponding to the data processing service involves five data tables in the source database, and the Data tables are determined as the first data table associated with the total incremental data, each first number According to the incremental data generated in the table, the sum of the table increment data corresponding to the five first data tables is the total incremental data.
  • the server may determine the data table A and the data table B as the first data table associated with the total incremental data, and the sum of the table incremental data corresponding to the data table A and the data table B respectively For the total incremental data.
  • the primary key is a Primary Key
  • the value of the Primary Key is unique in the first data table
  • the primary key corresponding to each row of data in the first data table is filled in the first data table.
  • the server may perform SQL distribution according to the first data table, so that the same first All the SQL statement-based incremental data in the data table (that is, the table delta data) are synchronized by the same process, so that the table increment data in different first data tables are synchronized by different processes, so
  • the data in the first data table can be synchronized to the second service data set (referred to as the target database) in parallel.
  • the server includes a first data table A, a first data table B, and a first data table C.
  • the first data table A includes table increment data a
  • the first data table B includes table increment data b.
  • a data table C includes table delta data c
  • the server can synchronize the table delta data a to the target database through the first process, and synchronize the table delta data b to the target database through the second process.
  • the table incremental data c is synchronized to the target database, that is, parallel synchronization of the table incremental data a, the table incremental data b, and the table incremental data c is realized.
  • the method of parallel synchronization based on the data table can greatly reduce the synchronization time, which can improve the synchronization efficiency of the incremental data, compared to the method of performing single-process synchronization on all incremental data by using one process.
  • S609 Synchronize table delta data in the first data table not including the primary key to the second service data set in parallel, and search for a table corresponding to the first data table that includes the primary key. And at least one target row data associated with the quantity data, and the row increment data corresponding to each target row data is synchronously synchronized to the second service data set.
  • the server may perform the table based on the first data table not including the primary key.
  • Parallel synchronization of the quantity data the specific process can be referred to the step S505 together, and will not be repeated here.
  • the server may first search for at least one target row data associated with the table delta data corresponding to the first data table including the primary key for the first data table including the primary key, the first data table
  • the sum of the row increment data corresponding to each of the target row data is the table increment data corresponding to the first data table.
  • the operation performed by the data processing service involves 10 rows of data in the first data table, and the 10 rows of data are the target row data associated with the table delta data corresponding to the first data table,
  • the sum of the incremental data corresponding to the 10 rows of data is the table delta data corresponding to the first data table.
  • the server further performs SQL distribution according to the target row data.
  • the row increment data corresponding to the target row data may include a Primary Key
  • the row increment data with the same Primary Key may be used (may be two or more
  • the row increment data with the same Primary Key in a data table is synchronized by the same process, and then the row increment data with different Primary Keys are synchronized by using different processes. Since the Primary Keys in a first data table are different and will not be repeated, they are based on The method of parallel synchronization of the row delta data by the primary key not only ensures the consistency of the data in the first data table, but also has higher synchronization efficiency than parallel synchronization of the table delta data based on the first data table.
  • the first data table in the server includes a Primary Key
  • the target row data in the first data table includes row data A, row data B, and row data C
  • the row data A includes row increment data a
  • Data B includes row delta data b
  • row data C includes row delta data c
  • the server can synchronize row delta data a to the target database through the first process, and increment the row by the second process.
  • the data b is synchronized to the target database
  • the row increment data c is synchronized to the target database by the third process, that is, parallel synchronization of the row increment data a, the row increment data b, and the row delta data c is realized.
  • the embodiment of the present invention can obtain a data operation instruction, execute a data processing service according to the data operation instruction, and record the total incremental data generated by the data processing service, and search for at least one first data table associated with the total incremental data, and The table delta data in each first data table is synchronously synchronized to the target database. Since the total incremental data is incremental data based on logical statements, and incremental data based on logical statements can be used in various versions of the database, embodiments of the present invention can perform incremental data between different versions of the database.
  • Synchronization and by determining the data record range corresponding to the data processing service, the specified data table or the incremental data in the database can be selectively recorded, thereby not only improving the synchronization efficiency but also saving the storage resources of the server; Multi-process parallel synchronization of incremental data of each table can effectively improve synchronization efficiency; and when the data table contains the primary key, the table incremental data corresponding to the data table containing the primary key can be further based on the row Parallel synchronization of incremental data to further improve synchronization efficiency.
  • FIG. 7 is a schematic flowchart of still another data synchronization method according to an embodiment of the present disclosure, where the method may include:
  • the background service instruction is an online upgrade instruction or a data relocation instruction.
  • the start time stamp corresponding to the background service instruction may be recorded, and if the background service instruction is an online upgrade instruction, the server may create a new version.
  • the database is used as a target database for representing the second service data set, and the full amount of data in the source database (ie, the database including the basic data) at the start time stamp (the full amount of data is in the source database) All data, and assuming that the server is not executing the data processing service at the time of the start time stamp), determining that the full amount of data is to be synchronized, and synchronizing the full amount of data to be synchronized to the target according to the background service instruction database.
  • the server may use the currently existing database or the database in the newly created database or other server as the target database, and the version of the target database may be the same as or different from the version of the source database. .
  • the server may be first synchronized to the target database, and after the data processing service is executed, the updated first part of the data obtained after the data processing service is executed may be resynchronized to the target database.
  • the server may start the data operation finger in real time starting from the start time stamp. And executing the data processing service according to the data operation instruction.
  • the data record range may be set as the source database, and the incremental data generated based on the data processing service may be recorded in the source database. .
  • the server may obtain a data operation instruction in real time starting from the start time stamp, and execute an operation instruction according to the data.
  • the server may obtain a data operation instruction in real time starting from the start time stamp, and execute an operation instruction according to the data.
  • the data record range is set to be a data table related to the second part of data, as at least one second data table, and then the record may be based on the at least one second data table.
  • the server may reset the data record range to the source database, thereby starting to record an increment generated based on the data processing service in the source database. data.
  • S705. Determine a time at which the data processing service is completed as an end timestamp, and determine all incremental data recorded between the start timestamp and the end timestamp as total incremental data.
  • the online upgrade is completed or the data relocation is completed.
  • the incremental data is stopped, if the data processing service is If it is still not completed, you can wait until the data processing service is completed before stopping to record the incremental data.
  • the server may further determine a time at which the data processing service is completed as an end time stamp, and determine all incremental data recorded between the start time stamp and the end time stamp as total incremental data. . If the determination in S704 is no, it indicates that the online upgrade has not been completed or the data relocation has not been completed. Therefore, the server will continue to perform the step S703 to further record the new incremental data.
  • execution state of the data processing service is a data rollback state, deleting the recorded total incremental data.
  • each first data table includes corresponding table increment data;
  • the sum of the table increment data corresponding to each of the first data tables is the total incremental data.
  • the table increment data in each of the first data tables is synchronously synchronized to the target database (ie, the second service data set).
  • S711 Synchronize table delta data in the first data table not including the primary key to the second service data set in parallel, and search for a table corresponding to the first data table that includes the primary key. And at least one target row data associated with the quantity data, and the row increment data corresponding to each target row data is synchronously synchronized to the second service data set.
  • step S711 is performed.
  • the source database continues to provide services, and the table increment data and/or rows in the total incremental data recorded in the source database may be updated until the upgrade is completed or the relocation is completed.
  • the incremental data is synchronously synchronized to the target database to ensure the consistency of the data in the target database, and the entire service process is not interrupted, that is, the transition from the use of the source database to the target can be realized without guaranteeing the uninterrupted service. Use of the database.
  • the server provided by the embodiment of the present invention will be described in detail below with reference to FIG. 8 to FIG. 15. It should be noted that the server shown in FIG. 8 to FIG. 15 is used to execute the method of the embodiment shown in FIG. 2 to FIG. 7 of the present invention. For the convenience of description, only the part related to the embodiment of the present invention is shown. For specific technical details not disclosed, please refer to the embodiment shown in Figures 2-7 of the present invention.
  • FIG. 8 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the server 1 of the embodiment of the present invention may include: a basic data migration module 11, an incremental data recording module 12, an incremental data migration module 13, and a mirror data clearing module 14.
  • the basic data migration module 11 is configured to acquire basic data to be migrated in the first service data set, generate mirror data that is the same as the basic data, and migrate the basic data to the second service data set.
  • the basic data migration module 11 may generate the same mirror data as the basic data, for example, generate the same mirror data as the basic data by using a data snapshot, and the basic data migration module 11 will use the basic data. Migrating to the second service data.
  • the incremental data recording module 12 is configured to record incremental data acquired for the basic data during migration of the basic data
  • the incremental data recording module 12 may record, in the migration process of the basic data from the first service data set to the second service data set, record an increase obtained for the basic data.
  • the amount of data indicates that the incremental data is data that requires data insertion, update, and the like on the basic data during the migration of the basic data.
  • the incremental data migration module 13 is configured to: when the migration process of the basic data is completed, add the mirrored data by using the incremental data, and migrate the incremental data to the second service data. In the collection;
  • the incremental data migration module 13 may adopt the incremental data pair. Adding processing to the mirror data, adding data of operations such as data insertion and update of the basic data to the mirror data, and the incremental data migration module 13 further needs to migrate the incremental data to In the second service data set, data insertion, update, and the like are performed on the basic data.
  • the mirroring data clearing module 14 is configured to: when the migration process of the incremental data is completed, clear the mirrored data and the incremental data in the first service data set;
  • the mirrored data clearing module 14 may update routing information of the basic data and the incremental data, that is, the basic data and the incremental data.
  • the routing information is converted into the second service data set by the first service data set, and the query, insertion, deletion, update, etc. of the basic data and the incremental data initiated by the subsequent user terminal are all allocated to the Executed in the second service data set, the mirror data clearing module 14 simultaneously clears the mirror data in the first service data set and the incremental data in the first service data set.
  • FIG. 9 is a schematic structural diagram of another server according to an embodiment of the present invention.
  • the server 1 of the embodiment of the present invention may include: a basic data migration module 11, an incremental data recording module 12, an incremental data migration module 13, a mirrored data clearing module 14, and a service data storage module 15, The time detecting module 16, the first data returning module 17 and the second data returning module 18.
  • the service data storage module 15 is configured to store the service data that belongs to the preset time period into the first service data set, and store the service data that belongs to the preset time period into the second service data set.
  • the service data storage module 15 may store the service data that belongs to the preset time period into the first service data set, and the services that belong to the preset time period.
  • the service data is stored in the second service data set.
  • the first service data set is a current service data set stored in a preset time period, that is, the hot data
  • the second service data set is The historical service data set stored outside the preset time period, that is, the above cold data.
  • the server 1 is specifically a background service device group including a plurality of coordinator nodes and a plurality of data nodes.
  • the basic data migration module 11 is configured to acquire basic data to be migrated in the first service data set, generate mirror data that is the same as the basic data, and migrate the basic data to the second service data set;
  • the basic data migration module 11 may obtain basic data to be migrated in the first service data set, and it may be understood that the basic data is required to be outside the preset time period as time passes. Transferring the first service data set to the service data of the second service data set, or the basic data is based on a modification of the preset time period by the administrator (for example, changing from nearly 4 months to nearly 3 months, etc.)
  • the basic data migration module 11 may generate the same mirror data as the basic data, for example, generating the same mirror data as the basic data by using a data snapshot.
  • the basic data migration module 11 migrates the basic data into the second service data.
  • the incremental data recording module 12 is configured to record incremental data acquired for the basic data during migration of the basic data
  • the incremental data recording module 12 may record, in the migration process of the basic data from the first service data set to the second service data set, record an increase obtained for the basic data.
  • the amount of data indicates that the incremental data is data that requires data insertion, update, and the like on the basic data during the migration of the basic data.
  • the migration of the basic data is The process also takes a long time, and the amount of data of the incremental data generated during the migration of the basic data is also large, so that it is necessary to perform subsequent migration processing on the incremental data that is continuously generated.
  • the incremental data recording module 12 records first incremental data acquired for the basic data during migration of the basic data, and the first incremental data is used to indicate that the basic data is Incremental data generated for the underlying data generated during the migration process.
  • the incremental data migration module 13 is configured to: when the migration process of the basic data is completed, add the mirrored data by using the incremental data, and migrate the incremental data to the second service data. In the collection;
  • the incremental data migration module 13 may adopt the incremental data pair. Adding processing to the mirror data, adding data of operations such as data insertion and update of the basic data to the mirror data, and the incremental data migration module 13 further needs to migrate the incremental data to In the second service data set, data insertion, update, and the like are performed on the basic data.
  • FIG. 10 is a schematic structural diagram of an incremental data migration module according to an embodiment of the present invention.
  • the incremental data migration module 13 may include:
  • the incremental data processing unit 131 is configured to add the first incremental data as incremental data when the migration process of the basic data is completed, and add the mirrored data by using the incremental data, and And the incremental data is migrated into the second service data set, and the second incremental data acquired for the basic data and the incremental data during the migration process of the incremental data is recorded, and the first The two incremental data is used as the incremental data, and the step is repeated until the data amount of the second incremental data is less than the preset data amount threshold.
  • the result obtaining unit 132 is configured to: when the data amount of the second incremental data is less than a preset data amount threshold, use the second incremental data to simultaneously be in the first service data set.
  • the mirror data, the incremental data, and the basic data and the incremental data in the second service data set are modified, and a modification processing result is obtained.
  • the process determining unit 133 is configured to determine that the migration process of the incremental data is completed when the modification processing result is that the modification process is successful.
  • the mirroring data clearing module 14 is configured to: when the migration process of the incremental data is completed, clear the mirrored data and the incremental data in the first service data set.
  • the time detecting module 16 is configured to detect whether the time range belongs to the preset time period when detecting a service data query request with a time range sent by the user terminal.
  • the first data returning module 17 is configured to: if the time detecting module 16 detects whether the time range belongs to the preset time period, the detection result is yes, the first service data set belongs to the preset The service data of the time period and belonging to the time range is returned to the user terminal.
  • the first data returning module 17 is further configured to: if the time detecting module 16 detects whether the time range belongs to the preset time period, the detection result is no, the second service data set does not belong to The service data of the preset time period but belonging to the time range is returned to the user terminal.
  • a second data returning module 18 configured to: when detecting a service data query request that is not carried by the user terminal that does not carry a time range, the first service data that belongs to the preset time period in the first service data set, And returning, to the user terminal, the second service data that does not belong to the preset time period in the second service data set.
  • the basic data in the first service data set when the basic data in the first service data set is migrated, the basic data is first migrated to the second service data set by recording and retaining the mirror data of the basic data, and the basic data migration is recorded.
  • the process of incremental data for the basic data when the basic data migration is completed, the incremental data is migrated, and after the incremental data migration is completed, the mirrored data and the incremental data in the first service data set are cleared.
  • the process of migrating business data to the line improves the efficiency of data processing such as querying and modifying business data, thereby ensuring the quality of business services; and further realizing online business by cyclically recording incremental data and migrating.
  • the process of data migration reduces the impact on operations such as inserting and updating business data.
  • the real-time synchronization completes the migration of the remaining service data and the modification process of the service data, and further performs verification on the service data in the first service data set and the second service data set to ensure the consistency of the service data in the migration process.
  • Sexuality in view of the fact that the service data is erroneously inserted during the storage process, the erroneously inserted service data may not be returned to the user terminal to protect the consistency of the data access; the data node is dynamically dynamic by adopting the deletion time threshold. Provisioning, no need to replace when the storage space is insufficient Data storage capacity of the node hardware, reducing hardware costs.
  • FIG. 11 is a schematic structural diagram of still another server according to an embodiment of the present invention.
  • the incremental data recording module 12 is configured to: during a migration process of the basic data, acquire a data operation instruction, perform a data processing service on the basic data according to the data operation instruction, and record Total incremental data generated based on the data processing service; the total incremental data is incremental data based on logic statements;
  • the incremental data migration module 13 includes:
  • the searching unit 131 is configured to search for at least one first data table associated with the total incremental data; each of the first data tables includes corresponding table increment data; and the table increment corresponding to each of the first data tables The sum of the data is the total incremental data;
  • the first synchronization unit 132 is configured to synchronously synchronize the table delta data in each of the first data tables to the second service data set.
  • the server 1 further includes:
  • the state detecting module 15 is configured to detect an execution state of the data processing service.
  • the notification module 16 is configured to notify the searching unit 131 to find at least one first data table associated with the total incremental data if the execution state of the data processing service is a successful execution state;
  • the deleting module 17 is configured to delete the recorded total incremental data if the execution state of the data processing service is a data rollback state.
  • FIG. 12 is a schematic structural diagram of a first synchronization unit according to an embodiment of the present invention.
  • the first synchronization unit 132 may include:
  • a determining subunit 1321 configured to determine whether there is a first data table including a primary key in the at least one first data table
  • the synchronization sub-unit 1322 is configured to synchronously synchronize the table delta data in each of the first data tables to the second service data set if the determining sub-unit 1321 determines to be no;
  • the synchronization sub-unit 1322 is further configured to: if the determination sub-unit 1321 determines YES, synchronize the table increment data in the first data table not including the primary key to the second service data in parallel And collecting at least one target row data associated with the table delta data corresponding to the first data table including the primary key, and synchronizing the row increment data corresponding to each target row data to the Second business data set;
  • the sum of the row increment data corresponding to each target row data in the first data table is the table increment data corresponding to the first data table.
  • FIG. 13 is a schematic structural diagram of an incremental data recording module according to an embodiment of the present invention.
  • the incremental data recording module 12 includes:
  • the obtaining detecting unit 122 is configured to acquire a data operation instruction, execute a data processing service for the basic data according to the data operation instruction, and set a data recording range corresponding to the data processing service;
  • a first recording unit 121 configured to record, according to the data processing service, the at least one second data table if the data record range is at least one second data table Total incremental data;
  • the second recording unit 123 is configured to record, in the basic data, total incremental data generated based on the data processing service, if the data recording range is the basic data.
  • FIG. 14 is a schematic structural diagram of a second recording unit according to an embodiment of the present invention.
  • the incremental data migration module 13 further includes:
  • a second synchronization unit 133 configured to record a start timestamp corresponding to the received background service instruction, and determine the full amount of data in the basic data that is in the start timestamp as the full amount of data to be synchronized, and according to the background
  • the service instruction synchronizes the full amount of data to be synchronized to the second service data set;
  • the background service instruction is an online upgrade instruction or a data relocation instruction.
  • the second recording unit 123 includes:
  • An incremental record subunit 1231 configured to record, in the basic data, incremental data generated based on the data processing service
  • the synchronization determining sub-unit 1232 is configured to determine whether the full amount of data to be synchronized has been synchronized to the second service data set.
  • the notification subunit 1233 is configured to notify the incremental recording subunit 1231 to continue to record incremental data generated based on the data processing service in the basic data if the synchronization determination subunit 1232 determines NO;
  • Determining the sub-unit 1234 if the synchronization judging sub-unit 1232 determines YES, determining the time at which the data processing service is completed as the end timestamp, and the starting timestamp to the end timestamp All incremental data recorded between them is determined as total incremental data.
  • the server 1500 may include: at least one processor 1501, such as a CPU, at least one network interface 1504, a user interface 1503, a memory 1505, at least one Communication bus 1502.
  • the communication bus 1502 is used to implement connection communication between these components.
  • the user interface 1503 can include a display and a keyboard.
  • the optional user interface 1503 can also include a standard wired interface and a wireless interface.
  • the network interface 1504 can optionally include a standard wired interface, a wireless interface (such as a WI-FI interface).
  • the memory 1505 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the memory 1505 can also optionally be at least one storage device located remotely from the processor 1501. As shown in FIG. 15, an operating system, a network communication module, a user interface module, and a data management application may be included in the memory 1505 as a computer storage medium.
  • the user interface 1503 is mainly used to provide an input interface for the user to acquire data input by the user; and the processor 1501 can be used to call the data management application stored in the memory 1505 and execute the specific The following operations:
  • the incremental data is used to add the mirrored data, and the incremental data is migrated to the second service data set;
  • the first service data set is a current service data set stored in a preset time period
  • the second service data set is a historical service data set stored in addition to the preset time period.
  • the process may be performed by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, which, when executed, may include the flow of an embodiment of the methods described above.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Abstract

一种数据管理方法及服务器,其中方法包括如下步骤:获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中(S101);记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据(S102);在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中(S103);当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据(S104)。

Description

一种数据管理方法及服务器
本申请要求于2016年12月19日提交中国专利局、申请号为201611178078.8、申请名称为“一种数据管理方法及其设备”的中国专利申请的优先权。同时,本申请要求于2016年12月19日提交中国专利局、申请号为201611178079.2、申请名称为“一种数据同步方法以及装置”的中国专利申请的优先权。其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网技术领域,尤其涉及一种数据管理方法及服务器。
发明背景
随着互联网技术不断的开发和完善,各种业务数据(例如:交易流水数据、通话记录数据等)的数据量日趋庞大,需要采用后台的数据库系统进行存储。
现有的数据库系统提出了热数据和冷数据的管理方式,即针对具有时效性的业务数据,通过以时间范围为分界线,将近期产生的业务数据(例如:近4个月的业务数据等)和除了近期以外的业务数据(例如:4个月以前的业务数据等)分别进行存储。
发明内容
本发明实施例提供一种数据管理方法及服务器,可以实现在线对业务数据进行迁移,提高对业务数据的查询及修改等数据处理的效率,进而保证业务服务的质量。
本发明实施例第一方面提供了一种数据管理方法,可包括:
获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
其中,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合。
本发明实施例第二方面提供了一种服务器,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
其中,所述第一业务数据集合为预设时间段内存储的当前业务数据 集合,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合。
本发明实施例第三方面提供了一种计算机可读存储介质,存储有计算机可读指令,可以使至少一个处理器执行如上所述的方法。
附图简要说明
为了更清楚的说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。其中,
图1是本发明实施例提供的一种数据管理系统的架构图;
图2是本发明实施例提供的一种数据管理方法的流程示意图;
图3是本发明实施例提供的另一种数据管理方法的流程示意图;
图4是本发明实施例提供的又一种数据管理方法的流程示意图;
图5是本发明实施例提供的另一种数据管理方法的流程示意图;
图6是本发明实施例提供的一种数据同步方法的流程示意图;
图7是本发明实施例提供的又一种数据同步方法的流程示意图;
图8是本发明实施例提供的一种服务器的结构示意图;
图9是本发明实施例提供的另一种服务器的结构示意图;
图10是本发明实施例提供的增量数据迁移模块的结构示意图;
图11是本发明实施例提供的又一种服务器的结构示意图;
图12是本发明实施例提供的第一同步单元的结构示意图;
图13是本发明实施例提供的增量数据记录模块的结构示意图;
图14是本发明实施例提供的第二记录单元的结构示意图;
图15是本发明实施例提供的一种服务器的结构示意图。
实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在使用热数据和冷数据的管理方式中,随着时间的推移,当部分热数据需要转换为冷数据时,往往需要对业务服务进行暂停,并对需要转移的热数据进行存储节点的切换以及路由的切换,由于业务服务暂停的时间由需要转移的热数据的数据量所决定,当数据量较大时,容易影响业务服务的质量,降低对业务数据的查询及修改等数据处理的效率。
为了更好理解本发明实施例公开的一种数据管理方法及服务器,下面先对本发明实施例适用的数据管理的架构进行描述。
请参见图1,为本发明实施例提供了一种数据管理系统的架构图。如图1所示,该数据管理系统100可以包括多个协调者节点以及多个数据节点,其中,协调者节点用于对外部用户终端1~N提供接口,接收用户终端发送的对业务数据的查询、修改等数据处理请求,并向数据节点进行分发,以及存储业务数据的存储索引等。多个协调者节点间的位置对等,用户设备可以接入任一协调者节点以对业务数据进行数据处理。所述数据节点用于存储业务数据,并执行协调者节点分发的数据处理请求等。
其中,多个协调者节点以及多个数据节点可以分别置于不同的后台服务设备中,以形成服务设备群组。在本发明实施例中,将多个协调者节点以及多个数据节点组成的设备群组称为服务器。
本发明实施例中,采用热数据和冷数据的方式对具有时效性的业务 数据分别进行存储,其中,业务数据具有数据量大,并且随着时间的推移,其被访问的频率逐渐降低的特性,例如:交易流水数据、通话记录数据等。热数据中包括多个数据节点,如图1所示,包括数据节点1和数据节点2等,用于存储近期产生的业务数据(例如:近4个月的业务数据等)。冷数据中同样也包括多个数据节点,如图1所示,包括数据节点3和数据节点4等,用于存储除了近期以外的业务数据(例如:4个月以前的业务数据等)。
可以理解的是,通过对热数据和冷数据分别进行存储,可以以时间范围为分界线,分别定义用户频繁访问的数据以及非频繁访问的数据,随着时间的推移,需要定期将超过时间范围的热数据转移至冷数据中进行存储。
根据本发明实施例提供的方法,预设一个时间段,将预设时间段内存储的当前业务数据集合(即热数据)定义为第一业务数据集合,将除所述预设时间段外所存储的历史业务数据集合(即冷数据)定义为第二业务数据集合。在将热数据转移至冷数据时,首先,将基础数据迁移至第二业务数据集合中(见110);然后,记录基础数据的迁移过程中针对基础数据所获取的增量数据;在基础数据的迁移过程完成时,将增量数据迁移至第二业务数据集合中(见120)。
所述用户终端可以包括:平板电脑、智能手机、笔记本电脑、掌上电脑、个人计算机以及移动互联网设备(MID)等终端设备。
基于图1所示的系统架构,下面将结合附图2-附图7,对本发明实施例提供的数据管理方法进行详细介绍。
请参见图2,为本发明实施例提供了一种数据管理方法的流程示意图。如图2所示,本发明实施例的所述方法,应用于服务器,可以包括以下步骤S101-步骤S104。
S101,获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中。
具体的,服务器可以获取第一业务数据集合中待迁移的基础数据,可以理解的是,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,即上述热数据,所述基础数据为随着时间推移在所述预设时间段外的需要从所述第一业务数据集合转移至第二业务数据集合的业务数据,或者所述基础数据为基于管理人员对预设时间段进行修改(例如:从近4个月修改为近3个月等)所形成的前后时间差中存在的业务数据,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合,即上述冷数据。所述服务器具体为包括多个协调者节点以及多个数据节点的后台服务设备群组。
所述服务器可以生成与所述基础数据相同的镜像数据,例如:采用数据快照的方式生成与所述基础数据相同的镜像数据等,所述服务器将所述基础数据迁移至所述第二业务数据中。
S102,记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据。
具体的,所述服务器可以在所述基础数据从所述第一业务数据集合迁移至所述第二业务数据集合的迁移过程中,记录针对所述基础数据所获取的增量数据,需要说明的是,所述增量数据为在所述基础数据的迁移过程中,需要对所述基础数据进行数据插入、更新等操作的数据。
S103,在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中。
具体的,在所述基础数据的迁移过程完成时,即所述基础数据已经 迁移至所述第二业务数据集合中时,所述服务器可以采用所述增量数据对所述镜像数据进行添加处理,将对所述基础数据的数据插入、更新等操作的数据添加至所述镜像数据中,同时,所述服务器还需要将所述增量数据迁移至所述第二业务数据集合中,对所述基础数据进行数据插入、更新等操作。
S104,当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据。
具体的,当所述增量数据的迁移过程完成时,所述服务器可以更新所述基础数据和所述增量数据的路由信息,即将所述基础数据和所述增量数据的路由信息由所述第一业务数据集合转换为所述第二业务数据集合,后续用户终端发起的对所述基础数据和所述增量数据的查询、插入、删除、更新等均分配至所述第二业务数据集合中执行,所述服务器同时清除所述第一业务数据集合中的所述镜像数据以及所述第一业务数据集合中的所述增量数据。
在本发明实施例中,在对第一业务数据集合中的基础数据进行迁移时,通过生成并保留基础数据的镜像数据,先将基础数据迁移至第二业务数据集合中,并记录基础数据迁移过程中针对基础数据的增量数据,在基础数据迁移完成时,再进行增量数据的迁移,直至增量数据迁移完成后,清除第一业务数据集合中的镜像数据和所述增量数据,实现了在线对业务数据进行迁移的过程,提高了对业务数据的查询及修改等数据处理的效率,进而保证了业务服务的质量。
请参见图3,为本发明实施例提供了另一种数据管理方法的流程示意图。如图3所示,本发明实施例的所述方法可以包括一下步骤S201-步骤S207。
S201,将属于预设时间段内的业务数据存储至第一业务数据集合中, 将属于所述预设时间段外的业务数据存储至第二业务数据集合中。
具体的,服务器可以将属于预设时间段内的业务数据存储至第一业务数据集合中,并将属于所述预设时间段外的业务数据存储至第二业务数据集合中,可以理解的是,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,即上述热数据;所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合,即上述冷数据。所述服务器具体为包括多个协调者节点以及多个数据节点的后台服务设备群组。
S202,获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中。
具体的,所述服务器可以获取第一业务数据集合中待迁移的基础数据,可以理解的是,所述基础数据为随着时间推移在所述预设时间段外的需要从所述第一业务数据集合转移至第二业务数据集合的业务数据,或者所述基础数据为基于管理人员对预设时间段进行修改(例如:从近4个月修改为近3个月等)所形成的前后时间差中存在的业务数据,所述服务器可以生成与所述基础数据相同的镜像数据,例如:采用数据快照的方式生成与所述基础数据相同的镜像数据等,所述服务器将所述基础数据迁移至所述第二业务数据中。
S203,记录所述基础数据的迁移过程中针对所述基础数据所获取的第一增量数据。
具体的,考虑到迁移的所述基础数据的数据量较大,所述基础数据的迁移过程同样需要较长的时间,此时在所述基础数据的迁移过程中所产生的所述增量数据的数据量同样较大,因此需要循环对不断产生的增量数据进行后续的迁移处理,优选的,所述服务器记录所述基础数据的迁移过程中针对所述基础数据所获取的第一增量数据,所述第一增量数 据用于表示在对所述基础数据进行迁移的过程中所产生的针对所述基础数据的增量数据。
S204,将所述第一增量数据作为增量数据,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,记录所述增量数据的迁移过程中针对所述基础数据和所述增量数据所获取的第二增量数据,将所述第二增量数据作为增量数据,重复执行本步骤,直至所述第二增量数据的数据量小于预设数据量阈值。
具体的,所述服务器可以将所述第一增量数据作为增量数据,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,记录所述增量数据的迁移过程中针对所述基础数据和所述增量数据所获取的第二增量数据,将所述第二增量数据作为增量数据,重复执行本步骤,直至所述第二增量数据的数据量小于预设数据量阈值,可以理解的是,所述第二增量数据为用于表示在对所述第一增量数据进行迁移的过程中所产生的针对所述基础数据和所述第一增量数据的增量数据。所述预设数据量阈值可以由维护人员根据经验值进行设定,以保证所述第二增量数据在迁移过程中对用户终端请求执行的插入、更新等操作的影响可以忽略不计。
例如:假设所述第一增量数据的数据量为1TB,将所述第一增量数据迁移至所述第二业务数据集合需要8小时,在这8小时过程中产生的第二增量数据的数据量为0.5TB,将该第二增量数据迁移至所述第二业务数据集合需要4小时,以此循环,直至产生的第二增量数据的数据量为36MB时,此时所需的迁移时间大概为1秒,因此当所述第二增量数据的数据量所需的迁移时间非常短时,可以进一步采用双写操作的方式进行处理,无需再次记录所述第二增量数据迁移至所述第二业务数据集合时所产生的增量数据。
S205,当所述第二增量数据的数据量小于预设数据量阈值时,采用所述第二增量数据同时对所述第一业务数据集合中的所述镜像数据、所述增量数据,以及所述第二业务数据集合中的所述基础数据、所述增量数据进行修改处理,并获取修改处理结果。
具体的,针对上述双写操作,当所述第二增量数据的数据量小于预设数据量阈值时,所述服务器可以采用所述第二增量数据同时对所述第一业务数据集合中的所述镜像数据、所述增量数据,以及所述第二业务数据集合中的所述基础数据、所述增量数据进行修改处理,即采用所述第二增量数据对所述第一业务数据集合中的所述镜像数据和所述增量数据进行数据插入、更新等操作,同时,采用所述第二增量数据对所述第二业务数据集合中的所述基础数据和所述增量数据进行数据插入、更新等操作,通过采用双写操作的方式,实现了在增量数据不影响对用户终端请求执行的插入、更新等操作的基础上,实时同步完成剩余业务数据的迁移以及对业务数据的修改过程,并可以对第一业务数据集合及第二业务数据集合中的业务数据进行进一步的校验,以保证业务数据在迁移过程中的一致性,当发现业务数据不一致时,所述服务器可以关闭双写操作并进行数据回滚,此时由于业务数据的路由信息并未发生改变,实现无成本的数据回滚。
S206,当所述修改处理结果为修改处理成功时,确定所述增量数据的迁移过程完成。
具体的,当所述修改处理结果为修改处理成功时,所述服务器确定所述增量数据的迁移过程完成。
特殊的,在基础数据迁移的过程中,以及在增量数据迁移的过程中,当存在用户终端对基础数据或增量数据的查询请求时,所述服务器均将所述查询请求分配至所述第一业务数据集合中,并将查询结果返回至所 述用户终端。
S207,当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据。
具体的,当所述增量数据的迁移过程完成时,所述服务器可以更新所述基础数据和所述增量数据的路由信息,即将所述基础数据和所述增量数据的路由信息由所述第一业务数据集合转换为所述第二业务数据集合,后续用户终端发起的对所述基础数据和所述增量数据的查询、插入、删除、更新等均分配至所述第二业务数据集合中执行,所述服务器同时清除所述第一业务数据集合中的所述镜像数据以及所述第一业务数据集合中的所述增量数据。
在本发明实施例中,所述服务器中的多个数据节点可以随着业务数据的数据量进行节点个数的调节,所述服务器可以基于业务要求对第二业务数据集合中满足删除条件的业务数据进行清除,优选的,所述业务要求可以为预设的删除时间阈值,例如:针对历史10年或10年以上的业务数据等,所述服务器可以将所述第二业务数据集合中满足删除时间阈值的业务数据进行清除,同时,清除后的数据节点可以等待再次使用,实现了对数据节点的动态调配。
在本发明实施例中,在对第一业务数据集合中的基础数据进行迁移时,通过生成并保留基础数据的镜像数据,先将基础数据迁移至第二业务数据集合中,并记录基础数据迁移过程中针对基础数据的增量数据,在基础数据迁移完成时,再进行增量数据的迁移,直至增量数据迁移完成后,清除第一业务数据集合中的镜像数据和所述增量数据,实现了在线对业务数据进行迁移的过程,提高了对业务数据的查询及修改等数据处理的效率,进而保证了业务服务的质量;通过循环记录增量数据并进行迁移的方式,进一步实现了在线对业务数据进行迁移的过程,降低对 业务数据进行插入、更新等操作的影响;通过采用双写操作的方式,实现了在增量数据不影响对用户终端请求执行的插入、更新等操作的基础上,实时同步完成剩余业务数据的迁移以及对业务数据的修改过程,并可以对第一业务数据集合及第二业务数据集合中的业务数据进行进一步的校验,以保证业务数据在迁移过程中的一致性;通过采用删除时间阈值的方式对数据节点进行动态调配,在存储空间不足时,无需替换数据节点的存储容量的硬件设备,降低硬件成本。
请参见图4,为本发明实施例提供了又一种数据管理方法的流程示意图。如图4所示,本发明实施例的所述方法可以包括以下步骤S301-步骤S308。
S301,获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中。
S302,记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据。
S303,在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中。
S304,当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据。
S305,当检测到用户终端发送的携带有时间范围的业务数据查询请求时,检测所述时间范围是否属于所述预设时间段。
具体的,当检测到用户终端发送的携带有时间范围的业务数据查询请求时,所述服务器可以进一步检测所述时间范围是否属于所述预设时间段,若是,则转入执行步骤S306;若否,则转入执行步骤S307。
S306,将所述第一业务数据集合中属于所述预设时间段且属于所述时间范围的业务数据返回至所述用户终端。
具体的,当所述服务器检测到所述时间范围属于所述预设时间段时,所述服务器可以将所述第一业务数据集合中属于所述预设时间段且属于所述时间范围的业务数据返回至所述用户终端。
S307,将所述第二业务数据集合中不属于所述预设时间段但属于所述时间范围的业务数据返回至所述用户终端。
具体的,当所述服务器检测到所述时间范围不属于所述预设时间段时,所述服务器可以将所述第二业务数据集合中不属于所述预设时间段但属于所述时间范围的业务数据返回至所述用户终端。
S308,当检测到用户终端发送的未携带有时间范围的业务数据查询请求时,将所述第一业务数据集合中属于所述预设时间段的第一业务数据,以及所述第二业务数据集合中不属于所述预设时间段的第二业务数据返回至所述用户终端。
具体的,当检测到用户终端发送的未携带有时间范围的业务数据查询请求时,所述服务器可以将所述第一业务数据集合中属于所述预设时间段的第一业务数据,以及所述第二业务数据集合中不属于所述预设时间段的第二业务数据返回至所述用户终端。
需要说明的是,上述查询过程中,考虑到业务数据在存储的过程中存在错误插入的情况,即原本属于第一业务数据集合的业务数据被存储至第二业务数据集合中,而原本数据第二业务数据集合的业务数据被存储至第一业务数据集合中,因此当存在此类业务数据时,所述服务器可以不将此类业务数据返回至所述用户终端,以保护数据访问的一致性。
可以理解的是,本发明实施例的步骤S305-步骤S308执行的业务数据查询过程可以不遵循本发明实施例的执行流程,即用户终端可以在任 意时间发起业务数据的查询流程。
其中,本发明实施例的步骤S301-步骤S304可以参见图2和图3所示实施例的具体描述,在此不进行赘述。
在本发明实施例中,在对第一业务数据集合中的基础数据进行迁移时,通过生成并保留基础数据的镜像数据,先将基础数据迁移至第二业务数据集合中,并记录基础数据迁移过程中针对基础数据的增量数据,在基础数据迁移完成时,再进行增量数据的迁移,直至增量数据迁移完成后,清除第一业务数据集合中的镜像数据和所述增量数据,实现了在线对业务数据进行迁移的过程,提高了对业务数据的查询及修改等数据处理的效率,进而保证了业务服务的质量;考虑到业务数据在存储的过程中存在错误插入的情况,可以不将错误插入的业务数据返回至用户终端,以保护数据访问的一致性。
针对附图2中对增量数据的记录和迁移操作,目前使用的PostgreSQL(对象-关系数据库管理系统)数据库,通常是通过日志的方式记录由各种数据处理业务对数据库的修改所生成的增量数据,进而再将增量数据迁移(或者同步)到其他的数据库,即目前的数据库的增量数据的同步方法需要通过日志来实现。而日志与数据库的版本之间是强相关的,因此,对于两个版本不同的数据库是无法通过日志实现增量数据的同步。而且目前的数据库的增量数据的同步方法都是基于单个进程实现的,即需要通过单个进程将所有增量数据同步到其他的数据库,例如,在大并发写入的情况下,增量数据会产生的比较快,因此,基于单个进程的同步方法将会大大降低增量数据的同步效率。
为了解决上述技术问题,图5是本发明实施例提供的另一种数据管理方法的流程示意图。如图5所示,本发明实施例的所述方法可以包括以下步骤S501-步骤S506。
S501,获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中。
此步骤可参照上述步骤101的描述,在此不再赘述。
S502,在所述基础数据的迁移过程中,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并记录基于所述数据处理业务所生成的总增量数据;所述总增量数据为基于逻辑语句的增量数据。
通过该步骤,实现了记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据。为了和后续的表增量数据进行区分,此处将增量数据称为总增量数据。
具体的,服务器可以获取用户设备发送的数据操作指令,所述数据操作指令可以包括数据新增指令、数据删除指令、数据修改指令等等,所述服务器可以进一步根据所述数据操作指令执行数据处理业务,所述数据处理业务可以包括对上述第一业务数据集合中的基础数据(为了描述方便,将这些基础数据的集合称为源数据库)进行数据新增业务、数据删除业务、数据修改业务等等。例如,若所述数据操作指令为数据删除指令,则所述服务器根据所述数据删除指令对相应的源数据库中的部分数据进行删除。
其中,所述服务器所记录的所述总增量数据为基于逻辑语句的增量数据,具体可以为基于SQL(Structured Query Language,结构化查询语言)语句的增量数据。所述服务器可以将数据库中的各数据表中的各行数据所涉及到的修改操作、删除操作以及新增操作都分别记录为一条SQL语句,同时还记录SQL语句涉及到的数据表信息或数据库信息,进而将所记录的SQL语句和所涉及到的数据表信息或数据库信息确定 为总增量数据。由于SQL语句在不同版本的数据库中都是通用的,所以基于SQL语句的增量数据可以在不同版本的数据库之间进行同步。
S503,在所述基础数据的迁移过程完成时,采用所述总增量数据对所述镜像数据进行添加处理。
此步骤可参照上述步骤103的描述,在此不再赘述。
S504,查找所述总增量数据所关联的至少一个第一数据表;各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据。
具体的,所述服务器记录所述总增量数据后,可以进一步检测所述数据处理业务的执行状态,若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。若所述数据处理业务的执行状态为成功执行状态,说明所述数据处理业务顺利提交,进而可以查找所述总增量数据所关联的至少一个第一数据表。其中,各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据。其中,若所述数据记录范围为所述至少一个第二数据表,则所述至少一个第二数据表包含所述至少一个第一数据表。
例如,有5个第二数据表,其中有4个第二数据表中包含增量数据(即表增量数据),则可以将这4个包含增量数据的数据表确定为与所述总增量数据相关联的第一数据表。
例如,所述服务器记录源数据库中基于所述数据处理业务所生成的总增量数据,所述数据处理业务对应的执行操作涉及到所述源数据库中的5个数据表,则可以将这5个数据表确定为所述总增量数据所关联的第一数据表,每个第一数据表中所生成的增量数据为表增量数据,这5个第一数据表分别对应的表增量数据的总和即为所述总增量数据。
又例如,若所述数据记录范围包括数据表A、数据表B以及数据表 C,且所述数据处理业务只涉及到数据表A和数据表B,即只有数据表A和数据表B包含表增量数据,则所述服务器可以将数据表A和数据表B确定为所述总增量数据所关联的第一数据表,数据表A和数据表B分别对应的表增量数据的总和即为所述总增量数据。
S505,将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
通过该步骤,实现了将所述总增量数据迁移至所述第二业务数据集合中,即迁移是一种并行同步的过程。
具体的,所述服务器可以根据第一数据表进行SQL分发,使得同一张第一数据表中的所有基于SQL语句的增量数据(即表增量数据)都由同一个进程进行同步,进而对不同的第一数据表中的表增量数据分别使用不同的进程进行同步,因此,不仅可以保证第一数据表中的数据的一致性,也可以实现将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
例如,所述服务器中存在第一数据表A、第一数据表B以及第一数据表C,第一数据表A包括表增量数据a,第一数据表B包括表增量数据b,第一数据表C包括表增量数据c,则所述服务器可以通过第一个进程将表增量数据a同步至所述第二业务数据集合,并通过第二个进程将表增量数据b同步至所述第二业务数据集合,并通过第三个进程将表增量数据c同步至所述第二业务数据集合,即实现了对表增量数据a、表增量数据b、表增量数据c进行并行同步。
基于数据表进行并行同步的方法相比于使用一个进程对所有增量数据进行单进程同步的方法,可以大大减少同步时间,即可以提高增量数据的同步效率。
S506,当所述表增量数据的同步过程完成时,清除所述第一业务数 据集合中的所述镜像数据和所述总增量数据。
此步骤可参照上述步骤104的描述,在此不再赘述。
在上述实施例中,由于总增量数据为基于逻辑语句的增量数据,且基于逻辑语句的增量数据可以在各种版本的数据库中使用,所以本发明实施例可以在不同版本的数据库之间进行增量数据的同步;而且通过多进程分别对各表增量数据进行并行同步,可以有效提高同步效率。
针对上述实施例中步骤502记录总增量数据、步骤504和步骤505的同步操作,图6是本发明实施例提供的一种数据同步方法的流程示意图,所述方法可以包括:
S601,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并设置与所述数据处理业务对应的数据记录范围。
具体的,所述服务器在执行所述数据处理业务的同时,还可以进一步设置与所述数据处理业务对应的数据记录范围,具体可以根据当前的业务需求设置所述数据记录范围,例如,当前业务需求为需要对数据表A和数据表B中被修改的数据进行同步,则可以设置所述数据记录范围为数据表A和数据表B,即后续只在数据表A和数据表B中记录基于所述数据处理业务所生成的总增量数据;若当前需要对整个源数据库(即包括第一业务数据集合中的基础数据)中被修改的基础数据进行同步,则可以设置所述数据记录范围为该源数据库,即后续可以在该源数据库中记录基于所述数据处理业务所生成的总增量数据。
若所述数据记录范围为源数据库,可以先判断所述数据处理业务对应的修改数据(或删除数据或新增数据)是否属于所述源数据库,若属于,则可以记录该修改数据(或删除数据或新增数据)对应的增量数据,否则不记录所对应的增量数据,即所述服务器只在源数据库中记录基于 所述数据处理业务所生成的总增量数据。通过设置与所述数据处理业务对应的数据记录范围,可以选择性的记录指定的数据表或数据库中的增量数据,从而可以解决日志中将无需关注的数据表或数据库对应的修改也一并记录下来的问题,因此,本发明实施例可以更灵活地选择所需记录的增量数据,不仅可以提高同步效率也可以节省服务器的存储资源。
可选的,所述服务器在步骤S601之前,可以先检测当前是否存在正在执行的数据处理业务,若存在,则可以等待正在执行的数据处理业务执行完后,再执行S601步骤,以保证所记录的总增量数据是完整的数据处理业务的增量数据,以便于在同步所述总增量数据时可以保证所同步的数据处理业务的完整性。
S602,若所述数据记录范围为至少一个第二数据表,则在所述至少一个第二数据表中记录基于所述数据处理业务所生成的总增量数据。
具体的,若所述数据记录范围为至少一个第二数据表,可以先判断所述数据处理业务对应的修改数据(或删除数据或新增数据)是否属于所述至少一个第二数据表,若属于,则可以记录该修改数据(或删除数据或新增数据)对应的增量数据,否则不记录所对应的增量数据,即只在所述至少一个第二数据表中记录基于所述数据处理业务所生成的总增量数据。例如,若4个第二数据表,所述数据处理业务为数据删除业务,则所述服务器只需记录这4个第二数据表中的每一条数据删除操作、各数据删除操作分别对应的操作序号和第二数据表信息,并将所记录的每一条数据删除操作、各数据删除操作分别对应的操作序号和第二数据表信息确定为总增量数据。
其中,所述服务器所记录的所述总增量数据为基于逻辑语句的增量数据,具体可以为基于SQL语句的增量数据。所述服务器可以将数据库中的各数据表中的各行数据所涉及到的修改操作、删除操作以及新增操 作都分别记录为一条SQL语句,同时还记录SQL语句涉及到的数据表信息或数据库信息,进而将所记录的SQL语句和所涉及到的数据表信息或数据库信息确定为总增量数据。由于SQL语句在不同版本的数据库中都是通用的,所以基于SQL语句的增量数据可以在不同版本的数据库之间进行同步。
S603,若所述数据记录范围为所述基础数据,则在所述基础数据中记录基于所述数据处理业务所生成的总增量数据。
具体的,若所述数据记录范围为所述基础数据(称其为源数据库),可以先判断所述数据处理业务对应的修改数据(或删除数据或新增数据)是否属于所述源数据库,若属于,则可以记录该修改数据(或删除数据或新增数据)对应的增量数据,否则不记录所对应的增量数据,即所述服务器只在源数据库中记录基于所述数据处理业务所生成的总增量数据。例如,若所述服务器中包括5个数据库,其中一个为源数据库,且所述数据处理业务为数据修改业务,则所述服务器只需记录该源数据库中的每一条数据修改操作、各数据修改操作所修改的数据、各数据修改操作分别对应的操作序号和数据表信息,并将所记录的每一条数据修改操作、各数据修改操作所修改的数据、各数据修改操作分别对应的操作序号和数据表信息确定为总增量数据。
可选的,若所述数据处理业务同时包括数据删除业务和数据修改业务,则所记录到的总增量数据即包括数据删除部分也包括数据修改部分。通过确定所述数据处理业务对应的数据记录范围,可以选择性的记录指定的数据表或数据库中的增量数据,从而可以解决日志中将无需关注的数据表或数据库对应的修改也一并记录下来的问题,因此,本发明实施例可以更灵活地选择所需记录的增量数据,不仅可以提高同步效率也可以节省服务器的存储资源。
其中,S603中的所述总增量数据也是基于SQL语句的增量数据。
S604,检测所述数据处理业务的执行状态。
S605,若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。
具体的,所述服务器记录所述总增量数据后,可以进一步检测所述数据处理业务的执行状态,若所述数据处理业务的执行状态为数据回滚状态,说明所述数据处理业务所修改(或删除或新增)的数据并没有落地,因此,可以删除所记录的所述总增量数据。
S606,若所述数据处理业务的执行状态为成功执行状态,则查找所述总增量数据所关联的至少一个第一数据表。
具体的,若所述数据处理业务的执行状态为成功执行状态,说明所述数据处理业务顺利提交,进而可以查找所述总增量数据所关联的至少一个第一数据表。其中,若所述数据记录范围为所述源数据库,则具体可以在所述源数据库中查找所述数据处理业务在哪些数据表中进行了数据的修改(或删除或新增),所查找出的这些数据表即为与所述总增量数据关联的第一数据表。其中,若所述数据记录范围为所述至少一个第二数据表,则所述至少一个第二数据表包含所述至少一个第一数据表,例如,有5个第二数据表,其中有4个第二数据表中包含增量数据(即表增量数据),则可以将这4个包含增量数据的数据表确定为与所述总增量数据相关联的第一数据表。
其中,各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据。例如,所述服务器记录源数据库中基于所述数据处理业务所生成的总增量数据,所述数据处理业务对应的执行操作涉及到所述源数据库中的5个数据表,则可以将这5个数据表确定为所述总增量数据所关联的第一数据表,每个第一数 据表中所生成的增量数据为表增量数据,这5个第一数据表分别对应的表增量数据的总和即为所述总增量数据。又例如,若所述数据记录范围包括数据表A、数据表B以及数据表C,且所述数据处理业务只涉及到数据表A和数据表B,即只有数据表A和数据表B包含表增量数据,则所述服务器可以将数据表A和数据表B确定为所述总增量数据所关联的第一数据表,数据表A和数据表B分别对应的表增量数据的总和即为所述总增量数据。
S607,判断所述至少一个第一数据表中是否存在包含主关键字的第一数据表。
具体的,所述主关键字为Primary Key,且Primary Key的值在第一数据表中是唯一的,且第一数据表中每行数据分别对应的Primary Key均是填写在该第一数据表中的其中一列。
S608,将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
具体的,若S607判断为否,即所述至少一个第一数据表中不存在包含主关键字的第一数据表,则所述服务器可以根据第一数据表进行SQL分发,使得同一张第一数据表中的所有基于SQL语句的增量数据(即表增量数据)都由同一个进程进行同步,进而对不同的第一数据表中的表增量数据则由不同的进程进行同步,因此,不仅可以保证第一数据表中的数据的一致性,也可以实现将所述各第一数据表中的表增量数据并行同步至第二业务数据集合(称其为目标数据库)。例如,所述服务器中存在第一数据表A、第一数据表B以及第一数据表C,第一数据表A包括表增量数据a,第一数据表B包括表增量数据b,第一数据表C包括表增量数据c,则所述服务器可以通过第一个进程将表增量数据a同步至目标数据库,并通过第二个进程将表增量数据b同步至目标数据库, 并通过第三个进程将表增量数据c同步至目标数据库,即实现了对表增量数据a、表增量数据b、表增量数据c进行并行同步。基于数据表进行并行同步的方法相比于使用一个进程对所有增量数据进行单进程同步的方法,可以大大减少同步时间,即可以提高增量数据的同步效率。
S609,将不包含所述主关键字的第一数据表中的表增量数据并行同步至所述第二业务数据集合,并查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,将所述各目标行数据分别对应的行增量数据并行同步至所述第二业务数据集合。
具体的,若S607判断为是,即所述至少一个第一数据表中存在包含主关键字的第一数据表,则所述服务器可以对不包含主关键字的第一数据表进行基于表增量数据的并行同步,具体过程可以一并参见S505步骤,这里不再进行赘述。
所述服务器对于包含主关键字的第一数据表,可以先查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,所述第一数据表中各目标行数据分别对应的行增量数据的总和为该第一数据表对应的表增量数据。例如,所述数据处理业务所执行的操作涉及到第一数据表中的其中10行数据,那么这10行数据即为与该第一数据表对应的表增量数据所关联的目标行数据,这10行数据分别对应的增量数据(即行增量数据)的总和即为该第一数据表对应的表增量数据。所述服务器再进一步根据目标行数据进行SQL分发,由于目标行数据对应的行增量数据可以包含Primary Key,所以可以将含有相同Primary Key的行增量数据(可以为两张或更多的第一数据表中的含有相同Primary Key的行增量数据)都通过同一个进程进行同步,进而对含有不同Primary Key的行增量数据分别使用不同的进程进行同步。由于一张第一数据表中的Primary Key各不相同,并且不会重复,所以基于 Primary Key对行增量数据进行并行同步的方法不仅可以保证第一数据表中的数据的一致性,也可以具备比基于第一数据表的表增量数据进行并行同步更高的同步效率。例如,所述服务器中的第一数据表含有Primary Key,且该第一数据表中的目标行数据包括行数据A、行数据B以及行数据C,行数据A包括行增量数据a,行数据B包括行增量数据b,行数据C包括行增量数据c,则所述服务器可以通过第一个进程将行增量数据a同步至目标数据库,并通过第二个进程将行增量数据b同步至目标数据库,并通过第三个进程将行增量数据c同步至目标数据库,即实现了对行增量数据a、行增量数据b、行增量数据c进行并行同步。
本发明实施例通过获取数据操作指令,根据数据操作指令执行数据处理业务,并记录基于数据处理业务所生成的总增量数据,可以查找总增量数据所关联的至少一个第一数据表,并将各第一数据表中的表增量数据并行同步至目标数据库。由于总增量数据为基于逻辑语句的增量数据,且基于逻辑语句的增量数据可以在各种版本的数据库中使用,所以本发明实施例可以在不同版本的数据库之间进行增量数据的同步;而且通过确定所述数据处理业务对应的数据记录范围,可以选择性的记录指定的数据表或数据库中的增量数据,因此,不仅可以提高同步效率也可以节省服务器的存储资源;而且通过多进程分别对各表增量数据进行并行同步,可以有效提高同步效率;而且当数据表中包含主关键字时,还可以进一步对包含主关键字的数据表对应的表增量数据进行基于行增量数据的并行同步,以进一步提高同步效率。
图7是本发明实施例提供的又一种数据同步方法的流程示意图,所述方法可以包括:
S701,记录接收到后台业务指令对应的起始时间戳,将所述基础数据中处于所述起始时间戳的全量数据确定为待同步全量数据,并根据所 述后台业务指令将所述待同步全量数据同步至所述第二业务数据集合。
具体的,所述后台业务指令为在线升级指令或数据搬迁指令。当所述服务器接收到所述后台业务指令时,可以记录接收到所述后台业务指令对应的起始时间戳,若所述后台业务指令为在线升级指令,则所述服务器可以新建一个最新版本的数据库作为目标数据库用于表示第二业务数据集合,并将所述源数据库(即包括所述基础数据的数据库)中处于所述起始时间戳的全量数据(全量数据即为所述源数据库中的所有数据,且假设所述服务器处于所述起始时间戳的时刻没有正在执行的数据处理业务)确定为待同步全量数据,并根据所述后台业务指令将所述待同步全量数据同步至目标数据库。若所述后台业务指令为数据搬迁指令,则所述服务器可以将当前已有的数据库或新建的数据库或其他服务器中的数据库作为目标数据库,且目标数据库的版本与源数据库的版本可以相同或不同。
可选的,若处于所述起始时间戳的所述服务器中存在正在执行的数据处理业务,且该数据处理业务对第一部分数据进行操作,对第二部分数据没有进行操作,则所述服务器可以先将第二部分数据同步至所述目标数据库,且当该数据处理业务执行完后,可以将执行完该数据处理业务后所得到的更新后的第一部分数据再同步到所述目标数据库。
S702,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并设置与所述数据处理业务对应的数据记录范围。
S703,在所述基础数据中记录基于所述数据处理业务所生成的增量数据。
具体的,若所述服务器是将所述待同步全量数据同步至所述目标数据库,则所述服务器可以从所述起始时间戳开始,实时获取数据操作指 令,并根据所述数据操作指令执行数据处理业务,此时可以设置所述数据记录范围为所述源数据库,进而可以在所述源数据库中记录基于所述数据处理业务所生成的增量数据。
可选的,若所述服务器是将所述第二部分数据同步至所述目标数据库,则所述服务器可以从所述起始时间戳开始,实时获取数据操作指令,并根据所述数据操作指令执行数据处理业务,此时可以设置所述数据记录范围为所述第二部分数据所涉及到数据表,以作为至少一个第二数据表,进而可以在所述至少一个第二数据表中记录基于所述数据处理业务所生成的增量数据。当开始同步所述更新后的第一部分数据时,所述服务器可以重新设置所述数据记录范围为所述源数据库,进而开始在所述源数据库中记录基于所述数据处理业务所生成的增量数据。
S704,判断所述待同步全量数据是否已全部同步至所述第二业务数据集合。
S705,将完成所述数据处理业务的时刻确定为结束时间戳,并将所述起始时间戳到所述结束时间戳之间所记录到的所有增量数据确定为总增量数据。
具体的,若S704判断为是,说明在线升级已完成或数据搬迁已完成,此时,若所述服务器中的所述数据处理业务已完成,则停止记录增量数据,若所述数据处理业务仍未完成,则可以等到完成所述数据处理业务时再停止记录增量数据。所述服务器可以进一步将完成所述数据处理业务的时刻确定为结束时间戳,并将所述起始时间戳到所述结束时间戳之间所记录到的所有增量数据确定为总增量数据。若S704判断为否,说明在线升级还未完成或数据搬迁还未完成,因此,所述服务器将继续执行S703步骤,以进一步记录新的增量数据。
S706,检测所述数据处理业务的执行状态。
S707,若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。
S708,若所述数据处理业务的执行状态为成功执行状态,则查找所述总增量数据所关联的至少一个第一数据表;各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据。
S709,判断所述至少一个第一数据表中是否存在包含主关键字的第一数据表。
S710,将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
具体的,若S709判断为否,则将所述各第一数据表中的表增量数据并行同步至所述目标数据库(即第二业务数据集合)。
S711,将不包含所述主关键字的第一数据表中的表增量数据并行同步至所述第二业务数据集合,并查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,将所述各目标行数据分别对应的行增量数据并行同步至所述第二业务数据集合。
具体的,若S709判断为是,则执行S711步骤。
S706-S711步骤的具体实现方式可以参见上述图6对应实施例中的S604-S609,这里不再进行赘述。
在整个升级过程中或数据搬迁过程中,源数据库持续提供服务,直到升级完毕或搬迁完毕,可以将所述源数据库中所记录的所述总增量数据中的表增量数据和/或行增量数据并行同步到所述目标数据库,以保证目标数据库中的数据的一致性,且整个服务过程不中断,即可以实现在保证服务不中断的前提下从对源数据库的使用过渡到对目标数据库的使用。
下面将结合附图8-附图15,对本发明实施例提供的服务器进行详细介绍。需要说明的是,附图8-附图15所示的服务器,用于执行本发明图2-图7所示实施例的方法,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明图2-图7所示的实施例。
请参见图8,为本发明实施例提供了一种服务器的结构示意图。如图8所示,本发明实施例的所述服务器1可以包括:基础数据迁移模块11、增量数据记录模块12、增量数据迁移模块13和镜像数据清除模块14。
基础数据迁移模块11,用于获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中。
所述基础数据迁移模块11可以生成与所述基础数据相同的镜像数据,例如:采用数据快照的方式生成与所述基础数据相同的镜像数据等,所述基础数据迁移模块11将所述基础数据迁移至所述第二业务数据中。
增量数据记录模块12,用于记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
具体实现中,所述增量数据记录模块12可以在所述基础数据从所述第一业务数据集合迁移至所述第二业务数据集合的迁移过程中,记录针对所述基础数据所获取的增量数据,需要说明的是,所述增量数据为在所述基础数据的迁移过程中,需要对所述基础数据进行数据插入、更新等操作的数据。
增量数据迁移模块13,用于在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
具体实现中,在所述基础数据的迁移过程完成时,即所述基础数据已经迁移至所述第二业务数据集合中时,所述增量数据迁移模块13可以采用所述增量数据对所述镜像数据进行添加处理,将对所述基础数据的数据插入、更新等操作的数据添加至所述镜像数据中,同时,所述增量数据迁移模块13还需要将所述增量数据迁移至所述第二业务数据集合中,对所述基础数据进行数据插入、更新等操作。
镜像数据清除模块14,用于当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
具体实现中,当所述增量数据的迁移过程完成时,所述镜像数据清除模块14可以更新所述基础数据和所述增量数据的路由信息,即将所述基础数据和所述增量数据的路由信息由所述第一业务数据集合转换为所述第二业务数据集合,后续用户终端发起的对所述基础数据和所述增量数据的查询、插入、删除、更新等均分配至所述第二业务数据集合中执行,所述镜像数据清除模块14同时清除所述第一业务数据集合中的所述镜像数据以及所述第一业务数据集合中的所述增量数据。
请参见图9,为本发明实施例提供了另一种服务器的结构示意图。如图9所示,本发明实施例的所述服务器1可以包括:基础数据迁移模块11、增量数据记录模块12、增量数据迁移模块13、镜像数据清除模块14、业务数据存储模块15、时间检测模块16、第一数据返回模块17和第二数据返回模块18。
业务数据存储模块15,用于将属于预设时间段内的业务数据存储至第一业务数据集合中,将属于所述预设时间段外的业务数据存储至第二业务数据集合中;
具体实现中,所述业务数据存储模块15可以将属于预设时间段内的业务数据存储至第一业务数据集合中,并将属于所述预设时间段外的业 务数据存储至第二业务数据集合中,可以理解的是,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,即上述热数据;所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合,即上述冷数据。所述服务器1具体为包括多个协调者节点以及多个数据节点的后台服务设备群组。
基础数据迁移模块11,用于获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
具体实现中,所述基础数据迁移模块11可以获取第一业务数据集合中待迁移的基础数据,可以理解的是,所述基础数据为随着时间推移在所述预设时间段外的需要从所述第一业务数据集合转移至第二业务数据集合的业务数据,或者所述基础数据为基于管理人员对预设时间段进行修改(例如:从近4个月修改为近3个月等)所形成的前后时间差中存在的业务数据,所述基础数据迁移模块11可以生成与所述基础数据相同的镜像数据,例如:采用数据快照的方式生成与所述基础数据相同的镜像数据等,所述基础数据迁移模块11将所述基础数据迁移至所述第二业务数据中。
增量数据记录模块12,用于记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
具体实现中,所述增量数据记录模块12可以在所述基础数据从所述第一业务数据集合迁移至所述第二业务数据集合的迁移过程中,记录针对所述基础数据所获取的增量数据,需要说明的是,所述增量数据为在所述基础数据的迁移过程中,需要对所述基础数据进行数据插入、更新等操作的数据。
考虑到迁移的所述基础数据的数据量较大,所述基础数据的迁移过 程同样需要较长的时间,此时在所述基础数据的迁移过程中所产生的所述增量数据的数据量同样较大,因此需要循环对不断产生的增量数据进行后续的迁移处理,优选的,所述增量数据记录模块12记录所述基础数据的迁移过程中针对所述基础数据所获取的第一增量数据,所述第一增量数据用于表示在对所述基础数据进行迁移的过程中所产生的针对所述基础数据的增量数据。
增量数据迁移模块13,用于在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
具体实现中,在所述基础数据的迁移过程完成时,即所述基础数据已经迁移至所述第二业务数据集合中时,所述增量数据迁移模块13可以采用所述增量数据对所述镜像数据进行添加处理,将对所述基础数据的数据插入、更新等操作的数据添加至所述镜像数据中,同时,所述增量数据迁移模块13还需要将所述增量数据迁移至所述第二业务数据集合中,对所述基础数据进行数据插入、更新等操作。
具体的,请一并参见图10,为本发明实施例提供了增量数据迁移模块的结构示意图。如图10所示,所述增量数据迁移模块13可以包括:
增量数据处理单元131,用于在所述基础数据的迁移过程完成时,将所述第一增量数据作为增量数据,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,记录所述增量数据的迁移过程中针对所述基础数据和所述增量数据所获取的第二增量数据,将所述第二增量数据作为增量数据,重复执行本步骤,直至所述第二增量数据的数据量小于预设数据量阈值。
结果获取单元132,用于当所述第二增量数据的数据量小于预设数据量阈值时,采用所述第二增量数据同时对所述第一业务数据集合中的 所述镜像数据、所述增量数据,以及所述第二业务数据集合中的所述基础数据、所述增量数据进行修改处理,并获取修改处理结果。
过程确定单元133,用于当所述修改处理结果为修改处理成功时,确定所述增量数据的迁移过程完成。
镜像数据清除模块14,用于当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据。
时间检测模块16,用于当检测到用户终端发送的携带有时间范围的业务数据查询请求时,检测所述时间范围是否属于所述预设时间段。
第一数据返回模块17,用于若所述时间检测模块16检测所述时间范围是否属于所述预设时间段的检测结果为是,则将所述第一业务数据集合中属于所述预设时间段且属于所述时间范围的业务数据返回至所述用户终端。
所述第一数据返回模块17,还用于若所述时间检测模块16检测所述时间范围是否属于所述预设时间段的检测结果为否,则将所述第二业务数据集合中不属于所述预设时间段但属于所述时间范围的业务数据返回至所述用户终端。
第二数据返回模块18,用于当检测到用户终端发送的未携带有时间范围的业务数据查询请求时,将所述第一业务数据集合中属于所述预设时间段的第一业务数据,以及所述第二业务数据集合中不属于所述预设时间段的第二业务数据返回至所述用户终端。
在本发明实施例中,在对第一业务数据集合中的基础数据进行迁移时,通过生成并保留基础数据的镜像数据,先将基础数据迁移至第二业务数据集合中,并记录基础数据迁移过程中针对基础数据的增量数据,在基础数据迁移完成时,再进行增量数据的迁移,直至增量数据迁移完成后,清除第一业务数据集合中的镜像数据和所述增量数据,实现了在 线对业务数据进行迁移的过程,提高了对业务数据的查询及修改等数据处理的效率,进而保证了业务服务的质量;通过循环记录增量数据并进行迁移的方式,进一步实现了在线对业务数据进行迁移的过程,降低对业务数据进行插入、更新等操作的影响;通过采用双写操作的方式,实现了在增量数据不影响对用户终端请求执行的插入、更新等操作的基础上,实时同步完成剩余业务数据的迁移以及对业务数据的修改过程,并可以对第一业务数据集合及第二业务数据集合中的业务数据进行进一步的校验,以保证业务数据在迁移过程中的一致性;考虑到业务数据在存储的过程中存在错误插入的情况,可以不将错误插入的业务数据返回至用户终端,以保护数据访问的一致性;通过采用删除时间阈值的方式对数据节点进行动态调配,在存储空间不足时,无需替换数据节点的存储容量的硬件设备,降低硬件成本。
请参见图11,为本发明实施例提供了又一种服务器的结构示意图。如图11所示,所述增量数据记录模块12用于,在所述基础数据的迁移过程中,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并记录基于所述数据处理业务所生成的总增量数据;所述总增量数据为基于逻辑语句的增量数据;
所述增量数据迁移模块13包括:
查找单元131,用于查找所述总增量数据所关联的至少一个第一数据表;各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据;
第一同步单元132,用于将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
在一实施例中,服务器1还包括:
状态检测模块15,用于检测所述数据处理业务的执行状态;
通知模块16,用于若所述数据处理业务的执行状态为成功执行状态,则通知所述查找单元131查找所述总增量数据所关联的至少一个第一数据表;
删除模块17,用于若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。
具体的,请一并参见图12,为本发明实施例提供了第一同步单元的结构示意图。如图12所示,所述第一同步单元132可以包括:
判断子单元1321,用于判断所述至少一个第一数据表中是否存在包含主关键字的第一数据表;
同步子单元1322,用于若所述判断子单元1321判断为否,则将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合;
所述同步子单元1322,还用于若所述判断子单元1321判断为是,则将不包含所述主关键字的第一数据表中的表增量数据并行同步至所述第二业务数据集合,并查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,将所述各目标行数据分别对应的行增量数据并行同步至所述第二业务数据集合;
其中,所述第一数据表中各目标行数据分别对应的行增量数据的总和为该第一数据表对应的表增量数据。
具体的,请一并参见图13,为本发明实施例提供了增量数据记录模块的结构示意图。如图13所示,所述增量数据记录模块12包括:
获取检测单元122,用于获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并设置与所述数据处理业务对应的数据记录范围;
第一记录单元121,用于若所述数据记录范围为至少一个第二数据表,则在所述至少一个第二数据表中记录基于所述数据处理业务所生成 的总增量数据;
第二记录单元123,用于若所述数据记录范围为所述基础数据,则在所述基础数据中记录基于所述数据处理业务所生成的总增量数据。
具体的,请一并参见图14,为本发明实施例提供了第二记录单元的结构示意图。
在一个实施例中,如图11所示,所述增量数据迁移模块13还包括:
第二同步单元133,用于记录接收到后台业务指令对应的起始时间戳,并将所述基础数据中处于所述起始时间戳的全量数据确定为待同步全量数据,并根据所述后台业务指令将所述待同步全量数据同步至所述第二业务数据集合;所述后台业务指令为在线升级指令或数据搬迁指令。
相应地,如图14所示,第二记录单元123包括:
增量记录子单元1231,用于在所述基础数据中记录基于所述数据处理业务所生成的增量数据;
同步判断子单元1232,用于判断所述待同步全量数据是否已全部同步至所述第二业务数据集合;
通知子单元1233,用于若所述同步判断子单元1232判断为否,则通知所述增量记录子单元1231继续在所述基础数据中记录基于所述数据处理业务所生成的增量数据;
确定子单元1234,用于若所述同步判断子单元1232判断为是,则将完成所述数据处理业务的时刻确定为结束时间戳,并将所述起始时间戳到所述结束时间戳之间所记录到的所有增量数据确定为总增量数据。
请参见图15,为本发明实施例提供了又一种服务器的结构示意图。如图15所示,所述服务器1500可以包括:至少一个处理器1501,例如CPU,至少一个网络接口1504,用户接口1503,存储器1505,至少一 个通信总线1502。其中,通信总线1502用于实现这些组件之间的连接通信。其中,用户接口1503可以包括显示屏(Display)、键盘(Keyboard),可选用户接口1503还可以包括标准的有线接口、无线接口。网络接口1504可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1505可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器1505可选的还可以是至少一个位于远离前述处理器1501的存储装置。如图15所示,作为一种计算机存储介质的存储器1505中可以包括操作系统、网络通信模块、用户接口模块以及数据管理应用程序。
在图15所示的服务器1500中,用户接口1503主要用于为用户提供输入的接口,获取用户输入的数据;而处理器1501可以用于调用存储器1505中存储的数据管理应用程序,并具体执行以下操作:
获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
其中,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分 流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。

Claims (23)

  1. 一种数据管理方法,其特征在于,应用于服务器,包括:
    获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
    记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
    在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
    当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
    其中,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合。
  2. 根据权利要求1所述的方法,其特征在于,所述获取第一业务数据集合中待迁移的基础数据之前,还包括:
    将属于预设时间段内的业务数据存储至第一业务数据集合中,将属于所述预设时间段外的业务数据存储至第二业务数据集合中。
  3. 根据权利要求1所述的方法,其特征在于,所述记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据,包括:
    记录所述基础数据的迁移过程中针对所述基础数据所获取的第一增量数据;
    所述采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,包括:
    将所述第一增量数据作为增量数据,采用所述增量数据对所述镜像 数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,记录所述增量数据的迁移过程中针对所述基础数据和所述增量数据所获取的第二增量数据,将所述第二增量数据作为增量数据,重复执行本步骤,直至所述第二增量数据的数据量小于预设数据量阈值;
    当所述第二增量数据的数据量小于预设数据量阈值时,采用所述第二增量数据同时对所述第一业务数据集合中的所述镜像数据、所述增量数据,以及所述第二业务数据集合中的所述基础数据、所述增量数据进行修改处理,并获取修改处理结果;
    当所述修改处理结果为修改处理成功时,确定所述增量数据的迁移过程完成。
  4. 根据权利要求1所述的方法,其特征在于,还包括:
    当检测到用户终端发送的携带有时间范围的业务数据查询请求时,检测所述时间范围是否属于所述预设时间段;
    若是,则将所述第一业务数据集合中属于所述预设时间段且属于所述时间范围的业务数据返回至所述用户终端;
    若否,则将所述第二业务数据集合中不属于所述预设时间段但属于所述时间范围的业务数据返回至所述用户终端。
  5. 根据权利要求1所述的方法,其特征在于,还包括:
    当检测到用户终端发送的未携带有时间范围的业务数据查询请求时,将所述第一业务数据集合中属于所述预设时间段的第一业务数据,以及所述第二业务数据集合中不属于所述预设时间段的第二业务数据返回至所述用户终端。
  6. 根据权利要求1所述的方法,其特征在于,还包括:
    将所述第二业务数据集合中满足删除时间阈值的业务数据进行清除。
  7. 根据权利要求1所述的方法,其特征在于,所述记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据包括:
    在所述基础数据的迁移过程中,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并记录基于所述数据处理业务所生成的总增量数据;所述总增量数据为基于逻辑语句的增量数据;
    所述将所述增量数据迁移至所述第二业务数据集合中包括:
    查找所述总增量数据所关联的至少一个第一数据表;各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据;
    将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
  8. 如权利要求7所述的方法,其特征在于,所述将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合,包括:
    判断所述至少一个第一数据表中是否存在包含主关键字的第一数据表;
    若判断为否,则将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合;
    若判断为是,则将不包含所述主关键字的第一数据表中的表增量数据并行同步至所述第二业务数据集合,并查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,将所述各目标行数据分别对应的行增量数据并行同步至所述第二业务数据集合;
    其中,所述第一数据表中各目标行数据分别对应的行增量数据的总和为该第一数据表对应的表增量数据。
  9. 如权利要求7所述的方法,其特征在于,所述获取数据操作指令, 根据所述数据操作指令针对所述基础数据执行数据处理业务,并记录基于所述数据处理业务所生成的总增量数据,包括:
    获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并设置与所述数据处理业务对应的数据记录范围;
    若所述数据记录范围为至少一个第二数据表,则在所述至少一个第二数据表中记录基于所述数据处理业务所生成的总增量数据;
    若所述数据记录范围为所述基础数据,则在所述基础数据中记录基于所述数据处理业务所生成的总增量数据。
  10. 如权利要求7所述的方法,其特征在于,在所述查找所述总增量数据所关联的至少一个第一数据表的步骤之前,还包括:
    检测所述数据处理业务的执行状态;
    若所述数据处理业务的执行状态为成功执行状态,则执行所述查找所述总增量数据所关联的至少一个第一数据表的步骤;
    若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。
  11. 如权利要求9所述的方法,其特征在于,在所述获取数据操作指令的步骤之前,还包括:
    记录接收到后台业务指令对应的起始时间戳,将所述基础数据中处于所述起始时间戳的全量数据确定为待同步全量数据,并根据所述后台业务指令将所述待同步全量数据同步至所述第二业务数据集合;
    则所述在所述基础数据中记录基于所述数据处理业务所生成的总增量数据,包括:
    在所述基础数据中记录基于所述数据处理业务所生成的增量数据;
    判断所述待同步全量数据是否已全部同步至所述第二业务数据集合;
    若判断为否,则继续执行所述在所述基础数据中记录基于所述数据处理业务所生成的增量数据的步骤;
    若判断为是,则将完成所述数据处理业务的时刻确定为结束时间戳,并将所述起始时间戳到所述结束时间戳之间所记录到的所有增量数据确定为总增量数据。
  12. 一种服务器,其特征在于,包括处理器和存储器,所述存储器中存储可被所述处理器执行的指令,当执行所述指令时,所述处理器用于:
    获取第一业务数据集合中待迁移的基础数据,生成与所述基础数据相同的镜像数据,并将所述基础数据迁移至第二业务数据集合中;
    记录所述基础数据的迁移过程中针对所述基础数据所获取的增量数据;
    在所述基础数据的迁移过程完成时,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中;
    当所述增量数据的迁移过程完成时,清除所述第一业务数据集合中的所述镜像数据和所述增量数据;
    其中,所述第一业务数据集合为预设时间段内存储的当前业务数据集合,所述第二业务数据集合为除所述预设时间段外所存储的历史业务数据集合。
  13. 根据权利要求12所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:
    将属于预设时间段内的业务数据存储至第一业务数据集合中,将属于所述预设时间段外的业务数据存储至第二业务数据集合中。
  14. 根据权利要求12所述的服务器,其特征在于,当执行所述指令 时,所述处理器进一步用于:记录所述基础数据的迁移过程中针对所述基础数据所获取的第一增量数据;
    在所述基础数据的迁移过程完成时,将所述第一增量数据作为增量数据,采用所述增量数据对所述镜像数据进行添加处理,并将所述增量数据迁移至所述第二业务数据集合中,记录所述增量数据的迁移过程中针对所述基础数据和所述增量数据所获取的第二增量数据,将所述第二增量数据作为增量数据,重复执行本步骤,直至所述第二增量数据的数据量小于预设数据量阈值;
    当所述第二增量数据的数据量小于预设数据量阈值时,采用所述第二增量数据同时对所述第一业务数据集合中的所述镜像数据、所述增量数据,以及所述第二业务数据集合中的所述基础数据、所述增量数据进行修改处理,并获取修改处理结果;
    当所述修改处理结果为修改处理成功时,确定所述增量数据的迁移过程完成。
  15. 根据权利要求12所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:
    当检测到用户终端发送的携带有时间范围的业务数据查询请求时,检测所述时间范围是否属于所述预设时间段;
    若检测所述时间范围是否属于所述预设时间段的检测结果为是,则将所述第一业务数据集合中属于所述预设时间段且属于所述时间范围的业务数据返回至所述用户终端;
    若检测所述时间范围是否属于所述预设时间段的检测结果为否,则将所述第二业务数据集合中不属于所述预设时间段但属于所述时间范围的业务数据返回至所述用户终端。
  16. 根据权利要求12所述的服务器,其特征在于,当执行所述指令 时,所述处理器进一步用于:
    当检测到用户终端发送的未携带有时间范围的业务数据查询请求时,将所述第一业务数据集合中属于所述预设时间段的第一业务数据,以及所述第二业务数据集合中不属于所述预设时间段的第二业务数据返回至所述用户终端。
  17. 根据权利要求12所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:
    将所述第二业务数据集合中满足删除时间阈值的业务数据进行清除。
  18. 根据权利要求12所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:在所述基础数据的迁移过程中,获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并记录基于所述数据处理业务所生成的总增量数据;所述总增量数据为基于逻辑语句的增量数据;
    查找所述总增量数据所关联的至少一个第一数据表;各第一数据表均包括对应的表增量数据;所述各第一数据表分别对应的表增量数据的总和为所述总增量数据;
    将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合。
  19. 根据权利要求18所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:
    判断所述至少一个第一数据表中是否存在包含主关键字的第一数据表;
    若判断为否,则将所述各第一数据表中的表增量数据并行同步至所述第二业务数据集合;
    若判断为是,则将不包含所述主关键字的第一数据表中的表增量数据并行同步至所述第二业务数据集合,并查找与包含所述主关键字的第一数据表对应的表增量数据所关联的至少一个目标行数据,将所述各目标行数据分别对应的行增量数据并行同步至所述第二业务数据集合;
    其中,所述第一数据表中各目标行数据分别对应的行增量数据的总和为该第一数据表对应的表增量数据。
  20. 根据权利要求18所述的服务器,其特征在于当执行所述指令时,所述处理器进一步用于:获取数据操作指令,根据所述数据操作指令针对所述基础数据执行数据处理业务,并设置与所述数据处理业务对应的数据记录范围;
    若所述数据记录范围为至少一个第二数据表,则在所述至少一个第二数据表中记录基于所述数据处理业务所生成的总增量数据;
    若所述数据记录范围为所述基础数据,则在所述基础数据中记录基于所述数据处理业务所生成的总增量数据。
  21. 根据权利要求18所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:
    检测所述数据处理业务的执行状态;
    若所述数据处理业务的执行状态为成功执行状态,则查找所述总增量数据所关联的至少一个第一数据表;
    若所述数据处理业务的执行状态为数据回滚状态,则删除所记录的所述总增量数据。
  22. 根据权利要求20所述的服务器,其特征在于,当执行所述指令时,所述处理器进一步用于:记录接收到后台业务指令对应的起始时间戳,并将所述基础数据中处于所述起始时间戳的全量数据确定为待同步全量数据,并根据所述后台业务指令将所述待同步全量数据同步至所述 第二业务数据集合;
    在所述基础数据中记录基于所述数据处理业务所生成的增量数据;
    判断所述待同步全量数据是否已全部同步至所述第二业务数据集合;
    若判断为否,则继续在所述基础数据中记录基于所述数据处理业务所生成的增量数据;
    若判断为是,则将完成所述数据处理业务的时刻确定为结束时间戳,并将所述起始时间戳到所述结束时间戳之间所记录到的所有增量数据确定为总增量数据。
  23. 一种计算机可读存储介质,其特征在于,存储有计算机可读指令,可以使至少一个处理器执行如权利要求1至11中任一项所述的方法。
PCT/CN2017/116144 2016-12-19 2017-12-14 一种数据管理方法及服务器 WO2018113580A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/289,808 US11500832B2 (en) 2016-12-19 2019-03-01 Data management method and server

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201611178079.2 2016-12-19
CN201611178078.8 2016-12-19
CN201611178079.2A CN108205560B (zh) 2016-12-19 2016-12-19 一种数据同步方法以及装置
CN201611178078.8A CN108205559B (zh) 2016-12-19 2016-12-19 一种数据管理方法及其设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/289,808 Continuation US11500832B2 (en) 2016-12-19 2019-03-01 Data management method and server

Publications (1)

Publication Number Publication Date
WO2018113580A1 true WO2018113580A1 (zh) 2018-06-28

Family

ID=62624529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/116144 WO2018113580A1 (zh) 2016-12-19 2017-12-14 一种数据管理方法及服务器

Country Status (2)

Country Link
US (1) US11500832B2 (zh)
WO (1) WO2018113580A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918382A (zh) * 2019-03-18 2019-06-21 Oppo广东移动通信有限公司 数据处理方法、装置、终端及存储介质
CN111651519A (zh) * 2020-05-08 2020-09-11 携程计算机技术(上海)有限公司 数据同步方法、数据同步装置、电子设备及存储介质
CN112527775A (zh) * 2020-12-18 2021-03-19 福建天晴数码有限公司 一种基于双写的数据库扩展的方法及其装置
CN112988916A (zh) * 2021-03-05 2021-06-18 杭州天阙科技有限公司 针对Clickhouse的全量和增量同步方法、设备和存储介质
CN113836114A (zh) * 2021-09-27 2021-12-24 北京互金新融科技有限公司 数据迁移方法、系统、设备及存储介质
CN116860898A (zh) * 2023-09-05 2023-10-10 建信金融科技有限责任公司 一种数据处理方法和装置

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737252B (zh) * 2020-05-22 2023-10-03 广东科学技术职业学院 基于数据中心的数据融合方法及系统
CN111831748B (zh) * 2020-06-30 2024-04-30 北京小米松果电子有限公司 数据同步方法、装置及存储介质
CN111857597A (zh) * 2020-07-24 2020-10-30 浪潮电子信息产业股份有限公司 一种热点数据缓存方法、系统及相关装置
CN112511352B (zh) * 2020-12-01 2023-01-24 深圳市鹰硕技术有限公司 一种用户管理方法及系统
CN112769945B (zh) * 2021-01-19 2023-02-03 中国工商银行股份有限公司 分布式服务调用方法及装置
CN113886478A (zh) * 2021-09-30 2022-01-04 杭州数梦工场科技有限公司 应用于etl的数据处理方法和装置及电子设备
US11822570B2 (en) 2021-11-03 2023-11-21 International Business Machines Corporation Database synchronization employing parallel poll threads

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823812A (zh) * 2012-11-19 2014-05-28 苏州工业园区新宏博通讯科技有限公司 系统数据管理方法
CN104598531A (zh) * 2014-12-25 2015-05-06 广东电子工业研究院有限公司 一种基于触发器的异构关系型数据库间增量数据迁移方法
CN105718570A (zh) * 2016-01-20 2016-06-29 北京京东尚科信息技术有限公司 用于数据库的数据迁移方法和装置
CN105868343A (zh) * 2016-03-28 2016-08-17 上海携程商务有限公司 数据库迁移方法及系统

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4842593B2 (ja) * 2005-09-05 2011-12-21 株式会社日立製作所 ストレージ仮想化装置のデバイス制御引継ぎ方法
JP5092897B2 (ja) * 2008-05-26 2012-12-05 富士通株式会社 データ移行処理プログラム、データ移行処理装置およびデータ移行処理方法
US8868487B2 (en) * 2010-04-12 2014-10-21 Sandisk Enterprise Ip Llc Event processing in a flash memory-based object store
US8489632B1 (en) * 2011-06-28 2013-07-16 Google Inc. Predictive model training management
CN105320681B (zh) * 2014-07-16 2020-06-30 中兴通讯股份有限公司 一种数据库内容合并方法及装置
US20160063050A1 (en) * 2014-08-28 2016-03-03 Joerg Schoen Database Migration Consistency Checker
US9875031B2 (en) * 2015-09-30 2018-01-23 Western Digital Technologies, Inc. Data retention management for data storage device
CN105472045A (zh) * 2016-01-26 2016-04-06 北京百度网讯科技有限公司 数据库迁移的方法和装置
CN107122361B (zh) * 2016-02-24 2021-07-06 阿里巴巴集团控股有限公司 数据迁移系统和方法
CN105868421A (zh) 2016-06-12 2016-08-17 浪潮通用软件有限公司 一种数据管理方法及装置
US10678758B2 (en) * 2016-11-21 2020-06-09 Commvault Systems, Inc. Cross-platform virtual machine data and memory backup and replication

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823812A (zh) * 2012-11-19 2014-05-28 苏州工业园区新宏博通讯科技有限公司 系统数据管理方法
CN104598531A (zh) * 2014-12-25 2015-05-06 广东电子工业研究院有限公司 一种基于触发器的异构关系型数据库间增量数据迁移方法
CN105718570A (zh) * 2016-01-20 2016-06-29 北京京东尚科信息技术有限公司 用于数据库的数据迁移方法和装置
CN105868343A (zh) * 2016-03-28 2016-08-17 上海携程商务有限公司 数据库迁移方法及系统

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918382A (zh) * 2019-03-18 2019-06-21 Oppo广东移动通信有限公司 数据处理方法、装置、终端及存储介质
CN109918382B (zh) * 2019-03-18 2021-06-01 Oppo广东移动通信有限公司 数据处理方法、装置、终端及存储介质
CN111651519A (zh) * 2020-05-08 2020-09-11 携程计算机技术(上海)有限公司 数据同步方法、数据同步装置、电子设备及存储介质
CN111651519B (zh) * 2020-05-08 2023-04-25 携程计算机技术(上海)有限公司 数据同步方法、数据同步装置、电子设备及存储介质
CN112527775A (zh) * 2020-12-18 2021-03-19 福建天晴数码有限公司 一种基于双写的数据库扩展的方法及其装置
CN112988916A (zh) * 2021-03-05 2021-06-18 杭州天阙科技有限公司 针对Clickhouse的全量和增量同步方法、设备和存储介质
CN112988916B (zh) * 2021-03-05 2023-06-16 杭州天阙科技有限公司 针对Clickhouse的全量和增量同步方法、设备和存储介质
CN113836114A (zh) * 2021-09-27 2021-12-24 北京互金新融科技有限公司 数据迁移方法、系统、设备及存储介质
CN113836114B (zh) * 2021-09-27 2024-04-26 北京互金新融科技有限公司 数据迁移方法、系统、设备及存储介质
CN116860898A (zh) * 2023-09-05 2023-10-10 建信金融科技有限责任公司 一种数据处理方法和装置
CN116860898B (zh) * 2023-09-05 2024-04-23 建信金融科技有限责任公司 一种数据处理方法和装置

Also Published As

Publication number Publication date
US20190197027A1 (en) 2019-06-27
US11500832B2 (en) 2022-11-15

Similar Documents

Publication Publication Date Title
WO2018113580A1 (zh) 一种数据管理方法及服务器
US11327799B2 (en) Dynamic allocation of worker nodes for distributed replication
JP5577350B2 (ja) 効率的なデータ同期化のための方法及びシステム
US20190243547A1 (en) Distributed object replication architecture
JP5792594B2 (ja) 仮想パーティションを用いたデータベース再分配
US10606865B2 (en) Database scale-out
WO2021238701A1 (zh) 数据迁移方法以及装置
CN108205560B (zh) 一种数据同步方法以及装置
WO2016050112A1 (zh) 一种数据存储方法、存储装置及存储系统
US11263236B2 (en) Real-time cross-system database replication for hybrid-cloud elastic scaling and high-performance data virtualization
CN104965879A (zh) 修改数据表的表结构的方法及装置
EP2380090B1 (en) Data integrity in a database environment through background synchronization
US20130066883A1 (en) Data management apparatus and system
WO2014023000A1 (zh) 分布式数据处理方法及装置
US11507277B2 (en) Key value store using progress verification
WO2016192496A1 (zh) 数据迁移处理方法及装置
WO2022111188A1 (zh) 事务处理方法、系统、装置、设备、存储介质及程序产品
CN113760847A (zh) 日志数据处理方法、装置、设备及存储介质
US11216421B2 (en) Extensible streams for operations on external systems
CN114185991A (zh) 基于分布式数据库实现数据同步的方法及相关装置
CN108205559B (zh) 一种数据管理方法及其设备
US10025680B2 (en) High throughput, high reliability data processing system
US20100023713A1 (en) Archive system and contents management method
CN101833585A (zh) 数据库服务器操作控制系统、方法及设备
US11010410B1 (en) Processing data groupings belonging to data grouping containers

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17883267

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17883267

Country of ref document: EP

Kind code of ref document: A1