CN114969206A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN114969206A
CN114969206A CN202210572211.7A CN202210572211A CN114969206A CN 114969206 A CN114969206 A CN 114969206A CN 202210572211 A CN202210572211 A CN 202210572211A CN 114969206 A CN114969206 A CN 114969206A
Authority
CN
China
Prior art keywords
data
database
sub
updating
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210572211.7A
Other languages
Chinese (zh)
Inventor
李鸿鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianlian Yintong Electronic Payment Co ltd
Original Assignee
Lianlian Yintong Electronic Payment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianlian Yintong Electronic Payment Co ltd filed Critical Lianlian Yintong Electronic Payment Co ltd
Priority to CN202210572211.7A priority Critical patent/CN114969206A/en
Publication of CN114969206A publication Critical patent/CN114969206A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the field of database application technologies, and in particular, to a data processing method, apparatus, device, and storage medium. The method comprises the following steps: and acquiring the full data in the first database, wherein the full data carries data identification information. And writing the full data into at least two sub-databases in the second database according to the data identification information. And receiving the sub-database updating data of the target sub-database, which is sent by the target sub-database responding to the sub-database updating instruction. And updating the data in the first database according to the update data of the sub-databases. According to the method, when the latest data is written into the sub-database in the second database, the updated data of the sub-database is synchronized into the first database for data backup, so that the data in the first database and the data in the second database are always kept consistent. Therefore, when the service data is switched back to the first database, the data loss can be avoided, and the normal operation of the service is ensured.

Description

Data processing method, device, equipment and storage medium
Technical Field
The present application relates to the field of database application technologies, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
As the system operates, the amount of data stored in the relational database increases, and the pressure on system access increases. With the increasing data volume in a single database and the increasing Query Per Second (QPS) rate of the database, the time required for reading and writing the database is correspondingly increased. The read-write performance of the database may become a bottleneck in the development of services. Accordingly, optimization in terms of database performance is required.
If the query QPS of the database is too high, the database splitting needs to be considered, and the connection pressure of the single database is shared by the sub-databases. If the amount of data in a single table is too large, performance problems may exist for data queries and data updates when the amount of data exceeds a certain level. The single table can be sorted at this time.
The database and the table can optimize the performance of the database. However, the database and the table may have problems of stability or inconsistent data at the initial stage of operation. Once these problems occur in the new library, the normal operation of the service is affected.
Disclosure of Invention
The application provides a data processing method, a device, equipment and a storage medium, wherein the sub-database sub-table service data is written in, backup is carried out on a single database, so that the single database data and the sub-database data are always kept consistent, and when the sub-database sub-table fails, the service data can be switched back to the single database, so that normal operation of services is ensured.
In a first aspect, an embodiment of the present application discloses a data processing method, including:
and acquiring the full data in the first database, wherein the full data carries the data identification information of the full data.
And writing the full data into at least two sub databases in the second database according to the data identification information of the full data.
And receiving the sub-database updating data of the target sub-database, which is sent by the target sub-database responding to the sub-database updating instruction. The target sub-database is a database of at least two sub-databases.
And updating the data in the first database according to the update data of the sub-databases.
Further, receiving the sub-database update data of the target sub-database sent by the target sub-database in response to the sub-database update instruction includes:
a sub-database update message is received from a preset topic of the message middleware. And the sub-database updating message is sent to a preset theme of the message middleware by the target sub-database in response to the sub-database updating instruction based on the first subscription publishing component.
And acquiring the update data of the sub-database based on the update message of the sub-database.
Further, the update message of the sub-database carries the original source information of the data. Updating the data in the first database according to the update data of the sub-databases, comprising:
and under the condition that the original source information of the data is determined to be the preset source information, updating the data in the first database according to the updated data of the sub-database. The preset source information is used for indicating that the original source of the update data of the sub-database is the second database.
Further, the method further comprises:
in the case where the target data exists in the first database, a first update time at which the sub-databases update the data, and a second update time at which the target data exists are determined. The target data is data of which the similarity with the update data of the sub-database is greater than a similarity threshold value in the first database.
In the event that the first update time is earlier than the second update time, the sub-database update data is discarded.
Furthermore, the update data of the sub-database carries the data processing type information. The sub-database update data includes censored data. Updating the data in the first database according to the update data of the sub-databases, comprising:
and determining target updating data corresponding to the censored data in the first database according to the data processing type information.
And updating the target updating data according to the censoring data.
Further, after updating the data in the first database according to the update data of the sub-databases, the method further includes:
and carrying out consistency check on the data in the first database and the data in the second database.
In the event that the verification fails, differential data of the first database and the second database is determined.
And processing the data in the first database according to the difference data.
Further, the method also comprises the following steps:
and under the condition that the second database is in a preset state, receiving target write data sent by the first database in response to the data synchronization instruction. The target write data is written into the first database under the condition that the second database is in a non-preset state. The target write data carries data identification information of the target write data.
And updating the data in the at least two sub-databases based on the data identification information of the target write data and the target write data.
In a second aspect, an embodiment of the present application discloses a data processing apparatus, including:
and the full data acquisition module is used for acquiring full data in the first database, wherein the full data carries data identification information of the full data.
And the full data writing module is used for writing the full data into at least two sub databases in the second database according to the data identification information of the full data.
And the sub-database updating data receiving module is used for receiving sub-database updating data of the target sub-database, which is sent by the target sub-database in response to the sub-database updating instruction. The target sub-database is a database of at least two sub-databases.
And the first database data updating module is used for updating the data in the first database according to the update data of the sub-databases.
In some optional embodiments, the sub-database update data receiving module comprises:
and the sub-database updating message receiving unit is used for receiving the sub-database updating message from the preset theme of the message middleware. And the sub-database updating message is sent to a preset theme of the message middleware by the target sub-database in response to the sub-database updating instruction based on the first subscription publishing component.
And the sub-database updating data acquisition unit is used for acquiring the sub-database updating data based on the sub-database updating message.
In some optional embodiments, the sub-database update message carries information about the original source of the data. The first database data update module includes:
and the data original source information determining unit is used for updating the data in the first database according to the updated data of the sub-databases under the condition that the data original source information is determined to be the preset source information. The preset source information is used for indicating that the original source of the update data of the sub-database is the second database.
In some optional embodiments, the apparatus further comprises:
and the target data determining module is used for determining the first updating time of the sub-database updating data and the second updating time of the target data under the condition that the target data exists in the first database. The target data is data of which the similarity with the update data of the sub-database is greater than a similarity threshold value in the first database.
And the updating time determining module is used for discarding the updating data of the sub-database under the condition that the first updating time is earlier than the second updating time.
In some optional embodiments, the sub-database update data carries data processing type information. The sub-database update data includes censored data. The first database data update module includes:
and the target updating data determining unit is used for determining target updating data corresponding to the censored data in the first database according to the data processing type information.
And the data updating unit is used for updating the target updating data according to the deleting data.
In some optional embodiments, the apparatus further comprises:
and the consistency checking module is used for carrying out consistency checking on the data in the first database and the data in the second database.
And the difference data determining module is used for determining the difference data of the first database and the second database under the condition that the verification is not passed.
And the data processing module is used for processing the data in the first database according to the difference data.
In some optional embodiments, the apparatus further comprises:
and the target write-in data receiving module is used for receiving the target write-in data sent by the first database in response to the data synchronization instruction under the condition that the second database is in a preset state. And the target writing data is written into the first database under the condition that the second database is in a non-preset state. The target write data carries data identification information of the target write data.
And the sub-database updating module is used for updating the data in the at least two sub-databases based on the data identification information of the target written data and the target written data.
In a third aspect, an embodiment of the present application discloses an electronic device, where the device includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded by the processor and executes the data processing method described above.
In a fourth aspect, an embodiment of the present application discloses a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the data processing method described above.
The technical scheme provided by the embodiment of the application has the following technical effects:
according to the data processing method, when the latest data is written into the sub-databases in the second database, the updated data of the sub-databases is synchronized into the first database for data backup, and meanwhile, as the original full amount of data in the first database is written into the sub-databases in the second database, the data in the first database and the data in the second database are always kept consistent. Therefore, when the service data is switched back to the first database, the data loss can be avoided, and the normal operation of the service is ensured.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic diagram of an application environment of a data processing method provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a system architecture for migrating data to a second database according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a system architecture for backing up update data of a second database in a first database according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 6 is a hardware block diagram of a server in a data processing method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the embodiments of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In order to make the objects, technical solutions and advantages disclosed in the embodiments of the present application more clearly apparent, the embodiments of the present application are described in further detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the embodiments of the application and are not intended to limit the embodiments of the application.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.
With the development of time and business, more and more tables are arranged in a single database, and the data volume in the single table is larger and larger. Accordingly, the overhead of data operation, increasing, deleting, and modifying is also increased. In addition, since distributed deployment is impossible and the resources of one server are limited, the amount of data and the data processing capacity that can be carried by the database will eventually suffer from bottlenecks. At this time, a single database and/or a single table can be split to obtain a plurality of sub-database sub-tables.
After the single database is subjected to database division and table division, data in the single database needs to be migrated to the database division and table division. And after the data migration is finished, switching the service system to the sub-database and sub-table, namely performing reading and writing operations on the newly added data generated by the service system in the sub-database and sub-table.
However, the stability of the sub-database and the sub-table at the initial stage of operation cannot be completely guaranteed, and sometimes a fault may occur. To ensure proper operation of the business, the data may sometimes be cut back to the single repository. However, data is not written into the single library after the service system is switched to the sub-library and the sub-table, and the data is switched back to the single library, which may cause partial data loss and also affect the normal operation of the service.
In view of this, embodiments of the present application provide a data processing method, an apparatus, a device, and a storage medium, where when writing the latest data into the sublist, the latest data is synchronously backed up to the single library, so that the single library data and the sublist data are always kept consistent, and when a service is switched back to the single library, the single library data can be prevented from being lost, and normal operation of the service is ensured.
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment of a data processing method according to an embodiment of the present application, and as shown in fig. 1, the application environment may include a client 101 and a server 103.
In an alternative embodiment, the client 101 may be a client of a business system. The user initiates or completes a service by using the client 101. Alternatively, the client 101 may include, but is not limited to, a smartphone, a desktop computer, a tablet computer, a laptop computer, a smart speaker, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, a smart wearable device, and other types of electronic devices. The software running on the electronic device may be an application program, an applet, or the like. Optionally, the operation service system running on the electronic device may include, but is not limited to, an android service system, an IOS service system, linux, windows, Unix, and the like.
In an alternative embodiment, the server 103 may be a backend server of the business system. The server 103 is provided with a database for storing service data generated by the service system. Optionally, the server 103 may include an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.
In an alternative embodiment, the databases in server 103 include a first database and a second database. Optionally, the first database and the second database are both relational databases, such as MySQL databases. The first database is a single database. Optionally, the first database may include one or more data tables. The second database is a sub-database and comprises at least two sub-databases. Optionally, each sub-database may include one or more data tables. The data in the first database and the second database are bidirectionally synchronized through the data synchronization component, that is, the data in a single database and the data in a plurality of sublibraries are bidirectionally synchronized through the data synchronization component.
While specific embodiments of a data processing method according to the present application are described below, fig. 2 is a schematic flow chart of a data processing method according to the embodiments of the present application, and the present specification provides the method operation steps according to the embodiments or the flow chart, but more or less operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In actual execution of a business system or server product, it may be executed sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the methods shown in the embodiments or figures. Specifically, as shown in fig. 2, the method may include:
s201: and acquiring the full data in the first database, wherein the full data carries the data identification information of the full data.
In the embodiment of the application, when the data volume in a single database is larger and the performance of the database is influenced, the single database can be subjected to database partitioning, so that a plurality of databases are obtained.
In the embodiment of the application, the first database is subjected to database partitioning according to a preset database partitioning rule, so that a second database comprising a plurality of sub-databases is obtained. Optionally, the preset database partitioning rule may be a vertical database partitioning rule or a horizontal database partitioning rule. The specific method adopted for splitting the first database can be splitting based on hash modulo or splitting based on a routing table.
As an alternative implementation, the following describes a single database splitting method by taking a merchant transaction service system as an example. Firstly, determining a single database needing to be partitioned in a merchant transaction service system, and then, partitioning by adopting a consistent hash algorithm by taking a merchant number as a partitioning field to determine the partitioned database into which each data in the single database is to fall. And then, transforming all tables in the sub-base to ensure that each table has redundancy as the business number of the fragment field, and simultaneously setting the query statement conditions of the sub-base sub-tables to have the business numbers of the fragment fields.
In the embodiment of the application, after the database partitioning is completed, the data in the first database needs to be migrated to a plurality of sub-databases in the second database. When data migration is performed, first, the full amount of data in the first database can be obtained, and then the full amount of data is migrated to the plurality of sub-databases in the second database.
S203: and writing the full data into at least two sub databases in the second database according to the data identification information of the full data.
In the embodiment of the application, the data migration scheme can be divided into a shutdown migration scheme and a shutdown-free migration scheme. For the shutdown migration scheme, the service system server is first shutdown, and the service system is shut down, and at this time, the service system does not generate new service data, so that all data in the first database after shutdown can be determined as full data, and then the full data is migrated to the multiple sub-databases in the second database. For the non-stop migration scheme, new business data may be generated by the business system and written into the first database during the data migration process. Therefore, it is possible to determine that all data in the first database at a time before the data migration is full data, and data at the time and after the time is updated data, and then migrate the full data to the plurality of sub-databases in the second database.
In the embodiment of the present application, each piece of data in the full volume data has data identification information for identifying the data. The data identification information is used to determine the sub-database into which the piece of data falls. Optionally, the data identification information may be a fragmentation field. As an example, when the first database is split, the merchant number is used as the fragmentation field. The second database comprises three sub-databases, wherein the business data with the business number of 0001-. Thus, when the full data in the first database is migrated to the second database, the full data is subjected to database dropping according to the merchant number. Namely, the service data with the business number of 0001-0500 in the full data is written into the first sub-database, the service data with the business number of 0501-1000 in the full data is written into the second sub-database, and the service data with the business number of 1001-1500 in the full data is written into the third sub-database.
In the embodiment of the present application, for the non-stop migration scheme, in addition to the migration of the full amount of data, the migration of the update data generated in the data migration process is also included. The update data generated during the data migration process may be written to the second database in a synchronous write and/or asynchronous write manner. The synchronous writing mode is a double-writing mode for writing into the second database. Namely, the data which needs to be written into the database is respectively and simultaneously written into the first database and the second database by modifying the rule of writing the database of the service system according to the updated data generated in the data migration process. The asynchronous writing mode is to write the updated data generated in the data migration process into the first database only. The updated data written in the first database is then synchronized to the second database by the data synchronization component.
As an optional implementation manner, when the service data is written into the first database, the service system sends a data update instruction to the database, where the data update instruction carries update data, and after receiving the data update instruction, the first database obtains the update data and writes the update data into the first database according to the data update instruction. Meanwhile, the data synchronization component acquires the updated data. The data synchronization component receives the update data of the first database sent by the first database in response to the data update instruction. And the data synchronization component updates the updated data to at least two sub-databases in the second database according to the data identification information of the updated data carried in the updated data. Optionally, the data identification information may be a fragmentation field. In particular, the data synchronization component can include a subscription publication component, a message middleware, and a data synchronization application. And the subscription and publishing component subscribes a data updating log of the first database, wherein the data updating log contains updating data. When the service data is written into the first database, the subscription and release component acquires a data update log of the first database, analyzes the data update log to acquire update data, and then sends the update data to the message middleware. The message middleware issues a first database update message through a message queue. The first database updating message carries updating data, the data synchronization application acquires the first database updating message, acquires the updating data by analyzing the first database updating message, and then writes the updating data into a plurality of sub-databases in the second database by the data identification information.
As an example, fig. 3 is a schematic diagram of a system architecture for migrating data to a second database according to an embodiment of the present application, and as shown in fig. 3, the first database is a MySQL database, the second database is a MySQL database sublist obtained by splitting the first database, and the second database includes multiple sublists. The data synchronization component comprises a second subscription publishing component, message middleware and a data synchronization application. The second subscription publishing component is a second canal and the message middleware is kafka. canal is configured to monitor the first database, and when monitoring the first database update, publish the first database update message to another topic in kafka, and the data synchronization application subscribes to another topic in kafka and obtains the first database update message from the another topic. Specifically, when the service data is written into the first database, the MySQL database may first configure a binary log binlog file for recording database operation instructions, and canal obtains the binary log binlog file, analyzes the binary log binlog file, and obtains a data writing command recorded by the binary log binlog file. Canal then sends a message containing the data write command to the specified topic of kafka so that the data synchronization application subscribing to the specified topic consumes the message from the specified topic. The data synchronization application acquires a data writing command by consuming the message, acquires updated service data by analyzing the data writing command, and writes the updated service data into one or more sub-databases of the second database according to the fragmentation field.
S205: and receiving the sub-database updating data of the target sub-database, which is sent by the target sub-database responding to the sub-database updating instruction. The target sub-database is a database of at least two sub-databases.
In the embodiment of the application, before the data migration is completed, the service data generated in the service system is directly written into the first database. After the data migration is completed, the service data can be switched to the second database, that is, the service data is no longer directly written in the first database, and the service data produced by the service system is directly written in a plurality of sub-databases in the second database.
In the embodiment of the application, in order to ensure that the service system can still perform service processing normally when the second database fails, when the service data generated by the service system is written into any sub-database in the second database, the sub-database can perform synchronous backup on the newly written data to the first database, so that all data generated by the service system is always stored in the first database. When the second database fails, the service data can be switched back to the first database for reading and writing, so that normal operation of the service can be ensured.
In this embodiment, the target sub-database is any sub-database in the second database. When the service data is written into any sub-database in the second database, the service system sends a sub-database updating instruction to the sub-database, the sub-database updating instruction carries sub-database updating data, and after the target sub-database receives the sub-database updating instruction, the sub-database updating data is obtained and written into the target sub-database according to the sub-database updating instruction. Meanwhile, the data synchronization component acquires the update data of the sub-database. And the data synchronization component receives the update data of the sub-database sent by the target sub-database in response to the update instruction of the sub-database. And the data synchronization component updates the sub-database updating data into the first database according to the data identification information of the sub-database updating data carried in the sub-database updating data. Optionally, the data identification information may be a fragmentation field. In particular, the data synchronization component can include a subscription publishing component, message middleware, and a data synchronization application. And the subscription and publishing component subscribes data updating logs of all sub-databases in the second database, wherein the data updating logs contain sub-database updating data. When the service data is written into any sub-database, the subscription and release component acquires a data update log of the sub-database, analyzes the data update log to acquire update data of the sub-database, and then sends the update data of the sub-database to the message middleware. And the message middleware issues a sub-database updating message through a message queue. The update information of the sub-database carries update data of the sub-database, the data synchronization application obtains the update information of the sub-database, the update information of the sub-database is analyzed to obtain the update data of the sub-database, and then the update data of the sub-database is written into the first database by the data identification information.
As an example, fig. 4 is a schematic diagram of a system architecture for backing up update data of a second database in a first database according to an embodiment of the present application, and as shown in fig. 4, when backing up data in the second database to the first database, a first subscription publishing component, a message middleware, and a data synchronization application may be configured in a data synchronization component. The first subscription publishing component is a first canal and the message middleware is kafka. The first canal is configured to monitor all sub-databases in the second database, and when monitoring that any sub-database is updated, issue a sub-database update message to a preset topic of kafka, and the data synchronization application subscribes to the preset topic of kafka and obtains the sub-database update message from the preset topic. Specifically, when the service data is written into any sub-database, the MySQL database may first configure a binary log binlog file for recording database operation instructions, and a first canal obtains the binary log binlog file, analyzes the binary log binlog file, and obtains a data writing command recorded by the binary log binlog file. The first canal then sends a message containing the data write command to the preset topic of kafka, so that the data synchronization application subscribed to the preset topic can consume the message from the preset topic. And the data synchronization application acquires a data writing command through the consumption message, analyzes the data writing command to acquire updated service data in the sub-database, and writes the updated service data into the first database.
In the embodiment of the application, because the data synchronization component is configured to perform bidirectional data transmission between the first database and the second database, in order to avoid data duplication, that is, the data synchronization component migrates data in the first database to the second database, and after the migration data is written in the second database, the migration data is issued to the data synchronization component to perform a process of performing synchronous backup on the first database. To avoid this, it is necessary to determine the original source of the data when writing the data to the target database.
As an optional implementation manner, when the data synchronization component synchronizes the data in the first database with the second database, the data synchronization component determines original source information of the data carried in the first database update message, and writes the first database update data into the sub-database in the second database when it is determined that the original source of the first database update data carried in the first database update message is the first database, that is, the second database only writes the data of which the original source is the first database at this time. When the data synchronization component synchronizes the data in the second database to the first database, the data synchronization component judges the original source information of the data carried in the sub-database updating message, and writes the sub-database updating data into the first database under the condition that the original source of the sub-database updating data carried in the sub-database updating message is determined to be the second database, namely the first database only writes the data of which the original source is the second database. Specifically, when the first database migrates data to the sub-databases in the second database, the service data generated by the service system is only written into the first database. And the data synchronization application acquires a first database updating message from another theme of the message middleware, acquires first database updating data according to the first database updating message, wherein the first database updating data carries a source identifier of the first database updating data, and writes the first database updating data into a sub-database of a second database under the condition that the source identifier is the first database. When the second database performs data synchronization backup to the first database, the service data generated by the service system is only written into the second database. And the data synchronization application acquires a sub-database updating message from a preset theme of the message middleware, acquires sub-database updating data according to the sub-database updating message, wherein the sub-database updating data carries a source identifier of the sub-database updating data, and writes the sub-database updating data into the first database under the condition that the source identifier is the second database.
In the embodiment of the present application, in some cases, for example, when a business system switches databases, one data may be written into a first database and a second database at the same time, and in order to avoid this, when writing data into a target database, it is also necessary to control data writing at a business logic level. When writing data into the target database, firstly determining whether the target database has the same data, and if so, judging the last update time of the two data.
As an optional implementation manner, when the data synchronization component synchronizes the data in the first database to the second database, the data synchronization component determines in the second database whether there is data in the second database whose similarity with the updated data of the first database is greater than the similarity threshold. In the case where such data exists in the second database, the last update time at which the first database updated the data, and the last update time of the similar data are determined. And if the first database updating data is new, writing the first database updating data into the sub-database of the second database. If the similar data in the second database is new, the first database update data is abandoned. When the data synchronization component synchronizes the data in the second database to the first database, the data synchronization component determines whether the first database has target data with the similarity larger than the similarity threshold value with the update data of the sub-database. In the case where the target data exists in the first database, a first update time at which the sub-databases update the data, and a second update time at which the target data exists are determined. And if the first updating time is later than the second updating time, writing the sub-database updating data into the first database. If the first update time is earlier than the second update time, the sub-database update data is discarded.
S207: and updating the data in the first database according to the update data of the sub-databases.
In the embodiment of the application, the data synchronization component updates the data in the first database according to the update data of the sub-database under the condition that the data synchronization component determines that the update data of the sub-database can be written into the first database. Specifically, the update data of the sub-database carries the data processing type information. The data processing type may be deletion of existing data and addition of new data. And if the update data of the sub-database is the deletion and modification data, the carried data processing type information is deletion and modification operation information of the existing data. And the data synchronization component determines target updating data corresponding to the deletion data in the first database according to the data processing type information, and then updates the target updating data according to the deletion data. And if the update data of the sub-database is the newly added data. And the carried data processing type information is the operation information of the newly added data. The data synchronization component writes the new data to the first database.
In some optional embodiments, after the data in the first database is migrated to the second database, or after the data in the second database is backed up to the first database, consistency check may be performed on the data in the two databases to ensure that the data in the two databases are consistent. Optionally, the consistency check may be a timing check performed on the two databases at preset time intervals. Or after the data is written into the target database, consistency check can be carried out. In order to improve the efficiency of consistency check, optionally, the key data information of the two databases may be checked, for example, the data information such as the total number of data, the state of the data table, and the total transaction amount may be compared and checked. And if the key data information of the two databases is consistent, the data in the two databases are consistent, and the consistency check is passed. If the key data information of the two databases is inconsistent, which means that the data in the two databases are inconsistent, the consistency check fails, and at this time, the difference data of the first database and the second database needs to be determined. And then processing the data in the first database according to the difference data to keep the data in the two databases consistent. Specifically, if data that is not present in the second database is present in the first database, the data is synchronized to the second database by the data synchronization component. If data not present in the first database is present in the second database, then the data is synchronized into the first database by the data synchronization component.
In the embodiment of the application, in order to reduce the probability of occurrence of a failure of the second database and ensure that the service of the service system is normally performed, only part of service data may be directly written in the second database at the initial operation stage of the second database, so as to verify the performance of the second database. For example, at the initial stage of operation of the second database, part of the business data of the merchant numbers may be selected as verification business data, that is, the business data of the verification merchant numbers are directly read and written in the second database, while the business data of the rest of the merchant numbers are still read and written in the first database. It should be noted that, during this period, the verification service data is synchronously backed up to the first database through the data synchronization component, and the service data generated by the other merchant numbers is also migrated to the second database through the data synchronization component. After the performance verification of the second database is completed, the service data generated by the service system can be switched to the second database for reading and writing.
In the embodiment of the application, when the second database fails, in order to avoid influencing the normal operation of the service, the service data generated by the service system can be switched to the first database for reading and writing. And then, the second database can be repaired, after the second database is repaired, the updating data written into the first database in the period can be synchronized to the second database, and then the service data generated by the service system is switched back to the second database for reading and writing. Specifically, when the second database fails, the service data generated by the service system is switched to the first database for reading and writing, and then the second database is repaired. And generating service data written into the first database during the period, and switching the service data generated by the service system back to the second database for reading and writing after the second database is repaired. And under the condition that the second database is in a preset state, namely the second database is in a fault repair completion state, the service system sends a data synchronization instruction to the first database, and the first database responds to the data synchronization instruction and writes data into a target of the first database sent by the second database. The target write-in data is the service data written into the first database under the condition that the second database is in a non-preset state, namely the second database is in a fault state. The target write data carries data identification information of the target write data. Optionally, the data identification information may be a fragmentation field. And then the data synchronization component updates the data in the at least two sub-databases according to the data identification information of the back-cut updating data and the back-cut updating data.
According to the data synchronization method, the existing single database is used for data backup, a new backup database does not need to be built, and the workload of database development is reduced. And because the single library has corresponding configuration connection to the service system, when the service system is switched back to the single library, new connection configuration does not need to be designed, and manpower and material resources are saved. In addition, the data processing method described in the embodiment of the present application writes the data written in the second database into the first database by using an asynchronous writing scheme, which can reduce the real-time requirement of data synchronization and save resources.
An embodiment of the present application further discloses a data processing apparatus, and fig. 5 is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present application, and as shown in fig. 5, the apparatus includes:
a full data obtaining module 501, configured to obtain full data in the first database, where the full data carries data identification information of the full data.
And a full data writing module 503, configured to write the full data into at least two sub databases in the second database according to the data identification information of the full data.
The sub-database update data receiving module 505 is configured to receive sub-database update data of the target sub-database, which is sent by the target sub-database in response to the sub-database update instruction. The target sub-database is a database of at least two sub-databases.
And a first database data updating module 507, configured to update data in the first database according to the update data of the sub-databases.
In some optional embodiments, the sub-database update data receiving module comprises:
and the sub-database updating message receiving unit is used for receiving the sub-database updating message from the preset theme of the message middleware. And the sub-database updating message is sent to a preset theme of the message middleware by the target sub-database in response to the sub-database updating instruction based on the first subscription publishing component.
And the sub-database updating data acquisition unit is used for acquiring the sub-database updating data based on the sub-database updating message.
In some optional embodiments, the sub-database update message carries information about the original source of the data. The first database data update module comprises:
and the data original source information determining unit is used for updating the data in the first database according to the updated data of the sub-databases under the condition that the data original source information is determined to be the preset source information. The preset source information is used for indicating that the original source of the update data of the sub-database is the second database.
In some optional embodiments, the apparatus further comprises:
and the target data determining module is used for determining the first updating time of the sub-database updating data and the second updating time of the target data under the condition that the target data exists in the first database. The target data is data in the first database, wherein the similarity between the target data and the update data of the sub-database is greater than a similarity threshold.
And the updating time determining module is used for discarding the updating data of the sub-database under the condition that the first updating time is earlier than the second updating time.
In some optional embodiments, the sub-database update data carries data processing type information. The sub-database update data includes censored data. The first database data update module comprises:
and the target updating data determining unit is used for determining target updating data corresponding to the censored data in the first database according to the data processing type information.
And a data updating unit for updating the target update data according to the censored data.
In some optional embodiments, the apparatus further comprises:
and the consistency checking module is used for carrying out consistency checking on the data in the first database and the data in the second database.
And the difference data determining module is used for determining the difference data of the first database and the second database under the condition that the verification is not passed.
And the data processing module is used for processing the data in the first database according to the difference data.
In some optional embodiments, the apparatus further comprises:
and the target write-in data receiving module is used for receiving target write-in data sent by the first database in response to the data synchronization instruction under the condition that the second database is in a preset state. The target write data is written into the first database under the condition that the second database is in a non-preset state. The target write data carries data identification information of the target write data.
And the sub-database updating module is used for updating the data in the at least two sub-databases based on the data identification information of the target written data and the target written data.
The device and the data processing method in the embodiment of the application are based on the same application concept. For the implementation of the apparatus, please refer to the detailed implementation of the method, which is not described herein.
The embodiment of the application also discloses an electronic device, which comprises a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded by the processor and executes the data processing method.
The method provided by the embodiment of the application can be executed in a mobile terminal, a computer terminal, a server or a similar operation device. Taking the example of running on a server, fig. 6 is a hardware structure block diagram of the server of the data processing method provided in the embodiment of the present application. As shown in fig. 6, the server 600 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 610 (the processors 610 may include but are not limited to Processing devices such as a microprocessor MCU or a Programmable logic device (FPGA)), a memory 630 for storing data, and one or more storage media 620 (e.g., one or more mass storage devices) for storing applications 623 or data 622. Memory 630 and storage medium 620 may be, among other things, transient or persistent storage. The program stored on the storage medium 620 may include one or more modules, each of which may include a series of instruction operations for the server. Still further, the central processor 610 may be configured to communicate with the storage medium 620 to execute a series of instruction operations in the storage medium 620 on the server 600. Server 600 may also include one or more power supplies 660, one or more wired or wireless network interfaces 650, one or more input-output interfaces 640, and/or one or more operational business systems 621, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, and so forth.
The input/output interface 640 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the server 600. In one example, i/o Interface 640 includes a Network adapter (NIC) that may be coupled to other Network devices via a base station to communicate with the internet. In one example, the input/output interface 640 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
It will be understood by those skilled in the art that the structure shown in fig. 6 is only an illustration and is not intended to limit the structure of the electronic device. For example, server 600 may also include more or fewer components than shown in FIG. 6, or have a different configuration than shown in FIG. 6.
The embodiment of the application also discloses a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded by a processor and executed to implement the data processing method as described above.
In an embodiment of the present application, the computer storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc, etc. The random access memory may include a resistive random access memory (ReRAM) and a Dynamic Random Access Memory (DRAM).
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A method of data processing, the method comprising:
acquiring full data in a first database, wherein the full data carries data identification information of the full data;
writing the full data into at least two sub databases in a second database according to the data identification information of the full data;
receiving sub-database updating data of a target sub-database sent by the target sub-database in response to a sub-database updating instruction; the target sub-database is a database in the at least two sub-databases;
and updating the data in the first database according to the update data of the sub-database.
2. The method of claim 1, wherein receiving sub-database update data sent by the target sub-database in response to a sub-database update instruction comprises:
receiving a sub-database updating message from a preset theme of the message middleware; the sub-database updating message is sent to a preset theme of the message middleware by the target sub-database in response to the sub-database updating instruction based on the first subscription publishing component;
and acquiring the update data of the sub-database based on the update message of the sub-database.
3. The method of claim 2, wherein the sub-database update message carries information of original source of data; the updating the data in the first database according to the update data of the sub-databases includes:
under the condition that the original source information of the data is determined to be preset source information, updating the data in the first database according to the updated data of the sub-database; the preset source information is used for indicating that the original source of the update data of the sub-database is the second database.
4. The method of claim 1, further comprising:
under the condition that the target data exists in the first database, determining a first updating time of updating data of the sub-databases and a second updating time of the target data; the target data is data of which the similarity with the update data of the sub-database is greater than a similarity threshold value in the first database;
and in the case that the first updating time is earlier than the second updating time, not executing the operation of updating the data in the first database according to the first updating data.
5. The method of claim 1, wherein the sub-database update data carries data processing type information; the sub-database update data comprises deletion and modification data; the updating the data in the first database according to the update data of the sub-databases includes:
determining target updating data corresponding to the censored data in the first database according to the data processing type information;
and updating the target updating data according to the censoring data.
6. The method of claim 1, wherein after said updating the data in the first database in accordance with the sub-database update data, the method further comprises:
performing consistency check on the data in the first database and the data in the second database;
determining difference data of the first database and the second database under the condition that the verification is not passed;
and processing the data in the first database according to the difference data.
7. The method of claim 1, further comprising:
under the condition that the second database is in a preset state, receiving target write-in data sent by the first database in response to a data synchronization instruction; the target writing data is written into the first database under the condition that the second database is in a non-preset state; the target write data carries data identification information of the target write data;
and updating the data in the at least two sub-databases based on the data identification information of the target write data and the target write data.
8. A data processing apparatus, characterized in that the apparatus comprises:
the system comprises a full data acquisition module, a full data acquisition module and a data processing module, wherein the full data acquisition module is used for acquiring full data in a first database, and the full data carries data identification information of the full data;
a full data writing module, configured to write the full data into at least two sub databases in a second database according to the data identification information of the full data;
the sub-database updating data receiving module is used for receiving sub-database updating data of a target sub-database, which is sent by the target sub-database in response to a sub-database updating instruction; the target sub-database is a database in the at least two sub-databases;
and the first database data updating module is used for updating the data in the first database according to the update data of the sub-database.
9. An electronic device, characterized in that the device comprises a processor and a memory, in which at least one instruction or at least one program is stored, which is loaded by the processor and executes the data processing method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which at least one instruction or at least one program is stored, which is loaded and executed by a processor to implement the data processing method according to any one of claims 1 to 7.
CN202210572211.7A 2022-05-24 2022-05-24 Data processing method, device, equipment and storage medium Pending CN114969206A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210572211.7A CN114969206A (en) 2022-05-24 2022-05-24 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210572211.7A CN114969206A (en) 2022-05-24 2022-05-24 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114969206A true CN114969206A (en) 2022-08-30

Family

ID=82955155

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210572211.7A Pending CN114969206A (en) 2022-05-24 2022-05-24 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114969206A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470302A (en) * 2022-10-25 2022-12-13 以萨技术股份有限公司 Database bidirectional synchronization method, medium and equipment based on canal

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470302A (en) * 2022-10-25 2022-12-13 以萨技术股份有限公司 Database bidirectional synchronization method, medium and equipment based on canal
CN115470302B (en) * 2022-10-25 2023-05-09 以萨技术股份有限公司 Two-way database synchronization method, medium and equipment based on canals

Similar Documents

Publication Publication Date Title
CN108170768B (en) Database synchronization method, device and readable medium
CN109683826B (en) Capacity expansion method and device for distributed storage system
CN109408115B (en) Method and computing system for migrating objects in container-based environment
US8954545B2 (en) Fast determination of compatibility of virtual machines and hosts
CN106815218B (en) Database access method and device and database system
US20060047776A1 (en) Automated failover in a cluster of geographically dispersed server nodes using data replication over a long distance communication link
US10747465B2 (en) Preserving replication to a storage object on a storage node
US11032156B1 (en) Crash-consistent multi-volume backup generation
CN109561151B (en) Data storage method, device, server and storage medium
US9984139B1 (en) Publish session framework for datastore operation records
CN110532123B (en) Fault transfer method and device of HBase system
CN111324606B (en) Data slicing method and device
CN111966631A (en) Mirror image file generation method, system, equipment and medium capable of being rapidly distributed
CN111338834B (en) Data storage method and device
CN112579550B (en) Metadata information synchronization method and system of distributed file system
CN111651302A (en) Distributed database backup method, device and system
CN112230853A (en) Storage capacity adjusting method, device, equipment and storage medium
CN114969206A (en) Data processing method, device, equipment and storage medium
EP3264254B1 (en) System and method for a simulation of a block storage system on an object storage system
CN112000850B (en) Method, device, system and equipment for processing data
CN112506432A (en) Dynamic and static separated real-time data storage and management method and device for electric power automation system
CN111694801A (en) Data deduplication method and device applied to fault recovery
CN115587141A (en) Database synchronization method and device
CN113986878A (en) Data writing method, data migration device and electronic equipment
CN113157811A (en) Data synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination