CN117131131A - Cross-machine-room data synchronization method and device, electronic equipment and storage medium - Google Patents

Cross-machine-room data synchronization method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117131131A
CN117131131A CN202311030897.8A CN202311030897A CN117131131A CN 117131131 A CN117131131 A CN 117131131A CN 202311030897 A CN202311030897 A CN 202311030897A CN 117131131 A CN117131131 A CN 117131131A
Authority
CN
China
Prior art keywords
information
room
data synchronization
host
computer room
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311030897.8A
Other languages
Chinese (zh)
Inventor
罗海红
张高斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Iflytek Huazhong Wuhan Co ltd
Original Assignee
Iflytek Huazhong Wuhan Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Iflytek Huazhong Wuhan Co ltd filed Critical Iflytek Huazhong Wuhan Co ltd
Priority to CN202311030897.8A priority Critical patent/CN117131131A/en
Publication of CN117131131A publication Critical patent/CN117131131A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/235Update request formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Abstract

The invention provides a data synchronization method, a data synchronization device, electronic equipment and a storage medium across a machine room, and relates to the technical field of computers. The method comprises the following steps: under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment; analyzing whether the incremental information comprises database update request information; and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room. The invention can improve the timeliness of data synchronization and can avoid the problem of failure data loss of a host computer room.

Description

Cross-machine-room data synchronization method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for synchronizing data across a machine room, an electronic device, and a storage medium.
Background
With the rapid development of computer technology, data required for various application systems is increasing, and thus, a large amount of data needs to be stored to provide stable uninterrupted data services. In order to ensure the stability and data security of the application system, multiple machine rooms are required to be deployed, and then the multiple machine rooms are required to be subjected to data synchronization, namely, the data synchronization across the machine rooms is required to be realized.
Currently, in order to realize high availability, most of cross-machine room deployment of high availability clusters, for example, cross-machine room deployment of Doris clusters, data synchronization of the cross-machine room is realized through a mode of periodic off-site backup, namely, the Doris clusters deployed in a host machine room recover data from a remote warehouse to a local cluster through periodic backup of the data to the remote warehouse. However, this periodic off-site backup method is not efficient, and if the host room fails, a period of data is lost, resulting in poor data security.
Disclosure of Invention
The invention provides a data synchronization method, a device, electronic equipment and a storage medium across a machine room, which are used for solving the defects of low timeliness and low data security of data synchronization across the machine room in the prior art.
The invention provides a data synchronization method across a machine room, which is applied to data synchronization equipment and comprises the following steps:
under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment;
analyzing whether the incremental information comprises database update request information;
and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room.
According to the data synchronization method across the machine room provided by the invention, the analysis of whether the incremental information comprises database update request information comprises the following steps:
Sending the increment information to a message queue;
monitoring the message queue through a message consuming program to send the increment information from the message queue to the message consuming program;
and analyzing whether the incremental information comprises database update request information or not through the message consumption program.
According to the data synchronization method across machine rooms provided by the invention, the analyzing whether the incremental information includes database update request information through the message consumption program further includes:
and if the incremental information does not comprise the database update request information, losing the incremental information.
According to the data synchronization method across the machine room, the Doris cluster is deployed in the host room, the Doris cluster comprises at least one front-end node FE device, and the host room device is the FE device;
under the condition that the change of the target log information stored in the host room equipment deployed in the host machine room is monitored, the method for receiving the incremental information of the target log information sent by the host machine room equipment further comprises the following steps:
and monitoring log information stored by each FE device in the main machine room.
The invention also provides a data synchronization method across the machine room, which is applied to the main machine room equipment, wherein the main machine room equipment is deployed in a main machine room, and the method comprises the following steps:
Receiving a database operation request and acquiring database operation request information based on the database operation request;
responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment;
and sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
According to the data synchronization method across the machine room provided by the invention, the step of sending the newly written incremental information in the target log information to the data synchronization device comprises the following steps:
receiving an information reading instruction sent by the data synchronization equipment;
acquiring the increment information based on the information reading instruction;
and sending the increment information to the data synchronization equipment.
The invention also provides a data synchronization device crossing the machine room, which is deployed in the data synchronization equipment and comprises:
the information receiving module is used for receiving incremental information of target log information sent by host computer room equipment under the condition that change of the target log information stored by the host computer room equipment deployed in the host computer room is monitored, wherein the target log information comprises database operation request information acquired by the host computer room equipment;
The information analysis module is used for analyzing whether the incremental information comprises database update request information;
and the request sending module is used for sending the database update request information to a slave computer room cluster if the incremental information comprises the database update request information so that the slave computer room cluster can realize data synchronization with the master computer room based on the database update request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room.
The invention also provides a data synchronization device crossing the machine room, which is deployed in the host room equipment, wherein the host room equipment is deployed in the host room, and the device comprises:
the request receiving module is used for receiving a database operation request and acquiring database operation request information based on the database operation request;
the information writing module is used for responding to the database operation request and writing the database operation request information into target log information stored by the host room equipment;
and the information sending module is used for sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and the slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the data synchronization method across the machine room when executing the program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of data synchronization across a machine room as described in any of the above.
According to the data synchronization method, the device, the electronic equipment and the storage medium for crossing the machine room, the data synchronization equipment timely receives the increment information of the target log information sent by the main machine room equipment under the condition that the change of the target log information stored by the main machine room equipment deployed in the main machine room is monitored, so that the database update request information is timely sent to the slave machine room cluster under the condition that the increment information comprises the database update request information, and the slave machine room cluster timely realizes the data synchronization with the main machine room based on the database update request information, so that the timeliness of the data synchronization is improved, the problem that the main machine room fails and the data is lost can be avoided by timely performing the data synchronization, and the data security is improved. Meanwhile, the data interaction among the host room equipment, the data synchronization equipment and the slave computer room clusters is limited to a small amount of incremental information, and complete log information or complete storage data are not required to be transmitted, so that timeliness of data synchronization is improved, the influence of problems such as network delay, transmission bandwidth and packet loss rate is small, reliability of data synchronization is further improved, stability and performance of a system are improved, and reliability of the system is further improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a data synchronization method across a machine room provided by the present invention;
FIG. 2 is a second flow chart of a data synchronization method across machine rooms provided by the present invention;
FIG. 3 is a third flow chart of a data synchronization method across machine rooms according to the present invention;
fig. 4 is a schematic structural diagram of a data synchronization system across a machine room provided by the present invention;
fig. 5 is a schematic structural diagram of a data synchronization device across a machine room provided by the present invention;
FIG. 6 is a second schematic diagram of a data synchronization device across a machine room according to the present invention;
fig. 7 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
With the rapid development of computer technology, as well as the development of the internet and digitization, data required for various application systems is increasing, and thus, a large amount of data needs to be stored to provide stable uninterrupted data services. In order to ensure the stability and data security of the application system, the data needs to be deployed in multiple rooms, namely, the deployment, backup and operation and maintenance of multiple data centers of the support system are realized, the multiple activities and disaster recovery in the same city or different places are realized, and then the data synchronization needs to be performed on the multiple rooms, namely, the data synchronization needs to be realized across machine rooms, so that the stability, reliability and data security of the system are fully ensured.
Currently, to achieve high availability, high availability clusters are deployed in most machine rooms; because the Doris clusters are themselves a supporting distributed high availability deployment, the Doris clusters can be deployed across machine rooms, namely, the Doris high availability clusters are deployed in both a main machine room and a slave machine room (standby machine room), and the data synchronization across the machine rooms is realized in a mode of periodically backing up data from different places, namely, the Doris clusters deployed in the main machine room periodically back up data to a remote warehouse, such as a remote warehouse for periodically backing up data to an HDFS (Hadoop Distributed File System, distributed file system) or an S3 remote warehouse, and the Doris clusters deployed in the slave machine room restore data from the remote warehouse to a local cluster. However, since the regular off-site backup mode needs to set a time interval, that is, backup and recovery are performed regularly, for example, data synchronization is performed on an hourly or daily basis, the timeliness is not high, if a host room fails, a situation that data is lost for a period of time occurs because data is not backed up in time for a period of time, so that the data security is not high, and the method is only suitable for a scene with low data requirements.
In addition, the backup synchronization of data can BE realized across the machine room at present, for example, in the scene of deploying the Doris cluster across the machine room, the BE (back end node) equipment in the main machine room directly performs data synchronization with the BE equipment in the slave machine room, namely, the BE equipment in the main machine room directly backs up the data of the BE equipment to the BE equipment in the slave machine room; specifically, in the scheme, the FE (front end node) high-availability cluster adopts a master-slave replication architecture, so that FE single-point faults can BE effectively avoided, BE manages multiple copies of the tablelet, and BE single-point data loss can BE effectively avoided. However, the data transmission is directly carried out across the machine room, and the data transmission is affected by the problems of network delay, transmission bandwidth, packet loss rate and the like among the machine rooms, so that the stability and performance of the Doris cluster are greatly affected, the reliability of a system is further affected, and the influence is larger under the condition of high concurrency writing; and considering that Doris locates the MPP (Massively Parallel Processing, massively parallel analysis) database of high performance ad hoc queries, this solution would not be used as a universally reliable way.
In view of the above problems, the present invention proposes the following embodiments. Fig. 1 is one of flow diagrams of a cross-machine-room data synchronization method provided by the present invention, and as shown in fig. 1, the cross-machine-room data synchronization method applied to a data synchronization device includes:
Step 110, receiving incremental information of target log information sent by host room equipment under the condition that the change of the target log information stored by the host room equipment deployed in a host room is monitored.
The target log information comprises database operation request information acquired by the host room equipment.
Here, the main room is the main room where data is stored. The host computer room is provided with at least one device for acquiring database operation requests and database operation request information and for storing data. The host room may deploy a high availability cluster to improve reliability of the system and security of data, e.g., the host room is deployed with a Doris cluster comprising at least one FE device and at least one BE device. The FE node and the BE node may BE deployed in the same device or in different devices. BE devices are used to store physical data and perform database operations in a distributed manner according to a physical plan generated by FE nodes, where one device can deploy multiple BE nodes and BE can store multiple copies of the entire data.
Here, the host room device is configured to receive a database operation request, and to obtain database operation request information. The hosting room device is also used for storing, maintaining cluster metadata, and for parsing database operation requests, and for planning query plans, scheduling query execution, returning query structures, and the like. For example, the host room device is an FE device in a Doris cluster. The FE can be divided into three roles, namely a Follower node, a Leader node and an Observer node; the Follower node is a task scheduling node and is used for synchronizing tasks of the Leader node, mutually carrying out data synchronization work and simultaneously serving as a backup node of the Leader node; the Leader node is a main node selected by the Follower node; the Observer node is a read-only node, and is used for synchronizing metadata only from the Leader node and expanding the query node, so that the query capability can be expanded under the condition that the cluster pressure is very high, and the Observer node does not participate in any writing and only participates in reading.
Here, the target log information is recorded with database operation request information acquired by the host room device. The target log information may be a log file. The target log information includes at least one record, each of which may include, but is not limited to, at least one of: user name, database name, table name, operation type, operation time, sql statement of operation, client ip address, etc.
Here, the database operation request information is detailed information (related information) of the database operation request. For example, the database operation request is an SQL (Structured Query Language ) request, and the database operation request information is detailed information of the SQL request. The database operation request comprises a database update request, a database query request and the like; the database update request is used for updating data of the database, and the database query request is used for querying the data in the database.
The database operation request information may include, but is not limited to, at least one of: user name, database name, table name, operation type, operation time, sql statement of operation, client ip address, etc.
For the host room device, the host room device receives the database operation request and acquires database operation request information based on the database operation request.
In one embodiment, a host room device stores a log of information.
Before the step 110, log information stored in each host room device in the host room is monitored, so that the step 110 is performed if any of the log information stored in the host room devices is monitored to change.
The incremental information is added compared with the target log information before the change, so that the complete target log information is not required to be acquired, the incremental information is only required to be acquired, the data transmission quantity is reduced, the timeliness of data synchronization is further improved, the influence of problems such as delay, transmission bandwidth and packet loss rate of a network is small, and the reliability of the data synchronization is further improved. The delta information includes at least one record in the target log information.
In one embodiment, an information reading instruction is sent to the host room device, so that the host room device sends incremental information based on the information reading instruction; and receiving the increment information sent by the host room equipment. Under the condition that the change of target log information stored by the host room equipment is monitored, the data synchronization equipment generates an information reading instruction and sends the information reading instruction to the host room equipment, so that the data synchronization equipment actively monitors the host room equipment, and the data synchronization equipment actively sends the information reading instruction, so that the host room equipment does not need to excessively participate in data synchronization, the stability and the performance of the host room are prevented from being influenced, and the reliability of the system is improved.
Step 120, analyzing whether the incremental information includes database update request information.
Here, the database update request information is detailed information (related information) of the database update request. For example, the database update request is an SQL request, and the database update request information is detailed information of the SQL request. The database update request is used to update the data of the database. It can be appreciated that analyzing whether the incremental information includes database update request information, so that only the database update request information is synchronized to the slave computer room cluster, but other information (such as database query request information) is not synchronized to the slave computer room cluster, can reduce redundant request information synchronization, further improve timeliness of data synchronization, improve accuracy of data synchronization, and improve stability and performance of the system.
The database update request information may include, but is not limited to, at least one of: user name, database name, table name, operation type, operation time, sql statement of operation, client ip address, etc. The delta information may include 0 pieces of database update request information, 1 piece of database update request information, and a plurality of pieces of database update request information.
Illustratively, analyzing whether SQL in the incremental information needs to be executed, i.e., analyzing the type of SQL, i.e., the incremental information includes database update request information. The SQL type of the database update request corresponding to the database update request information may include, but is not limited to: CREATE, INSERT, UPDATE, LOAD, etc.
And step 130, if the incremental information includes database update request information, the database update request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database update request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room.
Here, the slave room is a spare room for storing data. The slave room cluster includes at least one device for acquiring database operation requests and database operation request information and for storing data. The slave room may deploy a high availability cluster, i.e. the slave room cluster is a high availability cluster, to improve reliability of the system and security of data, e.g. the slave room cluster is a Doris cluster comprising at least one FE device and at least one BE device. The FE node and the BE node may BE deployed in the same device or in different devices.
It can be understood that the slave computer room equipment in the slave computer room cluster responds to the database update request corresponding to the database update request information like the master computer room equipment, so that the data synchronization of the master computer room and the slave computer room is realized, and the consistency of the data of the master computer room and the slave computer room is ensured. Specifically, the database update request information is analyzed from the equipment room equipment to obtain a database update request, and the database update request is responded.
The slave machine room equipment is also used for receiving the database operation request and acquiring the database operation request information. The slave room device is also used for storing and maintaining cluster metadata, analyzing database operation requests, planning a query plan, scheduling query execution, returning a query structure and the like. For example, the slave room device is an FE device in the Doris cluster. The FE can be divided into three roles, namely a Follower node, a Leader node and an oxaser node.
It should be noted that the data synchronization device is communicatively connected to the master room device in the master room, and the data synchronization device is communicatively connected to the slave room cluster.
Further, if the incremental information does not include the database update request information, the incremental information is lost, so that the occupied space of the information is reduced, and the processing performance of the data synchronization device is improved.
According to the data synchronization method across the machine room, when the data synchronization device monitors that the target log information stored by the host room device deployed in the host machine room changes, the data synchronization device timely receives the increment information of the target log information sent by the host machine room device, so that the database update request information is timely sent to the slave machine room cluster when the increment information comprises the database update request information, and the slave machine room cluster timely realizes data synchronization with the host machine room based on the database update request information, so that timeliness of data synchronization is improved, and the problem that failure data of the host machine room is lost can be avoided when the data synchronization is timely carried out, so that data security is improved. Meanwhile, the data interaction among the host room equipment, the data synchronization equipment and the slave computer room clusters is limited to a small amount of incremental information, and complete log information or complete storage data are not required to be transmitted, so that timeliness of data synchronization is improved, the influence of problems such as network delay, transmission bandwidth and packet loss rate is small, reliability of data synchronization is further improved, stability and performance of a system are improved, and reliability of the system is further improved.
Based on the above embodiment, fig. 2 is a second flowchart of a data synchronization method across machine rooms provided by the present invention, and as shown in fig. 2, the step 120 includes:
step 121, sending the increment information to a message queue.
Specifically, the incremental information is sent to the message queue while the data synchronization device reads the incremental information. The message queue needs to be installed in the data synchronization device early. In one embodiment, the message queue may be Kafak.
Further, the message queue writes to topic after routing after receiving the incremental information.
And step 122, monitoring the message queue through a message consuming program to send the increment information from the message queue to the message consuming program.
In consideration of the scenario that a plurality of increment information is possibly sent to the message queue, namely high concurrency, the increment information is sent to the message queue, and the message queue is monitored through the message consumption program, so that high concurrency processing is realized through the segmentation function of the message consumption program, and the timeliness of data synchronization is further improved. Furthermore, a plurality of message consumption programs can be deployed on the data synchronization device so as to monitor the message queues through different message program, thereby realizing high concurrency processing and further improving timeliness of data synchronization.
In one embodiment, when the message consuming program monitors a new message on topic of the message queue, the new message (i.e., delta information) is read in.
Step 123, analyzing, by the message consumption program, whether the incremental information includes database update request information.
Further, the message consuming program connects the slave computer room cluster to send the database update request information to the slave computer room cluster through the message consuming program.
In one embodiment, after the step 123, the method further includes:
and if the incremental information does not comprise the database update request information, losing the incremental information.
It can be understood that the missing incremental information is obtained by deleting and missing the incremental information in the message consuming program and the message queue, so as to reduce the occupied space of the information, further improve the processing performance of the data synchronization device and prevent the repeated processing of the incremental information.
According to the data synchronization method across the machine room, the incremental information is sent to the message queue, and the message queue is monitored through the message consumption program, so that high concurrency processing is achieved through the segmentation function of the message consumption program, the timeliness of data synchronization is further improved, and the incremental information can be consumed and processed in time through the message consumption program. And a plurality of message consumption programs can be deployed on the data synchronization device so as to monitor the message queues through different message program, thereby realizing high concurrency processing and further improving timeliness of data synchronization.
Based on any of the above embodiments, the host room is deployed with a Doris cluster, where the Doris cluster includes at least one front-end node FE device, and the host room device is an FE device.
It can be appreciated that the Doris cluster is a high availability cluster, thereby improving the reliability of the system and the security of the data.
Prior to step 110 above, the method further comprises:
and monitoring log information stored by each FE device in the main machine room.
Specifically, log information stored by all FE devices in the main machine room is monitored to monitor whether the log information stored by each FE device changes. The log information may be an audit log for recording all database operation requests received by the corresponding FE device.
In an embodiment, the log information stored in each FE device in the main machine room is monitored by the log monitoring and collecting tool, so as to analyze whether the log information changes by the log monitoring and collecting tool, and the incremental information of the target log information is read by the log monitoring and collecting tool. The log monitoring and collecting tool can be a filecoat, and the log monitoring and collecting tool needs to be pre-installed in the data synchronization device.
According to the data synchronization method across the machine room, the log information stored by each FE device in the main machine room is monitored, and whether the log information stored by each FE device changes or not is monitored timely, so that corresponding incremental information is read timely under the condition that the change occurs, and the timeliness of data synchronization is further improved.
Based on any one of the above embodiments, the present invention further provides a data synchronization method across machine rooms applied to a main machine room device, and fig. 3 is a third flow chart of the data synchronization method across machine rooms provided by the present invention, as shown in fig. 3, where the data synchronization method across machine rooms includes:
step 310, receiving a database operation request and acquiring database operation request information based on the database operation request.
Here, the host room device is deployed in the host room. The main machine room is a main machine room for storing data. The host computer room is provided with at least one device for acquiring database operation requests and database operation request information and for storing data. The host room may deploy a high availability cluster to improve reliability of the system and security of data, e.g., the host room is deployed with a Doris cluster comprising at least one FE device and at least one BE device. The FE node and the BE node may BE deployed in the same device or in different devices. BE devices are used to store physical data and perform database operations in a distributed manner according to a physical plan generated by FE nodes, where one device can deploy multiple BE nodes and BE can store multiple copies of the entire data.
It should be noted that, the host computer room may deploy a plurality of host computer room devices, and each host computer room device has substantially the same execution process. For example, the host room device is an FE device in a Doris cluster.
Here, the database operation request includes a database update request, a database query request, and the like; the database update request is used for updating data of the database, and the database query request is used for querying the data in the database.
In one embodiment, a database operation request sent by an application is received.
Here, the database operation request information is detailed information (related information) of the database operation request. For example, the database operation request is an SQL request, and the database operation request information is detailed information of the SQL request.
The database operation request information may include, but is not limited to, at least one of: user name, database name, table name, operation type, operation time, sql statement of operation, client ip address, etc.
Illustratively, the Doris cluster of the host room receives the SQL request sent by the application, and the Fe node devices in the Doris cluster obtain the detailed information of the SQL request.
And 320, responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment.
Specifically, the database operation request is processed, and the database operation request information is recorded to the target log information stored in the host room device while responding.
Here, the target log information is recorded with database operation request information acquired by the host room device. The target log information may be a log file. The target log information includes at least one record, each of which may include, but is not limited to, at least one of: user name, database name, table name, operation type, operation time, sql statement of operation, client ip address, etc.
And 330, sending the newly written incremental information in the target log information to a data synchronization device, so that the data synchronization device can realize data synchronization between the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
Here, the incremental information is added information compared with the target log information before writing, so that the complete target log information is not required to be sent to the data synchronization equipment, only the incremental information is required to be sent, the data transmission quantity is reduced, the timeliness of data synchronization is further improved, the influence of problems such as delay, transmission bandwidth and packet loss rate of a network is small, and the reliability of data synchronization is further improved. The delta information includes at least one record in the target log information.
Here, the slave room is a spare room for storing data. The slave room cluster includes at least one device for acquiring database operation requests and database operation request information and for storing data. The slave room may deploy a high availability cluster, i.e. the slave room cluster is a high availability cluster, to improve reliability of the system and security of data, e.g. the slave room cluster is a Doris cluster comprising at least one FE device and at least one BE device. The FE node and the BE node may BE deployed in the same device or in different devices.
For example, the slave room device is an FE device in the Doris cluster. The FE can be divided into three roles, namely a Follower node, a Leader node and an oxaser node.
It can be understood that, for the data synchronization device, the data synchronization device analyzes whether the incremental information includes the database update request information, and in the case that the database operation request information is the database update request information, only synchronizes the database update request information to the slave computer room cluster, but not synchronizes other information (such as database query request information) to the slave computer room cluster, so that redundant request information synchronization can be reduced, thereby improving timeliness of data synchronization, improving accuracy of data synchronization, and improving stability and performance of the system.
According to the data synchronization method across the machine room, the host room equipment responds to the database operation request, the database operation request information is written into the target log information stored in the host room equipment, and the newly written incremental information in the target log information is timely sent to the data synchronization equipment, so that timeliness of data synchronization is improved, and the problem that failure data of the host room is lost can be avoided by timely performing data synchronization, so that data security is improved. Meanwhile, the data interaction between the host room equipment and the data synchronization equipment is limited to a small amount of incremental information, and complete log information or complete stored data are not required to be transmitted, so that the timeliness of data synchronization is improved, the influence of problems such as network delay, transmission bandwidth and packet loss rate is small, the reliability of data synchronization is further improved, the stability and performance of a system are further improved, and the reliability of the system is further improved.
Based on any one of the above embodiments, the method in step 330 includes:
receiving an information reading instruction sent by the data synchronization equipment;
acquiring the increment information based on the information reading instruction;
and sending the increment information to the data synchronization equipment.
For the data synchronization equipment, the data synchronization equipment generates an information reading instruction under the condition that the change of target log information stored by the host room equipment is monitored, and sends the information reading instruction to the host room equipment, so that the data synchronization equipment actively monitors the host room equipment, and the data synchronization equipment actively sends the information reading instruction, so that the host room equipment does not need to excessively participate in data synchronization, the stability and the performance of the host room are prevented from being influenced, and the reliability of the system is improved.
The data synchronization method across the machine room provided by the embodiment of the invention receives the information reading instruction sent by the data synchronization equipment; acquiring incremental information based on the information reading instruction; and the incremental information is sent to the data synchronization equipment without excessive participation of the main machine room equipment in data synchronization, so that the influence on the stability and performance of the main machine room is avoided, and the reliability of the system is improved.
The data synchronization system across the machine room provided by the invention is described below, and the data synchronization system across the machine room described below and the data synchronization method across the machine room described above can be correspondingly referred to each other.
This data synchronization system across computer lab includes: the system comprises a host room cluster deployed in a host room, a slave room cluster deployed in a slave room and data synchronization equipment. The data synchronization device is respectively in communication connection with a host room cluster and a slave machine room cluster, and the host room cluster comprises at least one host room device.
In order to facilitate understanding, as shown in fig. 4, the host room is configured with three host room devices and three data storage devices, where each data storage device may perform data backup, where the host room device of the host room receives a database operation request sent by an application program, the host room device processes the database operation request and writes database operation request information corresponding to the database operation request into log information stored in the host room device, a log monitoring collection tool in the data synchronization device monitors the log information and reads incremental information in the log information, so as to send the incremental information to a message queue, and a message consumption program obtains the incremental information in the message queue and analyzes whether the incremental information includes database update request information, so that in a case that the incremental information includes database update request information, the host room device sends the database update request information to a slave room device of the slave room, thereby implementing data synchronization, and in a case that the incremental information does not include database update request information, the incremental information is lost.
Based on the embodiments, the invention realizes quasi-real-time synchronization, reduces the influence of the faults of the main machine room, does not influence the stability of the main machine room cluster or the slave machine room cluster, and further realizes a high-performance cross-machine room high-availability scheme.
The data synchronization device across the machine room provided by the invention is described below, and the data synchronization device across the machine room described below and the data synchronization method across the machine room described above can be correspondingly referred to each other.
Fig. 5 is a schematic structural diagram of a cross-machine-room data synchronization device provided by the present invention, as shown in fig. 5, where the cross-machine-room data synchronization device disposed in a data synchronization device includes:
the information receiving module 510 is configured to receive incremental information of target log information sent by a host room device when it is monitored that the target log information stored by the host room device deployed in the host room changes, where the target log information includes database operation request information acquired by the host room device;
an information analysis module 520 for analyzing whether the incremental information includes database update request information;
and the request sending module 530 is configured to send the database update request information to a slave computer room cluster if the incremental information includes the database update request information, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database update request information, and the slave computer room cluster is disposed in a slave computer room related to the master computer room.
According to the data synchronization device across the machine room, the data synchronization device timely receives the increment information of the target log information sent by the main machine room device under the condition that the change of the target log information stored by the main machine room device deployed in the main machine room is monitored, so that the database update request information is timely sent to the slave machine room cluster under the condition that the increment information comprises the database update request information, and the slave machine room cluster timely realizes the data synchronization with the main machine room based on the database update request information, so that timeliness of the data synchronization is improved, and the problem that the main machine room fails and data are lost can be avoided by timely performing the data synchronization, so that the data security is improved. Meanwhile, the data interaction among the host room equipment, the data synchronization equipment and the slave computer room clusters is limited to a small amount of incremental information, and complete log information or complete storage data are not required to be transmitted, so that timeliness of data synchronization is improved, the influence of problems such as network delay, transmission bandwidth and packet loss rate is small, reliability of data synchronization is further improved, stability and performance of a system are improved, and reliability of the system is further improved.
Based on any of the above embodiments, the information analysis module 520 is further configured to:
sending the increment information to a message queue;
monitoring the message queue through a message consuming program to send the increment information from the message queue to the message consuming program;
and analyzing whether the incremental information comprises database update request information or not through the message consumption program.
Based on any of the above embodiments, the information analysis module 520 is further configured to:
and if the incremental information does not comprise the database update request information, losing the incremental information.
Based on any of the above embodiments, the host room is deployed with a Doris cluster, where the Doris cluster includes at least one front-end node FE device, and the host room device is an FE device; the apparatus further comprises:
and the information monitoring module is used for monitoring log information stored by each FE device in the main machine room.
Fig. 6 is a second schematic structural diagram of a cross-machine room data synchronization device provided by the present invention, as shown in fig. 6, the cross-machine room data synchronization device disposed in a host room device includes:
a request receiving module 610, configured to receive a database operation request, and obtain database operation request information based on the database operation request;
An information writing module 620, configured to respond to the database operation request, and write the database operation request information into the target log information stored in the host room device;
and the information sending module 630 is configured to send the increment information newly written in the target log information to a data synchronization device, so that the data synchronization device realizes data synchronization between the host computer room and a slave computer room based on the increment information, where the slave computer room is a computer room related to the host computer room.
According to the data synchronization device across the machine room, the host room equipment responds to the database operation request, the database operation request information is written into the target log information stored in the host room equipment, and the newly written incremental information in the target log information is timely sent to the data synchronization equipment, so that timeliness of data synchronization is improved, the problem that failure data of the host room is lost can be avoided by timely performing data synchronization, and therefore data security is improved. Meanwhile, the data interaction between the host room equipment and the data synchronization equipment is limited to a small amount of incremental information, and complete log information or complete stored data are not required to be transmitted, so that the timeliness of data synchronization is improved, the influence of problems such as network delay, transmission bandwidth and packet loss rate is small, the reliability of data synchronization is further improved, the stability and performance of a system are further improved, and the reliability of the system is further improved.
Based on any of the above embodiments, the information sending module 630 is further configured to:
receiving an information reading instruction sent by the data synchronization equipment;
acquiring the increment information based on the information reading instruction;
and sending the increment information to the data synchronization equipment.
Fig. 7 illustrates a physical schematic diagram of an electronic device, as shown in fig. 7, which may include: processor 710, communication interface (Communications Interface) 720, memory 730, and communication bus 740, wherein processor 710, communication interface 720, memory 730 communicate with each other via communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform a cross-room data synchronization method for a data synchronization device, the method comprising: under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment; analyzing whether the incremental information comprises database update request information; and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room. Or, executing a data synchronization method applied to the main machine room equipment and crossing the machine room, wherein the method comprises the following steps: receiving a database operation request and acquiring database operation request information based on the database operation request; responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment; and sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
Further, the logic instructions in the memory 730 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor can perform a data synchronization method applied to a data synchronization device provided by the above methods, where the method includes: under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment; analyzing whether the incremental information comprises database update request information; and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room. Or, the computer can execute the data synchronization method across the machine room, which is applied to the main machine room equipment and provided by the methods, and the method comprises the following steps: receiving a database operation request and acquiring database operation request information based on the database operation request; responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment; and sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
In yet another aspect, the present invention further provides a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, is implemented to perform the method for data synchronization across a machine room provided by the methods above, the method comprising: under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment; analyzing whether the incremental information comprises database update request information; and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room. Or, the computer program is implemented when being executed by a processor to perform the data synchronization method across machine rooms provided by the above methods and applied to the main machine room equipment, where the method includes: receiving a database operation request and acquiring database operation request information based on the database operation request; responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment; and sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for synchronizing data across a machine room, applied to a data synchronization device, the method comprising:
under the condition that change of target log information stored by host room equipment deployed in a host machine room is monitored, incremental information of the target log information sent by the host machine room equipment is received, wherein the target log information comprises database operation request information acquired by the host machine room equipment;
analyzing whether the incremental information comprises database update request information;
and if the incremental information comprises database updating request information, the database updating request information is sent to a slave computer room cluster, so that the slave computer room cluster realizes data synchronization with the master computer room based on the database updating request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room.
2. The method for synchronizing data across machine rooms according to claim 1, wherein the analyzing whether the incremental information includes database update request information includes:
sending the increment information to a message queue;
monitoring the message queue through a message consuming program to send the increment information from the message queue to the message consuming program;
and analyzing whether the incremental information comprises database update request information or not through the message consumption program.
3. The method for synchronizing data across machine rooms according to claim 2, wherein the analyzing, by the message consuming program, whether the incremental information includes database update request information further includes:
and if the incremental information does not comprise the database update request information, losing the incremental information.
4. The method for synchronizing data across machine rooms according to claim 1, wherein the host room is deployed with a Doris cluster, the Doris cluster comprising at least one front-end node FE device, and the host room device is an FE device;
under the condition that the change of the target log information stored in the host room equipment deployed in the host machine room is monitored, the method for receiving the incremental information of the target log information sent by the host machine room equipment further comprises the following steps:
And monitoring log information stored by each FE device in the main machine room.
5. The data synchronization method across machine rooms is characterized by being applied to main machine room equipment, wherein the main machine room equipment is deployed in a main machine room, and the method comprises the following steps:
receiving a database operation request and acquiring database operation request information based on the database operation request;
responding to the database operation request, and writing the database operation request information into target log information stored by the host room equipment;
and sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and a slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
6. The method for synchronizing data across machine rooms according to claim 5, wherein the sending the newly written incremental information in the target log information to the data synchronizing device includes:
receiving an information reading instruction sent by the data synchronization equipment;
acquiring the increment information based on the information reading instruction;
and sending the increment information to the data synchronization equipment.
7. A data synchronization device across a machine room, wherein the device is deployed in a data synchronization apparatus, the device comprising:
the information receiving module is used for receiving incremental information of target log information sent by host computer room equipment under the condition that change of the target log information stored by the host computer room equipment deployed in the host computer room is monitored, wherein the target log information comprises database operation request information acquired by the host computer room equipment;
the information analysis module is used for analyzing whether the incremental information comprises database update request information;
and the request sending module is used for sending the database update request information to a slave computer room cluster if the incremental information comprises the database update request information so that the slave computer room cluster can realize data synchronization with the master computer room based on the database update request information, and the slave computer room cluster is deployed in a slave computer room related to the master computer room.
8. A data synchronization device across a machine room, wherein the device is deployed in a host room device, the host room device is deployed in a host room, the device comprising:
the request receiving module is used for receiving a database operation request and acquiring database operation request information based on the database operation request;
The information writing module is used for responding to the database operation request and writing the database operation request information into target log information stored by the host room equipment;
and the information sending module is used for sending the newly written incremental information in the target log information to data synchronization equipment so that the data synchronization equipment can realize data synchronization of the host computer room and the slave computer room based on the incremental information, wherein the slave computer room is a computer room related to the host computer room.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the data synchronization method across machine rooms according to any one of claims 1 to 4 or implements the data synchronization method across machine rooms according to claim 5 or 6 when executing the program.
10. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, implements the data synchronization method across machine rooms according to any one of claims 1 to 4, or implements the data synchronization method across machine rooms according to claim 5 or 6.
CN202311030897.8A 2023-08-14 2023-08-14 Cross-machine-room data synchronization method and device, electronic equipment and storage medium Pending CN117131131A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311030897.8A CN117131131A (en) 2023-08-14 2023-08-14 Cross-machine-room data synchronization method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311030897.8A CN117131131A (en) 2023-08-14 2023-08-14 Cross-machine-room data synchronization method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117131131A true CN117131131A (en) 2023-11-28

Family

ID=88855625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311030897.8A Pending CN117131131A (en) 2023-08-14 2023-08-14 Cross-machine-room data synchronization method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117131131A (en)

Similar Documents

Publication Publication Date Title
CN110209726B (en) Distributed database cluster system, data synchronization method and storage medium
US10353918B2 (en) High availability and disaster recovery in large-scale data warehouse
CN109918349B (en) Log processing method, log processing device, storage medium and electronic device
EP2474919B1 (en) System and method for data replication between heterogeneous databases
CN111090699A (en) Service data synchronization method and device, storage medium and electronic device
JP5308403B2 (en) Data processing failure recovery method, system and program
CN110795503A (en) Multi-cluster data synchronization method and related device of distributed storage system
CN101136728A (en) Cluster system and method for backing up a replica in a cluster system
CN111787055B (en) Redis-based transaction mechanism and multi-data center oriented data distribution method and system
US20120278817A1 (en) Event distribution pattern for use with a distributed data grid
US9258363B2 (en) Data cube high availability
CN113987064A (en) Data processing method, system and equipment
CN104994168A (en) distributed storage method and distributed storage system
CN104657497A (en) Mass electricity information concurrent computation system and method based on distributed computation
CN111597197B (en) Data reconciliation method and device between databases, storage medium and electronic equipment
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
JP2015069655A (en) Process control systems and methods
CN109901948B (en) Remote double-active disaster recovery system of shared-nothing database cluster
CN107026880A (en) Method of data synchronization and device
US20090063486A1 (en) Data replication using a shared resource
CN109739435A (en) File storage and update method and device
WO2017014814A1 (en) Replicating memory volumes
CN103384266A (en) Parastor200 management node high availability method based on real-time synchronization at file level
CN109859068B (en) Power grid data real-time synchronization system based on resource pool technology
US9612921B2 (en) Method and system for load balancing a distributed database providing object-level management and recovery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination