CN109145060B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN109145060B
CN109145060B CN201810800450.7A CN201810800450A CN109145060B CN 109145060 B CN109145060 B CN 109145060B CN 201810800450 A CN201810800450 A CN 201810800450A CN 109145060 B CN109145060 B CN 109145060B
Authority
CN
China
Prior art keywords
binlog
binary log
structure information
compensation
table structure
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810800450.7A
Other languages
Chinese (zh)
Other versions
CN109145060A (en
Inventor
吴夏
潘安群
雷海林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Tencent Cloud Computing Beijing Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201810800450.7A priority Critical patent/CN109145060B/en
Publication of CN109145060A publication Critical patent/CN109145060A/en
Application granted granted Critical
Publication of CN109145060B publication Critical patent/CN109145060B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of internet, in particular to a data processing method and a data processing device, wherein a producer module and a table storage module are preset, a cold standby server is determined from a slave server of a MySQL cluster, a binlog file is analyzed by the producer module of the cold standby server to obtain binlog analysis data, table structure information is obtained by analyzing the binlog analysis data and is stored in the table storage module, and when a message entity is output, the binlog analysis data and the corresponding table structure information in the table storage module are packaged into the message entity and output. The scheme of the invention acquires and analyzes the binlog file in a bypass mode, does not increase the MySQL cluster load, ensures the integrity of the data and enriches the content of the data.

Description

Data processing method and device
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a data processing method and apparatus.
Background
MySQL is a relational database management system that speeds data access and increases flexibility by storing data in different tables instead of putting all the data in one large repository. In order to realize load balance of the server and enhance the capability of the database system for resisting disasters, the MySQL database system is similar to other popular database systems, and a master-slave synchronization architecture mode is also adopted, namely: the system comprises a main server and a plurality of slave servers. The purpose of adopting the above-mentioned architecture mode is as follows. First, only the update operation of data can be realized on the master server, and all the query requests of the data can be transmitted to the slave servers for execution. And secondly, when the main server fails, the database operation request can be directly switched to the slave server to continue providing the service, so that the stable operation of the user service can be ensured.
MySQL utilizes binary log binlog to realize data consistency between master and slave servers. The MySQL system realizes a binary log acquisition function based on a binlog-dump protocol, the main application scene is data replication between a master server and a slave server in the MySQL cluster, the communication process is shown as the attached figure 1, and the implementation mode is specifically as follows: the master server writes the executed data updating action into the binary log binlog, after the previous slave server and the master server complete handshaking through a relevant protocol, the master server continuously sends the binary log binlog to the slave server, and the slave server receives and applies the binary log binlog.
In the application of data subscription, the main idea is to obtain the binlog event data through the binlog-dump protocol, referring to fig. 2 specifically, by utilizing the characteristic that the MySQL system can realize the binary log obtaining function based on the binlog-dump protocol, the binlog-dump protocol is disguised as a request initiated from the server to the MySQL to obtain the binlog event data, and then the binlog-dump protocol is analyzed and processed into a data format usable for the service scene of the server, and the analyzed data is stored in the subscription end for the third party to subscribe. However, the binlog event data acquired by the binlog-dump protocol cannot obtain table structure information; in addition, the mode of acquiring the binlog through the binlog-dump protocol needs MySQL to send data outwards, so that the burden of the MySQL is increased; in addition, if MySQL executes an operation of deleting the binlog relation, such as binlogpurge, the subscriber may also miss data.
Disclosure of Invention
In view of the foregoing problems in the prior art, an object of the present invention is to provide a data processing method and apparatus.
In a first aspect, the present invention provides a data processing method, including:
according to data stored in a master server and a slave server of the MySQL cluster, taking the slave server closest to the data stored in the master server as a cold standby server;
acquiring a binary system log binlog file from the cold standby server, and analyzing the binary system log binlog file to obtain binlog analysis data;
obtaining the table structure information of the binary log binlog file according to the binlog analysis data, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information;
and generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
In a second aspect, the present invention provides a data processing method, including:
acquiring compensation information in compensation nodes on a distributed coordination server cluster, wherein the compensation information comprises an identification code of a MySQL cluster;
judging whether a compensation task corresponding to the compensation information exists or not;
if the compensation task exists, acquiring the latest message entity corresponding to the MySQL cluster from the message cluster according to the identification code of the MySQL cluster;
according to the acquired message entity, acquiring a binary log binlog file from a main server;
processing the binary log binlog file to obtain a message entity;
and sending the processed message entity to a message cluster, and sending a modification request to the distributed coordination server cluster, wherein the modification request is used for triggering the distributed coordination server cluster to delete the compensation node.
In a third aspect, the present invention further provides a data processing apparatus, including:
the cold standby server determining unit is used for taking a slave server closest to the data stored in the master server as a cold standby server according to the data stored in the master server and the slave server of the MySQL cluster;
the binlog analyzing unit is used for acquiring a binary log binlog file from the cold standby server and analyzing the binary log binlog file to obtain binlog analyzing data;
the table structure information updating unit is used for acquiring the table structure information of the binary log binlog file according to the binlog analysis data and updating the table structure information of the binary log binlog file stored in the table storage module according to the acquired table structure information;
and the message entity generating unit is used for generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
In a fourth aspect, the present invention further provides a data processing apparatus, including:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring compensation information in compensation nodes on a distributed coordination server cluster, and the compensation information comprises an identification code of a MySQL cluster;
the judging unit is used for judging whether a compensation task corresponding to the compensation information exists or not;
the second acquisition unit is used for acquiring the latest message entity corresponding to the MySQL cluster from the message cluster according to the identification code of the MySQL cluster when a compensation task exists;
a third obtaining unit, configured to obtain a binary log binlog file from the main server according to the obtained message entity;
the processing unit is used for processing the binary log binlog file acquired by the third acquisition unit to acquire a message entity;
and the sending unit is used for sending the message entity obtained by the processing unit to the message cluster and sending a modification request to the distributed coordination server cluster, wherein the modification request is used for triggering the distributed coordination server cluster to delete the compensation node.
In a fifth aspect, the present invention also provides a computer-readable storage medium, in which at least one instruction, at least one program, code set, or instruction set is stored, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the data processing method of the first aspect.
In a sixth aspect, the present invention further provides a computer-readable storage medium, in which at least one instruction, at least one program, code set, or instruction set is stored, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by a processor to implement the data processing method of the second aspect.
The invention has the following beneficial effects:
the method includes configuring a producer module for each master server and each slave server of the MySQL cluster, setting a table storage module, determining a cold standby server from each slave server, analyzing a binlog file through the producer module of the cold standby server to obtain binlog analysis data, further acquiring table structure information through analyzing the binlog analysis data and storing the table structure information into the table storage module, and packaging the binlog analysis data and the corresponding table structure information in the table storage module into a message entity and outputting the message entity when the message entity is output. The scheme of the invention obtains and analyzes the binlog file in a bypass mode, does not increase the burden of the MySQL cluster, realizes zero intrusion to the MySQL cluster, and ensures that the performance of the database is not influenced in the operation process; the table structure information is obtained by analyzing binlog analysis data, and the table storage module is arranged to uniformly manage the table structure information, so that the table structure information of the binlog file is accurately checked, and meanwhile, the table structure information is added to the output message entity, and the data content is enriched.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of data communication between a master server and a slave server within a MySQL cluster;
FIG. 2 is a schematic diagram illustrating the processing of subscription data using the binlog-dump protocol in the prior art;
FIG. 3 is a topology diagram of a data processing method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating a data processing method according to an embodiment of the present invention;
FIG. 5 is a flow chart illustrating a data processing method according to an embodiment of the present invention;
fig. 6 is a schematic diagram illustrating interaction among devices in a data processing method according to an embodiment of the present invention;
FIG. 7 is a flow chart illustrating another data processing method according to an embodiment of the present invention;
fig. 8 is a schematic diagram illustrating interaction among devices in a data processing method according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 10 is a block diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the existing data subscription technology based on the MySQL cluster, a binary log acquisition function based on a binlog-dump protocol and realized by a MySQL system is generally utilized, the binlog-dump protocol is adopted to disguise that a request is initiated from a server to the MySQL to acquire binlog event data, and then the binlog event data is analyzed into a data format usable for a service scene of the server. However, this solution has the following drawbacks:
(1) if MySQL executes operations related to deletion of binlog, such as binlogpurge, the subscriber side may lose data.
The Binlogpurge operation is used for deleting part of the binlog files, and because the data source of the subscription end is the binlog files, deleting the binlog files can cause data loss of the subscription end, and even if the subscription end senses that the binlog files are lost, the missing data cannot be made up, because the binlog-dump protocol cannot acquire data on other machines.
(2) Table structure information is not available from the binlog event data obtained from the binlog-dump protocol.
According to the information in the binlog time, the table structure information of the corresponding table when the event occurs can not be obtained. Even if a binlog event recorded in a row format (row) contains a table map event describing table structure information, due to the relationship of time differences and the characteristic that the event can only obtain table name information, when the event is obtained, the table structure may have been changed, and thus correct table structure information cannot be obtained.
(3) The use of the binlog-dump protocol is somewhat invasive to MySQL itself.
Due to the fact that the binlog-dump protocol is adopted, and finally binlog data are sent to the outside through the MySQL, the burden of the MySQL is increased to a certain extent.
Aiming at the defects of the prior art, the invention provides a data processing scheme, which is characterized in that a producer module is respectively configured for each master server and each slave server of a MySQL cluster in advance, a table storage module is arranged, a data processing device is constructed, then a cold standby server is determined from each slave server, the producer module of the cold standby server analyzes a binlog file to obtain binlog analysis data, then table structure information is obtained by analyzing the binlog analysis data and stored in the table storage module, and when a message entity is output, the binlog analysis data and the corresponding table structure information in the table storage module are packaged into the message entity and output. According to the scheme, the binlog file is acquired and analyzed in a bypass mode, zero invasion to the MySQL cluster is achieved, the performance of the database is not affected in the operation process, the output message entity is added with the table structure information, and the data content is enriched. In addition, when finding the missing binlog file, the producer module of the main server can be used for acquiring and analyzing the missing binlog file, so that the integrity of the data is ensured, and the safety of the data is enhanced.
The data processing method provided by the embodiment of the invention relates to a MySQL cluster, a data processing device, a structural information base, a message cluster and a distributed coordination server cluster. The data processing device is configured around the MySQL cluster in a bypass mode and comprises a producer module (binlogproducer) and a table structure module, wherein the producer module corresponds to a master server and a slave server in the MySQL cluster one by one, namely each master server and each slave server correspond to one producer module, and the table structure module is used for storing table structure information of each instance of the MySQL cluster.
The MySQL cluster is used to generate a binary log binlog file. The master server of the MySQL cluster can realize the data updating operation, generate the updating data, compile the updating data into a binary log binlog file and synchronize the binary log binlog file to the slave server. The scheme of the invention mainly obtains the binary system log binlog file from the slave server, the event ID of the binary system log binlog file is set according to a continuous increasing mode, if the event ID of the binary system log binlog file is discontinuous, the binary system log binlog file is lost, and under the condition, the lost binary system log binlog file needs to be obtained from the master server.
Referring to fig. 6, the data processing apparatus is configured to: according to data stored in a master server and a slave server of the MySQL cluster, taking the slave server closest to the data stored in the master server as a cold standby server; acquiring a binary log binlog file from the cold standby server through a producer module of the cold standby server, and analyzing the binary log binlog file to obtain binlog analysis data; judging whether the binary log binlog file contains a DDL statement, if so, sending a replay request to a structure information base to acquire table structure information, updating corresponding table structure information in the table storage module according to the acquired table structure information, generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module, and sending the message entity to a message cluster.
The structural information base is used for: replaying the DDL statement according to a replay request sent by the data processing device to obtain table structure information and returning the table structure information to the data processing device;
the message cluster is to: storing the message instance sent by the data processing device.
Referring to fig. 8, the data processing apparatus is further configured to: when a producer module of a cold standby server detects that a binary log binlog file is missing, sending a compensation request to the distributed coordination server cluster; reading a compensation node on the distributed coordination server cluster through a producer module of the main server, and processing the missing binary log binlog file according to compensation information in the compensation node; the cold standby server is also used for continuously executing the step of processing the binary log binlog file through the producer module of the cold standby server when the producer module of the cold standby server detects that the processing of the missing binary log binlog file is finished;
the distributed coordination server cluster is configured to: and creating a compensation node according to a compensation request sent by a producer module of the cold standby server, and writing compensation information into the compensation node, wherein the compensation information comprises an IP address of the producer module of a main server for processing the compensation information and an identification code of the MySQL cluster to which the main server belongs.
Fig. 3 is a topological diagram of a data processing method according to an embodiment of the present invention. Referring to fig. 3, the producer module is deployed along with MySQL procedures, each MySQL instance having its own independent producer process. The producer module converts the binlog event into a message entity through analyzing the binlog, and stores the message entity in a message cluster (distributed message queue Kafka cluster) for a third party to subscribe, wherein the format of the message is a character string in a JSON format; the distributed coordination server cluster (ZOOKEEPER cluster) is in communication with a producer module.
The data processing method of the present specification will be specifically described below.
An embodiment of the present invention provides a data processing method, and fig. 4 is a flowchart illustrating the data processing method provided in the embodiment of the present invention. The present specification provides method steps as described in the examples or flowcharts, but may include more or fewer steps based on routine or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 4, the method may include:
s301: and according to the data stored in the master server and the slave servers of the MySQL cluster, taking the slave server closest to the data stored in the master server as a cold standby server.
In a specific embodiment, the method for determining the cold standby server may include:
acquiring a first time stamp of the latest data synchronized in the slave main server of each slave server;
acquiring a second timestamp of the latest data in the main server;
and comparing the first time stamp with the second time stamp, and taking the slave server corresponding to the first time stamp closest to the second time stamp as the cold standby server.
The time stamp is a complete and verifiable data that can indicate that a piece of data exists before a specific time, and is usually a character sequence for uniquely identifying the time of a certain moment. According to the embodiment of the invention, the difference between the data in the slave server and the data in the master server is judged through the timestamp, and if the difference between the timestamp of the latest data in the slave server and the timestamp of the latest data in the master server is small, the slave server is proved to synchronize the data faster, and the stored data is closer to the data in the master server. The advantage of selecting the slave server closest to the master server storing the data as the cold standby server is that the robustness of acquiring the binlog file can be ensured.
S302: and acquiring a binary system log binlog file from the cold standby server, and analyzing the binary system log binlog file to obtain binlog analysis data.
Specifically, after the binary log binlog file is acquired from the cold standby server, it needs to be determined whether the acquired binary log binlog file is continuous with the last acquired binary log binlog file, and if the binary log file is not continuous, it indicates that the binlog file is missed, which may cause incomplete messages stored in the message cluster. Referring to fig. 5, after the binary log binlog file is obtained, the method further includes the following steps:
s502: judging whether the binary log binlog file is missing or not according to the acquired binary log binlog file;
s503: if the binary log binlog file is missing, sending a compensation request to a distributed coordination server cluster, wherein the compensation request is used for triggering the distributed coordination server to create a compensation node and writing compensation information into the compensation node;
s504: judging whether the compensation node is processed or not;
if the compensation node has been processed, executing step S505 to parse the binary log binlog file to obtain binlog parsing data; if the compensation node is not processed, returning to execute the step S504;
furthermore, if there is no binary log binlog file missing, step S505 is executed to parse the binary log binlog file to obtain binlog parsing data.
Two cases can be defined as binary log binlog file missing, one is that the binlog file cannot be found according to the corresponding transaction ID, and the other is that the generated messages are not continuous under the condition that the server IDs are the same.
Since the value of the transaction ID of the binary log binlog file is continuously incremented, if the value of the transaction ID is not continuous, it can be determined that the binary log binlog file is missing. The embodiment of the invention judges whether the binlog file is lost by judging whether the transaction IDs of the binary log binlog file are continuous, and specifically comprises the following steps:
(1) extracting the transaction ID of the binary log binlog file obtained this time;
(2) judging whether the transaction ID of the binary log binlog file acquired this time is continuous with the transaction ID of the binary log binlog file acquired last time;
(3) if the transaction IDs are consecutive, determining that there is no binary log binlog file miss;
(4) if the transaction ID is not contiguous, it is determined that there is a binary log binlog file miss.
The scheme of the invention further judges whether the binlog file is lost after the binlog file of the binary log is obtained, and the step of analyzing the binlog file is suspended under the condition of the loss until the lost binlog file is processed, and then the step of analyzing the binlog file is continuously executed, thereby ensuring the order, the completeness and the correctness of the message entities stored in the message cluster.
S303: and according to the binlog analysis data, obtaining the table structure information of the binary log binlog file, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information.
In a specific embodiment, the obtaining the table structure information of the binary log binlog file according to the binlog parsing data includes:
s3031: and judging whether the binlog analysis data contains DDL statements.
The DDL (data definition language) statement is a data definition language, and is used for defining and managing languages of all objects in the MySQL database, and the main commands are CREATE, ALTER, DROP, and the like. The dml (datamanagementlanguage) statement is a language for operating on data in a database, and does not change table structure information. Therefore, the embodiment of the invention acquires the table structure information through the DDL statement.
S3032: and if the binlog analysis data contains a DDL statement, acquiring the table structure information of the binary log binlog file according to the DDL statement, and updating the table structure information of the binary log binlog file stored in a table storage module according to the acquired table structure information.
In a specific embodiment, the table structure module is free of data during creation, and when the cold standby server determines that the table structure module is empty, the table structure module is initialized, so that the table structure module stores the table structure information of the binlog file in the cold standby server. Specifically, initializing the table structure module includes:
a) refreshing the table to clear the cache (FLUSH blocks);
b) adding a global READ LOCK (FLUSH blocks WITH READ LOCK) for the base table;
c) all data used in the lock query (SET SESSION transport association free available READ);
d) START TRANSACTION (START TRANSACTION);
e) checking the currently used binary log and the currently executed binary log position (SHOWMASTER STATUS) of the database;
f) acquiring table structure information and storing the table structure information in a table storage module;
g) and UNLOCK TABLE (UNLOCK TABLE).
In a specific embodiment, the obtaining the table structure information of the binary log binlog file according to the DDL statement includes:
analyzing the DDL statement to obtain a library table name;
sending a replay request to a structure information base, the replay request including a base table name and a DDL statement;
and receiving table structure information of the binary log binlog file returned by a structure information base, wherein the table structure information comprises a base table name and table structure data.
S3033: if the binlog parsing data does not include a DDL statement, determining that the table structure information of the binary log binlog file is the same as the table structure information of the binary log binlog file stored in the table storage module.
Specifically, if the binlog parsing data does not include the DDL statement, the table structure data of the binlog file corresponding to the binlog parsing data is not changed, and the table structure information stored in the table storage module does not need to be changed.
S304: and generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
In one possible implementation, the binlog parse data and the table structure information may be matched according to the library table name. The method specifically comprises the following steps: according to the list names contained in the binlog analysis data, searching in the list storage module to obtain list structure information containing the list names; and encapsulating the queried table structure information and the binlog analysis data to obtain a message entity.
In a possible embodiment, after the step S304, the method further includes step S305: and sending the message entity to a message cluster, wherein the message cluster is used for storing the message entity.
In a possible embodiment, when another slave server becomes a new cold-standby server instead of the slave server currently serving as the cold-standby server, the producer module of the new cold-standby server needs to find the starting position of the binary log binlog, which can be specifically determined by the following method:
the method comprises the following steps: reading the latest message entity corresponding to the MySQL cluster in the message cluster;
step two: analyzing the latest message entity to obtain a transaction ID and a transaction ID offset;
step three: and determining the initial position of the binary log binlog file according to the transaction ID and the transaction ID offset. Alternatively, the starting position of the binlog file of the binary log to be parsed may be represented by the transaction ID of the binlog file, and the value obtained by adding the transaction ID of the latest message entity and the transaction ID offset is the transaction ID of the binlog file to be parsed.
The embodiment of the invention is characterized in that a producer module is configured for each master server and each slave server of the MySQL cluster, a table storage module is arranged, a cold standby server is determined from each slave server, a binlog file is analyzed by the producer module of the cold standby server to obtain binlog analysis data, then table structure information is obtained by analyzing the binlog analysis data and stored in the table storage module, and when a message entity is output, the binlog analysis data and the corresponding table structure information in the table storage module are packaged into the message entity and output. The embodiment of the invention has the following beneficial effects:
1. in the prior art, because the problem of the difference between the bin log file and the time is that the accurate table structure information data can not be obtained, the embodiment of the invention adds the table structure information in the message entity by means of a DDL event replay mechanism, enriches the content of the data, and ensures that the table structure information in the message entity is accurate and reliable.
2. The embodiment of the invention acquires data based on the analysis of the MySQL cluster binlog file, completes data processing in a bypass mode, does not increase the MySQL cluster burden, realizes zero intrusion to the MySQL cluster, and ensures that the performance of the database is not influenced in the operation process.
3. When the cold standby server is switched from one slave server to another slave server, the initial position of the binary log binlog file is determined and analyzed, so that the continuity and the integrity of the obtained message entities can be ensured, and the occurrence of repeated data in the message cluster is avoided.
An embodiment of the present invention provides a data processing method, and fig. 7 is a schematic flow chart of another data processing method provided in the embodiment of the present invention. Referring to fig. 7, the data processing method includes:
s701: and acquiring compensation information in compensation nodes on the distributed coordination server cluster, wherein the compensation information comprises an identification code and an IP address of the MySQL cluster.
S702: and judging whether a compensation task corresponding to the compensation information exists or not according to the IP address.
In a specific embodiment, step S705 may include:
s7021, judging whether the IP address in the compensation information is the same as the IP address of the compensation information;
s7022, if the IP address in the compensation information is the same as the IP address of the compensation information, judging that a compensation task corresponding to the compensation information exists;
s7023, if the IP address in the compensation information is different from the IP address of the compensation information, judging that a compensation task corresponding to the compensation information does not exist.
S703: and if the compensation task exists, acquiring the latest message entity corresponding to the MySQL cluster from the message cluster according to the identification code of the MySQL cluster.
S704: and acquiring a binary log binlog file from the main server according to the acquired message entity.
In a specific embodiment, step S704 may include:
s7041, analyzing the message entity to obtain a transaction ID and a transaction ID offset;
s7042, determining the transaction ID of the binary log binlog file to be acquired according to the transaction ID and the transaction ID offset; optionally, the value of the transaction ID of the binary log binlog file to be acquired may be the sum of the transaction ID and the transaction ID offset;
s7043, according to the transaction ID of the binary log binlog file to be obtained, obtaining the binary log binlog file from the main server.
S705: and processing the binary log binlog file to obtain a message entity.
In a specific embodiment, step S705 may include:
s7051, analyzing the binary log binlog file to obtain binlog analysis data;
s7052, obtaining the table structure information of the binary log binlog file according to the binlog analysis data, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information;
in a specific embodiment, the step S7052 may include: judging whether the binlog analysis data contains a DDL statement or not; if the binlog analysis data contains a DDL statement, acquiring the table structure information of the binary log binlog file according to the DDL statement, and updating the table structure information of the binary log binlog file stored in a table storage module according to the acquired table structure information; if the binlog parsing data does not include a DDL statement, determining that the table structure information of the binary log binlog file is the same as the table structure information of the binary log binlog file stored in the table storage module. Wherein the obtaining of the table structure information of the binary log binlog file according to the DDL statement may include: analyzing the DDL statement to obtain a library table name; sending a replay request to a structure information base, the replay request including a base table name and a DDL statement; and receiving table structure information of the binary log binlog file returned by a structure information base, wherein the table structure information comprises a base table name and table structure data.
S7053, generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
S706: and sending the processed message entity to a message cluster, and sending a modification request to the distributed coordination server cluster, wherein the modification request is used for triggering the distributed coordination server cluster to delete the compensation node.
In an actual production environment, the MySQL database often exists in a cluster manner, that is, the MySQL database is deployed in a master-slave-N manner, so that the availability of the system is increased. The embodiment of the invention ensures the reliability of the data by means of the high availability characteristic of the MySQL cluster, and when the producer module of the cold standby server finds that the binlog file is missing, the missing binlog file is processed by the producer module of the main server until the missing binlog file is processed, and the producer module of the cold standby server continues to execute the step of analyzing the binlog file, so that the completeness, the order and the correctness of the data in the message cluster can be ensured.
An embodiment of the present invention further provides a data processing apparatus, as shown in fig. 9, and fig. 9 is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present invention. Specifically, the data processing apparatus 900 may include:
a cold standby server determining unit 901, configured to use, according to data stored in a master server and a slave server of a MySQL cluster, the slave server closest to the data stored in the master server as a cold standby server;
a binlog analyzing unit 902, configured to obtain a binary log binlog file from the cold standby server, and analyze the binary log binlog file to obtain binlog analysis data;
a table structure information updating unit 903, configured to acquire the table structure information of the binary log binlog file according to the binlog analysis data, and update the table structure information of the binary log binlog file stored in the table storage module according to the acquired table structure information;
a message entity generating unit 904, configured to generate a message entity according to the binlog parsing data and the corresponding table structure information in the table storage module.
It should be noted that: in the data processing apparatus provided in the above embodiment, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the apparatus may be divided into different functional modules to complete all or part of the functions described above. In addition, the data processing apparatus provided in the above embodiment and the data processing method provided in the above embodiment belong to the same concept, and specific implementation processes thereof are described in the method embodiment and are not described herein again.
An embodiment of the present invention further provides a data processing apparatus, as shown in fig. 10, fig. 10 is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present invention. Specifically, the data processing apparatus 1000 may include:
a first obtaining unit 1001, configured to obtain compensation information in a compensation node on a distributed coordination server cluster, where the compensation information includes an identification code and an IP address of a MySQL cluster;
a judging unit 1002, configured to judge whether a compensation task corresponding to the compensation information exists according to the IP address;
a second obtaining unit 1003, configured to obtain, when there is a compensation task, a latest message entity corresponding to the MySQL cluster from the message cluster according to the identifier of the MySQL cluster;
a third obtaining unit 1004, configured to obtain a binary log binlog file from the main server according to the obtained message entity;
a processing unit 1005, configured to process the binary log binlog file acquired by the third acquiring unit, and acquire a message entity;
a sending unit 1006, configured to send the message entity obtained by processing by the processing unit to a message cluster, and send a modification request to the distributed coordination server cluster, where the modification request is used to trigger the distributed coordination server cluster to delete the compensation node.
It should be noted that: in the data processing apparatus provided in the above embodiment, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the apparatus may be divided into different functional modules to complete all or part of the functions described above. In addition, the data processing apparatus provided in the above embodiment and the data processing method provided in the above embodiment belong to the same concept, and specific implementation processes thereof are described in the method embodiment and are not described herein again.
An embodiment of the present invention provides a data processing server, where the data processing server includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or an instruction set, and the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the data processing method provided in the foregoing method embodiment.
The memory may be used to store software programs and modules, and the processor may execute various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.
Referring to fig. 11, the server 1100 is configured to implement the data processing method provided in the foregoing embodiment, and specifically, the server structure may include the data processing apparatus. The server 1100 may vary widely in configuration or performance, and may include one or more Central Processing Units (CPUs) 1110 (e.g., one or more processors) and memory 1130, one or more storage media 1120 (e.g., one or more mass storage devices) storing applications 1123 or data 1122. The memory 1130 and the storage medium 1120 may be, among other things, transient storage or persistent storage. The program stored in the storage medium 1120 may include one or more modules, each of which may include a series of instruction operations for a server. Still further, the central processor 1110 may be configured to communicate with the storage medium 1120, and execute a series of instruction operations in the storage medium 1120 on the server 1100. The server 1100 may also include one or more power supplies 1160, one or more wired or wireless network interfaces 1150, one or more input-output interfaces 1140, and/or one or more operating systems 1121, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
Embodiments of the present invention also provide a storage medium, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a data processing method in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the data processing method provided by the above method embodiments.
Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
It can be seen from the above embodiments of the data processing method, apparatus, server or storage medium provided by the present invention that the present invention configures a producer module for each master server and slave server of the MySQL cluster, sets a table storage module, determines a cold standby server from each slave server, analyzes the binlog file by the producer module of the cold standby server to obtain binlog analysis data, further obtains table structure information by analyzing the binlog analysis data and stores the table structure information into the table storage module, and when outputting a message entity, encapsulates the binlog analysis data and the corresponding table structure information in the table storage module into the message entity and outputs the message entity. The scheme of the invention obtains and analyzes the binlog file in a bypass mode, does not increase the burden of the MySQL cluster, realizes zero intrusion to the MySQL cluster, and ensures that the performance of the database is not influenced in the operation process; the table structure information is obtained by analyzing binlog analysis data, and the table storage module is arranged to uniformly manage the table structure information, so that the table structure information of the binlog file is accurately checked, and meanwhile, the table structure information is added to the output message entity, and the data content is enriched.
The scheme of the invention can be used for data subscription, can improve the reliability and the availability of the data subscription, and can be used in the following scenes:
(1) multi-center data synchronization and distribution: the data of a plurality of data centers are synchronized in a quasi-real-time manner, and a one-to-many topological structure is realized.
(2) Heterogeneous indexing: on the basis of data synchronization of a plurality of data nodes, different index structures are adopted according to service requirements, and query efficiency is optimized.
(3) And (3) quasi-real-time data backup: and realizing multi-center data synchronization and backup.
It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (12)

1. A data processing method, comprising:
according to data stored in a master server and a slave server of the MySQL cluster, taking the slave server closest to the data stored in the master server as a cold standby server;
acquiring a binary system log binlog file from the cold standby server, and analyzing the binary system log binlog file to obtain binlog analysis data;
obtaining the table structure information of the binary log binlog file according to the binlog analysis data, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information;
and generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
2. The method of claim 1, wherein obtaining table structure information for the binary log binlog file from the binlog parsing data comprises:
judging whether the binlog analysis data contains a DDL statement or not;
if the binlog analysis data contains a DDL statement, acquiring the table structure information of the binary log binlog file according to the DDL statement, and updating the table structure information of the binary log binlog file stored in a table storage module according to the acquired table structure information;
if the binlog parsing data does not include a DDL statement, determining that the table structure information of the binary log binlog file is the same as the table structure information of the binary log binlog file stored in the table storage module.
3. The method of claim 2, wherein said obtaining table structure information for the binary log binlog file from the DDL statement comprises:
analyzing the DDL statement to obtain a library table name;
sending a replay request to a structure information base, the replay request including a base table name and a DDL statement;
and receiving table structure information of the binary log binlog file returned by a structure information base, wherein the table structure information comprises a base table name and table structure data.
4. The method of claim 1, further comprising, after generating a message entity based on the binlog parsing data and table structure information of the binary log binlog file in the table storage module:
and sending the message entity to a message cluster, wherein the message cluster is used for storing the message entity.
5. The method of claim 1, wherein before parsing the binary log binlog file to obtain binlog parsing data, further comprising:
judging whether the binary log binlog file is missing or not according to the acquired binary log binlog file;
if the binary log binlog file is missing, sending a compensation request to a distributed coordination server cluster, wherein the compensation request is used for triggering the distributed coordination server to create a compensation node and writing compensation information into the compensation node;
judging whether the compensation node is processed or not;
if the compensation node has been processed, performing a step of parsing the binary log binlog file to obtain binlog parsing data.
6. The method of claim 5, wherein determining whether there is a binary log binlog file miss according to the obtained binary log binlog file comprises:
extracting the transaction ID of the binary log binlog file obtained this time;
judging whether the transaction ID of the binary log binlog file acquired this time is continuous with the transaction ID of the binary log binlog file acquired last time;
if the transaction IDs are consecutive, determining that there is no binary log binlog file miss;
if the transaction ID is not contiguous, it is determined that there is a binary log binlog file miss.
7. A data processing method, comprising:
acquiring compensation information in compensation nodes on a distributed coordination server cluster, wherein the compensation information comprises an identification code and an IP address of a MySQL cluster;
judging whether a compensation task corresponding to the compensation information exists according to the IP address;
if the compensation task exists, acquiring the latest message entity corresponding to the MySQL cluster from the message cluster according to the identification code of the MySQL cluster;
according to the acquired message entity, acquiring a binary log binlog file from a main server;
analyzing the binary log binlog file to obtain binlog analysis data;
obtaining the table structure information of the binary log binlog file according to the binlog analysis data, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information;
generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module;
and sending the processed message entity to a message cluster, and sending a modification request to the distributed coordination server cluster, wherein the modification request is used for triggering the distributed coordination server cluster to delete the compensation node.
8. The method of claim 7, wherein the determining whether the compensation task corresponding to the compensation information exists according to the IP address comprises:
judging whether the IP address in the compensation information is the same as the IP address of the compensation information;
if the IP address in the compensation information is the same as the IP address of the compensation information, judging that a compensation task corresponding to the compensation information exists;
and if the IP address in the compensation information is different from the IP address of the compensation information, judging that the compensation task corresponding to the compensation information does not exist.
9. The method of claim 7, wherein obtaining a binary log binlog file from a host server according to the obtained message entity comprises:
analyzing the message entity to obtain a transaction ID and a transaction ID offset;
determining the transaction ID of the binary log binlog file to be acquired according to the transaction ID and the transaction ID offset;
and acquiring the binary log binlog file from the main server according to the transaction ID of the binary log binlog file to be acquired.
10. A data processing apparatus, comprising:
the cold standby server determining unit is used for taking a slave server closest to the data stored in the master server as a cold standby server according to the data stored in the master server and the slave server of the MySQL cluster;
the binlog analyzing unit is used for acquiring a binary log binlog file from the cold standby server and analyzing the binary log binlog file to obtain binlog analyzing data;
the table structure information updating unit is used for acquiring the table structure information of the binary log binlog file according to the binlog analysis data and updating the table structure information of the binary log binlog file stored in the table storage module according to the acquired table structure information;
and the message entity generating unit is used for generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module.
11. A data processing apparatus, comprising:
the system comprises a first acquisition unit, a first management unit and a second acquisition unit, wherein the first acquisition unit is used for acquiring compensation information in compensation nodes on a distributed coordination server cluster, and the compensation information comprises an identification code and an IP address of a MySQL cluster;
the judging unit is used for judging whether a compensation task corresponding to the compensation information exists or not according to the IP address;
the second acquisition unit is used for acquiring the latest message entity corresponding to the MySQL cluster from the message cluster according to the identification code of the MySQL cluster when a compensation task exists;
a third obtaining unit, configured to obtain a binary log binlog file from the main server according to the obtained message entity;
the processing unit is used for processing the binary log binlog file acquired by the third acquisition unit to acquire a message entity; and is also used for: analyzing the binary log binlog file to obtain binlog analysis data; obtaining the table structure information of the binary log binlog file according to the binlog analysis data, and updating the table structure information of the binary log binlog file stored in a table storage module according to the obtained table structure information; generating a message entity according to the binlog analysis data and the corresponding table structure information in the table storage module;
and the sending unit is used for sending the message entity obtained by the processing unit to the message cluster and sending a modification request to the distributed coordination server cluster, wherein the modification request is used for triggering the distributed coordination server cluster to delete the compensation node.
12. A storage medium having stored therein at least one instruction or at least one program, which is loaded and executed by a processor to implement a data processing method as claimed in any one of claims 1 to 6 or a data processing method as claimed in any one of claims 7 to 9.
CN201810800450.7A 2018-07-20 2018-07-20 Data processing method and device Active CN109145060B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810800450.7A CN109145060B (en) 2018-07-20 2018-07-20 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810800450.7A CN109145060B (en) 2018-07-20 2018-07-20 Data processing method and device

Publications (2)

Publication Number Publication Date
CN109145060A CN109145060A (en) 2019-01-04
CN109145060B true CN109145060B (en) 2020-09-04

Family

ID=64801179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810800450.7A Active CN109145060B (en) 2018-07-20 2018-07-20 Data processing method and device

Country Status (1)

Country Link
CN (1) CN109145060B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111797158B (en) * 2019-04-08 2024-04-05 北京沃东天骏信息技术有限公司 Data synchronization system, method and computer readable storage medium
CN110704000B (en) * 2019-10-10 2023-05-30 北京字节跳动网络技术有限公司 Data processing method, device, electronic equipment and storage medium
CN111026813A (en) * 2019-12-18 2020-04-17 紫光云(南京)数字技术有限公司 High-availability quasi-real-time data synchronization method based on MySQL
CN113760920A (en) * 2020-08-20 2021-12-07 北京沃东天骏信息技术有限公司 Data synchronization method and device, electronic equipment and storage medium
CN112612859A (en) * 2020-12-31 2021-04-06 上海英方软件股份有限公司 DDL analysis method and device based on log analysis

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426594A (en) * 2011-10-31 2012-04-25 沈文策 Method and system for operating database
CN103838780A (en) * 2012-11-27 2014-06-04 阿里巴巴集团控股有限公司 Data recovery method of database and relevant device
CN104699541A (en) * 2015-03-30 2015-06-10 北京奇虎科技有限公司 Method, device, data transmission assembly and system for synchronizing data
CN104765659A (en) * 2015-04-30 2015-07-08 北京奇虎科技有限公司 Data recovery method and device applied to database
CN105260486A (en) * 2015-11-23 2016-01-20 郑州悉知信息科技股份有限公司 Data processing method, device and system
CN105447014A (en) * 2014-08-15 2016-03-30 阿里巴巴集团控股有限公司 Metadata management method based on binglog, and method and device used for providing metadata
CN107291926A (en) * 2017-06-29 2017-10-24 搜易贷(北京)金融信息服务有限公司 A kind of binlog analysis methods
CN107818431A (en) * 2016-09-14 2018-03-20 北京京东尚科信息技术有限公司 A kind of method and system that order track data is provided
CN108170768A (en) * 2017-12-25 2018-06-15 腾讯科技(深圳)有限公司 database synchronization method, device and readable medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100580843B1 (en) * 2003-12-22 2006-05-16 한국전자통신연구원 Channel transfer function matrix processing device and processing method therefor in V-BLAST

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426594A (en) * 2011-10-31 2012-04-25 沈文策 Method and system for operating database
CN103838780A (en) * 2012-11-27 2014-06-04 阿里巴巴集团控股有限公司 Data recovery method of database and relevant device
CN105447014A (en) * 2014-08-15 2016-03-30 阿里巴巴集团控股有限公司 Metadata management method based on binglog, and method and device used for providing metadata
CN104699541A (en) * 2015-03-30 2015-06-10 北京奇虎科技有限公司 Method, device, data transmission assembly and system for synchronizing data
CN104765659A (en) * 2015-04-30 2015-07-08 北京奇虎科技有限公司 Data recovery method and device applied to database
CN105260486A (en) * 2015-11-23 2016-01-20 郑州悉知信息科技股份有限公司 Data processing method, device and system
CN107818431A (en) * 2016-09-14 2018-03-20 北京京东尚科信息技术有限公司 A kind of method and system that order track data is provided
CN107291926A (en) * 2017-06-29 2017-10-24 搜易贷(北京)金融信息服务有限公司 A kind of binlog analysis methods
CN108170768A (en) * 2017-12-25 2018-06-15 腾讯科技(深圳)有限公司 database synchronization method, device and readable medium

Also Published As

Publication number Publication date
CN109145060A (en) 2019-01-04

Similar Documents

Publication Publication Date Title
CN109145060B (en) Data processing method and device
US11888599B2 (en) Scalable leadership election in a multi-processing computing environment
WO2019154394A1 (en) Distributed database cluster system, data synchronization method and storage medium
CN108121782B (en) Distribution method of query request, database middleware system and electronic equipment
US11157445B2 (en) Indexing implementing method and system in file storage
US10417103B2 (en) Fault-tolerant methods, systems and architectures for data storage, retrieval and distribution
US10990629B2 (en) Storing and identifying metadata through extended properties in a historization system
CN106874281B (en) Method and device for realizing database read-write separation
CN107181686B (en) Method, device and system for synchronizing routing table
EP3084631A1 (en) Data synchonization in a storage network
US20170031948A1 (en) File synchronization method, server, and terminal
CN106899654B (en) Sequence value generation method, device and system
US20150363484A1 (en) Storing and identifying metadata through extended properties in a historization system
CN114647698A (en) Data synchronization method and device and computer storage medium
CN113806301A (en) Data synchronization method, device, server and storage medium
CN112612850A (en) Data synchronization method and device
CN109471901B (en) Data synchronization method and device
CN111147226B (en) Data storage method, device and storage medium
CN115004662A (en) Data synchronization method, data synchronization device, data storage system and computer readable medium
CN115104295A (en) Data processing method, data processing device, electronic device and storage medium
US10860580B2 (en) Information processing device, method, and medium
CN115293365A (en) Management method, device, management platform and storage medium of machine learning model
CN113032408B (en) Data processing method, system and equipment
CN111522688B (en) Data backup method and device for distributed system
CN111177162A (en) Data synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230921

Address after: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee after: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.

Patentee after: TENCENT CLOUD COMPUTING (BEIJING) Co.,Ltd.

Address before: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors

Patentee before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd.