CN109408581B - Data exchange method, device, equipment and storage medium - Google Patents

Data exchange method, device, equipment and storage medium Download PDF

Info

Publication number
CN109408581B
CN109408581B CN201811348046.7A CN201811348046A CN109408581B CN 109408581 B CN109408581 B CN 109408581B CN 201811348046 A CN201811348046 A CN 201811348046A CN 109408581 B CN109408581 B CN 109408581B
Authority
CN
China
Prior art keywords
data source
abnormal
control node
node
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811348046.7A
Other languages
Chinese (zh)
Other versions
CN109408581A (en
Inventor
林鹏程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN201811348046.7A priority Critical patent/CN109408581B/en
Publication of CN109408581A publication Critical patent/CN109408581A/en
Application granted granted Critical
Publication of CN109408581B publication Critical patent/CN109408581B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Hardware Redundancy (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data exchange method, which is applied to a first working node of a data exchange system, wherein a control node of the data exchange system is respectively in communication connection with each working node, and the method comprises the following steps: when a first exchange operation is to be carried out on a first data source, determining whether the abnormality registration information of the first data source exists in an abnormality information base which is obtained in advance and is synchronous with a control node; if so, carrying out anomaly detection on the first data source, and shortening preset anomaly detection overtime time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source; based on the respective detection result, it is determined to terminate or run the first exchange job. By applying the technical scheme provided by the embodiment of the invention, the whole waiting time of the related exchange operation of the first data source can be reduced, and the data exchange efficiency is improved. The invention also discloses a data exchange device, equipment and a storage medium, and has corresponding technical effects.

Description

Data exchange method, device, equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data exchange scheduling method, apparatus, device, and storage medium.
Background
In the information age, various business data of enterprises are increasing. In order to better utilize data, enterprises mostly convert data into information and knowledge through an ETL technology, or transfer data from a business system to a data warehouse, and effective utilization of data has become a major bottleneck for improving core competitiveness of the enterprises.
ETL, an abbreviation of Extract-Transform-Load in english, is used to describe the process of extracting (Extract), Transform, and Load (Load) data from a source to a destination.
In order to improve data exchange capacity and support access of more data sources, a data exchange system using an ETL technology is deployed in a cluster manner, as shown in fig. 1, a control node of the data exchange system manages a plurality of working nodes in a unified manner, performs scheduling of an exchange job, and allocates the exchange job to 1 or more working nodes for data exchange. The working node can be transversely expanded to carry out specific data exchange work, the source end database and the target end database are connected, and data are periodically extracted, interactively converted and loaded based on exchange work.
In the prior art, a control node mainly performs allocation and scheduling of an exchange job, and a working node performs establishment of an exchange link and data exchange based on the exchange job. When the data source corresponding to the exchange operation is abnormal and cannot respond to the request of the working node, the working node performs multiple attempts, so that the running time of the exchange operation is longer. And other working nodes continue to run the exchange jobs on the data source according to the job scheduling. Therefore, each work node needs a long time for the exchange job of the data source, the job running resource of each work node is limited, other exchange jobs cannot be processed quickly, and the data exchange efficiency is low.
Disclosure of Invention
The invention aims to provide a data exchange method, a data exchange device, data exchange equipment and a storage medium, so as to improve the data exchange efficiency.
In order to solve the technical problems, the invention provides the following technical scheme:
a data exchange method is applied to a first working node of a data exchange system, the data exchange system comprises a control node and a plurality of working nodes, the control node is respectively in communication connection with each working node, the first working node is any one working node in the data exchange system, and the method comprises the following steps:
when a first exchange operation is to be carried out on a first data source, determining whether the abnormality registration information of the first data source exists in an abnormality information base which is obtained in advance and is synchronized by the control node;
if so, carrying out anomaly detection on the first data source, and shortening preset anomaly detection timeout time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source;
determining to terminate or run the first exchange job based on a corresponding detection result.
In one embodiment of the present invention, when it is determined that the anomaly registration information of the first data source exists in the anomaly information base, before the anomaly detection of the first data source, the method further includes:
determining whether the exception level of the exception registration information of the first data source recorded in the exception information base is a fault level;
if so, setting the first switching operation to be in a fault suspension state, and executing the step of carrying out abnormity detection on the first data source until fault recovery information aiming at the first data source is received;
the failure levels are: and the control node updates the exception level when the exception registration recording times of the first data source reach M times or the exception registration nodes of the first data source reach N, wherein M and N are positive integers.
In one embodiment of the present invention, the method further comprises:
and if the detection result is that the first data source is normal, sending state confirmation information aiming at the first data source to the control node, so that the control node sends an execution instruction for carrying out abnormality detection on the first data source to an abnormality registration node of the first data source after receiving the state confirmation information.
In one embodiment of the present invention, the method further comprises:
and if the detection result indicates that the first data source is abnormal, generating an abnormal registration table of the first data source, and reporting the abnormal registration table to the control node, so that the control node updates the abnormal information base based on the abnormal registration table, and synchronizes the updated abnormal information base to each working node.
In a specific embodiment of the present invention, after determining that the first exchange job is terminated when the detection result is that the first data source has an abnormality, the method further includes:
when an execution instruction which is sent by the control node and used for carrying out anomaly detection on the first data source is received, carrying out anomaly detection on the first data source;
if the first data source is detected to be normal, returning abnormal elimination information to the control node, so that the control node updates the abnormal information base based on the abnormal elimination information, and synchronizes the updated abnormal information base to each working node;
if the first data source is detected to be abnormal, returning reconfirmation to the control node, so that the control node updates the abnormal information base based on the reconfirmation and synchronizes the updated abnormal information to each working node.
In a specific embodiment of the present invention, when receiving an execution instruction sent by the control node to perform anomaly detection on the first data source, the method further includes performing anomaly detection on the first data source, and when detecting that the first data source is normal:
receiving and executing a second exchange job which is scheduled to the control node from a second working node and aims at the first data source by the control node;
the second working node is any one of the nodes except the first working node, and returns the reconfirmation information aiming at the first data source to the control node.
In one embodiment of the present invention, the method further comprises:
executing the step of performing anomaly detection on the first data source when determining that the anomaly registration information of the first data source does not exist in the anomaly information base;
if the first data source is detected to be abnormal, generating an abnormal registration table of the first data source, sending the abnormal registration table of the first data source to the control node, so that the control node updates the abnormal information base based on the abnormal registration table of the first data source, and synchronizes the updated abnormal information base to each working node.
A data exchange device is applied to a first working node of a data exchange system, the data exchange system comprises a control node and a plurality of working nodes, the control node is respectively in communication connection with each working node, the first working node is any one working node in the data exchange system, and the device comprises:
the abnormal information determining module is used for determining whether the abnormal registration information of the first data source exists in an abnormal information base which is obtained in advance and is synchronous with the control node when the first exchange operation is carried out on the first data source, and if so, the abnormal information determining module is triggered;
the anomaly detection module is used for carrying out anomaly detection on the first data source, and shortening preset anomaly detection timeout time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source;
and the exchange job processing module is used for determining to terminate or run the first exchange job based on the corresponding detection result.
A data exchange device is applied to a first working node of a data exchange system, the data exchange system comprises a control node and a plurality of working nodes, the control node is respectively in communication connection with each working node, the first working node is any one working node in the data exchange system, and the data exchange device comprises:
a memory for storing a computer program;
a processor for implementing the steps of any of the above data exchange methods when executing the computer program.
A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, carries out the steps of the data exchange method of any one of the preceding claims.
By applying the technical scheme provided by the embodiment of the invention, when the first exchange operation is to be performed on the first data source, the first working node can firstly determine whether the abnormal registration information of the first data source exists in the abnormal information base synchronized by the control node, if so, the first working node can perform abnormal detection on the first data source, shorten the preset abnormal detection timeout time and/or reduce the preset abnormal detection retry times in the process of performing the abnormal detection on the first data source, and then determine to terminate or operate the first exchange operation based on the corresponding detection result. When the abnormal registration information of the first data source exists in the abnormal information base, the possibility that the first data source still has abnormality is high, and the first working node shortens the preset abnormal detection timeout time and/or reduces the preset abnormal detection retry times in the process of carrying out abnormal detection on the first data source, so that the overall waiting time of the related exchange operation of the first data source can be reduced, and the data exchange efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a data exchange system in the prior art;
FIG. 2 is a flow chart of a method for data exchange according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a data exchanging apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data switching device in an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a data exchange method, which is applied to a first working node of a data exchange system. As shown in fig. 1, the data exchange system includes a control node and a plurality of working nodes, and the control node is respectively connected with each working node in communication. The first working node is any one working node of the data exchange system. The control node manages all the working nodes in a unified mode, carries out scheduling on the exchange jobs, and distributes the exchange jobs to 1 or more working nodes for data exchange. The working nodes can be transversely expanded to perform specific data exchange work, connect the source end and the destination end, and periodically extract, interactively convert and load data based on exchange work.
Referring to fig. 2, there is shown a flow chart of an implementation of a data exchange method provided in an embodiment of the present invention, where the method may include the following steps:
s210: when a first exchange job is to be performed on a first data source, it is determined whether or not abnormality registration information of the first data source exists in an abnormality information base synchronized by a control node obtained in advance.
In the embodiment of the present invention, the control node may maintain an exception information base, and the exception information base records exception registration information of the data source. The abnormal registration information of the data source recorded in the abnormal information base can be fed back to the control node when the working node detects that the data source is abnormal, or can be detected and recorded by operation and maintenance personnel according to actual conditions.
The exception registration information of the data source recorded in the exception information base may include exception data source information, such as a data source address, database information, instance information, table space information, table information, and the like, and may further include an exception level, exception registration node information, and the like.
The following description will be given by taking an example in which the working node a detects an abnormality of a data source and feeds back an abnormality registration table to the control node when running an exchange job.
When the working node A periodically runs an exchange operation, the operation abnormity occurs, an abnormity detection mechanism is started, the abnormity level is judged, and a data source abnormity registration table is recorded:
the operation and maintenance personnel can set different exception levels or types according to different data sources. Taking an Oracle database as an example, the exception level may be a table level, a table space level, a database instance level, a database level, a network exception level, and the like.
When the working node a starts an exchange job, it tries to establish a connection with a data source corresponding to the exchange job first, if the connection can be successfully established but the table information cannot be read, it needs to further determine whether other table information corresponding to the table space can be acquired, if the table space information can be acquired, it is preliminarily considered as a table level problem, which may be defined as a P0 level, and if the table space information cannot be acquired, it is determined as a table space level problem, which may be defined as a P1 level, which may be caused by reasons such as that the table or the table space is modified or deleted.
If the working node A fails to connect with the data source, problems such as database instance level, database level, network abnormality and the like can not be accurately judged, and the method can be defined as P3 level.
The working node A reports the data source exception registration table to the control node, and the data source exception registration table can contain exception data source information, such as a data source address, database information, instance information, table space information, table information and the like, exception levels, exception registration node information and the like.
And after receiving the data source abnormity registration table reported by the working node A, the control node extracts relevant information from the data source abnormity registration table and updates an abnormity information base maintained by the control node.
Specifically, the control node may retrieve the exception registration information summarized in the exception information base, and determine whether the data source in the received data source exception registration table is a new data source. If yes, the information item can be stored in the abnormal information base, the number of times of the registration item is recorded as 1, and the updated abnormal information base is synchronized to other working nodes. If the registration item information of the same data source address already exists in the exception information base, comparing the registration item information with the database information and the exception level in the received data source exception registration table:
when the abnormal level is judged to be the level P1, if the table space information and the table information are consistent, further judging abnormal registration node information, if the abnormal registration node information is consistent, increasing the abnormal recording times, and if the abnormal registration node information is inconsistent, adding the abnormal registration node information; if the tablespace is inconsistent with the tableinformation, adding a data source exception entry and adding an exception record related to a new data table;
when the abnormal level is judged to be the level P2, if the tablespace information is consistent, the abnormal registration node information is further judged, if the abnormal registration node information is consistent, the abnormal recording times are increased, and if the abnormal registration node information is inconsistent, the abnormal registration node information is added; if the tablespaces are inconsistent, adding a data source exception entry and adding an exception record related to the new tablespace;
when the abnormal level is judged to be P3, if the database instances are consistent, further judging abnormal registration node information, if the abnormal registration node information is consistent, increasing the abnormal recording times, and if the abnormal registration node information is inconsistent, adding the abnormal registration node information; if the database instances are not consistent but the databases are consistent, adding abnormal information records and adding abnormal records related to new database instances; and if the database instances are inconsistent and the databases are inconsistent, adding an exception information record and adding an exception record related to the new database.
When the abnormal information base is updated, the control node can synchronize the abnormal information base to each managed and controlled working node. Each working node can obtain an abnormal information base for controlling node synchronization.
The report of the abnormal registry of the data source by the working node and the update of the abnormal information base by the control node are described above by way of example, in practical application, different registration strategies and update strategies can be formulated according to different application scenarios, and the embodiments of the present invention are not described herein any more.
The first working node may periodically perform the first exchange job with the first data source based on the scheduling of the exchange job by the control node. When the first exchange job is to be performed on the first data source, it may be first queried in an exception information base obtained in advance whether exception registration information of the first data source exists, and if it is determined that the exception registration information exists, it indicates that an exception has occurred in the first data source before that, and the first working node may continue to perform the operation of step S220.
S220: and carrying out anomaly detection on the first data source, and shortening preset anomaly detection timeout time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source.
The first working node may perform anomaly detection on the first data source before running the first exchange job on the first data source. In the embodiment of the present invention, the abnormality detection timeout time and the abnormality detection retry number may be preset, and the preset specific value may be set and adjusted according to an actual situation, which is not limited in the embodiment of the present invention.
If the first working node determines that the abnormality registration information of the first data source exists in the abnormality information base, the first working node can shorten the preset abnormality detection timeout time and/or reduce the preset abnormality detection retry times in the process of performing abnormality detection on the first data source.
It can be understood that, if the exception registration information of the first data source exists in the exception information base, it indicates that an exception occurs in the first data source before that, and the exception may be reported by the first working node or reported by other working nodes. In this case, the detection result of the first working node for performing the anomaly detection on the first data source may still be that the first data source is anomalous, and if the anomaly detection timeout time is long or the number of anomaly detection retries is large, the time required for performing the first exchange operation on the first data source will be long, and the data exchange efficiency will be low. The embodiment of the invention shortens the preset abnormity detection overtime time and/or reduces the preset abnormity detection retry times, can shorten the time required by the first exchange operation on the first data source, and improves the data exchange efficiency.
S230: based on the respective detection result, it is determined to terminate or run the first exchange job.
After the first working node performs anomaly detection on the first data source, two detection results may be generated, one is that the first data source is anomalous, and the other is that the first data source is normal. The first working node may determine to terminate the first swap job if the first data source is anomalous and may determine to run the first swap job if the first data source is normal.
By applying the method provided by the embodiment of the present invention, when the first switching operation is to be performed on the first data source, the first working node may first determine whether the abnormal registration information of the first data source exists in the abnormal information base synchronized by the control node, if so, perform abnormal detection on the first data source, shorten a preset abnormal detection timeout period and/or reduce a preset abnormal detection retry number during the abnormal detection on the first data source, and then determine to terminate or run the first switching operation based on a corresponding detection result. When the abnormal registration information of the first data source exists in the abnormal information base, the possibility that the first data source still has abnormality is high, and the first working node shortens the preset abnormal detection timeout time and/or reduces the preset abnormal detection retry times in the process of carrying out abnormal detection on the first data source, so that the overall waiting time of the related exchange operation of the first data source can be reduced, and the data exchange efficiency is improved.
In a specific embodiment of the present invention, if the detection result is that the first data source is normal, status confirmation information for the first data source may be sent to the control node, so that the control node sends an execution instruction for performing anomaly detection on the first data source to the anomaly registration node of the first data source after receiving the status confirmation information.
When the first working node needs to perform first exchange operation on the first data source, it is determined that the abnormal registration information of the first data source exists in the abnormal information base, and after the first working node performs abnormal detection on the first data source, it finds that the first data source is normal, and after the first exchange operation runs normally, it may send status confirmation information for the first data source to the control node. After receiving the state confirmation information, the control node may send an execution instruction for performing anomaly detection on the first data source to an anomaly registration node of the first data source, that is, a working node that has reported the anomaly registration table of the first data source before, where each working node that has reported the anomaly registration table of the first data source before may start anomaly detection on the first data source, and if it is detected that the first data source is normal, may return anomaly removal information to the control node. The control node can update the abnormal information base based on the abnormal elimination information and synchronize the updated abnormal information base to each working node. In this way, each working node can perform related information query based on the updated abnormal information base.
In another embodiment of the present invention, if the detection result indicates that the first data source has an abnormality, an abnormality registration table of the first data source may be generated, and the abnormality registration table is reported to the control node, so that the control node updates the abnormality information base based on the abnormality registration table, and synchronizes the updated abnormality information base to each working node.
When the first working node needs to perform the first exchange operation on the first data source, it determines that the exception registration information of the first data source exists in the exception information base, and after it detects the exception of the first data source, it finds that the first data source has an exception, and may generate an exception registration table of the first data source according to the exception state of the first data source, and report the exception registration table to the control node. After receiving the exception registration table reported by the first working node, the control node may update the exception information base based on the exception registration table, and then synchronize the updated exception information base to each working node. And each working node can perform related information query based on the updated abnormal information base.
After determining that the first exchange job is terminated when the detection result is that the first data source has an abnormality, the method may further include the steps of:
the method comprises the following steps: when an execution instruction which is sent by a control node and used for carrying out anomaly detection on a first data source is received, carrying out anomaly detection on the first data source;
step two: if the first data source is detected to be normal, returning abnormal elimination information to the control node, so that the control node updates the abnormal information base based on the abnormal elimination information, and synchronizes the updated abnormal information base to each working node;
step three: and if the first data source is detected to be abnormal, returning reconfirmation to the control node so that the control node updates the abnormal information base based on the reconfirmation and synchronizes the updated abnormal information base to each working node.
For convenience of description, the above three steps are combined for illustration.
In the embodiment of the present invention, when the first working node detects an abnormality of the first data source when the first working node detects the abnormality of the first data source, the first working node may report an abnormality registration table of the first data source to the control node, and the control node records the abnormality registration information of the first data source in the abnormality information base. In this case, if the other working nodes detect that the first data source is normal when performing anomaly detection on the first data source, the state confirmation information for the first data source may be sent to the control node. After receiving the status confirmation information, the control node may send an execution instruction for performing anomaly detection on the first data source by using an anomaly registration node of the first data source.
The first working node is one of the abnormal registration nodes reporting the abnormal registration table of the first data source. When the first working node receives an execution instruction sent by the control node for performing anomaly detection on the first data source, the first working node may start to perform anomaly detection on the first data source.
If the first data source is detected to be normal, the abnormal elimination information can be returned to the control node, so that the control node can update the abnormal information base based on the abnormal elimination information and synchronize the updated abnormal information base to each working node. Specifically, the control node may eliminate the exception registration information of the first data source in the exception information base after receiving the exception elimination information returned by the exception registration nodes of all the first data sources. Meanwhile, the control node can trigger the working state of the first data source to be converted into normal, and normal switching job scheduling is recovered.
If the first data source is detected to be abnormal, reconfirmation information can be returned to the control node, so that the control node can update the abnormal information base based on the reconfirmation information and synchronize the updated abnormal information base to each working node. Specifically, when receiving the reconfirmation, the control node may determine that there may be some working nodes disconnected from the data source due to a network cause, update an exception level included in the exception registration information of the first data source in the exception information base to be an exception state of some nodes, and register information of the working nodes reporting the exception.
In a specific embodiment of the present invention, when receiving an execution instruction sent by a control node to perform anomaly detection on a first data source, the method may further include the following steps:
receiving and controlling a second exchange job which is scheduled to the first data source from the second working node by the control node;
the second working node is any node except the first working node, which returns the reconfirmation information aiming at the first data source to the control node.
In the embodiment of the present invention, the first working node performs the anomaly detection on the first data source when receiving the execution instruction for performing the anomaly detection on the first data source, which is sent by the control node, and may return the anomaly removal information to the control node when detecting that the first data source is normal. When the second working node receives an execution instruction sent by the control node for performing anomaly detection on the first data source, after the anomaly detection is performed on the first data source, if the first data source is detected to be still abnormal, reconfirmation information aiming at the first data source can be returned to the control node. That is, for the same data source, some working nodes may detect that they are normal, and some working nodes may detect that they are abnormal. When the control node receives the reconfirmation, the control node can judge that a part of working nodes are not connected with the data source due to possible network reasons, can update the abnormal level corresponding to the data source in the abnormal information base into the abnormal state of the part of nodes, and registers and reports abnormal node information.
In this case, the control node may schedule the second exchange job, which is scheduled to the second working node and is directed to the first data source, to the first working node, and the first working node runs the second exchange job, so that the continuation of the exchange job is ensured, and the reliability of the exchange job scheduling is improved.
That is to say, the connection difference between the data source and the plurality of working nodes can be judged by comparing different abnormal information of the plurality of working nodes, and the failed exchange job is dispatched to the operable working node by adjusting and dispatching, so that the dispatching reliability is improved.
In an embodiment of the present invention, when it is determined that the anomaly registration information of the first data source exists in the anomaly database, before performing anomaly detection on the first data source, the method may further include the following steps:
the first step is as follows: determining whether the abnormal level of the abnormal registration information of the first data source recorded in the abnormal information base is a fault level, and if so, executing a second step;
the second step is that: setting the first exchange operation as a fault suspension state, and executing a step of carrying out abnormity detection on the first data source until fault recovery information aiming at the first data source is received;
the failure level is: and the control node updates the abnormal level when the abnormal registration recording times of the first data source reach M times or the abnormal registration nodes of the first data source reach N, wherein M and N are positive integers.
For convenience of description, the above two steps are combined for illustration.
The exception registration information of the exception data source recorded in the exception information base contains an exception level. When the number of times of recording the abnormal registration of the first data source by the control node reaches M times or the number of the abnormal registration nodes of the first data source reaches N, the abnormal level in the abnormal registration information of the first data source in the abnormal information base can be updated to be the fault level, and the updated abnormal information base is synchronized to each working node. M and N are positive integers, can be set and adjusted according to actual conditions, and are set to be 2.
When the first working node determines that the abnormal level of the abnormal registration information of the first data source recorded in the abnormal information base is the fault level, the first switching operation can be directly set to be in a fault suspension state, the first data source is not subjected to abnormal detection, connection and switching scheduling are not performed, the detection time is saved, and other switching operations can be rapidly operated.
When the first data source is recovered from a fault, the control node or the operation and maintenance personnel may send fault recovery information for the first data source to the first working node, and when the first working node receives the fault recovery information for the first data source, the first working node may perform an operation of performing abnormality detection on the first data source.
In one embodiment of the invention, the method may further comprise the steps of:
the method comprises the following steps: when determining that the abnormal registration information of the first data source does not exist in the abnormal database, executing a step of performing abnormal detection on the first data source;
step two: and if the first data source is detected to be abnormal, generating an abnormal registration table of the first data source, sending the abnormal registration table of the first data source to the control node, so that the control node updates the abnormal information base based on the abnormal registration table of the first data source, and synchronizes the updated abnormal information base to each working node.
In the embodiment of the present invention, when the first working node performs the first exchange job on the first data source, if it is determined that the exception registration information of the first data source does not exist in the exception database, the first working node may directly perform exception detection on the first data source based on the preset exception detection timeout time and the preset number of times of retry of exception detection, as shown in fig. 2. If the first data source is detected to have an abnormality, the first exchange job can be terminated, an abnormality registration table of the first data source is generated based on the abnormal state of the first data source, and the abnormality registration table of the first data source is sent to the control node. The control node can update the abnormal information base based on the received abnormal registration table of the first data source, and synchronize the updated abnormal information base to each working node. And each working node can inquire related information based on the updated abnormal information base.
Of course, if it is detected that the first data source is normal, the first exchange job may be executed to perform the corresponding data exchange operation.
It should be noted that, each working node may involve a source-end database and a destination-end database with respect to the switching operation, which are used as the source and destination of data switching, and a source-end database exception detected in the switching or a destination-end database exception is used as the source of the data source exception registry. Similarly, the data source information corresponding to the exchange operation, including the source database and the destination database, needs to determine whether there is data source exception registration information. By judging the reading abnormity of the data source end, the data source information with abnormal target writing can be synchronously judged.
In the embodiment of the invention, each working node can quickly detect the abnormal registration information and the abnormal level of the existing data source, thereby isolating the abnormal data source, reducing the detection time of the abnormal state and improving the scheduling efficiency of normal operation. Meanwhile, the method can detect the working nodes which can normally and abnormally work about a certain data source, and carry out switching operation scheduling based on a certain strategy, thereby improving the reliability of the switching operation.
In practical application, in consideration of the limitation of the connection number resources of the working nodes and the data sources, the data sources can be detected through the working nodes, and the multiplexing control node can be used for regularly detecting the state of each data source. And the control node unifies and summarizes the abnormal data source registration table, so that more resources of the working node are prevented from being occupied.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a data exchange device, which is applied to a first working node of a data exchange system, where the data exchange system includes a control node and a plurality of working nodes, the control node is respectively connected to each working node in a communication manner, the first working node is any one of the working nodes in the data exchange system, and a data exchange device described below and a data exchange method described above may be referred to correspondingly.
Referring to fig. 3, the apparatus includes the following modules:
an abnormal information determination module 310, configured to determine, when a first exchange job is to be performed on a first data source, whether abnormal registration information of the first data source exists in an abnormal information base synchronized with a control node obtained in advance, and if so, trigger an abnormal detection module 320;
the anomaly detection module 320 is configured to perform anomaly detection on the first data source, and shorten a preset anomaly detection timeout time and/or reduce preset anomaly detection retry times during the anomaly detection on the first data source;
the exchange job processing module 330 is configured to determine to terminate or run the first exchange job based on the corresponding detection result.
By applying the apparatus provided in the embodiment of the present invention, when the first working node is to perform the first exchange job on the first data source, it may be determined whether the exception registration information of the first data source exists in the exception information base synchronized by the control node, if so, the first data source may be subjected to exception detection, and in the process of performing exception detection on the first data source, a preset exception detection timeout time is shortened and/or a preset number of retry times of exception detection is reduced, and then the first exchange job is determined to be terminated or run based on a corresponding detection result. When the abnormal registration information of the first data source exists in the abnormal information base, the possibility that the first data source still has abnormality is high, and the first working node shortens the preset abnormal detection timeout time and/or reduces the preset abnormal detection retry times in the process of carrying out abnormal detection on the first data source, so that the overall waiting time of the related exchange operation of the first data source can be reduced, and the data exchange efficiency is improved.
In one embodiment of the present invention, the method further comprises:
the fault level determining module is used for determining whether the abnormal level of the abnormal registration information of the first data source recorded in the abnormal information base is the fault level or not before the abnormal detection is carried out on the first data source when the abnormal registration information of the first data source exists in the abnormal information base, and if so, the switching operation suspending module is triggered;
the switching job suspending module is configured to set the first switching job in a failure suspending state, and trigger the anomaly detection module 320 to perform a step of performing anomaly detection on the first data source until failure recovery information for the first data source is received;
the failure level is: and the control node updates the abnormal level when the abnormal registration recording times of the first data source reach M times or the abnormal registration nodes of the first data source reach N, wherein M and N are positive integers.
In a specific embodiment of the present invention, the apparatus further includes a status confirmation information sending module, configured to:
and when the detection result is that the first data source is normal, sending state confirmation information aiming at the first data source to the control node, so that the control node sends an execution instruction for performing abnormality detection on the first data source to an abnormality registration node of the first data source after receiving the state confirmation information.
In a specific embodiment of the present invention, the present invention further includes an exception registry reporting module, configured to:
and when the detection result is that the first data source is abnormal, generating an abnormal registration table of the first data source, reporting the abnormal registration table to the control node, so that the control node updates the abnormal information base based on the abnormal registration table, and synchronizing the updated abnormal information base to each working node.
In an embodiment of the invention, the anomaly detection module 320 is further configured to:
when the detection result is that the first data source is abnormal, after the first exchange operation is determined to be terminated, and an execution instruction which is sent by the control node and used for carrying out abnormity detection on the first data source is received, carrying out abnormity detection on the first data source;
if the first data source is detected to be normal, returning abnormal elimination information to the control node, so that the control node updates the abnormal information base based on the abnormal elimination information, and synchronizes the updated abnormal information base to each working node;
and if the first data source is detected to be abnormal, returning reconfirmation to the control node so that the control node updates the abnormal information base based on the reconfirmation and synchronizes the updated abnormal information to each working node.
In an embodiment of the present invention, the system further includes a swap job receiving module, configured to:
when receiving an execution instruction sent by the control node for performing anomaly detection on the first data source, and detecting that the first data source is normal, the method further includes:
receiving and controlling a second exchange job which is scheduled to the first data source from the second working node by the control node;
the second working node is any node except the first working node, which returns the reconfirmation information aiming at the first data source to the control node.
In a specific embodiment of the present invention, the method further includes an exception registry sending module, configured to:
when determining that the abnormal registration information of the first data source does not exist in the abnormal information base, executing a step of performing abnormal detection on the first data source;
and if the first data source is detected to be abnormal, generating an abnormal registration table of the first data source, sending the abnormal registration table of the first data source to the control node, so that the control node updates the abnormal information base based on the abnormal registration table of the first data source, and synchronizes the updated abnormal information base to each working node.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a data exchange device, which is applied to a first working node of a data exchange system, where the data exchange system includes a control node and a plurality of working nodes, the control node is respectively in communication connection with each working node, and the first working node is any one of the working nodes in the data exchange system. Referring to fig. 4, the apparatus includes:
a memory 410 for storing a computer program;
the processor 420 is configured to implement the steps of the data exchange method when executing the computer program.
Corresponding to the above method embodiments, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the data exchange method described above.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims (10)

1. A data exchange method, applied to a first working node of a data exchange system, where the data exchange system includes a control node and a plurality of working nodes, the control node is in communication connection with each working node, and the first working node is any one of the working nodes in the data exchange system, and the method includes:
when a first exchange operation is to be carried out on a first data source, determining whether the abnormality registration information of the first data source exists in an abnormality information base which is obtained in advance and is synchronized by the control node;
if so, carrying out anomaly detection on the first data source, and shortening preset anomaly detection timeout time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source;
determining to terminate or run the first exchange job based on a corresponding detection result;
wherein the first switching operation is: and the first working node periodically performs the switching operation on the first data source based on the scheduling of the switching operation of the control node.
2. The method according to claim 1, upon determining that there is exception registration information of the first data source in the exception information repository, further comprising, before the detecting the exception of the first data source:
determining whether the exception level of the exception registration information of the first data source recorded in the exception information base is a fault level;
if so, setting the first switching operation to be in a fault suspension state, and executing the step of carrying out abnormity detection on the first data source until fault recovery information aiming at the first data source is received;
the failure levels are: and the control node updates the exception level when the exception registration recording times of the first data source reach M times or the exception registration nodes of the first data source reach N, wherein M and N are positive integers.
3. The method of claim 1, further comprising:
and if the detection result is that the first data source is normal, sending state confirmation information aiming at the first data source to the control node, so that the control node sends an execution instruction for carrying out abnormality detection on the first data source to an abnormality registration node of the first data source after receiving the state confirmation information.
4. The method of claim 1, further comprising:
and if the detection result indicates that the first data source is abnormal, generating an abnormal registration table of the first data source, and reporting the abnormal registration table to the control node, so that the control node updates the abnormal information base based on the abnormal registration table, and synchronizes the updated abnormal information base to each working node.
5. The method according to claim 1, wherein after determining that the first switching operation is terminated when the detection result is that the first data source has an abnormality, further comprising:
when an execution instruction which is sent by the control node and used for carrying out anomaly detection on the first data source is received, carrying out anomaly detection on the first data source;
if the first data source is detected to be normal, returning abnormal elimination information to the control node, so that the control node updates the abnormal information base based on the abnormal elimination information, and synchronizes the updated abnormal information base to each working node;
if the first data source is detected to be abnormal, returning reconfirmation to the control node, so that the control node updates the abnormal information base based on the reconfirmation and synchronizes the updated abnormal information to each working node.
6. The method according to claim 5, wherein when receiving an execution instruction sent by the control node to perform anomaly detection on the first data source, performing anomaly detection on the first data source, and detecting that the first data source is normal, the method further includes:
receiving and executing a second exchange job which is scheduled to the control node from a second working node and aims at the first data source by the control node;
the second working node is any one of the nodes except the first working node, and returns the reconfirmation information aiming at the first data source to the control node.
7. The method of any one of claims 1 to 6, further comprising:
executing the step of performing anomaly detection on the first data source when determining that the anomaly registration information of the first data source does not exist in the anomaly information base;
if the first data source is detected to be abnormal, generating an abnormal registration table of the first data source, sending the abnormal registration table of the first data source to the control node, so that the control node updates the abnormal information base based on the abnormal registration table of the first data source, and synchronizes the updated abnormal information base to each working node.
8. A data switching apparatus, applied to a first working node of a data switching system, where the data switching system includes a control node and a plurality of working nodes, the control node is in communication connection with each working node, respectively, and the first working node is any one of the working nodes in the data switching system, and the apparatus includes:
the abnormal information determining module is used for determining whether the abnormal registration information of the first data source exists in an abnormal information base which is obtained in advance and is synchronous with the control node when the first exchange operation is carried out on the first data source, and if so, the abnormal information determining module is triggered;
the anomaly detection module is used for carrying out anomaly detection on the first data source, and shortening preset anomaly detection timeout time and/or reducing preset anomaly detection retry times in the process of carrying out anomaly detection on the first data source;
the exchange job processing module is used for determining to terminate or run the first exchange job based on a corresponding detection result;
wherein the first switching operation is: and the first working node periodically performs the switching operation on the first data source based on the scheduling of the switching operation of the control node.
9. A data switching device, applied to a first working node of a data switching system, where the data switching system includes a control node and a plurality of working nodes, the control node is in communication connection with each working node, respectively, and the first working node is any one of the working nodes in the data switching system, including:
a memory for storing a computer program;
a processor for implementing the steps of the data exchange method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the data exchange method according to any one of claims 1 to 7.
CN201811348046.7A 2018-11-13 2018-11-13 Data exchange method, device, equipment and storage medium Active CN109408581B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811348046.7A CN109408581B (en) 2018-11-13 2018-11-13 Data exchange method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811348046.7A CN109408581B (en) 2018-11-13 2018-11-13 Data exchange method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109408581A CN109408581A (en) 2019-03-01
CN109408581B true CN109408581B (en) 2020-11-17

Family

ID=65473047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811348046.7A Active CN109408581B (en) 2018-11-13 2018-11-13 Data exchange method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109408581B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110297860B (en) * 2019-06-18 2024-01-26 杭州数梦工场科技有限公司 Data exchange method and device and related equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09298544A (en) * 1996-05-08 1997-11-18 Fujitsu Ltd Network operation managing device
CN103383689A (en) * 2012-05-03 2013-11-06 阿里巴巴集团控股有限公司 Service process fault detection method, device and service node
CN106357808B (en) * 2016-10-25 2019-09-24 Oppo广东移动通信有限公司 A kind of method of data synchronization and device
CN106844746A (en) * 2017-02-15 2017-06-13 浪潮软件集团有限公司 Method for realizing universal data exchange facing interface programming
CN107908494B (en) * 2017-11-10 2021-05-07 泰康保险集团股份有限公司 Abnormal event processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN109408581A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
EP2474919B1 (en) System and method for data replication between heterogeneous databases
WO2017177941A1 (en) Active/standby database switching method and apparatus
EP2790112B1 (en) Method and system for data synchronization and data access apparatus
CN107870829B (en) Distributed data recovery method, server, related equipment and system
CN109189860A (en) A kind of active and standby increment synchronization method of MySQL based on Kubernetes system
CN108345617B (en) Data synchronization method and device and electronic equipment
CN111581020A (en) Method and device for data recovery in distributed block storage system
CN112039970B (en) Distributed business lock service method, server, system and storage medium
CN109739435B (en) File storage and updating method and device
CN115297124B (en) System operation and maintenance management method and device and electronic equipment
CN106815094B (en) Method and equipment for realizing transaction submission in master-slave synchronization mode
CN110019510A (en) A kind of method and device carrying out increment synchronization
CN111444039B (en) Cache data rollback method and cache data rollback device
CN109408581B (en) Data exchange method, device, equipment and storage medium
CN114764380A (en) Distributed cluster control method and device based on ETCD
CN116055563A (en) Task scheduling method, system, electronic equipment and medium based on Raft protocol
CN108509296B (en) Method and system for processing equipment fault
US20190303233A1 (en) Automatically Detecting Time-Of-Fault Bugs in Cloud Systems
CN110825758B (en) Transaction processing method and device
JP5154843B2 (en) Cluster system, computer, and failure recovery method
CN110912979B (en) Method for solving multi-server resource synchronization conflict
CN112199432A (en) High-performance data ETL device based on distribution and control method
CN111897626A (en) Cloud computing scene-oriented virtual machine high-reliability system and implementation method
CN111158956A (en) Data backup method and related device for cluster system
CN111694894A (en) Method, server, device and storage medium for monitoring data synchronization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant