CN108874918B - Data processing device, database all-in-one machine and data processing method thereof - Google Patents

Data processing device, database all-in-one machine and data processing method thereof Download PDF

Info

Publication number
CN108874918B
CN108874918B CN201810543101.1A CN201810543101A CN108874918B CN 108874918 B CN108874918 B CN 108874918B CN 201810543101 A CN201810543101 A CN 201810543101A CN 108874918 B CN108874918 B CN 108874918B
Authority
CN
China
Prior art keywords
node
storage
backup
storage node
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810543101.1A
Other languages
Chinese (zh)
Other versions
CN108874918A (en
Inventor
魏本帅
杜彦魁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201810543101.1A priority Critical patent/CN108874918B/en
Publication of CN108874918A publication Critical patent/CN108874918A/en
Application granted granted Critical
Publication of CN108874918B publication Critical patent/CN108874918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing device, a database all-in-one machine and a data processing method thereof. The data processing device is arranged in each backup node of the database all-in-one machine, the address and the data block information of the storage node are stored in each storage node of the database all-in-one machine, and the data processing device comprises: the storage module is used for storing the address of each storage node in the database all-in-one machine; the detection module is used for detecting the working state of each storage node; and the processing module is used for backing up the data blocks in the second storage node to the specified backup node when the detection module detects that the first storage node has a fault, wherein the second storage node is a storage node which stores part or all of the data blocks in the first storage node. The embodiment of the invention can improve the reliability of the data block in the storage node on the basis of ensuring the effective utilization of the storage space.

Description

Data processing device, database all-in-one machine and data processing method thereof
Technical Field
The present application relates to, but not limited to, the field of computer and database technologies, and in particular, to a data processing apparatus, a database all-in-one machine, and a data processing method thereof.
Background
With the development of computer and database technologies, big data has penetrated into various industries and business function fields, and gradually becomes an indispensable data resource.
A Big Data Appliance (BDA) is a product of combination of software and hardware designed for analysis and processing of a large amount of Data, and is widely used, for example, as a high-performance database Appliance based on fusion computing, storage, high-speed network and database. Typical configurations of current database kiosks are, for example: the data blocks of the storage nodes are generally designed to be two-copy redundant or three-copy redundant. The redundancy of the two copies can only tolerate the downtime of one storage node, if one storage node is down, only one copy exists in a data block in the down storage node, and the risk of a single data point occurs; although the three-copy redundancy can tolerate the downtime of two storage nodes, when the three-copy redundancy is designed, the storage space of each storage node is greatly reduced, and the effective storage amount of the storage space is only 1/3.
In summary, the database all-in-one machine in the prior art is difficult to be compatible with the reliability of the data blocks in the storage nodes and the effective utilization of the storage space.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention provide a data processing apparatus, a database all-in-one machine, and a data processing method thereof, which can improve the reliability of data blocks in storage nodes on the basis of ensuring effective utilization of storage space.
The embodiment of the invention provides a database all-in-one machine, which is arranged in each backup node of the database all-in-one machine, wherein each storage node of the database all-in-one machine stores the address and data block information of the storage node, and a data processing device comprises:
the storage module is used for storing the address of each storage node in the database all-in-one machine;
the detection module is used for detecting the working state of each storage node in the database all-in-one machine according to the address stored by the storage module;
and the processing module is used for acquiring data block information of other storage nodes according to addresses of the other storage nodes except the first storage node when the detection module detects that the first storage node has a fault, and backing up data blocks of a second storage node to a backup node to which the processing module belongs, wherein the second storage node is a storage node in which part or all of the data blocks in the first storage node are stored.
An embodiment of the present invention further provides an integrated database apparatus, including: the data processing device comprises at least one backup node and storage nodes which are respectively communicated with the backup nodes, wherein the data processing device is configured in each backup node, and the address and data block information of the storage node are stored in each storage node.
The embodiment of the invention also provides a data processing method of the database all-in-one machine, which is implemented by adopting the database all-in-one machine, and comprises the following steps:
the backup node detects the working state of each storage node in the database all-in-one machine according to the address of each storage node stored in the backup node;
when detecting that a first storage node fails, the backup node acquires data block information of other storage nodes according to addresses of the other storage nodes except the first storage node, and backs up data blocks in a second storage node to the backup node, wherein the second storage node is a storage node in which part or all of the data blocks in the first storage node are stored.
An embodiment of the present invention further provides a computer device, including: a memory and a processor;
the memory is used for storing executable instructions;
the processor is used for realizing the data processing method of the database all-in-one machine when the executable instructions stored in the memory are executed.
The embodiment of the invention also provides a computer-readable storage medium, wherein the computer-readable storage medium stores executable instructions, and the executable instructions are executed by a processor to realize the data processing method of the database all-in-one machine.
In the data processing device, the backup node detects the working state of each storage node through the stored address of each storage node stored in the storage module by adopting the detection module, when the first storage node is detected to be in fault, the processing module acquires the data block information of other storage nodes except the storage node in fault according to the address of the storage node, and backs up the data block in the second storage node, which is the same as the first storage node in fault, into the backup node, so that the problem of single-point risk of the data block caused by crash of a certain storage node in the database all-in-one machine with a data block two-copy redundancy design is solved, and the effective utilization of the storage space in the storage node is ensured. The data processing device provided by the embodiment of the invention can improve the reliability of the data block in the storage node on the basis of ensuring the effective utilization of the storage space.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a schematic diagram of a database all-in-one machine in the prior art;
fig. 2 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 3 is a schematic view of an application scenario of the data processing apparatus according to the embodiment of the present invention;
fig. 4 is a schematic view of another application scenario of the data processing apparatus according to the embodiment of the present invention;
fig. 5 is a flowchart of a data processing method of a database all-in-one machine according to an embodiment of the present invention;
fig. 6 is a flowchart of another data processing method of a database all-in-one machine according to an embodiment of the present invention;
fig. 7 is a flowchart of a data processing method of a database all-in-one machine according to another embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
The following specific embodiments of the present invention may be combined, and the same or similar concepts or processes may not be described in detail in some embodiments.
The existing database all-in-one machine comprises a server which is provided with database software and provides functions of computing and the like, for example, the server comprises 2 or more computing nodes, the database all-in-one machine also comprises a server which provides functions of data storage, data filtering, data unloading and the like, for example, the server comprises 3 or more storage nodes, the number of the storage nodes is generally more than that of the computing nodes, and the computing nodes and the storage nodes can provide redundancy capability, namely that the condition that any one node is down does not influence the database all-in-one machine to provide normal service is required. Fig. 1 is a schematic structural diagram of a database all-in-one machine in the prior art, and the database all-in-one machine shown in fig. 1 is illustrated by taking a typical configuration example of 2 computing nodes +3 storage nodes, in the example shown in fig. 1, two data blocks are stored in each storage node, and the storage nodes are designed to be redundant in two copies, a data block a is stored in the storage node 1 and the storage node 2, a data block B is stored in the storage node 1 and the storage node 3, and a data block C is stored in the storage node 2 and the storage node 3, and it can be seen that, in the storage nodes with redundant two copies, the same data block does not appear in the same node. For the database all-in-one machine shown in fig. 1, if the storage node 1 is down, only one copy of the contents of the data blocks in the storage node 1, that is, the data blocks a and B, remains, that is, a single point risk occurs in the data blocks a and B. Aiming at the two-copy redundancy design of the storage nodes in the database all-in-one machine, a method for solving the single-point risk of data when one storage node is down needs to be provided urgently at present.
Fig. 2 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus 100 provided in this embodiment is disposed in each backup node of the database all-in-one machine, where the database all-in-one machine in this embodiment of the present invention includes a storage node and a backup node, and each storage node stores an address and data block information of the storage node, and the data processing apparatus 100 in this embodiment of the present invention may include: a storage module 110, a detection module 120 and a processing module 130.
The storage module 110 is configured to store an address of each storage node in the database all-in-one machine;
and the detection module 120 is configured to detect a working state of each storage node in the database all-in-one machine according to the address stored in the storage module 110 of the backup node.
In the embodiment of the present invention, the data processing apparatus 100 may be a software program configured in a backup node of the database all-in-one machine, for example, Agent software, a function implemented by the Agent software configured in the backup node is a function of each module in the data processing apparatus 100, each Storage node may be configured with software for communicating with the Agent software in the backup node (i.e., the data processing apparatus 100), the software may also be Agent software, the Agent software configured in the Storage node may store an address of the Storage node and data block information, the address may be an Internet Protocol (IP) address of the Storage node, the data block information is recorded in a block table (block table) of an Automatic Storage Management (ASM) or a User Agent Server (UAS), the data block information may include, for example: the size, number of data blocks, and the location of the data blocks in the storage node to which they belong. For example, if the data processing apparatus 100 in the backup node adopts Agent software, the Agent software is installed and configured in each storage node and the backup node, the address and the data block information of the node stored in each storage node may be recorded in the configuration file of the Agent software installed in the storage node, and the address of each storage node in the database all-in-one machine is stored in the configuration file of the Agent software installed in each backup node.
In the embodiment of the present invention, in the data processing apparatus 100 configured in each backup node, since the storage module 110 of the backup node stores the address of each storage node, the detection module 120 may detect the operating state of each storage node in real time according to the address stored in the storage module 110, and the detection may be implemented by: the backup node adopts the storage nodes which are ping (ping is a means for detecting whether a network between two network nodes is communicated) by the IP address of each storage node through the configured Agent software (storing the IP address of each storage node) and the Agent software configured on the storage node (storing the IP address of the storage node on each storage node), and after a certain storage node fails, the Agent software in the backup node can not ping the failed storage node through the IP address of the failed storage node, so that the storage node is detected to fail. Any backup node can know the working state of each storage node in the database all-in-one machine according to the detection result of the detection module 120.
Fig. 3 is a schematic view of an application scenario of the data processing apparatus according to the embodiment of the present invention, and fig. 3 is also a typical configuration example in which a database all-in-one machine is 2 computing nodes +3 storage nodes, where the database all-in-one machine includes two computing nodes 230 and three storage nodes (i.e., storage node 210a, storage node 210b, and storage node 210c), it can be seen that, unlike the database all-in-one machine in the prior art, a hardware configuration, i.e., backup node 220, is added in the database all-in-one machine according to the embodiment of the present invention, when configuring the database all-in-one machine, the hardware configuration of backup node 220 and the hardware configuration of the storage nodes are generally required to be consistent, and if a backup node 220 with the same configuration cannot be found, the difference in the hardware configuration is not too large as much, but the hard disk performance and capacity requirements of the backup node are consistent with the storage nodes, i.e., the backup node 220 is also actually a storage node. Fig. 3 illustrates an example configuration of a backup node 220.
As shown in fig. 3, the data processing apparatus 100 is configured in each storage node and the backup node 220, and the data processing apparatus 100 can realize the mutual communication between the backup node 220 and each storage node, for example, when the storage node 210a in the database all-in-one machine 20 is down, the backup node 220 can know the current state of the down storage node (i.e., the storage node 210a) through the data processing apparatus 100 configured therein.
It should be noted that the data processing apparatus 100 provided in the embodiment of the present invention may also be applied to a database all-in-one machine with other configurations, for example, a database all-in-one machine with 4 computing nodes +6 storage nodes, or 4 computing nodes +7 storage nodes, or 2 computing nodes +7 storage nodes, that is, the application range of the data processing apparatus 100 is not limited in the embodiment of the present invention; in addition, the embodiment of the present invention also does not limit the number of backup nodes, and may be one or more.
The processing module 130 is configured to, when the detection module 120 detects that the first storage node fails, obtain data block information of other storage nodes according to addresses of the other storage nodes except the first storage node, and backup data blocks of a second storage node to a backup node to which the processing module 130 belongs, where the second storage node is a storage node in which some or all data blocks in the first storage node are stored.
In the embodiment of the present invention, it has been described above that the storage module 110 stores each storage node address, and the processing module 130 may ping the non-failing storage nodes according to these addresses, i.e., other storage nodes than the first storage node, can ping through, since the data blocks in the storage nodes are designed as two-copy redundant backups, therefore, a copy of the data block in the failed storage node (i.e. the first storage node) remains in other storage nodes, the processing module 130 in the backup node can scan the configuration files of Agent software in other storage nodes according to the addresses of other storage nodes, the data block information in other storage nodes can be obtained, and the data block with only one data copy can be finally obtained, wherein the data block with only one data copy is the data block in the first storage node and is also the object for backup. That is to say, it can be known from the scanning result which other storage nodes the data block in the first storage node still stores, and the node storing the same data block as the first storage node is the second storage node. Therefore, the processing module 130 of the data processing apparatus 100 in the backup node may backup the same data blocks in the second storage node as the first storage node into the backup node according to the above information. The implementation mode of the backup in the implementation of the invention can be a remote synchronous data block or an asynchronous copy data block, and the operation of the remote backup data block is triggered only when a certain storage node fails; in addition, the detection module 120 detects a criterion that a certain storage node fails, that is, the condition for triggering backup may be: when a storage node is detected to be unable to respond within x seconds(s), namely the IP address of the storage node is unable to ping, the length of x can be set by an administrator.
It should be noted that in the embodiment of the present invention, there may be one or more data blocks that are scanned to obtain only one data copy, and when there are multiple data blocks, the data blocks may be distributed in multiple second storage nodes; in addition, the number of the second storage nodes may be one or more, when all the data blocks of the failed first storage node are stored in another storage node, the second storage node is the one storage node, and in this scenario, the backup node backs up all the data blocks of the one second storage node to the backup node; when the data block of the failed first storage node is stored in a plurality of storage nodes, where only a part of the data block of the first storage node is stored in these storage nodes, as shown in fig. 3, the data block a in the down storage node 210a (first storage node) is stored in the storage node 210B, and the data block B is stored in the storage node 210c, where both the storage node 210B and the storage node 210c are second storage nodes, in this scenario, the backup node respectively backs up the same part of the data block in each second storage node as the first storage node in the backup node.
In the existing database all-in-one machine designed with two-copy redundancy, if one storage node is down and cannot be recovered, the data block in the storage node has only one copy on a normal storage node, as shown in fig. 1, the data block C has two copies, and the data block a and the data block B in the down storage node 1 have only one copy, so that a single-point risk of the data block occurs at this time, and if the other storage node is down, the data block a or the data block B will be lost. The prior art does not provide an effective method for solving the problem that a single point risk occurs in a data block under the condition that a storage node is redundant only with two copies and is down. In contrast, the data processing apparatus 100 provided in the embodiment of the present invention is disposed in each additionally configured backup node of the database all-in-one machine, and the data processing apparatus 100 in the backup node and software (e.g., Agent software) on each storage node realize the intercommunication between the backup node and each storage node, so that when a certain storage node fails, data blocks in other storage nodes that are the same as a shutdown storage node can be backed up in the backup node, that is, a single-point risk problem of a data block occurring in a shutdown storage node in an application scenario with two redundant copies is avoided.
In the data processing apparatus 100 provided in the embodiment of the present invention, the backup node detects the operating state of each storage node by using the detection module 120 through the stored address of each storage node stored in the storage module 110, and when a failure of a first storage node is detected, the processing module 130 obtains the data block information of other storage nodes except the failed storage node according to the address of the storage node, and backs up the data block identical to the failed first storage node in a second storage node to the backup node, thereby avoiding the problem of single point risk of the data block due to downtime of a certain storage node in a database all-in-one machine with a data block two-copy redundancy design, and simultaneously ensuring effective utilization of the storage space in the storage node. The data processing apparatus 100 according to the embodiment of the present invention can improve the reliability of the data block in the storage node on the basis of ensuring effective utilization of the storage space.
In the above embodiments, it has been described that one backup node or a plurality of backup nodes may be configured in the database all-in-one machine.
In a possible implementation manner of the embodiment of the present invention, only one backup node is configured in the database all-in-one machine, and in the application scenario, the storage module 110 is further configured to store an address of the backup node; accordingly, the processing module 130 may implement the backup data block by: when the detection module 120 detects that the first storage node has a failure, the data block information in the storage nodes other than the first storage node is scanned according to the address stored in the storage module 110, and the data block in the second storage node is backed up to the backup node according to the scanning result and the address of the backup node. Since there is only one backup node, the address of the backup node stored in the storage module 110 is the target address of the backup, and the target address can be set in the Agent configuration file as the IP address of the backup node.
In another possible implementation manner of the embodiment of the present invention, at least two backup nodes are configured in the database all-in-one machine, and in the application scenario, the storage module 110 is further configured to store an address and a storage priority of each backup node; accordingly, the processing module 130 may implement the backup data block by: when the detection module 120 detects that the first storage node has a failure, the data block information in other nodes except the backup node to which the first storage node and the processing module 130 belong is scanned according to the address stored in the storage module 110, and the data block in the second storage node is backed up to the backup node with the highest storage priority and currently empty according to the scanning result and the address of each backup node and the storage priority.
The processing manner of the data processing apparatus 100 provided by the embodiment of the present invention in the application scenario is described below by an implementation example, fig. 4 is another schematic application scenario diagram of the data processing apparatus provided by the embodiment of the present invention, fig. 4 is shown by an example of a configuration in which a database all-in-one machine is 4 computing nodes +6 storage nodes, the database all-in-one machine includes four computing nodes 230 and 6 storage nodes (210a, 210b, 210c, 210d, 210e, and 210f), and is further configured with two backup nodes, namely a backup node 220a and a backup node 220b, wherein data blocks are also designed to be two-copy redundant, the distribution of the data blocks in each storage node is as shown in fig. 4, in addition, the data processing apparatus 100 is configured in each storage node, in the data processing apparatus 100 configured in each storage node, the storage module 110 stores the IP address and data block information of the storage node, in the data processing apparatus 100 configured in each backup node, the storage module 110 stores an IP address of each storage node, an IP address of each backup node (including addresses of the backup node 220a and the backup node 220 b), and storage priorities of the two backup nodes, where the storage priorities may be sorted according to hardware configuration performance of the backup nodes, for example, if the hardware configuration of the backup node 220a is identical to that of the storage node, and the hardware configuration of the backup node 220b is closer to that of the storage node, the storage priorities are set as: backup node 220a is a primary and backup node 220b is a secondary. Based on the configuration of the above-mentioned database all-in-one machine, when the storage node 210b is down, and there is only one copy of the data block a and the data block C, there is a single point risk, the same data block in other storage nodes (including the storage node 210a and the storage node 210C) having the same data block as the storage node 210b may be backed up to one of the backup nodes, at this time, the data processing apparatus 100 in both backup nodes may detect that the storage node 210b is down, and determine the second storage node (i.e., the storage node 210a and the storage node 210C) having the same data block as the down storage node by scanning the other storage nodes, and may further regard one of the backup nodes as a target backup node according to the storage priorities and the current working statuses of the two backup nodes, and if the backup node 220a is currently empty, regard the backup node 220a as a target backup node, if the backup node 220a is currently in a working state, that is, other data blocks are stored, the backup node 220b is used as a target backup node, and then the data block a in the storage node 210a and the data block C in the storage node 210C are backed up to the selected target backup node. Fig. 4 shows an example of backup of data blocks by using the backup node 220a as a target backup node.
Optionally, in the data processing apparatus 100 provided in the embodiment of the present invention, the processing module 130 is further configured to scan information of data blocks in the second storage node in real time, and when it is determined that a data block that has been backed up in the second storage node is updated, backup the updated data block to the backup node again. In the embodiment of the present invention, the data processing apparatus 100 may determine the update condition of the data block in real time, and if the data block that has been backed up to the backup node has an update in the second storage node, the updated data block may be backed up again, so as to achieve the purpose that the data block has high availability.
Optionally, in the data processing apparatus 100 according to the embodiment of the present invention, the processing module 130 is further configured to delete the data block in the backup node when the detecting module 120 detects that the first storage node recovers to the normal operating state. In the embodiment of the present invention, although the backup node has the same or close hardware configuration as the storage node and can implement the redundant storage capability of the data block, because the backup node is triggered to implement the redundant storage of the data block only when there is a storage node in the database all-in-one machine is down, after detecting that the down storage node (i.e., the first storage node) is recovering to normal operation, the data processing apparatus 100 may actively delete the data block stored in the backup node, so that when other storage nodes of the subsequent database all-in-one machine are down, the backup node may continue to perform the redundant backup operation of the data block.
Optionally, in this embodiment of the present invention, the data block information may include: the location of the data block in the node to which the data block belongs, and the size and number of the data block may be considered whether the size of the data block allows backup to the local node when the data processing apparatus 100 of the backup node backs up the data block, and when the local backup node does not meet the hardware requirement for backup, the data block of the down storage node may be backed up by other backup nodes. In addition, in the embodiment of the present invention, the number of the same data block is usually two, that is, two copies of the same data block are redundantly backed up, but three or other numbers of the same data block are not excluded, the number of different data blocks is configured according to the hardware configuration of the database all-in-one machine, the size of the data block affects the effectiveness of backup of the data block in the downtime storage node, and the processing module 130 may select a suitable backup node to perform the redundancy backup operation of the data block.
In the embodiment of the invention, the data processing device 100 (for example, Agent software) is configured in each backup node of the database all-in-one machine, and software (also can be Agent software) for communicating with the data processing device 100 is configured in each storage node, so that the Agent software can realize detailed recording of data block information and remote copy and incremental copy capabilities of data blocks.
Based on the data processing device 100 provided in each of the above embodiments of the present invention, an embodiment of the present invention further provides a database all-in-one machine, in which the data processing device 100 provided in any of the above embodiments of the present invention is configured.
Referring to fig. 3 and 4, a schematic structural diagram of a database all-in-one machine according to an embodiment of the present invention is also provided. The database all-in-one machine provided by the embodiment of the invention comprises: at least one backup node, and a storage node respectively connected to each backup node, where each backup node is configured with the data processing apparatus 100 (for example, Agent software) according to any of the above embodiments of the present invention, each storage node is configured with software (also, Agent software) for communicating with the data processing apparatus 100, and each storage node stores an address and data block information of the storage node. In addition, the backup nodes and the storage nodes in the embodiment of the invention can be communicated through a network, and in practical application, ping operation can be performed through Agent software configured on each node and a stored address, so that the communication capability is realized. The basic configuration of the database integration shown in fig. 3 and 4 is different, that is, the number of the computing nodes and the storage nodes is different, but both are configured with backup nodes, and the configuration number of the backup nodes in fig. 3 and 4 is also different.
It should be noted that the structure of the database all-in-one machine according to the embodiment of the present invention is not limited to the structure shown in fig. 3 and fig. 4, that is, the number of the computing nodes, the storage nodes, and the backup nodes in the database all-in-one machine is not limited, and any database all-in-one machine may be used as the database all-in-one machine in the embodiment of the present invention as long as each backup node is configured with the data processing apparatus 100 according to any of the above embodiments of the present invention, and the storage node is configured with software for communicating with the data processing apparatus 100, and the database can be redundantly backed up in two copies when one storage node fails.
The database all-in-one machine provided in the embodiment of the present invention is also configured with backup nodes, and each backup node is configured with the data processing apparatus 100 provided in any one of the above embodiments of the present invention, so that the same processing capability as the data processing apparatus 100 provided in the above embodiments can be achieved, and the same technical effects are achieved, and therefore, details are not described herein again.
Based on the data processing device 100 and the database all-in-one machine provided by each embodiment of the invention, the embodiment of the invention also provides a data processing method of the database all-in-one machine, and the data processing method of the database all-in-one machine is used for processing data by adopting the database all-in-one machine provided by any embodiment of the invention.
Fig. 5 is a flowchart of a data processing method of a database all-in-one machine according to an embodiment of the present invention. The data processing method provided by this embodiment is executed by a database all-in-one machine, the database all-in-one machine is the database all-in-one machine provided by any one of the above embodiments of the present invention, and the structure of the database all-in-one machine can refer to the database all-in-one machine shown in fig. 3 and 4, and the data processing method can include the following steps:
and S310, the backup node detects the working state of each storage node in the database all-in-one machine according to the address of each storage node stored in the backup node.
The data processing method of the database all-in-one machine provided in the embodiment of the present invention is a processing method applied to a database all-in-one machine with a two-copy redundancy design, and the structure of the database all-in-one machine in the embodiment of the present invention, the structure and the function of a data processing device (for example, Agent software) configured in an internal backup node thereof, and the structure and the function of software (for example, Agent software) configured in a storage node thereof have been described in detail in the above embodiments, and therefore, no further description is given here. Based on the hardware configuration and software capability of the data processing apparatus and the database all-in-one machine in the foregoing embodiments of the present invention, an address of each storage node is stored in the backup node in the embodiments of the present invention, and the address of each storage node and data block information are stored in each storage node, where the address may be an IP address of the storage node, the data block information is recorded in a block table (block table) of an ASM or a UAS, and the data block information may include, for example: the size, number of data blocks, and the location of the data blocks in the storage node to which they belong. In addition, the backup node in the database all-in-one machine can detect the working state of each storage node in real time according to the address stored by the backup node, and the detection can be realized by the following steps: the backup node uses the IP address of each storage node to ping the storage nodes through the configured Agent software (data processing device) and the Agent software configured on the storage nodes, and after a certain storage node fails, the Agent software in the backup node can not ping the failed storage node through the IP address of the failed storage node, and then the storage node is detected to fail. And any backup node can know the working state of each storage node in the database all-in-one machine according to the detection result.
And S320, when detecting that the first storage node fails, the backup node acquires the data block information of other storage nodes according to the addresses of the other storage nodes except the first storage node, and backs up the data block in the second storage node to the backup node, wherein the second storage node is a storage node in which part or all of the data blocks in the first storage node are stored.
In the embodiment of the present invention, it has been described above that each storage node address is stored in a backup node, and the backup node may ping pass through a storage node that has not failed according to the addresses, that is, all storage nodes except a first storage node may ping pass through, because a data block in the storage node is designed to be a two-copy redundant backup, a copy of the data block copy in the failed storage node (i.e., the first storage node) remains in the other storage node, Agent software in the backup node may scan configuration files of Agent software in other storage nodes according to addresses of other storage nodes, that is, data block information in other storage nodes may be obtained, and a data block with only one data copy may be finally obtained, where a data block with only one data copy is a data block in the first storage node, and is also an object to be backed up. That is to say, it can be known from the scanning result which other storage nodes the data block in the first storage node still stores, and the node storing the same data block as the first storage node is the second storage node. Therefore, the backup node of the database all-in-one machine can backup the data blocks in the second storage node, which are the same as the data blocks in the first storage node, to the backup node according to the information. The implementation mode of the backup in the implementation of the invention can be a remote synchronous data block or an asynchronous copy data block, and the operation of the remote backup data block is triggered only when a certain storage node fails; in addition, the criterion for the database all-in-one machine to detect that a certain storage node fails, that is, the condition for triggering backup may be: when a storage node is detected to be unable to respond within x seconds(s), namely the IP address of the storage node is unable to ping, the length of x can be set by an administrator.
It should be noted that in the embodiment of the present invention, there may be one or more data blocks that are scanned to obtain only one data copy, and when there are multiple data blocks, the data blocks may be distributed in multiple second storage nodes; in addition, the number of the second storage nodes may be one or more, when all the data blocks of the failed first storage node are stored in another storage node, the second storage node is the one storage node, and in this scenario, the backup node backs up all the data blocks of the one second storage node to the backup node; when the data block of the failed first storage node is stored in a plurality of storage nodes, where only a part of the data block of the first storage node is stored in these storage nodes, as shown in fig. 3, the data block a in the down storage node 210a (first storage node) is stored in the storage node 210B, and the data block B is stored in the storage node 210c, where both the storage node 210B and the storage node 210c are second storage nodes, in this scenario, the backup node respectively backs up the same part of the data block in each second storage node as the first storage node in the backup node.
In the existing database all-in-one machine designed with two-copy redundancy, if one storage node is down and cannot be recovered, the data block in the storage node has only one copy on a normal storage node, as shown in fig. 1, the data block C has two copies, and the data block a and the data block B in the down storage node 1 have only one copy, so that a single-point risk of the data block occurs at this time, and if the other storage node is down, the data block a or the data block B will be lost. The prior art does not provide an effective method for solving the problem that a single point risk occurs in a data block under the condition that a storage node is redundant only with two copies and is down. In contrast, in the data processing method of the database all-in-one machine provided by the embodiment of the present invention, the data processing apparatus provided in any one of the above embodiments is configured in each additionally configured backup node of the database all-in-one machine, and the data processing apparatus in the backup node and software (e.g., Agent software) on each storage node realize the intercommunication between the backup node and each storage node, so that when a certain storage node fails, data blocks in other storage nodes that are the same as a failed storage node can be backed up in the backup node, that is, a single point risk problem of a data block occurring in a failed storage node in an application scenario with two-copy redundancy is avoided.
According to the data processing method of the database all-in-one machine, the working state of each storage node in the database all-in-one machine is detected through the backup node, the backup node acquires the data block information of other storage nodes except the failed storage node according to the address of each storage node, and the data block in the second storage node, which is the same as the data block in the failed first storage node, is backed up in the backup node, so that the problem of single point risk of the data block caused by crash of a certain storage node in the database all-in-one machine with a data block double-copy redundancy design is solved, and meanwhile, the effective utilization of the storage space in the storage nodes is guaranteed. The data processing method of the database all-in-one machine provided by the embodiment of the invention can improve the reliability of the data blocks in the storage nodes on the basis of ensuring the effective utilization of the storage space.
In the above embodiments, it has been described that one backup node or a plurality of backup nodes may be configured in the database all-in-one machine.
In a possible implementation manner of the embodiment of the present invention, only one backup node is configured in the database all-in-one machine, and in the application scenario, the address of the backup node is stored in the backup node; accordingly, the implementation manner of S320 may include:
when detecting that the first storage node has a fault, the backup node scans the data block information in other storage nodes except the first storage node according to the address stored in the backup node, and backups the data block in the second storage node to the backup node according to the scanning result and the address of the backup node.
In the embodiment of the invention, as only one backup node is provided, the address of the backup node stored in the database all-in-one machine is the backup target address.
In another possible implementation manner of the embodiment of the present invention, at least two backup nodes are configured in the database all-in-one machine, and in the application scenario, the address and the storage priority of each backup node are stored in the backup node; accordingly, the implementation manner of S320 may include:
when the backup node detects that the first storage node has a fault, the data block information in other nodes except the first storage node and the backup node is scanned according to the address stored in the backup node, and the data block in the second storage node is backed up to the backup node with the highest storage priority and empty currently according to the scanning result and the address of each backup node and the storage priority. The implementation example of the application scenario may refer to the implementation example shown in fig. 4, and is not described herein again.
Optionally, fig. 6 is a flowchart of another data processing method of a database all-in-one machine according to an embodiment of the present invention. On the basis of the embodiment shown in fig. 5, the method provided by the embodiment of the present invention may further include:
s330, the backup node scans the data block information in the second storage node in real time, and when it is determined that the data block which is backed up in the second storage node is updated, the updated data block is backed up to the backup node again.
In the embodiment of the invention, the database all-in-one machine can also judge the updating condition of the data block in real time, and if the data block in the backed-up node has an update in the second storage node, the updated data block can be backed up again, so that the purpose that the data block has high availability is achieved.
Optionally, fig. 7 is a flowchart of a data processing method of a database all-in-one machine according to another embodiment of the present invention. On the basis of the foregoing embodiments, the method provided in the embodiments of the present invention may further include:
s340, when the backup node detects that the first storage node restores to the normal working state, deleting the data blocks in the backup node.
The embodiment shown in fig. 7 is illustrated on the basis of the flow shown in fig. 5 as an example. In the embodiment of the present invention, although the backup node has the same or close hardware configuration as the storage node and can implement the redundant storage capability of the data block, because the backup node is triggered to implement the redundant storage of the data block only when one storage node in the database all-in-one machine is down, the database all-in-one machine can actively delete the data block stored in the backup node after detecting that the down storage node (i.e., the first storage node) is restored to normal operation, so that when other storage nodes of the database all-in-one machine are down, the backup node can continue to perform the redundant backup operation of the data block.
Optionally, in this embodiment of the present invention, the data block information stored in the database all-in-one machine includes: the location of the data block in the node to which it belongs, and the size and number of the data blocks. In addition, in the embodiment of the present invention, the number of the same data block is usually two, that is, two copies of the same data block are redundantly backed up, but three or other numbers of the same data block are not excluded, the number of different data blocks is configured according to the hardware configuration of the database all-in-one machine, and the size of the data block affects the effectiveness of backup of the data block in the downtime storage node, so that a suitable backup node may be selected to perform the redundancy backup operation of the data block.
Fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present invention. The computer device 40 provided in the embodiment of the present invention may include: a memory 41 and a processor 42.
Wherein, the memory 41 is used for storing executable instructions;
and the processor 42 is configured to implement the database all-in-one machine provided in any one of the above embodiments of the present invention to perform data processing when the executable instructions stored in the memory 41 are executed.
The implementation of the computer device 40 provided in the embodiment of the present invention is substantially the same as the method for performing data processing on the database all-in-one machine provided in the above embodiment of the present invention, and details are not repeated herein.
The embodiment of the invention also provides a computer-readable storage medium, and the computer-readable storage medium stores executable instructions, and when the executable instructions are executed by a processor, the database all-in-one machine provided by any one of the above embodiments of the invention can be used for carrying out data processing. The implementation manner of the computer-readable storage medium provided in the embodiment of the present invention is substantially the same as the method for performing data processing on the database all-in-one machine provided in the above embodiment of the present invention, and details are not repeated herein.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
It will be understood by those of ordinary skill in the art that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art.

Claims (13)

1. The data processing device is characterized by being arranged in each backup node of a database all-in-one machine, wherein the address and data block information of each storage node of the database all-in-one machine are stored in each storage node of the database all-in-one machine, and the data processing device comprises:
the storage module is used for storing the address of each storage node in the database all-in-one machine;
the detection module is used for detecting the working state of each storage node in the database all-in-one machine according to the address stored by the storage module;
and the processing module is used for acquiring data block information of other storage nodes according to addresses of the other storage nodes except the first storage node when the detection module detects that the first storage node has a fault, and backing up data blocks of a second storage node to a backup node to which the processing module belongs, wherein the second storage node is a storage node in which part or all of the data blocks in the first storage node are stored.
2. The data processing apparatus of claim 1, wherein the database kiosk includes a backup node;
the storage module is further used for storing the address of the backup node;
the processing module backs up the data blocks in the second storage node to the backup node, and the backing up includes:
when the detection module detects that the first storage node has a fault, scanning data block information in other storage nodes except the first storage node according to the address stored by the storage module, and backing up the data block in the second storage node to the backup node according to the scanning result and the address of the backup node; alternatively, the first and second electrodes may be,
the database all-in-one machine comprises at least two backup nodes;
the storage module is further used for storing the address and the storage priority of each backup node;
the processing module backs up the data blocks in the second storage node to the backup node, and the backing up includes:
when the detection module detects that the first storage node has a fault, scanning data block information in other nodes except the first storage node and the backup node to which the processing module belongs according to the address stored by the storage module, and backing up the data block in the second storage node to the backup node with the highest storage priority and currently empty according to the scanning result and the address of each backup node and the storage priority.
3. The data processing apparatus of claim 1 or 2,
the processing module is further configured to scan information of the data blocks in the second storage node in real time, and when it is determined that the data blocks which have been backed up in the second storage node are updated, re-backup the updated data blocks to the backup node.
4. The data processing apparatus of claim 1 or 2,
the processing module is further configured to delete the data block in the backup node when the detection module detects that the first storage node recovers to a normal working state.
5. The data processing apparatus of claim 1 or 2,
the data block information includes: the location of the data block in the node to which it belongs, and the size and number of the data blocks.
6. An all-in-one database machine, comprising: the data processing device comprises at least one backup node and storage nodes which are respectively communicated with the backup nodes, wherein the data processing device as claimed in any one of claims 1 to 5 is configured in each backup node, and the address and data block information of the storage node are stored in each storage node.
7. A data processing method of a database all-in-one machine, which is implemented by the database all-in-one machine according to claim 6, and comprises the following steps:
the backup node detects the working state of each storage node in the database all-in-one machine according to the address of each storage node stored in the backup node;
when detecting that a first storage node fails, the backup node acquires data block information of other storage nodes according to addresses of the other storage nodes except the first storage node, and backs up data blocks in a second storage node to the backup node, wherein the second storage node is a storage node in which part or all of the data blocks in the first storage node are stored.
8. The data processing method of the database all-in-one machine as claimed in claim 7, wherein the database all-in-one machine comprises a backup node, and the address of the backup node is stored in the backup node;
the backup node backups the data blocks in the second storage node to the backup node, and the backups include:
when the backup node detects that the first storage node has a fault, scanning data block information in other storage nodes except the first storage node according to the address stored in the backup node, and backing up the data block in the second storage node to the backup node according to the scanning result and the address of the backup node; alternatively, the first and second electrodes may be,
the database all-in-one machine comprises at least two backup nodes, wherein the backup nodes store the address and the storage priority of each backup node;
the backup node backups the data blocks in the second storage node to the backup node, including:
when the backup node detects that the first storage node fails, scanning data block information in other nodes except the first storage node and the backup node according to the address stored in the backup node, and backing up the data block in the second storage node to the backup node with the highest storage priority and being empty currently according to the scanning result and the address of each backup node and the storage priority.
9. The data processing method of the database all-in-one machine according to claim 7 or 8, characterized by further comprising:
and the backup node scans the data block information in the second storage node in real time, and when the data block which is backed up in the second storage node is determined to be updated, the updated data block is backed up to the backup node again.
10. The data processing method of the database all-in-one machine according to claim 7 or 8, characterized by further comprising:
and deleting the data blocks in the backup node when the backup node detects that the first storage node is restored to the normal working state.
11. The data processing method of the database all-in-one machine according to claim 7 or 8, wherein the data block information comprises: the location of the data block in the node to which it belongs, and the size and number of the data blocks.
12. A computer device, comprising: a memory and a processor;
the memory is used for storing executable instructions;
the processor is used for realizing the data processing method of the database all-in-one machine as claimed in any one of claims 7-11 when the executable instructions stored in the memory are executed.
13. A computer-readable storage medium, characterized in that the computer-readable storage medium stores executable instructions, which when executed by a processor, implement the data processing method of the database all-in-one machine according to any one of claims 7 to 11.
CN201810543101.1A 2018-05-30 2018-05-30 Data processing device, database all-in-one machine and data processing method thereof Active CN108874918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810543101.1A CN108874918B (en) 2018-05-30 2018-05-30 Data processing device, database all-in-one machine and data processing method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810543101.1A CN108874918B (en) 2018-05-30 2018-05-30 Data processing device, database all-in-one machine and data processing method thereof

Publications (2)

Publication Number Publication Date
CN108874918A CN108874918A (en) 2018-11-23
CN108874918B true CN108874918B (en) 2021-11-26

Family

ID=64335893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810543101.1A Active CN108874918B (en) 2018-05-30 2018-05-30 Data processing device, database all-in-one machine and data processing method thereof

Country Status (1)

Country Link
CN (1) CN108874918B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113709197B (en) * 2020-05-21 2024-02-23 顺丰科技有限公司 Alliance block chain organization system and block chain system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1786920A (en) * 2004-12-09 2006-06-14 国际商业机器公司 Performing scheduled backups of a backup node associated with a plurality of agent nodes
CN101539873A (en) * 2009-04-15 2009-09-23 成都市华为赛门铁克科技有限公司 Data recovery method, data node and distributed file system
CN103064759A (en) * 2012-12-18 2013-04-24 华为技术有限公司 Data recovery method and device
CN105354108A (en) * 2014-08-22 2016-02-24 中兴通讯股份有限公司 Data backup method and node
CN105406980A (en) * 2015-10-19 2016-03-16 浪潮(北京)电子信息产业有限公司 Multi-node backup method and multi-node backup device
CN105930498A (en) * 2016-05-06 2016-09-07 中国银联股份有限公司 Distributed database management method and system
CN107015884A (en) * 2016-01-28 2017-08-04 杭州海康威视数字技术股份有限公司 A kind of date storage method and device
CN107329708A (en) * 2017-07-04 2017-11-07 郑州云海信息技术有限公司 A kind of distributed memory system realizes data cached method and system
CN107357689A (en) * 2017-08-02 2017-11-17 郑州云海信息技术有限公司 The fault handling method and distributed memory system of a kind of memory node
CN107484108A (en) * 2017-08-25 2017-12-15 中国联合网络通信集团有限公司 Method, sensing equipment and the radio sensing network of data backup

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104054076B (en) * 2013-01-14 2017-11-17 华为技术有限公司 Date storage method, database purchase node failure processing method and processing device
US10346267B2 (en) * 2014-10-31 2019-07-09 Red Hat, Inc. Registering data modification listener in a data-grid

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1786920A (en) * 2004-12-09 2006-06-14 国际商业机器公司 Performing scheduled backups of a backup node associated with a plurality of agent nodes
CN101539873A (en) * 2009-04-15 2009-09-23 成都市华为赛门铁克科技有限公司 Data recovery method, data node and distributed file system
CN103064759A (en) * 2012-12-18 2013-04-24 华为技术有限公司 Data recovery method and device
CN105354108A (en) * 2014-08-22 2016-02-24 中兴通讯股份有限公司 Data backup method and node
CN105406980A (en) * 2015-10-19 2016-03-16 浪潮(北京)电子信息产业有限公司 Multi-node backup method and multi-node backup device
CN107015884A (en) * 2016-01-28 2017-08-04 杭州海康威视数字技术股份有限公司 A kind of date storage method and device
CN105930498A (en) * 2016-05-06 2016-09-07 中国银联股份有限公司 Distributed database management method and system
CN107329708A (en) * 2017-07-04 2017-11-07 郑州云海信息技术有限公司 A kind of distributed memory system realizes data cached method and system
CN107357689A (en) * 2017-08-02 2017-11-17 郑州云海信息技术有限公司 The fault handling method and distributed memory system of a kind of memory node
CN107484108A (en) * 2017-08-25 2017-12-15 中国联合网络通信集团有限公司 Method, sensing equipment and the radio sensing network of data backup

Also Published As

Publication number Publication date
CN108874918A (en) 2018-11-23

Similar Documents

Publication Publication Date Title
CN106776130B (en) Log recovery method, storage device and storage node
CN108509153B (en) OSD selection method, data writing and reading method, monitor and server cluster
US8250202B2 (en) Distributed notification and action mechanism for mirroring-related events
US9477565B2 (en) Data access with tolerance of disk fault
US9015527B2 (en) Data backup and recovery
US8856592B2 (en) Mechanism to provide assured recovery for distributed application
EP3142011B1 (en) Anomaly recovery method for virtual machine in distributed environment
KR20150043331A (en) De-duplicating attachments on message delivery and automated repair of attachments
CN107508694B (en) Node management method and node equipment in cluster
CN111176888B (en) Disaster recovery method, device and system for cloud storage
US8543864B2 (en) Apparatus and method of performing error recovering process in asymmetric clustering file system
CN108874918B (en) Data processing device, database all-in-one machine and data processing method thereof
CN111342986B (en) Distributed node management method and device, distributed system and storage medium
CN102624537B (en) Data recovery system and method thereof
CN104407947A (en) Main/backup NAS (Network attached storage) switching method and device
CN108133034B (en) Shared storage access method and related device
CN105323271B (en) Cloud computing system and processing method and device thereof
CN105550230A (en) Method and device for detecting failure of node of distributed storage system
CN105490847A (en) Real-time detecting and processing method of node failure in private cloud storage system
CN110661599B (en) HA implementation method, device and storage medium between main node and standby node
CN115686368A (en) Method, system, apparatus and medium for storage capacity expansion of nodes of block chain network
CN113596195B (en) Public IP address management method, device, main node and storage medium
CN106020975B (en) Data operation method, device and system
US20170371752A1 (en) Cloud storage write cache management system and method
CN114116285A (en) Processing method, device and medium for file gateway fault switching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant