CN109783014B - Data storage method and device - Google Patents

Data storage method and device Download PDF

Info

Publication number
CN109783014B
CN109783014B CN201811508945.9A CN201811508945A CN109783014B CN 109783014 B CN109783014 B CN 109783014B CN 201811508945 A CN201811508945 A CN 201811508945A CN 109783014 B CN109783014 B CN 109783014B
Authority
CN
China
Prior art keywords
data
original data
node
backup
target data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811508945.9A
Other languages
Chinese (zh)
Other versions
CN109783014A (en
Inventor
宋飞
刘强
罗治文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201811508945.9A priority Critical patent/CN109783014B/en
Publication of CN109783014A publication Critical patent/CN109783014A/en
Application granted granted Critical
Publication of CN109783014B publication Critical patent/CN109783014B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0709Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The embodiment of the invention discloses a method and a device for storing data, relates to the technical field of communication, and can solve the problem of insufficient disk space caused by large occupied space of a backup node. The method of the embodiment of the invention comprises the following steps: at least two backup nodes receive original data sent by a main node, wherein the original data are data written in by the main node; after the original data is written into the at least two backup nodes, the at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain target data, and the storage space occupied by the target data is smaller than that occupied by the original data; at least two backup nodes delete the original data. The invention is suitable for the storage process of data.

Description

Data storage method and device
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for storing data.
Background
In a distributed storage system, a number of compute nodes and storage nodes are typically included, and the compute nodes and storage nodes may be connected by a network. In order to avoid data packet loss caused by storage node failure, multiple copies of the same data are usually stored on multiple storage nodes during data storage. Such as: 3 copies of the data are backed up and stored on different storage nodes, so that the integrity of the data can be ensured when 1 copy or 2 copies of the data are damaged.
In the data reading process, a request for reading data can be sent to the main node through the network, and the stored data is read from the main node, while the data stored on the backup node is only used for storing a data copy when the main node has no fault, that is, the data stored on the backup node is the same as the data stored on the main node, and the storage form is the same. Therefore, the data storage in the above manner often causes excessive redundancy. When the main node has no fault, the backup node is only used for backing up data, and when the amount of the stored data is large, the backed-up data and the data stored on the main node need to occupy the same storage space, so that the problem of insufficient disk space caused by large occupied space of the backup node is caused.
Disclosure of Invention
The embodiment of the invention provides a method and a device for storing data, which can solve the problem of insufficient disk space caused by large occupied space of a backup node.
In order to achieve the purpose, the embodiment of the invention adopts the following technical scheme:
in a first aspect, an embodiment of the present invention provides a method for storing data, where the method is used in a storage system, where the storage system includes at least two backup nodes and a master node, and the method includes:
the at least two backup nodes receive original data sent by the main node, wherein the original data are data written in by the main node;
after the original data is written into the at least two backup nodes, the at least two backup nodes compress the original data according to a data compression mode corresponding to each backup node to obtain target data, wherein the storage space occupied by the target data is smaller than that occupied by the original data;
the at least two backup nodes delete the original data.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the at least two backup nodes include a first backup node and a second backup node, the data compression mode includes a first mode or a second mode, the target data includes a first target data or a second target data, a time for restoring the first target data to the original data is less than a time for restoring the second target data to the original data, and the at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain the target data, including:
the first backup node compresses the original data according to the first mode to obtain the first target data, so that when the main node receives a request message for reading the original data and the original data written by the main node cannot be read, the first backup node preferentially restores the first target data to the original data;
and the second backup node compresses the original data according to the second mode to obtain the second target data.
With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the first mode includes a fast compression algorithm, the second mode includes a high compression rate algorithm, time for compressing the original data into the first target data is shorter than time for compressing the original data into the second target data, and a storage space occupied by the first target data is larger than that occupied by the second target data.
With reference to the first aspect or any implementation manner of the first to second possible implementations manners of the first aspect, in a third possible implementation manner of the first aspect, the storage system further includes a terminal, where the receiving, by the at least two backup nodes, the original data sent by the main node includes:
when the main node receives the original data sent by the terminal and writes the original data into the original data, and after the main node sends the original data to the at least two backup nodes, the at least two backup nodes receive the original data sent by the main node;
before the at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain the target data, the method comprises the following steps:
and the at least two backup nodes send feedback information of successful writing of the original data to the main node, so that the main node can send the feedback information to the terminal.
In a second aspect, an embodiment of the present invention provides an apparatus for storing data, where the apparatus is used in a storage system, where the storage system includes at least two backup nodes and a master node, and the apparatus includes:
the receiving module is used for receiving original data sent by the main node, wherein the original data is data written in by the main node;
a generating module, configured to compress the original data according to a data compression mode corresponding to each backup node after the original data is written in by the at least two backup nodes, to obtain target data, where a storage space occupied by the target data is smaller than a storage space occupied by the original data;
and the deleting module is used for deleting the original data.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the at least two backup nodes include a first backup node and a second backup node, the data compression mode includes a first mode or a second mode, the target data includes first target data or second target data, a time for restoring the first target data to the original data is less than a time for restoring the second target data to the original data, the generation module includes a first generation module and a second generation module, the first generation module is disposed at the first backup node and configured to compress the original data according to the first mode to obtain the first target data, so that when the master node receives a request message for reading the original data and the original data written by the master node cannot be read, the first backup node preferentially restores the first target data to the original data;
and the second generation module is arranged at the second backup node and used for compressing the original data according to the second mode to obtain the second target data.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the first mode includes a fast compression algorithm, the second mode includes a high compression rate algorithm, time for compressing the original data into the first target data is shorter than time for compressing the original data into the second target data, and storage space occupied by the first target data is larger than that occupied by the second target data.
With reference to the second aspect, or any implementation manner of the first to second possible implementations of the second aspect, in a third possible implementation manner of the second aspect, the storage system further includes a terminal, and the receiving module is specifically configured to receive the original data sent by the master node after the master node receives the original data sent by the terminal and writes the original data, and the master node sends the original data to the at least two backup nodes;
the device further comprises:
and the sending module is used for sending a feedback message of successful writing of the original data to the main node so that the main node can send the feedback message to the terminal.
In the method and apparatus for storing data provided by the embodiments of the present invention, after at least two backup nodes complete data writing according to original data sent by a master node, each backup node may perform compression on the original data according to a respective corresponding data compression mode to obtain target data, and delete the original data of the at least two backup nodes after the target data is generated. And the storage space occupied by the target data is smaller than that occupied by the original data. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup nodes, the invention can realize the storage of the original data on the backup nodes by adopting different storage forms, namely, the original data are compressed according to the data compression mode corresponding to each backup node, then the target data obtained by different data compression modes are stored on the corresponding backup nodes, and the original data are deleted at the same time. Because the storage space occupied by the target data obtained by the data compression mode is smaller than that occupied by the original data, the space occupied by each backup node for storing the data is reduced, and the problem of insufficient disk space caused by large space occupied by the backup nodes is solved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a memory system according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for storing data according to an embodiment of the present invention;
FIG. 3 is a flow chart of another method for storing data according to an embodiment of the present invention;
FIG. 4 is a flow chart of another method for storing data according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of an apparatus for storing data according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a device for storing data according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention can be used for a storage system, the storage system can comprise at least two backup nodes and a main node, and the storage system can also comprise a terminal. The storage system shown in fig. 1 includes a terminal, a master node, a backup node 1, and a backup node 2. When the terminal sends the original data to the master node, the master node may write the original data in, and send the original data to the backup node 1 and the backup node 2, respectively. After the backup node receives the original data sent by the master node and completes writing, a feedback message may be sent to the master node, so that the master node forwards the feedback message to the terminal, and the terminal determines that the backup node has received the original data and completes writing. After the backup node receives the original data sent by the master node and completes writing, in order to save the storage space for storing the original data on the backup node, the backup node may complete compression of the original data according to the data compression mode corresponding to the backup node, and obtain the target data, and at the same time, delete the original data after obtaining the target data. Then when the terminal initiates a request for reading the original data to the main node, if the main node can provide the original data for the terminal, the main node sends the original data to the terminal; when the primary node fails to provide the original data due to a failure, the backup node 1 or the backup node 2 may provide the original data to the terminal through the primary node after completing the operation of restoring the original data.
An embodiment of the present invention provides a method for storing data, as shown in fig. 2, where the method is performed by at least two backup nodes, and the method includes:
101. at least two backup nodes receive original data sent by the main node.
Wherein, the original data is the data written by the master node.
When the terminal sends the original data to the master node, the master node may write the original data and send the original data to each of the at least two backup nodes. I.e., synchronizing the original data stored on the primary node to each backup node.
102. And after the original data is written into the at least two backup nodes, the at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain the target data.
And the storage space occupied by the target data is smaller than that occupied by the original data.
After receiving the original data sent by the master node, each backup node needs to complete writing of the original data, and then the backup node can complete compression of the original data according to the data compression mode corresponding to the backup node, so as to obtain the target data. It should be noted that the compressed mode corresponding to each backup node may not be identical, that is, when there are 3 or more than 3 backup nodes, the data compressed modes corresponding to all backup nodes may include at least 2, which means that there may be backup nodes corresponding to the same data compressed mode. Because the data compression modes may not be completely the same, the target data obtained by using different data compression modes may not be completely the same, that is, the storage space occupied by the target data obtained by using different data compression modes may be different, but the content of the target data obtained on each backup node is the same. It should be noted that, since the process of compressing the original data according to the data compression mode is performed on the backup node, that is, the compression of the original data is completed by the background, thereby ensuring that the I/O (In/Out) performance of the storage system is not affected.
In the embodiment of the present invention, in order to ensure that the master node can provide the original data to the terminal as soon as possible when the terminal initiates a request message for reading data, in the embodiment of the present invention, the original data stored on the master node does not need to undergo any data compression mode to perform data compression.
103. At least two backup nodes delete the original data.
In order to occupy the space for storing data on the backup node as little as possible, in the embodiment of the present invention, after the backup node obtains the target data occupying a smaller storage space, the original data on the backup node may be deleted to avoid redundancy on data storage. It should be noted that, because different data compression modes are adopted, the compression time of the original data on different backup nodes may be different, and therefore, for a single backup node, after the original data is compressed to obtain the target data, the backup node may delete the original data, so as to save the storage space of the backup node.
In the method for storing data provided by the embodiment of the present invention, after at least two backup nodes complete data writing according to original data sent by a master node, each backup node may perform compression on the original data according to a respective corresponding data compression mode to obtain target data, and delete the original data of the at least two backup nodes after the target data is generated. And the storage space occupied by the target data is smaller than that occupied by the original data. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup nodes, the invention can realize the storage of the original data on the backup nodes by adopting different storage forms, namely, the original data are compressed according to the data compression mode corresponding to each backup node, then the target data obtained by different data compression modes are stored on the corresponding backup nodes, and the original data are deleted at the same time. Because the storage space occupied by the target data obtained by the data compression mode is smaller than that occupied by the original data, the space occupied by each backup node for storing the data is reduced, and the problem of insufficient disk space caused by large space occupied by the backup nodes is solved.
When the number of the backup nodes is two and the main node cannot provide the required original data to the terminal, in order to ensure that the first backup node can provide the original data to the terminal through the main node preferentially. In one implementation manner of the embodiment of the present invention, the at least two backup nodes may include a first backup node and a second backup node, the data compression mode may include a first mode or a second mode, the target data may include a first target data or a second target data, and a time for restoring the first target data to the original data is less than a time for restoring the second target data to the original data. Therefore, on the basis of the implementation shown in fig. 2, the implementation shown in fig. 3 can also be realized. After the original data is written into the at least two backup nodes in step 102, the at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain the target data, which may be specifically implemented as step 1021 and step 1022:
1021. after the original data are written into the at least two backup nodes, the first backup node compresses the original data according to a first mode to obtain first target data, so that when the main node receives a request message for reading the original data and the original data written into the main node cannot be read, the first backup node preferentially restores the first target data into the original data.
1022. And after the original data are written into the at least two backup nodes, the second backup node compresses the original data according to a second mode to obtain second target data.
It should be noted that the first mode may include a fast compression algorithm and the second mode may include a high compression ratio algorithm. In the embodiment of the invention, the time for compressing the original data into the first target data is less than the time for compressing the original data into the second target data, and the storage space occupied by the first target data is larger than that occupied by the second target data.
In the embodiment of the present invention, since the data compression speed of the fast compression algorithm is greater than that of the high compression rate algorithm, the time for generating the first target data is shorter than that for generating the second target data, and similarly, the time for restoring the first target data to the original data is also shorter than that for restoring the second target data to the original data. Therefore, when the master node cannot provide the original data for the terminal, in order to ensure that the terminal can acquire the original data as soon as possible, the original data needs to be restored by the first target data with less restoration time by adopting the reverse process of the first mode, and the original data is sent to the terminal through the master node, so that the storage space of the backup node for storing the original data is saved, and the overall performance of the storage system is ensured. It should be noted that the fast compression algorithm may be a compression algorithm with a relatively fast compression speed such as lz4 or snappy, and occupies a relatively small storage space after compression is completed; the high compression rate algorithm may be a data compression mode with a compression speed lower than that of the fast compression algorithm, such as gzip, however, the storage space occupied by the target data obtained by the high compression rate algorithm is smaller than the storage space occupied by the target data obtained by the fast compression algorithm. Therefore, the first target data serving as the backup data can save storage space and have a function of quickly recovering the original data, and the second target data serving as the other backup data can better save storage space, so that when the main node and the first backup node cannot provide the original data, the original data can still be ensured not to be lost, and the original data is provided for the terminal through the main node.
In the method for storing data provided by the embodiment of the invention, when two backup nodes are provided, the first backup node can compress the original data according to the first mode to obtain the first target data; the second backup node may compress the original data according to a second mode to obtain second target data. Although the storage space occupied by the first target data is larger than that occupied by the second target data, the time for restoring the first target data into the original data is shorter than that for restoring the second target data into the original data. When the terminal sends a request message for reading the original data to the master node and the original data on the master node cannot be read, the first backup node may preferentially restore the first target data to the original data so that the terminal can read the original data through the master node. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup node, the time consumed for restoring the first target data into the original data is less than the time consumed for restoring the second target data into the original data, so that the restored original data can be provided for the terminal as soon as possible under the condition that the main node cannot provide the original data, and therefore, the influence of the storage space on the overall performance of the storage system by adopting a data compression mode is reduced on the basis of reducing the space occupied by each backup node for storing the data and solving the problem of insufficient disk space caused by large space occupied by the backup node.
In order to ensure that the terminal can master the storage condition of the original data, in one implementation manner of the embodiment of the present invention, after at least two backup nodes receive and write the original data, a feedback message indicating that the original data is successfully written into the backup nodes may be sent to the terminal through the master node. Therefore, on the basis of the implementation shown in fig. 2, the implementation shown in fig. 4 can also be realized. Wherein, the step 101 of receiving, by the at least two backup nodes, the original data sent by the main node may be specifically implemented as step 1011, and after the original data is written into the at least two backup nodes, before the at least two backup nodes execute step 102 compress the original data according to the data compression mode corresponding to each backup node to obtain the target data, that is, before the step 105 is executed, step 104 may also be executed:
1011. when the main node receives the original data sent by the terminal and writes the original data in, and the main node sends the original data to the at least two backup nodes, the at least two backup nodes receive the original data sent by the main node.
104. After the original data is written into the at least two backup nodes, the at least two backup nodes send a feedback message that the original data is successfully written into the main node, so that the main node sends the feedback message to the terminal.
105. And at least two backup nodes compress the original data according to the data compression mode corresponding to each backup node to obtain the target data.
In the method for storing data provided by the embodiment of the invention, after at least two backup nodes complete data writing according to original data sent by the main node, each backup node can send a feedback message that the original data is successfully written to the main node, and the main node sends the feedback message to the terminal. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup nodes, the invention can realize the storage of the original data on the backup nodes by adopting different storage forms, namely, the original data are compressed according to the data compression mode corresponding to each backup node, then the target data obtained by different data compression modes are stored on the corresponding backup nodes, and the original data are deleted at the same time. Because the storage space occupied by the target data obtained through the data compression mode is smaller than the storage space occupied by the original data, and each backup node can inform the terminal of the feedback message for completing the writing of the original data through the main node in time, the backup node can send the feedback message to the terminal through the main node on the basis of reducing the space occupied by the data stored by each backup node and solving the problem of insufficient disk space caused by large space occupied by the backup node, so that the terminal can determine that the backup node successfully writes the original data, and the effect of backing up the original data is achieved.
An embodiment of the present invention provides an apparatus 20 for storing data, configured to execute the method flows shown in fig. 1 to 4, as shown in fig. 5, where the apparatus 20 is used in a storage system, the storage system includes at least two backup nodes and a master node, and the apparatus 20 includes:
the receiving module 21 is configured to receive original data sent by a master node, where the original data is data written by the master node.
The generating module 22 is configured to compress the original data according to the data compression mode corresponding to each backup node after the original data is written into the at least two backup nodes, so as to obtain target data, where a storage space occupied by the target data is smaller than a storage space occupied by the original data.
And a deleting module 23, configured to delete the original data.
In one implementation manner of the embodiment of the present invention, the at least two backup nodes include a first backup node and a second backup node, the data compression mode includes a first mode or a second mode, the target data includes a first target data or a second target data, a time for restoring the first target data to the original data is less than a time for restoring the second target data to the original data, and the generation module 22 includes a first generation module 221 and a second generation module 222.
The first generating module 221 is disposed at the first backup node, and configured to compress the original data according to the first mode to obtain the first target data, so that when the primary node receives the request message for reading the original data and the original data written by the primary node cannot be read, the first backup node preferentially restores the first target data to the original data.
The second generating module 222 is disposed at the second backup node, and configured to compress the original data according to a second mode to obtain second target data.
It should be noted that the first mode includes a fast compression algorithm, the second mode includes a high compression ratio algorithm, time for compressing the original data into the first target data is shorter than time for compressing the original data into the second target data, and a storage space occupied by the first target data is larger than that occupied by the second target data.
In an implementation manner of the embodiment of the present invention, the storage system further includes a terminal, and the receiving module 21 is specifically configured to receive the original data sent by the master node after the master node receives the original data sent by the terminal and writes the original data, and the master node sends the original data to the at least two backup nodes.
The apparatus 20 further comprises:
the sending module 24 is configured to send a feedback message that the original data is successfully written to the master node, so that the master node sends the feedback message to the terminal.
In the apparatus for storing data according to an embodiment of the present invention, after at least two backup nodes complete data writing according to original data sent by a master node, each backup node may perform compression on the original data according to a data compression mode corresponding to each backup node, so as to obtain target data, and delete the original data of the at least two backup nodes after the target data is generated. And the storage space occupied by the target data is smaller than that occupied by the original data. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup nodes, the invention can realize the storage of the original data on the backup nodes by adopting different storage forms, namely, the original data are compressed according to the data compression mode corresponding to each backup node, then the target data obtained by different data compression modes are stored on the corresponding backup nodes, and the original data are deleted at the same time. Because the storage space occupied by the target data obtained by the data compression mode is smaller than that occupied by the original data, the space occupied by each backup node for storing the data is reduced, and the problem of insufficient disk space caused by large space occupied by the backup nodes is solved.
An embodiment of the present invention provides a backup node 30, where the backup node 30, a terminal and a master node form a storage system, and the backup node 30 is configured to execute the method flows shown in fig. 1 to 4, as shown in fig. 6, the backup node 30 includes a processor 31 and an interface circuit 32, and a memory 33 and a bus 34 are also shown in the figure, and the processor 31, the interface circuit 32 and the memory 33 are connected by the bus 34 and complete mutual communication.
It should be noted that the processor 31 may be a single processing element or may be a general term for multiple processing elements. For example, the Processing element may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention, such as: one or more microprocessors (digital signal processors, DSPs), or one or more Field Programmable Gate Arrays (FPGAs).
The memory 33 may be a storage device or a combination of storage elements, and is used for storing executable program codes or parameters, data, etc. required by the operation of the access network management device. And the memory 33 may include a Random Access Memory (RAM) or a non-volatile memory (non-volatile memory), such as a magnetic disk memory, Flash memory (Flash), etc.
The bus 34 may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus 34 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 6, but this is not intended to represent only one bus or type of bus.
The backup node 30 may also include input and output devices coupled to the bus 34 for coupling to other components, such as the processor 31, via the bus 34.
Wherein the processor 31 calls the program code in the memory 33 for performing the operations performed by the backup node 30 in the above method embodiments. For example, it includes:
the raw data sent by the master node is received through the interface circuit 32, and the raw data is data written by the master node.
After the original data is written into at least two backup nodes, the original data is compressed by the processor 31 according to the data compression mode corresponding to each backup node to obtain target data, and the storage space occupied by the target data is smaller than that occupied by the original data.
The original data is deleted by the processor 31.
In an implementation manner of the embodiment of the present invention, the backup node may be specifically a first backup node or a second backup node, the data compression mode includes a first mode or a second mode, the target data includes a first target data or a second target data, and a time for restoring the first target data to the original data is shorter than a time for restoring the second target data to the original data.
When the backup node 30 is a first backup node, the processor 31 may compress the original data according to a first mode to obtain first target data, so that when the master node receives a request message for reading the original data and the original data written by the master node cannot be read, the first backup node preferentially restores the first target data to the original data; when the backup node 30 is a second backup node, the original data may be compressed by the processor 31 according to a second mode to obtain a second target data.
It should be noted that the first mode includes a fast compression algorithm, the second mode includes a high compression ratio algorithm, time for compressing the original data into the first target data is shorter than time for compressing the original data into the second target data, and a storage space occupied by the first target data is larger than that occupied by the second target data.
In an implementation manner of the embodiment of the present invention, the interface circuit 32 is specifically configured to, when the master node receives the original data sent by the terminal and writes the original data, and after the master node sends the original data to the at least two backup nodes, the at least two backup nodes receive the original data sent by the master node.
In the embodiment of the present invention, a feedback message indicating that the original data is successfully written may also be sent to the master node through the interface circuit 32, so that the master node sends the feedback message to the terminal.
In the backup node provided in the embodiment of the present invention, after at least two backup nodes complete data writing according to original data sent by a master node, each backup node may perform compression on the original data according to a data compression mode corresponding to each backup node, so as to obtain target data, and delete the original data of at least two backup nodes after the target data is generated. And the storage space occupied by the target data is smaller than that occupied by the original data. Compared with the prior art that the same storage form is adopted to store the data stored on the main node on the backup nodes, the invention can realize the storage of the original data on the backup nodes by adopting different storage forms, namely, the original data are compressed according to the data compression mode corresponding to each backup node, then the target data obtained by different data compression modes are stored on the corresponding backup nodes, and the original data are deleted at the same time. Because the storage space occupied by the target data obtained by the data compression mode is smaller than that occupied by the original data, the space occupied by each backup node for storing the data is reduced, and the problem of insufficient disk space caused by large space occupied by the backup nodes is solved.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (3)

1. A storage system comprising a first backup node and a second backup node, wherein,
the first backup node is configured to: receiving data sent by a main node, and compressing the data by using a first mode to obtain first target data;
the second backup node is to: receiving the data sent by the main node, and compressing the data by using a second mode to obtain second target data; wherein a time to restore the first target data to the data is less than a time to restore the second target data to the data;
the first mode includes a fast compression algorithm, the second mode includes a high compression rate algorithm, a time to compress the data into the first target data is less than a time to compress the data into the second target data, and the first target data occupies a larger storage space than the second target data.
2. An apparatus for storing data, the apparatus being for a storage system including a first backup node and a second backup node, the apparatus comprising:
the receiving module is used for receiving data sent by the main node;
the compression module is used for compressing the data by using a first mode to obtain first target data and compressing the data by using a second mode to obtain second target data; wherein a time to restore the first target data to the data is less than a time to restore the second target data to the data;
the first mode includes a fast compression algorithm, the second mode includes a high compression rate algorithm, a time to compress the data into the first target data is less than a time to compress the data into the second target data, and the first target data occupies a larger storage space than the second target data.
3. A storage medium storing a computer program, the computer program comprising computer instructions which, when executed by a computer, perform operations comprising:
receiving data sent by a main node, compressing the data by using a first mode to obtain first target data, and compressing the data by using a second mode to obtain second target data; wherein a time to restore the first target data to the data is less than a time to restore the second target data to the data;
the first mode includes a fast compression algorithm, the second mode includes a high compression rate algorithm, a time to compress the data into the first target data is less than a time to compress the data into the second target data, and the first target data occupies a larger storage space than the second target data.
CN201811508945.9A 2016-02-03 2016-02-03 Data storage method and device Active CN109783014B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811508945.9A CN109783014B (en) 2016-02-03 2016-02-03 Data storage method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610078319.5A CN105760245B (en) 2016-02-03 2016-02-03 A kind of method and device of storing data
CN201811508945.9A CN109783014B (en) 2016-02-03 2016-02-03 Data storage method and device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201610078319.5A Division CN105760245B (en) 2016-02-03 2016-02-03 A kind of method and device of storing data

Publications (2)

Publication Number Publication Date
CN109783014A CN109783014A (en) 2019-05-21
CN109783014B true CN109783014B (en) 2022-04-05

Family

ID=56329956

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201811508945.9A Active CN109783014B (en) 2016-02-03 2016-02-03 Data storage method and device
CN201610078319.5A Active CN105760245B (en) 2016-02-03 2016-02-03 A kind of method and device of storing data

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201610078319.5A Active CN105760245B (en) 2016-02-03 2016-02-03 A kind of method and device of storing data

Country Status (1)

Country Link
CN (2) CN109783014B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308233A (en) * 2017-07-28 2019-02-05 中兴通讯股份有限公司 Data back up method, apparatus and system
WO2019092733A1 (en) * 2017-11-09 2019-05-16 Telefonaktiebolaget Lm Ericsson (Publ) Method, apparatuses, computer programs and computer program products for data storage
CN107948334B (en) * 2018-01-09 2019-06-07 无锡华云数据技术服务有限公司 Data processing method based on distributed memory system
CN108494788B (en) * 2018-03-29 2020-11-24 深圳市国富前海区块链技术股份有限公司 Data transmission method, data transmission device and computer readable storage medium
CN109582245A (en) * 2018-12-06 2019-04-05 联想(北京)有限公司 Data processing method, device and equipment
CN110209640A (en) * 2019-06-06 2019-09-06 四川长虹电器股份有限公司 The method of switching at runtime lz4 compression algorithm type under cell phone system operating status
CN110837343B (en) * 2019-09-27 2021-06-22 华为技术有限公司 Snapshot processing method and device and terminal
CN117519611B (en) * 2024-01-05 2024-03-15 南京扬子信息技术有限责任公司 Data distributed storage method and system for information system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271938A (en) * 1999-04-24 2000-11-01 Lg电子株式会社 Numerical data broadcastor and data processing method and data storage medium
WO2006090412A2 (en) * 2005-02-24 2006-08-31 Monish Shah A data storage system and a method for its operation
CN101523850A (en) * 2006-08-03 2009-09-02 思杰系统有限公司 Systems and methods for providing multi-mode transport layer compression
CN102437894A (en) * 2011-11-04 2012-05-02 百度在线网络技术(北京)有限公司 Method, device and equipment for compressing information to be sent
CN102761540A (en) * 2012-05-30 2012-10-31 北京奇虎科技有限公司 Data compression method, device and system and server
CN103503381A (en) * 2011-11-21 2014-01-08 华为技术有限公司 Method, device and system for device redirection data transmission
CN105260268A (en) * 2015-10-10 2016-01-20 浪潮(北京)电子信息产业有限公司 Backup storage method and apparatus

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201491037U (en) * 2009-03-27 2010-05-26 深圳市迈科龙电子有限公司 Remote redundant backup system
CN103533004A (en) * 2012-07-06 2014-01-22 深圳市腾讯计算机系统有限公司 Data transmission method and system based on stage compression

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1271938A (en) * 1999-04-24 2000-11-01 Lg电子株式会社 Numerical data broadcastor and data processing method and data storage medium
WO2006090412A2 (en) * 2005-02-24 2006-08-31 Monish Shah A data storage system and a method for its operation
CN101523850A (en) * 2006-08-03 2009-09-02 思杰系统有限公司 Systems and methods for providing multi-mode transport layer compression
CN102437894A (en) * 2011-11-04 2012-05-02 百度在线网络技术(北京)有限公司 Method, device and equipment for compressing information to be sent
CN103503381A (en) * 2011-11-21 2014-01-08 华为技术有限公司 Method, device and system for device redirection data transmission
CN102761540A (en) * 2012-05-30 2012-10-31 北京奇虎科技有限公司 Data compression method, device and system and server
CN105260268A (en) * 2015-10-10 2016-01-20 浪潮(北京)电子信息产业有限公司 Backup storage method and apparatus

Also Published As

Publication number Publication date
CN105760245B (en) 2019-03-26
CN105760245A (en) 2016-07-13
CN109783014A (en) 2019-05-21

Similar Documents

Publication Publication Date Title
CN109783014B (en) Data storage method and device
CN106776130B (en) Log recovery method, storage device and storage node
US10817386B2 (en) Virtual machine recovery method and virtual machine management device
US10860447B2 (en) Database cluster architecture based on dual port solid state disk
CN110389858B (en) Method and device for recovering faults of storage device
WO2018121456A1 (en) Data storage method, server and storage system
US20190227710A1 (en) Incremental data restoration method and apparatus
CN110825562B (en) Data backup method, device, system and storage medium
CN110781157B (en) Backup and recovery method and device based on NAS
CN111818124B (en) Data storage method, data storage device, electronic equipment and medium
CN115658390A (en) Container disaster tolerance method, system, device, equipment and computer readable storage medium
CN109684128B (en) Cluster overall fault recovery method of message middleware, server and storage medium
CN108616598B (en) Data synchronization method and device and distributed storage system
CN104407806A (en) Method and device for revising hard disk information of redundant array group of independent disk (RAID)
CN109165117B (en) Data processing method and system
CN111625402A (en) Data recovery method and device, electronic equipment and computer readable storage medium
CN106020975B (en) Data operation method, device and system
CN115878381A (en) Data recovery method and device based on SRM disc, storage medium and electronic device
CN107885615B (en) Distributed storage data recovery method and system
CN106161061B (en) Service configuration rollback method and network equipment
CN109324931B (en) Method for realizing vmware mount recovery in data de-duplication system
CN104572350A (en) Method and device for processing metadata
CN113407508B (en) Method, system, equipment and medium for compressing log file
CN112966046B (en) Data synchronization method and device, electronic equipment and storage medium
CN108599982B (en) Data recovery method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant