WO2015109483A1 - Procédé et dispositif de stockage de données - Google Patents

Procédé et dispositif de stockage de données Download PDF

Info

Publication number
WO2015109483A1
WO2015109483A1 PCT/CN2014/071224 CN2014071224W WO2015109483A1 WO 2015109483 A1 WO2015109483 A1 WO 2015109483A1 CN 2014071224 W CN2014071224 W CN 2014071224W WO 2015109483 A1 WO2015109483 A1 WO 2015109483A1
Authority
WO
WIPO (PCT)
Prior art keywords
server
data
storage space
remaining
servers
Prior art date
Application number
PCT/CN2014/071224
Other languages
English (en)
Chinese (zh)
Inventor
琚列丹
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2014/071224 priority Critical patent/WO2015109483A1/fr
Priority to CN201480000338.5A priority patent/CN104205780B/zh
Publication of WO2015109483A1 publication Critical patent/WO2015109483A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems

Definitions

  • the present invention relates to the field of storage technologies, and in particular, to a method and apparatus for storing data. Background technique
  • a distributed storage system consists of multiple servers. Each server can have one or more virtual machines installed. Each server corresponds to RAID (Redundant Arrays of Independent Disks).
  • RAID Redundant Arrays of Independent Disks
  • the physical storage space is generally allocated for the data generated by the application/virtual machine installed on the server according to the principle of "load balancing", that is, the server is installed according to the principle of "balanced used storage space of each server in HDFS".
  • the data generated by the application/virtual machine allocates physical storage space; wherein, the physical storage space is the storage space of the disk array corresponding to a certain server and/or the storage space of the server itself.
  • the distributed storage system is HDFS ( Hadoop Distributed File System (Distributed File System) is an example:
  • HDFS Hadoop Distributed File System
  • the server first blocks the data. Ground, at most 1 block occupies less than 64M of storage space, the remaining blocks
  • the storage space is equal to 64M; secondly, according to the principle of "balanced used storage space of each server in HDFS", each block is mapped to the storage space of one or more servers in HDFS.
  • the storage space generated by the application/virtual machine is equal to or less than 64M
  • the data is mapped to the HDFS server with the smallest storage space according to the principle of "balanced used storage space of each server in HDFS".
  • Embodiments of the present invention provide a method and apparatus for storing data, which are used to reduce delay in reading data, thereby improving system performance.
  • the embodiment of the present invention adopts the following technical solutions:
  • a method for storing data is provided, which is applied to a distributed storage system, where the distributed storage system includes a first server and other servers, The method includes: determining first data generated by the first server; determining a size relationship between a remaining storage space of the first server and a storage space occupied by the first data;
  • the first server includes a virtual machine
  • the first data is data generated by the virtual machine
  • the other server includes a second server
  • the method also includes:
  • the first indication message is used to indicate that the second server stores the first data in a remaining storage space of the second server; in addition to the first data corresponding to the virtual machine The virtual machine and the other data are migrated to the second server when the storage space occupied by the other data is less than or equal to the remaining storage space of the second server.
  • the other server includes a second server, where the method further includes: if the remaining storage space of the first server is smaller than the storage occupied by the first data a first indication message that includes the first data is sent to the second server, where the first indication message is used to indicate that the second server stores the first data in the second server Remaining storage space; updating remaining storage space of the first server; sending to the second server when the remaining storage space of the first server is greater than or equal to the storage space occupied by the first data a second indication message, the second indication message is used to instruct the second server to migrate the first data to the first server, and receive the first data sent by the second server; The first data is stored in the remaining storage space after the first server is updated.
  • the other server further includes a third server, where The method also includes:
  • a second aspect provides a method for storing data, which is applied to a distributed storage system, where the distributed storage system includes a first server and other servers, and the method includes: determining first data generated by the first server; The storage space occupied by the first data is greater than the maximum value of the minimum storage unit of the distributed storage system; determining the size relationship between the remaining storage space of the first server and the storage space occupied by the first data;
  • a third aspect provides a server, which is applied to a distributed storage system, where the distributed storage system further includes other servers, where the server includes: a determining module, configured to determine first data generated by the server; a storage module, configured to determine, in the determining module, that a remaining storage space of the server is greater than or equal to the first The first data is stored in the remaining storage space of the server when the storage space occupied by the data is, and the used storage space of the server is greater than the minimum used storage space of the other server.
  • the server includes a virtual machine
  • the first data is data generated by the virtual machine
  • the other server includes a second server
  • the server further The sending module is configured to: when the determining module determines that the remaining storage space of the server is smaller than the storage space occupied by the first data, send a first indication that includes the first data to the second server a message, the first indication message is used to indicate the
  • the second server stores the first data in a remaining storage space of the second server, and a migration module, configured to occupy a storage space of data other than the first data corresponding to the virtual machine When less than or equal to the remaining storage space of the second server, the virtual machine and the other data are migrated to the second server.
  • the other server includes a second server
  • the server further includes: a sending module, configured to determine, in the determining module, that the remaining storage space of the server is smaller than Sending a first indication message including the first data to the second server when the storage space occupied by the first data is used; the first indication message is used to indicate that the second server is to use the first data Storing in a remaining storage space of the second server; an update module, configured to update a remaining storage space of the server; the sending module is further configured to: when the server updates, the remaining storage space is greater than or equal to the Sending a second indication message to the second server when the storage space is occupied by the first data; the second indication message is used to instruct the second server to migrate the first data to the server; And receiving the first data sent by the second server; the storage module is further configured to: store the first data in The remaining storage space of said server update.
  • the other server further includes a third
  • the sending module is further configured to: send, to the third server, a third indication message that includes a copy of the first data, where the third indication message is used to indicate the third service
  • the server stores a copy of the first data in a remaining storage space of the third server.
  • a fourth aspect provides a server, which is applied to a distributed storage system, where the distributed storage system further includes other servers, where the server includes: a processor, configured to determine first data generated by the server; Determining the size relationship between the remaining storage space of the server and the storage space occupied by the first data;
  • a memory configured to determine, by the processor, that a remaining storage space of the server is greater than or equal to a storage space occupied by the first data, and a used storage space of the server is greater than a minimum used storage of the other server In the space, the first data is stored in the remaining storage space of the server under the control of the processor.
  • the server includes a virtual machine
  • the first data is data generated by the virtual machine
  • the other server includes a second server
  • the server further The method includes: a transmitter, configured to send, when the processor determines that a remaining storage space of the server is less than a storage space occupied by the first data, send a first indication that includes the first data to the second server
  • the first indication message is used to indicate that the second server stores the first data in a remaining storage space of the second server
  • the processor is further configured to: correspond to the virtual machine The virtual machine and the other data are migrated to the second server when the storage space occupied by the data other than the first data is less than or equal to the remaining storage space of the second server.
  • the other server includes a second server
  • the processor is further configured to: update a remaining storage space of the server
  • the transmitter is further configured to: when the remaining storage space of the server is greater than or equal to the storage space occupied by the first data, send a second indication message to the second server;
  • the message is used to indicate that the second server migrates the first data to the server;
  • the server further includes: a receiver, configured to receive the first data sent by the second server; And storing, in the control of the processor, the first data in a remaining storage space after the server is updated.
  • the other server further includes a third The server is further configured to: send, to the third server, a third indication message that includes a copy of the first data, where the third indication message is used to indicate that the third server is to be the first A copy of the data is stored in the remaining storage space of the third server.
  • a fifth aspect provides a server, which is applied to a distributed storage system, where the distributed storage system further includes another server, where the server includes: a determining module, configured to determine first data generated by the server; The storage space occupied by the data is larger than the maximum storage unit of the distributed storage system; the determining module is configured to determine a size relationship between the remaining storage space of the server and the storage space occupied by the first data; a storage module, configured to determine, in the determining module, that a remaining storage space of the server is greater than or equal to a storage space occupied by the first data, and that the used storage space of the server is less than or equal to a minimum of the other servers Used storage space And storing the first data in a remaining storage space of the server.
  • a sixth aspect provides a server, which is applied to a distributed storage system, where the distributed storage system further includes other servers, where the server includes: a processor, configured to determine first data generated by the server; a relationship between a remaining storage space of the server and a storage space occupied by the first data; a storage space occupied by the first data is greater than a maximum value of a minimum storage unit of the distributed storage system; When the processor determines that the remaining storage space of the server is greater than or equal to the storage space occupied by the first data, and the used storage space of the server is less than or equal to the minimum used storage space of the other server; And storing the first data in a remaining storage space of the server under the control of the processor.
  • the foregoing technical solution is applied to a distributed storage system.
  • a server determines that the remaining storage space is greater than or equal to the storage space occupied by a data generated by the server, the data is preferentially stored in the remaining storage space of the server.
  • the server can directly read the data from the local without having to read the data from other servers in the network, thereby achieving the beneficial effects of shortening the delay and improving system performance.
  • the problem of prolonged time and poor system performance due to reading data on other servers in the network is solved in the prior art.
  • FIG. 1 is a flowchart of a method for storing data according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of a method for storing data according to Embodiment 2 of the present invention
  • FIG. 4 is a flowchart of a method for storing data according to Embodiment 2 of the present invention
  • FIG. 1 is a flowchart of a method for storing data according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of a method for storing data according to Embodiment 2 of the present invention
  • FIG. 4 is a flowchart of a method for storing data according to Embodiment 2 of the present invention
  • FIG. 1 is a flowchart of a method for storing data according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of a method for storing data according to Embodiment 2 of the present invention
  • FIG. 4 is a flowchart of a method for
  • FIG. 5 is a schematic structural diagram of a server according to Embodiment 3 of the present invention
  • FIG. 7 is a schematic structural diagram of a server according to Embodiment 4 of the present invention
  • FIG. 8 is a schematic structural diagram of another server according to Embodiment 4 of the present invention
  • FIG. 9 is a schematic structural diagram of a server according to Embodiment 5 of the present invention
  • FIG. 10 is a schematic structural diagram of a server according to Embodiment 6 of the present invention.
  • a method for storing data provided in this embodiment is applied to a distributed storage system, where the distributed storage system includes a first server and other services.
  • the method includes:
  • the "first server” may be any server in the distributed storage system; the “other server” refers to all servers except the first server in the distributed storage system.
  • First Data can be any data generated by the first server.
  • the execution subject of this embodiment may be "first server”.
  • a server in a distributed storage system can install one or more virtual machines or not. When a virtual machine is installed on a server, the server shares the disk array corresponding to the server with the virtual machine installed on it.
  • the first data generated by the server may be: an application-generated data installed on the server (physical machine); or a virtual machine installed on the server Machine generated data.
  • the server when the data is data generated by an application installed on the server, the server records the correspondence between the data and the server; when the data is data generated by a virtual machine installed on the server, The server records the correspondence between the data and the virtual machine. Specifically, the corresponding relationship can be recorded by generating a data routing table. In general, each data store in a distributed storage system can share the data routing table. In this paper, when the server where the application/virtual machine that generates a data is located and the server corresponding to the storage space where the data is stored is the same server, it is considered that the localization of the data is realized.
  • the “remaining storage space of the server” refers to the available storage space in the storage space corresponding to the server and/or the storage space of the server itself.
  • the remaining storage space may have a size of 0 or may include one or more storage units.
  • the remaining storage space of each server may be deleted due to deletion of stored data.
  • the data is updated by changing the size of the disk array corresponding to the server.
  • the server may include, but is not limited to, obtaining the remaining storage space in the following manners: periodically acquiring the remaining storage space, periodically obtaining the remaining storage space, acquiring the remaining storage space when generating a data, and the like. This embodiment does not limit the size of the "storage space occupied by the first data".
  • the method may further include: acquiring a storage space occupied by the first data, and remaining storage space of the first server.
  • the step 103 may specifically include: the first server determines that the remaining storage space is greater than or equal to the storage space occupied by the first data, and the used storage space of the first server is greater than that of the other server.
  • the storage space is the smallest, the first data is allocated a storage address, and the first data is stored in a storage unit corresponding to the storage address; the storage address is a remaining storage space of the first server. The storage address corresponding to one or more of the storage units.
  • the "minimum used storage space of other servers" is explained by a specific example: Assume that other servers are composed of server 1, server 2, and server 3, and the used storage spaces of the three servers are respectively: A, B, C, and A > B > C; then, the minimum used storage space of other servers is C.
  • step 103 will be described below from the perspective of the relationship between the storage space occupied by the first data and the maximum value of the minimum storage unit of the distributed storage system.
  • step 103 may include the following scenario 1 and scenario 2: scenario 1: the storage space occupied by the first data is larger than the maximum storage unit of the distributed storage system, and the used storage space of the first server is larger than other servers. The smallest used storage space.
  • Scenario 2 The storage space occupied by the first data is less than or equal to the maximum storage unit of the distributed storage system, and the used storage space of the first server is larger than the minimum used storage space of other servers.
  • the first server needs to block the first data, and stores each block in multiple servers of the distributed storage system according to the principle of “load balancing”. In the remaining storage space.
  • scenario 2 in the prior art solution, the first server needs to store the first data in the largest remaining storage space in the other servers. It should be noted that, in actual implementation, the following scenario A and scenario B may occur: scenario A: The storage space occupied by the first data is greater than the maximum storage unit of the distributed storage system, and the first server is used.
  • the storage space is less than or equal to the minimum used storage space of other servers.
  • Scenario B The storage space occupied by the first data is less than or equal to the maximum storage unit of the distributed storage system, and the used storage space of the first server is less than or equal to the minimum used storage space of other servers.
  • the implementation method for the scenario A is provided herein, which is correspondingly described in the following embodiment 2.
  • the implementation method is the same as that of the prior art solution.
  • the first server stores the first data in its remaining storage space.
  • the method for storing data provided by the embodiment of the present invention does not need to limit the size of the storage space occupied by the first data, and does not need to be limited.
  • Used storage space of a server The size relationship with the used storage space of other servers. Specific implementations include, but are not limited to, the methods shown in Embodiments 1 and 2 below.
  • the first server includes a virtual machine
  • the first data is data generated by the virtual machine
  • the other server includes a second server.
  • the method further includes the following steps.
  • a 1 -A2 Step A 1 : sending a first indication message including the first data to the second server, if the remaining storage space of the first server is smaller than the storage space occupied by the first data The first indication message is used to instruct the second server to store the first data in a remaining storage space of the second server.
  • Step A2 when the storage space occupied by the data other than the first data corresponding to the virtual machine is less than or equal to the remaining storage space of the second server, the virtual machine and the other data are used. Migrate to the second server.
  • the second server may be any other than the first data in the distributed storage system except that the remaining storage space is greater than or equal to the first data other than the virtual machine.
  • the storage space "conditions for any server.
  • Step A 1 may include: A ll) if the first server determines that the remaining storage space is smaller than the storage space occupied by the first data, determining, in addition to the first data, corresponding to the virtual machine The size of the storage space occupied by the other data; A 12) the server having the remaining storage space greater than or equal to the storage space as the second server; A 13) transmitting the first data including the first data to the second server Indicate the message.
  • step A2 may be performed immediately after step A1; or may be performed when the virtual machine needs to read the first data.
  • the former can be described as: localizing the first data when the first data is written to the second server
  • the latter can be described as: localization of the first data is implemented when the virtual machine needs to read the first data.
  • the virtual machine after the first server migrates the virtual machine to the second server, the virtual machine becomes a virtual machine installed on the second server, and the virtual machine does not exist on the first server. machine. After the "first server will migrate with the other data to the second server", the other data is stored in the storage space of the second server. Additionally, this alternative embodiment may be referred to as localizing the first data by way of virtual machine migration.
  • the other server includes a second server; the method further includes the following steps B 1 -B4 : Step B 1 : if the remaining storage space of the first server is smaller than the first a storage space occupied by the data, sending a first indication message including the first data to the second server; the first indication message is used to indicate that the second server stores the first data in The remaining storage space of the second server.
  • Step B2 Update the remaining storage space of the first server.
  • Step B3 Send a second indication message to the second server when the remaining storage space of the first server is greater than or equal to the storage space occupied by the first data, where the second indication message is used. Instructing the second server to migrate the first data to the first server.
  • Step B4 receiving the first data sent by the second server; storing the first data in a remaining storage space after the first server is updated.
  • the second server may delete the first data stored locally or may not delete the first data.
  • the embodiment may be referred to as localizing the first data by means of the first data migration; in the implementation manner of the latter, The copy of the first data stored in the cloth storage system is changed from n copies to n+ 1 copies, wherein the method of storing the copy of the first data can be seen in the related embodiments below.
  • the optional embodiment is hereinafter referred to as realizing localization of the first data by means of the first data migration.
  • the first server includes a virtual machine
  • the first data is data generated by the virtual machine
  • the other server further includes a third server
  • the method further includes the following Step C: Step C: The first server sends a third indication message that includes a copy of the first data to the third server, where the third indication message is used to indicate that the third server is to be the third A copy of the data is stored in the remaining storage space of the third server.
  • the "third server” may be used to store a copy of the first data, and the distributed storage system may include one or more third servers.
  • each server in the distributed storage system can know the remaining storage space of all servers in the distributed storage system.
  • the first server may select one or more servers in the distributed storage system as the third server according to the principle of "load balancing".
  • This embodiment can achieve the beneficial effect of enhancing the performance of the distributed storage system by storing a copy of the first data.
  • the distributed storage system can be operated normally by calling a copy of the first data, thereby enhancing the stability of the system.
  • the method for storing data provided by the embodiment of the present invention is applied to a distributed storage system including a first server and other servers, where the first server determines that the remaining storage space is greater than or equal to the storage space occupied by the data generated by the first server.
  • the used storage space of the first server is larger than the minimum used storage space of other servers, the data is preferentially stored in the remaining storage space of the server.
  • Embodiment 2 The method for storing data provided in this embodiment is applied to a distributed storage system, where the distributed storage system includes a first server and other servers.
  • This embodiment describes a method for storing data in scenario A in the first embodiment. As shown in Figure 2, it includes:
  • the method may further include: steps A 1 -A2 in the first embodiment, or steps B 1 -B4 in the first embodiment; and Step C in the first embodiment above.
  • the method for storing data provided by the embodiment of the present invention is applied to a distributed storage system including a first server and other servers, where the first server determines that the remaining storage space is greater than or equal to the storage space occupied by the data generated by the first server. If the used storage space of the first server is less than or equal to the minimum used storage space of the other server, the data is preferentially stored in the remaining storage space of the server; wherein the data occupies a larger storage space than the distributed storage system. The maximum value of the smallest memory unit.
  • Embodiment 1 the virtual machine is not included in the first server.
  • “Local” in this embodiment refers to a server where an application for generating data is located.
  • a method for storing data according to this embodiment includes:
  • An application installed on the first server (physical machine) generates a data.
  • the "application” herein may be any one of the prior art applications, such as word processing applications, audio processing applications, video picture processing applications, computer control applications, computer aided design applications, scientific simulation applications, and the like. Assuming that the step 301 is performed four times, an application installed on the first server (physical machine) generates a total of four data, which are D l, D2, D3, and D4.
  • the first server acquires a remaining storage space of the first server and a storage space occupied by the data.
  • the step 302 may be implemented as follows: the distributed storage program on the first server acquires the remaining storage space of the first server and the storage space occupied by the data.
  • step 302 is performed four times, and the remaining storage spaces of the obtained first server are: XI, X2, X3, X4; the space occupied by the data D1, D2, D3, D4
  • the sizes are: Ml, M2, M3, M4.
  • the first server determines whether the remaining storage space is greater than or equal to the storage space occupied by the data. If yes, go to step 304; if no, go to step 305.
  • the step 303 may be implemented as: determining, by the distributed storage program on the first server, whether the remaining storage space of the first server is greater than or equal to the storage space occupied by the data.
  • the first server stores the data in the remaining storage space of the first server.
  • the distributed storage program on the first server allocates a storage address to the data, and stores the data in a storage module corresponding to the storage address; the storage address is one or more of the remaining storage spaces of the first server.
  • D1, D2, and D3 are respectively stored in the (remaining) storage space of the first server.
  • the storage addresses allocated by the first server for D1, D2, and D3 are: Sl/Dl/offsetl, Sl/D2/offset2, and Sl/D2/offset3.
  • SI represents the first server
  • Sl/Dl/offsetl indicates that: the storage address of the data D1 is located at the offset1 of the storage space of the first server S1, and the other storage addresses are not interpreted.
  • the first server sends a first indication message that includes the data to the second server.
  • the first indication message is used to instruct the second server to store the data in the remaining storage space of the second server.
  • step 306 is performed.
  • the second server stores the data in the remaining storage space of the second server according to the first indication message. Specifically, the distributed storage program on the second server allocates a storage address to the data, and stores the data in a storage module corresponding to the storage address; the storage address is one or more of the remaining storage spaces of the second server. The storage address corresponding to the storage module.
  • D4 is stored in the (remaining) storage space of the second server.
  • the storage address assigned by the second server to D4 is: S2/D4/offset4.
  • S2 represents the second server
  • S2/D4/offset4 indicates that the storage address of the data D4 is located at the offset4 of the storage space of the second server S2.
  • D4 SI X4 S2/D4/offset4 As shown in Table 2, the server that obtains data D4 and the server that stores data D4 are the same server.
  • the first server updates the remaining storage space of the first server, and periodically detects the remaining storage space after the first server is updated.
  • the step 307 may be implemented as: the distributed storage program on the first server updates the remaining storage space of the first server, and periodically detects the remaining storage space after the first server is updated.
  • step 306 after performing step 301-306 for the fourth time, when step 307 is performed, the obtained remaining storage space of the first server is represented as X4'. 4 Set X4' > M4.
  • the first server determines whether the updated remaining storage space is greater than or equal to the storage space occupied by the data.
  • step 308 can be implemented as follows: The distributed storage program on the first server determines whether the remaining storage space after the first server is updated is greater than or equal to the storage space occupied by the data.
  • step 308 is specifically as follows: The first server determines whether X4' is greater than or equal to M4.
  • the first server sends a second indication message to the second server, where the second indication message is used to instruct the second server to send the data to the first server.
  • the second server sends the data to the first server according to the second indication message.
  • the first server stores the data in the remaining storage space after the first server is updated. After step 311 is performed, it ends. Specifically, the step 311 may be implemented as: the distributed storage program on the first server stores the data in the remaining storage space after the first server is updated. Exemplarily, according to the example in step 307, after performing step 311, the data routing table recorded in the distributed storage system is shown in Table 3: Table 3
  • the server that acquires the data D4 and the server that stores the data D4 are the same server.
  • SI/D4/offset4 indicates that the storage address of the data D4 is located at the offset 4 of the first server S1.
  • the method may further include the following steps A and B: Step A: The first server sends a third indication message including a copy of the data to the third server, where the third indication message is used to indicate The third server stores a copy of the data in the remaining storage space of the third server.
  • the pair of D 1 , D 2 , D 3 , D 4 This is expressed as: Dl', D2', D3', D4'.
  • Step B The third server stores the copy of the data in the remaining storage space of the third server according to the third indication message.
  • step B may be implemented as: the distributed storage program on the third server stores the copy of the data in the remaining storage space of the third server according to the third indication message.
  • the storage addresses allocated by the third server for D1, D2, D3, and D4 are: S3/D1' /offset 1, S3/D2' /offset2, S3/D3' /offset3, S3/D4' /offset4 , where S3 represents the third server, S3/D 1 ' /offsetl representation.
  • S3 represents the third server, S3/D 1 ' /offsetl representation.
  • a copy of the data D1 D1' storage address is located at the offset1 of the storage space of the third server S3, and the other storage addresses are no longer - explained.
  • step A and step B are used to enhance the stability of the system, specifically, if the data stored in the storage space of the second server is lost or When damaged, the first server may send an indication message to the third server to send a copy of the data to the first server; causing the third server to send a copy of the data to the first server according to the indication message; the first server A copy of the data is stored in the remaining storage space after the first server is updated.
  • the method for storing data provided by the embodiment of the present invention, when the first server determines that the remaining storage space is greater than or equal to the storage space occupied by a data generated by an application installed by the first server, the data is preferentially stored in the remaining storage of the server. In the space; when the first server determines that the remaining storage space is less than the storage space occupied by the data, storing the data in the remaining storage space of the second server, and determining that the remaining storage space after the first server is updated is greater than or When it is equal to the storage space occupied by the data, the data is stored in the remaining storage space after the first server is updated.
  • Embodiment 2 In this embodiment, a virtual machine is included on the first server. "Local” in this embodiment refers to a server where the virtual machine that generates data is located. As shown in FIG. 4, a method for storing data according to this embodiment includes:
  • the virtual machine installed on the first server generates a data.
  • VM1, VM2, VM3, VM4 four virtual machines (VM1, VM2, VM3, VM4) are installed on the first server, and the step 401 is executed.
  • VM1, VM2, VM3, and VM4 respectively acquire data D1, D2, and D3.
  • the first server acquires a remaining storage space of the first server and a storage space occupied by the data.
  • the first server determines whether the remaining storage space is greater than or equal to the storage space occupied by the data. If yes, go to step 404; if no, go to step 405.
  • the first server stores the data in the remaining storage space of the first server.
  • step 404 After performing step 404, it ends. It should be noted that the examples in step 402 to step 404 may refer to steps 302-304 in the foregoing embodiment 1, and details are not described herein again.
  • the first server determines the size of the storage space occupied by the data other than the first data corresponding to the virtual machine in step 401, and marks it as W.
  • the first server determines, as a second server, a server in the distributed storage system that has a remaining storage space greater than or equal to W.
  • the first server sends a first indication message that includes the data to the second server.
  • the first indication message is used to instruct the second server to store the data in the remaining storage space of the second server.
  • the second server stores the data in the remaining storage space of the second server according to the first indication message.
  • the server S 1 where the virtual machine VM4 of D4 is located is not the same server as the server S2 where D4 is stored.
  • step 409 The first server migrates the virtual machine and other data generated by the virtual machine to the second server. After step 409 is performed, it ends. Exemplarily, all the data recorded in the data routing table and corresponding to the virtual machine are: D l, D2, D3, D4, then step 409 is specifically: the first server uses the virtual machine and D l, D2, D3 migrated to the second server. Exemplarily, according to the example in step 408, after step 409, the distribution The data routing table recorded in the storage system is shown in Table 3': Table y
  • the server S2 where the virtual machine VM4 that acquires D4 is located is the same server as the server S2 that stores D4.
  • the method may further include step A and step B in the above embodiment 1.
  • Table 4 and Table 5 above can be expressed as Table 4' and Table 5', respectively: Table 4'
  • the first server determines that the remaining storage space is greater than or equal to a data generated by a virtual machine thereon.
  • the data is preferentially stored in the remaining storage space of the server; when the first server determines that the remaining storage space is smaller than the storage space occupied by the data, the data is stored in the storage of the second server. In space, and migrate the virtual machine to the second server.
  • Embodiment 3 As shown in FIG. 5, a server 1 provided in this embodiment is applied to a distributed storage system, where the distributed storage system further includes other servers, and the server 1 is configured to execute the method shown in FIG. A method of storing data, the server 1 comprising:
  • the determining module 5 1 is configured to determine the first data generated by the server 1 ; the determining module 52 is configured to determine a size relationship between a remaining storage space of the server 1 and a storage space occupied by the first data;
  • the storage module 53 is configured to determine, in the determining module 52, that the remaining storage space of the server 1 is greater than or equal to the storage space occupied by the first data, and the used storage space of the server 1 is larger than the other server.
  • the first data is stored in the remaining storage space of the server 1 when the minimum used storage space is used.
  • the server 1 includes a virtual machine, where the first data is data generated by the virtual machine; and the other server includes a second server; as shown in FIG.
  • the server 1 further includes:
  • the sending module 54 is configured to: when the determining module 52 determines that the remaining storage space of the server 1 is smaller than the storage space occupied by the first data, send the first data including the first data to the second server
  • the first indication message is used to instruct the second server to store the first data in a remaining storage space of the second server;
  • the migration module 55 is configured to: when the storage space occupied by the data other than the first data corresponding to the virtual machine is less than or equal to the remaining storage space of the second server, the virtual machine and the The other data is migrated to the second server.
  • the other server includes a second server.
  • the server 1 further includes:
  • the sending module 54 is configured to: when the determining module 52 determines that the remaining storage space of the server 1 is smaller than the storage space occupied by the first data, send the first data including the first data to the second server
  • the first indication message is used to instruct the second server to store the first data in a remaining storage space of the second server;
  • An update module 56 configured to update remaining storage space of the server 1;
  • the sending module 54 is further configured to: when the remaining storage space of the server 1 is greater than or equal to the storage space occupied by the first data, send a second indication message to the second server;
  • the second indication message is used to instruct the second server to migrate the first data to the server 1;
  • the receiving module 57 is configured to receive the first data sent by the second server.
  • the storage module 53 is further configured to store the first data in the remaining storage space of the server 1 after the update.
  • the other server further includes a third server
  • the sending module 54 is further configured to: send, to the third server, the first a third indication message of the copy of the data, the third indication message is used to instruct the third server to store a copy of the first data in a remaining storage space of the third server.
  • the server 1 in this embodiment may be the "first server” described in the first embodiment
  • the “second server” in this embodiment may be the “second server” described in the first embodiment. ".
  • the server 1 provided by the embodiment of the present invention is applied to a distributed storage system that further includes other servers, and the server 1 determines that the remaining storage space is greater than or equal to the storage space occupied by a data generated by the server, and the server 1 is used.
  • the server 1 can directly read the data from the local without having to read the data from other servers in the network, thereby achieving the beneficial effects of shortening the delay and improving the system performance.
  • the problem of prolonged time and poor system performance due to reading data on other servers in the network is solved in the prior art.
  • Embodiment 4 is directed to Embodiment 3, in hardware implementation, where the sending module may be a transmitter, the receiving module may be a receiver, and the transmitter and the receiver may be integrated to form a transceiver; the storage module may be a memory, The determining module, the judging module, the migration module, and the like may be embedded in or independent of the processor of the server 1 in hardware, or may be stored in the memory of the server 1 in software, so that the processor calls to execute the corresponding modules. Operation, the processor can be a central processing unit (CPU), a microprocessor, a microcontroller, or the like. As shown in FIG.
  • CPU central processing unit
  • a server 1 is provided for a distributed storage system, where the distributed storage system further includes other servers, and the server 1 is configured to execute the storage data shown in FIG. 1 .
  • the server 1 includes: a bus system 71, a memory 72, and a processor 73.
  • the memory 72 and the processor 73 are coupled together by a bus system 71.
  • the bus system 71 may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus. However, for the sake of clarity, various buses are labeled as the bus system 71 in the figure.
  • the server 1 includes a virtual machine, the first data is data generated by the virtual machine, and the other server includes a second server. As shown in FIG.
  • the server 1 further includes: sending The controller 74 is configured to: when the processor 73 determines that the remaining storage space of the server 1 is smaller than the storage space occupied by the first data, send a first indication that includes the first data to the second server The first indication message is used to indicate that the second server stores the first data in a remaining storage space of the second server; the processor 73 is further configured to: correspond to the virtual machine The virtual machine and the other data are migrated to the second server when the storage space occupied by the data other than the first data is less than or equal to the remaining storage space of the second server.
  • the other server includes a second server; the processor 73 is further configured to: update a remaining storage space of the server 1; the transmitter 74 is further configured to: when the server 1 is updated, Sending a second indication message to the second server when the storage space is greater than or equal to the storage space occupied by the first data; the second indication message is used to instruct the second server to migrate the first data
  • the server 1 further includes: a receiver 75, configured to receive the first data sent by the second server; the memory 72 is further configured to: The first data is stored in the remaining storage space of the server 1 after the control of the processor 73.
  • the other server further includes a third server, where the transmitter 74 is further configured to: send, to the third server, a third indication message that includes a copy of the first data, where the third indication message is And configured to instruct the third server to store a copy of the first data in a remaining storage space of the third server.
  • the server 1 in this embodiment may be the "first server” described in the first embodiment
  • the “second server” in this embodiment may be the “second server” described in the first embodiment. ".
  • the server 1 provided by the embodiment of the present invention is applied to a distributed storage system that further includes other servers, and the server 1 determines that the remaining storage space is greater than or equal to the storage space occupied by a data generated by the server, and the server 1 is used.
  • the server 1 determines that the remaining storage space is greater than or equal to the storage space occupied by a data generated by the server, and the server 1 is used.
  • the storage space is larger than the minimum used storage space of other servers, the data is preferentially stored in the remaining storage space of the server.
  • the server 1 can directly read the data from the local without having to read the data from other servers in the network, thereby achieving the beneficial effects of shortening the delay and improving the system performance.
  • Embodiment 5 due to other in the network The problem of extended time and poor system performance caused by reading data on the server. Embodiment 5
  • a server 2 provided in this embodiment is applied to a distributed storage system, where the distributed storage system further includes other servers, and the server 2 is configured to execute the stored data shown in FIG.
  • the server 2 includes: a determining module 91, configured to determine first data generated by the server 2; a storage space occupied by the first data is greater than a maximum value of a minimum storage unit of the distributed storage system;
  • the determining module 92 is configured to determine a size relationship between the remaining storage space of the server 2 and the storage space occupied by the first data.
  • the storage module 93 is configured to determine, in the determining module 92, the remaining storage of the server 2 The space is greater than or equal to the storage space occupied by the first data, and the used storage space of the server 2 is less than or equal to the minimum used storage space of the other server, and the first data is stored in the In the remaining storage space of server 2.
  • the server 2 in this embodiment may specifically be the "first server" described in the first embodiment.
  • the server 2 provided by the embodiment of the present invention is applied to a distributed storage system that further includes other servers.
  • the server 2 determines that the remaining storage space is greater than or equal to the storage space occupied by the data generated by the server 2, and the server 2 is used.
  • the storage space is less than or equal to the minimum used storage space of other servers, the data is preferentially stored in the remaining storage space of the server; wherein the storage space occupied by the data is greater than the maximum storage unit of the distributed storage system. .
  • the server 2 can directly read the data from the local without reading the data from other servers in the network, thereby achieving the beneficial effects of shortening the delay and improving the system performance.
  • the storage module may be a memory, and the determining module and the determining module may be embedded in the hardware of the server 2 in hardware or may be stored in software.
  • the processor may be a central processing unit (CPU), a microprocessor, a single chip microcomputer or the like.
  • a server 2 is provided for a distributed storage system, where the distributed storage system further includes other servers, and the server 2 is configured to execute the storage data shown in FIG.
  • the server 1 includes: a bus system 10A, a memory 10B, and a processor 10C.
  • the memory 10B and the processor 10C are coupled together by the bus system 10A.
  • the bus system 10A may include a power bus, a control bus, a status signal bus, and the like in addition to the data bus.
  • the various buses are labeled as bus system i 0 A in the figure.
  • a memory 10B for storing a set of codes; a code stored in the memory 10B for controlling the processor 10C to determine the first data generated by the server 2; and determining the remaining storage space of the server 2 and the first data a size relationship of the storage space; the storage space occupied by the first data is greater than a maximum value of a minimum storage unit of the distributed storage system; and the memory 10B is further configured to determine the server 2 at the processor 10C
  • the remaining storage space is greater than or equal to the storage space occupied by the first data, and the used storage space of the server 2 is less than or equal to the minimum used storage space of the other server, in the processor 10C Controlling the first data in storage The remaining storage space of the server 2.
  • the server 2 in this embodiment may be the "first server” described in the first embodiment, and the server 2 provided in the embodiment of the present invention is applied to a distributed storage system that further includes other servers.
  • the server 2 determines that the remaining storage space is greater than or equal to the storage space occupied by one of the data generated by the server 2, and when the used storage space of the server 2 is less than or equal to the minimum used storage space of the other server, the data is preferentially stored in the server.
  • the remaining storage space; wherein, the data occupies a storage space larger than the maximum storage unit of the distributed storage system. In this way, the server 2 can directly read the data from the local without reading the data from other servers in the network, thereby achieving the beneficial effects of shortening the delay and improving the system performance.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules is only a logical function division.
  • there may be another division manner for example, multiple modules or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be in an electrical, mechanical or other form.
  • the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules.
  • each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may be physically included separately, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of hardware plus software function modules.
  • the above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium.
  • the software function modules described above are stored in a storage medium and include instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform portions of the steps of the various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a ROM (Read-Only Memory), a RAM (Random Access Memory), a disk or an optical disk, and the like, which can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention appartient au domaine technique du stockage. Selon un mode de réalisation de la présente invention, un procédé et un dispositif de stockage de données permettent de réduire le retard temporel de données de lecture, ce qui améliore la performance du système. Le procédé est appliqué à un système de stockage distribué comprenant un premier serveur et d'autres serveurs. Le procédé consiste à déterminer les premières données générées par le premier serveur ; déterminer la relation entre la taille de l'espace de stockage restant du premier serveur et la taille de l'espace de stockage utilisé par les premières données ; si l'espace de stockage restant du premier serveur est égal ou supérieur à l'espace de stockage utilisé par les premières données, et que l'espace de stockage utilisé du premier serveur est supérieur à l'espace de stockage minimum utilisé des autres serveurs, stocker alors les premières données dans l'espace de stockage restant du premier serveur.
PCT/CN2014/071224 2014-01-23 2014-01-23 Procédé et dispositif de stockage de données WO2015109483A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/071224 WO2015109483A1 (fr) 2014-01-23 2014-01-23 Procédé et dispositif de stockage de données
CN201480000338.5A CN104205780B (zh) 2014-01-23 2014-01-23 一种存储数据的方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/071224 WO2015109483A1 (fr) 2014-01-23 2014-01-23 Procédé et dispositif de stockage de données

Publications (1)

Publication Number Publication Date
WO2015109483A1 true WO2015109483A1 (fr) 2015-07-30

Family

ID=52088184

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/071224 WO2015109483A1 (fr) 2014-01-23 2014-01-23 Procédé et dispositif de stockage de données

Country Status (2)

Country Link
CN (1) CN104205780B (fr)
WO (1) WO2015109483A1 (fr)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202350A (zh) * 2016-07-05 2016-12-07 浪潮(北京)电子信息产业有限公司 一种分布式文件系统自动精简配置的方法及系统
CN106909321A (zh) * 2017-02-24 2017-06-30 郑州云海信息技术有限公司 一种基于存储系统的控制方法及装置
CN107092443B (zh) * 2017-04-28 2020-04-07 杭州宏杉科技股份有限公司 数据迁移方法及装置
CN107172222A (zh) * 2017-07-27 2017-09-15 郑州云海信息技术有限公司 一种基于分布式存储系统的数据存储方法及装置
CN109783576B (zh) * 2019-01-02 2022-05-31 佛山市顺德区美的洗涤电器制造有限公司 家电设备及其数据存储方法、装置
CN110134332B (zh) * 2019-04-28 2022-03-08 平安科技(深圳)有限公司 一种数据存储方法及相关装置
CN110109622A (zh) * 2019-04-28 2019-08-09 平安科技(深圳)有限公司 一种基于中间件的数据处理方法和相关装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060031305A1 (en) * 2004-04-29 2006-02-09 International Business Machines Corporation Managing on-demand email storage
CN102508736A (zh) * 2011-10-11 2012-06-20 宇龙计算机通信科技(深圳)有限公司 通信终端中应用程序的备份方法及通信终端
CN103064639A (zh) * 2012-12-28 2013-04-24 华为技术有限公司 数据存储方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6958881B1 (en) * 2003-11-26 2005-10-25 Western Digital Technologies, Inc. Disk drive control system having a servo processing accelerator circuit
CN101771715B (zh) * 2008-12-26 2014-04-16 华为技术有限公司 分布式网络构建存储的方法、装置和系统
CN103092927B (zh) * 2012-12-29 2016-01-20 华中科技大学 一种分布式环境下的文件快速读写方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060031305A1 (en) * 2004-04-29 2006-02-09 International Business Machines Corporation Managing on-demand email storage
CN102508736A (zh) * 2011-10-11 2012-06-20 宇龙计算机通信科技(深圳)有限公司 通信终端中应用程序的备份方法及通信终端
CN103064639A (zh) * 2012-12-28 2013-04-24 华为技术有限公司 数据存储方法及装置

Also Published As

Publication number Publication date
CN104205780A (zh) 2014-12-10
CN104205780B (zh) 2017-06-27

Similar Documents

Publication Publication Date Title
WO2015109483A1 (fr) Procédé et dispositif de stockage de données
US20200259899A1 (en) System and method for sharing san storage
US10409508B2 (en) Updating of pinned storage in flash based on changes to flash-to-disk capacity ratio
US8458413B2 (en) Supporting virtual input/output (I/O) server (VIOS) active memory sharing in a cluster environment
JP5088366B2 (ja) 仮想計算機制御プログラム、仮想計算機制御システムおよび仮想計算機移動方法
US9098466B2 (en) Switching between mirrored volumes
KR20200017363A (ko) 호스트 스토리지 서비스들을 제공하기 위한 NVMe 프로토콜에 근거하는 하나 이상의 호스트들과 솔리드 스테이트 드라이브(SSD)들 간의 관리되는 스위칭
US10552089B2 (en) Data processing for managing local and distributed storage systems by scheduling information corresponding to data write requests
US20080263544A1 (en) Computer system and communication control method
US20150113218A1 (en) Distributed Data Processing Method and Apparatus
WO2015176636A1 (fr) Système de gestion de service de base de données distribué
JP5352490B2 (ja) サーバイメージ容量の最適化
US20160077996A1 (en) Fibre Channel Storage Array Having Standby Controller With ALUA Standby Mode for Forwarding SCSI Commands
KR20170076666A (ko) 서브넷 관리(sa) 쿼리 캐싱을 통한 동적 클라우드 제공을 위한 시스템 및 방법
US10331470B2 (en) Virtual machine creation according to a redundancy policy
JP2016539399A (ja) データ書込み要求処理方法及びストレージアレイ
WO2019148841A1 (fr) Système de stockage distribué, procédé de traitement de données et nœud de stockage
KR20180086791A (ko) 빅 데이터 처리 지원을 위한 클라우드 시스템 및 그 운영 방법
WO2015010646A1 (fr) Procédé, module, processeur et dispositif terminal d'accès aux données de mémoire hybride
CN111225003B (zh) 一种nfs节点配置方法和装置
JPWO2017145272A1 (ja) データ移行方法及び計算機システム
WO2019080150A1 (fr) Système de stockage doublement actif et procédé d'attribution d'adresses
CN105739930A (zh) 一种存储架构及其初始化方法和数据存储方法及管理装置
CN109587185B (zh) 云存储系统和云存储系统中的对象处理方法
US9052839B2 (en) Virtual storage apparatus providing a plurality of real storage apparatuses

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14879563

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14879563

Country of ref document: EP

Kind code of ref document: A1