WO2018090606A1 - 数据存储方法及装置 - Google Patents

数据存储方法及装置 Download PDF

Info

Publication number
WO2018090606A1
WO2018090606A1 PCT/CN2017/087212 CN2017087212W WO2018090606A1 WO 2018090606 A1 WO2018090606 A1 WO 2018090606A1 CN 2017087212 W CN2017087212 W CN 2017087212W WO 2018090606 A1 WO2018090606 A1 WO 2018090606A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machines
mds
affinity
identifiers
virtual
Prior art date
Application number
PCT/CN2017/087212
Other languages
English (en)
French (fr)
Inventor
钟颙
林铭
彭瑞林
赵军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP17871054.7A priority Critical patent/EP3432132B1/en
Publication of WO2018090606A1 publication Critical patent/WO2018090606A1/zh
Priority to US16/188,951 priority patent/US11036535B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2094Redundant storage or storage space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0664Virtualisation aspects at device level, e.g. emulation of a storage device or system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Definitions

  • the present invention relates to the field of data storage technologies, and in particular, to a data storage method and apparatus.
  • a distributed storage system generally includes a client, a Meta Data Server (MDS), and a plurality of data nodes (Storage Nodes, SNs).
  • the distributed storage system can be deployed in a virtualized environment.
  • the virtualized environment is built on multiple physical servers by using virtualization technology, and a plurality of virtual machines are deployed in the multiple physical servers through the virtual machine manager;
  • the machine manager may be different in different virtualization technologies, for example, may be HM (Hypervisor Manager), or may be KVM (Kernel-based Virtual Machine), or may be VMM (Virtual Machine Manager); wherein the distributed storage system
  • the client, MDS, and SN in the middle can be deployed in any of the several virtual machines.
  • the MDS can query and store the physical address of each physical server and the identifier of the virtual machine deployed in each physical server through the virtual machine manager.
  • the client sends a data storage request to the MDS according to the physical address of each physical server stored in advance and deployed in the The identifier of the virtual machine in each physical server returns the identifier of the virtual machine to the client.
  • the target data is stored in the SN deployed by the corresponding virtual machine.
  • the distributed storage system deployed in the virtualized environment is often a third-party system, and the distributed storage system performs data storage by the above method, it needs to rely on the physical address of each physical server and deployed in each physical server.
  • the identifier of the virtual machine may divulge the location of each physical server and the distribution of virtual machines in each physical server. This may bring security risks to the physical server and reduce the stability and security of the physical server. It also reduces the stability and security of the virtualized environment and the distributed storage system deployed in the virtualized environment.
  • an embodiment of the present invention provides a data storage method and apparatus.
  • the technical solution is as follows:
  • a data storage method in which at least M virtual machines are deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M SNs of a distributed storage system, and M is greater than or equal to 2.
  • the method includes:
  • the MDS of the distributed storage system receives a data storage request of the client
  • N is a positive integer greater than or equal to 1;
  • the MDS determines an identifier of the N virtual machines according to the group information, where the N virtual machines corresponding to the identifiers of the N virtual machines belong to at least one anti-affinity group, and the group information records multiple anti-affinity groups and M Virtual machine identification a mapping relationship, each of the virtual machines in the anti-affinity group has a reverse affinity, and the M virtual machines include the N virtual machines;
  • the data storage request specifies the data.
  • the MDS in the distributed storage system can receive the data storage request of the client, and determine, according to the stored group information, the identifiers of the N virtual machines that meet the storage data requirement from the at least one anti-affinity group. To ensure that the client can store data in the virtual machine deployed as the SN corresponding to the identifiers of the N virtual machines, because the group information records the mapping of the identifiers of the multiple anti-affinity groups and the M virtual machines.
  • the determining, by the MDS, the identifiers of the N virtual machines according to the group information includes:
  • the MDS determines, from the one of the plurality of anti-affinity groups, the N virtual machines Identification
  • the MDS determines the identifiers of the N virtual machines from at least two of the anti-affinity groups.
  • the anti-affinity group includes X virtual machines, and if X is less than N, the MDS may determine X virtual machines from any of the at least two anti-affinity groups, and from the at least two The other anti-affinity groups in the anti-affinity group determine the NX virtual machines, thereby obtaining N virtual machines.
  • the MDS when determining the identifiers of the N virtual machines, can compare the number of virtual machines included in the anti-affinity group with N, and according to the comparison result, adopt different strategies to determine the N
  • the identification of the virtual machine improves the flexibility of the distributed storage system while ensuring the reliability and stability of the storage storage system.
  • determining, by the MDS, the identifiers of the N virtual machines from the at least two the anti-affinity groups includes:
  • the MDS determines an identifier of the N virtual machines according to affinity between at least two of the anti-affinity groups.
  • the MDS may determine X virtual machines from any of the at least two anti-affinity groups, and from the other anti-affinity groups of the at least two anti-affinity groups, according to the pro And determining NX virtual machines in the same affinity group as at least one of the X virtual machines, thereby obtaining N virtual machines.
  • the MDS can also determine the identifiers of the N virtual machines, thereby improving the reliability of the distributed storage.
  • the method before the determining, by the MDS, the identifiers of the N virtual machines according to the group information, the method further includes:
  • the MDS divides the M virtual machines into the plurality of anti-affinity groups, and records, in the group information, a mapping relationship between the plurality of anti-affinity groups and identifiers of the M virtual machines;
  • the MDS sends the grouping information to the virtual machine manager, so that the virtual machine manager deploys M virtual machines according to the grouping information.
  • the MDS can group the identifiers of the N virtual machines that are sent by the virtual machine manager, and send the group information to the virtual machine manager to ensure that the virtual machine manager can Grouping
  • the information is deployed on the N virtual machines to ensure that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the group information, thereby further improving the stability and reliability of the distributed storage system.
  • the M virtual machines belong to multiple affinity groups, and each of the affinity groups has affinity between the virtual machines, and the group information records multiple affinity groups and M virtual machines.
  • the mapping relationship of the identity is
  • the M virtual machines may also belong to multiple affinity groups, that is, several virtual machines in the M virtual machines may be classified into anti-affinity groups according to reverse affinity. At the same time, it can be divided into affinity groups according to affinity, so that it can meet different storage requirements and improve the flexibility of the distributed storage system.
  • a data storage method in which at least M virtual machines are deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M SNs of a distributed storage system, where M is A positive integer greater than or equal to 2, the method comprising:
  • the MDS of the distributed storage system receives an identifier of the M virtual machines sent by the virtual machine manager;
  • the MDS divides the M virtual machines into a plurality of anti-affinity groups, and each of the virtual machines in the anti-affinity group has a reverse affinity;
  • the MDS records, in the grouping information, a mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines, where the group information is used by the MDS when the client requests to store data from at least one of the reverse Determining an identifier of the N virtual machines in the affinity group, where the N virtual machines are used to store N pieces of data specified by the client, where N is a positive integer greater than or equal to 1;
  • the MDS sends the grouping information to the virtual machine manager, so that the virtual machine manager deploys M virtual machines according to the grouping information.
  • the MDS can group the identifiers of the N virtual machines that are sent by the virtual machine manager, and send the group information to the virtual machine manager to ensure that the virtual machine manager can
  • the N virtual machines are deployed based on the group information, which ensures that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the group information, thereby improving the stability and reliability of the distributed storage system.
  • the location of each physical server and the distribution of virtual machines in each physical server are not revealed, and the situation is improved.
  • the stability and security of the physical server further enhances the stability and security of the virtualized environment and the distributed storage system deployed in the virtualized environment.
  • a data storage method wherein at least M virtual machines are deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M SNs of a distributed storage system, where M is A positive integer greater than or equal to 2, the method comprising:
  • the virtual machine manager sends the identifiers of the M virtual machines to the MDS of the distributed storage system;
  • the virtual machine manager receives the group information sent by the MDS, and the group information records a mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines, and virtual in each of the anti-affinity groups Having anti-affinity between the machines, the grouping information is used by the MDS to determine identifiers of N virtual machines from at least one of the anti-affinity groups when the client requests to store data, where the N virtual machines are used N storage data specified by the storage client, where N is a positive integer greater than or equal to 1;
  • the virtual machine manager deploys M virtual machines according to the grouping information.
  • the virtual machine manager may send the identifiers of the M virtual machines deployed as the SN to the MDS, and according to the received grouping information of the M virtual machines, the M
  • the deployment of the virtual machines ensures that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the packet information, thereby improving the stability and reliability of the distributed storage system.
  • the M virtual machines belong to a plurality of affinity groups, and each of the affinity groups has affinity between the virtual machines, and the group information records multiple affinity groups and M virtual machines.
  • the mapping relationship of the identity is a mapping relationship of the identity.
  • the virtual machine manager can deploy the M virtual machines according to the grouping information, and the deployed M virtual machines can also belong to multiple affinity groups, that is, the Several virtual machines in the M virtual machines can be divided into anti-affinity groups according to anti-affinity and can be divided into affinity groups according to affinity, so as to meet different storage requirements and improve the distributed. The flexibility of the storage system.
  • a data storage device having a function of implementing the data storage method of the first aspect described above.
  • the data storage device includes at least one module for implementing the data storage method provided by the first aspect above.
  • a data storage device having a function of implementing the data storage method of the second aspect described above.
  • the data storage device includes at least one module for implementing the data storage method provided by the second aspect above.
  • a data storage device having a function of implementing the data storage method of the above third aspect.
  • the data storage device includes at least one module for implementing the data storage method provided by the third aspect above.
  • a data storage device includes a processor and a memory, and the memory is configured to store a program supporting the data storage device to perform the data storage method provided by the first aspect, And/or storing data involved in implementing the data storage method provided by the first aspect above.
  • the processor is configured to execute a program stored in the memory.
  • the operating device of the storage device may further include a communication bus for establishing a connection between the processor and the memory.
  • a data storage device includes a processor and a memory, and the memory is configured to store a program that supports the data storage device to perform the data storage method provided by the second aspect, And/or storing data involved in implementing the data storage method provided by the second aspect above.
  • the processor is configured to execute a program stored in the memory.
  • the operating device of the storage device may further include a communication bus for establishing a connection between the processor and the memory.
  • a data storage device includes a processor and a memory, and the memory is configured to store a program supporting the data storage device to perform the data storage method provided by the third aspect, And/or storing data involved in implementing the data storage method provided by the third aspect above.
  • the processor is configured to execute a program stored in the memory.
  • the operating device of the storage device may further include a communication bus for establishing a connection between the processor and the memory.
  • the embodiment of the present invention provides a computer storage medium for storing computer software instructions used by the data storage device provided in any one of the fourth to ninth aspects, or for storing the foregoing The program designed by the data storage method of any of the first aspect to the third aspect.
  • the MDS in the distributed storage system can receive the data storage request of the client, and according to the stored group information, from at least one anti-parent Identifying the identifiers of the N virtual machines that meet the storage data requirements in the group to ensure that the client can store data in the virtual machine deployed as the SN corresponding to the identifiers of the N virtual machines, because the group information is recorded.
  • mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines without acquiring the physical addresses of the multiple physical servers or the distribution of the virtual machines in each physical server, so that each physical server is not leaked
  • the location and the distribution of virtual machines in each physical server improve the stability and security of the physical server, thereby improving the stability and security of the virtualized environment and the distributed storage system deployed in the virtualized environment. .
  • FIG. 1 is a structural diagram of a distributed storage system in a virtualized environment according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a physical server according to an embodiment of the present invention.
  • 3A is a flowchart of a data storage method according to an embodiment of the present invention.
  • FIG. 3B is a schematic diagram of another physical server according to an embodiment of the present invention.
  • FIG. 3C is a schematic diagram of still another physical server according to an embodiment of the present invention.
  • FIG. 3D is a schematic diagram of still another physical server according to an embodiment of the present invention.
  • FIG. 4A is a schematic structural diagram of a data storage device according to an embodiment of the present invention.
  • FIG. 4B is a schematic structural diagram of another data storage device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of still another data storage device according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of still another data storage device according to an embodiment of the present invention.
  • FIG. 1 is a schematic diagram of a distributed storage system architecture in a virtualized environment, according to an exemplary embodiment,
  • the distributed storage system includes a software control terminal such as a client, an MDS, and a plurality of SNs (only 8 are shown in FIG. 1).
  • the virtualization environment is built on multiple physical servers through virtualization technology, and several virtual machines are deployed in the multiple physical servers through the virtual machine manager. Any two of the plurality of physical servers can be connected through a network.
  • some or all of the SNs in the distributed storage system may be deployed in any of the plurality of virtual machines.
  • the client, the MDS, and the SN in the distributed storage system may be deployed in any one of the plurality of virtual machines.
  • the client and the MDS in the distributed storage system may be deployed on the physical server.
  • a part of the SNs in the distributed storage system are deployed in any one of the plurality of virtual machines, and at least one of the remaining SNs may be deployed on the physical server.
  • the client may communicate with the MDS to send a data storage request to the MDS, and receive an identifier of the virtual machine deployed by the MDS and deployed as the SN; the client may also be associated with the multiple Any one of the SNs communicates to store data or read data to some of the plurality of SNs.
  • the MDS can communicate with the virtual machine manager, query the identifier of the virtual machine deployed as the SN in the plurality of physical servers, or receive the virtual machine deployed by the virtual machine manager and deployed as the SN. logo.
  • the identifier of the virtual machine is used to uniquely identify a virtual machine in the multiple physical servers, and the identifier of the virtual machine may be an Internet Protocol (IP) address, a port number, a name, etc. of the virtual machine. In an actual application, the identifier of the virtual machine may also be another identifier that can uniquely identify one virtual machine among the multiple physical servers.
  • IP Internet Protocol
  • the virtual machine manager may be deployed on a physical device other than the multiple physical servers, for example, another physical server other than the multiple physical servers, of course, according to actual conditions.
  • the virtual machine manager can also be deployed in one of the plurality of physical servers.
  • FIG. 2 is a schematic structural diagram of a physical server according to an embodiment of the present invention.
  • the physical server can be any of the physical servers in FIG. Referring to FIG. 2, the physical server includes at least one processor 201, a communication bus 202, a memory 203, and at least one communication interface 204.
  • the processor 201 may be a Central Processing Unit (CPU), a microprocessor, an Application-Specific Integrated Circuit (ASIC), or one or more implementations that may be implemented to implement the present invention.
  • CPU Central Processing Unit
  • ASIC Application-Specific Integrated Circuit
  • Communication bus 202 can include a path for transferring data between components included in the physical server, such as processor 201, memory 203, and communication interface 204.
  • the memory 203 can be a Read-Only Memory (ROM), a Random Access Memory (RAM), other types of static storage devices that can store static data and instructions, other types of data and instructions that can be stored.
  • the dynamic storage device may also be an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM), or other optical disc storage or optical disc. Storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), disk storage media, or other magnetic storage devices, or can be used to carry Or any other medium having the desired program code in the form of an instruction or data structure and accessible by the physical server, but is not limited thereto.
  • the memory 203 may be stand-alone or connected to the processor 201 via the communication bus 202 or integrated with the processor 201.
  • the communication interface 204 uses devices such as any transceiver for communicating with other devices over a communication network; the communication network may be Ethernet, Radio Access Network (RAN), Wireless Local Area Networks (Wireless Local Area Networks, WLAN), etc.; the other device may be a device deployed with a client or an SN (eg, a computer device that can be moved or a mobile terminal that can be carried).
  • the communication network may be Ethernet, Radio Access Network (RAN), Wireless Local Area Networks (Wireless Local Area Networks, WLAN), etc.
  • the other device may be a device deployed with a client or an SN (eg, a computer device that can be moved or a mobile terminal that can be carried).
  • processor 201 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG.
  • the physical server can include multiple processors, such as processor 201 and processor 205 shown in FIG. Each of these processors can be a single core processor (CPU) or a multi-core processor (multi-CPU).
  • processors herein may refer to one or more devices, circuits, and/or processing cores for processing data.
  • the memory 203 is configured to store program code 210 for implementing various embodiments in the present invention, and the processor 201 is configured to execute the program code 210 stored in the memory 203.
  • the physical server can execute the program code 210 in the memory 203 by the processor 201 to implement the data storage method provided by the embodiment of FIG. 3A below.
  • FIG. 3A is a flowchart of a data storage method according to an embodiment of the present invention. At least one virtual server is deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M SNs of a distributed storage system. For a positive integer greater than or equal to 2, see Figure 3A, the method includes:
  • Step 301 The virtual machine manager queries the M virtual machines deployed as SNs from the plurality of physical servers, where the multiple physical servers are physical servers in which the virtualized environment is established.
  • M SNs may be all SNs or partial SNs of the distributed storage system
  • the running status of the virtual machines is directly related to the distributed storage system. Stability and reliability, therefore, in order to manage the virtual machines deployed in the plurality of physical servers to ensure the stability and reliability of the distributed storage system, the virtual machine manager can be from multiple physical servers Query the M virtual machines deployed as SNs.
  • the virtual machine manager may perform the operations described in step 301 when receiving the virtual machine query request sent by the MDS.
  • the virtual machine manager can acquire and store the identifier of the currently deployed virtual machine when the virtual machine is deployed in each physical server.
  • the virtual machine manager may obtain the identifiers of the M virtual machines deployed as SNs by using other methods.
  • a possible implementation manner is: the virtual machine manager is more than Each physical server in the physical server sends a virtual machine identifier acquisition request, and when the physical server receives the virtual machine identifier acquisition request, sends the identifier of the virtual machine deployed as the SN in the physical server to the virtual machine management. Device.
  • the distributed storage system is deployed in three physical servers, and the virtual machine manager determines that the identifiers of the virtual machines deployed as the SN among the three physical servers are: VM1, VM2, VM3, VM4, VM5, and VM6. , VM7, VM8, VM9, VM10, VM11, and VM12.
  • Step 302 The virtual machine manager sends the identifiers of the M virtual machines to the MDS in the distributed storage system.
  • the virtual machine manager query Since the identifier of the virtual machine obtained by the virtual machine manager query is the identifier of the virtual machine deployed as the SN, in order to ensure that the distributed storage system can store data in the SN, the virtual machine manager can obtain the virtual obtained by the query. The identity of the machine is sent to the MDS.
  • the MDS may send a virtual machine query request to the virtual machine manager, so that the virtual machine manager queries the plurality of physical servers to be deployed as the SN through the foregoing steps 301 and 302.
  • the identifier of the virtual machine is sent to the MDS, so as to ensure that the MDS can query the M virtual machines deployed as SNs among the multiple physical servers.
  • the MDS can also query the M virtual machines that are deployed as SNs in the multiple physical servers by using other methods.
  • a possible implementation manner is: the MDS receives the first time. When the identifier of the virtual machine sent by the virtual machine manager is stored, the identifier of the received virtual machine is stored, and then the MDS can query the virtual machine deployed as the SN among the plurality of physical servers from the local storage.
  • Step 303 The MDS receives the identifiers of the M virtual machines sent by the virtual machine manager, and groups the M virtual machines.
  • the M virtual machines are virtual machines deployed as SNs
  • the M virtual machines may include virtual machines deployed in the same physical server, and virtual machines respectively deployed in different physical servers, and the distribution A storage system may have different requirements when storing different data. For example, some data needs to be stored in the SN of a virtual machine deployed in the same physical server. Therefore, in order to facilitate the data of the distributed storage system.
  • the MDS can group the M virtual machines. And because the MDS does not need to obtain the physical address of the multiple physical servers or the distribution of virtual machines in each physical server, the location of each physical server and the distribution of virtual machines in each physical server are not revealed. The stability and security of the physical server are improved, thereby improving the stability and security of the virtualized environment built in the physical server and the distributed storage system deployed in the virtualized environment.
  • the MDS may group the M virtual machines by using at least one of the following two possible implementation manners, including:
  • virtual machines in each anti-affinity group have anti-affinity
  • virtual machines in each anti-affinity group are respectively deployed in virtual machines on different servers. Therefore, when the data is stored to the SN of the virtual machine deployed by the same anti-affinity group, it is possible to ensure the SN of the virtual machine deployed in the other physical server when the physical server where the SN storing the data is located is faulty. Obtaining this data can improve the reliability of storing the data. Therefore, the MDS may divide the M virtual machines into multiple anti-affinity groups, and record the mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines in the grouping information.
  • the number X of virtual machines included in each anti-affinity group is usually greater than or equal to the minimum number of data copies stored in the data reliability, and X is less than M; the X may be divided into the M virtual machines by the MDS.
  • a plurality of anti-affinity groups are previously determined, for example, a possible implementation strategy is that the X is obtained by the MDS receiving a value entered by the user.
  • the minimum number of data copies stored by the data reliability can be determined by the user based on the experience of actually storing the data, for example, the minimum number of data copies can be 2 or 3.
  • the MDS queries the identifiers of the 12 virtual machines as VM1, VM2, VM3, VM4, VM5, VM6, VM7, VM8, VM9, VM10, VM11, and VM12, and the value of X is 3.
  • the MDS will use the 12 virtual machines.
  • anti-affinity group 1 includes VM1, VM5 and VM9
  • anti-affinity group 2 includes VM2, VM6 and VM10
  • anti-affinity group 3 includes VM3, VM7 and VM11
  • anti-affinity group 4 Including VM4, VM8 and VM12, recording the four anti-affinities in the grouping information
  • the mapping relationship between the group and the identifiers of the 12 virtual machines is: anti-affinity group 1: VM1, VM5, VM9; anti-affinity group 2: VM2, VM6, VM10; anti-affinity group 3: VM3, VM7, VM11; anti-affinity group 4: VM4, VM8, VM12.
  • the virtual machines in each affinity group have affinity, and each virtual machine in each affinity group can be deployed on the same server. Therefore, when data is stored to the SN in the virtual machine of the same affinity group, data can be quickly stored in the SN in the virtual machine of the same affinity group, or can be obtained from the SN in the virtual machine of the same affinity group. Data can improve the efficiency of storing this data. Therefore, the MDS may divide the M virtual machines into multiple affinity groups, and record the mapping relationship between the multiple affinity groups and the identifiers of the M virtual machines in the grouping information.
  • the number Y of virtual machines included in each affinity group may be greater than or equal to the minimum number of data copies stored in the data reliability, and Y is less than M; the Y may be divided into multiple by the MDS in the virtual machine.
  • the affinity group previously determined, for example, that a possible implementation strategy is that the Y is obtained by the MDS receiving the value entered by the user.
  • the MDS queries the identifiers of the 12 virtual machines as VM1, VM2, VM3, VM4, VM5, VM6, VM7, VM8, VM9, VM10, VM11, and VM12, and the value of Y is 4, and the MDS will use the 12 virtual machines.
  • the identification is divided into three anti-affinity groups: affinity group 1 includes VM1, VM2, VM3, and VM4, affinity group 2 includes VM5, VM6, VM7, and VM8, and affinity group 3 includes VM9, VM10, VM11, and VM12.
  • affinity group 1 VM1, VM2, VM3, VM4
  • affinity group 2 VM5, VM6, VM7, VM8
  • affinity group 3 VM9, VM10, VM11, VM12.
  • the MDS may also divide the M virtual machines into multiple affinity groups according to affinity and divide into multiple anti-affinities according to reverse affinity. And mapping the mapping relationship between the plurality of affinity groups and the identifiers of the M virtual machines and the mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines.
  • the MDS queries the identifiers of the 12 virtual machines as VM1, VM2, VM3, VM4, VM5, VM6, VM7, VM8, VM9, VM10, VM11, and VM12, the value of X is 3, and the value of Y is 4, the MDS
  • the IDs of the 12 virtual machines are divided into 3 anti-affinity groups and 4 affinity groups.
  • the anti-affinity group includes: anti-affinity group 1 includes VM1, VM5, and VM9, anti-affinity group 2 includes VM2, VM6, and VM10, anti-affinity group 3 includes VM3, VM7, and VM11, and anti-affinity group 4 includes VM4, VM8, and VM12; affinity groups include: affinity group 1 includes VM1, VM2, VM3, and VM4, affinity group 2 includes VM5, VM6, VM7, and VM8, and affinity group 3 includes VM9, VM10, VM11, and VM12.
  • mapping relationship between the four anti-affinity groups and the identifiers of the 12 virtual machines in the group information is: anti-affinity group 1: VM1VM5VM9; anti-affinity group 2: VM2VM6VM10; anti-affinity group 3: VM3VM7VM11 ; anti-affinity group 4: VM4VM8VM12, and record the mapping relationship between the three affinity groups and the identifiers of the 12 virtual machines in the group information: affinity group 1: VM1, VM2, VM3, VM4; And group 2: VM5, VM6, VM7, VM8; affinity group 3: VM9, VM10, VM11, VM12.
  • Step 304 The MDS sends the group information to the virtual machine manager.
  • the MDS may send the group information to the virtual machine manager, so that the virtual machine manager deploys the M virtual machines according to the group information, and the deployment may be implemented by creating a virtual machine according to the group information or migrating the existing virtual machine. machine.
  • the virtual machine manager may include a configuration interface, and receive the configuration interface through the configuration interface. Group information.
  • the MDS can also store the group information.
  • Step 305 The virtual machine manager receives the group information sent by the MDS, and deploys the M virtual machines according to the group information.
  • the virtual machine manager may deploy the group information according to the group information. M virtual machines.
  • the virtual machine manager since the virtual machine manager does not need to notify the MDS when deploying the M virtual machines, the physical addresses of the multiple physical servers and the virtual ones in each physical server are not required after the deployment is completed. The MDS is notified to the MDS. Therefore, in the embodiment of the present invention, the virtual machine manager can perform the non-aware deployment of the M deployed virtual machines in the plurality of physical servers.
  • the virtual machine manager when the virtual machine manager deploys the M virtual machines according to the grouping information, at least one virtual machine that has been created in the M virtual machines may be migrated according to the grouping information, or may be based on the grouping information.
  • a virtual machine is created in the physical server to complete the deployment of the M virtual machines.
  • the virtual machine manager may deploy the M virtual machines according to the group information by using at least one of the following two possible implementation manners:
  • the virtual machine manager determines the virtual machines in the same anti-affinity group. Migrate virtual machines in the same anti-affinity group to different physical servers in the multiple physical servers.
  • the virtual machine in the plurality of physical servers is distributed as shown in FIG. 3B, wherein the virtual machines included in the physical server 1 are: VM1, VM4, VM5, and VM7;
  • the virtual machines included in the server 2 are: VM2, VM6, VM8, and VM10;
  • the virtual machines included in the physical server 3 are: VM3, VM9, VM11, and VM12.
  • the packet information received by the virtual machine manager is: anti-affinity group 1: VM1, VM5, VM9; anti-affinity group 2: VM2, VM6, VM10; anti-affinity group 3: VM3, VM7, VM11; anti-affinity Group 4: VM4, VM8, VM12.
  • the virtual machine manager migrates the virtual machines in the three physical servers. After the migration, the virtual machines in the three physical servers are distributed as shown in FIG. 3C.
  • the virtual machines included in the physical server 1 are: VM1.
  • the virtual machines included in the physical server 2 are: VM5, VM7, VM10, and VM12;
  • the virtual machines included in the physical server 3 are: VM2, VM4, VM9, and VM11.
  • the virtual machine manager determines that the virtual machines in the same affinity group are in the The virtual machines in the same affinity group are migrated to the same physical server in the multiple physical servers.
  • the virtual machine in the plurality of physical servers is distributed as shown in FIG. 3B, wherein the virtual machines included in the physical server 1 are: VM1, VM4, VM5, and VM7; physical server
  • the virtual machines included in 2 are: VM2, VM6, VM8, and VM10;
  • the virtual machines included in the physical server 3 are: VM3, VM9, VM11, and VM12.
  • the packet information received by the virtual machine manager is: affinity group 1: VM1, VM2, VM3, VM4; affinity group 2: VM5, VM6, VM7, VM8; affinity group 3: VM9, VM10, VM11, VM12.
  • the virtual machine manager migrates the virtual machines in the three physical servers. After the migration, the virtual machines in the three physical servers are distributed as shown in FIG. 3D.
  • the virtual machines included in the physical server 1 are: VM1.
  • the virtual machines included in the physical server 2 are: VM5, VM6, VM7, and VM8;
  • the virtual machines included in the physical server 3 are: VM9, VM10, VM11 and VM12.
  • the virtual machine management when the group information records a mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines, and records a mapping relationship between the plurality of affinity groups and the identifiers of the M virtual machines, the virtual machine management The virtual machine in the same anti-affinity group needs to be migrated to different physical servers in the same physical server, and the virtual machines in the same affinity group need to be migrated to the multiple physical servers. In the same physical server.
  • the M virtual machines of the plurality of physical servers are grouped by the MDS and the group information is stored, and the M virtual machines are deployed by the virtual machine manager, thereby ensuring the The distribution of the M virtual machines in the plurality of physical servers is consistent with the packet information stored by the MDS, and therefore, in the following steps, data storage may be performed based on the stored packet information.
  • Step 306 The MDS receives the data storage request of the client, and determines that the number of copies of the data specified by the data storage request is N, and N is a positive integer greater than or equal to 1.
  • the MDS can receive a data storage request sent by the client and determine the number of copies of the data specified by the data storage request.
  • the data storage request may include a block identifier of the specified data or a file identifier of a file to which the specified data belongs.
  • the specified data is data to be stored, and the specified data may be one data block.
  • the block identifier is used to uniquely identify a data block, and the block identifier may be an identity (ID) of the data block.
  • ID identity
  • the block identifier may be other unique. Identifies the identity of the data block.
  • the file identifier is used to uniquely identify the file, and the file identifier may be a file name or the like of the file. Of course, in an actual application, the file identifier may also be another identifier that can uniquely identify the file.
  • the data storage request may carry the number N of storage of the specified data, that is, the N may be determined by the client.
  • N is a positive integer greater than or equal to 1.
  • the N may be determined by the client before sending the data storage request to the MDS.
  • a possible implementation manner is: the client receives the value input by the user, and determines the received value as N.
  • the client can determine the size of N by other means.
  • the data storage request may not carry N, that is, N may be determined by the MDS.
  • N may be determined by the MDS.
  • One possible strategy is that the N can be determined by the relevant technician when deploying the MDS.
  • Step 307 The MDS determines an identifier of the N virtual machines according to the group information, and sends response information of the data storage request to the client, where the response information includes identifiers of the N virtual machines.
  • the MDS may determine N virtual machines from the M virtual machines according to the grouping information, and send response information to the client, where the response information is used to indicate that the client is in the N virtual machines store N copies of data specified by the data storage request.
  • each of the N virtual machines stores a data specified by the data storage request, thereby realizing allocation for the client for storing the data.
  • the N virtual machines corresponding to the identifiers of the N virtual machines belong to at least one anti-affinity group, and the N virtual machines are virtual machines in the M virtual machines, that is, the M virtual machines.
  • the machine includes the N virtual machines.
  • the MDS can determine the identifiers of the N virtual machines according to the grouping information by using the following two possible implementation manners.
  • the MDS may preferentially allocate N virtual machines with reverse affinity to the client: when X is greater than or equal to N, the MDS is One of the plurality of anti-affinity groups determines the identifiers of the N virtual machines. When X is less than N, the MDS determines the identifiers of the N virtual machines from at least two of the anti-affinity groups.
  • the MDS may determine X virtual machines from any of the at least two anti-affinity groups, and determine NX virtual from other anti-affinity groups in the at least two anti-affinity groups Machine, thus getting N virtual machines.
  • the MDS determines the identifiers of the N virtual machines according to the affinity between the at least two of the anti-affinity groups.
  • the MDS may determine X virtual machines from any of the at least two anti-affinity groups; and the MDS is from other anti-affinities in the at least two anti-affinity groups In the group, according to the affinity, the NX virtual machines in the same affinity group as the at least one of the X virtual machines are determined, thereby obtaining N virtual machines.
  • the grouping information may record a mapping relationship between multiple affinity groups and identifiers of the M virtual machines, and therefore, the MDS may be at least from the X virtual machines according to affinity. In the affinity group where a virtual machine is located, select NX virtual machines.
  • the MDS determines two virtual machines from the anti-affinity group 1 based on the packet information, and the identifiers of the two virtual machines are VM1 and VM5.
  • the MDS determines three virtual machines from the anti-affinity group 1 based on the grouping information, and the identifiers of the three virtual machines are VM1, VM5, and VM9, and according to the VM1 belongs to Affinity group 1, a virtual machine is determined from the anti-affinity group 2, the identifier of the virtual machine is VM2, and the VM2 belongs to the affinity group 1, thereby obtaining 4 virtual machines.
  • the MDS may preferentially allocate N virtual machines with affinity to the client: if Y is greater than or equal to N, the MDS is from the multiple affinity One of the affinity groups in the group determines the identifiers of the N virtual machines. If Y is less than N, the MDS determines the identifiers of the N virtual machines from at least two of the affinity groups.
  • the step of determining, by the MDS, the identifiers of the N virtual machines from the at least two affinity groups, and the first possible implementation manner, the MDS determining the N by using at least two of the anti-affinity groups The steps of the identification of the virtual machine are similar, and the embodiments of the present invention are not described again.
  • the MDS may further store the block identifier of the specified data and the N virtuals after determining the identifiers of the N virtual machines according to the group information. Correspondence between the identifiers of the machines, the correspondence between the block identifiers of the specified data and the file identifiers of the files to which the specified data belongs.
  • Step 308 When the client receives the identifiers of the N virtual machines, storing the designated data in the N virtual machines.
  • the client can store the specified data into the N virtual machines.
  • the distributed storage system may manage the specified data, such as reading, transferring, deleting, and the like, which is not specifically limited in the embodiment of the present invention.
  • the MDS in the distributed storage system is capable of receiving the data storage request of the client, and determining, according to the stored group information, the N virtual machines that meet the storage data requirement from the at least one anti-affinity group.
  • the identifier is used to ensure that the customer can store data in the virtual machine deployed as the SN corresponding to the identifiers of the N virtual machines, and the group information records the identifiers of the multiple anti-affinity groups and the M virtual machines.
  • the mapping relationship does not require obtaining the physical address of the multiple physical servers or the distribution of virtual machines in each physical server, so the location of each physical server and the distribution of virtual machines in each physical server are not revealed.
  • the stability and security of the physical server are improved, thereby improving the stability and security of the virtualized environment and the distributed storage system deployed in the virtualized environment.
  • the MDS can compare the number X of virtual machines included in the anti-affinity group with N, and according to the comparison result, adopt different strategies to determine the identifiers of the N virtual machines.
  • the flexibility of the distributed storage system is improved on the basis of ensuring the reliability and stability of the storage storage system.
  • the MDS can group the identifiers of the N virtual machines that are sent by the virtual machine manager, and send the group information to the virtual machine manager to ensure that the virtual machine manager can
  • the deployment of the virtual machines ensures that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the packet information, which further improves the stability and reliability of the distributed storage system.
  • FIG. 4A is a schematic diagram of a data storage device according to an embodiment of the present invention. At least one virtual server is deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M SNs of the distributed storage system, and M is greater than Or a positive integer equal to 2, referring to FIG. 4A, the apparatus includes: a first receiving module 401, a first determining module 402, a second determining module 403, and a first sending module 404.
  • the first receiving module 401 is configured to perform an operation of the MDS receiving the data storage request of the client in the distributed storage system in step 306;
  • a first determining module 402 configured to perform, in step 306, determining, by the MDS, that the number of copies of data specified by the data storage request is N shares;
  • a second determining module 403, configured to perform an operation of determining, by the MDS, the identifiers of the N virtual machines according to the group information in step 307;
  • the first sending module 404 is configured to perform the operation of sending, by the MDS, the response information of the data storage request to the client in step 307.
  • the second determining module 403 includes:
  • a first determining submodule configured to perform, in step 307, if the number of identifiers of the virtual machines included in the anti-affinity group is greater than or equal to N, the MDS from the one of the plurality of anti-affinity groups Determining the operation of the identifiers of the N virtual machines;
  • a second determining submodule configured to perform, in step 307, if the number of identifiers of the virtual machines included in the anti-affinity group is less than N, the MDS determines the identifiers of the N virtual machines from at least two of the anti-affinity groups operating.
  • the second determining submodule is further configured to:
  • the MDS determines the identifiers of the N virtual machines based on the affinity between the at least two of the anti-affinity groups.
  • the apparatus further includes:
  • a second receiving module 405, configured to perform the M virtualities sent by the MDS receiving virtual machine manager in step 303 Operation of the identification of the machine;
  • the dividing module 406 is configured to perform the MDS in step 303 to divide the M virtual machines into the plurality of anti-affinity groups, and record the plurality of anti-affinity groups and the identifiers of the M virtual machines in the group information.
  • the second sending module 407 is configured to perform the operations described in step 304.
  • the M virtual machines belong to multiple affinity groups, and each of the virtual machines in the affinity group has affinity, and the group information records the identifiers of multiple affinity groups and M virtual machines. Mapping relations.
  • the MDS in the distributed storage system can receive the data storage request of the client, and determine, according to the stored group information, the identifiers of the N virtual machines that meet the storage data requirement from the at least one anti-affinity group. To ensure that the client can store data in the virtual machine deployed as the SN corresponding to the identifiers of the N virtual machines, because the group information records the mapping of the identifiers of the multiple anti-affinity groups and the M virtual machines.
  • FIG. 5 is a schematic diagram of a data storage device according to an embodiment of the present invention. At least one virtual server is deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M data nodes SN, M of the distributed storage system. For a positive integer greater than or equal to 2, referring to FIG. 5, the apparatus includes: a receiving module 501, a dividing module 502, a recording module 503, and a transmitting module 504.
  • the receiving module 501 is configured to perform an operation of receiving, by the MDS of the distributed storage system, the identifiers of the M virtual machines sent by the virtual machine manager in step 303;
  • a dividing module 502 configured to perform the operation of dividing, by the MDS, the M virtual machines into multiple anti-affinity groups in step 303;
  • the recording module 503 is configured to perform an operation of recording, by the MDS, the mapping relationship between the plurality of anti-affinity groups and the identifiers of the M virtual machines in the group information in step 303;
  • the sending module 504 is configured to perform the operations described in step 304.
  • the MDS can group the identifiers of the N virtual machines that are sent by the virtual machine manager, and send the group information to the virtual machine manager to ensure that the virtual machine manager can
  • the N virtual machines are deployed based on the group information, which ensures that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the group information, thereby improving the stability and reliability of the distributed storage system.
  • the location of each physical server and the distribution of virtual machines in each physical server are not revealed, and the situation is improved.
  • the stability and security of the physical server further enhances the stability and security of the virtualized environment and the distributed storage system deployed in the virtualized environment.
  • FIG. 6 is a schematic diagram of a data storage device according to an embodiment of the present invention. At least one virtual server is deployed in a plurality of physical servers, and the M virtual machines are respectively deployed as M data nodes SN, M of the distributed storage system. For a positive integer greater than or equal to 2, referring to FIG. 6, the apparatus includes: a transmitting module 601, a receiving module 602, and a deployment module 603.
  • the sending module 601 is configured to perform an operation of sending, by the virtual machine manager, the identifiers of the M virtual machines to the metadata node MDS of the distributed storage system in step 302;
  • the receiving module 602 is configured to perform an operation of receiving, by the virtual machine manager, the group information sent by the MDS in step 305;
  • the deployment module 603 is configured to perform the operation of deploying M virtual machines according to the group information in the virtual machine manager in step 305.
  • the M virtual machines belong to multiple affinity groups, and each of the virtual machines in the affinity group has affinity, and the group information records the identifiers of multiple affinity groups and M virtual machines. Mapping relations.
  • the virtual machine manager may send the identifiers of the M virtual machines deployed as the SN to the MDS, and according to the received grouping information of the M virtual machines, the M
  • the deployment of the virtual machines ensures that the actual distribution of the N virtual machines in the multiple physical servers is consistent with the packet information, thereby improving the stability and reliability of the distributed storage system.
  • the data storage device provided by the foregoing embodiment stores data
  • only the division of each functional module described above is used for example.
  • the function distribution may be completed by different functional modules as needed.
  • the internal structure of the device is divided into different functional modules to perform all or part of the functions described above.
  • the data storage device and the data storage method embodiment provided by the foregoing embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据存储方法及装置,属于数据存储技术领域。多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数,所述方法包括:分布式存储系统中的MDS能够接收客户端的数据存储请求,并根据存储的分组信息确定N个虚拟机的标识,由于分组信息记录的是多个反亲合组与M个虚拟机的标识的映射关系,而不需要获取多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了物理服务器的稳定性和安全性,进而提高了虚拟化环境和部署在虚拟化环境中的分布式存储系统的稳定性和安全性。

Description

数据存储方法及装置 技术领域
本发明涉及数据存储技术领域,特别涉及一种数据存储方法及装置。
背景技术
随着电子信息技术的发展,网络中的数据量大幅增长,为了对大量的数据进行存储和管理,各种各样的数据存储技术应运而生。其中,分布式存储系统以其安全、便捷的优点得到了广泛的应用。
分布式存储系统一般包括客户端、元数据节点(Meta Data Server,MDS)和多个数据节点(Storage Node,SN)。该分布式存储系统通常可以部署在虚拟化环境中,该虚拟化环境通过虚拟化技术在多个物理服务器上搭建,并通过虚拟机管理器在该多个物理服务器中部署若干个虚拟机;虚拟机管理器在不同虚拟化技术中可以不同,例如可以是HM(Hypervisor Manager),或者可以是KVM(Kernel-based Virtual Machine),或者可以是VMM(Virtual Machine Manager);其中,该分布式存储系统中的客户端、MDS以及SN,均可以部署该若干个虚拟机中的任一虚拟机中。该MDS可以通过该虚拟机管理器查询并存储每个物理服务器的物理地址和部署在每个物理服务器中的虚拟机的标识。
当通过部署在该虚拟化环境中的分布式存储系统将待存储的目标数据进行存储时,该客户端向MDS发送数据存储请求,该MDS根据事先存储的每个物理服务器的物理地址和部署在每个物理服务器中的虚拟机的标识,向该客户端返回虚拟机的标识,当该客户端接收到返回的虚拟机的标识时,将目标数据存储在对应的虚拟机所部署的SN中。
由于部署在该虚拟化环境中的分布式存储系统往往是第三方系统,且分布式存储系统通过上述方法进行数据存储时,需要依赖每个物理服务器的物理地址和部署在每个物理服务器中的虚拟机的标识,此时,可能会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,进而可能会给物理服务器带来安全隐患,降低了物理服务器的稳定性和安全性,也降低了该虚拟化环境以及部署在该虚拟化环境中的分布式存储系统的稳定性和安全性。
发明内容
为了解决现有技术的问题,本发明实施例提供了一种数据存储方法及装置。所述技术方案如下:
第一方面,提供了一种数据存储方法,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个SN,M为大于或等于2的正整数,所述方法包括:
所述分布式存储系统的MDS接收客户端的数据存储请求;
所述MDS确定所述数据存储请求指定的数据的存储份数为N份,N为大于或等于1的正整数;
所述MDS根据分组信息确定N个虚拟机的标识,所述N个虚拟机的标识对应的N个虚拟机属于至少一个反亲合组,所述分组信息记录多个反亲合组与M个虚拟机的标识的 映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述M个虚拟机包括所述N个虚拟机;
所述MDS向所述客户端发送所述数据存储请求的响应信息,所述响应信息包括所述N个虚拟机的标识,所述响应信息指示所述客户端在所述N个虚拟机存储N份所述数据存储请求指定的数据。
在本发明实施例中,分布式存储系统中的MDS能够接收该客户端的数据存储请求,并根据存储的分组信息,从至少一个反亲合组中确定满足存储数据需求的N个虚拟机的标识,以确保该客户能够向该N个虚拟机的标识对应的、被部署为SN的虚拟机中存储数据,由于该分组信息记录的是多个反亲合组与M个虚拟机的标识的映射关系,而不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进而提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
进一步地,所述MDS根据分组信息确定N个虚拟机的标识包括:
若所述反亲合组包括的虚拟机的标识的数目大于或等于N,则所述MDS从所述多个反亲合组中的一个所述反亲合组确定所述N个虚拟机的标识;
若所述反亲合组包括的虚拟机的标识的数目小于N,则所述MDS从至少两个所述反亲合组确定所述N个虚拟机的标识。
,例如,反亲和组包括X个虚拟机,如果X小于N,该MDS可以从该至少两个反亲和组中的任一反亲和组中确定X个虚拟机,并从该至少两个反亲和组中其它的反亲和组确定N-X个虚拟机,从而得到N个虚拟机。
在本发明实施例中,该MDS在确定该N个虚拟机的标识时,能够将反亲和组包括的虚拟机的数目与N比较,根据比较的结果,采取不同的策略来确定该N个虚拟机的标识,在确保该存储存储系统的可靠性和稳定性的基础上提高了该分布式存储系统的灵活性。
进一步地,所述MDS从至少两个所述反亲合组确定所述N个虚拟机的标识包括:
所述MDS在至少两个所述反亲合组之间,根据亲合性确定所述N个虚拟机的标识。
其中,该MDS可以从该至少两个反亲和组中的任一反亲和组中确定X个虚拟机,并从与该至少两个反亲和组其它的反亲和组中,根据亲和性,确定与该X个虚拟机中的至少一个虚拟机处于同一亲和组的N-X个虚拟机,从而得到N个虚拟机。
在本发明实施例中,能够确保当反亲和组包括的虚拟机的数目小于N时,该MDS也能确定N个虚拟机的标识,从而提高了该分布式存储的可靠性。
可选地,所述MDS根据分组信息确定N个虚拟机的标识之前,所述方法还包括:
所述MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
所述MDS将所述M个虚拟机划分为所述多个反亲合组,并在所述分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系;
所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
在本发明实施例中,该MDS能够对接收到虚拟机管理器发送的N个虚拟机的标识进行分组,并将该分组信息发送给该虚拟机管理器,确保该虚拟机管理器能够基于该分组 信息对该N个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,进一步提高了该分布式存储系统的稳定性和可靠性。
可选地,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
在本发明实施例中,该M个虚拟机还可以属于多个亲和组,也即是,该M个虚拟机中的若干个虚拟机不但可以按照反亲和性划分为反亲和性组同时可以按照亲和性划分为亲和性组,从而能够满足不同的存储需求,提高了该分布式存储系统的灵活性。
第二方面,提供了一种数据存储方法,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个SN,M为大于或等于2的正整数,所述方法包括:
所述分布式存储系统的MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
所述MDS将所述M个虚拟机划分为多个反亲合组,每个所述反亲合组中的虚拟机之间具有反亲和性;
所述MDS在分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
在本发明实施例中,首先,该MDS能够对接收到虚拟机管理器发送的N个虚拟机的标识进行分组,并将该分组信息发送给该虚拟机管理器,确保该虚拟机管理器能够基于该分组信息对该N个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,提高了该分布式存储系统的稳定性和可靠性。其次,由于不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进一步提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
第三方面,提供了一种数据存储方法,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个SN,M为大于或等于2的正整数,所述方法包括:
虚拟机管理器向所述分布式存储系统的MDS发送所述M个虚拟机的标识;
所述虚拟机管理器接收所述MDS发送的分组信息,所述分组信息记录多个反亲合组与所述M个虚拟机的标识的映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
所述虚拟机管理器根据所述分组信息部署M个虚拟机。
在本发明实施例中,该虚拟机管理器可以将被部署为SN的M个虚拟机的标识发送给该MDS,并根据接收到的对该M个虚拟机进行分组的分组信息,对该M个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,提高了该分布式存储系统的稳定性和可靠性。其次,由于不需要将该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况发送给该MDS,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进一步提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
进一步地,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
在本发明实施例中,由于该虚拟机管理器能够根据该分组信息对该M个虚拟机进行部署,且部署后的该M个虚拟机还可以属于多个亲和组,也即是,该M个虚拟机中的若干个虚拟机不但可以按照反亲和性划分为反亲和性组同时可以按照亲和性划分为亲和性组,从而能够满足不同的存储需求,提高了该分布式存储系统的灵活性。
第四方面,提供了一种数据存储装置,所述数据存储装置具有实现上述第一方面中的数据存储方法的功能。该数据存储装置包括至少一个模块,该至少一个模块用于实现上述第一方面所提供的数据存储方法。
第五方面,提供了一种数据存储装置,所述数据存储装置具有实现上述第二方面中的数据存储方法的功能。该数据存储装置包括至少一个模块,该至少一个模块用于实现上述第二方面所提供的数据存储方法。
第六方面,提供了一种数据存储装置,所述数据存储装置具有实现上述第三方面中的数据存储方法的功能。该数据存储装置包括至少一个模块,该至少一个模块用于实现上述第三方面所提供的数据存储方法。
第七方面,提供了一种数据存储装置,所述数据存储装置的结构中包括处理器和存储器,所述存储器用于存储支持数据存储装置执行上述第一方面所提供的数据存储方法的程序,和/或存储用于实现上述第一方面所提供的的数据存储方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第八方面,提供了一种数据存储装置,所述数据存储装置的结构中包括处理器和存储器,所述存储器用于存储支持数据存储装置执行上述第二方面所提供的数据存储方法的程序,和/或存储用于实现上述第二方面所提供的的数据存储方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第九方面,提供了一种数据存储装置,所述数据存储装置的结构中包括处理器和存储器,所述存储器用于存储支持数据存储装置执行上述第三方面所提供的数据存储方法的程序,和/或存储用于实现上述第三方面所提供的的数据存储方法所涉及的数据。所述处理器被配置为用于执行所述存储器中存储的程序。所述存储设备的操作装置还可以包括通信总线,该通信总线用于该处理器与存储器之间建立连接。
第十方面,本发明实施例提供了一种计算机存储介质,用于储存为上述第四方面至第九方面中任一方面所提供的数据存储装置所用的计算机软件指令,或存储用于执行上述第一方面至第三方面中任一方面的数据存储方法所设计的程序。
本发明实施例提供的技术方案带来的有益效果是:在本发明实施例中,分布式存储系统中的MDS能够接收该客户端的数据存储请求,并根据存储的分组信息,从至少一个反亲合组中确定满足存储数据需求的N个虚拟机的标识,以确保该客户能够向该N个虚拟机的标识对应的、被部署为SN的虚拟机中存储数据,由于该分组信息记录的是多个反亲合组与M个虚拟机的标识的映射关系,而不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进而提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种虚拟化环境中的分布式存储系统架构图;
图2是本发明实施例提供的一种物理服务器的结构示意图;
图3A是本发明实施例提供的一种数据存储方法的流程图;
图3B是本发明实施例提供的另一种物理服务器示意图;
图3C是本发明实施例提供的又一种物理服务器示意图;
图3D是本发明实施例提供的又一种物理服务器示意图;
图4A是本发明实施例提供的一种数据存储装置结构示意图;
图4B是本发明实施例提供的另一种数据存储装置结构示意图;
图5是本发明实施例提供的又一种数据存储装置结构示意图;
图6是本发明实施例提供的又一种数据存储装置结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。
图1是根据一示例性实施例示出的一种虚拟化环境中的分布式存储系统架构图,参 照图1,该分布式存储系统包括客户端、MDS和多个SN(图1中仅示出8个)等软件控制端。另外,该虚拟化环境是通过虚拟化技术在多个物理服务器上搭建,并通过虚拟机管理器在该多个物理服务器中部署若干个虚拟机。该多个物理服务器中的任意两个物理服务器之间可以通过网络连接。
在本发明实施例中,分布式存储系统中的部分或者全部SN可以部署在该若干个虚拟机中的任一虚拟机中。
可选地,该分布式存储系统中的客户端、MDS、以及SN,均可以部署该若干个虚拟机中的任一虚拟机中。
可选地,该分布式存储系统中的客户端、MDS可以部署在物理服务器上。该分布式存储系统中的部分SN部署在该若干个虚拟机中的任一虚拟机中,剩余部分中的至少一个SN可以部署在物理服务器上。
在本发明实施例中,该客户端可以与该MDS进行通信,从而向该MDS发送数据存储请求,并接收该MDS分配的部署为SN的虚拟机的标识;该客户端也可以与该多个SN中任一个SN进行通信,从而向该多个SN中的某些SN存储数据或者读取数据。
该MDS可以与该虚拟机管理器进行通信,通过该虚拟机管理器查询该多个物理服务器中部署为SN的虚拟机的标识,或者接收该虚拟机管理器发送的部署为SN的虚拟机的标识。
其中,该虚拟机的标识用于在该多个物理服务器中唯一标识一个虚拟机,该虚拟机的标识可以是虚拟机的互联网协议(Internet Protocol,IP)地址、端口号、名称等,当然,在实际应用中,该虚拟机的标识还可以是其它能够在该多个物理服务器中唯一标识一个虚拟机的标识。
还需要说明的是,在实际使用中,该虚拟机管理器可以部署在该多个物理服务器之外的物理设备上,比如,该多个物理服务器之外的另一个物理服务器,当然,根据实际需要,该虚拟机管理器也可以部署在该多个物理服务器中的某个物理服务器中。
图2是本发明实施例提供的一种物理服务器的结构示意图。该物理服务器可以是图1中任一物理服务器。参见图2,该物理服务器包括至少一个处理器201,通信总线202,存储器203以及至少一个通信接口204。
处理器201可以是中央处理器(Central Processing Unit,CPU)、微处理器、特定应用集成电路(Application-Specific Integrated Circuit,ASIC),或者一个或多个可以执行用于实现本发明中的各实施例的程序的集成电路。
通信总线202可包括一通路,在该物理服务器包括的组件(例如处理器201,存储器203以及通信接口204)之间传送数据。
存储器203可以是只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、可存储静态数据和指令的其它类型的静态存储设备、可存储数据和指令的其它类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM),或者其它光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质,或者其它磁存储设备,或者能够用于携带 或存储具有指令或数据结构形式的期望的程序代码并能够由该物理服务器存取的任何其它介质,但不限于此。存储器203可以是独立存在,或者通过通信总线202与处理器201相连接,或者和处理器201集成在一起。
通信接口204使用任何收发器一类的装置,用于通过通信网络与其它设备通信;该通信网络可以是以太网、无线接入网(Radio Access Network,RAN)、无线局域网(Wireless Local Area Networks,WLAN)等;该其它设备可以是部署有客户端或者SN的设备(例如可以搬移的计算机设备或者可以携带的移动终端)。
在具体实现中,作为一种实施例,处理器201可以包括一个或多个CPU,例如图2中所示的CPU0和CPU1。
在具体实现中,作为一种实施例,该物理服务器可以包括多个处理器,例如图2中所示的处理器201和处理器205。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个或多个器件、电路、和/或用于处理数据的处理核。
其中,存储器203用于存储执行用于实现本发明中的各实施例的程序代码210,处理器201用于执行存储器203中存储的程序代码210。该物理服务器可以通过处理器201执行存储器203中的程序代码210,来实现下文图3A实施例所提供的数据存储方法。
图3A是本发明实施例提供的一种数据存储方法的流程图,多个物理服务器中至少部署有M个虚拟机,该M个虚拟机分别被部署为分布式存储系统的M个SN,M为大于或等于2的正整数,参见图3A,该方法包括:
步骤301:虚拟机管理器从多个物理服务器中查询该M个被部署为SN的虚拟机,该多个物理服务器为建立有该虚拟化环境的物理服务器。
由于M个SN(M个SN可以是该分布式存储系统的所有SN或者部分SN)部署在该多个物理服务器中的虚拟机中,这些虚拟机的运行状况直接关系到该分布式存储系统的稳定性和可靠性,因此,为了对部署在该多个物理服务器中的虚拟机进行管理,从而保证该分布式存储系统的稳定性和可靠性,该虚拟机管理器可以从多个物理服务器中查询该M个被部署为SN的虚拟机。
需要说明的是,该虚拟机管理器可以在接收到MDS发送的虚拟机查询请求时,执行步骤301所述的操作。
还需要说明的是,该虚拟机管理器可以在每个物理服务器中部署虚拟机时,获取并存储当前部署的该虚拟机的标识。当然,在实际应用中,该虚拟机管理器还可以通过其它方式获取该该M个被部署为SN的虚拟机的标识;比如,一种可能的实现方式为:该虚拟机管理器向该多个物理服务器中的每个物理服务器发送虚拟机标识获取请求,当该物理服务器接收到该虚拟机标识获取请求时,将该物理服务器中被部署为SN的虚拟机的标识发送给该虚拟机管理器。
例如,该分布式存储系统部署在3个物理服务器中,该虚拟机管理器确定该3个物理服务器中被部署为SN的虚拟机的标识分别为:VM1、VM2、VM3、VM4、VM5、VM6、VM7、VM8、VM9、VM10、VM11和VM12。
步骤302:该虚拟机管理器向该分布式存储系统中的MDS发送该M个虚拟机的标识。
由于该虚拟机管理器查询得到的虚拟机的标识为被部署为SN的虚拟机的标识,为了确保该分布式存储系统能够在该SN中存储数据,该虚拟机管理器可以将查询得到的虚拟机的标识发送给该MDS。
需要说明的是,该MDS可以在向该虚拟机管理器发送虚拟机查询请求,从而使该虚拟机管理器通过上述步骤301和302从该多个物理服务器中查询该M个被部署为SN的虚拟机的标识,并将查询得到的虚拟机的标识发送给该MDS,从而确保该MDS能够查询该多个物理服务器中该M个被部署为SN的虚拟机。当然,在实际应用中,该MDS还可以通过其它方式来查询该多个物理服务器中该M个被部署为SN的虚拟机,比如,一种可能的实现方式为:该MDS在首次接收到该虚拟机管理器发送的虚拟机的标识时,存储接收到的虚拟机的标识,之后,该MDS可以从本地存储中查询该多个物理服务器中被部署为SN的虚拟机。
步骤303:该MDS接收虚拟机管理器发送的该M个虚拟机的标识,对该M个虚拟机进行分组。
由于该M个虚拟机为被部署为SN的虚拟机,该M个虚拟机可能既包括部署在同一物理服务器中的虚拟机,也包括分别部署在不同的物理服务器中的虚拟机,而该分布式存储系统在存储不同的数据时,也可能有不同的需求,比如,需要将某些数据存储至部署在同一物理服务器中的虚拟机的SN中,因此,为了便于该分布式存储系统对数据进行存储,该MDS可以对该M个虚拟机进行分组。且由于该MDS并不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,从而不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进而提高了建立在物理服务器中的虚拟化环境以及部署在该虚拟化环境中的分布式存储系统的稳定性和安全性。
其中,该MDS可以通过下述两种可能的实现方式中的至少一种来对该M个虚拟机进行分组,包括:
第一种可能的实现方式,每个反亲合组中的虚拟机之间具有反亲和性,每个反亲和组中的虚拟机分别部署于不同的服务器上的虚拟机中。因此,当将数据存储至同一反亲和组的虚拟机部署的SN时,能够确保存储该数据的某个SN所在的物理服务器故障时,也能够从部署在其它物理服务器中的虚拟机的SN获取该数据,能够提高该存储该数据的可靠性。所以,该MDS可以将该M个虚拟机划分为多个反亲合组,并在分组信息记录该多个反亲合组与该M个虚拟机的标识的映射关系。
其中,每个反亲和组中包括的虚拟机的数目X通常大于或等于数据可靠性存储的最少数据副本数目,并且X小于M;该X可以由该MDS在将该M个虚拟机划分为多个反亲合组之前确定,比如,一种可能实现的策略为:该X由该MDS接收用户输入的数值得到。
还需要说明的是,数据可靠性存储的最少数据副本数目可以由用户根据实际存储数据的经验确定,比如最少数据副本数目可以为2或3。
例如,该MDS查询到12个虚拟机的标识为VM1、VM2、VM3、VM4、VM5、VM6、VM7、VM8、VM9、VM10、VM11和VM12,X的值3,该MDS将这12个虚拟机划分为4个反亲和组:反亲和组1包括VM1、VM5和VM9,反亲和组2包括VM2、VM6和VM10,反亲和组3包括VM3、VM7和VM11,反亲和组4包括VM4、VM8和VM12,在分组信息中记录该4个反亲和 组与该12个虚拟机的标识之间的映射关系为:反亲和组1:VM1、VM5、VM9;反亲和组2:VM2、VM6、VM10;反亲和组3:VM3、VM7、VM11;反亲和组4:VM4、VM8、VM12。
第二种可能的实现方式,每个亲和组中的虚拟机之间具有亲和性,每个该亲和组中的每个虚拟机可以部署于同一服务器上。因此,当将数据存储至同一亲和组的虚拟机中的SN时,可以快速向同一亲和组的虚拟机中的SN中存储数据,或者从同一亲和组的虚拟机中的SN中获取数据,能够提高存储该数据的效率。所以,该MDS可以将该M个虚拟机划分为多个亲合组,并在分组信息记录该多个亲合组与该M个虚拟机的标识的映射关系。
其中,每个亲和组中包括的虚拟机的数目Y可以大于或等于数据可靠性存储的最少数据副本数目,并且Y小于M;该Y可以由该MDS在将该M个虚拟机划分为多个亲合组之前确定,比如,一种可能实现的策略为:该Y由该MDS接收用户输入的数值得到。
例如,该MDS查询到12个虚拟机的标识为VM1、VM2、VM3、VM4、VM5、VM6、VM7、VM8、VM9、VM10、VM11和VM12,Y的值4,该MDS将这12个虚拟机的标识划分为3个反亲和组:亲和组1包括VM1、VM2、VM3和VM4,亲和组2包括VM5、VM6、VM7和VM8,亲和组3包括VM9、VM10、VM11和VM12,在分组信息中记录该3个亲和组与该12个虚拟机的标识之间的映射关系为:亲和组1:VM1、VM2、VM3、VM4;亲和组2:VM5、VM6、VM7、VM8;亲和组3:VM9、VM10、VM11、VM12。
需要说明的是,为了同时兼顾存储数据的可靠性和效率,该MDS也可以将该M个虚拟机按照亲和性划分为多个亲合组以及按照反亲和性划分为多个反亲和组,并在分组信息中记录该多个亲合组与该M个虚拟机的标识的映射关系以及该多个反亲合组与该M个虚拟机的标识的映射关系。
例如,该MDS查询到12个虚拟机的标识为VM1、VM2、VM3、VM4、VM5、VM6、VM7、VM8、VM9、VM10、VM11和VM12,X的值为3,Y的值4,该MDS将这12个虚拟机的标识划分为3个反亲和组和4个亲和组。其中,反亲和组包括:反亲和组1包括VM1、VM5和VM9,反亲和组2包括VM2、VM6和VM10,反亲和组3包括VM3、VM7和VM11,反亲和组4包括VM4、VM8和VM12;亲和组包括:亲和组1包括VM1、VM2、VM3和VM4,亲和组2包括VM5、VM6、VM7和VM8,亲和组3包括VM9、VM10、VM11和VM12。在分组信息中记录该4个反亲和组与该12个虚拟机的标识之间的映射关系为:反亲和组1:VM1VM5VM9;反亲和组2:VM2VM6VM10;反亲和组3:VM3VM7VM11;反亲和组4:VM4VM8VM12,且在分组信息中记录该3个亲和组与该12个虚拟机的标识之间的映射关系为:亲和组1:VM1、VM2、VM3、VM4;亲和组2:VM5、VM6、VM7、VM8;亲和组3:VM9、VM10、VM11、VM12。
步骤304:该MDS向该虚拟机管理器发送该分组信息。
由于该MDS对该M个虚拟机表示进行分组之后,该M个虚拟机在该多个物理服务器中的实际分布情况可能与该分组信息并不一致,因此,为了提高该分布式存储系统的可靠性,该MDS可以将该分组信息发送给虚拟机管理器,以便该虚拟机管理器根据该分组信息部署该M个虚拟机,部署的实现方式可以是根据该分组信息创建虚拟机或者迁移已有虚拟机。
需要说明的是,该虚拟机管理器可以包括一个配置接口,并通过该配置接口接收该 分组信息。
另外,为了确保该分布式存储系统能够根据分组的虚拟机进行数据存储,该MDS也可以存储该分组信息。
步骤305:该虚拟机管理器接收该MDS发送的分组信息,根据该分组信息,部署该M个虚拟机。
由于该M个虚拟机在该多个物理服务器中的实际分布情况可能与该分组信息并不一致,因此,为了提高该分布式存储系统的可靠性,该虚拟机管理器可以根据该分组信息部署该M个虚拟机,同时,由于该虚拟机管理器不需要在部署该M个虚拟机时通知该MDS,部署完成之后也不要需要将该多个物理服务器的物理地址以及每个物理服务器中的虚拟机分布情况通知该MDS,因此,在本发明实施例中,该虚拟机管理器能够对该多个物理服务器中的该M个部署虚拟机进行无感知部署。
需要说明的是,当该虚拟机管理器根据该分组信息部署该M个虚拟机时,可以根据分组信息对该M个虚拟机中已创建的至少一个虚拟机进行迁移,或者可以根据分组信息在物理服务器中创建虚拟机,从而完成对该M个虚拟机的部署。
其中,该虚拟机管理器可以通过下述两种可能的实现方式中的至少一种来根据该分组信息部署该M个虚拟机:
第一种可能的实现方式,当该分组信息记录有多个反亲合组与该M个虚拟机的标识的映射关系时,该虚拟机管理器确定处于同一反亲合组中的虚拟机,将处于同一反亲和组中虚拟机分别迁移至该多个物理服务器中不同的物理服务器中。
例如,在该虚拟机管理器对虚拟机迁移之前,该多个物理服务器中的虚拟机分布如图3B所示,其中物理服务器1中包括的虚拟机为:VM1、VM4、VM5和VM7;物理服务器2中包括的虚拟机为:VM2、VM6、VM8和VM10;物理服务器3中包括的虚拟机为:VM3、VM9、VM11和VM12。该虚拟机管理器接收的分组信息为:反亲和组1:VM1、VM5、VM9;反亲和组2:VM2、VM6、VM10;反亲和组3:VM3、VM7、VM11;反亲和组4:VM4、VM8、VM12。因此,该虚拟机管理器将该3个物理服务器中虚拟机进行迁移,迁移后该3个物理服务器中的虚拟机分布如图3C所示,其中,物理服务器1中包括的虚拟机为:VM1、VM3、VM6和VM8;物理服务器2中包括的虚拟机为:VM5、VM7、VM10和VM12;物理服务器3中包括的虚拟机为:VM2、VM4、VM9和VM11。
第二种可能的实现方式,当该分组信息记录有多个亲合组与该M个虚拟机的标识的映射关系时,该虚拟机管理器确定处于同一亲合组中的虚拟机,将处于同一亲和组中虚拟机迁移至该多个物理服务器中的同一物理服务器中。
例如,该虚拟机管理器对虚拟机迁移之前,该多个物理服务器中的虚拟机分布如图3B所示,其中物理服务器1中包括的虚拟机为:VM1、VM4、VM5和VM7;物理服务器2中包括的虚拟机为:VM2、VM6、VM8和VM10;物理服务器3中包括的虚拟机为:VM3、VM9、VM11和VM12。该虚拟机管理器接收的分组信息为:亲和组1:VM1、VM2、VM3、VM4;亲和组2:VM5、VM6、VM7、VM8;亲和组3:VM9、VM10、VM11、VM12。因此,该虚拟机管理器将该3个物理服务器中虚拟机进行迁移,迁移后该3个物理服务器中的虚拟机分布如图3D所示,其中,物理服务器1中包括的虚拟机为:VM1、VM2、VM3和VM4;物理服务器2中包括的虚拟机为:VM5、VM6、VM7和VM8;物理服务器3中包括的虚拟机为:VM9、 VM10、VM11和VM12。
另外,当该分组信息记录有多个反亲合组与该M个虚拟机的标识的映射关系以及记录有多个亲合组与该M个虚拟机的标识的映射关系时,该虚拟机管理器不但需要将处于同一反亲和组中的虚拟机分别迁移至该多个物理服务器中的不同物理服务器中,还需要将同一亲和组中的虚拟机分别迁移至该多个物理服务器中的相同物理服务器中。
在上述步骤301-305中,通过该MDS对该多个物理服务器中的该M个虚拟机进行了分组并存储分组信息,通过虚拟机管理器对该M个虚拟机进行了部署,确保了该多个物理服务器中的该M个虚拟机分布情况与该MDS存储的分组信息一致,因此,在下述步骤中,可以基于存储的分组信息进行数据存储。
步骤306:该MDS接收客户端的数据存储请求,并确定该数据存储请求指定的数据的存储份数为N份,N为大于或等于1的正整数。
为能够在该分布式存储系统中存储数据,该MDS可以接收客户端发送的数据存储请求,并确定该数据存储请求所指定的数据的存储份数。
其中,该数据存储请求中可以包括该指定的数据的块标识或该指定的数据所属的文件的文件标识。
需要说明的是,该指定的数据为待存储的数据,该指定的数据可以是一个数据块。
还需要说明的是,该块标识用于唯一标识一个数据块,该块标识可以是该数据块的身份(Identification,ID)标识,当然,在实际应用中,该块标识还可以是其它能够唯一标识该数据块的标识。
还需要说明的是,该文件标识用于唯一标识该文件,该文件标识可以是该文件的文件名等,当然,在实际应用中,该文件标识还可以是其它能够唯一标识该文件的标识。
进一步地,为了确保存储该指定的数据的灵活性,该数据存储请求中可以携待指定的数据的存储份数N,也即是,该N可以由该客户端确定。
其中,N为大于或等于1的正整数。该N可以由该客户端在向该MDS发送该数据存储请求之前确定,比如,一种可能的实现方式为:该客户端接收用户输入的数值,并将接收到的数值确定为N。当然,在实际应用中,该客户端还可以通过其它方式来确定N的大小。
在另一种可能的实现方式中,为了提高存储数据的可靠性,该数据存储请求中可以不携带N,即N可以由该MDS确定。其中,一种可能的策略为,该N可以由相关技术人员在部署该MDS时确定。
步骤307:该MDS根据分组信息确定N个虚拟机的标识,向该客户端发送该数据存储请求的响应信息,该响应信息包括该N个虚拟机的标识。
为了便于该客户端进行数据存储,该MDS可以根据该分组信息,从该M个虚拟机中确定N个虚拟机,并向该客户端发送响应信息,该响应信息用于指示该客户端在该N个虚拟机存储N份该数据存储请求指定的数据,具体地,N个虚拟机中的每个虚拟机存储一份该数据存储请求指定的数据,从而实现为该客户端分配用于存储该指定的数据的SN所在的虚拟机。
需要说明的是,该N个虚拟机的标识对应的N个虚拟机属于至少一个反亲合组,该N个虚拟机为该M个虚拟机中的虚拟机,也即是,该M个虚拟机包括该N个虚拟机。
其中,该MDS可以通过下述两种可能的实现方式来根据分组信息确定该N个虚拟机的标识。
第一种可能的实现方式,为了提高存储数据的可靠性和安全性,该MDS可以优先为该客户端分配N个具有反亲和性的虚拟机:当X大于或等于N时,该MDS从该多个反亲合组中的一个该反亲合组确定该N个虚拟机的标识,当X小于N时,该MDS从至少两个该反亲合组确定该N个虚拟机的标识。
其中,该MDS可以从该至少两个反亲和组中的任一反亲和组中确定X个虚拟机,并从该至少两个反亲和组中其它的反亲和组确定N-X个虚拟机,从而得到N个虚拟机。
进一步地,该MDS在至少两个该反亲合组之间,根据亲合性确定该N个虚拟机的标识。
其中,该MDS可以从该至少两个反亲和组中的任一反亲和组中确定X个虚拟机;并且,该MDS从与该至少两个反亲和组中的其它的反亲和组中,根据亲和性,确定与该X个虚拟机中的至少一个虚拟机处于同一亲和组的N-X个虚拟机,从而得到N个虚拟机。
需要说明的是,由前述可知,该分组信息可以记录多个亲合组与该M个虚拟机的标识的映射关系,因此,该MDS可以根据亲和性,从与该X个虚拟机中至少一个虚拟机所在的亲和组中,选择N-X个虚拟机。
例如,当该MDS接收到的数据存储请求中携带数据块标识为block1,存储份数N为2。由于X为3,即X大于或等于N,因此,该MDS基于分组信息,从反亲和组1中确定两个虚拟机,该两个虚拟机的标识为VM1和VM5。
例如,当该MDS接收到的数据存储请求中携带数据块标识为block2,存储份数N为4。由于X为3,即X小于N,因此,该MDS基于分组信息,从反亲和组1中确定3个虚拟机,该3个虚拟机的标识为VM1、VM5和VM9,并根据VM1所属的亲和组1,从反亲和组2中确定1个虚拟机,该虚拟机的标识为VM2,该VM2属于亲和组1,从而得到4个虚拟机。
第二种可能的实现方式,为了提高存储数据的效率,该MDS可以优先为该客户端分配N个具有亲和性的虚拟机:若Y大于或等于N,则该MDS从该多个亲合组中的一个该亲合组确定该N个虚拟机的标识,若Y小于N,则该MDS从至少两个该亲合组确定该N个虚拟机的标识。
需要说明的是,该MDS从至少两个该亲合组确定该N个虚拟机的标识的步骤,与第一种可能的实现方式中该MDS从至少两个该反亲合组确定该N个虚拟机的标识的步骤相似,本发明实施例不再一一赘述。
进一步地,为了便于在存储该指定的数据之后对该数据进行管理,该MDS在根据分组信息确定N个虚拟机的标识之后,该MDS还可以存储该指定的数据的块标识与该N个虚拟机的标识之间的对应关系、该指定的数据的块标识与该指定的数据所属的文件的文件标识之间的对应关系。
步骤308:当该客户端接收到该N个虚拟机的标识时,在该N个虚拟机中存储该指定的数据。
由于该N个虚拟机的标识为该MDS为该指定的数据分配的、被部署为SN的虚拟机的标识,因此,该客户端可以将该指定的数据存储至该N个虚拟机中。
另外,在将该指定的数据进行存储之后,该分布式存储系统可以对该指定的数据进行管理,比如读取、转存、删除等,本发明实施例对此不做具体限定。
在本发明实施例中,首先,分布式存储系统中的MDS能够接收该客户端的数据存储请求,并根据存储的分组信息,从至少一个反亲合组中确定满足存储数据需求的N个虚拟机的标识,以确保该客户能够向该N个虚拟机的标识对应的、被部署为SN的虚拟机中存储数据,由于该分组信息记录的是多个反亲合组与M个虚拟机的标识的映射关系,而不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进而提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。其次,该MDS在确定该N个虚拟机的标识时,能够将反亲和组包括的虚拟机的数目X与N比较,根据比较的结果,采取不同的策略来确定该N个虚拟机的标识,在确保该存储存储系统的可靠性和稳定性的基础上提高了该分布式存储系统的灵活性。最后,该MDS能够对接收到虚拟机管理器发送的N个虚拟机的标识进行分组,并将该分组信息发送给该虚拟机管理器,确保该虚拟机管理器能够基于该分组信息对该N个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,进一步提高了该分布式存储系统的稳定性和可靠性。
图4A是本发明实施例提供的一种数据存储装置示意图,多个物理服务器中至少部署有M个虚拟机,该M个虚拟机分别被部署为分布式存储系统的M个SN,M为大于或等于2的正整数,参见图4A,该装置包括:第一接收模块401,第一确定模块402,第二确定模块403和第一发送模块404。
第一接收模块401,用于执行步骤306中该分布式存储系统的MDS接收客户端的数据存储请求的操作;
第一确定模块402,用于执行步骤306中该MDS确定该数据存储请求指定的数据的存储份数为N份的操作;
第二确定模块403,用于执行步骤307中该MDS根据分组信息确定N个虚拟机的标识的操作;
第一发送模块404,用于执行步骤307中该MDS向该客户端发送该数据存储请求的响应信息的操作。
可选地,第二确定模块403包括:
第一确定子模块,用于执行步骤307中若该反亲合组包括的虚拟机的标识的数目大于或等于N,则该MDS从该多个反亲合组中的一个该反亲合组确定该N个虚拟机的标识的操作;
第二确定子模块,用于执行步骤307中若该反亲合组包括的虚拟机的标识的数目小于N,则该MDS从至少两个该反亲合组确定该N个虚拟机的标识的操作。
可选地,第二确定子模块还用于:
该MDS在至少两个该反亲合组之间,根据亲合性确定该N个虚拟机的标识。可选地,参见图4B,该装置还包括:
第二接收模块405,用于执行步骤303中该MDS接收虚拟机管理器发送的该M个虚拟 机的标识的操作;
划分模块406,用于执行步骤303中该MDS将该M个虚拟机划分为该多个反亲合组,并在该分组信息记录该多个反亲合组与该M个虚拟机的标识的映射关系的操作;
第二发送模块407,用于执行步骤304所述的操作。
可选地,该M个虚拟机属于多个亲合组,每个该亲合组中的虚拟机之间具有亲和性,该分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
在本发明实施例中,分布式存储系统中的MDS能够接收该客户端的数据存储请求,并根据存储的分组信息,从至少一个反亲合组中确定满足存储数据需求的N个虚拟机的标识,以确保该客户能够向该N个虚拟机的标识对应的、被部署为SN的虚拟机中存储数据,由于该分组信息记录的是多个反亲合组与M个虚拟机的标识的映射关系,而不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进而提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
图5是本发明实施例提供的一种数据存储装置示意图,多个物理服务器中至少部署有M个虚拟机,该M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数,参见图5,该装置包括:接收模块501,划分模块502,记录模块503和发送模块504。
接收模块501,用于执行步骤303中该分布式存储系统的MDS接收虚拟机管理器发送的该M个虚拟机的标识的操作;
划分模块502,用于执行步骤303中该MDS将该M个虚拟机划分为多个反亲合组的操作;
记录模块503,用于执行步骤303中该MDS在分组信息记录该多个反亲合组与该M个虚拟机的标识的映射关系的操作;
发送模块504,用于执行步骤304所述的操作。
在本发明实施例中,首先,该MDS能够对接收到虚拟机管理器发送的N个虚拟机的标识进行分组,并将该分组信息发送给该虚拟机管理器,确保该虚拟机管理器能够基于该分组信息对该N个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,提高了该分布式存储系统的稳定性和可靠性。其次,由于不需要获取该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进一步提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
图6是本发明实施例提供的一种数据存储装置示意图,多个物理服务器中至少部署有M个虚拟机,该M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数,参见图6,该装置包括:发送模块601,接收模块602和部署模块603。
发送模块601,用于执行步骤302中虚拟机管理器向该分布式存储系统的元数据节点MDS发送该M个虚拟机的标识的操作;
接收模块602,用于执行步骤305中该虚拟机管理器接收该MDS发送的分组信息的操作;
部署模块603,用于执行步骤305中该虚拟机管理器根据该分组信息部署M个虚拟机的操作。
可选地,该M个虚拟机属于多个亲合组,每个该亲合组中的虚拟机之间具有亲和性,该分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
在本发明实施例中,该虚拟机管理器可以将被部署为SN的M个虚拟机的标识发送给该MDS,并根据接收到的对该M个虚拟机进行分组的分组信息,对该M个虚拟机进行部署,保证了该N个虚拟机在该多个物理服务器中的实际分布情况与该分组信息一致,提高了该分布式存储系统的稳定性和可靠性。其次,由于不需要将该多个物理服务器的物理地址或每个物理服务器中的虚拟机分布情况发送给该MDS,因此,不会泄露每个物理服务器的位置以及每个物理服务器中的虚拟机分布情况,提高了该物理服务器的稳定性和安全性,进一步提高了虚拟化环境以及部署在虚拟化环境中的分布式存储系统的稳定性和安全性。
需要说明的是:上述实施例提供的数据存储的装置在存储数据时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的数据存储装置与数据存储方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (16)

  1. 一种数据存储方法,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述方法包括:
    所述分布式存储系统的元数据节点MDS接收客户端的数据存储请求;
    所述MDS确定所述数据存储请求指定的数据的存储份数为N份,N为大于或等于1的正整数;
    所述MDS根据分组信息确定N个虚拟机的标识,所述N个虚拟机的标识对应的N个虚拟机属于至少一个反亲合组,所述分组信息记录多个反亲合组与M个虚拟机的标识的映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述M个虚拟机包括所述N个虚拟机;
    所述MDS向所述客户端发送所述数据存储请求的响应信息,所述响应信息包括所述N个虚拟机的标识,所述响应信息指示所述客户端在所述N个虚拟机存储N份所述数据存储请求指定的数据。
  2. 如权利要求1所述的方法,其特征在于,所述MDS根据分组信息确定N个虚拟机的标识包括:
    若所述反亲合组包括的虚拟机的标识的数目大于或等于N,则所述MDS从所述多个反亲合组中的一个所述反亲合组确定所述N个虚拟机的标识;
    若所述反亲合组包括的虚拟机的标识的数目小于N,则所述MDS从至少两个所述反亲合组确定所述N个虚拟机的标识。
  3. 如权利要求2所述的方法,其特征在于,所述MDS从至少两个所述反亲合组确定所述N个虚拟机的标识包括:
    所述MDS在至少两个所述反亲合组之间,根据亲合性确定所述N个虚拟机的标识。
  4. 如权利要求1至3任一项所述的方法,其特征在于,所述MDS根据分组信息确定N个虚拟机的标识之前,所述方法还包括:
    所述MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
    所述MDS将所述M个虚拟机划分为所述多个反亲合组,并在所述分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系;
    所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  5. 如权利要求3或4所述的方法,其特征在于,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
  6. 一种数据存储方法,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述方法包括:
    所述分布式存储系统的元数据节点MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
    所述MDS将所述M个虚拟机划分为多个反亲合组,每个所述反亲合组中的虚拟机之间具有反亲和性;
    所述MDS在分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
    所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  7. 一种数据存储方法,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述方法包括:
    虚拟机管理器向所述分布式存储系统的元数据节点MDS发送所述M个虚拟机的标识;
    所述虚拟机管理器接收所述MDS发送的分组信息,所述分组信息记录多个反亲合组与所述M个虚拟机的标识的映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
    所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  8. 如权利要求7所述的方法,其特征在于,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
  9. 一种数据存储装置,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述装置包括:
    第一接收模块,用于所述分布式存储系统的元数据节点MDS接收客户端的数据存储请求;
    第一确定模块,用于所述MDS确定所述数据存储请求指定的数据的存储份数为N份,N为大于或等于1的正整数;
    第二确定模块,用于所述MDS根据分组信息确定N个虚拟机的标识,所述N个虚拟机的标识对应的N个虚拟机属于至少一个反亲合组,所述分组信息记录多个反亲合组与M个虚拟机的标识的映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述M个虚拟机包括所述N个虚拟机;
    第一发送模块,用于所述MDS向所述客户端发送所述数据存储请求的响应信息,所述响应信息包括所述N个虚拟机的标识,所述响应信息指示所述客户端在所述N个虚拟机存储N份所述数据存储请求指定的数据。
  10. 如权利要求9所述的装置,其特征在于,所述第二确定模块包括:
    第一确定子模块,用于若所述反亲合组包括的虚拟机的标识的数目大于或等于N,则所述MDS从所述多个反亲合组中的一个所述反亲合组确定所述N个虚拟机的标识;
    第二确定子模块,用于若所述反亲合组包括的虚拟机的标识的数目小于N,则所述MDS从至少两个所述反亲合组确定所述N个虚拟机的标识。
  11. 如权利要求10所述的装置,其特征在于,所述第二确定子模块还用于:
    所述MDS在至少两个所述反亲合组之间,根据亲合性确定所述N个虚拟机的标识。
  12. 如权利要求9至11任一项所述的装置,其特征在于,所述装置还包括:
    第二接收模块,用于所述MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
    划分模块,用于所述MDS将所述M个虚拟机划分为所述多个反亲合组,并在所述分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系;
    第二发送模块,用于所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  13. 如权利要求11或12所述的装置,其特征在于,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
  14. 一种数据存储装置,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述装置包括:
    接收模块,用于所述分布式存储系统的元数据节点MDS接收虚拟机管理器发送的所述M个虚拟机的标识;
    划分模块,用于所述MDS将所述M个虚拟机划分为多个反亲合组,每个所述反亲合组中的虚拟机之间具有反亲和性;
    记录模块,用于所述MDS在分组信息记录所述多个反亲合组与所述M个虚拟机的标识的映射关系,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
    发送模块,用于所述MDS向所述虚拟机管理器发送所述分组信息,以便所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  15. 一种数据存储装置,其特征在于,多个物理服务器中至少部署有M个虚拟机,所述M个虚拟机分别被部署为分布式存储系统的M个数据节点SN,M为大于或等于2的正整数;所述装置包括:
    发送模块,用于虚拟机管理器向所述分布式存储系统的元数据节点MDS发送所述M个虚拟机的标识;
    接收模块,用于所述虚拟机管理器接收所述MDS发送的分组信息,所述分组信息记录多个反亲合组与所述M个虚拟机的标识的映射关系,每个所述反亲合组中的虚拟机之间具有反亲和性,所述分组信息用于所述MDS在客户端请求存储数据时从至少一个所述反亲合组中确定N个虚拟机的标识,所述N个虚拟机用于存储客户端指定存储的N份数据,N为大于或等于1的正整数;
    部署模块,用于所述虚拟机管理器根据所述分组信息部署M个虚拟机。
  16. 如权利要求15所述的装置,其特征在于,所述M个虚拟机属于多个亲合组,每个所述亲合组中的虚拟机之间具有亲和性,所述分组信息记录多个亲合组与M个虚拟机的标识的映射关系。
PCT/CN2017/087212 2016-11-21 2017-06-05 数据存储方法及装置 WO2018090606A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17871054.7A EP3432132B1 (en) 2016-11-21 2017-06-05 Data storage method and device
US16/188,951 US11036535B2 (en) 2016-11-21 2018-11-13 Data storage method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611043007.7A CN106648462B (zh) 2016-11-21 2016-11-21 数据存储方法及装置
CN201611043007.7 2016-11-21

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/188,951 Continuation US11036535B2 (en) 2016-11-21 2018-11-13 Data storage method and apparatus

Publications (1)

Publication Number Publication Date
WO2018090606A1 true WO2018090606A1 (zh) 2018-05-24

Family

ID=58811239

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/087212 WO2018090606A1 (zh) 2016-11-21 2017-06-05 数据存储方法及装置

Country Status (4)

Country Link
US (1) US11036535B2 (zh)
EP (1) EP3432132B1 (zh)
CN (1) CN106648462B (zh)
WO (1) WO2018090606A1 (zh)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106648462B (zh) * 2016-11-21 2019-10-25 华为技术有限公司 数据存储方法及装置
JP6901683B2 (ja) * 2017-09-22 2021-07-14 富士通株式会社 調整プログラム、調整装置および調整方法
CN108228099B (zh) * 2017-12-27 2021-01-26 中国银联股份有限公司 一种数据存储的方法及装置
CN111213343B (zh) * 2018-09-28 2021-08-20 华为技术有限公司 一种主机升级方法及设备
US11385972B2 (en) * 2019-06-26 2022-07-12 Vmware, Inc. Virtual-machine-specific failover protection
CN111443986A (zh) * 2020-01-09 2020-07-24 武汉思普崚技术有限公司 一种分布式虚拟环境的微隔离防护方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130185667A1 (en) * 2012-01-18 2013-07-18 International Business Machines Corporation Open resilience framework for simplified and coordinated orchestration of multiple availability managers
CN104142853A (zh) * 2014-08-12 2014-11-12 华为技术有限公司 虚拟机存储资源部署方法和装置
CN104484220A (zh) * 2014-11-28 2015-04-01 杭州华为数字技术有限公司 虚拟化集群的动态资源调度的方法及装置
CN104657087A (zh) * 2015-02-04 2015-05-27 杭州华为数字技术有限公司 一种虚拟磁盘映射的方法、装置及系统
CN105656646A (zh) * 2014-11-10 2016-06-08 中国移动通信集团公司 一种虚拟网元的部署方法及装置
CN106020937A (zh) * 2016-07-07 2016-10-12 腾讯科技(深圳)有限公司 一种创建虚拟机的方法、装置及系统
CN106648462A (zh) * 2016-11-21 2017-05-10 华为技术有限公司 数据存储方法及装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8621270B2 (en) * 2010-09-24 2013-12-31 Hitachi Data Systems Corporation System and method for transparent recovery of damaged or unavailable objects in a replicated object storage system
US8468289B2 (en) * 2010-10-22 2013-06-18 International Business Machines Corporation Dynamic memory affinity reallocation after partition migration
US20130326053A1 (en) * 2012-06-04 2013-12-05 Alcatel-Lucent Usa Inc. Method And Apparatus For Single Point Of Failure Elimination For Cloud-Based Applications
US9367394B2 (en) * 2012-12-07 2016-06-14 Netapp, Inc. Decoupled reliability groups
EP3040860A1 (en) * 2014-12-29 2016-07-06 NTT DoCoMo, Inc. Resource management in cloud systems
US9775008B2 (en) * 2015-01-14 2017-09-26 Kodiak Networks, Inc. System and method for elastic scaling in a push to talk (PTT) platform using user affinity groups
US9846589B2 (en) * 2015-06-04 2017-12-19 Cisco Technology, Inc. Virtual machine placement optimization with generalized organizational scenarios
US9614731B2 (en) * 2015-07-02 2017-04-04 Fujitsu Limited Scalable provisioning of virtual optical network demands

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130185667A1 (en) * 2012-01-18 2013-07-18 International Business Machines Corporation Open resilience framework for simplified and coordinated orchestration of multiple availability managers
CN104142853A (zh) * 2014-08-12 2014-11-12 华为技术有限公司 虚拟机存储资源部署方法和装置
CN105656646A (zh) * 2014-11-10 2016-06-08 中国移动通信集团公司 一种虚拟网元的部署方法及装置
CN104484220A (zh) * 2014-11-28 2015-04-01 杭州华为数字技术有限公司 虚拟化集群的动态资源调度的方法及装置
CN104657087A (zh) * 2015-02-04 2015-05-27 杭州华为数字技术有限公司 一种虚拟磁盘映射的方法、装置及系统
CN106020937A (zh) * 2016-07-07 2016-10-12 腾讯科技(深圳)有限公司 一种创建虚拟机的方法、装置及系统
CN106648462A (zh) * 2016-11-21 2017-05-10 华为技术有限公司 数据存储方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3432132A4 *

Also Published As

Publication number Publication date
EP3432132A1 (en) 2019-01-23
US20190079791A1 (en) 2019-03-14
US11036535B2 (en) 2021-06-15
CN106648462A (zh) 2017-05-10
EP3432132A4 (en) 2019-04-24
CN106648462B (zh) 2019-10-25
EP3432132B1 (en) 2020-07-08

Similar Documents

Publication Publication Date Title
WO2018090606A1 (zh) 数据存储方法及装置
US11855904B2 (en) Automated migration of compute instances to isolated virtual networks
US20220237052A1 (en) Deploying microservices into virtualized computing systems
US10701139B2 (en) Life cycle management method and apparatus
US11218364B2 (en) Network-accessible computing service for micro virtual machines
US11409712B2 (en) Small-file storage optimization system based on virtual file system in KUBERNETES user-mode application
CN110865867B (zh) 应用拓扑关系发现的方法、装置和系统
US10686756B2 (en) Method and apparatus for managing MAC address generation for virtualized environments
US9582221B2 (en) Virtualization-aware data locality in distributed data processing
US9342346B2 (en) Live migration of virtual machines that use externalized memory pages
WO2017157156A1 (zh) 一种用户请求的处理方法和装置
US10310986B1 (en) Memory management unit for shared memory allocation
US9882775B1 (en) Dependent network resources
CN110809760A (zh) 资源池的管理方法、装置、资源池控制单元和通信设备
US11734044B2 (en) Configuring virtualization system images for a computing cluster
CN107423301B (zh) 一种数据处理的方法、相关设备及存储系统
WO2018086437A1 (zh) 一种加速器加载方法、系统和加速器加载装置
CN107547258B (zh) 一种网络策略的实现方法和装置
US11360824B2 (en) Customized partitioning of compute instances
WO2019001140A1 (zh) 一种管理vnf实例化的方法和设备
US20230115261A1 (en) Migrating stateful workloads between container clusters with different storage backends
US8838768B2 (en) Computer system and disk sharing method used thereby
US20220210226A1 (en) Orchestrating allocation of shared resources in a datacenter
US10397130B2 (en) Multi-cloud resource reservations
CN106775846A (zh) 用于物理服务器的在线迁移的方法及装置

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 2017871054

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2017871054

Country of ref document: EP

Effective date: 20181019

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17871054

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE