CN107943421A

CN107943421A - A kind of subregion partitioning method and device based on distributed memory system

Info

Publication number: CN107943421A
Application number: CN201711241562.5A
Authority: CN
Inventors: 罗四维; 张雷; 刘小威
Original assignee: Chengdu Huawei Technology Co Ltd
Current assignee: Chengdu Huawei Technology Co Ltd
Priority date: 2017-11-30
Filing date: 2017-11-30
Publication date: 2018-04-20
Anticipated expiration: 2037-11-30
Also published as: CN107943421B

Abstract

The embodiment of the present application discloses a kind of subregion partitioning method and device based on distributed memory system, is related to field of storage, in the case of solving in small-scale cluster a small amount of memory node failure or a small amount of disk failure, the problem of ensureing data reliability.Concrete scheme is：Cluster management node obtains fault message, and fault message is used to indicate the memory node of failure or the storage medium of failure；Cluster management node repartitions the normal storage medium of state according to the redundant mode of fault message, the load of the normal memory node of state and EC, obtains the first regeneration block information；Cluster management node sends the first regeneration block information to application node.The embodiment of the present application is used for the process of data storage.

Description

A kind of subregion partitioning method and device based on distributed memory system

Technical field

The invention relates to field of storage, more particularly to a kind of subregion division methods based on distributed memory system And device.

Background technology

Under big data environment, more data are grasped, the value that data contain is also bigger.At present, enterprise customer, number The data of magnanimity are mainly stored by cloud storage technology according to center infrastructures etc., for example, distributed memory system.Storing The data of magnanimity are it is also desirable to ensure the reliability of the data of magnanimity.Existing guarantee data reliability strategy mainly includes More copies (Multi-Replica) and correcting and eleting codes (Erasure Coding, EC).

In the distributed memory system of small-scale cluster, distributed storage can be pre-set according to the redundant mode of EC The subregion of system, each subregion include N+K disk, and each disk in each subregion belongs to different memory nodes, and N is represented The data fragmentation of EC, K represent the verification burst of EC.After application node carries out EC codings to data to be written, at least one is obtained EC bands, a subregion is write by each EC bands., can in the case of a small amount of memory node failure or a small amount of disk failure Calculated with to carry out simple exclusive or by fetching portion data and recover initial data, so that, ensure the reliability of the reading of data. But the data writing mode requires application node to carry out full band write-in, limitation according to the zone configuration that system is specified The flexibility of upper-layer service；Moreover, under conditions of subregion is unsatisfactory for current data write-in, this time data write-in and follow-up mistake Cheng Jun can not be performed normally, the situation about writing that degrades occur.

Write, can be stored temporarily with the normal memory node of state in other subregions in storage system in order to avoid degrading Should Write fault memory node data.Although can either so support system to write business and do not degrade, while also ensure data Write-in reliability.But add the memory node management complexity across subregion；If failed storage node recovers it Afterwards, it is necessary to which the operation that log-on data is moved back, adds additional the expense of Data Migration, and reduce the overall performance of system.

Therefore, in the case of a small amount of memory node failure or a small amount of disk failure, how to ensure in small-scale cluster Data reliability is a urgent problem to be solved.

The content of the invention

The embodiment of the present application provides a kind of subregion partitioning method and device based on distributed memory system, solves small In scale cluster in the case of a small amount of memory node failure or a small amount of disk failure, the problem of ensureing data reliability.

To reach above-mentioned purpose, the embodiment of the present application adopts the following technical scheme that：

The first aspect of the embodiment of the present application, there is provided a kind of subregion division methods based on distributed memory system, including： Distributed memory system includes cluster management node, application node and S memory node, and each memory node includes X storage Medium, the S*X storage medium that S memory node of redundant mode according to correcting and eleting codes EC includes are divided into P subregion, P subregion In each subregion include Y storage medium, Y storage medium is stored by one in each memory node of Y memory node Medium forms, wherein, the redundant mode of EC is the number of data fragmentation and the number of verification burst, and N represents of data fragmentation Number, K represent the number of verification burst, Y=N+K, its basic principle is：First, cluster management node obtains fault message, therefore Barrier information is used to indicate the memory node of failure or the storage medium of failure；Then, cluster management node is according to fault message, shape The load of the normal memory node of state and the redundant mode of EC repartition the normal storage medium of state, obtain the first renewal point Area's information；Cluster management node sends the first regeneration block information to application node.The embodiment of the present application is deposited based on distribution The subregion division methods of storage system, after memory node or storage media failure, section is normally stored according to fault message, state The load of point and the redundant mode of EC repartition the normal storage medium of state, remain storage medium in subregion Number it is identical with the configuration of the redundant mode of EC, ensure to be successfully written to when data write, effectively improve data Reliability.

With reference to first aspect, in a kind of possible implementation, if fault message is the node of the memory node of failure Mark, i memory node failure, cluster management node is according to fault message, the load of the normal memory node of state and EC Redundant mode repartitions the normal storage medium of state, obtains the first regeneration block information, including：Cluster management node according to (S-i) * X storage mediums that S-i memory node includes are divided into by the load of the redundant mode of EC and S-i memory node Q subregion, obtains the first regeneration block information, and the first regeneration block information includes the partition identification of each subregion in Q subregion, And in Q subregion the storage medium that each subregion includes media identification.

With reference to first aspect, in alternatively possible implementation, if fault message is Jie of the storage medium of failure Qualitative character, j storage media failure, cluster management node is according to fault message, the load of the normal memory node of state and EC Redundant mode repartition the normal storage medium of state, obtain the first regeneration block information, including：Cluster management node root (the S*X)-j storage mediums that S memory node includes are divided into W according to the load of the redundant mode and S memory node of EC A subregion, obtains the first regeneration block information, and the first regeneration block information includes the partition identification of each subregion in W subregion, And in W subregion the storage medium that each subregion includes media identification.

With reference to above-mentioned possible implementation, in alternatively possible implementation, in cluster management node to application After node sends the first regeneration block information, method further includes：Cluster management node, which obtains, recovers information, recovers information and is used for Indicate the trouble shooting of the memory node of failure or the trouble shooting of the storage medium of failure；Cluster management node is believed according to recovery The normal storage medium of state is repartitioned in breath, the load of the redundant mode of EC and the normal memory node of state, obtains second Regeneration block information；Cluster management node sends the second regeneration block information to application node.So as in memory node or storage The trouble shooting of medium, after recovering normal condition, divides subregion to the storage medium in distributed memory system again, makes storage Making full use of for medium, avoids waste of storage space.

The second aspect of the embodiment of the present application, there is provided a kind of method for writing data, including：Distributed memory system includes collection Group's management node, application node and S memory node, each memory node include X storage medium, according to the superfluous of correcting and eleting codes EC The S*X storage medium that remaining Mode S memory node includes is divided into P subregion, and each subregion includes Y storage in P subregion Medium, Y storage medium are made of a storage medium in each memory node of Y memory node, wherein, the redundancy of EC Pattern is the number of data fragmentation and the number of verification burst, and N represents the number of data fragmentation, and K represents the number of verification burst, Y=N+K, method include：Application node carries out EC codings to data to be written, obtains L bar EC bands, every EC band includes N A data fragmentation and K verification burst, L are determined that L is more than or equal to 1 by the data volume of data to be written；Application node is according to first Regeneration block information, which stores L bar EC bands to L subregion in Q subregion, the first regeneration block information, to be included in Q subregion The partition identification of each subregion, and in Q subregion the storage medium that each subregion includes media identification, Q subregion is by collecting (S-i) * X that group's management node includes S-i memory node according to the load of the redundant mode and S-i memory node of EC What storage medium divided, i represents the number of the memory node of failure；Alternatively, application node is believed according to the first regeneration block For breath by L bar EC bands storage to L subregion in W subregion, the first regeneration block information includes each subregion in W subregion The media identification for the storage medium that each subregion includes in partition identification, and W subregion, W subregion is by cluster management node (the S*X)-j storage mediums included according to the load of the redundant mode of EC and S memory node to S memory node divide Arrive, j represents the number of the storage medium of failure.So as to after memory node or storage media failure, by storage medium Subregion is repartitioned, remains that the number of storage medium is identical with the configuration of the redundant mode of EC in subregion, after renewal Subregion write-in data, ensure to be successfully written to when data write, effectively improve the reliability of data.

With reference to second aspect, in a kind of possible implementation, EC codings are carried out to data to be written in application node, Before obtaining L bar EC bands, method further includes：Application node receives the first regeneration block information that cluster management node is sent.

With reference to above-mentioned possible implementation, in alternatively possible implementation, cluster pipe is received in application node After managing the first regeneration block information that node is sent, method further includes：Application node receive that cluster management node sends the Two regeneration block information, the second regeneration block information are cluster management nodes according to recovery information, the redundant mode and state of EC The load of normal memory node repartitions what the normal storage medium of state obtained, recovers information and is used to indicate depositing for failure Store up the trouble shooting of node or the trouble shooting of the storage medium of failure.

The third aspect of the embodiment of the present application, there is provided a kind of cluster management node, including：Distributed memory system includes collection Group's management node, application node and S memory node, each memory node include X storage medium, according to the superfluous of correcting and eleting codes EC The S*X storage medium that remaining Mode S memory node includes is divided into P subregion, and each subregion includes Y storage in P subregion Medium, Y storage medium are made of a storage medium in each memory node of Y memory node, wherein, the redundancy of EC Pattern is the number of data fragmentation and the number of verification burst, and N represents the number of data fragmentation, and K represents the number of verification burst, Y=N+K, cluster management node include：Transmit-Receive Unit, for obtaining fault message, fault message is used for the storage for indicating failure The storage medium of node or failure；Processing unit, for according to fault message, the load of the normal memory node of state and EC Redundant mode repartitions the normal storage medium of state, obtains the first regeneration block information；Transmit-Receive Unit, is additionally operable to application Node sends the first regeneration block information.

The fourth aspect of the embodiment of the present application, there is provided a kind of application node, including：Distributed memory system includes cluster pipe Node, application node and S memory node are managed, each memory node includes X storage medium, the redundancy mould according to correcting and eleting codes EC The S*X storage medium that S memory node of formula includes is divided into P subregion, and each subregion includes Y storage medium in P subregion, Y storage medium is made of a storage medium in each memory node of Y memory node, wherein, the redundant mode of EC For the number and the number of verification burst of data fragmentation, N represents the number of data fragmentation, and K represents the number of verification burst, Y=N + K, application node include：Processing unit, for carrying out EC codings to data to be written, obtains L bar EC bands, every EC band Including N number of data fragmentation and K verification burst, L is determined that L is more than or equal to 1 by the data volume of data to be written；Processing unit and Transmit-Receive Unit, for being stored L bar EC bands to L subregion in Q subregion, the first renewal according to the first regeneration block information Partition information includes in Q subregion each partition identification of subregion, and the storage medium that each subregion includes in Q subregion Media identification, Q subregion deposit S-i according to the load of the redundant mode and S-i memory node of EC by cluster management node What (S-i) * X storage mediums that storage node includes divided, i represents the number of the memory node of failure；Alternatively, application section O'clock L bar EC bands are stored to L subregion in W subregion, the first regeneration block packet according to the first regeneration block information Include the partition identification of each subregion in W subregion, and in W subregion the storage medium that each subregion includes media identification, W (the S* that a subregion includes S memory node according to the load of the redundant mode and S memory node of EC by cluster management node X)-j storage mediums divide, and j represents the number of the storage medium of failure.

It should be noted that after cluster management node obtains fault message, if Y is more than S, Q each of subregion divides Area belongs to same memory node including at least two storage mediums.In addition, the distributed storage system described in the embodiment of the present application System is small-scale group system, and S is the integer more than or equal to 3 and less than or equal to 20.Failure rate is 10%S.I can be 3.

It should be noted that the above-mentioned third aspect and fourth aspect function module be able to can also be led to by hardware realization Cross hardware and perform corresponding software realization.Hardware or software include the one or more and corresponding module of above-mentioned function.For example, Communication interface, for completing the function of Transmit-Receive Unit, processor, for completing the function of processing unit, memory, for handling Device handles the subregion division methods based on distributed memory system of the embodiment of the present application and the program of method for writing data refers to Order.Processor, communication interface and memory are connected by bus and complete mutual communication.Specifically, it may be referred to first The function of the behavior of cluster management node in the subregion division methods based on distributed memory system that aspect provides, and second The function of the behavior of application node in the method for writing data that aspect provides.

5th aspect of the embodiment of the present application, there is provided a kind of cluster management node, the cluster management node can include：Extremely A few processor, memory, communication interface, communication bus；At least one processor passes through with memory, communication interface to communicate Bus connects, and memory is used to store computer executed instructions, and when processor is run, processor performs the meter of memory storage Calculation machine execute instruction, so that base any in the possible implementation of cluster management node execution first aspect or first aspect In the subregion division methods of distributed memory system.

6th aspect of the embodiment of the present application, there is provided a kind of application node, the application node can include：At least one place Manage device, memory, communication interface, communication bus；At least one processor is connected with memory, communication interface by communication bus Connect, memory is used to store computer executed instructions, and when processor is run, the computer that processor performs memory storage is held Row instruction, so that data write-in side any in the possible implementation of application node execution second aspect or second aspect Method.

7th aspect of the embodiment of the present application, there is provided a kind of computer-readable recording medium, for saving as above-mentioned collection Computer software instructions used in group's management node, when computer software instructions are executed by processor so that cluster management section The method that point can perform above-mentioned middle any aspect.

The eighth aspect of the embodiment of the present application, there is provided a kind of computer-readable recording medium, for saving as above-mentioned answer With the computer software instructions used in node, when computer software instructions are executed by processor so that application node can be held The method of the above-mentioned middle any aspect of row.

9th aspect of the embodiment of the present application, there is provided a kind of computer program product for including instruction, when it is being calculated When being run on machine so that computer can perform the method for above-mentioned any aspect.

In addition, third aspect technique effect caused by any design method into the 9th aspect can be found in first aspect With different designs mode in second aspect caused by technique effect, details are not described herein again.

In the embodiment of the present application, the name of cluster management node and application node does not form restriction in itself to equipment, in reality During border is realized, these equipment can occur with other titles.As long as the function of each equipment is similar with the embodiment of the present application, belong to Within the scope of the application claim and its equivalent technologies.

These aspects or other aspects of the embodiment of the present application can more straightforwards in the following description.

Brief description of the drawings

Fig. 1 is a kind of rough schematic view of distributed memory system provided by the embodiments of the present application；

Fig. 2 is that a kind of subregion that the prior art provides divides schematic diagram；

Fig. 3 is that another subregion that the prior art provides divides schematic diagram；

Fig. 4 is that another subregion that the prior art provides divides schematic diagram；

Fig. 5 is a kind of flow chart of the subregion division methods based on distributed memory system provided by the embodiments of the present application；

Fig. 6 divides schematic diagram for a kind of subregion provided by the embodiments of the present application；

Fig. 7 divides schematic diagram for another subregion provided by the embodiments of the present application；

Fig. 8 divides schematic diagram for another subregion provided by the embodiments of the present application；

Fig. 9 divides schematic diagram for another subregion provided by the embodiments of the present application；

Figure 10 divides schematic diagram for another subregion provided by the embodiments of the present application；

Figure 11 divides schematic diagram for another subregion provided by the embodiments of the present application；

Figure 12 is a kind of flow chart of method for writing data provided by the embodiments of the present application；

Figure 13 is a kind of schematic diagram of data writing process provided by the embodiments of the present application；

Figure 14 is a kind of composition schematic diagram of cluster management node provided by the embodiments of the present application；

Figure 15 is a kind of composition schematic diagram of computer equipment provided by the embodiments of the present application；

Figure 16 is the composition schematic diagram of another cluster management node provided by the embodiments of the present application；

Figure 17 is a kind of composition schematic diagram of application node provided by the embodiments of the present application；

Figure 18 is the composition schematic diagram of another application node provided by the embodiments of the present application.

Embodiment

In order to which the description of following each embodiments understands succinct, the brief introduction of correlation technique is provided first：

Distributed memory system, is a kind of storage system for being easy to extension, and each memory node status is impartial, does not limit and is The position of memory node and quantity in system, can disperse data to be stored in more independent memory nodes with arbitrary extension On, realize the effect of load balancing.All numbers are stored using the storage server concentrated relative to traditional network store system According to improving the reliability, availability and access efficiency of system.

Correcting and eleting codes carry out burst to data, obtain data fragmentation, then calculate a small amount of verification point according to data fragmentation Piece, all data fragmentations and verification burst are stored on different back end respectively.A small amount of burst letter need to be only obtained during reading Cease and combine simple exclusive or calculating and can obtain initial data.This mode greatly improves hard drive space utilization rate, utilizes Hardware device speed-up computation process, the loss for performance can also control within the specific limits.

Exemplary, Fig. 1 is a kind of rough schematic view of distributed memory system provided by the embodiments of the present application.Such as Fig. 1 institutes Show, which can include：Cluster management node, application node and S memory node, each memory node include X Storage medium.System can be pre-configured with the redundant mode of EC according to the number of memory node and the number of storage medium, that is, count According to the number of burst and the number of verification burst.System can include S memory node in system according to the redundant mode of EC S*X storage medium be divided into P subregion, each subregion includes Y storage medium in P subregion, and Y storage medium is in S Y different memory nodes is selected in a memory node, then one is selected from each memory node of Y different memory nodes A storage medium, the storage medium selected from each memory node of Y memory node is determined as to form a subregion.Can With understanding, the Y storage medium that each subregion includes belongs to different memory nodes.The storage that each subregion includes is situated between Matter is also different.Wherein, the redundant mode of EC represents data fragmentation for the number and the number of verification burst, N of data fragmentation Number, K represent verification burst number, Y=N+K.

It should be noted that the subregion division methods based on distributed memory system described in the embodiment of the present application are suitable for The distributed memory system of small-scale cluster.For example, S is the integer more than or equal to 3 and less than or equal to 20.X is more than or equal to 1 Integer.Exemplary, distributed memory system includes 8 memory nodes, and each memory node includes 6 storage mediums, i.e. S's takes The value being worth for 8, X is 6.Alternatively, distributed memory system includes 6 memory nodes, each memory node includes 6 storages and is situated between The value that the value of matter, i.e. S is 6, X is 6.

Assuming that the redundant mode of EC is 4 data fragmentations and 2 verification bursts, i.e. N=4, K=2.According to the redundancy mould of EC Formula is 4 data fragmentations and 2 verification bursts to including 8 memory nodes, each memory node includes point of 6 storage mediums Cloth storage system carries out subregion division, i.e., 6 different memory nodes are selected from 8 memory nodes, then from different 6 A storage medium is selected in each memory node of memory node, will be selected from each memory node of 6 memory nodes Storage medium be determined as form a subregion, eight subregions can be obtained, each subregion includes 6 storage mediums.Assuming that to 8 A memory node is encoded according to 1 to 8, and 6 storage mediums that each memory node includes are encoded according to 1 to 6.Storage Storage medium 1 in node 1 can be denoted as 1-1, and the storage medium 2 in memory node 1 can be denoted as 1-2, in memory node 1 Storage medium 3 can be denoted as 1-3, and the storage medium 4 in memory node 1 can be denoted as 1-4, the storage medium 5 in memory node 1 1-5 can be denoted as, the storage medium 6 in memory node 1 can be denoted as 1-6.Similarly, the storage medium 1 in memory node 2 can be with It is denoted as 2-1, the storage medium 2 in memory node 2 can be denoted as 2-2.Storage medium in other memory nodes equally can be by Represent that details are not described herein for the embodiment of the present application according to above-mentioned edit mode.

As shown in Fig. 2, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 6-1.Subregion two includes storage medium 2-2, storage medium 3-2, storage medium 4-2, storage Jie Matter 5-2, storage medium 6-2 and storage medium 7-2.Subregion three include storage medium 3-3, storage medium 4-3, storage medium 5-3, Storage medium 6-3, storage medium 7-3 and storage medium 8-3.Subregion four includes storage medium 4-4, storage medium 5-4, storage Jie Matter 6-4, storage medium 7-4, storage medium 8-4 and storage medium 1-2.Subregion five include storage medium 5-5, storage medium 6-5, Storage medium 7-5, storage medium 8-5, storage medium 1-3 and storage medium 2-3.Subregion six includes storage medium 6-6, storage is situated between Matter 7-6, storage medium 8-6, storage medium 1-4, storage medium 2-4 and storage medium 3-4.Subregion seven include storage medium 1-5, Storage medium 2-5, storage medium 3-5, storage medium 4-5, storage medium 7-1 and storage medium 8-1.Subregion eight includes storage and is situated between Matter 1-6, storage medium 2-6, storage medium 3-6, storage medium 4-6, storage medium 5-6 and storage medium 8-2.

Assuming that the redundant mode of EC is 4 data fragmentations and 2 verification bursts, i.e. N=4, K=2.According to the redundancy mould of EC Formula is 4 data fragmentations and 2 verification bursts to including 6 memory nodes, each memory node includes point of 6 storage mediums Cloth storage system carries out subregion division, i.e., a storage medium is selected from each memory node of 6 memory nodes, will be from The storage medium selected in each memory node of 6 memory nodes is determined as forming a subregion, can obtain six subregions, Each subregion includes 6 storage mediums.Assuming that encoded to 6 memory nodes according to 1 to 6, each memory node include 6 A storage medium is encoded according to 1 to 6.The narration way of storage medium in memory node can be according to above-mentioned edit mode Represent, details are not described herein for the embodiment of the present application.

As shown in figure 3, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 6-1.Subregion two includes storage medium 1-2, storage medium 2-2, storage medium 3-2, storage Jie Matter 4-2, storage medium 5-2 and storage medium 6-2.Subregion three include storage medium 1-3, storage medium 2-3, storage medium 3-3, Storage medium 4-3, storage medium 5-3 and storage medium 6-3.Subregion four includes storage medium 1-4, storage medium 2-4, storage Jie Matter 3-4, storage medium 4-4, storage medium 5-4 and storage medium 6-4.Subregion five include storage medium 1-5, storage medium 2-5, Storage medium 3-5, storage medium 4-5, storage medium 5-5 and storage medium 6-5.Subregion six includes storage medium 1-6, storage is situated between Matter 2-6, storage medium 3-6, storage medium 4-6, storage medium 5-6 and storage medium 6-6.

It should be noted that the dividing mode of above-mentioned subregion is exemplary illustration, the embodiment of the present application does not limit this It is fixed, there can also be the division of other modes in practical application.But need to ensure what same subregion included when dividing subregion Storage medium belongs to different memory nodes.If in distributed memory system there are remaining storage medium not enough composition one During a full partitions, it can select to have been dispensed into the storage medium of other subregions and remaining storage medium composition subregion, i.e., Different subregions can include identical storage medium (the same storage medium in same memory node).It should be noted It is, when selection has been dispensed into the storage medium of other subregions, it is necessary in the storage except including remaining storage medium The minimum storage medium of selection load in other memory nodes outside node.

Assuming that the redundant mode of EC is 4 data fragmentations and 2 verification bursts, i.e. N=4, K=2.According to the redundancy mould of EC Formula is 4 data fragmentations and 2 verification bursts to including 8 memory nodes, each memory node includes point of 5 storage mediums Cloth storage system carries out subregion division, i.e., 6 different memory nodes are selected from 8 memory nodes, then from different 6 A storage medium is selected in each memory node of memory node, will be selected from each memory node of 6 memory nodes Storage medium be determined as form a subregion, seven subregions can be obtained, each subregion includes 6 storage mediums.Assuming that to 8 A memory node is encoded according to 1 to 6, and 5 storage mediums that each memory node includes are encoded according to 1 to 5.Storage The narration way of storage medium in node can represent that details are not described herein for the embodiment of the present application according to above-mentioned edit mode.

As shown in figure 4, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 6-1.Subregion two includes storage medium 2-2, storage medium 3-2, storage medium 4-2, storage Jie Matter 5-2, storage medium 6-2 and storage medium 7-2.Subregion three include storage medium 3-3, storage medium 4-3, storage medium 5-3, Storage medium 6-3, storage medium 7-3 and storage medium 8-3.Subregion four includes storage medium 4-4, storage medium 5-4, storage Jie Matter 6-4, storage medium 7-4, storage medium 8-4 and storage medium 1-2.Subregion five include storage medium 5-5, storage medium 6-5, Storage medium 7-5, storage medium 8-5, storage medium 1-3 and storage medium 2-3.Subregion six includes storage medium 1-5, storage is situated between Matter 2-5, storage medium 3-5, storage medium 4-5, storage medium 7-1 and storage medium 8-1.At this time, remaining storage medium 1-4, Storage medium 2-4, storage medium 3-4 and tetra- storage mediums of storage medium 8-2, form also poor two storage mediums of subregion, can To select two storages to be situated between from small to large according to load from memory node 4, memory node 5, memory node 6 and memory node 7 Matter.Assuming that storage medium 6-4 and storage medium 7-4 loads are minimum, storage medium 1-4, storage medium 2-4, storage can be situated between Matter 3-4, storage medium 8-2, storage medium 6-4 and storage medium 7-4 composition subregions seven.

Wherein, storage medium refers to the carrier for storing data.Such as floppy disk, CD, DVD, hard disk, flash memory, safe digital Block (Secure Digital Memory Card, SD) card, multimedia (Mutimedia Card, MMC) card, memory stick (Memory Stick) etc..Current most popular storage medium is the disk based on flash memory (Nand flash).

Memory node includes storage medium described above, for storing data.Storage section described in the embodiment of the present application Point is alternatively referred to as storage server.Each memory node can be integrated with the different logical nodes in same equipment, Can be distribution and the equipment of diverse location, the embodiment of the present application is not construed as limiting this, as long as realize distributed memory system Store function can.

Cluster management node is used to manage metadata, the address of memory node, the state of memory node and memory node Load.Metadata (Metadata), also known as broker data, relaying data, to describe data (the data about of data Data), the information of data attribute (property) is mainly described, for supporting instruction storage location, historical data, resource to look into Look for and the function such as file record.

Application node is stored with application software, for producing data, writes data into memory node or accesses memory node Read data.Application node and cluster management node can be integrated with the different logical nodes in same equipment, also may be used To be distributed across the equipment of diverse location, be able to can also be directly connected to, the embodiment of the present application does not make this by network connection Limit, as long as realize that the function of application node and cluster management node can.

The embodiment of the present application provides a kind of subregion division methods based on distributed memory system, distributed memory system bag Cluster management node, application node and S memory node are included, its basic principle is：First, cluster management node obtains failure Information, fault message are used to indicate the memory node of failure or the storage medium of failure；Then, cluster management node is according to failure The load of the normal memory node of information, state and the redundant mode of EC repartition the normal storage medium of state, obtain One regeneration block information；Cluster management node sends the first regeneration block information to application node.So as in memory node or deposit It is normal to state according to the redundant mode of fault message, the load of the normal memory node of state and EC after storage media failure Storage medium is repartitioned, and remains that the number of storage medium is identical with the configuration of the redundant mode of EC in subregion, is protected Card can be successfully written to when data write, and effectively improve the reliability of data.

The embodiment of the embodiment of the present application is described in detail below in conjunction with attached drawing.

Fig. 5 is a kind of flow chart of the subregion division methods based on distributed memory system provided by the embodiments of the present application, As shown in figure 5, this method can include：

S501, cluster management node obtain fault message.

Exemplary, the memory node transmission state that cluster management node can be periodically into distributed memory system please Message is sought, whether the state for carrying out memory node in Querying Distributed storage system is normal.Status request message is used to seek survival Store up node and return to status information.If memory node is in normal condition, normal condition response can be returned to cluster management node Message；If memory node is in malfunction, memory node can return to malfunction response message to cluster management node.Separately Outside, if memory node does not receive any message such as normal condition response message or malfunction response message in scheduled time slot, Cluster management node can determine memory node response timeout, and memory node is in malfunction.Wherein, malfunction response disappears Breath, which can refer to memory node failure, can also refer to the storage media failure that memory node includes.

Alternatively, the memory node that cluster management node may not need on one's own initiative into distributed memory system is sent Status request message, the memory node in distributed memory system periodically can send state letter to cluster management node Breath, informs whether the cluster management node state of itself is normal.If in scheduled time slot, cluster management node does not receive distribution The status information of some memory node in formula storage system, it may be determined that the memory node failure.

Fault message is used to indicate the memory node of failure or the storage medium of failure.

S502, cluster management node are according to the redundant mode of fault message, the load of the normal memory node of state and EC The normal storage medium of state is repartitioned, obtains the first regeneration block information.

Optionally, if fault message is the node identification of the memory node of failure.For example, above-mentioned memory node 1, storage Node 1 can be node identification.Assuming that there is i memory node failure, i is the integer more than or equal to 1 and less than or equal to 3.At this time, Can not just data be write to the subregion of the storage medium of the memory node including failure, therefore, cluster management node is according to EC's (S-i) * X storage mediums that S-i memory node includes are divided into Q by the load of redundant mode and S-i memory node Subregion, obtains the first regeneration block information.First regeneration block information includes the partition identification of each subregion in Q subregion, with And in Q subregion the storage medium that each subregion includes media identification.For example, above-mentioned storage medium 1-1, storage medium 2- 3 can be the media identification of storage medium.For example, above-mentioned subregion one, subregion two can be partition identifications.

In the case where Y is less than or equal to S, for each subregion in Q subregion, cluster management node is first deposited from S-i Store up and Y memory node is selected in node, then select a storage to be situated between from each memory node of Y different memory nodes Matter, the storage medium selected from each memory node of Y memory node is determined as to form a subregion.

It is exemplary, it is assumed that the redundant mode of EC verifies bursts, i.e. N=4, K=2 for 4 data fragmentations and 2.It is distributed Storage system includes 8 memory nodes, and each memory node includes 6 storage mediums.Assuming that cluster management node receives storage 8 failure of node.Cluster management node according to the redundant mode of EC for 4 data fragmentations and 2 verification bursts to memory node 1 to The 6*7=42 storage medium that memory node 7 includes carries out subregion division, i.e., different 6 is first selected from 7 memory nodes Memory node, then a storage medium is selected from each memory node of 6 different memory nodes, will be from 6 storage sections The storage medium selected in each memory node of point is determined as forming a subregion, can obtain seven subregions, each subregion Including 6 storage mediums.

As shown in fig. 6, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 6-1.Subregion two includes storage medium 2-2, storage medium 3-2, storage medium 4-2, storage Jie Matter 5-2, storage medium 6-2 and storage medium 7-2.Subregion three include storage medium 3-3, storage medium 4-3, storage medium 5-3, Storage medium 6-3, storage medium 7-3 and storage medium 1-2.Subregion four includes storage medium 4-4, storage medium 5-4, storage Jie Matter 6-4, storage medium 7-4, storage medium 1-3 and storage medium 2-3.Subregion five include storage medium 5-5, storage medium 6-5, Storage medium 7-5, storage medium 1-4, storage medium 2-4 and storage medium 3-4.Subregion six includes storage medium 6-6, storage is situated between Matter 7-6, storage medium 1-5, storage medium 2-5, storage medium 3-5 and storage medium 4-5.Subregion seven include storage medium 1-6, Storage medium 2-6, storage medium 3-6, storage medium 4-6, storage medium 5-6 and storage medium 7-1.

It should be noted that in above-mentioned dividing mode, the storage medium that each subregion includes is different.A kind of possible , can be from distribution when a subregion is not enough formed for remaining storage medium in distributed memory system in implementation Loaded in storage system in the memory node of minimum and select divided storage medium, with remaining storage medium group component Area, i.e., different subregions can include identical storage medium.

It is exemplary, it is assumed that the redundant mode of EC verifies bursts, i.e. N=4, K=2 for 4 data fragmentations and 2.It is distributed Storage system includes 8 memory nodes, and each memory node includes 5 storage mediums.Assuming that cluster management node receives storage 8 failure of node.Cluster management node according to the redundant mode of EC for 4 data fragmentations and 2 verification bursts to memory node 1 to The 5*7=35 storage medium that memory node 7 includes carries out subregion division, i.e., different 6 is first selected from 7 memory nodes Memory node, then a storage medium is selected from each memory node of 6 different memory nodes, will be from 6 storage sections The storage medium selected in each memory node of point is determined as forming a subregion, can obtain six subregions, each subregion Including 6 storage mediums.

As shown in fig. 7, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 6-1.Subregion two includes storage medium 2-2, storage medium 3-2, storage medium 4-2, storage Jie Matter 5-2, storage medium 6-2 and storage medium 7-2.Subregion three include storage medium 3-3, storage medium 4-3, storage medium 5-3, Storage medium 6-3, storage medium 7-3 and storage medium 1-2.Subregion four includes storage medium 4-4, storage medium 5-4, storage Jie Matter 6-4, storage medium 7-4, storage medium 1-3 and storage medium 2-3.Subregion five include storage medium 5-5, storage medium 6-5, Storage medium 7-5, storage medium 1-4, storage medium 2-4 and storage medium 3-4.At this time, remaining storage medium 1-5, storage are situated between Five matter 2-5, storage medium 3-5, storage medium 4-5 and storage medium 7-1 storage mediums, the also poor storage of composition subregion are situated between Matter, can select a storage medium from small to large from memory node 5 and memory node 6 according to load.Assuming that storage medium 5-3 loads are minimum, can be by storage medium 5-3, and storage medium 1-5, storage medium 2-5, storage medium 3-5, storage Jie Matter 4-5 and storage medium 7-1 form subregion six together.

It should be noted that in above-mentioned dividing mode, although there is a memory node failure in distributed memory system, But the number of memory node is also greater than the configuration of the redundant mode of EC.In a kind of possible implementation, distribution is deposited The number of the normal memory node of state is likely less than the configuration of the redundant mode of EC in storage system, i.e. Y is more than S, therefore, Q At least two storage mediums belong to same memory node in the storage medium that each subregion includes in subregion.

It is exemplary, it is assumed that the redundant mode of EC verifies bursts, i.e. N=4, K=2 for 4 data fragmentations and 2.It is distributed Storage system includes 6 memory nodes, and each memory node includes 6 storage mediums.Assuming that cluster management node receives storage 6 failure of node.Cluster management node according to the redundant mode of EC for 4 data fragmentations and 2 verification bursts to memory node 1 to 6*5=30 storage medium that memory node 5 includes carries out subregion division, and cluster management node is first from the every of 5 memory nodes A memory node selects a storage medium, obtains 5 storage mediums, then, according to the load of 5 memory nodes, from it is small to 5 memory node sequences of senior general, since the minimum memory node of load, select 1 storage medium, by 5 storage mediums and 1 A storage medium forms new subregion.There are two different storage mediums of same memory node in subregion.

As shown in figure 8, subregion one include storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, Storage medium 5-1 and storage medium 1-2.Subregion two includes storage medium 2-2, storage medium 3-2, storage medium 4-2, storage Jie Matter 5-2, storage medium 1-3 and storage medium 2-3.Subregion three include storage medium 3-3, storage medium 4-3, storage medium 5-3, Storage medium 1-4, storage medium 2-4 and storage medium 3-4.Subregion four includes storage medium 4-4, storage medium 5-4, storage Jie Matter 1-5, storage medium 2-5, storage medium 3-5 and storage medium 4-5.Subregion five include storage medium 1-6, storage medium 2-6, Storage medium 3-6, storage medium 4-6, storage medium 5-6 and storage medium 5-5.

Optionally, if fault message is the media identification of the storage medium of failure, media identification is for indicating storage medium Position in distributed memory system.For example, the above-mentioned storage medium for being used to represent the storage medium 1 in memory node 1 1-1, for representing that storage medium 4-5 of storage medium 5 in memory node 4 etc. can be media identification.Assuming that there is j storage Media failure, j are the integer more than or equal to 1.At this time, can not just data be write to the subregion of the storage medium including failure, because This, cluster management node includes S memory node according to the load of the normal memory node of redundant mode and state of EC (S*X)-j storage mediums are divided into W subregion, obtain the first regeneration block information.First regeneration block information includes W points The partition identification of each subregion in area, and in W subregion the storage medium that each subregion includes media identification.On for example, Subregion one, the subregion two stated can be partition identifications.

It should be noted that if the storage media failure in memory node, when dividing subregion, should deposit from S first Each memory node selection storage medium composition subregion of node is stored up, the storage medium that ensureing each subregion includes belongs to different Memory node.When a subregion is not enough formed for remaining storage medium in distributed memory system, it can be deposited from distribution Loaded in storage system in the memory node of minimum and select divided storage medium, subregion is formed with remaining storage medium, I.e. different subregions can include identical storage medium.Different subregions can include identical storage medium in W subregion (the same storage medium in same memory node).

It is exemplary, by taking the subregion shown in Fig. 2 as an example, it is assumed that 1 failure of storage medium of memory node 8, i.e., in subregion seven Storage medium 8-1 failures, subregion seven have just lacked a storage medium, and storage medium 8-2 can be filled into subregion seven, subregion at this time Eight have just lacked a storage medium, it is assumed that storage medium 6-6 loads are minimum, can be multiplexed storage medium 6-6 at this time, and storage is situated between Matter 6-6 fills into subregion eight.As shown in figure 9, subregion seven includes storage medium 1-5, storage medium 2-5, storage medium 3-5, storage Medium 4-5, storage medium 7-1 and storage medium 8-2.Subregion eight includes storage medium 1-6, storage medium 2-6, storage medium 3- 6th, storage medium 4-6, storage medium 5-6 and storage medium 6-6.Shown in the storage medium and Fig. 2 that subregion one to subregion six includes The storage medium that includes to subregion six of subregion one it is identical.

It is exemplary, by taking the subregion shown in Fig. 3 as an example, it is assumed that 1 failure of storage medium of memory node 6, i.e., in subregion one Storage medium 6-1 failures, and 2 failure of storage medium of memory node 5, i.e., the storage medium 5-2 failures in subregion one.Such as figure Shown in 10, storage medium 6-2 can be filled into subregion one, subregion one includes storage medium 1-1, storage medium 2-1, storage medium 3-1, storage medium 4-1, storage medium 5-1 and storage medium 6-2.Storage medium 5-3 and storage medium 6-3 can be filled into point Area two, subregion two include storage medium 1-2, storage medium 2-2, storage medium 3-2, storage medium 4-2, storage medium 5-3 and Storage medium 6-3.Storage medium 5-4 and storage medium 6-4 can be filled into subregion three, subregion three includes storage medium 1-3, deposits Storage media 2-3, storage medium 3-3, storage medium 4-3, storage medium 5-4 and storage medium 6-4.Can be by storage medium 5-5 Subregion four is filled into storage medium 6-5, subregion four includes storage medium 1-4, storage medium 2-4, storage medium 3-4, storage Jie Matter 4-4, storage medium 5-5 and storage medium 6-5.Storage medium 5-6 and storage medium 6-6 can be filled into subregion five, subregion Five include storage medium 1-5, storage medium 2-5, storage medium 3-5, storage medium 4-5, storage medium 5-6 and storage medium 6- 6.At this time, tetra- remaining storage medium 1-6, storage medium 2-6, storage medium 3-6 and storage medium 4-6 storage mediums, composition Also poor two storage mediums of subregion, can select two from small to large according to load from memory node 5 and memory node 6 and deposit Storage media.Assuming that storage medium 5-4 and storage medium 6-5 loads are minimum, can by storage medium 5-4 and storage medium 6-5, with And storage medium 1-6, storage medium 2-6, storage medium 3-6 and storage medium 4-6 form subregion six together.

In the achievable mode of another kind, as shown in figure 11, subregion one includes storage medium 1-1, storage medium 2-1, deposits Storage media 3-1, storage medium 4-1, storage medium 5-1 and storage medium 1-2.Subregion two includes storage medium 1-3, storage medium 2-2, storage medium 3-2, storage medium 4-2, storage medium 5-3 and storage medium 6-3.Subregion three includes storage medium 1-4, deposits Storage media 2-3, storage medium 3-3, storage medium 4-3, storage medium 5-4 and storage medium 6-4.Subregion four includes storage medium 1-5, storage medium 2-4, storage medium 3-4, storage medium 4-4, storage medium 5-5 and storage medium 6-5.Subregion five includes depositing Storage media 1-6, storage medium 2-5, storage medium 3-5, storage medium 4-5, storage medium 5-6 and storage medium 6-6.Assuming that deposit Storage media 1-1 loads are smaller, can be multiplexed storage medium 1-1 at this time, storage medium 1-1 is filled into subregion six.Subregion six includes Storage medium 1-1, storage medium 2-6, storage medium 3-6, storage medium 4-6, storage medium 5-6 and storage medium 6-6.

Although storing failed storage node or failed storage medium in distributed memory system, remain subregion The number of middle storage medium is 6, identical with the configuration of the redundant mode of EC, so that, it is effectively improved the reliabilities of data.Need It is noted that at this time may there are 1 storage medium to belong to multiple subregions, in this case, if including same All storage mediums of a subregion are write full in two subregions of storage medium, cause another subregion in two subregions to include phase There is no capacity with storage medium, data can not be write, at this point it is possible to determine that another subregion belongs to invalid point in two subregions Area.If all storage mediums of a subregion are not write full in two subregions including same storage medium, two points Another subregion includes the available free capacity of same storage media in area, and it is full until writing can to write data.

S503, cluster management node send the first regeneration block information to application node.

S504, application node receive the first regeneration block information that cluster management node is sent.

After application node receives the first regeneration block information that cluster management node is sent, storage the first regeneration block letter Breath.

Further, should after cluster management node repartitions subregion to the storage medium in distributed memory system It can be write when being needed with node and write data according to the subregion newly divided.As shown in figure 12, can include in detailed below Step：

S505, application node carry out EC codings to data to be written, obtain L bar EC bands.

Every EC band includes N number of data fragmentation and K verification burst.

S506, application node store L bar EC bands to L subregion in Q subregion according to the first regeneration block information.

First regeneration block information includes the partition identification of each subregion in Q subregion, and each subregion in Q subregion Including storage medium media identification, redundant mode and S-i memory node of the Q subregion by cluster management node according to EC (S-i) * X storage mediums for including to S-i memory node of load divide, i represents the memory node of failure Number.

S507, application node store L bar EC bands to L subregion in W subregion according to the first regeneration block information.

First regeneration block information includes the partition identification of each subregion in W subregion, and each subregion in W subregion Including storage medium media identification, W subregion is by cluster management node according to the redundant mode of EC and S memory node What (the S*X)-j storage mediums that loading includes S memory node divided, j represents the number of the storage medium of failure.

It is exemplary, it is assumed that to need to be written to data flow 0 to 11 distributed memory system, the subregion of distributed memory system Structure is the subregion after the renewal shown in Fig. 8.Moreover, N=4, K=2.As shown in figure 13, first, by data flow 0 to 11 according to 4 A data are divided into three pieces for one piece, and the first data block includes three data fragmentations, i.e. data fragmentation 0 to data fragmentation 3, the second number Include three data fragmentations, i.e. data fragmentation 4 to data fragmentation 7 according to block, the 3rd data block includes three data fragmentations, i.e. data Burst 8 is to data fragmentation 11；Then, for the first data block, exclusive or is carried out according to data fragmentation 0 to data fragmentation 3 and obtains two A verification burst, that is, verify burst P0 and verification burst Q0, data fragmentation 0 to data fragmentation 3, verification burst P0 and verification burst Q0 forms the first band.For the second data block, exclusive or is carried out according to data fragmentation 4 to data fragmentation 7 and obtains two verifications point Piece, that is, verify burst P1 and verification burst Q1, data fragmentation 4 to data fragmentation 7, verification burst P1 and verification burst Q1 compositions the Two bands.For the 3rd data block, exclusive or is carried out according to data fragmentation 8 to data fragmentation 11 and obtains two verification bursts, i.e. school Test burst P2 and verification burst Q2, data fragmentation 8 to data fragmentation 11, verification burst P2 and verification burst Q2 composition Article 3 Band.According to the load of subregion, three bands are written in three subregions from small to large, for example, in the subregion shown in Fig. 8.

In addition, after cluster management node sends the first regeneration block information to application node, if the storage of failure The trouble shooting of node or the trouble shooting of the storage medium of failure, the embodiment of the present application can also include step in detailed below：

S508, cluster management node, which obtain, recovers message.

Recover the trouble shooting that message is used for the storage medium for the trouble shooting or failure for indicating the memory node of failure.

S509, cluster management node are according to recovery information, the load of the redundant mode and the normal memory node of state of EC The normal storage medium of state is repartitioned, obtains the second regeneration block information.

S510, cluster management node send the second regeneration block information to application node.

S511, application node receive the second regeneration block information that cluster management node is sent.

Second regeneration block information is that cluster management node is normally deposited according to information, the redundant mode of EC and state is recovered The load of storage node repartitions what the normal storage medium of state obtained, recovers memory node of the information for indicating failure The trouble shooting of the storage medium of trouble shooting or failure., can be according to second more if application node needs to write data again New partition information writes data to memory node.

It should be noted that the capacity of distributed memory system, and each storage medium in distributed memory system Volume consumer can voluntarily be set according to demand, and the embodiment of the present application is not construed as limiting this.For a subregion, if subregion Capacity is enough, can write multiple and different bands.

In addition, when dividing subregion, ensure that the storage medium in subregion belongs to different memory nodes first, if cannot Ensure that the storage medium in subregion belongs to different memory nodes, can be from the memory node of distributed memory system according to depositing Selection needs the storage medium of number to form subregion from small to large for the load of storage media.

It is above-mentioned that mainly scheme provided by the embodiments of the present application is described from the angle of interaction between each network element.Can With understanding, each network element, such as cluster management node, application node are in order to realize above-mentioned function, it is each it comprises performing The corresponding hardware configuration of a function and/or software module.Those skilled in the art should be readily appreciated that, with reference to institute herein Each exemplary algorithm steps of disclosed embodiment description, the application can be with the combination shape of hardware or hardware and computer software Formula is realized.Some functions is performed in a manner of hardware or computer software driving hardware actually, depending on technical solution Application-specific and design constraint.Professional technician can be realized each specific application using distinct methods Described function, but this realization is it is not considered that exceed scope of the present application.

The embodiment of the present application can carry out function module according to above method example to cluster management node, application node Two or more functions, for example, can correspond to each function divides each function module, can also be integrated in by division In one processing module.Above-mentioned integrated module can both be realized in the form of hardware, can also use software function module Form realize.It should be noted that the division in the embodiment of the present application to module is schematical, it is only a kind of logic work( It can divide, can there is other dividing mode when actually realizing.

In the case where dividing each function module using corresponding each function, Figure 14 shows in above-mentioned and embodiment and relates to And cluster management node a kind of possible composition schematic diagram, as shown in figure 14, which can include：Transmitting-receiving Unit 141 and processing unit 142.

Wherein, Transmit-Receive Unit 141, for support cluster management node perform shown in Fig. 5 based on distributed memory system Subregion division methods in S501 and S503, Figure 12 shown in method for writing data in S501, S503, S508 and S510.

Processing unit 142, for supporting cluster management node to perform the subregion based on distributed memory system shown in Fig. 5 S502 in division methods, the S502 and S509 in method for writing data shown in Figure 12.

It should be noted that all related contents for each step that above method embodiment is related to can quote correspondence The function description of function module, details are not described herein.

Cluster management node provided by the embodiments of the present application, draws for performing the above-mentioned subregion based on distributed memory system Divide method, therefore the effect identical with the above-mentioned subregion division methods based on distributed memory system can be reached.

In concrete implementation, the computer equipment that the cluster management node described in Figure 14 can be as shown in Figure 15 is realized.

Figure 15 is a kind of composition schematic diagram of computer equipment provided by the embodiments of the present application, as shown in figure 15, computer Equipment can include at least one processor 151, memory 152, communication interface 153 and communication bus 154.

Each component parts of computer equipment is specifically introduced with reference to Figure 15：

Processor 151 is the control centre of computer equipment, can be a processor or multiple treatment elements General designation.For example, processor 151 is a central processing unit (Central Processing Unit, CPU) or spy Determine integrated circuit (Application Specific Integrated Circuit, ASIC), or be arranged to implement this Apply for one or more integrated circuits of embodiment, such as：One or more microprocessors (Digital Signal Processor, DSP), or, one or more field programmable gate array (Field Programmable Gate Array, FPGA)。

Wherein, processor 151 can be by running or performing the software program being stored in memory 152, and calling The data being stored in memory 152, perform the various functions of computer equipment.

In concrete implementation, as a kind of embodiment, processor 151 can include one or more CPU, such as Figure 15 Shown in CPU0 and CPU1.

Processor described in the embodiment of the present application is mainly used for obtaining fault message, normal according to fault message, state The load of memory node and the redundant mode of EC repartition the normal storage medium of state, obtain the first regeneration block information.

In the concrete realization, as a kind of embodiment, computer equipment can include multiple processors, such as institute in Figure 15 The processor 151 and processor 155 shown.Each in these processors can be a single core processor (single- ) or a polycaryon processor (multi-CPU) CPU.Here processor can refer to one or more equipment, circuit, And/or for handling the process cores of data (such as computer program instructions).

Memory 152 can be read-only storage (Read-Only Memory, ROM) or can store static information and instruction Other kinds of static storage device, random access memory (Random Access Memory, RAM) or letter can be stored Breath and the other kinds of dynamic memory or Electrically Erasable Programmable Read-Only Memory (Electrically of instruction Erasable Programmable Read-Only Memory, EEPROM), read-only optical disc (Compact Disc Read- Only Memory, CD-ROM) or other optical disc storages, laser disc storage (including compression laser disc, laser disc, laser disc, digital universal Laser disc, Blu-ray Disc etc.), magnetic disk storage medium or other magnetic storage apparatus or can be used in carrying or store with referring to Order or data structure form desired program code simultaneously can by any other medium of computer access, but not limited to this. Memory 152 can be individually present, and be connected by communication bus 154 with processor 151.Memory 152 can also and be located Reason device 151 integrates.

Wherein, the memory 152 is used to store the software program for performing application scheme, and is controlled by processor 151 System performs.

Communication interface 153, for other equipment or communication, such as Ethernet, wireless access network (Radio Access Network, RAN), WLAN (Wireless Local Area Networks, WLAN) etc..Communication interface 153 can realize that receive capabilities, and transmitting element realize sending function including receiving unit.

Communication interface described in the embodiment of the present application is mainly used for sending the first regeneration block information to application node.

Communication bus 154, can be industry standard architecture (Industry Standard Architecture, ISA) bus, external equipment interconnection (Peripheral Component, PCI) bus or extended industry-standard architecture (Extended Industry Standard Architecture, EISA) bus etc..The bus can be divided into address bus, Data/address bus, controlling bus etc..For ease of representing, only represented in Figure 15 with a thick line, it is not intended that an only bus Or a type of bus.

The device structure shown in Figure 15 does not form the restriction to computer equipment, can include more more or more than illustrating Few component, either combines some components or different components arrangement.

In the case of using integrated unit, Figure 16 shows cluster management node involved in above-described embodiment Alternatively possible composition schematic diagram.As shown in figure 16, which includes：Processing module 161 and communication module 162。

The action that processing module 161 is used for cluster management node is controlled management, for example, processing module 161 is used for Support that cluster management node performs S502 in Fig. 5, S502 and S509 in Figure 12, and/or for techniques described herein Other processes.Communication module 162 is used for the communication for supporting cluster management node and other network entities, such as with showing in Fig. 1 Communication between the application node and memory node that go out.Cluster management node can also include memory module 163, collect for storing The program code and data of group's management node.

Wherein, processing module 161 can be processor or controller.It can realize or perform with reference in disclosure Hold described various exemplary logic blocks, module and circuit.Processor can also be the combination for realizing computing function, example Such as combined comprising one or more microprocessors, combination of DSP and microprocessor etc..Communication module 162 can be transceiver, Transmission circuit or communication interface etc..Memory module 163 can be memory.

When processing module 161 is processor, communication module 162 is communication interface, when memory module 163 is memory, this It can be the computer equipment shown in Figure 15 to apply for the cluster management node involved by embodiment.

In the case where dividing each function module using corresponding each function, Figure 17 shows in above-mentioned and embodiment and relates to And application node a kind of possible composition schematic diagram, as shown in figure 17, which can include：Processing unit 171 With Transmit-Receive Unit 172.

Wherein, processing unit 171, for support application node perform the S505 in the method for writing data shown in Figure 12, S506 and S507.

Transmit-Receive Unit 172, for supporting cluster management node to perform the subregion based on distributed memory system shown in Fig. 5 S504 in division methods, S504, S505, S506, S507 and S511 in method for writing data shown in Figure 12.

Application node provided by the embodiments of the present application, for performing the above-mentioned subregion division side based on distributed memory system Method, therefore the effect identical with the above-mentioned subregion division methods based on distributed memory system can be reached.

In the case of using integrated unit, Figure 18 shows the another of application node involved in above-described embodiment The possible composition schematic diagram of kind.As shown in figure 18, which includes：Processing module 181 and communication module 182.

The action that processing module 181 is used for application node is controlled management, for example, processing module 181 is used to support Application node performs S505, S506 and S507 in Figure 12, and/or other processes for techniques described herein.Communication Module 182 is used for the communication for supporting application node and other network entities, such as between the cluster management node with being shown in Fig. 1 Communication.Application node can also include memory module 183, for storing the program code and data of application node.

Wherein, processing module 181 can be processor or controller.It can realize or perform with reference in disclosure Hold described various exemplary logic blocks, module and circuit.Processor can also be the combination for realizing computing function, example Such as combined comprising one or more microprocessors, combination of DSP and microprocessor etc..Communication module 182 can be transceiver, Transmission circuit or communication interface etc..Memory module 183 can be memory.

When processing module 181 is processor, communication module 182 is communication interface, when memory module 183 is memory, this It can be the computer equipment shown in Figure 15 to apply for the application node involved by embodiment.

Through the above description of the embodiments, it is apparent to those skilled in the art that, for description It is convenienct and succinct, can as needed will be upper only with the division progress of above-mentioned each function module for example, in practical application State function distribution to be completed by different function modules, i.e., the internal structure of device is divided into different function modules, to complete All or part of function described above.

In several embodiments provided herein, it should be understood that disclosed apparatus and method, can pass through it Its mode is realized.For example, device embodiment described above is only schematical, for example, the module or unit Division, is only a kind of division of logic function, can there is other dividing mode, such as multiple units or component when actually realizing Another device can be combined or be desirably integrated into, or some features can be ignored, or do not perform.It is another, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be the indirect coupling by some interfaces, device or unit Close or communicate to connect, can be electrical, machinery or other forms.

The unit illustrated as separating component may or may not be physically separate, be shown as unit The component shown can be a physical location or multiple physical locations, you can with positioned at a place, or can also be distributed to Multiple and different places.Some or all of unit therein can be selected to realize this embodiment scheme according to the actual needs Purpose.

In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units integrate in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.

If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use When, it can be stored in a read/write memory medium.Based on such understanding, the technical solution of the embodiment of the present application is substantially The part to contribute in other words to the prior art or all or part of the technical solution can be in the form of software products Embody, which is stored in a storage medium, including some instructions are used so that an equipment (can be single Piece machine, chip etc.) or processor (processor) perform each embodiment the method for the application all or part of step. And foregoing storage medium includes：USB flash disk, mobile hard disk, read-only storage (Read-Only Memory, ROM), arbitrary access are deposited Reservoir (Random Access Memory, RAM), magnetic disc or CD etc. are various can be with the medium of store program codes.

The above, is only the embodiment of the application, but the protection domain of the application is not limited thereto, any Change or replacement in the technical scope that the application discloses, should all cover within the protection domain of the application.Therefore, this Shen Protection domain please should be based on the protection scope of the described claims.

Claims

1. a kind of subregion division methods based on distributed memory system, it is characterised in that the distributed memory system includes Cluster management node, application node and S memory node, each memory node includes X storage medium, according to correcting and eleting codes The S*X storage medium that S memory node described in the redundant mode of EC includes is divided into P subregion, each divides in the P subregion Area includes Y storage medium, and the Y storage medium is by a storage medium in each memory node of Y memory node Composition, wherein, the redundant mode of the EC is the number of data fragmentation and the number of verification burst, and N represents of data fragmentation Number, the number of K expression verification bursts, Y=N+K,

The described method includes：

The cluster management node obtains fault message, and the fault message is used to indicate the memory node of failure or depositing for failure Storage media；

The cluster management node is according to the redundancy mould of the fault message, the load of the normal memory node of state and the EC Formula repartitions the normal storage medium of state, obtains the first regeneration block information；

The cluster management node sends the first regeneration block information to the application node.

2. if according to the method described in claim 1, it is characterized in that, the fault message is the node of the memory node of failure Mark, i memory node failure, the cluster management node according to the fault message, the normal memory node of state bear Carry and the redundant mode of the EC repartitions the normal storage medium of state, obtain the first regeneration block information, including：

The cluster management node stores described S-i according to the load of the redundant mode and S-i memory node of the EC (S-i) * X storage mediums that node includes are divided into Q subregion, obtain the first regeneration block information, described first more New partition information includes the partition identification of each subregion in Q subregion, and the storage that each subregion includes in the Q subregion The media identification of medium.

3. if according to the method described in claim 1, it is characterized in that, the fault message is the medium of the storage medium of failure Mark, j storage media failure, the cluster management node according to the fault message, the normal memory node of state bear Carry and the redundant mode of the EC repartitions the normal storage medium of state, obtain the first regeneration block information, including：

The cluster management node stores described S according to the load of the redundant mode and the S memory node of the EC (the S*X)-j storage mediums that node includes are divided into W subregion, obtain the first regeneration block information, described first more New partition information includes the partition identification of each subregion in W subregion, and the storage that each subregion includes in the W subregion The media identification of medium.

4. according to claim 1-3 any one of them methods, it is characterised in that in the cluster management node to the application After node sends the first regeneration block information, the method further includes：

The cluster management node, which obtains, recovers information, the failure recovered information and be used to indicate the memory node of the failure The trouble shooting of the storage medium of releasing or failure；

The cluster management node is negative according to the recovery information, the redundant mode of the EC and the normal memory node of state Load repartitions the normal storage medium of state, obtains the second regeneration block information；

The cluster management node sends the second regeneration block information to the application node.

5. a kind of method for writing data, it is characterised in that the distributed memory system includes cluster management node, application node With S memory node, each memory node includes X storage medium, S storage section of redundant mode according to correcting and eleting codes EC The S*X storage medium that point includes is divided into P subregion, and each subregion includes Y storage medium in the P subregion, and the Y is a Storage medium is made of a storage medium in each memory node of Y memory node, wherein, the redundant mode of the EC For the number and the number of verification burst of data fragmentation, N represents the number of data fragmentation, and K represents the number of verification burst, Y=N + K,

The described method includes：

The application node carries out EC codings to data to be written, obtains L bar EC bands, and every EC band includes N number of data point Piece and K verification burst, the L are determined that L is more than or equal to 1 by the data volume of the data to be written；

The application node stores the L bars EC bands to L subregion in Q subregion according to the first regeneration block information, The first regeneration block information includes the partition identification of each subregion in the Q subregion, and in the Q subregion each The media identification for the storage medium that subregion includes, redundancy mould of the Q subregion by the cluster management node according to the EC What (S-i) * X storage mediums that the load of formula and S-i memory node includes the S-i memory node divided, i Represent the number of the memory node of failure；Alternatively, the application node according to the first regeneration block information by the L bars EC bands L subregion in W subregion is stored, the first regeneration block information includes the subregion of each subregion in the W subregion Mark, and in the W subregion storage medium that each subregion includes media identification, the W subregion is by the cluster (the S* that management node includes the S memory node according to the load of the redundant mode and the S memory node of the EC X)-j storage mediums divide, and j represents the number of the storage medium of failure.

6. according to the method described in claim 5, it is characterized in that, EC volumes are carried out to data to be written in the application node Code, before obtaining L bar EC bands, the method further includes：

The application node receives the first regeneration block information that the cluster management node is sent.

7. according to the method described in claim 6, it is characterized in that, receive the cluster management node hair in the application node After the first regeneration block information sent, the method further includes：

The application node receives the second regeneration block information that the cluster management node is sent, the second regeneration block letter Breath is the cluster management node according to the negative of the recovery information, the redundant mode of the EC and the normal memory node of state Load repartitions what the normal storage medium of state obtained, the failure solution recovered information and be used to indicate the memory node of failure Remove or the trouble shooting of the storage medium of failure.

8. a kind of cluster management node, it is characterised in that distributed memory system includes the cluster management node, application node With S memory node, each memory node includes X storage medium, S storage section of redundant mode according to correcting and eleting codes EC The S*X storage medium that point includes is divided into P subregion, and each subregion includes Y storage medium in the P subregion, and the Y is a Storage medium is made of a storage medium in each memory node of Y memory node, wherein, the redundant mode of the EC For the number and the number of verification burst of data fragmentation, N represents the number of data fragmentation, and K represents the number of verification burst, Y=N + K,

The cluster management node includes：

Transmit-Receive Unit, for obtaining fault message, the fault message is used for memory node or the storage of failure for indicating failure Medium；

Processing unit, for the redundant mode according to the fault message, the load of the normal memory node of state and the EC The normal storage medium of state is repartitioned, obtains the first regeneration block information；

The Transmit-Receive Unit, is additionally operable to send the first regeneration block information to the application node.

9. cluster management node according to claim 8, it is characterised in that if the fault message is the storage section of failure The node identification of point, i memory node failure, the processing unit, is specifically used for：

(S-i) the * X for being included the S-i memory node according to the load of the redundant mode of the EC and S-i memory node A storage medium is divided into Q subregion, obtains the first regeneration block information, and the first regeneration block information includes Q The partition identification of each subregion in subregion, and in the Q subregion storage medium that each subregion includes media identification.

10. cluster management node according to claim 8, it is characterised in that if the fault message is the storage of failure The media identification of medium, j storage media failure, the processing unit, is specifically used for：

(the S*X)-j for being included the S memory node according to the load of the redundant mode of the EC and the S memory node A storage medium is divided into W subregion, obtains the first regeneration block information, and the first regeneration block information includes W The partition identification of each subregion in subregion, and in the W subregion storage medium that each subregion includes media identification.

11. according to claim 8-10 any one of them cluster management nodes, it is characterised in that

The Transmit-Receive Unit, is additionally operable to obtain and recovers information, the recovery information is used for the memory node for indicating the failure The trouble shooting of the storage medium of trouble shooting or failure；

The processing unit, is additionally operable to redundant mode and the normal memory node of state according to the recovery information, the EC Load repartition the normal storage medium of state, obtain the second regeneration block information；

The Transmit-Receive Unit, is additionally operable to send the second regeneration block information to the application node.

12. a kind of application node, it is characterised in that distributed memory system includes cluster management node, the application node and S A memory node, each memory node include X storage medium, S memory node of redundant mode according to correcting and eleting codes EC Including S*X storage medium be divided into P subregion, each subregion includes Y storage medium in the P subregion, and the Y is a to be deposited Storage media is made of a storage medium in each memory node of Y memory node, wherein, the redundant mode of the EC is The number of data fragmentation and the number of verification burst, N represent the number of data fragmentation, and K represents the number of verification burst, Y=N+ K,

The application node includes：

Processing unit, for carrying out EC codings to data to be written, obtains L bar EC bands, every EC band includes N number of data point Piece and K verification burst, the L are determined that L is more than or equal to 1 by the data volume of the data to be written；

The processing unit and Transmit-Receive Unit, for being stored the L bars EC bands to Q points according to the first regeneration block information L subregion in area, the first regeneration block information include the partition identification of each subregion in the Q subregion, Yi Jisuo State the media identification of the storage medium that each subregion includes in Q subregion, the Q subregion by the cluster management node according to (S-i) the * X storages that the load of the redundant mode of the EC and S-i memory node includes the S-i memory node are situated between What matter divided, i represents the number of the memory node of failure；Alternatively, the application node is according to the first regeneration block information By L bars EC bands storage to L subregion in W subregion, the first regeneration block information is included in the W subregion The partition identification of each subregion, and in the W subregion storage medium that each subregion includes media identification, the W is a Subregion deposits the S according to the load of the redundant mode and the S memory node of the EC by the cluster management node What (the S*X)-j storage mediums that storage node includes divided, j represents the number of the storage medium of failure.

13. application node according to claim 12, it is characterised in that

The Transmit-Receive Unit, is additionally operable to receive the first regeneration block information that the cluster management node is sent.

14. application node according to claim 13, it is characterised in that

The Transmit-Receive Unit, is additionally operable to receive the second regeneration block information that the cluster management node is sent, described second more New partition information is that the cluster management node is normally stored according to the recovery information, the redundant mode of the EC and state The load of node repartitions what the normal storage medium of state obtained, the memory node for recovering information and being used to indicate failure Trouble shooting or failure storage medium trouble shooting.

15. the cluster management node and claim 12- described in method, claim 8-11 according to claim 1-7 Application node described in 14, it is characterised in that after the cluster management node obtains fault message, if Y is more than S, the Q Each subregion of a subregion belongs to same memory node including at least two storage mediums.

16. the cluster management node and claim 12- described in method, claim 8-11 according to claim 1-7 Application node described in 14, it is characterised in that S is the integer more than or equal to 3 and less than or equal to 20.

17. a kind of computer program product for including instruction, it is characterised in that when the computer program product is in cluster management When being run on node so that the cluster management node is performed and deposited as described in any one in claim 1-4 based on distribution The subregion division methods of storage system.

18. a kind of computer program product for including instruction, it is characterised in that when the computer program product is in application node During upper operation so that the application node performs the method for writing data as described in any one in claim 5-7.

19. a kind of computer-readable recording medium, including instruction, it is characterised in that when described instruction is on cluster management node During operation so that the cluster management node perform as described in any one in claim 1-4 based on distributed storage system The subregion division methods of system.

20. a kind of computer-readable recording medium, including instruction, it is characterised in that when described instruction is run on application node When so that the application node performs the method for writing data as described in any one in claim 5-7.

21. a kind of equipment, it is characterised in that the equipment exists with the product form of chip, and the structure of the equipment includes Processor and memory, the memory is used to couple with the processor, for preserving the programmed instruction sum number of the equipment According to the processor is used to perform the programmed instruction stored in the memory so that the equipment performs such as claim 1-4 Any one the method.

22. a kind of equipment, it is characterised in that the equipment exists with the product form of chip, and the structure of the equipment includes Processor and memory, the memory is used to couple with the processor, for preserving the programmed instruction sum number of the equipment According to the processor is used to perform the programmed instruction stored in the memory so that the equipment performs such as claim 5-7 Any one the method.