CN103608798A - Clustered file service - Google Patents

Clustered file service Download PDF

Info

Publication number
CN103608798A
CN103608798A CN201280027196.2A CN201280027196A CN103608798A CN 103608798 A CN103608798 A CN 103608798A CN 201280027196 A CN201280027196 A CN 201280027196A CN 103608798 A CN103608798 A CN 103608798A
Authority
CN
China
Prior art keywords
equipment
file
name space
cluster
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201280027196.2A
Other languages
Chinese (zh)
Other versions
CN103608798B (en
Inventor
V·库兹耐特索夫
A·达马托
A·沃维克
V·彼得
H·阿洛伊修斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of CN103608798A publication Critical patent/CN103608798A/en
Application granted granted Critical
Publication of CN103608798B publication Critical patent/CN103608798B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

A cluster based file service may operate on a cluster of two or more independent devices that have access to a common data storage. The file service may have a namespace definition with each device in the cluster, but may be modified by any device operating the file service. Each instance of the file service may identify and capture a command that changes the namespace structure and cause the change to be propagated to the other members of cluster. If one of the devices in the cluster does not successfully perform an update to the namespace structure, that device may be brought offline. The cluster based file service may permit adding or removing devices from the cluster while the file service is operating, and may provide a high throughput and high availability file service.

Description

Cluster file service
Background technology
File service is used for and each client devices shared file.File service can present file to client devices by shared form, and they are wherein can the bibliographic structure of store files or a part for bibliographic structure.In some cases, can make identical file available in different sharing.
Many file service can be different users and share to define different license collection for each.Some users can have read/write license, and other users can have read-only license, and other user can not have the access right shared to this.Each subset that some file system can be shared this is applied different licenses, as each file, catalogue or file or directory group in single sharing define different licenses.
Summary of the invention
File service based on cluster can operate on the cluster of two or more autonomous devices that can access common data storage.File service can have with the per unit name space definition in this cluster, but can be by any apparatus modifications of operation this document service.Each example of file service can identify and catch the order of change name space structure and make this change be transmitted to other members of cluster.If one of each equipment in cluster does not have the renewal of successful execution to name space structure, this equipment can be by off-line.File service based on cluster can permit in operation file service adding equipment or removing device therefrom to this cluster, and high-throughput and high availability file service can be provided.
It is for the form introduction to simplify is by the concept of the selection further describing in the following detailed description that this general introduction is provided.Content of the present invention is not intended to identify key feature or the essential feature of claimed subject, is not intended to for limiting the scope of claimed subject yet.
Accompanying drawing explanation
In the accompanying drawings:
Fig. 1 is the diagram that the embodiment of the network environment with cluster file service is shown.
Fig. 2 is the functional diagram of embodiment that the concept topological structure of file service cluster is shown.
Fig. 3 is the timeline process flow diagram that the embodiment of the method operating for management cluster is shown.
Fig. 4 is the process flow diagram that the embodiment of the method for serving for operation file is shown.
Fig. 5 illustrates for upgrading the process flow diagram of embodiment of the method for slave node.
Embodiment
File service based on cluster can walk abreast and use a plurality of equipment to provide file service to a plurality of client computer.The identical copies in each the had file name space in file service supplier, and can identify and catch the change to this name space.Those changes can be transmitted to each in each member of the cluster that the service of this same file is provided.
The framework of cluster can allow the different equipment group in this cluster that some different name spaces are provided.For example, a name space can be provided by three equipment in cluster, and second place word space can be provided by four equipment, two members that can be to provide the group of this first place word space in these four equipment.In such embodiments, some equipment in cluster can provide two or more name spaces, and other equipment can only provide a name space.
Cluster can carry out operating equipment group with leader and follower arrangement.Leader is defined as the equipment of the management application in cluster.The in the situation that of file service, leader can be the equipment that starts and stop file service, adds additional cluster device or therefrom remove additional cluster device and carry out other management roles to file service.
In the equipment group of file service is provided, depend on situation, some embodiment can make each equipment take on main equipment or slave.In equipment Inspection, when to the change of name space, as when user add or the deleted file, this equipment can be taken on main equipment and upgrade name space and send this name space to other equipment (they take on slave).During the operating process of file system, any main equipment taken in each equipment or slave.Other embodiment can have for upgrading the different mechanisms of other nodes in cluster.
Name space can identify the shared resource (it is file system normally) of any type.File system can comprise catalogue or file, file or other objects.In certain embodiments, name space can be the pointer that points to the starting point in bibliographic structure.Name space can comprise various license settings or other information relevant with this name space.
In this specification, in to the whole description of accompanying drawing, identical Reference numeral represents identical element.
When element is called as " being connected " or " being coupled ", these elements can be connected directly or be coupled, or also can have one or more neutral elements.On the contrary, when element is called as, is " directly connected " or when " direct-coupling ", does not have neutral element.
This theme can be embodied as equipment, system, method and/or computer program.Therefore, this theme partly or entirely can specialize with hardware and/or software (comprising firmware, resident software, microcode, state machine, gate array etc.).In addition, this theme can adopt computing machine to use or computer-readable recording medium on the form of computer program, in medium, included the computing machine using together for instruction execution system or combined command executive system and can use or computer-readable program code.In the context of this document, computing machine can use or computer-readable medium can be can comprise, store, communicate by letter, propagate or transmission procedure for instruction execution system, device or equipment is used or combined command executive system, device or equipment are used together any medium.
Computing machine can use or computer-readable medium can be, such as, but not limited to, electricity, magnetic, optical, electrical magnetic, infrared or semiconductor system, device, equipment or propagation medium.As example, and unrestricted, computer-readable medium can comprise computer-readable storage medium and communication media.
Computer-readable storage medium comprises for any method of information such as computer-readable instruction, data structure, program module or other data of storage or volatibility and non-volatile, the removable and irremovable medium that technology realizes.Computer-readable storage medium comprises, but be not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM, digital versatile disc (DVD) or other optical disc storage, tape cassete, tape, disk storage or other magnetic storage apparatus, maybe can be for any other medium of storing information needed and can being accessed by instruction execution system.Note, computing machine can use or computer-readable medium can be on it, to print paper or another the suitable medium that has program, because program can be via for example the optical scanning of paper or other media being caught electronically, be compiled if necessary subsequently, explain, or process in other suitable modes, and be stored in computer memory subsequently.
Communication media embodies computer-readable instruction, data structure, program module or other data with modulated message signal such as carrier wave or other transmission mechanisms conventionally, and comprises random information transmission medium.Term " modulated message signal " refers to the signal that makes to set or change in the mode of coded message in signal its one or more features.As example and unrestricted, communication media comprises such as cable network or the direct wire medium of line connecting, and the wireless medium such as acoustics, RF, infrared and other wireless mediums.Within in above-mentioned, arbitrary combination also should be included in the scope of computer-readable medium.
While specializing in the general context of this theme at computer executable instructions, this embodiment can comprise the program module of being carried out by one or more systems, computing machine or other equipment.Generally speaking, program module comprises the routine carrying out particular task or realize particular abstract data type, program, object, assembly, data structure etc.Conventionally, the function of program module can combine as required or distribute in each embodiment.
Fig. 1 is the diagram that the embodiment 100 of cluster file service is shown.Embodiment 100 is the exemplary architecture that can be used to provide file service in the tolerant system of highly-parallel with high availability.
The diagram of Fig. 1 shows the functional module of system.In some cases, assembly can be the combination of nextport hardware component NextPort, component software or hardware and software.Some assemblies can be application layer software, and other assemblies can be operating system level assemblies.In some cases, assembly can be tight connection to the connection of another assembly, and wherein two or more assemblies operate on single hardware platform.In other cases, connection can connect by the network of span length's distance to carry out.Each embodiment can realize described function by different hardware, software and interconnect architecture.
Embodiment 100 be wherein some computing machines can parallel work-flow so that the example of the computer cluster of various services (as file service) to be provided.Cluster can have the same application of execution or service and can process independently the some computing machines to the request of this application or service.Cluster can be a kind of mechanism that arranges a plurality of computing machines that the service of fault-tolerant and/or high-throughput is provided.
In cluster, two or more equipment can each operation of parallel processing.In many cluster environment, each equipment can be configured to make one of these equipment can be out of order, by off-line or otherwise shut-down operation, and this service still can operate on another equipment.Such configuration can be anti-failure system, and wherein this system can be allowed the fault of one or more equipment and this service is still provided.
In addition, cluster can provide very high handling capacity to a plurality of requests of this service by process simultaneously.In such use, single cluster can provide manyfold bandwidth or the handling capacity of individual equipment.
For file service, provide each node of file service can use identical name space definition.Name space definition can define the shared content being provided.This is shared can comprise various objects, as file, catalogue, file or other objects.
To each request of file service, can fall into two classifications: make this share those requests of change and not make this share the request of change.The request that makes this request of sharing change can comprise interpolation or deleted file, change document directory structure or carry out other operations.Do not change this shared request and can comprise reading file.In certain embodiments, the write operation that file is carried out can be considered to the change to name space, and other embodiment can be by write operation as not changing this name space.
When name space has been changed in request, all nodes that provide this to share can be provided in this change.When change is propagated, other nodes can suspend before any other request is responded, until this change completes on this node.If equipment Inspection is not correctly realized to change, this equipment can be by own off-line, until this problem can be solved.
Name space can be shared by some different modes between each node.In a kind of mode, each in each operating system of equipment can have the registration table of wherein having stored each configuration setting or other information.This shared name space providing can be stored in registration table.Registration table can be operating system or other application use can be by fast and the database of easily accessing.In certain embodiments, a part for registration table can be across some nodes sharing.Being shared partly of registration table can above operate to other nodes of sharing this part of registration table to the change of registration table and by this change propagation by detecting one of each node.
In another kind of mode, name space can be stored in another database, as is stored in the primary name space in storage system (it can be cluster storage system).In such system, each node can be safeguarded the local replica of name space.Local replica can be arranged in registration table or other databases.In such embodiments, the node of taking on host node can make change be transmitted to other nodes.
Cluster can be managed by cluster management application, and this management application can be carried out on one of each clustered node.Cluster management application can be carried out various bookkeepings to cluster, as added, removing and configuration node, and on this cluster, starts and each application of management.For file service application, the node that cluster management application can identification document service can be carried out thereon, assigns leader's node, and leader's node is configured and operation file service on assigned node.
Equipment 102 represents a node of cluster.In many examples, cluster can have many nodes, from only several to tens of, hundreds of or more nodes.Equipment in cluster consists of hardware platform 104 and each component software 106 conventionally.Equipment 102 can be server computer, but some embodiment can be used desk-top computer, game console and portable set even, as laptop computer, mobile phone or other equipment.
Hardware platform 104 can comprise processor 108, random access memory 110 and non-volatile memories 112.Processor 108 can be single microprocessor, polycaryon processor or one group of processor.Random access memory 110 can store executable code and the data that can directly access of processor 108, and non-volatile memories 112 can be with permanent state store executable code and data.
Hardware platform 104 can comprise the various peripherals that form user interface 114.In some cases, user interface peripherals can be monitor, keyboard, pointing device or other user interface peripherals.Some embodiment can not comprise such user interface peripherals.
Hardware platform 104 also can comprise network interface 116.Network interface 116 can comprise hard-wired interface and wave point, and equipment 102 can communicate by these interfaces and other equipment.
Component software 106 can comprise the operating system 118 that various application can be carried out thereon.In certain embodiments, operating system 118 can be the special purpose operating system of calculating for cluster.Such operating system can comprise various services, database or the mechanism that can be used to each equipment to be unified into cluster.In other embodiments, operating system can be that each cluster application is for carrying out so that equipment can be taken on the general-purpose operating system of a part for cluster thereon.
Cluster management application 123 can be carried out on equipment 102.Cluster management application 123 can only operate on one or several nodes of cluster.While operating on a node 123 of cluster management application at cluster, this node can be considered to head node or management node.
Cluster management application 123 can be carried out for the various management of this cluster and management and control function.Such function can comprise this cluster of configuration, to this cluster, adds node or therefrom remove node and on this cluster, start and stop each application.
Cluster client application 120 also can be carried out on equipment 102.Cluster client application 120 can allow equipment 102 to add cluster and the bookkeeping from cluster management application is responded.In certain embodiments, cluster management application 123 and cluster client application 120 can be carried out on same equipment.Other embodiment can so not configure.
Equipment 102 can comprise the file service 122 that can respond the file service request from each client devices 148.File service 122 can make to share and can use each client devices 148, and wherein this is shared and can reside in physically in storage system 138.
In certain embodiments, the set 125 of name space definition can reside on equipment 102.Name space definition 125 can comprise about being stored in the metadata of the file in sharing.In certain embodiments, this metadata can comprise the metadata of each file in bibliographic structure and this bibliographic structure.Name space definition 125 may be enough to some file service requests to respond, as the request of the filename in particular category.In some cases, name space definition 125 can be used to cluster storage system 142 to call with retrieving files content, to file writing information or to this, share and carry out other operations.
In certain embodiments, name space definition 125 can comprise the pointer that points to starting point shared in existing bibliographic structure.In such embodiments, name space definition 125 can comprise various metadata, as permitted maybe these other metadata of sharing of setting, access control.
In certain embodiments, name space definition 125 can reside in database, and this database can be the data storage mechanism of any type, as relational database, file, table or other mechanism.In certain embodiments, name space definition 125 can be stored in the registration table 119 that can be used as the database being used by operating system 118.
Except file service 122, equipment 102 can be carried out various other application and service 124.In many examples, cluster can be carried out many application and service, and wherein different resource set has been applied in each application or service.
Cluster can consist of some nodes.Equipment 102 can be one of these nodes, and clustered node 128 can be additional node.Clustered node 128 can operate on the hardware platform 130 of hardware platform 104 that is similar to equipment 102.The cluster client application 132 that this node of each comprised permission in clustered node 128 operates in this cluster is together with file service 134 and other services 136.Not shownly on clustered node 128 can be used for processing by file service 134 the name space definition set of file service request.
Each in clustered node can be connected to each other by cluster network 126.In certain embodiments, cluster network can be the LAN (Local Area Network) that the network 146 that operates therein with client devices 148 separates.In such embodiments, cluster network 146 can have the private high network that each clustered node can communicate with one another therein.In other embodiments, cluster network 126 can be wide area network, the Internet or other networks.In such embodiments, cluster network 126 can be optimized or can not optimize for each clustered node and communicates with one another.
Clustered node can be communicated by letter with storage system 138, and storage system 138 can have the hardware platform 140 that cluster storage system 142 can operate thereon.Storage system 138 can be to provide storage area network or the other system of the storage of each access that can be in clustered node.
At clustered node, just when operation file is served, this is shared and can be stored in storage system 138.Each node of operation this document service can communicate by letter to retrieve file, catalogue or other objects being associated with provided name space with storage system 138.In such configuration, the addressable identical file of each node, this and the documentary a plurality of copies of tool form contrast.
Cluster can arrange load balancer 144.Load balancer 144 can be distributed to any in each node of carrying out specific file service by the request of importing into.Load balancer can operate by the load balance scheme of any type.In a kind of load balance scheme, load balancer 144 can be distributed to each node successively by request.Such scheme is called as recycle scheme.Other schemes can be analyzed bandwidth or the response time of each node and as criterion, distribute new request by these data.
In normal running, the file service of carrying out on cluster can make to share and can use each client devices 148.Client devices 148 can be the computer equipment that can access any type of sharing.Cluster can provide redundancy, one of them node can be due to fault, maintenance or other reasons off-line, and another node can continue operation.Cluster also can provide the handling capacity of raising, because many nodes can carry out Parallel Service to request.Such use can provide the higher handling capacity that can carry out alone than individual node.
Fig. 2 is the diagram of embodiment 200 that the functional diagram of cluster file service is shown.Embodiment 200 is the exemplary architecture that can be used to provide across cluster a plurality of file service.
The functional module of the system that illustrates of Fig. 2.In some cases, assembly can be the combination of nextport hardware component NextPort, component software or hardware and software.Some assemblies can be application layer software, and other assemblies can be operating system level assemblies.In some cases, assembly can be tight connection to the connection of another assembly, and wherein two or more assemblies operate on single hardware platform.In other cases, connection can connect by the network of span length's distance to carry out.Each embodiment can realize described function by different hardware, software and interconnect architecture.
Embodiment 200 only shows an example of the cluster that three different file service can operate thereon.Each in file service can have different resources distributes, because they can operate on the node of varying number.In addition, each node can operate one, two, three or more different file service.
Cluster 202 is shown to have five computing nodes 204,206,208,210 and 212.Each in these computing nodes can be to carry out the computing machine of the many processing in each processing of applying.In cluster, can there be other nodes, as management node, memory node, load balance node, agent node or additional calculations node.
Show three different file service.File service 214 can operate on node 204,206 and 208.File service 216 can operate on node 206,208,210 and 212, and file service 218 can operate on node 210 and 212.
Each file service can be taken on the separately example in their respective nodes of file service.For example, node 208 can operation file two examples of service.In such embodiments, each example can provide different shared and each examples on different group nodes, to operate.
In certain embodiments, individual node can operate can provide two or more shared single instances.In such embodiments, node 208 is the single instance of executable file service for example, and it can respond the shared request being associated with file service 216 and file service 218.
In the example of embodiment 200, some nodes can differently load with other nodes.For example, node 204 can only have a file service 214, and node 206 can have two file service 214 and 216.Such situation can occur during by initial configuration in file service or other application.During configuring, meet and expect that the number of nodes requiring can be determined and these nodes can be selected.Can select node by various criterions, comprise based on minimum use, Random assignment or other selection criterions and select node.
In certain embodiments, node can be not identical.Some nodes can have than the more processing power of other nodes, the network bandwidth or other abilities, and more examples that therefore can supporting document service.
The non-equal loading of node can be used as to be added or removes the result of node and occur after service execution.Some embodiment can identification service loading increase, and can be to this service interpolation new node so that additional request be responded.Similarly, the loading that some embodiment can identification service has reduced and can remove some nodes from service.After some different services are added or removed node, non-equilibrium or non-equal situation can occur, the situation as shown in embodiment 200.
Each be connected to cluster storage 220 in each node.Cluster storage 220 can comprise by the file that any node visit of this shared file service is provided, catalogue and sundry item in sharing.In many examples, cluster storage 220 can be can have for to storage area network or other storage systems of capacity, speed or other performance parameters that each node of file service responds are provided.
In certain embodiments, some nodes can be directly connected to cluster storage 220, and other nodes connect only indirectly.In such embodiments, the node indirectly connecting can visit cluster storage 220 by using cluster network and direct-connected node to communicate.
Some clusters can have load balancer 222.Load balancer 222 can be by new file system request is distributed to each node.Load balancer 222 can have the various algorithms of distribution process load between each computing node.A kind of simple algorithm is request to be distributed to successively to the round-robin algorithm of each node.Meticulousr algorithm can check that each node is to determine a least-loaded for which node, and this algorithm can will newly ask to distribute to this node.
Load balancer 222 can comprise that client devices 228 can be used to be undertaken by 226 pairs of clusters of network 202 the common cluster name 224 of addressing.Common cluster name 224 can be the single network name that can represent whole cluster.When client devices 228 spanned file services request, client devices 228 can communicate that request to common cluster name 224.From the viewpoint of client devices, file service can be provided by individual equipment, even if any one that this document service in fact can be in a plurality of equipment in cluster provides.In such embodiments, cluster 202 can look like individual equipment on network 226.
Cluster 202 can comprise the cluster management application 230 that can carry out various management roles on cluster.Cluster management application 230 can operate on the one or more nodes in each node of cluster.In certain embodiments, dedicated management node can be carried out cluster management application 230.
Each in computing node 202,204,206,208 and 210 can be accessed each the Cluster Database 232 of name space that can comprise in each file service.Cluster Database 232 can be realized by some different modes.
In a kind of mode, Cluster Database 232 can comprise the primary copy of name space definition.Primary copy can be synchronously or is copied to each in the node of the file service that provides corresponding.
In another way, Cluster Database 232 can comprise name space definition equally, and provides each node of file service can be linked to Cluster Database.In such embodiments, node can have this locality making in this node and calls and be directed to being redirected or other links of Cluster Database.Such embodiment can not safeguard the local replica of Cluster Database at each Nodes.
Fig. 3 is the timeline diagram that the embodiment 300 of the method operating for management cluster is shown.Embodiment 300 is simplification examples of the method that can be carried out by cluster manager dual system 302, leader's node 304 and follower node 306.The operation of cluster manager dual system 302 is illustrated in left hand column, and the operation of leader's node 304 is illustrated in middle column, and the operation of follower node 306 is illustrated in right-hand column.
Other embodiment can complete similar functions with different sequences, additional or step still less and different name or terms.In certain embodiments, each operation or operational set can be with other operations to synchronize or asynchronous system is carried out concurrently.Selected step is selected for the form to simplify illustrates certain operations principle herein.
Embodiment 300 shows the embodiment of use ' leader ' and ' follower ' model.Leader realizes the first node of service and can manage additional node (being called follower node).Cluster manager dual system can communicate with leader other management activitys that start, stop service and carry out service.Leader can communicate by letter to carry out with each follower those management activitys.
At frame 308, can identify leader's node.Leader's node can be the configuration identical with other nodes in cluster, but can manage specific service.
At frame 310, file structure that can will be shared from cluster storaging mark, and at name space corresponding to frame 312 definables.At frame 314, name space can be stored in Cluster Database.
At frame 316, can determine the quantity of the node that can carry out this document service, and at frame 318, can send file service configuration to leader node.
At frame 320, leader's node 304 can receive this configuration and start layoutprocedure.
At frame 322, leader's node 304 can identification name word space and can from Cluster Database, be retrieved this name space at frame 324.At frame 326, can start file service with this name space.Now, leader's node 304 can be to provide unique node of this document service.
At frame 328, can process each follower node.In many examples, can have a plurality of follower nodes, each in them can be used this name space that this document service is provided.At frame 330, leader's node 304 can send configuration order to follower node 306, and can use follower nodal information to upgrade load balancer at frame 332.
At frame 334, follower node 306 can receive configuration order.At frame 336, follower node 306 can identification name word space, at frame 338, from the definition of Cluster Database retrieval name space, and can use this name space to start this document service at frame 340.
The processing of frame 328 to 340 can be carried out when each new follower node is added to this document service.
At frame 342, node can be identified as forbidding or be removed from file service.At the renewable load balancer of frame 344, so that load balancer can stop sending new request to the node that is about to forbidding.At frame 346, leader's node 304 can send to follower node 306 by forbidding notice, at frame 348 follower nodes 306, can receive this notice and stop this document service at frame 350.
Embodiment 300 shows a kind of method that can be started file service and be expanded to subsequently other follower nodes or shrink by removing follower node by it.
Fig. 4 is the process flow diagram that the embodiment 400 of the method for serving for operation file is shown.Embodiment 400 is simplification examples of the method carried out of any node that can be served by execute file, and is the example of the operation that can carry out when name space being made to change.
Other embodiment can complete similar functions with different sequences, additional or step still less and different name or terms.In certain embodiments, each operation or operational set can be with other operations to synchronize or asynchronous system is carried out concurrently.Selected step is selected for the form to simplify illustrates certain operations principle herein.
Embodiment 400 shows the example of the method that can carry out when name space being made to change.Embodiment 400 shows by its node and detects name space has been made to change, subsequently the method to other nodes by this change propagation.
Other embodiment can be used MS master-slave operation to realize such renewal in file service.In such embodiments, the data that any node of execute file service can be when this node detects request and can cause the change to name space or in Cluster Database by be modified a certain other conditions time become host node.Consistent Cluster Database is used by all nodes, and it is consistent making each file service request, and regardless of which node, this request is served.
MS master-slave embodiment can operate by the information detecting in the variable Cluster Database of a request, and this node can be arranged to oneself host node and make other nodes take on slave node, until change is transmitted to each node.
In such embodiments, any node of operation file service can declare it oneself is host node at any time.In such embodiments, each node can be processed the request of any type.Such embodiment only permits a node at any given time becomes host node.
At frame 402, file service can start operation.
At frame 404, can receive file service request.At frame 406, if processing renewal, at frame 408 these nodes, can before continuing, wait for until upgrade end.The circulation of frame 408 can be guaranteed can not process request with expired or inconsistent database.The example of the process of carrying out at reproducting periods shown in the embodiment 500 that can present after a while at this instructions.
At frame 406, if do not upgraded just processed, and at frame 410, this request does not cause the change to name space, at frame 412, can process this document services request.
At frame 412, if request has caused change really, at frame 416, can start renewal process.At frame 418, can make change to name space, and can be by this change propagation to other nodes at frame 420.This change can be stored in the local in local storage or high-speed cache.
Fig. 5 illustrates for upgrading the process flow diagram of embodiment 500 of the method for slave.Embodiment 500 is simplification examples of the method that can be carried out by any node of taking on slave at main equipment reproducting periods.In many examples, during the All Time of carrying out in file service, any equipment can be taken on subordinate or main equipment.
Other embodiment can complete similar functions with different sequences, additional or step still less and different name or terms.In certain embodiments, each operation or operational set can be with other operations to synchronize or asynchronous system is carried out concurrently.Selected step is selected for the form to simplify illustrates certain operations principle herein.
Embodiment 500 is causing the simplification example of the operation that can be carried out by slave when the change of name space is propagated at main equipment.
At frame 502, can receive update notification.At frame 504, can stop processing new file service request.
At frame 506, can make the trial of upgrading name space definition or other information.At frame 508, if be updated successfully, can Recovery processing request at frame 514 slave nodes.
At frame 508, if upgrade unsuccessfully, at frame 510, can make node off-line and can transmit alarm to cluster manager dual system or leader's node at frame 512.
The operation of embodiment 500 shows when slave node is attempted upgrading and running into fault, and this node can make own off-line.When this node off-line, can take corrective action and make all the other nodes continue operation simultaneously.
Object to foregoing description of the present invention for diagram and description presents.It is not intended to exhaustive theme or this theme is limited to disclosed precise forms, and other are revised and modification is all possible in view of above-mentioned instruction.Select and describe embodiment and explain best principle of the present invention and practical application thereof, making thus others skilled in the art in various embodiment and the various modification that is suitable for conceived special-purpose, utilize best the present invention.Appended claims is intended to be interpreted as comprising other alternative embodiments except the scope that limit by prior art.

Claims (15)

1. a system, comprising:
A plurality of equipment, each in described equipment have can be in described a plurality of equipment each on the file service that operates;
Comprise the data storage of file, each access in described a plurality of equipment of described data stored energy;
The tissue of described file is defined as to a shared name space definition, and described name space definition is stored in the Cluster Database of each access that can be in described a plurality of equipment, and described sharing becomes and can be accessed by client devices;
Described file service sign is updated to the name space definition through upgrading to the change of described name space definition and on described Cluster Database by described name space definition, and described file service is also upgraded the local cache version of described name space definition.
2. the system as claimed in claim 1, is characterized in that, also comprises:
Load balancer, described load balancer receives from the request of described client devices and for each in described request determines that one of described a plurality of equipment process request.
3. the system as claimed in claim 1, is characterized in that, also comprises:
Leader's application of carrying out on the first equipment in described a plurality of equipment, described leader's application identities new equipment also adds described new equipment in described a plurality of equipment to.
4. system as claimed in claim 3, is characterized in that, described leader's application also identifies the first equipment and described the first equipment is removed from described a plurality of equipment.
5. the system as claimed in claim 1, is characterized in that, also comprises:
More than second equipment as the subset of described a plurality of equipment; And
The second tissue of at least one subset of described file is defined as to the second shared second place word space definition, the definition of described second place word space is stored in the Cluster Database of each access that can be in described more than second equipment, and described second shares and become and can be accessed by client devices.
6. system as claimed in claim 5, is characterized in that, described more than second equipment is identical with described a plurality of equipment.
7. the system as claimed in claim 1, is characterized in that, described file service also:
When the local cache version of name space definition described in trial is the first renewal of the equipment, fault detected and described the first equipment is forbidden to described file service.
8. system as claimed in claim 7, is characterized in that, described file service also:
Retry upgrades the local cache version of described name space definition and add described the first equipment to described a plurality of equipment when described being updated successfully.
9. a method, comprising:
For each in a plurality of file server equipment, install and execute file service and described file service is connected to the file storage that comprises file;
The definition of definition name space, described name space definition defines the shared of at least some files in described file;
The definition of described name space is stored in Cluster Database to each access that described Cluster Database can be in described a plurality of file server equipment;
On the first file server equipment as one of described a plurality of file server equipment, start described file service, described file service is used described name space definition;
Described file service sign is to the change of described name space definition and in described Cluster Database, upgrade described name space definition;
Use described name space definition starting the second file server equipment; And
By described the first file server and described the second file server, concurrently file request is served.
10. method as claimed in claim 9, also comprises:
The definition of described name space is copied to the first local cache on described the first file server equipment and the second local cache on described the second file server equipment.
11. methods as claimed in claim 10, is characterized in that, also comprise:
First change of described the first file server equipment Inspection to described name space, upgrades described the first local cache with described first, and upgrades described Cluster Database with described first;
Described the second file server equipment upgrades described the second local cache with described first.
12. methods as claimed in claim 11, is characterized in that, also comprise:
Second change of described the second file server equipment Inspection to described name space, upgrades described the second local cache with described second, and upgrades described Cluster Database with described second;
Described the first file server equipment upgrades described the first local cache with described second.
13. methods as claimed in claim 12, is characterized in that, also comprise:
When described the first file server equipment upgrades described the first local cache with described second, problem detected and forbid that described the first file server serves described file request.
14. methods as claimed in claim 13, is characterized in that, also comprise:
Use described name space definition starting the 3rd file server equipment; And
By described the first file server, described the second file server and described the 3rd file server, concurrently file request is served.
15. methods as claimed in claim 14, is characterized in that, also comprise:
Forbid that described the first file server served and operate concurrently by described the second file server to described file request and described the 3rd file server is served described file request.
CN201280027196.2A 2011-06-04 2012-05-29 Group document services Active CN103608798B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/153,416 2011-06-04
US13/153,416 US9652469B2 (en) 2011-06-04 2011-06-04 Clustered file service
PCT/US2012/039879 WO2012170234A2 (en) 2011-06-04 2012-05-29 Clustered file service

Publications (2)

Publication Number Publication Date
CN103608798A true CN103608798A (en) 2014-02-26
CN103608798B CN103608798B (en) 2016-11-16

Family

ID=47262503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280027196.2A Active CN103608798B (en) 2011-06-04 2012-05-29 Group document services

Country Status (4)

Country Link
US (1) US9652469B2 (en)
EP (1) EP2718837B1 (en)
CN (1) CN103608798B (en)
WO (1) WO2012170234A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106484587A (en) * 2015-08-26 2017-03-08 华为技术有限公司 A kind of NameSpace management method, device and computer system

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9332069B2 (en) 2012-12-28 2016-05-03 Wandisco, Inc. Methods, devices and systems for initiating, forming and joining memberships in distributed computing systems
US9424272B2 (en) * 2005-01-12 2016-08-23 Wandisco, Inc. Distributed file system using consensus nodes
US9361311B2 (en) * 2005-01-12 2016-06-07 Wandisco, Inc. Distributed file system using consensus nodes
US8364633B2 (en) 2005-01-12 2013-01-29 Wandisco, Inc. Distributed computing systems and system components thereof
US20120246609A1 (en) 2011-03-24 2012-09-27 International Business Machines Corporation Automatic generation of user stories for software products via a product content space
JP6102108B2 (en) * 2012-07-24 2017-03-29 富士通株式会社 Information processing apparatus, data providing method, and data providing program
US10649607B2 (en) 2012-12-28 2020-05-12 Facebook, Inc. Re-ranking story content
US9087155B2 (en) 2013-01-15 2015-07-21 International Business Machines Corporation Automated data collection, computation and reporting of content space coverage metrics for software products
US9075544B2 (en) 2013-01-15 2015-07-07 International Business Machines Corporation Integration and user story generation and requirements management
US9063809B2 (en) * 2013-01-15 2015-06-23 International Business Machines Corporation Content space environment representation
US9141379B2 (en) 2013-01-15 2015-09-22 International Business Machines Corporation Automated code coverage measurement and tracking per user story and requirement
US9081645B2 (en) 2013-01-15 2015-07-14 International Business Machines Corporation Software product licensing based on a content space
US9659053B2 (en) 2013-01-15 2017-05-23 International Business Machines Corporation Graphical user interface streamlining implementing a content space
US9111040B2 (en) 2013-01-15 2015-08-18 International Business Machines Corporation Integration of a software content space with test planning and test case generation
US9069647B2 (en) 2013-01-15 2015-06-30 International Business Machines Corporation Logging and profiling content space data and coverage metric self-reporting
US9218161B2 (en) 2013-01-15 2015-12-22 International Business Machines Corporation Embedding a software content space for run-time implementation
US9396342B2 (en) 2013-01-15 2016-07-19 International Business Machines Corporation Role based authorization based on product content space
US9020893B2 (en) * 2013-03-01 2015-04-28 Datadirect Networks, Inc. Asynchronous namespace maintenance
US9183148B2 (en) * 2013-12-12 2015-11-10 International Business Machines Corporation Efficient distributed cache consistency
ES2881606T3 (en) * 2014-03-31 2021-11-30 Wandisco Inc Geographically distributed file system using coordinated namespace replication
CN105991565B (en) * 2015-02-05 2019-01-25 阿里巴巴集团控股有限公司 Method, system and the database proxy server of read and write abruption
US11360942B2 (en) 2017-03-13 2022-06-14 Wandisco Inc. Methods, devices and systems for maintaining consistency of metadata and data across data centers
WO2018235132A1 (en) * 2017-06-19 2018-12-27 Hitachi, Ltd. Distributed storage system
US10826984B2 (en) * 2018-04-24 2020-11-03 Futurewei Technologies, Inc. Event stream processing
US11204940B2 (en) * 2018-11-16 2021-12-21 International Business Machines Corporation Data replication conflict processing after structural changes to a database
US11960763B2 (en) * 2021-04-23 2024-04-16 EMC IP Holding Company LLC Load balancing combining block and file storage

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283658A1 (en) * 2004-05-21 2005-12-22 Clark Thomas K Method, apparatus and program storage device for providing failover for high availability in an N-way shared-nothing cluster system
US7136903B1 (en) * 1996-11-22 2006-11-14 Mangosoft Intellectual Property, Inc. Internet-based shared file service with native PC client access and semantics and distributed access control
US20070055702A1 (en) * 2005-09-07 2007-03-08 Fridella Stephen A Metadata offload for a file server cluster
US7577688B2 (en) * 2004-03-16 2009-08-18 Onstor, Inc. Systems and methods for transparent movement of file services in a clustered environment
US20090282046A1 (en) * 2008-05-06 2009-11-12 Scott Alan Isaacson Techniques for accessing remote files

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5394555A (en) 1992-12-23 1995-02-28 Bull Hn Information Systems Inc. Multi-node cluster computer system incorporating an external coherency unit at each node to insure integrity of information stored in a shared, distributed memory
US6119143A (en) 1997-05-22 2000-09-12 International Business Machines Corporation Computer system and method for load balancing with selective control
US6748416B2 (en) 1999-01-20 2004-06-08 International Business Machines Corporation Client-side method and apparatus for improving the availability and performance of network mediated services
US6801949B1 (en) 1999-04-12 2004-10-05 Rainfinity, Inc. Distributed server cluster with graphical user interface
US6954881B1 (en) 2000-10-13 2005-10-11 International Business Machines Corporation Method and apparatus for providing multi-path I/O in non-concurrent clustering environment using SCSI-3 persistent reserve
US6976060B2 (en) 2000-12-05 2005-12-13 Agami Sytems, Inc. Symmetric shared file storage system
US7062490B2 (en) 2001-03-26 2006-06-13 Microsoft Corporation Serverless distributed file system
US20040139125A1 (en) 2001-06-05 2004-07-15 Roger Strassburg Snapshot copy of data volume during data access
US6865597B1 (en) 2002-12-20 2005-03-08 Veritas Operating Corporation System and method for providing highly-available volume mount points
US7653699B1 (en) 2003-06-12 2010-01-26 Symantec Operating Corporation System and method for partitioning a file system for enhanced availability and scalability
US7525902B2 (en) 2003-09-22 2009-04-28 Anilkumar Dominic Fault tolerant symmetric multi-computing system
US7496565B2 (en) 2004-11-30 2009-02-24 Microsoft Corporation Method and system for maintaining namespace consistency with a file system
US7506009B2 (en) 2005-01-28 2009-03-17 Dell Products Lp Systems and methods for accessing a shared storage network using multiple system nodes configured as server nodes
US7739677B1 (en) 2005-05-27 2010-06-15 Symantec Operating Corporation System and method to prevent data corruption due to split brain in shared data clusters
JP4795787B2 (en) 2005-12-09 2011-10-19 株式会社日立製作所 Storage system, NAS server, and snapshot method
US8019812B2 (en) 2007-04-13 2011-09-13 Microsoft Corporation Extensible and programmable multi-tenant service architecture
US20090204705A1 (en) 2007-11-12 2009-08-13 Attune Systems, Inc. On Demand File Virtualization for Server Configuration Management with Limited Interruption
US7840730B2 (en) 2008-06-27 2010-11-23 Microsoft Corporation Cluster shared volumes

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7136903B1 (en) * 1996-11-22 2006-11-14 Mangosoft Intellectual Property, Inc. Internet-based shared file service with native PC client access and semantics and distributed access control
US7577688B2 (en) * 2004-03-16 2009-08-18 Onstor, Inc. Systems and methods for transparent movement of file services in a clustered environment
US20050283658A1 (en) * 2004-05-21 2005-12-22 Clark Thomas K Method, apparatus and program storage device for providing failover for high availability in an N-way shared-nothing cluster system
US20070055702A1 (en) * 2005-09-07 2007-03-08 Fridella Stephen A Metadata offload for a file server cluster
US20090282046A1 (en) * 2008-05-06 2009-11-12 Scott Alan Isaacson Techniques for accessing remote files

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106484587A (en) * 2015-08-26 2017-03-08 华为技术有限公司 A kind of NameSpace management method, device and computer system
CN106484587B (en) * 2015-08-26 2019-07-19 华为技术有限公司 A kind of NameSpace management method, device and computer system

Also Published As

Publication number Publication date
EP2718837B1 (en) 2017-11-22
US20120311003A1 (en) 2012-12-06
US9652469B2 (en) 2017-05-16
EP2718837A4 (en) 2015-08-12
CN103608798B (en) 2016-11-16
WO2012170234A3 (en) 2013-02-07
EP2718837A2 (en) 2014-04-16
WO2012170234A2 (en) 2012-12-13

Similar Documents

Publication Publication Date Title
CN103608798B (en) Group document services
EP3811597B1 (en) Zone redundant computing services using multiple local services in distributed computing systems
CN102834822B (en) By trooping of sharing of virtual machine quick-full backup
CN102571905B (en) A kind of method and system for online service supervising the network and machine
US8051170B2 (en) Distributed computing based on multiple nodes with determined capacity selectively joining resource groups having resource requirements
US10067835B2 (en) System reset
CN102523101B (en) Machine manager service fabric
JP2019101703A (en) Storage system and control software arrangement method
CN102520991A (en) Efficient virtual application update
CN102938784A (en) Method and system used for data storage and used in distributed storage system
CN102945139A (en) Storage device drivers and cluster participation
CN102760081A (en) Method and device for allocating virtual machine resources
US20220147365A1 (en) Accelerating Segment Metadata Head Scans For Storage System Controller Failover
US9218140B2 (en) System and method for selectively utilizing memory available in a redundant host in a cluster for virtual machines
JP2005056392A (en) Method and device for validity inspection of resource regarding geographical mirroring and for ranking
US6883093B2 (en) Method and system for creating and managing common and custom storage devices in a computer network
CN101211362B (en) System and method for database update management
CN104517067B (en) Access the method, apparatus and system of data
CN101251815B (en) System and method for recoverring computer system
CN106354435A (en) Method and device for initializing RAID
CN105827744A (en) Data processing method of cloud storage platform
US10461991B1 (en) Dynamic replication peering
JP5574993B2 (en) Control computer, information processing system, control method, and program
JP6931005B2 (en) Data storage in a consistent scattered storage network
US9971532B2 (en) GUID partition table based hidden data store system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: MICROSOFT TECHNOLOGY LICENSING LLC

Free format text: FORMER OWNER: MICROSOFT CORP.

Effective date: 20150727

C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20150727

Address after: Washington State

Applicant after: Micro soft technique license Co., Ltd

Address before: Washington State

Applicant before: Microsoft Corp.

C14 Grant of patent or utility model
GR01 Patent grant