EP1433073A1 - Structure d'antememoire exactement une fois exactly once cache framework - Google Patents

Structure d'antememoire exactement une fois exactly once cache framework

Info

Publication number
EP1433073A1
EP1433073A1 EP02798120A EP02798120A EP1433073A1 EP 1433073 A1 EP1433073 A1 EP 1433073A1 EP 02798120 A EP02798120 A EP 02798120A EP 02798120 A EP02798120 A EP 02798120A EP 1433073 A1 EP1433073 A1 EP 1433073A1
Authority
EP
European Patent Office
Prior art keywords
host server
servers
server
network
jms
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP02798120A
Other languages
German (de)
English (en)
Other versions
EP1433073A4 (fr
Inventor
Dean Bernard Jacobs
Eric Halpern
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oracle International Corp
Original Assignee
BEA Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/234,597 external-priority patent/US7113980B2/en
Priority claimed from US10/234,693 external-priority patent/US6826601B2/en
Application filed by BEA Systems Inc filed Critical BEA Systems Inc
Publication of EP1433073A1 publication Critical patent/EP1433073A1/fr
Publication of EP1433073A4 publication Critical patent/EP1433073A4/fr
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/566Grouping or aggregating service requests, e.g. for unified processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]

Definitions

  • the present invention is related to technology for distributing objects among servers in a network cluster.
  • a device can exist to which a cluster may want exclusive access.
  • One such device is a transaction log on a file system. Whenever a transaction is in progress, there are certain objects that need to be saved in a persistent way, such that if a failure occurs those persistently-saved objects can be recovered. [0013] For these objects that need to be saved in one place, there is typically a transaction monitor that runs on each server in that cluster or domain, which then uses a local file system to access the object. Each server can have its own transaction manager such that there is little to no problem with persistence. There is then also no need for coordination, as each server has its own transaction manager.
  • the present invention includes a system for managing objects, such as can be stored in servers on a network or in a cluster.
  • the system includes a data source, application, or service, such as a file system or Java Message Service component, which can be located inside or outside of a cluster.
  • the system can include several servers in communication with the file system or application, such as through a high-speed network connection.
  • the system includes a lead server, such as canbe agreed upon by the other servers.
  • the lead server can be contained in a hardware cluster or in a software cluster.
  • the system can include an algorithm for selecting a lead server from among the servers, such as an algorithm built into a hardware cluster machine.
  • the lead server in turn will contain a distributed consensus algorithm for selecting a host server, such as a Paxos algorithm.
  • the algorithm used for selecting the lead server can be different from, or the same as, the algorithm used to select the host server.
  • the host server can contain a copy of the item or object, such as can be stored in local cache.
  • the host server can provide local copy access to any server on the network or in a cluster.
  • the host server can also provide the only access point to an object stored in a file system, or the only access point to an application or service. Any change made to an item cached, hosted, or owned by the host server can also be updated in the file system, application, or service.
  • a new host can be chosen using a distributed consensus algorithm. The new host can then pull the necessary data for the object from the file system or service.
  • the other servers in the cluster can be notified that a new server is hosting the object.
  • the servers can be notified by any appropriate means, such as by point-to-point connections or by multicasting.
  • Figure 1 is a diagram of a distributed object system in accordance with one embodiment of the present invention.
  • Figure 2 is a diagram of another distributed object system in accordance with one embodiment of the present invention.
  • Figure 3 is a flowchart of a method for selecting a host server in accordance with the present invention.
  • Figure 4 is a flowchart of a method for selecting a new host server in accordance with the present invention.
  • Figure 5 is a flowchart of a method for utilizing a lead server in accordance with the present invention.
  • Figure 6 is a diagram of JMS message store system in accordance with one embodiment of the present invention.
  • Figure 7 is a block diagram depicting components of a computing system that can be used in accordance with the present invention.
  • Systems in accordance with the present invention can provide solutions to availability issues, such as when a server owning a data object becomes unavailable to a server cluster.
  • One such solution allows for another server in the cluster to take over the ownership of the data object.
  • a problem arises, however, in making the data object accessible to both servers without having to replicate the data object on both.
  • file system data store, or database
  • file system data store, or database
  • the second server can automatically take over the task of data object access if the first server owning that object encounters a problem.
  • algorithm utilized by the cluster or a server in the cluster to instruct a server to take ownership of the item.
  • Another fundamental problem involves getting the cluster to agree on which server now owns the resource or object, or achieving a "consensus" amongst the servers.
  • FIG. 1 shows one example of a cluster system 100 in accordance with the present invention, where an object such as a transaction log 114 is stored in a file system 112.
  • the file system 112 is accessible to all servers 106, 116, 118 in the cluster 110, but only one of these servers can access the log 114 at a time.
  • a host server 106 among the servers in the cluster 110 will "own” or "host” the log 114, such as by storing a copy 108 of the log 114 or by providing all access to the log 114 in the file system 112.
  • Any other server 116, 118 in the cluster 110 can access the copy 108 of the log, and/or can access the log 114 through the hosting server 106.
  • a client or browser 102 can make a request to a network 104 that is directed to server 116 in cluster 110. That server can access the copy 108 of the transaction log on the host server 106 through the network 104. If the transaction log needs to be updated, the copy 108 can be updated along with the original log 114 on the file system 112.
  • a server can "own" or "host” a data object when, for example, it acts as a repository for the object, such as by storing a copy of the data object in local cache and making that copy available to other servers in the cluster, or by being the sole server having direct access to an object in a file system, service, or application, such that all other servers in the cluster must access that object through the hosting server. This ensures that an object exists "exactly once" in the server cluster.
  • FIG. 3 shows one process 300 that can be used to establish the hosting of an object.
  • a host server can be selected using a distributed consensus algorithm 302, such as a Paxos algorithm. Such an algorithm is referred to as a "distributed consensus" algorithm because servers in a cluster must generally agree, or come to a consensus, as to how to distribute objects amongst the cluster servers.
  • ahosted object is, for example, to be cached on the hosting server, a copy of the data object can be pulled from a file system to the host server and stored as an object in local cache 304.
  • the other servers on the network or in the appropriate cluster are then notified, such as by the hosting server, that a local copy of the object exists on the hosting server, and that the local copy should be used in processing future network requests 306.
  • a server can be selected to act as a host or lead server by a network server, the network server leading a series of "consensus rounds.” In each of these consensus rounds, a new host or lead server is proposed. Rounds continue until one of the proposed servers is accepted by a majority or quorum of the servers.
  • Any server can propose a host or lead server by initiating a round, although a system can be configured such that a lead server always initiates a round for a host server selection. Rounds for different selections can be carried out at the same time. Therefore, a round selection can be identified by a round number or pair of values, such as a pair with one value referring to the round and one value referring to the server leading the round.
  • a round can be initiated by a leader sending a "collect" message to other servers in the cluster.
  • a collect message collects information from servers in the cluster regarding previously conducted rounds in which those servers participated. If there have been previous consensus rounds for this particular selection process, the collect message also informs the servers not to commit selections from previous rounds.
  • the leader can decide the value to propose for the next round and send this proposal to the cluster servers as a "begin" message. In order for the leader to choose a value to propose in this approach, it is necessary to receive the initial value information from the servers.
  • a server can respond by sending an "accept" message, stating that the server accepts the proposed host/lead server. If the leader receives accept messages from a majority or quorum of servers, the leader sets its output value to the value proposed in the round. If the leader does not receive majority or quorum acceptance ("consensus") within a specified period of time, the leader can begin a new round. If the leader receives consensus, the leader can notify the cluster or network servers that the servers should commit to the chosen server. This notification can be broadcast to the network servers by any appropriate broadcasting technology, such as through point-to-point connections or multicasting. [0035] The agreement condition of the consensus approach can be guaranteed by proposing selections that utilize information about previous rounds.
  • This information can be required to come from at least a majority of the network servers, so that for any two rounds there is at least one server that participated in both rounds.
  • the leader can choose a value for the new round by asking each server for the number of the latest round in which the server accepted a value, possibly also asking for the accepted value. Once the leader gets this information from a majority or quorum of the servers, it can choose a value for the new round that is equal to the value of the latest round among the responses. The leader can also choose an initial value if none of the servers were involved in a previous round. If the leader receives a response that the last accepted round is x, for example, and the current round is y, the server can imply that no round between ⁇ ; and y would be accepted, in order to maintain consistency.
  • a sample interaction between a round leader and a network server involves the following messages:
  • the file system can be a single resource. In one embodiment, the server may only care that a single server owns the file system at any time.
  • Another example of a system in accordance with the present invention involves caching in a server cluster. It may be desirable in a clustered environment, such as for reasons of network performance, to have a single cache represent a data obj ect to servers in the cluster. Keeping items in a single cache can be advantageous, as servers in the cluster can access the cache without needing to continually return to persistent storage. Being able to pull an item already in memory can greatly increase the efficiency of such a system, as hits to a database or file system can be relatively time intensive. [0047] One problem with a single cache, however, is that it may be necessary to ensure that the object stored in memory is the same as that which is stored on a disk of the file system. One reason for requiring such consistency is to ensure that any operations or calculations done on a cached item produce the correct result. Another reason is that it can be necessary to restore the cache from the file system in the event that the cache crashes or becomes otherwise tainted or unavailable.
  • the caches can be hosted on a single server, or spread out among some or all of the servers in the cluster.
  • the cluster itself can be any appropriate cluster, such as a hardware cluster or a group of servers designated by a software application to be in a given "software" cluster.
  • a transaction log and/or a cache may be desirable to ensure that any such object exists only once in a cluster, and that the object is always available. It may also be desirable to ensure that the object can be recovered on another server if the server hosting the object fails, and that the object will be available to the cluster.
  • One method 400 for recovery is shown in Figure 4.
  • a determination is made whether the host server can continue to host an object 402, such as whether the server is still available to the network. If not, a new host is selected using a distributed consensus algorithm. This selection may be performed according to the method used to select the original host 404. A copy of the data object is pulled from a file system to the new host, and can be stored in a local cache 406. The other servers on the network or in the appropriate cluster are notified that the new host server contains a local copy of the object, and that the local copy should be used in processing any future network requests 408.
  • Systems and methods in accordance with the present invention can define objects that exist in exactly one place in a cluster, and can ensure that those objects always exist. From a server's perspective, it may not matter whether an object such as a transaction log is mirrored or replicated, such as by a file system. From the server' s perspective, there is always one persistent storage accessible by any server in the cluster. The system can periodically check for the existence of an object, or may assign ownership of objects for short periods of time such that an object will be reassigned frequently to ensure existence on some machine on the network or in the cluster.
  • a hardware cluster can comprise a bank of machines, each machine being capable of running multiple servers. There can also be a file system behind each machine.
  • Servers in a hardware cluster are typically hardwired, such that they are able to more quickly make decisions and deal with server faults within the hardware cluster.
  • Hardware clusters can be limited in size to the physical hardware of the machine containing the servers.
  • Servers in a hardware cluster can be used as servers in a software cluster, and can also comprise network servers, as the individual servers on the machines are available to the network.
  • the shared file system for one of these machines can be available to all servers in a cluster, such as through a high-speed network.
  • the file system can also be redundant. In one embodiment, this redundancy is implemented through the use of multiple data disks for the file system. In such a redundant implementation, an object can be replicated across multiple disks any time the object is written to the file system.
  • Such a file system when viewed as a "black box,” can be able to withstand failures of any of the disks and still provide access to a data item from any of the servers in the cluster.
  • a framework in accordance with the present invention can be built on the assumption that these objects kept in memory are always backed by a reliable, persistent storage mechanism.
  • An object representing that transaction log can be sitting on one of the servers in the cluster, such as the host server.
  • An exactly-once framework can ensure that, as long as at least one of the servers in the cluster is up and running, a server will be able to take over ownership of the log if another server fails.
  • the obj ect can be resurrected on another server. The resurrected obj ect can pull all the necessary information from persistent storage.
  • An exactly-once framework can act as a memory buffer for use by the cluster.
  • the framework can provide a single cache representing data in the system that is backed by a reliable, persistent storage. Whenever data is read from the cache, the read can be done without needing to access persistent storage. When an update is written to cache, however, it can be necessary to write back through the persistent storage, such that the system can recover if there is a failure.
  • One important aspect of an exactly-once framework involves the way in which the approach is abstracted, which can vary depending upon the application and/or implementation.
  • An exactly-once object can be, for example, a locally-cached copy of a data item in a file system, or the sole access point to such a data item for servers in a cluster. Underlying techniques for implementing this abstraction can also be important.
  • Systems of the present invention can utilize any of a number of methods useful for distributed consensus, such as a method using the aforementioned Paxos algorithm. Such an algorithm can be selected which provides an efficient way for multiple nodes and/or distributed nodes to agree on one value of an object. The algorithm can be chosen to work even if nodes fail and/or return during an agreement process.
  • a typical approach to network clustering utilizes reliable broadcasting, where every message is guaranteed to be delivered to its intended recipient, or at least delivered to every intended functioning server. This approach can make it very difficult to parallelize a system, as reliable broadcasting requires a recipient to acknowledge a message before moving on to the next message or recipient.
  • a distributed algorithm utilizing multicasting may reduce the number of guarantees, as multicasting does not guarantee that all servers receive a message. Multicasting does simplify the approach such that the system can engage in parallel processing, however, as a single message can be multicast to all the cluster servers concurrently without waiting for a response from each server.
  • a server that does not receive a multicast message can pull the information from the lead server, or another cluster server or network server, at a later time.
  • a network server can refer to any server on the network, whether in a hardware cluster, in a software cluster, or outside of any cluster.
  • An important aspect of an exactly-once architecture is that consensus difficulties are reduced.
  • the performance of a distributed consensus implementation can be improved by using multicast messaging with a distributed consensus algorithm. This approach can allow for minimizing the message exchange and/or network traffic required for all the servers to agree.
  • a lead server can multicast a message to all other servers on the network, such as may be used in a round of a Paxos algorithm, or used to state that a new host has been selected for an object.
  • the lead server only needs to send one message, which can be passed to any server available on the network. If a server is temporarily off the network, the server can request the identification of the new host after coming back onto the network.
  • the lead server can pre-select a host server using an appropriate algorithm.
  • the lead server Before assigning an object to that host, however, the lead server can contact every other server in the cluster to determine whether the servers agree with the choice of the new host server.
  • the lead server can contact each server by a point-to-point connection, or can send out a multicast request and then wait for each server to respond. If the servers do not agree on the selection of the host, the lead server can pre-select a new host using the algorithm. The lead server would then send out another multicast request with the identity of the newly preselected host in another round. [0064] If every server agrees to the pre-selected host, the lead server can assign the object to the host server. The lead server can then multicast a commit message, informing the servers that the new change has taken effect and the servers should update their information accordingly.
  • An exactly-once framework can also utilize a "leasing" mechanism.
  • an algorithm can be used to get the cluster servers to agree on a lead server, such as by using distributed consensus.
  • that lead server can be responsible for assigning exactly-once objects to various servers in the cluster.
  • the system can be set up such that the cluster servers will always agree on a new leader if an existing lead server fails.
  • the lead server can be aware of all the exactly-once objects that need to exist in the system.
  • the lead server can decide which server should host each object, and can then "lease" that object to the selected server.
  • that server can own or host the object for a certain period of time, such as for the duration of a lease period.
  • the lead server can be configured to periodically renew these leases. This approach can provide a way to ensure that a server will not get its lease renewed if it fails or becomes disconnected in any way, or is otherwise not operating properly within the cluster.
  • a system using an exactly-once architecture can also be tightened down. Operating systems often provide special machinery that is built closer to the hardware and can offer more control. One problem with this approach, however, is that it can be limited by the hardware available. For example, a hardware cluster of servers can have on the order of 16 servers. Because these systems require some tight hardware coupling, there can be limitations on the number of servers that can be included in the cluster.
  • An exactly-once framework may be able to handle clusters much larger than these proprietary hardware clusters can handle.
  • a framework can allow for some leveraging of the qualities of service that are available from one of the proprietary clusters, thereby allowing for a larger cluster. Differing qualities of service may include, for example, whether messages are sent by a reliable protocol, such as by point-to-point connections, or are sent by a less reliable but more resource-friendly protocol, such as multicasting.
  • An advantage to using an exactly-once framework is the ability to balance scalability with fault tolerance, such that a user can adapt the system to the needs of a particular application.
  • Prior art systems such as hardware cluster machines can attempt high availability solutions by having (what appears to the cluster to be) a single machine backed up by a second machine. If the first machine goes down, there is a "buddy" that takes over, and any software that was running on the first machine is brought up on the second machine.
  • An exactly-once framework in accordance with the present invention can assign the lead server to a server in one of these hardware clusters, such that dealing with leader failure can become faster than dealing with it in a software cluster.
  • This lead server can, however, dole out leases to servers whether those servers are in the hardware cluster or the software cluster. This arrangement may provide for faster lead server recovery, while allowing for a software cluster that is larger than, but still includes, the hardware cluster.
  • a hardware cluster 218 can comprise a single machine containing multiple servers 220, 222, 224. The hardware cluster can be used to choose a lead server 220 from among the servers on that machine, such as may improve efficiency.
  • the lead server can select a host 206 for an object 214 in a file system 212, which can be located inside or outside of the software cluster 210.
  • the file system 214 itself can replicate the object 214 to a second object 216 on another disk of the file system, such as may provide persistence.
  • the object 214 can be pulled from the file system 212 by the new host 206 with a copy 208 of the object cached on the host 206.
  • a request is received from a browser or client 202 to a server through the network 204, such as servers 206, 216, and 220, that server will know to contact host server 206 if the server needs access to the cached copy of the object 208.
  • the lead server is selected using an algorithm of a hardware cluster 502.
  • This algorithm may be, for example, a proprietary algorithm of the hardware cluster machine, or may be a distributed consensus algorithm requiring consensus over the hardware cluster servers only.
  • a host server can then be pre-selected using a distributed consensus algorithm with the lead server 504, such as a Paxos algorithm.
  • the identity of the pre-selected host can then be multicast to the other servers in a software cluster containing the hardware cluster 506.
  • the lead server can receive approval or disapproval from each server that is presently operational and connected to the cluster 508.
  • FIG. 6 shows another example of a cluster system 600 in accordance with the present invention, where an object 608 acts as a message store for Java Message Service (JMS) 612.
  • JMS Java Message Service
  • FIG. 7 illustrates a block diagram 700 of a computer system which can be used for components of the present invention or to implement methods of the present invention.
  • the computer system of Figure 7 includes a processor unit 704 and main memory 702.
  • Processor unit 704 may contain a single microprocessor, or may contain a plurality of microprocessors for configuring the computer system as a multi-processor system.
  • Main memory 702 stores, in part, instructions and data for execution by processor unit 704. If the present invention is wholly or partially implemented in software, main memory 702 can store the executable code when in operation.
  • Main memory 702 may include banks of dynamic random access memory (DRAM), high speed cache memory, as well as other types of memory known in the art.
  • DRAM dynamic random access memory
  • the system of Figure 7 further includes a mass storage device 706, peripheral devices 708, user input devices 712, portable storage medium drives 714, a graphics subsystem 718, and an output display 716.
  • a mass storage device 706, peripheral devices 708, user input devices 712, portable storage medium drives 714, a graphics subsystem 718, and an output display 716 For purposes of simplicity, the components shown in Figure 7 are depicted as being connected via a single bus 720. However, as will be apparent to those skilled in the art, the components may be connected through one or more data transport means.
  • processor unit 704 and main memory 702 may be connected via a local microprocessor bus
  • the mass storage device 706, peripheral devices 708, portable storage medium drives 714, and graphics subsystem 718 may be connected via one or more input/output (I/O) buses.
  • I/O input/output
  • Mass storage device 706, which may be implemented with a magnetic disk drive, optical disk drive, as well as other drives known in the art, is a non-volatile storage device for storing data and instructions for use by processor unit 704. In one embodiment, mass storage device 706 stores software for implementing the present invention for purposes of loading to main memory 702.
  • Portable storage medium drive 714 operates in conjunction with a portable non-volatile storage medium, such as a floppy disk, to input and output data and code to and from the computer system of Figure 7.
  • the system software for implementing the present invention is stored on such a portable medium, and is input to the computer system via the portable storage medium drive 714.
  • Peripheral devices 708 may include any type of computer support device, such as an input/output (I/O) interface, to add additional functionality to the computer system.
  • peripheral devices 708 may include a network interface for connecting the computer system to a network, as well as other networking hardware such as modems, routers, or other hardware known in the art.
  • User input devices 712 provide a portion of a user interface.
  • User input devices 712 may include an alpha-numeric keypad for inputting alphanumeric and other information, or a pointing device, such as a mouse, a trackball, stylus, or cursor direction keys.
  • the computer system of Figure 7 includes graphics subsystem 718 and output display 716.
  • Output display 716 may include a cathode ray tube (CRT) display, liquid crystal display (LCD) or other suitable display device.
  • Graphics subsystem 718 receives textual and graphical information, and processes the information for output to display 716.
  • the system of Figure 7 includes output devices 710. Examples of suitable output devices include speakers, printers, network interfaces, monitors, and other output devices known in the art.
  • the components contained in the computer system of Figure 7 are those typically found in computer systems suitable for use with certain embodiments of the present invention, and are intended to represent a broad category of such computer components known in the art.
  • the computer system of Figure 7 can be a personal computer, workstation, server, minicomputer, mainframe computer, or any other computing device.
  • Computer system 700 can also incorporate different bus configurations, networked platforms, multi-processor platforms, etc.
  • Various operating systems can be used including Unix, Linux, Windows, Macintosh OS, Palm OS, and other suitable operating systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention concerne un système de gestion d'objets dans un réseau en grappes, qui comprend un système de fichier (212) contenant au moins une copie d'un objet de données (208). Le système peut comprendre plusieurs serveurs en grappes communiquant avec le système de fichier (212). Un serveur principal est choisi, qui présente un algorithme de consensus réparti permettant de choisir un serveur hôte (206), et utilise la diffusion vers plusieurs destinataires pendant qu'il exécute des cycles de l'algorithme. Le serveur hôte (206) choisi peut contenir une copie de l'objet de données (208), par exemple dans une antémémoire locale, qui donne à chaque autre serveur de la grappe accès à la copie locale (208). Toute modification apportée à un article hébergé par le serveur hôte (206) peut également être mise à jour dans le système de fichier (212). En cas d'incapacité du serveur hôte (206) à héberger l'objet, un nouvel hôte peut être choisi par la mise en oeuvre de l'algorithme de consensus réparti. Les autres serveurs (216, 218) sont alors informés du choix du nouvel hôte par messagerie à plusieurs destinataires.
EP02798120A 2001-09-06 2002-09-05 Structure d'antememoire exactement une fois exactly once cache framework Ceased EP1433073A4 (fr)

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
US31771801P 2001-09-06 2001-09-06
US31756601P 2001-09-06 2001-09-06
US317718P 2001-09-06
US317566P 2001-09-06
US234597 2002-09-04
US10/234,597 US7113980B2 (en) 2001-09-06 2002-09-04 Exactly once JMS communication
US10/234,693 US6826601B2 (en) 2001-09-06 2002-09-04 Exactly one cache framework
US234693 2002-09-04
PCT/US2002/028199 WO2003023633A1 (fr) 2001-09-06 2002-09-05 Structure d'antememoire exactement une fois exactly once cache framework

Publications (2)

Publication Number Publication Date
EP1433073A1 true EP1433073A1 (fr) 2004-06-30
EP1433073A4 EP1433073A4 (fr) 2009-11-18

Family

ID=27499750

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02798120A Ceased EP1433073A4 (fr) 2001-09-06 2002-09-05 Structure d'antememoire exactement une fois exactly once cache framework

Country Status (5)

Country Link
EP (1) EP1433073A4 (fr)
JP (1) JP2005502957A (fr)
CN (1) CN1568467B (fr)
AU (1) AU2002332845B2 (fr)
WO (1) WO2003023633A1 (fr)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7698465B2 (en) * 2004-11-23 2010-04-13 Microsoft Corporation Generalized Paxos
JP4707477B2 (ja) * 2005-06-23 2011-06-22 富士通株式会社 ファイル共有プログラムおよびファイル共有装置
EP2021910A4 (fr) * 2006-05-16 2015-05-06 Oracle Int Corp Gestion de grappe de nouvelle génération
JP5395517B2 (ja) * 2009-05-29 2014-01-22 日本電信電話株式会社 分散データ管理システム、データ管理装置、データ管理方法、およびプログラム
JP5123961B2 (ja) * 2010-02-04 2013-01-23 株式会社トライテック 分散コンピューティングシステム、分散コンピューティング方法及び分散コンピューティング用プログラム
JP5292351B2 (ja) * 2010-03-30 2013-09-18 日本電信電話株式会社 メッセージキュー管理システム及びロックサーバ及びメッセージキュー管理方法及びメッセージキュー管理プログラム
JP5292350B2 (ja) * 2010-03-30 2013-09-18 日本電信電話株式会社 メッセージキュー管理システム及びロックサーバ及びメッセージキュー管理方法及びメッセージキュー管理プログラム
US8595546B2 (en) * 2011-10-28 2013-11-26 Zettaset, Inc. Split brain resistant failover in high availability clusters
CN107181608B (zh) * 2016-03-11 2020-06-09 阿里巴巴集团控股有限公司 一种恢复服务及性能提升的方法及运维管理系统
CN106899648B (zh) * 2016-06-20 2020-02-14 阿里巴巴集团控股有限公司 一种数据处理方法和设备
JP7099305B2 (ja) 2018-12-20 2022-07-12 富士通株式会社 通信装置、通信方法、および通信プログラム

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000026782A1 (fr) * 1998-11-03 2000-05-11 Sun Microsystems Limited Systeme serveur de fichiers

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5802291A (en) * 1995-03-30 1998-09-01 Sun Microsystems, Inc. System and method to control and administer distributed object servers using first class distributed objects
US6067477A (en) * 1998-01-15 2000-05-23 Eutech Cybernetics Pte Ltd. Method and apparatus for the creation of personalized supervisory and control data acquisition systems for the management and integration of real-time enterprise-wide applications and systems
US6122629A (en) * 1998-04-30 2000-09-19 Compaq Computer Corporation Filesystem data integrity in a single system image environment
US6389462B1 (en) * 1998-12-16 2002-05-14 Lucent Technologies Inc. Method and apparatus for transparently directing requests for web objects to proxy caches
US6826601B2 (en) * 2001-09-06 2004-11-30 Bea Systems, Inc. Exactly one cache framework

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000026782A1 (fr) * 1998-11-03 2000-05-11 Sun Microsystems Limited Systeme serveur de fichiers

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BIRMAN K P ET AL: "Implementing fault-tolerant distributed objects" PROCEEDINGS OF THE FOURTH SYMPOSIUM ON RELIABILITY IN DISTRIBUTED SOFTWARE AND DATABASE SYSTEMS (CAT. NO. 84CH2082-6) IEEE COMPUT. SOC. PRESS SILVER SPRING, MD, USA, 1984, pages 124-133, XP002548687 ISBN: 0-8186-0564-2 *
DE PRISCO R ET AL: "Revisiting the PAXOS algorithm" DISTRIBUTED ALGORITHMS. 11TH INTERNATIONAL WORKSHOP, WDAG '97. PROCEEDINGS SPRINGER-VERLAG BERLIN, GERMANY, 1997, pages 111-125, XP002548686 ISBN: 3-540-63575-0 *
LEBOUTE, WEBER: "A reliable distributed file system for UNIX based on NFS"[Online] 14 January 1998 (1998-01-14), XP002548684 Retrieved from the Internet: URL:http://www.public.asu.edu/~ychen10/conference/ifip98/papers/weber.ps> [retrieved on 2009-10-02] *
MARCHETTI C ET AL: "An Interoperable Replication Logic for CORBA systems" PROCEEDINGS DOA'00. INTERNATIONAL SYMPOSIUM ON DISTRIBUTED OBJECTS AND APPLICATIONS IEEE COMPUT. SOC LOS ALAMITOS, CA, USA, 2000, pages 7-16, XP002548685 ISBN: 0-7695-0819-7 *
See also references of WO03023633A1 *
SONGNIAN ZHOU ET AL: "Utopia: a load sharing facility for large, heterogeneous distributed computer systems" SOFTWARE - PRACTICE AND EXPERIENCE UK, vol. 23, no. 12, December 1993 (1993-12), pages 1305-1336, XP002548688 ISSN: 0038-0644 *

Also Published As

Publication number Publication date
CN1568467B (zh) 2010-06-16
AU2002332845B2 (en) 2008-06-12
WO2003023633A1 (fr) 2003-03-20
EP1433073A4 (fr) 2009-11-18
JP2005502957A (ja) 2005-01-27
CN1568467A (zh) 2005-01-19

Similar Documents

Publication Publication Date Title
US6826601B2 (en) Exactly one cache framework
US7113980B2 (en) Exactly once JMS communication
US6839752B1 (en) Group data sharing during membership change in clustered computer system
US7146532B2 (en) Persistent session and data in transparently distributed objects
EP1533701B1 (fr) Système et procédé de basculement après faute
US7627694B2 (en) Maintaining process group membership for node clusters in high availability computing systems
US6003075A (en) Enqueuing a configuration change in a network cluster and restore a prior configuration in a back up storage in reverse sequence ordered
Stumm et al. Fault tolerant distributed shared memory algorithms
US7899897B2 (en) System and program for dual agent processes and dual active server processes
US20030088746A1 (en) Control method for a data storage system
JP2004519024A (ja) 多数のノードを含むクラスタを管理するためのシステム及び方法
AU2002332845B2 (en) Exactly once cache framework
AU2002332845A1 (en) Exactly once cache framework
GB2359384A (en) Automatic reconnection of linked software processes in fault-tolerant computer systems
US8230086B2 (en) Hidden group membership in clustered computer system
Amir et al. A highly available application in the Transis environment
Bartoli Implementing a replicated service with group communication
JPH1125062A (ja) 障害回復システム
Kaashoek et al. A Comparison of Two Paradigms for Distributed Computing
Bartoli et al. Service replication with sequential consistency in unreliable networks

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20040326

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LI LU MC NL PT SE SK TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

A4 Supplementary search report drawn up and despatched

Effective date: 20091020

17Q First examination report despatched

Effective date: 20100628

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: ORACLE INTERNATIONAL CORPORATION

APBK Appeal reference recorded

Free format text: ORIGINAL CODE: EPIDOSNREFNE

APBN Date of receipt of notice of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA2E

APBR Date of receipt of statement of grounds of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA3E

APAF Appeal reference modified

Free format text: ORIGINAL CODE: EPIDOSCREFNE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R003

APBT Appeal procedure closed

Free format text: ORIGINAL CODE: EPIDOSNNOA9E

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20200609