Process mobility protocol
This invention relates to techniques for mobile processing. It can be applied especially to real-time systems. Mobile processing involves the migration of processes from a source node (e.g. a machine) to a target node. The migration is not really executed, but emulated: a new process is instantiated on the target node, and, after the context of the process on the source node is copied, the process on the source node is killed. But the net effect is that the processing of some data is moved from source to target node. A process running on a node is called an "incarnation". The process at the source node is called the "old incarnation". The process at the target node is called the "new incarnation". The requirements for process migration are that the current state of the process on the source node can be transferred to the target node, in such a way the new incarnation can continue processing on the target node. This means that not only the process needs to be replicated on the target node, but also a snapshot of its current context needs to be installed. Next, the new incarnation of the process must be put in the same state the old incarnation is in. Finally, the old incarnation must be killed, while the new incarnation is jumpstarted to continue at the point where the old incarnation left off. An operating system (OS) keeps track of the state and context of each process running, and performs scheduling duties. However, operating systems do not provide the service of moving a process from one node to another. Indeed, operating systems are hardware oriented. In particular, representations for context and state differ between operating systems. Operating systems of same brand but different release or even between operating systems of same brand and release but for different hardware- platforms have critical differences. In other words, operating systems lack of interoperability and code portability to implement mobile processing between different operating systems. Conventional systems for mobile processing, like those employed in agent based systems, rely on a so-called virtual machine (VM) to implement process mobility. The VM is an interpreter, which provides an
instruction set, a set of registers, a garbage-collected heap, and memory storage usable by the processes. The VM interprets the code of the processes and hence controls the state and context for these processes. It also schedules the execution of the processes. It is relatively easy to freeze and ship the state and context of a process in such cases. In other words, the use of virtual machines makes the process mobility between different operating systems possible. However, there are major drawbacks. Especially, virtual machines provide a limited set of services compared to the services provided by the operating systems themselves. In other words, the processes are bound to the services offered by the virtual machine. For example, virtual machines provide only one specific form of shared memory or no shared memory at all. In addition, since the virtual machines provide an abstraction layer, they do their own scheduling, threading... Therefore, the threading, the scheduling, the real-time behavior at the process level are not supported. A last, the interprocess communication (IPC) is not commonly supported by the virtual machines. Another drawback is that the use of a virtual machine requires the processes to be compliant with it (e.g. JAVA virtual machine, JADE platform...). Therefore, the mobility is limited to processes on the VM and other (interoperable) virtual machines on different nodes. There are other drawbacks:
- a VM is not suited for time- and safety critical systems
- a VM is not suited for real-time systems - it is hard to reach the primal resources of a node from within the VM Through its characteristics, as described and claimed here below, the invention seeks to resolve the above-mentioned problems. It seeks especially to move safely a process between different nodes (such as different machines with different operating systems) without the drawbacks of the virtual machines. To meet this goal and obtain other advantages that will appear more clearly here below, it is an object of the invention especially to provide a method for the process mobility. The invention is implemented in portal processes, so-called gateways. According to a feature of the invention, the gateways transport the mobile process context (the old incarnation) from one
node to another. To this end, the process gateways take care of starting up equivalent processes on different nodes, which have easy access to the node's resources. They have a context matching the node's hardware and OS. The process gateways are involved in the exchange of necessary context-information, in order to rebuild an equivalent context for the new incarnation. More precisely, it is an object of the invention to provide a method for moving a process from a source node to a target node, wherein:
- an old incarnation of the process (3), that is running on the source node, saves its context in a frame (13);
- a new incarnation of the process (4) is started on the destination node, this new incarnation being in a ready state;
- a direct communication (21 ) is established between the old incarnation (3) and the new incarnation (4) of the process, in which the saved context is sent from the old incarnation to the new incarnation;
- the new incarnation starts (24) processing using the context sent. The main advantages of the invention are that the processes have an easy access to system resources and other processes, that the process mobility method can fulfill the safety and security constraints, and that it is able to operate in a hybrid network. The method according to the invention can be used to keep the loads in a system balanced. Indeed, processes cause load on the node that hosts them. Too much load on a node may hamper the execution of the processes, especially when time-critical processes are involved. Mobile processing is a way to keep the loads balanced, by moving processes from overloaded nodes to nodes with sufficient capabilities and low load. In one advantageous embodiment, the old incarnation saves its context regularly at predetermined synchronization points, this context being wrapped in a frame upon the receiving of a signal from a source gateway. In another advantageous embodiment, the context contains the memory and the state of the process. In another advantageous embodiment, the new incarnation is started upon the receiving of a signal from a target gateway.
In another advantageous embodiment, the signal for starting the new incarnation is sent upon the receiving of a signal from the source gateway. In another advantageous embodiment, the direct communication is established using a socket, this socket being:
- sent by the new incarnation to the target gateway, and then
- sent by the target gateway to the source gateway, and then
- sent by the source gateway to the old incarnation. In another advantageous embodiment, the old and new incarnations are multi-threaded processes. In another advantageous embodiment, each thread of the old incarnation saves its own context, one of those threads being a main thread, these saved contexts being wrapped in a single frame by the main thread. The invention shall now be described in a more detailed manner in the context of a special exemplary practical embodiment. In this description, reference shall be made to the figures of the appended drawing, of which:
- Figure 1 is a timing diagram of an exemplary implementation of a process mobility method according to the invention; - Figure 2a and 2d are timing diagrams of an exemplary implementation of a multi -threaded variant of the method.
Reference is now made to figure 1. According to the invention, synchronization points are added to the mobile processes. A synchronization point 5 is a step wherein the process (old incarnation) saves information, such as the content of its memory and its state. This information is similar to the information saved by a debugger generating a core dump file. In order to perform the process mobility method, the gateway process 1 that is running on the source node sends a "move" signal 6 to the process 3 (old incarnation) that is running on its node. For the sake of clarity, the gateway process 1 that is running on the source node is called the source gateway. The source gateway 1 also sends a "start process" signal 8 to the gateway process 2 that is running on the target node. For the sake of clarity, the gateway process 2 that is running on the target node is called the target
gateway. The "move" signal 6 and the "start process" signal 8 can be made for example using interruptions. The "move" signal 6 can be sent after or before the "start process" signal 8. The "move" signal 6 can be an interrupt. The "start process" signal 8 is more likely to take the form of a message, containing specifics (path, arguments, environment) of the process to be started. The "move" signal 6 triggers steps 7, 13 and 14 performed by the old incarnation in order to build a frame. More precisely, when the old incarnation 3 receives the "move" signal 6, it comes in a suspended state 7. All the processing done between the last synchronization point 5 and the receiving of the "move" signal 7 is lost. Then, the old incarnation performs a step 13, called context wrapping, wherein it builds a frame containing the information saved at the last synchronization point. At the end of the context wrapping 13, the process sends an "end" signal 14 to the source gateway 1 , indicating that it is done with the context wrapping 13. This "end" signal is then relayed by the source gateway 1 to the target gateway 2 by means of a "hold" signal 15. According to another implementation (not shown), the "hold" signal 15 is sent directly by the old incarnation to the target gateway. The "start process" signal 8 is a request to start an identical process 4 on the target node. However, this new process 4 (new incarnation) starts from zero. The target gateway 2 can start 9 the new incarnation using for example the Fork/Exec instruction in Unix. As the new incarnation starts, it performs a registration 10 with the operating system. This registration indicates to the operating system that the new process is a new incarnation type process. In other words, the new incarnation registers that it is mobile. The registration involves a local interaction with the target (local) gateway 2. For example, the registration 10 of the new incarnation encompasses the following steps: 1. Installation of wrapping functions. 2. Installation of restore functions. 3. Installation of unwrapping functions. The wrapping functions are the functions used to perform the context wrapping 13. The restore functions are entry-points, enabling the process to pick up execution from a given process-state. This given process-state is the
process-state saved at a pre-recorded synchronization point (for example synchronization point 5). The unwrapping functions perform the reverse action of the wrapping functions. From a wrapped context they build a usable process run- time context. These unwrapping functions are used in case the process appears to be a new incarnation. To perform the installation of the wrapping/restore/unwrapping functions, the locations of the functions for the process are stored in a jump- table for example. According to a preferred embodiment, the functions installed during the registration step 10 are handlers. A handler is a function that is called by the operating system in specific events, such as an exception X or a signal Y. If the exception X occurs or the signal Y is sent to the process, the operating system immediately suspends execution of the process and calls the appropriate handler. The installation of a handler typically takes the form of a system call, which overwrites the operating systems' default handler. This embodiment, using low-level interrupts (signals) for interprocess communication and synchronization, enables the invention to be compliant with real-time scheduling issues. These low-level interrupts can only be generated on a host-wide level (i.e. on the local node), and not on a network-wide level. On some operating systems, handlers are "volatile", which means the operating system installs the default handler again prior to calling the handler. This can be undone by re-installing a user-defined handler for signal Y on exit of the handler function for signal Y. During the registration 10, the new incarnation can also send a message to the target gateway, informing the gateway the process is mobile. After receiving the message informing the process is mobile, the local gateway sends a reply. The reply can be an acknowledgement or a token identifying the process is a new incarnation. According to another embodiment, the process can derive the fact it is a new incarnation from a flag in its context, which has been set during the start 9 action. In other words, the process does not obtain this information from the gateway: it checks its context instead.
Regardless of algorithm, at the end of the registration 10, the process should know whether or not it is a new incarnation of a migrating process (identified by acknowledgement from gateway, or flag in its context). If a process is not a new incarnation, it starts normal execution. If the process appears to be a new incarnation, it enters in a preparation step 11. During the preparation step 11 , the new incarnation 4 prepares itself to receive the wrapped context (frame generated by the old incarnation). Part of the preparation step 11 is that the new incarnation requests a socket (means of communication) from the operating system. For example, the preparation step 11 encompasses the following steps: 1. The new incarnation requests a socket, using system calls provided by the operating system.
2. The port-number of the allocated socket is sent by the new incarnation in a message to the target gateway.
3. The new incarnation enters a hold-loop, waiting for a "listen" signal 16 to start listening on the allocated socket. According to another embodiment, the new incarnation does not enter in a hold-loop, but starts listening right away after sending the message (port-number) to the target gateway. When the new incarnation is done with the registration 10 and its preparation step 11 , it sends a "hold" signal 12 back to the target gateway 2.
This "hold" signal 12 indicates that the new incarnation is in a ready state, waiting to establish a communication. This "hold" signal 12 can contain the address of the socket requested by the new incarnation. The target gateway 2 waits for both "hold" signals 12 and 15. As both signals are received, the target gateway 2 sends the "listen" signal 16 to the new incarnation. The new incarnation is then in a listening mode 17. The target gateway 2 also sends a "start sending" signal 18 to the source gateway. The "start sending" signal 18 can contain the address of the socket requested by the new incarnation. The "start sending" signal 18 is relayed by the source gateway 1 to the old incarnation by means of a "send" signal 19. This "send" signal 19 can be used to transmit the address of the socket. According to a preferred embodiment, the old incarnation, after finishing context wrapping 13, comes into a hold-loop, where it keeps
requesting the port-number of the allocated socket from the source gateway. Each request is relayed by the source gateway to the target gateway. If the new incarnation has successfully managed to request a socket and notify the target gateway, the target gateway will respond with the proper port-number to the source gateway. In turn, the source gateway will respond to the old incarnation's request with this port-number. This port-number can be included in the "send" signal 19. The "send" signal can also be the port- number itself. If the new incarnation has not yet finished claiming a socket and notifying the target gateway, the target gateway responds with a special "not yet finished" reply to the source gateway, which will reply the same to the old incarnation. This "not yet finished" reply triggers another iteration of the hold-loop of the old incarnation. After receiving the port-number, the old incarnation 3 is in a sending mode. It uses the socket requested by the new incarnation to establish a direct communication 21 with the new incarnation. The wrapped context is sent during this communication 21 (formatted transmission). When the context has been sent over, the new incarnation sends a "done" signal 22 to the target gateway. A proprietary library can for example provide the means for wrapping the context 13, sending and receiving the wrapped context 21. The format is irrelevant. After receiving the "done" signal 22, the target gateway sends a "go" signal 23 to the new incarnation, and a "done" signal 25 to the old incarnation. The "go" signal 23 causes the new incarnation to start processing using the information sent in the wrapped context. In other words, the new incarnation resumes the processing at the last synchronization point 7 of the old incarnation. The new incarnation has synchronization points, such as point 28, which allow it to move to another node again using the same method. The "done" signal 25 is relayed by the source gateway to the old incarnation by means of a "kill" signal 26. This "kill" signal stops the execution 27 of the old incarnation. The "kill" signal triggers an exit-function in the old incarnation, which can perform some neat deregistration activities, release some system resources and so on.
According to another embodiment, the old incarnation can be stopped by a kill signal from the operating system. However, a clean type exit (i.e. with an exit function) is desirable over a forced exit. Indeed, the old incarnation may still be connected to a number of peers in the distributed environment, may possess some handles and so on, since it was interrupted in the middle of its execution (at the last synchronization point before the "move" signal 6).
Reference is now made to figure 2a, where the method described in relation to figure 1 is adapted to a multi-threaded process. In this exemplary implementation, each thread registers separately, and each thread wraps and unwraps its own context. The source gateway sends "move" signal 6 to the main thread 3a. When the main thread 3a receives the "move" signal 6, the main thread suspends the sub threads 3b and 3c, and suspends itself. All the threads are then in a suspended state, that is to say all processing is stopped. Each thread 3a, 3b, 3c of the old incarnation has its own synchronization point 5a, 5b, 5c. All the processing done between each synchronization point and the receiving of the "move" signal 6 is lost. Then, each thread wraps its own context. More precisely, each sub thread 3b, 3c wraps its own local context 13b, 13c and exits (stops executing) 27b, 27c. The main thread wraps its own local context 13a and wraps the global context of the old incarnation. The global context can be for example a sequence of the wrapped contexts 13b, 13c, 13a as illustrated on figure 2a. At the end of the (global) context wrapping, the main thread sends an "end" signal 14 to the source gateway, indicating that the global context wrapping is over. As previously described, the "end" signal is then relayed to the target gateway by means of a "hold" signal 15. According to a preferred embodiment, there are two types of unwrapping functions: a main thread unwrapping function, and a sub thread unwrapping function. The main thread unwrapping function is used to unwrap the context for the main thread (which possibly forks sub threads; single threaded processes having only a main thread). The sub thread unwrapping function unwraps the wrapped context for a sub thread.
As for the single-threaded process, the source gateway also sends a "start process" signal 8 to the target gateway 2. This "start process" signal 8 is a request to start a process 4a on the target node. The process 4a has only one thread at first, which will become the main thread when the other thread start. The thread 4a performs a registration 10 and a preparation 11 as described above. The thread 4a then sends a "hold" signal 12 to the target gateway. When the "hold" signals from the source gateway 1 and the thread 4a are received, the target gateway 2 sends a "listen" signal to the thread 4a (new incarnation). The thread 4a is then in a listening mode. The target gateway 2 also sends a "start sending" signal 18 to the source gateway, which is relayed to the remaining (main) thread 3a of the old incarnation.
Reference is now made to figure 2b, showing the following steps where the following steps of the exemplary implementation of the method are described. The "start sending" signal 18 is relayed by the source gateway to the remaining thread 3a by means of a "send" signal 19. A direct communication 21 is then established between the remaining thread 3a of the old incarnation, and the first thread 4a of the new incarnation. Each time the first thread 4a receives the context of a sub thread, the first thread 4a starts a sub thread with the received context. The sub threads 4b, 4c are in a hold state. When the global context has been sent over, the main thread 4a of the new incarnation sends a "done" signal 22 to the target gateway. After the receiving of the "done" signal 22, the target gateway sends a "go" signal 23 to the main thread of the new incamation, and a "done" signal 25 to the remaining thread of the old incarnation. The "go" signal 23 makes the threads 4a, 4b, 4c start. The threads are indeed scheduled by the operating system (?). Then, during the runtime, each thread has its own synchronization point 28a, 28b, 28c. The "done" signal 25 is relayed by the source gateway to the remaining thread 3a of the old incarnation by means of a "kill" signal 26. This "kill" signal stops the execution of the remaining thread 3a, that is to say it stops the execution of the old incarnation.
In any case, although they are particularly advantageous, the special implementations described are nevertheless non-exhaustive. There is a variety of alternative implementations. These alternatives remain within the framework of the invention covered by the patent. For example, it is possible to use message passing instead of interruptions to send the signals. However, the "move" signal 6 should be an interruption, in order to avoid impeding the move activities by recursion.