CN101383690B - Grid synchronization method for fault tolerant computer system based on socket - Google Patents

Grid synchronization method for fault tolerant computer system based on socket Download PDF

Info

Publication number
CN101383690B
CN101383690B CN2008102318954A CN200810231895A CN101383690B CN 101383690 B CN101383690 B CN 101383690B CN 2008102318954 A CN2008102318954 A CN 2008102318954A CN 200810231895 A CN200810231895 A CN 200810231895A CN 101383690 B CN101383690 B CN 101383690B
Authority
CN
China
Prior art keywords
socket
syn
msg
function
synchronous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008102318954A
Other languages
Chinese (zh)
Other versions
CN101383690A (en
Inventor
董小社
胡冰
孙江斌
王钊
赵晓昳
伍卫国
田佳
雷济凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN2008102318954A priority Critical patent/CN101383690B/en
Publication of CN101383690A publication Critical patent/CN101383690A/en
Application granted granted Critical
Publication of CN101383690B publication Critical patent/CN101383690B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a network synchronization method based on a socket fault-tolerant computing system. A monitor is set on a socket interface layer of a Linux network architecture through modifying a Linux operating system so as to monitor the socket communication correlation kernel function. When the duplication redundancy schedule performs the network communication through the socket way, the communication operation is intercepted at the socket interface layer and the synchronization logic is triggered. The synchronization logic is implemented by reconstructing the socket series kernel functions, and the duplication schedule enters the synchronization point and completes the synchronizing process when implementing the reconstructed kernel functions. The implementation of the method is simple without special custom hardware. The reconstructed operating system is suitable for common hardware architectures with strong commonality. The synchronization process is initiatively completed by the operating system, and is fully transparent to the upper application. The invention has better expandability, and can be expanded to a multimode redundancy system.

Description

A kind of network synchronization method of the fault-tolerant computer system based on socket
Technical field
The invention belongs to computer realm, relate to fault-toleranr technique and simultaneous techniques, be specifically related to a kind of network synchronization method of the fault-tolerant computer system based on socket.
Background technology
Along with the extensive use of the Internet and information platform, the information-based important component part that has become enterprise core competence.In modern enterprise, the server downtime is the one of the main reasons of loss profit.Concerning those need ensure information safety and provide the mechanism of uninterrupted information service, for example security, manufacturing, communication, bank, transportation, it is particularly important that the fault-tolerance of operation system and continuity seem.How to ensure that various crucial application continue operation, reach sustainable benign cycle, become the key issue that current enterprises and institutions and IT field urgent need solves.Fault-tolerant computer and correlation technique are arisen at the historic moment under this objective demand just, utilize the fault-tolerant calculation function to avoid the ten hundreds of economic loss that causes because of server failure.
Fault-tolerant computer is on redundant (towering surplus, the information redundancy of hardware redundancy, time, software redundancy) basis, by rational architecture, under effective management of systems soft ware and highly reliable, the high available computers that forms.Fault detect is one of key technology of fault-tolerant computer.And in the isolation processes of the voting of coordinating to finish the multimode tolerant system, malfunctioning module, be its core synchronously.In multimode redundant system, the identical task of each module operation compares, decides by vote the result, thus error detection.No matter with comparison point, where voted data places, processor core (Processor Core) for example, and high-speed cache (Cache), main memories (Memory) etc. all are a key link synchronously.
The method of synchronization can be divided into clock synchronization and tasks synchronization.Clock synchronization comprises the loose clock synchronization that strict hardware clock is synchronous and part software is realized, and the related algorithm of extending.Clock synchronization is found in time wrong, and expense is little, but needs certain hardware supports, realizes complicated.Tasks synchronization with an independent operating, once moving of having certain logic task as synchronous basis, in this way, system sets up one or more comparisons, voting point in task.In identical relatively voting point position, move normal task and have consistent task process status data and application data, be in the tasks synchronization state.Can carry out consistency with last output to the intermediate object program of task this moment and judge, promptly task process status data and application data be carried out conforming relatively voting judgement, thus error detection.
For network data, tasks synchronization has different implementation methods again.A kind of method is added one deck pseudo-device and is driven Pseudo Device Driver (PDD) between device drives and operating system nucleus, synchronous points and checkpoint are set in PDD, and data compare to flowing through.Another kind of mode has moved into synchronous points in the operating system nucleus, synchronous points is arranged among the netif_rx () (reception packet) and dev_queue_xmit () functions such as (transmission packets) of network core layer.Also have some researchs that the IP layer is changed, add connection management and cache management module, it is synchronous to realize that on highly available cluster system TCP connects.Certainly, also can directly utilize the hardware lock-step, realize synchronously with specific hardware and architecture.
In the above variety of network data method for synchronous, some too close underlying device are arranged, have some to change, some hardware designs that place one's entire reliance upon are arranged, more or less all have the complexity on realizing ICP/IP protocol itself.
Summary of the invention
The objective of the invention is to overcome the shortcoming of above-mentioned prior art, provide a kind of simple and effective way to realize the network synchronization method that network data in the tolerant system is synchronous based on the fault-tolerant computer system of socket.
For achieving the above object, the technical solution used in the present invention is: the sockets interface layer in the Linux network architecture inserts monitor, process is carried out network service by the socket mode and is called socket relevant interface function, socket relevant interface function comprises socket establishment function socket (), bundling port function bind (), listening port function listen (), request contiguous function connect (), accept contiguous function accept (), close contiguous function close (), send packet function send () or write (), receive packet function recv () or read (), interface function enters operating system nucleus by system call, and carry out corresponding kernel function and finish core operation, comprise sys_socket (), sys_bind (), sys_listen (), sys_accept (), sys_connect (), _ sock_sendmsg (), _ sock_recvmsg (), sys_close (), above kernel function is provided with monitoring, in case kernel function is by the duplication redundancy process transfer then the triggering synchronous logic;
Its concrete steps are as follows:
1) inserts synchronous logic to creating socket structurally associated kernel function
Socket communicating requirement communicating pair process creation communication socket data structure is a struct socket structure, and with the File Open mode and the process context of socket type inode node, kernel function sys_socket () is used to finish this task, sys_socket () function behind the insertion synchronous logic makes the duplication redundancy process only set up single socket structure, and shared this structure is carried out network service, when monitor monitors arrives duplication redundancy process transfer sys_socket () function, carrying out synchronous logic carries out synchronously, the process P1 that arrives finishes socket structure constructive process earlier, its address is preserved temporarily, and enter waiting list sleep wait; The process P2 that the back arrives obtains the socket structure according to the address that the process P1 that arrives earlier preserves, and wake P1 up, finish synchronous logic after, P1, P2 sets up file object, directory entry object and the socket structure connection of oneself respectively, has also promptly finished the establishment of the shared socket structure of bimodulus process;
2) to setting up and closing join dependency kernel function insertion synchronous logic
Process is before the transceive data bag carries out network service, need finish the auxiliary operation that establishes a communications link, comprise and call bind () BOUND communication port, listen () listening port, accept () receives connection request, connect () sends connection request, listen () wherein, and accept () only is used for TCP with connect () and communicates by letter, in the synchronous logic of this class function, the bimodulus process area is divided into master and slave process, and host process is carried out core operation, and it is synchronous only to finish form from process;
Network service finishes back process transfer close () function and closes connection, and in closing the function synchronous logic, host process is carried out sock_close () and discharged socket; Only finish synchronous wait from process, do not carry out core operation;
3) data pack receiving and transmitting kernel function is inserted synchronous logic
Communication process calls send () or write () function sends data, send () and write () function are all corresponding to kernel function _ sock_sendmsg (), for process of transmitting, after process arrival synchronous points, at kernel heap district's application temporary space storage packet to be sent; After host process arrives, self packet to be sent and packet from the process storage are compared, if identical then execution of data sends, if difference then triggers error handling processing; For receiving course, after host process arrives synchronous points, call receiver function and receive packet, and packet is duplicated and stores in application space, kernel heap district; Arrive the address space that the back copies packet to from kernel heap district this process from process.
The present invention adds in kernel synchronously with data structure and sync item, definition waiting list socket_wq, arrive synchronous points and do not finish the sleep wait in this wait formation of synchronous task as yet, definition data structure structft_socket_msghdr, be used to store the needs msghdr packet relatively of synchronous points task, in the socket structure, add ft_socket_bind_syn, ft_socket_listen_syn, ft_socket_connect_syn, ft_socket_read_syn, ft_socket_write_syn, the ft_socket_close_syn territory is as the synchronous mark position; Increase ft_socket_msghdr type pointer *Ft_msg_recv, *Ft_msg_send is as the address pointer in space, kernel scratchpad area (SPA); Increase ft_socket_ret and be used for storage function operation return value, in process control block (PCB), increase ft_socket_executor and be used to identify principal and subordinate's process, increase *Ft_create_socket and *Ft_accept_socket is as the pointer of common structure on the synchronous points.
Monitor of the present invention is judged process type, if carrying out the process of kernel function is redundant process, then carry out the kernel function of inserting synchronous logic, next successively to sys_socket (), sys_bind (), sys_listen (), sys_accept (), sys_connect (), _ sock_sendmsg (), _ sock_recvmsg () and sys_close () function are transformed, and insert synchronous logic;
Transform sys_socket (), sys_socket () is used to create the socket structure and this structure and process is got in touch, and wherein creates part and is finished by sock_create (), and the coupling part is finished by sock_map_fd ().When the duplication redundancy task process calls sys_socket () establishment socket, the bimodulus process is carried out synchronously, make its shared socket structure.Judge that according to ft_socket_executor process type, this value are that 1 process is a host process, this value is 0 process for from process.If host process arrives synchronous points earlier, then call sock_create () and finish constructive process, the socket structure address of creating is deposited among the ft_create_socket, enter waiting list socket_wq afterwards and wait for, the back arrives the host process that reads ft_create_socket and the wake-up waiting from process of synchronous points and finishes synchronously; If arrive earlier synchronous points from process, then enter waiting list socket_wq sleep and wait for, the host process that the back arrives is finished to wake up from process after the building work and is finished synchronously.In sock_create () function, finally call sock_alloc () and distribute socket structure and its corresponding index node, the reference count i_count of index node is made as 2, so that shared socket structure of bimodulus process and index node.Need be when creating the socket structure with ft_socket_bind_syn, all synchronous mark positions such as ft_socket_listen_syn are initialized as 0.Afterwards, the duplication redundancy process is called sock_map_fd () respectively to set up oneself file object, directory entry object related with index node.Wherein, the file object of the file pointed host process of socket structure.
Transform sys_bind (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations.For host process, carry out ops-〉bind () bundling port, deposit return value in ft_socket_ret, ft_socket_bind_syn puts 1 with sync bit, in the wake-up waiting formation from process; For from process, by synchronous points that ft_socket_bind_syn is clear 0 when arriving synchronous points if ft_socket_bind_syn is 1, and from ft_socket_ret, read return value, if ft_socket_bind_syn is 0, enters sleep and waiting for.
Transform sys_listen (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations.For host process, carry out ops-〉listen () listening port, deposit return value in ft_socket_ret, ft_socket_listen_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, when arriving synchronous points, be 1 as if ft_socket_listen_syn, by synchronous points, ft_socket_listen_syn is clear 0, and from ft_socket_ret, read return value, if ft_socket_listen_syn is 0, enters the socket_wq sleep and wait for.
Transform sys_accept (), sys_accept () is called by monitor process, receives connection request and creates new communication socket.Distinguish principal and subordinate's process according to ft_socket_executor, host process is called sock_alloc () and is distributed new socket structure, passing through ops-〉accept () changes newly-generated socket into communication socket, preserves its address with ft_accept_socket, and the wake-up waiting formation; Whether is the empty arrival order of judging from process according to ft_accept_socket,, then enters waiting list and waits for if arrive synchronous points earlier from process, if the back arrives synchronous points, then reads getting in touch with its foundation of new establishment by ft_accept_socket.
Transform sys_connect (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations.For host process, carry out ops-〉connect ask connection, deposits return value in ft_socket_ret, ft_socket_connect_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, when arriving synchronous points, be 1 as if ft_socket_connect_syn, by synchronous points, ft_socket_connect_syn is clear 0, and from ft_socket_ret, read return value, if ft_socket_connect_syn is 0, enters the socket_wq sleep and wait for.
Transform send () and write (), these two functions all can be used for sending data, finally all call in kernel _ sock_sendmsg (), can transform _ sock_sendmsg () catches two kinds of method of calling simultaneously.Be used to store for the ft_msg_send allocation space in kernel heap district for packet relatively.Distinguish principal and subordinate's process according to ft_socket_executor, need at first to be ready to the packet struct msghdr msg of transmission from process, msg is copied in the memory headroom of ft_msg_send sensing, ft_socket_write_syn is put 1, the wake-up waiting formation oneself enters the waiting list sleep; When host process arrives synchronous points, if ft_socket_write_syn is 0, then entering waiting list waits for, if ft_socket_write_syn is 1, then packet of oneself and the data in the memory headroom that points to of ft_msg_send are compared, and finally calling ops-〉sendmsg () sends data, and ft_socket_write_ is clear 0, the wake-up waiting formation.Wherein need the data of comparison to comprise all the elements in the msghdr structure, comprise msg_name, msg_namelen, msg_iov, msg_iovlen, msg_control, msg_controllen, msg_flags etc.
Transform recv () and read (), these two functions all can be used for receiving data, finally all call in kernel _ sock_recvmsg (), can transform _ sock_recvmsg () catches two kinds of method of calling simultaneously.Be used for storage for duplicated data packet in kernel heap district for the ft_msg_recv allocation space.Distinguish principal and subordinate's process according to ft_socket_executor, host process is at first called ops-〉recvmsg () receives data and deposits among the struct msghdr msg, and the content among the msg is copied item by item in the memory headroom of ft_msg_recv sensing, ft_socket_read_syn puts 1 with the synchronous mark position, and enters the waiting list wait; Arrive the back from process and from the memory headroom that ft_msg_recv points to, packet is copied among the struct msghdr msg of oneself, comprise msg_name, msg_namelen, msg_iov, msg_iovlen, msg_control, msg_controllen, contents such as msg_flags.At last that ft_socket_read_syn is clear 0, the host process in the wake-up waiting formation.
Transform sys_close (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations.For host process, carry out sock_close () and discharge socket, ft_socket_close_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, by synchronous points that ft_socket_close_syn is clear 0 when arriving synchronous points if ft_socket_close_syn is 1, if ft_socket_close_syn is 0, enters the socket_wq sleep and wait for.
The present invention is provided with synchronous points by revising the linux operating system nucleus on the sockets interface layer, the network service of duplication redundancy process is carried out synchronously.Through monitor being set in socket layer, insert the synchronous logic reconstructed operating system, initiatively the socket correlation kernel is monitored, intercept and capture and call the duplication redundancy process that correlation function carries out network service, and it is carried out synchronously.Whole process is transparent fully to upper layer application.Compare with other implementations, the method realizes simple, need not to change hardware designs, and is transparent to device drives, need not to revise lower floor's procotol.
Description of drawings
Fig. 1 is a workflow diagram of the present invention;
Fig. 2 is synchronization principles figure of the present invention;
Fig. 3 is the workflow diagram of the present invention's host process on reception data sync point;
Fig. 4 is the present invention's workflow diagram from process on reception data sync point;
Fig. 5 is the workflow diagram of the present invention's host process on transmission data sync point;
Fig. 6 is the present invention's workflow diagram from process on transmission data sync point.
Embodiment
Below in conjunction with accompanying drawing the present invention is described in further detail.
As shown in Figure 1.Moving duplication redundancy with server end among the figure is example, and client such as need move duplication redundancy, and flow process is similar.As shown in the figure, redundant process is carried out create () synchronously, bind (), listen (), accept (), read (), write (), close operations such as ().Server end among the figure, the process in left side are host process, and promptly principal and subordinate's flag bit ft_socket_executor is 1 process; Right-hand member is from process, and promptly ft_socket_executor is 0 process.Actual core operation is carried out in operation among the figure in the solid box, and the operation in the frame of broken lines is only finished synchronously.Carry out create () operation, host process is created the socket inode structures, connects the public socket structure of two process in synchronous points and socket structure from process.Carry out bind () operation, host process is waited for socket structure and designated port binding synchronously in synchronous points from process, is not carried out practical operation.Carry out listen () operation, host process is finished and is changed its socket that sets up into monitoring socket, waits for synchronously in synchronous points from process, does not carry out practical operation.Carry out accept () operation, host process is created new communication socket, and related with this socket foundation in synchronous points from process, same two process is shared socket.Carry out read () operation, host process copies data in the kernel scratchpad area (SPA) reading when lower floor transmits the packet that comes, from process in synchronous points read data packet from the scratchpad area (SPA).Carry out write () operation, deposit the required data of writing out in the kernel scratchpad area (SPA) from process in synchronous points, host process compares the data of oneself with the data that deposit in from process, and the most a data pass to lower floor.Carry out close () operation, host process discharges the socket structure, waits for synchronously in synchronous points from process.From above process as can be seen, principal and subordinate's process is finished the communication of socket mode all the time by the socket structure of sharing.
The synchronization principles of duplication redundancy process as shown in Figure 2, task P1, P2 are a pair of redundant processes, respectively independent the execution carried out when carrying out socket () system call transceiver network data synchronously.Among the figure, synchronously data are sent synchronously for the first time, P2 all set sends data and at first arrives synchronous points, and P1 no show still, P2 enters wait state, behind elapsed time t1, P1 also arrives synchronous points smoothly, and kernel starts synchronizing process, compares and sends to lower floor sending data.Be synchronous to receiving course synchronously for the second time, P1 at first arrives synchronous points and enters wait, and P2 arrives synchronous points behind elapsed time t2, and kernel carries out synchronously, will descend layer data to copy as two parts and send to P1, P2 respectively.System is for having made up waiting list synchronously especially, and the task of arriving synchronous points earlier enters waiting list and waits for, when after the arrival task will arrive first task after entering synchronous points and wake up from waiting list.
Referring to Fig. 3, host process enters the Data Receiving synchronous points, at first call sock-〉ops-〉recvmsg () receives packet, distributes the space of scratchpad area (SPA) ft_msg_recv simultaneously, and the packet that receives is copied in the ft_msg_recv indication space, scratchpad area (SPA) item by item.After this synchronous mark position ft_socket_read_syn is put 1, waiting list socket_wq is carried out wake_up () operation, wake up may wait for from process, wait for and oneself enter waiting list socket_wq.Host process circulates in waiting list flag bit ft_socket_read_syn is judged, after ft_socket_read_syn quilt (from process) clear 0, host process breaks away from waiting list, discharges the scratchpad area (SPA), and end data receives synchronizing process.
Referring to Fig. 4, arrive the Data Receiving synchronous points from process, at first entering waiting list socket_wq waits for, and to flag bit ft_socket_read_syn cycle detection, after ft_socket_read_syn is put 1 by (host process), break away from waiting list from process, from the scratchpad area (SPA), packet is copied in the data space of oneself item by item afterwards, at last with ft_socket_read_syn clear 0, and waiting list socket_wq is called wake_up operate, waking up may be in the host process of waiting for, end data receives synchronizing process.
Referring to Fig. 5, host process arrives data and sends synchronous points, and at first enter waiting list socket_wq and wait for, and the cycle detection flag bit, after ft_socket_write_syn was put 1 by (from process), host process was waken up.The packet that own desire is sent compares item by item with the data in the kernel scratchpad area (SPA) then, relatively by after call ops-sendmsg () sends data, as if not by changing fault processing over to.At last that ft_socket_write_syn is clear 0, waiting list socket_wq is carried out the wake_up operation, wake up may wait for from process, end data sends synchronizing process.
Referring to Fig. 6, arrive data from process and send synchronous points, at first distribute the kernel scratchpad area (SPA), desire is sent packet be admitted in the scratchpad area (SPA) item by item, and ft_socket_write_syn is put 1.Then waiting list socket_wq is carried out wake_up () operation, wake up and may wait for and oneself enter waiting list in the host process of waiting for, and cycle detection flag bit ft_socket_write_syn.After ft_socket_write_syn quilt (host process) clear 0, break away from waiting list from process, discharge the scratchpad area (SPA), end data sends synchronizing process.

Claims (2)

1. network synchronization method based on the fault-tolerant computer system of socket, it is characterized in that: the sockets interface layer in the Linux network architecture inserts monitor, process is carried out network service by the socket mode and is called socket relevant interface function, socket relevant interface function comprises socket establishment function socket (), bundling port function bind (), listening port function listen (), request contiguous function connect (), accept contiguous function accept (), close contiguous function close (), send packet function send () or write (), receive packet function recv () or read (), interface function enters operating system nucleus by system call, and carry out corresponding kernel function and finish core operation, comprise sys_socket (), sys_bind (), sys_listen (), sys_accept (), sys_connect (), _ sock_sendmsg (), _ sock_recvmsg (), sys_close (), above kernel function is provided with monitoring, in case kernel function is by the duplication redundancy process transfer then the triggering synchronous logic;
In kernel, add synchronously with data structure and sync item, definition waiting list socket_wq, arrive synchronous points and do not finish the sleep wait in this wait formation of synchronous task as yet, definition data structure structft_socket_msghdr, be used to store the needs msghdr packet relatively of synchronous points task, in the socket structure, add ft_socket_bind_syn, ft_socket_listen_syn, ft_socket_connect_syn, ft_socket_read_syn, ft_socket_write_syn, the ft_socket_close_syn territory is as the synchronous mark position; Increase ft_socket_msghdr type pointer * ft_msg_recv, * ft_msg_send is as the address pointer in space, kernel scratchpad area (SPA); Increase ft_socket_ret and be used for storage function operation return value, in process control block (PCB), increase ft_socket_executor and be used to identify principal and subordinate's process, increase * ft_create_socket and * ft_accept_socket pointer as public socket structure on the synchronous points;
Its concrete steps are as follows:
1) inserts synchronous logic to creating socket structurally associated kernel function
Socket communicating requirement communicating pair process creation communication socket data structure is a struct socket structure, and with the File Open mode and the process context of socket type inode node, kernel function sys_socket () is used to finish this task, sys_socket () function behind the insertion synchronous logic makes the duplication redundancy process only set up single socket structure, and shared this structure is carried out network service, when monitor monitors arrives duplication redundancy process transfer sys_socket () function, carrying out synchronous logic carries out synchronously, the process P1 that arrives finishes socket structure constructive process earlier, its address is preserved temporarily, and enter waiting list sleep wait; The process P2 that the back arrives obtains the socket structure according to the address that the process P1 that arrives earlier preserves, and wake P1 up, finish synchronous logic after, P1, P2 sets up file object, directory entry object and the socket structure connection of oneself respectively, has also promptly finished the establishment of the shared socket structure of bimodulus process;
2) to setting up and closing join dependency kernel function insertion synchronous logic
Process is before the transceive data bag carries out network service, need finish the auxiliary operation that establishes a communications link, comprise and call bind () BOUND communication port, listen () listening port, accept () receives connection request, connect () sends connection request, listen () wherein, and accept () only is used for TCP with connect () and communicates by letter, in the synchronous logic of this class function, the bimodulus process area is divided into master and slave process, and host process is carried out core operation, and it is synchronous only to finish form from process;
Network service finishes back process transfer close () function and closes connection, and in closing the function synchronous logic, host process is carried out sock_close () and discharged socket; Only finish synchronous wait from process, do not carry out core operation;
3) data pack receiving and transmitting kernel function is inserted synchronous logic
Communication process calls send () or write () function sends data, send () and write () function are all corresponding to kernel function _ sock_sendmsg (), for process of transmitting, after process arrival synchronous points, at kernel heap district application temporary space storage packet to be sent, after host process arrives, self packet to be sent and packet from the process storage are compared, if identical then execution of data sends, if difference then triggers error handling processing; For receiving course, after host process arrives synchronous points, call receiver function and receive packet, and packet is duplicated and stores in application space, kernel heap district, arrive the address space that the back copies packet to from kernel heap district this process from process.
2. the network synchronization method of the fault-tolerant computer system based on socket according to claim 1, it is characterized in that: said monitor is judged process type, if carrying out the process of kernel function is redundant process, then carry out the kernel function of inserting synchronous logic, next successively to sys socket (), sys_bind (), sys_listen (), sys_accept (), sys_connect (), _ sock_sendmsg (), _ sock_recvmsg () and sys_close () function are transformed, and insert synchronous logic;
Transform sys_socket (), sys_socket () is used to create the socket structure and this structure and process is got in touch, wherein creating part is finished by sock_create (), the coupling part is finished by sock_map_fd (), when the duplication redundancy task process calls sys_socket () establishment socket, the bimodulus process is carried out synchronously, make its shared socket structure, judge process type according to ft_socket_executor, this value is that 1 process is a host process, this value is 0 process for from process, if host process arrives synchronous points earlier, then call sock_create () and finish constructive process, the socket structure address of creating is deposited among the ft_create_socket, enter waiting list socket_wq afterwards and wait for, the back arrives the host process that reads ft_create_socket and the wake-up waiting from process of synchronous points and finishes synchronously; If arrive synchronous points earlier from process, then entering waiting list socket_wq sleep waits for, the host process that the back arrives is finished to wake up from process after the building work and is finished synchronously, in sock_create () function, finally call sock_alloc () and distribute socket structure and its corresponding index node, reference count i_count to index node is made as 2, so that shared socket structure of bimodulus process and index node, need be when creating the socket structure with ft_socket_bind_syn, all synchronous mark positions of ft_socket_listen_syn are initialized as 0, afterwards, duplication redundancy process call establishment sock_map_fd () is respectively set up the file object of oneself, the directory entry object is related with index node, wherein, the file object of the file pointed host process in the socket structure;
Transform sys_bind (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations, for host process, carry out ops->bind () bundling port, deposit return value in ft_socket_ret, ft_socket_bind_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, when arriving synchronous points, be 1 as if ft_socket_bind_syn, by synchronous points, ft_socket_bind_syn is clear 0, and from ft_socket_ret, read return value, if ft_socket_bind_syn is 0, enters the socket_wq sleep and wait for;
Transform sys_listen (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations, for host process, carry out ops->listen () listening port, deposit return value in ft_socket_ret, ft_socket_listen_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, when arriving synchronous points, be 1 as if ft_socket_listen_syn, by synchronous points, ft_socket_listen_syn is clear 0, and from ft_socket_ret, read return value, if ft_socket_listen_syn is 0, enters the socket_wq sleep and wait for;
Transform sys_accept (), sys_accept () is called by monitor process, receive connection request and create new communication socket, distinguish principal and subordinate's process according to ft_socket_executor, host process is called sock_alloc () and is distributed new socket structure, by ops->accept () with newly-generated socket change into communication socket, preserve its address with ft_accept_socket, and among the wake-up waiting formation socket_wq from process; Whether is the empty arrival order of judging from process according to ft_accept_socket,, then enters waiting list and waits for if arrive synchronous points earlier from process, if the back arrives synchronous points, then socket and its foundation of reading new establishment by ft_accept_socket is got in touch;
Transform sys_connect (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations, for host process, carrying out ops->connect asks to connect, deposit return value in ft_socket_ret, ft_socket_connect_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, when arriving synchronous points, be 1 as if ft_socket_connect_syn, by synchronous points, ft_socket_connect_syn is clear 0, and from ft_socket_ret, read return value, if ft_socket_connect_syn is 0, enters the socket_wq sleep and wait for;
Transform send () and write (), these two functions all can be used for sending data, in kernel, finally all call _ sock_sendmsg (), transformation _ sock sendmsg () catches two kinds of method of calling simultaneously, be used to store for packet relatively for the ft_msg_send allocation space in kernel heap district, distinguish principal and subordinate's process according to ft_socket_executor, need at first to be ready to the packet struct msghdr msg of transmission from process, msg is copied in the memory headroom of ft_msg_send sensing, ft_socket_write_syn is put 1, the wake-up waiting formation oneself enters the waiting list sleep; When host process arrives synchronous points, if ft_socket_write_syn is 0, then entering waiting list waits for, if ft_socket_write_syn is 1, then packet of oneself and the data in the memory headroom that points to of ft_msg_send are compared, and finally call ops->sendmsg () and send data, ft_socket_write_syn is clear 0, the wake-up waiting formation; Wherein need the data of comparison to comprise all the elements in the msghdr structure, comprise msg_name, msg_namelen, msg_iov, msg_iovlen, msg_control, msg_controllen, msg_flags;
Transform recv () and read (), these two functions all are used to receive data, finally all call in kernel _ sock_recvmsg (), transformation _ sock_recvmsg () catches two kinds of method of calling simultaneously, is used for storage for duplicated data packet in kernel heap district for the ft_msg_recv allocation space; Distinguish principal and subordinate's process according to ft_socket_executor, host process is at first called ops->recvmsg () reception data and is deposited among the struct_msghdr_msg, and the content among the msg is copied item by item in the memory headroom of ft_msg_recv sensing, ft_socket_read_syn puts 1 with the synchronous mark position, and enters the waiting list wait; Arriving the back from process copies packet among the struct_msghdr_msg of oneself from the memory headroom that ft_msg_recv points to, comprise msg_name, msg_namelen, msg_iov, msg_iovlen, msg_control, msg_controllen, the msg_flags content, at last that ft_socket_read_syn is clear 0, the host process in the wake-up waiting formation;
Transform sys_close (), distinguish principal and subordinate's process according to ft_socket_executor and carry out different operations,, carry out sock_close () and discharge socket for host process, ft_socket_close_syn puts 1 with sync bit, wake up in the socket_wq waiting list from process; For from process, by synchronous points that ft_socket_close_syn is clear 0 when arriving synchronous points if ft_socket_close_syn is 1, if ft_socket_close_syn is 0, enters the socket_wq sleep and wait for.
CN2008102318954A 2008-10-27 2008-10-27 Grid synchronization method for fault tolerant computer system based on socket Expired - Fee Related CN101383690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008102318954A CN101383690B (en) 2008-10-27 2008-10-27 Grid synchronization method for fault tolerant computer system based on socket

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008102318954A CN101383690B (en) 2008-10-27 2008-10-27 Grid synchronization method for fault tolerant computer system based on socket

Publications (2)

Publication Number Publication Date
CN101383690A CN101383690A (en) 2009-03-11
CN101383690B true CN101383690B (en) 2011-06-01

Family

ID=40463335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008102318954A Expired - Fee Related CN101383690B (en) 2008-10-27 2008-10-27 Grid synchronization method for fault tolerant computer system based on socket

Country Status (1)

Country Link
CN (1) CN101383690B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101794242B (en) * 2010-01-29 2012-07-18 西安交通大学 Fault-tolerant computer system data comparing method serving operating system core layer
CN102109996A (en) * 2010-12-29 2011-06-29 浙大网新科技股份有限公司 Method for enabling Linux kernel to support Socket reuse
CN102073531B (en) * 2010-12-29 2013-07-31 浙大网新科技股份有限公司 Method for realizing AcceptEx mechanism of WinSock2 in Linux kernel
US9571952B2 (en) * 2011-04-22 2017-02-14 Qualcomm Incorporatd Offloading of data to wireless local area network
CN103023602A (en) * 2011-09-20 2013-04-03 镇江雅迅软件有限责任公司 Data transmission fault tolerance system based on Socket protocol
KR101991687B1 (en) * 2012-11-23 2019-06-24 삼성전자 주식회사 Dynamic library profiling method, computer readable recording medium storing thereof and dynamic library profiling system
CN103051634B (en) * 2012-12-26 2014-08-27 中国电力科学研究院 High-accuracy multipoint synchronous acquisition method for extra-high voltage direct-current corona current
CN103473133A (en) * 2013-09-25 2013-12-25 浪潮电子信息产业股份有限公司 High availability system-oriented redundant process synchronization method
CN104065732A (en) * 2014-06-30 2014-09-24 江苏华大天益电力科技有限公司 Telnet proxy forwarding mechanism
CN104536851A (en) * 2014-12-11 2015-04-22 中国航空工业集团公司第六三一研究所 System and method for synchronizing redundant fault-tolerant computer system
CN105940658B (en) * 2015-01-04 2019-04-26 华为技术有限公司 A kind of transmission method of user data, device and terminal
CN106020960B (en) * 2016-05-30 2020-02-18 北京奇艺世纪科技有限公司 Calling method and device
CN106487945B (en) * 2016-09-09 2019-05-17 中国互联网络信息中心 A kind of DNS forwarding inquiries method
CN107704274B (en) * 2017-08-24 2019-11-05 武汉斗鱼网络科技有限公司 Method, storage medium, equipment and the system waken up for Android application host process
CN110392077A (en) * 2018-04-20 2019-10-29 杭州游络科技有限公司 A kind of game server engine and exchange method that client can be interacted with server-side
CN110620778B (en) * 2019-09-25 2021-11-30 北京简约纳电子有限公司 Method for simultaneously supporting synchronous and asynchronous communication modes of socket
CN114564158B (en) * 2022-04-24 2022-07-15 北京圣博润高新技术股份有限公司 Method, device, equipment and medium for controlling document printing under Linux system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1495611A (en) * 2002-07-12 2004-05-12 �ձ�������ʽ���� Fault-tderant computer system and its resynchronization method and program
CN101093453A (en) * 2007-07-25 2007-12-26 湖南大学 Method for implementing checkpoint of Linux program at user level based on virtual kernel object

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1495611A (en) * 2002-07-12 2004-05-12 �ձ�������ʽ���� Fault-tderant computer system and its resynchronization method and program
CN101093453A (en) * 2007-07-25 2007-12-26 湖南大学 Method for implementing checkpoint of Linux program at user level based on virtual kernel object

Also Published As

Publication number Publication date
CN101383690A (en) 2009-03-11

Similar Documents

Publication Publication Date Title
CN101383690B (en) Grid synchronization method for fault tolerant computer system based on socket
CN103199972B (en) The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system
CN106062717B (en) A kind of distributed storage dubbing system and method
CN102346460B (en) Transaction-based service control system and method
CN102404390B (en) Intelligent dynamic load balancing method for high-speed real-time database
EP2281240B1 (en) Maintaining data integrity in data servers across data centers
Bouteiller et al. Correlated set coordination in fault tolerant message logging protocols
CN102938705B (en) Method for managing and switching high availability multi-machine backup routing table
CN104794028B (en) A kind of disaster tolerance processing method, device, primary data center and preliminary data center
CN105354113B (en) A kind of system and method for server, management server
CN102724304A (en) Information warehouse federation in subscription/release system and data synchronization method
CN106850260A (en) A kind of dispositions method and device of virtual resources management platform
CN101483540A (en) Master-slave switching method in high class data communication equipment
CN101207517B (en) Method for reliability maintenance of distributed enterprise service bus node
WO2012145963A1 (en) Data management system and method
CN101145946A (en) A fault tolerance cluster system and method based on message log
CN115550384B (en) Cluster data synchronization method, device and equipment and computer readable storage medium
CN107357800A (en) A kind of database High Availabitity zero loses solution method
CN100466545C (en) Data synchronization method between the main board and spare board in the communication system
CN114020279A (en) Application software distributed deployment method, system, terminal and storage medium
CN113127499B (en) Block chain-based micro-service method, equipment and medium
CN102571311B (en) Master-slave switching communication system and master-slave switching communication method
CN103049348A (en) Data fault tolerant storage method under multiserver environment
Kim et al. Dynamic configuration management in reliable distributed real-time information systems
CN111708659B (en) Method for constructing Yun Yuansheng disaster recovery architecture based on kubernetes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110601

Termination date: 20141027

EXPY Termination of patent right or utility model