CN101794242B - Fault-tolerant computer system data comparing method serving operating system core layer - Google Patents
Fault-tolerant computer system data comparing method serving operating system core layer Download PDFInfo
- Publication number
- CN101794242B CN101794242B CN201010103349XA CN201010103349A CN101794242B CN 101794242 B CN101794242 B CN 101794242B CN 201010103349X A CN201010103349X A CN 201010103349XA CN 201010103349 A CN201010103349 A CN 201010103349A CN 101794242 B CN101794242 B CN 101794242B
- Authority
- CN
- China
- Prior art keywords
- data
- syner
- list
- message
- kernel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Hardware Redundancy (AREA)
Abstract
The invention relates to a fault-tolerant computer system data comparing method serving an operating system core layer. The data comparing service is provided for a double-module redundancy process in a fault-tolerant computer system by starting a core daemon process in a Linux operating system and executing data comparator logics. An event list is added in a core to be used as a message passage; the redundancy process and a data comparator are operated in a production-consumption way; the redundancy process packages the data to be written into a message packet to be inserted into a message list; and the comparator takes the message packet from the message list, analyzes the message packet according to a definition format, compares the data to be written of the redundancy process and finally returns a result to the redundancy process. The invention is realized in the operating system core layer, has no customized hardware, simple realization and good generality and is suitable for a process-grade double-module redundancy fault-tolerant system based on a common hardware structure. All comparing logics are automatically finished in the operating system core layer without participation of an application program and have favorable transparency on application.
Description
Technical field
The invention belongs to computer realm, relate to Fault-tolerant Technique and data comparison techniques, particularly a kind of fault-tolerant computer system data comparing method of serving the operating system kernel layer.
Background technology
Develop rapidly along with computing machine, Internet technology; The information-based every aspect that has been deep into society; Computer technology has greatly changed people's life style at aspects such as increasing work efficiency, promote information interchange; But also make people that it has been produced increasing dependence simultaneously, the fault of a computer system possibly brought the loss that can't estimate.Concerning those need ensure information safety and provide the mechanism of uninterrupted information service, for example security, manufacturing, communication, bank, transportation, it is particularly important that the reliability of operation system and continuity seem.How to improve the reliability and availability of computer system, thereby ensure that various key application continue operation, reach sustainable benign cycle, become a major issue of message area.Fault-tolerant computer and correlation technique are arisen at the historic moment under this objective demand just, utilize the fault-tolerant calculation function to avoid the ten hundreds of economic loss that causes because of server failure.
Fault-tolerant computer is on the basis of redundant resource (hardware redundancy, time redundancy, information redundancy, software redundancy), through architecture reasonable in design, under effective management of system software and highly reliable, the high available computers that forms.Fault detect is to realize one of gordian technique of fault-tolerant computer system, and is the wrong main means of finding to comparison, the voting of task data.
Comparison, voting to data mainly contain based on hardware with based on the software dual mode.Hardware based method increases comparable chip in system, comprise in the chip relatively or the logic of voting, and all data to be write out are compared, decide by vote, and this mode is found in time wrong, but complex design realizes that cost is high.Method based on software is provided with comparison, voting point in built-in function or application program; The intermediate result of task is carried out consistance with last output to be judged; This mode system design is simple, but poor to using the transparency, has brought extra burden for programming personnel and user.
Summary of the invention
The objective of the invention is to shortcoming and defect to above-mentioned prior art; A kind of fault-tolerant computer system data comparing method of serving the operating system kernel layer is provided; The present invention can carry out consistance relatively to the state and the data result of redundant task in the tolerant system, writes down the synchronous comparison information of redundant task simultaneously.
In order to realize above-mentioned task, the present invention adopts following technical solution: in the (SuSE) Linux OS kernel, create kernel state finger daemon ft_syner, carry out comparator logic, relatively serve for redundant process provides data; Redundant process is when carrying out write operation; Be ready to treat write data respectively; Be the message bag by host process with data encapsulation to be written again, then the message bag be added into redundant process and data comparator communication port, and initiatively wake data comparator ft_syner up and carry out data relatively; Data comparator ft_syner accomplish data relatively after, redundant process obtains comparative result through the comparative result field in the detect-message bag.
Described communication port, implementation is following: in the (SuSE) Linux OS kernel, create incident chained list ft_syner_event_list, redundant process realizes communicating by letter through incident chained list ft_syner_event_list with data comparator.
Data comparator and redundant process are worked with Survivor-consumer's mode; Redundant process is put data to be compared for behind the message bag in order by protocol format and is articulated among the incident chained list ft_syner_event_list; Comparer takes off the message bag from this incident chained list ft_syner_event_list, therefrom extract data message and compare.
The form of described message bag is:
typedef?struct{
struct?list_head?list;
short?ft_msg_type;
struct?task_struct*p1;
struct?task_struct*p2;
void*master_data;
long?master_data_len;
void*slave_data;
long?slave_data_len;
short?error;
}ft_syner_event_msg;
Wherein, list is a linked list head, is used for the message bag is hung the incoming event chained list, and adopting list_head is linux kernel universal chain list structure, and the insertion of message bag, deletion action use list_add () and the list_del () in the kernel to accomplish;
Ft_msg_type is a type of message, and which kind of system call expression message bag comes from, and occurrence defines as follows:
#define FT_WRITE 1 //write () system call
#define FT_WRITEV 2 //writev () system call
#define FT_SEND 3 //send () system call
#define FT_SENDTO 4 //sendto () system call
#define FT_SENDMSG 5 //sendmsg () system call
Comparer judges according to the value of ft_msg_type type of message the message bag from which kind of system call produces, and above-mentioned definition can be according to requirements extend or reduction;
P1, p2 is for generating the right process control block (PCB) pointer of redundant process of this message bag, and p1 is a host process, and p2 is from process, and comparer obtains the right process control block (PCB) of redundant process through these two pointers;
Master_data is the data buffer pointer of redundant process centering host process, and master_data_len is a buffer length;
Slave_data is the data buffer pointer of redundant process centering from process, and slave_data_len is a buffer length;
Error writes down comparative result, and value is 1 expression data consistent, and value is that 0 expression data are inconsistent.
Described data comparator is a kernel state finger daemon ft_syner who in the (SuSE) Linux OS kernel, creates, and its resident operating system kernel is carried out comparator logic, relatively serves for redundant process provides data.Kernel state finger daemon ft_syner is in waiting status when idle; When being arranged, task can be waken up by redundant process; Or self periodic wakeup, after kernel state finger daemon ft_syner is waken up at every turn, traversal incident chained list ft_syner_event_list; Each the message bag that takes off in the chained list is resolved, and data are compared.After having resolved all the message bags in the current event chained list; Kernel state finger daemon ft_syner call function sleep_on_timeout () gets into waiting status; It is 5 seconds that latent period is set in this function, wakes kernel state finger daemon ft_syner after the stand-by period uses up up and gets into the next round traversal.
Described redundant process need increase principal and subordinate's attribute; Redundant process is when carrying out write operation; Being ready to treat write data respectively, is the message bag by host process with data encapsulation to be written again, through list_add ((msg->list); &ft_syner_event_list) the message bag is added into incident chained list ft_syner_event_list; And initiatively wake kernel state finger daemon ft_syner up, kernel state finger daemon ft_syner accomplish data relatively after, redundant process can obtain comparative result through detecting msg->error.
Data comparator of the present invention provides data relatively to serve with the mode of kernel finger daemon, and redundant process is being called write (), writev (); Send (); Sendto () during the relevant write operation of sendmsg (), utilizes the comparison of treating write data in the service complete operation of data comparator.Promptly through in (SuSE) Linux OS, starting kernel finger daemon ft_syner, this process is carried out the data comparator logic, for the duplication redundancy process in the fault-tolerant computer system provides data relatively to serve.The incident chained list ft_syner_event_list that adds in the kernel is as message channel; Redundant process and data comparator are worked with the mode of production-consumption; Redundant process is inserted the message chained list with data encapsulation to be written for the message bag; Comparer takes off, resolves the message bag and accomplishes data relatively from the message chained list, at last the result is returned to redundant process.The data that the present invention has accomplished redundant process in the bimodulus tolerant system at the operating system kernel layer with the mode of software succinctly, reliably compare.This method realizes at the operating system kernel layer; Need not customize by hardware, be applicable to the process level duplication redundancy tolerant system based on the common hardware framework, all logics are all accomplished at the operating system kernel layer automatically; Need not application program participate in, corresponding to having the good transparency.
Description of drawings
Fig. 1 is the workflow diagram of data comparator among the present invention;
Fig. 2 is the interaction concept figure of redundant process among the present invention and data comparator.
Embodiment
Below in conjunction with accompanying drawing the present invention is done further explain.
Method of the present invention is following:
In the (SuSE) Linux OS kernel, create kernel state finger daemon ft_syner, carry out comparator logic, relatively serve for redundant process provides data; Redundant process is when carrying out write operation; Be ready to treat write data respectively; Be the message bag by host process with data encapsulation to be written again, then the message bag be added into redundant process and data comparator communication port, and initiatively wake data comparator ft_syner up and carry out data relatively; Data comparator ft_syner accomplish data relatively after, redundant process obtains comparative result through the comparative result field in the detect-message bag.
Described communication port, implementation is following: in the (SuSE) Linux OS kernel, create incident chained list ft_syner_event_list, redundant process realizes communicating by letter through incident chained list ft_syner_event_list with data comparator.
Data comparator and redundant process are worked with Survivor-consumer's mode; Redundant process is put data to be compared for behind the message bag in order by protocol format and is articulated among the incident chained list ft_syner_event_list; Comparer takes off the message bag from this incident chained list ft_syner_event_list, therefrom extract data message and compare.
The form of described message bag is:
typedef?struct{
struct?list_head?list;
short?ft_msg_type;
struct?task_struct*p1;
struct?task_struct*p2;
void*master_data;
long?master_data_len;
void*slave_data;
long?slave_data_len;
short?error;
}ft_syner_event_msg;
Wherein, list is a linked list head, is used for the message bag is hung the incoming event chained list, and adopting list_head is linux kernel universal chain list structure, and the insertion of message bag, volume remove the list_add () and the list_del () that manipulate in the kernel and accomplish;
Ft_msg_type is a type of message, and which kind of system call expression message bag comes from, and occurrence defines as follows:
#define FT_WRITE 1 //write () system call
#define FT_WRITEV 2 //writev () system call
#define FT_SEND 3 //send () system call
#define FT_SENDTO 4 //sendto () system call
#define FT_SENDMSG 5 //sendmsg () system call
Comparer judges according to the value of ft_msg_type type of message the message bag from which kind of system call produces, and above-mentioned definition can be according to requirements extend or reduction;
P1, p2 is for generating the right process control block (PCB) pointer of redundant process of this message bag, and p1 is a host process, and p2 is from process, and comparer obtains the right process control block (PCB) of redundant process through these two pointers;
Master_data is the data buffer pointer of redundant process centering host process, and master_data_len is a buffer length;
Slave_data is the data buffer pointer of redundant process centering from process, and slave_data_len is a buffer length;
Error writes down comparative result, and value is 1 expression data consistent, and value is that 0 expression data are inconsistent.
Described data comparator is a kernel state finger daemon ft_syner who in the (SuSE) Linux OS kernel, creates, and its resident operating system kernel is carried out comparator logic, relatively serves for redundant process provides data.Kernel state finger daemon ft_syner is in waiting status when idle; When being arranged, task can be waken up by redundant process; Or self periodic wakeup, after kernel state finger daemon ft_syner is waken up at every turn, traversal incident chained list ft_syner_event_list; Each the message bag that takes off in the chained list is resolved, and data are compared.After having resolved all the message bags in the current event chained list, ft_syner call function sleep_on_timeout () gets into waiting status, and it is 5 seconds that latent period is set in this function, wakes ft_syner after the stand-by period uses up up and gets into the next round traversal.
Described redundant process need increase principal and subordinate's attribute; Redundant process is when carrying out write operation; Being ready to treat write data respectively, is the message bag by host process with data encapsulation to be written again, through list_add ((msg->list); &ft_syner_event_list) the message bag is added into incident chained list ft_syner_event_list; And initiatively wake data comparator ft_syner up, data comparator ft_syner accomplish data relatively after, redundant process can obtain comparative result through detecting msg->error.
The workflow of data comparator shown in Figure 1 is:
(1) comparer process ft_syner calls the spin lock that spin_lock () obtains the message chained list;
(2) judge whether message chained list ft_syner_event_list is empty, if be idle running (6), if be not idle running (3);
(3) obtain a message bag msg in the chained list, resolve this message bag and accomplish data relatively;
(4) use the list_add () in the kernel that the message bag of handling is deleted from the message chained list;
(5), change (3), otherwise change (6) if also have untreated residue message bag in the message chained list;
(6) call the spin lock of spin_unlock () release message chained list;
(7) comparer process ft_syner calls sleep_on_timeout () and gets into the sleep wait;
(8) process ft_syner arrives passive the waking up in back in latency time period, or is initiatively waken up by redundant process;
(9) judgement symbol position finish, if finish is 1, expression receives the request that finishes the comparer service, ft_syner finishes, if finish is 0, changes (1) and gets into the next round service.
Fig. 2 has showed the reciprocal process of redundant process and data comparator.This figure is operating as example with write (), and redundant process P1, P2 need carry out data relatively when carrying out write () operation, and redundant process in message bag msg, and is inserted into the message chained list with data encapsulation to be compared, wakes data comparator up.Comparer takes off the message bag from the message chained list, resolve the message bag by formal definition, and two piece of data in the bag are carried out consistance relatively, and comparative result is deposited among msg->error.Last redundant process is obtained comparative result through the value of checking msg->error.
Claims (1)
1. serve the operating system kernel layer system data comparing method of fault-tolerant computer; It is characterized in that; At first in the (SuSE) Linux OS kernel, create kernel state finger daemon ft_syner, it act as the execution comparator logic, relatively serves for redundant process provides data; Secondly after redundant process is ready to respectively treat write data in execution write operation process; Host process is the message bag with these data encapsulation to be written and the message bag is added in redundant process and the data comparator communication port, initiatively wakes data comparator simultaneously up and carries out the data comparison; The final data comparer is accomplished data relatively, and redundant process obtains comparative result through the comparative result field in the detect-message bag; Described communication port, implementation is following: in the (SuSE) Linux OS kernel, create incident chained list ft_syner_event_list, redundant process realizes communicating by letter through incident chained list ft_syner_event_list with data comparator; Data comparator and redundant process are worked with Survivor-consumer's mode; Redundant process is put data to be compared for behind the message bag in order by protocol format and is articulated among the incident chained list ft_syner_event_list, and comparer takes off the message bag and therefrom extracts data message and compares from this incident chained list ft_syner_event_list; Described data comparator is a kernel state finger daemon ft_syner who in the (SuSE) Linux OS kernel, creates, its resident operating system kernel; Carry out comparator logic, relatively serve for redundant process provides data, kernel state finger daemon ft_syner is in waiting status when idle; When being arranged, task waken up by redundant process; Or self periodic wakeup, after kernel state finger daemon ft_syner is waken up at every turn, traversal incident chained list ft_syner_event_list; Each the message bag that takes off in the chained list is resolved; And data are compared, resolved all the message bags in the current event chained list after, kernel state finger daemon ft_syner call function sleep_on_timeout gets into waiting status; It is 5 seconds that latent period is set in this function, wakes kernel state finger daemon ft_syner after the stand-by period uses up up and gets into the next round traversal; Described redundant process need increase principal and subordinate's attribute; Redundant process is ready to treat write data respectively when carrying out write operation, be the message bag by host process with data encapsulation to be written again; Through function list_add the message bag is added into incident chained list ft_syner_event_list; And initiatively wake kernel state finger daemon ft_syner up, kernel state finger daemon ft_syner accomplish data relatively after, redundant process obtains comparative result through the value that detects msg.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010103349XA CN101794242B (en) | 2010-01-29 | 2010-01-29 | Fault-tolerant computer system data comparing method serving operating system core layer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201010103349XA CN101794242B (en) | 2010-01-29 | 2010-01-29 | Fault-tolerant computer system data comparing method serving operating system core layer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101794242A CN101794242A (en) | 2010-08-04 |
CN101794242B true CN101794242B (en) | 2012-07-18 |
Family
ID=42586952
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201010103349XA Expired - Fee Related CN101794242B (en) | 2010-01-29 | 2010-01-29 | Fault-tolerant computer system data comparing method serving operating system core layer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101794242B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102323900B (en) * | 2011-08-31 | 2014-03-26 | 国家计算机网络与信息安全管理中心 | System fault tolerance mechanism based on dynamic sensing for many-core environment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6088817A (en) * | 1993-11-26 | 2000-07-11 | Telefonaktiebolaget Lm Ericsson | Fault tolerant queue system |
CN101000561A (en) * | 2006-12-20 | 2007-07-18 | 中国电子科技集团公司第十四研究所 | Implementing method of multi-machine fault-tolerance system kermel |
CN101369241A (en) * | 2007-09-21 | 2009-02-18 | 中国科学院计算技术研究所 | Cluster fault-tolerance system, apparatus and method |
CN101383690A (en) * | 2008-10-27 | 2009-03-11 | 西安交通大学 | Grid synchronization method for fault tolerant computer system based on socket |
-
2010
- 2010-01-29 CN CN201010103349XA patent/CN101794242B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6088817A (en) * | 1993-11-26 | 2000-07-11 | Telefonaktiebolaget Lm Ericsson | Fault tolerant queue system |
CN101000561A (en) * | 2006-12-20 | 2007-07-18 | 中国电子科技集团公司第十四研究所 | Implementing method of multi-machine fault-tolerance system kermel |
CN101369241A (en) * | 2007-09-21 | 2009-02-18 | 中国科学院计算技术研究所 | Cluster fault-tolerance system, apparatus and method |
CN101383690A (en) * | 2008-10-27 | 2009-03-11 | 西安交通大学 | Grid synchronization method for fault tolerant computer system based on socket |
Non-Patent Citations (1)
Title |
---|
附图1. |
Also Published As
Publication number | Publication date |
---|---|
CN101794242A (en) | 2010-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106850260A (en) | A kind of dispositions method and device of virtual resources management platform | |
US8990617B2 (en) | Fault-tolerant computer system, fault-tolerant computer system control method and recording medium storing control program for fault-tolerant computer system | |
CN101383690B (en) | Grid synchronization method for fault tolerant computer system based on socket | |
Bouteiller et al. | Correlated set coordination in fault tolerant message logging protocols | |
CN107634855A (en) | A kind of double hot standby method of embedded system | |
CN102591759B (en) | Clock precision parallel simulation system for on-chip multi-core processor | |
CN103064770B (en) | Dual-process redundancy transient fault tolerating method | |
CN109189860A (en) | A kind of active and standby increment synchronization method of MySQL based on Kubernetes system | |
TWI522794B (en) | Energy-efficient nonvolatile microprocessor | |
US11656902B2 (en) | Distributed container image construction scheduling system and method | |
CN104205755A (en) | Method, device, and system for delaying packets during a network-triggered wake of a computing device | |
CN103455393A (en) | Fault tolerant system design method based on process redundancy | |
Chen et al. | Replication-based fault-tolerance for large-scale graph processing | |
Bouteiller et al. | Correlated set coordination in fault tolerant message logging protocols for many‐core clusters | |
CN111221662B (en) | Task scheduling method, system and device | |
CN101794242B (en) | Fault-tolerant computer system data comparing method serving operating system core layer | |
CN112367186B (en) | Fault protection method and device based on 0penStack bare computer | |
Camargos et al. | Multicoordinated paxos | |
CN103593251A (en) | Fault-tolerant system based on process redundancy and design method thereof | |
JP2010534888A (en) | High integrity and high availability computer processing module | |
WO2023185335A1 (en) | Crash clustering method and apparatus, electronic device and storage medium | |
CN106775964A (en) | The operating system framework and method for scheduling task of time/event mixing triggering | |
EP4170519A1 (en) | Data synchronization method and device | |
TW201113696A (en) | Test method and tool for master-slave systems on multicore processors | |
Bowles et al. | A formal model for integrating multiple views |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120718 Termination date: 20150129 |
|
EXPY | Termination of patent right or utility model |