CN106649039B - A kind of fault-tolerant method of C language monitoring software under embedded Linux system - Google Patents

A kind of fault-tolerant method of C language monitoring software under embedded Linux system Download PDF

Info

Publication number
CN106649039B
CN106649039B CN201611147014.1A CN201611147014A CN106649039B CN 106649039 B CN106649039 B CN 106649039B CN 201611147014 A CN201611147014 A CN 201611147014A CN 106649039 B CN106649039 B CN 106649039B
Authority
CN
China
Prior art keywords
monitoring
module
signal
error
mistake
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611147014.1A
Other languages
Chinese (zh)
Other versions
CN106649039A (en
Inventor
刘波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201611147014.1A priority Critical patent/CN106649039B/en
Publication of CN106649039A publication Critical patent/CN106649039A/en
Application granted granted Critical
Publication of CN106649039B publication Critical patent/CN106649039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/865Monitoring of software

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The fault-tolerant method of C language monitoring software under a kind of built-in Linux environment provided by the invention, by realizing the try catch exception handling structure in similar C++ in C language, capture mistake at the first time in monitoring software, and mistake is handled, program is avoided to collapse, so that monitoring software has stronger fault-tolerance.Monitoring software can be made in the case where encountering core dumped, floating-point operation mistake, exiting mistake, guarantee that the cyclic process of monitoring programme executes always, and the position that error message, mistake can occur in error processing procedure records, in order to the analysis of mistake;The system for realizing aforesaid operations is provided simultaneously, fault-tolerant strategy and method of the system based on C language monitoring software, realize the processing to erroneous procedures during monitoring, and performance loss is small, the operating reliability for effectively increasing system avoids the complexity of hardware customization.

Description

A kind of fault-tolerant method of C language monitoring software under embedded Linux system
Technical field
The invention belongs to server monitoring administrative skill fields, and in particular to C language is supervised under a kind of embedded Linux system The method for controlling software fault-tolerant.
Background technique
Embedded Linux system has been widely used in server monitoring field at present, these monitoring softwares, when the moment Quarter is monitored management to the operating status of server, needs to keep prolonged stable operation, and can be in own collapse In the case of, it reruns in a relatively short period of time.
Currently, most of monitoring software, is developed using C language.But in C language, do not provide in similar C++ The exception handling of trycatch type, once the problems such as encountering certain such as core dumpeds, floating-point operation mistake, program will collapse It beats a precipitate retreat out.Guarantee that program continues to run, current solution be monitoring software is inspected periodically, once detect prison It controls software anomaly to terminate, just reruns the monitoring software again, this method is on the one hand complicated for operation, checks require weight every time New edited tissue language;On the other hand, it can only be prevented by checking, and abnormal problem cannot be solved constantly.
Therefore it provides the fault-tolerant method of C language monitoring software is very necessary under a kind of embedded Linux system.
Summary of the invention
It is an object of the invention to solve the above-mentioned exception handler existing in the prior art for lacking try catch type System, can not carry out the problems such as error handle in monitoring software, provide C language monitoring software under a kind of Embedded Linux System Fault-tolerant method.
The present invention is achieved by the following technical solutions:
A kind of fault-tolerant method of C language monitoring software under embedded Linux system, it is characterised in that: the following steps are included: (1) monitoring software brings into operation;
(2) whether code meets monitoring condition, (3) is entered step if met, if not satisfied, exiting monitoring;
(3) registration signal processing is carried out;Once mistake occurs for monitoring software, mistake is just taken over by signal processing flow, and It is not to be handled by operating system, usually directly exits;
(4) signal is detected, detects whether that there are error signals, if error signal cannot be detected, executes normal monitoring Process enters step (5), if detecting error signal, error process, go to step (2);It will test at signal Reason process increases before supervision subjects, and when no signal carries out normal supervision subjects process, then carries out error handle when having signal.
(5) whether supervision subjects generate error signal, if generating error signal, carry out signal processing, and jump to step Suddenly (4), if not generating error signal, go to step (2) restart monitor process.In signal processing, Increase jumps, and program is made to jump back to the circulation process of monitoring programme.
Preferably, also carrying out logging operations when carrying out signal processing in the step 5.
Preferably, when carrying out signal processing in the step 5, also progress signal type record operation.
Preferably, also carrying out wrong time of origin record operation when carrying out error handle in the step (4).
Preferably, when carrying out error handle in the step (4), also progress global variable record operation.
A kind of system for realizing above-mentioned fault-tolerance approach, the system concentrate on the operation core of embedded Linux system Layer, it is characterised in that: including monitoring management module and coupled process manager module, signal processing module, mistake Processing module, the process manager module, signal processing module, error handling module are sequentially connected, in which:
(1) process manager module, the module is for realizing monitoring Period Process management, creation, tune including monitoring process Degree, communication, so that primary process while orderly executing original logic, meets the needs of the fault-tolerant monitoring of C language;
(2) signal processing module, which detects for realizing registration signal and signal processing, registration signal detect After mistake, program is voluntarily handled first, and (SuSE) Linux OS processing is entered when can not solve;Signal processing module will be whole The program circuit at end jumps, and comes back in monitoring process;
(3) error handling module, the module diagnose for realizing type of error and take corresponding mistake according to pre-configuration Processing mode completes errors repair;
(4) monitoring management module, including master control end and internal control end, master control end are supplied to the visual behaviour of user Make interface, internal control end is interacted with master control end so that user can in master control end checking monitoring software operation state, Fault-tolerant log is pre-configured system parameters.
Compared with prior art, the beneficial effects of the present invention are:
The fault-tolerant method of C language monitoring software under a kind of built-in Linux environment provided by the invention, by C language The middle try catch exception handling structure realized in similar C++, captures mistake, and to mistake at the first time in monitoring software It is handled, program is avoided to collapse, so that monitoring software has stronger fault-tolerance.Monitoring software can be made to meet To core dumped, floating-point operation mistake, exit mistake in the case where, guarantee that the cyclic process of monitoring programme executes always, and can The position that error message, mistake occur in error processing procedure records, in order to the analysis of mistake.It is also mentioned in scheme System for carrying out the process design is supplied, fault-tolerant strategy and method of the system based on C language monitoring software realize pair The processing of erroneous procedures during monitoring, performance loss is small, effectively increases the operating reliability of system, avoids hardware customization Complexity.
In addition, the method for the present invention principle is reliable, step is simple, has very extensive application prospect.
It can be seen that compared with prior art, the present invention have substantive distinguishing features outstanding and it is significant ground it is progressive, implementation Beneficial effect be also obvious.
Detailed description of the invention
Fig. 1 is the workflow for the method that C language monitoring software is fault-tolerant under a kind of built-in Linux environment provided by the invention Cheng Tu.
Fig. 2 is the structural representation of C language monitoring software tolerant system under a kind of built-in Linux environment provided by the invention Figure.
Wherein, 1- process manager module, 2- signal processing module, 3- error handling module, 4- monitoring management module, 41- Master control end, 42- internal control end.
Specific embodiment
Present invention is further described in detail with reference to the accompanying drawing:
The fault-tolerant method of C language monitoring software, the first place to general monitoring software under a kind of embedded Linux system Reason process does an explanation.Monitoring software (main) is the program executed according to certain clocking discipline, Infinite Cyclic.It wants The persistence operation for guaranteeing monitoring software, needs monitoring core code (monitor) enough stabilizations, does not make mistakes.But with The complexity of code increases, and the probability of error becomes larger, especially some mistakes sporadic, recurrence rate is low.These are wrong It misses, if floating-point operation is divided by 0, will lead to monitoring programme error and exit.Code in following table is the code of the present embodiment Frame.Description to carry out fault-tolerance approach of the present invention for it.
As shown in Figure 1, a kind of fault-tolerant method of C language monitoring software under embedded Linux system, comprising the following steps:
(1) monitoring software brings into operation;
(2) whether code meets monitoring condition, (3) is entered step if met, if not satisfied, exiting monitoring;
(3) registration signal processing is carried out;
(4) signal is detected, detects whether that there are error signals, if error signal cannot be detected, executes normal monitoring Process enters step (5), if detecting error signal, error process, go to step (2);
(5) whether supervision subjects generate error signal, if generating error signal, carry out signal processing, and jump to step Suddenly (4), if not generating error signal, go to step (2) restart monitor process.
Wherein, step (3) carries out registration signal processing, the signal function of corresponding code sample.If it is registration signal Processing, once error, program will exit, linux will pop up an information.After registration signal processing, after mistake occurs, first These error handles are carried out by program oneself, when oneself is not handled, can just be handled by (SuSE) Linux OS.To guarantee program It does not exit, SIGSEGV, SIGFPE, SIGABRT registration signal should at least be handled.Signal_hdl is exactly the signal registered Handle function.
Detection signal in step (4) corresponds to the sigsetjmp function in code sample.This is a selection structure, such as Fruit does not detect signal, then executes normal monitoring (monitor) process;If detecting signal, error process.? In error handle, it can recorde the time of mistake generation, in addition can recorde some global variables, with the generation of general location mistake Position.
Signal processing operations in step (5) correspond to the signal_hdl function in code sample.Signal processing it is main Purpose is that the program circuit of interruption is jumped (siglongjmp), comes back to monitoring process, rather than after receiving signal, Just exit monitoring programme.Furthermore it is possible to increase journalizing in signal process function, remember to the type etc. of signal Record, a part as bug analysis log.
As shown in Fig. 2, the present invention also provides a kind of system for realizing above-mentioned fault-tolerance approach, which concentrates on embedding Enter the operation core layer of formula linux system, including monitoring management module 4 and coupled process manager module 1, letter Number processing module 2, error handling module 3, the process manager module 1, signal processing module 2, error handling module 3 successively connect It connects, in which:
(1) process manager module 1, the module is for realizing monitoring Period Process management, creation, tune including monitoring process Degree, communication, so that primary process while orderly executing original logic, meets the needs of the fault-tolerant monitoring of C language;
(2) signal processing module 2, which detects for realizing registration signal and signal processing, registration signal detection To after mistake, program is voluntarily handled first, and (SuSE) Linux OS processing is entered when can not solve;Signal processing module will The program circuit of terminal jumps, and comes back in monitoring process;
(3) error handling module 3, the module diagnose for realizing type of error and take corresponding mistake according to pre-configuration Processing mode completes errors repair;
(4) monitoring management module 4, including master control end 41 and internal control end 42, master control end 41 are supplied to user Visual operation interface, internal control end 42 are interacted with master control end 41, so that user can check prison at master control end 41 Software operation state is controlled, fault-tolerant log is pre-configured system parameters.
The fault-tolerant method of C language monitoring software under a kind of built-in Linux environment provided by the invention, by C language The middle try catch structure realized in similar C++, captures mistake, and handle mistake at the first time in monitoring software, Program is avoided to collapse, so that monitoring software has stronger fault-tolerance.Can make monitoring software encounter core dumped, Floating-point operation mistake in the case where exiting mistake, guarantees that the cyclic process of monitoring programme executes always, and can be in error handle The position that error message, mistake occur in process records, in order to the analysis of mistake.It is additionally provided in scheme for real The system design of existing this method, fault-tolerant strategy and method of the system based on C language monitoring software are realized to during monitoring The processing of erroneous procedures, performance loss is small, effectively increases the operating reliability of system, avoids the complexity of hardware customization.
Above-mentioned technical proposal is one embodiment of the present invention, for those skilled in the art, at this On the basis of disclosure of the invention application method and principle, it is easy to make various types of improvement or deformation, be not limited solely to this Invent method described in above-mentioned specific embodiment, therefore previously described mode is only preferred, and and do not have limitation The meaning of property.

Claims (1)

1. C language monitoring software tolerant system under a kind of embedded Linux system, the system concentrate on embedded Linux system Operation core layer, it is characterised in that: at monitoring management module and coupled process manager module, signal Module, error handling module are managed, the process manager module, signal processing module, error handling module are sequentially connected, in which:
(1) process manager module, the module for realizing monitoring Period Process management, lead to by creation, scheduling including monitoring process Letter, so that primary process while orderly executing original logic, meets the needs of the fault-tolerant monitoring of C language;
(2) signal processing module, which detects for realizing registration signal and signal processing, registration signal detect mistake Afterwards, program is voluntarily handled first, and (SuSE) Linux OS processing is entered when can not solve;Signal processing module is by terminal Program circuit jumps, and comes back in monitoring process;
(3) error handling module, the module diagnose for realizing type of error and take corresponding error handle according to pre-configuration Mode completes errors repair;
(4) monitoring management module, including master control end and internal control end, master control end are supplied to visual operation circle of user Face, internal control end are interacted with master control end, so that user can be fault-tolerant in master control end checking monitoring software operation state Log is pre-configured system parameters;
The process manager module is specifically used for:
(a) monitoring software is made to bring into operation;
(b) whether monitoring code meets monitoring condition, the entering signal processing module if meeting, if not satisfied, exiting monitoring;
The signal processing module is specifically used for:
(a) registration signal processing is carried out;
(b) signal is detected, detects whether that there are error signals, if error signal cannot be detected, executes normal monitoring process, Whether supervision subjects generate error signal, if generating error signal, signal processing are carried out, and detect signal again, if do not had There is generation error signal, then jumps to process manager module and restart to monitor process;If detecting error signal, enter Error handling module.
CN201611147014.1A 2016-12-13 2016-12-13 A kind of fault-tolerant method of C language monitoring software under embedded Linux system Active CN106649039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611147014.1A CN106649039B (en) 2016-12-13 2016-12-13 A kind of fault-tolerant method of C language monitoring software under embedded Linux system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611147014.1A CN106649039B (en) 2016-12-13 2016-12-13 A kind of fault-tolerant method of C language monitoring software under embedded Linux system

Publications (2)

Publication Number Publication Date
CN106649039A CN106649039A (en) 2017-05-10
CN106649039B true CN106649039B (en) 2019-09-27

Family

ID=58825255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611147014.1A Active CN106649039B (en) 2016-12-13 2016-12-13 A kind of fault-tolerant method of C language monitoring software under embedded Linux system

Country Status (1)

Country Link
CN (1) CN106649039B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706752A (en) * 2009-11-20 2010-05-12 中兴通讯股份有限公司 Method and device for in-situ software error positioning
CN104794031A (en) * 2015-04-16 2015-07-22 上海交通大学 Cloud system fault detection method combining self-adjustment strategy with virtualization technology

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001005692A (en) * 1999-06-25 2001-01-12 Toshiba Corp Computer system, its maintenance and management system, and method for informing of fault

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101706752A (en) * 2009-11-20 2010-05-12 中兴通讯股份有限公司 Method and device for in-situ software error positioning
CN104794031A (en) * 2015-04-16 2015-07-22 上海交通大学 Cloud system fault detection method combining self-adjustment strategy with virtualization technology

Also Published As

Publication number Publication date
CN106649039A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
Das et al. Desh: deep learning for system health prediction of lead times to failure in hpc
US10719072B2 (en) Method and apparatus for diagnosis and recovery of system problems
Fu et al. Execution anomaly detection in distributed systems through unstructured log analysis
CN106445781B (en) The detection system of HPC large-scale parallel program exception based on message transmission
Lou et al. Mining dependency in distributed systems through unstructured logs analysis
CN107301115A (en) Application exception is monitored and restoration methods and equipment
Jia et al. SMARTLOG: Place error log statement by deep understanding of log intention
CN108897676B (en) Flight guidance control software reliability analysis system and method based on formalization rules
US20070079288A1 (en) System and method for capturing filtered execution history of executable program code
US8489941B2 (en) Automatic documentation of ticket execution
CN109714202A (en) A kind of client off-line reason method of discrimination and concentrating type safety management system
CN107729217A (en) A kind of database abnormality eliminating method and terminal
CN111752741A (en) System performance detection method and device
RU2597472C2 (en) Method and device for monitoring of the device equipped with a microprocessor
Cotroneo et al. Enhancing failure propagation analysis in cloud computing systems
CN105511937A (en) Batch virtual machine blue screen monitoring method suitable for cloud platform
Chen et al. Automatic root cause analysis via large language models for cloud incidents
Huang Human error analysis in software engineering
CN103645985B (en) Source code macro-matching detection method
CN106445787B (en) Method and device for monitoring server core dump file and electronic equipment
CN103870349B (en) For the configuration management device and method of data handling system
CN106649039B (en) A kind of fault-tolerant method of C language monitoring software under embedded Linux system
Lu et al. Iaso: an autonomous fault-tolerant management system for supercomputers
Weber et al. Diagnosis and repair of dependent failures in the control system of a mobile autonomous robot
CN105426304B (en) A kind of control method and device for restarting test

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant