CN110442470A

CN110442470A - A kind of the system stability monitoring and restoration methods of communication equipment

Info

Publication number: CN110442470A
Application number: CN201910682271.2A
Authority: CN
Inventors: 黄振江; 王清波; 黄仝宇; 汪刚; 宋一兵; 侯玉清; 刘双广
Original assignee: Gosuncn Technology Group Co Ltd
Current assignee: Gosuncn Technology Group Co Ltd
Priority date: 2019-07-26
Filing date: 2019-07-26
Publication date: 2019-11-12
Anticipated expiration: 2039-07-26
Also published as: CN110442470B

Abstract

The invention belongs to technical field of communication equipment, and in particular to a kind of the system stability monitoring and restoration methods of communication equipment, be specifically divided into 3 steps: Linux command executes status monitoring and recovering step；The read-write status monitoring of FLASH memory space and recovering step；Important thread monitoring running state and recovering step.This programme can monitor multiple " Linux command execution states " for influencing equipment stable operation simultaneously, the function point of " the read-write status monitoring of FLASH memory space " and " multiple thread operating statuses ", and executing different abnormal solutions according to the result monitored makes equipment be restored to normal condition, to guarantee the stable operation of equipment.

Description

A kind of the system stability monitoring and restoration methods of communication equipment

Technical field

The invention belongs to technical field of communication equipment, and in particular to a kind of the system stability monitoring and recovery of communication equipment Method.

Background technique

Prior art, which monitors system stability, to be realized by software watchdog or hardware watchdog, specially One thread timing dog-feeding, if feeding dog thread abnormal, system reboot if without feeding dog in time occurs.Communication equipment is influenced to stablize Property the reason of it is very much, wherein include following 3 points: 1.linux order execution state；The readable write state of 2.FLASH memory space；3. Important thread operating status.

Prior art, which only monitors, feeds this thread of dog, and other function points for influencing equipment stable operation do not have It monitors, if exception, which occur, in other places also results in equipment fluctuation of service, so existing scheme not can solve The problem of equipment stable operation.

Summary of the invention

In order to solve technological deficiency existing in the prior art, the invention proposes a kind of system stability of communication equipment Monitoring and restoration methods.

The invention is realized by the following technical scheme:

A kind of the system stability monitoring and restoration methods of communication equipment, comprising steps of

(1) Linux command executes status monitoring and recovery, specifically includes step:

1.1, Linux command system is monitored, i=0 is enabled；

1.2, the first preset time is waited, judges the size of i；

If 1.3, i=0, system command ls is executed, and judge implementing result；

If 1.4, i=1, system command ps is executed, and judge implementing result；

If 1.5, i=2, system command free is executed, and judge implementing result；

1.6, whether the implementing result in judgment step 1.3-1.5 fails；If so, executing step 1.8, step is executed if not Rapid 1.7；

1.7, if ((++ i) >=3), { i=0；, return step 1.2；

1.8, hardware watchdog stops feeding dog, restarts system, terminates process.

(2) the read-write status monitoring of FLASH memory space and recovery, specifically include step:

2.1, FLASH memory space is monitored；

2.2, the second preset time is waited；

2.3, text document is read and write in FLASH memory space, judges to read and write whether result succeeds, if so, return step 2.2, if it is not, entering step 2.4；

2.4, set FLASH is abnormal, and request processing terminates process.

(3) important thread monitoring running state and recovery.

Further, it in the step (1), is executed using linux system instruction ls, ps and free timing, determines life Implementing result is enabled, and decides whether to call exception handling according to returning the result.

Preferably, first preset time is 30 minutes.

Further, it in the step (2), uses using second preset time as interval, is periodically stored in FLASH A file is read and write in space, and decides whether to call exception handling according to returning the result.

Preferably, second preset time 12 hours.

Further, further comprise step in the step (3):

3.1 in the beginning location set monitoring request flag position of per thread.

3.2 in the loop body of thread, every once just to add 1 to the counter of the thread into loop body, is monitored thread The runing time of loop body will be faster than the runing time of monitoring thread loops body.

Just whether 3.3 monitoring threads judge the thread according to the monitoring request flag of each monitored thread and counter Often operation, to decide whether to call exception handling.

The monitoring method of 3.4 monitoring threads, the counter of the relatively more monitored thread of monitoring thread, if monitored thread Counter is as last counter values, and it is abnormal that monitoring thread then determines that the thread occurs, if the detection mistake reaches Exception handling will be called to setting number.

Further, in the step (3), the important thread monitoring running state and recovery further comprise:

A, start, monitor each thread operating status；

B, it waits 60 seconds；

C, thread n=0 is enabled, is monitored since first thread；

D, judge whether that n < m, m are that can monitor total number of threads；If entering step E, return step B if not；

E, judge thread n monitoring mark whether set, if so, enter step F, if it is not, enable n=n+1, return step D；

F, judge whether the thread counter changes；If entering step G；

G, system reboot mark adds 1；

H, judge whether system reboot mark is greater than the set value, if then system reboot；If it is not, enabling n=n+1, step is returned Rapid D.

Further, in the step (3), each thread for needing to monitor executes step:

A, start, into thread；

B, it is named to thread；

C, set thread monitors flag bit；

D, every to add 1 into thread counter of loop body into thread loops body；

E, other business in processing cycle body, return step D.

The invention also includes a kind of computer readable storage mediums, are stored thereon with computer program, which is characterized in that should The step of monitoring and restoration methods are realized when program is executed by processor.

The invention also includes a kind of computer equipment, including memory, processor and storage on a memory and can located The computer program run on reason device, which is characterized in that the processor realizes monitoring and restoration methods when executing described program The step of.

Compared with prior art, the present invention at least has the following beneficial effects or advantage: while monitoring multiple influences and setting " the Linux command execution state " of standby stable operation, " the read-write status monitoring of FLASH memory space " and " multiple thread operations The function point of state ", and executing different abnormal solutions according to the result monitored makes equipment be restored to normal condition, To guarantee the stable operation of equipment.

Detailed description of the invention

The present invention is described in further details below with reference to attached drawing；

Fig. 1 is that Linux command of the invention executes status monitoring and restoration methods principle flow chart；

Fig. 2 is the read-write status monitoring of FLASH memory space and restoration methods principle flow chart of the invention；

Fig. 3 is single thread principle flow chart of the invention；

Fig. 4 is thread monitoring running state and restoration methods principle flow chart of the invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts Example, shall fall within the protection scope of the present invention.

The present invention provides a kind of communication equipment of (SuSE) Linux OS, and the system stability including a kind of communication equipment Monitoring and restoration methods, detailed process is as follows:

Linux command executes status monitoring:

1. monitoring method: being executed using linux system instruction ls, ps and free timing, determine command execution results, and root Decide whether to call abnormality eliminating method according to returning the result.

2. monitoring and restoration methods schematic diagram as shown in Figure 1, comprising steps of

A, Linux command system is monitored, i=0 is enabled；

B, it waits 30 minutes, judges the size of i；

If C, i=0, system command ls is executed, and judge implementing result；

If D, i=1, system command ps is executed, and judge implementing result；

If E, i=2, system command free is executed, and judge implementing result；

F, whether the implementing result in judgment step C~E fails；If so, executing step H, step G is executed if not；

G, if ((++ i) >=3), { i=0；, return step B；

H, hardware watchdog stops feeding dog, restarts system, terminates process.

The read-write status monitoring of FLASH memory space:

1. monitoring method: 12 hours are interval, periodically read and write a file in FLASH memory space, and tie according to returning Fruit decides whether to call abnormality eliminating method.

2. monitoring and restoration methods schematic diagram as shown in Fig. 2, comprising steps of

A, FLASH memory space is monitored；

B, it waits 12 hours；

C, text document is read and write in FLASH memory space, judges to read and write whether result succeeds, if so, return step B, if It is no, enter step D；

D, set FLASH is abnormal, and request processing terminates process.

Thread monitoring running state:

1. monitoring method:

1.1 in the beginning location set monitoring request flag position of per thread.

1.2 in the loop body of thread, every once just to add 1 to the counter of the thread into loop body, is monitored thread The runing time of loop body will be faster than the runing time of monitoring thread loops body.

Just whether 1.3 monitoring threads judge the thread according to the monitoring request flag of each monitored thread and counter Often operation, to decide whether to call exception handling.

The monitoring method of 1.4 monitoring threads, the counter of the relatively more monitored thread of monitoring thread, if monitored thread Counter is as last counter values, and it is abnormal that monitoring thread then determines that the thread occurs, if the detection mistake reaches Exception handling will be called to setting number.

2. monitoring and restoration methods schematic diagram:

2.1 it is each need the thread schematic diagrams that monitor as shown in figure 3, comprising steps of

A, start, into thread；

B, it is named to thread；

C, set thread monitors flag bit；

D, every to add 1 into thread counter of loop body into thread loops body；

E, other business in processing cycle body, return step D.

2.2 monitoring threads thread schematic diagram as shown in figure 4, comprising steps of

A, start, monitor each thread operating status；

B, it waits 60 seconds；

C, thread n=0 is enabled, is monitored since first thread；

E, judge thread n monitoring mark whether set, if so, enter step F, if it is not, enabling n=n+1, (monitoring is next Thread), return step D；

F, judge whether the thread counter changes；If entering step G；

G, system reboot mark adds 1；

H, judge whether system reboot mark is greater than the set value, if then system reboot；If it is not, enabling n=n+1 (under monitoring One thread), return step D.

The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, which is characterized in that should The step of monitoring and restoration methods are realized when program is executed by processor.

The present invention also provides a kind of computer equipment, including memory, processor and storage on a memory and can located The computer program run on reason device, which is characterized in that the processor realizes monitoring and restoration methods when executing described program The step of.

Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects Describe in detail it is bright, it should be understood that the above is only a specific embodiment of the present invention, the guarantor being not intended to limit the present invention Protect range.Without departing from the spirit and scope of the invention, any modification, equivalent substitution, improvement and etc. done also belong to this Within the protection scope of invention.

Claims

1. a kind of system stability of communication equipment monitors and restoration methods, which is characterized in that comprising steps of

1.1, Linux command system is monitored, i=0 is enabled；

1.2, the first preset time is waited, judges the size of i；

If 1.3, i=0, system command ls is executed, and judge implementing result；

If 1.4, i=1, system command ps is executed, and judge implementing result；

If 1.5, i=2, system command free is executed, and judge implementing result；

1.6, whether the implementing result in judgment step 1.3-1.5 fails；If so, executing step 1.8, step is executed if not 1.7；

1.7, if ((++ i) >=3), { i=0；, return step 1.2；

1.8, hardware watchdog stops feeding dog, restarts system, terminates process.

2.1, FLASH memory space is monitored；

2.2, the second preset time is waited；

2.3, text document is read and write in FLASH memory space, judges to read and write whether result succeeds, if so, return step 2.2, if It is no, enter step 2.4；

2.4, set FLASH is abnormal, and request processing terminates process.

(3) important thread monitoring running state and recovery.

2. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described In step (1), is executed using linux system instruction ls, ps and free timing, determine command execution results, and tie according to returning Fruit decides whether to call exception handling.

3. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described the One preset time is 30 minutes.

4. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described In step (2), use using second preset time as interval, periodically in FLASH memory space one file of read-write, and according to It returns the result and decides whether to call exception handling.

5. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described the Two preset times 12 hours.

6. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described Further comprise step in step (3):

3.2 in the loop body of thread, every once just to add 1 to the counter of the thread into loop body, is monitored thread loops The runing time of body will be faster than the runing time of monitoring thread loops body.

3.3 monitoring threads judge whether the thread is normally transported according to the monitoring request flag of each monitored thread and counter Row, to decide whether to call exception handling.

The monitoring method of 3.4 monitoring threads, the counter of the relatively more monitored thread of monitoring thread, if the counting of monitored thread For device as last counter values, it is abnormal that monitoring thread then determines that the thread occurs, and sets if the detection mistake reaches Exception handling will be called by determining number.

7. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described In step (3), the important thread monitoring running state and recovery further comprise:

A, start, monitor each thread operating status；

B, it waits 60 seconds；

C, thread n=0 is enabled, is monitored since first thread；

F, judge whether the thread counter changes；If entering step G；

G, system reboot mark adds 1；

H, judge whether system reboot mark is greater than the set value, if then system reboot；If it is not, n=n+1 is enabled, return step D.

8. the system stability of communication equipment according to claim 1 monitors and restoration methods, which is characterized in that described In step (3), each thread for needing to monitor executes step:

A, start, into thread；

B, it is named to thread；

C, set thread monitors flag bit；

D, every to add 1 into thread counter of loop body into thread loops body；

E, other business in processing cycle body, return step D.

9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claim 1 to 8 the method is realized when row.

10. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes any one of claim 1 to 8 the method when executing described program Step.