KR102023164B1

KR102023164B1 - Method for monitoring os task of twin micom in rtos

Info

Publication number: KR102023164B1
Application number: KR1020130004504A
Authority: KR
Inventors: 진연실
Original assignee: 콘티넨탈 오토모티브 시스템 주식회사
Priority date: 2013-01-15
Filing date: 2013-01-15
Publication date: 2019-09-19
Also published as: KR20140092132A

Abstract

The present invention discloses a method for monitoring an OS task of Althioes micom that can detect a real-time error that may occur in the microcomputer.
According to another aspect of the present invention, there is provided a monitoring method for monitoring an OS task of a mutual system by using two microcomputers MC1 and MC2 having independent real time operating systems (RTOSs). Setting the two microcomputers as the master MC2 and the slave MC1 and mutually determining whether the tasks of the master MC2 and the slave MC1 are completed within a tolerance time of a preset task. step; And b) as a result of the determination in the step a), if an error in which the task execution time has passed the allowable error time is detected in the master MC2 or the slave MC1, the master MC2 and the slave MC1 may be restarted. Characterized in that the step of performing a synchronization (RESYNC).
Accordingly, the present invention monitors an error in which an OS task does not operate at an appropriate time due to an error of a timer for operating an OS or an interruption of a high priority task or an interrupt, thereby real-time error that may occur in a microcomputer. Has the effect of detecting.

Description

METHOD FOR MONITORING OS TASK OF TWIN MICOM IN RTOS}

The present invention relates to a method for monitoring a microcomputer OS (OS: Operating System) in an independent Real Time Operating System (RTOS: RTOS), and more specifically, the error or high priority task of a timer for operating the OS ( The present invention relates to a method of monitoring an OS task of Althioes MICOM that can detect a real-time error that may occur in MICOM by monitoring an error in which an OS task does not operate at an appropriate time due to an interruption of a task) or an interrupt.

Programming methods used in the conventional embedded system are largely a conventional sequential execution programming method of the foreground / background method and a single tasking / multi-tasking method using an operating system (OS). The sequential execution method is used in many microcomputer applications, and this method is a program written in order that all programs flow sequentially except for interrupt processing.

In addition, the programming method using the OS is similar to the sequential execution method in the case of the single tasking method, except that there is a kernel that loads, executes, and manages a program. Likewise, this method causes all programs to flow sequentially except for interrupt processing. A typical example is DOS (Disk Operating System).

The multi-tasking method has several advantages that are realized on one OS, and it is easy to implement a program by performing an appropriate task, and convenience of work is also increased. Unix and RTOS are typical examples. This timer is used to check the computer hardware for abnormality. In normal condition, it is repeatedly reset by the program in shorter time than the monitoring time, but when it is not reset due to abnormal condition, a warning is issued. This timer is set when processing is delayed due to a fault or a program error.

As described above, the RTOS is constructing a malfunction monitoring system by itself, which will be described in detail with reference to FIG. 1.

As shown first, the malfunction monitoring system in the RTOS is periodically reset by receiving a reset signal from the control unit 10 and the control unit 10, and when the reset signal is not input within a predetermined time from the control unit 10. And a watchdog timer 20 for rebooting the system.

The control unit 10 is composed of a main module 11 to be executed according to the system drive, and a module called and executed by the main module 11 and each module, the modules are executed at the same time the hardware is initialized The watchdog timer initialization module 12 and the watchdog timer task module 13 generated by the watchdog timer initialization module 12 to continuously reset the watchdog timer 20. And a threshold region processing interrupt module 14 for generating an interrupt signal to prevent a system reboot of the timer.

Accordingly, the RTOS stops the watchdog timer when the system hardware is initialized in the malfunction monitoring process, resets the variable value of the watchdog timer, generates the watchdog timer task, and drives the watchdog timer. And outputs a reset signal to reproduce the watchdog timer.

In addition, when a critical region is generated, an interrupt signal is generated to prevent system reboot of the watchdog timer. When the system hardware is initialized while the RTOS kernel is loaded, the controller 10 executes a watchdog timer initialization process. That is, the watchdog timer initialization module 12 determines whether the initialization flag is set to '1'. If the initialization flag is set as a result of the determination, the watchdog timer initialization module 12 stops executing the initialization module and exits the initialization routine.

If the initialization flag is not set, the watchdog timer is stopped, the operation mode is set in the timer control register, and the value to be reloaded is written in the timer variable register. The watchdog timer initialization module 12 generates a timer task having the highest rank, and then continuously executes the timer task. Then, the watchdog timer initialization module 12 sets the timer initialization flag and restarts the watchdog timer.

However, in the above-mentioned RTOS, the operation task is monitored based on the setting state of the flag. Therefore, the OS task does not operate at an appropriate time due to an error of a timer that drives the OS, or an interference of a high priority task or an interrupt. There is a problem that cannot be detected by OS task monitoring. In other words, since the microcomputer does not monitor its own error, even if the OS task runs within the correct time, the real-time error that may occur in the program operation sequence cannot be detected, thereby deteriorating the stability of the system.

The present invention was created to solve such a problem, and an object of the present invention is to operate an OS task at an appropriate time due to an error of a timer for operating an OS or an interruption of a high priority task or interrupt. By monitoring the errors that do not occur, it provides a method of monitoring the OS task of ALTIOS MICOM that can detect real-time errors that may occur in MICOM.

Another object of the present invention is to provide a method of monitoring an OS task of AltiOS MICOM which can secure the stability of an RTOS system by detecting real-time errors that may occur in a program operation sequence even if an OS task operates within an accurate time. have.

In order to achieve the above object, a method of monitoring an OS task of an ALTIOS MICOM according to an aspect of the present invention is an OS task of a mutual system using two microcomputers MC1 and MC2 having independent real time operating systems (RTOS). A monitoring method for monitoring a task, the method comprising: a) setting the two microcomputers as a master MC2 and a slave MC1, and the master and the master MC2 within a tolerance time of a preset task; Mutually determining whether a task of the slave MC1 is completed; And b) as a result of the determination in the step a), if an error in which the task execution time has passed the allowable error time is detected in the master MC2 or the slave MC1, the master MC2 and the slave MC1 may be restarted. Characterized in that the step of performing a synchronization (RESYNC).

According to a preferred embodiment of the present invention, step a) includes: a-1) entering an operation (EXEC) step from a monitoring window of the master (MC2) or slave (MC1); a-2) the slave (MC1) or the master (MC2) to perform a task (task), the slave (MC1) or master (MC2) to switch to the operation (EXEC) phase; And a-3) determining that an error (EXEC) time between the slave MC1 or the master MC2 and the master MC2 or the slave MC1 exists in an error time range. It features.

According to a preferred embodiment of the present invention, the plurality of functions included in the task include an arithmetic algorithm at start and end points of each function; The arithmetic algorithm is a different expression based on an initial value, and the operation completion of each function is verified based on the expression.

In the method of monitoring an OS task of the ALTIOS MICOM proposed in the present invention, an OS task does not operate at an appropriate time due to an error of a timer that drives an OS, or an interruption of a high priority task or an interrupt. By monitoring this, it is possible to detect real-time errors that may occur in the microcomputer. In addition, even if the OS task is executed in the correct time, it is possible to secure the stability of the RTOS system by detecting real-time errors that may occur in the program operation sequence.

1 is a configuration diagram for explaining task monitoring of a conventional RTOS.
2 is a view for explaining the OS state of the microcomputer according to the present invention.
FIG. 3 is a diagram illustrating an embodiment in which MC1 shown in FIG. 2 monitors an OS error of MC2.
FIG. 4 is a diagram illustrating an embodiment in which MC2 shown in FIG. 2 monitors an OS error of MC1.
5 is a diagram for describing resynchronization due to an OS error of MC2.
FIG. 6 is a diagram for describing resynchronization due to an OS error of MC1.
7 is a view for explaining the principle of program flow and OS monitoring according to the present invention.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

First, the present invention has a structure for monitoring OS tasks of a mutual system using two microcomputers having independent real time operating systems (RTOS). This not only monitors the OS task not running at the proper time due to the error of timers running the OS, or the interference of high priority tasks or interrupts. It detects real-time errors that can occur in the sequence of program operation.

To this end, in the present invention, two micoms in which communication lines are established between the microcomputers are monitored using the resources and the communication lines in the microcomputer, thereby monitoring the OS tasks and program flows of the microcomputers. Here, two micoms are defined as slaves and masters, and have OS states corresponding to OS tasks for monitoring each micom.

The master also directs communication with the slave and monitors the slave's OS task operation. The slave monitors its program flow and changes the OS state to reflect the results. The OS state of the current slave is sent to the master according to the master's communication request. The slave also monitors the master's operating system and detects errors.

For example, an OS monitoring method of two microcomputers MC1 (slave) and MC2 (master) monitoring a 40 ms task will be described with reference to FIG. 2.

First, when the OS task runs, the allowable jitter is set to ± 10ms and the communication period of the two microcomputers is set to 10ms. It may be desirable to set the error range to be a multiple of the communication period. Therefore, every 10 ms, MC2 transmits its OS state to MC1 and MC1 also transmits its OS state to MC2. It monitors the operation of the partner OS task with the received OS state.

In the 'synchronization phase', the OS states of the MC1 and the MC2 are set to the synchronization (SYNC) state. Then, the synchronization (SYNC) phase is maintained until the 40ms task of the MC1 is performed, and when the 40ms task of the MC1 is performed, the state is changed to the operation (EXEC) state. When the OS state of MC1 changes to the EXEC state, monitoring of each microcomputer is started.

When monitoring is started through the above-described 'synchronization step', MC1 and MC2 are mutually monitored. As described above, since the operation time of the 40ms task is within 30ms to 50ms considering the error range ± 10ms, a maximum of 50ms is required to monitor the 40ms task (40ms task monitoring segment). Depending on the time it will take 30m ~ 50ms.

The monitoring segment that monitors the 40ms task is classified into t1, t2, t3, and t4 according to the communication order as mentioned in the 'Monitoring 1' step, and may exist up to t5, which is a maximum of 50ms. And the communication order of the synchronization (SYNC) step will be defined as t0. As shown, the MC2 (master) may be divided into a PEND state waiting for the operation of the 40 ms task of the MC1 (slave) and an operation (EXEC) step of checking the performance of the 40 ms task.

The time outside the monitoring window is a PEND state and the time within the monitoring window is an EXEC phase. In the case of MC1 (slave) is a standby (PEND) state from the standby (PEND) state of the MC2 (master) and becomes an operation (EXEC) step when the 40ms task of MC2 itself is performed.

The illustrated 'Monitoring 2' and 'Monitoring 3' are examples for explaining OS states for the MC1 and the MC2 according to the present invention, and enter the PEND step through the SYNC step and enter the MC2. (master) enters the operation (EXEC) phase in the monitoring window.

In a normal case, a 40 ms task of MC1 (slave) performs an operation in the monitoring window and MC1 (slave) is changed to an operation (EXEC) step. After confirming that the OS state of MC1 has changed to the operation (EXEC) phase, MC2 starts a new monitoring segment. Since the jitter of ± 10 ms is within the monitoring window of MC2, it is regarded as normal operation of MC1.

Then, the OS error monitoring will be described with reference to FIG. 3 according to the above-described monitoring method.

3 illustrates OS error monitoring for the MC2 as an example of the present invention. This is a case where MC1 monitors an OS error of MC2. If the OS of MC2 does not operate normally, the OS state of MC2 operates differently from that expected by MC1. As in the case of MC1, an error range of ± 10 ms is regarded as normal, and in normal cases, t3, t4, and t5 should be changed to an operation (EXEC) step because they are a monitoring window.

If it is maintained in the PEND state at t3 of 'normal 1' due to the OS delay of MC2, the MC2 is not determined as an error because it is within an error range. In other words, the OS delay of t1, t2, and t3 is 10ms, 20ms, and 20ms in the OS state of MC2, but a 50ms OS delay occurred, but considering the error range ± 10ms in the 40ms task, the task was executed within the maximum range of 50ms. , MC2 does not recognize the error as the operation (EXEC) is performed at t4 after the task is completed. Therefore, in case of 'normal 1', MC1 does not determine execution (EXEC) of MC2 as an error at t4.

On the other hand, in case of 'Error 1', the task is executed for 10ms at t1 of MC2, and 10ms at t2 and 20ms at t3 of MC2, but the task is performed for the entire 40ms, but again at t4 (PEND 20ms task is maintained by maintaining the state. As a result, as the task for 60 ms is operated, the above normal range is exceeded even in view of the above error range ± 10 ms. Therefore, the MC1 detects an error of the MC2 in the t4 state.

In the illustrated 'normal 2', the operation is performed at the minimum value of the error range. The MC2 performs a task while maintaining a PEND state at t1 and t2, respectively, for 10 ms and 20 ms. In addition, although the operation is switched to the operation (EXEC) step at t3, the task execution time required at t1 and t2 is within the 30 ms range considering the error range ± 10 ms of the 40 ms task, so that the MC1 does not determine MC2 as an error.

On the other hand, in 'Error 2', the operation (EXEC) is performed at t2 of MC2, but it is terminated after the task of 10ms is performed at t1, and it is within the allowable range of 30ms. It is recognized. Therefore, the MC1 is determined to be an error in the step t2 of MC2.

In this way, two microcomputers are set as masters (MC2) and slaves (MC1) in an independent Real Time Operating System (RTOS), and whether the master's task execution time has elapsed within the tolerance time of the preset task is set. Slave judges. In addition, this determination is made based on the conversion of states of the master and the slave, so that mutual error monitoring is performed.

4 illustrates an example of a method in which MC2 monitors an OS error of MC1. In the same manner as described above, when the 40 ms task of MC1 operates, the OS state is changed from a PEND stage to an EXEC stage. If the 40ms task is operated after 10ms, it is determined to be normal because it is changed to the operation (EXEC) step from t5 of 'normal 1'. That is, at t5 of the MC1, the task of the MC1 operates (EXEC), and the MC2 monitors the MC1. The MC1 takes a task time of 40 ms at t5, and the MC2 is included in an allowable error range of 10 ms as the task operates at 50 ms, and MC2 determines that the MC1 is operating normally.

However, when MC1 maintains the PEND state as in 'Error 1', the MC2 switches to the operation (EXEC) state at 50 ms at t5, and eventually exceeds the error tolerance of 10 ms from 30 ms of MC1. MC2 recognizes an error of MC1. In addition, when the task is made 40 ms at the time t3 of MC1 as in the case of 'normal 2', the MC1 is included in the error tolerance range -10 ms and is determined to be a normal state.

In addition, when the 40 ms task is performed at t2 of MC1 as recognized in 'Error 2', the MC2 takes a waiting time (PEND) at 20 ms, thus exceeding an error tolerance of 10 ms. Therefore, the MC2 determines the error of the MC1. As a result, when the MC2 monitors the OS error of the MC1, it is determined whether the operation of the MC2 (EXEC) is performed within an allowable range from the operation (EXEC) time of the MC1.

On the other hand, when the other party detects the OS error of the MC1 or MC2, it is necessary to resynchronize the OS state of the two microcomputers, because the mutual reliability by performing a continuous mutual monitoring through synchronization.

FIG. 5 is a diagram illustrating a synchronization time point of MC1 when an error occurs after the OS error of MC2 is monitored by MC1. As shown, when the OS state is a PEND step at t4 due to an OS error of MC2, MC1 detects an error. The MC1 enters a resynchronization step for synchronizing the OS state. As another example, the MC1 detects an error of the MC2 outside the tolerance range at t2 and then performs resynchronization (RESNC).

The resynchronization step (RESYNC) maintains the communication of t0 in the same manner as the synchronization (SYNC) step described with reference to FIG. In the resynchronization phase, when the OS state of MC1 changes to the EXEC phase, the monitoring segment of the 40ms task is restarted. In the same manner, when the OS error of MC1 is detected by MC2 as shown in FIG. 6, when an error out of the error tolerance range occurs at t5, it enters t0 and performs resynchronization. This confirms the MC2's EXEC phase during the resynchronization phase and restarts the 40ms task monitoring segment.

On the other hand, in the present invention, by checking the program flow to determine whether the task has been performed, the execution status of the program in the task as well as the task is checked.

That is, when checking whether the task of MC1 is performed, the program flow is checked using an action including an arithmetic operation. As shown in FIG. 7, MC1 and MC2 hold a plurality of tasks, and include registers for storing a program execution state of each task. Therefore, data stored in each register (OS status data) is provided to the counterpart microcomputer to determine whether to execute the corresponding program. In this case, the present invention includes a simple arithmetic operation in each program, and determines whether the program is normally performed based on the arithmetic result performed in each program.

For example, one task TASK 0 includes a plurality of functions Action 0 to Action m, and each function includes a simple arithmetic algorithm Action_0 to Action_end. This arithmetic algorithm consists of two arithmetic algorithms (Action_0 ~ Action_end) in one function to be located at the start and end of each function.

As shown, arithmetic algorithms (Action_0 to Action_end) have simple arithmetic operations, and arithmetic expressions based on simple arithmetic will be appropriate. Each arithmetic algorithm is called at the beginning and end of a function called in each task. In other words, assume Task 0 is a 40ms task and configure three functions to be called in task 0 and two arithmetic algorithms within the three functions.

For example, when m + 1 functions exist in Task 0, and two arithmetic algorithms (Action_0 to Action_end) exist in each function, the arithmetic expression of the first arithmetic algorithm (Action_0) is set to '× 5'. If the initial value is set to '10', the result value is calculated as '10' at the time of operation of the first function. When the algorithm (Action_1) is set to '÷ 2' at the time when the program of the corresponding function is operated, the final value of the corresponding function is '25'.

By applying different arithmetic algorithms at the start and end of each function in the same way, it is possible to predict the final result value at the end of one task, and in fact, each function provided in one task is terminated. At that point the final result is calculated. Therefore, after one task is completed, it is checked whether each program in the task is normally started based on the result of the last arithmetic algorithm (Action_end).

Therefore, when the task is normally operated, the result value checked by Action_End is always the same when operating according to the programming flow. If the result value obtained from Action_End and the expected result value are different, it is determined that the flow diagram does not operate normally. On the other hand, if the result is the same as expected, it is determined that the program flow is normal, and the 40ms task is regarded as normal, and the OS state of MC1 is changed from the PEND stage to the EXEC stage.

MC1, MC2: Micom EXEC: Operation
PEND: Standby SYNC: Synchronize
RESYNC: Resync

Claims

In the monitoring method for monitoring the OS task of the mutual system by using two microcomputers (MC1, MC2) having independent real time operating system (RTOS),
a) The two micoms are set as the master MC2 and the slave MC1, and mutual determination of whether the tasks of the master MC2 and the slave MC1 are completed within a tolerance time of a preset task; Making; And
b) If it is determined in step a) that an error in which the task execution time has passed the allowable error time is detected in the master MC2 or the slave MC1, resynchronization between the master MC2 and the slave MC1 is detected. Performing (RESYNC),
And the step a) includes recognizing state transitions of the master MC2 and the slave MC1.

The method of claim 1, wherein step a)
a-1) entering an operation (EXEC) step in the monitoring window of the master MC2 or the slave MC1;
a-2) the slave (MC1) or the master (MC2) to perform a task (task), the slave (MC1) or master (MC2) to switch to the operation (EXEC) phase; And
a-3) determining if the operation (EXEC) time between the slave (MC1) or master (MC2) and the master (MC2) or slave (MC1) is out of the tolerance time range, and determining as an error How to monitor OS task of OS MICOM.

The method according to claim 1 or 2,
The task (task) is a 40ms task, the tolerance time is ± 10ms, The method of monitoring the OS task of the Al ThioS MICOM.

The method according to claim 1 or 2,
The plurality of functions included in the task include an arithmetic algorithm at start and end points of each function;
The arithmetic operation algorithm is a different operation expression based on an initial value, and the method of monitoring an OS task of Althioes micom, characterized in that verifying the completion of operation for each function based on the operation expression.

The method of claim 4, wherein
And the arithmetic operation algorithm is arithmetic operation.