CN116991637A

CN116991637A - Operation control method and device of embedded system, electronic equipment and storage medium

Info

Publication number: CN116991637A
Application number: CN202311252711.3A
Authority: CN
Inventors: 李保晗; 马文凯; 赵凤鸣
Original assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Current assignee: Suzhou Metabrain Intelligent Technology Co Ltd
Priority date: 2023-09-26
Filing date: 2023-09-26
Publication date: 2023-11-03
Anticipated expiration: 2043-09-26
Also published as: CN116991637B

Abstract

The embodiment of the application provides an operation control method and device of an embedded system, electronic equipment and a storage medium, wherein the method comprises the following steps: detecting the running state of a second operating system through a first operating system; stopping the target specified process in the second operating system through the first operating system under the condition that the specified abnormality exists in the second operating system and the normal running target specified process exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system through the first operating system, starting a set of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system. The application solves the problem of low safety of equipment operation caused by overlong restarting time in the operation control method of the embedded system in the related technology.

Description

Operation control method and device of embedded system, electronic equipment and storage medium

Technical Field

The embodiment of the application relates to the field of computers, in particular to an operation control method and device of an embedded system, electronic equipment and a storage medium.

Background

Currently, embedded systems may be applied to different scenarios such as electronic products, for example, embedded software in electronic products. In order to ensure the operation safety of the embedded system, the process state of each process in the embedded system can be monitored through a monitoring process, and if abnormal conditions such as process suspension, incapability of scheduling and the like occur, the system can be restarted.

However, because the time of system start is affected by various factors, if the restarting time is long, abnormal conditions which affect the normal operation of the electronic equipment are easy to occur, and the operation safety of the equipment is affected. Therefore, the operation control method of the embedded system in the related art has the problem of low safety of equipment operation caused by overlong restarting time.

Disclosure of Invention

The embodiment of the application provides an operation control method and device of an embedded system, electronic equipment and a storage medium, which at least solve the problem of low safety of equipment operation caused by overlong restarting time in the operation control method of the embedded system in the related technology.

According to an embodiment of the present application, there is provided an operation control method of an embedded system, including: detecting the running state of a second operating system through a first operating system, wherein the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system comprises the first operating system and the second operating system; stopping, by the first operating system, the target-specific process in the second operating system when a specific exception is detected to exist in the second operating system and a target-specific process running normally exists in a set of specific processes in the second operating system; after stopping the target appointed process in the second operating system through the first operating system, starting the set of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system.

According to still another embodiment of the present application, there is provided an operation control apparatus of an embedded system, including: the first detection unit is used for detecting the running state of a second operating system through a first operating system, wherein the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system comprises the first operating system and the second operating system; a first control unit, configured to stop, by the first operating system, a target specification process in the second operating system when it is detected that a specification abnormality exists in the second operating system and that a normal running target specification process exists in a group of specification processes in the second operating system; the first designating unit is used for starting the group of designating processes backed up in the first operating system through the first operating system after stopping the target designating process in the second operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system.

According to still another embodiment of the present application, there is also provided an embedded system including: a first operating system and a second operating system running on different processor cores, wherein the first operating system is used for detecting the running state of the second operating system; stopping the target specified process in the second operating system under the condition that the specified abnormality exists in the second operating system and the target specified process which runs normally exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system, starting the set of appointed processes backed up in the first operating system, and controlling the second operating system to restart the system; the second operating system is used for interacting with the first operating system in an inter-core communication mode, and responding to the control of the first operating system to execute matching operation.

According to still another embodiment of the present application, there is also provided a server including: the BMC chip is provided with a first operating system and a second operating system which are operated on different processor cores of the BMC chip, wherein the first operating system is used for detecting the operation state of the second operating system; stopping the target specified process in the second operating system under the condition that the specified abnormality exists in the second operating system and the target specified process which runs normally exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system, starting the set of appointed processes backed up in the first operating system, and controlling the second operating system to restart the system; the second operating system is used for interacting with the first operating system in an inter-core communication mode, and responding to the control of the first operating system to execute matching operation.

According to still another embodiment of the present application, there is further provided a BMC chip, where a first operating system and a second operating system are running on different processor cores of the BMC chip, where the first operating system is configured to detect an running state of the second operating system; stopping the target specified process in the second operating system under the condition that the specified abnormality exists in the second operating system and the target specified process which runs normally exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system, starting the set of appointed processes backed up in the first operating system, and controlling the second operating system to restart the system; the second operating system is used for interacting with the first operating system in an inter-core communication mode, and responding to the control of the first operating system to execute matching operation.

According to a further embodiment of the application, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

According to a further embodiment of the application there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

According to the embodiment of the application, the mode of detecting the fault condition of the main system and designating the corresponding recovery strategy through the relatively independent auxiliary systems under the heterogeneous dual systems is adopted, and the running state of the second operating system is detected through the first operating system, wherein the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system comprises the first operating system and the second operating system; stopping the target specified process in the second operating system through the first operating system under the condition that the specified abnormality exists in the second operating system and the normal running target specified process exists in a group of specified processes in the second operating system; starting a set of appointed processes backed up in a first operating system through the first operating system; after stopping a target appointed process in a second operating system through a first operating system, starting a group of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system.

Drawings

FIG. 1 is a block diagram of the hardware architecture of an alternative computer terminal according to an embodiment of the present application;

FIG. 2 is a flow chart of an alternative method of controlling operation of an embedded system according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an alternative method of operation of an embedded system according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an alternative method of operation of an embedded system according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a method of operation of yet another alternative embedded system in accordance with an embodiment of the present application;

FIG. 6 is a flow chart of another alternative method of controlling operation of an embedded system according to an embodiment of the present application;

FIG. 7 is a flow chart of yet another alternative method of operation control of an embedded system in accordance with an embodiment of the present application;

fig. 8 is a block diagram of an operation control device of an embedded system according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.

It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present application and the above-described drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking a computer terminal as an example, fig. 1 is a block diagram of a hardware structure of an alternative computer terminal according to an embodiment of the present application. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the computer terminal may further include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to an operation control method of an embedded system in an embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, to implement the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a NIC (Network Interface Controller, network adapter) that can communicate with other network devices via a base station to communicate with the internet. In one example, the transmission device 106 may be an RF (Radio Frequency) module for communicating with the internet wirelessly.

According to an aspect of the embodiment of the present application, taking a computer terminal as an example to execute the operation control method of the embedded system in the embodiment, fig. 2 is a schematic flow chart of an alternative operation control method of the embedded system according to the embodiment of the present application, as shown in fig. 2, the flow includes the following steps:

in step S202, the running state of the second operating system is detected by the first operating system, where the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system includes the first operating system and the second operating system.

The operation control method of the embedded system in the embodiment can be applied to a scene for performing operation control on the embedded system. The embedded system may be an embedded multi-system, where the embedded multi-system refers to running multiple operating systems (e.g., a first operating system, a second operating system, etc.) in a multi-core processor of the embedded system, where the operating systems run simultaneously in the same embedded system, and the multiple operating systems may be the same type of operating system, or may be different types of operating systems, for example, heterogeneous operating systems (operating systems of different types with different architectures). The execution of the service on the processor may be performed in parallel by processing resources on a plurality of cores of the processor, where the processor may be a multi-core processor, for example, an 8-core processor, or a processor including other cores, and in this embodiment, the number of cores included in the multi-core processor is not limited.

Currently, embedded systems may be applied to different scenarios such as electronic products, for example, embedded software in electronic products. In order to ensure the operation safety of the embedded system, the process state of each process in the embedded system can be monitored through a monitoring process, and if abnormal conditions such as process suspension, incapability of scheduling and the like occur, the system can be restarted. For example, with the rapid development of electronic technology, in order to ensure the reliability of embedded software in various electronic products, in general, in embedded software, such as Linux (collectively referred to as GNU/Linux, a set of freely-transmissible Unix-like operating systems), a monitoring process is usually defined, and a WDT (Watch Dog Timer) is periodically reset, for example, so as to prevent the WDT from being lost or "clocked", so that the system is restarted, and meanwhile, each process state is detected, and if a process is suspended, the corresponding process is recovered; if the process cannot be scheduled, restarting the system.

Here, in order for the watchdog to operate normally, the watchdog needs to be "fed" at regular time, and the feeding is implemented by using the device to output a high-low level to let the watchdog execute the reset command. When the device is configured with a feeding dog operation, the device runs a feeding dog script at intervals and outputs a high level to the watchdog, and the watchdog considers that the device is normally operated after receiving the level, and does not need to execute a reset command; conversely, if the watchdog fails to receive the level signal or the device is pressed to perform no feeding operation for a certain period of time, the device may fall into an infinite restart cycle.

It should be noted that WDT is an electronic or software timer that is used to detect and recover from a computer failure. During normal operation, the computer will periodically reset the watchdog timer to prevent it from losing or "timing. If the computer fails to reset the watchdog due to a hardware failure or a program error, the timer will be lost and a timeout signal will be generated that will be used to initiate one or more corrective actions. Here, corrective action typically includes placing the computer system in a safe state and resuming normal operation of the system.

It should be further noted that the embedded software and the embedded system are inseparable, and are software designed based on the embedded system, and are also one kind of computer software, which is an important component of the embedded system. An embedded system is generally composed of 4 parts of an embedded microprocessor, peripheral hardware devices, an embedded operating system, and user applications, which are means for controlling, monitoring, or assisting in operating machines and devices, and is a special purpose computer system.

However, since the system start time is affected by various factors, if the restart time is long, the normal operation of the electronic device where the electronic device is located is easily affected, and the operation safety of the device is easily affected. For example, if a system is dead, the process cannot be scheduled, resulting in a restart of the system, however, some large systems have long restart times, and some critical processes cannot be executed, which may have serious consequences. If the task is processed correspondingly according to the real-time temperature, the temperature alarm may be invalid, so that the subsequent task cannot be executed, and the like. For another example, the system may be suspended during some start-up phases, such as uboot (Universal Boot Loader, a boot loader of an embedded system), resulting in a start-up failure, which may result in repeated reboots or suspended situations, resulting in a system that cannot be executed normally. Here, uboot is an open source software compliant with GPL (General Public License ) protocol, and can be regarded as a bare metal integration routine.

In order to at least partially solve the above problem, in this embodiment, in a heterogeneous dual system, the running state of the second operating system may be detected by the first operating system, where the first operating system and the second operating system are both operating systems running on different processor cores, and when detecting that an exception occurs in the second operating system (i.e., the main system) that needs to perform system restart, through the first operating system (i.e., the secondary system), the specified process (i.e., the critical process) backed up by the first operating system is started first, and then the restart of the second operating system is controlled, so that the purpose that the critical process does not stop in the abnormal start of embedded software can be achieved, and the running security of the embedded system is improved.

Here, the heterogeneous dual system is an operating system in which two cores are relatively independent and run differently, and each core is isolated from each other, so that information interaction can be performed through inter-core communication. The heterogeneous dual system can be an embedded system, can comprise a first operating system and a second operating system, and can monitor the main system and recover faults by taking one system as the main system and the other system as the auxiliary system based on the characteristic of mutual independence of the heterogeneous dual systems.

Alternatively, the first operating system may be an operating system with explicitly fixed time constraints, within which all processing (task scheduling) needs to be done, otherwise the system may be in error, which may be an RTOS system (Real TimeOperating System, real-time operating system), e.g., freeRTOS, RTLinux, etc., but also real-time operating systems in other embedded systems. The second operating system does not have this feature, and the second operating system generally adopts a fair task scheduling algorithm, and when the number of threads/processes increases, it needs to share the time of the CPU (Central Processing Unit ), and task debugging has uncertainty, and may be called a non-real-time operating system, for example contiki, heliOS, linux (collectively called GNU/Linux, which is a set of freely-propagating Unix-like operating systems), or may be a non-real-time operating system in other embedded systems.

The RTOS real-time operating system is an operating system which can receive and process external events or data quickly enough, can control the production process or respond to the processing system quickly within a specified time, schedule all available resources to complete real-time tasks and control all real-time tasks to run in a coordinated and consistent manner. The Linux system is a multi-user, multi-tasking, multi-CPU supporting operating system based on POSIX (Portable Operating System Interface ).

Accordingly, the traffic allocated to the first operating system is typically a real-time traffic, which refers to traffic that needs to be scheduled within a specified time, and that requires a processor to process the traffic quickly enough, and the result of the processing can control the production process or respond quickly to the processing system within the specified time. As a typical scenario, control of the robot in industrial control is a real-time business, and the system needs to take measures in time before detecting the misoperation of the robot, otherwise serious consequences may occur. The traffic allocated to the second operating system is typically non-real time traffic, which refers to traffic that is insensitive to the scheduling time and has a certain tolerance for the delay of the scheduling, e.g. sensor data of a read temperature sensor (sensor) in the server.

Alternatively, the operation control of the embedded system may be applied on a chip, for example, but not limited to, an X86 architecture chip, an ARM architecture (Advanced RISC Machine, advanced reduced instruction set computer) chip, a RiSC-V architecture (Reduced Instruction Set Computer, fifth generation reduced instruction set architecture) chip, and a MIPS (Microprocessor without Interlocked Pipeline Stages, microprocessor without interlocking pipeline stages) architecture chip, etc. The above-described chip may not be limited to any chip that allows running multiple operating systems in the same processor, where the chip may be a BMC chip.

In one exemplary embodiment, the first operating system and the second operating system are operating systems running on different processor cores in a multi-core processor of a BMC (baseboard management controller ) chip; alternatively, the first operating system is an operating system running on a coprocessor of the Baseboard Management Controller (BMC) chip, the second operating system is an operating system running on a processor core of a multi-core processor of the BMC chip, that is, the second operating system is an operating system running on one or some of the multi-core processors of the BMC chip, and the first operating system is an operating system running on a processor core different from the processor core on which the second operating system runs, or an operating system running on a coprocessor of the multi-core processor.

For example, by taking a first operating system as an RTOS system, a second operating system as a Linux system, and a heterogeneous dual system formed by the Linux system and the RTOS system as an example, an embedded system may be shown in fig. 3, where the embedded system includes a chip and at least two operating systems, the chip includes a processor, and a virtual channel is set in a memory of the processor; at least two operating systems run based on the processor, and the at least two operating systems communicate by adopting the communication method of the embodiment, wherein the CPU0 core runs the Linux system before running, and the original service code is run as a main system; the CPU1 cores or coprocessors run an RTOS system as a subsystem.

The first operating system may include an application layer and/or a driver layer, and the second operating system may also include an application layer and/or a driver layer, where the first operating system uses the processor core CPU0 to perform tasks, and the second operating system uses the processor core CPU1 to perform tasks, where the CPU0 and the CPU1 implement communication by way of interrupt. The application layer provides a man-machine interaction interface for the user, and the specific functions required by the user are flexibly realized; the drive layer is communicated with the hardware, can read and write the register of the hardware, and meanwhile, the drive layer can provide a unified interface for the application layer, so that data transmitted by the application layer are received, and the program application layer and the drive layer are subjected to layered design and management, so that the maintenance and the transplanting of a program are facilitated.

And for the mode of detecting the running state of the second operating system by the first operating system, the running state of the second operating system can be monitored by the first operating system in an inter-core communication mode, so that the running state of the second operating system is obtained. For the time setting in which the first operating system detects the second operating system, it may be set to detect at regular intervals, for example, every five minutes.

In this embodiment, taking the first operating system as an RTOS system and the second operating system as a Linux system as an example, the RTOS system may detect the Linux system by using a timing detection manner, and detect an operation state of the Linux system. Here, detecting the operation state of the Linux system may include, but is not limited to: whether the current running state is normal in scheduling, whether the process is normal in running and the like. For example, whether the Linux system is started normally or not can be detected, and whether the Linux system is abnormal or not can be detected.

For example, as shown in fig. 4, different cores may be started to wake up different systems through codes, and a main system and a sub-system are loaded respectively, where the main system loads and completes running service codes, and the sub-system continuously monitors the main system, to monitor whether the main system is started normally and is abnormal, where the interaction manner between the main system and the sub-system may be inter-core communication manner.

Illustratively, as shown in fig. 5, the embedded system may include: a chip and at least two operating systems, wherein the chip comprises a processor 502, a hardware controller 504, a first bus 506, and a second bus 508, wherein the bandwidth of the first bus 506 is higher than the bandwidth of the second bus 508, and the first bus 506 is configured in a multi-master multi-slave mode, and the second bus 508 is configured in a master multi-slave mode; at least two operating systems run based on processor 502; at least two operating systems communicate via a first bus 506; at least two operating systems implement control of the hardware controller via a second bus 508.

In step S204, when it is detected that the second operating system has a specified exception and a set of specified processes in the second operating system have a target specified process that runs normally, the target specified process in the second operating system is stopped by the first operating system.

After the first operating system detects the second operating system, the detection result may indicate whether the second operating system has an abnormality, and when the abnormality exists, the type of abnormality existing in the second operating system may be: the processes in the second operating system all run normally or there are anomalies in the second operating system, and the anomalies in the second operating system may be multiple, which may be anomalies that the second operating system can restore itself (i.e., anomalies that the second operating system does not need to restart, i.e., anomalies that the second operating system does not need to restore), or anomalies that the second operating system cannot restore itself (i.e., anomalies that the second operating system needs to restart to attempt to restore).

For the unrecoverable exception of the second operating system (which may be a system exception of the second operating system itself or a process running in the second operating system), such exception may be specified by configuration information, that is, if the second operating system has the specified exception, the second operating system needs to be restarted at this time, so as to prevent that a critical process (for example, a temperature monitoring process) does not run for a long time due to the system restart to affect the security of the system running, the critical process may be backed up in the first operating system, and when the second operating system restarts, the critical process backed up in the first operating system is started. To avoid process run anomalies, critical processes in the second operating system may be stopped. Here, the critical processes may be specified by configuration information, the number of which may be one or more, and for the device itself, it may be a set of specified processes.

If a set of specified processes in the second operating system have a target specified process that is running normally, the first operating system may stop the target specified process in the second operating system. Here, the target specifying process may be a normal running specifying process (i.e., a critical process). For a given process of the set of given processes that is running abnormally, the first operating system may also attempt to stop the given process of the second operating system that is running abnormally, or ignore the given process of the running abnormally.

For example, when the RTOS system detects that a specified exception condition exists in the Linux system, the RTOS system may stop critical processes in the second operating system.

In step S206, after stopping the target designating process in the second operating system by the first operating system, a set of designating processes backed up in the first operating system is started by the first operating system, and the second operating system is controlled by the first operating system to restart the system.

The first operating system may also initiate a set of specified processes backed up in the first operating system after stopping the target specified process in the second operating system by the first operating system. For example, the RTOS system may initiate critical processes that the RTOS system has backed up in advance. The backup mode can be manual backup or automatic backup by selecting the built-in function of other tools. For the manner of starting, a set of designated processes may be started by triggering system instructions on the operating system, such as process start instructions, or other starting manners.

Meanwhile, the first operating system can also control the second operating system to restart the system. Here, the control of the second operating system to restart the system may be performed after stopping the target specified process in the second operating system by the first operating system, or may be performed after starting a set of specified processes backed up in the first operating system, so as to achieve the purpose of uninterrupted running of the critical process in abnormal starting of the embedded software.

For example, under the architecture of heterogeneous dual systems, a Linux system is used as a main system, an RTOS system is used as a secondary system, the RTOS system monitors Linux processes and the running state of the system in real time, if the system is scheduled abnormally, the system needs to be restarted, the RTOS system starts a backup critical process, after the RTOS system starts the backup critical process, the RTOS system can control the Linux system to be restarted, and the system can also wait for the successful restarting of the main system.

Through the steps, detecting the running state of the second operating system through the first operating system; stopping the target specified process in the second operating system through the first operating system under the condition that the specified abnormality exists in the second operating system and the normal running target specified process exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system through the first operating system, starting a group of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system, so that the problem of low safety of equipment operation caused by overlong restarting time in the operation control method of the embedded system in the related technology is solved, and the safety of equipment operation is improved.

In an exemplary embodiment, before detecting, by the first operating system, the running state of the second operating system, the method further includes:

s11, compiling the backed-up process codes of a group of designated processes into the first operating system in the process of compiling the first operating system.

In this embodiment, in order to improve the convenience of process control, in the process of compiling the first operating system, that is, compiling a set of process codes of a specified process into the first operating system, it may be convenient to directly start a set of compiled executing processes when the specified process needs to be run in the first operating system.

According to the embodiment, the process codes of the critical processes are compiled into the subsystem when the subsystem is subjected to system compiling, so that the system starting efficiency can be improved.

In one exemplary embodiment, detecting, by a first operating system, an operational state of a second operating system includes:

s21, monitoring the running state of the second operating system by the first operating system in an inter-core communication mode to obtain the running state of the second operating system.

In this embodiment, the heterogeneous dual systems may interact through IPC (Inter-Processor Communication, inter-core communication) mode. It should be noted that, at present, a large multi-chip has several cores, some are Cortex m0+, M4, M7, and others, some have 2 cores, 3 cores, even 6 cores, 8 cores, and different cores have different main frequency support degrees, and applicable specific application scenarios are different, so IPC is required to communicate for data interaction. The mechanism for inter-core communication implementation may be a Mailbox interrupt or a shared memory based message queue. Other implementation mechanisms are also possible, and this is not limited in this embodiment.

Alternatively, the first operating system may detect the running state of the second operating system by adopting an inter-core communication manner, and the running state of the second operating system may be the running state of the second operating system by adopting an inter-core communication request (for example, an interrupt request), and the used interrupt number may be matched with an existing interrupt number, for example, a reserved part of interrupt numbers are adopted to implement the running state of the second operating system.

For example, when the program is started, the CPU1 core is awakened or the coprocessor runs the RTOS system as a secondary system, the system monitors the main system in an inter-core communication mode to check the current running condition of the main system, whether the scheduling is normal, whether the process is normal, and the like.

According to the embodiment, the auxiliary system detects the running state of the main system by adopting an inter-core communication mode, so that the compatibility of detecting the running state of the main system can be improved.

In an exemplary embodiment, monitoring, by a first operating system, an operating state of a second operating system by adopting an inter-core communication manner, to obtain the operating state of the second operating system, including:

s31, sending a second interrupt request to a second operating system by adopting an inter-core communication mode through the first operating system, wherein the second interrupt request is used for requesting to acquire the system state of the second operating system;

S32, determining the system state of the second operating system based on whether a response message of the second interrupt request returned by the second operating system is received, wherein the running state of the second operating system comprises the system state of the second operating system.

In this embodiment, the monitoring of the running state of the second operating system may be implemented by interacting with the second operating system. The first operating system may send a second interrupt request to the second operating system in an inter-core communication manner, where the second interrupt request may be used to request acquisition of a system state of the second operating system. The acquired system state of the second operating system may be used to indicate a state of a system schedule of the second operating system.

After the first operating system sends the second interrupt request to the second operating system, the first operating system may determine whether the second operating system communicates normally based on whether a response message of the second interrupt request returned by the second operating system is received, and determine a system state of the second operating system based on the received response message of the second interrupt request returned by the second operating system after receiving the response message of the second interrupt request returned by the second operating system, where the running state of the second operating system may include the system state of the second operating system.

Through the embodiment, the auxiliary system interacts with the main system through the interrupt request to determine the system state of the main system, so that the convenience of system state acquisition can be improved, and the accuracy of operating state acquisition of the operating system is improved.

s41, sending a third interrupt request to a target process in a second operating system by adopting an inter-core communication mode through a first operating system, wherein the third interrupt request is used for requesting to acquire the process state of the target process in the second operating system;

s42, determining the process state of the target process in the second operating system based on whether a response message of the third interrupt request returned by the target process is received, wherein the running state of the second operating system comprises the process state of the target process in the second operating system.

In this embodiment, the monitoring of the running state of the second operating system may be implemented by directly interacting with a process in the second operating system. The first operating system may send a third interrupt request to the target process in the second operating system in an inter-core communication manner, where the third interrupt request may be used to request to obtain a process state of the target process in the second operating system. The obtained process state of the target process may be used to indicate whether the target process is running normally. Here, the target process may be any process of a set of designated processes, or may be another process other than the set of designated processes, which is not limited in this embodiment.

After the first operating system sends the third interrupt request to the target process in the second operating system, the first operating system may determine whether the target process in the second operating system communicates normally based on whether a response message of the third interrupt request returned by the target process system in the second operating system is received, and determine a process state of the target process in the second operating system based on a received response message of the third interrupt request returned by the target process in the second operating system after receiving the response message of the third interrupt request returned by the target process in the second operating system, where the running state of the second operating system may include a process state of the target process in the second operating system.

According to the method and the device for obtaining the running state of the operating system, the auxiliary system directly obtains the process state of the specific process of the main system in a mode of sending the interrupt request, so that the convenience of obtaining the process state can be improved, and the accuracy of obtaining the running state of the operating system is improved.

s51, the running state information of the second operating system stored in the shared memory between the first operating system and the second operating system is read through the first operating system, and the running state of the second operating system is obtained.

In this embodiment, the second operating system may transfer its own running state to the first operating system through the shared memory with the first operating system. The first operating system may read the running state information of the second operating system stored in the shared memory with the second operating system, where the running state information is used to indicate the running state of the second operating system or a process in the second operating system, and based on whether the running state information of the second operating system is read and the read running state information of the second operating system, the running state of the second operating system may be determined.

According to the embodiment, the running state of the main system is monitored through the shared memory between the main system and the auxiliary system, so that the real-time requirement on the acquisition of the running state information can be reduced, and the utilization rate of system resources is improved.

In an exemplary embodiment, after detecting the running state of the second operating system by the first operating system, the method further includes:

s61, judging whether the second operating system has a specified abnormality according to the running state of the second operating system, wherein the specified abnormality comprises at least one of the following: the system is abnormal in scheduling, the unrecoverable process is failed, the communication is overtime, and the system is suspended.

In this embodiment, according to the running state of the second operating system, the first operating system may determine whether there is a serious abnormality in the second operating system, that is, an abnormality that needs to be attempted to be recovered by restarting the system, where the serious abnormality is a specified abnormality, which may include, but is not limited to, at least one of the following: abnormal system scheduling, unrecoverable process faults, overtime communication, system suspension and other anomalies can be included.

For example, the secondary system may determine whether the primary system is severely abnormal according to the information acquired by the inter-core communication, and if so, notify the primary system that the critical process is stopped through the inter-core communication, so as to execute the foregoing system restart procedure, thereby performing fault recovery. Serious anomalies herein may include, but are not limited to, a need to restart the system anomaly of at least one of: scheduling exceptions; the process is abnormal in operation and cannot be recovered; communication timeout; the main system hangs up and the like.

By the embodiment, the accuracy of judging the running state of the second operating system can be improved by judging whether the second operating system has the specified abnormality to determine the running state of the second operating system.

In one exemplary embodiment, in a case where it is detected that the second operating system has a specified exception and a set of specified processes in the second operating system have a target specified process that runs normally, stopping, by the first operating system, the target specified process in the second operating system includes:

And S71, under the condition that the second operating system is detected to have the specified abnormality and a group of specified processes in the second operating system have the target specified processes which run normally, sending a fourth interrupt request to the second operating system by adopting an inter-core communication mode through the first operating system, wherein the fourth interrupt request is used for requesting to stop the target specified processes in the second operating system.

In this embodiment, when it is detected that the second operating system has a specified exception and a set of specified processes in the second operating system have target specified processes that run normally, the first operating system may send a fourth interrupt request to the second operating system by adopting an inter-core communication manner. For example, the secondary system notifies the primary system that critical processes are stopped by inter-core communication. Here, the fourth interrupt request is for requesting to stop the target specifying process in the second operating system. The fourth interrupt request is similar to the interrupt request described in the previous embodiments, and will not be described here.

By adopting the embodiment, the auxiliary system requests to stop the critical process in the main system in the interrupt request mode, so that the convenience and compatibility of process operation control (namely, compatibility with the existing interrupt communication mode) can be improved.

In one exemplary embodiment, initiating, by a first operating system, a set of specified processes backed up in the first operating system includes:

s81, starting a temperature detection process backed up in the first operating system through the first operating system, wherein the temperature detection process is a process for detecting the temperature of equipment where the embedded system is located, and a group of designated processes comprise the temperature detection process.

In this embodiment, the set of designated processes may be critical processes in the embedded system, and may include a temperature detection process. The first operating system may initiate a backup temperature detection process in the first operating system. The temperature detection process can be used for detecting the temperature of the device in which the embedded system is located or a designated component in the device.

In the related art, when a main system process cannot be scheduled or the system cannot operate normally, a method is generally adopted in which the main system is directly restarted. However, this approach may result in the critical process not being performed for a long time, which may have serious consequences. Taking the temperature monitoring process as an example, in the process of restarting the main system, the temperature may suddenly rise, and at this time, the temperature data cannot be obtained immediately, so that the subsequent processing, such as fire extinguishing, electronic device cooling, etc., cannot be performed immediately, thereby causing serious consequences, such as fire disaster or circuit board, etc.

In this embodiment, under the heterogeneous dual system, the first operating system (which may be used as a secondary system) and the second operating system (which may be used as a primary system) are mutually independent, and in the restarting process of the second operating system, the first operating system may take over the critical process (for example, the temperature detection process), and after the second operating system is started, the critical process is handed over to the second operating system, so that the critical process can still run normally when the primary system crashes and restarts, the defects can be avoided, and the reliability of the system is greatly improved.

By starting the backup temperature detection process in the auxiliary system, the temperature detection process is not interrupted, and serious consequences similar to fire disaster or circuit board are avoided, so that the safety of the embedded system is improved.

In one exemplary embodiment, controlling, by a first operating system, a second operating system to perform a system restart includes:

s91, forcing the second operating system to restart through the first operating system.

The manner in which the first operating system controls the second operating system to restart the system may be: and (5) forced restarting. The forced restart may be implemented on hardware by IPMI (Intelligent Platform Management Interface ) commands or other means.

According to the embodiment, the main system is forcedly restarted through the auxiliary system, so that the system restarting failure caused by communication abnormality of the main system can be avoided, and the success rate of the system restarting is improved.

The operation control method of the embedded system in the present embodiment is explained below in conjunction with an alternative example. In this embodiment, the first operating system is a secondary system (or called an auxiliary system) in the heterogeneous dual system, which is an RTOS system, the second operating system is a main system in the heterogeneous dual system, which is a Linux system, and the designated process is a critical process.

In the related art, if the embedded system is suspended, the process cannot be scheduled and the system needs to be restarted. However, some large systems (such as Linux systems) have long reboots and some critical processes cannot be performed, which can have serious consequences.

In this regard, the present alternative example provides a scheme for enhancing the running reliability of the system based on heterogeneous dual systems, where the CPU0 core continues to run the Linux system before running, and runs the original service code as the main system; the CPU1 core or the coprocessor runs the RTOS system to serve as a secondary system, the running state of the Linux process and the running state of the system are monitored in real time, if the system scheduling is abnormal and the critical process cannot run normally, the RTOS system starts the backup critical process, restarts the main system and waits until the main system takes over, and the problems can be solved.

As shown in fig. 6, the process of the secondary system monitoring the system of the primary system for anomalies and recovering from anomalies may include:

step 1, starting an RTOS system as a secondary system;

step 2, the RTOS system detects the running state of a main system (Linux system) through inter-core communication;

step 3, judging whether the main system is abnormal in operation (the process cannot be scheduled or the system cannot be normally operated), if the main system is seriously abnormal, executing step 4, and if the main system is normal in operation (no seriously abnormal and the main system can be recovered), executing step 2, and continuously monitoring;

step 4, the RTOS system stops the key process of the main system through inter-core communication;

step 5, the RTOS system starts a backup critical process, wherein the RTOS system starts the backup critical process in the RTOS system, and the critical process is prevented from not running for a long time;

step 6, the RTOS system restarts the main system, wherein the RTOS notifies or forces the main system to restart (in the process, the critical process runs in the RTOS system and has no interrupt;

and 7, starting the main system, completing and managing a critical process, and managing the critical process in the RTOS, returning the auxiliary system to the step 2, and continuously monitoring the main system.

By the alternative example, under the heterogeneous dual system, the fault condition of the main system is detected through the relatively independent auxiliary system, and the corresponding recovery strategy is executed, namely, the auxiliary system monitors the system abnormality of the main system, and in the abnormal starting of the embedded software, the execution of the critical process is ensured not to be stopped, so that the serious consequences caused by the inadmissibility of the critical process under certain conditions can be effectively prevented, and the running safety of the system is improved.

In an exemplary embodiment, after the second operating system is controlled by the first operating system to perform the system restart, the method further includes:

s101, under the condition that the second operating system fails to restart from the target image, the second operating system is controlled to restart from the standby image through the first operating system, wherein the target image and the standby image are images of the second operating system.

The second operating system may be a slave target image (which may be the image that the second operating system has initiated by default) and may fail to restart due to problems or other anomalies in the system programs in the target image. In this regard, a dual image redundancy scheme may be configured in the embedded software, with a standby image being pre-configured to match the target image, and the second operating system restarting from the standby image when the second operating system fails to restart from the target image.

However, in some cases, the first operating system may fail to start the standby image, for example, if the system is locked before the redundant module is not started, the main system cannot start the standby image. In order to improve the success rate of the system restart, the first operating system can control the second operating system to restart from the standby image, wherein the target image and the standby image are images of the second operating system.

It should be noted that, the heterogeneous dual systems operate independently, if the main system is jammed in an unrecoverable place, the auxiliary system still can operate normally, so that the auxiliary system controls the main system to restart from the standby mirror image, and the situation that the main system is jammed and the like, so that a subsequent restarting process (for example, restarting from the backup mirror image) cannot be executed can be avoided. The target image and the standby image may be stored in a ROM (Read-Only Memory) or in other memories. Wherein, mirroring may refer to a specific series of files being made into a single file according to a certain format or may refer to other files.

According to the embodiment, by configuring the double-image redundancy scheme and controlling the secondary system to restart from the standby image when the primary system fails to restart from the target image, the success rate of restarting of the primary system can be ensured, and meanwhile, the fault tolerance of equipment operation can be effectively improved.

In one exemplary embodiment, controlling, by the first operating system, the second operating system to restart from the standby image in the event that the second operating system fails to restart from the target image, includes:

s111, setting a mirror image identifier in a starting register corresponding to the second operating system through the first operating system under the condition that the second operating system fails to restart from the target mirror image, so as to switch the mirror image used by the second operating system for restarting from the target mirror image to the standby mirror image;

s112, controlling the second operating system to restart from the standby mirror image identified by the mirror image identification in the starting register through the first operating system.

In order to improve security of the embedded system, in this embodiment, when the second operating system fails to restart from the target image, the first operating system may set the image identifier in the start register corresponding to the second operating system, and replace the image used by the second operating system for system restart to the standby image.

For the image identifier in the start register, the unique identifier may be indicated to identify different image files of one operating system, so that the image identifier is used to distinguish the different image files of the operating system, where the image identifier may be a flag bit, and when the value of the flag bit is 0 (or 1), the second operating system is identified to restart from the target image, and when the value of the flag bit is 1 (or 0), the second operating system is identified to restart from the backup image, or other identification manners may be adopted. The manner of setting the first operating system may be: by reading and setting the entry address of the mirror start code.

For example, the auxiliary system forces the main system to switch the mirror image to start by modifying the corresponding starting register of the main system, thereby solving the problem that the main system is restarted from the backup mirror image due to the blocking of the main system and the like, and further improving the reliability of the system.

According to the embodiment, the auxiliary system can force the main system to switch the mirror image to start by modifying the starting register corresponding to the main system, so that the success rate of the main system to switch the mirror image to start can be improved.

In one exemplary embodiment, in controlling the restart of the second operating system from the standby image by the first operating system, the method further includes:

s121, determining whether the second operating system runs or not by adopting an inter-core communication mode through the first operating system, and determining the running stage of the second operating system through a shared memory between the first operating system and the second operating system so as to determine whether the second operating system is restarted from the standby mirror image.

In order to ensure the accuracy of the determination of the restart state of the primary system, in the present embodiment, during the process of controlling the restart of the second operating system from the standby image by the first operating system, it may be determined whether the second operating system is running and the running phase of the second operating system, respectively, so as to determine whether the restart of the second operating system from the standby image is completed. The manner in which the second operating system is restarted from the target image is similar, and will not be described in detail herein.

Alternatively, determining whether the second operating system is running may be performed by way of inter-core communication or otherwise, and determining the run phase of the second operating system may be performed by way of a shared memory between the first operating system and the second operating system. The memory space in the chip may be, but not limited to, a shared memory space that can be accessed by both the first operating system and the second operating system, where the first operating system may determine whether the second operating system is running by reading running state information and context information stored in the shared memory between the first operating system and the second operating system. The operation stage of the second operating system may be an incomplete restart, a completed restart, or a specific stage from the restart, which is not specifically limited in this embodiment.

For example, when the program is started, the CPU1 core is awakened or the coprocessor runs the RTOS system as a secondary system, and in the starting stage of the Linux system, the RTOS system checks whether the Linux system runs or not through inter-core communication and checks the running stage of the Linux system through a shared memory.

According to the embodiment, the auxiliary system determines whether the main system operates or not in an inter-core communication mode, and checks the operation stage of the main system through the shared memory, so that the accuracy of determining the restarting state of the system can be ensured.

In an exemplary embodiment, the above method further comprises:

s131, sending a first interrupt request to a second operating system by adopting an inter-core communication mode through the first operating system, wherein the first interrupt request is used for requesting to acquire the starting state of the second operating system;

s132, reading information stored in the shared memory through the first operating system to determine the operation stage of the second operating system;

s133, when a response message returned by the second operating system is not received within the preset starting time of the second operating system, and the second operating system is determined to be started incompletely according to the running state parameter in the shared memory, the second operating system is determined to be restarted abnormally.

In this embodiment, when determining whether the second operating system is restarted successfully, the inter-core communication manner may be used to send a first interrupt request to the second operating system to request to obtain the startup state of the second operating system. Whether the second operating system operates is determined by judging whether a response message returned by the second operating system is received within a preset starting time (for example, five minutes, ten minutes, or other time, which can be configured according to requirements) of the second operating system. The manner of sending the first interrupt request to the second operating system is similar to that in the foregoing embodiment, and will not be described herein.

Meanwhile, the first operating system may read the information stored in the shared memory to determine an operation phase of the second operating system, where the operation phase may represent an operation state of the second operating system. And if the response message returned by the second operating system is not received within the starting time and the fact that the starting of the second operating system is not completed is determined according to the running state parameters in the shared memory, determining that the restarting of the second operating system is abnormal. Otherwise, continuing to detect until the detection ending condition is met, wherein the detection ending condition comprises: receiving a response message returned by the second operating system within the starting time, and determining that the starting of the second operating system is completed according to the running state parameters in the shared memory; the boot time arrives (i.e., the set boot time of the second operating system has arrived).

For example, when the starting time arrives, the inter-core communication cannot get the response of the main system, and the running state in the shared memory is not the starting completion, the starting abnormality of the main system is determined; otherwise, continuously detecting, continuously checking whether the Linux system is operated or not through inter-core communication, and checking the operation stage of the Linux system through the shared memory.

Here, the interrupt request may be transmitted between systems by means of a software protocol, but not limited to, or may be transferred through a hardware module. Taking the form of hardware module mailbox to transmit interrupt request as an example, a mailbox channel can be established between the first operating system and the second operating system, service data is read and written through the storage space, and interrupt request is transmitted through the mailbox channel.

According to the embodiment, whether the main system is restarted successfully is determined by whether the response of the main system is received through inter-core communication in the restarting time of the main system and whether the running state of the main system in the shared memory is the starting completion, so that the accuracy of determining the restarting state of the system can be improved.

s141, determining a restarting state of a second operating system through the first operating system, wherein the restarting state of the second operating system is used for indicating whether the restarting of the second operating system is completed or not;

s142, under the condition that the restarting abnormality of the second operating system is determined according to the restarting state of the second operating system, the second operating system is controlled to restart the system for a plurality of times through the first operating system until the restarting of the second operating system is completed or the restarting times of the second operating system reach the designated times;

S143, determining that the second operating system is restarted and fails under the condition that the number of times of restarting the second operating system reaches the designated number of times and the second operating system is still restarted abnormally.

After the main system is controlled by the auxiliary system to restart, the success rate of the main system restart can be ensured by controlling the main system to restart for at most specified times, and meanwhile, the system resource waste caused by excessive system restart is avoided. In this embodiment, after the first operating system controls the second operating system to restart the system, the first operating system may determine a restart state of the second operating system, where the restart state of the second operating system is used to indicate whether the restart of the second operating system is completed, and a manner of determining the restart state of the second operating system is similar to that in the foregoing embodiment, and will not be repeated herein.

When the second operating system is determined to restart abnormally according to the restarting state of the second operating system, the first operating system can control the second operating system to restart for a plurality of times until the second operating system is restarted or the restarting times of the second operating system reaches the designated times, wherein the designated times can be three times or other times, and flexible configuration can be carried out according to the needs. Compared with the method for directly judging the restarting failure of the second operating system when the first system is restarted abnormally, the success rate of the system restarting can be improved, and meanwhile, the waste of system resources can be avoided by setting the restarting time threshold value.

By setting the designated times for the restarting times of the second operating system, the accuracy of the running state of the second operating system can be improved.

and S151, under the condition that the restarting of the second operating system is completed, taking over a group of designated processes through the second operating system, and detecting the running state of the second operating system through the first operating system again, wherein after the second operating system takes over the group of designated processes, the group of designated processes backed up by the first operating system are in a stop state.

When the restart of the second operating system is completed, the second operating system may take over a set of designated processes currently performed by the first operating system, and at this time, a set of designated processes backed up by the first operating system is in a stopped state. Meanwhile, the first operating system can also re-detect the running state of the second operating system. The manner of re-detection is similar to that of the previous embodiment, and will not be described here.

According to the embodiment, the key process is taken over by the primary system after the restart is successful, and the operation state of the primary system is detected by the secondary system, so that the safety of the operation of the system can be improved.

s161, re-detecting the running state of the second operating system through the first operating system under the condition that the second operating system is not detected to have the specified abnormality;

s162, determining whether the second operating system has the appointed abnormality according to the redetected running state of the second operating system.

When the first operating system does not detect that the second operating system has the specified abnormality, the first operating system can re-detect the running state of the second operating system, and then determine whether the second operating system has the specified abnormality according to the re-detected running state of the second operating system. The running state of the second operating system is continuously monitored through the first operating system, so that the running safety of the second operating system can be ensured.

According to the embodiment, the operation state of the main system is continuously monitored through the auxiliary system, so that the operation safety of the system can be improved.

The operation control method of the embedded system in the present embodiment is explained below in conjunction with an alternative example. In this embodiment, the first operating system is a sub-system in the heterogeneous dual system, which is an RTOS system, and the second operating system is a main system in the heterogeneous dual system, which is a Linux system.

In the related art, the Linux system is suspended in some start-up phases, for example, the uboot phase, which results in a failure of starting the Linux system, and may cause a situation of repeated restarting or suspension, which results in a failure of normal execution of the system.

In this regard, the present alternative example provides a solution for enhancing the operational reliability of a system based on heterogeneous dual systems, where if the Linux system is started abnormally and cannot be started normally, the slave system starts the standby mirror image of the master system, so as to solve the above problem.

As shown in fig. 7, the process of performing system recovery when the secondary system monitors that the primary system cannot be started may include:

step 1, starting an RTOS system as a secondary system;

step 2, the RTOS system detects whether the main system is started to be completed or not through inter-core communication;

step 3, judging whether the main system does not respond to the RTOS message for a long time, if so, executing step 4, otherwise, executing step 2;

step 4, if the starting is abnormal, adding 1 to the retry number, judging whether the retry number is more than 3, if so, executing step 5, otherwise, executing step 2, and continuing to perform forced restarting;

and 5, setting a standby mirror image starting mark by the RTOS system, switching the standby mirror image to start, namely setting the switching mirror image mark, restarting the main system, starting from the standby mirror image, and simultaneously returning to the step 2, and continuously detecting the starting state of the Linux system in the starting process of the main system.

By the alternative example, under the heterogeneous dual system, the starting condition of the main system is not judged by the main system in the starting process of the main system, the main system can be switched to the standby mirror image for starting no matter what abnormal condition exists, the normal loading success of the main system is ensured, the reliability of the embedded software can be greatly improved, and the reliability of the embedded system is improved.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.

According to another aspect of the embodiment of the present application, an operation control device of an embedded system is further provided, and the device is used for implementing the operation control method of the embedded system provided in the foregoing embodiment, which is not described herein. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.

Fig. 8 is a block diagram of an operation control apparatus of an embedded system according to an embodiment of the present application, as shown in fig. 8, the apparatus comprising:

a first detecting unit 802, configured to detect an operation state of a second operating system through a first operating system, where the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system includes the first operating system and the second operating system;

a first control unit 804, configured to stop, by the first operating system, the target specification process in the second operating system when it is detected that the second operating system has a specification abnormality and that a group of specification processes in the second operating system have target specification processes that run normally;

the first execution unit 806 is configured to start, after stopping, by the first operating system, the target specified process in the second operating system, by the first operating system, a set of specified processes backed up in the first operating system, and control, by the first operating system, the second operating system to restart the system.

According to the embodiment of the application, the running state of the second operating system is detected through the first operating system; stopping the target specified process in the second operating system through the first operating system under the condition that the specified abnormality exists in the second operating system and the normal running target specified process exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system through the first operating system, starting a group of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system, so that the problem of low safety of equipment operation caused by overlong restarting time in the operation control method of the embedded system in the related technology is solved, and the safety of equipment operation is improved.

In an exemplary embodiment, the above apparatus further includes:

and the restarting unit is used for controlling the second operating system to restart from the standby image through the first operating system under the condition that the second operating system fails to restart from the target image, wherein the target image and the standby image are images of the second operating system.

In one exemplary embodiment, the restart unit includes:

the switching module is used for setting the mirror image identifier in the starting register corresponding to the second operating system through the first operating system under the condition that the second operating system fails to restart from the target mirror image, so as to switch the mirror image used by the second operating system for restarting from the target mirror image to the standby mirror image;

and the control module is used for controlling the second operating system to restart from the standby mirror image identified by the mirror image identification in the starting register through the first operating system.

In an exemplary embodiment, the restart unit further includes:

the first determining module is used for determining whether the second operating system runs or not by adopting an inter-core communication mode through the first operating system, and determining the running stage of the second operating system through a shared memory between the first operating system and the second operating system so as to determine whether the second operating system is restarted from the standby mirror image.

In one exemplary embodiment, the first determination module includes:

the first sending submodule is used for sending a first interrupt request to the second operating system through the first operating system in an inter-core communication mode, wherein the first interrupt request is used for requesting to acquire the starting state of the second operating system;

the first determining submodule is used for reading information stored in the shared memory through the first operating system to determine the operation stage of the second operating system;

the second determining submodule is used for determining that the second operating system is restarted abnormally under the condition that the response message returned by the second operating system is not received within the preset starting time of the second operating system and the fact that the starting of the second operating system is not completed is determined according to the running state parameters in the shared memory.

In an exemplary embodiment, the above apparatus further includes:

the first determining unit is used for determining a restarting state of the second operating system through the first operating system after the second operating system is controlled to restart through the first operating system, wherein the restarting state of the second operating system is used for indicating whether the restarting of the second operating system is completed or not;

the second control unit is used for controlling the second operating system to restart the system for multiple times through the first operating system under the condition that the restarting abnormality of the second operating system is determined according to the restarting state of the second operating system until the restarting of the second operating system is completed or the restarting times of the second operating system reach the designated times;

And the second determining unit is used for determining that the second operating system is restarted and fails under the condition that the number of times of restarting the second operating system reaches the designated number of times and the second operating system is still restarted abnormally.

In an exemplary embodiment, the above apparatus further includes:

and the compiling unit is used for compiling the backed-up process codes of a group of designated processes into the first operating system in the process of compiling the first operating system before the running state of the second operating system is detected by the first operating system.

In one exemplary embodiment, the first detection unit includes:

and the monitoring module is used for monitoring the running state of the second operating system by adopting an inter-core communication mode through the first operating system to obtain the running state of the second operating system.

In one exemplary embodiment, the monitoring module includes:

the second sending submodule is used for sending a second interrupt request to a second operating system through the first operating system in an inter-core communication mode, wherein the second interrupt request is used for requesting to acquire the system state of the second operating system;

and the third determining subunit is used for determining the system state of the second operating system based on whether the response message of the second interrupt request returned by the second operating system is received or not, wherein the running state of the second operating system comprises the system state of the second operating system.

In one exemplary embodiment, the monitoring module includes:

the third sending submodule is used for sending a third interrupt request to a target process in the second operating system by adopting an inter-core communication mode through the first operating system, wherein the third interrupt request is used for requesting to acquire the process state of the target process in the second operating system;

and the fourth determining subunit is used for determining the process state of the target process in the second operating system based on whether the response message of the third interrupt request returned by the target process is received or not, wherein the running state of the second operating system comprises the process state of the target process in the second operating system.

In one exemplary embodiment, the first detection unit includes:

and the reading module is used for monitoring the running state of the second operating system by adopting an inter-core communication mode through the first operating system to obtain the running state of the second operating system.

In one exemplary embodiment, the first control unit includes:

the sending module is used for sending a fourth interrupt request to the second operating system in an inter-core communication mode through the first operating system under the condition that the second operating system has specified abnormality and a group of specified processes in the second operating system have target specified processes running normally, wherein the fourth interrupt request is used for requesting to stop the target specified processes in the second operating system.

In an exemplary embodiment, the above apparatus further includes:

the judging unit is used for judging whether the second operating system has a specified abnormality according to the running state of the second operating system, wherein the specified abnormality comprises at least one of the following components: the system is abnormal in scheduling, the unrecoverable process is failed, the communication is overtime, and the system is suspended.

In an exemplary embodiment, the above apparatus further includes:

the second detection unit is used for detecting the running state of the second operating system again through the first operating system under the condition that the specified abnormality exists in the second operating system;

and the third determining unit is used for determining whether the second operating system has a specified abnormality according to the redetected running state of the second operating system.

In one exemplary embodiment, the first execution unit includes:

the starting module is used for starting a temperature detection process backed up in the first operating system through the first operating system, wherein the temperature detection process is a process for detecting the temperature of equipment where the embedded system is located, and the set of designated processes comprise a temperature detection process.

In one exemplary embodiment, the first execution unit includes:

And the restarting module is used for forcing the second operating system to restart the system through the first operating system.

In an exemplary embodiment, the above apparatus further includes:

and the second execution unit is used for taking over a group of appointed processes through the second operating system and detecting the running state of the second operating system through the first operating system again under the condition that the restarting of the second operating system is completed, wherein after the second operating system takes over the group of appointed processes, the group of appointed processes backed up by the first operating system are in a stop state.

In one exemplary embodiment, the first operating system and the second operating system are operating systems running on different processor cores in a multi-core processor of the BMC chip; alternatively, the first operating system is an operating system running on a coprocessor of the BMC chip and the second operating system is an operating system running on a processor core of a multi-core processor of the BMC chip.

It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.

According to still another aspect of the embodiments of the present application, an embedded system is provided, and the embedded system may be used to execute the operation control method of the embedded system in the foregoing embodiments, which is already described and will not be described herein.

In this embodiment, the above-mentioned embedded system includes: a first operating system and a second operating system running on different processor cores, wherein,

the first operating system is used for detecting the running state of the second operating system; stopping the target specified process in the second operating system under the condition that the specified abnormality exists in the second operating system and the target specified process which runs normally exists in a group of specified processes in the second operating system is detected; after stopping the target appointed process in the second operating system, starting a group of appointed processes backed up in the first operating system, and controlling the second operating system to restart the system;

and the second operating system is used for interacting with the first operating system in an inter-core communication mode and responding to the control of the first operating system to execute matching operation.

In some exemplary embodiments, the first operating system and the second operating system may cooperate to perform steps in any of the method embodiments described above.

According to still another aspect of the embodiments of the present application, a server is provided, which may be used to execute the operation control method of the embedded system in the foregoing embodiments, and is not described herein.

In this embodiment, the server includes: the BMC chip is provided with a first operating system and a second operating system which are operated on different processor cores of the BMC chip,

According to still another aspect of the embodiments of the present application, a BMC chip is provided, and the BMC chip may be used to execute the operation control method of the embedded system in the foregoing embodiments, which is already described and will not be described herein.

In this embodiment, a first operating system and a second operating system are run on different processor cores of the BMC chip, where,

According to a further aspect of embodiments of the present application, there is also provided a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.

In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.

According to a further aspect of embodiments of the present application there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.

In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.

Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.

It will be appreciated by those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps in them may be fabricated into a single integrated circuit module. Thus, embodiments of the application are not limited to any specific combination of hardware and software.

The above is only a preferred embodiment of the present application and is not intended to limit the embodiment of the present application, and various modifications and variations can be made to the embodiment of the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the embodiments of the present application should be included in the protection scope of the embodiments of the present application.

Claims

1. An operation control method of an embedded system, comprising:

detecting the running state of a second operating system through a first operating system, wherein the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system comprises the first operating system and the second operating system;

stopping, by the first operating system, the target-specific process in the second operating system when a specific exception is detected to exist in the second operating system and a target-specific process running normally exists in a set of specific processes in the second operating system;

after stopping the target appointed process in the second operating system through the first operating system, starting the set of appointed processes backed up in the first operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system.

2. The method of claim 1, wherein after the controlling the second operating system by the first operating system for a system restart, the method further comprises:

And under the condition that the second operating system is failed to restart from the target image, controlling the second operating system to restart from the standby image through the first operating system, wherein the target image and the standby image are images of the second operating system.

3. The method of claim 2, wherein controlling, by the first operating system, the second operating system to restart from the standby image in the event that the second operating system fails to restart from the target image, comprises:

setting a mirror image identifier in a starting register corresponding to the second operating system through the first operating system under the condition that the second operating system fails to restart from a target mirror image, so as to switch the mirror image used by the second operating system for system restart from the target mirror image to the standby mirror image;

and controlling the second operating system to restart from the standby mirror identified by the mirror identification in the starting register through the first operating system.

4. The method of claim 2, wherein in controlling the reboot of the second operating system from the standby image by the first operating system, the method further comprises:

And determining whether the second operating system runs or not by adopting an inter-core communication mode through the first operating system, and determining the running stage of the second operating system through a shared memory between the first operating system and the second operating system so as to determine whether the second operating system is restarted from the standby mirror image.

5. The method of claim 4, wherein determining, by the first operating system, whether the second operating system is running using inter-core communication, and determining, by a shared memory between the first operating system and the second operating system, a running phase of the second operating system to determine whether the second operating system is finished from the standby image, comprises:

sending a first interrupt request to a second operating system by adopting an inter-core communication mode through the first operating system, wherein the first interrupt request is used for requesting to acquire the starting state of the second operating system;

reading the information stored in the shared memory through the first operating system to determine the operation stage of the second operating system;

and under the condition that a response message returned by the second operating system is not received within the preset starting time of the second operating system and the fact that the starting of the second operating system is not completed is determined according to the running state parameters in the shared memory, determining that the restarting of the second operating system is abnormal.

6. The method of claim 2, wherein after the controlling the second operating system by the first operating system for a system restart, the method further comprises:

determining a restarting state of the second operating system through the first operating system, wherein the restarting state of the second operating system is used for indicating whether the restarting of the second operating system is completed or not;

under the condition that the restarting abnormality of the second operating system is determined according to the restarting state of the second operating system, the second operating system is controlled to restart the system for a plurality of times through the first operating system until the restarting of the second operating system is completed or the restarting times of the second operating system reach the designated times;

and under the condition that the restarting times of the second operating system reach the designated times and the second operating system is still in abnormal restarting, determining that the restarting of the second operating system fails.

7. The method of claim 1, wherein prior to said detecting, by the first operating system, an operational state of the second operating system, the method further comprises:

and compiling the backed-up process codes of the set of designated processes into the first operating system in the process of compiling the first operating system.

8. The method of claim 1, wherein detecting, by the first operating system, an operational state of the second operating system comprises:

and monitoring the running state of the second operating system by the first operating system in an inter-core communication mode to obtain the running state of the second operating system.

9. The method of claim 8, wherein the monitoring, by the first operating system, the operation state of the second operating system by using an inter-core communication manner to obtain the operation state of the second operating system includes:

sending a second interrupt request to the second operating system by adopting an inter-core communication mode through the first operating system, wherein the second interrupt request is used for requesting to acquire the system state of the second operating system;

and determining the system state of the second operating system based on whether a response message of the second interrupt request returned by the second operating system is received, wherein the running state of the second operating system comprises the system state of the second operating system.

10. The method of claim 8, wherein the monitoring, by the first operating system, the operation state of the second operating system by using an inter-core communication manner to obtain the operation state of the second operating system includes:

Sending a third interrupt request to a target process in the second operating system by adopting an inter-core communication mode through the first operating system, wherein the third interrupt request is used for requesting to acquire the process state of the target process in the second operating system;

and determining the process state of the target process in the second operating system based on whether a response message of the third interrupt request returned by the target process is received, wherein the running state of the second operating system comprises the process state of the target process in the second operating system.

11. The method of claim 1, wherein detecting, by the first operating system, an operational state of the second operating system comprises:

and reading the running state information of the second operating system stored in the shared memory between the first operating system and the second operating system through the first operating system to obtain the running state of the second operating system.

12. The method of claim 1, wherein the stopping, by the first operating system, the target-specific process in the second operating system if a specific exception is detected for the second operating system and a target-specific process is detected for a set of specific processes in the second operating system that are running normally, comprises:

And under the condition that the second operating system has specified abnormality and a group of specified processes in the second operating system have target specified processes which run normally, sending a fourth interrupt request to the second operating system by adopting an inter-core communication mode through the first operating system, wherein the fourth interrupt request is used for requesting to stop the target specified processes in the second operating system.

13. The method of claim 1, wherein after the detecting, by the first operating system, the operational state of the second operating system, the method further comprises:

judging whether the second operating system has the specified abnormality according to the running state of the second operating system, wherein the specified abnormality comprises at least one of the following: the system is abnormal in scheduling, the unrecoverable process is failed, the communication is overtime, and the system is suspended.

14. The method of claim 1, wherein after the detecting, by the first operating system, the operational state of the second operating system, the method further comprises:

re-detecting the running state of the second operating system through the first operating system under the condition that the second operating system is not detected to have the specified abnormality;

And determining whether the second operating system has the specified abnormality according to the redetected running state of the second operating system.

15. The method of claim 1, wherein the launching, by the first operating system, the set of specified processes backed up in the first operating system comprises:

and starting a temperature detection process backed up in the first operating system through the first operating system, wherein the temperature detection process is a process for detecting the temperature of equipment in which the embedded system is located, and the set of designated processes comprises the temperature detection process.

16. The method of claim 1, wherein the controlling, by the first operating system, the second operating system to perform a system restart comprises:

and forcing the second operating system to restart through the first operating system.

17. The method of claim 1, wherein after the controlling the second operating system by the first operating system for a system restart, the method further comprises:

and under the condition that the restarting of the second operating system is completed, taking over the set of designated processes through the second operating system, and detecting the running state of the second operating system through the first operating system again, wherein after the second operating system takes over the set of designated processes, the set of designated processes backed up by the first operating system are in a stop state.

18. The method of any one of claims 1 to 17, wherein the first operating system and the second operating system are operating systems running on different processor cores in a multi-core processor of a baseboard management controller, BMC, chip; or, the first operating system is an operating system running on a coprocessor of a Baseboard Management Controller (BMC) chip, and the second operating system is an operating system running on a processor core of a multi-core processor of the BMC chip.

19. An operation control device of an embedded system, comprising:

the first detection unit is used for detecting the running state of a second operating system through a first operating system, wherein the first operating system and the second operating system are operating systems running on different processor cores, and the embedded system comprises the first operating system and the second operating system;

a first control unit, configured to stop, by the first operating system, a target specification process in the second operating system when it is detected that a specification abnormality exists in the second operating system and that a normal running target specification process exists in a group of specification processes in the second operating system;

And the first execution unit is used for starting the set of appointed processes backed up in the first operating system through the first operating system after stopping the target appointed process in the second operating system through the first operating system, and controlling the second operating system to restart the system through the first operating system.

20. An embedded system, comprising: a first operating system and a second operating system running on different processor cores, wherein,

the first operating system is used for detecting the running state of the second operating system; stopping the target specified process in the second operating system under the condition that the specified abnormality exists in the second operating system and the target specified process which runs normally exists in a group of specified processes in the second operating system; after stopping the target appointed process in the second operating system, starting the set of appointed processes backed up in the first operating system, and controlling the second operating system to restart the system;

the second operating system is used for interacting with the first operating system in an inter-core communication mode, and responding to the control of the first operating system to execute matching operation.

21. A server, comprising: a Baseboard Management Controller (BMC) chip, wherein different processor cores of the BMC chip are provided with a first operating system and a second operating system in an running mode,

22. A baseboard management control BMC chip is characterized in that a first operating system and a second operating system are operated on different processor cores of the BMC chip, wherein,

23. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program, wherein the computer program, when executed by a processor, implements the steps of the method of any of claims 1 to 18.

24. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 18 when the computer program is executed.