CN108255591B - Unified exception handling method for partition operating system - Google Patents

Unified exception handling method for partition operating system Download PDF

Info

Publication number
CN108255591B
CN108255591B CN201711292568.5A CN201711292568A CN108255591B CN 108255591 B CN108255591 B CN 108255591B CN 201711292568 A CN201711292568 A CN 201711292568A CN 108255591 B CN108255591 B CN 108255591B
Authority
CN
China
Prior art keywords
exception
partition
processing
guest
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711292568.5A
Other languages
Chinese (zh)
Other versions
CN108255591A (en
Inventor
周霆
李运喜
叶宏
张勇
徐晓光
郭芳超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Aeronautics Computing Technique Research Institute of AVIC
Original Assignee
Xian Aeronautics Computing Technique Research Institute of AVIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Aeronautics Computing Technique Research Institute of AVIC filed Critical Xian Aeronautics Computing Technique Research Institute of AVIC
Priority to CN201711292568.5A priority Critical patent/CN108255591B/en
Publication of CN108255591A publication Critical patent/CN108255591A/en
Application granted granted Critical
Publication of CN108255591B publication Critical patent/CN108255591B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0793Remedial or corrective actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/461Saving or restoring of program or task context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/481Exception handling

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)
  • Retry When Errors Occur (AREA)

Abstract

The invention belongs to the technical field of computer system software, and relates to a unified exception handling method for different types of partitions of a partition operating system. The method is based on an exception handling framework of a partition operating system, and comprises an exception system level processing and distributing program unit, a bare application partition exception handling task unit, a guest OS partition exception handling task unit and a guest OS exception handling program unit. The exception handling method based on the partition operating system realizes the unified handling capacity of the two types of user mode exceptions of the partition operating system, and solves the problems that a bare application partition and a guest OS partition coexist in the partition operating system and all user-level exception handling is required.

Description

Unified exception handling method for partition operating system
Technical Field
The invention belongs to the technical field of computer system software, and particularly relates to a unified exception handling method for a partition operating system.
Background
The guest OS is loaded in the partition of the partition operating system, so that the partition operating system has good legacy code inheritance capability and flexible comprehensive integration capability, and becomes an important characteristic of the mainstream partition operating system at present. For a partition operating system supporting both a bare application partition and a guest OS partition, a unified exception handling method needs to be provided for supporting both the bare application partition and the guest OS partition to perform user-mode exception handling, so that both the bare application program and the guest OS have a chance to take over the exception concerned and attach their own user-level exception handling programs for handling. The design of the exception handling method is that firstly, the requirement that the exception handling is respectively required when the two types of partitions exist simultaneously can be met; secondly, the design of the exception handling method should not destroy the original exception handling flow of the guest OS, and should be transparent to the application on the guest OS; finally, from the perspective of ensuring the safety and reliability of the kernel of the partition operating system, the design of the method should be as simple and general as possible, and the method should be implemented in the partition as possible, so as to reduce modification and function extension of the kernel. The existing partition operating system processes user mode exception by adopting a mode that a kernel directly calls a user hook, on one hand, the mode changes the kernel too much, reduces the reliability of the kernel and introduces security risk; on the other hand, the direct procedure call mode increases the processing burden of the kernel and reduces the efficiency of exception handling.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a unified exception handling method for a partition operating system, which solves the problems that the partition operating system simultaneously configures a guest OS partition and a bare application partition, and the two partitions need to carry out user-level exception handling at the same time, and solves the problem of the existing handling mode that a kernel of the partition operating system directly calls a user hook.
The technical scheme of the invention is as follows:
1. an exception handling framework for a partitioned operating system:
a) exception system level processing and distribution program unit: the method resides in the kernel of the partition operating system, provides direct takeover and basic processing of the exception, and uses IPC communication to distribute the exception event so as to activate the exception processing task unit of the bare application partition or the exception processing task unit of the guest OS partition to perform corresponding user-level exception processing.
b) Bare application partition exception handling task unit: and the system resides in the bare application partition and is responsible for searching the user-level exception handling routine table and calling a corresponding routine registered by the user to finish user-level processing. The task is set to the highest level when being created, enters a processing cycle after being started, is in an IPC receiving blocking state, and waits for receiving the exception notification and activation of the kernel exception processing and distribution program unit.
c) Guest OS partition exception handling task unit: residing in the guest OS partition, is responsible for transferring exception handling to the guest OS's exception handler unit, while passing the exception context. The task is set to the highest level when being created, enters a processing cycle after being started, is in an IPC receiving blocking state, and waits for receiving the exception notification and activation of the kernel exception processing and distribution program unit.
d) Guest OS exception handler unit: this section includes the guest OS original exception handler and the modified guest OS exception exit mechanism. After exception handling by the guest OS is complete, if the guest OS application task that triggered the exception is not suspended, operation may be resumed through this exit mechanism.
2. An exception handling flow for a partitioned operating system (which is executed in order unless a step jump is included in the step specification, until the end of the step including "done"):
step 1: when an exception occurs, the exception system level processing and distributing program unit takes over first to store a basic exception context, and if the type of the occurred exception needs to be processed by the kernel full weight (such as restarting exception, machine detection exception and the like), the kernel executes the system level processing and finishes the execution. Otherwise, go to step 2.
Step 2: for the exception types which occur in the partitions and can be processed by the partitions, the exception system level processing and distributing program unit writes information such as exception context and exception vector number into the IPC message buffer.
And step 3: the exception system level processing and distribution program unit then identifies the partition ID where the exception occurred, which may be a bare application partition or a guest OS partition, and looks up the corresponding exception handling task based on the partition ID.
And 4, step 4: and finally, the abnormal system level processing and distributing program unit calls an IPC (inter-process control) sending and receiving composite operation interface to send the abnormal information to the abnormal processing task of the partition. And after the confirmation message is successfully sent, the current abnormal site is suspended, the execution right of the processor is given out, and the next step of processing is carried out after the partition abnormal processing task is waited to be switched. If the exception occurs in the bare application partition, execution goes to step 5, and if the exception occurs in the guest OS partition, execution goes to step 7.
And 5: the IPC notification of the exception system level processing and distributing program unit activates the bare application partition exception handling task unit, which then retrieves exception context information and exception vector numbers from the message buffer. And then searching a user-level exception handling routine table according to the exception vector number, and calling a corresponding routine registered by the user to finish user-level processing.
Step 6: after the processing is finished, the IPC receiving operation is blocked again, the execution right of the processor is given out, the task with the exception is switched to the task with the exception, and the execution is continued from the exception occurrence point. And (5) finishing the treatment.
And 7: IPC notification of the exception system level handling and dispatch program unit activates the guest OS partition exception handling task, which then retrieves exception context information from the message buffer, then modifies the Program Counter (PC) of the exception live to the entry of the guest OS exception handling program unit, and modifies the parameter register (R3) of the exception live to the exception context address.
And 8: and the guest OS partition exception handling task is blocked in the IPC receiving operation again, and gives out the execution right, and at the moment, the guest OS partition exception handling task enters the execution inlet of the guest OS exception handling program unit through the task scheduling and switching of the OS and enters the exception handling of the guest OS.
And step 9: the guest OS exception handler unit begins to perform user-level exception handling. If the guest OS task that is abnormal (note that this task is a guest OS created task and not a partition operating system created task) is suspended in the process, go to step 10. If the guest OS task in which the exception occurred is not suspended, go to step 11.
Step 10: and executing internal rescheduling of the guest OS, switching to a new guest OS task selected by the guest OS scheduling mechanism, and finishing the execution.
Step 11: soft interrupt handling using soft interrupts (SC instructions) trapped in the kernel, via the kernel
And recovering the reserved exception context and the RFI instruction to the exception occurrence point to continue to execute, and finishing the execution. The invention has the advantages and effects that:
1. the user-level exception handling is carried out in a task mode, most of work of exception handling is transferred to the partition, and modification and function expansion are not carried out on the kernel as much as possible, so that the kernel complexity is reduced, the kernel safety and reliability are improved, and the exception handling efficiency is improved;
2. and the synchronous IPC communication is used for delivering the exception handling notice to the subarea, so that the exception information is conveniently delivered, and the processing logic of taking over the exception site by the user-level exception handling task is ensured.
Drawings
FIG. 1 is a diagram of a partitioned operating system unified exception handling framework,
FIG. 2 is a flow diagram illustrating a unified exception handling process for a partitioned operating system.
Detailed Description
The implementation of the present invention in a typical scenario of an embedded partition operating system is as follows:
a) the partition operating system kernel is internally provided with an exception system level processing and distributing program unit EXPTION _ SYS _ PROCESS, and configures a guest OS partition A and a bare application partition B. The exception processing TASK unit GOS _ EXP _ TASK _ A and the exception processing program unit GOS _ EXP _ PROCESSS _ A of the guest OS partition reside in the partition A, and the exception processing TASK unit APP _ EXP _ TASK _ B of the bare application partition resides in the partition B.
b) After the partition operating system is started, an exception handling TASK GOS _ EXP _ TASK _ A is created and started in the initialization process of the partition A, an exception handling cycle is entered, the receiving operation of IPC communication is blocked, and the activation is waited.
c) During the initialization process of the partition B, a user-level exception handling routine table is initialized, then an exception handling TASK APP _ EXP _ TASK _ B is created and started, an exception handling loop is entered, the receiving operation of IPC communication is blocked, and the activation is waited. Subsequently, the B-partition application needs to register its own exception handling routine in the "user-level exception handling routine table".
d) After the exception occurs, taking over by a system level processing and distributing program of a partition operating system kernel, if the exception occurs in the partition A, preparing to activate a processing TASK GOS _ EXP _ TASK _ A of the partition A by using IPC, and turning to the step e); if the partition B happens, the processing TASK APP _ EXP _ TASK _ B of the partition B is ready to be activated by IPC, and the step h) is carried out.
e) The GOS _ EXP _ TASK _ A is activated by the IPC operation sent by the kernel, acquires the execution right, executes abnormal field redirection, and then blocks the IPC reception again to make the execution right.
f) After the task GOS _ EXP _ PRS _ a gives the right to execute, the task goes to a redirected exception site, and the exception processing GOS _ EXP _ PROCESS _ a of the guest OS is started.
g) If the abnormal guest OS task is not suspended in the processing of the GOS _ EXP _ PROCESS _ A, the abnormal guest OS task can be exited to the abnormal point to continue executing through the modified abnormal exit mechanism, and otherwise, the abnormal guest OS task is switched to a new guest OS task to execute. Thus, guest OS partition exception handling ends.
h) The APP _ EXP _ TASK _ B TASK is activated by IPC operation sent by the kernel to acquire execution right. And searching the user-level exception handling routine table, and calling a corresponding routine registered by the user to finish user-level processing.
i) After the processing is completed, the APP _ EXP _ TASK _ B TASK blocks on the IPC receiving operation again and gives the processor execution right. Thus, exception handling of the bare application partition ends.

Claims (2)

1. An exception handling framework for a partitioned operating system, comprising: comprises the following structural units: a) exception system level processing and distribution program unit: residing in the kernel of the partition operating system, providing direct takeover and basic processing of the exception, and using IPC communication to distribute the exception event so as to activate the exception processing task unit of the bare application partition or the exception processing task unit of the guest OS partition to perform corresponding user-level exception processing; b) bare application partition exception handling task unit: residing in the bare application partition, and being responsible for searching the user-level exception handling routine table and calling the corresponding routine registered by the user to finish user-level processing; the task is set to be the highest level when being created, enters a processing cycle after being started, is in an IPC receiving and blocking state, and waits for receiving the exception notification and activation of the kernel exception processing and distribution program unit; c) guest OS partition exception handling task unit: residing in the guest OS partition, and being responsible for transferring exception processing to an exception handler unit of the guest OS and simultaneously transferring an exception context; the task is set to be the highest level when being created, enters a processing cycle after being started, is in an IPC receiving and blocking state, and waits for receiving the exception notification and activation of the kernel exception processing and distribution program unit; d) guest OS exception handler unit: the part comprises an original exception handler of the guest OS and a modified exception exit mechanism of the guest OS; after exception handling by the guest OS is complete, if the guest OS application task that triggered the exception is not suspended, operation may be resumed through this exit mechanism.
2. An exception handling method based on the exception handling framework of the partitioned operating system of claim 1, characterized in that: the method comprises the following steps:
step 1: when an exception occurs, the exception system level processing and distributing program unit takes over firstly to store the basic exception context, if the occurred exception type needs to be processed by the kernel full right, the kernel executes the system level processing, and the execution is finished; otherwise, turning to the step 2;
step 2: for the exception type which occurs in the partition and can be processed by the partition, the exception system level processing and distributing program unit writes the exception context and the exception vector number information into an IPC message buffer;
and step 3: the abnormal system level processing and distributing program unit then identifies the abnormal partition ID, and searches the corresponding abnormal processing task according to the partition ID, wherein the abnormal partition is a bare application partition or a guest OS partition;
and 4, step 4: the abnormal system level processing and distributing program unit finally calls an IPC (inter process control) sending and receiving composite operation interface to send abnormal information to the abnormal processing task of the partition; after the confirmation message is successfully sent, the current abnormal site is suspended, the execution right of the processor is given out, and the next step of processing is carried out after the partition abnormal processing task is waited to be switched; if the exception occurs in the bare application partition, the execution is shifted to step 5, and if the exception occurs in the guest OS partition, the execution is shifted to step 7;
and 5: the IPC notification of the abnormal system level processing and distributing program unit activates the abnormal processing task unit of the bare application partition, and the task then acquires abnormal context information and an abnormal vector number from a message buffer; then, searching a user-level exception handling routine table according to the exception vector number, and calling a corresponding routine registered by the user to finish user-level processing;
step 6: after the processing is finished, the IPC receiving operation is blocked again, the execution right of the processor is given out, the task with the abnormal occurrence is switched to, and the execution is continued from the abnormal occurrence point; finishing the treatment;
and 7: the IPC notification of the abnormal system level processing and program distributing unit activates the abnormal processing task of the client OS partition, the task then acquires abnormal context information from the message buffer, then modifies the program counter of the abnormal site as the entrance of the client OS abnormal processing program unit, and modifies the parameter register of the abnormal site as the abnormal context address;
and 8: the abnormal processing task of the guest OS partition is blocked in the IPC receiving operation again, and gives out the execution right, at the moment, the abnormal processing task enters the execution inlet of the abnormal processing program unit of the guest OS through the task scheduling and switching of the OS, and enters the self abnormal processing of the guest OS;
and step 9: the guest OS exception handler unit starts to perform user-level exception handling; if the guest OS task in which the abnormality occurs is suspended in the process, go to step 10; if the guest OS task with the exception is not suspended, go to step 11;
step 10: executing internal rescheduling of the guest OS, switching to a new guest OS task selected by a guest OS scheduling mechanism, and finishing the execution;
step 11: and executing the soft interrupt processing flow of the kernel by using the trap of the soft interrupt to the kernel, recovering to an exception occurrence point through the exception context and the RFI instruction reserved by the kernel, and continuing to execute the exception occurrence point, wherein the execution is finished.
CN201711292568.5A 2017-12-07 2017-12-07 Unified exception handling method for partition operating system Active CN108255591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711292568.5A CN108255591B (en) 2017-12-07 2017-12-07 Unified exception handling method for partition operating system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711292568.5A CN108255591B (en) 2017-12-07 2017-12-07 Unified exception handling method for partition operating system

Publications (2)

Publication Number Publication Date
CN108255591A CN108255591A (en) 2018-07-06
CN108255591B true CN108255591B (en) 2021-10-15

Family

ID=62722374

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711292568.5A Active CN108255591B (en) 2017-12-07 2017-12-07 Unified exception handling method for partition operating system

Country Status (1)

Country Link
CN (1) CN108255591B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542610B (en) * 2018-12-04 2023-06-30 中国航空工业集团公司西安航空计算技术研究所 Method for realizing virtual interrupt standard component of multi-partition operating system
CN112799776B (en) * 2020-12-31 2022-03-25 科东(广州)软件科技有限公司 Multi-partition operating system monitoring method and device, computing equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020070795A (en) * 2001-03-01 2002-09-11 인터내셔널 비지네스 머신즈 코포레이션 Method and system for eliminating duplicate reported errors in a logically partitioned multiprocessing system
CN103049318A (en) * 2011-10-11 2013-04-17 北京科银京成技术有限公司 Virtual suspension algorithm of subregion operation system
CN104461719A (en) * 2014-11-29 2015-03-25 中国航空工业集团公司第六三一研究所 Pseudo interrupt expanding method for partition operating system
CN104794393A (en) * 2015-04-24 2015-07-22 杭州字节信息技术有限公司 Embedded type partition image security certification and kernel trusted boot method and equipment thereof
CN105468434A (en) * 2015-12-11 2016-04-06 浪潮(北京)电子信息产业有限公司 Method and device for processing exception of virtual machine
CN105528276A (en) * 2015-12-09 2016-04-27 中国航空工业集团公司西安航空计算技术研究所 Partition operating system-based module support layer fault processing method for health monitoring
US9529661B1 (en) * 2015-06-18 2016-12-27 Rockwell Collins, Inc. Optimal multi-core health monitor architecture
CN106293986A (en) * 2016-08-12 2017-01-04 中国航空工业集团公司西安飞行自动控制研究所 A kind of failure monitoring processing means based on virtual interrupt and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150261952A1 (en) * 2014-03-13 2015-09-17 Unisys Corporation Service partition virtualization system and method having a secure platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020070795A (en) * 2001-03-01 2002-09-11 인터내셔널 비지네스 머신즈 코포레이션 Method and system for eliminating duplicate reported errors in a logically partitioned multiprocessing system
CN103049318A (en) * 2011-10-11 2013-04-17 北京科银京成技术有限公司 Virtual suspension algorithm of subregion operation system
CN104461719A (en) * 2014-11-29 2015-03-25 中国航空工业集团公司第六三一研究所 Pseudo interrupt expanding method for partition operating system
CN104794393A (en) * 2015-04-24 2015-07-22 杭州字节信息技术有限公司 Embedded type partition image security certification and kernel trusted boot method and equipment thereof
US9529661B1 (en) * 2015-06-18 2016-12-27 Rockwell Collins, Inc. Optimal multi-core health monitor architecture
CN105528276A (en) * 2015-12-09 2016-04-27 中国航空工业集团公司西安航空计算技术研究所 Partition operating system-based module support layer fault processing method for health monitoring
CN105468434A (en) * 2015-12-11 2016-04-06 浪潮(北京)电子信息产业有限公司 Method and device for processing exception of virtual machine
CN106293986A (en) * 2016-08-12 2017-01-04 中国航空工业集团公司西安飞行自动控制研究所 A kind of failure monitoring processing means based on virtual interrupt and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种微内核分区操作系统C 库的适配验证方法;郝继锋等;《航空计算技术》;20170531;全文 *

Also Published As

Publication number Publication date
CN108255591A (en) 2018-07-06

Similar Documents

Publication Publication Date Title
US10678550B2 (en) Capturing snapshots of offload applications on many-core coprocessors
EP2423808B1 (en) Arithmetic device
US7810083B2 (en) Mechanism to emulate user-level multithreading on an OS-sequestered sequencer
JP5160176B2 (en) System, method, and program for communication management with multiple configurations for virtual machines
US9170803B2 (en) Runtime patching of an operating system (OS) without stopping execution
KR100940335B1 (en) Enabling multiple instruction stream/multiple data stream extensions on microprocessors
JPS62163149A (en) Dispatching control system for virtual computer
EP2239662A2 (en) System management mode inter-processor interrupt redirection
JP2007102781A5 (en)
CN108196946B (en) A kind of subregion multicore method of Mach
EP2718867A2 (en) System and method for virtual partition monitoring
EP1817663A2 (en) Hardware multithreading systems and methods
CN102713847A (en) Hypervisor isolation of processor cores
US9152426B2 (en) Initiating assist thread upon asynchronous event for processing simultaneously with controlling thread and updating its running status in status register
CN108255591B (en) Unified exception handling method for partition operating system
CN103473135B (en) The processing method of spin lock LHP phenomenon under virtualized environment
CN103699428A (en) Method and computer device for affinity binding of interrupts of virtual network interface card
CN106155803B (en) A kind of thread pool sharing method and system based on semaphore
US10055234B1 (en) Switching CPU execution path during firmware execution using a system management mode
CN106339257B (en) Method and system for making client operating system light weight and virtualization operating system
WO2018040845A1 (en) Method and apparatus for scheduling computing resource
CN113535362A (en) Distributed scheduling system architecture and micro-service workflow scheduling method
CN110045992A (en) A kind of general-purpose system and method suitable for multicore board
US11086658B2 (en) System performance enhancement with SMI on multi-core systems
JP2008523491A (en) Method and system for providing access to active objects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant