WO2024007934A1 - 中断处理方法、电子设备和存储介质 - Google Patents

中断处理方法、电子设备和存储介质 Download PDF

Info

Publication number
WO2024007934A1
WO2024007934A1 PCT/CN2023/103632 CN2023103632W WO2024007934A1 WO 2024007934 A1 WO2024007934 A1 WO 2024007934A1 CN 2023103632 W CN2023103632 W CN 2023103632W WO 2024007934 A1 WO2024007934 A1 WO 2024007934A1
Authority
WO
WIPO (PCT)
Prior art keywords
interrupt
user
mode
thread
kernel
Prior art date
Application number
PCT/CN2023/103632
Other languages
English (en)
French (fr)
Inventor
廖畅
李政谕
郭寒军
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024007934A1 publication Critical patent/WO2024007934A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/327Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for interrupts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30105Register structure
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4812Task transfer initiation or dispatching by interrupt, e.g. masked

Definitions

  • the present application relates to the field of interrupt control technology, and in particular, to an interrupt processing method, electronic device and storage medium.
  • the network card device is managed by a software module running in the kernel state. Events generated on the network card device, such as message arrival, message transmission completion, etc., will generate hardware interrupts.
  • the typical processing flow of these interrupts is: first, run in the CPU kernel
  • the operating system (OS) in the state is processed and then forwarded to the corresponding driver software for processing.
  • the user-state application must rely on the kernel-state driver software and the network card device for data interaction.
  • a typical scenario is: the instant chat tool sends and receives messages.
  • the system call is first executed to switch the CPU to the kernel state. After the OS processes the system call, the chat message data is handed over to the network card device through the driver for processing.
  • an interrupt is generated.
  • the interrupt is first processed by the OS and the driver, and then After switching the CPU to user mode, call the callback function registered by the application.
  • the application needs to go through multiple privilege level switches to complete the interaction with the input/output (IO) device, including entering the kernel state from the user state through system calls; when the hardware successfully initiates the IO operation, It will return to the user mode from the kernel mode; when the hardware successfully completes data processing, it will enter the kernel mode to handle the interrupt; and finally it will return to the user mode to execute the callback function registered by the application.
  • IO input/output
  • Due to the existence of these privilege level switches each interaction between the application and the IO device will bring a lot of additional response delays. When the data center business volume is particularly large, these delays will greatly reduce the processing capabilities of the application. , leading to network packet loss and poor user experience.
  • Embodiments of the present application provide an interrupt method, an electronic device, and a storage medium.
  • the interrupt method can be used to reduce the number of unnecessary privilege level switches, thereby improving the interaction performance between application programs and IO devices.
  • embodiments of the present application provide an interrupt method, which includes: when determining that the interrupt thread is running on the processor, based on the status information of the current processor running thread recorded in the user mode interrupt memo information register, if it is determined that the current thread is running on the processor, When running in user mode, a user mode interrupt is delivered to the processor, and an instruction is read based on the address stored in the user mode interrupt vector register, and the interrupt processing function is executed in the user mode.
  • the address space of the processor can directly call the user-mode interrupt processing function, further reducing the cost of software responding to user-mode interrupts.
  • determining that the current thread is running in user mode based on the record information of the user mode interrupt memo information register includes: determining that the terminal thread of the user mode interrupt memo information register falls into the kernel and is marked as the first state, and the Whether the interrupt thread in the user mode interrupt memo information register is marked online as the second state, it is determined that the current thread is running in the user mode.
  • the terminal thread in the user-mode interrupt memo information register falls into the kernel and is marked as the first state, it may be that the T value of the user-mode interrupt memo information register is 0, and whether the interrupt thread in the user-mode interrupt memo information register
  • determining that the current thread is running in the user mode it also includes: determining whether the current thread is running in the user mode based on the record information of the user mode interrupt memo information register; wherein, based on the record information of the user mode interrupt memo information register , if it is determined that the terminal thread of the user-mode interrupt memo information register falls into the kernel mark non-first state, and/or whether the interrupt thread of the user-mode interrupt memo information register is online marked not the second state, the interrupt service will The processing interrupt context of the program table entry populates the user mode interrupt context register and delivers the doorbell interrupt to the processor.
  • determining that the interrupt thread is running on the processor it also includes: determining whether the pending interrupt list in the interrupt service table and the interrupt information to be processed by the thread in the user mode interrupt memo information register are the same. If they are the same, Then determine where the interrupt thread is run on the processor.
  • the interrupt service context of the interrupt service program entry is filled in the user-mode interrupt context register , and delivers a kernel doorbell interrupt to the processor.
  • the interrupt thread before determining that the interrupt thread is running on the processor, it also includes: when determining that the processor falls into the kernel state, detecting the value of the non-safe context register at a safety detection point. If it is determined that the processor is in a non-safe intrusion state, status, the kernel context is saved in the set memory, and the address of the set memory is recorded in the suspended context register, where the security detection point is the kernel exception processing entry; based on the suspended context register The recorded address of the set memory restores the processor to the interrupted scene and allows the interrupted kernel to continue executing the safe reentry point, where the safe reentry point is the point at which the kernel returns to user mode. ; When the kernel executes to the safe reentry point, restore the suspended kernel context through the suspended context register.
  • the conventional kernel returns to user mode only after exceptions and system calls are completed.
  • the embodiment of the present application allows the kernel to quickly return to user mode to respond to interrupts without ending these operations. This allows This causes the kernel to be in a non-safe reentrant state. If the user-mode interrupt handling function is trapped in the kernel again, the data integrity of the kernel will face greater uncertainty.
  • the embodiment of the present application restores the kernel state to the interrupted scene by using the UCR register at the entry point of exceptions and system calls, and after the kernel enters the safe reentrant state again (that is, the exception and system call processing is completed and ready to return When entering the user mode), the OCR register is then used to start processing exceptions and system calls from the user mode interrupt handling function.
  • embodiments of the present application further provide an interrupt controller, including: a processor and a memory, the memory being used to store at least one instruction, which is loaded and executed by the processor to implement the first aspect Provided interrupt handling methods.
  • embodiments of the present application also provide an interrupt controller, including: a user-mode interrupt routing engine and a user-mode interrupt quick call engine;
  • the user-mode interrupt quick call engine includes: a user-mode interrupt memo information register, which is used to To record the status information of the current processor running thread; when the interrupt controller determines that the interrupt thread is running on the processor, based on the status information of the current processor running thread recorded in the user state interrupt memo information register, if determined If the current thread is running in user mode, a user mode interrupt is delivered to the processor, and an instruction is read based on the address stored in the user mode interrupt vector register, and the interrupt processing function is executed in user mode.
  • the user-mode interrupt quick call engine also includes: a user-mode interrupt context register for recording memory addresses; wherein, based on the recorded information of the user-mode interrupt memo information register, if the user-mode interrupt memo information is determined If the terminal thread of the register falls into the kernel mark non-first state, and/or if the interrupt thread of the user mode interrupt memo information register is online marked not the second state, then the processing interrupt context of the interrupt service routine table entry is filled in the user mode. Interrupt context register and delivers the doorbell interrupt to the processor.
  • the user-mode interrupt memo information register and user-mode interrupt context register introduced in the interrupt controller provided by the embodiment of the present application can thereby allow the kernel to quickly switch to the address space where the interrupt processing thread is located.
  • embodiments of the present application further provide an electronic device, which may include the interrupt controller provided in the second aspect.
  • embodiments of the present application further provide an electronic device, which may include the interrupt controller provided in the third aspect.
  • embodiments of the present application further provide a computer-readable storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the interrupt processing method provided in the first aspect is implemented.
  • Figure 1 is a schematic diagram comparing the processing flow of kernel-mode interrupts and user-mode interrupts in related technologies
  • FIG. 2 is a schematic diagram of the interrupt delivery architecture in related technologies
  • FIG 3 is a flow chart of user-mode interrupt delivery of the interrupt controller shown in Figure 2;
  • Figure 4 is a schematic diagram of another interrupt flow in the related technology
  • Figure 5 is a system architecture applicable to the embodiment of this application.
  • Figure 6 is a user-mode interrupt hardware architecture diagram provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of a user-mode interrupt routing engine provided by an embodiment of the present application.
  • Figure 8 is a schematic flow diagram of user mode interrupt routing hardware provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of a user mode interrupt quick call hardware module provided by an embodiment of the present application.
  • Figure 10 is a schematic diagram of a hardware delivery user mode interrupt process provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of software module classification provided by an embodiment of the present application.
  • Figure 12 is a schematic diagram of an interrupt event management module provided by an embodiment of the present application.
  • Figure 13 is a schematic diagram of an interrupt address space switching module provided by an embodiment of the present application.
  • Figure 14 is a schematic diagram of a secure reentrant module provided by an embodiment of the present application.
  • FIG. 15 is a schematic flow diagram of the DPDK interrupt mode provided by an embodiment of the present application.
  • Figure 16 is a schematic diagram of a software framework provided by an embodiment of the present application.
  • Figure 17 is a delay comparison diagram between DPDK interrupt mode and user mode interrupt provided by the embodiment of the present application.
  • Microkernel architecture A kernel architecture solution in which only the most core kernel components are retained in the kernel state. For example, the permission management module and task scheduling kernel components are retained in the kernel state; the traditional kernel components are retained in the kernel state.
  • the kernel components are placed in the user mode. For example, the file system, interrupt processing framework, network protocol stack and other kernel components are placed in the user mode.
  • Macrokernel architecture A kernel architecture solution in which all kernel components are placed in the kernel state. For example, kernel components such as IPC modules, permission management modules, file systems, interrupt processing frameworks, and network protocol stacks are placed in the kernel state. They are all placed in the kernel mode; the user mode is only responsible for running application code.
  • kernel components such as IPC modules, permission management modules, file systems, interrupt processing frameworks, and network protocol stacks are placed in the kernel state. They are all placed in the kernel mode; the user mode is only responsible for running application code.
  • Thread context An operating system (OS) maintains various states of a thread, including general registers, page tables, thread private spaces, thread metadata, etc. used by the thread.
  • OS operating system
  • CPU privilege level Existing chips support two privilege levels: user mode and kernel mode.
  • Various software systems run under different privilege levels. These software systems have different access rights to the underlying hardware. , For example, the OS runs at the high privilege level of the CPU, and various applications, such as editors, web browsers, etc., run at the low privilege level of the CPU. When applications need to interact with hardware, for example, printing files through a printer or editing files through the keyboard, they will rely on high-privilege software systems to complete these tasks.
  • Kernel driver There will be various hardware devices in the computer, such as keyboard, network card, and hard disk. Each hardware device relies on different software modules to operate. These software modules are part of the OS and run under the high privilege level of the CPU.
  • Interrupt A hardware-triggered event that interrupts the execution flow of the CPU and continues execution from a specific kernel driver.
  • Context saving and restoration After the execution flow of the CPU is interrupted by hardware, in order to return to the interrupted point and continue execution, the software will save the General Purpose Register (GPR) and Control Status Register (Control Status Register) at the time of interruption. , CSR) are all stored in the memory. The software restores the backed-up data to GPR and CSR again, and the original execution flow of the CPU can be restored to continue execution.
  • GPR General Purpose Register
  • Control Status Register Control Status Register
  • Kernel reentrancy The kernel is interrupted at any time and executes another piece of code. This code calls the kernel again, but the kernel state can still remain correct.
  • Task migration In a multi-CPU system, when the workload of the CPU is too heavy, the OS will migrate the tasks in the CPU's run queue to the run queues of other CPUs.
  • the network card device is managed by the software module running in the kernel state. Events generated on the network card device, such as the arrival of a message, the completion of message sending, etc., will generate hardware interrupts.
  • the typical processing flow of these interrupts is: first, run in the OS processing in CPU kernel state, Then it is forwarded to the corresponding driver software for processing.
  • User-mode applications need to rely on kernel-mode driver software and network card devices for data interaction. For example, in the process of sending and receiving messages with an instant chat tool, a system call is first executed to switch the CPU to the kernel mode. After the OS processes the system call, the chat message data is handed over to the network card device for processing through the driver. After the network card is processed, an interrupt will be generated. The interrupt is first processed by the OS and the driver, and then the CPU is switched to user mode and the application registered is called. Callback.
  • the application needs to go through multiple privilege level switches to complete the interaction with the IO device, including entering the kernel state from the user state through system calls; when the hardware successfully initiates the IO operation, it will return from the kernel state to the user state; When the hardware successfully completes data processing, it will enter the kernel state to handle the interrupt; finally it will return to the user state to execute the callback function registered by the application. Due to the existence of these privilege level switches, each interaction between the application and the IO device will bring a lot of additional response delays. When the data center business volume is particularly large, these delays will greatly reduce the processing capabilities of the application. , leading to network packet loss and poor user experience. The strategy to improve the interaction performance between applications and IO devices is to reduce the number of unnecessary privilege level switches as much as possible.
  • FIG. 1 is a schematic diagram comparing the processing flow of kernel-mode interrupts and user-mode interrupts in related technologies.
  • the user-mode interrupt-based architecture is a widely used technology to reduce IO response latency.
  • Most user-mode interrupt implementation solutions support: interrupts will be triggered when the CPU is running a specific application. And through the management data object in the memory, the hardware automatically sends the interrupt to the CPU running the specific application, avoiding the problem of first being processed by the OS in kernel mode and then calling the user mode callback function. Therefore, with the support of user-mode interrupts, the application only needs to initiate hardware operations.
  • the hardware triggers a user-mode interrupt the application's instruction flow can directly jump to the user-mode callback function, reducing at least two privilege level switches and tasks. Scheduling overhead: when the data-intensive business load is very heavy, the delay overhead consumed in processing interrupts can be reduced by 10 to 30 times.
  • FIG. 2 is a schematic diagram of the interrupt delivery architecture in related technologies.
  • the interrupt controller adopts an Event-Based Branches (EBB) scheme.
  • the interrupt controller contains two functional components, as follows:
  • ISC Interrupt Source Controller
  • IPC Interrupt Presentation Controller
  • the interrupt controller shown in Figure 2 also defines four types of memory data to support interrupt delivery and interrupt response, as follows:
  • Each interrupt source is allocated a management object event state buffer (ESB) in memory by software.
  • ESD management object event state buffer
  • Each interrupt response target for example, virtual machine hypervisor, kernel mode OS and user mode process
  • a management object event notification descriptor (Event Notification Descriptor END) in the memory by the software. Note that it runs on different Physical Threads (physical threads). OSs on threads) have different ENDs.
  • the software will also allocate a management object event assignment structure (EAS) EAS in memory to describe the strategy for delivering interrupts.
  • EAS management object event assignment structure
  • FIG. 3 is a flow chart of user-mode interrupt delivery of the interrupt controller shown in Figure 2.
  • the interrupt controller uses the EBB mechanism to allow the user mode process to directly respond to PMU (Performance Monitor Unit, performance monitoring unit, used to collect counters of each functional component in the CPU) interrupt.
  • PMU Performance Monitor Unit
  • the application first registers the user mode through the OS.
  • PMU interrupt this operation will create three objects in the memory: ESB, EAS, END.
  • the user mode interrupt delivery process includes the following steps:
  • Step 301 When the ISC module of the interrupt controller receives a user-mode interrupt, it will obtain the ESB object data corresponding to the interrupt number from the memory according to the interrupt request number, and obtain the END object data and EAS object data of the user-mode thread from the data. , and encapsulates these two data into ENM messages and sends them to the IPC module of the interrupt controller.
  • Step 302 The IPC module parses the END data of the user state thread from the ENM message, and compares it with the User TIMA of all physical threads. If it matches, it means that the process running on the current physical thread belongs to a specific address space, and the process running on this physical thread is The execution flow will be interrupted, and the program pointer register (Program Counter, PC) jumps to the user mode interrupt entry address. Among them, the program pointer register indicates the address that the CPU is currently fetching.
  • Program Counter Program
  • Step 303 If the END data does not match all User TIMAs, it means that the target thread is not running. IPC then uses the EAS data to obtain the address of the END object of the Escalate interrupt. This object corresponds to a running on a specific physical thread. The OS interrupt processing function compares the END object with the OS TIMA of all physical threads, and delivers the user mode interrupt to the OS on the corresponding physical thread to respond.
  • Step 304 After the OS responds to the user-mode interrupt, it wakes up the corresponding user-mode thread according to the interrupt information, fills in the END object corresponding to the user-mode thread in User TIMA and sets the interrupt Pending flag.
  • Step 305 When the thread returns to user mode, the chip will detect that there is a user mode interrupt Pending through User TIMA, and the PC will jump to the user mode interrupt processing entry address.
  • the interrupt controller's strategy for handling user-mode threads that are not online is also to wake up the user-mode thread through a higher-privileged software OS line (kernel driver of the interrupt controller), and wait until the thread returns to the user mode. Response to user mode interrupt after state.
  • a higher-privileged software OS line Kernel driver of the interrupt controller
  • PMU interrupts are supported in user mode, and other device interrupts are not currently supported; and only the currently running process can handle interrupts. If you want to ensure that interrupts can be responded to in a timely manner, the process is required to run in polling mode, which will cause the CPU to High power consumption and low utilization.
  • FIG. 4 is a schematic diagram of another interrupt flow in the related art.
  • the user mode directly triggers a hardware interrupt (called User IPI in the instruction set), and can ensure that a specific thread responds to the interrupt (called User Interrupt in the instruction set).
  • User Interrupt in the instruction set
  • the software does not have to respond to the interrupt. Privilege level switching, this User Interrupt mechanism is mainly implemented around the following two key memory objects:
  • UPID User Posted-interrupt Descriptor
  • UITT User Interrupt Target Table
  • the software triggers a user-mode interrupt through the "SENDUIPI index" instruction, which can be operated in Ring3.
  • the hardware uses UITTSIZE to detect the validity of the index, ensure that the index does not exceed the number of UITT entries, and then uses UITTADDR to obtain the virtual address of the UITT entry.
  • the UITT entry contains two types of information: UIV and UPIDADDR, where UIV is the interrupt request number.
  • User-mode interrupt processing selects different processing logic based on UIV.
  • UPIDADDR is used to access the UPID object corresponding to the target thread, and the hardware registers the UIV to UPID. PIR field, and select the target CPU for interrupt routing based on the UPID.Notification_destination field.
  • the hardware sends the UPID.Notification_vector field to the CPU specified by UPID.Notification_destination. After the APIC of the target CPU receives the Notification_vector, it compares it with the local UINV MSR. If the APIC is the same, it will deliver the interrupt in the way of user mode interrupt.
  • the hardware accesses the UPID object currently running on the CPU thread through the UPIDADDR register, fills the value of UPID.PIR into the UIRR register, and then sets the PC to the UIHANDLER register value, the stack pointer register (Stack Pointer, SP) is set to the value of the UISTACKADJUST register, and finally the save register jumps to UIHANDLER.
  • the stack pointer register indicates the top address of the stack used by the current function.
  • the local UITTADDR and UITTSIZE registers need to be updated.
  • the local UPIDADDR, UIHANDLER, UISTACKADJUST and UINV registers of the CPU also need to be updated. .
  • each thread can handle up to 64 different user-mode interrupt request numbers, and the CPU must be running in Ring3 (user mode) to respond to interrupts. If the thread is blocked in Ring0 (kernel mode) or is blocked by Schedule it out, and the OS will first respond to the interrupt and then schedule the target thread.
  • Ring0 Kernel mode
  • embodiments of the present application provide a software and hardware collaboration solution, which provides a hardware architecture including interrupt routing and interrupt delivery, and further provides a software solution (i.e., an interrupt processing method) based on the hardware architecture. ), this method can support fast calling of user-mode functions in the underlying hardware and ensure that the kernel is safe and reentrant.
  • a complex address space and privilege level switching process is required before calling the user mode interrupt processing function.
  • This solution saves the context data dependent on the address space and privilege level switching in a set of Hardware registers, the kernel can use these context data to achieve fast switching, avoiding the kernel task wake-up and scheduling process, and improving the overhead of the switching process.
  • This solution can also ensure that the kernel is safe and reentrant, even if the kernel execution process is interrupted by user mode interrupts Under such circumstances, services can still be provided normally.
  • Figure 5 is a system architecture applicable to the embodiment of the present application.
  • the system architecture includes an application layer, a user state library layer, a kernel layer and a hardware layer.
  • the description of each layer is as follows:
  • Application layer runs various applications.
  • the optimization solution provided by this application is transparent to this layer and ensures compatibility.
  • User-mode library layer Contains two modules, the user-mode interrupt registration module and the user-mode interrupt context switching module. These two modules will provide an interface for registering user-mode interrupt handling functions, and encapsulate the common process of interrupt context switching, providing it to all programs that need to handle interrupts in user-mode.
  • Kernel layer The kernel state contains four modules, "user state interrupt routing engine driver”, “user state interrupt event management module”, “kernel safe reentrant module” and “interrupt address space switching module”.
  • "User mode interrupt routing engine driver” is responsible for allocating and configuring the interrupt event routing list (Interrupt Routing Table, IRT) and interrupt service object list (Interrupt Service Table, IST) to ensure that user mode interrupts can be responded to by the correct user mode thread
  • the "User Mode Interrupt Event Management Module” is responsible for encapsulating hardware interrupts into an event resource of the kernel.
  • the “Secure Reentrant Module” and the “Interrupt Address Space Switching Module” solve the problem of rapid calls to user interrupt processing functions and user interrupts falling into the kernel. And provide hook functions in the kernel scheduling module to prevent user-mode interrupt processing tasks from migrating to other processors.
  • the processor involved in the following embodiments of this application is explained by taking a CPU as an example and is not limited to the CPU.
  • the user-mode interrupt engine (Reentrant User Interrupt, RUI) provides a user-mode interrupt routing engine and a user-mode interrupt calling engine.
  • an exemplary technical problem is: "The interrupt processing task is in a blocked state or runs in the kernel mode, and it is necessary to go through task scheduling and privilege level switching to safely call the user mode interrupt processing function. , this process brings a large delay.”
  • the hardware layer provides a hardware mechanism for interrupt routing and quick call of interrupt processing functions based on user-mode tasks, and the software layer is responsible for In order to drive these hardware functions and provide a common interface for applications, it is convenient for applications to use user-mode interrupts to improve IO processing performance.
  • the thread-targeted interrupt routing mechanism can use the Interrupt Service Table and Interrupt Router Table and related hardware logic to dynamically select the CPU that responds to the interrupt based on the allowed running status and location of the thread.
  • Figure 6 is a user-mode interrupt hardware architecture diagram provided by an embodiment of the present application.
  • the interrupt hardware architecture includes a user-mode interrupt routing engine and a user-mode interrupt fast calling engine.
  • the first part 61 represents the hardware interface of the user-space interrupt routing engine
  • the second part 62 represents the user-space interrupt routing hardware logic
  • the third part 63 represents the memory table object that the user-space interrupt routing engine relies on, and the software allocates and constructs memory. table, and configure the table information through the hardware interface of the first part 61
  • the fourth part 64 represents the hardware registers related to user mode interrupt quick call.
  • the user-mode interrupt routing engine (Router Engine) is different from the interrupts processed by the kernel.
  • the processing function of user-mode interrupts belongs to a specific address space. This address space will be dynamically migrated on multiple CPUs.
  • the architecture provided by this application can be The hardware selects a CPU to handle user-mode interrupts, ensuring that the address space running on the CPU can directly call the interrupt processing function.
  • a set of memory data query modules namely Router Engine, are introduced in the hardware structure. This module is used to query two types of memory table data: Interrupt Router Table and Interrupt Service Table.
  • the hardware queries the Interrupt Router Table based on the interrupt request number to obtain the thread identifier that handles the interrupt, then queries the Interrupt Service Table based on the thread identifier to obtain the CPU that handles the interrupt, and finally delivers the interrupt to the CPU.
  • FIG. 7 is a schematic diagram of a user-mode interrupt routing engine provided by an embodiment of the present application.
  • the user-mode interrupt routing engine may be a device mounted on the system bus.
  • the software programming interface of the user-mode interrupt routing engine includes:
  • irtbar interrupt routing table base address register
  • This register saves a physical address, which serves as a base address and points to a continuous range in the memory, that is, the interrupt routing table (interrupt router table). ).
  • This memory area (interrupt routing table) stores information about all user-mode interrupt sources in the system, specifically including valid bits (Valid), route type (Route Type), event number (Event ID), and service number (Service ID).
  • Router Type currently only supports one type, User-mode, and can be expanded to more type values according to specific scenarios.
  • istbar interrupt service table base address register
  • This register saves a physical address as a base address and points to a continuous range in the memory. That is, the interrupt service table.
  • This memory area saves the information of all user-mode interrupt processing threads in the system, specifically including valid bit (Valid), interrupt service context (Interrupt Service Context), pending interrupt list (Interrupt Pending Buffer), and physical interrupt number (Physical Processor) ID).
  • bcbbar (byte code buffer base address register): It is a Memory-mapped register in the device configuration space. This register saves a physical address as a base address and points to a continuous range (byte code) in the memory. buffer). This memory area saves the bytecode sequence executed by the Router Engine.
  • the Router Engine updates the IRT and IST uniformly. The software controls the hardware update by filling the bytecode into the BCB. IRT and IST.
  • bcbsize (byte code buffer size): It is a Memory-mapped register in the device configuration space. This register saves the length information of the BCB, including the length of each byte code and the length of the entire BCB.
  • bcbcurp (byte code buffer current pointer): It is a Memory-mapped register in the device configuration space. This register saves the offset in the BCB of the byte code being executed by the Router Engine.
  • bcbendp (byte code buffer end pointer): It is a Memory-mapped register in the device configuration space. This register saves the offset in the BCB of the byte code being updated by the Router Engine.
  • the above-mentioned Memory-mapped register is mainly used to maintain three structures, namely Interrupt Router table, Interrupt Service table and Byte Code Buffer. These three structures are constructed and accessed by the kernel and are not allowed to be accessed in user mode.
  • the Interrupt Router table has many entries. Each entry corresponds to a user-mode interrupt source.
  • the entry information includes the valid bit (Valid), interrupt routing type (Route Type), event number (Event ID), and service number (Service ID).
  • the default length of each entry is 64 bits
  • Event ID and Service ID occupy 16 to 24 bits respectively
  • Route Type occupies 3 to 7 bits
  • Valid occupies 1 bit. It can be adjusted according to needs in different implementations.
  • the table base address and The length should be aligned to the page.
  • the Interrupt Service Table also has many entries. Each entry corresponds to an interrupt processing thread.
  • the entry information includes the valid bit (Valid) of the thread, the Interrupt Service Context (Interrupt Service Context), and the pending interrupt list (Interrupt Pending Buffer). , the physical interrupt number (Physical Processor ID), where the Interrupt Service Context points to a memory block, which must be able to accommodate a set of PC, SP and thread page table base addresses, and the Interrupt Pending Buffer points to a memory block, which block
  • the length of the Physical Processor ID is controlled by the system CPU topology. The base address and length of the table are required to be page-aligned.
  • the Byte Code Buffer is a continuous memory block that records bytecodes. Each bytecode is 32 bits in length. The base address and length of the BCB are required to be page aligned.
  • FIG. 8 is a schematic flowchart of user-mode interrupt routing hardware provided by an embodiment of the present application. Referring to Figure 8, the process may include the following steps:
  • Step 801 Router Engine receives the interrupt message from the interrupt source through the bus, parses the user mode interrupt request number from the message message through the Message Decoder, and sends the interrupt request number to the selector Selector (for IRT).
  • Step 802 Selector reads the entry corresponding to the interrupt request number from the Interrupt Router Table in the memory.
  • Step 803 Determine whether the interrupt request number exceeds the range. If it exceeds the range, execute step 804. If it does not exceed the range, execute step 805.
  • Step 804 End the interruption process.
  • Step 805 Pass the entry data to the Interrupt Router Entry Decoder.
  • Step 806 The Interrupt Router Entry Decoder parses out each field of the entry and determines whether the valid bit (Valid) is 0. If the valid bit is 0, execute step 804. If the valid bit is not 0, execute step 807.
  • Step 807 Pass the Event ID to the Arbiter (Arbiter) and the Service ID to the Selector (for IST).
  • Step 808 Selector reads the entry corresponding to the Service ID from the Interrupt Service Table in the memory, parses each field, and determines whether the valid bit (Valid) is 0. If the valid bit is 0, execute step 804. If If the valid bit is not 0, perform step 809.
  • Step 809 Send the Physical Processor ID, Interrupt Service Context and Interrupt Pending Buffer fields to Arbiter.
  • Step 810 Arbiter writes the Event ID into the memory block pointed to by Interrupt Pending Buffer.
  • Step 811 Compare whether the Interrupt Pending Buffer and the GPR saved value of the CPU are the same. If they are the same, perform step 812. If they are not the same, perform step 813.
  • Arbiter determines the corresponding Physical Processor ID through various internal comparators (Comparator, CMP). Whether the CPU of the CPU is running the thread (that is, the thread that provides the processing function for the interrupt). Specifically, you can confirm whether the CPU corresponding to the Physical Processor ID is running the thread by comparing the Interrupt Pending Buffer and the GPR saved value of the CPU. That is, The thread that provides the handler for this interrupt.
  • CMP internal comparators
  • Step 812 Trigger a user-mode interrupt on the target CPU.
  • Step 813 Trigger the doorbell interrupt on the target CPU.
  • Step 814 The CPU receives the user mode interrupt, and the CPU will not fall into the kernel mode.
  • the CPU receives the Doorbell interrupt the CPU will fall into the kernel state.
  • the interrupt routing mechanism provided by this set of hardware, when the interrupt processing thread is running on any CPU, it can directly receive interrupt events in user mode. Among them, the kernel can be notified through the Doorbell interrupt to prepare a safe user-mode interrupt processing context.
  • the embodiment of the present application can also quickly call the user mode interrupt processing function when the interrupt processing thread is in a blocked state or running in the kernel mode.
  • the hardware module provides a set of registers that are quickly called by user-mode interrupts: uicontext, uienable, uivector and uinoteinfo.
  • the hardware module provides two registers: the unsafe context register (unsafe context register, UCR) and the pending context register (ongoing context register, OCR) .
  • FIG. 9 is a schematic diagram of a user-mode interrupt quick call hardware module provided by an embodiment of the present application. Referring to Figure 9,
  • the user-mode interrupt quick call register can include user-mode interrupt vector register (uivector), user-mode interrupt enable register (uienable), user-mode interrupt memo information register (uinoteinfo), user-mode interrupt context register (uicontext), the above four
  • the specific description of the register is as follows:
  • uivector (user interrupt vector): This register records the virtual address of a user-mode interrupt processing function. When the CPU returns from the kernel mode to the user mode, it will fetch the address saved by the uivector and execute it. When the CPU is executing in user mode, if it is interrupted by a user mode interrupt, the instruction will be fetched from the address saved by uivector. This register is not allowed to be accessed in user mode.
  • uienable (user interrupt enable): This register is used to shield user-mode interrupts. When the register value is 0, the CPU will not be interrupted by user-mode interrupts. When the register value is 1, the CPU is allowed to be interrupted by user-mode interrupts. This Registers are accessible from user mode.
  • uinoteinfo (user interrupt noteinfo): This register records some related information of the interrupt processing thread running by the CPU. As shown in Figure 9, this register consists of three parts. Bit 0 represents whether the thread running by the CPU can handle user mode interrupts. As shown in “V” in Figure 9; the first bit indicates whether the thread running by the CPU is trapped in the kernel, as shown in "T” in Figure 9; the other bits record a memory block address, and this memory block registers all the threads to be processed. Interrupt, as shown in "Pending Buffer Address" shown in Figure 9.
  • uicontext (user interrupt context): This register records a memory block address.
  • the kernel uses the data in this memory address to perform fast address space switching.
  • the memory is allocated by the kernel and changes the address space page table base address (ATT_ADDR) of the address space where the interrupt processing function is located.
  • UI_PC entry address of the interrupt processing function
  • UI_SP stack pointer of the interrupt processing function
  • ukcp unsafe kernel context pointer
  • okcp ongoing kernel context pointer
  • the user-mode interrupt fast calling mechanism can allow the kernel to quickly switch to the address space where the interrupt processing thread is located by introducing the uinoteinfo and uicontext registers, and by introducing the UCR and OCR registers, the kernel can always be in a safe and re-entry state.
  • the status information of the current CPU running thread can be saved through the uinoteinfo register, including the pending interrupt information (pending buffer address field), the interrupt thread stuck in the kernel flag (uinoteinfo.T), and whether the interrupt thread is online flag (uinoteinfo .V), with the hardware interrupt routing module and the Interrupt Service Table entry, the thread status can be quickly determined.
  • the benefit of distinguishing the two states of V and T is to avoid meaningless page table switching operations.
  • the uicontext register can also be used to speed up the process of user mode calls.
  • the hardware interrupt routing module automatically fills the page table, PC and SP that handle interrupt dependencies into the uicontext.
  • the benefit of this mechanism is to avoid the kernel from using complex queries.
  • the kernel uses the information provided by the hardware to complete address space switching and interrupt processing function calls with minimal code.
  • the CPU can also quickly respond to user-mode interrupts in the kernel mode, which can not only meet energy consumption ratio-sensitive scenarios such as DPDK, but also meet response delay-sensitive scenarios such as user-mode IPC and RPC.
  • UCR kernel stack pointer backup register
  • OCR kernel stack pointer backup register
  • kernels return to user mode after exceptions and system calls are completed before they can execute user mode functions. This process often involves complex wake-up and thread scheduling processes.
  • the kernel can avoid complex wake-ups. and scheduling overhead, directly obtain the page table, PC and SP corresponding to the user mode function through the uicontext register, and complete the function call at the minimum cost.
  • user-mode interrupts can also be triggered directly in user-mode applications to achieve fast IPC.
  • FIG 10 is a schematic diagram of a hardware delivery user mode interrupt process provided by an embodiment of the present application. Referring to Figure 10, the process includes the following steps:
  • Step 1001 Before delivering the interrupt to the CPU, the user mode interrupt routing engine (Router Engine) determines whether the pending buffer address in uinoteinfo and the Interrupt Pending Buffer of the IST entry are the same.
  • the interrupt processing thread can only directly call the processing function when the user mode is running. Therefore, before the Router Engine delivers the interrupt to the CPU, it will compare the contents of the user mode interrupt memo information register (uinoteinfo) and the IST entry. If the contents are different , then execute step 1002. If the pending buffer address in uinoteinfo is the same as the Interrupt Pending Buffer of the IST entry, it means that the interrupt thread is running on the CPU, then execute step 1003.
  • Step 1002 The user mode interrupt routing engine (Router Engine) fills the Interrupt Service Context of the interrupt service routine (Interrupt Service Routines, ISR) table entry into the uicontext register, and then delivers a special kernel Doorbell interrupt to the CPU, and the CPU falls into the kernel , the kernel reads data from the uicontext register to the memory, the page table base address where the interrupt processing function is located, the interrupt processing function PC, and the interrupt processing function stack frame. The kernel uses these three data to achieve fast switching of the address space.
  • ISR Interrupt Service Routines
  • Step 1003 Determine whether uinoteinfo.V is equal to 1. If equal to 1, proceed to step 1004. If not equal to 1, return to step 1002.
  • Step 1005 Determine whether uienable is equal to 1. If it is equal to 1, execute step 1006. If it is not equal to 1, execute step 1007.
  • Step 1006 Deliver a user-mode interrupt to the CPU, so that the CPU starts fetching instructions from the address in the uivector and directly executes the processing function in the user-mode.
  • Step 1007 Trap into kernel processing (doorbell) Doorbell interrupt.
  • the interrupt processing thread can only directly call the processing function when running in user mode, so the Router Engine will compare the contents of the uinoteinfo and IST entries before delivering the interrupt to the CPU. If the pending buffer address in uinoteinfo and the Interrupt Pending of the IST entry If the Buffer is the same, it means that the interrupt thread is running on the CPU. If uinoteinfo.T is equal to 0 and uinoteinfo.V is equal to 1, it means that the current thread (that is, the thread that provides the processing function for the interrupt) is running in user mode, and the Router Engine will be used by the user. The state interrupt is delivered to the CPU. The CPU will fetch instructions starting from the address in the uivector and execute the processing function directly in the user state.
  • Router Engine fills the Interrupt Service Context of the ISR table entry into the uicontext register, and then delivers a special kernel Doorbell interrupt to the CPU.
  • the CPU falls into the kernel, and the kernel starts from the uicontext register. Point to the base address of the page table where the memory reads the data interrupt processing function, the interrupt processing function PC, and the interrupt processing function stack frame.
  • the kernel uses these three data to achieve fast switching of the address space.
  • uinoteinfo.T should be set to 1.
  • uinoteinfo.T should be set to 0.
  • uinoteinfo.V When the interrupt processing thread is scheduled out by the CPU, uinoteinfo.V should be set to 0. , when the interrupt processing thread is called by the CPU, the pending buffer address field of uinoteinfo must be updated, and uinoteinfo.V must be set to 1.
  • FIG 11 is a schematic diagram of software module classification provided by an embodiment of the present application.
  • the kernel layer (software layer) includes: a scheduling module, an exception handling module, a user-mode interrupt event management module, a user-mode interrupt routing engine driver, and an interrupt address space fast switching module to ensure that the kernel is safe and reentrant.
  • the user-mode interrupt routing engine driver directly operates the user-mode interrupt engine hardware provided by this application, including configuring Memory-mapped registers, constructing bytecodes and filling them into the Byte Code Buffer.
  • the user mode interrupt event management module is responsible for allocating and managing the Interrupt Router Table and Interrupt Service Table provided by this application in the memory, and registering the thread as The user-mode interrupt processing thread registers the interrupt as a user-mode interrupt. These operations will eventually be driven by the user-mode interrupt routing engine to perform hardware operations.
  • the interrupt address space fast switching module is responsible for operating the interrupt fast call register provided by this application. This module is used in the thread scheduling and exception handling modules.
  • the kernel safe reentrant module is responsible for operating the safe reentrant registers provided by this application. This module will be used in the thread scheduling and exception handling modules.
  • interrupt event management module The following is a detailed description of the user mode interrupt event management module (hereinafter referred to as the interrupt event management module), the interrupt address space fast switching module and the reentrant module to ensure kernel security.
  • an ordinary thread must first register itself as an interrupt processing thread before it can have the ability to handle user-mode interrupts.
  • the routing target of interrupts processed by the kernel is the physical CPU. No matter which CPU the interrupt is processed on, the kernel can directly call the interrupt processing function.
  • the user mode interrupt processing function belongs to a specific address space, so the user mode handles the interrupt.
  • the routing target is the interrupt handling thread. Therefore, the interrupt event management module contains two features: registering interrupt processing threads and configuring interrupt routing targets.
  • FIG 12 is a schematic diagram of an interrupt event management module provided by an embodiment of the present application.
  • a thread if a thread needs to handle interrupts in user mode, it can register itself as an interrupt processing thread.
  • the kernel will allocate a table entry in the Interrupt Service Table and fill in the Interrupt Service Context and Interrupt Pending Buffer fields of the entry.
  • the memory block pointed to by each field is also allocated by the kernel.
  • the kernel management object of each thread needs to register the corresponding IST entry number, Interrupt Service Context and Interrupt Pending buffer memory block address.
  • the Interrupt Pending Buffer field in the Interrupt Service Table table entry points to a memory block. This memory block stores all pending interrupt information of the interrupt thread, allowing threads to quickly obtain pending interrupt information on different CPUs.
  • the kernel When a thread is registered as an interrupt processing thread, it must configure the pending interrupt source for itself: the kernel will allocate a table entry in the Interrupt Router Table, and set the Router Type in the entry to User-mode (the value is 1). Set the Service ID in the entry to the IST entry number corresponding to the interrupt processing thread.
  • the Event ID is provided by the application.
  • the Interrupt Router Table provided by the embodiment of this application decouples the interrupt triggering end and the interrupt processing end, and can support more flexible software scenarios, such as using user-mode interrupts to accelerate conventional synchronization primitives futex, eventfd, pipe, signal, etc.
  • FIG. 13 is a schematic diagram of an interrupt address space switching module provided by an embodiment of the present application. Referring to Figure 13, the following is explained through three scenarios, as follows:
  • the first scenario is that the user-mode interrupt processing thread is running on the CPU, and the Router Engine directly sends the interrupt to the CPU. At this time, the program running on the CPU is interrupted, and execution starts from the user-mode interrupt entry saved in the uivector:
  • Step 1301 Save the context of the CPU being interrupted in the interrupt processing stack frame, including GPR and PC.
  • Step 1302 Traverse all the set bits in the Interrupt Pending buffer, where each set bit corresponds to an Event ID to be processed.
  • Step 1303 Obtain the corresponding user-mode interrupt processing function according to the Event ID, and then call the function.
  • Step 1304 After all pending Event IDs have been processed, restore the GPR and PC from the stack frame.
  • Step 1305 Call the interrupt return instruction to resume execution in the interrupted context.
  • the second scenario is that the interrupt processing thread runs in the kernel state. At this time, the T bit of the uinoteinfo register is 1, and the Router Engine will send a Doorbell interrupt to the CPU. This Doorbell interrupt is processed by the kernel.
  • the Doorbell processing function is mainly:
  • Step 1311 Save the interrupt context in the kernel stack frame, and then save the top address of the kernel stack in the non-secure context register UCR.
  • Step 1312 Check the value of uinoteinfo.T. If it is 1, it means that the interrupt processing thread is executed in the kernel mode without switching address space.
  • Step 1313 Get the user-mode interrupt processing function (UI_PC) and interrupt stack frame address (UI_SP) from the memory pointed to by uicontext, and construct a user-mode context stack frame, in which the PC and SP registers are set to UI_PC and UI_SP respectively.
  • UI_PC user-mode interrupt processing function
  • UI_SP interrupt stack frame address
  • Step 1314 Execute the interrupt return instruction and return to the user mode interrupt entry function.
  • the third scenario is that the interrupt processing thread is not executed on the CPU. At this time, the V bit of the uinoteinfo register is 0, and the Router Engine will also send a Doorbell interrupt to the CPU.
  • the processing flow of this Doorbell is basically the same as the second scenario, but Additional interrupt address space switching operations are required:
  • Step 1321 Get the page table base address (ATT_ADDR) where the interrupt processing thread is located from the memory pointed to by uicontext.
  • Step 1322 Switch the page table base address of the CPU. In order to prevent the duplicate name problem of the Translation Lookaside Buffer (TLB), all TLB entries must also be refreshed.
  • TLB Translation Lookaside Buffer
  • Step 1323 Temporarily fix the interrupt processing thread on the current CPU to prevent it from being migrated to other CPUs by the kernel.
  • the first scenario has the lowest cost of calling the interrupt processing function
  • the third scenario has the highest cost.
  • the overall task switching overhead is ⁇ 100 instruction cycles. If the main frequency of the CPU is 2G HZ, the interrupt address space switching overhead is ⁇ 100ns. Compared with the existing user-mode interrupt scheme, this application has 10 ⁇ 20x performance advantage.
  • FIG. 14 is a schematic diagram of a secure reentrant module provided by an embodiment of the present application.
  • embodiments of the present application can implement user-mode interrupts to directly interrupt the execution flow of the kernel, and quickly return to the user-mode to execute the interrupt processing function. However, this may cause the kernel to be in a state of Non-reentrant state. If any function of the kernel is used during user mode interrupt processing, the kernel state may be destroyed.
  • embodiments of the present application can provide two registers UCR (unsafe context register) and OCR (ongoing context register), and define several "safe reentry points" and "reentrancy detection points" in the kernel. Ensure that the process of handling user-mode interrupts can safely use kernel functions, including the following steps:
  • Step 1401 Define the kernel exception handling entry as the "reentrancy detection point”, and define the point at which the kernel returns to user mode as the "safe reentry point”.
  • Step 1402 The kernel saves the context interrupted by the user mode interrupt into a memory block, records the memory block address in the UCR register, and then quickly returns to the user mode to execute the interrupt processing function.
  • the Interrupt Service Context field in the Interrupt Service Table table entry points to a memory block.
  • the data stored in this memory block can provide the software with the information necessary to quickly call the user mode interrupt processing function.
  • Step 1403 The kernel checks the UCR at the "reentry detection point". If the UCR value is non-0, it means that the CPU is in a non-safe reentry state and cannot directly use the kernel's functions. At this time, the kernel context is first saved in a memory block. And the memory block address is recorded in the OCR register, and the current kernel execution flow is suspended until the kernel returns to the safe reentry state.
  • Step 1404 Then retrieve the memory block address that saves the kernel context through the OCR register, restore the CPU to the interrupted scene, and allow the interrupted kernel to continue executing to the "safe reentry point".
  • Step 1405 When the kernel executes to the "safe reentry point", the suspended kernel context is restored through the OCR register, which is step 3. At this time, the kernel can safely use the kernel functions.
  • the fast switching of interrupt addresses and the safe reentrant module are the core features of the software part of this application. Considering that most scenarios where user mode interrupts are used, the interrupt processing thread is in a blocked state, this The ability to quickly call user-mode interrupt processing functions provided by the two modules is more competitive in performance and applicable scenarios than other user-mode interrupt mechanisms.
  • DPDK Data Plane Development Kit
  • DPDK applications In order to pursue high performance, DPDK applications must occupy a group of CPUs for polling operations. , when the network is relatively idle, the CPU will be in an idling state, causing high energy consumption in the data center. Using the interrupt mode of DPDK can reduce energy consumption.
  • FIG 15 is a schematic flowchart of the DPDK interrupt mode provided by an embodiment of the present application.
  • the CPU can enter the idle state to save energy consumption.
  • the energy consumption of interrupt mode and polling mode is 40 times different.
  • the DPDK interrupt module will bring additional overhead, resulting in increased network delay when sending and receiving packets. Packet loss may occur. In key services such as cloud core and cloud, high latency and packet loss lead to unavailable business performance.
  • DPDK is adapted to the user-mode interrupt mode based on the Linux kernel.
  • the components involved in this embodiment include a hardware interrupt routing engine and a hardware user-mode interrupt call Registers, hardware safe reentrant registers; kernel interrupt event management module, interrupt routing engine driver, interrupt address switching module, safe reentrant module; user mode layer interrupt thread registration module, interrupt event registration module.
  • FIG 16 is a schematic diagram of a software framework provided by an embodiment of the present application. Referring to Figure 16, the process of the DPDK interrupt module responding to the network card interrupt includes the following steps:
  • Step 1601 DPDK App calls the user-mode interrupt thread registration module.
  • DPDK App provides a function location for handling user-mode interrupts. address and stack frame address, corresponding to (1) in the figure.
  • Step 1602 DPDK App calls the user-mode interrupt registration module and sets the routing target of an interrupt to this thread, corresponding to (2) in the figure.
  • steps 1601 and 1602 are both completed through the interrupt event management module of the kernel.
  • the available table entries can be allocated from the IST and IRT in the memory and initialized, corresponding to (3) (4) in the figure.
  • Step 1603 Configure the allocated IST and IRT table entries into the Router Engine hardware module provided by this application through the kernel's interrupt routing engine driver. This step requires constructing Byte Codes and filling them into the Bytes Code Buffer of the memory and waiting for Router Engine execution. Completed, corresponding to (5) (6) in the figure.
  • Step 1605 Whenever it is determined that the DPDK App falls into the kernel state, is scheduled out by the kernel, or is transferred into the CPU, the kernel updates the uinoteinfo register. Corresponds to (7) in the figure.
  • Step 1606 The network card device triggers the RX interrupt.
  • the Router Engine hardware module first receives the interrupt, and then accesses the IST and IRT of the memory to select the interrupt delivery target. If the DPDK App is currently running on the CPU, the CPU jumps directly to the user mode interrupt function entry, corresponding to (8) in the figure.
  • Step 1607 If the DPDK App is blocked in the kernel state or is scheduled out, the CPU jumps to the interrupt address space fast switching module of the kernel state, first saves the CPU scene, then executes the address space switching process, and finally returns to the user state of the DPDK App Interrupt processing function entry, see Figures (9) and (10)
  • Figure 17 is a delay comparison diagram between DPDK interrupt mode and user mode interrupt provided by the embodiment of the present application.
  • the traditional DPDK interrupt mode process includes network card interruption, kernel trap, task wake-up, task scheduling, page switching, and message processing through the DPDK App after returning to user mode.
  • the embodiments of this application provide a simplified implementation process based on user-mode interrupts, including network card interrupt, trapping in the kernel, switching page tables, and returning to the user-mode interrupt entry function.
  • the traditional DPDK interrupt mode responds to kernel-mode interrupt forwarding, its corresponding response delay is 16000 to 18000 cycles.
  • the user-mode interrupt provided by the embodiment of the present application responds to user-mode interrupt calls, its corresponding response The delay is 500 ⁇ 1500cycles, which reduces the interrupt response delay.
  • An embodiment of the present application also provides an interrupt controller, including: a processor and a memory, the memory being used to store at least one instruction, which when loaded and executed by the processor, implements any of the embodiments of the present application. interrupt handling method.
  • Embodiments of the present application also provide an interrupt controller, including: a user-mode interrupt routing engine and a user-mode interrupt quick call engine; the user-mode interrupt quick call engine includes: a user-mode interrupt memo information register, used to record the current processor operation Thread status information; when the interrupt controller determines that the interrupt thread is running on the processor, based on the status information of the current processor running thread recorded in the user mode interrupt memo information register, if it is determined that the current thread is running in user mode, it will notify the processor Deliver user-mode interrupts, read instructions based on the address stored in the user-mode interrupt vector register, and execute the interrupt processing function in user-mode.
  • the interrupt controller including: a user-mode interrupt routing engine and a user-mode interrupt quick call engine; the user-mode interrupt quick call engine includes: a user-mode interrupt memo information register, used to record the current processor operation Thread status information; when the interrupt controller determines that the interrupt thread is running on the processor, based on the status information of the current processor running thread recorded in the user mode interrupt memo information register,
  • the user-mode interrupt fast calling engine also includes: a user-mode interrupt context register, used to record the memory address; wherein, based on the recorded information of the user-mode interrupt memo information register, if the user-mode interrupt memo information is determined If the terminal thread of the register falls into the kernel mark non-first state, and/or if the interrupt thread of the user mode interrupt memo information register is online marked not the second state, then the processing interrupt context of the interrupt service routine table entry is filled in the user mode interrupt context. register and delivers the doorbell interrupt to the processor.
  • the user-mode interrupt memo information register and user-mode interrupt context register introduced in the interrupt controller provided by the embodiment of the present application can thereby allow the kernel to quickly switch to the address space where the interrupt processing thread is located.
  • An embodiment of the present application also provides an electronic device, which may include the above-mentioned interrupt controller, so that the above-mentioned interrupt control method is implemented through the interrupt controller.
  • Embodiments of the present application also provide a computer scale storage medium on which a computer program is stored.
  • the computer program is executed by a processor, the interrupt processing method provided by any embodiment of the present application is implemented.
  • the application may be an application program (nativeApp) installed on the terminal, or may also be a web page program (webApp) of the browser on the terminal, which is not limited in the embodiments of the present application.
  • nativeApp application program
  • webApp web page program
  • the disclosed systems, devices and methods can be used in other ways. realized.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined. Either it can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-mentioned integrated unit implemented in the form of a software functional unit can be stored in a computer-readable storage medium.
  • the above-mentioned software functional unit is stored in a storage medium and includes a number of instructions to cause a computer device (which can be a personal computer, server, or network device, etc.) or processor (Processor) to execute the methods described in various embodiments of this application. Some steps.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bus Control (AREA)

Abstract

提供一种中断方法、电子设备和存储介质,该方法包括:确定中断线程在处理器上运行时,基于用户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向所述处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。通过该方法,处理器的地址空间可以直接调用用户态中断处理函数,进一步减少了软件响应用户态中断的开销。

Description

中断处理方法、电子设备和存储介质 技术领域
本申请涉及中断控制技术领域,尤其涉及一种中断处理方法、电子设备和存储介质。
背景技术
随着个人电脑、移动计算设备,以及云计算的普及,数据中心承载的业务规模快速增长,数据库,web服务器,AI训练和推理系统,网页搜索系统,电子商务系统处理的数据量越来越大,这些应用软件处理的数据都是通过高速网卡进行收发。
网卡设备由运行在内核态的软件模块管理,网卡设备上产生的事件,比如,报文到达,报文发送完毕等,都会产生硬件中断,这些中断的典型处理流程是:先由运行在CPU内核态的操作系统(Operating System,OS)处理,再转发给对应驱动软件处理,用户态的应用程序必须依赖内核态的驱动软件和网卡设备进行数据交互,一个典型场景是:即时聊天工具收发消息的过程,先执行系统调用将CPU切换到内核态,OS处理系统调用后再通过驱动程序将聊天消息数据交给网卡设备处理,网卡处理完毕会产生一个中断,中断先被OS和驱动程序处理,然后将CPU切换到用户态后,调用应用程序注册的回调函数。
可见,应用程序需要经历多次的特权级切换才能完成和输入/输出(Input/Output,IO)设备的交互,包括,从用户态通过系统调用进入内核态;当硬件成功发起了IO操作后,会从内核态回到用户态;当硬件成功完成数据处理,又会进入内核态处理中断;最后还会回到用户态执行应用程序注册的回调函数。由于这些特权级切换的存在,都会为应用程序每次和IO设备的交互带来很多额外响应时延,当数据中心业务量特别大的情况下,这些时延将会大大降低应用程序的处理能力,导致网络丢包和用户体验差。
申请内容
本申请实施例提供一种中断方法、电子设备和存储介质,可以通过该中断方法减少这种不必要的特权级切换次数,从而改善应用程序和IO设备交互性能。
第一方面,本申请实施例提供一种中断方法,包括:确定中断线程在处理器上运行时,基于用户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向所述处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。通过该方法处理器的地址空间可以直接调用用户态中断处理函数,进一步减少了软件响应用户态中断的开销。
进一步地,所述基于用户态中断备忘信息寄存器的记录信息,确定当前线程在用户态运行包括:确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记为第一状态,并且所述用户态中断备忘信息寄存器的中断线程是否在线标记为第二状态,则确定当前线程在用户态运行。在一种实施方式中,用户态中断备忘信息寄存器的终端线程陷入内核标记为第一状态可以为用户态中断备忘信息寄存器的T值为0,用户态中断备忘信息寄存器的中断线程是否在线标记为第二状态可以为户态中断备忘信息寄存器的V值为1,(uinoteinfo.T==0,uinoteinfo.V==1)即表示当前线程在用户态运行。
进一步地,所述确定当前线程在用户态运行之前还包括:基于用户态中断备忘信息寄存器的记录信息,确定当前线程是否在用户态运行;其中,基于用户态中断备忘信息寄存器的记录信息,若确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记非第一状态,和/或所述用户态中断备忘信息寄存器的中断线程是否在线标记非第二状态,则将中断服务程序表条目的处理中断上下文填入用户态中断上下文寄存器,并向所述处理器投递门铃中断。
进一步地,所述确定中断线程在处理器上运行时之前,还包括:确定中断服务表中的待处理中断列表与所述用户态中断备忘信息寄存器的待线程处理中断信息是否相同,若相同则确定中断线程在处 理器上运行。
进一步地,若确定中断服务表中的待处理中断列表与所述用户态中断备忘信息寄存器的待线程处理中断信息不相同,则将中断服务程序条目的中断服务上下文填入用户态中断上下文寄存器,并向所述处理器投递内核门铃中断。
进一步地,所述确定中断线程在处理器上运行时之前,还包括:确定所述处理器陷入内核态时,在安全检测点检测非安全上下文寄存器的值,若确定处理器处于非安全冲入状态,则将内核上下文保存在设定内存,并将所述设定内存的地址记录在被挂起上下文寄存器,其中,所述安全检测点为内核异常处理入口;基于所述被挂起上下文寄存器记录的所述设定内存的地址,将处理器恢复到被打断的现场,并使被打断的内核继续执行安全重入点,其中,所述安全重入点为内核返回用户态的点;在内核执行到所述安全重入点时,通过所述被挂起上下文寄存器恢复被挂起的内核上下文。其中,常规内核都是在异常和系统调用结束后才返回用户态,但为了支持快速响应用户态中断,本申请实施例允许内核在非结束这些操作时就快速返回到用户态响应中断,这就导致内核此时处于非安全可重入状态。如果用户态中断处理函数又陷入内核,内核的数据完整性将面临较大的不确定性。本申请实施例通过在异常和系统调用入口处利用UCR寄存器将内核状态恢复到被打断的现场,并在内核再次进入安全可重入状态后,(即结束了异常和系统调用处理,准备返回到用户态时),再利用OCR寄存器开始处理来自用户态中断处理函数的异常和系统调用。通过上述方式解决了在内核里响应用户态中断的带来的非安全可重入问题,既保证了用户态中断可以及时响应,又不会破坏内核的状态。
第二方面,本申请实施例还提供一种中断控制器,包括:处理器和存储器,所述存储器用于存储至少一条指令,所述指令由所述处理器加载并执行时以实现第一方面提供的中断处理方法。
第三方面,本申请实施例还提供一种中断控制器,包括:用户态中断路由引擎和用户态中断快速调用引擎;所述用户态中断快速调用引擎包括:用户态中断备忘信息寄存器,用于记录当前处理器运行线程的状态信息;所述中断控制器确定中断线程在处理器上运行时,基于用所述户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向所述处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。
进一步地,所述用户态中断快速调用引擎还包括:用户态中断上下文寄存器,用于记录内存地址;其中,基于用户态中断备忘信息寄存器的记录信息,若确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记非第一状态,和/或所述用户态中断备忘信息寄存器的中断线程是否在线标记非第二状态,则将中断服务程序表条目的处理中断上下文填入用户态中断上下文寄存器,并向所述处理器投递门铃中断。本申请实施例提供的中断控制器中引入的用户态中断备忘信息寄存器和用户态中断上下文寄存器,进而可以实现允许内核快速切换到中断处理线程所在地址空间。
第四方面,本申请实施例还提供一种电子设备,该电子设备可以包括第二方面提供的中断控制器。
第五方面,本申请实施例还提供一种电子设备,该电子设备可以包括第三方面提供的中断控制器。
第六方面,本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现第一方面提供的中断处理方法。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为相关技术中内核态中断和用户态中断处理流程对比示意图;
图2为相关技术中中断投递架构示意图;
图3为图2所示中断控制器的用户态中断投递的流程图;
图4为相关技术中另一种中断流程示意图;
图5为本申请实施例适用的一种系统架构;
图6为本申请一个实施例提供的用户态中断硬件架构图;
图7为本申请一个实施例提供的用户态中断路由引擎示意图;
图8为本申请一个实施例提供的用户态中断路由硬件流程示意图;
图9为本申请一个实施例提供的用户态中断快速调用硬件模块示意图;
图10为本申请实施例提供的一种硬件投递用户态中断流程示意图;
图11为本申请一个实施例提供的一种软件模块分类示意图;
图12为本申请一个实施例提供的中断事件管理模块的示意图;
图13为本申请一个实施例提供的中断地址空间切换模块示意图;
图14为本申请一个实施例提供的安全可重入模块示意图;
图15为本申请一个实施例提供的DPDK中断模式流程示意图;
图16为本申请一个实施例提供的一种软件框架示意图;
图17为本申请实施例提供的DPDK中断模式与用户态中断之间的时延对比图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为清楚地理解本申请实施例,以下对本申请实施例涉及到的概念进行相应解释:
微内核架构:一种内核架构方案,在该架构中,只保留最核心的内核组件在内核态中,示例性的,将权限管理模块、任务调度的内核组件保留在内核态中;将传统的内核组件设在用户态,示例性的,将文件系统,中断处理框架,网络协议栈等内核组件都放到用户态。
宏内核架构:一种内核架构方案,在该架构中,将所有内核组件放在内核态中,示例性的,将IPC模块、权限管理模块、文件系统,中断处理框架和网络协议栈等内核组件均放在内核态中;用户态只负责运行应用代码。
线程上下文:一个操作系统(Operating System,OS)中维护一个线程的各种状态,包括线程使用的通用寄存器、页表、线程私有空间,线程元数据等部分。
中央处理器(Central Processor Unit,CPU)特权级:现有芯片支持用户态和内核态两种特权级,在不同特权级下运行着各种软件系统,这些软件系统对底层硬件有不同的访问权限,比如,OS运行在CPU高特权级下,各种应用程序,比如,编辑器,网页浏览器等运行在CPU低特权级下。当应用程序需要和硬件交互,比如,通过打印机打印文件,通过键盘编辑文件,就会依赖高特权级的软件系统完成这些工作。
内核驱动:计算机里会有各种硬件设备,比如,键盘,网卡,硬盘,每种硬件设备都依赖不同软件模块来进行操作,这些软件模块属于OS的一部分,运行在CPU高特权级下。
中断:一种由硬件触发的事件,该事件发生时将中断CPU的执行流,并从特定内核驱动继续执行。
上下文保存和恢复:CPU的执行流被硬件打断后,为了能回到被打断点继续执行,软件将打断时的通用寄存器(General Purpose Register,GPR)和控制和状态寄存器(Control Status Register,CSR)都保存在内存里,软件将备份的数据再次恢复到GPR和CSR里,就能恢复CPU原执行流继续执行。
响应时延:从硬件触发中断事件到CPU开始执行该中断处理函数,这个过程消耗的时间。
内核可重入性:内核在任意时候被中断后执行了另外一段代码,这段代码又调用了内核,但内核状态仍然能保持正确。
任务迁移:在多CPU的系统里,当CPU的工作负载过重,OS会将该CPU运行队列里的任务迁移到其他CPU的运行队列里。
以上为本申请实施例所涉及的概念。
随着个人电脑、移动计算设备以及云计算的普及,数据中心承载的业务规模快速增长,数据库、web服务器、AI训练和推理系统、网页搜索系统或者电子商务系统处理的数据量越来越大,这些应用软件处理的数据都是通过高速网卡进行收发。
其中,网卡设备由运行在内核态的软件模块管理,网卡设备上产生的事件,比如,报文到达,报文发送完毕等,都会产生硬件中断,这些中断的典型处理流程是:先由运行在CPU内核态的OS处理, 再转发给对应驱动软件处理,用户态的应用程序需依赖内核态的驱动软件和网卡设备进行数据交互,示例性的,即时聊天工具收发消息的过程,先执行系统调用将CPU切换到内核态,OS处理系统调用后再通过驱动程序将聊天消息数据交给网卡设备处理,网卡处理完毕会产生一个中断,中断先被OS和驱动程序处理,然后将CPU切换到用户态后,调用应用程序注册的回调函数。
可见,应用程序需要经历多次的特权级切换才能完成和IO设备的交互,包括,从用户态通过系统调用进入内核态;当硬件成功发起了IO操作后,会从内核态回到用户态;当硬件成功完成数据处理,又会进入内核态处理中断;最后还会回到用户态执行应用程序注册的回调函数。由于这些特权级切换的存在,都会为应用程序每次和IO设备的交互带来很多额外响应时延,当数据中心业务量特别大的情况下,这些时延将会大大降低应用程序的处理能力,导致网络丢包和用户体验差,改善应用程序和IO设备交互性能的策略就是尽可能减少这种不必要的特权级切换次数。
图1为相关技术中内核态中断和用户态中断处理流程对比示意图。参照图1所示,基于用户态中断的架构是一项广泛使用以降低IO响应时延的技术,大多数的用户态中断实现方案都支持:当CPU正在运行特定应用程序时才会触发中断,并且通过内存里的管理数据对象,由硬件自动将中断发送给运行特定应用程序的CPU,避免了先由内核态的OS处理再调用用户态回调函数的问题。所以在用户态中断的支持下,应用程序只需要发起硬件操作,当硬件触发用户态中断时,应用程序的指令流可以直接跳转到用户态的回调函数,减少至少两次特权级切换和任务调度开销,当数据密集业务负载非常繁重时,消耗在处理中断上的时延开销可减少10~30倍。
图2为相关技术中中断投递架构示意图。参照图2所示,该中断控制器采用基于事件的分支(Event-Based Branches,EBB)的方案。该中断控制器包含两种功能部件,具体如下:
中断源控制器(Interrupt Source Controller,ISC),每个设备都先将中断信息发往ISC,再由ISC将中断封装成紧急通知报文(Emergency Notification Message,ENM)发到IPC部件。
中断呈现控制器(Interrupt Presentation Controller,IPC),IPC接收来自ISC的ENM后,再将中断发送给在特定地址空间下运行的软件处理。
图2所示的中断控制器中还定义了四种内存数据,用来支持中断投递和中断响应,具体如下:
每个中断源都由软件在内存里分配一个管理对象事件状态缓存(Event State Buffer,ESB),
每个中断响应目标(比如,虚拟机Hypervisor,内核态OS和用户态进程)都由软件在内存里分配一个管理对象事件通知描述符(Event Notification Descriptor END),注意,运行在不同Physical Thread(物理线程)上的OS有不同的END。
软件还会在内存里分配一个管理对象事件指派结构(Event Assignment Structure,EAS)EAS,用来描述将投递中断的策略。
如图2所示,同时在每个物理线程上还存在三个物理寄存器User TIMA,OS TIMA和Hyp TIMA分别记录当前物理线程上正在执行的虚拟机Hypervisor,OS和用户态进程的END识别码。
图3为图2所示中断控制器的用户态中断投递的流程图。参照图3所示,该中断控制器利用EBB机制使用户态进程直接响应PMU(Performance Monitor Unit,性能监控单元,用于收集CPU内各功能部件的计数器)中断,应用程序先通过OS注册用户态的PMU中断,这个操作会在内存里创建三个对象:ESB,EAS,END,用户态中断投递流程包括以下步骤:
步骤301:当该中断控制器的ISC模块接收到用户态中断后,会根据中断请求号从内存里获取中断号对应的ESB对象数据,从数据里获取用户态线程的END对象数据和EAS对象数据,并将这两个数据封装成ENM消息发送给该中断控制器的IPC模块。
步骤302:IPC模块从ENM报文里解析出用户态线程的END数据,和所有物理线程的User TIMA进行比较,如果匹配就代表当前物理线程上运行的进程属于特定地址空间,此物理线程上正在执行流将被打断,程序指针寄存器(Program Counter,PC)跳转到用户态中断入口地址。其中,程序指针寄存器指示了CPU当前正在取指的地址。
步骤303:如果END数据没有能匹配所有User TIMA,就意味着目标线程没有运行,IPC再利用EAS数据里获取上升(Escalate)中断的END对象地址,该对象对应的是一个运行在特定物理线程上的OS中断处理函数,将该END对象和所有物理线程的OS TIMA进行比较,并将用户态中断投递到对应物理线程上的OS来响应。
步骤304:OS响应用户态中断后,根据中断信息唤醒对应用户态线程,往User TIMA里填入用户态线程对应的END对象并设置中断Pending标记。
步骤305:当线程返回到用户态时,芯片会通过User TIMA检测到有用户态中断Pending,PC跳转到用户态中断处理入口地址。
基于图3所示的流程,该中断控制器处理用户态线程不在线的策略同样是通过更高特权级的软件OS线(中断控制器的内核驱动)唤醒用户态线程,需等到线程返回到用户态后响应用户态中断。目前只支持在用户态处理PMU中断,其他设备中断暂不支持;并且只能使当前运行的进程处理中断,如果要保证中断能被及时响应,就要求进程以轮询模式运行,这会导致CPU功耗大和利用率低。
图4为相关技术中另一种中断流程示意图。参照图4所示,用户态直接触发硬件中断(指令集称之为User IPI),并且能保证由某个特定线程来响应该中断(指令集称之为User Interrupt),软件响应中断时不必发生特权级切换,这套User Interrupt机制主要围绕以下两个关键的内存对象来实现:
User Posted-interrupt Descriptor(UPID),UPID代表一个中断处理任务对象。
User Interrupt Target Table(UITT),UITT起到了中断路由表的作用。
参照图4所示,User IPI和User Interrupt的逻辑模型是:
(1)软件通过“SENDUIPI index”指令触发用户态中断,该指令可以在Ring3操作,硬件利用UITTSIZE检测index的有效性,确保index不超过UITT条目数量,然后利用UITTADDR获取UITT条目虚拟地址。
(2)UITT条目包含两种信息:UIV和UPIDADDR,其中UIV就是中断请求号,用户态中断处理根据UIV选择不同处理逻辑,UPIDADDR用于访问目标线程对应的UPID对象,硬件将UIV登记到UPID.PIR字段里,并且根据UPID.Notification_destination字段选择中断路由的目标CPU。
(3)硬件将UPID.Notification_vector字段发往UPID.Notification_destination指定的CPU,目标CPU的APIC接收到Notification_vector后,和本地UINV MSR进行比较,如果相同APIC将会按照用户态中断的方式来投递中断。
(4)当CPL==3(表示CPU运行在用户态),硬件通过UPIDADDR寄存器访问当前正在CPU上线程的UPID对象,将UPID.PIR的值填充到UIRR寄存器,然后将PC设置为UIHANDLER寄存器的值,栈指针寄存器(Stack Pointer,SP)设置为UISTACKADJUST寄存器的值,最后保存寄存器跳转到UIHANDLER。该栈指针寄存器指示了当前函数使用的栈顶地址。
(5)当用户态中断触发线程被OS调入时,需要更新本地的UITTADDR和UITTSIZE寄存器,当用户态中断处理线程被OS调入时,也需要更新CPU本地的UPIDADDR,UIHANDLER,UISTACKADJUST和UINV寄存器。
基于图4所示的流程,每个线程最多能处理64种不同的用户态中断请求号,并且要求CPU必须运行在Ring3(用户态)才能响应中断,如果线程阻塞在Ring0(内核态)或者被调度出去,先由OS响应该中断后再将目标线程调度起来。
综合现有技术中存在的缺陷,存在以下技术问题:
(一)、相关用户态中断方案均不支持当任务处于阻塞时直接执行用户态中断处理函数。
(二)、相关用户态中断方案中内核执行过程中突然切回到用户态执行函数,将导致内核里部分不可重入代码的上下文被破坏。
为克服上述技术问题,本申请实施例提供一种软硬件协同方案,其中提供一种包括中断路由和中断投递的硬件架构,并基于该硬件架构进一步提供一种软件方案(即一种中断处理方法),通过该方法可以支持在底层硬件中对快速调用用户态函数并保证内核安全可重入。当中断处理线程阻塞在内核态或者被抢占的情况下,调用用户态中断处理函数前要求复杂的地址空间和特权级切换流程,本方案将地址空间和特权级切换依赖的上下文数据保存在一组硬件寄存器,内核可以利用这些上下文数据实现快速切换,避免了内核任务唤醒和调度过程,改善了切换流程的开销,该方案还能保证内核安全可重入,即使内核执行过程被用户态中断打断的情况下,仍然能正常提供服务。
图5为本申请实施例适用的一种系统架构。参照图5所示,该系统架构包括应用层、用户态库层、内核层和硬件层。各层说明具体如下:
应用层:运行各种应用程序,本申请提供的优化方案对该层透明,保证兼容性。
用户态库层:包含两个模块,用户态中断注册模块,用户态中断上下文切换模块。这两个模块会提供注册用户态中断处理函数的接口,并且将中断上下文切换的通用流程进行封装,提供给所有需要在用户态处理中断的程序。
内核层:内核态包含四个模块,“用户态中断路由引擎驱动”,“用户态中断事件管理模块”,“内核安全可重入模块”和“中断地址空间切换模块”。“用户态中断路由引擎驱动”负责分配和配置中断事件路由列表(Interrupt Routing Table,IRT)和中断服务对象列表(Interrupt Service Table,IST)以保证用户态中断能由正确的用户态线程响应,“用户态中断事件管理模块”负责将硬件中断封装成内核的一种事件资源,“安全可重入模块”和“中断地址空间切换模块”解决用户中断处理函数快速调用和用户中断陷入内核的问题,并且在内核调度模块里提供钩子函数,防止用户态中断处理任务迁移到其他处理器。本申请以下实施例中涉及的处理器以CPU为例进行说明,并不限定于CPU。
硬件层:用户态中断引擎(Reentrant User Interrupt,RUI)提供了用户态中断路由引擎和用户态中断调用引擎。
为解决现有技术中存在的技术问题,示例性的该技术问题为:“中断处理任务处于阻塞状态或者运行在内核态,需要通过任务调度和特权级切换才可以安全的调用用户态中断处理函数,这个过程带来的时延很大”,基于图5所示实施例提供的系统架构,其中该硬件层提供了基于用户态任务进行中断路由和快速调用中断处理函数的硬件机制,软件层负责为驱动这些硬件功能并为应用程序提供一套通用的接口,方便应用程序利用用户态中断来改善IO处理性能。
本申请提供的以线程为目标的中断路由机制,可以通过Interrupt Service Table和Interrupt Router Table和相关的硬件逻辑,允许将根据线程的允许运行状态和位置来动态选择响应中断的CPU。
图6为本申请一个实施例提供的用户态中断硬件架构图。参照图6所示,该中断硬件架构包括用户态中断路由引擎和用户态中断快速调用引擎。其中,第一部分61表示的是用户态中断路由引擎的硬件接口;第二部分62表示用户态中断路由硬件逻辑;第三部分63表示用户态中断路由引擎依赖的内存表对象,软件分配和构造内存表,并通过第一部分61的硬件接口配置表信息;第四部分64表示用户态中断快速调用相关的硬件寄存器。
其中,该用户态中断路由引擎(Router Engine)不同于由内核处理的中断,用户态中断的处理函数属于特定地址空间,这个地址空间会在多个CPU上动态迁移,本申请提供的架构可以通过硬件选择一个CPU来处理用户态中断,保证该CPU上运行的地址空间可以直接调用中断处理函数。为此,硬件构上引入了一套内存数据查询模块,即Router Engine,这个模块用来查询两种内存表数据:Interrupt Router Table和Interrupt Service Table。当用户态中断发生触发时,硬件根据中断请求号查询Interrupt Router Table获取处理中断的线程标识符,再根据线程标识符查询Interrupt Service Table获取处理中断的CPU,最后将中断投递到该CPU上。
图7为本申请一个实施例提供的用户态中断路由引擎示意图。参照图7所示,在一种实施方式中,该用户态中断路由引擎可以是一个挂接在系统总线上的设备,该用户态中断路由引擎的软件编程接口包括:
irtbar(interrupt routing table base address register):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存一个物理地址,作为一个基地址,指向内存里面的一块连续区间,即中断路由表(interrupt router table)。该内存区域(中断路由表)保存系统里所有用户态中断源的信息,具体而言包含有效位(Valid),路由类型(Route Type),事件编号(Event ID),服务编号(Service ID)。其中Router Type目前只支持User-mode一种类型,可以根据具体场景扩展更多类型值。
istbar(interrupt service table base address register):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存一个物理地址,作为一个基地址,指向内存里面的一块连续区间。即中断服务表(interrupt service table)。该内存区域保存系统里所有用户态中断处理线程的信息,具体而言包含有效位(Valid),中断服务上下文(Interrupt Service Context),待处理中断列表(Interrupt Pending Buffer),物理中断编号(Physical Processor ID)。
bcbbar(byte code buffer base address register):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存一个物理地址,作为一个基地址,指向内存里面的一块连续区间(byte code  buffer)。该内存区域保存Router Engine执行的字节码序列,为了防止软件直接修改IRT和IST造成的竞争问题,统一由Router Engine对IRT和IST进行更新,软件通过往BCB里填充字节码来控制硬件更新IRT和IST。
bcbsize(byte code buffer size):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存BCB的长度信息,包括每条字节码的长度,整块BCB的长度。
bcbcurp(byte code buffer current pointer):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存了Router Engine正在执行的字节码在BCB里的偏移。
bcbendp(byte code buffer end pointer):是设备配置空间里的一个Memory-mapped寄存器,该寄存器保存了Router Engine正在更新的字节码在BCB里的偏移。
在一种实施方式中,上述的Memory-mapped寄存器主要用于维护三个结构,分别是Interrupt Router表,Interrupt Service表和Byte Code Buffer。这三个结构都是由内核构造和访问,不允许在用户态访问。
其中,Interrupt Router表有很多条目,每一个条目对应一个用户态中断源,条目信息包含该有效位(Valid),中断路由类型(Route Type),事件编号(Event ID),服务编号(Service ID),每个条目默认长度是64位,Event ID和Service ID分别占用16~24位,Route Type占3~7位,Valid占1位,在不同实现上可以根据需要进行调整,该表基地址和长度要保证按页对齐。
中断服务表(Interrupt Service Table)也有很多条目,每一个条目对应一个中断处理线程,条目信息包含该线程有效位(Valid),中断服务上下文(Interrupt Service Context),待处理中断列表(Interrupt Pending Buffer),物理中断编号(Physical Processor ID),其中Interrupt Service Context指向一个内存区块,该区块要能容纳一组PC,SP和线程页表基地址,Interrupt Pending Buffer指向一个内存区块,该区块的每个比特代表一个待处理的用户态中断,Physical Processor ID长度由系统CPU拓扑控制,该表基地址和长度都要求按页对齐。
字节码缓存(Byte Code Buffer)是记录字节码的连续内存区块,每个字节码长度是32位,BCB的基地址和长度都要求按页对齐。
Router Engine通过上述的Interrupt Router Table和Interrupt Service Table将用户态中断转发到处理线程的运行CPU上,这个过程全程由硬件完成,避免了内核介入用户态中断转发导致的性能开销和竞争问题。图8为本申请一个实施例提供的用户态中断路由硬件流程示意图。参照图8所示,该流程可以包括如下步骤:
步骤801:Router Engine通过总线接收到来自中断源的中断消息,通过Message Decoder将从消息报文里解析出用户态中断请求号,并将中断请求号发送给选择器Selector(用于IRT)。
步骤802:Selector从内存里的Interrupt Router Table将中断请求号对应的条目读取出来。
步骤803:确定中断请求号是否超过范围,若超出范围,则执行步骤804,若未超出范围则执行步骤805。
步骤804:结束中断流程。
步骤805:将条目数据传递给Interrupt Router Entry Decoder。
步骤806:Interrupt Router Entry Decoder将条目各个字段解析出来,确定有效位(Valid)是否为0,若有效位为0,则执行步骤804,若有效位不为0,则执行步骤807。
步骤807:将Event ID传递给仲裁器(Arbiter),将Service ID传递给Selector(用于IST)。
步骤808:Selector从内存的Interrupt Service Table将Service ID对应的条目读取出来,并将各个字段解析出来,并确定有效位(Valid)是否为0,若有效位为0,则执行步骤804,若有效位不为0,则执行步骤809。
步骤809:将Physical Processor ID,Interrupt Service Context和Interrupt Pending Buffer字段发送给Arbiter。
步骤810:Arbiter将Event ID写入Interrupt Pending Buffer指向的内存区块。
步骤811:比较Interrupt Pending Buffer和CPU的GPR保存值是否相同,若相同,则执行步骤812,若不相同,则执行步骤813。
其中,Arbiter通过内部的各种比较器(Comparator,CMP)来判断Physical Processor ID对应 的CPU是否在运行该线程(即为该中断提供处理函数的线程),具体可以通过比较Interrupt Pending Buffer和CPU的GPR保存值是否相同,来确认Physical Processor ID对应的CPU是否在运行该线程,即为该中断提供处理函数的线程。
步骤812:在目标CPU上触发用户态中断。
步骤813:在目标CPU上触发门铃(Doorbell)中断。
步骤814:CPU接收到用户态中断,CPU不会陷入内核态。CPU接收到Doorbell中断,CPU会陷入内核态。通过这套硬件提供的中断路由机制,当中断处理线程在任意CPU上运行的情况下,都能直接在用户态接收中断事件。其中,可以通过Doorbell中断通知内核来准备一个安全的用户态中断处理上下文。
本申请实施例可以当中断处理线程处于阻塞状态或者运行在内核态,也可以快速调用用户态中断处理函数。如图6所示,硬件模块中提供了用户态中断快速调用的一组寄存器:uicontext,uienable,uivector和uinoteinfo。为了解决在内核态调用用户态中断处理函数导致的重入问题,硬件模块又提供了两个寄存器:非安全上下文寄存器(unsafe context register,UCR)和被挂起上下文寄存器(ongoing context register,OCR)。
图9为本申请一个实施例提供的用户态中断快速调用硬件模块示意图。参照图9所示,
用户态中断快速调用寄存器可以包括用户态中断向量寄存器(uivector),用户态中断使能寄存器(uienable),用户态中断备忘信息寄存器(uinoteinfo),用户态中断上下文寄存器(uicontext),上述四个寄存器的具体说明如下:
uivector(user interrupt vector):该寄存器记录一个用户态中断处理函数的虚拟地址,CPU从内核态返回到用户态将从uivector保存的地址取指执行。当CPU在用户态执行时,如果此时被用户态中断打断,也会从uivector保存的地址取指执行,这个寄存器不允许在用户态访问。
uienable(user interrupt enable):该寄存器用来屏蔽用户态中断,当寄存器的值为0,CPU将不会被用户态中断打断,当寄存器值为1,允许CPU被用户态中断打断,这个寄存器允许在用户态访问。
uinoteinfo(user interrupt noteinfo):该寄存器记录了CPU运行的中断处理线程的一些关联信息,从图9所示,这个寄存器由三部分构成,第0位代表CPU运行的线程是否能处理用户态中断,如图9所示的“V”;第1位表示CPU运行的线程是否陷入内核,如图9所示的“T”;其他位记录一个内存块地址,这个内存块登记了所有待线程处理的中断,如图9所示的“Pending Buffer Address”所示。
uicontext(user interrupt context):该寄存器记录了一个内存块地址,内核利用该内存地址里的数据执行快速地址空间切换,该内存由内核分配,将中断处理函数所在地址空间页表基地址(ATT_ADDR),中断处理函数入口地址(UI_PC),中断处理函数栈指针(UI_SP)保存在该内存块里。
ukcp(unsafe kernel context pointer):内核被用户态中断打断时,将现场的PC,GPR,CSR等上下文信息保存在内核栈帧里,保存上下文的栈帧地址记录在该寄存器里。
okcp(ongoing kernel context pointer):用户态中断处理函数陷入内核时,将陷入现场的PC,GPR,CSR等上下文信息保存在内核栈帧里,保存上下文的栈帧地址记录在该寄存器里。
本申请实施例提供的用户态中断快速调用机制可以通过引入的uinoteinfo和uicontext寄存器,允许内核快速切换到中断处理线程所在地址空间,通过引入UCR和OCR寄存器允许在内核永远处于安全可以重入状态。具体地,一方面可以通过uinoteinfo寄存器保存了当前CPU运行线程的状态信息,包括,待处理中断信息(pending buffer address字段),中断线程陷入内核标记(uinoteinfo.T),中断线程是否在线标记(uinoteinfo.V),配合硬件中断路由模块和Interrupt Service Table表项可以快速判断线程状态,区分了V和T这两种状态的收益是,避免在无意义的页表切换操作。
另一方面,还可以通过uicontext寄存器来加速用户态调用的过程,硬件中断路由模块自动将处理中断依赖的页表,PC和SP填充到uicontext里,这个机制的收益是,避免内核通过复杂查询来获取页表基地址,内核利用硬件提供的信息,以最小的代码完成地址空间切换和中断处理函数的调用。
再一方面,CPU在内核态也可以快速响应用户态中断,既能满足DPDK这种能耗比敏感场景,也可以满足用户态IPC和RPC这响应时延敏感场景。
再一方面,可以通过引入两个内核栈指针备份寄存器(UCR和OCR),其中,UCR用于记录内核被中断打断时的栈指针,OCR用于记录中断处理函数陷入内核时的栈指针,通过这两个栈指针可以快速恢复内核可重入状态。
常规内核都是在异常和系统调用结束后先回到用户态,才可以执行用户态函数,这个过程往往涉及到复杂的唤醒和线程调度流程,通过配合uinoteinfo和uicontext寄存器,内核可以避免复杂的唤醒和调度开销,通过uicontext寄存器里直接获取用户态函数对应的页表,PC和SP,用最小的代价完成函数调用。
在一种替代方案中,还可以在用户态应用程序里直接触发用户态中断,实现快速IPC。
图10为本申请实施例提供的一种硬件投递用户态中断流程示意图。参照图10所示,该流程包括以下步骤:
步骤1001:用户态中断路由引擎(Router Engine)将中断投递到CPU前,确定uinoteinfo里的pending buffer address和IST条目的Interrupt Pending Buffer是否相同。
其中,中断处理线程只有在用户态运行时才可以直接调用处理函数,所以Router Engine将中断投递到CPU前,会对比用户态中断备忘信息寄存器(uinoteinfo)和IST条目的内容,若内容不相同,则执行步骤1002,如果uinoteinfo里的pending buffer address和IST条目的Interrupt Pending Buffer相同,就表示中断线程正在CPU上运行,则执行步骤1003。
步骤1002:用户态中断路由引擎(Router Engine)将中断服务程序(Interrupt Service Routines,ISR)表条目的Interrupt Service Context填入uicontext寄存器,然后将一个特殊的内核Doorbell中断投递到CPU上,CPU陷入内核,内核从uicontext寄存器指向内存读取数据中断处理函数所在页表基地址,中断处理函数PC,中断处理函数栈帧,内核利用这三个数据以实现地址空间的快速切换。
步骤1003:确定uinoteinfo.V是否等于1,若等于1,则继续执行步骤1004,若不等于1,则返回执行步骤1002。
步骤1004:确定uinoteinfo.T==0是否等于0,若等于0,则继续执行步骤1005,若不等于0,则返回执行步骤1002。
其中,uinoteinfo.T==0和uinoteinfo.V==1,表示当前线程正在用户态运行。
步骤1005:确定uienable是否等于1,若等于1则执行步骤1006,若不等于1则执行步骤1007。
步骤1006:向CPU投递用户态中断,从而使CPU从uivector里的地址开始取指,并直接在用户态执行处理函数。
步骤1007:陷入内核处理(门铃)Doorbell中断。
其中,中断处理线程只有在用户态运行时才可以直接调用处理函数,所以Router Engine将中断投递到CPU前,会对比uinoteinfo和IST条目的内容,如果uinoteinfo里的pending buffer address和IST条目的Interrupt Pending Buffer相同,就表示中断线程正在CPU上运行,如果uinoteinfo.T等于0并且uinoteinfo.V等于1,表示当前线程(即,为该中断提供处理函数的线程)正在用户态运行,Router Engine将被用户态中断投递到CPU上,CPU将从uivector里的地址开始取指,并直接在用户态执行处理函数。若uinoteinfo.T等于0但是uinoteinfo.V不等于1,Router Engine将ISR表条目的Interrupt Service Context填入uicontext寄存器,然后将一个特殊的内核Doorbell中断投递到CPU上,CPU陷入内核,内核从uicontext寄存器指向内存读取数据中断处理函数所在页表基地址,中断处理函数PC,中断处理函数栈帧,内核利用这三个数据以实现地址空间的快速切换。当中断处理线程陷入内核时要将uinoteinfo.T设置为1,当中断处理线程返回到用户态前要将uinoteinfo.T设置为0,中断处理线程被CPU调度出去时要将uinoteinfo.V设置为0,中断处理线程被CPU调入时要更新uinoteinfo的pending buffer address字段,并将uinoteinfo.V设置为1。
图11为本申请一个实施例提供的一种软件模块分类示意图。参照图11所示,内核层(软件层)包括:调度模块、异常处理模块、用户态中断事件管理模块,用户态中断路由引擎驱动,中断地址空间快速切换模块,保证内核安全可重入模块。
其中用户态中断路由引擎驱动直接操作本申请提供的用户态中断引擎硬件,包括配置Memory-mapped寄存器,构造字节码并填充至Byte Code Buffer。用户态中断事件管理模块负责在内存里分配和管理本申请提供的Interrupt Router Table和Interrupt Service Table,将线程注册成 用户态中断处理线程,将中断注册为用户态中断,这些操作最终都会通过用户态中断路由引擎驱动进行硬件操作。中断地址空间快速切换模块负责操作本申请提供的中断快速调用寄存器,该模块在线程调度和异常处理模块里都会使用。内核安全可重入模块负责操作本申请提供的安全可重入寄存器,该模块在线程调度和异常处理模块都会使用。
以下针对用户态中断事件管理模块(以下简称中断事件管理模块),中断地址空间快速切换模块和保证内核安全可重入模块进行详细说明。
关于用户态中断事件管理模块:
普通线程要先将自己注册为中断处理线程,才能具备处理用户态中断的能力。另外,由内核处理的中断的路由目标是物理CPU,无论中断在哪个CPU上处理,内核都可以直接调用中断处理函数,但用户态中断处理函数属于特定的地址空间的,所以用户态处理的中断的路由目标是中断处理线程。所以中断事件管理模块包含两个特性:注册中断处理线程和配置中断路由目标。
图12为本申请一个实施例提供的中断事件管理模块的示意图。参照图12所示,如果线程需要在用户态处理中断就可以将自己注册为中断处理线程,内核会在Interrupt Service Table分配一个表条目,并且填充条目的Interrupt Service Context和Interrupt Pending Buffer字段,这两个字段指向的内存区块也是由内核分配,每个线程的内核管理对象都需要登记对应的IST条目编号,Interrupt Service Context和Interrupt Pending buffer内存区块地址。Interrupt Service Table表条目里Interrupt Pending Buffer字段指向一块内存区块,这个内存块里保存中断线程所有待处理的中断信息,支持线程在不同的CPU都能快速获取待处理中断信息。
当线程注册为中断处理线程后,就要为自己配置待处理的中断源:内核会在Interrupt Router Table里分配一个表条目,并且将条目里的Router Type设置为User-mode(值为1),将条目里的Service ID里的设置为中断处理线程对应的IST条目编号,Event ID由应用程序提供。
本申请实施例提供的Interrupt Router Table将中断触发端和中断处理端进行了解耦,可以支持更加灵活的软件场景,比如,利用用户态中断加速常规同步原语futex,eventfd,pipe,signal等。
图13为本申请一个实施例提供的中断地址空间切换模块示意图。参照图13所示,以下通过三种场景进行说明,具体如下:
第一种场景是用户态中断的处理线程正在CPU上运行,Router Engine直接将中断发送到CPU上,此时在CPU上运行程序被打断,并且从uivector里保存的用户态中断入口开始执行:
步骤1301:将CPU被打断的上下文保存在中断处理栈帧里,包含GPR和PC。
步骤1302:遍历Interrupt Pending buffer里所有设置的比特位,其中,每个设置的比特位对应一个待处理的Event ID。
步骤1303:根据Event ID获取对应的用户态的中断处理函数,然后调用该函数。
步骤1304:将所有待处理的Event ID都处理完毕后,从栈帧里恢复GPR和PC。
步骤1305:调用中断返回指令恢复到被中断的上下文继续执行。
第二种场景是中断处理线程运行在内核态,此时uinoteinfo寄存器的T位为1,Router Engine会往CPU上发送一个Doorbell中断,这个Doorbell中断由内核处理,Doorbell的处理函数主要是:
步骤1311:将中断上下文保存在内核栈帧里,再将内核栈顶地址保存在非安全上下文寄存器UCR里。
步骤1312:检测uinoteinfo.T的值,如果为1,表示中断处理线程在内核态执行,无需切换地址空间。
步骤1313:从uicontext指向的内存里取出用户态中断处理函数(UI_PC)和中断栈帧地址(UI_SP),构造一个用户态上下文栈帧,其中PC和SP寄存器分别设置为UI_PC和UI_SP。
步骤1314:执行中断返回指令,返回到用户态中断入口函数。
第三种场景是中断处理线程未在CPU上执行,此时uinoteinfo寄存器的V位为0,Router Engine也会往CPU上发送一个Doorbell中断,这个Doorbell的处理流程和第二种场景基本一致,但需要额外执行中断地址空间切换操作:
步骤1321:从uicontext指向的内存里取出中断处理线程所在的页表基地址(ATT_ADDR)。
步骤1322:切换CPU的页表基础地址,为防止转换检测缓冲区(Translation Lookaside Buffer,TLB)的重名问题,还要刷新所有TLB条目。
步骤1323:将中断处理线程暂时固定在当前CPU上,防止被内核迁移到其他CPU上。
基于上述三种场景,第一种场景下调用中断处理函数的开销最低,第三种场景的开销最大,但按图13所示,即使是需要切换地址空间,利用uicontext也可避免开销最大的任务调度环节,整体任务切换开销<100指令周期,如果CPU的主频是2G HZ,这个中断地址空间切换开销<100ns,本申请相比已有的用户态中断方案在第三种场景上有10~20倍性能优势。
图14为本申请一个实施例提供的安全可重入模块示意图。参照图14所示,为了支持用户态中断处理函数,本申请实施例可以实现用户态中断直接打断内核的执行流,并且快速返回到用户态执行中断处理函数,然而会出现导致内核处于一种不可重入状态,如果处理用户态中断过程中使用了内核的任何功能,可能破坏内核状态的问题。为解决上述技术问题,本申请实施例可以通过提供两个寄存器UCR(unsafe context register)和OCR(ongoing context register),并且在内核里定义若干“安全重入点”和“重入检测点”来保证处理用户态中断的过程能安全的使用内核功能,具体包括以下步骤:
步骤1401:将内核异常处理入口定义为“重入检测点”,将内核返回用户态的点定义为“安全重入点”。
步骤1402:内核被用户态中断打断的上下文保存到一个内存块里,并将内存块地址记录在UCR寄存器里,然后快速返回到用户态执行中断处理函数。
其中,Interrupt Service Table表条目里Interrupt Service Context字段指向一块内存块,这个内存块里保存的数据可以为软件提供快速调用用户态中断处理函数所必须的信息。
步骤1403:内核在“重入检测点”检查UCR,如果UCR值非0表示CPU正处于非安全重入状态,还不能直接使用内核的功能,此时先将内核上下文保存在一个内存块里,并将内存块地址记录在OCR寄存器里,当前内核的执行流被挂起直到内核恢复安全重入状态。
步骤1404:接着通过OCR寄存器取出保存内核上下文的内存块地址,将CPU恢复到这个被打断的现场,让被打断的内核继续执行到“安全重入点”。
步骤1405:当内核执行到“安全重入点”时,此时再通过OCR寄存器恢复被挂起的内核上下文,即第3步,此时内核就可以安全的使用内核的功能。
基于步骤1401至步骤1405,中断地址快速切换和安全可重入模块是本申请软件部分的核心特性,考虑到大部分使用用户态中断的场景,都是中断处理线程处于阻塞状态这种情况,这两个模块提供的快速调用用户态中断处理函数能力,相比其他的用户态中断机制在性能和适用场景上会更有竞争力。
数据面开发套件(Data Plane Development Kit,DPDK)为一种用户态网卡驱动框架,它能提供高性能的网络报文IO处理能力,为了追求高性能,DPDK应用必须一组占用CPU进行轮询操作,在网络比较空闲的时候,CPU将处于空转状态,带来了数据中心的高能耗问题,使用DPDK的中断模式可以降低能耗。
图15为本申请一个实施例提供的DPDK中断模式流程示意图。参照图15所示,在网络空闲时CPU可以进入idle状态,节省能耗,中断模式和轮询模式能耗相差40倍,但DPDK中断模块会带来额外开销,导致网络收发包时时延增加,可能会造成丢包,在云核以及云等关键业务中,高时延以及丢包导致业务性能不可用。
通过在RISC-V环境上加入本申请提供的用户态中断机制,基于Linux内核,将DPDK针对用户态中断模式进行适配,该实施例涉及的组件包括,硬件中断路由引擎,硬件用户态中断调用寄存器,硬件安全可重入寄存器;内核中断事件管理模块、中断路由引擎驱动、中断地址切换模块、安全可重入模块;用户态层的中断线程注册模块、中断事件注册模块。
图16为本申请一个实施例提供的一种软件框架示意图。参照图16所示,DPDK中断模块响应网卡中断的流程包括以下步骤:
步骤1601:DPDK App调用用户态中断线程注册模块,DPDK App提供用于处理用户态中断的函数地 址和栈帧地址,对应图中(1)。
步骤1602:DPDK App调用用户态中断注册模块,将某个中断的路由目标设置为本线程,对应图中(2)。
其中,步骤1601和步骤1602都是通过内核的中断事件管理模块完成,具体可以为从内存里的IST和IRT里分配可用表项并对其进行初始化,对应图中(3)(4)。
步骤1603:通过内核的中断路由引擎驱动将分配的IST和IRT表项配置到本申请提供的Router Engine硬件模块里,这个环节需要构造Byte Codes并填充到内存的Bytes Code Buffer里并等待Router Engine执行完毕,对应图中(5)(6)。
上述步骤是DPDK App为了利用本申请提供的用户态中断机制必须额外执行的过程,这些环节在DPDK App的生命周期只需执行一次。
步骤1605:每确定DPDK App陷入内核态、被内核调度出或者调入CPU时,内核更新uinoteinfo寄存器。对应图中(7)。
步骤1606:网卡设备触发RX中断,Router Engine硬件模块先接收到中断,然后访问内存的IST和IRT选择中断投递目标。如果DPDK App当前正在CPU上运行,CPU直接跳转到用户态中断函数入口,对应图中(8)。
步骤1607:如果DPDK App阻塞在内核态或者被调度出去了,CPU跳转到内核态的中断地址空间快速切换模块,先保存CPU现场,再执行地址空间切换流程,最后返回到DPDK App的用户态中断处理函数入口,见图(9)和(10)
图17为本申请实施例提供的DPDK中断模式与用户态中断之间的时延对比图。参照图17所示,传统的DPDK中断模式流程网卡中断、陷入内核、唤醒任务、任务调度、切换页面以及返回用户态后通过DPDK App进行报文处理。然而本申请实施例提供的基于用户态中断实现流程简化,包括网卡中断、陷入内核、切换页表后返回到用户态中断入口函数。相比而言,传统的DPDK中断模式响应内核态中断转发时,其对应的响应时延为16000~18000cycles,然而本申请实施例提供的基于用户态中断响应用户态中断调用时,其对应的响应时延为500~1500cycles,减小了中断响应时延。
本申请实施例还提供一种中断控制器,包括:处理器和存储器,所述存储器用于存储至少一条指令,所述指令由所述处理器加载并执行时以实现本申请任一实施例提供的中断处理方法。
本申请实施例还提供一种中断控制器,包括:用户态中断路由引擎和用户态中断快速调用引擎;用户态中断快速调用引擎包括:用户态中断备忘信息寄存器,用于记录当前处理器运行线程的状态信息;中断控制器确定中断线程在处理器上运行时,基于用户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。
在一种实施方式中,用户态中断快速调用引擎还包括:用户态中断上下文寄存器,用于记录内存地址;其中,基于用户态中断备忘信息寄存器的记录信息,若确定用户态中断备忘信息寄存器的终端线程陷入内核标记非第一状态,和/或用户态中断备忘信息寄存器的中断线程是否在线标记非第二状态,则将中断服务程序表条目的处理中断上下文填入用户态中断上下文寄存器,并向处理器投递门铃中断。本申请实施例提供的中断控制器中引入的用户态中断备忘信息寄存器和用户态中断上下文寄存器,进而可以实现允许内核快速切换到中断处理线程所在地址空间。
本申请实施例还提供一种电子设备,该电子设备可以包括上述中断控制器,从而通过该中断控制器实现上述中断控制方法。
本申请实施例还提供一种计算机刻度存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现本申请任一实施例提供的中断处理方法。
可以理解的是,所述应用可以是安装在终端上的应用程序(nativeApp),或者还可以是终端上的浏览器的一个网页程序(webApp),本申请实施例对此不进行限定。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方 式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机,服务器,或者网络装置等)或处理器(Processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (11)

  1. 一种中断处理方法,其特征在于,所述方法包括:
    确定中断线程在处理器上运行时,基于用户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向所述处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。
  2. 根据权利要求1所述的方法,其特征在于,所述基于用户态中断备忘信息寄存器的记录信息,确定当前线程在用户态运行包括:
    确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记为第一状态,并且所述用户态中断备忘信息寄存器的中断线程是否在线标记为第二状态,则确定当前线程在用户态运行。
  3. 根据权利要求1所述的方法,其特征在于,所述确定当前线程在用户态运行之前还包括:
    基于用户态中断备忘信息寄存器的记录信息,确定当前线程是否在用户态运行;
    其中,基于用户态中断备忘信息寄存器的记录信息,若确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记非第一状态,和/或所述用户态中断备忘信息寄存器的中断线程是否在线标记非第二状态,则将中断服务程序表条目的处理中断上下文填入用户态中断上下文寄存器,并向所述处理器投递门铃中断。
  4. 根据权利要求1所述的方法,其特征在于,所述确定中断线程在处理器上运行时之前,还包括:
    确定中断服务表中的待处理中断列表与所述用户态中断备忘信息寄存器的待线程处理中断信息是否相同,若相同则确定中断线程在处理器上运行。
  5. 根据权利要求4所述的方法,其特征在于,若确定中断服务表中的待处理中断列表与所述用户态中断备忘信息寄存器的待线程处理中断信息不相同,则将中断服务程序条目的中断服务上下文填入用户态中断上下文寄存器,并向所述处理器投递内核门铃中断。
  6. 根据权利要求1所述的方法,其特征在于,所述确定中断线程在处理器上运行时之前,还包括:
    确定所述处理器陷入内核态时,在安全检测点检测非安全上下文寄存器的值,若确定处理器处于非安全冲入状态,则将内核上下文保存在设定内存,并将所述设定内存的地址记录在被挂起上下文寄存器,其中,所述安全检测点为内核异常处理入口;
    基于所述被挂起上下文寄存器记录的所述设定内存的地址,将处理器恢复到被打断的现场,并使被打断的内核继续执行安全重入点,其中,所述安全重入点为内核返回用户态的点;
    在内核执行到所述安全重入点时,通过所述被挂起上下文寄存器恢复被挂起的内核上下文。
  7. 一种中断控制器,其特征在于,所述中断控制器包括:
    处理器和存储器,所述存储器用于存储至少一条指令,所述指令由所述处理器加载并执行时以实现如权利要求1-6中任意一项所述的中断处理方法。
  8. 一种中断控制器,其特征在于,所述中断控制器包括:用户态中断路由引擎和用户态中断快速调用引擎;
    所述用户态中断快速调用引擎包括:用户态中断备忘信息寄存器,用于记录当前处理器运行线程的状态信息;
    所述中断控制器确定中断线程在处理器上运行时,基于用所述户态中断备忘信息寄存器记录的当前处理器运行线程的状态信息,若确定当前线程在用户态运行,则向所述处理器投递用户态中断,并基于用户态中断向量寄存器存储的地址读取指令,在用户态执行中断处理函数。
  9. 根据权利要求8所述的中断控制器,其特征在于,所述用户态中断快速调用引擎还包括:用户态中断上下文寄存器,用于记录内存地址;
    其中,基于用户态中断备忘信息寄存器的记录信息,若确定所述用户态中断备忘信息寄存器的终端线程陷入内核标记非第一状态,和/或所述用户态中断备忘信息寄存器的中断线程是否在线标记非第二状态,则将中断服务程序表条目的处理中断上下文填入用户态中断上下文寄存器,并向所述处理器投递门铃中断。
  10. 一种电子设备,其特征在于,所述电子设备包括权利要求8或9所述的中断控制器。
  11. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-6中任意一项所述的中断处理方法。
PCT/CN2023/103632 2022-07-07 2023-06-29 中断处理方法、电子设备和存储介质 WO2024007934A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210803080.9A CN117407054A (zh) 2022-07-07 2022-07-07 中断处理方法、电子设备和存储介质
CN202210803080.9 2022-07-07

Publications (1)

Publication Number Publication Date
WO2024007934A1 true WO2024007934A1 (zh) 2024-01-11

Family

ID=89454259

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/103632 WO2024007934A1 (zh) 2022-07-07 2023-06-29 中断处理方法、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN117407054A (zh)
WO (1) WO2024007934A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117670649A (zh) * 2024-01-30 2024-03-08 南京砺算科技有限公司 元数据写入及读取方法、图形处理单元

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150205661A1 (en) * 2014-01-20 2015-07-23 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Handling system interrupts with long-running recovery actions
CN107003899A (zh) * 2015-10-28 2017-08-01 华为技术有限公司 一种中断响应方法、装置及基站
CN112231007A (zh) * 2020-11-06 2021-01-15 中国人民解放军国防科技大学 基于用户态与内核态驱动协同处理框架的设备驱动方法
CN113010275A (zh) * 2019-12-20 2021-06-22 大唐移动通信设备有限公司 一种中断处理方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150205661A1 (en) * 2014-01-20 2015-07-23 Lenovo Enterprise Solutions (Singapore) Pte. Ltd. Handling system interrupts with long-running recovery actions
CN107003899A (zh) * 2015-10-28 2017-08-01 华为技术有限公司 一种中断响应方法、装置及基站
CN113010275A (zh) * 2019-12-20 2021-06-22 大唐移动通信设备有限公司 一种中断处理方法和装置
CN112231007A (zh) * 2020-11-06 2021-01-15 中国人民解放军国防科技大学 基于用户态与内核态驱动协同处理框架的设备驱动方法

Also Published As

Publication number Publication date
CN117407054A (zh) 2024-01-16

Similar Documents

Publication Publication Date Title
US10880195B2 (en) RPS support for NFV by system call bypass
US6223207B1 (en) Input/output completion port queue data structures and methods for using same
US10353725B2 (en) Request processing techniques
US9116869B2 (en) Posting interrupts to virtual processors
CN107046508B (zh) 报文接收方法及网络设备
US9229789B2 (en) Transparent user mode scheduling on traditional threading systems
US20060117325A1 (en) System and method for interrupt handling
US11126575B1 (en) Interrupt recovery management
US20220229688A1 (en) Virtualized i/o
WO2024007934A1 (zh) 中断处理方法、电子设备和存储介质
WO2023046141A1 (zh) 一种数据库网络负载性能的加速框架、加速方法及设备
US20210055948A1 (en) Fast thread execution transition
EP3770759A1 (en) Wake-up and scheduling of functions with context hints
US8495261B2 (en) Redispatching suspended tasks after completion of I/O operations absent I/O interrupts
WO2024164622A1 (zh) 多核核间通信方法、系统、设备及非易失性可读存储介质
CN116257471A (zh) 一种业务处理方法及装置
US7797473B2 (en) System for executing system management interrupts and methods thereof
WO2023241307A1 (zh) 管理线程的方法及装置
JP2001282558A (ja) マルチオペレーティング計算機システム
US7320044B1 (en) System, method, and computer program product for interrupt scheduling in processing communication
CN117407183B (zh) 线程间通信的方法及电子设备
US12039363B2 (en) Synchronizing concurrent tasks using interrupt deferral instructions
EP4300307A1 (en) Systems and method for processing privileged instructions using user space memory
US20240248744A1 (en) Systems and methods for offloading guest tasks to a host system
US11593159B2 (en) External exception handling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23834701

Country of ref document: EP

Kind code of ref document: A1