WO2023174126A1 - 一种数据处理方法及装置 - Google Patents

一种数据处理方法及装置 Download PDF

Info

Publication number
WO2023174126A1
WO2023174126A1 PCT/CN2023/080336 CN2023080336W WO2023174126A1 WO 2023174126 A1 WO2023174126 A1 WO 2023174126A1 CN 2023080336 W CN2023080336 W CN 2023080336W WO 2023174126 A1 WO2023174126 A1 WO 2023174126A1
Authority
WO
WIPO (PCT)
Prior art keywords
cpu
host
virtual machine
cache line
vcpu
Prior art date
Application number
PCT/CN2023/080336
Other languages
English (en)
French (fr)
Inventor
摩西
郭凯杰
罗犇
Original Assignee
阿里巴巴(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴(中国)有限公司 filed Critical 阿里巴巴(中国)有限公司
Publication of WO2023174126A1 publication Critical patent/WO2023174126A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Definitions

  • the present application relates to the field of computer technology, and in particular, to a data processing method and device.
  • Virtualization is a key technology of cloud computing.
  • Virtualization technology can virtualize a physical machine (host) into one or more virtual machines.
  • Each virtual machine has its own virtual hardware, such as VCPU (Virtual Central Processing Unit), virtual memory, and virtual I/O devices, thus forming an independent virtual machine execution environment.
  • Virtualization technology is widely used in fields such as cloud computing and high-performance computing due to its high fault tolerance and high resource utilization.
  • VMM Virtual Machine Management, virtual machine monitor
  • VMM Virtual Machine Management, virtual machine monitor
  • CPU Central Processing Unit, central processing unit
  • memory and I/O devices, etc.
  • This application shows a data processing method and device.
  • this application shows a data processing method, which is applied to a host machine in which at least a virtual machine and a detection thread are running; the method includes: predicting the VCPU allocated to the virtual machine after the current moment. The first estimated number of executions of cross-cache line cache line operations on the host's central processing unit CPU within a period of time; when the first estimated number of executions is greater than or equal to the preset threshold, shut down the host's CPU due to The function of throwing an exception when the memory access bus of the CPU is locked; so that the host CPU does not throw an exception when the memory access bus of the host CPU is locked; and, switching the state of the detection thread from silent state to Activated state; so that the detection thread polls the CPU running data recorded in the performance monitor PMU corresponding to the CPU in the host, and obtains the VCPU allocated to the virtual machine in the first time period based on the polled CPU running data The number of actual execution times of cross-cache line operations on the host's CPU.
  • this application shows a data processing device, which is applied to a host machine.
  • the host machine has at least a virtual machine and a detection thread running; the device includes: a first prediction module, used to predict the allocation of data to the virtual machine.
  • the VCPU is expected to execute the first expected number of executions of cross-cache line cache line operations on the host's central processing unit CPU in the first time period after the current moment; the shutdown module is used to execute the first expected number of executions when the first expected number of executions is greater than or equal to the expected number.
  • the threshold When the threshold is set, turn off the function of the host's CPU that throws an exception because the CPU's memory access bus is locked; so that the host's CPU The host CPU does not throw an exception when the memory access bus is locked; and, the first switching module is used to switch the state of the detection thread from the silent state to the active state; the polling module is used to poll the host
  • the acquisition module is used to obtain the CPU operating data recorded in the performance monitor PMU corresponding to the CPU, and is used to obtain the VCPU allocated to the virtual machine based on the polled CPU operating data and actually execute the CPU of the host machine within the first time period. The actual number of executions of cross-cache line operations.
  • the present application shows an electronic device.
  • the electronic device includes: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to perform the method shown in any of the foregoing aspects. .
  • the present application shows a non-transitory computer-readable storage medium that, when instructions in the storage medium are executed by a processor of an electronic device, enables the electronic device to perform the method shown in any of the foregoing aspects. .
  • the present application illustrates a computer program product that, when instructions in the computer program product are executed by a processor of an electronic device, enables the electronic device to perform the method shown in any of the foregoing aspects.
  • this application includes the following advantages:
  • the VCPU allocated to the virtual machine is expected to perform the first estimated number of executions of cross-cache line operations on the host's CPU in the first time period after the current time.
  • the first expected number of executions is greater than or equal to the preset threshold, shut down the function of the host's CPU throwing an exception because the memory access bus of the CPU is locked. So that the host CPU does not throw an exception when the memory access bus of the host CPU is locked. And, switch the state of the detection thread from the silent state to the active state.
  • the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host, and obtains the VCPU allocated to the virtual machine based on the polled running data of the CPU to actually execute the host on the first time period.
  • the actual number of executions of cross-cache line operations by the CPU is detected.
  • the overall performance of the host and the performance of the virtual machine can be improved.
  • Figure 1 is a step flow chart of a data processing method in this application.
  • Figure 2 is a step flow chart of a data processing method in this application.
  • Figure 3 is a step flow chart of a data processing method in this application.
  • Figure 4 is a structural block diagram of a data processing device of the present application.
  • Figure 5 is a structural block diagram of a device of the present application.
  • the CPU in the host allows unaligned memory accesses.
  • the operands of the atomic operation due to address misalignment
  • an operation to access the CPU's cache spans two cache lines, which will trigger a split lock event.
  • a cache line consists of 64 bytes
  • a member of the struct counter occupies 8 bytes
  • buf fills 62 bytes. Therefore, once accessing this member involves splicing the contents of the two cache lines, performing an atomic operation will trigger the split lock event.
  • cache-specific protocols can only guarantee consistency at cache line granularity. Accessing two cache lines at the same time cannot guarantee the consistency of cache line granularity.
  • special logic such as cold path
  • the average memory access delay of the host's CPU will increase significantly, and due to the execution of "interception of other threads in the host /Access to the memory bus by other cores in the CPU in the host will also consume some computing resources of the host's CPU, therefore, it will reduce the overall performance of the host.
  • the average memory access delay of the host's CPU can be reduced as much as possible, and the execution of "interception of other threads in the host/host" can be reduced as much as possible.
  • the number of operations performed by other cores in the CPU to access the memory bus can be reduced as much as possible.
  • the number of data processing process interruptions of other threads in the host/other cores in the CPU in the host can be reduced.
  • the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine can be detected.
  • This can then reduce the number of interceptions of access to the memory bus by other threads in the host/other cores in the CPU in the host, which in turn can reduce the number of subsequent interceptions by other threads in the host/other cores in the CPU in the host.
  • the number of data processing process interruptions can then reduce the average memory access delay of the host's CPU, which can achieve the purpose of improving the overall performance of the host.
  • the countermeasures may include: reducing the CPU utilization rate of the virtual machine on the host machine or the access frequency of the virtual machine to the CPU of the host machine, etc.
  • the inventor found that: when the VCPU allocated for the virtual machine actually performed cross-cache line operations on the host's CPU, In the case of cache line operation, the memory access bus of the host CPU will be locked. When the memory access bus of the host CPU is locked, the host CPU will throw an exception, such as Bus Lock Exception or #DB Exception etc.
  • the kernel-mode VMM on the host machine is required to handle the exception.
  • the virtual machine will exit to the kernel-mode VMM (for example, suspend the virtual machine and resume the operation of the kernel-mode VMM).
  • the kernel-mode VMM After exiting to the kernel-mode VMM, Afterwards (that is, after resuming running the kernel-mode VMM), the kernel-mode VMM will try to handle the exception.
  • the kernel-mode VMM may determine that the kernel-mode VMM cannot handle the exception or that the exception should be handled by the user-mode VMM. In this case, the kernel-mode VMM can notify the user-mode VMM to handle the exception. Exception, after the user-mode VMM receives the notification, it will obtain the exception or try to handle the exception.
  • the user-mode VMM after the user-mode VMM obtains the exception, it can also obtain the relevant information of the exception, etc. (For example, the cause of the exception can be recorded in the shared area between the user-mode VMM and the kernel-mode VMM, from which the exception can be parsed. Whether the reason is due to the triggering of the split lock event, etc.), and based on the relevant information of the exception, it can be determined whether the VCPU allocated to the virtual machine has performed a cache line operation across the host's CPU (whether the VCPU allocated to the virtual machine has triggered split lock event), and count the actual number of cross-cache line operations performed by the VCPU assigned to the virtual machine on the host's CPU.
  • an exception Buss Lock Exception or #DB Exception, etc.
  • the VCPU allocated to the virtual machine When the VCPU allocated to the virtual machine performs frequent cross-cache line operations on the host's CPU (for example, tens of thousands or hundreds of thousands of times per second), on the one hand, it will cause the VMM in the host to frequently enter
  • DOS Delivery of Service Attack (denial of service attack) attack status.
  • the performance of the virtual machine will be reduced due to multiple exits from the virtual machine to the kernel-mode VMM.
  • the inventor believes that when the VCPU allocated to the virtual machine performs frequent cross-cache line operations on the host's CPU (for example, tens of thousands or hundreds of thousands of times per second), the above method can be used to detect The VCPU allocated to the virtual machine actually performs an inappropriate number of cross-cache line operations on the host's CPU.
  • the function of the host CPU throwing an exception when the memory access bus of the host CPU is locked has a function switch. Function switches can be turned on or off.
  • the function switch can be turned off, so that when the VCPU allocated to the virtual machine performs frequent cross-cache line operations on the host's CPU (for example, tens of thousands or hundreds of thousands of times per second, etc.) Under this situation, every time the memory access bus of the host CPU is locked, the host CPU will not throw an exception, thereby avoiding reducing the overall performance of the host and reducing the performance of the virtual machine.
  • the inventor further abandoned the idea that after the user-mode VMM obtains the exception, it can also obtain the relevant information of the exception, etc. (for example, the cause of the exception can be recorded in the shared area between the user-mode VMM and the kernel-mode VMM. etc., from which it can be parsed that the cause of the exception is due to the split lock event being triggered, etc.), and based on the exception Common relevant information can determine whether the VCPU allocated to the virtual machine has performed a cache line operation across the host's CPU (whether the VCPU allocated to the virtual machine has triggered a split lock event), and the virtual machine can be counted by counting.
  • the relevant information of the exception for example, the cause of the exception can be recorded in the shared area between the user-mode VMM and the kernel-mode VMM. etc., from which it can be parsed that the cause of the exception is due to the split lock event being triggered, etc.
  • the actual number of cross-cache line operations that the allocated VCPU actually performed on the host's CPU is used to detect the actual number of times that the VCPU assigned to the virtual machine actually performed the cross-cache line operations on the host's CPU, but I thought of creating a detection thread on the host, which is used to detect the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine.
  • the method is applied to a host machine, in which at least a virtual machine and a detection thread are running.
  • the method includes:
  • step S101 it is predicted that the VCPU allocated to the virtual machine is expected to perform a first estimated number of executions of cross-cache line operations on the host's CPU in a first time period after the current time.
  • performing a cross-cache line operation on the host's CPU will trigger a split lock event.
  • the split lock event is triggered, the memory access bus of the host's CPU will be locked, which will reduce the overall performance of the host.
  • the time in order to detect the actual number of cross-cache line operations performed by the VCPU assigned to the virtual machine on the host's CPU, the time can be divided into multiple consecutive time periods, and each time period The duration can be the same, and adjacent time periods can be consecutive. For example, in adjacent time periods, the end time of the earlier time period and the starting time of the later time period can be the same.
  • the time period can be used as the base time unit to detect the actual number of cross-cache line operations performed by the VCPU assigned to the virtual machine on the host's CPU. For example, it can be detected that the VCPU assigned to the virtual machine performs cross-cache line operations in each time period. The actual number of cross-cache line operations performed by the host's CPU.
  • the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine can also be detected on a basis smaller than the time period, so as to improve the detection of the VCPU allocated to the virtual machine. Real-time performance of the actual number of executions of cross-cache line operations on the host's CPU.
  • At least two methods can be used to detect the actual number of times the VCPU allocated to the virtual machine actually performs cross-cache line operations on the host's CPU.
  • the appropriate detection method in one embodiment, if it is necessary to detect the actual number of cross-cache line operations performed by the VCPU allocated to the virtual machine within a time period on the host's CPU, in this time period Before, you can first predict the estimated number of cross-cache line operations that the virtual VCPU allocated to the virtual machine is expected to perform on the host's CPU during this time period.
  • a preset threshold can be set based on actual conditions.
  • one of the methods can be used to detect The actual number of cross-cache line operations performed on the host's CPU by the VCPU assigned to the virtual machine during this time period.
  • another method can be used to detect The actual number of cross-cache line operations performed on the host's CPU by the VCPU assigned to the virtual machine during this time period.
  • the VCPU allocated to the virtual machine in the first time period there is a first time period after the current moment, and before detecting the actual number of execution times of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the first time period, You can first predict the first estimated number of executions of cross-cache line operations on the host's CPU by the virtual VCPU allocated to the virtual machine in the first time period after the current moment (it is the predicted number of executions, not the actual executions) number of executions), and then execute step S102.
  • historical data can be used to predict the first estimated number of executions of cross-cache line operations on the host's CPU by the virtual VCPU allocated to the virtual machine in the first time period after the current time.
  • Historical data may include the number of historical executions of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in at least one historical period before the current moment, and then the virtual VCPU allocated to the virtual machine is obtained based on the historical data.
  • historical data can be analyzed to analyze the pattern of the number of historical execution times of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the historical process, and based on the pattern, the VCPU allocated to the virtual machine can be obtained.
  • the virtual VCPU is expected to perform the first estimated number of executions of cross-cache line operations on the host's CPU in the first period after the current time.
  • the number of historical execution times of cross-cache line operations on the host's CPU actually performed by the VCPU allocated to the virtual machine in at least one historical period before the current time can be obtained; and then executed according to the history The number of times obtains the first estimated number of execution times of cross-cache line operations on the host's CPU that the virtual VCPU allocated to the virtual machine is expected to perform in the first period after the current time.
  • step S102 when the first expected number of executions is greater than or equal to the preset threshold, the function of the host CPU throwing an exception due to the CPU's memory access bus being locked is turned off. So that the host CPU does not throw an exception when the memory access bus of the host CPU is locked. And, switch the state of the detection thread from the silent state to the active state.
  • the detection thread poll the PMU (Performance Monitoring Unit) corresponding to the CPU in the host machine performance monitoring The actual number of cross-cache line operations performed on the host's CPU by the VCPU assigned to the virtual machine in the first period of time is obtained based on the polled CPU operating data.
  • shutting down the function of the host's CPU that throws an exception due to the CPU's memory access bus being locked and switching the state of the detection thread from the silent state to the active state can be executed in parallel.
  • the function of the host's CPU that throws an exception due to the CPU's memory access bus being locked can be turned off first, and then the state of the detection thread is switched from the silent state to the active state.
  • the state of the detection thread can be switched from the silent state to the active state, and then the function of the host CPU throwing an exception due to the CPU's memory access bus being locked is turned off.
  • the host CPU's function of throwing exceptions due to the CPU's memory access bus being locked is turned off, the host's CPU does not throw an exception when the host's CPU's memory access bus is locked. Throws an exception. Therefore, as analyzed above, it is impossible to detect whether the VCPU allocated to the virtual machine actually performs a cross-cache line operation on the host's CPU, and it is also impossible to detect whether the VCPU allocated to the virtual machine actually performs a cross-cache line operation on the host's CPU. The number of actual execution times of cross-cache line operations on the host's CPU.
  • a detection thread can be created in the host in advance, and the detection thread can be used to detect virtual The actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated by the host within the first period of time.
  • the detection thread has multiple states, including, for example, a silent state and an activated state.
  • the detection thread in the silent state does not work and can be low-power or low-resource-consuming. For example, it does not occupy the host's CPU overhead (computing resources).
  • the detection thread in the active state can work.
  • the detection thread will automatically poll the running data of the CPU recorded in the PMU corresponding to the CPU in the host. , and obtain the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine within the first period of time based on the polled CPU operating data.
  • the detection thread can store "the number of actual execution times of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the first period of time" in the log for later use "for the virtual machine"
  • the "actual number of cross-cache line operations performed by the allocated VCPU on the host's CPU in the first time period” is retrieved from the log
  • "the actual number of cross-cache line operations performed by the allocated VCPU on the virtual machine in the first time period” is retrieved from the log.
  • the actual number of cross-cache line operations performed by the host's CPU is retrieved from the log.
  • the detection thread if the status of the detection thread is in the silent state at this time, since the detection thread in the silent state is not working, it is impossible to detect the VCPU allocated to the virtual machine based on the detection thread in the silent state.
  • the number of actual execution times of cross-cache line operations on the host's CPU within the first period of time so, You can switch the status of the detection thread from silent to active. In this way, the detection thread will automatically poll the CPU running data recorded in the PMU corresponding to the host CPU, and obtain the actual execution of the VCPU allocated to the virtual machine within the first time period based on the polled CPU running data. The actual number of cross-cache line operations performed on the host's CPU.
  • the detection thread can store "the number of actual execution times of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the first period of time" in the log for later use "for the virtual machine"
  • the "actual number of cross-cache line operations performed by the allocated VCPU on the host's CPU in the first time period” is retrieved from the log
  • "the actual number of cross-cache line operations performed by the allocated VCPU on the virtual machine in the first time period” is retrieved from the log.
  • the actual number of cross-cache line operations performed by the host's CPU is retrieved from the log.
  • the VCPU allocated to the virtual machine is expected to perform the first estimated number of executions of cross-cache line operations on the host's CPU in the first time period after the current time.
  • the first expected number of executions is greater than or equal to the preset threshold, shut down the function of the host's CPU throwing an exception because the memory access bus of the CPU is locked. So that the host CPU does not throw an exception when the memory access bus of the host CPU is locked. And, switch the state of the detection thread from the silent state to the active state.
  • the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host, and obtains the VCPU allocated to the virtual machine based on the polled running data of the CPU to actually execute the host on the first time period.
  • the actual number of executions of cross-cache line operations by the CPU is detected.
  • the overall performance of the host and the performance of the virtual machine can be improved.
  • the host also runs a kernel-mode VMM and a user-mode VMM.
  • the user-mode VMM can predict that the VCPU is allocated to the virtual machine.
  • the allocated VCPU is expected to perform the first expected number of cross-cache line operations on the host's CPU in the first period after the current time.
  • the user-mode VMM can send a shutdown request to the kernel-mode VMM.
  • the shutdown request is used to shut down the host's CPU due to locking.
  • the CPU's memory access bus is locked and an exception is thrown.
  • the VMM in the host When the first expected number of executions is greater than or equal to the preset threshold, as analyzed above, the VMM in the host frequently enters the exception handling process, which will reduce the overall performance of the host and cause the virtual machine to exit to the server multiple times.
  • the kernel-mode VMM reduces the performance of the virtual machine.
  • the function of the host's CPU that throws an exception due to the CPU's memory access bus being locked belongs to the VMM in kernel mode. Controlled, the user-mode VMM does not control the host CPU's function of throwing an exception due to the CPU's memory access bus being locked. Therefore, in the user-mode VMM, the first expected number of executions is greater than or equal to the preset threshold. If it is determined that the host CPU needs to shut down the function that throws an exception because the CPU's memory access bus is locked, you can request the kernel mode VMM to shut down the host's CPU that throws an exception because the CPU's memory access bus is locked. Function.
  • the user-mode VMM sends a shutdown request to the kernel-mode VMM.
  • the shutdown request is used to shut down the function of the host's CPU that throws an exception because the CPU's memory access bus is locked.
  • the user-mode VMM can put the shutdown request into the shared area (such as a shared memory page, etc.) between the user-mode VMM and the kernel-mode VMM, and then notify the kernel-mode VMM.
  • the shared area such as a shared memory page, etc.
  • the VMM in the kernel state receives the shutdown request, and according to the shutdown request, shuts down the function of the host's CPU that throws an exception because the CPU's memory access bus is locked.
  • the VMM in the kernel state can read the shutdown request from the shared area between the VMM in the user state and the VMM in the kernel state.
  • the function that the host CPU throws an exception because the CPU's memory access bus is locked corresponds to a function switch. If you turn off this function switch, you can turn off the host's CPU that throws an exception because the CPU's memory access bus is locked. Function, if you turn on this function switch, you can start the function of the host CPU throwing an exception due to the CPU's memory access bus being locked.
  • the kernel-mode VMM can turn off the function switch through the externally exposed API (Application Programming Interface, Application Programming Interface) of the function switch.
  • the detection thread can be a detection thread in the kernel state. Since the detection thread can be based on the kernel state, the user-mode VMM can switch the state of the detection thread from the silent state to the active state through the kernel-state VMM. For example, the detection thread There is an API exposed to the outside world. The user-mode VMM can send an activation request carrying this API to the kernel-mode VMM, so that the kernel-mode VMM can use this API to switch the status of the detection thread from the silent state to the activated state.
  • the user-mode VMM may send an activation request to the kernel-mode VMM.
  • the activation request is used to request to switch the status of the detection thread from the silent state to the activated state.
  • the VMM in the kernel state receives the activation request and switches the state of the detection thread from the silent state to the active state according to the activation request.
  • the kernel-mode VMM can send a shutdown response to the user-mode VMM.
  • the shutdown response uses A function that notifies the CPU of the host that has been shut down that an exception is thrown because the CPU's memory access bus is locked.
  • the kernel-mode VMM can put the shutdown response (processing result) into the shared area (such as a shared memory page, etc.) between the user-mode VMM and the kernel-mode VMM, and then notify the kernel-mode VMM.
  • the user-mode VMM receives the shutdown response and sends an activation request to the kernel-mode VMM based on the shutdown response.
  • the user-mode VMM can obtain the notification from the user-mode VMM and the kernel-mode VMM. Read the shutdown response from the shared area, and learn based on the shutdown response that the host's CPU has been shut down and an exception is thrown due to the CPU's memory access bus being locked.
  • the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host, that is, it regularly obtains the CPU operating data recorded in the PMU corresponding to the CPU in the host. Operating data.
  • the running data of the CPU recorded in the PMU corresponding to the CPU obtained each time includes: the number of times the VCPU allocated to the virtual machine has performed cross-cache line operations on the host's CPU at the time when the running data is obtained. .
  • the number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine can be obtained at the beginning of the first time period and at the end of the first time period.
  • the obtained number of times that the VCPU allocated to the virtual machine has performed cross-cache line operations on the host's CPU is used to obtain the number of times the VCPU allocated to the virtual machine actually performed cross-cache line operations on the host's CPU within the first period of time.
  • the polled CPU running data includes: the VCPU allocated to the virtual machine has executed the first cross-cache line operation on the host's CPU at the beginning of the first time period. The number of times and the second number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine at the end of the first period.
  • the difference between the second number of executions and the first number of executions can be calculated, and then the VCPU allocated to the virtual machine is obtained based on the difference.
  • the difference can be determined as the actual execution number of cross-cache line operations on the host's CPU by the VCPU allocated by the virtual machine in the first period of time.
  • the specific process may include:
  • step S201 it is predicted that the VCPU allocated to the virtual machine is expected to perform a second estimated number of executions of cross-cache line operations on the host's CPU in a second time period after the first time period.
  • step S202 Before detecting the actual number of executions of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the second time period, it is possible to first predict the expected execution times of the virtual VCPU allocated to the virtual machine in the second time period. The second estimated number of executions of the cross-cache line operation on the host's CPU (which is the predicted number of executions, not the actual number of executions), and then step S202 is performed.
  • historical data can be used to predict the second expected number of executions of cross-cache line operations on the host's CPU by the virtual VCPU allocated to the virtual machine in the second time period.
  • the historical data may include the VCPU allocated to the virtual machine in at least one historical period before the current moment.
  • the number of historical execution times of cross-cache line operations on the host's CPU is actually performed, and then the virtual VCPU allocated to the virtual machine is expected to perform cross-cache line operations on the host's CPU in the second time period after the current time based on historical data.
  • the second expected number of executions of the operation For example, historical data can be analyzed to find out the pattern of the number of historical execution times of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine in the historical process, and the VCPU allocated to the virtual machine can be obtained based on the pattern.
  • the virtual VCPU is expected to execute a second expected number of cross-cache line operations on the host's CPU in a second time period after the current time.
  • the number of historical executions of cross-cache line operations on the host's CPU that the VCPU allocated to the virtual machine actually performed in at least one historical time period before the current moment can be obtained; and then the number of historical executions is obtained as The virtual VCPU allocated by the virtual machine is expected to perform the second expected number of cross-cache line operations on the host's CPU in the second time period after the current time.
  • step S202 when the second expected number of executions is less than the preset threshold, start the function of the host CPU throwing an exception because the memory access bus of the CPU is locked. This causes the host CPU to throw an exception when the memory access bus of the host CPU is locked.
  • step S203 when the relevant information about the exception is obtained, it is determined according to the relevant information about the exception whether the exception is thrown by the host's CPU because the memory access bus of the CPU is locked.
  • step S204 in the case where the exception is thrown by the host's CPU because the CPU's memory access bus is locked, it is determined that the VCPU allocated to the virtual machine actually performs a cross-connection of the host's CPU within the second time period. cache line operations.
  • the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine during the second time period can be counted by counting.
  • the host also runs a kernel-mode VMM and a user-mode VMM.
  • the user-mode VMM can predict that the virtual machine will The VCPU allocated by the host is expected to perform the second expected number of cross-cache line operations on the host's CPU in the second time period after the first time period.
  • the user-mode VMM can send a startup request to the kernel-mode VMM.
  • the startup request is used to start the host's CPU.
  • the CPU's memory access bus is locked and an exception is thrown.
  • the VMM in the host When the second expected number of executions is less than the preset threshold, as analyzed above, although the VMM in the host will enter the exception handling process, it will not enter the exception handling process frequently. In this way, the overall performance of the host will be reduced. Not too big, and although the virtual machine will exit to the kernel-state VMM, it will not exit to the kernel-state VMM frequently. In this way, the performance of the virtual machine will not be greatly reduced. In this way, "the VMM will not frequently enter the exception handling process and the virtual machine will not frequently exit the VMM in the kernel state", resulting in a reduction in the overall performance of the host and the virtual machine. The degree of performance degradation is often tolerated.
  • the detection thread polls the CPU operating data recorded in the PMU corresponding to the CPU in the host, that is, it regularly obtains the CPU operating data recorded in the PMU corresponding to the CPU in the host. Affected by the "regular" cycle (the time interval between two adjacent polls), it is often "one cycle” before the VCPU allocated to the virtual machine actually executes the span of the host's CPU within one cycle.
  • the actual number of executions of cache line operations will affect to a certain extent the timeliness of obtaining the actual number of executions of cross-cache line operations on the host's CPU by the VCPU allocated to the virtual machine (for example, the division of the time periods described above cannot be free) According to the actual situation, the division of time periods can only be divided according to the "regular" cycle. The minimum duration included in the time period can only be the "regular" cycle and cannot be any smaller, so the timeliness will be reduced).
  • the VMM will not frequently enter the exception handling process and the virtual machine will not frequently exit the VMM in the kernel state, resulting in a reduction in the overall performance of the host.
  • the degree and the degree of degradation of the performance of the virtual machine are often tolerated. In this way, in order to improve the timeliness of obtaining the actual number of cross-cache line operations performed by the VCPU allocated to the virtual machine on the host's CPU, you can After accepting the use of "user mode VMM to get the exception, you can also get the relevant information of the exception, etc.
  • the cause of the exception can be recorded in the shared area of the user mode VMM and the kernel mode VMM, and the reason for the exception can be parsed from it.
  • the reason for the exception is that the split lock event is triggered, etc.
  • the function that the host CPU throws an exception because the CPU's memory access bus is locked is controlled by the kernel-mode VMM.
  • the user-mode VMM can not control the host's CPU because the CPU's memory access bus is locked.
  • the function of throwing an exception Therefore, if the VMM in the user mode determines that it needs to start the host's CPU when the second expected number of executions is less than the preset threshold, the function of throwing an exception will be caused by the CPU's memory access bus being locked. , you can request the VMM in the kernel state to start the function of the host's CPU throwing an exception because the CPU's memory access bus is locked.
  • the user-mode VMM sends a startup request to the kernel-mode VMM.
  • the startup request is used to start the host's CPU function that throws an exception because the CPU's memory access bus is locked.
  • the user-mode VMM can put the startup request into the shared area (such as a shared memory page, etc.) between the user-mode VMM and the kernel-mode VMM, and then notify the kernel-mode VMM.
  • the shared area such as a shared memory page, etc.
  • the VMM in the kernel state receives the startup request, and starts the host CPU function that throws an exception due to the CPU's memory access bus being locked according to the startup request.
  • the VMM in the kernel state can read the startup request from the shared area between the VMM in the user state and the VMM in the kernel state.
  • the function where the host's CPU throws an exception due to the CPU's memory access bus being locked corresponds to a function switch. If the function is turned off, Function switch can be used to turn off the host CPU's function of throwing exceptions due to the CPU's memory access bus being locked. If the function switch is turned on, the host's CPU can be enabled to throw exceptions due to the CPU's memory access bus being locked. Abnormal functionality.
  • the kernel-mode VMM starts the function that the host CPU throws an exception because the CPU's memory access bus is locked according to the startup request, it can start the host CPU throws an exception because the CPU's memory access bus is locked.
  • the function switch corresponding to the function.
  • the kernel-mode VMM can enable the function switch through the externally exposed API of the function switch.
  • the user mode VMM when determining whether the exception is thrown by the host's CPU because the CPU's memory access bus is locked based on the relevant information of the exception, the user mode VMM can determine whether the exception is caused by the host's CPU based on the relevant information of the exception.
  • the CPU's memory access bus is locked and thrown.
  • the user mode VMM may determine that the VCPU allocated to the virtual machine has actually performed a cross-cache line operation on the host's CPU during the second time period. Cross-cache line operations on the host's CPU are actually performed.
  • the detection thread may not be used to detect whether the VCPU allocated to the virtual machine actually performs a cross-cache line operation on the host's CPU during the second time period. The actual number of executions, so that the state of the detection thread can be switched from the active state to the silent state.
  • the host also runs a kernel-mode VMM and a user-mode VMM.
  • the user-mode VMM can send a silent request to the kernel-mode VMM.
  • the silent request is used to request to change the status of the detection thread from the activated state. Switch to silent state.
  • the kernel-mode VMM can receive a quiet request and switch the status of the detection thread from the active state to the quiet state according to the quiet request.
  • the detection thread can directly request the VMM in the kernel state to switch the status of the detection thread from the active state to the silent state. This application does not limit the specific switching method.
  • this embodiment includes the following process:
  • step S301 the user-mode VMM predicts that the VCPU allocated to the virtual machine is expected to perform the first estimated number of cross-cache line operations on the host's CPU in the first period after the current time.
  • step S302 when the first expected number of executions is greater than or equal to the preset threshold, the user-mode VMM sends a shutdown request to the kernel-mode VMM.
  • the shutdown request is used to shut down the host's CPU because the CPU's memory access bus is blocked. The function of locking and throwing an exception; so that the host CPU does not throw an exception when the memory access bus of the host CPU is locked.
  • step S303 the VMM in the kernel state receives the shutdown request, and according to the shutdown request, shuts down the function of the host's CPU that throws an exception due to the CPU's memory access bus being locked.
  • step S304 the kernel-mode VMM sends a shutdown response to the user-mode VMM.
  • the shutdown response is used to notify that the shutdown has been completed. Turn off the host's CPU and throw an exception due to the CPU's memory access bus being locked.
  • step S305 the user-mode VMM receives the shutdown response, and sends an activation request to the kernel-mode VMM according to the shutdown response.
  • the activation request is used to request to switch the state of the detection thread from a silent state to an activated state.
  • step S306 the VMM in the kernel state receives the activation request, and switches the state of the detection thread from the silent state to the active state according to the activation request.
  • step S307 when the state of the detection thread is switched to the active state, the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host machine.
  • step S308 the detection thread obtains the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine within the first period of time based on the polled CPU running data.
  • the user-mode VMM predicts that the VCPU allocated to the virtual machine is expected to perform the first estimated number of cross-cache line operations on the host's CPU in the first time period after the current time.
  • the user-mode VMM sends a shutdown request to the kernel-mode VMM.
  • the shutdown request is used to shut down the host's CPU because the CPU's memory access bus is locked and an exception is thrown. Function; so that the host CPU does not throw an exception when the memory access bus of the host CPU is locked.
  • the VMM in the kernel state receives the shutdown request and shuts down the function of the host CPU that throws an exception due to the CPU's memory access bus being locked according to the shutdown request.
  • the kernel-mode VMM sends a shutdown response to the user-mode VMM.
  • the shutdown response is used to notify the CPU of the host that has been shut down to throw an exception because the CPU's memory access bus is locked.
  • the user-mode VMM receives the shutdown response, and sends an activation request to the kernel-mode VMM based on the shutdown response.
  • the activation request is used to request to switch the state of the detection thread from a silent state to an activated state.
  • the VMM in the kernel state receives the activation request and switches the state of the detection thread from the silent state to the active state according to the activation request.
  • the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host.
  • the detection thread obtains the actual number of cross-cache line operations performed on the host's CPU by the VCPU allocated to the virtual machine within the first period of time based on the polled CPU running data.
  • the VCPU allocated to the virtual machine performs frequent cross-cache line operations on the host's CPU (for example, tens of thousands or hundreds of thousands of times per second, etc.)
  • the VCPU allocated to the virtual machine is detected.
  • the overall performance of the host and the performance of the virtual machine can be improved.
  • FIG. 4 there is shown a structural block diagram of a data processing device of the present application, which is applied to a host machine.
  • the host machine at least runs a virtual machine and a detection thread; the device includes:
  • the first prediction module 11 is used to predict the first estimated number of executions of cross-cache line cache line operations on the host's central processing unit CPU by the VCPU allocated to the virtual machine in the first time period after the current time; Close The closing module 12 is used to close the function of the host's CPU that throws an exception due to the CPU's memory access bus being locked when the first expected number of executions is greater than or equal to the preset threshold; so that the host's CPU The host CPU does not throw an exception when the memory access bus is locked; and the first switching module 13 is used to switch the state of the detection thread from the silent state to the active state; the polling module 14 is used to poll the host The acquisition module 15 is configured to acquire the running data of the CPU recorded in the performance monitor PMU corresponding to the CPU in the CPU and obtain the actual execution of the VCPU assigned to the virtual machine on the host machine within the first time period based on the polled running data of the CPU. The actual number of executions of cross-cache line operations by the CPU.
  • the host also runs a kernel-mode VMM and a user-mode VMM;
  • the first prediction module includes: a first prediction unit of the user-mode VMM; a first prediction unit for predicting The VCPU allocated by the virtual machine is expected to execute the first expected number of cross-cache line cache line operations on the host's central processing unit CPU in the first time period after the current moment; accordingly, the shutdown module includes: user-mode VMM
  • the included first sending unit also includes a first receiving unit and a shutdown unit included in the VMM in the kernel state; the first sending unit is used to send a shutdown request to the first receiving unit, and the shutdown request is used to shut down the function; The receiving unit is used for receiving the shutdown request, and the shutdown unit is used for shutting down the function according to the shutdown request.
  • the first switching module includes: a second sending unit included in the VMM in user mode, a second receiving unit included in the VMM in kernel mode, and a first switching unit; the second sending unit, To send an activation request to the second receiving unit, the activation request is used to request to switch the state of the detection thread from the silent state to the active state; the second receiving unit is used to receive the activation request, and the first switching unit is used to switch the detection thread according to the activation request. Detects that the thread's state switches from silent to active.
  • the first switching module further includes: a third sending unit included in the VMM in the user mode, and a third receiving unit included in the VMM in the kernel mode; a third sending unit, configured to transmit to the third The receiving unit sends a shutdown response, which is used to notify that the function has been turned off; the third receiving unit is used to receive the shutdown response, and the second sending unit is also used to send an activation request to the second receiving unit according to the shutdown response.
  • the device further includes: a second prediction module, configured to predict that the VCPU allocated to the virtual machine is expected to perform cross-cache execution on the CPU of the host machine in the second time period after the first time period.
  • the second expected number of executions of the line operation ;
  • a startup module configured to start the function when the second expected number of executions is less than the preset threshold; so that the host can be locked when the memory access bus of the host's CPU is locked
  • the CPU throws an exception;
  • the first determination module is used to determine whether the exception is thrown by the host's CPU due to the CPU's memory access bus being locked according to the exception-related information when the relevant information of the exception is obtained;
  • the second determination module The second determination module is used to determine that the VCPU allocated to the virtual machine has actually executed the operation of the host CPU in the second time period when the exception is thrown by the host CPU due to the CPU's memory access bus being locked.
  • Cross cache line operations configured to predict that the VCPU allocated to the virtual machine is expected to perform
  • the host also runs a kernel-mode VMM and a user-mode VMM;
  • the second prediction module includes: a second prediction unit of the user-mode VMM; a second prediction unit for predicting The VCPU allocated by the virtual machine is expected to perform the second expected number of cross-cache line operations on the host's CPU in the second time period after the first time period; accordingly, the startup module includes: the user-mode VMM includes a fourth
  • the sending unit also includes a fourth receiving unit and a startup unit included in the kernel state VMM; the fourth sending unit is used to send to the fourth receiving unit. a start request, the start request is used to start the function; the fourth receiving unit is used to receive the start request, and the start unit is used to start the function according to the start request.
  • the first determination module also includes a first determination unit included in the user-mode VMM; the first determination unit is used to determine whether the exception is caused by the CPU access of the host machine based on the relevant information of the exception.
  • the memory bus is locked and thrown; accordingly, the second determination module includes a second determination unit included in the VMM of the user state; the second determination unit is used to determine that the VCPU allocated to the virtual machine is actually executed in the second time period Performs cross-cache line operations on the host's CPU.
  • the device further includes: a second switching module configured to switch the state of the detection thread from the active state to the silent state when the second expected number of executions is less than the preset threshold.
  • the host also runs a kernel-mode VMM and a user-mode VMM;
  • the second switching module includes: a fifth sending unit included in the user-mode VMM, and a fifth sending unit included in the kernel-mode VMM.
  • the fifth receiving unit and the second switching unit include: a fifth sending unit, used to send a silent request to the fifth receiving unit, the silent request is used to request to switch the state of the detection thread from the active state to the silent state; the fifth receiving unit , used to receive the silent request, and the second switching unit is used to switch the state of the detection thread from the activated state to the silent state according to the silent request.
  • the first prediction module includes: a first acquisition unit, used to acquire the actual execution of cross-cache execution of the VCPU allocated to the virtual machine on the CPU of the host machine in at least one historical period before the current time. The historical number of executions of the line operation; the second acquisition unit is used to acquire the first estimated number of executions based on the number of historical executions.
  • the polled CPU running data includes: the VCPU allocated to the virtual machine has executed the first cross-cache line operation on the host's CPU at the beginning of the first time period. The number of executions and the second execution number of cross-cache line operations on the host's CPU that the VCPU allocated to the virtual machine has executed at the end of the first time period; the acquisition module includes: a computing unit, used to calculate the second The difference between the number of executions and the first number of executions; a third acquisition unit configured to acquire the actual number of executions according to the difference.
  • the VCPU allocated to the virtual machine is expected to perform the first estimated number of executions of cross-cache line operations on the host's CPU in the first period after the current time.
  • the first expected number of executions is greater than or equal to the preset threshold, shut down the function of the host's CPU throwing an exception because the memory access bus of the CPU is locked. So that the host CPU does not throw an exception when the memory access bus of the host CPU is locked. And, switch the state of the detection thread from the silent state to the active state.
  • the detection thread polls the running data of the CPU recorded in the PMU corresponding to the CPU in the host, and obtains the VCPU allocated to the virtual machine based on the polled running data of the CPU to actually execute the host on the first time period.
  • the actual number of executions of cross-cache line operations by the CPU is detected.
  • the overall performance of the host and the performance of the virtual machine can be improved.
  • Embodiments of the present application also provide a non-volatile readable storage medium.
  • One or more modules are stored in the storage medium. When the one or more modules are applied to a device, they can cause the device to execute Instructions for each method step in the embodiments of this application.
  • Embodiments of the present application provide one or more machine-readable media with instructions stored thereon that, when executed by one or more processors, cause the electronic device to perform one or more of the methods in the above embodiments.
  • electronic devices include servers, gateways, sub-devices, etc., and the sub-devices are Internet of Things devices and other devices.
  • Embodiments of the present disclosure may be implemented as devices using any appropriate hardware, firmware, software, or any combination thereof to perform desired configurations.
  • the devices may include servers (clusters), terminal devices such as IoT devices and other electronic devices.
  • Figure 5 schematically illustrates an exemplary apparatus 1300 that may be used to implement various embodiments in the present application.
  • FIG. 5 illustrates an exemplary apparatus 1300 having one or more processors 1302 , a control module (chipset) 1304 coupled to at least one of the processor(s) 1302 , memory 1306 coupled to the control module 1304 , a non-volatile memory (NVM)/storage device 1308 coupled to the control module 1304 , one or more input/output devices 1310 coupled to the control module 1304 , and Network interface 1312 coupled to control module 1304 .
  • processors 1302 a control module (chipset) 1304 coupled to at least one of the processor(s) 1302
  • memory 1306 coupled to the control module 1304
  • NVM non-volatile memory
  • storage device 1308 coupled to the control module 1304
  • input/output devices 1310 coupled to the control module 1304
  • Network interface 1312 coupled to control module 1304 .
  • Processor 1302 may include one or more single-core or multi-core processors, and processor 1302 may include any combination of general-purpose processors or special-purpose processors (eg, graphics processors, applications processors, baseband processors, etc.).
  • the device 1300 can serve as a server device such as a gateway in the embodiment of this application.
  • apparatus 1300 may include one or more computer-readable media (eg, memory 1306 or NVM/storage device 1308) having instructions 1314 and configured in combination with the one or more computer-readable media to One or more processors 1302 execute instructions 1314 to implement modules to perform actions in this disclosure.
  • processors 1302 execute instructions 1314 to implement modules to perform actions in this disclosure.
  • control module 1304 may include any suitable interface controller to provide any suitable interface controller to at least one of the processor(s) 1302 and/or any suitable device or component in communication with the control module 1304 Interface.
  • Control module 1304 may include a memory controller module to provide an interface to memory 1306 .
  • the memory controller module may be a hardware module, a software module, and/or a firmware module.
  • Memory 1306 may be used, for example, to load and store data and/or instructions 1314 for device 1300 .
  • memory 1306 may include any suitable volatile memory, such as suitable DRAM.
  • memory 1306 may include double data rate quad synchronous dynamic random access memory (DDR4SDRAM).
  • control module 1304 may include one or more input/output controllers to provide interfaces to NVM/storage device 1308 and input/output device(s) 1310 .
  • NVM/storage device 1308 may be used to store data and/or instructions 1314 .
  • NVM/storage device 1308 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard drives (e.g., one or more hard drives) HDD), one or more compact disc (CD) drives and/or one or more digital versatile disc (DVD) drives).
  • suitable non-volatile memory e.g., flash memory
  • suitable non-volatile storage device(s) e.g., one or more hard drives (e.g., one or more hard drives) HDD), one or more compact disc (CD) drives and/or one or more digital versatile disc (DVD) drives.
  • NVM/storage device 1308 may include storage that is physically part of the device on which appliance 1300 is installed. A resource that is stored or accessible to the device does not necessarily need to be part of the device. For example, NVM/storage device 1308 may be accessed over the network via input/output device(s) 1310.
  • the input/output device(s) 1310 may provide an interface for the apparatus 1300 to communicate with any other suitable device, and the input/output device 1310 may include a communication component, a pinyin component, a sensor component, or the like.
  • Network interface 1312 may provide an interface for device 1300 to communicate over one or more networks, and device 1300 may communicate with one or more wireless networks in accordance with any of one or more wireless network standards and/or protocols.
  • Components perform wireless communication, such as accessing wireless networks based on communication standards, such as WiFi, 2G, 3G, 4G, 5G, etc., or their combination for wireless communication.
  • At least one of the processor(s) 1302 may be packaged with the logic of one or more controllers (eg, a memory controller module) of the control module 1304 .
  • at least one of the processor(s) 1302 may be packaged together with the logic of one or more controllers of the control module 1304 to form a system-in-package (SiP).
  • SiP system-in-package
  • at least one of the processor(s) 1302 may be integrated on the same die as the logic of one or more controllers of the control module 1304 .
  • at least one of the processor(s) 1302 may be integrated on the same die with the logic of one or more controllers of the control module 1304 to form a system on a chip (SoC).
  • SoC system on a chip
  • the apparatus 1300 may be, but is not limited to, a terminal device such as a server, a desktop computing device, or a mobile computing device (eg, a laptop computing device, a handheld computing device, a tablet computer, a netbook, etc.).
  • device 1300 may have more or fewer components and/or a different architecture.
  • device 1300 includes one or more cameras, a keyboard, a liquid crystal display (LCD) screen (including touch screen displays), a non-volatile memory port, a plurality of antennas, a graphics chip, an application specific integrated circuit ( ASIC) and speakers.
  • LCD liquid crystal display
  • ASIC application specific integrated circuit
  • Embodiments of the present application provide an electronic device, including: one or more processors; and one or more machine-readable media having instructions stored thereon, which when executed by the one or more processors, causes the electronic device to The device performs one or more methods as described herein.
  • the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable information processing terminal equipment to produce a machine such that the instructions are executed by the processor of the computer or other programmable information processing terminal equipment. Means are generated for implementing the functions specified in the process or processes of the flowchart diagrams and/or the block or blocks of the block diagrams.
  • These computer program instructions may also be stored in a computer or other programmable information processing terminal device capable of guiding a specific A computer-readable memory that operates in a certain manner such that instructions stored in the computer-readable memory produce an article of manufacture that includes instruction means that implements a process or processes in a flowchart and/or a block in a block diagram or Features specified in multiple boxes.
  • These computer program instructions can also be loaded onto a computer or other programmable information processing terminal equipment, so that a series of operating steps are executed on the computer or other programmable terminal equipment to produce computer-implemented processing, thereby causing the computer or other programmable terminal equipment to perform computer-implemented processing.
  • the instructions executed on provide steps for implementing the functions specified in a process or processes of the flow diagrams and/or a block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本申请提供了一种数据处理方法及装置。在本申请中,若为虚拟机分配的VCPU在之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数大于预设阈值,先关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,再将检测线程的状态从静默状态切换至激活状态;以使检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据,根据轮询到的CPU的运行数据获取实际执行次数。通过本申请,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁的情况下,在检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,可以提高宿主机的整体性能以及提高虚拟机的性能。

Description

一种数据处理方法及装置
本申请要求于2022年03月17日提交中国专利局、申请号为202210267806.1、申请名称为“一种数据处理方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种数据处理方法及装置。
背景技术
随着技术的飞速发展,虚拟机化技术应用范围越来越广,虚拟化是云计算的关键技术,虚拟化技术可以将一个物理机(宿主机)虚拟化为一个或多个虚拟机。每一个虚拟机都拥有自己的虚拟硬件,例如,包括VCPU(Virtual Central Processing Unit,虚拟中央处理器)、虚拟内存以及虚拟I/O设备等,从而形成一个独立的虚拟机执行环境。虚拟化技术由于具有高容错性和高资源利用率而广泛应用在云计算和高性能计算等领域。
在虚拟化环境里,VMM(Virtual Machine Management,虚拟机监视器)是位于宿主机的硬件与虚拟机之间的一个软件管理层,其主要负责管理宿主机的硬件,如管理宿主机的CPU(Central Processing Unit,中央处理器)、内存及I/O设备等,并将宿主机的硬件抽象为对应的虚拟设备接口供虚拟机使用。
发明内容
本申请示出了一种数据处理方法及装置。
第一方面,本申请示出了一种数据处理方法,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程;所述方法包括:预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常;以及,将检测线程的状态从静默状态切换至激活状态;以使检测线程轮询宿主机中的CPU对应的性能监视器PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
第二方面,本申请示出了一种数据处理装置,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程;所述装置包括:第一预测模块,用于预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;关闭模块,用于在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU 的访存总线被锁定的情况下宿主机的CPU不抛出异常;以及,第一切换模块,用于将检测线程的状态从静默状态切换至激活状态;轮询模块,用于轮询宿主机中的CPU对应的性能监视器PMU中记录的CPU的运行数据,获取模块,用于根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
第三方面,本申请示出了一种电子设备,电子设备包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为执行如前述的任一方面所示的方法。
第四方面,本申请示出了一种非临时性计算机可读存储介质,当存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行如前述的任一方面所示的方法。
第五方面,本申请示出了一种计算机程序产品,当计算机程序产品中的指令由电子设备的处理器执行时,使得电子设备能够执行如前述的任一方面所示的方法。
与现有技术相比,本申请包括以下优点:
在本申请中,预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。以及,将检测线程的状态从静默状态切换至激活状态。以使检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。通过本申请,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,可以提高宿主机的整体性能以及提高虚拟机的性能。
附图说明
图1是本申请的一种数据处理方法的步骤流程图。
图2是本申请的一种数据处理方法的步骤流程图。
图3是本申请的一种数据处理方法的步骤流程图。
图4是本申请的一种数据处理装置的结构框图。
图5是本申请的一种装置的结构框图。
具体实施方式
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。
有时候,宿主机中的CPU允许不对齐的内存访问,在不对齐的内存访问的场景中,原子操作的操作数(由于地址不对齐)会跨越宿主机的CPU的两个cache line(缓存行), 也即,访问CPU的cache的一个操作跨越两个cache line,这样会触发split lock(分裂锁)事件。
例如,在一个例子中,在CPU的cache中,一个cache line包括64个字节,struct counter中的一个成员占8个字节,buf填充了62个字节。因此,一旦访问这个成员,就涉及两个cache line的内容的拼接,如此,执行原子操作会触发split lock事件。
然而,通常情况下,缓存一制性协议只能保证cache line粒度的一致性。同时访问两个cache line无法保证cache line粒度的一致性,为保证split lock的原子性,在访问的操作数跨越两个cache line的情况下,可以使用特殊逻辑(例如冷路径)来处理:例如,锁定宿主机的CPU的访存总线(BUS LOCK)。
在宿主机的CPU的访存总线被锁定的情况下,宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问都会被拦截,导致宿主机中的其他线程/宿主机中的CPU中的其他核的数据处理进程中断。
由于宿主机中的其他线程/宿主机中的CPU中的其他核的数据处理进程中断,则会导致宿主机的CPU的平均访存延时显著上升,且由于执行“拦截宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问”的动作也会耗费宿主机的CPU的一些计算资源,因此,会降低宿主机的整体性能。
如此,提出了提高宿主机的整体性能的需求。
为了提高宿主机的整体性能,在一个可能的方式中,可以尽可能地降低宿主机的CPU的平均访存延时,以及,尽可能地降低执行“拦截宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问”的动作的次数。
为了实现“降低宿主机的CPU的平均访存延时”的目的,在一个方式中,可以降低宿主机中的其他线程/宿主机中的CPU中的其他核的数据处理进程中断的次数。
为了实现“降低宿主机中的其他线程/宿主机中的CPU中的其他核的数据处理进程中断的次数”的目的,在一个方式中,可以降低宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问被拦截的次数。
其中,降低宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问被拦截的次数,可以节省拦截动作所需耗费的计算资源。
为了实现“降低宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问被拦截的次数”的目的,在一个方式中,可以降低宿主机的CPU的访存总线被锁定的次数。
为了实现“降低宿主机的CPU的访存总线被锁定的次数”的目的,在一个方式中,可以检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
在为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数较高的情况下,可以采取一些应对措施降低之后为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数,从而可以降低之后宿主机的CPU的访存 总线被锁定的次数。
进而可以降低之后宿主机中的其他线程/宿主机中的CPU中的其他核对访存总线的访问被拦截的次数,进而可以降低之后宿主机中的其他线程/宿主机中的CPU中的其他核的数据处理进程中断的次数,进而可以降低之后宿主机的CPU的平均访存延时,如此可以实现提高宿主机之后的整体性能的目的。
其中,应对措施可以包括:降低宿主机上的虚拟机对宿主机的CPU利用率或者虚拟机对宿主机的CPU的访问频率等。
通过上述分析得知,检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数就比较重要。
如此,又出了“检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数”的需求。
为了实现“检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数”的目的,发明人发现:在为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的情况下,宿主机的CPU的访存总线会被锁定,在宿主机的CPU的访存总线被锁定的情况下,宿主机的CPU会抛出异常,例如Bus Lock Exception或#DB Exception等。
后续需要宿主机上的内核态的VMM来处理异常,如此,会从虚拟机退出至内核态的VMM(例如,挂起虚拟机,恢复内核态的VMM的运行),在退出至内核态的VMM之后(也即,在恢复运行内核态的VMM之后),内核态的VMM会尝试处理异常。
在一个可能的情况下,内核态的VMM可能会确定出内核态的VMM无法处理该异常或者该异常应当交由用户态的VMM处理,如此,内核态的VMM可以通知用户态的VMM来处理该异常,用户态的VMM接收到通知之后,就会获取该异常或尝试处理该异常。
其中,用户态的VMM得到该异常之后,还可以得到异常的该相关信息等(例如用户态的VMM与内核态的VMM的共享区域内可以记录该异常的原因等,从其中可以解析出该异常的原因是否是由于触发了split lock事件等),并根据该异常的相关信息可以确定出为虚拟机分配的VCPU是否执行了跨宿主机的CPU的cache line操作(为虚拟机分配的VCPU是否触发了split lock事件),并通过计数的方式来统计为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
然而,发明人发现,在上述方式中,在为虚拟机分配的VCPU每执行一次对宿主机的CPU的跨cache line操作,都会导致宿主机的CPU抛出异常(Bus Lock Exception或#DB Exception等),进而会从虚拟机退出至(例如通过Vmexit函数等)内核态的VMM(例如,挂起虚拟机,恢复内核态的VMM的运行)。
在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,一方面,会导致宿主机中的VMM频繁进入异常处理流程,降低宿主机的整体性能,进一步地还会间接地导致了宿主机进入类似于被DOS(Denial of  Service Attack,拒绝服务攻击)攻击的状态。另一方面,会因多次从虚拟机退出至内核态的VMM而降低虚拟机的性能。
鉴于此,发明人认为,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,通过上述方式来检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数不合适。
如此,提出了“在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在需要检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,提高宿主机的整体性能以及提高虚拟机的性能”的需求。
为了实现“在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在需要检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,提高宿主机的整体性能以及提高虚拟机的性能”的目的,发明人对上述方式进行了统计分析,并发现:
在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常的功能是具有功能开关的。功能开关可以打开可以关闭。
结合实际情况,若需要“在宿主机的CPU的访存总线被锁定的情况下,宿主机的CPU抛出异常”,则可以将该功能开关打开,也即通过打开功能开关而启动在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常的功能。
若需要“在宿主机的CPU的访存总线被锁定的情况下,宿主机的CPU不抛出异常”,则可以将该功能开关关闭,也即通过关闭功能开关而关闭在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常的功能。
如此,发明人发现,可以关闭该功能开关,这样,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在宿主机的CPU的访存总线每一次被锁定都不会使得宿主机的CPU抛出异常,从而可以避免降低宿主机的整体性能以及避免降低虚拟机的性能。
但是,虽然实现了“避免降低宿主机的整体性能以及避免降低虚拟机的性能”的目的,但是由于在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常,进而会导致无法检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数,进而无法实现“在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在需要检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,提高宿主机的整体性能以及提高虚拟机的性能”的目的。
鉴于此,进一步地,发明人摒弃了“用户态的VMM得到该异常之后,还可以得到异常的该相关信息等(例如用户态的VMM与内核态的VMM的共享区域内可以记录该异常的原因等,从其中可以解析出该异常的原因是与是由于触发了split lock事件等),并根据该异 常的相关信息可以确定出为虚拟机分配的VCPU是否执行了跨宿主机的CPU的cache line操作(为虚拟机分配的VCPU是否触发了split lock事件),并通过计数的方式来统计为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数”的检测方式来检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数,而是想到了在宿主机上创建检测线程,其用于检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
具体地,参见图1,示出了本申请的一种数据处理方法,该方法应用于宿主机,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程。该方法包括:
在步骤S101中,预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
在本申请中,执行对宿主机的CPU的跨cache line操作会触发split lock事件。在触发split lock事件的情况下,宿主机的CPU的访存总线会被锁定这样,会降低宿主机的整体性能。
在本申请中,为了提高宿主机的整体性能,可以检测为虚拟机分配的VCPU(可以是内核态的VMM为虚拟机分配的)实际执行对宿主机的CPU的跨cache line操作的实际执行次数。并在为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数较高的情况下,通过一些应对措施(例如降低宿主机上的虚拟机之后对宿主机的CPU利用率,或者虚拟机之后对宿主机的CPU的访问频率等等)降低为虚拟机分配的VCPU之后执行对宿主机的CPU的跨cache line操作的执行次数,进而降低之后触发split lock事件的次数,进而降低宿主机的CPU的访存总线之后被锁定的次数,从而可以提高宿主机的之后的整体性能。
在本申请一个实施例中,为了检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数,可以将时间划分为依次相邻的多个时间段,各个时间段的持续时长可以相同,相邻的时间段可以是首尾相接的,例如,相邻的时间段中,靠前的时间段的结束时刻与靠后的时间段的起始时刻可以相同等。
可以以时间段为基准时间单元来检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数,例如,检测为虚拟机分配的VCPU在各个时间段内分别执行对宿主机的CPU的跨cache line操作的实际执行次数。
当然,在一个可能的情况下,也可以以小于时间段的基准来检测为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作的实际执行次数,以提高检测为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作的实际执行次数的实时性。
鉴于此,在本申请中,可以通过至少两种方式来检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
为了确定合适的检测方式,在一个实施例中,若需要检测为虚拟机分配的VCPU在一个时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,在该时间段 之前,可以先预测为虚拟机分配的虚拟VCPU在该时间段内预计执行对宿主机的CPU的跨cache line操作的预计执行次数。
然后参考预测的为虚拟机分配的虚拟VCPU在该时间段内预计执行对宿主机的CPU的跨cache line操作的预计执行次数,来选择其中一种方式检测为虚拟机分配的VCPU在该时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
例如,可以根据实际情况设置预设阈值。
在预测的为虚拟机分配的虚拟VCPU在该时间段内预计执行对宿主机的CPU的跨cache line操作的预计执行次数大于或等于预设阈值的情况下,可以使用其中一种方式来检测为虚拟机分配的VCPU在该时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。或者,在预测的为虚拟机分配的虚拟VCPU在该时间段内预计执行对宿主机的CPU的跨cache line操作的预计执行次数小于预设阈值的情况下,可以使用另一种方式来检测为虚拟机分配的VCPU在该时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。具体可以参见之后步骤的描述,在此不做详述。
例如,在一个实施例中,在当前时刻之后具有第一时间段,在检测为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数之前,可以先预测为虚拟机分配的虚拟VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数(是预测出来的执行次数,不是实际执行的执行次数),然后执行步骤S102。
其中,可以采用历史数据来预测为虚拟机分配的虚拟VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
历史数据可以包括在当前时刻之前的至少一个历史时间段内为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的历史执行次数,然后根据历史数据获取为虚拟机分配的虚拟VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。例如,可以对历史数据分析,以分析出在历史过程中为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的历史执行次数的规律,并基于规律来获取为虚拟机分配的虚拟VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
其中,在本申请一个实施例中,可以获取为虚拟机分配的VCPU在当前时刻之前的至少一个历史时间段内实际执行对宿主机的CPU的跨cache line操作的历史执行次数;然后根据历史执行次数获取为虚拟机分配的虚拟VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
在步骤S102中,在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。以及,将检测线程的状态从静默状态切换至激活状态。以使检测线程轮询宿主机中的CPU对应的PMU(Performance Monitoring Unit,性能监视 器)中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
在本申请一个实施例中,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能以及将检测线程的状态从静默状态切换至激活状态可以并行执行。
或者,在本申请另一实施例中,可以先关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,然后再将检测线程的状态从静默状态切换至激活状态。
或者,在本申请另一实施例中,可以先将检测线程的状态从静默状态切换至激活状态,然后再关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
由于宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能被关闭了,因此,在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常,但是由于不抛出异常,因此,如前述分析,这样也无法检测为虚拟机分配的VCPU实际是否执行了对宿主机的CPU的跨cache line操作,也就无法检测为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
因此,为了在宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能被关闭的情况下不仅能够实现避免降低宿主机的整体性能以及避免降低虚拟机的性能且还能够检测为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,在本申请中,事先在宿主机中可以创建检测线程,可以借助检测线程来检测为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
其中,检测线程具有多个状态,例如,包括静默状态以及激活状态等。
处于静默状态的检测线程是不工作的,可以是低功耗的或低资源消耗的,例如可以不占用宿主机的CPU开销(计算资源)。
处于激活状态的检测线程是可以工作的。
在本申请一个实施例中,若此时检测线程的状态为激活状态,则可以不对检测线程的状态进行切换,检测线程会自动轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
之后,检测线程可以将“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”存储在日志中,以供之后需要使用“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”时,再从日志中调取“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”。
或者,在本申请另一实施例中,若此时检测线程的状态为静默状态,由于处于静默状态的检测线程是不工作的,是无法基于处于静默状态的检测线程检测为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数的,如此, 可以将检测线程的状态从静默状态切换至激活状态。这样,检测线程会自动轮询宿主机中的器CPU对应的PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
之后,检测线程可以将“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”存储在日志中,以供之后需要使用“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”时,再从日志中调取“为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数”。
在本申请中,预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。以及,将检测线程的状态从静默状态切换至激活状态。以使检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。通过本申请,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,可以提高宿主机的整体性能以及提高虚拟机的性能。
在本申请一个实施例中,宿主机中还运行有内核态的VMM以及用户态的VMM。
如此,在预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数时,可以是用户态的VMM预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
相应地,在关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能时,可以是用户态的VMM向内核态的VMM发送关闭请求,关闭请求用于关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在第一预计执行次数大于或等于预设阈值的情况下,如前述分析,宿主机中的VMM频繁进入异常处理流程,会降低宿主机的整体性能,以及,会导致虚拟机因多次退出至内核态的VMM而降低虚拟机的性能。
因此,为了避免降低宿主机的整体性能以及虚拟机的整体性能,如前述分析,可以关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
然而,宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能是隶属于内核态的VMM 管控的,用户态的VMM可以不管控宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,因此,在用户态的VMM在第一预计执行次数大于或等于预设阈值的情况下若确定出需要关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,则可以请求内核态的VMM关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
例如,用户态的VMM向内核态的VMM发送关闭请求,关闭请求用于关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
其中,在本申请一个实施例中,用户态的VMM可以将关闭请求放入至用户态的VMM与内核态的VMM的共享区域(例如共享内存页等),然后通知内核态的VMM。
然后,内核态的VMM接收关闭请求,根据关闭请求关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在本申请一个实施例中,内核态的VMM在得到通知之后,可以从用户态的VMM与内核态的VMM的共享区域中去读取该关闭请求。
宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能对应有功能开关,若关闭该功能开关,则可以实现关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,若打开该功能开关,则可以启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
如此,内核态的VMM在根据关闭请求关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能时,可以关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能对应的功能开关。
内核态的VMM可以通过该功能开关的对外暴露的API(Application Programming Interface,应用程序编程接口)来关闭该功能开关。
其中,检测线程可以是内核态的检测线程,由于检测线程可以是基于内核态的,如此,用户态VMM可以经由内核态的VMM将检测线程的状态从静默状态切换至激活状态,例如,检测线程对外暴露有API,用户态的VMM可以向内核态的VMM发送携带该API的激活请求,以使内核态的VMM通过该API将检测线程的状态从静默状态切换至激活状态等。
例如,在将检测线程的状态从静默状态切换至激活状态时,可以是用户态VMM向内核态的VMM发送激活请求,激活请求用于请求将检测线程的状态从静默状态切换至激活状态。然后内核态的VMM接收激活请求,根据激活请求将检测线程的状态从静默状态切换至激活状态。
在本申请另一实施例中,内核态的VMM关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能之后,内核态的VMM可以向用户态VMM发送关闭响应,关闭响应用于通知已关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。在本申请一个实施例中,内核态的VMM可以将关闭响应(处理结果)放入至用户态的VMM与内核态的VMM的共享区域(例如共享内存页等),然后通知内核态的VMM。
然后,用户态VMM接收关闭响应,根据关闭响应再向内核态的VMM发送激活请求。在本申请一个实施例中,用户态的VMM在得到通知之后,可以从用户态的VMM与内核态的VMM 的共享区域中去读取该关闭响应,并根据该关闭响应获知已关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在本申请一个实施例中,检测线程是轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据的,也即,是定期地获取宿主机中的CPU对应的PMU中记录的CPU的运行数据。
每一次获取到的CPU对应的PMU中记录的CPU的运行数据中均包括:在获取到运行数据的时刻时为虚拟机分配的VCPU已执行对宿主机的CPU的跨cache line操作的已执行次数。
如此,可以根据在第一时间段的起始时刻时获取到的、为虚拟机分配的VCPU已执行对宿主机的CPU的跨cache line操作的已执行次数以及在第一时间段的结束时刻时获取到的、为虚拟机分配的VCPU已执行对宿主机的CPU的跨cache line操作的已执行次数,来获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
例如,在一个例子中,轮询到的CPU的运行数据包括:为虚拟机分配的VCPU在第一时间段的起始时刻时已执行对宿主机的CPU的跨cache line操作的第一已执行次数以及为虚拟机分配的VCPU在第一时间段的结束时刻时已执行对宿主机的CPU的跨cache line操作的第二已执行次数。
如此,在根据轮询到的CPU的运行数据获取实际执行次数时,可以计算第二已执行次数与第一已执行次数之间的差值,然后根据差值获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,例如,可以将该差值确定为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数等。
通过本申请,通过检测线程可以得到为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
之后需要进入第一时间段之后的第二时间段,也就需要检测为虚拟机分配的VCPU在第二时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
参见图2,具体流程可以包括:
在步骤S201中,预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。
在检测为虚拟机分配的VCPU在第二时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数之前,可以先预测为虚拟机分配的虚拟VCPU在第二时间段内预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数(是预测出来的执行次数,不是实际执行的执行次数),然后执行步骤S202。
其中,可以采用历史数据来预测为虚拟机分配的虚拟VCPU在第二时间段内预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。
历史数据可以包括在当前时刻之前的至少一个历史时间段内为虚拟机分配的VCPU实 际执行对宿主机的CPU的跨cache line操作的历史执行次数,然后根据历史数据获取为虚拟机分配的虚拟VCPU在当前时刻之后的第二时间段内预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。例如,可以对历史数据分析,以分析出在历史过程中为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的历史执行次数的规律,并基于规律来获取为虚拟机分配的虚拟VCPU在当前时刻之后的第二时间段内预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。
其中,在本申请中,可以获取在当前时刻之前的至少一个历史时间段内为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的历史执行次数;然后根据历史执行次数获取为虚拟机分配的虚拟VCPU在当前时刻之后的第二时间段内预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。
在步骤S202中,在第二预计执行次数小于预设阈值的情况下,启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常。
在本申请中,启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能的具体方式可以参见之后的描述,在此不做详述。
在步骤S203中,在得到异常的相关信息的情况下,根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的。
在步骤S204中,在异常是宿主机的CPU因CPU的访存总线被锁定而抛出的情况下,确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
进一步地,可以通过计数的方式来统计为虚拟机分配的VCPU在第二时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
在本申请一个实施例中,宿主机中还运行有内核态的VMM以及用户态的VMM。
如此,在预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数时,可以是用户态的VMM预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数。
相应地,在启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能时,可以是用户态的VMM向内核态的VMM发送启动请求,启动请求用于启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在第二预计执行次数小于预设阈值的情况下,如前述分析,宿主机中的VMM虽然会进入异常处理流程,但并不会频繁进入异常处理流程,如此,宿主机的整体性能的降低程度并不大,以及,虚拟机虽然会退出至内核态的VMM,但是不会频繁退出至内核态的VMM,如此,对虚拟机的性能的降低程度并不大。如此,“VMM并不会频繁进入异常处理流程以及虚拟机并不会频繁退出至内核态的VMM”导致的宿主机的整体性能的降低程度以及虚拟机 的性能的降低程度往往是能够容忍的。
另一方面,检测线程是轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据的,也即,是定期地获取宿主机中的CPU对应的PMU中记录的CPU的运行数据。受“定期”的周期(相邻两次轮询之间的时间间隔)的影响,往往是间隔“一个周期”才能得到为虚拟机分配的VCPU在一个周期内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,会一定程度影响获取为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作的实际执行次数的时效性(例如,前述描述的时间段的划分就无法自由按照实际情况划分,时间段的划分只能根据“定期”的周期来划分,时间段包括的时长最小只能为“定期”的周期,无法再小了,所以会降低时效性)。
因此,由于在第二预计执行次数小于预设阈值的情况下,“VMM并不会频繁进入异常处理流程以及虚拟机并不会频繁退出至内核态的VMM”导致的宿主机的整体性能的降低程度以及虚拟机的性能的降低程度往往是能够容忍的,如此,为了提高获取的为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的时效性,如此,可以接受使用“用户态的VMM得到该异常之后,还可以得到异常的该相关信息等(例如用户态的VMM与内核态的VMM的共享区域内可以记录该异常的原因等,从其中可以解析出该异常的原因是与是由于触发了split lock事件等),并根据该异常的相关信息可以确定出为虚拟机分配的VCPU是否执行了跨宿主机的CPU的cache line操作(为虚拟机分配的VCPU是否触发了split lock事件),并通过计数的方式来统计为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数”的检测方式来检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
这样,就需要启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
然而,宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能是隶属于内核态的VMM管控的,用户态的VMM可以不管控宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能的,因此,在用户态的VMM在第二预计执行次数小于预设阈值的情况下若确定出需要启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,则可以请求内核态的VMM启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
例如,用户态的VMM向内核态的VMM发送启动请求,启动请求用于启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
其中,在本申请一个实施例中,用户态的VMM可以将启动请求放入至用户态的VMM与内核态的VMM的共享区域(例如共享内存页等),然后通知内核态的VMM。
然后,内核态的VMM接收启动请求,根据启动请求启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在本申请一个实施例中,内核态的VMM在得到通知之后,可以从用户态的VMM与内核态的VMM的共享区域中去读取该启动请求。
宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能对应有功能开关,若关闭该 功能开关,则可以实现关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,若打开该功能开关,则可以启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
如此,内核态的VMM在根据启动请求启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能时,可以启动宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能对应的功能开关。
内核态的VMM可以通过该功能开关的对外暴露的API来启动该功能开关。
相应地,在根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的时,可以是用户态VMM根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的。
相应地,在确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作时,可以是用户态VMM确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
进一步地,由于在第二预计执行次数小于预设阈值的情况下,可以不使用检测线程来检测为虚拟机分配的VCPU在第二时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,如此,可以将检测线程的状态从激活状态切换至静默状态。
例如,宿主机中还运行有内核态的VMM以及用户态的VMM。
如此,在将检测线程的状态从激活状态切换至静默状态时,在一个实施例中,用户态的VMM可以向内核态的VMM发送静默请求,静默请求用于请求将检测线程的状态从激活状态切换至静默状态。内核态的VMM可以接收静默请求,根据静默请求将检测线程的状态从激活状态切换至静默状态。
或者,在另一个实施例中,检测线程可以直接请求内核态的VMM将检测线程的状态从激活状态切换至静默状态,本申请对具体的切换方式不做限定。
参见图3,以一个实施例对本方案进行举例说明,但不作为对本方案的保护范围的限制。
具体地,该实施例包括如下流程:
在步骤S301中,用户态的VMM预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。
在步骤S302中,在第一预计执行次数大于或等于预设阈值的情况下,用户态的VMM向内核态的VMM发送关闭请求,关闭请求用于关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。
在步骤S303中,内核态的VMM接收关闭请求,根据关闭请求关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在步骤S304中,内核态的VMM向用户态VMM发送关闭响应,关闭响应用于通知已关 闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。
在步骤S305中,用户态VMM接收关闭响应,根据关闭响应向内核态的VMM发送激活请求,激活请求用于请求将检测线程的状态从静默状态切换至激活状态。
在步骤S306中,内核态的VMM接收激活请求,根据激活请求将检测线程的状态从静默状态切换至激活状态。
在步骤S307中,在检测线程的状态切换至激活状态的情况下,检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据。
在步骤S308中,检测线程根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
在本申请中,用户态的VMM预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。在第一预计执行次数大于或等于预设阈值的情况下,用户态的VMM向内核态的VMM发送关闭请求,关闭请求用于关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。内核态的VMM接收关闭请求,根据关闭请求关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。内核态的VMM向用户态VMM发送关闭响应,关闭响应用于通知已关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。用户态VMM接收关闭响应,根据关闭响应向内核态的VMM发送激活请求,激活请求用于请求将检测线程的状态从静默状态切换至激活状态。内核态的VMM接收激活请求,根据激活请求将检测线程的状态从静默状态切换至激活状态。在检测线程的状态切换至激活状态的情况下,检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据。检测线程根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。通过本申请,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,可以提高宿主机的整体性能以及提高虚拟机的性能。
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作并不一定是本申请所必须的。
参照图4,示出了本申请的一种数据处理装置的结构框图,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程;所述装置包括:
第一预测模块11,用于预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;关 闭模块12,用于在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常;以及,第一切换模块13,用于将检测线程的状态从静默状态切换至激活状态;轮询模块14,用于轮询宿主机中的CPU对应的性能监视器PMU中记录的CPU的运行数据,获取模块15,用于根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
在一个可选的实现方式中,宿主机中还运行有内核态的VMM以及用户态的VMM;第一预测模块包括:用户态的VMM的第一预测单元;第一预测单元,用于预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;相应地,关闭模块包括:用户态的VMM包括的第一发送单元,还包括内核态的VMM包括的第一接收单元以及关闭单元;第一发送单元,用于向第一接收单元发送关闭请求,关闭请求用于关闭所述功能;第一接收单元,用于接收关闭请求,关闭单元,用于根据关闭请求关闭所述功能。
在一个可选的实现方式中,第一切换模块包括:用户态的VMM包括的第二发送单元,还包括内核态的VMM包括的第二接收单元以及第一切换单元;第二发送单元,用于向第二接收单元发送激活请求,激活请求用于请求将检测线程的状态从静默状态切换至激活状态;第二接收单元,用于接收激活请求,第一切换单元,用于根据激活请求将检测线程的状态从静默状态切换至激活状态。
在一个可选的实现方式中,第一切换模块还包括:用户态的VMM包括的第三发送单元,还包括内核态的VMM包括的第三接收单元;第三发送单元,用于向第三接收单元发送关闭响应,关闭响应用于通知已关闭所述功能;第三接收单元,用于接收关闭响应,第二发送单元还用于根据关闭响应向第二接收单元发送激活请求。
在一个可选的实现方式中,所述装置还包括:第二预测模块,用于预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数;启动模块,用于在第二预计执行次数小于预设阈值的情况下,启动所述功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常;第一确定模块,用于在得到异常的相关信息的情况下,根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的;第二确定模块,用于在异常是宿主机的CPU因CPU的访存总线被锁定而抛出的情况下,确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
在一个可选的实现方式中,宿主机中还运行有内核态的VMM以及用户态的VMM;第二预测模块包括:用户态的VMM的第二预测单元;第二预测单元,用于预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数;相应地,启动模块包括:用户态的VMM包括的第四发送单元,还包括内核态的VMM包括的第四接收单元以及启动单元;第四发送单元,用于向第四接收单元发送 启动请求,启动请求用于启动所述功能;第四接收单元,用于接收启动请求,启动单元,用于根据启动请求启动所述功能。
在一个可选的实现方式中,第一确定模块还包括用户态的VMM包括的第一确定单元;第一确定单元,用于根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的;相应地,第二确定模块包括用户态的VMM包括的第二确定单元;第二确定单元,用于确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
在一个可选的实现方式中,所述装置还包括:第二切换模块,用于在第二预计执行次数小于预设阈值的情况下,将检测线程的状态从激活状态切换至静默状态。
在一个可选的实现方式中,宿主机中还运行有内核态的VMM以及用户态的VMM;第二切换模块包括:用户态的VMM包括的第五发送单元,还包括内核态的VMM包括的第五接收单元以及第二切换单元;包括:第五发送单元,用于向第五接收单元发送静默请求,静默请求用于请求将检测线程的状态从激活状态切换至静默状态;第五接收单元,用于接收静默请求,第二切换单元,用于根据静默请求将检测线程的状态从激活状态切换至静默状态。
在一个可选的实现方式中,第一预测模块包括:第一获取单元,用于获取为虚拟机分配的VCPU在当前时刻之前的至少一个历史时间段内实际执行对宿主机的CPU的跨cache line操作的历史执行次数;第二获取单元,用于根据历史执行次数获取所述第一预计执行次数。
在一个可选的实现方式中,轮询到的CPU的运行数据包括:为虚拟机分配的VCPU在第一时间段的起始时刻时已执行对宿主机的CPU的跨cache line操作的第一已执行次数以及为虚拟机分配的VCPU在第一时间段的结束时刻时已执行对宿主机的CPU的跨cache line操作的第二已执行次数;获取模块包括:计算单元,用于计算第二已执行次数与第一已执行次数之间的差值;第三获取单元,用于根据所述差值获取所述实际执行次数。
在本申请中,预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的CPU的跨cache line操作的第一预计执行次数。在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能。以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常。以及,将检测线程的状态从静默状态切换至激活状态。以使检测线程轮询宿主机中的CPU对应的PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。通过本申请,在为虚拟机分配的VCPU执行对宿主机的CPU的跨cache line操作很频繁(例如每秒数万次或数十万次等)的情况下,在检测为虚拟机分配的VCPU实际执行对宿主机的CPU的跨cache line操作的实际执行次数的场景中,可以提高宿主机的整体性能以及提高虚拟机的性能。
本申请实施例还提供了一种非易失性可读存储介质,该存储介质中存储有一个或多个模块(programs),该一个或多个模块被应用在设备时,可以使得该设备执行本申请实施例中各方法步骤的指令(instructions)。
本申请实施例提供了一个或多个机器可读介质,其上存储有指令,当由一个或多个处理器执行时,使得电子设备执行如上述实施例中一个或多个的方法。本申请实施例中,电子设备包括服务器、网关、子设备等,子设备为物联网设备等设备。
本公开的实施例可被实现为使用任意适当的硬件,固件,软件,或及其任意组合进行想要的配置的装置,该装置可包括服务器(集群)、终端设备如IoT设备等电子设备。
图5示意性地示出了可被用于实现本申请中的各个实施例的示例性装置1300。
对于一个实施例,图5示出了示例性装置1300,该装置具有一个或多个处理器1302、被耦合到(一个或多个)处理器1302中的至少一个的控制模块(芯片组)1304、被耦合到控制模块1304的存储器1306、被耦合到控制模块1304的非易失性存储器(NVM)/存储设备1308、被耦合到控制模块1304的一个或多个输入/输出设备1310,和被耦合到控制模块1304的网络接口1312。
处理器1302可包括一个或多个单核或多核处理器,处理器1302可包括通用处理器或专用处理器(例如图形处理器、应用处理器、基频处理器等)的任意组合。在一些实施例中,装置1300能够作为本申请实施例中网关等服务器设备。
在一些实施例中,装置1300可包括具有指令1314的一个或多个计算机可读介质(例如,存储器1306或NVM/存储设备1308)和与该一个或多个计算机可读介质相合并被配置为执行指令1314以实现模块从而执行本公开中的动作的一个或多个处理器1302。
对于一个实施例,控制模块1304可包括任意适当的接口控制器,以向(一个或多个)处理器1302中的至少一个和/或与控制模块1304通信的任意适当的设备或组件提供任意适当的接口。
控制模块1304可包括存储器控制器模块,以向存储器1306提供接口。存储器控制器模块可以是硬件模块、软件模块和/或固件模块。
存储器1306可被用于例如为装置1300加载和存储数据和/或指令1314。对于一个实施例,存储器1306可包括任意适当的易失性存储器,例如,适当的DRAM。在一些实施例中,存储器1306可包括双倍数据速率四同步动态随机存取存储器(DDR4SDRAM)。
对于一个实施例,控制模块1304可包括一个或多个输入/输出控制器,以向NVM/存储设备1308及(一个或多个)输入/输出设备1310提供接口。
例如,NVM/存储设备1308可被用于存储数据和/或指令1314。NVM/存储设备1308可包括任意适当的非易失性存储器(例如,闪存)和/或可包括任意适当的(一个或多个)非易失性存储设备(例如,一个或多个硬盘驱动器(HDD)、一个或多个光盘(CD)驱动器和/或一个或多个数字通用光盘(DVD)驱动器)。
NVM/存储设备1308可包括在物理上作为装置1300被安装在其上的设备的一部分的存 储资源,或者其可被该设备访问可不必作为该设备的一部分。例如,NVM/存储设备1308可通过网络经由(一个或多个)输入/输出设备1310进行访问。
(一个或多个)输入/输出设备1310可为装置1300提供接口以与任意其他适当的设备通信,输入/输出设备1310可以包括通信组件、拼音组件、传感器组件等。网络接口1312可为装置1300提供接口以通过一个或多个网络通信,装置1300可根据一个或多个无线网络标准和/或协议中的任意标准和/或协议来与无线网络的一个或多个组件进行无线通信,例如接入基于通信标准的无线网络,如WiFi、2G、3G、4G、5G等,或它们的组合进行无线通信。
对于一个实施例,(一个或多个)处理器1302中的至少一个可与控制模块1304的一个或多个控制器(例如,存储器控制器模块)的逻辑封装在一起。对于一个实施例,(一个或多个)处理器1302中的至少一个可与控制模块1304的一个或多个控制器的逻辑封装在一起以形成系统级封装(SiP)。对于一个实施例,(一个或多个)处理器1302中的至少一个可与控制模块1304的一个或多个控制器的逻辑集成在同一模具上。对于一个实施例,(一个或多个)处理器1302中的至少一个可与控制模块1304的一个或多个控制器的逻辑集成在同一模具上以形成片上系统(SoC)。
在各个实施例中,装置1300可以但不限于是:服务器、台式计算设备或移动计算设备(例如,膝上型计算设备、手持计算设备、平板电脑、上网本等)等终端设备。在各个实施例中,装置1300可具有更多或更少的组件和/或不同的架构。例如,在一些实施例中,装置1300包括一个或多个摄像机、键盘、液晶显示器(LCD)屏幕(包括触屏显示器)、非易失性存储器端口、多个天线、图形芯片、专用集成电路(ASIC)和扬声器。
本申请实施例提供了一种电子设备,包括:一个或多个处理器;和,其上存储有指令的一个或多个机器可读介质,当由一个或多个处理器执行时,使得电子设备执行如本申请中一个或多个的方法。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、和流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程信息处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程信息处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程信息处理终端设备以特 定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程信息处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例和落入本申请实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对本申请所提供的数据处理方法及装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上,本说明书内容不应理解为对本申请的限制。

Claims (14)

  1. 一种数据处理方法,其特征在于,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程;所述方法包括:
    预测为虚拟机分配的虚拟中央处理器VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;
    在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常;以及,将检测线程的状态从静默状态切换至激活状态;以使检测线程轮询宿主机中的CPU对应的性能监视器PMU中记录的CPU的运行数据,并根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
  2. 根据权利要求1所述的方法,其特征在于,宿主机中还运行有内核态的虚拟机监视器VMM以及用户态的VMM;
    预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数,包括:
    用户态的VMM预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;
    相应地,所述关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能,包括:
    用户态的VMM向内核态的VMM发送关闭请求,关闭请求用于关闭所述功能;
    内核态的VMM接收关闭请求,根据关闭请求关闭所述功能。
  3. 根据权利要求2所述的方法,其特征在于,所述将检测线程的状态从静默状态切换至激活状态,包括:
    用户态VMM向内核态的VMM发送激活请求,激活请求用于请求将检测线程的状态从静默状态切换至激活状态;
    内核态的VMM接收激活请求,根据激活请求将检测线程的状态从静默状态切换至激活状态。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    内核态的VMM向用户态VMM发送关闭响应,关闭响应用于通知已关闭所述功能;
    用户态VMM接收关闭响应,根据关闭响应执行所述向内核态的VMM发送激活请求的步骤。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数;
    在第二预计执行次数小于预设阈值的情况下,启动所述功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU抛出异常;
    在得到异常的相关信息的情况下,根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的;
    在异常是宿主机的CPU因CPU的访存总线被锁定而抛出的情况下,确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
  6. 根据权利要求5所述的方法,其特征在于,宿主机中还运行有内核态的VMM以及用户态的VMM;
    所述预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数,包括:
    用户态的VMM预测为虚拟机分配的VCPU在第一时间段之后的第二时间段预计执行对宿主机的CPU的跨cache line操作的第二预计执行次数;
    相应地,启动所述功能,包括:
    用户态的VMM向内核态的VMM发送启动请求,启动请求用于启动所述功能;
    内核态的VMM接收启动请求,根据启动请求启动所述功能。
  7. 根据权利要求6所述的方法,其特征在于,所述根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的,包括:
    用户态VMM根据异常的相关信息确定异常是否是宿主机的CPU因CPU的访存总线被锁定而抛出的;
    相应地,确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作,包括:
    用户态VMM确定为虚拟机分配的VCPU在第二时间段内实际执行了对宿主机的CPU的跨cache line操作。
  8. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    在第二预计执行次数小于预设阈值的情况下,将检测线程的状态从激活状态切换至静默状态。
  9. 根据权利要求8所述的方法,其特征在于,宿主机中还运行有内核态的VMM以及用户态的VMM;
    所述将检测线程的状态从激活状态切换至静默状态,包括:
    用户态的VMM向内核态的VMM发送静默请求,静默请求用于请求将检测线程的状态从激活状态切换至静默状态;
    内核态的VMM接收静默请求,根据静默请求将检测线程的状态从激活状态切换至静默状态。
  10. 根据权利要求1所述的方法,其特征在于,预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数,包括:
    获取为虚拟机分配的VCPU在当前时刻之前的至少一个历史时间段内实际执行对宿主 机的CPU的跨cache line操作的历史执行次数;
    根据历史执行次数获取所述第一预计执行次数。
  11. 根据权利要求1所述的方法,其特征在于,轮询到的CPU的运行数据包括:为虚拟机分配的VCPU在第一时间段的起始时刻时已执行对宿主机的CPU的跨cache line操作的第一已执行次数以及为虚拟机分配的VCPU在第一时间段的结束时刻时已执行对宿主机的CPU的跨cache line操作的第二已执行次数;
    根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数,包括:
    计算第二已执行次数与第一已执行次数之间的差值;
    根据所述差值获取所述实际执行次数。
  12. 一种数据处理装置,其特征在于,应用于宿主机,宿主机中至少运行有虚拟机以及检测线程;所述装置包括:
    第一预测模块,用于预测为虚拟机分配的虚拟中央处理器VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;
    第一预测模块,用于预测为虚拟机分配的VCPU在当前时刻之后的第一时间段内预计执行对宿主机的中央处理器CPU的跨缓存行cache line操作的第一预计执行次数;关闭模块,用于在第一预计执行次数大于或等于预设阈值的情况下,关闭宿主机的CPU因CPU的访存总线被锁定而抛出异常的功能;以使在宿主机的CPU的访存总线被锁定的情况下宿主机的CPU不抛出异常;以及,第一切换模块,用于将检测线程的状态从静默状态切换至激活状态;轮询模块,用于轮询宿主机中的CPU对应的性能监视器PMU中记录的CPU的运行数据,获取模块,用于根据轮询到的CPU的运行数据获取为虚拟机分配的VCPU在第一时间段内实际执行对宿主机的CPU的跨cache line操作的实际执行次数。
  13. 一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其特征在于,处理器执行程序时实现如权利要求1至11中任一项的方法的步骤。
  14. 一种计算机可读存储介质,其特征在于,计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时实现如权利要求1至11中任一项的方法的步骤。
PCT/CN2023/080336 2022-03-17 2023-03-08 一种数据处理方法及装置 WO2023174126A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210267806.1A CN114595037A (zh) 2022-03-17 2022-03-17 一种数据处理方法及装置
CN202210267806.1 2022-03-17

Publications (1)

Publication Number Publication Date
WO2023174126A1 true WO2023174126A1 (zh) 2023-09-21

Family

ID=81810315

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/080336 WO2023174126A1 (zh) 2022-03-17 2023-03-08 一种数据处理方法及装置

Country Status (2)

Country Link
CN (1) CN114595037A (zh)
WO (1) WO2023174126A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114595037A (zh) * 2022-03-17 2022-06-07 阿里巴巴(中国)有限公司 一种数据处理方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286115A1 (en) * 2016-04-01 2017-10-05 James A. Coleman Apparatus and method for non-serializing split locks
CN110955512A (zh) * 2018-09-27 2020-04-03 阿里巴巴集团控股有限公司 缓存处理方法、装置、存储介质、处理器及计算设备
CN111124947A (zh) * 2018-10-31 2020-05-08 阿里巴巴集团控股有限公司 一种数据处理方法及其装置
CN112559049A (zh) * 2019-09-25 2021-03-26 阿里巴巴集团控股有限公司 用于指令高速缓存的路预测方法、访问控制单元以及指令处理装置
CN114595037A (zh) * 2022-03-17 2022-06-07 阿里巴巴(中国)有限公司 一种数据处理方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286115A1 (en) * 2016-04-01 2017-10-05 James A. Coleman Apparatus and method for non-serializing split locks
CN110955512A (zh) * 2018-09-27 2020-04-03 阿里巴巴集团控股有限公司 缓存处理方法、装置、存储介质、处理器及计算设备
CN111124947A (zh) * 2018-10-31 2020-05-08 阿里巴巴集团控股有限公司 一种数据处理方法及其装置
CN112559049A (zh) * 2019-09-25 2021-03-26 阿里巴巴集团控股有限公司 用于指令高速缓存的路预测方法、访问控制单元以及指令处理装置
CN114595037A (zh) * 2022-03-17 2022-06-07 阿里巴巴(中国)有限公司 一种数据处理方法及装置

Also Published As

Publication number Publication date
CN114595037A (zh) 2022-06-07

Similar Documents

Publication Publication Date Title
Fried et al. Caladan: Mitigating interference at microsecond timescales
JP6154071B2 (ja) アクティブなプロセッサに基づく動的電圧及び周波数管理
US9268542B1 (en) Cache contention management on a multicore processor based on the degree of contention exceeding a threshold
US8402232B2 (en) Memory utilization tracking
US8489744B2 (en) Selecting a host from a host cluster for live migration of a virtual machine
US8935698B2 (en) Management of migrating threads within a computing environment to transform multiple threading mode processors to single thread mode processors
US9697029B2 (en) Guest idle based VM request completion processing
US20110202699A1 (en) Preferred interrupt binding
WO2023174126A1 (zh) 一种数据处理方法及装置
US10628203B1 (en) Facilitating hibernation mode transitions for virtual machines
US9489223B2 (en) Virtual machine wakeup using a memory monitoring instruction
US9311142B2 (en) Controlling memory access conflict of threads on multi-core processor with set of highest priority processor cores based on a threshold value of issued-instruction efficiency
US9110723B2 (en) Multi-core binary translation task processing
US9600314B2 (en) Scheduler limited virtual device polling
WO2023165402A1 (zh) 一种原子操作的处理方法、设备、装置及存储介质
WO2012113232A1 (zh) 调整时钟中断周期的方法和装置
US10310890B2 (en) Control method for virtual machine system, and virtual machine system
JP2017534970A (ja) 複数のスレッドを実行する方法、システム、およびコンピュータ・プログラム製品、ならびに複数のスレッドの待ち状態を実現する方法、システム、およびコンピュータ・プログラム
Rybina et al. Investigation into the energy cost of live migration of virtual machines
US9606825B2 (en) Memory monitor emulation for virtual machines
US10387178B2 (en) Idle based latency reduction for coalesced interrupts
US11061840B2 (en) Managing network interface controller-generated interrupts
JP5014179B2 (ja) Os優先度変更装置及びos優先度変更プログラム
US20160170474A1 (en) Power-saving control system, control device, control method, and control program for server equipped with non-volatile memory
US11243603B2 (en) Power management of an event-based processing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23769632

Country of ref document: EP

Kind code of ref document: A1