CN106293944B - non-consistency-based I/O access system and optimization method under virtualized multi-core environment - Google Patents

non-consistency-based I/O access system and optimization method under virtualized multi-core environment Download PDF

Info

Publication number
CN106293944B
CN106293944B CN201610657524.7A CN201610657524A CN106293944B CN 106293944 B CN106293944 B CN 106293944B CN 201610657524 A CN201610657524 A CN 201610657524A CN 106293944 B CN106293944 B CN 106293944B
Authority
CN
China
Prior art keywords
load
node
performance
nodes
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610657524.7A
Other languages
Chinese (zh)
Other versions
CN106293944A (en
Inventor
管海兵
钱建民
李阳德
马汝辉
李健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201610657524.7A priority Critical patent/CN106293944B/en
Publication of CN106293944A publication Critical patent/CN106293944A/en
Application granted granted Critical
Publication of CN106293944B publication Critical patent/CN106293944B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/10Program control for peripheral devices
    • G06F13/105Program control for peripheral devices where the programme performs an input/output emulation function

Abstract

The invention discloses a non-uniform I/O (input/output) access system and an optimization method based on a virtualized multi-core environment, which relate to the field of computer virtualization and comprise a performance detection module, a performance monitoring module and a performance monitoring module, wherein the performance detection module monitors hardware information of a virtual machine and a physical host in real time through a modified performance monitoring tool; the thread binding module judges whether the current system is in a low load or a high load according to the hardware information collected by the performance detection module, and binds the virtual machine thread on the node with higher load to another node with lower load if the current system is in the high load condition; and if the load of the current system is low, the memory migration module migrates the related threads to the node closest to the network adapter. The system establishes an affinity optimization model based on the I/O performance under the virtualized multi-core environment, and provides a real-time dynamic high-throughput low-delay optimal placement strategy for the system, so that the performance of multi-core resources and a high-performance network adapter is efficiently utilized, and the load of the system is effectively reduced.

Description

Non-consistency-based I/O access system and optimization method under virtualized multi-core environment
Technical Field
the invention relates to the field of computer virtualization, in particular to a non-uniform I/O access-based system in a virtualized multi-core environment and an optimization method thereof.
background
Virtualization is a key technology in cloud computing. Virtualization allows multiple computer systems to run on a physical computer, abstracting the hardware resources (CPU, memory, I/O devices, etc.) of the physical computer into on-demand resources similar to power, which are made available to customers. The use of the virtualization technology greatly reduces the investment of small enterprises on server purchase, and greatly improves the use efficiency of idle hosts, so that the virtualization technology widely exists in the high-performance servers at present, and representative virtualized cloud computing examples include amazon EC2 and arieba aristo cloud.
One key component in virtualization technology is a Virtual Machine Monitor (VMM). The virtual machine monitor is responsible for abstracting host hardware resources for virtual machine usage, as well as for management of virtual machines and communication between virtual machines, among other things. Traditional hardware resources include CPU resources, Memory resources, I/O resources, and the like, and under a Non-Uniform Memory Access (NUMA) architecture, a virtualization technology mainly focuses on improving performance of the hardware resources after virtualization. However, with the development of the current high-performance network technology and CPU multi-core technology, the performance of hardware is not a bottleneck, and instead, how to efficiently process the high-performance network I/O requests in a multi-core environment becomes a bottleneck, because a very small host processing delay will cause a huge performance degradation to the virtualized network application.
The I/O virtualization is one of important components of virtualization, mainly aims at the virtualization of a network card PCIe function, and expands the number of virtual machines as much as possible under the condition of not losing I/O performance. However, as the number of cores of a high-performance physical machine is continuously increased, the number of nodes for placing physical cores is also continuously increased, and how efficiently multiple physical cores access to I/O resources becomes more and more important. For this reason, Non-Uniform I/O access (NUIOA) is proposed based on the NUMA architecture as shown in fig. 1. The various devices are directly connected to the node packet of each physical core, the devices are asymmetric, a remote node can access a certain device only through internal connection between the nodes, and therefore the access is slower than that of a near node, the asymmetric access increases the delay of a system, and finally the performance of the virtual machine is reduced. Fig. 2 is a conventional remote memory access, but since the existing system does not consider the existence of a high-performance network device, the optimization strategy of the system omits the factor of a network card, and particularly in a high-performance network environment, the performance loss caused by the omission is huge. The CPU on node 2 has to transmit data to the network card through the internal connection between node 2 and node 1, and between node 1 and node 0, which not only increases the bandwidth occupation of the internal connection, but also increases the access delay of the data.
the existing general applications are basically deployed in the cloud and have the characteristics of networking and distribution, so that high-performance reliable network transmission plays a critical role in the effective operation of the applications, and therefore the importance of the I/O device must be considered in affinity modeling under the current multi-core architecture.
At present, basically all virtual machine monitors, such as Xen, KVM and VMware ESXi, basically adopt that VCPU and all memories of one virtual machine are scheduled to one node to keep local access, but this method has a great defect because load balancing technology and other technologies of the system can dynamically balance load between CPU and memory, which causes the original placement policy to be disturbed and finally causes the policy to fail. Existing dynamically placed models are based on NUMA, and the point of interest for modeling is the locality of a memory, or the locality of the memory and the hit rate of the cache are considered at the same time, and the importance of network I/O equipment is not considered. And moreover, the modeling of the hardware and the hardware only considers the affinity between the threads and does not consider the affinity between the threads, so that the modeling accuracy is problematic.
the existing I/O performance tuning method under the virtualization multi-core architecture mainly comprises thread binding and memory migration. The thread binding refers to binding a VCPU thread of a virtual machine running an application program to a specific node, the memory migration refers to migrating a memory of the virtual machine running the application program to the specific node, and if the VCPU thread and the memory of the virtual machine are bound to the same node, the affinity between a CPU and the memory can be maximized, so that the performance of the system is improved. However, the existing research mainly focuses on two dimensions of a CPU and a memory, and omits the existence of the factor of a network card. New I/O device based affinity modeling is performed to achieve optimal system performance.
therefore, the invention aims to develop a non-uniform I/O access system and an optimization method under a virtualized multi-core environment under the NUIOA architecture, and establishes affinity optimization modeling based on I/O performance under the virtualized multi-core environment, so that the performance of multi-core resources and a high-performance network adapter is efficiently utilized, the load of the system is effectively reduced, and the system is suitable for application under the current high-performance network environment.
Disclosure of Invention
in view of the above defects in the prior art, the technical problem to be solved by the present invention is how to perform affinity modeling on I/O-based devices in addition to two dimensions, i.e., CPU and memory, in a virtualized multi-core architecture, and provide a real-time, dynamic, high-throughput, low-latency optimal placement strategy for a system, so as to achieve optimal overall system performance.
In order to achieve the above object, the present invention provides a non-uniform I/O-based access system in a virtualized multi-core environment, comprising a performance monitoring module, a thread binding module, and a memory migration module, wherein the performance monitoring module is configured to monitor hardware information of a virtual machine and a physical host in real time through a modified performance monitoring tool; the thread binding module is configured to judge whether the current system is under low load or high load according to the hardware information collected by the performance detection module, and bind the virtual machine thread on the node with higher load to another node with lower load if the current system is under high load; the memory migration module is configured to migrate the associated thread to a node closest to the network adapter if the current system load is low.
Further, the hardware information includes the number of times of accessing pages and the number of times of I/O requests by the application program in the virtual machine, and the real-time CPU load and memory load of the physical host.
the invention also provides an optimization method based on the non-uniform I/O access system in the virtualized multi-core environment, which comprises the following steps:
(1) Providing a performance monitoring module, a thread binding module and a memory migration module;
(2) Simultaneously monitoring the access times of the internal pages of the virtual machine and the times of I/O requests in unit time through a performance monitoring module;
(3) monitoring the load of a CPU and the load of a memory of a physical host in real time through a performance monitoring module;
(4) When the load of a certain node of the physical host is higher than a threshold value, the node needs to perform thread migration, and the formula of the judgment condition is as follows:
the TMT matrix represents the storage distribution of the threads, the DT matrix represents the access delay among the nodes, and the standard for migrating the threads T to the nodes K instead of the nodes P is that the average delay to the nodes K is smaller than the average access delay to the nodes P;
(5) When a certain node load of the physical machine is in a normal range, migrating the accessed hot page of the thread distributed on the remote node to the local node, wherein the judgment formula of the hot page is as follows:
if NodeAcc[n][i]>2*NodeAcc[n][j]
wherein NodeAcc [ n ] [ i ] represents the number of times that the node n accesses the page i;
(6) after the address of the hot page in the virtual machine is converted into the physical address of the physical machine, calling a page transfer function to transfer the hot page of the application program to a target node;
(7) And after the hot page migration is finished, continuing returning to the performance detection module to monitor the performance of the system.
In view of the existing defects of the existing multi-core system, the invention provides a high-throughput low-latency real-time placement scheduling strategy based on a virtualization environment under a NUIOA architecture, so as to improve the performance of an application program running in a virtual machine. Further elaboration is as follows:
The performance detection module monitors hardware information of the virtual machine and the physical host in real time through the modified performance monitoring tool, and the hardware information mainly comprises the times of accessing pages and I/O (input/output) request times of an application program in the virtual machine, the real-time load condition in the physical host and the like.
the thread binding module judges whether the current system is in low load or high load according to the hardware information collected by the performance detection module, and if the current system is in high load, the thread binding module needs to bind the virtual machine thread on the node with higher load to another node, so that the load is reduced.
And the memory migration module can migrate the related threads to the node closest to the network adapter if the current system is under the condition of lower load, so that excessive remote access is avoided, the internal bandwidth occupation among the nodes is reduced, and the throughput of the system is improved.
Compared with the existing modeling method based on the NUMA architecture, the access system and the optimization method have the following advantages that:
(1) The affinity of a network adapter and a processor node is considered, a dimension is added to the traditional modeling method, and the system can better reflect the importance of network equipment in the current high-performance network environment;
(2) Meanwhile, the affinity among multiple VCPU threads of the virtual machine is considered, and the threads with higher affinity are placed on the same node to reduce data communication among nodes;
(3) And a modeling matrix with finer granularity is adopted, an access delay matrix and a thread memory mapping matrix are established to carry out the final placement scheduling decision, and the fine granularity can more comprehensively embody the accuracy of modeling.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a schematic diagram of a conventional non-coherent I/O access architecture;
FIG. 2 is a diagram of a conventional remote access without consideration of network card elements;
FIG. 3 is a system architecture diagram of a preferred embodiment of the present invention;
FIG. 4 is a flow chart of an optimization method according to a preferred embodiment of the present invention.
Detailed Description
the following describes embodiments of the present invention in detail with reference to the drawings, which are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given below, but the scope of the present invention is not limited to the following embodiments.
As shown in FIG. 4, the method for optimizing a system based on non-uniform I/O access in a virtualized multi-core environment according to the present invention includes the following steps:
step 1, monitoring the access times of the internal pages of the virtual machine and the times of I/O requests in unit time simultaneously through a performance monitoring module. The number of times of the page accessed by each node is recorded by an array, and the number of times of the I/O request of the unit time class can be directly obtained. And simultaneously, the load in the physical host, mainly the load of a CPU and the load of a memory, is monitored in real time.
step 2, when the load of a certain node of the physical machine is higher than a certain threshold (the threshold is related to the configuration of the system), it indicates that the node needs to perform thread migration, and the specific migration method is represented by a formula:
The TMT matrix represents the memory distribution of the threads, the DT matrix represents the access delay between the nodes, and the criterion for migrating a thread T onto a node K instead of onto a node P is that the average delay to a node K is less than the average access delay to a node P.
step 3, when a certain node load of the physical machine is within a normal range (the threshold value is related to the configuration of the system), migrating the accessed hot pages of the thread distributed on the remote node to the local node as much as possible, and performing the migration by using a numactl API, wherein the determination of the hot pages adopts the following formula:
if NodeAcc[n][i]>2*NodeAcc[n][j]
NodeAcc [ n ] [ i ] represents the number of times that the node n accesses the page i, and the formula shows that the maximum number of times that the node page is accessed is more than twice as large as the second maximum, and the previous pages are all considered to be hot pages. And then after the address of the hot page in the virtual machine is converted into the physical address of the physical machine, the page migration module calls a function move _ pages to migrate the hot page of the application program to the target node. And after the hot page migration is finished, continuing returning to the performance detection module to monitor the performance of the system.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (2)

1. An optimization method based on a non-uniform I/O access system in a virtualized multi-core environment is characterized by comprising the following steps:
(1) providing a performance monitoring module, a thread binding module and a memory migration module;
(2) Simultaneously monitoring the access times of the internal pages of the virtual machine and the times of I/O requests in unit time through a performance monitoring module;
(3) monitoring the load of a CPU and the load of a memory of a physical host in real time through a performance monitoring module;
(4) when the load of a certain node of the physical host is higher than a threshold value, the node needs to perform thread migration, and the formula of the judgment condition is as follows:
The TMT matrix represents the storage distribution of the threads, the DT matrix represents the access delay among the nodes, and the standard for migrating the threads T to the nodes k instead of the nodes p is that the average delay to the nodes k is smaller than the average access delay to the nodes p;
(5) when a certain node load of the physical machine is in a normal range, migrating the accessed hot page of the thread distributed on the remote node to the local node, wherein the judgment formula of the hot page is as follows:
if NodeAcc[n][i]>2*NodeAcc[n][j]
Wherein NodeAcc [ n ] [ i ] represents the number of times that the node n accesses pagei, and the formula shows that the previous page is considered to be a hot page if the maximum number of times that the node page is accessed is more than twice of the second maximum;
(6) after the address of the hot page in the virtual machine is converted into the physical address of the physical machine, calling a page transfer function to transfer the hot page of the application program to a target node;
(7) And after the hot page migration is finished, continuing returning to the performance detection module to monitor the performance of the system.
2. The non-uniform I/O access system based on the virtualized multi-core environment using the optimization method as claimed in claim 1, comprising a performance monitoring module, a thread binding module and a memory migration module,
the performance detection module is configured to monitor hardware information of the virtual machine and the physical host in real time through the modified performance monitoring tool;
The thread binding module is configured to judge whether the current system is under low load or high load according to the hardware information collected by the performance detection module, and bind the virtual machine thread on the node with higher load to another node with lower load if the current system is under high load;
the memory migration module is configured to migrate the related threads to the node closest to the network adapter if the load of the current system is low, so that excessive remote access is avoided, the bandwidth occupation of the interconnection among the nodes is reduced, and the throughput of the system is improved;
the hardware information comprises the times of accessing pages by an application program in the virtual machine, the times of I/O requests, and real-time CPU load and memory load of the physical host;
the judgment condition for judging whether the current system is in low load or high load is as follows:
the TMT matrix represents the storage distribution of the threads, the DT matrix represents the access delay among the nodes, and the standard for migrating the threads T to the nodes k instead of the nodes p is that the average delay to the nodes k is smaller than the average access delay to the nodes p;
the step of migrating the related thread to the node closest to the network adapter refers to migrating an access hot page of the thread distributed on a remote node to a local node, wherein the judgment formula of the hot page is as follows:
if NodeAcc[n][i]>2*NodeAcc[n][j]
and the judgment formula of the hot page shows that the prior pages are all hot pages when the maximum access times of the node pages are more than twice of the second maximum.
CN201610657524.7A 2016-08-11 2016-08-11 non-consistency-based I/O access system and optimization method under virtualized multi-core environment Active CN106293944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610657524.7A CN106293944B (en) 2016-08-11 2016-08-11 non-consistency-based I/O access system and optimization method under virtualized multi-core environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610657524.7A CN106293944B (en) 2016-08-11 2016-08-11 non-consistency-based I/O access system and optimization method under virtualized multi-core environment

Publications (2)

Publication Number Publication Date
CN106293944A CN106293944A (en) 2017-01-04
CN106293944B true CN106293944B (en) 2019-12-10

Family

ID=57670064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610657524.7A Active CN106293944B (en) 2016-08-11 2016-08-11 non-consistency-based I/O access system and optimization method under virtualized multi-core environment

Country Status (1)

Country Link
CN (1) CN106293944B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107070709B (en) * 2017-03-31 2020-06-26 上海交通大学 NFV (network function virtualization) implementation method based on bottom NUMA (non uniform memory Access) perception
CN107168771A (en) 2017-04-24 2017-09-15 上海交通大学 A kind of scheduling virtual machine device and method under Non Uniform Memory Access access architectures
CN107832213A (en) * 2017-11-03 2018-03-23 郑州云海信息技术有限公司 A kind of hpl test optimization methods based on internal memory compatibility
CN107967180B (en) * 2017-12-19 2019-09-10 上海交通大学 Based on resource overall situation affinity network optimized approach and system under NUMA virtualized environment
CN108259583B (en) * 2017-12-29 2020-05-26 广州云达信息技术有限公司 Data dynamic migration method and device
CN109039831A (en) * 2018-09-21 2018-12-18 浪潮电子信息产业股份有限公司 A kind of load detection method and device
CN109639531B (en) * 2018-12-28 2022-07-19 天津卓朗科技发展有限公司 Virtual machine network self-adaptive switching method and system
CN109947569B (en) * 2019-03-15 2021-04-06 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for binding core
CN110673928B (en) * 2019-09-29 2021-12-14 天津卓朗科技发展有限公司 Thread binding method, thread binding device, storage medium and server
US20230289303A1 (en) * 2020-09-18 2023-09-14 Intel Corporation Improving remote traffic performance on cluster-aware processors
CN115348157B (en) * 2021-05-14 2023-09-05 中国移动通信集团浙江有限公司 Fault positioning method, device and equipment of distributed storage cluster and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103916438A (en) * 2013-01-06 2014-07-09 上海计算机软件技术开发中心 Cloud testing environment scheduling method and system based on load forecast
CN104166594A (en) * 2014-08-19 2014-11-26 杭州华为数字技术有限公司 Load balancing control method and related devices
CN104836819A (en) * 2014-02-10 2015-08-12 阿里巴巴集团控股有限公司 Dynamic load balancing method and system, and monitoring and dispatching device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103916438A (en) * 2013-01-06 2014-07-09 上海计算机软件技术开发中心 Cloud testing environment scheduling method and system based on load forecast
CN104836819A (en) * 2014-02-10 2015-08-12 阿里巴巴集团控股有限公司 Dynamic load balancing method and system, and monitoring and dispatching device
CN104166594A (en) * 2014-08-19 2014-11-26 杭州华为数字技术有限公司 Load balancing control method and related devices

Also Published As

Publication number Publication date
CN106293944A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN106293944B (en) non-consistency-based I/O access system and optimization method under virtualized multi-core environment
Guo et al. Clio: A hardware-software co-designed disaggregated memory system
TWI625674B (en) Systems and methods for nvme controller virtualization to support multiple virtual machines running on a host
US9501245B2 (en) Systems and methods for NVMe controller virtualization to support multiple virtual machines running on a host
US9648081B2 (en) Network-attached memory
US20160132541A1 (en) Efficient implementations for mapreduce systems
US9733980B1 (en) Virtual machine management using I/O device logging
CN107436809B (en) data processor
CN107967180B (en) Based on resource overall situation affinity network optimized approach and system under NUMA virtualized environment
WO2013044829A1 (en) Data readahead method and device for non-uniform memory access
US20220050722A1 (en) Memory pool management
KR20210001886A (en) Data accessing method and apparatus, device and medium
Tafa et al. The evaluation of transfer time, cpu consumption and memory utilization in xen-pv, xen-hvm, openvz, kvm-fv and kvm-pv hypervisors using ftp and http approaches
CN105681402A (en) Distributed high speed database integration system based on PCIe flash memory card
US20190007483A1 (en) Server architecture having dedicated compute resources for processing infrastructure-related workloads
JP2019185764A (en) Data-centric computing architecture based on storage server in ndp server data center
US11914903B2 (en) Systems, methods, and devices for accelerators with virtualization and tiered memory
Sato et al. A model-based algorithm for optimizing i/o intensive applications in clouds using vm-based migration
CN104461941B (en) A kind of memory system framework and management method
US20150074351A1 (en) Write-behind caching in distributed file systems
CN103955397A (en) Virtual machine scheduling multi-strategy selection method based on micro-architecture perception
CN109117247B (en) Virtual resource management system and method based on heterogeneous multi-core topology perception
EP4123649A1 (en) Memory module, system including the same, and operation method of memory module
TWI824392B (en) On-demand shared data caching method, computer program, and computer readable medium applicable for distributed deep learning computing
Blagodurov et al. Towards the contention aware scheduling in hpc cluster environment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant