KR101303079B1

KR101303079B1 - Apparatus and method for controlling cache coherence in virtualized environment based on multi-core

Info

Publication number: KR101303079B1
Application number: KR1020110042577A
Authority: KR
Inventors: 허재혁; 김대훈; 김환주
Original assignee: 한국과학기술원
Priority date: 2011-05-04
Filing date: 2011-05-04
Publication date: 2013-09-03
Also published as: KR20120124743A

Abstract

본 발명은 멀티-코어 기반의 가상화 환경에서 캐쉬 일관성을 제어하는 장치, 방법 및 그 방법을 기록한 기록매체에 관한 것으로서, 본 발명에 따른 캐쉬 일관성 제어 방법은, 가상 머신의 가상 프로세서와 상기 코어들 간의 매핑 정보 및 페이지의 공유 여부를 나타내는 상태 정보를 생성하고, 캐쉬 미스의 발생이 감지되면 매핑 정보 및 상태 정보를 이용하여 캐쉬 미스가 감지된 데이터에 대해 가상 머신에 의한 페이지 공유 여부를 검사하며, 검사 결과에 따라 동일한 가상 머신에 속한 가상 프로세서들이 매핑된 코어를 고려하여 선택적으로 일관성 메시지를 전송한다.The present invention relates to an apparatus, a method for controlling cache coherence in a multi-core based virtualization environment, and a recording medium recording the method. The cache coherence control method according to the present invention includes a virtual processor of a virtual machine and the cores. Generates the status information indicating whether the mapping information and the page is shared, and if the occurrence of the cache miss is detected, the mapping information and the status information is used to check whether the page is shared by the virtual machine for the cache miss detected. As a result, virtual processors belonging to the same virtual machine selectively transmit a consistency message considering the mapped core.

Description

Apparatus and method for controlling cache coherence in virtualized environment based on multi-core}

본 발명은 가상화 환경이 구현된 멀티-코어 시스템에 관한 것으로, 보다 상세하게는 복수 개의 코어들이 구비된 시스템에서 가상화 환경의 특성을 고려하여 캐쉬 일관성을 제어함으로써 코어들 간의 불필요한 일관성 메시지 전송에 따른 통신량을 감소시킬 수 있는 캐쉬 일관성 제어 장치, 방법 및 그 방법을 기록한 기록매체에 관한 것이다.The present invention relates to a multi-core system in which a virtualization environment is implemented, and more particularly, a communication amount due to unnecessary coherency message transmission between cores by controlling cache coherency in consideration of characteristics of a virtualization environment in a system having a plurality of cores. The present invention relates to a cache coherence control apparatus and method for reducing the number of times, and a recording medium recording the method.

가상화(virtualization)는 컴퓨터의 자원을 추상화(abstraction)해 실제 물리적인 자원과 분리시키는 기술이다. 종래에는 중앙처리장치(CPU)나 메모리 같은 물리적인 컴퓨팅 자원은 운영체제(operating system: OS)와 연결되어 하나의 컴퓨팅 시스템을 구성하였다. 즉, 하나의 하드웨어 세트(set)에는 하나의 컴퓨팅 시스템만이 구성 가능하였다. 그러나, 가상화는 이러한 종래의 개념을 깨고 하나의 물리적인 하드웨어 상에 다수의 운영체제 및 이에 기반한 컴퓨팅 시스템을 구축하는 것을 가능하게 한다. 특히, 다양한 가상화 기술 중 서버 가상화 기술은 사용자나 기업의 입장에서는 하나의 PC나 서버를 가지고서도 여러 대의 PC를 가지고 있는 것처럼 사용할 수 있게 해 준다.Virtualization is the art of abstraction of computer resources and separating them from physical resources. Conventionally, physical computing resources such as a central processing unit (CPU) and a memory are connected to an operating system (OS) to form a single computing system. That is, only one computing system could be configured in one hardware set. However, virtualization breaks this conventional concept and makes it possible to build multiple operating systems and a computing system based thereon on one physical hardware. In particular, among the various virtualization technologies, server virtualization technology allows users or companies to use multiple PCs as if they had one PC or server.

한편, 이미 오래전부터 단일 쓰레드(thread) 기반의 싱글-코어(single-core) 기술은 과도한 전력 소모로 인하여 성능 향상의 한계에 도달하였다. 이를 극복하기 위하여 하나의 중앙처리장치(CPU)에 다수의 코어를 탑재한 멀티-코어(multi-core) 시스템 구축에 관한 연구가 활발하게 진행되어 왔다. 현재 대부분의 대용량 데이터 처리 시스템에서는 멀티-코어 시스템을 채택하여 사용 중에 있으며, 이로 인해 하나의 코어를 사용할 때에 비해 상대적으로 우수한 성능 향상을 이루었다. 특히, 이러한 멀티-코어 시스템은 병렬화된 작업을 처리할 때에 더욱 큰 이점과 성능 향상을 보여준다. 또한, 향후 반도체 직접도 향상에 따른 중앙처리장치당 코어의 수는 계속 증가할 것으로 예상된다.In the meantime, single-thread-based single-core technology has reached the limit of performance improvement due to excessive power consumption. In order to overcome this problem, researches on the construction of a multi-core system having a plurality of cores in one CPU have been actively conducted. Most large data processing systems are currently using multi-core systems, resulting in a relatively good performance improvement compared to using one core. In particular, these multi-core systems show greater benefits and performance improvements when dealing with parallelized tasks. In addition, the number of cores per central processing unit is expected to increase in the future as semiconductor directivity improves.

멀티-코어 시스템에는 각 코어 별로 보다 빠른 데이터 접근을 위한 로컬 캐쉬(local cache)가 존재하고, 이러한 코어들이 동일한 공유 메모리를 사용한다면, 캐쉬들 간의 일관성을 맞춰 주기 위해 코어들 간의 통신이 필요함을 알 수 있다. 예를 들어, 제 1 코어가 변수 x를 자신의 로컬 캐쉬에 적재(load)하였고, 제 2 코어 역시 변수 x를 자신의 로컬 캐쉬에 적재한 상황을 가정하자. 만약, 제 2 코어가 변수 x의 값을 변경하게 되면 쓰기에 의한 캐쉬 미스(cache miss)가 발생하게 되고, 동일한 변수 x에 대해 제 1 코어가 인지하는 값과 제 2 코어가 인지하는 값이 달라지는 문제가 발생한다. 이러한 상황을 캐쉬 일관성(cache coherence) 문제라고 하며, 이를 해결하기 위한 다양한 기술들이 소개되어 있다.In a multi-core system, there is a local cache for faster data access for each core, and if these cores use the same shared memory, communication between the cores is needed to ensure consistency between caches. Can be. For example, assume a situation where a first core loads a variable x into its local cache, and a second core also loads a variable x into its local cache. If the second core changes the value of the variable x, a cache miss occurs due to writing, and the value recognized by the first core and the value recognized by the second core are different for the same variable x. A problem arises. This situation is called a cache coherence problem, and various techniques have been introduced to solve this problem.

본 발명이 해결하고자 하는 기술적 과제는 종래의 가상화 기술이 싱글-코어 기반에서 벗어나지 못하는 한계를 극복하고, 멀티-코어에 기반한 가상화 환경 하에서 캐쉬 일관성을 유지하기 위해 전송하는 일관성 메시지로 인해 불필요한 통신이 발생하는 불편함을 해소하며, 이러한 불필요한 통신으로 인해 시스템의 성능이 하락하고 전력 소모가 증가하는 문제점을 해결할 수 있는 캐쉬 일관성 제어 장치 및 방법을 제공하고자 한다.The technical problem to be solved by the present invention is to overcome the limitation that the conventional virtualization technology does not deviate from the single-core base, and unnecessary communication occurs due to the coherency message transmitted to maintain cache coherency in the multi-core based virtualization environment The present invention provides a cache coherence control device and method that can solve the problem of deteriorating system performance and increasing power consumption due to such unnecessary communication.

상기 기술적 과제를 해결하기 위하여, 본 발명에 따른 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하는 방법은, 상기 가상 머신의 가상 프로세서와 상기 코어들 간의 매핑(mapping) 정보 및 페이지의 공유 여부를 나타내는 상태 정보를 생성하는 단계; 캐쉬 미스의 발생이 감지되면, 상기 생성된 매핑 정보 및 상기 생성된 상태 정보를 이용하여 상기 캐쉬 미스가 감지된 데이터에 대해 상기 가상 머신에 의한 페이지(page) 공유 여부를 검사하는 단계; 및 상기 검사 결과에 따라 동일한 가상 머신에 속한 가상 프로세서들이 매핑된 코어를 고려하여 선택적으로 일관성 메시지를 전송하는 단계를 포함한다.In order to solve the above technical problem, a method of controlling cache coherence in a virtualization environment in which a plurality of virtual machines share a plurality of cores according to the present invention, mapping information between a virtual processor and the cores of the virtual machine Generating status information indicating whether a page is shared; If the occurrence of a cache miss is detected, checking whether the page is shared by the virtual machine with respect to the data where the cache miss is detected, using the generated mapping information and the generated state information; And selectively transmitting a consistency message in consideration of the mapped cores of the virtual processors belonging to the same virtual machine according to the check result.

이 때, 상기 상태 정보는, 상기 페이지가 복수 개의 가상 머신에 의해 공유되지 않음을 나타내는 가상 머신 독점(virtual machine private) 상태, 상기 페이지가 복수 개의 가상 머신에 의해 공유되고 상기 가상 머신들 각각이 공유 페이지에 대해 읽기 및 쓰기가 가능함을 나타내는 읽기-쓰기 공유(read-write shared) 상태 및 상기 페이지가 복수 개의 가상 머신에 의해 공유되고 상기 가상 머신들 각각이 공유 페이지에 대해 읽기만이 가능함을 나타내는 읽기-전용 공유(read-only shared) 상태 중 어느 하나이다.In this case, the state information may include a virtual machine private state indicating that the page is not shared by a plurality of virtual machines, the page is shared by a plurality of virtual machines, and each of the virtual machines is shared. Read-write shared state indicating that a page can be read and written and read indicating that the page is shared by a plurality of virtual machines and each of the virtual machines can only read a shared page One of the read-only shared states.

상기 기술적 과제를 해결하기 위하여, 본 발명에 따른 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하는 방법은, 상기 가상 머신의 가상 프로세서와 상기 코어들 간의 매핑 정보를 생성하여 가상 프로세서 맵 레지스터에 저장하는 단계; 페이지의 공유 여부를 나타내는 상태 정보를 생성하여 섀도우 페이지 테이블(shadow page table)에 저장하는 단계; 및 캐쉬 미스의 발생이 감지되면, 상기 저장된 매핑 정보 및 상기 저장된 상태 정보를 이용하여 상기 캐쉬 미스가 감지된 데이터에 대해 상기 가상 머신에 의한 페이지 공유 여부를 검사함으로써 상기 복수 개의 코어 중에서 선택된 코어에만 일관성 메시지를 전송하는 단계를 포함한다.In order to solve the above technical problem, a method for controlling cache coherence in a virtualization environment in which a plurality of virtual machines share a plurality of cores according to the present invention may include generating mapping information between a virtual processor of the virtual machine and the cores. Storing in a virtual processor map register; Generating state information indicating whether a page is shared and storing the state information in a shadow page table; And detecting occurrence of a cache miss, checking whether the cache miss is shared by the virtual machine with respect to the data detected by the cache miss, using the stored mapping information and the stored state information, thereby making it consistent only with a selected core among the plurality of cores. Sending a message.

또한, 상기된 캐쉬 일관성을 제어하는 방법은, 상기 복수 개의 코어 별로 해당 코어의 로컬 캐쉬(local cache)에 적재된 가상 머신의 데이터의 수를 나타내는 레지던스 카운터(residence counter)를 산출하여 저장하는 단계를 더 포함한다.The method of controlling cache coherency may include calculating and storing a residence counter indicating a number of data of a virtual machine loaded in a local cache of a corresponding core for each of the plurality of cores. It includes more.

나아가, 이하에서는 상기 기재된 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하는 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다.Furthermore, the following provides a computer-readable recording medium having recorded thereon a program for executing a method for controlling cache coherence in a virtualization environment in which a plurality of virtual machines share a plurality of cores described above.

상기 기술적 과제를 해결하기 위하여, 본 발명에 따른 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하는 장치는, 상기 가상 머신의 가상 프로세서와 상기 코어들 간의 매핑 정보를 생성하여 가상 프로세서 맵 레지스터에 저장하고, 페이지의 공유 여부를 나타내는 상태 정보를 생성하여 섀도우 페이지 테이블에 저장하는 모니터링(monitoring)부; 캐쉬 미스의 발생이 감지되면, 상기 저장된 매핑 정보 및 상기 저장된 상태 정보를 이용하여 상기 캐쉬 미스가 감지된 데이터에 대해 상기 가상 머신에 의한 페이지 공유 여부를 검사하는 공유 검사부; 및 상기 검사 결과에 따라 동일한 가상 머신에 속한 가상 프로세서들이 매핑된 코어를 고려하여 선택적으로 일관성 메시지를 전송하는 처리부를 포함한다.In order to solve the above technical problem, an apparatus for controlling cache coherence in a virtualization environment in which a plurality of virtual machines share a plurality of cores according to the present invention may generate mapping information between a virtual processor of the virtual machine and the cores. A monitoring unit for storing in a virtual processor map register and generating state information indicating whether a page is shared, and storing the state information in a shadow page table; A sharing checker that checks whether a page is shared by the virtual machine with respect to the data on which the cache miss is detected, using the stored mapping information and the stored state information when a cache miss is detected; And a processor configured to selectively transmit a consistency message in consideration of cores to which virtual processors belonging to the same virtual machine are mapped according to the check result.

이 때, 상기 상태 정보는, 상기 페이지가 복수 개의 가상 머신에 의해 공유되지 않음을 나타내는 가상 머신 독점 상태, 상기 페이지가 복수 개의 가상 머신에 의해 공유되고 상기 가상 머신들 각각이 공유 페이지에 대해 읽기 및 쓰기가 가능함을 나타내는 읽기-쓰기 공유 상태 및 상기 페이지가 복수 개의 가상 머신에 의해 공유되고 상기 가상 머신들 각각이 공유 페이지에 대해 읽기만이 가능함을 나타내는 읽기-전용 공유 상태 중 어느 하나이고, 상기 모니터링부는 상기 페이지와 상기 페이지의 상태 정보를 매핑하여 상기 섀도우 페이지 테이블에 저장하고, 상기 공유 검사부는 데이터 접근시 상기 섀도우 페이지 테이블 및 변환 색인 버퍼(translation lookaside buffer, TLB)를 참조하여 상기 페이지 공유 여부를 검사한다.At this time, the state information may include a virtual machine exclusive state indicating that the page is not shared by a plurality of virtual machines, the page is shared by a plurality of virtual machines, and each of the virtual machines reads and shares a shared page. A read-write shared state indicating that write is possible, and a read-only sharing state indicating that the page is shared by a plurality of virtual machines and that each of the virtual machines can only read a shared page, and the monitoring unit Map the page and state information of the page and store it in the shadow page table, and the shared checker checks whether the page is shared by referring to the shadow page table and a translation lookaside buffer (TLB) when data is accessed. do.

상기 기술적 과제를 해결하기 위하여, 본 발명에 따른 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하는 장치는, 상기 가상 머신의 가상 프로세서와 상기 코어들 간의 매핑 정보를 생성하여 가상 프로세서 맵 레지스터에 저장하고, 페이지의 공유 여부를 나타내는 상태 정보를 생성하여 섀도우 페이지 테이블에 저장하며, 상기 복수 개의 코어 별로 해당 코어의 로컬 캐쉬에 적재된 가상 머신의 데이터의 수를 나타내는 레지던스 카운터를 산출하여 저장하는 모니터링부; 및 캐쉬 미스의 발생이 감지되면, 상기 저장된 매핑 정보 및 상기 저장된 상태 정보를 이용하여 상기 캐쉬 미스가 감지된 데이터에 대해 상기 가상 머신에 의한 페이지 공유 여부를 검사함으로써 상기 복수 개의 코어 중에서 선택된 코어에만 일관성 메시지를 전송하는 처리부를 포함한다.In order to solve the above technical problem, an apparatus for controlling cache coherence in a virtualization environment in which a plurality of virtual machines share a plurality of cores according to the present invention may generate mapping information between a virtual processor of the virtual machine and the cores. A residence counter indicating the number of virtual machine data loaded in a local cache of the core for each of the plurality of cores; Monitoring unit for calculating and storing; And detecting occurrence of a cache miss, checking whether the cache miss is shared by the virtual machine with respect to the data detected by the cache miss, using the stored mapping information and the stored state information, thereby making it consistent only with a selected core among the plurality of cores. It includes a processing unit for transmitting a message.

또한, 상기 기재된 캐쉬 일관성 제어 장치에서, 상기 모니터링부는 상기 복수 개의 가상 머신 별로 할당된 가상 프로세서를 매핑시켜 스눕 도메인으로서 상기 가상 프로세서 맵 레지스터에 저장한다.Further, in the cache coherency control apparatus described above, the monitoring unit maps the virtual processors allocated to the plurality of virtual machines and stores the virtual processors in the virtual processor map register as a snoop domain.

본 발명은 가상 프로세서와 코어들 간의 매핑 정보 및 페이지의 공유 여부를 나타내는 상태 정보를 이용하여 코어들에 선택적으로 일관성 메시지를 전송하는 기술을 제안함으로써 가상화 기술을 멀티-코어 기반의 시스템에 확장하여 적용할 수 있고, 멀티-코어에 기반한 가상화 환경 하에서 캐쉬 일관성을 유지하기 위해 전송하는 일관성 메시지에 따른 통신량을 감소시킴으로써 시스템의 성능을 향상시키며, 이러한 통신량 감소로 인해 전력 소모를 최소화할 수 있다.The present invention extends and applies virtualization technology to a multi-core based system by proposing a technique of selectively transmitting a coherency message to cores by using mapping information between virtual processors and cores and state information indicating whether a page is shared. It can improve the performance of the system by reducing the traffic according to the coherency message transmitted to maintain cache coherency under the multi-core based virtualization environment, and the power consumption can be minimized due to the reduced traffic.

도 1은 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하기 위한 본 발명의 기본 아이디어를 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 가상화 환경에서 가상 머신의 재배치가 이루어지는 과정을 예시한 도면이다.
도 3은 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 일관성을 제어하는 방법을 도시한 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 도 3의 캐쉬 일관성 제어 방법을 보다 구체적으로 도시한 흐름도이다.
도 5는 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 미스가 발생한 페이지가 읽기-전용 공유 상태인 경우 캐쉬 일관성을 제어하는 방법의 일례를 도시한 도면이다.
도 6은 본 발명의 다른 실시예에 따라 레지던스 카운터를 이용하여 가상 프로세서 맵 레지스터의 스눕 도메인을 갱신하는 과정을 예시한 도면이다.
도 7은 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 일관성을 제어하는 장치를 도시한 블록도이다.1 is a diagram illustrating a basic idea of the present invention for controlling cache coherency in a virtualization environment in which a plurality of virtual machines share a plurality of cores.
2 is a diagram illustrating a process of relocation of a virtual machine in a virtualization environment according to an embodiment of the present invention.
3 is a flowchart illustrating a method of controlling cache coherency in a virtualization environment according to an embodiment of the present invention.
4 is a flowchart illustrating the cache coherency control method of FIG. 3 according to an embodiment of the present invention in more detail.
FIG. 5 illustrates an example of a method of controlling cache coherency when a cache miss page is in a read-only shared state in a virtualization environment according to an embodiment of the present invention.
6 is a diagram illustrating a process of updating a snoop domain of a virtual processor map register by using a residence counter according to another embodiment of the present invention.
7 is a block diagram illustrating an apparatus for controlling cache coherency in a virtualization environment according to an embodiment of the present invention.

본 발명의 실시예들을 설명하기에 앞서, 캐쉬 일관성을 제어하는 관련 기술과 본 발명의 실시예들이 구현되는 가상화 환경에서 복수 개의 가상 머신이 복수 개의 코어를 공유할 때의 문제 상황에 대해 개괄적으로 검토하도록 하겠다.Before describing embodiments of the present invention, an overview of related technologies for controlling cache coherence and problem situations when a plurality of virtual machines share a plurality of cores in a virtualized environment in which embodiments of the present invention are implemented I'll do it.

일반적으로 캐쉬 일관성(cache coherence) 문제를 해결하기 위해서는 모든 로컬 캐쉬가 항상 최신의 값을 반환해줄 수 있도록 프로토콜에 의해 캐쉬 간의 일관성을 유지해야 한다. 또한, 로컬 캐쉬에서 읽기에 의한 캐쉬 미스가 발생했을 때 역시 해당 데이터가 만약 다른 코어의 로컬 캐쉬에 존재할 경우, 해당 데이터를 지연시간(latency time)이 큰 램(RAM)과 같은 메모리에서 가져오지 않고, 다른 코어의 로컬 캐쉬에서 데이터를 가져오는 것이 바람직하다. 이러한 처리를 위해서도 캐쉬 일관성 프로토콜이 사용된다. 캐쉬 일관성을 위한 기본적인 두 가지 방법으로 스눕(snoop) 방식과 디렉토리(directory) 방식이 알려져 있다.In general, to solve the cache coherence problem, the protocol must maintain consistency between caches so that all local caches always return the latest values. Also, when a cache miss occurs due to a read from the local cache, if the data exists in the local cache of another core, the data is not retrieved from memory such as RAM having a high latency time. It's a good idea to get the data from the local cache on another core. The cache coherence protocol is also used for this process. The two basic methods for cache coherence are known as snoop and directory.

첫째, 스눕 방식은 캐쉬 일관성 문제를 해결하기 위해 캐쉬 미스가 발생한 코어에서 요청하는 메시지를 모든 코어에 직접적으로 전송하는 방식이다. 이러한 스눕 방식은 별도의 지연 없이 빠른 캐쉬 간의 데이터 전달을 제공하지만, 만약 코어의 수가 많아진다면 모든 코어로 메시지를 보내야 하기 때문에 통신량이 급격하게 늘어나는 문제점을 갖는다.First, the snoop method sends a message requested from a core where a cache miss occurs directly to all cores to solve a cache coherency problem. This snoop method provides fast cache data transfer without any delay, but if the number of cores is large, there is a problem in that the traffic volume increases rapidly because a message must be sent to all cores.

둘째, 디렉토리 방식은 별도의 디렉토리를 가지고 해당 데이터가 어느 코어의 로컬 캐쉬에 적재(load)가 되어 있는지를 추적하여 해당 코어에만 메시지를 보내는 방식이다. 이를 위해 각각의 데이터마다 물리적 주소에 따라 다른 홈 노드(home node)라고 불리우는 코어가 존재하고, 홈 노드는 해당 데이터가 어느 코어의 로컬 캐쉬에 로드가 되어 있는지에 대한 정보를 유지하고 제공하게 된다. 따라서, 캐쉬 미스가 발생하게 되면 우선 캐쉬 미스가 발생한 코어에서 홈 노드로 메시지를 발생시키고, 홈 노드에서 제공하는 공유 코어들에 대한 정보를 바탕으로 해당 코어들에게만 메시지를 보내게 된다. 이상과 같은 과정을 통해서 디렉토리 방식은 통신량은 줄일 수 있지만 홈 노드로 가기 위한 추가적인 과정이 필요하고, 공유 코어들에 대한 정보를 유지하기 위한 별도의 공간과 과정이 필요하게 되므로 복잡도가 높아지고 비용이 증가하게 된다.Secondly, the directory method keeps track of which data is loaded in the local cache of a core with a separate directory and sends a message only to the core. For this purpose, each data has a core called a home node, which is different according to a physical address, and the home node maintains and provides information on which core the data is loaded in the local cache. Therefore, when a cache miss occurs, a message is first generated from a core where a cache miss occurs to a home node, and a message is sent only to the corresponding cores based on information on shared cores provided by the home node. Through the above process, the directory method can reduce the traffic, but it requires an additional process to go to the home node and a separate space and process to maintain information about the shared cores, which increases complexity and costs. Done.

이러한 복잡성과 비용 때문에 현재 많은 시스템에서는 스눕 방식을 채택하여 사용하여 왔으나, 문제는 멀티-코어의 개수가 증가하는 경우에 발생할 수 있다. 스눕 방식의 경우 4개 정도의 코어를 갖는 멀티-코어 시스템에서는 일정 수준의 성능을 보장할 수 있으나, 멀티-코어의 개수가 8, 16, 32, 64, 128 개와 같이 증가할 경우, 일관성 메시지를 전송으로 인해 통신량이 급격하게 증가하게 된다. 이러한 통신량의 증가는 시스템의 성능 저하와 과도한 전력 소모를 야기할 수 밖에 없으므로, 이러한 불필요한 통신량을 감소시키기 위한 노력이 필요하다.Because of this complexity and cost, many systems now use snoop, but the problem can arise when the number of multi-cores increases. The snoop method guarantees a certain level of performance in a multi-core system with four cores. However, if the number of multi-cores increases such as 8, 16, 32, 64, 128, a coherent message is generated. The transmission causes a rapid increase in the traffic volume. Since the increase in the amount of traffic will inevitably lead to deterioration of the system and excessive power consumption, efforts to reduce such unnecessary amount of traffic are required.

한편, 최근 클라우드 컴퓨팅(cloud computing)이 이슈가 되면서 하나의 물리적 머신 상에서 다수의 가상 머신(virtual machine)을 구현함으로써, 각각의 가상 머신을 통해 마치 별개의 하드웨어 자원이 존재하는 것처럼 동작하게 해주는 가상화(virtualization) 기술이 주목받고 있다. 이러한 가상화 기술은 높은 자원 이용률(resource utilization), 높은 결함 내성(fault tolerance)와 이식성(portability), 그리고 뛰어난 관리성(manageability)의 장점을 갖는다.Meanwhile, as cloud computing becomes an issue in recent years, by implementing a plurality of virtual machines on one physical machine, virtualization is performed through each virtual machine as if there are separate hardware resources. Virtualization technology is attracting attention. This virtualization technology has the advantages of high resource utilization, high fault tolerance and portability, and excellent manageability.

그런데, 이러한 가상화 기술을 멀티-코어 시스템 환경에 적용시키게 되면, 물리적인 머신(core를 의미한다.)이 복수 개 존재할 뿐만 아니라, 가상 머신 역시 복수 개 존재하는 상황에 놓여진다. 즉, 각각의 가상 머신은 가상 프로세서(virtual processor, vCPU)를 가지고 있고, 또한 가상 프로세서들은 실제 복수 개의 물리적 코어에 매핑(mapping)이 되어 코어를 이용하게 된다. 이러한 환경 하에서 가상 머신들은 각각 서로 다른 주소 메모리 영역을 사용하기 때문에, 가상 머신 간의 데이터 공유나 가상 머신 모니터 영역 같은 부분을 제외하고는 가상 머신들 간의 통신은 불필요하다. 그러나, 현재의 멀티-코어 시스템에서는 코어 간의 통신에 있어서 가상 머신에 대한 고려가 전혀 없기 때문에 상당량의 불필요한 통신이 발생하게 된다. 따라서, 멀티-코어 시스템에서 가상화 환경을 고려하여 이에 적합한 캐쉬 일관성 유지를 위한 통신량 감소 기법이 요구된다.However, when the virtualization technology is applied to a multi-core system environment, not only a plurality of physical machines (or cores) exist, but also a plurality of virtual machines exist. That is, each virtual machine has a virtual processor (vCPU), and the virtual processors are mapped to a plurality of physical cores to use cores. In this environment, virtual machines use different address memory areas, so communication between virtual machines is unnecessary except for data sharing between virtual machines and virtual machine monitor areas. However, in current multi-core systems, there is no consideration of virtual machines in the communication between cores, so that a considerable amount of unnecessary communication occurs. Therefore, considering a virtualization environment in a multi-core system, a traffic reduction scheme for maintaining cache coherency suitable for this is required.

이하에서, 관련 도면들을 참조하여 본 발명의 실시예들을 보다 구체적으로 설명한다. 도면들에서 동일한 참조 번호들은 동일한 구성 요소를 지칭한다.In the following, embodiments of the present invention will be described in more detail with reference to the related drawings. Like reference numerals in the drawings refer to like elements.

도 1은 복수 개의 가상 머신이 복수 개의 코어를 공유하는 가상화 환경에서 캐쉬 일관성을 제어하기 위한 본 발명의 기본 아이디어를 도시한 도면이다. 캐쉬 일관성 제어 장치(100)는 복수 개의 코어(core)(50)와 이에 따른 캐쉬(L2 캐쉬가 예시되었다.)뿐만 아니라, 가상 프로세서 맵 레지스터(virtual processor map register)(13)와 섀도우 페이지 테이블(shadow page table)(15)을 포함한다.1 is a diagram illustrating a basic idea of the present invention for controlling cache coherency in a virtualization environment in which a plurality of virtual machines share a plurality of cores. The cache coherency control device 100 may include a plurality of cores 50 and a corresponding cache (L2 cache is illustrated), as well as a virtual processor map register 13 and a shadow page table. shadow page table 15).

가상 프로세서 맵 레지스터(13)는 가상 머신와 이에 할당된 코어를 매핑시켜 관리한다. 즉, 가상 프로세서 맵 레지스터(13)는 가상 머신 식별자(VMID)와 코어 리스트를 연결하여 관리하는 것이 바람직하다. 도 1에는 1개의 가상 머신(VM1)에 4개의 코어(P0, P1, P2, P4)가 매핑되어 있음이 예시되어 있다.The virtual processor map register 13 maps and manages a virtual machine and a core allocated thereto. That is, the virtual processor map register 13 preferably manages the virtual machine identifier (VMID) by linking the core list. In FIG. 1, four cores P0, P1, P2, and P4 are mapped to one virtual machine VM1.

섀도우 페이지 테이블(15)은 페이지와 각각의 페이지의 상태를 매핑시켜 관리한다. 즉, 섀도우 페이지 테이블(15)에는 페이지 주소(paddr)와 페이지의 상태(status)를 연결하여 테이블의 형태로 관리하는 것이 바람직하다. 도 1에는 3개의 페이지에 대해 각각 다른 3개의 상태(VM-private, RW-shared, RO-shared)가 기록되어 있음이 예시되어 있다. 각각의 상태 정보에 대해서는 이후 도 3을 통해 구체적으로 설명한다.The shadow page table 15 maps and manages a state of a page and each page. That is, the shadow page table 15 may be managed by linking a page address paddr and a page status to form a table. In FIG. 1, three different states (VM-private, RW-shared, and RO-shared) are recorded for three pages. Each state information will be described in detail later with reference to FIG. 3.

가상화 기술은 각 가상 머신 별로 독립적인 컴퓨팅 환경을 제공한다. 기본적으로 각 가상 머신은 개별적인 가상 프로세서와 메모리 영역을 사용하게 된다. 각 가상 머신의 가상 프로세서(vCPU)가 서로 다른 물리적 코어에 매핑이 되어 실행이 된다. 이 때, 가상 머신 간의 메모리 공유가 없다면 서로 다른 가상 머신들이 매핑된 물리적 코어의 로컬 캐쉬 간의 일관성을 맞출 필요가 없게 되고, 같은 가상 머신에 속한 가상 프로세서들이 매핑 된 코어 들 간의 로컬 캐쉬들 간에만 캐쉬 일관성을 맞춰 주면 된다.Virtualization technology provides an independent computing environment for each virtual machine. By default, each virtual machine uses a separate virtual processor and memory area. Each virtual machine's virtual processors (vCPUs) are mapped to different physical cores and run. In this case, if there is no memory sharing between virtual machines, different virtual machines do not need to match the local caches of mapped physical cores, and virtual processors belonging to the same virtual machine are cached only between local caches between mapped cores. You need to be consistent.

따라서, 이하에서 기술될 본 발명의 실시예들은 이러한 가상화의 특징을 이용하여 캐쉬 간의 통신량을 줄이기 위해 각 가상 머신의 가상 프로세서가 어떤 코어에 매핑되어 있는지에 대한 정보를 추적하고, 이 정보를 이용하여 현재 같은 가상 머신에 속한 가상 프로세서들이 매핑되어 있는 코어들에게만 캐쉬 일관성을 위한 메시지를 전송하도록 한다. 즉, 본 발명의 실시예들의 기본 아이디어는 가상화 환경을 갖춘 멀티-코어 시스템에서 가상화의 특성을 고려하여 코어들 간의 일관성을 유지하기 위해 발생하는 통신량을 감소시킬 수 있는 일관성 제어 기법을 제시하고자 한다.Accordingly, embodiments of the present invention described below track the information on which cores the virtual processors of each virtual machine are mapped to and use this information to reduce the amount of communication between caches using this feature of virtualization. Currently, virtual processors belonging to the same virtual machine are sent a message for cache consistency only to the mapped cores. That is, the basic idea of the embodiments of the present invention is to propose a coherence control technique that can reduce the amount of communication that occurs to maintain consistency between cores in consideration of the characteristics of virtualization in a multi-core system having a virtualization environment.

다만, 가상 머신들 간에는 공유되는 영역이 존재할 수도 있고, 또한 가상 머신 모니터 영역은 모든 가상 머신에 의해서 호출되어 어느 물리적 코어의 로컬 캐쉬에도 데이터가 적재될 수 있기 때문에, 이러한 데이터의 성격에 따라 그 유형을 구분하여 각각의 유형에 따라 캐쉬 미스에 따른 일관성 처리를 다르게 할 필요가 있다. 이를 위해, 본 발명의 다양한 실시예들은 페이지 테이블(도 1에서는 섀도우 페이지 테이블(15)이 될 수 있다.) 엔트리에 각각의 페이지에 대한 공유 상태 정보를 저장함으로써 현재 캐쉬 미스가 발생한 페이지가 다른 가상 머신에 의해 공유되는 페이지인지 여부를 검사한다.However, there may be a shared area between the virtual machines, and the virtual machine monitor area may be called by all virtual machines so that data may be loaded in the local cache of any physical core, and thus the type may vary depending on the nature of the data. It is necessary to separate the processing of consistency between cache misses for each type. To this end, various embodiments of the present invention store the shared state information for each page in the page table (which may be the shadow page table 15 in FIG. 1). Check if the page is shared by the machine.

보다 구체적으로, 이상과 같은 일관성 유지를 수행하기 위해 각각의 가상 머신의 데이터가 현재 어느 코어의 로컬 캐쉬에 존재하고 있는지가 추적이 되어야 하고, 추적된 코어들에 대한 정보가 자료구조에 저장되어야 한다. 이러한 정보는 어떤 가상 머신이 어떤 코어에 새롭게 스케쥴링될 때 가상 프로세서 맵 레지스터(13)에 적재된다. 이렇게 적재된 가상 프로세서 맵 레지스터(13)는 해당 코어에서 캐쉬 미스가 발생하여 일관성 메시지를 보내야 할 때 항상 참조된다.More specifically, in order to maintain the above consistency, the data of each virtual machine should be tracked in which local cache currently exists, and information about the tracked cores should be stored in the data structure. . This information is loaded into the virtual processor map register 13 when a virtual machine is newly scheduled on a core. The loaded virtual processor map register 13 is always referenced when a cache miss occurs in the core and a consistency message needs to be sent.

이제, 일관성 메시지를 전송할 대상을 선택하는 방법을 설명한다. 일관성 메시지는 캐쉬 미스가 발생한 시점에 해당 가상 머신의 데이터를 보유하고 있는 코어들에게만 전송되어야 한다. 이를 위해 먼저 캐쉬 미스가 감지된 데이터에 대해 가상 머신에 의한 페이지 공유 여부를 검사할 필요가 있다. 도 1에는 가상화 환경이 구현된 멀티-코어 시스템(100)에서 가상 주소를 실제 물리 주소로 바꾸어 주는 섀도우 페이지 테이블(15)에 각각의 페이지들에 대한 공유 정보를 기록함으로써, 캐쉬 미스가 발생한 경우 일관성 메시지를 전송하기 전에 섀도우 페이지 테이블(15)에 기록된 공유 정보를 참조한다.Now, a method of selecting a destination to transmit a consistency message will be described. Consistency messages should only be sent to the cores that hold the data of the virtual machine at the time of the cache miss. To do this, it is necessary to first check whether the page is shared by the virtual machine for the data detected by the cache miss. In FIG. 1, in a multi-core system 100 in which a virtualization environment is implemented, the shared information for each page is recorded in a shadow page table 15 that converts a virtual address into a real physical address, thereby ensuring consistency when a cache miss occurs. Before sending the message, refer to the shared information recorded in the shadow page table 15.

만약, 현재 캐쉬 미스가 발생한 데이터가 공유된 페이지에 포함되어 있지 않다면, 가상 프로세서 맵 레지스터(13)를 통해 참조된 코어들에게만 일관성 메시지를 전송하면 된다. 이에 반해, 현재 캐쉬 미스가 발생한 데이터가 다른 가상 머신이나 다른 가상 머신 모니터에 의해 공유된 페이지에 포함되어 있다면, 다른 가상 머신의 데이터를 보유하고 있는 코어도 동일한 데이터를 보유하고 있을 가능성이 높으므로, 해당 가상머신의 데이터를 가지고 있는 코어들에게만 일관성 메시지를 보내는 것만으로는 캐쉬 일관성 유지에 불충하다. 이러한 경우의 일관성 유지 방법은 이후에 도 4를 통해 구체적으로 설명하도록 한다.If the current cache missed data is not included in the shared page, the consistency message may be transmitted only to the cores referred to through the virtual processor map register 13. On the other hand, if the data that caused the current cache miss is included in a page shared by another virtual machine or by another virtual machine monitor, then the cores that hold data from other virtual machines are likely to have the same data. Sending a consistency message only to the cores that hold the virtual machine's data is not enough to maintain cache consistency. The method of maintaining consistency in this case will be described in detail later with reference to FIG. 4.

도 2는 본 발명의 일 실시예에 따른 가상화 환경에서 가상 머신의 재배치가 이루어지는 과정을 예시한 도면으로서, 가상 머신의 가상 프로세서가 고정되어 있지 않고 코어의 이용률을 높이기 위해 옮겨다니는 상황을 예시하고 있다.FIG. 2 is a diagram illustrating a process of relocating a virtual machine in a virtual environment according to an embodiment of the present invention, and illustrates a situation in which a virtual processor of the virtual machine is not fixed and moves to increase the utilization of the core. .

도 2에서 캐쉬 미스가 발생한 경우, 해당 가상 머신의 데이터를 가지고 있는 모든 코어에게 일관성 메시지를 올바르게 보내려면 새롭게 이용되는 코어를 스눕 도메인에 적절히 추가해주는 것이 필요하다. 해당 가상 머신이 가상 프로세서 맵 레지스터에 없는 코어를 새롭게 사용하게 된다면, 새로 사용되는 코어를 새로운 로컬 캐쉬에 데이터를 로드하기 전에 가상 프로세서 맵 레지스터에 추가하여야만 한다. 또한, 현재 새롭게 사용되고 있는 코어의 가상 프로세서 맵 레지스터뿐만 아니라 해당 가상 머신의 매핑 정보를 로드하고 있는 모든 코어의 가상 프로세서 맵 레지스터에 업데이트 메시지를 전송함으로써 각각의 가상 프로세서 맵 레지스터를 업데이트해야 한다. 이렇게 업데이트된 가상 프로세서 맵 레지스터의 값은 이후 각 가상 머신의 스눕 도메인을 나타내는 자료구조에 저장된다. 이상과 같은 과정을 통해 해당 가상 머신의 데이터를 가지고 있는 모든 코어들에게 일관성 메시지를 보낼 수 있지만, 가상 프로세서 재배치로 인하여 계속해서 가상 프로세서 맵 레지스터의 엔트리가 증가하게 된다면, 가상 프로세서의 개수가 실제 코어의 수보다 적다고 하더라도 실제 코어의 수와 같거나 이에 준하는 코어의 수가 스눕 도메인에 포함될 수 있다.In case of a cache miss in FIG. 2, it is necessary to properly add a newly used core to the snoop domain in order to correctly send a consistency message to all cores having data of the virtual machine. If the virtual machine uses a new core that is not in the virtual processor map register, the newly used core must be added to the virtual processor map register before loading data into the new local cache. In addition, each virtual processor map register must be updated by transmitting an update message to the virtual processor map registers of all cores that are loading mapping information of the virtual machine as well as the virtual processor map registers of the newly used cores. The value of this updated virtual processor map register is then stored in a data structure representing the snoop domain of each virtual machine. With this process, you can send a consistency message to all cores that hold the data of the virtual machine, but if virtual processor relocation continues to increase the entries in the virtual processor map registers, the number of virtual processors The number of cores may be included in the snoop domain even if the number of cores is less than or equal to the number of cores.

도 3은 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 일관성을 제어하는 방법을 도시한 흐름도로서, 다음과 같은 단계들을 포함한다. 이 때, 도 3의 가상화 환경은 복수 개의 가상 머신이 복수 개의 코어를 공유하는 환경을 말한다.3 is a flowchart illustrating a method of controlling cache coherency in a virtualization environment according to an embodiment of the present invention, and includes the following steps. In this case, the virtualization environment of FIG. 3 refers to an environment in which a plurality of virtual machines share a plurality of cores.

310 단계에서 가상 머신의 가상 프로세서와 코어들 간의 매핑(mapping) 정보 및 페이지의 공유 여부를 나타내는 상태 정보를 생성한다. 여기서, 매핑 정보는 가상 머신의 가상 프로세서가 사용하는 페이지 단위의 데이터가 코어들 중 어느 코어의 로컬 캐쉬에 존재하는지를 추적하여 추적된 코어를 기록함으로서 생성된 정보를 말한다.In step 310, mapping information between virtual processors and cores of the virtual machine and state information indicating whether a page is shared are generated. Here, the mapping information refers to information generated by recording a tracked core by tracking which page unit data used by the virtual processor of the virtual machine exists in the local cache of which cores.

320 단계에서 캐쉬 미스의 발생이 감지되면, 310 단계를 통해 생성된 매핑 정보 및 상태 정보를 이용하여 캐쉬 미스가 감지된 데이터에 대해 가상 머신에 의한 페이지(page) 공유 여부를 검사한다.When the generation of the cache miss is detected in step 320, the mapping information and the state information generated in step 310 are used to check whether a page is shared by the virtual machine with respect to the detected data.

330 단계에서는 320 단계의 공유 여부 검사 결과에 따라 동일한 가상 머신에 속한 가상 프로세서들이 매핑된 코어를 고려하여 선택적으로 일관성 메시지를 전송한다. 보다 구체적으로 일관성 메시지를 전송하는 과정은 다음과 같다.In step 330, the virtual processors belonging to the same virtual machine may selectively transmit a consistency message in consideration of the mapped cores according to the sharing test result of step 320. More specifically, the process of transmitting a consistency message is as follows.

우선, 캐쉬 미스의 발생을 감지한다. 이러한 감지 과정은 통상적인 멀티-코어 시스템이 구현되는 과정에서 필수적으로 활용되고 있는 구성이므로 여기서는 자세한 설명을 생략한다. 다음으로, 310 단계를 통해 생성된 상태 정보를 이용하여 캐쉬 미스가 감지된 데이터에 대해 가상 머신에 의한 페이지 공유 여부를 검사한다. 페이지 공유 여부는 섀도우 페이지 테이블에 기록된 페이지 별 상태 정보를 참조함으로써 수행될 수 있다. 이러한 검사 결과에 기초하여 가상 머신 모니터(virtual machine monitor)가 310 단계를 통해 생성된 매핑 정보를 참조하여 가상 머신에 매핑된 코어를 선택한다. 이렇게 선택된 코어는 페이지 공유에 관한 유형에 따라 결정될 수 있으며, 보다 구체적인 유형과 선택 방법은 이후 도 4를 통해 설명한다. 이제, 이상에서 선택된 코어에 일관성 메시지를 전송한다.First, it detects the occurrence of a cache miss. This sensing process is a configuration that is essentially used in the process of implementing a conventional multi-core system, so a detailed description thereof will be omitted. Next, the state information generated in operation 310 is used to check whether the page is shared by the virtual machine with respect to the data detected by the cache miss. Whether or not to share the page may be performed by referring to page state information recorded in the shadow page table. Based on the inspection result, the virtual machine monitor selects the core mapped to the virtual machine by referring to the mapping information generated in step 310. The selected core may be determined according to the type of page sharing, and more specific types and selection methods will be described later with reference to FIG. 4. Now, the consistency message is sent to the core selected above.

도 4는 본 발명의 일 실시예에 따른 도 3의 캐쉬 일관성 제어 방법을 보다 구체적으로 도시한 흐름도로서, 페이지 공유 여부 유형에 따른 3 가지 일관성 메시지 전송 방법을 제시한다.FIG. 4 is a flowchart illustrating the cache coherency control method of FIG. 3 according to an embodiment of the present invention in detail, and illustrates three methods of transmitting coherency messages according to page sharing type.

410 단계에서 우선 매핑 정보 및 상태 정보를 생성하여 저장한다. 즉, 가상 머신의 가상 프로세서와 코어들 간의 매핑 정보를 생성하여 가상 프로세서 맵 레지스터에 저장하고, 또한 페이지의 공유 여부를 나타내는 상태 정보를 생성하여 섀도우 페이지 테이블에 저장한다.In step 410, the mapping information and the state information are first generated and stored. That is, mapping information between virtual processors and cores of the virtual machine is generated and stored in the virtual processor map register, and state information indicating whether a page is shared is generated and stored in the shadow page table.

420 단계에서는 캐쉬 미스가 발생했는지 여부를 감지한다. 이러한 감지는 각 코어에서 데이터 접근이 일어났을 때 수행되며, 캐쉬 미스가 발생하지 않았다면 절차는 종료된다. 반면, 캐쉬 미스가 발생했다면 430 단계로 진행하여 캐쉬 일관성을 유지하기 위한 일련의 절차가 수행된다.In step 420, it is detected whether a cache miss has occurred. This detection is performed when data access occurs on each core, and the procedure ends if no cache miss occurs. On the other hand, if a cache miss occurs, a process proceeds to step 430 to maintain cache coherency.

430 단계에서 페이지 공유 여부를 검사한다. 이 때, 페이지의 공유 여부 판단을 위해 섀도우 페이지 테이블과 변환 색인 버퍼(translation lookaside buffer, TLB)를 참조하여 페이지 공유 여부를 판단한다. 이를 위해 본 발명의 실시예들은 페이지 공유 상태를 적어도 다음과 같은 3 가지 유형으로 구분하였다.In step 430, the page is shared. In this case, the page sharing is determined by referring to the shadow page table and the translation lookaside buffer (TLB) to determine whether the page is shared. To this end, embodiments of the present invention divided the page sharing state into at least three types as follows.

첫째, 페이지가 복수 개의 가상 머신에 의해 공유되지 않음을 나타내는 가상 머신 독점(virtual machine private) 상태가 분류 가능하다.First, the virtual machine private state, which indicates that a page is not shared by multiple virtual machines, is sortable.

둘째, 페이지가 복수 개의 가상 머신에 의해 공유되고 가상 머신들 각각이 공유 페이지에 대해 읽기 및 쓰기가 가능함을 나타내는 읽기-쓰기 공유(read-write shared) 상태가 분류 가능하다.Second, a read-write shared state, which indicates that a page is shared by a plurality of virtual machines and that each of the virtual machines can read and write to the shared page, can be classified.

셋째, 페이지가 복수 개의 가상 머신에 의해 공유되고 가상 머신들 각각이 공유 페이지에 대해 읽기만이 가능함을 나타내는 읽기-전용 공유(read-only shared) 상태가 분류 가능하다.Third, a read-only shared state, which indicates that a page is shared by a plurality of virtual machines and that each of the virtual machines can only read a shared page, is sortable.

본 발명의 실시예들에서는 공유되지 않은 가상 머신 독점의 경우와 현재 해당 가상 머신의 스눕 도메인에 속하지 않으면서 다른 가상 머신의 스눕 도메인에 포함된 로컬 캐쉬에 공유된 데이터가 존재할 수 있는 읽기-쓰기 공유 및 읽기-전용 공유의 유형을 포함하여 총 3 가지 유형을 구분하기 위한 구분자(페이지 공유 상태 정보를 의미한다.)를 사용한다. 구현의 측면에서 이러한 상태 정보는 섀도우 페이지 테이블 엔트리와 변환 색인 버퍼의 엔트리에 추가적인 2 비트(bit)를 사용하여 표시 가능하다.In the embodiments of the present invention, a read-write share in which the shared data may exist in the local cache included in the snoop domain of another virtual machine without being in the snoop domain of the non-shared virtual machine and the current virtual machine. And a delimiter (meaning page sharing status information) to distinguish a total of three types, including the type of read-only sharing. In terms of implementation, this state information can be displayed using additional two bits in the shadow page table entry and the entry in the translation index buffer.

430 단계의 검사 결과, 만약 해당 페이지가 공유된 페이지가 아니라면(즉, 가상 머신 독점 상태(440 단계)인 경우를 의미한다.) 445 단계로 진행한다. 445 단계에서는 가상 프로세서 맵 레지스터(virtual processor map register)를 참조함으로써 독출된 스눕 도메인(snoop domain)에 포함된 코어만을 선택하고, 선택된 코어들에게만 일관성 메시지를 전송한다.If it is determined in step 430 that the page is not a shared page (i.e., it is a virtual machine exclusive state (step 440)), the process proceeds to step 445. In step 445, only the cores included in the read snoop domain are selected by referring to the virtual processor map register, and a consistency message is transmitted only to the selected cores.

반면, 해당 페이지가 공유된 페이지이고, 읽기-쓰기 공유 상태라면(450 단계) 455 단계로 진행한다. 455 단계에서는 모든 코어들에게 일관성 메시지를 브로드캐스팅(broadcasting)한다. 예를 들어, 대표적인 읽고-쓰기 가능 공유 데이터인 가상 머신 모니터의 데이터에 캐쉬 미스가 발생한 경우를 가정하자. 가상 머신 모니터 영역은 어느 가상 머신에 의해서도 호출될 수 있기 때문에, 가상 머신 모니터의 데이터는 어떠한 물리적 코어의 로컬 캐쉬에도 쉽게 적재될 수 있다는 특징을 갖는다. 따라서, 해당 페이지가 어느 코어에 존재하는지를 추적하는 것보다는 브로드캐스팅하는 것이 보다 효과적이다.On the other hand, if the page is a shared page and the read-write sharing state (step 450) proceeds to step 455. In step 455, a consistency message is broadcast to all cores. For example, suppose a cache miss occurs in the data of a virtual machine monitor, which is a representative read-write shared data. Since the virtual machine monitor area can be called by any virtual machine, the data of the virtual machine monitor can be easily loaded into the local cache of any physical core. Therefore, broadcasting is more effective than tracking which core the page is on.

한편, 해당 페이지가 공유된 페이지이고, 읽기-전용 공유 상태라면 465 단계로 진행한다. 465 단계에서는 읽기-쓰기 공유 상태와 같이 브로드캐스팅할 수도 있으나, 보다 효율적인 메시지 전송을 위해 다른 접근 방법이 활용될 수 있다. 물론 읽기-전용 페이지를 위한 일관성 메시지도, 읽기-쓰기 가능한 페이지처럼 브로드캐스팅을 이용하면 간단히 해결되겠지만, 가상 머신 간의 공유되는 페이지가 많아진 다면 이 역시 각 로컬 캐쉬들의 일관성을 맞춰주기 위해 많은 통신량을 야기할 우려가 있다. 따라서, 본 발명의 실시예들에서는 다음과 같은 적어도 3 가지 최적화 기법을 소개하고 있는데, 각각은 메모리 직접 방법(memory-direct method)과 내부 가상 머신 방법(intra-VM method), 그리고 프렌드 가상 머신 방법(friend-VM method)이다.On the other hand, if the page is a shared page, read-only sharing proceeds to step 465. In step 465, the broadcast may be broadcast, such as read-write sharing, but another approach may be used for more efficient message transmission. Of course, consistency messages for read-only pages can be solved simply by using broadcasting as read-write-enabled pages, but if more pages are shared between virtual machines, this also causes a lot of traffic to match the consistency of each local cache. There is a concern. Accordingly, embodiments of the present invention introduce at least three optimization techniques, each of which is a memory-direct method, an intra-VM method, and a friend virtual machine method. (friend-VM method).

읽기-전용 공유 데이터의 처리에 보다 다른 접근 방법이 필요한 이유는 다음과 같다. 읽기-전용 공유 페이지의 대표적인 경우는 가상화 환경에서 메모리 절약을 위해 사용되는 기법인 내용기반공유(content-based sharing) 기법에 의해 공유 되는 페이지들을 예로 들 수 있다. 이러한 읽기-전용 공유는 먼저 쓰기 연산이 일어나면 카피 온 라이트(copy on write)가 일어나서 새로운 물리 주소에 새로운 페이지가 할당되고, 기존 공유 페이지에 대해 별도의 캐쉬 일관성을 맞춰줄 필요가 없게 된다. 따라서, 읽기-전용 공유 페이지의 일관성을 위한 통신의 경우 읽기 연산에 대한 고려만 하면 되는데, 이는 쓰기 연산처럼 반드시 데이터를 로드하고 있는 모든 로컬 캐쉬에 일관성 메시지를 보낼 필요가 없다. 그 이유는 메모리로부터 언제든지 올바른 데이터를 가져올 수 있기 때문이다. 만약, 다른 코어의 로컬 캐쉬에 데이터가 존재하고 있었는데 그 캐쉬에 일관성 메시지를 못 보내게 된 경우, 해당 캐쉬로부터 데이터를 가져오지 못하더라도 메모리로부터 원하는 데이터를 가져올 수 있다. 따라서, 일관성 메시지의 수를 줄이기 위해 추가적인 개선이 가능하다. 이하에서는 읽기-전용 페이지의 일관성 메시지를 감소시키기 위한 3 가지 개선 방법을 제시한다.The reason why a different approach is needed for the processing of read-only shared data is as follows. A typical case of a read-only shared page may be a page shared by a content-based sharing technique, which is a technique used for memory saving in a virtualized environment. Such a read-only share is a copy on write when a write operation occurs first, so that a new page is allocated to a new physical address, and there is no need for a separate cache coherency for the existing shared page. Thus, for communication for read-only shared page consistency, you only need to consider read operations, which do not necessarily send a consistency message to all local caches that are loading data, such as write operations. This is because the correct data can be retrieved from memory at any time. If data exists in the local cache of another core, but the consistency message is not sent to the cache, the desired data can be retrieved from memory even if the data cannot be retrieved from the cache. Thus, further improvements are possible to reduce the number of consistency messages. Hereinafter, three improvement methods are proposed to reduce the consistency message of the read-only page.

첫째, 메모리 직접 방법은 캐쉬 미스된 데이터를 메모리로부터 직접 독출한다. 보다 구체적으로, 메모리 직접 방법은 읽기-전용 페이지에 대한 읽기 연산에 의해서 캐쉬 미스가 발생할 경우, 일관성 메시지를 발생시키지 않고 무조건 메모리로 요청 메시지를 전송한다. 다만, 이 방법은 캐쉬 일관성에 의한 통신량은 줄일 수 있겠지만, 만약 그 데이터가 다른 코어의 로컬 캐쉬에 존재한다면, 캐쉬에서 데이터를 가져올 수 있음에도 불구하고 메모리에서 데이터를 가져 오게 되므로, 데이터를 가져오는 시간의 측면에서 손해를 볼 우려가 있다. 따라서, 캐쉬에서 데이터를 가져올 수 있는 기회를 좀 더 얻기 위해 다음의 다른 방법들이 활용 가능하다.First, the direct memory method reads cache missed data directly from memory. More specifically, the memory direct method sends a request message to memory unconditionally without generating a coherency message when a cache miss occurs due to a read operation on a read-only page. However, this method can reduce the traffic caused by cache coherence, but if the data exists in the local cache of another core, the data is retrieved from memory even though the data can be retrieved from the cache. There is a risk of loss in terms of. Therefore, the following other methods can be used to get more opportunities to retrieve data from the cache.

둘째, 내부 가상 머신 방법은 매핑 정보를 이용하여 캐쉬 미스가 발생한 가상 머신에 매핑된 코어 및 메모리에 동시에 캐쉬 미스된 데이터를 요청한다. 즉, 내부 가상 머신 방법은 현재 캐쉬 미스를 일으킨 가상 머신이 가상 프로세서 맵 레지스터의 엔트리에 기록되어 있는 코어들과 메모리로 동시에 데이터 요청을 보내는 방법이다. 이 방법을 이용하면 일관성 메시지에 의한 데이터 통신량은 메모리 직접 방법보다는 상대적으로 증가하지만, 관련이 있는 코어에 데이터 요청을 보내므로 증가하는 통신량에 비해서 데이터를 캐쉬에서 가져올 확률을 높일 수 있다.Second, the internal virtual machine method requests the data that is cache missed simultaneously to the core and the memory mapped to the virtual machine where the cache miss occurs using the mapping information. In other words, the internal virtual machine method is a method in which the virtual machine that caused the current cache miss simultaneously sends data requests to the cores and the memory recorded in the entry of the virtual processor map register. Using this method, the data traffic due to the coherency message is relatively higher than the direct memory method, but by sending data requests to the cores involved, it increases the probability of getting data from the cache compared to the increasing traffic.

셋째, 프렌드 가상 머신 방법은 복수 개의 가상 머신 별로 공유가 빈번하게 발생하는 가상 머신들을 프렌드 가상 머신으로 미리 설정하고, 매핑 정보를 이용하여 캐쉬 미스가 발생한 가상 머신에 매핑된 코어, 프렌드 가상 머신에 매핑된 코어 및 메모리에 동시에 캐쉬 미스된 데이터를 요청한다. 즉, 프렌드 가상 머신 방법은 가상 머신 별로 공유가 많이 일어나는 가상 머신들을 프렌드 가상 머신으로 설정하여, 데이터 요청을 할 때 현재 캐쉬 미스를 일으킨 가상 머신의 스눕 도메인, 프렌드 가상 머신의 스눕 도메인, 및 메모리에 동시에 데이터 요청을 보내는 방법이다. 이 방법은 이상에서 소개한 방법들에 비해 상대적으로 조금 더 통신량을 증가시키지만, 데이터를 캐쉬에서 가져올 수 있는 활률을 보다 높일 수 있다. 이를 위해, 가상 머신 모니터는 각 가상 머신들 간의 공유 정도를 파악해야 하며, 공유 정도에 따라 적절하게 프렌드 가상 머신을 설정해 주어야 한다. 프렌드 가상 머신을 설정함으로써 캐쉬 일관성을 제어하는 방법은 다음의 도 5의 예시도를 통해 다시 설명한다.Third, the friend virtual machine method pre-sets virtual machines that are frequently shared by a plurality of virtual machines as a friend virtual machine, and uses the mapping information to map cores and friend virtual machines mapped to the virtual machine where the cache miss occurs. Cache missed data at the same time to the core and memory. In other words, the friend virtual machine method sets virtual machines that share a lot of virtual machines as friend virtual machines, so that the snoop domain of the virtual machine that caused the cache miss when the data request is made, the snoop domain of the friend virtual machine, and the memory. At the same time, it sends data requests. This method increases the traffic by a little more than the methods introduced above, but it can increase the active rate from which data can be retrieved from the cache. To do this, the virtual machine monitor needs to know the degree of sharing between each virtual machine and set up a friend virtual machine according to the degree of sharing. The method of controlling cache coherency by setting a friend virtual machine will be described again with reference to the following example of FIG. 5.

도 5는 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 미스가 발생한 페이지가 읽기-전용 공유 상태인 경우 캐쉬 일관성을 제어하는 방법의 일례를 도시한 도면으로서, 각각의 가상 머신들(510, 520, 530)은 4 개의 코어를 스눕 도메인으로 할당받았음을 예시하고 있다. FIG. 5 is a diagram illustrating an example of a method of controlling cache coherency when a cache missed page is in a read-only shared state in a virtualization environment according to an embodiment of the present invention. Each of the virtual machines 510 and 520 is illustrated in FIG. 530 illustrates that four cores have been assigned to the snoop domain.

또한, 도 5는 각각의 가상 머신들(510, 520, 530) 별로 프레드 가상 머신의 설정 정보(550)가 유지되고 있음을 예시하고 있다. 도시된 바에 따르면, 가상 머신 VM1은 가상 머신 VM2와 40%의 페이지를 공유하고 있으므로 가상 머신 모니터에 의해 VM1과 VM2는 프렌드 가상 머신으로 설정이 되었다. 반면, 가상 머신 VM1은 가상 머신 VM3와 1%의 페이지를 공유하고 있으므로 프렌드 가상 머신으로 설정되지 않는다. 5 illustrates that the configuration information 550 of the Fred virtual machine is maintained for each of the virtual machines 510, 520, and 530. As shown, virtual machine VM1 shares 40% of pages with virtual machine VM2, so VM1 and VM2 are configured as friend virtual machines by the virtual machine monitor. On the other hand, virtual machine VM1 shares 1% of pages with virtual machine VM3, so it is not configured as a friend virtual machine.

이제, 가상 머신 VM1이 자신의 코어인 P0에서 읽기-전용 공유 페이지에 대해 읽기 연산에 의한 캐쉬 미스를 일으켰다고 가정하자. 이러한 상황에서 프렌드 가상 머신 방법에 따르면, 가상 머신 VM1과 가상 머신 VM2의 스눕 도메인에 대해서는 일관성 메시지가 전송이 되고, 이와 동시에 메모리로 데이터 요청이 전송된다. 반면, 가상 머신 VM3는 가상 머신 VM1의 프렌드 가상 머신이 아니므로 일관성 메시지가 전송되지 않는다.Now assume that virtual machine VM1 caused a cache miss due to a read operation on a read-only shared page on its core P0. In this situation, according to the friend virtual machine method, a coherence message is transmitted to snoop domains of the virtual machine VM1 and the virtual machine VM2, and at the same time, a data request is transmitted to the memory. On the other hand, since the virtual machine VM3 is not a friend virtual machine of the virtual machine VM1, the consistency message is not transmitted.

도 6은 본 발명의 다른 실시예에 따라 레지던스 카운터(17)를 이용하여 가상 프로세서 맵 레지스터(13)의 스눕 도메인을 갱신하는 과정을 예시한 도면으로서, 본 실시예에서는 복수 개의 코어(50) 별로 해당 코어(50)의 로컬 캐쉬(local cache)(55)에 적재된 가상 머신의 데이터의 수를 나타내는 레지던스 카운터(residence counter)(17)를 산출하여 저장한다. FIG. 6 is a diagram illustrating a process of updating a snoop domain of the virtual processor map register 13 using the residence counter 17 according to another embodiment of the present invention. The residence counter 17 representing the number of data of the virtual machine loaded in the local cache 55 of the core 50 is calculated and stored.

보다 구체적으로, 각 코어(50) 별로 레지던스 카운터(17)를 이용하여 해당 코어(50)의 로컬 캐쉬(55)에 어떤 가상 머신의 데이터가 새롭게 로드가 되면 카운터를 1 증가시켜 저장하고, 가상 머신의 데이터의 교체(replacement)가 발생하면 해당 카운터를 1 감소시켜 저장한다. 즉, 현재 코어(50)의 로컬 캐쉬(55)에 로드되어 있는 데이터의 수를 각 가상 머신 별 카운터를 통해서 나타낼 수 있다. 따라서, 레지던스 카운터(17)가 0이 되는 순간, 해당 가상 머신의 스눕 도메인에서 해당 코어를 제거해주면, 스눕 도메인으로부터 더 이상 모니터링할 필요 없는 불필요한 코어들을 제거할 수 있다. 이를 위해 본 발명의 실시예는 어떠한 코어(50)에 대응하는 레지던스 카운터(17)가 0인 경우, 가상 프로세서 맵 레지스터(13)에 저장된 매핑 정보의 스눕 도메인에서 이에 해당하는 코어를 삭제한다.More specifically, when the data of a virtual machine is newly loaded in the local cache 55 of the core 50 using the residence counter 17 for each core 50, the counter is incremented by 1 and stored. If the replacement of data occurs, decrement the counter by 1 and store it. That is, the number of data currently loaded in the local cache 55 of the core 50 may be indicated through a counter for each virtual machine. Therefore, when the residence counter 17 becomes 0, by removing the core from the snoop domain of the virtual machine, unnecessary cores that are no longer monitored can be removed from the snoop domain. To this end, in the embodiment of the present invention, when the residence counter 17 corresponding to a certain core 50 is 0, the corresponding core is deleted from the snoop domain of the mapping information stored in the virtual processor map register 13.

도 6에서는 코어 P0(50)에 대응하는 레지던스 카운터(17)에서 가상 머신 VM1의 데이터가 교체되면서 가상 머신 VM1의 카운터 값이 0이 되는 경우를 예시하였다. 그러면, 가상 프로세서 맵 레지스터(13)는 가상 머신 VM1의 스눕 도메인으로부터 P0를 제거하게 된다.6 illustrates a case in which the counter value of the virtual machine VM1 becomes 0 while the data of the virtual machine VM1 is replaced in the residence counter 17 corresponding to the core P0 50. The virtual processor map register 13 then removes P0 from the snoop domain of the virtual machine VM1.

상기 기재된 본 발명의 다양한 실시예들에 따르면, 가상 프로세서와 코어들 간의 매핑 정보 및 페이지의 공유 여부를 나타내는 상태 정보를 이용하여 코어들에 선택적으로 일관성 메시지를 전송하는 기술을 제안함으로써 가상화 기술을 멀티-코어 기반의 시스템에 확장하여 적용할 수 있고, 멀티-코어에 기반한 가상화 환경 하에서 캐쉬 일관성을 유지하기 위해 전송하는 일관성 메시지에 따른 통신량을 감소시킴으로써 시스템의 성능을 향상시키며, 이러한 통신량 감소로 인해 전력 소모를 최소화할 수 있다.According to various embodiments of the present invention described above, a virtualization technology may be multiplied by proposing a technique of selectively transmitting a consistency message to cores by using mapping information between a virtual processor and cores and state information indicating whether a page is shared. Extends to core-based systems, improves system performance by reducing traffic based on coherency messages sent to maintain cache coherency under multi-core-based virtualization environments, and reduces power The consumption can be minimized.

한편, 본 발명은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the present invention can be embodied in computer readable code on a computer readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현하는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Examples of the computer-readable recording medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device and the like, and also a carrier wave (for example, transmission via the Internet) . In addition, the computer-readable recording medium may be distributed over network-connected computer systems so that computer readable codes can be stored and executed in a distributed manner. In addition, functional programs, codes, and code segments for implementing the present invention can be easily deduced by programmers skilled in the art to which the present invention belongs.

도 7은 본 발명의 일 실시예에 따른 가상화 환경에서 캐쉬 일관성을 제어하는 장치(100)를 도시한 블록도로서, 모니터링부(10), 공유 검사부(20) 및 처리부(30)를 포함한다. 또한, 본 실시예가 가정하고 있는 바에 따라 복수 개의 가상 머신이 복수 개의 코어(50)를 공유하고 있으며, 각각의 코어(50)는 캐쉬를 구비한다. 도 7에 도시된 각각의 구성은 앞서 도 3 및 도 4를 통해 구체적으로 소개한 캐쉬 일관성 제어 방법의 각 단계에 대응하는 것으로서, 여기서는 구체적인 내용을 생략하고 그 개요만을 설명한다.FIG. 7 is a block diagram illustrating an apparatus 100 for controlling cache coherency in a virtualization environment according to an embodiment of the present invention, and includes a monitoring unit 10, a shared inspection unit 20, and a processing unit 30. In addition, as the present embodiment assumes, a plurality of virtual machines share a plurality of cores 50, and each core 50 has a cache. Each configuration illustrated in FIG. 7 corresponds to each step of the cache coherency control method introduced in detail with reference to FIGS. 3 and 4, and detailed descriptions thereof are omitted here.

모니터링(monitoring)부(10)는 가상 머신의 가상 프로세서와 코어들 간의 매핑 정보를 생성하여 가상 프로세서 맵 레지스터(13)에 저장하고, 페이지의 공유 여부를 나타내는 상태 정보를 생성하여 섀도우 페이지 테이블(15)에 저장한다. 여기서 상태 정보는, 가상 머신 독점 상태, 읽기-쓰기 공유 상태 및 읽기-전용 공유 상태 중 어느 하나를 의미한다. 또한, 모니터링부(10)는 복수 개의 가상 머신 별로 할당된 가상 프로세서를 매핑시켜 스눕 도메인으로서 가상 프로세서 맵 레지스터(13)에 저장한다.The monitoring unit 10 generates mapping information between virtual processors and cores of the virtual machine, stores the mapping information in the virtual processor map register 13, and generates state information indicating whether a page is shared, thereby generating a shadow page table 15. ). Here, the status information means any one of a virtual machine exclusive state, a read-write shared state, and a read-only shared state. In addition, the monitoring unit 10 maps the virtual processors allocated to the plurality of virtual machines and stores the virtual processors in the virtual processor map register 13 as snoop domains.

한편, 모니터링부(10)는 복수 개의 코어 별로 해당 코어의 로컬 캐쉬에 적재된 가상 머신의 데이터의 수를 나타내는 레지던스 카운터(17)를 산출하여 저장할 수 있으며, 어떤 코어에 대응하는 레지던스 카운터가 0인 경우, 모니터링부(10)는 매핑 정보의 스눕 도메인에서 해당 코어를 삭제함으로써 불필요한 코어를 모니터링 대상에서 배제시킨다.Meanwhile, the monitoring unit 10 may calculate and store the residence counter 17 indicating the number of data of the virtual machine loaded in the local cache of the corresponding core for each of the plurality of cores, and the residence counter corresponding to a certain core is 0. In this case, the monitoring unit 10 removes the core from the monitoring target by deleting the core from the snoop domain of the mapping information.

공유 검사부(20)는 캐쉬 미스의 발생이 감지되면, 가상 프로세서 맵 레지스터(13)에 저장된 매핑 정보 및 섀도우 페이지 테이블(15)에 저장된 상태 정보를 이용하여 캐쉬 미스가 감지된 데이터에 대해 가상 머신에 의한 페이지 공유 여부를 검사한다. 이러한 공유 검사부(20)는 데이터 접근시 섀도우 페이지 테이블(15) 및 변환 색인 버퍼(25)를 참조하여 페이지 공유 여부를 검사한다.When the occurrence of a cache miss is detected, the shared checker 20 uses the mapping information stored in the virtual processor map register 13 and the state information stored in the shadow page table 15 to inform the virtual machine of the data where the cache miss is detected. Check page sharing by The sharing checker 20 checks whether a page is shared by referring to the shadow page table 15 and the translation index buffer 25 when data is accessed.

처리부(30)는 공유 검사부(20)에 의한 검사 결과에 따라 동일한 가상 머신에 속한 가상 프로세서들이 매핑된 코어를 고려하여 선택적으로 일관성 메시지를 전송한다.The processor 30 selectively transmits a coherency message in consideration of the core to which virtual processors belonging to the same virtual machine are mapped according to the check result by the shared checker 20.

이상에서 본 발명에 대하여 그 다양한 실시예들을 중심으로 살펴보았다. 본 발명에 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.The present invention has been described above with reference to various embodiments. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

100 : 캐쉬 일관성 제어 장치
10 : 모니터링부 13 : 가상 프로세서 맵 레지스터
15 : 섀도우 페이지 테이블 17 : 레지던스 카운터
20 : 공유 검사부 25 : 변환 색인 버퍼
30 : 처리부
50 : 코어 55 : 캐쉬
510, 520, 530 : 가상 머신 550 : 프렌드 머신 설정 정보100: cache coherency control unit
10: monitoring unit 13: virtual processor map register
15: Shadow Page Table 17: Residence Counter
20: shared checker 25: conversion index buffer
30: processing unit
50: core 55: cache
510, 520, 530: virtual machine 550: friend machine configuration information

Claims

A method of controlling cache coference in a virtualization environment in which a plurality of virtual machines share a plurality of cores,
Generating mapping information between a virtual processor of the virtual machine and the cores and state information indicating whether a page is shared;
If the occurrence of a cache miss is detected, checking whether the page is shared by the virtual machine with respect to the data where the cache miss is detected, using the generated mapping information and the generated state information; And
And optionally transmitting a consistency message in consideration of the mapped cores of the virtual processors belonging to the same virtual machine according to the check result.
The transmitting of the consistency message may include:
Detecting the occurrence of the cache miss;
Checking whether the page miss is shared by the virtual machine with respect to the data detected by the cache miss using the generated state information;
Selecting, by a virtual machine monitor, a core mapped to the virtual machine based on the generated mapping information based on the inspection result; And
Sending a consistency message to the selected core.

delete

The method of claim 1,
The status information may include:
A virtual machine private state indicating that the page is not shared by multiple virtual machines,
A read-write shared state indicating that the page is shared by a plurality of virtual machines and each of the virtual machines is capable of reading and writing to a shared page; and
And wherein the page is shared by a plurality of virtual machines and each of the virtual machines is in a read-only shared state indicating that only the read of a shared page is possible.

The method of claim 1,
If the check result is a virtual machine exclusive state indicating that the page is not shared by a plurality of virtual machines,
The transmitting of the coherency message may include transmitting a coherency message only to cores included in a snoop domain read by referring to a virtual processor map register storing the mapping information. Way.

The method of claim 1,
The check result indicates that the page is shared by a plurality of virtual machines and each of the virtual machines is in a read-write shared state indicating that the shared pages can be read and written to the shared page,
Sending the consistency message is to broadcast a consistency message to all cores.

The method of claim 1,
The check result indicates that the page is shared by a plurality of virtual machines and each of the virtual machines is in a read-only shared state indicating that only the shared page can be read;
The transmitting of the coherency message may include at least one of a broadcasting method, a memory-direct method, an intra-VM method, and a friend-VM method. Characterized in that it is carried out.

The method according to claim 6,
And the memory direct method reads the cache missed data directly from the memory.

The method according to claim 6,
The internal virtual machine method may use the mapping information to request cache missed data simultaneously to a core and a memory mapped to a virtual machine where the cache miss occurs.

The method according to claim 6,
The friend virtual machine method,
Pre-set virtual machines that frequently share for each of the plurality of virtual machines as a friend virtual machine,
And using the mapping information to request cache missed data simultaneously to a core mapped to the virtual machine where the cache miss occurred, the core mapped to the set friend virtual machine, and a memory.

A method of controlling cache coherency in a virtualization environment in which a plurality of virtual machines share a plurality of cores,
Generating mapping information between the virtual processor and the cores of the virtual machine and storing the mapping information in a virtual processor map register;
Generating state information indicating whether a page is shared and storing the state information in a shadow page table; And
When the occurrence of a cache miss is detected, a consistency message is selected only among cores selected from the plurality of cores by checking whether the cache miss is shared by the virtual machine based on the stored mapping information and the stored state information. Transmitting a step.

11. The method of claim 10,
And calculating and storing a residence counter indicating the number of data of a virtual machine loaded in a local cache of a corresponding core for each of the plurality of cores.

The method of claim 11,
And when the residence counter corresponding to a predetermined core is 0, deleting the predetermined core from the snoop domain of the stored mapping information.

A computer-readable recording medium having recorded thereon a program for executing the method of any one of claims 1 and 3 to a computer.

An apparatus for controlling cache coherency in a virtualization environment in which a plurality of virtual machines share a plurality of cores,
A monitoring unit generating mapping information between the virtual processor and the cores of the virtual machine and storing the mapping information in a virtual processor map register, and generating state information indicating whether a page is shared and storing the state information in a shadow page table;
A sharing checker that checks whether a page is shared by the virtual machine with respect to the data on which the cache miss is detected, using the stored mapping information and the stored state information when a cache miss is detected; And
The processor may include a processor configured to selectively transmit a consistency message in consideration of the mapped cores of the virtual processors belonging to the same virtual machine according to the check result.
Wherein,
Detect the occurrence of the cache miss,
Using the generated state information, checks whether the page is shared by the virtual machine on the detected data of the cache miss;
A virtual machine monitor selects a core mapped to the virtual machine based on the generated mapping information based on the inspection result,
And send a consistency message to the selected core.

15. The method of claim 14,
The state information includes a virtual machine exclusive state indicating that the page is not shared by a plurality of virtual machines, the page is shared by a plurality of virtual machines, and each of the virtual machines can read and write to the shared page. Any one of a read-write shared state indicating a read-write shared state and a read-only shared state indicating that the page is shared by a plurality of virtual machines and each of the virtual machines can only read a shared page,
The monitoring unit maps the page and state information of the page and stores the page in the shadow page table.
And the sharing checker checks whether the page is shared by referring to the shadow page table and translation lookaside buffer (TLB) when data is accessed.

An apparatus for controlling cache coherency in a virtualization environment in which a plurality of virtual machines share a plurality of cores,
Generates mapping information between the virtual processor of the virtual machine and the cores and stores the mapping information in a virtual processor map register, generates state information indicating whether a page is shared, and stores the shadow information in a shadow page table. A monitoring unit for calculating and storing a residence counter indicating the number of data of the virtual machine loaded in the local cache; And
When the occurrence of a cache miss is detected, a consistency message is selected only among cores selected from the plurality of cores by checking whether the cache miss is shared by the virtual machine based on the stored mapping information and the stored state information. Apparatus comprising a processing unit for transmitting the.

17. The method of claim 16,
And the monitoring unit maps the virtual processors allocated to the plurality of virtual machines and stores the virtual processors in the virtual processor map register as a snoop domain.

17. The method of claim 16,
And when the residence counter corresponding to a predetermined core is 0, the monitoring unit deletes the predetermined core from the snoop domain of the stored mapping information.