TW201115474A

TW201115474A - Memory detection method under the non uniform memory access environment

Info

Publication number: TW201115474A
Application number: TW98136071A
Authority: TW
Inventors: Yan Li; Tom Chen
Original assignee: Inventec Corp
Priority date: 2009-10-23
Filing date: 2009-10-23
Publication date: 2011-05-01

Abstract

A memory detection method under the non uniform memory access (NUMA) environment, including the following steps: obtaining the number of nodes in the system under the NUMA environment; replicating the execution thread of the memory test program more than one according to the number of nodes; binding each execution thread to each CPU and implementing each execution thread; and testing the dedicated memory of each CPU in parallel by using each execution thread. This method can significantly reduce test time while increasing test efficiency and test pressure.

Description

201115474 六、發明說明：【發明所屬之技術領域】本發明侧於-種記缝檢财法，尤其侧於—種非統一記憶體存取（NUMA)環境下的記憶體檢測方法。【先前技術】現今，非統-記憶體存取Ac_，以下簡稱NUMA)技術可贿料伺職料—祕雜運轉，同時保留小系統便於編程和管理的優點。在含有許多中央處理單元 (Central Processing Unit，以下簡稱 cpu)的電腦上，1^硬體可將專航舰與CPU配對，_大巾帛改善效能。目月'J ’硬體的趨勢已走向可提供多個系統匯流排，而每一個匿流排都可服務-小組處理ϋ。每-組處理器都有自己的記憶體，可能的話，也會有自己的輸入/輸出（1/0)通道。不過，每個 CPU都可存取與使肋同方法設計之其辦組湖聯的記憶體。母一個群組就稱為NUMA節點（node)。：numa節點内的CPU數目取決於硬體供應商。存取本機記憶體會比存取與其他^^几仏節點相關聯的記憶體更快。在NUMA硬體上，有些記憶體區域實際上是位於其他區域的不同匯流排上。因為NUMA使用本機和外部記憶體，所以存取某些記憶體區域的時間有時會比存取其他區域更久。「本機記憶體」和「外部s己憶體」通常是用來參考目前執行的執行線程。本機記憶體是與目前執行之執行線程的CPU位於相同節點上的記憶體。 201115474 不屬於目前執行之執行線程節點的記憶體，就是外部記憶體。外部記憶體也稱為「遠端記憶體」。 NUMA駐要優點是延展性。則^ _是為了超越對稱性多重處理（SymmetricMultiprocessing，簡稱SMp)架構的延展性限制而設計的。使用SMP時，所有記憶體存取都會公佈到相同共用記憶體匯流排。這適合只有少數幾個CPU的時候使用，當有幾十個甚至幾百個CPU爭相存取共用記憶體匯流排時，就不適用。 •突破了這些瓶頸，它限制任何一個記憶體匯流排上的CPU 數目，並利用ilj速互連的方式來連接不同的節點。習知的NUMA環境下的記憶體檢測方法只是簡單的透過驅動程式貝現了为配實體δ己憶體對其進行檢測的功能。並沒有能夠利用NUMA系統本身所固有的硬體環境特點來進行更加優化的記憶體檢測處理，因此透過優化現有的測試方法，測試壓力及測試效率還有進一步提升的空間。 • 請參考「第1圖」，此圖為習知技術之NUMA環境下的記憶體檢測方法所運行之系統方塊圖。如圖所示，習知技術之環境下的記憶體檢測方法在具有多個CPU及其專屬記憶體的 NUMA硬體架構之環境中僅僅透過一個記憶體測試執行線程來遍曆測試所有的實體記憶體，習知技術的這種檢測方法存有諸多缺陷或限制，特別是： 1、/又有能夠利用多CPU的硬體環境來盡可能地並行作業藉以提升系統之測試壓力。 201115474 2 1記憶體測試執行線程始終綁定在一個固定的CPU上9這樣當測試遠端記憶體（屬於其他CPU的專屬記憶體）的時候，訪問及存取速度會大幅降低，進而嚴重影響測試效率及測試壓力。【發明内容】為了解決上述習知技術中的問題與缺陷，本發明之目的在於提供一種可減少測試用時並可大幅提升測試壓力之環境下的記憶體檢測方法。本發明所提供之一種NUMA環境下的記憶體檢測方法，係包含以下步驟：獲得NUMA環境下系統的節點個數；根據節點健，將域體戦程式之執行_複製多份；將每一執行線程分別綁定到不同的cpu之上加以執行；以及利用各執行線程並行地測試各cpu專屬之記憶體。、其中，上述本發明所提供之一種職^環境下的記憶體檢測方法中’利用各執行線程並行地測試各cpu專屬之記憶體可進一步包含如下步驟： #執行_運行之後，麵在各自_財_驅動仏分配實體記憶體；將實體記憶體映射到用戶進程空間；以及 ▲應用針對各種砰目的之演算法對各記紐進行讀寫驗證黾全部的讀寫驗證結果均一致時，目丨丨 ’·’、測試。、則測忒通過，否則報錯並退d 201115474 此外’上述本發明之方法中，各執行線程係以cpu間規格書上的相互_關賴咖無料扣糾者在某兩個⑽空閒時單方向互測記憶體。。體次综上所述，本㈣倾供之―種舰仏職下的記憶體檢測方法’由於透過複製多份記憶體測試程式之執行線程，並將每一個執行線程分別綁定到不同的CPU上面去並行的測試各咖專屬之記憶體，因而相對於f知技術具有如下之優點·· 1由於利用NUMA架構之硬體特性而使得記憶體測試之並械大幅提高，進關峨並行度的提升使系統之測試勤得到提升。201115474 VI. Description of the Invention: [Technical Field of the Invention] The present invention is directed to a method for detecting a wealth of money, in particular, a method for detecting a memory in a non-uniform memory access (NUMA) environment. [Prior Art] Nowadays, the non-system-memory access Ac_, hereinafter referred to as NUMA) technology, can bribed the servo-secret operation, while retaining the advantages of small system for programming and management. On a computer with many Central Processing Units (Cpus), the 1^ hardware can pair the special ship with the CPU. The trend of the month 'J ’ hardware has come to provide multiple system busses, and each of the escaping rows can be serviced - the group handles ϋ. Each-group processor has its own memory and, if possible, its own input/output (1/0) channel. However, each CPU has access to the memory of the group that is designed in the same way as the ribs. A parent group is called a NUMA node. The number of CPUs in the :numa node depends on the hardware vendor. Accessing local memory is faster than accessing memory associated with other nodes. On NUMA hardware, some memory areas are actually on different bus bars in other areas. Because NUMA uses this unit and external memory, accessing certain memory areas is sometimes longer than accessing other areas. "Local memory" and "external s memory" are usually used to refer to the currently executing execution thread. The local memory is the memory on the same node as the CPU of the currently executing execution thread. 201115474 Memory that is not part of the currently executing thread node, is external memory. External memory is also known as "remote memory." The advantage of NUMA residency is scalability. Then ^ _ is designed to exceed the scalability limitations of the Symmetric Multiprocessing (SMp) architecture. When using SMP, all memory accesses are published to the same shared memory bus. This is suitable for use with only a few CPUs. It is not applicable when there are dozens or even hundreds of CPUs competing to access the shared memory bus. • Breaking through these bottlenecks, it limits the number of CPUs on any of the memory busses and uses ilj speed interconnects to connect different nodes. The memory detection method in the conventional NUMA environment is simply a function of detecting the entity δ hexamed body through the driver program. It is not possible to use the hardware environment inherent in the NUMA system to perform more optimized memory detection processing. Therefore, by optimizing the existing test methods, there is room for further improvement in test pressure and test efficiency. • Please refer to “Figure 1”, which is a system block diagram of the memory detection method in the NUMA environment of the prior art. As shown in the figure, the memory detection method in the prior art environment traverses all physical memory through a memory test execution thread in an environment of NUMA hardware architecture with multiple CPUs and their dedicated memories. Body, the detection method of the prior art has many defects or limitations, in particular: 1. There is also a hardware environment that can utilize multiple CPUs to work as much as possible in parallel to improve the testing pressure of the system. 201115474 2 1 Memory test execution thread is always bound to a fixed CPU. 9 When testing remote memory (which belongs to other CPU's dedicated memory), access and access speed will be greatly reduced, which will seriously affect the test. Efficiency and testing pressure. SUMMARY OF THE INVENTION In order to solve the problems and deficiencies in the above-mentioned prior art, it is an object of the present invention to provide a memory detecting method in an environment which can reduce the test time and can greatly increase the test pressure. The memory detection method in the NUMA environment provided by the present invention comprises the following steps: obtaining the number of nodes of the system in the NUMA environment; performing the execution of the domain body program according to the node health; copying multiple copies; The threads are respectively bound to different cpus for execution; and each execution thread is used to test each cpu-specific memory in parallel. In the memory detection method provided by the above-mentioned invention, the memory of each CPU is tested in parallel by using each execution thread to further include the following steps: # execution_operation, face each other _ Finance_Driver allocates physical memory; maps physical memory to user process space; and ▲ applies read and write verification of each note to various algorithms丨'·', test. Then, the test passes, otherwise the error is reported and retired. 201115474 In addition, in the above method of the present invention, each execution thread is in a mutual direction of the cpu specification book, and the two directions are in a single direction. Mutual test memory. . In summary, the memory detection method under the “four-year-old” is based on the execution thread of copying multiple memory test programs, and each execution thread is bound to a different CPU. The above is a parallel test of the memory of each coffee, and thus has the following advantages over the technology of F. 1 Due to the hardware characteristics of the NUMA architecture, the memory test is greatly improved, and the degree of parallelism is improved. The upgrade has improved the testing of the system.

兹配合圖神紐實施例詳細說 2、由於每-份耻喻齡職執行線_不會去訪問記憶體，及測試壓力。【實施方式】In conjunction with the diagram of the embodiment of the map, the details of the implementation of the 2, because each of the shameful age line _ will not access the memory, and test the pressure. [Embodiment]

請參考「第2圖」’此圖為本發明一的記憶體檢測方法的整體步驟流程圖，^ NUMA環境下的記憶體檢測方法獲得NUMA環境下系統的節點個數Please refer to "Fig. 2". This figure is a flowchart of the overall steps of the memory detecting method of the present invention. ^ Memory detecting method in the NUMA environment Obtaining the number of nodes in the NUMA environment

&月一實施例之NUMA環境下圖，如圖所示，本發明之一種係包含以下步驟： t (步驟 101); 之執行線程複製多份（步 201115474 將每一執行線程分別綁定到不同的Ci>u之上加以執行（步驟 103);以及利用各執行線程並行地測試各CPU專屬之記憶體（步驟1〇4 )。其中，如「第3圖」所示，上述本發明一實施例之一種環境下的記憶體檢測方法中的步驟1〇4可進一步包含如下步驟：當各執行線程運行之後，分別在各自的節點中利用驅動程式分配實體記憶體（步驟1041); 將實體記憶體映射到用戶進程空間（步驟1〇42); 應用針對各種不同目的之演算法對各記憶體進行讀寫驗證 (步驟 1043); 判斷讀寫驗證結果是否均一致（步驟1〇44);以及备全部的碩寫驗證結果均一致時，則測試通過（步驟，否則報錯並退出測試（步驟1〇46)。此外’上述本發明一實施例之NUMA環境下的記憶體檢測方法中，各執躲程仙CPU間規格書上_互間的關延遲時間較短的單方向訪問和測試記憶體，或者在某兩個cpu空閒時單方向互測記憶體。現在請參考「第4圖 '叫」凡固两不發明一實施例之NUMA^ 兄下的德體制方法崎行之祕方侧，如騎示，應用才之。己隐體制方法，在具制試的喃首先會獲射統的歸 / (CPU與本地聰體賴合稱為節點）個數，隨後根據節點的個數_細咖_執行_ 1G，賴分鑛各執行線鞋 201115474 1〇綁定到每個哪上面錢行，當記憶體測試執行線程Η)運行起來之後，首先會分別在各自的節點中_驅動程式分配實體記憶體，隨後將它們映射到用戶進程空間，接下來就應用針對各鮮同目狀演算絲對記憶體進行讀寫驗證，當所有的讀寫驗證結果都-致時，則測試通過。當在上述測試過程中出現 a寫驗也不致的Jf况時，就說明系統記憶體存有相應的品質缺陷，進而使問題被檢測出來。因此，採用如上所述的本發明之 .NUMA %境下的^髓檢測綠將會贿崎決胃知技術之環境下的Alt體檢測方法所存在的諸多問題。雖然本發明贿述之紐實财式揭露如上，然其並非用以限定本發明。本賴之驗人·當意朗在械縣發明所附之申請專利範圍所揭示之本發明之範圍和精神之情況下，所為之更動，、潤_ ’均屬本發明之專利保護範圍之内。關於本發明所界定之保護範圍請參考所附之申請專利範圍。 • 【圖式簡單說明】第1圖為習知技術之NUMA環境下的記憶體檢測方法所運行之系統方塊圖；第2圖為本發明一實施例之NUMA環境下的記憶體檢測方法的整體步驟流程圖；第3圖為第2圖中步驟1〇4之分解步騾流程圖；以及第4圖為本發明一實施例之NUMA環境下的記憶體檢測方法所運行之系統方塊圖。 201115474 【主要元件符號說明】 10 記憶體測試執行線程In the NUMA environment of the month embodiment, as shown in the figure, one aspect of the present invention comprises the following steps: t (step 101); executing thread copying multiple copies (step 201115474 binds each execution thread to each Executing on different Ci>u (step 103); and testing each CPU-specific memory in parallel by using each execution thread (step 1〇4), wherein, as shown in "Fig. 3", the above-mentioned invention Step 1 to 4 in the memory detection method in an environment of an embodiment may further include the following steps: after each execution thread runs, respectively, using the driver to allocate the physical memory in the respective nodes (step 1041); The memory is mapped to the user process space (steps 1 〇 42); the memory is read and written for each of the different purposes (step 1043); the read and write verification results are consistent (step 1 〇 44); And if all the mastering verification results are consistent, then the test passes (step, otherwise the error is reported and the test is exited (step 1〇46). Further, the above description of the NUMA environment according to an embodiment of the present invention In the method of detecting the memory, each of the unidirectional access and test memories with a short delay time between the CPUs in the CPU specification book, or a single direction mutual memory when two CPUs are idle. Please refer to "4th figure" called "The two sides of the NUMA^ brothers under the NUMA^ brothers who have not invented an embodiment, such as riding the show, applying the talent. The method of the hidden system, in the test The first will be the number of the system (the CPU and the local smart body called the node), and then according to the number of nodes _ fine coffee _ implementation _ 1G, Lai mine each execution line shoes 201115474 1〇 binding To each of the above money lines, when the memory test execution thread 运行) runs, the _driver first allocates the physical memory in the respective nodes, and then maps them to the user process space, and then applies the target Each of the fresh-eyed calculus tests the memory for reading and writing. When all the reading and writing verification results are correct, the test passes. When the JF condition is not found during the above test, the system is described. Memory has a corresponding lack of quality The problem is detected, so that the problem of the Alt body detection method in the environment of the NUMA% environment of the invention described above is adopted. The present invention is not intended to limit the present invention, and the scope of the present invention disclosed in the scope of the patent application of the invention is hereby incorporated by reference. In the case of the spirit, it is within the scope of patent protection of the present invention. For the scope of protection defined by the present invention, please refer to the attached patent application scope. The figure is a system block diagram of a memory detecting method in a NUMA environment of the prior art; FIG. 2 is a flow chart of the overall steps of the memory detecting method in a NUMA environment according to an embodiment of the present invention; 2 is a flowchart of the decomposition steps of steps 1 and 4; and FIG. 4 is a system block diagram of the operation of the memory detection method in the NUMA environment according to an embodiment of the present invention. 201115474 [Main component symbol description] 10 memory test execution thread

Claims

201115474 VII. Patent application scope: 1. A memory detection method in a non-uniform memory access (NUMA) environment, comprising the following steps: obtaining a number of nodes of a system in a non-uniform memory access environment; The number of nodes, a copy of one of the memory test programs is executed by the thread; each of the execution threads is bound to a different CPU for execution; φ and the parallel execution of the execution threads Each of the cpu exclusive ones is a memory. 2. In the case of applying for the special item (4)_method of the first item, the memory using the execution threads to test the memory exclusive to the CPU in parallel includes the following steps: when the execution thread After the operation, the driver is allocated a physical memory in a respective node; the shell memory is mapped to the user process space; and the memory is written and verified by an algorithm for various purposes. When all the results of the read and write verification are met, the trial passes the error and exits the test. The method for detecting a body, wherein the access delay time of each of the inter-subsequents is shorter. 3. The memory execution thread according to item 2 of the scope of the patent application is accessed and tested in the direction of the single direction in the inter-CPU specification. Describe the memory. The method for detecting a memory according to claim 2, wherein each of the execution threads mutually detects the memory in a single direction when two of the CPUs are idle.

12