CN117271230A

CN117271230A - Memory testing method and device, storage medium and electronic equipment

Info

Publication number: CN117271230A
Application number: CN202210681561.7A
Authority: CN
Inventors: 连军委; 黄涛
Original assignee: Changxin Memory Technologies Inc
Current assignee: Changxin Memory Technologies Inc
Priority date: 2022-06-15
Filing date: 2022-06-15
Publication date: 2023-12-22
Also published as: WO2023240719A1

Abstract

The disclosure relates to a memory testing method, a memory testing device, a computer readable storage medium and electronic equipment, and relates to the technical field of integrated circuits. The memory testing method comprises the following steps: determining the local memory of each processor; the local memory is evenly distributed to each execution thread of the processor; and testing the allocated local memory in parallel by utilizing each execution thread. The present disclosure provides a method for reducing test time of a processor to memory in a NUMA system.

Description

Memory testing method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the technical field of integrated circuits, and in particular, to a memory testing method, a memory testing device, a computer readable storage medium, and an electronic apparatus.

Background

Non-coherent memory access (Non Uniform Memory Access, NUMA) techniques can allow numerous servers to behave as a single system while retaining the advantages of a small system for programming and management.

Since the NUMA system is a memory design for multiple processors and the storage capacity of each processor is large, the test time of the processor to the memory is long. Therefore, reducing the test time of the processor to the memory in the NUMA system is a urgent problem to be solved.

It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The disclosure aims to provide a memory testing method, a memory testing device, a computer readable storage medium and an electronic device, and provides a method for reducing the testing time of a processor to a memory in a NUMA system.

Other features and advantages of the present disclosure will be apparent from the following detailed description, or may be learned in part by the practice of the invention.

According to a first aspect of the present disclosure, there is provided a memory testing method, the method including: determining the local memory of each processor; the local memory is evenly distributed to each execution thread of the processor; and testing the allocated local memory in parallel by utilizing each execution thread.

In an exemplary embodiment of the disclosure, the determining the local memory of each processor includes: determining a physical address demarcation point of each processor; and determining the local memory of each processor according to the physical address demarcation point.

In an exemplary embodiment of the disclosure, the determining the physical address demarcation point of each of the processors includes: and decoding the address signal by using an address decoder to obtain the physical address demarcation point corresponding to the physical address of the memory.

In an exemplary embodiment of the disclosure, the allocating the local memory for each execution thread of the processor includes: determining the total storage capacity of the local memory corresponding to the processor; dividing the total storage amount by the number of the execution threads of the processor to determine the average storage amount of the local memory allocated by each execution thread; and determining the local memory allocated by each execution thread according to the average memory capacity.

In one exemplary embodiment of the present disclosure, the number of execution threads is equal to the number of cores of the processor.

In an exemplary embodiment of the present disclosure, the testing the allocated local memory in parallel with each of the execution threads includes: performing read-write verification on the allocated local memory by utilizing each execution thread; when all the read-write verification results are consistent, the test is passed; otherwise, reporting an error, and completing the test to prompt a test failure.

In one exemplary embodiment of the present disclosure, the thread of execution is used to test memory of a non-uniform memory access NUMA system.

According to a second aspect of the present disclosure, there is provided a memory test apparatus, the apparatus comprising: the local memory determining module is used for determining the local memory of each processor; the memory allocation module is used for averagely allocating the local memory for each execution thread of the processor; and the test module is used for testing the allocated local memory in parallel by utilizing each execution thread.

In an exemplary embodiment of the disclosure, the local memory determining module is configured to determine a physical address demarcation point of each of the processors; and determining the local memory of each processor according to the physical address demarcation point.

In an exemplary embodiment of the disclosure, the local memory determining module is configured to decode an address signal by using an address decoder to obtain the physical address demarcation point corresponding to a physical address of the memory.

In an exemplary embodiment of the present disclosure, the memory allocation module is configured to determine a total storage amount of the local memory corresponding to the processor; dividing the total storage amount by the number of the execution threads of the processor to determine the average storage amount of the local memory allocated by each execution thread; and determining the local memory allocated by each execution thread according to the average memory capacity.

In an exemplary embodiment of the present disclosure, the test module is configured to perform read-write verification on the allocated local memory by using each of the execution threads; when all the read-write verification results are consistent, the test is passed; otherwise, reporting an error, and completing the test to prompt a test failure.

According to a third aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program, characterized in that the computer program, when executed by a processor, implements the memory test method described above.

According to a fourth aspect of the present disclosure, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the memory test method described above via execution of the executable instructions.

The technical scheme provided by the disclosure can comprise the following beneficial effects:

In the exemplary embodiment of the disclosure, on one hand, by determining the local memory of the processor and testing the allocated local memory by using each execution thread of the processor, the processor is prevented from testing and accessing the remote memory, so that the efficiency of memory testing can be improved; on the other hand, the local memory is evenly distributed for each execution thread of the processor, and the evenly distributed local memory is tested in parallel by utilizing each execution thread, so that the test time of each execution thread is the same, the total test time can be shortened, and the efficiency of memory test is further improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort. In the drawings:

FIG. 1 schematically illustrates a structural contrast diagram of a coherent memory access and a non-coherent memory access according to an exemplary embodiment of the present disclosure;

FIG. 2 schematically illustrates a node schematic diagram of a non-coherent memory access according to an exemplary embodiment of the present disclosure;

FIG. 3 schematically illustrates a comparison of speeds of accessing different memories in a non-coherent memory access according to an exemplary embodiment of the present disclosure;

FIG. 4 schematically illustrates a flow chart of steps of a memory testing method according to an exemplary embodiment of the present disclosure;

FIG. 5 schematically illustrates a flowchart of steps for evenly allocating local memory in a memory testing method according to an exemplary embodiment of the present disclosure;

FIG. 6 schematically illustrates a block diagram of a memory test apparatus according to an exemplary embodiment of the present disclosure;

fig. 7 schematically illustrates a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments can be embodied in many forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the disclosed aspects may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known structures, methods, devices, implementations, materials, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.

The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, these functional entities may be implemented in software, or in one or more software-hardened modules, or in different networks and/or processor devices and/or microcontroller devices.

A feature of coherent memory access (Uniform Memory Access, UMA) is that multiple processors access all available memory in the system through the same bus, as shown in UMA in fig. 1. The time each processor accesses memory is the same and is therefore referred to as a coherent memory access.

UMA has the problem that multiple processors access memory via one bus, increasing the load on the shared bus. Multiple processors may contend for the memory controller (memory controller) causing memory access conflicts. In addition, access latency may be caused by limited bus bandwidth.

Under UMA architecture, the front-side bus between the CPU and the memory controller becomes the bottleneck of system performance under the premise of increasing the number of system CPUs. Thus, when a 64-bit x86 architecture is introduced, a non-uniform memory access NUMA architecture is implemented.

In contrast to UMA, NUMA is characterized by each processor having a local memory as indicated by NUMA in fig. 1. And each processor may access the local memory (equivalent to remote memory) of the other processor.

For NUMA, the processors and corresponding local memory may also be divided into different groups, with a group of processors having access to their own local memory (equivalent to a memory group) together. When there are multiple sets of processors and their memory banks, each set of processors and its corresponding memory bank constitutes a NUMA node, as shown in FIG. 2.

It should be noted that, both coherent memory access UMA and non-coherent memory access NUMA belong to a symmetric multiprocessor architecture (Symmetric Multiprocessing, SMP). SMPs are among the most common multiprocessor computer architectures today, where multiple processors of the SMP are homogeneous, using CPUs of the same architecture.

Generally, NUMA can access remote memory when local memory is not available, because processor CPU0 accesses local memory faster than remote memory, as shown in fig. 3, processor CPU0 accesses local memory (memory 0) faster than processor CPU 1's local memory (memory 1) and also accesses processor CPU 2's local memory (memory 2), and likewise, processor CPU 3's local memory (memory 3), where memory 1, memory 2 and memory 3 belong to remote memory for processor CPU 0. Because the speed of the processor accessing the remote memory is slower, delay is generated when the processor accesses the remote memory, and the access efficiency is obviously reduced.

In general, in the memory test process of the NUMA system, all memories are traversed and tested through one memory test thread, so that the memory test thread is always bound on one processor, and when a remote memory is tested, the access speed and the access speed are greatly reduced, and the test efficiency is seriously affected.

Based on this, the exemplary embodiments of the present disclosure provide a memory testing method based on NUMA for testing non-uniform memory access to the memory of NUMA. Referring to fig. 4, a flowchart illustrating steps of a memory testing method according to an embodiment of the present disclosure is shown. In one possible implementation, the memory testing method may include:

Step S410, determining the local memory of each processor;

step S420, uniformly distributing local memory for each execution thread of the processor;

step S430, the distributed local memories are tested in parallel by using each execution thread.

According to the memory testing method provided by the embodiment of the disclosure, on one hand, the local memory of the processor is determined, and the distributed local memory is tested by utilizing each execution thread of the processor, so that the processor is prevented from testing and accessing the remote memory, and the efficiency of memory testing can be improved; on the other hand, the local memory is evenly distributed for each execution thread of the processor, and the evenly distributed local memory is tested in parallel by utilizing each execution thread, so that the test time of each execution thread is the same, the total test time can be shortened, and the efficiency of memory test is further improved.

The memory test method will be described in detail with reference to the following specific embodiments:

in step S410, the local memory of each processor is determined.

Non-coherent memory access NUMA architecture refers to multiprocessor systems in which the access time of memory is dependent on the relative location between the processor and the memory. Memory present in such architectures relatively close to the processor is commonly referred to as local memory; there is also memory relatively far from the processor, commonly referred to as remote memory.

On the Intel x86 platform, the local memory is a memory that the CPU can access through the iMC (Integrated Memory Controller ) in the un core (non-operation core) component. Whereas those that are not local, remote Memory (Remote Memory) would need to be accessed through the QPI (QuickPath Interconnect) controller's link to the iMC of the local CPU where that Memory resides. Memory access performance tests that were ever done on Intel IvyBridge's NUMA platform show that the latency of remote memory access is doubled compared to local memory. Therefore, determining the local memory of each processor has a significant effect on improving the test rate of the memory.

In an exemplary embodiment of the disclosure, determining the local memory of each processor is based on determining a physical address demarcation point for each processor. Wherein the physical address demarcation point is two adjacent physical addresses belonging to different processors in the memory physical addresses. The memory physical address can be divided into memory physical address blocks by the physical address demarcation point, and different memory physical address blocks belong to different processors.

It should be noted that the physical address of the memory, i.e., the physical address of the memory unit, is determined by the location on the address bus where it is located, and after the machine is installed, the physical address is fixed, unchanged, and not allocated by the processor CPU. The physical address refers to the address loaded into the memory address register and is the real address of the memory cell. The memory addresses transmitted on the front side bus are all physical addresses of the memory, numbered from 0 up to the highest of the available physical memory. These numbers are mapped onto the actual memory stripe by a Northbridge (Northbridge Chip).

In practical applications, there may be various ways of determining the physical address demarcation point, and in the exemplary embodiment of the present disclosure, in the process of determining the physical address demarcation point of each processor, an address Decoder needs to be used to decode an address signal to obtain the physical address demarcation point corresponding to the physical address of the memory.

In practice, the address signal may be obtained by grasping by a logic analyzer, which is an instrument that analyzes the logical relationship of the digital system. The logic analyzer is a bus analyzer belonging to the data field testing instrument, i.e. an instrument based on the concept of bus (multi-line) and used for observing and testing the data flow on a plurality of data lines. Typically, the address signals captured by the logic analyzer may be decoded by an address decoder to parse the data to obtain the memory physical address, as shown in table 1.

TABLE 1

The SK is Socket, and one Socket corresponds to one physical CPU.

As can be seen from table 1, the addresses at and before the physical memory address 0x0000008FFFFFFFFF belong to the CPU00, and the addresses after the physical memory address 0x0000008FFFFFFFFF belong to the CPU01, so that the physical memory address 0x0000008FFFFFFFFF can be determined as a physical address demarcation point, the memory corresponding to the physical address demarcation point and the address before the physical address demarcation point belongs to the local memory of the processor CPU00, and the memory corresponding to the address after the physical address demarcation point belongs to the local memory of the processor CPU 01.

In addition, as can be seen from table 1, the addresses before the physical memory address 0x0000009000000000 all belong to the CPU00, and the addresses after the physical memory address 0x0000009000000000 all belong to the CPU01, so that the physical memory address 0x0000009000000000 can be determined as another physical address demarcation point, the memory corresponding to the address before the physical address demarcation point belongs to the local memory of the processor CPU00, and the memory corresponding to the physical address demarcation point and the address after the physical address demarcation point belongs to the local memory of the processor CPU 01. Also, as can be seen from table 1, physical memory addresses 0x0000008FFFFFFFFF and 0x0000009000000000 belong to adjacent physical addresses.

In the exemplary embodiment of the disclosure, by determining the physical address demarcation point of each processor, the physical address corresponding to each processor can be determined based on the physical address demarcation point, so that the corresponding local memory belonging to each processor can be determined. The physical address demarcation point belongs to a memory address dividing point, which is equivalent to dividing the memory into different memory blocks, and judging which processor each memory block belongs to, so that the physical address demarcation point is irrelevant to a memory node, even spans the memory node. As shown in table 1, the physical memory addresses belong to the same memory bank DIMM00, but the addresses therein are divided into local memories of different processors. The DIMM is called Dual-Inline Memory module.

It should be noted that the memory block belongs to a continuous memory address block with respect to the processor, and the memory node is a set of memory banks on the computer memory topology. When the computer performs memory mapping, the memory nodes are not distinguished, and only the memory nodes are addressed according to the memory addresses, so that the memory nodes are crossed in the process of addressing. According to the memory testing method provided by the exemplary embodiment of the disclosure, the memory address is divided into different memory blocks by determining the physical address demarcation point, so that the additional operation of the memory node is not required to be concerned, and the memory node is not limited.

In practical applications, the address signal may be captured by an apparatus such as a DDR (Double Data Rate) memory protocol analyzer, in addition to being captured by a logic analyzer, and the exemplary embodiments of the present disclosure are not limited to specific capturing apparatuses.

In step S420, local memory is allocated equally for each execution thread of the processor.

In the exemplary embodiments of the present disclosure, the task of performing memory testing is referred to as a thread of execution, each of which is performed by a kernel in a processor. It is common for a processor to include multiple cores, for example, 36 cores, each for performing the tasks of memory testing. Where a core is a processor core, a processor may have multiple cores (i.e., a multi-core processor), and a core may belong to only one processor. The CPU core is a core chip in the middle of the CPU, is made of monocrystalline silicon, is used for completing all calculation, receiving/storing commands, processing data and the like, and is a digital processing core. Cores (Die), also known as kernels, are the most important components of a CPU. The CPU center piece of the raised chip is the core and is manufactured by monocrystalline silicon according to a certain production process, and all calculation, receiving/storing commands and processing data of the CPU are executed by the core.

The number of execution threads provided by the embodiments of the present disclosure is equal to the number of cores of a processor. One processor core corresponds to one execution thread, and each execution thread is started according to the processor core and is executed by the processor core to complete.

Generally, one processor corresponds to a plurality of local memories, and as shown in table 1, the processor CPU00 corresponds to a plurality of local memories, and the processor CPU01 corresponds to a plurality of local memories. Therefore, after determining the local memory of each processor, the local memory may be allocated to the kernel of the processor according to the need, that is, the local memory may be allocated to each execution thread.

In the exemplary embodiment of the disclosure, the local memory can be evenly distributed for each execution thread of the processor, so that the sizes of the local memories tested by each execution thread are the same, and therefore the same time for each execution thread to execute the memory test, that is, the same size of the local memories to be tested by each kernel of the processor, can be ensured. The process is equivalent to the process of averaging the memory test time into the kernel of each processor, thereby shortening the total use time of the memory test and improving the efficiency of the memory test. In the memory test process, each processor tests a local memory instead of a remote memory, so that the problem of delay in testing the remote memory is avoided, and the efficiency of memory test is further improved.

In practical applications, there may be various ways of equally allocating local memory for each execution thread of the processor, and in an exemplary embodiment of the present disclosure, referring to fig. 5, the step of equally allocating local memory for each execution thread of the processor includes:

step S510, determining the total storage capacity of a local memory corresponding to the processor;

step S520, dividing the total storage amount by the number of execution threads of the processor, and determining the average storage amount of the local memory allocated by each execution thread;

step S530, determining the local memory allocated by each execution thread according to the average memory amount.

In the memory test method provided by the exemplary embodiment of the present disclosure, a manner of averagely allocating local memory to each execution thread of a processor is adopted, and in the process of averagely allocating local memory, the average memory allocated to each execution thread is determined by determining the total memory capacity of the local memory corresponding to the processor and then combining the number of execution threads of the processor, so that the local memory is allocated to each execution thread based on the average memory capacity, thereby implementing a method for averagely allocating local memory.

The following details the steps for evenly allocating local memory by way of example:

Specifically, in step S510, the total storage amount of the local memory corresponding to the processor is determined.

In the exemplary embodiment of the present disclosure, for each processor, the storage amount of each local memory corresponding to the processor may be determined first (the storage amount of each local memory may be different), and then the storage amounts of all the local memories corresponding to the processor are added, so as to obtain the total storage amount of the local memories corresponding to the processor.

Taking the processors CPU00 and CPU01 listed in table 1 as an example, the local memory corresponding to CPU00 includes: memory corresponding to physical memory addresses from 0x0000007000000000 to 0x0000008 FFFFFFFFF; the local memory corresponding to the CPU01 includes: memories corresponding to physical memory addresses from 0x0000009000000000 to 0x00000b 0000000000.

For the processor CPU00, the storage amounts of the local memories included in the processor CPU00 may be different from each other, and the storage amounts of all the local memories may be added to obtain the total storage amount of the local memories corresponding to the CPU00, for example, 18GB.

For the processor CPU01, the storage amount of each local memory may be different in the local memories included in the processor CPU01, and each storage amount of all the local memories may be added to obtain the total storage amount of the local memories corresponding to the CPU01, for example, 10GB.

The total storage capacity of the local memory corresponding to the processor can be obtained through the mode.

Next, in step S520, the total memory is divided by the number of execution threads of the processor, and an average memory size of the local memory allocated by each execution thread is determined.

After obtaining the total storage amount of the local memory corresponding to the processor, the local memory may be allocated for each execution thread based on the total storage amount. As described above, the average allocation of the local memory to each execution thread of the processor can shorten the time of the memory test and improve the efficiency of the memory test.

Thus, in the exemplary embodiments of the present disclosure, the local memory may be equally allocated according to the number of execution threads, so that the local memory allocated to each execution thread is the same in size. That is, the local memory may be equally allocated by dividing the total amount of local memory corresponding to the processor by the number of threads executed by the processor. The method is equivalent to determining the average storage capacity of the local memory which can be allocated to each execution thread, and the specific local memory can be allocated based on the average storage capacity.

From the foregoing, it can be seen that the number of execution threads of the processor is the number of cores of the processor, dividing the total storage of the local memory by the number of execution threads of the processor is equivalent to dividing the total storage of the local memory by the number of cores of the processor, determining an average storage for each core of the memory, and equally distributing the local memory. In general, the number of cores that different processors have may vary.

The above-described examples of the processors CPU00 and CPU01 are continued. For the processor CPU00, assuming that it includes 36 cores, the number of execution threads corresponding to the CPU00 is 36. Therefore, the total storage capacity 18GB of the local memory corresponding to the CPU00 may be divided by the number of execution threads 36, so as to determine that the average storage capacity of the local memory required to be allocated by each execution thread of the CPU00 is: 18 GB/36=0.5 GB.

For the processor CPU01, assuming that it contains 18 cores, the number of execution threads corresponding to the CPU01 is 18. Therefore, the total storage amount of the local memory corresponding to the CPU01 may be divided by the number of execution threads 18 to determine that the average storage amount of the local memory required to be allocated by each execution thread of the CPU01 is: 10 GB/18=0.556 GB.

It should be noted that the average storage amount is the size of the local memory that needs to be tested for each execution thread, that is, each core of the processor, and does not refer to a specific local memory.

In step S530, the local memory allocated by each execution thread is determined according to the average storage amount.

In the exemplary embodiment of the disclosure, after determining the average storage amount of the local memory allocated to each execution thread in each processor, the local memory may be allocated to each execution thread on average based on the average storage amount.

For example, for the processor CPU00, each execution thread may be allocated a local memory of 0.5GB size; for the processor CPU01, each execution thread may be allocated a local memory of 0.556GB in size.

In practical applications, the size of each local memory may be different, and local memories with different sizes may be allocated to each execution thread according to the average storage size. For example, two local memories corresponding to the processor with the size of 0.2GB and one local memory corresponding to the processor with the size of 0.1GB may be allocated to a certain execution thread of the processor CPU 00. As an example, two local memories corresponding to the processor with the size of 0.25GB may be allocated to a certain execution thread of the processor CPU01, and a local memory with the size of 0.056GB may be allocated to the processor CPU 01.

In the actual allocation process, if there is no local memory with a proper size, for example, no local memory with a size of 0.056GB, a certain local memory may be partitioned, and after a portion of 0.056GB is allocated, the rest portion is allocated to other execution threads. The exemplary embodiments of the present disclosure are not particularly limited to a specific allocation manner.

It should be noted that, the above-mentioned CPU00 and CPU01 are only exemplary, and the exemplary embodiments of the present disclosure do not limit the number of execution threads of the processor and the storage amount of the corresponding local memory, etc.

In step S430, the allocated local memories are tested in parallel by the respective execution threads.

In the exemplary embodiment of the disclosure, after the local memory is allocated for each execution thread of the processor, each execution thread may be used to test the allocated local memory. In the test process, each execution thread is tested in parallel, so that the time of the whole test can be shortened, and the test efficiency is improved.

In practical applications, the types of testing the memory may be various, for example, performing a modified memory parameter test, a modified system parameter test, a leak test, a high temperature test, a low temperature test, a normal temperature test, a high pressure test, a low pressure test, and a combination test using different test methods such as a test algorithm. In the exemplary embodiment of the present disclosure, the test process of the execution thread is described by taking the read-write test as an example.

In an exemplary embodiment of the disclosure, each execution thread may be utilized to perform read-write verification on the allocated local memory, that is, the execution thread writes test data in the corresponding local memory, and then reads out the written test data after a preset time. The specific preset time may be determined according to actual circumstances, and the exemplary embodiments of the present disclosure are not particularly limited thereto.

After the test data is read, the read data can be compared with the test data written into the local memory, and only when the read data is consistent with the test data written into the local memory, the read data and the write verification result are consistent, and the read test and the write test are passed. For a plurality of execution threads, the test is passed only when all read-write verification results are consistent, otherwise, as long as one read-write verification result is inconsistent, the test is failed, an error can be reported, and the test failure is prompted when the test is completed.

The exemplary embodiments of the present disclosure do not specifically describe other memory test types, and may refer to existing test means, which are not described herein.

Further, in the exemplary embodiments of the present disclosure, during execution of a test by a processor, all execution threads of each processor share a set of test code, that is, a code region is shared by a plurality of execution threads, and each execution thread runs a code with a separate memory stack region to record a program location of the code to which each execution thread runs. Compared with a set of test codes for each execution thread, each set of test codes occupies a part of memory space, the memory test method provided by the exemplary embodiment of the disclosure can reduce the size of the memory space occupied by the test codes by sharing one set of test codes for all the execution threads, thereby increasing the size of the testable memory.

The memory test method provided by the exemplary embodiment of the present disclosure is equivalent to dividing the local memory of the processor into different memory blocks for each execution thread. After local memories are allocated to each execution thread according to the average storage amount, when each execution thread executes a test program, only the starting address and the ending address of a memory block are input, and the test program starts to test the corresponding memories after acquiring the starting address and the ending address.

It should be noted that, each execution thread of the above processor is mainly used for testing the memory of the non-uniform memory access NUMA system.

In summary, according to the memory test method provided by the embodiment of the present disclosure, on one hand, the local memories of the processors are divided by determining the physical address demarcation point, so that the memory addresses are divided into different memory blocks, and the memory nodes are not distinguished, so that no additional operation of the memory nodes is required, and the memory nodes are not limited. On the other hand, by determining the local memory of the processor and utilizing each execution thread of the processor to test the allocated local memory, the processor can be prevented from testing and accessing the remote memory, so that the efficiency of memory testing can be improved; in still another aspect, by evenly allocating local memory to each execution thread of the processor and using each execution thread to test the evenly allocated local memory in parallel, which is equivalent to evenly allocating local memory to each core of the processor and evenly allocating local memory through each core parallel test, the time of memory test can be averaged, and the test time of each execution thread is the same, thereby shortening the total test time and further improving the efficiency of memory test.

It should be noted that although the steps of the method of the present invention are depicted in the drawings in a particular order, this does not require or imply that the steps must be performed in that particular order or that all of the illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.

In addition, in the present exemplary embodiment, a memory test apparatus is also provided. Referring to fig. 6, the memory test apparatus 600 may include: a local memory determination module 610, a memory allocation module 620, and a test module 630, wherein:

the local memory determining module 610 may be configured to determine a local memory of each processor;

the memory allocation module 620 may be configured to allocate local memory for each execution thread of the processor on average;

the test module 630 may be configured to test the allocated local memory in parallel with each execution thread.

In an exemplary embodiment of the present disclosure, the local memory determination module 610 may be configured to determine a physical address demarcation point for each processor; and determining the local memory of each processor according to the physical address demarcation point.

In an exemplary embodiment of the present disclosure, the local memory determination module 610 may be configured to decode an address signal with an address decoder to obtain a physical address demarcation point corresponding to a physical address of a memory.

In an exemplary embodiment of the present disclosure, the memory allocation module 620 may be configured to determine a total storage amount of the local memory corresponding to the processor; dividing the total storage amount by the number of execution threads of the processor, and determining the average storage amount of the local memory allocated by each execution thread; and determining the local memory allocated by each execution thread according to the average memory capacity.

In one exemplary embodiment of the present disclosure, the number of threads of execution is equal to the number of cores of the processor.

In an exemplary embodiment of the present disclosure, the test module 630 may be configured to perform read-write verification on the allocated local memory by using each execution thread; when all the read-write verification results are consistent, the test is passed; otherwise, reporting an error, and completing the test to prompt a test failure.

In one exemplary embodiment of the present disclosure, a thread of execution is used to test memory of a non-uniform memory access NUMA system.

The details of the virtual module of each memory test device are described in detail in the corresponding bit fail data acquisition method, and therefore, will not be described herein.

It should be noted that although several modules or units of the memory test device are mentioned in the above detailed description, this division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.

In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.

An electronic device 700 according to this embodiment of the invention is described below with reference to fig. 7. The electronic device 700 shown in fig. 7 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.

As shown in fig. 7, the electronic device 700 is embodied in the form of a general purpose computing device. Components of electronic device 700 may include, but are not limited to: the at least one processing unit 710, the at least one storage unit 720, a bus 730 connecting the different system components (including the storage unit 720 and the processing unit 710), and a display unit 740.

Wherein the storage unit 720 stores program code that is executable by the processing unit 710 such that the processing unit 710 performs the steps according to various exemplary embodiments of the present invention described in the above section of the "exemplary method" of the present specification. For example, the processing unit 710 may perform step S410 shown in fig. 2, and determine the local memory of each processor; step S420, uniformly distributing local memory for each execution thread of the processor; step S430, the distributed local memories are tested in parallel by using each execution thread.

The memory unit 720 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 7201 and/or cache memory 7202, and may further include Read Only Memory (ROM) 7203.

The storage unit 720 may also include a program/utility 7204 having a set (at least one) of program modules 7205, such program modules 7205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Bus 730 may be a bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 700 may also communicate with one or more external devices 770 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 700, and/or any device (e.g., router, modem, etc.) that enables the electronic device 700 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 750. Also, electronic device 700 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through network adapter 760. As shown, network adapter 760 communicates with other modules of electronic device 700 over bus 730. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 700, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.

From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.

A program product for implementing the above-described method according to an embodiment of the present invention may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

Furthermore, the above-described drawings are only schematic illustrations of processes included in the method according to the exemplary embodiment of the present invention, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A memory testing method, the method comprising:

Determining the local memory of each processor;

the local memory is evenly distributed to each execution thread of the processor;

and testing the allocated local memory in parallel by utilizing each execution thread.

2. The method of claim 1, wherein determining the local memory of each processor comprises:

determining a physical address demarcation point of each processor;

and determining the local memory of each processor according to the physical address demarcation point.

3. The method of claim 2, wherein said determining a physical address demarcation point for each of said processors comprises:

and decoding the address signal by using an address decoder to obtain the physical address demarcation point corresponding to the physical address of the memory.

4. A method according to any one of claims 1-3, wherein said equally allocating the local memory for each execution thread of the processor comprises:

determining the total storage capacity of the local memory corresponding to the processor;

dividing the total storage amount by the number of the execution threads of the processor to determine the average storage amount of the local memory allocated by each execution thread;

And determining the local memory allocated by each execution thread according to the average memory capacity.

5. The method of claim 4, wherein the number of threads of execution is equal to the number of cores of the processor.

6. The method of claim 1, wherein said testing the allocated local memory with each of the execution threads in parallel comprises:

performing read-write verification on the allocated local memory by utilizing each execution thread;

when all the read-write verification results are consistent, the test is passed;

otherwise, reporting an error, and completing the test to prompt a test failure.

7. The method of claim 1, wherein the thread of execution is to test memory of a non-uniform memory access NUMA system.

8. A memory test device, the device comprising:

the local memory determining module is used for determining the local memory of each processor;

the memory allocation module is used for averagely allocating the local memory for each execution thread of the processor;

and the test module is used for testing the allocated local memory in parallel by utilizing each execution thread.

9. The apparatus of claim 8, wherein the local memory determination module is configured to determine a physical address demarcation point for each of the processors; and determining the local memory of each processor according to the physical address demarcation point.

10. The apparatus of claim 9, wherein the local memory determination module is configured to decode an address signal using an address decoder to obtain the physical address demarcation point corresponding to a physical address of the memory.

11. The apparatus according to any one of claims 8-10, wherein the memory allocation module is configured to determine a total storage amount of the local memory corresponding to the processor; dividing the total storage amount by the number of the execution threads of the processor to determine the average storage amount of the local memory allocated by each execution thread; and determining the local memory allocated by each execution thread according to the average memory capacity.

12. The apparatus of claim 11, wherein the number of threads of execution is equal to the number of cores of the processor.

13. The apparatus of claim 8, wherein the test module is configured to perform read-write verification on the allocated local memory using each of the execution threads; when all the read-write verification results are consistent, the test is passed; otherwise, reporting an error, and completing the test to prompt a test failure.

14. The apparatus of claim 8, wherein the thread of execution is to test memory of a non-uniform memory access (NUMA) system.

15. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the memory testing method of any of claims 1-7.

16. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the memory testing method of any of claims 1-7 via execution of the executable instructions.