CN115586999A - Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium - Google Patents

Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium Download PDF

Info

Publication number
CN115586999A
CN115586999A CN202211296525.5A CN202211296525A CN115586999A CN 115586999 A CN115586999 A CN 115586999A CN 202211296525 A CN202211296525 A CN 202211296525A CN 115586999 A CN115586999 A CN 115586999A
Authority
CN
China
Prior art keywords
current health
power
state monitoring
health state
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211296525.5A
Other languages
Chinese (zh)
Inventor
庞潇
李含聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202211296525.5A priority Critical patent/CN115586999A/en
Publication of CN115586999A publication Critical patent/CN115586999A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The embodiment of the invention provides a nonvolatile testing method and device for a persistent memory, electronic equipment and a storage medium. The persistent memory is arranged in a server, the server is provided with a Basic Input Output System (BIOS), and the nonvolatile testing method of the persistent memory comprises the following steps: when the BIOS starts the asynchronous refreshing function of the dynamic random access memory, acquiring a first current health state monitoring value; when the power is on and restarted, a second current health state monitoring value is obtained; recording the second current health state monitoring value; repeatedly executing the step of obtaining second current health state monitoring values when the power is on and restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded; and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory. The embodiment of the invention tests the nonvolatile characteristics of the persistent memory, so that the nonvolatile test result has higher accuracy.

Description

Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium
Technical Field
The present invention relates to the field of persistent memory testing technologies, and in particular, to a persistent memory nonvolatile testing method, a persistent memory nonvolatile testing apparatus, an electronic device, and a storage medium.
Background
As one of the most important components of the server, the memory is a bridge for communicating a Central Processing Unit (CPU) with other external storage devices, and can temporarily store operation data in the CPU, so as to meet the work requirement of reading and writing data of the server. However, when the server is suddenly powered off, the memory does not have a storage function, and read-write data is lost at the moment of power failure, so that huge loss is caused to the client. Based on this, persistent memory (PMem) products based on 3D Xpoint (memory technology) media have emerged. The persistent memory has the characteristic of no data loss when power is off, read-write data in the memory can be stored in the data module through the flash memory module at the moment of power off, and when the server is powered on again, the read-write data can be read into the memory again to complete data interaction with the CPU.
The persistent Memory product has the characteristics of high capacity, low cost, low time delay and the like besides the power-down nonvolatile characteristic, and can meet various application scenes, so that the persistent Memory can be configured into various modes such as App Direct (application program Direct processing), memory (Memory), MIX (hybrid) and the like during application. The non-volatile function of the power failure depends on an App Direct mode, the persistent memory can be regarded as a solid disk in the mode, and after the solid disk is partitioned and mounted, the storage of the read data and the write data of the memory can be realized when the server is powered off. Therefore, the power-down protection is the most important function of the persistent memory, and the necessity of testing and verifying the power-down protection is self-evident. However, in the prior art, in the nonvolatile test of the persistent memory, the test principle and the test method are old, the traditional hard disk test method is relied on, the specific technology of the persistent memory is not used, the test principle is not scientific, and the accuracy of the test result is low.
Disclosure of Invention
In view of the above problems, embodiments of the present invention are proposed to provide a persistent memory nonvolatile test method, a persistent memory nonvolatile test apparatus, an electronic device, and a storage medium that overcome the above problems or at least partially solve the above problems.
In a first aspect of the present invention, an embodiment of the present invention discloses a nonvolatile testing method for a persistent memory, where the persistent memory is disposed in a server, and the server has a BIOS, and the method includes:
when the BIOS starts the asynchronous refreshing function of the dynamic random access memory, acquiring a first current health state monitoring value;
when the power is turned on and restarted, a second current health state monitoring value is obtained;
recording the second current health state monitoring value;
repeatedly executing the step of obtaining second current health state monitoring values when the power is on and restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded;
and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory.
Optionally, a power supply unit is further disposed in the server, and before the step of obtaining the second current health status monitoring value when the server is powered on and restarted, the method further includes:
controlling the power supply unit to be powered off so as to power off the server.
Optionally, the step of obtaining the first current health status monitoring value includes:
and accessing the memory firmware state register by adopting a frame information structure FIS protocol to acquire a first current health state monitoring value.
Optionally, the step of accessing the memory firmware status register by using a frame information structure FIS protocol to obtain the first current health status monitoring value includes:
opening an operating system by adopting the FIS protocol, wherein the operating system comprises the memory firmware state register and a management tool ipmctl;
reading a first power-on and power-off period and a first abnormal power-off count in the memory firmware state register by using the management tool ipmctl;
and determining the first on-off period and the first abnormal off count as the first current health state monitoring value.
Optionally, the step of obtaining the second current health status monitoring value during power-on restart includes:
when the power-on is restarted, the memory firmware state register is self-checked, and the last shutdown state data is read;
and determining the last shutdown state data as the second current health state monitoring value.
Optionally, the step of determining that the last shutdown state data is the second current health state monitoring value includes:
reading a second power-on and power-off period and a second abnormal power-off count in the last power-off state data by using the management tool ipmctl;
and determining the second on-off period and the second abnormal off count as the second current health state monitoring value.
Optionally, the step of comparing the first current health status monitored value with the plurality of second current health status monitored values to determine persistent memory non-volatility comprises:
judging whether the first power-on and power-off period and the second power-on and power-off period are in an increasing relationship;
when the first power-on and power-off period and the second power-on and power-off period are in an increasing relationship, judging whether the first abnormal power-off count is the same as the second abnormal power-off count;
when the first abnormal shutdown count is the same as the second abnormal shutdown count, determining that the nonvolatile memory of the persistent memory is in a normal state;
and when the first on-off period and the second on-off period are not in an increasing relationship or the first abnormal off count is different from the second abnormal off count, determining that the nonvolatile memory is in a fault state.
In a second aspect of the present invention, an embodiment of the present invention discloses a nonvolatile testing apparatus for a persistent memory, where the persistent memory is disposed in a server, and the server has a BIOS, and the apparatus includes:
the first acquisition module is used for acquiring a first current health state monitoring value when the BIOS starts the asynchronous refreshing function of the dynamic random access memory;
the second acquisition module is used for acquiring a second current health state monitoring value when the power is on and restarted;
the recording module is used for recording the second current health state monitoring value;
the third acquisition module is used for repeatedly executing the step of acquiring the second current health state monitoring values when the power-on restart is carried out until a plurality of second current health state monitoring values meeting the preset quantity condition are recorded;
and the comparison module is used for comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory.
In a third aspect of the present invention, an embodiment of the present invention discloses an electronic device, which includes a processor, a memory, and a computer program stored on the memory and capable of running on the processor, and when the computer program is executed by the processor, the steps of the persistent memory nonvolatile testing method described above are implemented.
In a fourth aspect of the present invention, an embodiment of the present invention discloses a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the persistent memory nonvolatile testing method described above.
The embodiment of the invention has the following advantages:
the embodiment of the invention obtains a first current health state monitoring value when the BIOS starts the asynchronous refreshing function of the dynamic random access memory; when the power is on and restarted, a second current health state monitoring value is obtained; recording the second current health state monitoring value; repeatedly executing the step of obtaining second current health state monitoring values when the power-on is restarted until a plurality of second current health state monitoring values meeting the preset quantity condition are recorded; and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory. By acquiring data when the server restarts the self-check and comparing the data with the initial data, the testing step is simpler, and the data acquired in the self-check restarting state of the server can represent the data stored before power failure, so that the data acquired by testing is more in line with the nonvolatile characteristic of the persistent memory, and the nonvolatile testing result is more accurate.
Drawings
FIG. 1 is a flowchart illustrating steps of a method for testing non-volatility of a persistent memory according to an embodiment of the present invention;
FIG. 2 is a flow chart of steps in another embodiment of a persistent memory nonvolatile test method of the present invention;
FIG. 3 is a flowchart illustrating exemplary steps of a method for testing non-volatility of persistent memory according to the present invention;
FIG. 4 is a block diagram of an embodiment of a nonvolatile memory test apparatus according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
In the existing testing method of the persistent memory, the testing and verification of the power-down protection function of the persistent memory imitates the testing method of the data storage function of the traditional hard disk. After partitioning and mounting the persistent memory, storing a large file in a disk of the persistent memory, recording an MD5 (Message-digest Algorithm 5) code, and directly powering off the server; and starting the server after a period of time, entering the system, mounting the persistent memory disk again, checking whether the large file exists or not, performing MD5 verification, if the large file is verified to be consistent, verifying that the persistent memory has no data loss, repeating the steps for many times, and if the large file is tested to be correct for many times, verifying that the data protection function of the persistent memory is normal, namely the nonvolatile property of the persistent memory is in a normal state, otherwise, the persistent memory is in a fault state. In the existing testing method, steps such as persistent memory mounting, partitioning, MD5 verification and the like are required before and after power failure, the testing steps are repeated and complex to operate, and manpower and time are wasted; the test principle and the test method are old, depend on the traditional hard disk test method, are not based on the unique technology of the persistent memory, and are not scientific, so that the accuracy of the test result is low. Embodiments of the invention are proposed for this purpose.
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a nonvolatile testing method for persistent memory according to the present invention is shown. In the embodiment of the invention, the persistent memory is arranged in the server. The server may be specifically a big data server, a storage database server, or the like, which is not limited in the embodiment of the present invention. The server has a basic input output system BIOS. The BIOS, as the most direct administrator of hardware setup and control at the bottom of the server motherboard, can provide more simple accessibility functions for the server. The BIOS is a set of programs that are fixed on a ROM (Read-Only Memory) chip on a server motherboard, and stores the most important basic input and output programs of the computer, system setting information, a self-test program after power-on, and a system self-start program, and has a main function of providing the bottommost and most direct hardware setting and control for the computer. The nonvolatile testing method of the persistent memory specifically comprises the following steps:
step 101, when the BIOS starts the asynchronous refreshing function of the dynamic random access memory, a first current health state monitoring value is obtained;
the asynchronous refresh of DRAM is performed by alternating access operation and refresh operation in each row, and the refresh is arranged in the decoding phase to make the dead time shorter. The server may enter the BIOS to turn on the dynamic random access memory Asynchronous Refresh (ADR) function.
And entering a BIOS interface when the BIOS starts the asynchronous refreshing function of the dynamic random access memory, and acquiring the current health state monitoring value of the current persistent memory from the BIOS to be used as a first current health state monitoring value. The current health state monitoring value comprises parameter values of various parameters corresponding to the tested persistent memory. Including but not limited to, historical error counts, current CPU utilization, current memory granule capacity, current temperature values, current error counts, abnormal shutdown counts, and on-off cycles. And taking the first current health state monitoring value as basic comparison data of the persistent memory test.
102, acquiring a second current health state monitoring value when power is on and restarted;
after the first current health state monitoring value is obtained, when the server is powered off and the server is restarted, the current health state monitoring value of the persistent memory at the current moment is obtained as a second current health state monitoring value during the period that the server restarts the BIOS system. When the BIOS system is restarted, the obtained current health state monitoring value is used for obtaining the data before last shutdown, which is stored in the persistent memory when the persistent memory is in power-off restart and is not subjected to other recovery operations, so that the obtained data is determined on the premise that the obtained second current health state monitoring value meets the non-volatility, and the subsequent test result is close to the characteristic of the persistent memory.
103, recording the second current health state monitoring value;
after the second current health state monitoring value is obtained, the second current health state monitoring value obtained currently can be recorded and stored in the designated storage space, so that the second current health state monitoring value can be read subsequently to judge whether the nonvolatile memory of the persistent memory is normal or not. It should be noted that the storage space may be a storage space of the server itself, or may be a storage space of a third party connected to the server, which is not limited in the embodiment of the present invention.
Step 104, repeatedly executing the step of obtaining second current health state monitoring values when the power is on and restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded;
after the first and second current health state monitoring values are obtained and recorded and stored; when the subsequent server of the server is powered off, the server is powered on and restarted again, the current health state monitoring value of the persistent memory at the current moment is obtained as a new second current health state monitoring value, and the second current health state monitoring value is recorded and stored. And executing the step of acquiring the second current health state monitoring value when the server is powered on and restarted after each power failure so as to acquire a plurality of second current health state monitoring values. And repeating the step of obtaining the second current health state monitoring value until the number corresponding to the obtained second current health state monitoring value meets the number condition, stopping the step of obtaining the second current health state monitoring value, and storing the previously obtained second current health state monitoring value. The quantity condition may be that the quantity corresponding to the second current health status monitoring value reaches a preset number threshold. The specific preset time threshold value can be determined according to the test precision and accuracy. The larger the preset time threshold value is, the more the number of the second current health state monitoring values which need to be acquired is, the higher the test precision and accuracy are, and the longer the test time is. On the contrary, the smaller the preset time threshold value is, the smaller the number of the second current health state monitoring values required to be acquired is, the lower the test precision and accuracy are, and the shorter the test time is. One skilled in the art can rely on the test requirements for persistent memory. And determining a specific preset time threshold value so as to determine a quantity condition.
In practical application, the accuracy, precision and test duration of the test need to be balanced. The preset time threshold may be 10 to 20 times, so that the test accuracy, precision and test duration corresponding to the test for the persistent memory are in the most reasonable range. For example, when the preset number threshold is 20 times, that is, the number corresponding to the second current health state monitoring value needs to reach 20, that is, after the first second current health state monitoring value is obtained in the execution step 102, the step of obtaining the second current health state monitoring value needs to be repeatedly executed for 19 times again, and when the server is powered on and restarted, the second current health state monitoring value and the third current health state monitoring value are obtained, and until the twenty second current health state monitoring values are obtained, it is determined that the preset number condition is met, and the step of obtaining the second current health state monitoring value is not repeatedly executed.
Step 105, comparing the first current health status monitor value and the second current health status monitor values to determine persistent memory non-volatility.
In the embodiment of the invention, after the first current health state monitoring value and the plurality of second current health state monitoring values are obtained, the first current health state monitoring value is compared with each second current health state monitoring value, the first current health state monitoring value is compared with the plurality of second current health state monitoring values, and whether the nonvolatile property of the persistent memory is normal or not is determined according to the comparison result.
The embodiment of the invention obtains a first current health state monitoring value when the BIOS starts the asynchronous refreshing function of the dynamic random access memory; when the power is on and restarted, a second current health state monitoring value is obtained; recording the second current health state monitoring value; repeatedly executing the step of obtaining second current health state monitoring values when the power-on is restarted until a plurality of second current health state monitoring values meeting the preset quantity condition are recorded; and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory. By acquiring data when the server restarts the self-check and comparing the data with the initial data, the testing step is simpler, and the data acquired in the self-check restarting state of the server can represent the data stored before power failure, so that the data acquired by testing is more in line with the nonvolatile characteristic of the persistent memory, and the nonvolatile testing result is more accurate.
Referring to FIG. 2, a flow chart of steps of another embodiment of a persistent memory nonvolatile testing method of the present invention is shown. In the embodiment of the invention, the persistent memory is arranged in the server. The server is embodied as a big data server. The server has a basic input output system BIOS. The server is further provided with a Power Supply Unit (PSU), which is an electric energy conversion type power supply in the server, and is used for converting standard alternating current of, for example, 220 volts and 50 hertz into low-voltage stable direct current, and providing direct current for other hardware (such as a motherboard) in the server to use, so that the hardware in the server operates.
The nonvolatile testing method of the persistent memory specifically comprises the following steps:
step 201, when the BIOS starts an asynchronous refresh function of a dynamic random access memory, acquiring a first current health status monitoring value;
when the server runs, an ADR function is opened under the BIOS, and when data acquisition is carried out on a register in the BIOS, data stored in a current memory Firmware state register is acquired from a memory Firmware state register (PMem FW (Firmware) state register) which stores monitoring data corresponding to a persistent memory, and a first current health state monitoring value is determined.
In an optional embodiment of the present invention, the step of obtaining the first current health status monitoring value comprises the following sub-steps:
and a substep S2011 of accessing the memory firmware state register by adopting a frame information structure FIS protocol to acquire a first current health state monitoring value.
The frame information structure FIS protocol is a mechanism for information transmission between a Host and a device, and can access a memory firmware status register in a BIOS system by adopting the FIS protocol, read data from the memory firmware status register and acquire a first current health status monitoring value.
Specifically, the step of accessing the memory firmware status register by using a frame information structure FIS protocol to obtain the first current health status monitoring value includes:
substep S20111, opening an operating system by using the FIS protocol, wherein the operating system includes the memory firmware state register and a management tool ipmctl;
in practical applications, the Operating System (OS) of the BIOS may be first opened using the FIS protocol to enter the OS interface. The operating system comprises a memory firmware state register and a management tool ipmctl.
In the substep S20112, the management tool ipmctl is adopted to read a first power-on/power-off period and a first abnormal power-off count in the memory firmware status register;
and starting a management tool ipmctl, and reading a first power-on and power-off period and a first abnormal power-off count currently stored in the memory firmware state register through the ipmctl. The first power-on/power-off period refers to a power-on/power-off period stored in the memory firmware state register when the memory firmware state register is read for the first time. The period may be calibrated in units of seconds and grades, which is not limited in the embodiment of the present invention. The first abnormal shutdown count is an abnormal shutdown count stored in the memory firmware status register when the memory firmware status register is read for the first time.
And a substep S20113, determining the first power-on/off period and the first abnormal power-off count as the first current health status monitoring value.
After the first power-on/off period and the first abnormal power-off count are read, the first power-on/off period and the first abnormal power-off count can be determined as a first current health state monitoring value to be used as a basis for non-volatile judgment of the persistent memory.
Step 202, controlling the power supply unit to power off so as to power off the server;
and after the first current health state monitoring value is obtained, the data before the power failure of the persistent memory is obtained. And then the power supply unit of the server can be controlled to be powered off, the power supply unit is powered off, namely the server is powered off, and the data can be stored when the power of the persistent memory is cut off.
Further, the power-off control of the power supply unit may be implemented by hardware alone or software. For example, the pure hardware manner may be to unplug the power supply unit to connect to the server. For example, the software may control the hardware to control a switch of the power supply unit connected to the server, and the power off of the power supply unit is controlled by controlling the switch state of the switch.
Step 203, acquiring a second current health state monitoring value when the power is on and restarted;
after the power is off for a certain time, controlling the server to be electrified and restarted; and acquiring a second current health state monitoring value during the restarting period of the server, and representing whether data loss occurs before the power failure exists in the persistent memory or not through the second current health state monitoring value. And the second current health state monitoring value is the current health state monitoring value obtained when the server is restarted.
In this embodiment of the present invention, the step of obtaining the second current health status monitoring value during power-on restart may specifically include the following sub-steps:
step S2031, when the power is on and restarted, self-checking the memory firmware state register and reading the last shutdown state data;
when the server is singly restarted, the BIOS system is also restarted, the corresponding register of the BIOS system can be self-checked when the BIOS system is restarted, and Last Shutdown State data (LSS, last Shutdown State) is read from the memory firmware State register when the memory firmware State register is self-checked. The last shutdown state data records that various pieces of state data are stored in the persistent memory before the server is powered off last time.
And a substep S2032 of determining that the last shutdown state data is the second current health state monitoring value.
And after the last shutdown state data is obtained, taking the last shutdown state data as a second current health state monitoring value.
Specifically, the step of determining that the last shutdown state data is the second current health state monitoring value includes:
substep S20321, reading a second power-on and power-off period and a second abnormal power-off count in the last power-off state data by using the management tool ipmctl;
similarly, the management tool ipmctl may be used to open the operating system, read the power-on/off period in the last power-off state data as the second power-on/off period on the operating system interface, and read the abnormal power-off count in the last power-off state data as the second abnormal power-off count. Namely, the second startup and shutdown period is the startup and shutdown period obtained when the server is restarted; the second abnormal shutdown count is an abnormal shutdown count acquired when the server is restarted.
And a substep S20322 of determining the second power on/off period and the second abnormal power off count as the second current health status monitoring value.
After reading the second switching cycle and the second abnormal shutdown count, the second switching cycle and the second abnormal shutdown count may be jointly used as the second current health state monitoring value, and the state of the persistent memory is determined by the second current health state monitoring value and the first current health state monitoring value.
Step 204, recording the second current health state monitoring value;
and recording and storing the acquired second current health state monitoring value after each second current health state monitoring value is acquired.
Step 205, repeatedly executing the step of obtaining second current health status monitoring values when restarting after power up until a plurality of second current health status monitoring values meeting a preset number of conditions are recorded;
judging whether the test result is sporadic or inaccurate when a group of second current health state monitoring values are obtained; therefore, after the server is powered off, the server is powered on and restarted repeatedly, the self-checking memory firmware state register obtains a second current health state monitoring value, and the second current health state monitoring value is obtained; and repeating the steps for a plurality of times to obtain a plurality of second current health state monitoring values. And when the number corresponding to the obtained second health state monitoring value meets a preset number condition, stopping executing the operation of obtaining the second current health state monitoring value. For example, the preset quantity condition is that the second current health status monitoring value reaches twenty groups. At this time, step 203 is executed to obtain a first group of second current health state monitoring values, and after the obtained first group of second current health state monitoring values are recorded, step 203 is executed repeatedly for nineteen times, so that after twenty groups of second current health state monitoring values are obtained, the step is skipped and executed repeatedly. And recording and storing the second current health state monitoring value acquired each time so as to facilitate subsequent judgment.
And step 206, comparing the first current health status monitoring value with the plurality of second current health status monitoring values to determine the nonvolatile property of the persistent memory.
After the first current health state monitoring value and the second current health state monitoring values are obtained, the first current health state monitoring value can be used as basic data, the second current health state monitoring values are compared with the first current health state monitoring value, or the second current health state monitoring values are compared with each other, and whether the nonvolatile property of the persistent memory is normal or not is determined.
Specifically, the step of comparing the first current health status monitoring value with the plurality of second current health status monitoring values to determine the nonvolatile memory includes:
a substep S2061 of determining whether the first power-on and power-off period and the second power-on and power-off period are in an increasing relationship;
specifically, first, the first on-off period and the second on-off period that are obtained may be determined according to the obtaining sequence, and whether the first on-off period and the second on-off period are sequentially increased is determined. For example, taking 20 second switching cycles as an example, the first on-off cycle is A 0 The second switching period obtained for the first time is A 1 The second switching period obtained for the second time is A 2 And then subsequently acquiring a second switching period A 3 、A 4 、A 5 、A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20 (ii) a The single cycle is 1. Namely, judgment of A 1 、A 2 、A 3 、A 4 、A 5 、A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20 Whether or not it is an increasing series, i.e. A 1 、A 2 、A 3 、A 4 、A 5 、A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20 Whether all differ by 1. When A is 1 、A 2 、A 3 、A 4 、A 5 、A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20 The difference between the first on-off period and the second on-off period is 1, namely the first on-off period and the second on-off period are in an increasing relationship. Otherwise, as long as A 1 、A 2 、A 3 、A 4 、A 5 、A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20 If the difference between the two items is not 1, determining that the first on-off period and the second on-off period are not in an increasing relationship.
In the sub-step S2062, when the first power-on/off period and the second power-on/off period are in an increasing relationship, whether the first abnormal power-off count and the second abnormal power-off count are the same is determined;
when the first on-off period and the second on-off period are in an increasing relationship, it may be further determined whether the first abnormal power-off count and the second abnormal power-off count are the same. For example, taking 20 abnormal shutdown counts as an example, the first abnormal shutdown count is B 0 And the first obtained second abnormal shutdown count is B 1 And the second abnormal shutdown count obtained for the second time is B 2 And then subsequently acquiring a second abnormal shutdown count B 3 、B 4 、B 5 、B 6 、B 7 、B 8 、B 9 、B 10 、B 11 、B 12 、B 13 、B 14 、B 15 、B 16 、B 17 、B 18 、B 19 、B 20 (ii) a Judgment B 1 、B 2 、B 3 、B 4 、B 5 、B 6 、B 7 、B 8 、B 9 、B 10 、B 11 、B 12 、B 13 、B 14 、B 15 、B 16 、B 17 、B 18 、B 19 、B 20 Whether all are equal. When B is present 1 =B 2 =B 3 =B 4 =B 5 =B 6 =B 7 =B 8 =B 9 =B 10 =B 11 =B 12 =B 13 =B 14 =B 15 =B 16 =B 17 =B 18 =B 19 =B 20 Then, the first abnormal shutdown count is the same as the second abnormal shutdown count. Otherwise, when at least one abnormal shutdown count is different from other abnormal shutdown counts, the first abnormal shutdown count and the second abnormal shutdown count can be determined to be different.
In the substep S2063, when the first abnormal shutdown count is the same as the second abnormal shutdown count, it is determined that the nonvolatile memory of the persistent memory is in a normal state;
when the first abnormal shutdown count is the same as the second abnormal shutdown count, and the first startup and shutdown period and the second startup and shutdown period are in an increasing relationship. The server is stated before and after power failure, the data stored in the persistent memory is not lost, and the nonvolatile of the persistent memory can be determined to be in a normal state.
In the substep S2064, when the first power on/off period and the second power on/off period are not in an increasing relationship, or the first abnormal power off count is different from the second abnormal power off count, it is determined that the nonvolatile memory is in a failure state.
When the first abnormal shutdown count is different from the second abnormal shutdown count, or the first startup and shutdown period and the second startup and shutdown period are not in an increasing relationship. The phenomenon that data stored in the persistent memory is lost before and after the server is powered off is described, and the nonvolatile state of the persistent memory can be determined to be a fault state.
The embodiment of the invention obtains a first current health state monitoring value when the BIOS starts the asynchronous refreshing function of the dynamic random access memory; controlling the power supply unit to be powered off so as to power off the server; when the power is on and restarted, a second current health state monitoring value is obtained; recording the second current health state monitoring value; repeatedly executing the step of obtaining second current health state monitoring values when the power is on and restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded; and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory. By acquiring data when the server restarts the self-check and comparing the data at the moment with the initial data, the testing step is simpler, and the problems of complex operation and complex steps of the original method for testing the power-down protection function of the persistent memory are solved; the data acquired by restarting the self-checking state of the server can represent the data stored before power failure, so that the data acquired by testing is more consistent with the nonvolatile characteristic of the persistent memory, and the nonvolatile testing result is more accurate; the problem that the nonvolatile characteristic of the persistent memory cannot be accurately reflected due to the fact that the original method for testing the power-down protection function of the persistent memory is used for keeping the original persistent memory is solved, and the test result is more scientific and convincing.
In order to enable a person skilled in the art to better understand the embodiments of the present invention, the following description is given by way of an example:
referring to fig. 3, a flowchart illustrating steps of an exemplary method for testing non-volatility of persistent memory according to the present invention is shown. The leftmost column is started up for the first time, enters the BIOS to open the ADR, then enters the Linux OS to record the values of the PCC and the DSC, then each column on the right is repeatedly subjected to power-down operation to trigger dirty shutdown, and then enters the OS to record the values of the PCC and the DSC; and repeating the circulation, recording the values of PCC and DSC, and judging whether the power failure protection function is normal. Specifically, the method comprises the following steps:
step 1: opening a server to enter a BIOS, and starting an ADR function;
and 2, step: entering Linux OS, obtaining PCC value through ipmctl and recording the PCC value as A 0 Obtaining a DSC value as B0;
and step 3: directly unplugging a PSU of the machine, and powering off the server;
and 4, step 4: powering up the server again at intervals;
and 5: entering Linux OS, obtaining PCC value through ipmctl and recording the PCC value as A 1 Obtaining DSC value as B 1 ;A 6 、A 7 、A 8 、A 9 、A 10 、A 11 、A 12 、A 13 、A 14 、A 15 、A 16 、A 17 、A 18 、A 19 、A 20
Step 6: repeating the steps 4 to 5 for 20 times to obtain A 2 ~A 20 ,B 1 ~B 20 Equivalence;
and 7: comparing the test results, and judging whether all the following conditions are met: a. The 1 =A 0 +1、A 2 =A 1 +1、A 3 =A 2 +1、A 4 =A 3 +1、A 5 =A 4 +1、A 6 =A 5 +1、A 7 =A 6 +1、A 8 =A 7 +1、A 9 =A 8 +1、A 10 =A 9 +1、A 11 =A 10 +1、A 12 =A 11 +1、A 13 =A 12 +1、A 14 =A 13 +1、A 15 =A 14 +1、A 16 =A 15 +1、A 17 =A 16 +1、A 18 =A 17 +1、A 19 =A 18 +1;A 20 =A 19 +1;B 1 =B 2 =B 3 =B 4 =B 5 =B 6 =B 7 =B 8 =B 9 =B 10 =B 11 =B 12 =B 13 =B 14 =B 15 =B 16 =B 17 =B 18 =B 19 =B 20
And 8: if the above results are met, the test is passed; otherwise, the test fails.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of a structure of an embodiment of a persistent memory nonvolatile testing apparatus according to the present invention is shown, where the persistent memory is disposed in a server, the server has a BIOS, and the persistent memory nonvolatile testing apparatus may specifically include the following modules:
a first obtaining module 401, configured to obtain a first current health status monitoring value when the BIOS starts an asynchronous refresh function of a dynamic random access memory;
a second obtaining module 402, configured to obtain a second current health status monitoring value when the power supply is restarted;
a recording module 403, configured to record the second current health status monitoring value;
a third obtaining module 404, configured to repeatedly perform the step of obtaining the second current health status monitoring value when the power supply is restarted until a plurality of second current health status monitoring values meeting a preset number of conditions are recorded;
a comparison module 405, configured to compare the first current health status monitoring value with the plurality of second current health status monitoring values to determine persistent memory non-volatility.
In an optional embodiment of the present invention, the server is further provided with a power supply unit, and before the step of obtaining the second current health status monitoring value when the power is turned on and restarted, the apparatus further includes:
and the control module is used for controlling the power-off of the power supply unit so as to power off the server.
In an optional embodiment of the present invention, the first obtaining module 401 includes:
and the first acquisition submodule is used for accessing the memory firmware state register by adopting a frame information structure FIS protocol to acquire a first current health state monitoring value.
In an optional embodiment of the present invention, the first obtaining sub-module includes:
the opening unit is used for opening an operating system by adopting the FIS protocol, and the operating system comprises the memory firmware state register and a management tool ipmctl;
the first reading unit is used for reading a first power-on and power-off period and a first abnormal power-off count in the memory firmware state register by adopting the management tool ipmctl;
and the first current health state monitoring value determining unit is used for determining that the first power-on and power-off period and the first abnormal power-off count are the first current health state monitoring value.
In an optional embodiment of the present invention, the second obtaining module 402 includes:
the self-checking submodule is used for self-checking the memory firmware state register and reading last shutdown state data when the power-on is restarted;
and the second current health state monitoring value determining submodule is used for determining that the last shutdown state data is the second current health state monitoring value.
In an optional embodiment of the present invention, the second current health status monitoring value determining sub-module includes:
the second reading unit is used for reading a second power-on and power-off period and a second abnormal power-off count in the last power-off state data by using the management tool ipmctl;
and the second current health state monitoring value determining unit is used for determining that the second on-off period and the second abnormal off count are the second current health state monitoring value.
In an alternative embodiment of the present invention, the comparison module 405 includes:
the first judgment submodule is used for judging whether the first on-off period and the second on-off period are in an increasing relationship;
a second determining submodule, configured to determine whether the first abnormal shutdown count is the same as the second abnormal shutdown count when the first startup and shutdown period and the second startup and shutdown period are in an increasing relationship;
a correct state confirmation submodule, configured to determine that the nonvolatile memory is in a normal state when the first abnormal shutdown count is the same as the second abnormal shutdown count;
and the fault state confirmation submodule is used for determining that the nonvolatile property of the persistent memory is in a fault state when the first on-off period and the second on-off period are not in an increasing relationship or the first abnormal off count is different from the second abnormal off count.
For the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
An embodiment of the present invention further provides an electronic device, including:
the device comprises a processor and a storage medium, wherein the storage medium stores a computer program executable by the processor, and when an electronic device runs, the processor executes the computer program to execute the nonvolatile testing method of the persistent memory according to any one of the embodiments of the invention. The specific implementation manner and technical effects are partially similar to those of the method embodiment, and are not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the method for testing the non-volatility of the persistent memory according to any one of the embodiments of the present invention is executed. The specific implementation manner and technical effects are similar to those of the method embodiment, and are not described herein again.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the embodiments of the invention.
Finally, it should also be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "include", "including" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article, or terminal device including a series of elements includes not only those elements but also other elements not explicitly listed or inherent to such process, method, article, or terminal device. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or terminal apparatus that comprises the element.
The method, the device, the electronic device and the storage medium for testing the nonvolatile memory provided by the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation mode of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A nonvolatile test method for a persistent memory, wherein the persistent memory is arranged in a server, and the server is provided with a Basic Input Output System (BIOS), the method comprises the following steps:
when the BIOS starts the asynchronous refreshing function of the dynamic random access memory, acquiring a first current health state monitoring value;
when the power is on and restarted, a second current health state monitoring value is obtained;
recording the second current health state monitoring value;
repeatedly executing the step of obtaining second current health state monitoring values when the power is on and restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded;
and comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory.
2. The method according to claim 1, wherein a power supply unit is further disposed in the server, and before the step of obtaining the second current health status monitoring value at the time of power-on restart, the method further comprises:
controlling the power supply unit to be powered off so as to power off the server.
3. The method of claims 1-2, wherein said obtaining a first current health status monitor value step comprises:
and accessing the memory firmware state register by adopting a frame information structure FIS protocol to acquire a first current health state monitoring value.
4. The method of claim 3, wherein accessing the memory firmware status register using frame information structure FIS protocol to obtain the first current health status monitor value comprises:
opening an operating system by adopting the FIS protocol, wherein the operating system comprises the memory firmware state register and a management tool ipmctl;
reading a first power-on and power-off period and a first abnormal power-off count in the memory firmware state register by using the management tool ipmctl;
and determining the first on-off period and the first abnormal off count as the first current health state monitoring value.
5. The method of claim 4, wherein the step of obtaining a second current health status monitor value at power-up restart comprises:
when the power-on is restarted, the memory firmware state register is self-checked, and the last shutdown state data is read;
and determining the last shutdown state data as the second current health state monitoring value.
6. The method of claim 5, wherein determining the last shutdown state data to be the second current health status monitor value comprises:
reading a second power-on and power-off period and a second abnormal power-off count in the last power-off state data by using the management tool ipmctl;
and determining the second on-off period and the second abnormal off count as the second current health state monitoring value.
7. The method of claim 6, wherein said comparing said first current health status monitor value to said plurality of second current health status monitor values to determine persistent memory non-volatility comprises:
judging whether the first power-on and power-off period and the second power-on and power-off period are in an increasing relationship or not;
when the first power-on and power-off period and the second power-on and power-off period are in an increasing relationship, judging whether the first abnormal power-off count is the same as the second abnormal power-off count;
when the first abnormal shutdown count is the same as the second abnormal shutdown count, determining that the nonvolatile memory of the persistent memory is in a normal state;
and when the first on-off period and the second on-off period are not in an increasing relationship or the first abnormal off count is different from the second abnormal off count, determining that the nonvolatile memory is in a fault state.
8. A persistent memory nonvolatile test apparatus, wherein the persistent memory is disposed in a server, and the server has a BIOS, the apparatus comprising:
the first acquisition module is used for acquiring a first current health state monitoring value when the BIOS starts the asynchronous refreshing function of the dynamic random access memory;
the second acquisition module is used for acquiring a second current health state monitoring value when the power is on and restarted;
the recording module is used for recording the second current health state monitoring value;
the third acquisition module is used for repeatedly executing the step of acquiring the second current health state monitoring values when the power supply is restarted until a plurality of second current health state monitoring values meeting the preset number condition are recorded;
and the comparison module is used for comparing the first current health state monitoring value with the plurality of second current health state monitoring values to determine the nonvolatile property of the persistent memory.
9. An electronic device comprising a processor, a memory, and a computer program stored on the memory and capable of running on the processor, the computer program, when executed by the processor, implementing the steps of the persistent memory non-volatility testing method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the persistent memory non-volatility testing method according to any one of claims 1 to 7.
CN202211296525.5A 2022-10-21 2022-10-21 Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium Pending CN115586999A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211296525.5A CN115586999A (en) 2022-10-21 2022-10-21 Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211296525.5A CN115586999A (en) 2022-10-21 2022-10-21 Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115586999A true CN115586999A (en) 2023-01-10

Family

ID=84779580

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211296525.5A Pending CN115586999A (en) 2022-10-21 2022-10-21 Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115586999A (en)

Similar Documents

Publication Publication Date Title
US8874953B2 (en) System and method of cloud testing and remote monitoring for integrated circuit components in system validation
WO2021174811A1 (en) Prediction method and prediction apparatus for traffic flow time series
CN107076797B (en) Testing semiconductor memory power consumption based on executed access commands
CN103744764A (en) Crontab based whole computer memory stability test method
KR20070039176A (en) Method and apparatus to establish, report and adjust system memory usage
US11043269B2 (en) Performing a test of memory components with fault tolerance
CN116340076B (en) Hard disk performance test method, device and medium
CN113366576A (en) Retention self-test for power loss operations on memory systems
CN109074311A (en) Selective data in computing system retains
CN107329914A (en) It is a kind of that the out of order method and device of hard disk is detected based on linux system
CN109684149A (en) A kind of hardware information monitoring method, device and the equipment of NVMe hard disk
CN113220332A (en) BIOS (basic input output System) firmware refreshing test method and device, electronic equipment and storage medium
CN112416670A (en) Hard disk test method, device, server and storage medium
TW201516665A (en) System and method for detecting system error of server
CN115586999A (en) Nonvolatile testing method and device for persistent memory, electronic equipment and storage medium
CN116361111A (en) Data acquisition method and device and electronic equipment
WO2020117917A1 (en) Allocation of test resources to perform a test of memory components
CN116560924A (en) Performance test method, device, computer equipment and readable storage medium
US9218260B2 (en) Host device and method for testing booting of servers
CN115662488A (en) SSD reliability test method and system
CN115391110A (en) Test method of storage device, terminal device and computer readable storage medium
CN114817010A (en) Python-based Redfish automatic testing method and device
CN113835944A (en) Test method and device for rapidly judging link rate of solid state disk and computer equipment
CN113900928B (en) IO load automatic test method and device
CN116758973B (en) Testing method for unexpected power failure data verification of enterprise-level solid state disk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination