CN112463495B - Open channel ssd and driver reliability testing method and device thereof - Google Patents

Open channel ssd and driver reliability testing method and device thereof Download PDF

Info

Publication number
CN112463495B
CN112463495B CN202011417693.6A CN202011417693A CN112463495B CN 112463495 B CN112463495 B CN 112463495B CN 202011417693 A CN202011417693 A CN 202011417693A CN 112463495 B CN112463495 B CN 112463495B
Authority
CN
China
Prior art keywords
node
ocsd
test
test node
ocssd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011417693.6A
Other languages
Chinese (zh)
Other versions
CN112463495A (en
Inventor
贾桂森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202011417693.6A priority Critical patent/CN112463495B/en
Publication of CN112463495A publication Critical patent/CN112463495A/en
Application granted granted Critical
Publication of CN112463495B publication Critical patent/CN112463495B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2221Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test input/output devices or peripheral units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2273Test methods

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Test And Diagnosis Of Digital Computers (AREA)

Abstract

The invention provides a method and a device for testing the reliability of an open channel ssd and a driver thereof, wherein the method comprises the following steps: configuring a test environment, setting the communication between a BMC (baseboard management controller) of the test node and an auxiliary node network, checking the OCSSD (online charging system security) state of the test node, partitioning the OCSSD of the test node, and setting a standard file in the OCSSD partition of the test node; the auxiliary node controls the test node to be started, and obtains a time interval T1 from the start of the test node to the start of loading the OCSSD by the hard disk drive program and a time interval T2 from the start of loading the OCSSD by the hard disk drive program to the completion of loading the OCSSD; the auxiliary node controls the test node to complete the startup and shutdown tests for preset times, and controls the test node to continue for a time interval T after each startup to perform shutdown, wherein T1< T < T1+ T2; and checking the OCSD state of the test node again, verifying the read-write function of the OCSD of the test node, and checking the standard file.

Description

Open channel ssd and driver reliability testing method and device thereof
Technical Field
The invention belongs to the technical field of storage reliability testing, and particularly relates to a method and a device for testing reliability of an open channel ssd and a driver thereof.
Background
The OCSSD is short for open-channel SSD, and the SSD is short for Solid State Drives, namely a Solid State disk.
The server plays a significant role in the wide application of cloud computing and big data, the storage is the server and an important part thereof, and the OCSSD is more and more widely applied to the server along with the continuous innovation of the storage technology. The ocsd is also called an open channel solid state drive, which is a special solid state disk, and the solid state disk does not implement a Flash Translation Layer (FTL) in the firmware of the drive, but transfers the management task of the physical solid state storage to the operating system, so that a special driver needs to be installed in the operating system in general ocsd usage, and the driver loads the ocsd during the server boot process, which lasts for a long time and is very easy to have an abnormal condition, and the reliability of the ocsd and its driver is related to whether the data is secure and reliable, and is very important for the server using the ocsd and its driver.
At present, the OCSD reliability test is mainly carried out by carrying out repeated restart tests on an OCSD server complete machine, including soft restart, hard restart and power-off restart, and whether abnormal error reporting occurs in the OCSD server complete machine restart test process or not is checked after the OCSD server complete machine restart test enters a system, so that the reliability of the OCSD is judged. The conventional testing method is to check the running state of the OCSSDs after all the OCSSDs are loaded by the driver, but the abnormal conditions occurring in the process of loading the OCSSDs by the driver cannot be fully verified.
Therefore, it is very necessary to provide a method and an apparatus for testing reliability of an open channel ssd and its driver, aiming at the above-mentioned drawbacks in the prior art.
Disclosure of Invention
Aiming at the defects that the running state of the OCSSD is checked after all the OCSSDs are loaded by the driver in the prior art, and the abnormal condition occurring in the process of loading the OCSSD by the driver cannot be fully verified, the invention provides an open channel ssd and a method and a device for testing the reliability of the driver, so as to solve the technical problems.
In a first aspect, the present invention provides an open channel ssd and a method for testing reliability of its driver, including the following steps:
s1, configuring a test environment, setting communication between a BMC (baseboard management controller) of a test node and an auxiliary node network, checking an OCSD (online charging system security) state of the test node, partitioning the OCSD of the test node, and setting a standard file in the OCSD partition of the test node;
s2, the auxiliary node controls the test node to be started, and a time interval T1 from the start of the test node to the start of loading of the OCSD by the hard disk drive program and a time interval T2 from the start of loading of the OCSD by the hard disk drive program to the completion of loading of the OCSD are obtained;
s3, the auxiliary node controls the test node to complete the startup and shutdown tests for preset times, controls the test node to continue for a time interval T after each startup, and shuts down the test node, wherein T1 is less than T < T1+ T2;
and S4, checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node, and checking the standard file.
Further, the step S1 specifically includes the following steps:
s11, setting an auxiliary node to cooperate with the test node to normally operate, and setting the test node BMC and the auxiliary node network to be in the same local area network;
s12, checking the OCSD state to be tested under the test node system, and judging whether the OCSD state to be tested is normal or not;
if yes, recording an OCSSD state log to be tested, and entering step S13;
if not, ending;
s13, dividing the OCSD space to be tested into two physical partitions equally, and generating a first partition and a second partition;
s14, creating a file system on the first partition, and mounting the file system to a specified directory;
and S15, writing a standard file with a set size into the mounted file system, and performing md5 coding and storing on the standard file.
Further, the step S2 specifically includes the following steps:
s21, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, controls the test node to be powered on after waiting for a first time period, and records a power-on time point A;
s22, the auxiliary node acquires a time point B when the hard disk drive program starts to load the OCSSD and acquires a time point C when the test node operating system enters a login interface;
s23, the auxiliary node calculates a time interval T1 between the startup of the test node and the start of loading of the OCSD by the hard disk drive program, wherein the time interval T1 is B-A;
and S24, the auxiliary node calculates the time interval T2 from the start of loading the OCSSD by the hard disk drive program to the completion of loading the OCSSD as C-B. The hard disk drive program has different loading time lengths for the OCSSDs with different storage data amounts, and therefore, the hard disk drive program needs to be acquired by the auxiliary node before each test.
Further, the step S3 includes the following steps:
s31, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, and controls the test node to be powered on after waiting for a first time period;
s32, the auxiliary node judges whether the startup and shutdown tests of preset times are finished or not;
if yes, go to step S4;
if not, go to step S33;
s33, the auxiliary node judges whether the duration of the test node after being started reaches a time interval T, wherein T1< T < T1+ T2;
if yes, return to step S31;
if not, the process returns to step S33.
Further, the step S4 specifically includes the following steps:
s41, checking the tested OCSSD state under the test node system, and judging whether the tested OCSSD state is normal;
if yes, go to step S42;
if not, ending;
s42, recording the log of the tested OCSSD state, comparing the log with the log of the OCSSD state before testing, and judging whether the log of the OCSSD state is abnormal before and after testing;
s43, issuing a file block size setting, a queue depth setting, an IO engine setting and a set number of random mixed read-write tasks to the tested OCSD second partition through an FIO tool, and detecting whether the FIO process is abnormal;
s44, checking the standard file of the first partition by checking the md5 coded file.
In a second aspect, the present invention provides an apparatus for testing reliability of an open channel ssd and its driver, including:
the test environment configuration module is used for configuring a test environment, setting the communication between the BMC of the test node and the network of the auxiliary node, checking the state of the OCSSD of the test node, partitioning the OCSSD of the test node, and setting a standard file in the OCSSD partition of the test node;
the time interval acquisition module is used for controlling the test node to be started through the auxiliary node, and acquiring a time interval T1 from the start of the test node to the start of loading the OCSD by the hard disk drive program and a time interval T2 from the start of loading the OCSD by the hard disk drive program to the completion of loading the OCSD;
the power on/off test module is used for controlling the test node to complete power on/off tests for preset times through the auxiliary node, controlling the test node to continue for a time interval T after each power on, and performing power off, wherein T1 is less than T < T1+ T2;
and the test verification module is used for checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node and verifying the standard file.
Further, the test environment configuration module includes:
the auxiliary node setting unit is used for setting the auxiliary node to be matched with the test node to normally operate and setting the test node BMC and the auxiliary node network to be in the same local area network;
the OCSD state first checking unit is used for checking the OCSD state to be tested under the test node system and judging whether the OCSD state to be tested is normal or not;
the OCSD state recording unit is used for recording an OCSD state log to be detected when the OCSD state to be detected is normal;
the OCSD partition unit is used for dividing the OCSD space to be detected into two physical partitions in a halving way to generate a first partition and a second partition;
the file system mounting unit is used for creating a file system on the first partition and mounting the file system to a specified directory;
and the file writing unit is used for writing a standard file with a set size into the mounted file system, and performing md5 encoding and saving on the standard file.
Further, the time interval acquisition module comprises:
the starting time recording unit is used for setting the auxiliary node to send a command to the test node BMC through the ipmitool tool, controlling the test node to be shut down, controlling the test node to be started after waiting for a first time period, and recording a starting time point A;
the OCSD loading and unloading time point recording unit is used for setting a time point B when the auxiliary node acquires that the hard disk drive program starts to load the OCSD and acquiring a time point C when the test node operating system enters a login interface;
the starting-loading time interval calculation unit is used for calculating a time interval T1, from starting to loading the OCSSD by the test node, from starting to the hard disk drive program, to be B-A through the auxiliary node;
and the loading process time interval calculation unit is used for calculating a time interval T2 from the start of loading the OCSSD by the hard disk driver to the completion of loading the OCSSD, namely C-B, through the auxiliary node.
Further, the power on/off test module comprises:
the starting unit is used for setting the auxiliary node to send a command to the BMC (baseboard management controller) of the test node through the ipmitool tool, controlling the test node to be powered off, and controlling the test node to be started after waiting for a first time period;
the power on/off test completion judging unit is used for judging whether the power on/off test of the preset times is completed through the auxiliary node;
and the time interval judgment unit after startup is used for judging whether the time interval T is reached by the time interval T after the startup of the test node through the auxiliary node when the startup test is not finished, wherein T1< T < T1+ T2.
Further, the test verification module includes:
the second OCSD state checking unit is used for checking the tested OCSD state under the test node system and judging whether the tested OCSD state is normal or not;
the OCSD state comparison unit is used for recording the tested OCSD state log, comparing the log with the OCSD state log before testing and judging whether the OCSD state log is abnormal before and after testing;
the IO verification unit is used for issuing a set file block size, a set queue depth, an IO engine and a set number of random mixed read-write tasks to the tested OCSD second partition through an FIO tool and detecting whether the FIO process is abnormal or not;
and the standard file verification unit is used for verifying the standard file of the first partition by checking the md5 coded file.
The invention has the beneficial effects that:
according to the method and the device for testing the reliability of the open channel ssd and the driver thereof, the loading of the OCSD by the hard disk driver is forcibly stopped by simulating the abnormal condition when the OCSD is loaded by the hard disk driver, so that the abnormal condition which may occur in the process of loading the OCSD by the driver can be effectively verified, the reliability of the OCSD and the driver thereof in the use of the server can be fully verified, and the vacancy of the OCSD in the current test is made up.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Therefore, compared with the prior art, the invention has prominent substantive features and remarkable progress, and the beneficial effects of the implementation are also obvious.
Drawings
In order to more clearly illustrate the embodiments or prior art solutions of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a first schematic flow chart of the method of the present invention;
FIG. 2 is a second schematic flow chart of the method of the present invention;
FIG. 3 is a schematic diagram of the system of the present invention;
in the figure, 1-test environment configuration module; 1.1-an auxiliary node setting unit; 1.2-first checking unit of OCSSD state; 1.3-OCSSD state recording unit; 1.4-OCSSD partition unit; 1.5-file system mount unit; 1.6-file writing unit; 2-a time interval acquisition module; 2.1-a starting-up time recording unit; 2.2-recording time points before and after OCSSD loading; 2.3-starting up to loading time interval calculation unit; 2.4-a loading process time interval calculation unit; 3-a startup and shutdown test module; 3.1-a boot-up unit; 3.2-a startup and shutdown test completion judgment unit; 3.3-a duration interval judgment unit after starting up; 4-testing the verification module; 4.1-second view unit of OCSSD status; 4.2-OCSSD state comparison unit; 4.3-IO verification unit; 4.4-standard document authentication unit.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
Example 1:
as shown in fig. 1, the present invention provides an open channel ssd and a method for testing reliability of a driver thereof, comprising the following steps:
s1, configuring a test environment, setting communication between a BMC (baseboard management controller) of a test node and an auxiliary node network, checking an OCSD (online charging system security) state of the test node, partitioning the OCSD of the test node, and setting a standard file in the OCSD partition of the test node;
s2, the auxiliary node controls the test node to be started, and a time interval T1 from the starting of the test node to the start of loading of the OCSD by the hard disk drive program and a time interval T2 from the start of loading of the OCSD by the hard disk drive program to the completion of loading of the OCSD are obtained;
s3, the auxiliary node controls the test node to complete the startup and shutdown tests for preset times, controls the test node to continue for a time interval T after each startup, and shuts down the test node, wherein T1 is less than T < T1+ T2;
and S4, checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node, and checking the standard file.
Example 2:
as shown in fig. 2, the present invention provides an open channel ssd and a method for testing reliability of driver thereof, comprising the following steps:
s1, configuring a test environment, setting communication between a BMC (baseboard management controller) of a test node and an auxiliary node network, checking an OCSD (online charging system security) state of the test node, partitioning the OCSD of the test node, and setting a standard file in the OCSD partition of the test node; the method comprises the following specific steps:
s11, setting an auxiliary node to cooperate with the test node to normally operate, and setting the test node BMC and the auxiliary node network to be in the same local area network;
s12, checking the OCSSD state to be tested under the test node system, and judging whether the OCSSD state to be tested is normal or not;
if yes, recording an OCSSD state log to be tested, and entering step S13;
if not, ending;
s13, dividing the OCSD space to be tested into two physical partitions equally, and generating a first partition and a second partition;
s14, creating a file system on the first partition, and mounting the file system to a specified directory; the file system may employ the EXT4 file system;
s15, writing a standard file with a set size of 100G in the mounted file system, and carrying out md5 encoding and storing on the standard file; calculating the md5 code by using a standard file with the md5sum instruction of 100G;
s2, the auxiliary node controls the test node to be started, and a time interval T1 from the start of the test node to the start of loading of the OCSD by the hard disk drive program and a time interval T2 from the start of loading of the OCSD by the hard disk drive program to the completion of loading of the OCSD are obtained; the hard disk drive program has different loading time lengths on the OCSSDs with different storage data volumes, so that the OCSSDs need to be acquired by the auxiliary node before each test; the method comprises the following specific steps:
s21, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, controls the test node to be powered on after waiting for a first time period such as 10s, and records a power-on time point A;
s22, the auxiliary node acquires a time point B when the hard disk drive program starts to load the OCSD and acquires a time point C when the test node operating system enters a login interface;
s23, the auxiliary node calculates a time interval T1 from the start of the test node to the start of loading the OCSSD by the hard disk drive program to be B-A;
s24, the auxiliary node calculates that a time interval T2 from the start of loading the OCSSD by the hard disk drive program to the completion of loading the OCSSD is equal to C-B;
s3, the auxiliary node controls the test node to complete the startup and shutdown tests for preset times, controls the test node to continue for a time interval T after each startup, and shuts down the test node, wherein T1 is less than T < T1+ T2; the method comprises the following specific steps:
s31, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, and controls the test node to be powered on after waiting for a first time period such as 10 s;
s32, the auxiliary node judges whether the startup and shutdown tests of the preset times are finished; the preset number of times can be set as 100 times;
if yes, go to step S4;
if not, go to step S33;
s33, the auxiliary node judges whether the duration of the test node after being started reaches a time interval T, wherein T1< T < T1+ T2;
if yes, return to step S31;
if not, returning to the step S33;
s4, checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node, and checking the standard file; the method comprises the following specific steps:
s41, checking the tested OCSSD state under the test node system, and judging whether the tested OCSSD state is normal;
if yes, go to step S42;
if not, ending;
s42, recording the log of the tested OCSSD state, comparing the log with the log of the OCSSD state before testing, and judging whether the log of the OCSSD state is abnormal before and after testing;
s43, issuing a file block size setting, a queue depth setting, an IO engine setting and a set number of random mixed read-write tasks to the tested OCSD second partition through an FIO tool, and detecting whether the FIO process is abnormal; setting the file block size to be 1MB, setting the queue depth to be 128 MB, setting the IO engine to adopt asynchronous libaio, and setting the number to be 50 MB;
s44, checking the standard file of the first partition by checking the md5 coded file; and verifying the md5 file of the standard file stored before testing through an md5 sum-c instruction, verifying whether the standard file is changed, and storing a verification result.
Example 3:
as shown in fig. 3, an apparatus for testing reliability of an open channel ssd and its driver thereof according to the present invention includes:
the testing environment configuration module 1 is used for configuring a testing environment, setting the communication between a testing node BMC and an auxiliary node network, checking the OCSSD state of the testing node, partitioning the OCSSD of the testing node, and setting a standard file in the OCSSD partition of the testing node; the test environment configuration module 1 includes:
the auxiliary node setting unit 1.1 is used for setting an auxiliary node to cooperate with the test node to normally operate, and setting the test node BMC and the auxiliary node network to be in the same local area network;
the first OCSD state checking unit 1.2 is used for checking the OCSD state to be tested under the test node system and judging whether the OCSD state to be tested is normal or not;
the OCSD state recording unit 1.3 is used for recording an OCSD state log to be detected when the OCSD state to be detected is normal;
the OCSD partition unit 1.4 is used for dividing the OCSD space to be detected into two physical partitions in a halving way to generate a first partition and a second partition;
the file system mounting unit 1.5 is used for creating a file system on the first partition and mounting the file system to a specified directory;
a file writing unit 1.6, configured to write a standard file with a set size in the mounted file system, and perform md5 encoding and saving on the standard file;
the time interval acquisition module 2 is used for controlling the test node to be started through the auxiliary node, and acquiring a time interval T1 from the start of the test node to the start of loading the OCSSD by the hard disk drive program and a time interval T2 from the start of loading the OCSSD by the hard disk drive program to the completion of loading the OCSSD; the time interval acquisition module 2 includes:
the starting time recording unit 2.1 is used for setting an auxiliary node to send a command to the BMC (baseboard management controller) of the test node through the ipmitool tool, controlling the test node to be powered off, controlling the test node to be started after waiting for a first time period, and recording a starting time point A;
an OCSD loading pre-and post-time point recording unit 2.2, configured to set a time point B at which the auxiliary node acquires that the hard disk drive program starts to load the OCSD, and acquire a time point C at which the test node operating system enters the login interface;
the start-up to loading time interval calculation unit 2.3 is configured to calculate, by using the auxiliary node, a time interval T1, where the ocsd starts to be loaded by the test node from start-up to the hard disk driver, is B-a;
the loading process time interval calculation unit 2.4 is configured to calculate, by the auxiliary node, a time interval T2 from when the hard disk driver starts loading the ocsd to when the ocsd completes loading is equal to C-B;
the power on/off test module 3 is used for controlling the test node to complete the power on/off test for a preset number of times through the auxiliary node, controlling the test node to continue for a time interval T after each power on, and performing power off, wherein T1 is less than T < T1+ T2; the startup and shutdown test module 3 includes:
the starting unit 3.1 is used for setting the auxiliary node to send a command to the BMC of the test node through the ipmitool tool, controlling the test node to be powered off, and controlling the test node to be powered on after waiting for a first time period;
a power on/off test completion judging unit 3.2, configured to judge whether the power on/off test for the preset number of times is completed through the auxiliary node;
the time interval judging unit 3.3 is used for judging whether the time interval T is reached by the auxiliary node after the test node is started when the startup and shutdown test is not finished, wherein T1 is less than T < T1+ T2;
the test verification module 4 is used for checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node and verifying the standard file; the test verification module 4 includes:
the second OCSD state checking unit 4.1 is used for checking the tested OCSD state under the test node system and judging whether the tested OCSD state is normal or not;
an OCSD state comparing unit 4.2, configured to record a log of the tested OCSD state, compare the log with the log of the OCSD state before testing, and determine whether the log of the OCSD state is abnormal before and after testing;
the IO verification unit 4.3 is used for issuing a set file block size, a set queue depth, a set IO engine and a set number of random mixed read-write tasks to the tested OCSD second partition through an FIO tool, and detecting whether the FIO process is abnormal or not;
a standard file verification unit 4.4 for verifying the standard file of the first partition by checking the md5 encoded file.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions should be within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present disclosure and the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. An open channel ssd and its driver reliability test method, characterized by comprising the following steps:
s1, configuring a test environment, setting communication between a BMC (baseboard management controller) of a test node and an auxiliary node network, checking an OCSD (online charging system security) state of the test node, partitioning the OCSD of the test node, and setting a standard file in the OCSD partition of the test node;
s2, the auxiliary node controls the test node to be started, and a time interval T1 from the start of the test node to the start of loading of the OCSD by the hard disk drive program and a time interval T2 from the start of loading of the OCSD by the hard disk drive program to the completion of loading of the OCSD are obtained;
s3, the auxiliary node controls the test node to complete the startup and shutdown tests for preset times, controls the test node to continue for a time interval T after each startup, and shuts down the test node, wherein T1 is less than T < T1+ T2;
and S4, checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node, and checking the standard file.
2. The method for testing the reliability of the open channel ssd and its driver as claimed in claim 1, wherein step S1 comprises the following steps:
s11, setting an auxiliary node to cooperate with the test node to normally operate, and setting the test node BMC and the auxiliary node network to be in the same local area network;
s12, checking the OCSD state to be tested under the test node system, and judging whether the OCSD state to be tested is normal or not;
if yes, recording an OCSSD state log to be tested, and entering step S13;
if not, ending;
s13, dividing the OCSD space to be tested into two physical partitions equally, and generating a first partition and a second partition;
s14, creating a file system on the first partition, and mounting the file system to a specified directory;
and S15, writing a standard file with a set size into the mounted file system, and performing md5 coding and storing on the standard file.
3. The method for testing the reliability of the open channel ssd and its driver as claimed in claim 1, wherein step S2 comprises the following steps:
s21, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, controls the test node to be powered on after waiting for a first time period, and records a power-on time point A;
s22, the auxiliary node acquires a time point B when the hard disk drive program starts to load the OCSSD and acquires a time point C when the test node operating system enters a login interface;
s23, the auxiliary node calculates a time interval T1 between the startup of the test node and the start of loading of the OCSD by the hard disk drive program, wherein the time interval T1 is B-A;
and S24, the auxiliary node calculates the time interval T2 from the start of loading the OCSSD by the hard disk drive program to the completion of loading the OCSSD as C-B.
4. The method for testing the reliability of the open channel ssd and its driver as claimed in claim 1, wherein step S3 comprises the following steps:
s31, the auxiliary node sends a command to a BMC (baseboard management controller) of the test node through an ipmitool tool, controls the test node to be powered off, and controls the test node to be powered on after waiting for a first time period;
s32, the auxiliary node judges whether the startup and shutdown tests of preset times are finished or not;
if yes, go to step S4;
if not, go to step S33;
s33, the auxiliary node judges whether the duration of the test node after being started reaches a time interval T, wherein T1< T < T1+ T2;
if yes, return to step S31;
if not, the process returns to step S33.
5. The method for testing the reliability of the open channel ssd and the driver thereof of claim 1, wherein the step S4 includes the following steps:
s41, checking the tested OCSSD state under the test node system, and judging whether the tested OCSSD state is normal;
if yes, go to step S42;
if not, ending;
s42, recording the tested OCSD state log, comparing the log with the OCSD state log before testing, and judging whether the OCSD state log is abnormal before and after testing;
s43, issuing a file block size setting, a queue depth setting, an IO engine setting and a set number of random mixed read-write tasks to the tested OCSD second partition through an FIO tool, and detecting whether the FIO process is abnormal;
and S44, checking the standard file of the first partition by checking the md5 coded file.
6. An open channel ssd and its driver reliability testing device, comprising:
the test environment configuration module (1) is used for configuring a test environment, setting the communication between the test node BMC and the auxiliary node network, checking the OCSD state of the test node, partitioning the OCSD of the test node, and setting a standard file in the OCSD partition of the test node;
the time interval acquisition module (2) is used for controlling the test node to be started through the auxiliary node, and acquiring a time interval T1 from starting to loading the OCSSD by the test node from the hard disk drive program to loading the OCSSD and a time interval T2 from loading the OCSSD by the hard disk drive program to loading the OCSSD;
the power on/off test module (3) is used for controlling the test node to complete the power on/off test for a preset number of times through the auxiliary node, controlling the test node to continue for a time interval T after each power on, and performing power off, wherein T1 is less than T < T1+ T2;
and the test verification module (4) is used for checking the OCSSD state of the test node again, verifying the read-write function of the OCSSD of the test node and verifying the standard file.
7. The open channel ssd and its driver reliability testing device of claim 6, wherein the testing environment configuration module (1) comprises:
the auxiliary node setting unit (1.1) is used for setting the auxiliary node to cooperate with the test node to normally operate, and setting the test node BMC and the auxiliary node network to be in the same local area network;
the first OCSD state checking unit (1.2) is used for checking the OCSD state to be tested under the test node system and judging whether the OCSD state to be tested is normal or not;
the OCSD state recording unit (1.3) is used for recording an OCSD state log to be detected when the OCSD state to be detected is normal;
the OCSD partitioning unit (1.4) is used for dividing the OCSD space to be tested into two physical partitions in a halving way, and generating a first partition and a second partition;
the file system mounting unit (1.5) is used for creating a file system on the first partition and mounting the file system to a specified directory;
and a file writing unit (1.6) for writing a standard file with a set size into the mounted file system, and performing md5 encoding and saving on the standard file.
8. The open channel ssd and its driver reliability testing device of claim 6, wherein the time interval obtaining module (2) comprises:
the starting time recording unit (2.1) is used for setting an auxiliary node to send a command to the BMC of the test node through the ipmitool tool, controlling the test node to be stopped, controlling the test node to be started after waiting for a first time period, and recording a starting time point A;
the OCSD loading pre-and post-time point recording unit (2.2) is used for setting a time point B when the auxiliary node acquires the hard disk drive program to start loading the OCSD and acquiring a time point C when the test node operating system enters a login interface;
a boot-to-load time interval calculation unit (2.3) for calculating, by the auxiliary node, a time interval T1, from boot to start of loading the ocsd by the test node, which is B-a;
and the loading process time interval calculation unit (2.4) is used for calculating the time interval T2 from the start of loading the OCSSD by the hard disk driver to the completion of loading the OCSSD, namely C-B, through the auxiliary node.
9. The open channel ssd and driver reliability testing device of claim 6, wherein the switch power-on/off testing module (3) comprises:
the starting unit (3.1) is used for setting the auxiliary node to send a command to the BMC of the test node through the ipmitool tool, controlling the test node to be powered off, and controlling the test node to be powered on after waiting for a first time period;
a power on/off test completion judging unit (3.2) for judging whether the power on/off test of the preset times is completed through the auxiliary node;
and the duration time interval judgment unit (3.3) is used for judging whether the duration time of the test node after the startup reaches a time interval T or not through the auxiliary node when the startup test is not finished, wherein T1 is less than T < T1+ T2.
10. The open channel ssd and its driver reliability testing device of claim 6, wherein the test validation module (4) comprises:
the second OCSD state checking unit (4.1) is used for checking the tested OCSD state under the test node system and judging whether the tested OCSD state is normal or not;
an OCSD state comparing unit (4.2) for recording the tested OCSD state log, comparing the log with the OCSD state log before testing, and judging whether the OCSD state log is abnormal before and after testing;
the IO verification unit (4.3) is used for issuing the set file block size, the set queue depth, the set IO engine and the set number of random mixed read-write tasks to the tested OCSD second partition through the FIO tool and detecting whether the FIO process is abnormal or not;
and the standard file verification unit (4.4) is used for verifying the standard file of the first partition by checking the md5 coded file.
CN202011417693.6A 2020-12-07 2020-12-07 Open channel ssd and driver reliability testing method and device thereof Active CN112463495B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011417693.6A CN112463495B (en) 2020-12-07 2020-12-07 Open channel ssd and driver reliability testing method and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011417693.6A CN112463495B (en) 2020-12-07 2020-12-07 Open channel ssd and driver reliability testing method and device thereof

Publications (2)

Publication Number Publication Date
CN112463495A CN112463495A (en) 2021-03-09
CN112463495B true CN112463495B (en) 2022-07-19

Family

ID=74802061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011417693.6A Active CN112463495B (en) 2020-12-07 2020-12-07 Open channel ssd and driver reliability testing method and device thereof

Country Status (1)

Country Link
CN (1) CN112463495B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10105425A (en) * 1996-08-23 1998-04-24 Samsung Electron Co Ltd Method for inspecting computer system by using hard disk
CN110825569A (en) * 2019-10-13 2020-02-21 苏州浪潮智能科技有限公司 Hard disk stability test method and test system
CN110992992A (en) * 2019-10-31 2020-04-10 苏州浪潮智能科技有限公司 Hard disk test method, device and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10105425A (en) * 1996-08-23 1998-04-24 Samsung Electron Co Ltd Method for inspecting computer system by using hard disk
CN110825569A (en) * 2019-10-13 2020-02-21 苏州浪潮智能科技有限公司 Hard disk stability test method and test system
CN110992992A (en) * 2019-10-31 2020-04-10 苏州浪潮智能科技有限公司 Hard disk test method, device and storage medium

Also Published As

Publication number Publication date
CN112463495A (en) 2021-03-09

Similar Documents

Publication Publication Date Title
CN108646982B (en) Automatic data restoration method and device based on UBIFS
US7356744B2 (en) Method and system for optimizing testing of memory stores
CN110825569B (en) Hard disk stability test method and test system
CN103019920A (en) Complete machine non-power-off startup and shutdown method based on Linux system
CN114880177A (en) Method and device for testing complete machine abnormal power failure of solid state disk and computer equipment
CN112463495B (en) Open channel ssd and driver reliability testing method and device thereof
CN116719668A (en) Automatic testing method and testing system for RAID (redundant array of independent disks) of NVMe (non-volatile media management) solid-state disk group
CN110187922A (en) It is arranged and verifies the method, apparatus, equipment and storage medium of BIOS parameter
CN111966543B (en) System and method for testing stability of storage OSES system
CN109086162B (en) Memory diagnosis method and device
CN116705138A (en) Solid state disk testing method and storage medium
CN116662050A (en) Error injection support function verification method, device, terminal and medium
CN115509815A (en) Method and device for protecting data in server
CN110826114B (en) User data testing method and device based on SSD after safe erasure
CN115470040A (en) Method, device, equipment and medium for testing re-deleted fingerprint threshold based on snapshot
CN114420194A (en) Test method and device for power failure protection function of solid state disk and computer equipment
CN112486717A (en) Method, system, terminal and storage medium for verifying consistency of disk data
CN111985010A (en) Interception verification method and device for notebook hard disk, computer equipment and storage medium
CN116758973B (en) Testing method for unexpected power failure data verification of enterprise-level solid state disk
CN115629925A (en) Method and device for testing data verification in disk, computer equipment and storage medium
CN112667452B (en) Disk array testing method, system and medium
CN112835742B (en) Data parameter backup and recovery method
CN112463472B (en) Automatic testing method and device for disk array, electronic equipment and storage medium
CN114385574A (en) Log processing method and device, electronic equipment and storage medium
CN111241013B (en) Method and system for realizing NVMe equipment configuration based on pooling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant