CN115035943A - Storage system and self-detection method and system thereof - Google Patents

Storage system and self-detection method and system thereof Download PDF

Info

Publication number
CN115035943A
CN115035943A CN202210760624.8A CN202210760624A CN115035943A CN 115035943 A CN115035943 A CN 115035943A CN 202210760624 A CN202210760624 A CN 202210760624A CN 115035943 A CN115035943 A CN 115035943A
Authority
CN
China
Prior art keywords
test
detection
storage system
self
test item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210760624.8A
Other languages
Chinese (zh)
Inventor
何亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangtze Memory Technologies Co Ltd
Original Assignee
Yangtze Memory Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangtze Memory Technologies Co Ltd filed Critical Yangtze Memory Technologies Co Ltd
Priority to CN202210760624.8A priority Critical patent/CN115035943A/en
Publication of CN115035943A publication Critical patent/CN115035943A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/04Detection or location of defective memory elements, e.g. cell constructio details, timing of test signals
    • G11C29/08Functional testing, e.g. testing during refresh, power-on self testing [POST] or distributed testing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2284Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing by power-on test, e.g. power-on self test [POST]

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The embodiment of the application provides a storage system, a self-detection method thereof and a self-detection system. The self-detection method of the storage system comprises the following steps: detecting at least one test item of the storage system; judging whether the test item has errors or not; if yes, an alarm is sent to the host. In the embodiment of the application, due to the fact that the alarm mechanism is arranged, when the test item is wrong, the test failure result can be timely fed back to the host, the emergency response to the SSD self-detection is improved, meanwhile, the host can be timely made to know the healthy running condition of the SSD, and the data safety is guaranteed.

Description

Storage system and self-detection method and system thereof
Technical Field
The invention relates to the technical field of semiconductors, in particular to a storage system, a self-detection method and a self-detection system thereof.
Background
With the rapid development of internet of things, people have an urgent need for security of information data storage and transmission, especially in the field of SSD (Solid State drive) storage. The solid state disk is composed of a control unit and a storage unit (a FLASH chip and a DRAM chip). Consumer-level SSD storage is private data for users, and enterprise-level SSDs store large amounts of confidential data within an enterprise that is very private and important to both individuals and enterprises. However, if the user does not have good and timely knowledge of the operational health of the SSD, there is a risk of losing data.
The SSD device self-test operation is a diagnostic test sequence for testing the functionality and integrity of the SSD controller and may test NAND media associated with the namespace. Therefore, the user can well know the running health condition of the SSD device through self-detection of the SSD device, so that the SSD can be maintained in time, user data can be safely transferred, and the purpose of protecting the privacy of the user is achieved.
Disclosure of Invention
In view of the above, embodiments of the present invention are directed to a method and a system for self-testing a storage system, i.e., a device thereof.
The embodiment of the application provides a self-detection method of a storage system, which comprises the following steps:
detecting at least one test item of the storage system; judging whether the test item has an error; if yes, an alarm is sent to the host.
In the above solution, the determining whether the error occurs in the test item is performed after the detection of each test item is finished.
In the above scheme, a plurality of test items of the storage system are detected; and if the test item is judged to be wrong, saving a first detection result, wherein the first detection result is the detection result of the test item which is detected completely.
In the above scheme, a plurality of test items of the storage system are detected; and if the test item is judged to be wrong, stopping detecting other test items which are not detected yet.
In the above scheme, the method further comprises: and judging whether the self-detection time is reached, if so, executing the detection of the test item.
In the above scheme, a plurality of test items of the storage system are detected; before the step of judging whether the test item has an error, the method further comprises: and judging whether the plurality of test items are all detected.
In the above scheme, the method comprises the following steps: detecting a plurality of the test items of the storage system; the detection of the plurality of test items is performed in sequence.
An embodiment of the present invention further provides a self-test system for a storage system, including: the first control module is used for controlling the detection of at least one test item of the storage system; the first judging module is used for judging whether the test item has errors or not; and the alarm module is used for initiating an alarm to the host.
In the above solution, the first control module is configured to control detection of a plurality of test items of the storage system; the self-detection system further comprises a first storage module, wherein the first storage module is used for storing a first detection result, and the first detection result is the detection result of the test item after detection is finished.
In the above scheme, the system further comprises a second judging module, and the second judging module is configured to judge whether the self-detection time is reached.
In the above solution, the first control module is configured to control detection of a plurality of test items of the storage system; the self-test system further comprises: and the third judging module is used for judging whether all the test items are detected.
An embodiment of the present invention further provides a storage system, including: comprising a control and a memory, the control comprising: a first control module to control detection of at least one test item of the storage system; the first judgment module is used for judging whether the test item has an error; and the alarm module is used for initiating an alarm to the host.
In the above solution, the control element further includes a second determining module, and the second determining module is configured to determine whether the time for detecting the test item is reached.
In the above solution, the first control module is configured to control detection of a plurality of test items of the storage system; the control element further comprises a third judging module, and the third judging module is used for judging whether the plurality of test items are all detected.
In the above scheme, the method comprises the following steps: the memory includes first firmware configured with parameters that define functions of a storage system.
In the above solution, the parameter includes a first parameter, and the first parameter is whether to support the detection of periodically executing the test item.
In the above solution, the parameter further includes a second parameter, and the second parameter is a time for executing the detection of the test item.
In the above scheme, the parameters further include a third parameter, where the third parameter is whether to support real-time storage of the detection result.
In the above solution, the first control module is configured to control detection of a plurality of test items of the storage system; the parameters further comprise a fourth parameter, and the fourth parameter is whether the detection of other test items is stopped once the test items generate errors.
In the above scheme, the memory includes a NAND unit, and the first firmware is located in the NANA unit.
The embodiment of the invention provides a self-detection method and a self-detection system of a storage system, namely equipment of the storage system. The self-detection method of the storage system comprises the following steps: detecting at least one test item of the storage system; judging whether the test item has errors or not; if yes, an alarm is sent to the host. In the embodiment of the application, due to the fact that the alarm mechanism is arranged, when the test item is wrong, the test failure result can be timely fed back to the host, the emergency response to the SSD self-detection is improved, meanwhile, the host can be timely made to know the healthy running condition of the SSD, and the data safety is guaranteed.
Drawings
FIG. 1 is a standard equipment self-test method;
fig. 2 is a self-testing method of a device according to an embodiment of the present disclosure;
fig. 3 is another self-testing method for a device according to an embodiment of the present disclosure;
fig. 4 is another self-testing method for a device according to an embodiment of the present disclosure;
fig. 5 is a block diagram of a self-test system according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of another self-test system provided in an embodiment of the present application;
FIG. 7 is a block diagram of a storage system according to an embodiment of the present application;
fig. 8 is a block diagram of another storage system according to an embodiment of the present application.
Detailed Description
Various embodiments of the present invention are described in more detail below with reference to the accompanying drawings. Elements and features of embodiments of the invention may be variously configured or arranged to form additional embodiments that are variations of any of the disclosed embodiments. Accordingly, embodiments of the present invention are not limited to the embodiments set forth herein. Rather, the described embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the embodiments to those skilled in the art. It should be noted that references to "an embodiment," "another embodiment," and the like do not necessarily refer to only one embodiment, and different references to any such phrases are not necessarily referring to the same embodiment. It will be understood that, although the terms first, second, third, etc. may be used herein to identify various elements, these elements are not limited by these terms. These terms are used to distinguish one element from another element having the same or similar designation. Thus, a first element in one embodiment may also be referred to as a second or third element in another embodiment without departing from the spirit and scope of the embodiments of the present invention.
The drawings are not necessarily to scale and, in some instances, may be exaggerated in scale to clearly illustrate features of embodiments. When an element is referred to as being connected or coupled to another element, it will be understood that the former may be directly connected or coupled to the latter, or may be electrically connected or coupled to the latter via one or more intervening elements therebetween. In addition, it will also be understood that when an element is referred to as being "between" two elements, it can be the only element between the two elements, or one or more intervening elements may also be present.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The articles "a" and/or "an" as used in the embodiments of the present invention and the appended claims should be construed to mean "one or more" unless specified otherwise or clear from context to be directed to a singular form. It will be further understood that the terms "comprises," "comprising," "includes" and "including" when used in connection with embodiments of the invention, specify the presence of stated elements and do not preclude the presence or addition of one or more other elements. As used in connection with embodiments of the present invention, the term "and/or" includes any and all combinations of one or more of the associated listed items. Unless defined otherwise, all terms including technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs in view of the present embodiments. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of embodiments of the present invention and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, and the present invention may be practiced without some or all of these specific details. In other instances, well known process structures and/or processes have not been described in detail in order to not unnecessarily obscure the present invention. It will also be understood that, in some instances, features or elements described in connection with one embodiment may be used alone or in combination with other features or elements of another embodiment, unless specifically stated otherwise, as would be apparent to one skilled in the relevant art. Hereinafter, various embodiments of the present invention are described in detail with reference to the accompanying drawings. The following description focuses on details to facilitate an understanding of embodiments of the invention. Well-known technical details may be omitted so as not to obscure the features and aspects of the embodiments of the present invention.
For convenience of description, the storage system is described by taking a solid state disk as an example.
Currently, device self-test is described based on NVMe (Non-Volatile Memory Express) version 1.4 protocol, and the device self-test operation is a diagnostic test sequence that tests the integrity and functionality of the controller and may include testing media associated with the namespace, including short device self-test operations and extended device self-test operations. Device self-detection is a standard NVMe command that requires host to initiate the command on its own initiative for the operation to be performed by the SSD. Each standard device self-test command requires testing of multiple test items, where the test items include: RAM (Random Access Memory) testing, SMART (Self-Monitoring, Analysis, and Reporting Technology) testing, volatile Memory backup testing, metadata verification, NAND read/write testing by a controller, data integrity (only limited to extended device Self-test operations) testing, media testing, and media lifetime testing.
Specifically, the test of the RAM means that one piece of test data is written into the RAM, and then the original data is read and compared; the SMART test refers to checking whether a severe warning bit in the SMART/health information log indicating SMART or health status is "1"; testing of volatile memory backup refers to verifying the operating condition of the volatile memory backup solution (e.g., measuring backup power charge and/or discharge times); metadata validation refers to validating/verifying all metadata copies; the read-write test of the controller on the NAND means writing/reading/comparing a reserved area of each NVM, and each read/write channel of the controller is ensured to be executed; the detection of the data integrity refers to executing a background management task and preferentially executing the operation with high requirement on the integrity of the stored data; media detection refers to performing random reads from each available good physical block; media life detection refers to evaluating whether an SSD is eligible to continue write operations.
Referring to fig. 1, fig. 1 illustrates a device self-test method in nvme1.4 version of the protocol, in which a host initiates a command, and then determines whether the command is a device self-test command. And if the command is judged to be the equipment self-detection command, executing the detection command, wherein the detection command at the moment is the equipment self-detection command. As previously mentioned, each standard device self-test command requires testing of multiple test items. Thus, when the test command is executed, a multi-step test is performed. Taking the self-test operation of the expansion device as an example, the first step is to execute the test of the RAM, the second step is to execute the test of the SMART, the third step is to execute the backup detection of the volatile memory, the fourth step is to execute the detection of the verification of the metadata, the fifth step is to execute the read-write test of the NAND by the controller, the sixth step is to execute the detection of the integrity of the data, the seventh step is to execute the detection of the medium, the eighth step is to execute the detection of the service life of the medium, and the ninth step is to execute the test of the SMART again. The second test result is saved only after the tests of all the test items in the self-test command are performed (to distinguish the first test result later). It is understood that the second test result is the detection result of all the test items.
As can be seen from the device self-detection method in fig. 1, the whole process is initiated by the host, and the host does not know whether the test result is erroneous, and needs to autonomously initiate a command for obtaining the test result to obtain the test result, and analyze the test result. The whole test shows the following disadvantages: first, the host cannot obtain the wrong test result in time, which may cause more important data storage when the SSD has a problem, and further has a risk of data loss. Secondly, each self-detection of the SSD requires the host to initiate actively, and the dependence on the host is extremely strong. Finally, no matter whether the test items in the self-detection of the equipment have errors or not, all the test items in the detection command need to be completed, so that the test time is long, and the overall execution efficiency of the self-detection of the equipment is low.
In order to solve at least one problem caused by the standard equipment self-detection method, the application provides a self-detection method of a storage system, which comprises the following steps: detecting at least one test item of the storage system; judging whether the test item has errors or not; if yes, an alarm is sent to the host.
It is understood that the error of the test item refers to that the test item does not satisfy the condition of a healthy SSD, for example, if the test item is a RAM test, the test item is considered to be in error when an uncorrectable error occurs during the detection process or when data errors are compared after writing a test pattern into the RAM and then reading and comparing original data. If the test item is judged to be wrong, an alarm mechanism is started, and an alarm is sent to the host. It should be explained that the alarm mechanism in the embodiment of the present application may be an asynchronous event, a buzzer, or any other mechanism that can serve as a reminder for the host.
Referring to fig. 2, in some embodiments, when it is determined that the command initiated by the host is a device self-test command, a device self-test detection command is executed, where the device self-test detection command includes detecting at least one test item of the storage system. Therefore, when it is determined in fig. 2 that the command initiated by the host is the device self-test command, the test item is detected, and whether an error occurs in the test item is determined.
It is understood that, in some embodiments, the device self-test command includes testing one or more test items, and determining whether the test item has an error may be performed after each test item is tested. For example, in some embodiments, the device self-test command includes a test item test, and determining whether the test item has an error requires waiting for the test item test to be completed.
In some embodiments, the device self-test command includes testing a plurality of test items, and determining whether the test item has an error may occur after each test item is tested, and the testing of the test items and the determining whether the test item has an error are alternated. And if the test item is detected to be wrong, an alarm is sent to the host. And if no error occurs in the test item, immediately detecting the next test item, and after the detection is finished, continuously judging whether the error occurs in the test item which is just detected and finished. And repeating the steps until the self-detection of the equipment is completely executed.
In other optional embodiments, the determining whether the test item has the error may be determining whether the test item having been detected has the error after all the test items have been detected, and it may be understood that before determining whether the test item has the error, it is further necessary to determine whether all the test items have been detected, that is, whether the self-test command of the device has been completed.
In the embodiment of the application, due to the fact that the alarm mechanism is arranged, when the test item is wrong, the SSD can timely feed the test failure result back to the host, emergency response to SSD self-detection is improved, meanwhile, the healthy running condition of the SSD can be timely known, and data safety is guaranteed.
With continued reference to FIG. 2, in some alternative embodiments, determining whether the test item has errors is performed after the end of each test item test, and the test item test and determining whether the test item has errors are alternated. It is understood that, each time the test item is tested, the operation of determining whether the test item has an error is performed after the test is finished. In some embodiments, once an error occurs in the test item just detected and finished, an alarm mechanism can be started in time to send an alarm to the host. By the arrangement, the host can know the running state of the SSD in time, can quickly respond, and ensures the safety of data. In some optional embodiments, the device self-test command may include testing a plurality of test items, and if it is determined that the test item has an error, saving a first test result, where the first test result is a test result of the test item that has been tested.
It can be understood that, if the device self-detection command includes detecting 9 test items, after the detection of the 3 rd test item is finished, if it is determined that the 3 rd test item is in error, an alarm is issued to the host, and the detection results of the 3 test items that have been detected are timely saved. The stored detection result is the first detection result. The host can respond to the alarm more quickly by the setting, the first test result can be obtained in time, the health state of the SSD can be judged according to the first test result, and whether the data of the SSD needs to be backed up or the SSD is stopped to be used for making a timely and accurate judgment, so that the safety of the data is ensured.
It should be noted that, as described in the above example, if the device self-test command includes testing a plurality of test items, when a test item in the middle is detected and an error is detected in the test item, it may be selected whether to continue testing the remaining test items.
With continued reference to fig. 2, it can be understood that a test item in the self-test command is first detected, whether the test item has an error is determined, and if the test item has an error, an alarm mechanism is activated to issue an alarm to the host. And then continuously judging whether the self-detection command of the equipment is executed completely, and if not, continuously executing the detection of the next test item of the storage system. And repeating the steps until the self-detection command of the equipment is executed, finally stopping detection, and storing a second detection result.
It is understood that the second detection result is the detection result of all the test items in the self-test command. It should be explained that when one of the test items is in error, only the alarm mechanism is activated, but other non-detection test items in the self-detection command are still detected, i.e. the execution of the device self-detection command is not terminated because an error occurs in the middle one of the test items. By the arrangement, the host can find the health state of the SSD in time, and can acquire the whole self-detection test result to know the running state of the SSD more comprehensively.
In other alternative embodiments, please refer to fig. 3, first, whether to perform device self-test is determined according to an instruction sent by the host, and if the instruction sent by the host is a device self-test instruction, the device self-test instruction is started to be executed, where the device self-test instruction may include testing a plurality of test items of the storage system. Optionally, the plurality of test items of the storage system may be sequentially detected. And if the test item is judged to be wrong, stopping executing the self-test command of the equipment and storing the test result, wherein the stored test result is the first test result. And if other test items are not detected in the self-detection command of the equipment at the moment, stopping detecting other test items which are not detected. If the test item is judged not to have errors, whether the device self-detection command is completely executed or not is continuously judged, if not, the device self-detection command is continuously executed, the steps are repeatedly carried out until the device self-detection command is completely executed, and the stored test result is the second detection result.
It should be explained that, after determining that the device self-test is required in fig. 2 and 3, the test item is detected by: when the instruction given by the host is judged to be the self-detection command of the device, the self-detection command of the device is started to be executed, the self-detection command of the device comprises detection of one or more test items, and therefore the test items in the self-detection command of the device are inevitably detected after the self-detection command of the device is started to be executed. The first step of executing the device self-test command is not limited to test the test item, and other test steps may be included before testing the test item.
By the arrangement, if the test item is detected to be wrong, the subsequent self-detection test step is terminated, so that the purpose of SSD detection is achieved, the detection of subsequent unnecessary test items is reduced, the test time of the self-detection of the equipment is shortened, and the overall execution efficiency of the self-detection of the equipment is improved.
In some optional embodiments, referring to fig. 4, before performing the test item for detection, the method further includes: and judging whether the self-detection time of the equipment is reached, if so, executing the detection of the test item.
It should be explained that, since the device self-test command includes at least one test item of the storage system to be tested, if the time for the device self-test is reached, the time for testing the test item naturally also comes. Therefore, whether the time for the self-detection of the equipment is reached is judged, and if yes, the detection of the test item is executed. It will be appreciated that the performance of a device self-test need only be performed once to determine whether the self-test time has been reached. If the first self-test of the equipment is carried out, the operation of judging whether the time of the second self-test of the equipment is reached is carried out after the first self-test of the equipment is finished. The first self-detection of the equipment does not start the operation of judging whether the time of the second self-detection of the equipment is reached in the execution process.
It should be further explained that the time for setting the self-detection of the device may be set according to the requirement, and may be once a week or once a month.
With the above arrangement, the SSD may determine whether the set time is reached to periodically perform the device self-test operation, and start to execute the test command when the set time is reached. The whole test operation can be carried out periodically, the SSD can be well detected periodically without depending on a command initiated by the host, and the dependence of the SSD on the host is reduced.
Referring to fig. 5, the present application further provides a self-test system of a storage system, including: the first control module is used for controlling the detection of at least one test item of the storage system; the first judgment module is used for judging whether the test item has errors or not; and the alarm module is used for initiating an alarm to the host.
It should be explained that the first control module is used for controlling the detection of the at least one test item of the storage system, which can be understood as: the first control module is used for controlling the starting and stopping of the self-detection of the equipment. Since device self-test involves testing one or more test items of the storage system, starting or stopping device self-test may also be understood as starting or stopping testing of one or more test items. The alarm module can be flexibly set by using the structure or program of the SSD, and can also be set by additionally adding some firmware.
Due to the fact that the alarm module is arranged, when the first judgment module judges that the test item has errors, the test failure result can be timely fed back to the host, the emergency response to the SSD self-detection is improved, meanwhile, the healthy operation condition of the SSD can be timely known, and the safety of data is guaranteed.
In some optional embodiments, referring to fig. 6, according to the different apparatus self-test methods, when the apparatus self-test includes testing a plurality of test items of the storage system, the self-test system of the storage system may further include a first storage module, where the first storage module is configured to store a first test result, and the first test result is a test result of a test item that has completed testing.
It is understood that, if the test item is in error, the detection result of the test item that has completed detection is saved in the first storage module, and an alarm is issued to the host. The host can respond to the alarm more quickly by the setting, the first test result can be obtained in time, the health state of the SSD can be judged according to the first test result, and whether the data of the SSD needs to be backed up or the SSD is stopped to be used for making a timely and accurate judgment, so that the safety of the data is ensured.
In some optional embodiments, please continue to refer to fig. 6, according to the different apparatus self-testing methods described above, the self-testing system of the storage system may further include a second determining module, where the second determining module is configured to determine whether the self-testing time is reached.
It is understood that the second determination module only needs to perform the operation of determining whether the self-detection time is reached once for the execution of the self-detection of the device. If the first self-detection of the equipment is carried out, the second judgment module executes the operation of judging whether the time of the second self-detection of the equipment is reached or not, and the operation is carried out after the first self-detection of the equipment is finished. The first time of equipment self-detection is also in the execution process, the second judgment module does not start the operation of judging whether the time of the second time of equipment self-detection is reached.
With the above arrangement, the SSD may determine whether the set time is reached to periodically perform the device self-test operation, and start to execute the test command when the set time is reached. The whole self-detection operation can be carried out periodically, the SSD can be well detected periodically without depending on a command initiated by the host, and the dependence of the SSD on the host is reduced.
In some optional embodiments, please continue to refer to fig. 6, according to the different apparatus self-test methods described above, when the apparatus self-test includes detecting a plurality of test items of the storage system, the self-test system of the storage system may further include a third determining module, where the third determining module is configured to determine whether all of the plurality of test items complete the detection.
It should be explained that, since the device self-test includes detecting a plurality of test items of the storage system, the third determining module is configured to determine whether all of the plurality of test items of the storage system in the device self-test are detected, which may be understood as that the third determining module is configured to determine whether all of the device self-test are performed.
Due to the arrangement of the third judgment module, the execution of the self-detection of the equipment can be stopped in time, the self-detection of the equipment is prevented from being carried out without stopping, the power consumption is reduced, and meanwhile, the storage efficiency of the SSD is prevented from being influenced by the self-detection of the equipment.
Referring to fig. 7, the present application further provides a storage system including a control element and a memory, where the control element includes: the first control module is used for controlling the detection of at least one test item of the storage system; the first judgment module is used for judging whether the test item has an error; and the alarm module is used for initiating an alarm to the host.
In some embodiments, different modules may correspond to one controller or a plurality of controllers, that is, a control element may be a set of a plurality of controllers or may include only one controller.
It should be noted that the controller can implement information interaction between the host and the memory, and the chip in the memory performs an erase operation, a program operation or a read operation under the control of the controller. The controller of the control element corresponding to the first control module for controlling the detection of the at least one test item of the storage system may be understood as: the first control module is used for controlling the starting and stopping of the self-detection of the equipment. Since device self-test involves testing one or more test items of the storage system, starting or stopping device self-test may also be understood as starting or stopping testing of one or more test items. The alarm module in the controller can be flexibly set by using the structure or program of the SSD, and can also be additionally provided with some firmware for setting.
According to the storage system, the alarm module is arranged in the control element, when the controller corresponding to the first judgment module in the control element judges that the test item has an error, the SSD can feed the test failure result back to the host in time, the emergency response to the SSD self-detection is improved, the healthy operation condition of the SSD can be known in time, and the data safety is guaranteed.
In some optional embodiments, referring to fig. 8, according to the different self-test system, the control element of the storage system may further include a second determining module, where the second determining module is configured to determine whether the time for detecting the test item is reached. The second determination module has already been explained above, and is not described herein again.
According to the storage system, the second judging module is arranged in the control element, the SSD can judge whether the set time is reached to periodically execute the self-detection operation of the equipment, and when the set time is reached, the SSD starts to execute the detection command. The whole self-detection operation in the SSD can be periodically performed, the SSD can be well periodically detected without depending on a command initiated by a host, and the dependency of the SSD on the host is reduced.
In some optional embodiments, please continue to refer to fig. 8, according to the different self-testing systems described above, the control element of the storage system may further include a third determining module, where the third determining module is configured to determine whether all of the plurality of test items have been tested. The third determining module is already explained above, and is not described herein again.
According to the storage system, the third judgment module is arranged in the control element, the execution of the self-detection of the equipment can be stopped in time, the self-detection of the equipment is prevented from being carried out without stopping, the power consumption is reduced, and meanwhile, the storage efficiency of the SSD is prevented from being influenced by the self-detection of the equipment.
In some alternative embodiments, with continued reference to fig. 8, the memory of the storage system includes a first firmware configured with parameters that define functions of the storage system, in accordance with the various self-test systems described above.
It should be explained that the engineer burns the written first firmware into the NAND, where the first firmware includes a program of identity, which is a description of which functions the SSD supports, and the size of the data structure is 4096 bytes. The Identity program includes various parameters defining the SSD function, and it can also be understood that the first firmware is configured with various parameters defining the SSD function, and the data structures of different parameters have different sizes.
In some optional embodiments, the parameter comprises a first parameter, the first parameter being whether the detection of periodically executed test items is supported. Since device self-testing includes testing at least one test item of the storage system, the testing of whether to support initiation of periodic execution of the test item may be understood as whether to support initiation of periodic execution of the device self-testing.
It should be explained that, in the case that the first firmware is configured with a parameter that supports whether or not to initiate periodic device self-detection, the SSD may be configured to perform the device self-detection periodically or may not perform the device self-detection periodically. In the case where the first parameter is configured to support the SSD to periodically perform the device self-test, the SSD determines whether the time for the device self-test is reached at the time of the device self-test. In some alternative embodiments, the size of the data structure of the first parameter is 2 bytes.
In some optional embodiments, the first firmware configuration may further include a second parameter, the second parameter being a time at which the detection of the test item is performed. Since the device self-test includes testing at least one test item of the storage system, the time at which the testing of the test item is performed may be understood as the time at which the device self-test is performed.
It should be explained that, in the case that the first firmware is provided with the first parameter, and the first parameter is configured to support the SSD to periodically perform the device self-test, due to the configuration of the second parameter, the SSD may periodically perform the device self-test according to the time of the device self-test set by the second parameter. In some alternative embodiments, the data structure of the second parameter has a size of 4 bytes. In some optional embodiments, the parameters configured by the first firmware may further include a third parameter, where the third parameter is whether real-time storage of the detection result is supported.
As explained above, in the case that the first firmware is configured with the third parameter, the SSD may choose to support or not to support saving the detection result in real time when performing the device self-detection. It can be understood that, in the case that the third parameter is configured to support saving the detection result in real time, when the SSD performs device self-detection, if one of the test items is wrong, the first detection result may be saved in real time, which has been explained in the foregoing, and is not described herein again. In some alternative embodiments, the data structure size of the third parameter is 1 byte.
In some optional embodiments, if the first control module is configured to control the detection of a plurality of test items of the storage system, the parameter configured by the first firmware may further include a fourth parameter, where the fourth parameter is whether to support the detection of other test items to be stopped once an error occurs in a test item.
As explained above, in the case that the first firmware is configured with the fourth parameter, the SSD may choose to stop the detection of other test items or continue the detection of other test items when the test item is in error while performing the self-detection of the device. It is understood that, in the case that the fourth parameter is configured to support that the test item stops the detection of other test items once an error occurs, the SSD may implement a function of stopping the detection of other test items if one of the test items has an error while performing the device self-detection. In some alternative embodiments, the data structure of the fourth parameter is 1 byte.
When the parameters of the first firmware include the first parameter, the second parameter, the third parameter and the fourth parameter, the sum of the sizes of the data structures of the four parameters may be a multiple of 8 bytes, for example, the size of the data structure of the first parameter is 2 bytes, the size of the data structure of the second parameter is 4 bytes, the size of the data structure of the third parameter is 1 byte, and the size of the data structure of the fourth parameter is 1 byte. Such an arrangement facilitates alignment of the data.
In some optional embodiments, the memory further includes a NAND cell, and the first firmware is located in the NAND cell.
In summary, the embodiment of the present application provides a storage system, a self-testing method thereof, and a self-testing system. The self-detection method of the storage system comprises the following steps: detecting at least one test item of the storage system; judging whether the test item has errors or not; if yes, an alarm is sent to the host. In the embodiment of the application, due to the fact that the alarm mechanism is arranged, when the test item is wrong, the SSD can timely feed the test failure result back to the host, emergency reaction to SSD self-detection is improved, meanwhile, the healthy running condition of the SSD can be known in time, and data safety is guaranteed.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1. A method for self-testing a memory system, comprising:
detecting at least one test item of the storage system;
judging whether the test item has errors or not;
if yes, an alarm is sent to the host.
2. The method of claim 1, wherein said determining whether the test item has failed is performed after each of the tests has completed.
3. The method of claim 2, wherein a plurality of the test items of the storage system are tested; and if the test item is judged to be wrong, saving a first detection result, wherein the first detection result is the detection result of the test item which is detected completely.
4. The method of claim 2, wherein a plurality of the test items of the storage system are tested; and if the test item is judged to be wrong, stopping detecting other test items which are not detected.
5. The method of claim 1, further comprising: and judging whether the self-detection time is reached, if so, executing the detection of the test item.
6. The method of claim 1, wherein a plurality of the test items of the storage system are tested; before the step of judging whether the test item has an error, the method further comprises:
and judging whether the plurality of test items are all detected.
7. The method of claim 1, comprising: detecting a plurality of the test items of the storage system; the detection of the plurality of test items is performed in sequence.
8. A self-test system for a storage system, comprising:
the first control module is used for controlling the detection of at least one test item of the storage system;
the first judging module is used for judging whether the test item has errors or not;
and the alarm module is used for initiating an alarm to the host.
9. The system of claim 8, wherein the first control module is configured to control the testing of the plurality of test items of the storage system; the self-detection system further comprises a first storage module, wherein the first storage module is used for storing a first detection result, and the first detection result is the detection result of the test item after detection is finished.
10. The system of claim 8, further comprising a second determining module configured to determine whether the self-detection time is reached.
11. The system of claim 8, wherein the first control module is configured to control the testing of a plurality of the test items of the storage system; the self-detection system further comprises: and the third judging module is used for judging whether all the test items are detected.
12. A storage system comprising a control and a memory, the control comprising:
the first control module is used for controlling the detection of at least one test item of the storage system;
the first judgment module is used for judging whether the test item has an error;
and the alarm module is used for initiating an alarm to the host.
13. The storage system of claim 12, wherein the control further comprises a second determination module to determine whether a time to detect the test item has been reached.
14. The storage system of claim 12, wherein the first control module is configured to control the testing of the plurality of test items of the storage system; the control element further comprises a third judging module, and the third judging module is used for judging whether the plurality of test items are all detected.
15. The storage system according to claim 12, comprising: the memory includes first firmware configured with parameters that define functions of the storage system.
16. The storage system of claim 15, wherein the parameter comprises a first parameter, the first parameter being whether detection of the test item is enabled to be periodically executed.
17. The storage system of claim 16, wherein the parameters further include a second parameter, the second parameter being a time at which the detection of the test item is performed.
18. The storage system according to claim 15, wherein the parameter further comprises a third parameter, and the third parameter is whether real-time storage of the detection result is supported.
19. The storage system of claim 15, wherein the first control module is configured to control the testing of the plurality of test items of the storage system; the parameters further comprise a fourth parameter, and the fourth parameter is whether the detection of other test items is stopped once the test items generate errors.
20. The memory system of claim 15, wherein the memory comprises a NAND cell and the first firmware is located in the NANA cell.
CN202210760624.8A 2022-06-29 2022-06-29 Storage system and self-detection method and system thereof Pending CN115035943A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210760624.8A CN115035943A (en) 2022-06-29 2022-06-29 Storage system and self-detection method and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210760624.8A CN115035943A (en) 2022-06-29 2022-06-29 Storage system and self-detection method and system thereof

Publications (1)

Publication Number Publication Date
CN115035943A true CN115035943A (en) 2022-09-09

Family

ID=83129017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210760624.8A Pending CN115035943A (en) 2022-06-29 2022-06-29 Storage system and self-detection method and system thereof

Country Status (1)

Country Link
CN (1) CN115035943A (en)

Similar Documents

Publication Publication Date Title
US9304856B2 (en) Implementing ECC control for enhanced endurance and data retention of flash memories
US20110173378A1 (en) Computer system with backup function and method therefor
US7356744B2 (en) Method and system for optimizing testing of memory stores
CN102760090B (en) Debugging method and computer system
US11221933B2 (en) Holdup self-tests for power loss operations on memory systems
US11688483B2 (en) Managing block retirement for temporary operational conditions
TW201003662A (en) Memory malfunction prediction system and method
US9728276B2 (en) Integrated circuits with built-in self test mechanism
US7272058B2 (en) Nonvolatile semiconductor memory device having redundant relief technique
CN110727597B (en) Method for checking invalid code completion case based on log
US8006144B2 (en) Memory testing
CN110618892A (en) Bug positioning method and device for solid state disk, electronic equipment and medium
CN112363909A (en) Automatic test method for reliability of file system in relay protection device
CN115035943A (en) Storage system and self-detection method and system thereof
CN110826114B (en) User data testing method and device based on SSD after safe erasure
CN109086162B (en) Memory diagnosis method and device
CN109686397B (en) Memory with self-checking function and its checking method
CN114420194A (en) Test method and device for power failure protection function of solid state disk and computer equipment
WO2022027170A1 (en) Flash memory data management method, storage device controller, and storage device
US6229743B1 (en) Method of a reassign block processing time determination test for storage device
CN108231134B (en) RAM yield remediation method and device
US20180261298A1 (en) Memory system including a delegate page and method of identifying a status of a memory system
CN110993015B (en) Hard disk differential signal quality detection method, hard disk differential signal quality detection device, main control and medium
KR101566487B1 (en) Apparatus for performing a power loss test for a non-volatile memory device and method of performing a power loss test for a non-volatile memory device
US20230026712A1 (en) Generating system memory snapshot on memory sub-system with hardware accelerated input/output path

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination