CN116974961A - Firmware changing method, system and device - Google Patents

Firmware changing method, system and device Download PDF

Info

Publication number
CN116974961A
CN116974961A CN202310774084.3A CN202310774084A CN116974961A CN 116974961 A CN116974961 A CN 116974961A CN 202310774084 A CN202310774084 A CN 202310774084A CN 116974961 A CN116974961 A CN 116974961A
Authority
CN
China
Prior art keywords
firmware
storage component
target
data
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310774084.3A
Other languages
Chinese (zh)
Inventor
李舒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Alibaba Feitian Information Technology Co ltd
Original Assignee
Hangzhou Alibaba Feitian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Alibaba Feitian Information Technology Co ltd filed Critical Hangzhou Alibaba Feitian Information Technology Co ltd
Priority to CN202310774084.3A priority Critical patent/CN116974961A/en
Publication of CN116974961A publication Critical patent/CN116974961A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1668Details of memory controller
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The embodiment of the specification provides a firmware changing method, a system and a device, wherein the firmware changing method comprises the following steps: replacing the fault firmware with bypass firmware loaded into a controller of a storage component in response to detection information of the fault firmware in the controller, wherein the bypass firmware is loaded into the controller by a target flash memory of the storage component; loading a takeover program of a host side in response to the detection information, and rebuilding an address mapping relation corresponding to the storage component by using the takeover program; and the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.

Description

Firmware changing method, system and device
Technical Field
Embodiments of the present disclosure relate to the field of computer technologies, and in particular, to a method, a system, and an apparatus for firmware modification.
Background
With the development of computer technology, the requirements on capacity, throughput, low delay and the like of the storage component are more and more prominent, the design of firmware in the storage component is more and more complex at any time, corresponding IO combination scenes are more and more various, interaction with various components such as a network card, a CPU (Central processing Unit), a GPU (graphics processing Unit) and the like is frequent, and the upper storage application is increasingly thick and heavy for improving the performance and strengthening stability, so that the online pressure of the storage component is increased. The resulting failure of the storage element can affect the large amount of data carried on hundreds of millions of storage elements. Particularly in a cloud service scene, the service may be degraded or even interrupted due to the fact that the back-end performance is damaged and the storage component is abnormal. Firmware troubleshooting of the storage component is therefore particularly important. In the prior art, the positioning of the firmware problem of the storage component needs to reproduce the event, and the modified test flow is long. In addition, the firmware problem of the storage component often needs weeks or months to solve, and when the storage component is maintained, the storage component cannot process data read-write tasks, so that the equipment performance and the data stability are obviously affected. There is therefore a need for an effective solution to the above problems.
Disclosure of Invention
In view of this, the present embodiment provides a firmware changing method. One or more embodiments of the present specification also relate to a firmware changing system, a firmware changing apparatus, a computing device, a computer-readable storage medium, and a computer program that solve the technical drawbacks of the prior art.
According to a first aspect of embodiments of the present specification, there is provided a firmware changing method, including:
replacing the fault firmware with bypass firmware loaded into a controller of a storage component in response to detection information of the fault firmware in the controller, wherein the bypass firmware is loaded into the controller by a target flash memory of the storage component;
loading a takeover program of a host side in response to the detection information, and rebuilding an address mapping relation corresponding to the storage component by using the takeover program;
and the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
According to a second aspect of the embodiments of the present specification, there is provided a firmware changing system including a storage part and a host side, including:
The storage component is used for responding to detection information of fault firmware in a controller and replacing the fault firmware by using bypass firmware loaded into the controller, wherein the bypass firmware is loaded to the controller by a target flash memory;
the host end is used for loading a takeover program and reconstructing an address mapping relation corresponding to the storage component by using the takeover program; and the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
According to a third aspect of embodiments of the present specification, there is provided a firmware changing apparatus including:
a replacement module configured to replace a failed firmware in a controller of a storage component with bypass firmware loaded into the controller in response to detection information of the failed firmware, wherein the bypass firmware is loaded to the controller by a target flash memory of the storage component;
the reconstruction module is configured to respond to the detection information to load a takeover program of the host end and reconstruct an address mapping relation corresponding to the storage component by utilizing the takeover program;
And the execution module is configured to execute the data read-write task corresponding to the target storage component by using the target storage component replaced by the firmware according to the reconstructed address mapping relation by the takeover program.
According to a fourth aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed, implement the steps of any of the firmware alteration methods described above.
According to a fifth aspect of embodiments of the present specification, there is provided a computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the steps of the firmware change method described above.
According to a sixth aspect of the embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the firmware changing method described above.
In order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, the firmware changing method provided by the specification can be used for loading the bypass firmware to the controller by utilizing the target flash memory of the storage component under the condition that the firmware in the storage component fails, so that the bypass firmware can replace the failed firmware to continue working, and therefore the storage component is ensured not to be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
Drawings
FIG. 1 is a schematic diagram of a firmware modification method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of another firmware modification method according to one embodiment of the present disclosure;
FIG. 3 is a flow chart of a firmware change method provided by one embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating a reconstruction of address mapping relationship in a firmware modification method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating a deployment relationship of offloading hardware in a firmware modification method according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a host-side failure recovery in a firmware upgrade method according to one embodiment of the present disclosure;
FIG. 7 is a flowchart of a firmware change method according to one embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a firmware modification system according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a firmware changing apparatus according to an embodiment of the present disclosure;
FIG. 10 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present specification will be explained.
Solid State Disk (Solid State Disk or Solid State Drive, abbreviated as SSD): also known as solid state drives, are hard disks made from arrays of solid state electronic memory chips.
NOR Flash (NOR Flash): the flash memory technology is a nonvolatile flash memory technology and is used for still storing data after the hard disk is powered off.
Nand-flash memory (Nand flash memory): the Flash memory is a type of Flash memory, and a nonlinear macro unit mode is adopted in the Flash memory, so that an effective solution is provided for the realization of a solid-state large-capacity memory. The Nand-flash memory has the advantages of large capacity, high rewriting speed and the like, and is suitable for storing a large amount of data.
FTL (Flash Translation Layer ): translation (or Mapping) of the host logical address space to the Flash physical address space is accomplished at the Flash translation layer.
PCIe: PCI-Express (peripheral component interconnect express) a high-speed serial computer expansion bus standard.
In the present specification, a firmware changing method is provided, and the present specification relates to a firmware changing system, a firmware changing apparatus, a computing device, a computer-readable storage medium, and a computer program, which are described in detail in the following embodiments one by one.
In practical applications, to meet the market demands of increased performance, increased capacity, use of low cost storage media (increased storage density), and the like, storage component manufacturers employ more complex and expensive hardware to run firmware with continuously increasing complexity. As shown in fig. 2, the storage unit with firmware execution as a core is disposed in the controller, the storage unit is connected with the host end through the host interface, the controller is connected with the storage medium through the medium interface, and the hardware circuit is used as a task execution circuit in the controller. On this basis, the memory unit runs firmware using a plurality of processors to ensure performance due to throughput requirements. In order to ensure that the storage components keep the same reliability after the complexity of the software and the hardware is increased, the development and test difficulty of the firmware is increased, the test flow is long, and the workload is large. The current practice is to cope with a future unlimited combination of applications with a current limited set of tests by repeatedly repeating long-time stress tests in an attempt to cover the use scenarios that the memory component may experience in future use. Unlimited application portfolios refer to scenarios that continue to be newly generated after a device deployment is brought online, at which time service software, storage engines, etc. evolve at their iterative pace, and these new scenarios were not adequately tested by previous storage components. Directly causes the test period to be prolonged and the verification workload to be increased.
In view of this, in order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, in the case of determining the firmware failure in the storage component, the firmware changing method provided by the present disclosure may load the bypass firmware to the controller by using the target flash memory of the storage component in response to the detection information of the failed firmware in the controller of the storage component, so as to replace the failed firmware with the bypass firmware loaded to the controller, so that the bypass firmware can replace the failed firmware to continue to work, thereby ensuring that the storage component cannot be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
Referring to the schematic diagram shown in fig. 1, in the case that a problem occurs in the firmware of the SSD component, the firmware with the problem of no sensing takeover is realized, and the influences of aspects of positioning, testing, issuing of the firmware problem, such as performance, stability, cost and the like in equal-period are filled. Specifically, after a problem occurs in firmware running on the controller, the host will start a takeover program in the host memory and pause the IO access. The SSD component will flush the FTL currently stored in DRAM (Dynamic Random Access Memory ) and transfer it to host memory via a host interface (e.g., PCIe bus) for use by the takeover program. The host-side takeover program will run on the host CPU. Meanwhile, bypass firmware is stored in the NOR flash memory in the SSD component, and after the firmware fails, the controller can read the bypass firmware in the NOR flash memory and replace the firmware with the problem. Therefore, the bypass firmware can execute corresponding data read-write operation when the host takes over. By transferring the work to the host-side takeover program, firmware problems can be avoided from affecting the work of the SSD component. Meanwhile, the power-down protection is supported, and by arranging the power-down protection circuit on the SSD component, enough electric quantity can be provided when the host side fails, so that effective data (dirty pages) in the SSD cache can be quickly flushed into the NAND flash memory, and the data reliability is ensured.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) related to one or more embodiments of the present disclosure are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant area, and is provided with a corresponding operation entry for the user to select authorization or rejection.
Fig. 3 shows a flowchart of a firmware changing method according to an embodiment of the present disclosure, which specifically includes the following steps.
In step S302, in response to detection information of the fault firmware in the controller of the storage unit, the fault firmware is replaced with bypass firmware loaded into the controller, wherein the bypass firmware is loaded into the controller by the target flash memory of the storage unit.
The firmware changing method provided by the embodiment can be applied to a scene of firmware faults in a controller of a storage component, such as a bug in firmware design; the software change starts out that the firmware design is insufficient; hardware life wears out and existing firmware is not adequately compensated; the system pressure increases and persists, resulting in mismatch of current version firmware, etc.; the firmware changing method provided by the embodiment can be adopted, and the data read-write task corresponding to the storage component is continuously executed by utilizing the matching mode of the bypass firmware and the takeover program, and simultaneously, the hot update of the fault firmware under the condition is supported, so that the firmware repairing process is completed under the condition of no perception.
Specifically, the storage component specifically refers to a storage hard disk, such as a solid state hard disk, deployed at a host end and used for executing a data read-write task. Correspondingly, the controller specifically refers to a main controller of the storage component, and is used for storing firmware, and when data read-write operation is supported, the corresponding operation is completed by the firmware in the controller. Accordingly, the faulty firmware specifically refers to the firmware that has a fault in the controller. Correspondingly, the detection information is specifically information detected by the pointer to the firmware with faults, the information is used for triggering the subsequent take-over mode to start, and after the firmware faults in the controller of the storage component, the take-over program and the bypass firmware can cooperate with the support storage component to continuously execute the data read-write task without stopping the work of the storage component due to the firmware faults.
Furthermore, the bypass firmware is specifically a simplified version of firmware which is designed independently, and can be understood as a minimized program for executing basic instructions to support the dependence of decoupling on complex scene processing and transfer to a host side driver side to realize complex operation, wherein the updating and the hot upgrading of the host side driver can be completed in a short period, compared with the period of firmware positioning, testing and publishing, the time is greatly shortened, so that the bypass firmware can replace the fault firmware to perform basic operation when the fault firmware cannot work, and the operation needs to be completed by matching with a takeover program loaded by the host side; the basic instructions that can be executed by the bypass firmware include, but are not limited to, operations and management of flash memory media such as sequential reading, sequential writing, deleting, and the like, and when the execution fails, the data in the cache of the storage component can be sequentially flushed to the flash memory. Accordingly, the target flash memory specifically refers to a NOR flash memory in the storage component, which is used for storing bypass firmware, and supports that the bypass firmware is flushed from the flash memory to the controller to replace the fault firmware in case of the firmware fault in the controller, and the storage component is supported to work normally.
Based on the above, in order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, under the condition that the firmware in the storage component fails, the bypass firmware is loaded to the controller by using the target flash memory of the storage component to replace the failure firmware by using the bypass firmware loaded to the controller, so that the bypass firmware can replace the failure firmware to continue working, and the storage component is ensured not to be down.
Step S304, loading a takeover program of the host end in response to the detection information, and rebuilding the address mapping relation corresponding to the storage component by using the takeover program.
Specifically, after the bypass firmware in the target flash memory is used for replacing the fault firmware in the controller, further, considering that the bypass firmware is used for temporarily replacing the fault firmware to work, compared with the fault firmware, the structure is simpler, and the storage component cannot be independently supported to execute the data read-write task, so that the processing of the data read-write task needs to be completed by matching with the takeover program of the host side. Therefore, after the bypass firmware is completed to replace the fault firmware in the controller in response to the detection information, the takeover program of the host side needs to be loaded in response to the detection information at the same time, that is, the takeover program in the memory of the host side is loaded to the CPU to run. In addition, after the firmware in the controller of the storage component fails, the problem of the FTL recording the mapping relation between the physical address and the logical address can be caused, so that the FTL cannot be reused, and the corresponding address mapping relation of the storage component needs to be rebuilt through the takeover program, namely the FTL is rebuilt, so that the takeover program can be supported to complete the subsequent data read-write task in cooperation with bypass construction.
The takeover program is specifically a program deployed at the host end and used for matching with bypass firmware, so that the storage component can continue to execute the data reading and writing task, the program descends a memory existing at the host end under the condition that the program is not started, after the firmware of the storage component fails, the takeover program can be loaded into the CPU from the memory, the takeover program is operated to complete the reconstruction of the FTL, and the storage component is supported to continue to execute the data reading and writing task based on the reconstruction. Correspondingly, the address mapping relation specifically refers to the mapping relation between the recorded physical address and the logical address, namely the FTL, and the takeover program rebuilds the FTL, so that the accuracy of data reading and writing can be ensured when the takeover storage component executes the data reading and writing task.
In particular, when firmware in the controller fails, there may be a probability that the address mapping relationship is incomplete or the version is incorrect, for example, metadata of the received data, such as LAB (Logical Block Address, logical address), may be in error after the firmware fails, so that the address mapping relationship is in error. In addition, firmware errors may cause front-end instruction parsing errors, and both order and execution content may be erroneous, resulting in erroneous reading or modification of FTL content. And the conditions such as neglected recording or deleting by mistake can also occur. Therefore, after the firmware is in error, the integrity and the correctness of the current FTL cannot be guaranteed. The FTL needs to be rebuilt through the takeover procedure to support the execution of the data read-write task based on the rebuilt address mapping relationship. In addition, if the address mapping relation is complete, the host program is not required to reconstruct the address mapping relation, and the address mapping relation is directly sent to the host program for use by the storage component.
Furthermore, when the address mapping relation is rebuilt by using the takeover program, the fact that the rebuilding of the takeover program needs to be completed by attaching to the address information in the flash memory is considered, so that a target flash memory page for writing data needs to be determined first, and the corresponding information can be read to complete the rebuilding of the address mapping relation; in this embodiment, the specific implementation manner is as follows:
determining a target flash page of write data in the target flash of the storage component using the takeover program; and determining the address information of the target flash memory page, and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program.
Specifically, the target flash page specifically refers to a flash page in which data is written in the storage unit, and correspondingly, the address information specifically refers to out-of-band (OOB) used by the transport layer protocol, which is used to record LBA information, so that the takeover program can reconstruct an address mapping relationship between a physical address and a logical address based on the address information.
Based on the above, after the host loads the takeover program, in order to support the takeover program to manage the storage component for executing the data read-write task subsequently, the takeover program can be utilized to determine the target flash memory page of the write data in the target flash memory of the storage component; the target flash page to which the data is written will have address information, so that the address information can be determined from the target flash page, after which the address information can be utilized by the takeover program to reconstruct the address mapping relationship corresponding to the storage unit, i.e. to reconstruct the FTL that records the physical address and the logical address.
Furthermore, when the address mapping relation is rebuilt, the rebuilding information can be read after the data writing area is scanned, so that the logical address is extracted based on the address information, and the physical address is informed to be read; in this embodiment, the specific implementation manner is as follows:
scanning a data writing area of the storage component by using the takeover program, and determining the target flash memory page for writing data in the target flash memory according to a scanning result; determining address information of the target flash memory page, extracting a logic address from the address information, and reading a physical address of the target flash memory page; writing the logical address and the physical address into a preset mapping table, and reconstructing an address mapping relation corresponding to the storage component by using the mapping table through the takeover program.
Specifically, the data writing area refers specifically to a flash memory in which data has been written in the storage section. Correspondingly, the logical address specifically refers to a virtual address recorded in a flash memory page in which data has been written, and the physical address specifically refers to a real address in the data writing cache. Correspondingly, the mapping table specifically refers to a table for recording a logical address and a physical address with a mapping relationship, and the table is stored in a system memory of the host side.
Based on the above, when the address mapping relation needs to be rebuilt through the takeover program, the takeover program can be utilized to scan the data writing area of the storage component first, so that the target flash memory page for writing data in the target flash memory is determined according to the scanning result; on the basis, the address information of the target flash memory pages can be determined, the logical addresses are extracted from the address information, the physical addresses of the target flash memory pages are read, and on the basis, the logical addresses of each target flash memory page can be extracted, and the physical addresses are read; and then writing the logical address and the physical address with the mapping relation into a preset mapping table so as to support the takeover program to reconstruct the address mapping relation corresponding to the storage component by using the mapping table.
Taking a firmware fault of a controller in the solid state disk as an example, the above description is explained. Referring to the schematic diagram shown in fig. 4, when the takeover program at the host end needs to reconstruct the FTL, the takeover program needs to scan a data writing area in a Solid State Disk (SSD) first, so as to determine the NAND flash memory for writing data according to the scanning result. And determining an OOB interval in a flash memory page of the NAND flash memory, wherein recorded LBA information { LBA-x, LBA-y, LBA-z, LBA-w } can be extracted from the OOB interval of each flash memory page, meanwhile, PBA information { PBA n, PBA n+1,PBA n+2,PBA n+3} of a current page can be directly read in the flash memory page, after the LBA information and PBA information of each flash memory page are obtained, the mapping table can be filled with contents with one-to-one correspondence, and the like, when the scanning of the all-solid-state hard disk is completed, the mapping table with a large amount of LBA information and PBA information can be obtained, and the taking-over program in the CPU core of the host end can reconstruct the FTL according to the mapping table at the moment, so that the problems of inconsistent read data, data loss and the like are avoided.
In summary, the host computer-side takeover program is used for reconstructing the address mapping relation, so that the problems of inconsistent read data, data loss and the like can be avoided, and the takeover program can continue to execute the data read-write task by utilizing the storage component replaced by the firmware according to the address mapping relation after supporting the firmware failure, so that the problem that the storage component cannot work is avoided.
And step S306, the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
Specifically, after the address mapping relation is rebuilt by the takeover program and the fault firmware is replaced by the bypass firmware, further, the preparation work that the storage component can continue to work is finished, the execution of the data read-write task can be continued for the support storage component, and the target storage component replaced by the firmware can be managed by the takeover program, so that the support takeover program can utilize the target storage component replaced by the firmware according to the rebuilt address mapping relation to continue to execute the data read-write task allocated to the storage component. The target storage component specifically refers to a storage component in which the fault firmware of the storage component is replaced by bypass firmware, and the corresponding data reading and writing task specifically refers to a task of reading and writing data by the storage component.
Further, after the takeover program executes the data processing task by using the target storage component replaced by the firmware according to the reconstructed address mapping relation, if the fault firmware is maintained successfully in the process, the bypass firmware needs to be replaced by the firmware without problems, so that the storage component can recover normal data reading and writing performance; in this embodiment, the specific implementation manner is as follows:
receiving replacement firmware and replacing the bypass firmware with the replacement firmware loaded into the controller; calling the takeover program according to the replacement result, and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program; and the target storage component executes the data reading and writing task according to the address mapping relation.
Specifically, the replacement firmware is firmware obtained by upgrading the fault firmware by the pointer, has the same structure as the fault firmware, and can solve the problem of the fault firmware.
Based on this, in the case that the replacement firmware is received, it is explained that the repair of the faulty firmware is completed at this time, and in order to enable the storage part to be restored to the normal operation mode, the bypass firmware may be replaced with the replacement firmware loaded into the controller first; when the firmware is replaced, taking into consideration that the storage component is always managed by the takeover program and the corresponding address mapping relation is also stored at the takeover program, and in order to support the storage component to continue working, the takeover program can be called according to the replacing result and the address mapping relation is sent to the dynamic memory of the target storage component through the takeover program; the target storage component continues to execute the data reading and writing task according to the address mapping relation.
That is, after the storage unit is managed by the takeover program, the availability can be kept continuously through the cooperation of the takeover program at the host side, and normal functions can be provided. After the problem of the fault firmware is solved, the fault firmware is released after the processes of positioning, developing, testing and the like, the storage component can execute on-line firmware hot upgrading, the bypass firmware is replaced by the new version firmware, after the upgrading is successful, the storage component can return to a normal execution mode, the address mapping relation in the host memory can be transferred to the storage component for the firmware to operate, and after that, the storage component can continue to execute various operations through the firmware.
Along the above example, when the solid state disk is managed by the takeover program of the host end and executes the data read-write task for a period of time, the firmware problem of the solid state disk is solved, when the new version firmware is received, the new version firmware can be loaded to the controller first, and then the bypass firmware is replaced by the new version firmware loaded to the controller; at this time, the takeover program at the host end is required to send the FTL being used at the current moment to the solid state disk and store the FTL into the DRAM of the solid state disk, so as to support the firmware of the new version to use the FTL to continue the data read-write task.
In summary, after the new version firmware is available, the storage component is switched to a normal working state at this time, and in order to support the storage component to continue working without being affected, the address mapping relationship at the takeover program is also sent to the storage component to support the new version firmware to be used.
Furthermore, in order to avoid that different types of firmware in the controller can support the execution of the data read-write task in different modes, the mode can be switched; in this embodiment, the specific implementation manner is as follows:
under the condition that the loading of the takeover program is completed, switching a data read-write mode between the storage component and the host end into a takeover mode, and suspending the interface access of the storage component according to a mode switching result; correspondingly, after the step of calling the takeover program according to the replacement result and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program is executed, the method further comprises: and under the condition that the address mapping relation is sent completely, switching the takeover mode into a data read-write mode between the storage component and the host side.
Specifically, the data read-write mode specifically refers to a mode in which the storage component and the host end are in a normal working state, and correspondingly, the takeover mode specifically refers to a mode in which the takeover program is in a working state after the bypass firmware replaces the fault firmware and the takeover program rebuilds the address mapping relation.
Based on the above, under the condition that the loading of the takeover program is completed, in order to ensure that the storage component can be managed by the takeover program to execute the data read-write task, the data read-write mode between the storage component and the host end can be switched to the takeover mode, so that the interface access of the storage component is paused according to the mode switching result, and the problems of data loss and the like caused by the process are avoided; correspondingly, after the step of calling the takeover program according to the replacement result and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program is executed, the method further comprises: under the condition that the address mapping relation is sent completely, which means that the firmware updating is completed at the moment, the take-over mode can be switched into a data read-write mode between the storage component and the host side, so that the storage component can work normally.
In addition, in order to support the storage component to support the normal operation of the processing of other dimensions when the data read-write task processing is performed, unloading hardware can be arranged on the storage component, and the unloading hardware has a connection relation with the takeover program; further, in the case that the takeover program sends a data read-write request to the storage unit, the offload hardware performs a sequential addressing task, a data reclamation task, a power-down protection task, a protocol parsing task, an error correction code task, and/or a data recovery task in response to the data read-write request.
The unloading hardware specifically refers to a hardware circuit deployed on the storage component, and the unloading hardware has a connection relationship with the takeover program, so that the unloading program can make protection measures, such as power-down protection, in time under the condition of a host end fault.
Further, the sequential addressing task specifically refers to a data sequential read-write operation task executed by taking hardware as a main body; the data reclamation task specifically refers to a GC (Garbage Collecting, garbage reclamation) task; the power-down protection task specifically refers to a task that can provide enough electric quantity to support the continuous work of the storage component when the power supply of the system is abnormal. The protocol parsing task specifically refers to the parsing and executing task of the set protocol instruction. The error correction code task refers in particular to the task of protecting the data storage quality. The data recovery task is specifically a task for performing data recovery processing when data is missing or problematic.
For example, the storage component is provided with unloading hardware, the unloading hardware abstracts and loads the conventional firmware operation into a hardware circuit, and the performance is improved by using the hardware to execute high efficiency. Referring to the schematic diagram shown in fig. 5, the host runs a service application, which is hereinafter a file system and a takeover program. And the takeover program includes block device management, media management, FTL, adapting NVMe (non-volatile memory host controller interface specification) driver, etc. The block device abstracts the storage component into a block device, and provides a block device interface to the file system. The media management performs operations such as bad block management and wear leveling performed during flash memory usage. The FTL maintains and updates a mapping between logical and physical addresses of data. The drive supports the functions of control instruction transmission, operation monitoring feedback and the like in a host taking over mode on the basis of the standard NVMe drive.
Accordingly, the bypass firmware replaces the problematic firmware inside the storage component, and read-write operations and the like are directly transferred from the host takeover program to the bypass firmware. The system comprises a storage component, a data storage interface, a data transmission overhead and a data medium interface, wherein the storage component is used for storing data, the data storage component is used for storing data, and the data storage component is used for storing data.
In practical application, considering that the host side may have a fault when the bypass firmware replaces the fault firmware to execute the data read-write task, in order to avoid data loss caused by the mode, the unused content may be recorded through different data storage areas; in this embodiment, the specific implementation manner is as follows:
responding to the fault detection information of the host end, and reading effective data in the dynamic memory of the target storage component by using the controller; coding the effective data, and performing format conversion on the coded effective data to obtain target data corresponding to the flash memory physical page format; and writing target data corresponding to the physical page format of the flash memory into a data area of the target storage component, and writing a target logic address corresponding to the effective data into an abnormal recording area of the target storage component.
Specifically, the fault detection information specifically refers to detection information generated under the condition of a host side fault, and is used for triggering a subsequent protection mechanism. Correspondingly, the dynamic memory specifically refers to a DRAM in the storage component, and the corresponding effective data specifically refers to the effective data (dirty pages) of the data cache area in the DRAM; correspondingly, the target data specifically refers to encoding the effective data, converting the effective data into data corresponding to the proper requirement of flash memory writing, and recording the OOB information in the target data. Correspondingly, the data area specifically refers to a data area corresponding to the NAND flash memory and is used for recording target data; the abnormal recording area specifically refers to an abnormal information recording area corresponding to the NAND flash memory, and is used for recording a target roadbed address corresponding to the effective data, namely, LBA information.
Based on the above, when the storage component is in the take-over mode, if the host computer side fails, the storage component in the current mode cannot work at the moment, and in order to avoid data loss, the controller can be used for reading the effective data in the dynamic memory of the target storage component in response to the failure detection information of the host computer side; then, the effective data can be encoded, and format conversion is carried out on the encoded effective data, so that target data corresponding to the flash memory physical page format is obtained; on the basis, the target data corresponding to the physical page format of the flash memory can be written into the data area of the target storage component, and the target logic address corresponding to the effective data is written into the abnormal recording area of the target storage component, so that the data and the logic address are recorded into the flash memory, and the situation that the normal use of the data cannot be ensured after the fault recovery is avoided.
On the basis, after the host computer end is recovered, the original address mapping relation can be updated by adopting the increment address mapping relation, so that the host computer end supporting takeover program can cooperate with the bypass firmware to continuously enable the storage component to execute the data reading and writing task; in this embodiment, the specific implementation manner is as follows:
Reading the target logical address in the abnormal recording area and the target data in the data area in response to the failure recovery information of the host side; resolving an associated logical address from the target data, and comparing the associated logical address with the target logical address; and determining an increment address mapping relation according to the comparison result, and sending the increment address mapping relation to the takeover program for updating the address mapping relation.
Specifically, the fault recovery information specifically refers to information triggered when the host side fails to contact, and the information is used for incrementally increasing the address mapping relationship in the take-over mode. Correspondingly, the associated logical address specifically refers to LBA information analyzed from target data, and by comparing the LBA information with the target logical address, the accuracy of an incremental address mapping relationship can be ensured, and correspondingly, the incremental address mapping relationship specifically refers to an address mapping relationship which is not recorded under the condition of a host end fault, and can incrementally update the address mapping relationship of a program at the take-over position.
Based on this, after the host side is restored, the target logical address can be read in the abnormal recording area and the target data can be read in the data area in response to the failure restoration information of the host side; meanwhile, the associated logical address can be resolved from the target data, and the associated logical address and the target logical address are compared; and determining an increment address mapping relation according to the comparison result, and finally, sending the increment address mapping relation to a takeover program for increment updating of the address mapping relation.
For example, when the host end in the take over mode fails, the host end cannot perform any instruction and feedback at the moment, and in order to avoid the problems of data loss and the like, referring to the schematic diagram shown in fig. 6, the controller of the Solid State Disk (SSD) can automatically read the valid data (dirty page) of the data buffer area in the DRAM in the current hard disk; secondly, the dirty pages enter an SSD controller; at this time, the encoding of the effective data can be completed with the ECC codeword requirement and spliced into a flash memory physical page format to be written into the flash memory, meanwhile, the LBA information written in during failure is recorded in the flash memory, namely, the data is written into the NAND data area, and the LBA information is recorded in the NAND abnormal recording area.
After the host is recovered from the failure, the host can be re-hung on the SSD, and the LBA information written when the failure occurs is read from the failure recording area, and the OOB area of the flash memory page written when the failure occurs is read from the data area. And the OOB area coincides with the read LBA information. At this time, the mapping relation of the data written in the abnormal recording can be updated to the FTL of the host end, and the mapping relation is used as a fault additional increment for the FTL reconstruction. Thus supporting the normal execution of the data read-write task in the take-over mode.
In order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, the firmware changing method provided by the specification can be used for loading the bypass firmware to the controller by utilizing the target flash memory of the storage component under the condition that the firmware in the storage component fails, so that the bypass firmware can replace the failed firmware to continue working, and therefore the storage component is ensured not to be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
The following description will further explain the firmware changing method by taking the application of the firmware changing method provided in the specification in the solid state disk fault scenario as an example with reference to fig. 7. Fig. 7 is a flowchart of a processing procedure of a firmware changing method according to an embodiment of the present disclosure, which specifically includes the following steps.
In step S702, in response to detection information of the fault firmware in the controller of the storage unit, the fault firmware is replaced with bypass firmware loaded into the controller, wherein the bypass firmware is loaded into the controller by the target flash memory of the storage unit.
In step S704, in response to the detection information loading host-side takeover program, the takeover program is utilized to scan the data writing area of the storage unit, and the target flash memory page for writing data is determined in the target flash memory according to the scanning result.
In step S706, the address information of the target flash page is determined, the logical address is extracted from the address information, and the physical address of the target flash page is read.
Step S708, writing the logical address and the physical address into a preset mapping table, and reconstructing the address mapping relation corresponding to the storage component by taking over the program and utilizing the mapping table.
Step S710, the takeover program uses the target storage component after the firmware replacement according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
In step S712, the controller is used to read the valid data in the dynamic memory of the target storage unit in response to the failure detection information of the host.
Step S714, the effective data is encoded, and format conversion is performed on the encoded effective data to obtain target data corresponding to the flash memory physical page format.
In step S716, the target data corresponding to the flash physical page format is written into the data area of the target storage unit, and the target logical address corresponding to the valid data is written into the abnormal recording area of the target storage unit.
In step S718, in response to the failure recovery information of the host side, the target logical address is read in the abnormal recording area, and the target data is read in the data area.
Step S720, resolving the associated logical address in the target data, and comparing the associated logical address with the target logical address.
Step S722, determining an increment address mapping relation according to the comparison result, and sending the increment address mapping relation to the takeover program for updating the address mapping relation.
In step S724, the replacement firmware is received, and the bypass firmware is replaced with the replacement firmware loaded into the controller.
Step S726, calling a takeover program according to the replacement result, and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program; and the target storage component executes a data reading and writing task according to the address mapping relation.
In order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, the firmware changing method provided by the specification can be used for loading the bypass firmware to the controller by utilizing the target flash memory of the storage component under the condition that the firmware in the storage component fails, so that the bypass firmware can replace the failed firmware to continue working, and therefore the storage component is ensured not to be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
Corresponding to the method embodiment, the present disclosure further provides a firmware changing system embodiment, and fig. 8 shows a schematic structural diagram of a firmware changing system provided in one embodiment of the present disclosure. As shown in fig. 8, the firmware alteration system 800 includes a storage unit 810 and a host side 820;
the storage unit 810 is configured to replace the fault firmware with bypass firmware loaded into the controller in response to detection information of the fault firmware in the controller, where the bypass firmware is loaded into the controller by a target flash memory;
the host side 820 is configured to load a takeover program, and reconstruct an address mapping relationship corresponding to the storage component by using the takeover program; and the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
In an alternative embodiment, the host side 820 is further configured to: determining a target flash page of write data in the target flash of the storage component using the takeover program; and determining the address information of the target flash memory page, and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program.
In an alternative embodiment, the storage unit 810 is further configured to: receiving replacement firmware and replacing the bypass firmware with the replacement firmware loaded into the controller; calling the takeover program according to the replacement result, and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program; and the target storage component executes the data reading and writing task according to the address mapping relation.
In an alternative embodiment, the storage component is configured with uninstall hardware, and the uninstall hardware has a connection relationship with the takeover program;
in the case that the takeover program transmits a data read/write request to the storage unit, the storage unit 810 is further configured to: and the unloading hardware responds to the data read-write request and executes a sequential addressing task, a data recovery task, a power-down protection task, a protocol analysis task, an error correction code task and/or a data recovery task.
In an alternative embodiment, the host side 820 is further configured to: scanning a data writing area of the storage component by using the takeover program, and determining the target flash memory page for writing data in the target flash memory according to a scanning result; the determining the address information of the target flash memory page and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program comprises the following steps: determining address information of the target flash memory page, extracting a logic address from the address information, and reading a physical address of the target flash memory page; writing the logical address and the physical address into a preset mapping table, and reconstructing an address mapping relation corresponding to the storage component by using the mapping table through the takeover program.
In an alternative embodiment, the storage unit 810 is further configured to: responding to the fault detection information of the host end, and reading effective data in the dynamic memory of the target storage component by using the controller; coding the effective data, and performing format conversion on the coded effective data to obtain target data corresponding to the flash memory physical page format; and writing target data corresponding to the physical page format of the flash memory into a data area of the target storage component, and writing a target logic address corresponding to the effective data into an abnormal recording area of the target storage component.
In an alternative embodiment, the storage unit 810 is further configured to: reading the target logical address in the abnormal recording area and the target data in the data area in response to the failure recovery information of the host side; resolving an associated logical address from the target data, and comparing the associated logical address with the target logical address; and determining an increment address mapping relation according to the comparison result, and sending the increment address mapping relation to the takeover program for updating the address mapping relation.
In an alternative embodiment, the host side 820 is further configured to: under the condition that the loading of the takeover program is completed, switching a data read-write mode between the storage component and the host end into a takeover mode, and suspending the interface access of the storage component according to a mode switching result; after the step of calling the takeover program according to the replacement result and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program is executed, the method further comprises: and under the condition that the address mapping relation is sent completely, switching the takeover mode into a data read-write mode between the storage component and the host side.
In summary, in order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, under the condition that the firmware in the storage component is determined to fail, the bypass firmware is loaded to the controller by using the target flash memory of the storage component to replace the failed firmware, so that the bypass firmware can replace the failed firmware to continue to work, and therefore the storage component is ensured not to be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
The above is a schematic scheme of a firmware changing system of the present embodiment. It should be noted that, the technical solution of the firmware changing system and the technical solution of the firmware changing method belong to the same concept, and details of the technical solution of the firmware changing system, which are not described in detail, can be referred to the description of the technical solution of the firmware changing method.
Corresponding to the method embodiment, the present disclosure further provides a firmware changing apparatus embodiment, and fig. 9 shows a schematic structural diagram of a firmware changing apparatus provided in one embodiment of the present disclosure. As shown in fig. 9, the apparatus includes:
a replacement module 902 configured to replace a failed firmware in a controller of a storage component with bypass firmware loaded into the controller in response to detection information of the failed firmware, wherein the bypass firmware is loaded to the controller by a target flash memory of the storage component;
a rebuilding module 904 configured to load a takeover program of the host side in response to the detection information, and rebuild an address mapping relationship corresponding to the storage component by using the takeover program;
and the execution module 906 is configured to execute the data read-write task corresponding to the target storage component by the takeover program according to the reconstructed address mapping relation by using the target storage component replaced by the firmware.
In an alternative embodiment, the reconstruction module 904 is further configured to:
determining a target flash page of write data in the target flash of the storage component using the takeover program; and determining the address information of the target flash memory page, and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program.
In an alternative embodiment, the apparatus further comprises:
a receiving firmware module configured to receive replacement firmware and replace the bypass firmware with the replacement firmware loaded into the controller; calling the takeover program according to the replacement result, and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program; and the target storage component executes the data reading and writing task according to the address mapping relation.
In an alternative embodiment, the storage component is configured with uninstall hardware, and the uninstall hardware has a connection relationship with the takeover program;
and under the condition that the takeover program sends a data read-write request to the storage component, the unloading hardware responds to the data read-write request and executes a sequential addressing task, a data recovery task, a power-down protection task, a protocol analysis task, an error correction code task and/or a data recovery task.
In an alternative embodiment, the reconstruction module 904 is further configured to:
scanning a data writing area of the storage component by using the takeover program, and determining the target flash memory page for writing data in the target flash memory according to a scanning result; the determining the address information of the target flash memory page and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program comprises the following steps: determining address information of the target flash memory page, extracting a logic address from the address information, and reading a physical address of the target flash memory page; writing the logical address and the physical address into a preset mapping table, and reconstructing an address mapping relation corresponding to the storage component by using the mapping table through the takeover program.
In an alternative embodiment, the apparatus further comprises:
the encoding module is configured to respond to the fault detection information of the host end and read effective data in the dynamic memory of the target storage component by using the controller; coding the effective data, and performing format conversion on the coded effective data to obtain target data corresponding to the flash memory physical page format; and writing target data corresponding to the physical page format of the flash memory into a data area of the target storage component, and writing a target logic address corresponding to the effective data into an abnormal recording area of the target storage component.
In an alternative embodiment, the apparatus further comprises:
an updating module configured to read the target logical address in the abnormal recording area and the target data in the data area in response to the failure recovery information of the host side; resolving an associated logical address from the target data, and comparing the associated logical address with the target logical address; and determining an increment address mapping relation according to the comparison result, and sending the increment address mapping relation to the takeover program for updating the address mapping relation.
In an alternative embodiment, the apparatus further comprises:
the switching module is configured to switch a data read-write mode between the storage component and the host end into a takeover mode under the condition that the takeover program is loaded, and suspend the interface access of the storage component according to a mode switching result; after the takeover program is called according to the replacement result and the address mapping relation is sent to the dynamic memory of the target storage component through the takeover program, the method further comprises the following steps: and under the condition that the address mapping relation is sent completely, switching the takeover mode into a data read-write mode between the storage component and the host side.
In summary, in order to avoid the problem that the storage component cannot work after the firmware in the controller of the storage component fails, under the condition that the firmware in the storage component is determined to fail, the bypass firmware is loaded to the controller by using the target flash memory of the storage component to replace the failed firmware, so that the bypass firmware can replace the failed firmware to continue to work, and therefore the storage component is ensured not to be down. On the basis, considering that the bypass firmware is only used as the first-aid firmware in the fault scene, compared with the fault firmware, the structure is simpler, the data read-write task can be continuously executed by the storage component, and meanwhile, the host end is required to be loaded with the takeover program, so that the data read-write task can be completed by the takeover program matched with the target storage component after the firmware replacement; in the process, the address mapping relation corresponding to the storage component is rebuilt by using the takeover program, so that the takeover program is supported to execute the data reading and writing task by using the target storage component replaced by the firmware according to the rebuilt address mapping relation; when the problems of unavailability, instability or low performance caused by firmware faults of the storage component are solved, an escape machine preparation option can be provided to maintain that the storage component can continue to work so as to avoid influencing upstream and downstream services, and after the fault firmware is replaced by the bypass firmware, the fault firmware can be repaired without stopping the work of the storage component so as to support the hot update of the firmware, thereby realizing more convenient maintenance on the storage component.
The above is a schematic scheme of a firmware changing apparatus of the present embodiment. It should be noted that, the technical solution of the firmware changing apparatus and the technical solution of the firmware changing method belong to the same concept, and details of the technical solution of the firmware changing apparatus, which are not described in detail, can be referred to the description of the technical solution of the firmware changing method.
Fig. 10 illustrates a block diagram of a computing device 1000 provided in accordance with one embodiment of the present description. The components of the computing device 1000 include, but are not limited to, a memory 1010 and a processor 1020. Processor 1020 is coupled to memory 1010 via bus 1030 and database 1050 is used to store data.
Computing device 1000 also includes access device 1040, which access device 1040 enables computing device 1000 to communicate via one or more networks 1060. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 1040 may include one or more of any type of network interface, wired or wireless, such as a network interface card (NIC, network interface controller), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Network) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, worldwide Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a near-field communication (NFC, near Field Communication) interface, and so forth.
In one embodiment of the application, the above-described components of computing device 1000, as well as other components not shown in FIG. 10, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 10 is for exemplary purposes only and is not intended to limit the scope of the present application. Those skilled in the art may add or replace other components as desired.
Computing device 1000 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 1000 may also be a mobile or stationary server.
The processor 1020 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the firmware change method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the firmware changing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the firmware changing method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the firmware changing method described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the firmware changing method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the firmware changing method.
An embodiment of the present disclosure further provides a computer program, where the computer program, when executed in a computer, causes the computer to perform the steps of the firmware changing method described above.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solution of the firmware changing method belong to the same concept, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the firmware changing method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (12)

1. A firmware altering method, comprising:
replacing the fault firmware with bypass firmware loaded into a controller of a storage component in response to detection information of the fault firmware in the controller, wherein the bypass firmware is loaded into the controller by a target flash memory of the storage component;
loading a takeover program of a host side in response to the detection information, and rebuilding an address mapping relation corresponding to the storage component by using the takeover program;
And the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
2. The method of claim 1, wherein reconstructing, with the takeover program, the address mapping relationship corresponding to the storage component, comprises:
determining a target flash page of write data in the target flash of the storage component using the takeover program;
and determining the address information of the target flash memory page, and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program.
3. The method according to claim 1, wherein the takeover program uses the firmware-replaced target storage unit according to the reconstructed address mapping relationship, and further comprises, after executing the step of executing the data read-write task corresponding to the target storage unit:
receiving replacement firmware and replacing the bypass firmware with the replacement firmware loaded into the controller;
and calling the takeover program according to the replacement result, and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program so that the target storage component executes the data read-write task according to the address mapping relation.
4. The method of claim 1, the storage component configured with offload hardware, the offload hardware having a connection relationship with the takeover program;
and under the condition that the takeover program sends a data read-write request to the storage component, the unloading hardware responds to the data read-write request and executes a sequential addressing task, a data recovery task, a power-down protection task, a protocol analysis task, an error correction code task and/or a data recovery task.
5. The method of claim 2, the determining, with the takeover program, a target flash page of write data in the target flash of the storage component, comprising:
scanning a data writing area of the storage component by using the takeover program, and determining the target flash memory page for writing data in the target flash memory according to a scanning result;
the determining the address information of the target flash memory page and reconstructing the address mapping relation corresponding to the storage component by using the address information through the takeover program comprises the following steps:
determining address information of the target flash memory page, extracting a logic address from the address information, and reading a physical address of the target flash memory page;
Writing the logical address and the physical address into a preset mapping table, and reconstructing an address mapping relation corresponding to the storage component by using the mapping table through the takeover program.
6. The method according to claim 1, wherein the takeover program uses the firmware-replaced target storage unit according to the reconstructed address mapping relationship, and further comprises, after executing the step of executing the data read-write task corresponding to the target storage unit:
responding to the fault detection information of the host end, and reading effective data in the dynamic memory of the target storage component by using the controller;
coding the effective data, and performing format conversion on the coded effective data to obtain target data corresponding to the flash memory physical page format;
and writing target data corresponding to the physical page format of the flash memory into a data area of the target storage component, and writing a target logic address corresponding to the effective data into an abnormal recording area of the target storage component.
7. The method of claim 6, further comprising, after the step of writing the target logical address corresponding to the valid data into the exception record area of the target storage unit is performed:
Reading the target logical address in the abnormal recording area and the target data in the data area in response to the failure recovery information of the host side;
resolving an associated logical address from the target data, and comparing the associated logical address with the target logical address;
and determining an increment address mapping relation according to the comparison result, and sending the increment address mapping relation to the takeover program for updating the address mapping relation.
8. The method according to claim 3, wherein after the step of loading the host-side takeover program in response to the detection information and reconstructing the address mapping relationship corresponding to the storage unit by using the takeover program is performed, the method further comprises:
under the condition that the loading of the takeover program is completed, switching a data read-write mode between the storage component and the host end into a takeover mode, and suspending the interface access of the storage component according to a mode switching result;
after the step of calling the takeover program according to the replacement result and sending the address mapping relation to the dynamic memory of the target storage component through the takeover program is executed, the method further comprises:
And under the condition that the address mapping relation is sent completely, switching the takeover mode into a data read-write mode between the storage component and the host side.
9. A firmware alteration system comprising a storage component and a host side, comprising:
the storage component is used for responding to detection information of fault firmware in a controller and replacing the fault firmware by using bypass firmware loaded into the controller, wherein the bypass firmware is loaded to the controller by a target flash memory;
the host end is used for loading a takeover program and reconstructing an address mapping relation corresponding to the storage component by using the takeover program; and the takeover program uses the target storage component replaced by the firmware according to the reconstructed address mapping relation to execute the data read-write task corresponding to the target storage component.
10. A firmware changing apparatus comprising:
a replacement module configured to replace a failed firmware in a controller of a storage component with bypass firmware loaded into the controller in response to detection information of the failed firmware, wherein the bypass firmware is loaded to the controller by a target flash memory of the storage component;
The reconstruction module is configured to respond to the detection information to load a takeover program of the host end and reconstruct an address mapping relation corresponding to the storage component by utilizing the takeover program;
and the execution module is configured to execute the data read-write task corresponding to the target storage component by using the target storage component replaced by the firmware according to the reconstructed address mapping relation by the takeover program.
11. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the method of any one of claims 1 to 8.
12. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of any one of claims 1 to 8.
CN202310774084.3A 2023-06-27 2023-06-27 Firmware changing method, system and device Pending CN116974961A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310774084.3A CN116974961A (en) 2023-06-27 2023-06-27 Firmware changing method, system and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310774084.3A CN116974961A (en) 2023-06-27 2023-06-27 Firmware changing method, system and device

Publications (1)

Publication Number Publication Date
CN116974961A true CN116974961A (en) 2023-10-31

Family

ID=88470349

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310774084.3A Pending CN116974961A (en) 2023-06-27 2023-06-27 Firmware changing method, system and device

Country Status (1)

Country Link
CN (1) CN116974961A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312055A (en) * 2023-11-16 2023-12-29 荣耀终端有限公司 Data backup method and related device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117312055A (en) * 2023-11-16 2023-12-29 荣耀终端有限公司 Data backup method and related device
CN117312055B (en) * 2023-11-16 2024-04-19 荣耀终端有限公司 Data backup method and related device

Similar Documents

Publication Publication Date Title
US8239714B2 (en) Apparatus, system, and method for bad block remapping
CN102929750B (en) Nonvolatile media dirty region tracking
US8286028B2 (en) Backup method and disk array apparatus
JP4821448B2 (en) RAID controller and RAID device
US9817600B2 (en) Configuration information backup in memory systems
CN103970481A (en) Method and device for reconstructing memory array
CN101482837B (en) Error correction method and apparatus for flash memory file system
US20040103246A1 (en) Increased data availability with SMART drives
CN102184129B (en) Fault tolerance method and device for disk arrays
US7975171B2 (en) Automated file recovery based on subsystem error detection results
CN104050056A (en) File system backup of multi-storage-medium device
CN103577121A (en) High-reliability linear file access method based on nand flash
US11232032B2 (en) Incomplete write group journal
CN116974961A (en) Firmware changing method, system and device
US9519545B2 (en) Storage drive remediation in a raid system
CN111831476A (en) Method of controlling operation of RAID system
US9378092B2 (en) Storage control apparatus and storage control method
US11809295B2 (en) Node mode adjustment method for when storage cluster BBU fails and related component
JP2022017212A (en) System and device for data restoration for ephemeral storage
CN116204137B (en) Distributed storage system, control method, device and equipment based on DPU
JP4951493B2 (en) Disk array device
CN102737716B (en) Memorizer memory devices, Memory Controller and method for writing data
US20210208986A1 (en) System and method for facilitating storage system operation with global mapping to provide maintenance without a service interrupt
CN101739308B (en) Method for generating image file and storage system for image file
CN114398087A (en) Method for improving running stability of single chip microcomputer after program updating and single chip microcomputer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 553, 5th Floor, Building 3, No. 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province, 310023

Applicant after: Hangzhou Alibaba Cloud Feitian Information Technology Co.,Ltd.

Address before: Room 553, 5th Floor, Building 3, No. 969 Wenyi West Road, Wuchang Street, Yuhang District, Hangzhou City, Zhejiang Province, 310023

Applicant before: Hangzhou Alibaba Feitian Information Technology Co.,Ltd.