WO2019196199A1 - 一种磁盘坏道的处理方法、装置及计算机存储介质 - Google Patents

一种磁盘坏道的处理方法、装置及计算机存储介质 Download PDF

Info

Publication number
WO2019196199A1
WO2019196199A1 PCT/CN2018/091579 CN2018091579W WO2019196199A1 WO 2019196199 A1 WO2019196199 A1 WO 2019196199A1 CN 2018091579 W CN2018091579 W CN 2018091579W WO 2019196199 A1 WO2019196199 A1 WO 2019196199A1
Authority
WO
WIPO (PCT)
Prior art keywords
target disk
disk
bad track
target
area
Prior art date
Application number
PCT/CN2018/091579
Other languages
English (en)
French (fr)
Inventor
谢佳祥
张旭
郑雅娟
潘志淮
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Priority to EP18889973.6A priority Critical patent/EP3745270A4/en
Priority to US16/506,349 priority patent/US11073998B2/en
Publication of WO2019196199A1 publication Critical patent/WO2019196199A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/36Monitoring, i.e. supervising the progress of recording or reproducing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • G06F11/1088Reconstruction on already foreseen single or plurality of spare disks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/2205Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested
    • G06F11/2221Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing using arrangements specific to the hardware being tested to test input/output devices or peripheral units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0653Monitoring storage devices or systems
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/18Error detection or correction; Testing, e.g. of drop-outs
    • G11B20/1883Methods for assignment of alternate areas for defective areas
    • G11B20/1889Methods for assignment of alternate areas for defective areas with discs
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/11Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information not detectable on the record carrier
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/18Error detection or correction; Testing, e.g. of drop-outs
    • G11B20/1816Testing
    • G11B2020/1826Testing wherein a defect list or error map is generated
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/18Error detection or correction; Testing, e.g. of drop-outs
    • G11B20/1816Testing
    • G11B2020/183Testing wherein at least one additional attempt is made to read or write the data when a first attempt is unsuccessful

Definitions

  • the present application relates to the field of hardware device detection technologies, and in particular, to a method and device for processing a disk bad track and a computer storage medium.
  • the number of disks is large and the disks are relatively scattered, and the maintenance of disks is difficult.
  • a disk fails it is usually waited until the disk is dropped or the service/quality of service is seriously affected before a person is placed to replace the failed disk.
  • the disadvantage of this processing is that the processing efficiency is low when the disk fails.
  • the bad sectors of the disk are common faults of the disk. If a new disk is used to replace a disk with only a few bad sectors, a large waste of resources is caused.
  • the purpose of the present application is to provide a method and device for processing a bad track of a disk and a computer storage medium, which can improve the processing efficiency of the failed disk while saving disk resources.
  • the present application provides a method for processing a bad track of a disk, the method comprising: acquiring a target disk to be processed, and detecting bad track data in the target disk; and combining the bad track data representation a bad track area to obtain an available area of the target disk other than the bad track area; according to the detection result, determining whether the target disk is available, and if available, rebuilding the target disk based on the available area Storage space and setting access parameters for the rebuilt storage space.
  • another aspect of the present application is to provide a processing device for a disk bad track, the device comprising: a disk detecting unit, configured to acquire a target disk to be processed, and detect bad track data in the target disk. a bad track isolation unit, configured to merge the bad track area characterized by the bad track data to obtain an available area of the target disk other than the bad track area; and a spatial reconstruction unit configured to determine according to the detection result Whether the target disk is available, if available, rebuilding the storage space of the target disk based on the available area, and setting access parameters for the reconstructed storage space.
  • another aspect of the present application provides a processing apparatus for a disk bad track, the apparatus comprising a memory and a processor, the memory being used to store a computer program, when the computer program is executed by the processor , perform the above method.
  • another aspect of the present application also provides a computer storage medium for storing a computer program, which when executed by a processor, performs the above method.
  • the health status information of the target disk can be periodically obtained, and the health status information is analyzed to determine whether the target disk needs to perform bad track test.
  • the bad track test needs to be performed, the bad track areas characterized by the bad track data can be merged, and the merged bad track areas can be isolated in the target disk, thereby obtaining the available area for eliminating the bad track area. Due to the existence of the bad track area, the available area may exhibit a discrete distribution in the target disk. In this case, in order to normally use the available area in the target disk, the storage space of the target disk may be reconstructed based on the available area.
  • the target disk may be divided into a plurality of partitions according to the available area, and the plurality of partitions may be combined into one volume.
  • the formatted volume's volume label and mount point can be used to follow the target disk's original volume label and mount point, thus completing the processing of the failed disk.
  • the technical solution provided by the present application does not need to replace the entire disk when a small number of bad sectors occur, but can use the bad track isolation to make full use of the available area, thereby saving disk resources. Improve the processing efficiency of faulty disks.
  • FIG. 1 is a schematic diagram of a processing method of a disk bad track in the embodiment of the present application.
  • FIG. 2 is a flowchart of a method for processing a bad track of a disk in the embodiment of the present application
  • FIG. 3 is a schematic diagram of bad track isolation in the embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a processing device for a disk bad track in the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a computer terminal in an embodiment of the present application.
  • the present application provides a method for processing a bad track of a disk.
  • the method may include the following steps.
  • S1 Acquire a target disk to be processed, and detect bad track data in the target disk.
  • the disk state monitoring script can be periodically executed, and when executed, the script can obtain the health status information of the target disk.
  • the health status information of the target disk may include at least one of self-check information of the target disk, log information of an operating system where the target disk is located, and input/output load information of the target disk.
  • the self-test information may be S.M.A.R.T (Self-Monitoring Analysis and Reporting Technology) information of the target disk, and the S.M.A.R.T information may be retained in a system service area of the hard disk.
  • binary code can be used as the basic instruction of S.M.A.R.T, and it is specified to be written into the standard register to form a specific S.M.A.R.T information table for normal detection and operation.
  • the S.M.A.R.T instructions can be divided into a main command (Command) and a sub-command (Subcommands).
  • the main instruction mainly provides information on whether the device supports S.M.A.R.T or ignores a certain instruction feature.
  • the second instruction provides detection information to support the S.M.A.R.T device.
  • the SMART message it can display whether the target hard disk fails (FAIL), and it can also display whether the target disk has an error during runtime, and can judge whether the target disk has a large number of errors according to the number of errors (the number of errors is 200). the above).
  • the log information may refer to the message information of the operating system, and the information may indicate whether the target disk has read/write errors and whether there is a file system error.
  • the input and output load information can then characterize whether the target disk is in a high load operating state.
  • the running state of the target disk may be preset, and then the determination result corresponding to each of the running states may be separately determined based on the health state information of the target disk to obtain the determination of the target disk.
  • the result is combined.
  • Table 1 lists several operating states, and in each operating state, the possible determination results are exemplified, and each row can be combined as a result.
  • the preset processing policy corresponding to the determination result combination may be invoked, and the preset processing policy may be used to identify whether the target disk needs to perform bad sectors. test.
  • the preset processing strategy may be information of a column of "processing actions", and may be combined with different processing actions according to different combinations of determination results.
  • S3 Combine the bad track area characterized by the bad track data to obtain an available area of the target disk other than the bad track area.
  • the service in the target disk may be suspended or the service in the server where the target disk is located may be suspended. For example, you can temporarily remove the target disk from the server or directly shut down the server where the target disk is located. In this way, normal traffic execution can be prevented without performing bad sectors testing on the target disk.
  • the logical bad sectors in the target disk may be first repaired.
  • the bad track data corresponding to the bad track that cannot be repaired may be saved to the specified file.
  • the specified file may be a file preset in the operating system. When the bad track data is analyzed, the bad track data may be read through the access path of the specified file.
  • the bad track data may be an array of one-dimensional non-negative integers.
  • the array can contain multiple elements, where each element can represent a bad track with a capacity of 4KB, and the element value of the element can represent the location of the bad track on the target disk.
  • the element value N an integer starting from 0
  • the area where the bad track is located can be covered with the smallest possible area, and the area covering the bad track can be isolated from the target disk, thereby obtaining the available area in the target disk that can be normally used.
  • the storage space of the target disk may be divided into a specified number of sub-areas in advance.
  • the target disk can be equally divided into 100 sub-regions, which can be represented by numbers from 0-99. Among them, 0 can represent 0-1% of storage space, 1 can represent 1-2% of storage space, and so on. Then, a target sub-region corresponding to each element value in the bad track data may be determined in the specified number of sub-regions.
  • the total capacity of the target disk (the number of 4 KB included in the actual capacity of the target disk) is first determined in units of 4 KB.
  • the element value can then be divided by the total capacity of the target disk, and the resulting result is rounded to determine which sub-region the element value corresponds to.
  • the determined target sub-regions can be used as sub-regions with bad sectors.
  • the shaded sub-regions in FIG. 3 can be used as target sub-regions with bad sectors.
  • two adjacent target sub-regions whose intervals satisfy the specified condition may be merged, thereby obtaining more Consolidated areas.
  • an adapted area may be employed to cover a plurality of target sub-areas whose intervals satisfy the specified condition.
  • the interval satisfying the specified condition may mean that the number of sub-regions spaced between adjacent two target sub-regions is less than or equal to a specified number threshold.
  • the two adjacent target sub-regions may be separated as long as the interval between two adjacent target sub-regions does not exceed 2 sub-regions. Consolidate. Referring to FIG. 3, between the first target and the second target sub-area, there is one sub-area separated, so the two target sub-areas can be merged, and when the two target sub-areas are merged, The sub-areas included between the two target sub-areas are also merged together, so that the sub-area after the initial merge may include three sub-areas.
  • the merge condition is not satisfied, so the second target sub-region and the third target sub-region are not merged.
  • the third, fourth and fifth target sub-regions all satisfy the merge condition, so the three target sub-regions can be merged together with the sub-regions between them, and the merged region can include four sub-regions.
  • the isolated area may be determined at the head and/or the tail of each of the merged areas.
  • the isolated area may be one sub-area (the total capacity of the target disk) 1%), and the combination of the merged area and the isolated area is taken as the merged bad track area.
  • the first merged area is at the head of the target disk, only one sub-area is used as the isolated area at the end of the merged area, so that the four sub-areas in the dotted line frame can be finally regarded as the bad track area.
  • the second merged area is in the middle of the target disk, both the head and the tail can determine one sub-area as the isolated area, so that the six sub-areas in the dotted line frame can serve as the bad track area.
  • the area other than the bad track area is removed from the target disk, and the area can be used as an available area.
  • S5 Determine, according to the detection result, whether the target disk is available. If available, rebuild the storage space of the target disk based on the available area, and set an access parameter for the reconstructed storage space.
  • the target disk after detecting and isolating the bad sectors, it may be determined according to the detection result whether the target disk can continue to be used. Specifically, the three aspects of the number of bad sectors, the space capacity of the available area, and the number of partitions of the available area can be comprehensively considered. If the number of detected bad sectors is greater than a specified threshold number of bad sectors, or the obtained space capacity of the available area is less than a specified capacity threshold, or the number of partitions after partitioning according to the available area is greater than a specified number of partition thresholds, it may be determined that The target disk is not available.
  • the target disk is unavailable.
  • the number of detected bad sectors is less than or equal to the specified number of bad sectors threshold, and the obtained free space of the available area is greater than or equal to the specified capacity threshold, and the number of partitions after partitioning according to the available area Less than or equal to the specified number of partitions threshold, it can be determined that the target disk is available.
  • the number of bad sectors does not exceed 200, and the space capacity of the available area is not less than 90% of the total capacity of the target disk, and the number of partitions of the available area does not exceed 4, it can be determined that the target disk is available.
  • an alarm message may be sent to notify the administrator to replace the disk.
  • the target disk may be divided into multiple partitions according to the available area. Specifically, a plurality of consecutive available areas can be divided into one partition. For example, in FIG. 3, blank sub-regions other than the dashed box are all available regions, then the remaining available regions can be divided into two partitions, the first partition contains 2 sub-regions, and the second partition contains only 1 sub-region. region. After dividing the available area into multiple partitions, the multiple partitions can be merged into one volume by the LVM (Logical Volume Manager) function, and the merged volume is used as the reconstructed storage space. .
  • LVM Logical Volume Manager
  • the volume label and the mount point of the target disk may be recorded, so that after the reconstructed storage space is obtained, the reconstructed may be formatted.
  • the storage space is set, and the recorded volume label and the mount point of the target disk are respectively set to the label and the mount point of the formatted storage space, thereby completing the setting process of the access parameter.
  • the target disk after the completion of the bad track isolation and the merged volume processing can continue to be used, and therefore, the service in the target disk can be restored or the service in the server where the target disk is located can be restored.
  • the application also provides a processing device for a disk bad track, the device comprising:
  • a disk detecting unit configured to acquire a target disk to be processed, and detect bad track data in the target disk
  • a bad track isolation unit configured to merge the bad track area characterized by the bad track data to obtain an available area of the target disk other than the bad track area
  • a space reconstruction unit configured to determine, according to the detection result, whether the target disk is available, if available, reconstruct a storage space of the target disk based on the available area, and set an access parameter for the reconstructed storage space .
  • the bad track data includes at least one element value for characterizing a position of the bad track in the target disk; correspondingly, the bad track isolation unit includes:
  • a sub-area dividing module configured to divide the storage space of the target disk into a specified number of sub-areas in advance
  • a target sub-area determining module configured to determine a target sub-area corresponding to each element value in the bad track data in the specified number of sub-areas
  • a region merging module configured to combine two adjacent target sub-regions whose intervals meet the specified condition to obtain multiple merging regions
  • an isolation area setting module configured to determine an isolation area at a header and/or a tail of each of the merged areas, and combine the combined area and the isolated area as a merged bad track area.
  • the spatial reconstruction unit comprises:
  • a partitioning module configured to divide the target disk into multiple partitions according to the available area
  • a volume merge module is configured to merge the plurality of partitions into one volume and use the merged volume as a rebuilt storage space.
  • the apparatus further includes:
  • a parameter recording unit configured to record a volume label and a mount point of the target disk
  • the space reconstruction unit further includes:
  • a parameter reset module configured to format the rebuilt storage space, and set the recorded volume label and the mount point of the target disk to a label and a mount point of the formatted storage space, respectively.
  • the present application further provides a processing device for a disk bad track, the device comprising a memory and a processor, wherein the memory is used to store a computer program, and when the computer program is executed by the processor, The processing method of the above bad sectors of the disk.
  • the memory may include physical means for storing information, typically by digitizing the information and then storing it in a medium that utilizes electrical, magnetic or optical means.
  • the memory in this embodiment may further include: a device for storing information by using an electric energy method, such as a RAM, a ROM, etc.; a device for storing information by using a magnetic energy method, such as a hard disk, a floppy disk, a magnetic tape, a magnetic core memory, a magnetic bubble memory, and a USB flash drive; A device that optically stores information, such as a CD or a DVD.
  • an electric energy method such as a RAM, a ROM, etc.
  • a magnetic energy method such as a hard disk, a floppy disk, a magnetic tape, a magnetic core memory, a magnetic bubble memory, and a USB flash drive
  • a device that optically stores information such as a CD or a DVD.
  • computer storage media such as quantum memories, graphene memories, and the like.
  • the processor can be implemented in any suitable manner.
  • the processor can take the form of, for example, a microprocessor or processor and computer readable media, logic gates, switches, and special-purpose integrations for storing computer readable program code (eg, software or firmware) executable by the (micro)processor.
  • ASIC Application Specific Integrated Circuit
  • programmable logic controller programmable logic controller and embedded microcontroller form.
  • the present application also provides a computer storage medium for storing a computer program, which when executed by a processor, can perform the processing method of the above-mentioned disk bad track.
  • Computer terminal 10 may include one or more (only one of which is shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), for storing data.
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), for storing data.
  • FIG. 5 is merely illustrative, and does not limit the structure of the above electronic device.
  • computer terminal 10 may also include more or fewer components than shown in FIG. 5, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of application software, and the processor 102 executes various functional applications and data processing by running software programs and modules stored in the memory 104.
  • Memory 104 may include high speed random access memory and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the computer program in the above embodiments may be stored in the memory 104, and the memory 104 may be coupled to the processor 102, and the processor 102 may thereby read and execute the computer program in the memory 104, thereby implementing the above-described Technical solutions.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • the health status information of the target disk can be periodically obtained, and the health status information is analyzed to determine whether the target disk needs to perform bad track test.
  • the bad track test needs to be performed, the bad track areas characterized by the bad track data can be merged, and the merged bad track areas can be isolated in the target disk, thereby obtaining the available area for eliminating the bad track area. Due to the existence of the bad track area, the available area may exhibit a discrete distribution in the target disk. In this case, in order to normally use the available area in the target disk, the storage space of the target disk may be reconstructed based on the available area.
  • the target disk may be divided into a plurality of partitions according to the available area, and the plurality of partitions may be combined into one volume.
  • the formatted volume's volume label and mount point can be used to follow the target disk's original volume label and mount point, thus completing the processing of the failed disk.
  • the technical solution provided by the present application does not need to replace the entire disk when a small number of bad sectors occur, but can use the bad track isolation to make full use of the available area, thereby saving disk resources. Improve the processing efficiency of faulty disks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请公开了一种磁盘坏道的处理方法、装置及计算机存储介质,其中,所述方法包括:获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据;合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。本申请提供的技术方案,能够在节省磁盘资源的情况下,提高故障磁盘的处理效率。

Description

一种磁盘坏道的处理方法、装置及计算机存储介质 技术领域
本申请涉及硬件设备检测技术领域,特别涉及一种磁盘坏道的处理方法、装置及计算机存储介质。
背景技术
目前,在一些大规模分布式应用环境中,磁盘数量巨大并且磁盘比较分散,对磁盘的维护比较困难。例如,在内容分发网络中,当磁盘出现故障时,通常是等到磁盘掉线或严重影响业务/服务质量后,才会安排人员替换出现故障的磁盘。这样处理的弊端在于:磁盘出现故障时处理效率较低,此外,磁盘坏道是磁盘常见的故障,如果用新的磁盘替换只有少量坏道的磁盘,会造成较大的资源浪费。
发明内容
本申请的目的在于提供一种磁盘坏道的处理方法、装置及计算机存储介质,能够在节省磁盘资源的情况下,提高故障磁盘的处理效率。
为实现上述目的,本申请一方面提供一种磁盘坏道的处理方法,所述方法包括:获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据;合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
为实现上述目的,本申请另一方面还提供一种磁盘坏道的处理装置,所 述装置包括:磁盘检测单元,用于获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据;坏道隔离单元,用于合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;空间重构单元,用于根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
为实现上述目的,本申请另一方面还提供一种磁盘坏道的处理装置,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,执行上述的方法。
为实现上述目的,本申请另一方面还提供一种计算机存储介质,所述计算机存储介质用于存储计算机程序,所述计算机程序被处理器执行时,执行上述的方法。
由上可见,本申请提供的技术方案中,可以自动识别目标磁盘是否需要进行坏道测试。具体地,可以定期获取目标磁盘的健康状态信息,并针对健康状态信息进行分析,从而确定目标磁盘是否需要进行坏道测试。当需要进行坏道测试时,可以将坏道数据表征的坏道区域进行合并,合并后的坏道区域在目标磁盘中可以被隔离,从而得到剔除坏道区域的可用区域。由于坏道区域的存在,可用区域在目标磁盘中可能会呈现离散的分布情况,此时,为了正常使用目标磁盘中的可用区域,可以基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。具体地,可以按照所述可用区域,将所述目标磁盘划分为多个分区,并将所述多个分区合并为一个卷。在对合并后的卷进行格式化之后,可以将格式化后的卷的卷标和挂载点沿用目标磁盘原先的卷标和挂载点,从而完成对故障磁盘的处理过程。处理之后的磁盘,由于坏道区域被隔离开,从而不会影响正常区域的使用。由此可见,本申请提供的技术方案,在出现少量坏道时,不需要将磁盘整体替换,而是可以采用坏道隔离的方式,充分利用可用区域,因此能够在节省磁盘资源的情况下, 提高故障磁盘的处理效率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例中磁盘坏道的处理方法步骤图;
图2是本申请实施例中磁盘坏道的处理方法流程图;
图3是本申请实施例中坏道隔离示意图;
图4是本申请实施例中磁盘坏道的处理装置的结构示意图;
图5是本申请实施例中计算机终端的结构示意图。
具体实施例
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。
实施例一
本申请提供一种磁盘坏道的处理方法,请参阅图1和图2,所述方法可以包括以下步骤。
S1:获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据。
在本实施例中,可以定期执行磁盘状态监控脚本,该脚本在执行时,可以获取目标磁盘的健康状态信息。在实际应用中,所述目标磁盘的健康状态信息可以包括所述目标磁盘的自检信息、所述目标磁盘所处操作系统的日志信息以及所述目标磁盘的输入输出负载信息中的至少一种。其中,所述自检信息可以是目标磁盘的S.M.A.R.T(Self-Monitoring Analysis and Reporting  Technology,自我监测、分析及报告技术)信息,S.M.A.R.T信息可以保留在硬盘的系统保留区(service area)内。S.M.A.R.T标准中可以采用二进制代码作为S.M.A.R.T的基本指令,并规定写入标准的寄存器中,形成特定的S.M.A.R.T信息表,以供正常检测和运行。S.M.A.R.T指令可以分为主指令(Command)和次指令(Subcommands)。主指令主要提供设备是否支持S.M.A.R.T或忽略某一次指令特征的信息。而次指令则提供支持S.M.A.R.T设备的检测信息。在S.M.A.R.T信息中,可以显示目标硬盘是否失败(FAIL),还可以显示目标磁盘在运行时是否出现了错误,并且可以根据出现错误的数量,判断该目标磁盘是否出现了大量错误(错误数量在200以上)。所述日志信息可以指操作系统的message信息,该信息可以表明目标磁盘是否有读写错误,以及是否有文件系统错误。所述输入输出负载信息则可以表征目标磁盘是否处于高负载的运行状态下。
在本实施例中,可以预先设定目标磁盘可能存在的运行状态,然后基于所述目标磁盘的健康状态信息,可以分别确定各个所述运行状态对应的判定结果,以得到所述目标磁盘的判定结果组合。例如,请参阅表1,表1中列出了多个运行状态,并且在各个运行状态下,例举了可能的判定结果,每一行均可以作为一个判定结果组合。
表1 硬盘故障判断分析表
Figure PCTCN2018091579-appb-000001
在本实施例中,在得到目标磁盘实际的判定结果组合时,可以调用所述判定结果组合对应的预设处理策略,所述预设处理策略可以用于表征所述目标磁盘是否需要进行坏道测试。如表1所示,所述预设处理策略可以是“处理动作”这一栏的信息,根据不同的判定结果组合,可以对应不同的处理动作。
S3:合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域。
在本实施例中,当判断结果表示需要对目标磁盘进行坏道测试时,可以中止所述目标磁盘中的业务或者中止所述目标磁盘所处服务器中的业务。例如,可以将该目标磁盘从服务器中暂时剔除,或者直接将该目标磁盘所处的服务器停机。这样,可以在对目标磁盘进行坏道测试时,不妨碍正常的业务执行过程。
在本实施例中,在对目标磁盘进行坏道测试时,首先可以对所述目标磁盘中的逻辑坏道进行修复,最终,可以将无法修复的坏道对应的坏道数据保存至指定文件中。所述指定文件可以是在操作系统中预先设置的文件,后续对坏道数据进行分析时,可以通过该指定文件的访问路径,读取其中的坏道数据。
在本实施例中,当完成坏道测试的过程后,可以针对所述指定文件中的坏道数据进行分析。在实际应用中,所述坏道数据可以是一个一维的非负整数的数组。该数组中可以包含多个元素,其中,每个元素可以代表一个容量为4KB的坏道,同时,该元素的元素值可以表征坏道在目标磁盘上所处的位置。例如,元素值N(从0开始的整数)可以代表所述目标磁盘中的第N+1个容量为4KB的区域。
在本实施例中,可以采用尽可能小的区域将坏道所在的区域覆盖,并将覆盖坏道的区域从目标磁盘中隔离,从而得到目标磁盘中可以正常使用的可用区域。具体地,可以预先将所述目标磁盘的存储空间划分为指定数量的子区域。例如,可以将目标磁盘等分为100个子区域,这100个子区域可以通过0-99的数字来表示。其中,0可以表示0-1%的存储空间,1可以表示1-2%的存储空间, 以此类推。然后,可以在所述指定数量的子区域中确定所述坏道数据中各个元素值对应的目标子区域。具体地,首先以4KB为单位,确定目标磁盘的总容量(目标磁盘的实际容量中包含的4KB的数量)。然后可以用元素值除以该目标磁盘的总容量,然后将得到的结果取整,从而可以确定该元素值具体对应哪一个子区域。这样,确定出的目标子区域均可以作为存在坏道的子区域。请参阅图3,图3中填充阴影的子区域便可以作为存在坏道的目标子区域。
在本实施例中,在确定出各个坏道所处的目标子区域之后,为了避免坏道的区域过于离散,可以将间隔满足指定条件的两个相邻的目标子区域进行合并,从而得到多个合并区域。具体地,在对目标子区域进行合并时,可以采用适配的区域来覆盖间隔满足指定条件的多个目标子区域。所述间隔满足指定条件可以指相邻两个目标子区域之间间隔的子区域的数量小于或者等于指定数量阈值。例如,所述指定数量阈值为2(目标磁盘总容量的2%),那么只要相邻两个目标子区域之间的间隔不超过2个子区域,就可以将这两个相邻的目标子区域进行合并。请参阅图3,第一个和第二个目标子区域之间,相隔了1个子区域,因此可以将这两个目标子区域进行合并,在对这两个目标子区域进行合并时,需要将这两个目标子区域之间包含的子区域也一起合并,因此,初步合并后的该合并区域中,可以包括三个子区域。同理,由于第二个目标子区域与第三个目标子区域之间间隔了4个子区域,不满足合并条件,因此不会将第二个目标子区域与第三个目标子区域进行合并。而第三、第四以及第五个目标子区域均满足合并条件,因此可以将这三个目标子区域连同它们之间间隔的子区域一同合并,合并后的合并区域中可以包括四个子区域。
在本实施例中,在将目标子区域合并为合并区域之后,可以在各个所述合并区域的头部和/或尾部确定隔离区域,例如,该隔离区域可以是1个子区域(目标磁盘总容量的1%),并将所述合并区域与所述隔离区域的组合作为合并后的坏道区域。例如在图3中,第一个合并区域由于在目标磁盘的头部,因此 仅在合并区域的尾部将一个子区域作为隔离区域,这样,最终可以将虚线框中的4个子区域作为坏道区域。同理,第二个合并区域由于处于目标磁盘的中间,因此头部和尾部均可以确定1个子区域作为隔离区域,这样,虚线框中的6个子区域可以作为坏道区域。
在本实施例中,在对坏道进行隔离,得到坏道区域之后,目标磁盘中剔除坏道区域以外的区域,便可以作为可用区域。
在一个实施例中,若在指定时长内无法完成坏道测试的过程,则可以发
送报警信息,从而通知管理人员直接更换磁盘。例如,超过48小时还未完成坏道测试过程,则表明磁盘中坏道过多或者磁盘出现严重的读写错误,从而可以放弃坏道隔离的过程,直接更换磁盘。
S5:根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
在本实施例中,在对坏道进行检测以及隔离之后,可以根据检测结果,判断该目标磁盘是否可以继续使用。具体地,可以通过对坏道数量、可用区域的空间容量以及可用区域的分区数量这三个方面进行综合考量。若检测到的坏道的数量大于指定坏道数量阈值,或者得到的可用区域的空间容量小于指定容量阈值,或者按照所述可用区域进行分区后的分区数量大于指定分区数量阈值,可以判定所述目标磁盘不可用。例如,若坏道数量超过200个,或者可用区域的空间容量低于目标磁盘总容量的90%,或者可用区域的分区数量超过4个,均判定目标磁盘不可用。相反,若检测到的坏道的数量小于或者等于所述指定坏道数量阈值,并且得到的可用区域的空间容量大于或者等于所述指定容量阈值,并且按照所述可用区域进行分区后的分区数量小于或者等于指定分区数量阈值,可以判定所述目标磁盘可用。例如,若坏道数量不超过200个,并且可用区域的空间容量不低于目标磁盘总容量的90%,并且可用区域的分区数 量不超过4个,可以判定目标磁盘可用。
在本实施例中,针对不可用的目标磁盘,可以发出告警信息,通知管理人员更换磁盘。而对于可用的目标磁盘,可以按照所述可用区域,将所述目标磁盘划分为多个分区。具体地,可以将连续的多个可用区域划分为一个分区。例如,在图3中,虚线框以外的空白子区域均为可用区域,那么可以将剩余的可用区域划分为两个分区,第一分区中包含2个子区域,第二个分区中仅包含1个子区域。在将可用区域划分为多个分区之后,可以通过LVM(Logical Volume Manager,逻辑卷管理)功能,将所述多个分区合并为一个卷,并将合并后的所述卷作为重新构建的存储空间。
在本实施例中,在检测所述目标磁盘中的坏道数据之前,可以记录所述目标磁盘的卷标和挂载点,这样,在得到重新构建的存储空间之后,可以格式化重新构建的所述存储空间,并将记录的所述目标磁盘的卷标和挂载点分别设置为格式化后的存储空间的卷标和挂载点,从而完成接入参数的设置过程。
在本实施例中,完成坏道隔离以及合并卷处理后的目标磁盘可以继续使用,因此,可以恢复所述目标磁盘中的业务或者恢复所述目标磁盘所处服务器中的业务。
实施例二
本申请还提供一种磁盘坏道的处理装置,所述装置包括:
磁盘检测单元,用于获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据;
坏道隔离单元,用于合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;
空间重构单元,用于根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
在一个实施例中,所述坏道数据中包括至少一个用于表征坏道在所述目标磁盘中所处位置的元素值;相应地,所述坏道隔离单元包括:
子区域划分模块,用于预先将所述目标磁盘的存储空间划分为指定数量的子区域;
目标子区域确定模块,用于在所述指定数量的子区域中确定所述坏道数据中各个元素值对应的目标子区域;
区域合并模块,用于将间隔满足指定条件的两个相邻的目标子区域进行合并,得到多个合并区域;
隔离区域设置模块,用于在各个所述合并区域的头部和/或尾部确定隔离区域,并将所述合并区域与所述隔离区域的组合作为合并后的坏道区域。
在一个实施例中,所述空间重构单元包括:
分区划分模块,用于按照所述可用区域,将所述目标磁盘划分为多个分区;
卷合并模块,用于将所述多个分区合并为一个卷,并将合并后的所述卷作为重新构建的存储空间。
在一个实施例中,所述装置还包括:
参数记录单元,用于记录所述目标磁盘的卷标和挂载点;
相应地,所述空间重构单元还包括:
参数重置模块,用于格式化重新构建的所述存储空间,并将记录的所述目标磁盘的卷标和挂载点分别设置为格式化后的存储空间的卷标和挂载点。
实施例三
请参阅图4,本申请还提供一种磁盘坏道的处理装置,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,可以执行上述磁盘坏道的处理方法。
在本实施例中,所述存储器可以包括用于存储信息的物理装置,通常是 将信息数字化后再以利用电、磁或者光学等方法的媒体加以存储。本实施例所述的存储器又可以包括:利用电能方式存储信息的装置,如RAM、ROM等;利用磁能方式存储信息的装置,如硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置,如CD或DVD。当然,还有其他方式的计算机存储介质,例如量子存储器、石墨烯存储器等等。
在本实施例中,所述处理器可以按任何适当的方式实现。例如,所述处理器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式等等。
本申请还提供一种计算机存储介质,所述计算机存储介质用于存储计算机程序,所述计算机程序被处理器执行时,可以执行上述磁盘坏道的处理方法。
本说明书实施例提供的磁盘坏道的处理装置和计算机存储介质,其本身所实现的具体功能,可以与本说明书中的前述方法实施例相对照解释,并能够达到前述方法实施例的技术效果,这里便不再赘述。
请参阅图5,在本申请中,上述实施例中的技术方案可以应用于如图5所示的计算机终端10上。计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输模块106。本领域普通技术人员可以理解,图5所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图5中所示更多或者更少的组件,或者具有与图5所示不同的配置。
存储器104可用于存储应用软件的软件程序以及模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一 个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
上述实施例中的计算机程序可以存储于存储器104中,并且存储器104可以与处理器102耦合,处理器102从而可以读取存储器104中的计算机程序并执行该计算机程序,从而可以实现本申请上述的技术方案。
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
由上可见,本申请提供的技术方案中,可以自动识别目标磁盘是否需要进行坏道测试。具体地,可以定期获取目标磁盘的健康状态信息,并针对健康状态信息进行分析,从而确定目标磁盘是否需要进行坏道测试。当需要进行坏道测试时,可以将坏道数据表征的坏道区域进行合并,合并后的坏道区域在目标磁盘中可以被隔离,从而得到剔除坏道区域的可用区域。由于坏道区域的存在,可用区域在目标磁盘中可能会呈现离散的分布情况,此时,为了正常使用目标磁盘中的可用区域,可以基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。具体地,可以按照所述可用区域,将所述目标磁盘划分为多个分区,并将所述多个分区合并为一个卷。在对合并后的卷进行格式化之后,可以将格式化后的卷的卷标和挂载点沿用目标磁盘原先的卷标和挂载点,从而完成对故障磁盘的处理过程。处理之后的磁盘,由于坏道区域被隔离开,从而不会影响正常区域的使用。由此可见,本申 请提供的技术方案,在出现少量坏道时,不需要将磁盘整体替换,而是可以采用坏道隔离的方式,充分利用可用区域,因此能够在节省磁盘资源的情况下,提高故障磁盘的处理效率。
通过以上的实施例的描述,本领域的技术人员可以清楚地了解到各实施例可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (16)

  1. 一种磁盘坏道的处理方法,其特征在于,所述方法包括:
    获取待处理的目标磁盘,并检测所述目标磁盘中的坏道数据;
    合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;
    根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
  2. 根据权利要求1所述的方法,其特征在于,获取待处理的目标磁盘之后,所述方法还包括:
    获取所述目标磁盘的健康状态信息,并基于所述健康状态信息判断所述目标磁盘是否需要进行坏道测试;相应地,当需要时,检测所述目标磁盘中的坏道数据。
  3. 根据权利要求2所述的方法,其特征在于,所述目标磁盘的健康状态信息包括所述目标磁盘的自检信息、所述目标磁盘所处操作系统的日志信息以及所述目标磁盘的输入输出负载信息中的至少一种;
    相应地,基于所述健康状态信息判断所述目标磁盘是否需要进行坏道测试包括:
    预先设定所述目标磁盘对应的多个运行状态,并基于所述目标磁盘的健康状态信息,分别确定各个所述运行状态对应的判定结果,以得到所述目标磁盘的判定结果组合;
    调用所述判定结果组合对应的预设处理策略,所述预设处理策略用于表征所述目标磁盘是否需要进行坏道测试。
  4. 根据权利要求1所述的方法,其特征在于,在检测所述目标磁盘中的坏道数据时,所述方法还包括:
    对所述目标磁盘中的逻辑坏道进行修复,并将无法修复的坏道对应的坏道数据保存至指定文件中。
  5. 根据权利要求1或4所述的方法,其特征在于,所述坏道数据中包括至少一个用于表征坏道在所述目标磁盘中所处位置的元素值;相应地,合并所述坏道数据表征的坏道区域包括:
    预先将所述目标磁盘的存储空间划分为指定数量的子区域;
    在所述指定数量的子区域中确定所述坏道数据中各个元素值对应的目标子区域;
    将间隔满足指定条件的两个相邻的目标子区域进行合并,得到多个合并区域;
    在各个所述合并区域的头部和/或尾部确定隔离区域,并将所述合并区域与所述隔离区域的组合作为合并后的坏道区域。
  6. 根据权利要求5所述的方法,其特征在于,所述间隔满足指定条件包括:
    相邻两个目标子区域之间间隔的子区域的数量小于或者等于指定数量阈值。
  7. 根据权利要求1所述的方法,其特征在于,基于所述可用区域重新构建所述目标磁盘的存储空间包括:
    按照所述可用区域,将所述目标磁盘划分为多个分区;
    将所述多个分区合并为一个卷,并将合并后的所述卷作为重新构建的存储空间。
  8. 根据权利要求1或7所述的方法,其特征在于,在检测所述目标磁盘中的坏道数据之前,所述方法还包括:
    记录所述目标磁盘的卷标和挂载点;
    相应地,为重新构建的所述存储空间设置接入参数包括:
    格式化重新构建的所述存储空间,并将记录的所述目标磁盘的卷标和挂载点分别设置为格式化后的存储空间的卷标和挂载点。
  9. 根据权利要求1所述的方法,其特征在于,根据检测结果,判断所述目标磁盘是否可用包括:
    若检测到的坏道的数量大于指定坏道数量阈值,或者得到的可用区域的空间容量小于指定容量阈值,或者按照所述可用区域进行分区后的分区数量大于指定分区数量阈值,判定所述目标磁盘不可用;
    若检测到的坏道的数量小于或者等于所述指定坏道数量阈值,并且得到的可用区域的空间容量大于或者等于所述指定容量阈值,并且按照所述可用区域进行分区后的分区数量小于或者等于指定分区数量阈值,判定所述目标磁盘可用。
  10. 根据权利要求1所述的方法,其特征在于,在检测所述目标磁盘中的坏道数据之前,所述方法还包括:
    中止所述目标磁盘中的业务或者中止所述目标磁盘所处服务器中的业务;
    相应地,在为重新构建的所述存储空间设置接入参数之后,所述方法还包括:
    恢复所述目标磁盘中的业务或者恢复所述目标磁盘所处服务器中的业务。
  11. 一种磁盘坏道的处理装置,其特征在于,所述装置包括:
    磁盘检测单元,用于获取待处理的目标磁盘,并检测所述目标磁盘中的坏 道数据;
    坏道隔离单元,用于合并所述坏道数据表征的坏道区域,以得到所述目标磁盘中除所述坏道区域以外的可用区域;
    空间重构单元,用于根据检测结果,判断所述目标磁盘是否可用,若可用,基于所述可用区域重新构建所述目标磁盘的存储空间,并为重新构建的所述存储空间设置接入参数。
  12. 根据权利要求11所述的装置,其特征在于,所述坏道数据中包括至少一个用于表征坏道在所述目标磁盘中所处位置的元素值;相应地,所述坏道隔离单元包括:
    子区域划分模块,用于预先将所述目标磁盘的存储空间划分为指定数量的子区域;
    目标子区域确定模块,用于在所述指定数量的子区域中确定所述坏道数据中各个元素值对应的目标子区域;
    区域合并模块,用于将间隔满足指定条件的两个相邻的目标子区域进行合并,得到多个合并区域;
    隔离区域设置模块,用于在各个所述合并区域的头部和/或尾部确定隔离区域,并将所述合并区域与所述隔离区域的组合作为合并后的坏道区域。
  13. 根据权利要求11所述的装置,其特征在于,所述空间重构单元包括:
    分区划分模块,用于按照所述可用区域,将所述目标磁盘划分为多个分区;
    卷合并模块,用于将所述多个分区合并为一个卷,并将合并后的所述卷作为重新构建的存储空间。
  14. 根据权利要求11或13所述的装置,其特征在于,所述装置还包括:
    参数记录单元,用于记录所述目标磁盘的卷标和挂载点;
    相应地,所述空间重构单元还包括:
    参数重置模块,用于格式化重新构建的所述存储空间,并将记录的所述目标磁盘的卷标和挂载点分别设置为格式化后的存储空间的卷标和挂载点。
  15. 一种磁盘坏道的处理装置,其特征在于,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,执行权利要求1至10中任一所述的方法。
  16. 一种计算机存储介质,其特征在于,所述计算机存储介质用于存储计算机程序,所述计算机程序被处理器执行时,执行权利要求1至10中任一所述的方法。
PCT/CN2018/091579 2018-04-10 2018-06-15 一种磁盘坏道的处理方法、装置及计算机存储介质 WO2019196199A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP18889973.6A EP3745270A4 (en) 2018-04-10 2018-06-15 METHOD AND DEVICE FOR PROCESSING DEFECTIVE TRACES FROM DATA CARRIERS AND COMPUTER STORAGE MEDIUM
US16/506,349 US11073998B2 (en) 2018-04-10 2019-07-09 Method, apparatus for processing disk bad sector,and computer storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810317786.8A CN108536548B (zh) 2018-04-10 2018-04-10 一种磁盘坏道的处理方法、装置及计算机存储介质
CN201810317786.8 2018-04-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/506,349 Continuation US11073998B2 (en) 2018-04-10 2019-07-09 Method, apparatus for processing disk bad sector,and computer storage medium

Publications (1)

Publication Number Publication Date
WO2019196199A1 true WO2019196199A1 (zh) 2019-10-17

Family

ID=63479841

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/091579 WO2019196199A1 (zh) 2018-04-10 2018-06-15 一种磁盘坏道的处理方法、装置及计算机存储介质

Country Status (4)

Country Link
US (1) US11073998B2 (zh)
EP (1) EP3745270A4 (zh)
CN (1) CN108536548B (zh)
WO (1) WO2019196199A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116380149A (zh) * 2023-04-07 2023-07-04 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110209550A (zh) * 2019-05-24 2019-09-06 新华三技术有限公司成都分公司 存储介质的故障处理方法、装置、电子设备及存储介质
CN112558859A (zh) * 2019-09-26 2021-03-26 杭州海康威视数字技术股份有限公司 硬盘、存储系统及硬盘容量标记方法
CN110931072B (zh) * 2019-11-28 2022-03-22 深信服科技股份有限公司 一种坏道扫描方法、装置、设备及存储介质
CN111007992B (zh) * 2020-03-04 2020-08-04 广东电网有限责任公司佛山供电局 一种磁盘数据存储表示方法、系统、存储介质
CN113778657B (zh) * 2020-09-24 2024-04-16 北京沃东天骏信息技术有限公司 一种数据处理方法和装置
CN112732517B (zh) * 2020-12-29 2023-12-22 北京浪潮数据技术有限公司 一种磁盘故障告警方法、装置、设备及可读存储介质
CN113032201B (zh) * 2021-05-24 2021-09-21 广东睿江云计算股份有限公司 一种硬盘坏道检测方法
CN113672415A (zh) * 2021-07-09 2021-11-19 济南浪潮数据技术有限公司 一种磁盘故障处理方法、装置、设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593275A (zh) * 2013-10-31 2014-02-19 华为技术有限公司 磁盘信息显示方法及装置
CN104484251A (zh) * 2014-12-11 2015-04-01 华为技术有限公司 一种硬盘故障的处理方法及装置
CN105279057A (zh) * 2015-11-10 2016-01-27 浪潮(北京)电子信息产业有限公司 一种磁盘坏道检测方法与系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6058462A (en) * 1998-01-23 2000-05-02 International Business Machines Corporation Method and apparatus for enabling transfer of compressed data record tracks with CRC checking
JP3958776B2 (ja) * 2003-09-05 2007-08-15 富士通株式会社 光磁気ディスク装置および光磁気ディスクに対するデータ書き込み方法
US7890796B2 (en) * 2006-10-04 2011-02-15 Emc Corporation Automatic media error correction in a file server
US7653840B1 (en) * 2007-04-27 2010-01-26 Net App, Inc. Evaluating and repairing errors during servicing of storage devices
CN101527142B (zh) * 2009-04-17 2011-04-13 杭州华三通信技术有限公司 一种磁盘冗余阵列中数据的读写方法和设备
US8385014B2 (en) * 2010-10-11 2013-02-26 Lsi Corporation Systems and methods for identifying potential media failure
CN107015877A (zh) * 2017-03-14 2017-08-04 唐山钢铁集团有限责任公司 一种带有物理坏道的raid磁盘的重新利用方法
CN107807862A (zh) * 2017-09-29 2018-03-16 曙光信息产业(北京)有限公司 检测硬盘故障点的方法、装置及服务器

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593275A (zh) * 2013-10-31 2014-02-19 华为技术有限公司 磁盘信息显示方法及装置
CN104484251A (zh) * 2014-12-11 2015-04-01 华为技术有限公司 一种硬盘故障的处理方法及装置
CN105279057A (zh) * 2015-11-10 2016-01-27 浪潮(北京)电子信息产业有限公司 一种磁盘坏道检测方法与系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3745270A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116380149A (zh) * 2023-04-07 2023-07-04 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统
CN116380149B (zh) * 2023-04-07 2024-02-02 深圳市兴源智能仪表股份有限公司 一种仪表码盘转动测试方法、系统

Also Published As

Publication number Publication date
US20190332305A1 (en) 2019-10-31
CN108536548A (zh) 2018-09-14
CN108536548B (zh) 2020-12-29
US11073998B2 (en) 2021-07-27
EP3745270A4 (en) 2021-05-19
EP3745270A1 (en) 2020-12-02

Similar Documents

Publication Publication Date Title
WO2019196199A1 (zh) 一种磁盘坏道的处理方法、装置及计算机存储介质
US10198196B2 (en) Monitoring health condition of a hard disk
US9170888B2 (en) Methods and apparatus for virtual machine recovery
US9105308B2 (en) System and method for disk sector failure prediction
US9529674B2 (en) Storage device management of unrecoverable logical block addresses for RAID data regeneration
CN105892934B (zh) 用于存储设备管理的方法和装置
US9317349B2 (en) SAN vulnerability assessment tool
US20170147425A1 (en) System and method for monitoring and detecting faulty storage devices
CN109947596A (zh) Pcie设备故障系统宕机处理方法、装置及相关组件
CN111104293A (zh) 用于支持盘故障预测的方法、设备和计算机程序产品
CN109726036B (zh) 一种存储系统中的数据重构方法和装置
CN111124264B (zh) 用于重建数据的方法、设备和计算机程序产品
US9104604B2 (en) Preventing unrecoverable errors during a disk regeneration in a disk array
US11126501B2 (en) Method, device and program product for avoiding a fault event of a disk array
US20140089477A1 (en) System and method for monitoring storage machines
CN111104051B (zh) 用于管理存储系统的方法、设备和计算机程序产品
US10606490B2 (en) Storage control device and storage control method for detecting storage device in potential fault state
US20160110246A1 (en) Disk data management
US9940211B2 (en) Resource system management
CN112466382A (zh) 一种raid阵列的巡检方法和装置
US8977892B2 (en) Disk control apparatus, method of detecting failure of disk apparatus, and recording medium for disk diagnosis program
US11115056B2 (en) Location selection based on erasure code techniques
US10656987B1 (en) Analysis system and method
CN112084097A (zh) 一种磁盘告警方法及装置
US20120210061A1 (en) Computer and method for testing redundant array of independent disks of the computer

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE