CN105302677A - Information-processing device and method - Google Patents

Information-processing device and method Download PDF

Info

Publication number
CN105302677A
CN105302677A CN201410487610.9A CN201410487610A CN105302677A CN 105302677 A CN105302677 A CN 105302677A CN 201410487610 A CN201410487610 A CN 201410487610A CN 105302677 A CN105302677 A CN 105302677A
Authority
CN
China
Prior art keywords
storage area
block group
group
block
judged
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410487610.9A
Other languages
Chinese (zh)
Inventor
栗林哲生
梅田通彦
菅野浩典
菅原信广
户田诚二
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Publication of CN105302677A publication Critical patent/CN105302677A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0751Error or fault detection not based on redundancy
    • G06F11/0754Error or fault detection not based on redundancy by exceeding limits
    • G06F11/0757Error or fault detection not based on redundancy by exceeding limits by exceeding a time limit, i.e. time-out, e.g. watchdogs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/0706Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
    • G06F11/0727Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a storage system, e.g. in a DASD or network based storage system
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/18Error detection or correction; Testing, e.g. of drop-outs
    • G11B20/1816Testing
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/10Digital recording or reproducing
    • G11B20/18Error detection or correction; Testing, e.g. of drop-outs
    • G11B20/1883Methods for assignment of alternate areas for defective areas
    • G11B20/1889Methods for assignment of alternate areas for defective areas with discs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems

Abstract

According to one embodiment, there is provided an information-processing device which includes a storage medium and a controller configured to acquire a delay time in access to storage area included in the storage medium for every storage area with reference to a time at which an access is performed without performing retrying on the storage area based on first information relating to an access history with respect to the storage area, and to determine the storage area of which the delay time exceeds a predetermined allowable delay time as a defective area.

Description

Signal conditioning package and information processing method
The application is based on U.S. Provisional Patent Application 62/030, and No. 275 (applyings date: on July 29th, 2014) also require its right of priority.The full content of this earlier application is incorporated to herein by reference.
Technical field
The present invention relates to signal conditioning package and information processing method.
Background technology
At use disk set and/or SSD (SolidStateDrive, solid state hard disc) etc. the RAID (RedundantArraysofInexpensiveDisks of memory storage, Redundant Array of Inexpensive Disc) in system, when the memory storage forming RAID breaks down, carry out the Recovery processing (so-called reconstruction process) of RAID.In RAID system, generally carry out reconstruction process.Reconstruction process utilizes the data in other memory storages in multiple memory storages of being stored in and forming RAID, beyond the memory storage that breaks down, recover the data be stored in this memory storage broken down, and be written to the process in the memory storage (replacement device) preset.
Time (RAID release time) required for reconstruction process increases with the high capacity of memory storage.Thus, the risk that in reconstruction process, the hydraulic performance decline of RAID system and/or the fault of other memory storage produce increases.Therefore, propose the scheme of rebuilding assist in functions, that is, be stored in available data in the data in the memory storage broken down by use and carry out reconstruction process, realize the shortening of RAID release time.In reconstruction assist in functions, the defective region that requirement forecast (judgement) can not conduct interviews to the memory storage broken down.
But, in reconstruction subsidiary function, when defective region correctly can not be predicted, can postpone because of retry during reconstruction process, or excessively predict that adverse extent makes to increase the access load of other memory storages, causes long-timeization of RAID release time.
Summary of the invention
The invention provides a kind of signal conditioning package and the information processing method that can shorten the time required for rebuilding.
According to the present embodiment, provide a kind of signal conditioning package, it possesses: storage medium; And control part, it is based on the first information of the access resume about the storage area had for this storage medium, the time delay of the access to storage area is asked for by each storage area, the storage area of the delay allowance time exceeding regulation this time delay for benchmark, and is judged as defective region with the time do not conducted interviews with not performing retry to storage area by this time delay.
Accompanying drawing explanation
Fig. 1 is the figure of an example of the hardware configuration representing the disk set applying the signal conditioning package that the first embodiment relates to.
Fig. 2 is the figure of the example representing the block group address message table that the disk set that the first embodiment relates to stores.
Fig. 3 is the figure of the example representing the block group flame table that the disk set that the first embodiment relates to stores.
Fig. 4 is the process flow diagram of an example of the flow process of the access process representing the disk to disk set that the first embodiment relates to.
Fig. 5 is the process flow diagram of an example of the flow process of the update process representing the block group flame table that the disk set that the first embodiment relates to carries out.
Fig. 6 is the process flow diagram of an example of the flow process representing the reconstruction auxiliary mode validation process that the disk set that the first embodiment relates to carries out.
Fig. 7 is the process flow diagram of an example of the flow process representing the reconstruction auxiliary process (rebuilding the read/write process in auxiliary mode) that the disk set that the first embodiment relates to carries out.
Fig. 8 is the process flow diagram of an example of the flow process of the acquisition process representing the bad judged result that the disk set that the first embodiment relates to carries out.
Fig. 9 (a) ~ (c) is the figure of the example for illustration of the process in the disk set related at the second embodiment, higher level's cohort being judged as defective region.
Embodiment
Describe signal conditioning package and information processing method that embodiment relates to reference to the accompanying drawings in detail.Further, the invention is not restricted to this embodiment.
(the first embodiment)
First, use Fig. 1, the hardware configuration of the disk set of the signal conditioning package that application first embodiment relates to.Fig. 1 is the figure of an example of the hardware configuration representing the disk set applying the signal conditioning package that the first embodiment relates to.In the following description, example signal conditioning package of the present embodiment being applied to disk set is described, but is not limited thereto, also signal conditioning package of the present embodiment can be applied in the memory storages such as SSD.
As shown in Figure 1, disk set 1 of the present embodiment comprises: CPU (CentralProcessingUnit, CPU (central processing unit)) 10, ROM (ReadOnlyMemory, ROM (read-only memory)) 11, RAM (RandomAccessMemory, random access memory) 12, drive control part 13, main frame IF (Interface, interface) control part 14, data buffering control part 15, data buffer 16, Read-write Catrol portion 17, disk 18 and head stack (ヘ ッ De ス タ ッ Network ア ッ セ Application Block リ) 19.
Disk 18 (example of storage medium) is made up of magnetic recording media etc., has the multiple pieces of groups (example of storage area) of reading or the write carrying out data.In the present embodiment, block group is the half cycle of each magnetic track (track) of disk 18.In the present embodiment, although using the half cycle of each magnetic track on disk 18 as 1 block group, as long as using a region, 1 in the card of disk 18 as block group, be not limited to this.Such as also can using each magnetic track of disk 18 as 1 block group.
Head stack 19 keeps magnetic head and makes this magnetic head move to the mechanism of the assigned position (carrying out the reading of data or the position of write) on disk 18.
CPU10 is the control part controlling disk set 1 entirety.Specifically, bad judgements that CPU10 performs whether the control of the access (reading of data or write) of disk 18, decision block group are defective region processing, the block group being judged as defective region being notified the reconstruction auxiliary process of main frame 2 when being stored in the reconstruction of the data in block group, before this reconstruction auxiliary process, detecting the reconstruction auxiliary mode validation process etc. of the exception of each physical attribute of disk 18.
ROM11 stores the various programs performed by CPU10.RAM12 uses as the operating area of CPU10.In the present embodiment, RAM12 (example of storage part) preserves: the bad block group flame table 400 (with reference to Fig. 4) judging the result of process that the block group address message table 200 (with reference to Fig. 2) of the LBA (LogicalBlockAddress, LBA (Logical Block Addressing)) of the block group of memory disk 18 and storage are performed by CPU.
Drive control part 13 is controlled by CPU10, performs the data received from main frame 2 to be written in disk 18 and from disk 18 to read data.
Main frame IF control part 14 controls the various information such as transmitting and receiving data and/or order between disk set 1 and main frame 2.Main frame 2 possesses RAID (RedundantArraysofInexpensiveDisks, the Redundant Array of Inexpensive Disc) controller be such as included in PC (PersonalComputer, personal computer) and/or server etc.RAID controller is according to SATA (SerialATA, serial ATA) standard or SAS (SerialAttachedSCSI, Serial Attached SCSI (SAS)) interface standard such as standard, and the various information such as transmitting and receiving data and/or order between disk set 1.
Data buffering control part 15 is controlled by CPU10, will receive from main frame 2 and the data (writing data) be written in disk 18 and/or from disk 18 read data (read data) be written to data buffer 16.In addition, data buffering control part 15 reads from data buffer 16 and writes data and output to Read-write Catrol portion 17.And then data buffering control part 15 reads read data from data buffer 16, output to main frame 2 by main frame IF control part 14.Namely, data buffer 16 stores temporarily and writes data and/or read data.
Read-write Catrol portion 17 is controlled by CPU10, the read-write of conduct interviews to disk 18 to head stack 19 output instruction (reading of data or write).Thus, Read-write Catrol portion 17 controls the access to disk 18.
Then, the block group address message table 200 in the RAM12 that disk set 1 of the present embodiment has is used Fig. 2 to illustrate to be stored in.Fig. 2 is the figure of the example representing the block group address message table that the disk set that the first embodiment relates to stores.
As shown in Figure 2, block group address message table 200 with as can the block group # of an example of information of identification block group store accordingly: the district of an example as the information that can be identified in the district (zone) being configured with block group in disk 18 numbers, as the cylinder number of an example of the information of the cylinder that can be identified in disk 18 belonging to block group, as the head number of an example of the information that can identify the magnetic head that block group conducts interviews, be included in the LBA (LogicalBlockAddress of the sector in block group, LBA (Logical Block Addressing)), and number as the block cohort of an example of the information of energy identification block cohort (example of the first storage area group).
Here, block cohort has multiple pieces of groups, and the plurality of piece of group is classified according to the class condition of the physical attribute (such as magnetic head, district, cylinder etc.) based on block group.In addition, class condition sets based on the physical attribute of block group, for the condition of block component class.In the present embodiment, class condition is that head number shares, district's numbering shares and cylinder number is continuous.
Then, use Fig. 2 that the sorting technique of the block group in disk set 1 of the present embodiment is described.
In the present embodiment, as shown in Figure 2, CPU10 by head number in multiple pieces of groups and district numbering share and cylinder number continuous print block group is categorized as 1 block cohort.Such as, as shown in Figure 2, by block group #: 0 ~ 34 the block groups identified, its district numbers: 0 and head number: 0 shares, cylinder number: 0,1 is continuous.Therefore, these 4 block groups are categorized as 1 block cohort (block cohort numbering: 0) by CPU10.According to the combination of district's numbering, cylinder number and head number, also can be categorized as 2 block groups, but by cylinder is divided into 2,4 block groups can be categorized as.
Then, the block group flame table 400 in the RAM12 that disk set 1 of the present embodiment has is used Fig. 3 to illustrate to be stored in.Fig. 3 is the figure of the example representing the block group flame table that the disk set that the first embodiment relates to stores.
As shown in Figure 3, bad judged result, retry information, block number, replacement information, reading temporal information and write time information is stored accordingly with block group # in block group flame table 400.These information are examples for the first information about the access resume for block group.Here, bad judged result is the bad result judging process for block group of being undertaken by CPU.In the present embodiment, when block group is judged as defective region, bad judgment result displays " bad ", under block group is judged as normal situation, bad judgment result displays " normally ".
Retry information is the information of the number of times of the retry performed in the reading (reading) of the data for block group and the number of times at the middle retry performed of the write (writing) of the data for block group.
Block number represents to carry out the number of times of digital independent (reading) to the sector be included in block group and the sector be included in block group is write to the information of number of times of (writing).Replacement information represents that the quantity of sector (alternate sector) (replacing an example in region) is replaced in the sector performing replacement process in the access to block group namely.
Reading temporal information is information that to represent with the time of block group not being carried out with not performing retry to digital independent (access) be benchmark, that read the ratio from the time required for the data of this block group.Write time information is that to represent with the time of block group not being carried out with not performing retry to data write (access) be benchmark, to the ratio of the time required for this block group write data information.
Then, use Fig. 4 that the access process of the disk 18 to disk set 1 of the present embodiment is described.Fig. 4 is the process flow diagram of an example of the flow process of the access process representing the disk to disk set that the first embodiment relates to.
CPU10 receives the order of the block group (hereinafter referred to writing scope) of block group (hereinafter referred to reading scope) or the write data can specified and read data from main frame 2 by main frame IF control part 14.
CPU10 is from being carried out digital independent by the specific reading scope of the order received or carrying out data write (B501) to specifically being write scope by the order received.
CPU10, when the block group had disk 18 performs reading or the write of data, performs update process (B502) to the block group flame table 400 be stored in RAM12.CPU, at the end of the update process of block group flame table 400, terminates the access to block group.
Then, use Fig. 5 that the update process of the block group flame table 400 that disk set 1 of the present embodiment carries out is described.Fig. 5 is the process flow diagram of an example of the flow process of the update process representing the block group flame table that the disk set that the first embodiment relates to carries out.
First, in the access of the block group (hereinafter referred to upgating object group) of the scope of writing (or reading scope) that CPU10 accesses in the B501 at Fig. 4, judge whether to perform the replacement process (B601) being substituted into the replacement sector that disk 18 has.
Performing in the situation (B601: yes) being substituted into the replacement process of replacing sector, CPU10 upgrades the replacement information (B602) stored accordingly with the block group # of upgating object group in block group flame table 400.In the present embodiment, CPU10 is in the access to upgating object group, and the quantity of have been carried out the quantity of the sector (hereinafter referred to the source of replacement) of replacing process and the sector represented by the replacement information stored accordingly with the block group # of upgating object group in the sector that upgating object group is comprised is added.
Then, except the number of times of the retry to the execution of the source of replacement, CPU10 upgrades the retry information and block number (B603) that store accordingly with the block group # of upgating object group in block group flame table 400.In the present embodiment, when having carried out digital independent from upgating object group, except the number of times reading retry performed replacement source, the number of times reading retry performed in the digital independent carried out upgating object group and the number of retries represented by the retry information (reading) stored accordingly with the block group # of this upgating object group are added by CPU10.
In addition, when having carried out data write to upgating object group, except the number of times writing retry performed in the data write in the source of replacement, the number of times writing retry performed in the data write carried out upgating object group and the number of retries represented by the retry information (writing) stored accordingly with the block group # of upgating object group are added by CPU10.
In the situation (B601: no) not having to perform the replacement process being substituted into replacement sector, CPU10 does not carry out the renewal of the replacement information of block group flame table 400, but in block group flame table 400, the retry information store accordingly the block group # with upgating object group and block number (reading or writing) upgrade (B604).
In the present embodiment, when having carried out digital independent to upgating object group, CPU10 has carried out the sector number of digital independent (reading) by the sector comprised in upgating object group is actual and has been added with the block number (reading) that the block group # of upgating object group stores accordingly.In addition, when having carried out data write to upgating object group, CPU10 has carried out the sector number of data write (writing) by the sector comprised in upgating object group is actual and has been added with the block number (writing) that the block group # of upgating object group stores accordingly.
Then, CPU10 (example of control part) based on the information about the access resume for upgating object group (in the present embodiment, be stored in the retry information, block number, replacement information etc. in block group flame table 400), ask for the time delay of the access to this upgating object group, judge the delay allowance time (B605) whether exceeding regulation this time delay.Here, time delay is that be benchmark with the time do not conducted interviews with not performing retry to block group, to the access of block group time delay.The delay allowance time of regulation is time delay that be benchmark with the time do not conducted interviews with not performing retry to block group, that allow in the access to block group.
In the present embodiment, the following time asks for as time delay by CPU10: be added to the time (hereinafter referred to the first time delay) required for the access being allocated to the sector of replacing sector because of replacement process performed in the access to upgating object group and the time (hereinafter referred to the second time delay) required for retry the time obtained.Specifically, CPU10 uses following formula (1) to ask for for the first time delay.
First time delay=replacement sector number × (seek time × 2+ rotation waiting time+replacement sector access time) ... (1)
Here, the quantity that sector number is the replacement sector represented by replacement information stored accordingly with the block group # of upgating object group in block group flame table 400 is replaced.Seek time searches the time of replacing required for sector.In the present embodiment, seek time searches the averaging time of replacing required for sector.Rotation waiting time is the time that disk 18 rotates required for 1 week.Replacing the sector access time is to the time required for the access (reading of data or write) of replacement sector.
In addition, CPU10 uses following formula (2) to ask for for the second time delay.
Second time delay=rotation waiting time × number of retries ... (2)
Here, rotation waiting time is the time that disk 18 rotates required for 1 week.Number of retries is the number of times of the retry that every 1 magnetic track produces in the access to upgating object group, uses following formula (3) to ask for.
The sector number of number of retries=(number of times/block number of retry) × 1 magnetic track ... (3)
Here, the number of times of retry is the number of times of the retry that the retry information (when upgating object group is reading scope for retry information (reading), be when writing scope for retry information (write) in upgating object group) of upgating object group represents.Block number is the block number (when upgating object group is reading scope for block number (reading), be block number (writing) when upgating object group is for writing scope) of upgating object group.
In addition, in the present embodiment, when upgating object group is reading scope, CPU10 uses following formula (4) to ask for reading temporal information.On the other hand, when upgating object group is for writing scope, CPU10 uses following formula (4) to ask for write time information.Then, the block group # of the reading temporal information obtained (or write time information) with upgating object group is stored in block group flame table 400 accordingly.
Read temporal information (or write time information)=((normal access time+time delay)/normal access time) × 100 ... (4)
Here, the normal access time is the time do not conducted interviews with not performing retry to block group.
Then, upgating object group, when the reading temporal information obtained (or write time information) exceedes with the ratio (in the present embodiment for 300%) of access allowed time that is benchmark normal access time, is judged as defective region by CPU10.Here, the access allowed time is the time of allowing the access of block group, is the time that the delay allowance time obtains that adds the normal access time.Thus, time delay is exceeded the upgating object group of the delay allowance time of regulation and is judged as defective region by CPU10.
That is, in the situation (B605: no) being judged as the delay allowance time not exceeding regulation time delay, CPU10 judges whether to perform and rebuilds auxiliary process (B608).Do not performing in the situation (B608: no) of rebuilding auxiliary process, CPU10 makes the bad judged result stored accordingly with the block group # of upgating object group become " normally " (B606) in block group flame table 400.Then, the update process of CPU10 end block group flame table 400.On the other hand, performing in the situation (B608: yes) of rebuilding auxiliary process, CPU10 does not carry out the renewal of bad judged result, and the update process of end block group flame table 400.
On the other hand, in the situation (B605: yes) being judged as the delay allowance time exceeding regulation time delay, CPU10 makes the bad judged result stored accordingly with the block group # of upgating object group become " bad " (B607) in block group flame table 400.Thus, during the reconstruction being stored in the data in block group, the risk correctly can not predicted defective region or excessively predict defective region can be reduced, therefore can shorten the time required for reconstruction.Then, the update process of CPU10 end block group flame table 400.In the present embodiment, although carry out the renewal reading or writing the bad judged result of all carrying out block group at every turn, if but have updated the bad judged result of block group before reading or writing after rebuilding the process of auxiliary mode validation, then which timing carrying out the bad judgement process of block group in can.
Then, use Fig. 6 that the reconstruction auxiliary mode validation process that disk set 1 of the present embodiment carries out is described.Fig. 6 is the process flow diagram of an example of the flow process representing the reconstruction auxiliary mode validation process that the disk set that the first embodiment relates to carries out.
CPU10 is when receiving the order of instruction execution reconstruction auxiliary process by main frame IF control part 14 from main frame 2, detect the exception of this disk 18 by the physical attribute (such as magnetic head, region etc.) of disk 18, and start to perform in block group that disk 18 is comprised to have and detect reconstruction auxiliary mode validation process (B702) of the block group of abnormal physical attribute as defective region.
First, CPU10 controls drive control part 13 and performs read-write checking, that is, the test zone preset in the data storage areas had disk 18 conducts interviews (reading of data and write).Then, CPU10 judges whether the magnetic head (hereinafter referred to fault magnetic head) (B703) that there occurs fault in the magnetic head detecting that head stack 19 has by read-write checking.
Being judged as in the situation (B703: yes) detecting fault magnetic head, in the block group that disk 18 has by CPU10, being judged as defective region by being judged as the block group detecting the access of out of order fault magnetic head.And then CPU10, in block group flame table 400, makes the bad judged result stored accordingly with the block group # of the block group being judged as defective region (the block group of namely being accessed by fault magnetic head) become " bad " (B704).
In the situation (B703: no) not detecting fault magnetic head or make after bad judged result becomes " bad " (B704), CPU10 controls drive control part 13 and performs search checking, namely, the magnetic head that head stack 19 is had is searched, judge whether to detect in disk 18 namely the region that magnetic head can not be made to carry out searching searches can not region (B705).
Be judged as detecting that search can not in the situation (B705: yes) in region, belonging to the search that detects in the block group that disk 18 has by CPU10 the block group in region can not be judged as defective region.And then CPU10 makes the bad judged result stored accordingly with the block group # of the block group being judged as defective region (namely belong to search can not the block group in region) become " bad " (B706) in block group flame table 400.
In the situation (B705: no) in region or can not make after bad judged result becomes " bad " (B706) being judged as not detecting searching, CPU10 controls drive control part 13 and performs and read checking, and this reading checking judges whether that namely in the data storage areas detecting that disk 18 has, can not to read data region reads can not region (B707).
In the situation (B707: yes) being judged as detecting not readable region, in the block group that disk 18 has by CPU10, to belong to the not readable region detected block group is judged as defective region.And then CPU10, in block group flame table 400, makes the bad judged result stored accordingly with the block group # of the block group being judged as defective region (namely belonging to the block group of not readable region) become " bad " (B708).Then, being judged as in the situation (B707: no) not detecting not readable region and making after bad judged result becomes " bad " (B708), CPU10 terminates to rebuild the process of auxiliary mode validation.
And then in the present embodiment, CPU10 performs following process after rebuilding the process of auxiliary mode validation: be that the quantity of bad block group is judged as defective region more than the block cohort of the first stated number by bad judged result.In the block cohort that disk 18 has by CPU10, block cohort numbers the handling object block cohort (B709) that minimum block cohort (i.e. block cohort numbering: the block cohort of 0) is set as the object determining whether defective region.
Then, CPU10 reads the bad judged result (B710) of each piece of group belonging to handling object block cohort in block group flame table 400.Then, CPU10 judge to belong in the block group of handling object block cohort, read bad judged result as whether the quantity of the block group of " bad " more than the first stated number (B711).Here, the first stated number is the quantity of the block group for block cohort being judged as defective region.
Read bad judged result be the quantity of the block group of " bad " more than in the situation (B711: yes) of the first stated number, CPU10 makes the bad judged result of the whole block groups belonging to handling object block cohort become " bad " (B712) in block group flame table 400.That is, the quantity being judged as bad block group is judged as defective region more than the handling object block cohort of the first stated number by CPU10.Thus, do not need to determine whether defective region by each piece of group, and can be that the block cohort that bad possibility is higher is judged as defective region by whole block group, therefore can shorten the time required for reconstruction.
Making after the bad judged result of the block group belonging to handling object block cohort becomes " bad " (B712), or in the situation of quantity below the first stated number (B711: no) that the bad judged result read is the block group of " bad ", CPU10 makes handling object block cohort become the secondary little block cohort of block cohort numbering, carrys out update process object block cohort (B713).
Then, CPU10 judges whether the bad judgement (B714) performing whole block cohort.In the situation (B714: no) being judged as the bad judgement not performing whole block cohort, CPU10 returns B710, reads the bad judged result belonging to the block group of handling object block cohort.On the other hand, in the situation (B714: yes) being judged as the bad judgement performing whole block cohort, CPU10, relative to the read command received from main frame 2 or write order (hereinafter referred to read/write command), makes execution rebuild the reconstruction auxiliary mode validation (B715) of auxiliary process.Then, CPU10 terminates process block cohort being judged as defective region.
Then, use Fig. 7 that the reconstruction auxiliary process that disk set 1 of the present embodiment carries out is described.Fig. 7 is the process flow diagram of an example of the flow process representing the reconstruction auxiliary process that the disk set that the first embodiment relates to carries out.
CPU10 performs reconstruction auxiliary process according to the read/write command received from main frame 2 after the reconstruction auxiliary mode validation process shown in Fig. 6, reference block group flame table 400, obtains bad judged result (B801) in the block group of disk 18, the block group be included within the scope of read/write.
Then, bad judged result whether is had as block group (defective region) (B802) of " bad " the block group that CPU10 judges within the scope of the read/write being included in the read/write command received from main frame 2.In the situation (B802: yes) being judged as defective region, CPU10 reference block group flame table 400 judges whether the bad judged result being included in block group (starting LBA group hereinafter referred to read/write scope) in the block group within the scope of read/write, that comprise the sector of the beginning LBA as minimum LBA is " bad " (B803).
In the situation (B803: yes) that the bad judged result starting LBA group in read/write scope is " bad ", bad judgement is started LBA and the final LBA of bad judgement (B804) and notifies main frame 2 by CPU10.Bad judge start LBA be included in LBA within the scope of read/write and minimum in the LBA of bad judged result as the block group of " bad ".Badly judge that final LBA is LBA maximum in the LBA of bad judged result as the block group of " bad ".
On the other hand, in the situation (B803: no) of the bad judged result starting LBA group in read/write scope not " bad ", CPU10 reference block group flame table 400 reads and to be included within the scope of read/write and bad judged result is in the LBA of the block group of " bad ", minimum LBA and bad beginning LBA.Then, CPU10 by starting LBA group from read/write scope to comprise bad beginning LBA previous LBA sector block group, be set as the scope (actual read/write scope) (B805) the actual execution of the read/write scope of the read/write command received from main frame 2 being read or write to process.
CPU10 performs reading or the write (B806) of data to actual read/write scope.After the reading actual read/write scope being performed to data and write, CPU10 performs the update process (B807) of the block group flame table 400 shown in Fig. 3.And then CPU10 judges whether there occurs not amendable mistake (hereinafter referred to irrecoverable error) (B808) in the digital independent carried out actual read/write scope or write.In the situation (B808: yes) that there occurs irrecoverable error, CPU10 main control system IF control part 14, notifies main frame 2 (B809) by the result of the digital independent carried out actual read/write scope or write and read/write result and the minimum LBA that there occurs in the block group of irrecoverable error.
In the situation (B808: no) that irrecoverable error does not occur, CPU10 is by main frame IF control part 14, and by read/write result, bad judgement starts LBA and the final LBA of bad judgement notifies main frame 2 (B804).
In addition, under being judged as being included in the block group within the scope of the read/write of the read/write command received from main frame 2 and thering is no the situation of defective region (B802: no), CPU10 using the read/write scope of the read/write command received from main frame 2 directly as actual read/write scope to perform reading or the write (B810) of data.Then, CPU10 performs the update process (B811) of the block group flame table 400 shown in Fig. 3.
And then CPU10 judges whether there occurs irrecoverable error (B808) in the reading or write of the data of carrying out actual read/write scope.In the situation (B808: yes) that there occurs irrecoverable error, CPU10 main control system IF control part 14, notifies main frame 2 (B809) by read/write result and the minimum LBA of the block group that there occurs irrecoverable error.
On the other hand, in the situation (B808: no) that irrecoverable error does not occur, CPU10 main control system IF control part 14, notifies main frame 2 (B804) by read/write result.
Then, use Fig. 8 describes the process (B801 shown in Fig. 7) of the bad judged result obtaining the block group be included within the scope of read/write in detail.Fig. 8 is the process flow diagram of an example of the flow process of the acquisition process representing the bad judged result that the disk set that the first embodiment relates to carries out.
First, the specific read/write scope of CPU10 comprise the block group of sector starting LBA, this specific block is consisted of obtain the acquisition group of objects (B901) of bad judged result.And then CPU10 obtains the bad judged result (B902) obtaining group of objects from block group flame table 400.
Then, CPU10 judges whether the bad judged result obtaining group of objects is " bad " (B903).In the situation (B903: yes) that the bad judged result obtaining group of objects is " bad ", CPU10 makes the block group (next block group) comprising the sector of time little LBA of acquisition group of objects become new acquisition group of objects (B904).Then, CPU10 obtains the bad judged result (B905) of new acquisition group of objects from block group flame table 400.
CPU10 judges whether the bad judged result of new acquisition group of objects is " normally " (B906).In the situation (B906: no) that the bad judged result of new acquisition group of objects is " bad ", CPU10 judges whether the residual block group (B907) not obtaining bad judged result.In the residual situation (B907: yes) not obtaining the block group of bad judged result, CPU10 returns B904, again carries out the setting of new acquisition group of objects.
Be included in the block group within the scope of read/write in the not residual situation (B907: no) not obtaining the block group of bad judged result, the block group comprising the sector starting LBA of CPU10 specific read/write scope is defective region, bad judgement starts LBA and the final LBA of bad judgement (B908).Here, bad judgement starts the beginning LBA that LBA is read/write scope.In addition, the final LBA of bad judgement is the maximum LBA finally becoming the block group obtaining group of objects.After this, process terminates.
On the other hand, in the situation (B906: yes) that the bad judged result of new acquisition group of objects is " normally ", the block group comprising the sector starting LBA of read/write scope is defective region by CPU10, bad judgement starts LBA and the final LBA of bad judgement notifies main frame 2 (B908).It is here, bad that to judge to start LBA and the final LBA of bad judgement the same with above-mentioned.
In addition, in the situation (B903: no) that the bad judged result obtaining group of objects (comprising the block group of the sector starting LBA) is " normally ", CPU10 judges to be included in the block group within the scope of read/write whether remain the block group (unacknowledged piece of group) (B909) not obtaining bad judged result.In the situation (B909: no) of not residual unacknowledged piece of group, the specific read/write scope of CPU10 is normal (B914).After this, process terminates.
On the other hand, in the situation (B909: yes) of residual unacknowledged piece of group, CPU10 makes the block group (next block group) comprising the sector of time little LBA of acquisition group of objects become new acquisition group of objects (B910).And then CPU10 obtains the bad judged result (B911) of new acquisition group of objects from block group flame table 400.Then, CPU10 judges whether the bad judged result of new acquisition group of objects is " bad " (B912).In the situation (B912: no) that the bad judged result of new acquisition group of objects is " normally ", CPU10 returns B909, whether remains the judgement of unacknowledged piece of group.
In the situation (B912: yes) that the bad judged result of the block group of new acquisition object is " bad ", the block group comprising the sector starting LBA of CPU10 specific read/write scope is that normal and bad judgement starts LBA (B913).Here, bad judgement starts LBA is the minimum LBA finally becoming the block group obtaining group of objects.And then, CPU10 based on specific bad judgement start LBA, perform the process same with B904 ~ B908, the final LBA of specific adverse extent (B913).
According to the first embodiment, the block group exceeding the delay allowance time of regulation time delay that each piece that obtains based on the information about the access resume for block group is organized is judged as defective region.Its result, during the reconstruction being stored in the data in block group, can reduce the risk correctly can not predicted defective region or excessively predict defective region.Thus, the effect that can shorten the time required for rebuilding can be obtained.
In the first embodiment, the block group of the delay allowance time exceeding regulation the time delay of block group is judged as defective region by CPU10, but is not limited thereto.CPU10, based on the information about the access resume for block group, judges whether each piece of group is defective region.CPU10 such as also can by based on the information about the access resume for block group, the implementation rate of retry exceedes regulation allows that the block group of retry implementation rate is judged as defective region.Here, regulation allow that retry implementation rate is by the implementation rate of allowing to the retry of block group.In addition, the implementation rate of retry is such as tried to achieve by removing with the block number be stored in block group flame table 400 number of times being stored in the retry that the retry information in block group flame table 400 represents.
In addition, in the present embodiment, the example 1 CPU10 had by disk set 1 being performed to the reconstruction auxiliary mode validation process shown in the update process of the block group flame table shown in Fig. 6, Fig. 6 and the reconstruction auxiliary process shown in Fig. 7 and Fig. 8 is illustrated, but is not limited thereto.The CPU that such as also can be had by external unit such as main frame 2 grade come execution block group flame table update process, rebuild auxiliary mode validation process and rebuild auxiliary process, the CPU that external unit such as main frame 2 grade has also can execution block group flame table update process, rebuild auxiliary mode validation process and rebuild the part of auxiliary process.
(the second embodiment)
Block cohort is categorized as higher level's cohort (the second storage area group) with the multiple pieces of cohorts that physical configuration shares by this second embodiment, and in this second embodiment, the block cohort being judged as defective region is judged as defective region more than higher level's cohort of the second stated number.In the following description, the explanation of the part identical with the first embodiment is omitted.
Fig. 9 is the figure of the example for illustration of the process in the disk set related at the second embodiment, higher level's cohort being judged as defective region.Fig. 9 (a) is the figure of an example of the bad judged result of before treatment piece of the bad judgement group representing block cohort.Fig. 9 (b) is the bad figure judging an example of the bad judged result of the block group after processing representing block cohort.Fig. 9 (c) is the bad figure judging an example of the bad judged result of the higher level's cohort after processing representing higher level's cohort.In the present embodiment, as shown in Figure 9, CPU10, according to the physical configuration of block cohort, distributes by every 5 block cohorts and becomes higher level's cohort.0), block cohort numbering number exemplified with block cohort: (higher level's cohort is numbered:: higher level's cohort (higher level's cohort numbering: 1) of 5 ~ 9 (numbers about block cohort: 7 later omissions) for higher level's cohort of 0 ~ 4.
The CPU10 method identical with the process shown in the B710 ~ B714 of Fig. 6, decision block cohort is numbered: whether the quantity of the defective region in the block cohort of 0 is more than the first stated number (being " 2 " in the present embodiment).Then, the quantity of defective region is judged as defective region more than the block cohort of the first stated number by CPU10.By present treatment (the bad judgement process of block cohort), whole block groups (block group #: 4 ~ 7, the block group of 8 ~ 11,16 ~ 19) that block cohort (block cohort is numbered: 1, the block cohort of 2,4) has are judged as defective region by CPU10.The bad judged result of the block cohort after present treatment is the state shown in Fig. 9 (b).Be classified as higher level's cohort numbering: the block cohort numbering of 0: the quantity of block cohort in 0 ~ 4, that be judged as defective region (block cohort is numbered: 1, the block cohort of 2,4) is more than the second stated number (being " 2 " in the present embodiment).Here, the second stated number be higher level's cohort is judged as defective region the quantity of block cohort.
Therefore, as shown in Figure 9, CPU10 will be classified as higher level's cohort numbering: the block cohort numbering of 0: 0 ~ 4 whole block groups had (block group #: the block group of 0 ~ 19) are judged as defective region.Thus, higher level's cohort is numbered by CPU10: 0 is set to defective region.
According to the second embodiment, the block cohort being judged as defective region is judged as defective region more than higher level's cohort of the second stated number.Whole block cohort can be that higher level's cohort that the possibility of defective region is higher is judged as defective region by its result.Thus, the effect that can shorten the time required for rebuilding can be obtained.
Although the description of several embodiment of the present invention, but these embodiments are only illustration, are not intended to limit scope of the present invention.The embodiment of these novelties can be implemented in other various modes, within a range not departing from the gist of the invention, can carry out various omission, replacement, change.These embodiments and/or its distortion are contained in scope of invention and purport, and are contained in the scope of invention and the equalization thereof of asking the scope of protection to be recorded.

Claims (8)

1. a signal conditioning package, is characterized in that,
Possess:
Storage medium; With
Control part, it is based on the first information relevant to the access resume of the storage area had for described storage medium, and the time delay of the access to described storage area is asked for by each storage area, wherein, the described storage area of the delay allowance time exceeding regulation this time delay for benchmark, and is judged as defective region with the time do not conducted interviews with not performing retry to described storage area by this time delay.
2. signal conditioning package according to claim 1, is characterized in that,
Described control part also detects the exception of this storage medium by each physical attribute of described storage medium, and the described storage area corresponding with detecting abnormal physical attribute is judged as described defective region.
3. signal conditioning package according to claim 1, is characterized in that,
Described storage medium is classified as the first storage area group with multiple described storage area according to the class condition of the physical attribute based on described storage medium,
The quantity being judged as the storage area of described defective region is judged as described defective region more than the first storage area group of the first stated number by described control part.
4. signal conditioning package according to claim 3, is characterized in that,
Described storage medium is classified as the second storage area group with the multiple described first storage area group that physical configuration shares,
The quantity being judged as the first storage area group of described defective region is judged as described defective region more than the second storage area group of the second stated number by described control part.
5. signal conditioning package according to any one of claim 1 to 4, is characterized in that,
The described first information comprises the number of times described storage area being carried out to retry.
6. signal conditioning package according to any one of claim 1 to 4, is characterized in that,
The described first information is included in the quantity in the replacement region that the replacement process that performs in the access to described storage area is formed.
7. signal conditioning package according to any one of claim 1 to 4, is characterized in that,
The described storage area that the implementation rate of the retry based on the described first information is also exceeded the implementation rate of regulation by described control part is judged as described defective region.
8. an information processing method, it is for possessing the signal conditioning package of storage medium, and the feature of this information processing method is, comprising:
Based on the first information relevant to the access resume of the storage area had for described storage medium, and the time delay of the access to described storage area is asked for by each storage area, wherein, this time delay with the time do not conducted interviews with not performing retry to described storage area for benchmark
The described storage area of the delay allowance time exceeding regulation described time delay is judged as defective region.
CN201410487610.9A 2014-07-29 2014-09-22 Information-processing device and method Pending CN105302677A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462030275P 2014-07-29 2014-07-29
US62/030,275 2014-07-29

Publications (1)

Publication Number Publication Date
CN105302677A true CN105302677A (en) 2016-02-03

Family

ID=55180135

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410487610.9A Pending CN105302677A (en) 2014-07-29 2014-09-22 Information-processing device and method

Country Status (2)

Country Link
US (1) US20160034330A1 (en)
CN (1) CN105302677A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109725847A (en) * 2017-10-30 2019-05-07 东芝存储器株式会社 Storage system and control method
CN110310675A (en) * 2018-03-20 2019-10-08 株式会社东芝 Disk set and reading and processing method

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160123452A (en) * 2015-04-15 2016-10-26 삼성디스플레이 주식회사 Organic light emitting display device and method of driving the same
KR102572357B1 (en) * 2016-02-03 2023-08-29 삼성전자주식회사 Raid-6 data storage device and data processing system having the same

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097531A (en) * 2006-06-28 2008-01-02 联想(北京)有限公司 Computer RAID array early-warning system and method
CN102012847A (en) * 2010-12-06 2011-04-13 创新科存储技术有限公司 Improved disk array reconstruction method
CN102033797A (en) * 2009-09-28 2011-04-27 佳能株式会社 Information processing apparatus, method for controlling information processing apparatus

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69033262T2 (en) * 1989-04-13 2000-02-24 Sandisk Corp EEPROM card with replacement of faulty memory cells and buffer
US5535328A (en) * 1989-04-13 1996-07-09 Sandisk Corporation Non-volatile memory system card with flash erasable sectors of EEprom cells including a mechanism for substituting defective cells
US5488702A (en) * 1994-04-26 1996-01-30 Unisys Corporation Data block check sequence generation and validation in a file cache system
US6000006A (en) * 1997-08-25 1999-12-07 Bit Microsystems, Inc. Unified re-map and cache-index table with dual write-counters for wear-leveling of non-volatile flash RAM mass storage
US7050252B1 (en) * 2002-06-01 2006-05-23 Western Digital Technologies, Inc. Disk drive employing off-line sector verification and relocation of marginal sectors discovered during read error recovery procedure
JP4110000B2 (en) * 2003-01-28 2008-07-02 株式会社ルネサステクノロジ Storage device
US7881133B2 (en) * 2003-11-11 2011-02-01 Samsung Electronics Co., Ltd. Method of managing a flash memory and the flash memory
US7774643B2 (en) * 2006-01-06 2010-08-10 Dot Hill Systems Corporation Method and apparatus for preventing permanent data loss due to single failure of a fault tolerant array
US7573775B2 (en) * 2006-02-09 2009-08-11 Fujitsu Limited Setting threshold voltages of cells in a memory block to reduce leakage in the memory block
JP2009211233A (en) * 2008-03-01 2009-09-17 Toshiba Corp Memory system
US9047187B2 (en) * 2012-06-28 2015-06-02 Intel Corporation Defect management in memory systems
US9201777B2 (en) * 2012-12-23 2015-12-01 Advanced Micro Devices, Inc. Quality of service support using stacked memory device with logic die

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097531A (en) * 2006-06-28 2008-01-02 联想(北京)有限公司 Computer RAID array early-warning system and method
CN102033797A (en) * 2009-09-28 2011-04-27 佳能株式会社 Information processing apparatus, method for controlling information processing apparatus
CN102012847A (en) * 2010-12-06 2011-04-13 创新科存储技术有限公司 Improved disk array reconstruction method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109725847A (en) * 2017-10-30 2019-05-07 东芝存储器株式会社 Storage system and control method
CN109725847B (en) * 2017-10-30 2021-12-07 东芝存储器株式会社 Memory system and control method
CN110310675A (en) * 2018-03-20 2019-10-08 株式会社东芝 Disk set and reading and processing method

Also Published As

Publication number Publication date
US20160034330A1 (en) 2016-02-04

Similar Documents

Publication Publication Date Title
US9009526B2 (en) Rebuilding drive data
EP2778926B1 (en) Hard disk data recovery method, device and system
US9047219B2 (en) Storage system, storage control device, and storage control method
US8943358B2 (en) Storage system, apparatus, and method for failure recovery during unsuccessful rebuild process
US10120769B2 (en) Raid rebuild algorithm with low I/O impact
JP4886209B2 (en) Array controller, information processing apparatus including the array controller, and disk array control method
US20050229033A1 (en) Disk array controller and information processing apparatus
US9563552B2 (en) Storage control device and storage control method
JP2005122338A (en) Disk array device having spare disk drive, and data sparing method
CN105302677A (en) Information-processing device and method
JP2006139478A (en) Disk array system
US10606490B2 (en) Storage control device and storage control method for detecting storage device in potential fault state
US20150347224A1 (en) Storage control apparatus and method therefor
JP5040331B2 (en) Storage device, storage device control method, and storage device control program
US20130179726A1 (en) Automatic remapping in redundant array of independent disks and related raid
JP5218147B2 (en) Storage control device, storage control method, and storage control program
US20050283651A1 (en) Disk controller, disk patrol method, and computer product
JP5181795B2 (en) RAID system and error sector repair method
US9343113B2 (en) Control apparatus and control method
US20140380090A1 (en) Storage control device and storage control method
JP2008234446A (en) Data consistency checking method and system
JP5217452B2 (en) Information processing apparatus and system, and storage area management method and program
CN113190179B (en) Method for prolonging service life of mechanical hard disk, storage device and system
JP2013149112A (en) Management method for storage medium
US20100058141A1 (en) Storage device and control device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160203