TW200535609A - Disk controller methods and apparatus with improved striping, redundancy operations and interfaces - Google Patents

Disk controller methods and apparatus with improved striping, redundancy operations and interfaces Download PDF

Info

Publication number
TW200535609A
TW200535609A TW94107704A TW94107704A TW200535609A TW 200535609 A TW200535609 A TW 200535609A TW 94107704 A TW94107704 A TW 94107704A TW 94107704 A TW94107704 A TW 94107704A TW 200535609 A TW200535609 A TW 200535609A
Authority
TW
Taiwan
Prior art keywords
data
drive
disk
array
read
Prior art date
Application number
TW94107704A
Other languages
Chinese (zh)
Other versions
TWI386795B (en
Inventor
Michael C Stolowitz
Original Assignee
Netcell Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netcell Corp filed Critical Netcell Corp
Publication of TW200535609A publication Critical patent/TW200535609A/en
Application granted granted Critical
Publication of TWI386795B publication Critical patent/TWI386795B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1076Parity data used in redundant arrays of independent storages, e.g. in RAID systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2211/00Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
    • G06F2211/10Indexing scheme relating to G06F11/10
    • G06F2211/1002Indexing scheme relating to G06F11/1076
    • G06F2211/1054Parity-fast hardware, i.e. dedicated fast hardware for RAID systems with parity

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

A RAID disk drive controller (FIG. 33) implements disk storage operations, including striping and redundancy operations with multiple disk drives connected via respective SATA ports (520). Configurable data path switch logic (460) provides dynamic configuration of two or more attached drives into one or more arrays. Data transfers are synchronized locally by leveraging the SATA port transport layer FIFO (530). Synchronous transfers allow on-the-fly redundancy (XOR) operations (FIG. 36) for improved performance and reduced hardware complexity. XOR accumulator hardware (FIG. 42-FIG. 43) reduces buffer requirements for multiple DMA channels otherwise required for synchronization, and various narrow and wide striping modes are supported.

Description

200535609 九、發明說明: 【優先權主張】 此申請案係主張於西元2〇〇4年3月12日所提出之美 國臨時專利巾請㈣6G/553,594號之優先權。該件臨時專 利申睛案係以參照方式而整體納入於本文。 【發明所屬之技術領域】 本發明係為數位資料儲存系統之領域,且更具體而言 係相關於RAID磁碟陣列控制器之改良。 【先前技術】 硬碟係仍為用於小型至中型電腦之主要的大量储存穿 一 ^的特性,其主要為由搜尋時間與旋轉延遲所組成。 受限於嬋n—百 械械部分,資料傳輸率係 資料傳輪;二目、ΛΓ_)電子電… 騰μ每秒Γ產品而為於50至100 圍。 百4位兀組,咖㈣⑽Per second)之範 一種系統係可能運用多個磁一 機,若所需要的容量、性能 ._列之磁碟 磁碟機所可利用者一旦s罪度係超過其自單一個 尺寸之… 谷夏增強係最為普遍的動機。-給定 一 10 ’碟機係可儲存如Θ 疋 的資料;tl_y… 卞> n干—個磁碟機之二倍之之 '才+,比較第1A圖與第1B圖。 4之少 給定型式之_侗## #度纟曰強係較不明顯。 個磁碟機係將具有單-個磁碟機之二倍的失 200535609 效率。另一方面’可配置一系統而 是具有於第一個磁碟機的資料之—確每 s磁碟機為總 Λ ί( ^ M r * 只複製(此係經常稱 :鏡射一lng)”)。若任1碟機 : 二於仍具有-份複製而並未喪失;參閱第1(:圖(/中, 料A,係貧料A之一份複製)。 ’、 貝 揸曰、—制*去☆ 失效的磁碟機係立即更 換且稷μ重新建立,„料喪失的機率 一失效的磁碟機且重新建立複f '、 於更換 機失效的機率。此係遠低於任 —個磁碟 率。 j早個磁碟機失效之機 第三個理由在於性能,其為:—系統 碟機陣列之運用。存在兩個主要情形。第一,」::磁 流⑽eaming)應用係相較於單—個磁碟機可供給:而^ 需要較面的持續(輪lned)頻寬。具有N個磁碟機、: 係可潛在提供N倍於單一個磁碟機之持續頻寬。第二,針 對方=給定的磁碟機之存取時間(主要由搜尋與旋轉延遲所 決定)係限制可實行於每秒鐘的1〇操作(I〇ps,ι〇 ⑽SeC°lld)之數目。於最佳情形,—N個磁碟機之陣列係 可潛在支援N倍於單一個磁碟機之I〇ps性能。—個實例 係說明於第1D圖。 貝’ (Data Architectures) 將一第二磁碟機簡單地附加至一系統係將立即加倍容 量,但是可能不具有一性能改良。若是大部分的存取為至 該等磁碟機之一者或另一者,性能係將受限於較為常用的 兹茱械之f生月b。拆解係一種習知的技術,其係分布資料於 200535609 可利用的磁碟機,使得取回資料係時會需 磁碟機之參與,因而允呼李统f p利用的 能之性能。 紙、、心4性 於一典型的磁碟機最 (+ 、 取』了疋址的儲存單位係磁區 )。—磁區之資料係典型長度為m組之一指數 的倍數。於此應用,512位元組 曰 ππ 兹^尺寸係將用於說 月之目的’而非限制。欲拆解㈣’―順序或次序係指定 至該等磁碟機且係選擇—拆解寬度。—㈣磁碟機係可識 別為〇與i。拆解寬度係可能為4Κ位元組,其係八個磁區, 各者為512位元組。藉著此等選擇,使用者資料之第一個 4Κ區塊(於圖式之“使用者〇”)係儲存於磁碟機〇之第一 個4Κ區塊。使用者資料之第三個4κ區塊(‘‘使用者】,,) 係儲存於磁《 i之第—個4Κ區塊。使用者資料之第三 個4Κ區塊係儲存於磁碟冑〇之第二^ 4κ區塊且使用者資 料之第四個4Κ區塊係儲存於磁碟機丨之第二個4κ區塊。 此安排係說明於第2圖。此過程係重複,交替4κ區塊之 儲存於該二個磁碟機之間,直到到達該等磁碟機之結束為 止。若系統係具有大量之一或二個磁區的小存取,該二個 磁碟機係可同時存取以達成二倍於單—個磁碟機之隨機存 取的〖生此。若系統係存取相當大的資料區塊,例如:^⑼尺, 該二個磁碟機係同樣可同時操作以達成幾乎為二倍於單一 個磁碟機之持續的性能。拆解寬度選擇之結果係將論述於 後0 餘性(Redundancy) 200535609 如上所述,係可附加-磁碟機以保持_個 之一連續更新的備份複製。針胃 茱機 丁對於此間早方式,任 寫入操作係簡單地被複製於備份的磁碟機 機係主要磁碟機之一確實的 、、碟 双衣 此技術係經當習雜达 “鏡射(mirroring),,或 Raid 冉為 出而直到該等磁碟機之—者失:知可自任-磁碟機讀200535609 IX. Description of Invention: [Priority Claim] This application claims the priority of US Provisional Patent No. 6G / 553,594 filed on March 12, 2004. The provisional patent case is incorporated herein by reference in its entirety. [Technical field to which the invention belongs] The present invention relates to the field of digital data storage systems, and more specifically relates to the improvement of RAID controllers. [Previous Technology] The hard disk drive is still the main mass storage feature for small to medium-sized computers, which is mainly composed of search time and rotation delay. Limited by 婵 n—100 mechanical parts, the data transmission rate is the data transmission wheel; binocular, ΛΓ_) electronic electricity ... Teng μ per second Γ product and within the range of 50 to 100. One hundred and forty-four groups, per second) One type of system may use multiple magnetic machines, if the required capacity and performance. From a single size ... Gu Xia's most common motivation. -Given a 10 ′ drive that can store data such as Θ 疋; tl_y ... 卞 > n dry-twice as much as a drive 'cause +', compare Figure 1A and Figure 1B. 4 of the given pattern of _ 纟 ## # 度 纟 Yueqiang is less obvious. Each drive will have twice the efficiency of a single drive. On the other hand, 'a system can be configured but it has the data of the first drive-it is true that each drive is the total Λ ί (^ M r * only copying (this is often called: mirroring a lng) "). If any 1 disc player: Er Yu still has-one copy without loss; see section 1 (: Figure (/ in, material A, is a copy of poor material A). ', Bei Yi said, — Make * go ☆ The failed disk drive is replaced immediately and re-established. „The chance of material loss is a failed disk drive and re-established f ', the probability of failure of the replacement machine. This is much lower than any — The magnetic disk rate. The third reason for the failure of the early disk drive is performance. It is: — the use of the system disk array. There are two main situations. First, ": magnetic current ⑽eaming) applications Compared with a single drive, it can provide: and ^ requires a relatively continuous bandwidth. With N drives: Potentially provides N times the continuous bandwidth of a single drive. Second, the target side = the access time of a given drive (mainly determined by search and rotation delays) is limited to 10 operations per second (I0ps, ι〇⑽SeC ° lld) number. In the best case, an array of N drives can potentially support N times the Ips performance of a single drive. An example is illustrated in Figure 1D. Data Architectures simply attaching a second drive to a system will double the capacity immediately, but may not have a performance improvement. If most of the access is to one or the other of these drives, the performance will be limited to the more commonly used f month b. Dismantling is a well-known technology that distributes data to the available disk drives in 200535609, so that the drive will be required to retrieve the data system, thus allowing Li Tong f p to use the performance of the performance. The characteristics of paper, and heart are the same as those of a typical disk drive (+, and the storage unit of the address is the magnetic sector). —The data of the magnetic field are typically multiples of an exponent of the m group. For this application, the 512-byte size is to be used for the purpose of the month 'instead of limitation. To disassemble ㈣ '-the order or sequence is assigned to these drives and is selected-the disassembly width. —㈣ Disk drives can be identified as 0 and i. The disassembly width may be 4K bytes, which is eight magnetic regions, each of which is 512 bytes. With these choices, the first 4K block of user data ("User 0" in the figure) is the first 4K block stored on drive 0. The third 4κ block of user data (‘‘ user ’,,) is stored in the first 4K block of magnetic“ i ”. The third 4K block of user data is stored in the second ^ 4κ block of the disk 胄 0 and the fourth 4K block of user information is stored in the second 4κ block of the drive. This arrangement is illustrated in Figure 2. This process is repeated, alternating 4κ blocks are stored between the two drives until the end of the drives is reached. If the system has a large number of small accesses to one or two sectors, the two drives can be accessed simultaneously to achieve twice the random access of a single drive. If the system is accessing a relatively large block of data, such as: ⑼ ⑼, the two drive systems can also operate simultaneously to achieve almost twice the continuous performance of a single drive. The result of the disassembly width selection will be discussed in the last zero redundancy (Redundancy) 200535609 As mentioned above, it is a backup copy that can be attached to a drive to keep one of them continuously updated. For the early method here, any write operation is simply copied to a backup drive. One of the main drives is a reliable, hard disk. This technology has been used in the past. Mirroring, or Raid went out until those drives were lost: knowing that you can do it yourself-drive read

有失效為止,於其點,A 碟機係,選擇以供讀出。提高可靠度會造成儲存成本 機。 個鏡射磁碟機係須用於各個主要磁碟 存在之技術係以小於 料。考;"JL;# ^ h ' 的增量成本而用於保護資 π 冑料拆解之先前所述的二磁砰機陣列 D:原始的二者之相同尺寸的—個額外磁碟 忒陣列。此磁碟機係稱為“ ”付加至 於此安排,冗餘磁碟:之各:V'碟機。參閲··第3圖。 碟機之對應的4K區塊之χ £塊係接收其他二個磁 失效,失效磁碟機之===於任何單-個磁碟機 j 4K區塊的内容係可荽斗曾# 其餘的資料j猎者计异该 而重建。一身…餘磁碟機之對應的4K區塊之職 一陣列,於^轉之^對跨t N個磁碟機具有資料拆解之 餘磁碟機。同樣於1斤有的貧料區塊之X0R係儲存於冗 拆解之其餘:塊:重拆^ I /N,其中, : 几餘性之附加成本係降低至 個資料磁碟機:上 拆解,冗餘磁碟機係含二碟機::陣列。針對於各個 、μ二個貧料磁碟機之對應的區 (§ 8 200535609 又而言,於“資料磁碟機,,盘‘〜— 之間係不具有任何實 ,、几釦磁碟機” 作為方便表示拆解之磁碟機指定功 ::“知 於具有高百分比的寫入Γ广碟機係傾向成為用 負載。 m用的瓶頸且此旋轉係傾向平衡 針對讀出存取^> π 能係相同於益冗餘性之^=機失效,冗餘陣列之性 …、几餘性之拆解陣列性能。然 機的一資料區塊之重建係兩要链冰沾4 失放磁碟 ^ 建係而要員外的磁碟活動以存取於談 車列之八餘的磁碟機各者及資 'Until there is a failure, at its point, the A drive system is selected for reading. Increasing reliability can cause storage costs. Each mirrored disk drive is a technology that must be used for each major disk that exists. ”JL; # ^ h 'for the cost of protection of the two magnetic bang machine array D: the original two of the same size-an additional disk Array. This drive is referred to as "". For this arrangement, redundant drives: each: V 'drive. See Figure 3. The χ £ block of the corresponding 4K block of the disk drive is to receive the other two magnetic failures. The content of the failed disk drive === on any single-drive j 4K block is 荽 斗斗 曾 # The rest The data j hunter should rebuild it. One ... The corresponding 4K block of the redundant disks. An array, which has a data disassembly of the redundant disks across t N disks. The same X0R in the lean material block that is 1 kg is stored in the rest of the redundant disassembly: block: reassemble ^ I / N, of which: the additional cost of several is reduced to a data disk drive: disassemble Solution, the redundant disk drive contains two disk drives :: array. For the corresponding areas of each and μ two lean disk drives (§ 8 200535609, in addition, there is no real disk drive between "data disk drive, disk '~", "As a convenience to indicate the disassembly of the disk drive specified work:" Know that with a high percentage of write Γ the disk drive system tends to become a load. The bottleneck of m and the rotation system tends to balance against read access ^ > π can be the same as the benefit of redundancy: ^ = machine failure, redundancy of the array ..., the performance of the dismantling of the array. However, the reconstruction of a data block of the machine is caused by the ice of the two main chains. Disks ^ Established as a faculty and required disk activities to access the various disk drives and resources of the eight trains ''

呼管。^ y 貝才十之額外的處理以供XOR 任何區塊之更新係將使得用於該拆解之冗餘 區塊無效,進而同樣需要冗餘區塊之更新。 ” 如上文所指出,具有二個磁碟機之—系統係可藉著運 =個磁碟機以鏡射另一者而提供冗餘性,或是可加㈣ 容量且提供高達一 2X性能改良。欲維持—複製所需= 額外的:碟寫入命令或是欲分布或收集拆解於二個磁碟機 之間的貝料所需要的額外操作之發出係可由驅動程式軟體 所易於處理’其係使用不提供任何專用的陣列功能之一磁 碟控制器。然而’針對於具有三或四個磁碟機的—冗餘陣 列之系統’ X0R計算與另外的磁碟活動係因為專用的硬體 而可具有顯著之好處,不論是有無區域智慧。於現今市場, 二磁碟機之陣列係一般係以軟體處理。較大的陣S係=用 專用磁碟控制器硬體,其可為位於主機板、於—插入卡、 或一外部盒體。 200535609 ik^^edundancy Hardware) 坪述磁碟機的機械、電氣、與邏輯介面之產業標準係 ^在…磁碟機係可藉著提供通常稱為-控制H或一轉接 益' 且符合介面標準的需求之一介面而附接至一系統 性能係一個問題點之任何的系統,直接記憶體存取(職, D⑽ct Mem〇ry Access)係由控制器所運用以傳輪磁碟資料 於磁碟機與系統記憶體之間。 、 作為用於檢驗加速硬體之情況,考慮由三個資料磁碟 機加上一個冗餘磁碟機所組成之一陣列;參閱第4圖。在 磁碟機已經失效之前,存取一區塊之資料係僅需要該目 標磁碟機被讀出且資料由DMA傳輸至記憶體。具有一磁 ^存取且4K位元組的資料係傳輸至記憶體。若此磁碟機 為失效,存取相同資料區塊係將需要讀出該拆解之平衡, 即.來自所有其他磁碟機之相同拆解的對應區塊。 其餘的磁碟機之各者係被讀出且資料係由DM 至記憶體。即使該三個磁碟機可具有相同的平均存: 性’針對種種理由’讀出操作係將實際完成於不同的時門、 旦该二個區塊係儲存於一緩衝器 建錯失的資料。參考第35B圖, 一個成分(element)係讀自各個 包括之事實係為,磁頭位置與旋轉位置之初始的狀能二 立。參考第35A圖,其顯示此種非同步的資料傳輪 料〇、資料1與PAR (同位或冗餘磁碟機)經由個別的心 通道至對應的緩衝記憶體。資料2磁碟機係已經失效。— ,X〇R操作係可實行以重 欲進行XOR該三個串流, 串流,該三個成分係進行 10 200535609 X〇R於邏輯620,且所爲沾劣八〆 且所付的成分係儲存於一新區塊之記憶 體622。注思·一成分係可用於相κ蝴 寸日關5己fe體及DMA硬體之 任何方便的尺寸。此過程传雹I — 你而要二個磁碟存取:12K之資 料係自該等磁碟而傳輪至記憶體(使用4K區塊以供說明);、 W之資料係自該記憶體而讀回以供職計算;且攸位 元組之資料係寫回至記愔駚 芏。己U體622,用於進出該記憶體之 共28K的資料傳輸。 由前述的實例係觀察出下列·· ‘吕4K貝料區塊之存取係僅需要4K之資料傳 輸,.失效後之存取係、需要總共為28Κ位元組之緩衝器存取,, 即·糸統匯流排與記十咅研斗苜官 c丨心饈頻寬負載之七倍的增加。Exhale. ^ y The extra processing of Bei Cai for the update of any block of XOR will invalidate the redundant block used for the disassembly, and it will also require the update of the redundant block. As noted above, a system with two drives can provide redundancy by mirroring the other drive, or it can add capacity and provide up to a 2X performance improvement .To maintain—required for copying = Extra: The issue of a disk write command or an additional operation required to distribute or collect shell material disassembled between two drives can be easily handled by the driver software ' It uses one of the disk controllers that does not provide any dedicated array functions. However, 'for systems with three or four drives-redundant arrays' X0R calculations and additional disk activity are due to dedicated hard drives It can have significant benefits, whether with or without regional wisdom. In today's market, the array systems of two disk drives are generally processed by software. Larger array S series = use dedicated disk controller hardware, which can be It is located on the motherboard, insert card, or an external box. 200535609 ik ^^ edundancy Hardware) The industry standard system of mechanical, electrical, and logical interface of the disk drive ^ In the drive system can be provided by Commonly called-Control H or "Transfer benefits" and meet any of the requirements of the interface standard interface and attached to a system whose performance is a problem point of any system, direct memory access (job, D⑽ct Mem〇ry Access) is used by the controller to Transferring disk data between the disk drive and the system memory. As a test for accelerated hardware, consider an array of three data drives plus a redundant disk drive; see Figure 4. Before the drive has expired, accessing a block of data requires only that the target drive is read out and the data is transferred from DMA to memory. It has a magnetic access and 4K bytes The data is transferred to the memory. If the drive fails, access to the same data block will need to read the balance of the disassembly, that is, the same disassembled corresponding block from all other drives. Each of the remaining drives is read and the data is transferred from DM to memory. Even if the three drives can have the same average memory: The 'reading operation for various reasons' will actually be completed in different Shimen, once these two blocks are stored in Missing data for buffer construction. With reference to Figure 35B, an element is read from each included fact that the initial state of the head position and the rotational position are different. Refer to Figure 35A, which shows this non- Synchronized data transmission rounds 0, data 1 and PAR (co-located or redundant disk drives) through the individual heart channel to the corresponding buffer memory. Data 2 disk drive system has failed. —, X〇R operation system can XOR is implemented in order to perform the XOR of the three streams. Streaming, the three components are performed on 10 200535609 XOR in logic 620, and the contents are bad and the paid components are stored in a new block of memory Body 622. Note · A component system can be used in any convenient size of the body and DMA hardware. In this process, I — you need two disks to access: 12K of data is transferred from these disks to memory (use 4K blocks for illustration); W of data is from this memory The read back is calculated for the job; and the data of the byte group is written back to the record. The U-body 622 is used to transfer a total of 28K data into and out of the memory. From the foregoing example, it is observed that the following: "The access to the 4K shell material block only requires 4K data transmission. After the failure, the access system requires a total of 28K bytes of buffer access, That is, the system bus and the seven-fold increase in heart bandwidth load.

2.XOR計算将可I 了不開始,直到資料區塊為已經接收來 自所有的三個磁碟德A L ^2. The XOR calculation will not start until the data block is received from all three disks. A L ^

’、,、、'止。因此,整個XOR過程係增加 該讀出操作之總等待時n 二立L ^ ^ 寻孖守間,而產生一動機以使得緩衝記情 係決速如可能實行者。注意··儘管XOR過 程係可能已經起始於I^ ^ ^ 取先二個串流,但仍需要額外的頻宽 以儲存中間的結果且脾甘$ l 只見 果且將其再次抓取以與最終的資料 行XOR。 I運 3 ·失效後係奮暂你 具貝使侍用於磁碟機管理之經常費 (overhead)成為三倍。 、用 Μ_料值认 /0 1',,,,'stop. Therefore, the entire XOR process is to increase the total waiting time of the read operation, n L L ^ ^ to search for guards, and to generate a motive to make the buffer memory system as fast as possible. Note ... Although the XOR process may have started with I ^^^^ to take the first two streams, additional bandwidth is still needed to store the intermediate results The final column XOR. I run 3 · After the failure, you will be temporarily suspended. Gu Bei tripled the overhead cost of the drive management. Recognize with Μ_ 料 值 / 0 1

RedundairL^RedundairL ^

Transfers) — 典型的磁碟旛仫& 一 、,、内邻緩衝,藉以解耦該讀寫(R/W)頭 之貧料傳輸率與該诚虛4、 )員 " ”枯1介面之傳輸率。此内部緩衝器(盥 200535609 其順f於種種的介面速度之能力)係可利用以增強冗餘操作 而且顯著降低硬體雲、卡 , 求。考慮ATA/ATAPI介面於其原始 之並列的程式規劃輪入輪出(pi〇, pr— ⑴ Γ模式。於此模式,單—個十六位元的字組之資料:讀 出自或寫入至磁石举;+ + / 、 接器所提供之-讀出咬/ ’其係使用由控制器/轉 或寫入選通(DIOR或DIOW)。由冗餘 硬體以上之說明可知’重建一失效的磁碟機中之一區塊的 貧料係需要讀出其餘的三個磁碟機、傳輸其資料至一區域 广、且接著為讀出來自該等緩衝器之三個串流以計算 牛/數在匕係因為该等磁碟機(儘管均同時操作)為非同 广破此’故各者為傳輸資料於不料間。(“内部緩衝哭” 運用以,出一磁碟機之内部緩衝器,如為區別於一:制 為轉接器或主機之緩衝記憶體。) -種替代的技術係f稱為同步冗餘資料傳輸(S謝, 傳y: i,r〇nous Redundant Data Transfer)。藉著同步冗餘資料 =出命令係發出至所有的三個(或n個)磁碟機。當 :貝料係可用於各個磁碟機之内部緩衝器,其非為立即 傳軔至於控制器/轉接器之内的一緩衝器。 2.然而,當來自所有三個磁碟機之讀出資料 個別的内部緩衝器,蓮過程係可開始。—職、引擎係 抓取來自各個磁碟機之-第-成分;計算該三個成分之 撕’且輪出該結果之第一成分至於控制器/轉接器之内的 •戍衝器。此冗餘操作係“即時(on_the_fly)” ,由於其隨著 12 200535609 資料移出自該磁碟機至緩衝器 ^ 如有別於如上所述 之百先儲存資料於緩衝哭、 操作。 n i接者必須讀出其以進行冗餘 針對於PIO模式之ata/atTransfers) — typical disks & I, I, and inner buffers, to decouple the poor data transfer rate of the read / write (R / W) header and the honest and honest 4) member " "Dry 1 interface Transfer rate. This internal buffer (the ability of 200535609 to comply with various interface speeds) can be used to enhance redundant operations and significantly reduce hardware clouds and cards. Consider the ATA / ATAPI interface to its original Parallel programming round-in-round-out (pi〇, pr— ⑴ Γ mode. In this mode, the data of a single sixteen-bit block: read from or write to the magnet lift; + + / 、 connector Provided-read bit / 'It is used by the controller / turn or write strobe (DIOR or DIOW). From the above description of redundant hardware, we can know' rebuild a block in a failed drive ' The lean system needs to read the remaining three disk drives, transfer its data to a wide area, and then read the three streams from these buffers to calculate the number of cattle / numbers. Because these disks Machines (although they all operate at the same time) are different from each other, so everyone is surprised to transmit data. ("Internal buffer The application uses the internal buffer of a disk drive as if it is different from one: the buffer memory of the adapter or the host.)-An alternative technology is called synchronous redundant data transmission (S thanks , Pass y: i, r〇nous Redundant Data Transfer). By synchronous redundant data = out command is issued to all three (or n) drives. When: shell material system can be used for each drive The internal buffer is not a buffer that is immediately transmitted to the controller / adapter. 2. However, when reading data from all three drives individually internal buffer, the process is You can start. —The job and the engine are to grab the -th-component from each drive; calculate the tear of the three components and turn the first component of the result into the controller / adapter. This redundant operation is "on_the_fly", as it moves data from the drive to the buffer with 12 200535609 data. If there is a difference from the above-mentioned one hundred stored data in the buffer, operation. The ni-connector must read it out for redundancy. ata / at for PIO mode

} 介面,成分尺寸係I 一们十六位元字組,即:介命 ’、 日丰斷^ ητ 見度。成分抓取係藉由同 ㈣疋m〇R選通至該三個磁碟機而達成。使 ? DIOR選通係使得資料傳輸為“同 '、 的上述計書彳,X〇R在脸 。农几餘硬體之下 - J X〇R係將不會開始,直到杏自田% t 機的資料已缺值於j末自取後一個磁碟 貝了十匕經傳輸至記憶體為止。 於同步(SRDT)計劃中,當來 可用於兮^ ^自取後一個磁碟機的資料 了用於该磁碟機之内部緩衝器時,該 tt tb in 'S σ私便開始。假設: 、出、通係於由該等磁碟機所支 步資料傳輸與即時 取、、:'、而產生,同 I T几餘汁异之優點係如後·· h當最後完成之磁碟機於其緩 資料時,從此時起,X0R係被計中具有備*之讀出 有如同來έ σσ ’、 ^ 且結果係傳輸,其具 旁戈Π末自早一個磁碟機在失效前 待時間。抓取來自缓彳|…。Μ 、4傳輸之相同的等 存該結果至绥猞哭々#, t开x〇R、以及儲 的°°之頟外的等待時間係被消除。 .傳輸至緩衝器之總資料 於此例所需的油 ;^ …原…傳輸之4K區塊。 頻寬。“頻見係欲支援單-個磁碟機所需之 •來自三個磁碟機之資料係降 址與計數)係需 DMA内容, 早一個DMA内容(位 要每個磁碟機為一個 低至單一個串流。僅為 要用於該操作而並非需 如為於原始的實例所需 rs 13 200535609 者。然而,此種有效率的操作係取決於使用等於磁碟機介 面的寬度之一儲存成分尺寸(“窄的拆解”),且其受限於 藉由施加一共同DIOR選通至陣列中的全部磁碟機所引發 之同步傳輸。} Interface, the component size is a 16-bit character group, that is: 命 命, Rifeng break ^ ητ visibility. Component grabbing is achieved by gating the three drives with the same MR. Make ? The DIOR gating system makes the data transmission "same as above", X〇R is on the face. Under the hardware of the farmer-the JX〇R system will not start until the data of Xing Zitian machine The value is missing at the end of j. A disk has been transferred from the disk to the memory. In the Synchronization (SRDT) plan, it can be used for ^^ ^ Self-fetched data from the disk drive is used for When the internal buffer of the disk drive, the tt tb in 'S σ starts privately. Assume that:, output, and pass are generated by the step data transmission and real-time fetching by the disk drives. The advantages of IT and IT are as follows: h. When the last completed drive is in its cache, from this time, X0R is counted as having read *. ^ And the result is a transmission, which has a long waiting time before the failure of a disk drive. Grab the same transmission from the buffer | ..., M, 4 and save the result to 猞 猞 哭 々 #, t open x〇R, and the waiting time outside the ° ° ° are eliminated.. The total data transmitted to the buffer in this example is required for oil; ^ ... the original ... transmission 4K blocks. Bandwidth. "Frequency is required to support a single drive. • Data from three drives is de-addressed and counted.) DMA content is required. Each drive is a stream as low as a single stream. It is only used for this operation and not required for the original instance. However, this efficient operation depends on the use of equals One of the widths of the disk drive interface stores the component size ("narrow disassembly"), and it is limited by the synchronous transmission caused by applying a common DIOR gate to all disk drives in the array.

鑒於以上背景論述,數個問題係仍然存在。關於ρΑτΑ 技術’-控制器係可同時存取多個磁碟機,允許其針對改 進的RAID性能及顯著降低的硬體複雜度而即時實行冗餘 計算。已經發展的技術(首先為UDMA,然後為sata)係 同步來源,其將不允許一控制器同步傳輸讀出資料。 其次,先前技藝係運用同步資料傳輸及具有數個位元 組或字組的拆解寬度之即時冗餘,且合併該等磁碟機資料 至單Γ料流,其可為以單—個DMA通道而傳輸往返於 緩衝斋。係需要技術以擴充同步f料傳輸與即時冗餘之 運用至具有—或多個磁區的拆解寬度之陣列。 此外,係可實行用於在任何X0R計算前 的控制器傳輸㈣資料至記憶體之目前的㈣步技術。I :相對於可能m餘操作而無須先傳輸所有資料至纪情 =無須再次將其讀回以供計算之技術而言為浪費緩衝器 -種raID磁碟機控制器係實施磁碟儲 拆解及冗餘操作,苴係叙ώ 。乍匕括·In view of the above background discussion, several issues remain. Regarding ρΑτΑ technology '-the controller can access multiple drives at the same time, allowing it to implement redundant calculations in real time for improved RAID performance and significantly reduced hardware complexity. The technology that has been developed (first UDMA, then sata) is a synchronization source, which will not allow a controller to transmit read data synchronously. Secondly, the previous technology used synchronous data transmission and real-time redundancy with the disassembly width of several bytes or words, and combined the drive data into a single Γ stream, which can be a single DMA The channel transfers to and from the buffer fast. Technology is needed to extend the application of synchronous material transmission and real-time redundancy to arrays with a disassembly width of one or more magnetic domains. In addition, it is possible to implement the current stepping technology for the controller to transfer data to memory before any X0R calculation. I: Compared with the possibility of more than one operation without having to transfer all the data to the history first = it is a wasteful buffer for the technology that does not need to read it back for calculation again-a kind of raID drive controller implements disk storage disassembly And redundant operations. Dagger

蟓m幻 工由個別來源的同步埠(例如:SATA蟓 m magic port from a separate source (for example: SATA

埠)而連接多個磁碟機。 A1A 邏輯係提供二或更多個㈣& n lgUrabIe)f料路徑開關 更夕個附接的磁碟機之動態配置至一或多Port) while multiple drives are connected. A1A logic system provides two or more ㈣ & n lgUrabIe) f material path switches, and dynamic configuration of one attached drive to one or more

14 20053560914 200535609

個陣列。資料傳輪係藉由槓桿作用SAT 而受區域同步化。同步傳輪係允許即時的冗餘 二"得:4善的性能與降低的硬雜複雜度。職累:二體 ^ Μ圖)係降低多個DMA通道或者是同步化所需要 =器需求’且係支援種種的窄 = 了部分拆解更新之改良。 、八係挺供 由以下之較佳實施例的詳細說 明之另外的層面與優點,月是地瞭解本發 【實施方式】"、…、中軸說明係參照圖式。 建立同步資料傳輸於一磁 控制器之往返於磁碟機 1 ’於該磁碟 對於一些介面金協定貝科傳輸係非同步。即,針 為而疋由個別的磁碟機電子電仏制 於一不同時間而完°周且各個磁碟機係 資料之一讀出= 其部分者’例如:拆解 餘資訊之“即t二(:?料傳輸之可利用性係致能冗 了 屋生(於磁碟宫Λ 士* A、、 的漏失資料之“即時” ° )以及於讀出方向 此#可拉“ (使於—磁碟失效之情形)。 ’、曰者置放一彈性的緩衝器(即:一 FIF〇e久 磁碟機與控制器之資料各個 介面之實例而說明,雖然其可:成運用::略二 為發起於資料館存裝置而非 二、中之一貧料選通 於各個磁碟機與其FIF0,,=之任何的應用。針對 受來自兮磁“ MDA協定之-介面係接 進至:::料且於該磁碟機之讀_而將其推 右任一個聊為接近滿,則該介面係將運用Arrays. The data transmission system is subject to regional synchronization by leveraging the SAT. The synchronous transmission train allows instantaneous redundancy. Second, "Good performance" and reduced complexity. Tiredness: Two body ^ Μ)) is to reduce the number of DMA channels or synchronization required = device requirements' and support a variety of narrow = the improvement of some disassembly and update. The eight series are very clear From the detailed description of the following preferred embodiments, the other aspects and advantages of the present invention are well understood [Embodiment] ", ..., the description of the bottom bracket refers to the drawings. Establish synchronous data transmission to and from a magnetic controller to and from the drive 1 'to the disk. For some interface gold protocol Beco transmission is asynchronous. That is, the needles are controlled by individual disk drive electronics at a different time and are completed in one cycle and one of the data of each disk drive is read out = part of it. For example: dismantling the remaining information "ie t Two (:? The availability of material transmission is to enable redundant life (in the real-time) of the missing data of the disk disc Λ * *,) and in the direction of reading this # 可 拉 "(make in —The case of a magnetic disk failure. '"I put an elastic buffer (that is, an example of each interface of the FIF0e drive and controller data), although it can be used as: Slightly two are originated from the storage device of the library instead of one or two, and are selected from each of the drives and their FIF0, =. Any application. For the interface that is subject to the MDA agreement from Xi Magnetic, access to: :: In the reading of the drive and push it to the right of any one to chat to be close to full, the interface will use

(S 200535609 其提供於UDMA協定之機構而“暫停”資料傳輪 此目的,FIF〇係將提供_ “幾乎為滿(alm〇stfuil),,▲、於 其為斷定於刪所剩餘的足夠空間以接受二 出之最大數目之字組,—旦於“暫停”為已經斷定。運: 於吴國專利第6,G1 8,778號所述的方法之大部分 資料係同步移出自該等fIF0。 iiA,(S 200535609 For the purpose of "suspending" the transfer of data by the institutions provided in the UDMA agreement, FIF〇 will provide _ "almost full (alm0stfuil)," ▲, it is determined that there is enough space left to delete In order to accept the maximum number of words, the "pause" has been determined. Yun: Most of the data in the method described in Wu Guo Patent No. 6, G1 8,778 are synchronously removed from the fIF0. IiA ,

等待I,在發出讀出命令至所有的磁碟機之後,則 寺待而直到存在資料為可利用以傳輸於所有FWait for I. After issuing a read command to all drives, wait until there is data available to transfer to all F

制而取得且運用單一位址技術器而為傳輸至控 内的—緩衝記憶體。若任-個FIF〇成為“空”, 该過程係將停止而直到其均為再次指出“未空”I 考慮磁碟讀出操作。再次,一 數器而讀“於控制n之㈣早®位址叶 資料丰“ r 円’戍衝益。讀出自該緩衝器之 即·資料'/段係運用一共同選通而推進至各個则, (⑽)=解於該陣列之磁碟機。若任一胸成為“滿The buffer memory is obtained by using a single address technology device and transferred to the controller using a single address technology. If any of the FIFs becomes "empty", the process will stop until all of them indicate "not empty" again. I consider a disk read operation. Once again, read a few words and read "In the Control of the Early Morning ® Address Leaf Data Fung" r 円 ’円 Chong Yi. The data / segments read from the buffer are advanced to each by using a common strobe, (⑽) = the drive in the array. If either chest becomes "full

協定之八…過程係停止。於FIF〇之磁碟機側,實施UDMA 協疋之介面係將提出自 一 磁碟機。儘管此等 之貝料且傳輸其至該等 由於各個^輸#可能同時起始,其將為非同步, 技各個"面係將獨立響應以 附接的磁碟機之請求。 1 了。L來自其 為磁tFlF〇或類似的記憶體之運用,其為非同步(音指 為磁碟機產生其資料選通)的磁;機八…^ ( 禾钱介面或協疋之此修改以 16 200535609 =同步冗餘資料傳輸係提供重大的優點為優於用以處理 ’碟機陣列的同時資料傳輸請求之標準技術。實 也例係更為詳細說明於後。 貝 弟6圖係說明一個陣列1〇之磁碟機。口讀 運用作為說明而非為限制。磁碟…具有—資料::: 14 ’以提供讀出資料至其實施標帛UDMA協定之—Agreement No. 8 ... The process is stopped. On the drive side of FIF0, the interface implementing the UDMA protocol will be proposed from a drive. Although these items are transmitted to them, since each input may start at the same time, it will be asynchronous, and each device will respond to the request of the attached disk drive independently. 1 out. L comes from the use of magnetic tFlF0 or similar memory, which is non-synchronous (sound refers to the drive to generate its data strobe); machine eight ... 16 200535609 = Synchronous redundant data transmission system provides significant advantages over the standard technology used to process simultaneous data transmission requests for disk arrays. Examples are explained in more detail below. Betty 6 illustrates one Disk 10 of the array. The oral reading is used as an illustration rather than a limitation. The disk ... has -data :: 14 'to provide read-out data to its implementation of the UDMA agreement-

16同理,-第二磁碟機2〇係具有一資料路徑U,其 接至-個對應@ UDMA介面24,以此類推。磁碟機q 目係可’四者係顯示以供說明。各個實體磁碟機係附 接至一個UDMA介面。各個磁碟機係經由其udma介面 而輕接至-記憶體之-資料輸人埠,該記憶體係諸如— FIF〇 ’雖,然其他型式的記憶體係可運用。舉例而言,磁碟 機^係經由UDMA介面16而㈣至一第—先進先出⑽⑺ 記憶體26 ’而磁碟機20係經由其UDMA介面24而耗接 至第二先進先出(FIF〇)記憶體28,以此類推。 、於各個情形,udma介面係接受來自磁碟機之資料且 為於磁碟機之讀出選通而推進該資料至FIF〇。參閱··自磁 碟機12至先進先出(FIF0)記憶體26的寫入…反輸入之訊 嬈60,自磁碟機20至先進先出(FIF〇)記憶體28的寫入 WR輸入之訊號62,以此類推。 如上所述,此策略係相反於PI〇模式,於其,讀出選 通係由控制器所提供至磁碟機。若任一 FIF〇為接近一滿 (full)條件,UDMA介面係將為由NCITS之ΑΤΑ/ΑτΑρι規 格所述之方法而“暫停”。針對於此目的,FIF〇或其他的 17 200535609 記憶體糸統係提供 戍卞马滿(AF,almost full) 川1’其為”於足夠空間仍為維持可利用於nF〇 = 旦方;料[經斷定時而接受一磁碟機所可送 數目的字組。 心取大 資料係運用類似於美阉皇妥丨$ 貝似吴闼專利弟6,〇18,77δ號所述者之 :種方法而同步移出自該等酬。明择而言,在發出讀出 中令至所有的磁碟機之後’則為等待而直到存在資 利用以傳輸於所有的聊,即:其均為指出—“未空( emPty)”條件。此係說替第6圖,藉由其來自各個fif〇 為輸入至-邏輯方塊4〇以產生 貧料訊號仏在所有刪均具有資料之指示後,即 2 FIFO均具有來自其對應磁碟機之資料,讀出資料係 得輸。 視出貝料係傳輸如後所述。各個fif〇係具有 輸出路徑,例如K 。 . 48,於本較佳實施例為十六位元寬。 方:的磁碟機資料路徑係以並列方式而合併,如為指出於 方塊50。換言之,一 “官 自該等胸至一緩衝^ re)”資料路徑係提供 之一办 、11 °。52,其具有等於N乘以m個位元 又八中,N係附接的磁碟機之數目且m係來自各 =碟機的資料路徑之寬度(雖然其非必要為均具有相同寬 :―。於圖不的配置,四個磁碟機為運用,各者係具有一 16 '的貧料路徑,同時為總數64位元至緩衝器52。 44 等FIF〇的資料之傳輸係由一個共同讀出選通 所艇動,共同讀出選通44係傳播至所有的FIF〇。至緩 18 200535609 衝器52之傳輸係因此為運用如圖所示 而同步作成’即使該等磁碟機 :―位址計數器54 ^ v 各者係非同步提供該讀出 貝枓之一口 P刀。要是FIF〇之任一者 、 該過程係將拖延而直到立^/^工㈣购), empty)”。 U為再次指出“未空㈣ 參考第7圖,磁碟寫入操作係描述。再次 係引入於控制器與各個磁碟機 用單一位址計數器7G而1的貝#路径1料係運 、, 而5貝出自於控制器之内的緩衝器52。 於一個目丽較佳實施例,由石^ ^ ^ ^ ", 卞拽1緩衝1§的資斜值終 係半雙工(half-duplex),今裳、 月飞 )"亥寺FIFO與位址計數器 用。各個FIFO係具有多工哭去 J為八 埠,視該資料傳輸方向而定。 、钓出 項出自該緩衝器之資粗空^ 貝枓子組的區段係運用一共同還、g 72而推進至各個FIF〇, 、 、1 共同通通72係耦接至各個fif〇 之寫入控制輸入歡,如圖所示。參閱:資料路m 78、80。以此方式’寫入資料係“拆解,,於該陣列之磁 機。若任一個FIFO成為“、、茱 此係由…2所代表之邏輯而實施以產生“ (any are full),,訊號。 馬滿 /於刚◦之磁碟機側,實施UDMAM之介面^ :糸將棱出自該等FIFO之資料且傳輸其至該等磁 管此等傳輸係可能同時起始,其將為非同步,由於各個: :係將獨立響應以“暫停、“停止,,來自其磁碟機之; 39 20053560916 Similarly, the second drive 20 has a data path U connected to a corresponding @UDMA interface 24, and so on. The drive q item can be displayed for explanation. Each physical drive is attached to a UDMA interface. Each disk drive is lightly connected to the -data input port of the memory via its udma interface. The memory system such as -FIF0 ', although other types of memory systems can be used. For example, the disk drive ^ goes through the UDMA interface 16 to the first-first-in-first-out memory 26 ′ and the disk drive 20 goes through the UDMA interface 24 to the second-in-first-out (FIF). ) Memory 28, and so on. In each case, the udma interface accepts data from the drive and advances the data to FIF for read strobes on the drive. See ... · Write from drive 12 to first-in-first-out (FIF0) memory 26 ... inverted input signal 60, write WR input from drive 20 to first-in-first-out (FIF〇) memory 28 Signal 62, and so on. As mentioned above, this strategy is the opposite of the PI0 mode, where the read strobe is provided to the drive by the controller. If any FIF0 is close to a full condition, the UDMA interface will be "paused" for the method described by the ATCA / AτΑρι specification of NCITS. For this purpose, FIF0 or other 17 200535609 memory systems provide a full (AF, almost full) Chuan 1 ', which is “with sufficient space to remain available for nF0 = denier; expected [According to the interruption, the number of words that can be sent by a disk drive is accepted. The big data is obtained by using something similar to the United States and the United Kingdom, which is described in Patent No. 6, 〇18,77δ: This method is synchronously removed from the compensation. Optionally, after issuing the read order to all the drives, 'there is waiting until resources are available for transmission to all chats, that is: — "Empty (emPty)" condition. This is to say that instead of Fig. 6, it uses the input from each fif0 to-the logical box 4o to generate a lean signal. After all the deletions have the instruction of data, that is, 2 FIFO all have data from its corresponding disk drive, read the data to lose. See the shell material transmission as described later. Each fif0 system has an output path, such as K. 48, in this preferred implementation For example, it is 16 bits wide. Fang: Drive data paths are merged side by side, such as Out of the box 50. In other words, an "Official from the chest to a buffer ^ re)" data path is provided by one office, 11 °. 52, which has an equal to N times m bits and eight middle, N system The number of attached drives and m is the width of the data path from each drive (although it is not necessary that they all have the same width: ―. In the configuration shown in the figure, four drives are used, each of which It has a 16 'lean path, and it is a total of 64 bits to the buffer 52. 44 The transmission of FIF0 and other data is moved by a common read gate, and the common read gate 44 is transmitted to all FIF〇 。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。 FIF .。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。。 From which is shown in the figure to the FIF 0. To 18 18 200535609. A mouth P knife. If any of the FIF0, the process will be delayed until the establishment of ^ / ^ (㈣) (empty) ". U refers again to "not empty". Referring to Figure 7, the disk write operation is described. Once again, it is introduced into the controller and each disk drive with a single address counter 7G, and the path # 1 is transported, And 5 贝 comes from the buffer 52 inside the controller. In a preferred embodiment, the value of the buffer slope of 1 § 1 is finally half-duplex (half-duplex) duplex), Jin Sang, Yue Fei) " Hai Temple FIFO and address counter. Each FIFO has a multiplex to cry J to eight ports, depending on the direction of the data transmission. The fishing items come from the buffer The rough space ^ The subgroup of the bee group uses a common return, g 72 to advance to each FIF0, and the common 72 is coupled to the write control input of each fif0, as shown in the figure See: Data Path m 78, 80. 'Write data in this way' is "disassembled, on the array's magnetic machine." If any of the FIFOs becomes "," this is implemented by the logic represented by ... to produce "(any are full), a signal. Ma Man / Yu Gang ’s drive side, implement UDMAM interface ^: 糸 The data from these FIFOs will be transmitted to these magnetic tubes. These transmissions may start at the same time, which will be asynchronous, Since each:: will respond independently to "pause," stop, from its drive; 39 200535609

透過運用FIFO之UDMA的此修改以致能同步冗餘資 料傳輸係提供重大的優點為優於用以處理來自一個磁碟機 陣列的同¥資料傳輸請求之標準技術。該標準方式係需要 每個磁碟機為_個DMA通道,即:超過一個位址計數器。 此等DMA通道係爭取以存取至緩衝器而產生多個短資料 、、且(buist)傳輸且降低由種種的技術所可達成之頻 見。判定的是··歸因於磁碟資料傳輸、主機資料傳輸、與 ^餘資料計算的存取之組合的緩衝H頻寬係成為針對於^ 勺Raid控制為設計之瓶頸。如上所述,本發明係需 要針對方;整個陣列之僅為單一個通道。 。扣儲存於一磁碟陣列之資料係可保護為免於其歸因於任 何單磁碟機之失效的損失,藉著提供冗餘資訊。於一冗 餘陣歹J j諸存的資料係包⑨:其為充分以致能所有使用者 貝料之重建的使用者資料以及冗餘資料,假使於該陣列的 任何單一磁碟機之失效。 ^ ^ 、國專利第6,237,052 B1號係揭示的是:冗餘資料計 =係可於一同步資料傳輸期間而為“即時(on-the-fly),,實 仃。同步資料傳輪、“即時,,冗餘、與udma轉接器(其 |丨^夂每個磁碟機為一 FIFO)之三個概念的組合係運用最 里之硬體而提供一高性能的冗餘磁碟陣列資料路徑。 一、、、现g種種的异術與邏輯操作係可能運用以產生一冗 二料樣式’ X0R係將運用於目前解說。參考第8圖,於 向之貝料流係顯示。該圖係說明一組的磁碟機3〇〇 各者為連接至—組的UDMA介面32〇之一者。各個磁碟 20 200535609 係具有於資料路徑之-個對應的先進先出記憶冑34〇,如 前所述。 於磁碟寫入方向,資料字組係讀出自缓衝器35〇。例 如:參閱資料路徑342、344,此等資料字組之區段(segment) 係寫入至各個磁碟機。於此點邏輯x〇R操作係可“即 時”實行於該等區段的對應位元之間。x〇R邏輯36〇係安 排以計算各個區段之對應的位元之布才木x〇r,I生一序列 的冗餘區段,其為預先儲存於—先進先出記憶冑μ,在 、、工由UDMA介面380而傳輸至—個冗餘或同位磁碟機39〇 之刖。因此’ XOR資料係同步於該等資料區段而作儲存。 換言之,,—冗餘資料樣式即時,’產生係“監測 (Sn〇〇P)”磁碟寫入過程而未增加任何延遲至其。 :考第9 1]個類似的圖式係說明於讀出方向之資 料流。該陣列之磁碟機3〇〇、對應的介面32〇與πρ〇記慢 體340係顯示如前。於磁碟讀出方向,職係計算為料 來自各個資料磁碟機與冗餘磁碟機所讀出之資料區段。因 此’該等資料區段係經由路徑392而輸入至舰邏輯州 以產生職輸出於396。若該等資料磁碟機之-者(於第9 圖之磁碟機3 2 2)為ρ細生4 果係將’工夕,則於394之X0R計算的結 此^ 前失效的磁碟機322之原始序列的區 ㈣區段係代替其為來自該失效磁碟機之目前不 :歹1且為連同於其他的資料而儲存於緩衝器350。 此代替係可為由對於眘μ 重建係未延遲至今緩衝1 調“實現。此資料 友衝盗之資料傳輸,如更為完整解說於 200535609 本案申請 第1 〇圖係時序圖,並$日4 4走 其況明根據本發明之於磁碟讀出太 向的FIFO相關訊號。如同私北b 、山乃 說如圖所指出,各個磁碟機係 有一不同的讀出存取時問。n ,、 才間 旦一磁碟機為具有目標資料 於其區域缓衝器,其斷定DMARQ (一難請求)。接著 於收到麵CK之時,開始其資料傳輪至刚〇。於圖式中,’ 磁碟機0係碰巧為首先完成且傳輸資料而 後:依順序為磁碟機2、M3。於此例:磁: 係將為ΓΓ 旦開始寫入FIF〇,所有四個阳〇 係將為未工以允許資料為藉 有的四個_,在此顯干為1立 同步移出自所 際為…為獨立的跡则以強調其實 於先前技藝,透過冗餘眘 為盆嘗試解…θ§ 存之資料保護係已經 ~ 旨σ式解決之問題的主 多個控制器係必須等待而二 對於一磁碟讀出,諸 =點,資料係將為讀出自該緩衝器,進行x〇R; 异,且將該結果送回。假定仍或六— K 4 存取,# 75為存在緩衝器之主機與磁碟 二加:Γ計算一X0R之目的之存取係-第三存取, 所•要、6玄緩衝…1寬要求為50%。由一區域處理哭 所而要以實行此任務 处里- 用的D ^改/寫入操作係太慢,故專 所需要的日士 係已經針對於此目的而設計。計算職 汀而要的時間係減少,但是 運作係仍為需要。 、“衝器的資料之-第三次 夕的貝施’新的貪料係立即寫入至磁碟。至同位This modification of UDMA using FIFO to enable synchronous redundant data transfer provides significant advantages over standard techniques for processing the same data transfer requests from a disk array. This standard method requires _ DMA channels per drive, that is, more than one address counter. These DMA channels strive to generate multiple short data by accessing to the buffer, and buist the transmission and reduce the frequency that can be achieved by various technologies. It is determined that the buffered H-bandwidth due to the combination of disk data transmission, host data transmission, and access to redundant data calculations has become a bottleneck for Raid control as a design. As mentioned above, the present invention needs to be targeted; the entire array is only a single channel. . Data stored on a disk array can be protected from loss due to the failure of any single disk drive by providing redundant information. The data stored in a redundant array are the following: it is user data and redundant data that are sufficient to enable reconstruction of all users, assuming that any single drive in the array fails. ^ ^, National Patent No. 6,237,052 B1 reveals that: redundant data meter = can be "on-the-fly" during a synchronous data transmission, real. Synchronous data transfer wheel, "real-time The combination of the three concepts of redundancy and udma adapter (its | 丨 ^ 夂 each disk drive is a FIFO) is to use the hardest hardware to provide a high-performance redundant disk array data path. I. Various kinds of alien and logical operation systems may be used to produce a redundant two-material style. The X0R system will be applied to the current explanation. Referring to Figure 8, Yu Xiangzhi's material flow is shown. This figure illustrates that each of the disk drives 300 of the group is one of the 32 UDMA interfaces connected to the group. Each disk 20 200535609 has a corresponding first-in-first-out memory (34) in the data path, as described above. In the disk write direction, the data block is read from the buffer 35. For example, see data paths 342, 344. Segments of these data blocks are written to each drive. At this point, the logical xOR operation can be performed "instantly" between the corresponding bits of these sectors. x〇R logic 36〇 is arranged to calculate the corresponding bit of each section of the cloth. X〇r, a sequence of redundant sections, which are stored in advance—first in, first out memory 胄 μ, in The UDMA interface is transmitted by UDMA interface 380 to a redundant or co-located drive 39. Therefore, XOR data is stored in synchronization with these data sections. In other words,-the pattern of redundant data is instantaneous, and the 'generating' is a "monitoring (Snoop)" disk write process without adding any delay to it. : Consider Section 9 1] A similar diagram illustrates the data flow in the readout direction. The array's disk drive 300, the corresponding interface 32o, and the πρ0 slow body 340 are shown as before. In the read direction of the disk, the grade is calculated as the data segment read from each data drive and the redundant drive. Therefore, these data sections are input to the ship logic state via path 392 to generate job outputs at 396. If one of these data disk drives (disk drive 3 2 2 in Figure 9) is ρ 生生 4, the result will be 'work evening', then the X0R calculation of 394 will result in a disk that has expired before ^ The ㈣ section of the original sequence of the machine 322 replaces it with the current one from the failed drive: 歹 1 and stores it in the buffer 350 along with other data. This replacement system can be implemented for the reason that the reconstruction of the system has not been delayed until now. The transfer of data from the data enthusiasts, such as 200535609, is more completely explained. The figure 10 in the application is a sequence diagram, and $ 4 According to the present invention, according to the present invention, the FIFO-related signals on the readout of the magnetic disk are directed. As the private north b, Shannai said, as shown in the figure, each disk drive has a different read access time. N A disk drive that has target data in its area buffer determines DMARQ (a difficult request). Then when receiving the CK, it starts its data transfer to just 0. In the drawing , 'Disk 0 happened to be the first to complete and transfer the data and then: Disk 2 and M3 in this order. In this example: Magnetic: The system will write FIF once, and all four positives will be written. The four _ borrowed from the allowable data for the unworked, here is a stand-alone synchronous move out from the real world as ... independent traces to emphasize the fact that in the previous technique, try to solve the problem through redundancy ... θ § Existing data protection system has been solved. The problem is to solve the problem of sigma type. Waiting for two disk reads, == points, the data will be read out from the buffer, x0R; XOR, and return the result. Assume still or six — K 4 access, # 75 For the host and the disk with the buffer, add two: Γ calculates the access system of the purpose of one X0R-the third access, so the requirement, 6x buffering ... 1 wide requirement is 50%. To implement this task, the D-change / write operation system is too slow, so the Japanese department required by the school has been designed for this purpose. The time required to calculate the job is reduced, but the operation system It is still needed. "" Information of the punch-the third time Beishi's new greed is written to the disk immediately. To parity

(S 22 200535609 磁碟機之寫入係必須為延後 Μ ^ λ ^ j且刹入〇R汁异為已經完成。 一^後退係累積且同位磁碟機係成為針對於寫入操作 ::員。堵多的設計係嘗試解決此問題,藉著於RAiD5 另:该同位者於陣列的所有磁碟機。運用於先前技藝之 而計管!式係企圖以隨著資料為傳輸自該主機或至磁碟機 中門I=餘11。由於此等傳輪係發生於不同時間,針對於 :4=之“累積器加—),,係,個完整磁區或更 二二、二此係避免對於額外的緩衝器存取之需要,但是 代如、為顯著提高的複雜度。 士上所述’本發明係不需要其針對於XOR計算存取之 較夕5 0。/〇的緩衝器頻寬、 _ X,、储存几餘貧料之緩衝器空間、 或/、貫行讀出/修改/寫入操作於 擎、或其儲存來自職計算的中㈣合肖DM亡引 於一個實施例,_磁^ _ ;j 用的、讀杰。 板。舉例而言,亦可實施為::係…-電腦主機 R 了貝把為一主機匯流排轉接器(HBA,Host us apter)以介面於一 ρα主機匯流排。 資料^徑開關(DataPathSwitch)之應用 干人之先前的“同步冗餘資料傳輸,,專利係揭 ::::的拆解方式’其中,來自二或四個資料磁碟機 〇㈣簡單交錯。下文所述之“陣列„,,(亦於= 一貧料路經開關,’)係納入新特徵與方法,皇允 = 拆解於三個資料磁碟機。 , ”,、σ樣 成此係田僅具有四個磁碟機(總數) 為重要°其他的用途與優點係描述於後。 於透過該陣列開關之實體磁碟機埠、映射暫存器、與 23 200535609 資料路徑的以下說明,應為注意的是:其他實施或實施例 係可能具有其他數目之實體埠,例如:8、1 2、1 6、等等。 該等變化係均為於本發明之範疇内。 揭示的陣列開關係包括其利於支援RAID5之特徵。 RAID5係針對於小的隨機存取之一種最佳化,而RAIDXL 係針對於大的順序存取之一種最佳化。RAID5性能係通常 測量於每秒鐘之10操作數(IOPS,10 Operations per second) 而相對於每秒鐘之百萬位元組數(MBPS, Megabytes per second)。此等特徵係運用其已經存在為針對於“即時 X0R”之X0R硬體以及一新的單一個磁區緩衝器、累積 器、與適當的排序以達成RAID5功能性。支援之主要的 RAID5功能為:完全拆解讀出於失效的磁碟機(FULL STRIPE READ WITH FAILED DRIVE)、讀出自失效的磁碟 機(READ FROM FAILED DRIVE)、完全拆解寫入(FULL STRIPE WRITE)、與部分拆解更新(PARTIAL STRIPE UPDATE)。 陣列開關係實施資料路徑。陣列開關之配置係定義由 一或多個磁碟機所組成之陣列。單一個磁碟機之陣列係一 JBOD (Just a bunch of drives,僅為一群的磁碟機)。針對 於RAID0、RAID1、RAIDXL與RAID5,多個磁碟機係涉 及。於一個給定的實例,陣列開關係可為實行此等功能之 任一者,且能夠同時支援其全部者。 以下的定義係運用於此段落與關聯的圖式: 陣列(Airay) ··附接至一控制器之磁碟機的一個子集 24 200535609 成。 列之資料磁碟機係 LI、L2、與 L3。 合。陣列係可為由一至四個磁碟機所組 邏輯磁碟機(Logical Drive): —陣 以零為起始而編號。邏輯磁碟機係L〇、 單一個冗 介面至一 SATA0、 同位磁碟機(Parity Drive): —陣列係可具有 餘或同位磁碟機。同位磁碟機係將為paR。 SATA 埠(SATA Port) ·· 一 SATA 埠係提供一 磁碟機且符合串列ATA規格之要求。SATA埠係(S 22 200535609 The writing system of the drive must be postponed M ^ λ ^ j and the brake is not complete. The backward system is accumulated and the parity drive system is targeted for the write operation :: The design of Dado attempts to solve this problem by using RAiD5. Another: all the drives in the array are used by this peer. It is used in the prior art to account for it! The system is designed to transfer data from the host as the data Or go to the middle door of the disk drive I = I 11. Since these transmission gear trains occur at different times, for: 4 = "accumulator plus-), system, a complete magnetic zone or two, two or two It avoids the need for additional buffer accesses, but instead, for a significantly increased complexity. As mentioned above, the present invention does not require its buffer for XOR calculations to be compared with 50. Device bandwidth, _X, buffer space to store several lean materials, or /, continuous read / modify / write operations on the engine, or its storage from the computer. In the embodiment, _magnetic ^ _; j is used to read Jie. Board. For example, it can also be implemented as :: ... A host bus adapter (HBA, Host us apter) is used to interface to a ρα host bus. The application of the Data Path Switch stems from the previous "synchronous redundant data transmission," the patent reveals :: :: Disassembly method 'where two or four data drives are simply interleaved. The “array”, described below (also in = a poor material path switch, ′), incorporates new features and methods, Huang Yun = disassembled on three data drives. It is important that this system has only four drives (total). Other uses and advantages are described later. The physical drive port, mapping register, The following description of the data path with 23 200535609 should be noted that other implementations or embodiments may have other numbers of physical ports, such as: 8, 1, 2, 16, etc. These changes are based on this Within the scope of the invention. The disclosed array opening relationship includes its features that are conducive to supporting RAID5. RAID5 is optimized for small random access, and RAIDXL is optimized for large sequential access. RAID5 Performance is usually measured at 10 Operations per second (IOPS) and relative to Megabytes per second (MBPS). These characteristics use the existing Aimed at the "real-time X0R" X0R hardware and a new single-sector buffer, accumulator, and proper sequencing to achieve RAID5 functionality. The main RAID5 functions supported are: completely disassembled and interpreted as invalid Drive FULL STRIPE READ WITH FAILED DRIVE), READ FROM FAILED DRIVE, READ FROM FAILED DRIVE, FULL STRIPE WRITE, and PARTIAL STRIPE UPDATE. Array open relationship implementation data path The configuration of the array switch defines an array composed of one or more disk drives. The array of a single disk drive is a JBOD (Just a bunch of drives, only a group of drives). For RAID0, RAID1, RAIDXL, and RAID5 involve multiple drives. In a given instance, the array opening relationship can be any one of these functions and can support all of them at the same time. The following definitions apply to this Paragraphs and associated diagrams: Array (Airay) ·· A subset of the drives attached to a controller 24 200535609. The listed data drives are LI, L2, and L3. Combined. Array systems can A logical drive consisting of one to four drives:-arrays are numbered starting with zero. The logical drive is L0, with a single redundant interface to a SATA0, co-located drive ( Parity Drive): --Array I may have or parity drives. Parity disk drive system will paR. SATA ports (SATA Port) ·· a SATA port system provides a serial ATA disk drive and meet the specifications required .SATA port system

SATA1、SATA2、與 SATA3。 實體磁碟機(Physical Drive)·· 一磁碟機係將取得其識 別為自SATA 4至其附接者。實體磁碟機係將為磁碟機°B〇 (Dnve 〇)、磁碟機i (Ddve υ、磁碟機2⑺、與磁 碟機 3 (Drive 3)。 ” 磁區(Sector)··磁區係最小可定址區塊之磁碟資料。針 對於本案目的’ 一磁區係將為5 12位元組。 LB A :磁碟機之磁區係由一邏輯區塊位址(lb a, 叫ck Address)所識別。LBA係指定為自零至高達其需要 疋址該磁碟機的容量之最大數目。 拆解(Stnplng):於其具有超過一個資料磁碟機之陣 列,資料係分布於該陣列之資料磁碟機。一拆解寬度係選 擇。各個邏輯磁碟機之容量係、視為拆解寬度之—組的區 塊。δ玄陣列之容量係映射至其於邏輯磁碟機順序之該陣列 的各個邏輯磁碟機之此等區塊的第一者、接著至第二區 塊、等等。針對於1^1〇0或RAID5,拆解寬度係將為一“二 之乘方(power of two)”數目的磁區。針對於raidxl,拆 25 200535609 解覓度係將為一個DWC)RD。 RAID1 :於此模式,具有單一個邏輯磁碟機與一 pAR 磁碟機’其持有該邏輯磁碟機的内容之一鏡射影像。於此 才吴式’ e亥對磁碟機係對於主機系統而看似為單一個磁碟 機。SATA1, SATA2, and SATA3. Physical Drive ... A drive will get its identification from SATA 4 to its attached. The physical drive will be drive ° B〇 (Dnve 〇), drive i (Ddve υ, drive 2⑺, and drive 3 (Drive 3). "Sector ·· Magnet Disk data of the smallest addressable block. For the purpose of this case, a magnetic block will be 5 12 bytes. LB A: The magnetic block of the drive is composed of a logical block address (lb a, It is identified by ck Address). LBA is designated as the maximum number of capacity of the drive from zero to as high as it needs to be addressed. The data disk drive in the array. A disassembly width is selected. The capacity of each logical drive is considered as a group of disassembly widths. The capacity of the delta array is mapped to its logical disk. Machine order, the first of these blocks of each logical drive of the array, then to the second block, etc. For 1 ^ 100 or RAID5, the disassembly width will be a "two" Power of two "number of magnetic domains. For raidxl, the 25 25 35 35 609 resolution will be a DWC) RD. RAID1: in this mode With a single logical drive and a pAR drive, it holds a mirror image of one of the contents of the logical drive. This is why Wu's e-drive system looks like a host system. Is a single drive.

RAIDXL ••此映射係運用 DWORD 交錯(interleave), 不論有無一同位磁碟機。於此模式,包括同位磁碟機之該 陣列的磁碟機係對於主機系統而呈現為單-個磁碟機。" 資料映射圖·註記。於第1 5A至1 7圖: 顯不的磁碟機編號係一陣列之“邏輯,,磁碟機。 所有的數值係拆解寬度之區塊的序列編號。 [+]係指出所列出的區塊之X〇R。 標記(η)、(n+1)、(n+2)、 係包括此等區塊之部分者。 • · ·係指出 邏輯磁碟機資料 η個相對區段。 資料映射圖,第-行係標示該軸心。跨過於发] 顯不對於二、二、和 、, 5幻久具,; 一 人四磁碟機陣列 跨於該等可利用的磁碟機之f料的分:解^解係簡^ 選,即:I6K位元組之使用者資料。針對於 尺寸係去 最先的16Κ位元組之使 ;一磁碟機陣列, 次…使用者資料上 =:Γ磁碟機。。其 之使用者資料(稱為區塊2)係進行於磁碟卜弟三個Ι6Κ …者資_為區塊3)係進行於磁碟r;且::;: 26 200535609 複而直到該等磁碟機 碟機之過程係類似。 、’二填滿。針對於三與四個資料磁 弟15B圖係顯示針對於RAID1之資料映射 尺寸為何,第二磁碟機係第一者之一二映射。不論拆解 係疑慮運用術語“鏡射崔只铍製。(本申請人 RAIDXL且不具有7 圖係顯示針對於 對於RAIDXL且且右”从 B圖係顯示針 一有几餘性之資料映射。 用者資㈣拆解為刚⑽寬, 、/AI咖,使 模式係可提供於16 …’、位元。(注意:一 .,, '、 位兀拆解以提供針對於1搜存 控制器的資料之-個遷移路 …储存… 資料之浐杏— )、一兹茱機陣列,使用者 、 一個磁區係儲存於該對磁碑機< f & .π , 區。磁碟機〇係具有使用者次料成厂/枝之取先二個磁 八3 1文用耆貝枓磁區〇盥 〜 如為由標t“,1[0]所指出。磁碟 具 磁 區…之奇數字組,如為由標=:者= 石g她总曰丄 ,iJ所才日出。同位磁 碟枝係具有使用者資料 η 1Γ Ί 貝科之取先-個磁區的XOR,如為由標 ° 1 + ]所指出。藉著it卜;0古〇 佶W 稭者此才“己方式,8,9,1〇,"[3]係意指該 使用者貢料磁區8,U之每第四個字組。 陣列開關(Array Switch): 陣列開關係具有四組的暫存器,一者為針對於各個 SATA I此等暫存器係用以定義個別磁碟機或陣列。各 個暫存1§組係具有映射(Mappmg)、資料組長度(如加 ^邮)、快速讀出(Fast Read)、與命令(C〇mmand)暫存器。 芩閱乐18圖與詳述於後之種種的配置。於一個較佳實施 例,主機軟體驅動程式係載入該等映射暫存器而配置該陣 27 200535609 列開關。此後,藉由說明而非為限制,於第 示之系統係根據目前配置而執行磁 1 1與12圖所 的RAID功能,如後所述。 、 木作且提供改進 映射欄位係具有針對於各個實俨 、粗旱之一個位 攔位係運用以指出該對應於實體埠/ 、、' 六口口〜私〜μ 土 末钱疋否為由特定暫 存…斤疋義之陣列而運用。若該磁碟機為運用 出其是否為同位磁碟機或一資料磁碟機。 一、,曰 之情形’將指出邏輯磁碟機編號:料磁碟機 效。 及忒磁碟機是否為失 、、貧料組長度係本質為邏輯資料磁碟機之數目 欲為連續傳輸的磁區之資料組長度。 ·; 快速讀出旗標係指出的是:於其並無磁 效之一冗餘陣列’同位資料是否為於-讀出而作讀出= 查。若為否,該讀出係、歸因於降低的旋轉等待 = 較快速,即:“快速讀出”。 將為 命令暫存器係载入該陣列開關基本動作㈣灿 一者,其為定義於後。 ) JBOD (Just a bunch of drives A 热认丄 m% 18A® es,僅為-群的磁碟機)參 此係陣列開關之預設内容,其跟隨於重置時之—電源 而發生。各個SATA埠/實體磁碟機係一個獨立的單 機陣列,其運用於主機介面之對應的DMA通道以傲資;: 傳輪。軟體磁碟機係將送出命令纟SATA埠。當由SAS 璋所請求’陣列開關係將運用DMA通道而傳輸單—個磁 28 200535609 區的封包於該SATA與PCI匯流排的—緩衝器之間。 RAIDl ··參閱第18B圖 此配置係總是涉及二個磁碟機’即:一資 一同位磁碟機。古玄映鼾孫 于饿一 至實… 射係顯不的是··邏輯磁碟機〇為附接 /焉肢蟑〇且同位磁碟機為附接至實體蟑〗。同位 係維持貧料磁碟機之一確實的複製。於RAID1,一 ^ ' 機之陣列係對於 ,一個磁碟 宜Λ ▲ 為早一個磁碟機。於-磁碟機 寫入,陣列開關邏輯係致使寫入命令 :、成 機,且相同資料係寫 = /固磁碟 命令係僅為送出至資枓诚摊祕 佚違喝出’ 甘处 4磁碟機且僅有該資料磁碟機係在 。,十對於-“慢速讀出,,,命令係傳播至該 二個磁碟機係均為存取,且同位(PAR,parlty)=内 容係對照於資料磁碟機之内容而檢查。於第二列的= 紋係指出的是·埴!批士 j〕及色條 為由追曰存器係不可利用’由於此埠係已μ 為由埠0所定義的陣列而運用。 、、、二 RAID1·快速讀出:參閱第i8c圖 此配置係類似於前去^ e , . m 、、者。一個磁碟機係涉及且其内容俜 總疋為相同。於第二斬六。 奋係 於第四暫存二= 广陣列係定義— Η … 。°組L〇係於埠3。如上所述,—快速讀出” 係僅為存取資料磁碟機。 & — 個不同的磁碟機,”等:::人疋義陣列“〇於二 同時存取。針對於寫rf料組的不同部分係可 等磁碟機之僅有一者為2體磁碟機係必須確保的是:該 將涉及於寫入而企圖之同時讀出。 寒枝為均 29 200535609 於一 RAiDXL,資料係拆解為DW〇RD寬而跨 之資料磁碟機。第-個__係儲存於第一輯車: 機;下一個⑽〇RD係儲存於下一個邏輯磁碟機,以此类 推,直到各個邏輯磁碟機係 俄係已經收到一個DWORD。i次 制碟機之上的DW0RDS之位元方式⑽㈣e) χ二 异的DWORD #儲在於n y ^ 所5十 係鍺存於问位磁碟機(若為存在)。此 重複於整個磁碟。—給定磁區之使 = 列之各個資料磁碟機。為…一貝科係將刀布於陣 干执馮了存取一給定磁區,該陣列夕無 有的磁碟機係必須為存取。 所 構成ό亥陣列的磁碑機 侗讲厂 成之取小可定址的資料區塊亦為一 個磁區,故任何的實體 為 碟機之至少一個磁f ¥存取係將涉及來自各個資料磁 長戶且* 構成其等於Ν個磁區之最小傳幹 長度,其中,Ν為資料磁碟機之數目。由 丄:, 之所有的區段係儲存於各個個-疋拆解 置,存取此等區段所需的“屮、: 之相同的相對位 許針對於任何存取之入命令係相同。此係允 得該陣列係對於主機=傳播至所有的磁碟機,使 存取,可能的β 個磁碟機。於一給定的 的磁碟機,差: 碟機係將具有-誤差。任何 ,、祛块差係必須透過存取至1 碟機而解析。 八表見忒决差之特定的磁 若同位磁碟機係存在, … 機以及該等f # # # & ' 項於碩出該同位磁碟 新計算且相較於來自 > 门/出存取…貝料之職係重 磁碟機之資料。-誤差係指 30 200535609 -為未匹配。料#係未增加存取時間,但是存取 一 Γ卜的磁碟機係增加對於存取之平均的旋㈣待時間。 :刚同位資料之選項係可藉由㈣“快速讀出,,而拒 、巴方'各個W,同位磁碟機係總是存取於寫入。 同位磁碟機之裨益择力. 糸在於·即使該等磁碟機之一者為 已、、坐失效而允許該陣列為 微不足道的情开…“• 乍H位磁碟機之失效係 R娜α 彳料純成以具有同位之― 新尸…右一或多個資料磁碟機為失效,陣列開關係重 ==指出已經失效的磁碟機之位置。所指出的磁 浐不一收到任何的傳播命令或參與於任何的資料傳 出盥寫入而柞^ 之狀心為何’同位磁碟機係將於讀 山/、馬入而作存取。於一 ^ ^ ^ Μ m " $,所有其餘的資料磁碟機 :為將計算。此計算之結果係等效於 ===館存於失效的磁碟機之資料。此資料係插人 取二其將已經讀出自該失效的磁碟機之資 入資料之計管.果^ 式而收到其為基於所有的寫 拋棄。 P使疋針對於失效的磁碟機之資料係 料軟體程式係收到來自操作系統之存 :邊Γ伸於各端為一或二個磁區,如為所需以達到-; 角+邊界。針斜於 咬~ ^ 磁碟機之數目而= 一命令係建立,藉著以資料 成之數目而分割所得到的LBA位址 思.延伸的命令係將為平 t?x王 —仏定沾其并 卞^刀。〗右具有二個資料磁碟機, 、。W韻之LBA與賴係將僅為❹者LBA與計 31 200535609 合個磁碟機係儲存該資料之僅有 開關係將合併來自該陣列之資料…1/3。陣列 / 、 貝料甲流至單一個串流,用於 彺返於使用者的緩衝器之傳輪。 驅動軟體程式係必須處理 j 士』、 f男為具已經延伸你田去 …係將要傳輪總請求計 程式係將建立-分散列表(scatter 十' 峡出,駆動 求資料的前端或後端之任 :日引所附加至請 資料係直接傳輸。針對於一寫抛棄緩衝器。請求的 將需要針對;”、、方;各端之附加的磁區係 而文叮町瓦β ί而之_讀出/修 出/修改/耷人π " 寫入刼作。針對於一讀 出/1U改/寫入刼作,驅動 包括至-㈣… 飞係將H買出目標拆解,其 (gather),其係直接拾 接者將建立一積集 為需幻與使用者之資料。斥解_器之延伸磁區(如 XOR操作係可為“即時”實 XOR累積器FIFO硬體,如审泛6 弟4圖所不之 公佈的專利。 冑為元整解說於前文所列出之 RAIDXL-2 Drives-No Parity (RAIDXL.-, 同位)··參閱第1 9A圖 一個磁碟機-無 該圖係顯示的是:邏輯磁碟冑〇為於 輯磁碟機1為於實體阜 a ,且邏RAIDXL • This mapping uses DWORD interleave, regardless of the presence or absence of parity drives. In this mode, the drives of the array including co-located drives are presented as a single drive to the host system. " Data Map · Notes. In Figures 15A to 17: The drive numbers shown are arrays of "logical," drives. All values are the serial numbers of the blocks of disassembly width. [+] Indicates the listed X〇R of the block. Marks (η), (n + 1), (n + 2) are those that include some of these blocks. • • • It refers to n relative sections of logical drive data. Data map, line-indicates the axis. Crossing over] shows that for two, two, and, five magical tools; one four-drive array across one of these available drives F material points: solution ^ solution is simple ^ selection, that is: I6K byte user data. For the size of the first 16K bytes, a drive array, times ... user data Top =: Γ drive ... Its user data (called block 2) is performed on the disk drive three I16K ... The resource_ is block 3) is performed on the disk r; and :: ;: 26 200535609 Repeated until the process of these drives is similar. , 'Two filled. For three and four data magnetic disc 15B figure shows the size of data mapping for RAID1, One of the first two lines by two drives mapped. Doubts whether dismantling system use the term "mirror Cui only beryllium. (The applicant's RAIDXL and does not have 7 maps are displayed for RAIDXL and the right "shows how much data is available from the B map. The user data is disassembled into a rigid frame. , So that the model system can be provided at 16… ', bits. (Note: I. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, to to, s, the data of, the 1st storage controller ... —), A Jujube machine array, a user, and a magnetic area are stored in the pair of magnetic tablet machines < f & .π, area. The magnetic disk drive 0 has the user's first material / plant first Two magnetic eight 3 1 text used 耆 贝 耆 magnetic zone 〇 盥 ~ As indicated by the label t ", 1 [0]. The disk has an odd number group of the magnetic zone ... if the label =: 者 = 石g She always said 丄, iJ is only sunrise. The co-located disk system has user data η 1Γ Ί Beco's first XOR of a magnetic zone, as indicated by the standard ° 1 +]. By it Bu; 0 〇 〇 佶 秸 This is the way, "8,9,10," [3] means that the user contributes the magnetic field 8, U every fourth block. Array switch (Array Switch): Array Open Relationship There are four sets of registers, one for each SATA I. These registers are used to define individual disk drives or arrays. Each temporary storage 1§ has a mapping (Mappmg), data group length (such as Canada Post), Fast Read, and Command (C0mmand) registers. 芩 Read 18 pictures and detailed configurations detailed below. In a preferred embodiment, the host software driver Load these mapping registers and configure the array of 27 200535609 switches. Thereafter, by way of illustration and not limitation, the system shown in the figure performs the RAID function shown in Figures 1 and 12 based on the current configuration. As described later, the woodwork and the improved mapping field are provided with a bit block system for each real estate and rough drought to indicate that the corresponding physical port / ,, '六 口 口 ~ 私 ~ μ 土Is the last money used for a specific temporary storage array ... If the drive is used to determine whether it is a parity drive or a data drive. 1. The situation will indicate the logical magnetic Drive ID: The effect of the drive. And whether the drive is missing, The length of the data set is essentially the length of the data set of the number of logical data disk drives that are to be continuously transmitted. The quick readout flag indicates that: a redundant array with no magnetic effect is in the same position. Whether the data is read out for -reading = check. If not, the readout is due to reduced spin waiting = faster, ie: "quick readout". It will be loaded for the command register. One of the basic actions of this array switch is defined as follows.) JBOD (Just a bunch of drives A heat recognition 丄 m% 18A® es, only-group of drives) Participate in this array switch Preset content, which follows the power-on reset. Each SATA port / physical disk drive is an independent stand-alone array, which is applied to the corresponding DMA channel of the host interface to be proud of :: Passing round. The software drive will send a command to the SATA port. When requested by SAS, the array connection will use a DMA channel to transmit a single magnetic 28 200535609 packet between the SATA and the buffer of the PCI bus. RAIDl ·· Refer to Figure 18B. This configuration always involves two drives', namely, one co-located drive. Gu Xuanying's grandson Yu Hengyi is true ... What the shooting system shows is ... The logical drive 〇 is attached / the cockroach cock 0 and the parity drive is attached to the physical cock. Parity maintains an exact copy of one of the lean drives. For RAID1, the array of a ^ 'machine is, for a disk, Λ ▲ is an earlier disk drive. Yu-disk drive write, the array switch logic caused the write command :, the machine, and the same data is written = / solid disk command is only sent to the resource manager sincerely secrets 佚 drink out 'Gan Chu 4 Drive and only that data drive is attached. For ten- "slow read,", the command is transmitted to the two drives are both access, and parity (PAR, parlty) = content is checked against the content of the data drive. The second line of the = line indicates that "埴! 士士 及] and the color bars are used to indicate that the memory system is unavailable. 'Since this port system has been used as an array defined by port 0, ... Two RAID1 · Quick Readout: Refer to Figure i8c. This configuration is similar to ^ e,. M, and. A disk drive is involved and its content is always the same. It is the second one. In the fourth temporary storage = wide array system definition — —…. ° group L0 is in port 3. As mentioned above, “quick readout” is only for accessing data drives. & — a different disk drive, ”etc. ::: human array” 0 access at the same time. For writing different parts of the rf data set, only one of the waiting drives is a two-body drive. The system must ensure that this will involve writing and attempting to read at the same time. Han Zhijun 29 200535609 In a RAiDXL, the data is disassembled into a DW〇RD wide and span data drive. The first __ is stored in the first series of cars: machine; the next RD〇RD is stored in the next logical drive, and so on, until each logical drive has been received by the Russian department a DWORD. The bit method of DW0RDS on the disc drive ⑽㈣e) χ two different DWORD #stored in n y ^ 50th germanium is stored in the position drive (if it exists). This is repeated for the entire disk. —The given volume makes each data drive in a row. In order to ... A Beco system will arrange the blades in the array to perform access to a given magnetic area, and all drives in the array must be accessed. The small addressable data block formed by the magnetic tablet machine of the Haihe array is also a magnetic zone, so any entity is at least one magnetic disk of the disc drive. Long accounts and * make up the minimum transmission length equal to N magnetic zones, where N is the number of data drives. From 丄:, all the sections are stored in each-疋 disassembly, the same relative position of "位,:" required to access these sections is the same for any access. This allows the array to be propagated to all drives, making access possible β drives. For a given drive, the difference: the drive system will have -error. Any, the block removal system must be resolved by accessing to the 1 disk drive. See table 8 if the specific magnetic disk with the same position exists,… and the f # # # & The new calculation of this parity disk is compared with the data from the heavy disk drive of the door / out access ... class of the material.-Error refers to 30 200535609-is not matched. Material # is not added Fetch time, but access to a disk drive increases the average waiting time for access.: The option of just parity data can be quickly read by "㈣, but refused, Pakistan's W, parity drives are always accessed for writing. The benefits of co-located drives. The problem is that even if one of these drives fails, the array is allowed to be insignificant ... "• The failure of H-bit drives is R Na The alpha material is purely with the same position-the new corpse ... one or more data drives on the right are invalid, and the array opening relationship is == indicates the position of the failed drive. The indicated magnetic disks were not received Any dissemination order or participation in any data transmission and writing and what is the cause of the ^^ 'Parallel drive system will be accessed by reading Shan // Ma Ma. Yu Yi ^ ^ ^ Μ m " $ All remaining data disk drives: will be calculated. The result of this calculation is equivalent to the data stored in the failed disk drive. The management of the data input data of the drive is discarded based on all writes. P causes the data software software program for the failed drive to receive from the operating system: The edge Γ extends to one or two magnetic regions at each end, if necessary to achieve-; angle + boundary. Needle oblique to bite ~ ^ magnetic The number of disc players = a command system is established, by dividing the LBA address obtained by the number of data into the thinking. The extended command system will be flat t? X king-Lu Dingzhan and ^ ^ knife.〗 There are two data drives on the right. The WBA's LBA and Lai will be only the LBA and the account 31 200535609. The only open relationship that stores this data will be to merge the data from the array ... 1/3. Array /, shell material to a single stream, used to return to the user's buffer of the transfer wheel. The driving software program must handle j shi ", f male tools have extended your field to … Is about to transfer the total request calculation program is to create-scatter list (scatter ten 's out, ask for any of the front-end or back-end of the data: the daily quote attached to the data is directly transmitted. Discard the buffer for a write The request will need to be targeted at the "" ,, 方; the additional magnetic field system at each end and the Wending Town tile β ί read_repair / repair / modify / 耷 人 π " write operation. For In a read / 1U change / write operation, the drive includes to -㈣ ... The flying system disassembles the H buy target, and its (gath er), the direct pick-up person will create a collection of data for the user and the user. The extended magnetic area of the resolver (such as XOR operation can be "real-time" real XOR accumulator FIFO hardware, such as Examine the published patents that are not shown in Figure 6 and Figure 4. 胄 For the whole explanation, please refer to the RAIDXL-2 Drives-No Parity (RAIDXL.-, parity) listed in the previous section. Refer to Figure 1 9A for a drive- What is not shown in the figure is: logical disk 胄 〇 is in the compilation drive 1 is in the physical store a, and

-…h 取小的傳輪長度係對於㈣A -之母個貧料磁碟機為—個磁…數為 係傳播至其二者。來自各璋之”二入至貝體璋0之命令 — 曰合年之貝枓傳輸請 —埠為備妥以傳輸資料。資料係運用其對^而直到 存器組之單一個dma α方、陣列開關暫 ΜΑ通道而傳輪。灰色條紋係“ 1 32 200535609 體埠1為由目前陣列所運用。-... h The smaller length of the transmission wheel is the magnetic disk drive for the parent of ㈣A--a magnetic ... The number is transmitted to both. The order from the “two-in-ones” to the shell body 0—the transmission of all-in-one shells—please prepare for the transmission of data. The data is used by its counterparts until a single dma α, The array switch temporarily passes the MA channel. The gray stripes are "1 32 200535609 Body Port 1 is used by the current array.

RAIDXL-3 Drives-NG Parity (RAI 同位):參閱第19B圖 —個磁碟機-無 磁碟是:邏㈣碟機G為於實體埠邏輯 ”衣貫體埠1,且邏輯磁碟機2 總數為…”心二 ㈣料傳輸請求係忽略而直到所有三個璋為:= 資料。資料係運用其對應於陣列開 專輪 DMA通道而傳於。六么片 y 口口、、且之早一個 二 、別Λ色11卞紋係指出:實體埠1盥2為ώ日 雨的陣列所運用。參考文獻之同步冗輸利 示簡單的拆解,1中^ 胃輸專利係揭 , /、 科磁碟機之數目係二之一個# t (power)且來自該等磁 個宋方 用資料。於本發明,斩:二爾間早交錯以重新建立使 -中,資二1 係引入以延伸該概念至陣列, /、 枓碟機之數目係可為不同於二之一彻千 了展示該概念,考、—之一個乘方。為 忒〃、具有二個資料磁碟機之一陣列。至 貝體磁碟機埠之資料路徑係DW0 /至 面之資料路徑係QW〇 R 主機系統介 一個磁區,其,,使:者Ο; ::的存取偷 資料磁碟機之各者的單一磁區。—個分布於該三個RAIDXL-3 Drives-NG Parity (RAI parity): Refer to Figure 19B—a disk drive-no disk is: Logical drive G is logically connected to the physical port ", and is connected to logical port 2 and logical drive 2 The total is ... "The second transmission request is ignored until all three are: = data. The data is transmitted by using the DMA channel corresponding to the array opening. Six Mody y mouths, and one early, two, do not color 11 卞 pattern points out: the physical port 1 bathroom 2 is used by the array of free rain. The synchronous and redundant input of the reference documents shows a simple disassembly. The 1st patent of the stomach infusion is disclosed, and the number of the disk drives is one of two # t (power) and comes from the data of these magnetic songs. In the present invention, chopping: early interleaving between the two Seoul to re-establish the use of Chinese-Zhongji, Zier 1 is introduced to extend the concept to the array, /, the number of disk drives can be different from the two Concept, test, one of the powers. It is an array with two data drives. The data path to the shell disk drive port is DW0 / to the surface data path is QW〇R host system refers to a magnetic zone, which, make :: 0; :: access to each of the data stealing drive Single magnetic domain. — Distributed among the three

十 區。硬體係將同步讀出—A 來自该寺實體磁碟機資料埠之 出-:人其 六個DW〇RDS。得自第_ 口 以’以產生總計為 刪RDS係送出 貝二個DW〇㈣的二個 装Μ - > 戌系、、充而一者為保持於一暫存哭 弟一頃出,硬體係將持有總計為四個贿〇卿^為隨 /、々送 33 200535609 出至主機於二個週期,第一個週期為來自暫存器之dw〇rd 及來自第二讀出之DWORDS的第一者且下一個週期為來 自第二讀出之DWORDS的平衡。此過程係重複以針對於 該等磁區之平衡。 ' 同位)··參閱第19C圖 該圖係顯示的是:邏輯磁碟機0為於實體埠〇, 磁碟機1為於實體埠卜邏輯磁碟機2為於實體埠2,且 ^輯磁« 3為於實體埠3。最小㈣輸長度係對於 為四之每個資料磁碟機為一個磁區 : 令係傳播至其全部者。㈣各埠之f料傳^之命 直刭所古—, 干心貝抖傅輸凊求係忽略而 陣列門…固埠為備妥以傳輸資料。資料係運用其對應於 #關曰存器組之單一個DMa 係指出:實體埠m3A"灰色條紋Ten districts. The hard system will read out synchronously—A comes from the physical disk drive data port of the temple—: Ren Qi's six DW〇RDS. Obtained from the _ mouth to 'to generate a total of RDS system to send two DW〇㈣ two packs M-> 戌 system, and one to keep in a temporary crying brother one out, hard The system will hold a total of four bribes, which will be sent to the host in two cycles. The first cycle is dw〇rd from the scratchpad and DWORDs from the second read. The first and next cycle is the balance of DWORDS from the second read. This process is repeated for the balancing of these magnetic zones. 'Parity'. Refer to Figure 19C. The picture shows: logical drive 0 is on physical port 0, drive 1 is on physical port, logical drive 2 is on physical port 2, and ^ series Magnetic «3 is for physical port 3. The minimum input length is one sector for each data drive that is four: Let the system spread to all of them.埠 The fate of each port is fate ^ Straight from the ancient times, the enthusiasm of the heart is not to be ignored and the array gate ... The fixed port is ready to transmit data. The data uses a single DMa corresponding to the #guan Yue memory device group to point out: the physical port m3A " grey stripes

# 3為由目則的陣列所運用。 IDXL-3 Drives-Parity (RAmxL 位)··參閱第19D圖 一個磁碟機·同 為圖係顯示的是··邏輯磁碟機 磁碟機1為於實體璋!,且同位磁碟機為二=0,邏輯 小的值y、戍為於貫體埠2。# J的傳輪長度係對於總數為二之每早2取 區。寫入至實俨造n +入^ 们貝枓磁碟機為一個磁 王貝體埠0之命令係傳播至1 _ 或“慢速讀出,'寫入至實體谭0之;=值針對於寫入 三個蜂。針對於“快速讀出”,命令“:傳播至所有的 取4:Γ來自任—蟑之資料傳輸請求係忽略而=有該等f 阜為備妥以傳輪資料 至’i所有的存 的DMA通道而 34 200535609 傳輪。灰色條紋係指出:實體珲】與實體淳2為由 列所運用。 Μ Ι RAIDXL-4 Drives_parity (raidxl_ 四個磁碟 位)··參閱第19E圖 '同 该圖係顯示的是:邏輯磁碟機0為於實體埠〇,、羅 磁碟機1為於實體埠1,邏輯磁碟機2為於實體埠^耳 同^磁碟機為於實體埠3。最小傳輸長度係對於總數為^ 之每個貝料磁碟機為一個磁區。針對於寫入〇速一 Γ ’寫人至實體埠G之命令係、傳播至所有的三個H 快速讀出’,,命令係傳播至僅有該等資料埠。來自 資料傳輸請求係忽略而直到所有的存取埠為備妥 =輸㈣。資㈣運用對應的難通道而傳輸。= ^係指出:實料! '實體璋2、與實體蜂 的陣列所運用。 A田目刖 磁碟!1AIDXL 3 DrivesParity-Drive Faiied (raidxl-三個 ”同位-磁碟機失效)··參閱第2〇A圖 1_此貝例’對應於實體埠0之該圖欄位係已經修改以 該磁碟機為已經失效。此磁碟機係已經儲存對二: 則陣列之邏輯磁碟機 子對於目 係將不再…、 於此圖配置,實體埠〇 '、、、子以供讀出或寫入,而同位磁碟機传將A t 資料係藉著=1。白買出時’針對於失效的磁碟機之 重::,陣列之所有其餘磁碟機的資料之 磁碟機:資:之L。於:寫入時’其為寫入至所有㈣ R知叶异及儲存於同位磁碟機,即使 (§ 35 200535609 疋其欲為寫入至失效的磁碟機之資料係棄置。 RAIDXL-4 Drives-Parity.Drive Failed (RAIDXL-^i® 磁碟機-同位-磁碟機失效):參閱第2〇B圖 一於此實例,對應於實體琿】之該圖攔位係已經修改以 ..、員不.。亥磁碟機為已經失效。此磁碟機係已經儲存對於目 前陣列之邏輯磁碟機!的資料。由於此圖配置,實體埠i 係將不再為存取以供讀出或寫入,而同位磁碟機係將為存 取以供項出或寫入。於一讀出時,針對於失效的磁碟機之 資料係藉著計算來自該陣列之所有其餘磁碟機的資料之 x〇R而為重新建立。於一寫入時,其為寫入至所有的邏輯 :石«的資料之職係計算及儲存於同位磁碟機,即使 疋欲為寫入至失效的磁碟機之資料係棄置。 RAID0 (一般資訊广 ,於RAJD0 ’拆解寬度係一個“二之乘方(p〇爾〇f tW'數目的磁區。舉例而言,關於32個磁區之一拆解寬 :取初32個磁區之使用者資料係、儲存於邏輯磁碟機〇, :,32個磁區為於邏輯磁碟機1,以此類推,直到32個 =為已經儲存於各個邏輯資料磁碟機。此過程係重複於 碟之容量。RAm〇功能性係運用s“A璋/磁碟機於 供ΓΖΓ整體實施於軟體以。本系統係、並未提 二二二1ΓD 0之任何硬體支援。於收到來自操作系統之 存取,月求時,軟體驅動程式係將分解請求為一序列的存 ==碟:此舉係包括:找出使用者_之 仅决差處理、及報告完成。 36 200535609 ……般資訊): RAID5係於一方而 同位磁碟機1二=為不同於™0。第一,存在一 ^ U存於貧料磁碟 —a 元方式職係計算且儲存於1门〜疋拆解的資訊之位 第-,节耸$ ;同位磁碟機之對應的拆解。 弟一違寺-貝料與同位磁 於拆解之間,其方_^八;^之破輯至貫體的對準係旋轉 碟機。—RAID5 於該陣列之所有的磁 機)#相ππ 1之正常讀出存取(其未涉及同位磁碟 咖相同於_ RAID〇陣列之讀出存取 竿 軟體驅動程式係必須允 ,事貝在於. 之n 、 11 t輯至貫體磁碟機的旋轉於拆解 一個磁碟機為失效之―碟寫入操作或讀 r功:碟機之存取且為較複雜。本系統係包括硬 體功μ助於此㈣作之軟體驅動程式。 磁禅ΓRAIDXL,—連續串流之使用者資料係交錯於資料 /'/之間。針對於此理由,料任何存取為方便以運用 早一個DMA通道,而僂輪眘祖 陣列夕P, I巾傳輸貝科於早—個使用者緩衝器與 解::於RAID5,儲存於-給定的資料磁碟機之一拆 '區段係可為多個千位元組(kil〇byte)之 解係需要資料傳輸於資料緩衝器與磁碟機之間,:中,= :資料緩衝器係位於分開為多個千位元組。針對於此: ’ -DMA通道係運用於各個資料磁碟機。該等區段係 2依序傳輸且共用單一個DMA通道,但是此係限制性 此而且意味者大量之緩衝’藉以實行職計算。再者, :對於JBOD功能,已經具有每個磁碟機為—βΜΑ通道。 列開關之暫存器係必須程式規劃為用於各個㈣仍基本 37 200535609 動作。此係幾年氣& m μ j 為f □方;拆解之間 旋轉且亦為借用其正常指定用一:^ (RAir!^I?S_FU,1 8ίΗΡβ RCad 21A圖 磁碟機·完全拆解讀出於磁碟機失效):參閱第 例,—拆解係存取,其中,邏輯 …ΑΤΑ 〇,且同位 T曰 機1係指定至SATA〗/ 3邏輯磁碟 SATA卜但是此磁碟機係已經失效。由於 :料=斥解於二個磁碟機,最小的傳輸長度係仍為 ^即使㈣資料磁碟機之—者為已經失效。—dma通 ST二對於各個邏輯磁碟機而程式規劃,包括其磁碟 、,、,、絰失效之通道1。必須發生於此者係在於··儲存於 磁碟機0之拆解的區段係將傳輸至DMA通道〇所指出之 緩衝器。儲存於磁碟機G的區段係必須為與儲存於同位磁 碟機的區段而進行職,以重新建立其為已經或將健存於 磁碟機!之資料。結果係必須送出至賴通道】所指出 之緩衝器。 、儘管此等存取係無關於RAID〇,其必須為同步化,藉 以達成X〇R計算。陣列開關係具有單一個磁區緩衝器,X0R 邏輯之輸出係可儲存至其。命令係寫入至實料〇且將為 傳播至4 1與3。陣列開關係將等待而直到所有的存取磁 碟機為備妥以傳輸至少—個磁區。接著,將傳輸來自第一 非失效的邏輯資料磁碟機之SATA埠的一個磁區至其運用 該等磁碟機DMA通道之主機。x〇R邏輯係監測該傳輸而# 3 is used by the array of rules. IDXL-3 Drives-Parity (RAmxL bit) ·· Refer to Figure 19D. A drive · The same is shown in the picture system. · Logical drive. Drive 1 is physical! , And the parity drive is two = 0, and the logically small values y and 戍 are in the continuous port 2. # J's pass length is for a total of two rounds of 2 each. The command to write to the actual production n + drive ^ we drive the drive to a magnetic king shell body port 0 is propagated to 1 _ or "slow read, 'write to entity Tan 0; = value is for Write three bees. For "quick readout", the command ": spread to all take 4: Γ from Ren-cockroach data transfer request is ignored and = there are f 'i all saved DMA channels while 34 200535609 pass. The gray stripes indicate that: Entity 珲] and Entity Chun 2 are used by the columns. Μ Ι RAIDXL-4 Drives_parity (raidxl_ four disk locations). Refer to Figure 19E 'the same as the figure shows: logical drive 0 is on physical port 0, and Luo drive 1 is on physical port 1. The logical drive 2 is on the physical port ^ The same as the drive is on the physical port 3. The minimum transmission length is one sector for each shell drive with a total of ^. For the command system of writing 0-speed Γ ′ to write to the physical port G and propagate it to all three H fast readouts ”, the command system is propagated to only these data ports. Data transfer requests are ignored until all access ports are ready = input. Assets are transmitted using corresponding difficult channels. = ^ Points out: Actual! 'Entity 璋 2, and the array of solid bees. A 田 目 刖 disk! 1AIDXL 3 DrivesParity-Drive Faiied (raidxl-three "parity-drive failure) ·· Refer to Figure 2A Figure 1_This example 'corresponds to the figure field of physical port 0 The system has been modified to take the disk drive as invalid. This disk system has been stored to two: then the logical drive of the array will no longer be for the target system ... Configured in this figure, the physical port 0 ',,,, For read or write, the parity drive will transfer the A t data by = 1. At the time of purchase, the weight of the failed drive ::, the rest of the array ’s Data drive: Data: L. At: At the time of writing, 'It is written to all ㈣ R Zhi Ye Yi and stored in the same drive, even if (§ 35 200535609 疋 it wants to write to invalid The data of the drive is discarded. RAIDXL-4 Drives-Parity.Drive Failed: See Figure 2B in this example, which corresponds to the physical entity. The block system in the figure has been modified to: .., member is not ... Hai drive is invalid. This drive has stored the logical disk for the current array Data! Due to the configuration of this figure, the physical port i will no longer be accessed for reading or writing, while the co-located disk drive will be accessed for reading or writing. When a read The data for the failed drive is re-created by calculating the xOR of the data from all the remaining drives in the array. At the time of a write, it is written to all logic: stone «The grade of data is calculated and stored on the same drive, even if you do not want to discard the data written to the failed drive. RAID0 (General information is wide, in RAJD0 'The disassembly width is a" multiplier of two " Square (p 0 er 0 f tW 'number of magnetic zones. For example, regarding the disassembly width of one of the 32 magnetic zones: take the user data of the first 32 magnetic zones and store them on the logical drive 〇: The 32 sectors are in logical drive 1, and so on, until 32 = are already stored in each logical data drive. This process is repeated for the capacity of the disc. The function of RAm〇 uses s "A璋 / disk drive is used for the overall implementation of ΓZΓ in software. This system is without any hardware support of 222 1ΓD 0 Upon receiving the access from the operating system, the software driver will decompose the request into a sequence of storage == disk: this includes: finding only the user's decision processing, and reporting completion. 36 200535609 …… general information): RAID5 is on one side and the same drive is 1 = 2 is different from ™ 0. First, there is a ^ U stored on the lean disk-a yuan method grades are calculated and stored in 1 Door ~ 疋 The dismantling position of the information is-, the section dismantles; the corresponding disassembly of the co-located disk drive. Di Yi Gui Temple-Bei Lian and the co-located magnet are in the disassembly, and its square is _ ^ 八; ^ 之The break-to-body alignment is a rotating disk player. —RAID5 all the magnetic machines in the array) #phase ππ 1 for normal read access (which does not involve co-located disks that are the same as _ RAID 0 array read access rod software driver must be allowed, nothing The n, 11 t series to the rotation of the solid-state disk drive is disabling when disassembling a disk drive-the disk write operation or the reading function: the disk drive access is more complicated. The system includes The hardware work μ helps the software driver program for this work. Magnetic Zen Γ RAIDXL—- continuous stream of user data is interleaved between data / '/. For this reason, it is expected that any access is convenient for early use. One DMA channel, and the Phenix array, P, I transfers Beco as early as a user buffer and solution: In RAID5, store in-a given data drive to remove the 'sector system' The solution that can be multiple kilobytes (kilobytes) requires data to be transmitted between the data buffer and the disk drive :: ,, =: The data buffer is located in multiple kilobytes. For Here: '-DMA channel is used for each data drive. These sectors are transmitted sequentially and share a single DMA channels, but this is restrictive and means a large number of buffers to perform job calculations. Furthermore, for the JBOD function, each disk drive already has a -βΜΑ channel. The register of the switch must be The program is planned to be basically 37 200535609 action for each ㈣. This is the qi & m μ j for several years; rotation between dismantling and also for borrowing its normal designation one: ^ (RAir! ^ I? S_FU, 1 8Η Ηββ RCad 21A drive · Completely disassembled due to failure of the drive): Refer to the example,-disassembly system access, where the logic ... ΑΑΑ 〇, and the co-located machine 1 is assigned to SATA〗 / 3 logical disk SATA, but this drive system is no longer valid. Because: material = replies to two drives, the minimum transmission length is still ^ even if one of the data drives-it is already Invalidation. — Dma through ST II for each logical disk drive program planning, including its disk, ,,,, and channel 1 of failure. Must happen here is the disassembly of storage in disk 0 The sector will be transferred to the buffer indicated by DMA channel 0. Stored on disk G's section must work for the section stored on the same drive, in order to re-establish that it is already or will be stored on the drive! Data. Results must be sent to Lai channel] Buffer. Although these access systems are not related to RAID〇, they must be synchronized to achieve XOR calculation. The array open relationship has a single sector buffer, and the output of X0R logic can be stored to it. Command It is written to the actual material 0 and will be propagated to 4 1 and 3. The array relationship will wait until all the access drives are ready to transfer at least one sector. Then, the transfer will be from the first non-sector. A sector of the SATA port of the failed logical data drive to the host that uses the drive's DMA channel. x〇R logic monitors this transmission and

(S 38 200535609 捕捉該磁區之一份複製於其x〇r缓衝器。依序,將傳輸 :、來自σ亥等非失效的資料磁碟機之平衡的各者之一個磁 ,’累進各個新的磁區與該緩衝器的目前内容< x〇r。接 著將取仔來自同位磁碟機之一個磁區,將其與該緩衝器 的内容而進行w 且运出結果至其運用失效的磁碟機之 DMA通道的主機。 於此點,已經取得來自各個作用埠之一個磁區且 一個磁區至各個主機^ … ㈣主钱緩衝益。在匕過程係重複而直到所有的 二’、‘、’、、!傳輸。若磁碟機必須延伸使用者請求以產生一 :全(㈣拆解請求,隨附的磁區係可運用散播(s,)機 冓而為棄置’如同於RAIDXL情形。 RAID5-3 Drives_Read以㈣a·三個磁碑 祛^出失效的磁碟機)··參閱第2ΐβ圖 、 /例儲存於失效的磁碟機之拆解的一區段係存 且同::於先前的實例,邏輯磁碟機0係指定至SATA〇, 同位磁碟機係指定至SATA 3 0僅 機之DMA通道#程t ^ ' 效的磁碟 如同於先前的心气 此將為傳輸的僅有資料。 勺貝例,包括同位磁碟機之該 磁碟機係存取。 j的所有其餘 命令係傳播至所有存取的磁碟機* 而直到所有存取磁碟機為請求 汗h將等待 碑:接者為同步傳輪於各個存.取磁碟機 貝 :機之資料係同步傳輸,此資料之:η磁 且結果為儲存於DMA通道 系即4計算 、“出之主機緩衝器。此過程 39 200535609 係重複而直到所有的資料為已經傳輸。 RAID5-4 Drives-Ful, Stripe Read With Drive Failed (RAID5-四個磁碟機-完全拆解讀出 視貝“磁碟機失效參閱第 z i c圖 a於此貫例,-拆解係存取,其中,邏輯磁碟機〇係指 疋:SA:0,邏輯磁碟機1係指定至“ΤΑ丨,且同位磁 碟機係才曰疋S SATA 3。邏輯磁碟機2係指定至“η 2, =此:碟機係已經失效。由於資料格式係拆解於三個磁 一者為取二的傳輸長度係仍為三’即使該等資料磁碟機之 磾機::經失:。—〇ΜΑ通道係必須針對於該等邏輯磁 業機各者而程式規劃,包括其磁碟機為已經失效之通道2。 必須發生於此的是:儲存於磁碟機〇 將傳輸至由DMA通道i所指 鮮…又係 Λ . τ 出之,友衝為。儲存於邏輯磁 隹' Λ 1的區段係必須為與儲存於同位磁碟機的區段而 進行XOR,以重新建立其為已經或㈣存於㈣機2之資 料。結果係必須送出至D Μ Α通道2所指出之緩衝器。' 儘管此等存取係無關於RAID〇,其必須為同步化,藉 二達成X〇R計算。陣列開關係具有單一個磁區緩衝器,職 出係可儲存至其。命令係寫入至實體埠0且將為 、 ^、1、與3 °陣列開關係將等待而直到所有的 ::碟機為備妥以傳輸至少一個磁區。接著,將傳輸來自 非矢效的邏輯資料磁碟機之SATA埠的—個磁區至盆 ^用,磁碟機DMA通道之主機。舰邏輯係監測輸 而捕捉該磁區之-份複製於其麗緩衝器、依序,將= 200535609 輸其來自該等非失效 區,累進各個新的磁 著,將取得來自同位 的内谷而進行XOR, DMA通道的主機。 的貢料磁碟機之平衡的各者之_個磁 區與該緩衝器的目前内容之x〇R。接 磁碟機之一個磁區,將其與該緩衝器 且送出結果至其運用失效的磁碟機之 π付术自各個作用埠之 一個磁區至各個主機 他匕丑冩入 資料為已經傳此過程係重複而直到所有的 ^八· β …^機必須㈣使肖者請求以產生- 兀王拆角午睛求,隨附的磁 生 如同於RMDXL情形。 運用散播機構而為棄置, RAIDS_4 Drives_Read Faiied 如 機-項出失效的磁碟機):參閱第21DS 四個磁碟 於此實例,儲存於失效的 取。如同於养1从一 卡俄之拆解的一區段係存 、刖的貫例,邏輯磁碟機〇 # γ + 邏輯磁碟機i «定至SATA丨4係&^SATA0, SATA 3。僅有針對w A卜且同位磁碟機係指定至 通道2)係程式規畫彳, ^(於此例為 於先前的實例,包括 …所傳輸的僅有資料。如同 機係存取。命令俾傳 ^ j的所有其餘磁碟 等待而直到所有在f / 勺放碟機。陣列開關係(S 38 200535609 captures a copy of the magnetic zone in its x〇r buffer. In order, one of the balances of each of the non-failed data disk drives such as σ Hai will be transmitted, 'progressive' Each new sector and the current content of the buffer < x〇r. Next, a sector from the co-located drive will be taken, and it will be w with the content of the buffer, and the result will be transferred to its use The host of the DMA channel of the failed disk drive. At this point, one magnetic zone from each active port and one magnetic zone to each host have been obtained ^ ㈣ The main money buffer benefits. The process is repeated until all two ',', ',,! Transmission. If the drive must extend the user request to generate one: full (㈣ dismantling request, the attached sector can be discarded using the spread (s,) machine' as In the case of RAIDXL. RAID5-3 Drives_Read removes the failed drive with ㈣a · three magnetic tablets) ·· Refer to Figure 2ΐβ, / Example of the disassembly of a segment stored in the failed drive Same as :: In the previous example, logical drive 0 is assigned to SATA〇, parity disk This is the DMA channel assigned to the SATA 3 0 machine only. The effective disk is the same as before, and this will be the only data transmitted. Examples of this disk, including the same drive All remaining commands of j are propagated to all access drives * and until all access drives request a request, h will wait for the tablet: the receiver will synchronize the transfer to each store. Take the drive: The data of the machine is transmitted synchronously, the data of this data: η magnetic and the result is stored in the DMA channel, which is 4 calculations, "out of the host buffer. This process 39 200535609 is repeated until all the data has been transmitted. RAID5-4 Drives-Ful, Stripe Read With Drive Failed (RAID5-four drives-completely disassembled to interpret the video as "drive failure" see figure za in this example,-disassembly system access, where the logic Drive 0 means 疋: SA: 0, logical drive 1 is designated as “TA,” and the co-located drive is designated as S SATA 3. Logical drive 2 is designated as “η 2, = This: The disc player system has become invalid. Since the data format is disassembled from the three magnetic ones, the transmission length is still two. For three 'even if the drive of the data drive :: Economic failure:-0ΜΑ channel must be programmed for each of these logical drives, including the drive is a failed channel 2 What must happen here is: stored on the drive 0 will be transferred to the DMA channel i refers to ... and it is Λ. Τ out of it, You Chong is. The segment stored on the logical magnetic 隹 'Λ 1 must be XOR with the segments stored on the same drive to re-establish it as data already stored on the drive 2. The result must be sent to the buffer indicated by DM A channel 2. 'Although these accesses are not related to RAID 0, they must be synchronized to achieve XOR calculations. The array open relationship has a single magnetic zone buffer to which the job can be stored. The command is written to physical port 0 and will be opened for, ^, 1, and the 3 ° array will wait until all :: drives are ready to transfer at least one sector. Next, one sector from the SATA port of the non-effect logical data drive will be transferred to the host, the drive's DMA channel. The ship logic system monitors and captures a copy of the magnetic area in its buffer, which is sequentially transferred to 200535609. It comes from these non-failure areas, and each new magnetic progression is advanced. It will obtain the same inner valley. Host that performs XOR, DMA channel. One of each of the balanced disk drives is x0R and the current content of the buffer. Connect a magnetic area of the disk drive, send it to the buffer, and send the result to its failed drive. The π method of using the failed disk drive is from a magnetic area of each active port to each host. This process is repeated until all ^ 8 · β… ^ machines must make Xiao Zhe request to produce-Wu Wang Wang Jiaojiao Wuqiu request, the attached magnetron is like RMDXL. Use the dissemination mechanism and discard it. RAIDS_4 Drives_Read Faiied (drive-failed drive): Refer to the 21DS four disks. In this example, store the failed drive. As in Yu Yang 1 ’s example of a sector system that was dismantled from a card, the logical drive ## γ + logical drive i «Set to SATA 丨 4 series & ^ SATA0, SATA 3 . Only for w Ab and the parity drive is assigned to channel 2) is the program plan, ^ (in this example is the previous example, including the only data transmitted. As the machine access. Command Pass on all the remaining disks of ^ j and wait until all the disc players in f / spoon. Array open relationship

之資料係接:為:碟機為請求一資料傳輪。-個磁區 丁十你接者為同步傳輸於各個 L 取磁碟機之資料伟同牛 . 票機。來自所有存 計算且处果為= 此資料之撕係、“即時” 且、、。果為储存於DMA通道 丨了 過程係重複% j 曰 主機緩衝器。此 灵而直到所有的資料為已經傳輪。 (S: 41 200535609Data connection: For: the disc player requests a data transfer wheel. -One magnetic area Ding Shiyou will transfer the data to each L drive simultaneously. Ticket machine. From all calculations and processing results = tearing of this data, "real time" and ,. If it is stored in the DMA channel, the process is repeated% j host buffer. This spirit and until all information is passed on. (S: 41 200535609

RAID5-3 DriVes_F 機-完全拆解寫入)··灸門…训Write(RAID5_三個磁碟 ^翏閱弟22A圖 考第2圖衣此貫例,邏輯磁碟機0係指定至SATA 〇且邏輯磁碟機“系指RAID5-3 DriVes_F machine-completely disassemble and write) ·· Moxibustion door ... Training Write (RAID5_three disks ^ 翏 Read the 22A figure test 2nd figure This example, logical drive 0 is assigned to SATA 〇And logical drive "means

0 〇 至SATA 1。同位磁碟機係於SATA 2。由於貧料格式伤拍 。 、,,、解衣二個磁碟機,最小的傳輸長度 知。D Μ A通道你7 /石 須針對於各個邏輯磁碟機而程式規 扣㈠係寫入至SATA 〇且將為傳播至“Μ 〇、ΜΑ卜 與SATA 2。陣列開關係0 〇 to SATA 1. The parity drive is based on SATA 2. Injury due to poor material format. ,,,, and undressing two disk drives, the minimum transmission length is known. D Μ A channel you 7 / must be written to SATA 〇 for each logical drive and will be transmitted to Μ 〇, ΜΑΒ and SATA 2. Array open relationship

J竹將4待而直到所有的作用SATA 璋為備妥以接收資料。技 ^ ^ iA 、 妾者’將傳輸一個磁區至於SATA 0 之邏輯磁碟機〇’運用由DMA通道〇所指出之緩衝哭。 此傳輸係由陣列開關所監測,且該磁區之一份複製係 捕捉於XOR緩衝哭。久4 ^ 〇〇各者之一個磁區係接著為針對於其 餘的資料磁碟楼久去& /' 成各者而依序傳輸。隨著各個磁區為經過, XOR係累進於緩衡哭夕 a 一 友衝叩之目刖的内容。隨著最終磁區係儲存 於其資料磁碟機,XQR古十曾 。卞t之、纟σ果係同步儲存於同位磁碟 機方;此點,—個磁區為已經取自各個主機緩衝器,且一 们兹區為已、、、工寫入至各個資料磁碟機與同位磁碟機。該過 程係重複而直到所有的資料為已經寫入。如於㈣胤, -使用者階層存取係可能涉及小於一完全拆解。使用者請 求係可為延伸至拆解邊界。此等延伸係將需要讀出"多改, 寫入操作,如先前所述。 RAID5-4 Drives_Full StHpe WHte (RAm5 四個磁碟 機-完全拆解寫入):參閱第22B圖 再次參考第22圖,於此實例,邏輯磁碟機〇係指定至 rs 42 200535609 SATA 〇,邏輯磁碟機1 2係指定至SATA 2。π SATA卜且邏輯磁碟機 ^ ,. 同位磁碟機係、,SATA 3。由於資料 ^ $枝,取小的傳輪長度係三。1)撾八通 道“必須針對於各個 $ ,δτδ λ 幸耳症茱拽而程式規劃。Φ令係寫入 至SATA 〇且將為傳播 ςΔΤΛ , W雀至 SATA 0、SATA 1、SATA 2、與 bATA 3。陣列開關係 供/冰 ’、、寺待而直到所有的作用SATA埠為 備文以接收資料。接著J Zhu will wait 4 until all functions are SATA 璋 ready to receive data. The technology ^ ^ iA, the player 'will transfer a sector to the logical drive SATA 0 of the SATA 0' using the buffer indicated by the DMA channel 0. This transmission is monitored by the array switch, and a copy of the magnetic field is captured in the XOR buffer. Each of the 4 ^ 〇 magnetic domains is sequentially transmitted for the remaining data disks, and each of them is sequentially transmitted. With the passage of each magnetic field, the XOR system progresses slowly and slowly. With the final magnetic sector stored on its data drive, XQR ancient times.卞 t and 纟 σ are stored in the same disk drive at the same time. At this point, one magnetic area has been taken from each host buffer, and one area is written to the data Drives and co-located drives. This process is repeated until all data is written. As in ㈣ 胤,-user-level access may involve less than a complete disassembly. The user request can be extended to the dismantling boundary. These extensions will require read " multiple change, write operations, as previously described. RAID5-4 Drives_Full StHpe WHte (RAm5 four drives-complete disassembly write): Refer to Figure 22B and refer to Figure 22 again. In this example, the logical drive 〇 is assigned to rs 42 200535609 SATA 〇, logical Drive 1 2 is assigned to SATA 2. π SATA and logical drive ^,. Co-located drive system, SATA 3. Because of the information ^ $ sticks, the length of the small round is three. 1) The eight-channel Laos "must be programmed for each $, δτδ λ Xinger's disease. Φ order is written to SATA 〇 and will be transmitted 传播 ΔΤΛ, W 雀 to SATA 0, SATA 1, SATA 2, and bATA 3. The array is open for supply / ice, and waits until all the functioning SATA ports are prepared for receiving data.

將傳輸一個磁區至於SATA 0之 邏輯磁碟機0,運用由DMa诵、蓄A 妗㈣MA通道0所指出之緩衝器。此傳 所監測且該磁區之—份複製係捕捉於職 機夂者而:良《㈤磁區係接著為針對於其餘的資料磁碟 械各者而依序傳輸。隨著各個磁區為經過,職係累❹ =器之目前的隨著最終磁區係儲存 料碑 機,XOR計算之έ士罢总η止 ¥ 果係同步儲存於同位磁碟機。於此點, ^固磁區為已經取自各個主機緩衝器,且一個磁區為已經 寫入至各個資料磁碑機盘 ,、為〃、冋位磁碟機。該過程係重複而直 到所有的資料為已經寫入。如 直 万、’ 一使用者階声 :取係可月&涉及小於—完全拆解。該使用者請求係可為延 至拆解邊界。此等延伸係將需要讀出/修改/寫入操作, 如先前所述。 RAID5_Partiin StHpe π·部分拆 新)··參閱第22C圖 再次參考第22圖1RAID5, 一使用者係可能寫入 早一個磁區。同位磁碟機係可能更新而無須進行整個拆解 之-讀出/修改/寫入。僅有於目標資料磁碟機與同位磁碟 43 200535609 機的貝料係將改變。此係保留其他的磁碟機為可利用於同 時的讀出存取或是其他對的磁碟機為可利用於一同時的部 分拆解更新。隨著更新,同位磁碟機係必須含有其包括新 資料的整個拆解之X0R’但是其已經具有整個拆解與目前 資料之XOR。對於此問題之傳統的RAm5方式係讀出其 將為取代之資料區段且讀出目前的同位資料。該二^係進 行XOR以提供該拆解與其移除自該結果的目標資料區^ 之平衡的同位。新的資料區段係接著寫入至該陣列,且: 先刖的4异之結果而進行X〇R,以產生其包 料區段的更新拆解之順。此係寫入至同位磁碟機二 開關硬體係包括其設計以利於此過程之特徵。 如上所述,-部分拆解更新係涉及僅4有_ 機與-個同位磁碟機。於圖 、枓途碟 附拯ΗΑΤΛ η 之貫例,貧料磁碟機係視為 接至SATA 〇之-邏輯磁碟冑 5 Q △下Λ Q丄 I问位磁碟機係指定 至SATA 3。由於僅有一個資料 —。括屮入八在官s 取小傳輪長度係 3 係寫入至SATA〇且傳播至SATA〇mata 。於項出階段期間,陣列開關係等待而直到 埠為備妥以傳輸資料。接著 ΑΤΑ 時,,貝料係隨著X0R為“即 _通道所指出之主機緩衝器。此結果十f 3的 式針對於旦有將Λ争*f p '、k ί、叙體驅動程 r對方…、有將為更新已經移除者 的同位資料。此俾達成值又之目前磁轨 貝 這成傳統的RAID5难山"女., 二個讀出操作與第一 x〇R ^丨多改/寫入之該 叶开於早一個動作。 參考第22D目,於寫入階段期 孝人驅動程式係程 44 200535609 式規劃隨八通道0為具有其保持新資料之緩衝器的位址 且DMA通道3為具有其保持剛完成的同位計 衝器的位址。該等命令係寫入至 、,友 王α铒琿ϋ且傳播至SATA 0 與SATA 3。陣列開關係將等 以接收資料。接著,將運用:=,碟機為備妥 將運用其DMA通道而傳輸一個磁區 至该貝料磁碟機之sATA埠。One sector will be transferred to logical drive 0 of SATA 0, using the buffer pointed out by DMa, storing A 妗 ㈣ MA channel 0. One copy of this magnetic field monitored by this pass was captured by the office workers: the "magnetic field system" was then transmitted sequentially for each of the remaining data disks. As each magnetic field passes, the grade is accumulated, and the final magnetic field system stores the tablet. The XOR calculates the total amount of the data. The result is stored in the same drive simultaneously. At this point, the solid magnetic area is a magnetic disk that has been taken from each host buffer, and one magnetic area has been written to each data disk drive. This process is repeated until all data is written. For example, ‘one user ’s order sound: It ’s possible that the month & involves less than — complete disassembly. The user request may be extended to the dismantling boundary. These extensions will require read / modify / write operations, as previously described. RAID5_Partiin StHpe π · Partially disassembled) ·· Refer to Figure 22C. Referring again to Figure 1 of RAID5, a user may write to an earlier volume. The parity drive may be updated without the need for a complete disassembly-read / modify / write. Only the target data drive and parity drive 43 200535609 The shell material system will change. This is to keep other disk drives available for simultaneous read access or other pairs of disk drives available for simultaneous partial disassembly and update. With the update, the parity drive must contain its entire disassembly XOR 'including the new data but it already has the entire disassembly XOR with the current data. The traditional RAm5 approach to this problem is to read out the data segment it will replace and read out the current parity data. The two are XORed to provide a parity between the disassembly and the target data area removed from the result. The new data segment is then written to the array, and: the first 4 different results are performed XOR to generate an update and disassembly sequence of its packaging segment. This system is written to a co-located hard drive. The switch hardware includes features designed to facilitate this process. As mentioned above, the partial disassembly update involves only 4 drives and a co-located drive. As shown in the figure and the example of the disk attached with a life saver ΑΤΛ η, the lean disk drive is considered to be connected to SATA 〇-the logical disk 胄 5 Q △ lower Λ Q 问 I the disk drive system is designated to SATA 3 . Since there is only one profile —. Included in the eighth round, take the length of the small round and write it to SATA0 and spread it to SATAmata. During the entry phase, the array waits until the port is ready to transmit data. Following ΑΤΑ, the shell material is followed by X0R as the host buffer indicated by the channel. The result of the formula of ten f 3 is for the Λ contention * fp ', k ί, and the stylistic driver r each other. …, There will be updates for the parity data of those who have been removed. The value reached here is also the current magnetic track, which has become a traditional RAID5 hard mountain " female., Two read operations and the first x〇R ^ 丨 more The leaf that was changed / written was opened earlier. Refer to item 22D. During the writing phase, the filial piety driver program 44 200535609 is planned with the eight channels 0 as the address of the buffer that holds the new data and DMA channel 3 has the address of the parity counter that it has just completed. These commands are written to, and Friend King α 铒 珲 ϋ and transmitted to SATA 0 and SATA 3. The array will wait to receive data . Then, the following will be used: =, the drive will use its DMA channel to transfer a sector to the sATA port of the drive.

複製係捕捉於緩衝哭。接著:::心·1且該磁區之-份 城 、衡接者將運用同位磁碟機的SATA 之DMA通道而傳輸一個磁區。隨著資料為傳輸,陣列 開關係將計算此資料與緩衝器的内容t XOR,送出計曾 結果至同位磁碟機的SAT ° "之 讀出自各個_通道之緩:。:個磁區為已經 各個SΑΤΑ埠。此渦_筏舌、—^ G、、、工冩入至 輸。 過私係重硬而直到所有的資料為已經傳 前文敘述係顯示其組織於邏輯璋順序之映射 ι 現在呈現的是:一較科 9 σσ。 實m 佺解決方式係可為組織映射暫存器於 二版、序。於各個暫m將存在針對於各個實 個攔位。於該攔位之登事 、 機(或是指出造ΓΓ 的邏輯磁碟 ^貝體埠為未運用於該陣列)。 πι·映射暫存器與資料路徑開關邏輯 暫存器 針對於一小型電腦系統之典型的raid 括:至-主機系統之-介面、以及至一磁碟機陣二:係包 :提=圖係:種磁碟陣列控制器10之簡化的方塊圖 方、互動灰—主機匯流排12之一主機介面ι6、 及 45 200535609 用於互動於複數個附接的磁碟機— 該控制器係較佳為包括, 磁碟機介面22。 G枯· 一控制處理器 體1 8,用於移動於主^^ 、人一緩衝記憶 抄動方、主機匯流排與磁 儲存。 執之間的貢料之暫時 -實體埠係需要以附接—大量儲 機)至一系統。儘管某些介 :(堵如.一磁碟 多個裝置,實體埠俜傾为、月fa °支杈同時的資料傳輸至 n R: 成為—瓶頸。針對於此理由,- 问性月&的RAID扭制哭a w 才工制裔係可對於每個大 -實體埠,如於第24A圖所 料’、置而具有 存器24之對應的内容, 圖亦顯示一映射暫 一性能裨益二=1而進一步說明, 該等磁碟機。舉例而言,_ 貝料之拆解於陣列之 ° _人讀出來自四個磁碟嬙沾-欠』丨 係產生對於單—個磁碟機 《碟機的肓料 第24A圖所示之實例,自 ° 良。針對於 _ 自四個磁碟機所到達之十丄办士从 貧料係以邏輯磁碟機順序 、 、 忻為〆、十四位元的資斜,並 為送出至緩衝器(於第23圖之 、科- 拆解,卽.JL A a v 、友〇 °。 1 8)。使用者資料係 饰解,即·其為一次分布一 — 區1^例如· 1 6位元的字袓)以 一預疋序列而跨於一個陣列 " 起始於邏輯磁碟機#0且進 “ °日 乍為 ^ 退仃到邏輯磁碟機#11-1,1中, 係於該陣列的磁碟機之备曰 "" 11 … 此拆解序列係重複而使得使 用者貢料之第k個區段為 吏 勺司·應於邏輯磁碟機(k mod 此方式,運用邏輯磁碟機編號以反映拆解順序。、 圖式中,四個“邏輯埠”之堆疊係僅為指出_拆解之一万有 序(°rdered)集合的四個區段。各個“邏輯埠,,係對應於拆Reproduction is captured in the buffer cry. Then ::: heart · 1 and the share of this sector, the balancer will use the SATA DMA channel of the same drive to transfer a sector. As the data is transmitted, the relationship between the data and the buffer will be calculated, and the data will be sent to the co-located drive's SAT ° " read from each channel. : Each magnetic area is a SAA port. This vortex _ raft tongue, — ^ G ,,, and 冩 冩 至 to lose. It is hard to be private until all the information is passed. The previous narrative shows the mapping of its organization in logical order. What is presented now is: a comparative 9 σσ. The real solution is to map the register for the organization in the second edition. At each temporary m there will be specific stops. The board and machine at the stop (or pointed out that the logical disk created by ΓΓ is not used in the array). π · Mapping register and data path switch logic register. Typical raids for a small computer system include: to-the host system-interface, and to a disk array 2: system package: mention = map system : A simplified block diagram of the disk array controller 10, interactive gray-host interface 12 of one of the host bus 12, and 45 200535609 for interacting with multiple attached disk drives-the controller is better For inclusion, the drive interface 22. A control processor body 18 is used to move to the host ^^, the human-buffered memory, the copying side, the host bus and magnetic storage. Temporary materials between the management-the physical port system needs to be attached-a large number of storage machines) to a system. Although some mediators: (block such as multiple devices on one disk, the physical port is tilted, and the monthly data transfer to n R: becomes a — bottleneck. For this reason,-questionable month & The RAID twisting system can be used for each large-to-physical port, as expected in Figure 24A, and has the corresponding content of the register 24. The figure also shows a mapping temporary performance benefit two = 1 to further explain these disk drives. For example, _ shell material is disassembled in the array ° _ people read from four disks 嫱--owed "丨 generated for a single drive "The example shown in Figure 24A of the disc player's material, since ° good. For _ ten people who have arrived from the four disk drives from the poor material system to logical drive order,, Xin as, Fourteen bits of information are oblique and sent to the buffer (as shown in Figure 23, Section-Dismantling, 卽 .JL A av, Friends 0 °. 18). User data is a decorative solution, namely To distribute one at a time—area 1 ^ for example, a 16-bit word) across an array in a pre-ordered sequence " starts at logical drive # 0 and advances ^ Back to logical drive # 11-1,1, the preparation of the drives attached to the array "11" ... This disassembly sequence is repeated to make the user kth area Duan Weisi · Ying Yu logical drive (k mod this way, using logical drive number to reflect the dismantling order. In the figure, the stacking of the four "logical ports" is only to indicate _ disassembly One of the four sections of an ordered set. Each "logical port" corresponds to the

46 200535609 解之單一個段,且整個堆4係對應於一有序集合的四個 區段。 來自各個磁碟機之100 MBPS傳輸率係成為一 4⑽ S傳輪率至緩衝器。虛線方塊%係概念代表其為務後 坪述之一資料路徑開關。資料路徑開關26係動態提㈣ 邏輯2料埠與實體資料埠之間的可配置資料路徑。、、 弟24A圖係僅為概念圖,顯示為於邏輯 !::之間的其直接連接。於真實的應用,可利用的;: 貝之數目係將大於邏輯f料之 留作為“熱門備用件㈣SP⑽),,之璋,或其保 群組為其獨立存取只體阜係可 資肺、… 陣列。第24b圖係四個邏輯 貝料埠(邏輯物至邏輯埠#3)至可 = (實際物至實際物)之可能指定的_個個二體-貝料璋 大的箭頭30係簡單指出 埠牛例而言, 24B ISI介祐- 只f不璋#2之指定。筮 暫存==射暫存器24之對應的内容。在此,自二 子-之右邊的弟二個欄位係自4 其指出實際埠幻之值“2”山山 k輯埠幻,且其含有 徑開關26係實施、羅^者由箭頭3〇所指出。資料路 後。 ^ 4 肢埠之指定,如為完整描述於 第2化圖係顯示—種 中,各個磁碟機係指定至五 一、-個實例,其 者,即··實體物與實體物。為’用二實體資料痒之一 64位元的字組,各 '、、、、、且合用於緩衝器之— 於第一讀出,# 凡的磁碟機係必須讀出-次 貝出對於邏輯痒糾與幻之資料在ν 人。 之貝枓係分別為得自 47 200535609 貫體㈣料。於第二讀出,邏輯璋#2與#3係分別 來自貫體物與之資料。此等操作係由 排'同樣’映射暫存器係顯示至實料#1與#2之指^所、.扁 第24D圖係顯示一種陣列的一 為連接至實體㈣。針對於此配置,個《機 1¾配置,對於邏 之貧料係藉著讀取相同的實體埠為四次而得到。 於美國專利第M18,778號所描述之同步 輸的一個特徵係在於··其允許冗餘資料為“即時,,,、處理 =於吳國專利第6,237,G52號所述。第25A圖 圖之四個磁碟機的陣列而且增加邏輯間36以計算一冗餘 樣式’其為儲存於附接至實體埠#4之磁碟 算術與邏輯操作係可能運用以產生 種㈣ 邏輯資料之資料的對應位元之間的邏 “广作之優點為在於:舰操作係無 ㈣〇0。歸因於運用X0R,第五個磁碟機係 餘(redundant),,_m心 _),,„ 機/為 如圖所顯不之16位元寬的匯流排職係等效於十六 個XOR閘,各者為具有四個輸入 ^ 兄〜弟25B圖,x〇r 函數之運用亦為極為對稱於磁碟讀出與: 間。第25B圖係顯示如於第25A圖 ‘^作之 機陣列,資料路徑40、42、等 _ :之相同的四磁碟 操作。於此例,附接至實"#2係顯不為針對於磁碟讀出 要至M體垾#2之磁碟機係已經失效。是 以,對應的資料…4(其未作用)係顯示於虛線二 函數料异於來自其餘的資料磁碟機(實料#0、#1、與#3) 48 200535609 以及來自冗餘磁碟機(實體埠#4)之資料。此計算係重建其 為儲存於失效的磁碟機之資料,且結果係經由資料路徑扑 而抬向至邏軻璋#2以取代其來自失效的磁碟機之目前為不 可利用的資料。 Μ —先前的段落係展示其可能存在於一個尺仙控制器的 —組邏輯埠與—組實體裝置埠之間的種種關係之一些實 二—般而言,:個高性能的RAID控制器係迫使以處: 構成由,接至其貝體4之種種的子群組之大量館存裝置所 射ST個陣列。本發明之-個層面係運用-種新賴的映 ,曰存為與關聯的邏輯以致能儲存裝置陣列之軟體配置, 以及改善性能,如進一步解說於後。 , 根據本:明之一個實施例,其結構為顯示於第%圖之 日―、射暫存器24係控制於邏輯與實體資料埠之 7之配置。(映射暫存器亦提供稱後論述之其他的特徵盘 ^。州施例’映射暫存器係由五個攔位所組成二 者為針對於五個邏輯資料之各者,於此實例$ 於該暫存器之各個邏輯資料埠的對應攔位係載有1為 資料痒的編號。針對於邏輯資料璋〇之搁位的資 之二Γ::為叫其指出:其為關聯於邏輯蟑。 “淳。於其次的四個攔位之諸值係分別識別為pp u、 ,PP—L3、與PP—Μ。第五個邏輯資料埠係-個虛擬 (:r。)淳。pp-l4值係運用以指定用於同位磁碑機Γ 實體資料埠。 *钱之一 映射暫存器攔位係可為幾乎任何尺寸。舉例而士,一 49 200535609 個八位元的欄位係將支援高達256個實體埠之一陣列。於 圖不的貫施例,僅具有五個實體埠,一個三位元的攔位係 充7刀。該五個攔位係適當包封成為一個十六位元的暫存器 且其具有由“ r”所標示於圖式之一個位元以備用於“保 召 。任何型式之非揮發性(non-volatile)的記憶體係可運 用以儲存該映射暫存器資訊。46 200535609 The solution is a single segment, and the whole heap 4 corresponds to the four segments of an ordered set. The 100 MBPS transfer rate from each drive becomes a 4⑽S transfer rate to the buffer. The concept of the dotted square% represents that it is one of the data path switches described in Wuhou. The data path switch 26 is a dynamically configurable data path between the logical 2 material port and the physical data port. The 24A diagram is only a conceptual diagram, which is shown as a direct connection between the logic! ::. For real applications, the available :: The number of shells will be greater than the logical number of materials left as "popular spare parts (SP⑽)," or its security group for its independent access, only the system can be funded. , ... Array. Figure 24b shows the four logical shell ports (logical port to logical port # 3) to the possible designation of _ individual two-body shells with large arrows that can be equal to (real object to physical object) 30 In the case of port cattle, 24B ISI refers to only the designation of # 2. Temporary storage == the corresponding content of register 24. Here, since the second son-the second brother on the right The field is from 4 which indicates the value of the actual port illusion "2", the mountain k series port illusion, and it contains the path switch 26, which is indicated by the arrow 30. After the data path. ^ 4 Designation, such as the complete description in the second picture series display-in the species, each drive system is designated to 51, one instance, the other, that is, physical objects and physical objects. It's used to itch the two physical data One of the 64-bit blocks, each of which is used as a buffer—for the first read, # Fan's drive system must read-sub-out pair The data of logical rectification and magic are in ν. The shells are respectively obtained from 47 200535609 perforated material. In the second reading, the logical volumes # 2 and # 3 are from the perforated material. This The other operations are shown by the row "same" mapping register to the actual material # 1 and # 2, and the flat 24D picture shows an array of one connected to the entity. For this configuration, a "The configuration of the machine 1¾ is obtained by reading the same physical port for four times. One feature of the synchronous output described in US Patent No. M18,778 is that it allows redundant data. "Instant ,,,, processing = as described in Wu Guo Patent No. 6,237, G52. Figure 25A shows an array of four drives and adds logical space 36 to calculate a redundant pattern. 'It is stored in the disk arithmetic and logic operation system attached to physical port # 4. It may be used to generate seed logic. The advantage of the logic of the corresponding bits of the data is that the ship's operating system is not ㈣〇0. Due to the use of X0R, the fifth disk drive system is redundant, _m 心 _) , „Machine / is a 16-bit wide bus grade shown in the figure is equivalent to sixteen XOR gates, each with four inputs ^ Brother ~ Brother 25B, the use of the x〇r function It is also extremely symmetrical between disk readout and:. Fig. 25B shows the same four-disk operation as shown in Fig. 25A's array, data paths 40, 42, and so on. In this example, the attached # 2 system is not for disk readout, and the drive system to M body # 2 has failed. Therefore, the corresponding data ... 4 (which has no effect) is shown in the dotted line. The second function is different from the remaining data drives (actual materials # 0, # 1, and # 3). 48 200535609 and from the redundant disk Data of physical machine (physical port # 4). This calculation reconstructs the data it stored on the failed drive, and the result is carried through the data path to logic # 2 to replace its currently unavailable data from the failed drive. Μ —The previous paragraphs show some of the real two relationships that may exist between a group of logical ports and a group of physical device ports of a ruler controller—in general, a high-performance RAID controller system Forced to: ST arrays shot by a large number of inventory devices connected to various subgroups of its corpus body 4. One aspect of the present invention is the use of a new type of mapping, that is, storage as associated logic to enable software configuration of the storage device array and improve performance, as further explained later. According to an embodiment of the present invention, the structure is shown on the day of the% chart-the configuration of the shooting register 24 is controlled by 7 of the logical and physical data ports. (The mapping register also provides other feature disks discussed later.) The state example 'mapping register is composed of five stops, both of which are for each of the five logical data. Here is an example. The corresponding block of each logical data port in the register contains the number 1 which is the data itch. The second resource for the logical data slot Γ :: It is pointed out that it is associated with logic "Chun." Chun. The values of the next four stops are identified as ppu,, PP-L3, and PP-M. The fifth logical data port system-a virtual (: r.) Chun.pp The -l4 value is used to specify the physical data port used for the in-position magnetic stele machine. * One of the mapping registers can be of almost any size. For example, a 49 200535609 eight-bit field system An array of up to 256 physical ports will be supported. In the example shown in the figure, there are only five physical ports, and a three-bit block is charged with 7 knives. The five blocks are properly encapsulated into a ten. Six-bit register and it has one bit marked with "r" in the schema for "guarantee. Any type Non-volatile (non-volatile) memory for storing the system can be transported register mapping information.

欲展示該映射暫存器之運用,將簡要再次參考目前為 止所述的配置之各者。於第24A圖,注意··一映射暫存器 24係顯不。Pp—L〇之值係〇,其指出:邏輯資料埠洲為連 接^實體埠#0。後續的三個值係卜2、與3,其指出的是: 後^的二個邏輯資料埠為連接至後續的三個實體資料璋。 -4之值係7。此係非為於此實例之一個合法的實體淳 編號。值“ 7 ”传谨用4匕山. 心' 糸運用以才曰出·於此配置係不具有同位磁 石亲機。所選取的特定值係 體蟑編號。 ^重要’只要其料-實際的實 實體資料埠1、2、4、金n a \ “ ” / ” 係y刀別為支援邏輯埠0到3。 D “7”係指出··—同位磁碟機為未運用。 弟2 4 C圖係顯示針對於一插— 映射_六_ ' 一磁碟機的陣列而配置之 、、曰存為。注思··邏輯資料埠#2 ㈣而為關聯於相同的實料— 係如同邏輯埠#0 於第一實邮# 則—個邏輯埠係傳輪資料 貝月豆埠週期,而後二個邏輯埠 體埠週期。 宁、得輸貝枓於弟二實 第24D圖係顯示針對單一 渠枝情形而配置之映射暫 d: 50 200535609 存器。邏輯埠#0到#3係於連續的 埠#3。帛24圖之所有 ,傳輸貧料至實體 丨3的交化係其益關 ^ — 顯示之不同的資料路徨配置。 …;几餘貢料邏輯而 弟25A圖係顯示針對於如同第24A圖的 X〇R€#〇; 來自所有四個邏輯資料埠之資料。 / R係計算於 器之邏輯瑋#4欄位 例:係健存於映射暫存 為“7”之-…指出:i,係具有“4”而非 接至埠#4。 茱機且其為附 第25B圖係顯示針對於如同第24八金 料路徑之於磁碟讀出方向的職邏輯圖的相同資 物之磁碟機係目前已經失效。邏 ;了:接至實體 容M2係、已經取代為—“5,Ά;^#2搁位之内 到4。 “s”总一/ 。法的貫體埠編號係〇 失W取:留值,其運用以指出:-磁碟機為已-失放。存取虛擬的實體埠編號5之 、、二 Χ⑽之輸出而取得其資料。 饵貝科埠係將自 路徑開關 於前文論述,已經展示的是··都 j, t 至一映射暫存的 之四個值係可運用以代表於四個邏輯 〇" :至五個實體璋的Η、或〜機陣= 月匕配置,具有或不具有一冗餘磁碟機; 磁碟機之陣列,具有或不具有—失效的磁^於具有冗餘 映射暫存器之内容為如何運用以配置硬體二= 性。換言之’以下的論述係提出該資料路徑開關26 ί〇 51 200535609 之一個目前較佳實施的細節、以 之内容所配置。 ’、”、、σ何由映射暫存器 爹考第27Α圖,四個邏輯資料淳之各者 么 收來自五個實體資料埠一 ,、必須能夠接 失效的磁碟機之來自磁碑資料、或是假使為於― 能的資料來源,實體資二:L〇R的資料。具有六個可 --貝枓蟑各者係具有一個 (S1X-t〇-〇ne)型式多工器5〇, 1 ,以的/、至一 琿#1之多工器50係顯示於第U兀寬。針對於邏輯 邏輯埠#〇、#2、愈#3)传相门A圖,但是其他者(針對於 ^ . )係相同。該多工器之選擇器或“s”To demonstrate the use of this mapping register, reference will briefly be made to each of the configurations described so far. In Figure 24A, note that a mapping register 24 is not displayed. The value of Pp-L0 is 0, which indicates that the logical data port continent is the connection port # 0. The following three values are 2, and 3, which indicate that: The two logical data ports of the last two are connected to the next three physical data ports. The value of -4 is 7. This is not a legal entity number for this instance. The value "7" is passed on with 4 daggers. The heart is used to express it. This configuration is not equipped with the same magnet. The specific value selected is the cockroach number. ^ Important 'As long as the material-the actual physical data port 1, 2, 4, gold na \ "" / "system is supported by logical ports 0 to 3. D" 7 "is pointed out--parity disk The machine is not used. Brother 2 4C picture shows the array that is configured for a plug-in-mapping _six_ 'drive. Note: · Logical data port # 2 is associated In the same actual material — it is the same as logical port # 0 于 第一 实 邮 #: a logical port system transfers the data to the monthly port period, and the next two logical port periods. Figure 24D of the Second Reality shows the mapping configured for a single channel scenario: 50 200535609. Logical ports # 0 to # 3 are connected to consecutive ports # 3. All of the 24 pictures are transmitted to the entity The intersection of 3 and its benefits ^ — the different configuration of the data path display.…; Several tribute logic and the 25A map is shown for X〇R € # 〇 as shown in Figure 24A; from all four The data of a logical data port. / R is calculated in the logic of the device # 4 Field example: It is stored in the mapping temporary storage as "7" -... pointing out: i, system "4" instead of connecting to port # 4. Zhuji and it is attached to Figure 25B is a disk drive showing the same material for the job logic diagram of the disk readout direction as the 24th gold material path The system is currently invalid. Logical: It has been connected to the entity capacity M2 system and has been replaced by-"5, Ά; ^ # 2 within the seat to 4. "S" is always one /. The serial port number of the method is 0. Lost W: Remaining value. Its application is to indicate:-the drive is already-lost. Access the output of virtual physical port number 5 and 2, to obtain its data. The baike port system will discuss the self-path switch in the previous discussion. It has been shown that · the four values temporarily stored in the mapping of j, t to a map can be used to represent four logics: to five entities璋 Η, or ~ machine array = month dagger configuration, with or without a redundant disk drive; array of disk drives, with or without-failed magnetics ^ The content of the redundant mapping register is How to use to configure hardware two = sex. In other words, the following discussion proposes the details and configuration of one of the currently preferred implementations of the data path switch 26 51200535609. ',' ,, and σ are explained in Figure 27A of the mapping register. Each of the four logical data is received from five physical data ports. It must be able to connect the failed disk drive from the magnetic stele data. Or if it is a source of capable data, the material of the second entity: the data of LOR. It has six cans-each of them has one (S1X-t〇-〇ne) type multiplexer 5. The multiplexer 50 of 1 to 珲 # 1 is shown in the U-th width. For logic logic ports # 〇, # 2, ## 3) pass the picture of the gate A, but others ( For ^.) Are the same. The selector or "s" of the multiplexer

輛入係連接至映射暫存哭 〆 S 曰仔态之邏輯埠#1攔位_ “pp PP一L 1之值0到4係分別為選 之資m “s,,/ 物到實體埠#4 、” 係選擇該磁碟讀出XOR之輸出。 、。弟27B圖係顯示磁碟讀出職邏輯52。磁碟讀出x〇R 邏輯52係一個五輸入的χ ◦R電路,於較佳實施例為十六 位凡寬(對應於附接的磁碟機資料路徑)。(此係等效於十丄 個XOR,其各者為具有五個輸入。)該五個輸入之各者: 由例如為AND閘54夕—处丨士 之個對應的AND閘(亦為十六位元The car is connected to the logical storage port # 1 of the mapping temporary crying state, “The value of 0 to 4 of pp PP-L 1 is the selected asset, respectively. 4. "" selects the output of the disk read XOR. 27. Figure 27B shows the disk read logic 52. Disk read x〇R logic 52 is a five-input χ R circuit. The preferred embodiment is sixteen bits wide (corresponding to the attached drive data path). (This is equivalent to ten XORs, each of which has five inputs.) Each of the five inputs : For example, the corresponding AND gate (also 16 bits)

寬)所邏輯合格(quanfled)或是“選通(gated)”。(此係等效 於十’、個NAND閘,其各者為具有二個輸入。)該五個N錢D 閘係由對應的五個實體埠選擇訊號pp〇—sel到pp4—sEL 所合格。此等訊號之產生係將描述於後。 至各個實體埠之資料路徑係可為來自四個邏輯資料埠 之任一者、或是來自磁碟寫入x〇R。實例係參照第24八至 24D圖所顯不。儘管映射暫存器之一搁位係識別其針對於 52 200535609 各個邏輯資料埠之資料來源 體埠的對應資料之一攔位。此資㈡=對於各個實 攔位。映射暫存哭之— _ ’、σ侍自方;具有之6亥等 一種“八之,,^斗―兀—進制編碼的欄位之各者係以 種八之—㈤式解碼器而解碼 於邏輯埠攔位之該種解碼器6 8圖係顯不針對Wide) logically qualified (quanfled) or "gated". (This series is equivalent to ten 'NAND gates, each of which has two inputs.) The five N money D gates are qualified by the corresponding five physical port selection signals pp0-sel to pp4-sEL. . The generation of these signals will be described later. The data path to each physical port can be from any of the four logical data ports, or from a disk write x〇R. Examples are shown with reference to Figures 24A to 24D. Although one of the mapping register's slots identifies one of its corresponding data corresponding to the data source of each logical data port of 52 200535609. This asset = for each actual stop. Mapping temporary crying — _ ', σ wait for the side; each of the fields with a “eight,” ^ Dou — Wu — hexadecimal coded field is a kind of eighth — ㈤ decoder The decoders decoded in the logical port block are not shown in Figure 8

Ll—PO、LI—Pl、li Ρ2、 Ll PP—Li 係解碼成為 出自一個來源至一個目的Hp 7 ’其中’該等名稱係指 指出自邏輯物至實體物之二徑舉例而言,L1-P2係 参考第29A圖,取樣電踗一 料埠至實體資料埠(#(^的“ ^用=該等邏輯資 於實體物之多工器72係顯 夕工化。針對 四個蟑之多工器係相同。各個多工二針對於其他 and閉74 (均為十六位元 应」72係由其具有五個 —and/or陣列所έ且成 人固對應的OR閘76之 早夕j所、,且成。(該等An 六個AND閘,苴夂者為且女 Ψ 者係等效於十 十六個⑽開,Υ久者ΛΓ輪入。該0等效於 之夕工⑦,來自該等邏輯資 、貝體埠#2 器之對應輸出所合格,即. 閘係由五個解碼 ,、與L4—P2圖所示之一叫 機陣列〃 # #料之—彳目#㈣制。於-個-磁磾 4陣列,一給定的實體埠係接 個一磁碟 貪料’雖然於不同的週期。固不同的邏輯埠之 器66係具有一個致能輪入“ εν” ,弟^^圖,各個解碼 針對於二磁碟機配置’僅有 :D格所有其輸出。 十於璉輯資料埠#〇與#1之 53 200535609 解碼器係致能於一第一週期,且僅有針對於邏輯資料埠#2 與#3之解碼器係致能於一第二週期。針對此理由,—次係 將合格於第29A圖的AND間之僅有—者。換言之,^ $ 來自指定的邏輯埠(根據映射暫存器)之資料係輸入 = 的實體埠。 心 於單一磁碟機陣列,其中,單—個實體璋係接收 所有四個邏輯埠(參閱f 24D圖)的資料,僅有—個解碼的 %係-次致能,使得僅有—個娜閘74係將為—欠致: =選擇-獨特資料來源(邏輯埠)。其他開放問題係針對= 弟27B圖之“PPn_SEL”訊號的來源。第28圖係顯示—種 五輸入OR閘68之運用,其將斷定針對於—實體埠‘‘” 之二:SEL訊號,若存在一資料路徑於主體的實二 之間。此係提供一指示:該實體蜂係作用且可表 與於第27B圖之磁碟讀出x〇R。 乂 及窵' 根據ΑΤΑ/ΑΤΑΡΙ規格’送出命令至磁碟機係需 用程式規劃10 (PIO, Programmed 1〇)模式,1 運 支援僅有PTO f々Λ /、可為針對於 針==4置如每個存取為〜 η…之裝置而並未更佳於每個存取 ns。早一中令係需要八或更多個存取。若 必須依庠A A八 L 有的磁碟機係 、序為〒々,此時間係相乘以磁碟機之數 觀的等待時間至整個過程。該等命令 “ 口可 從A u i 」马由每個埠之— 獨的控制器而同時發出,但是此舉係顯著 成本。 f3者增加複雜度與 54 200535609 田貝料in拆解於—陣列之磁 分者係將為位於各個磁碟機 ? '給定的拆解之部 對於各個磁碟機的資、 。、目對位置。此係使得針 馎白勺貝#之位址、遴Ll-PO, LI-Pl, li P2, Ll PP-Li are decoded into Hp 7 from a source to a destination 'wherein' these names refer to the two paths from logical to physical. For example, L1- P2 refers to Figure 29A. The sampling circuit is from a material port to a physical data port (# (^ 的 "^ 用 = The logic multiplexer 72 of the physical object is markedly industrialized. For four cockroaches) The mechanics are the same. Each multiplex is targeted at the other and close 74 (both are 16-bit should). 72 is the OR gate of the OR gate 76, which has five-and / or arrays and is fixed by adults. (The six AND gates of the An, those who are and the son-in-law are equivalent to the sixteen openings, and those who are long-time ΛΓ turn in. The 0 is equivalent to the evening work , The corresponding output from these logical resources and port # 2 is qualified, that is, the gate system is decoded by five, and one of the arrays shown in L4—P2 is called the machine array〃 # # 料 之 — 彳 目 # Control. In a 4 magnetic array, a given physical port is connected to a disk. Although it is in different cycles. The 66 series of different logical ports has an enable turn. εν ”, brother ^^ picture, each decoding is for the two-drive configuration 'only: D grid has all its outputs. Shi Yuji data port # 〇 and # 1 of 53 200535609 decoder is enabled in a first Cycle, and only decoders for logical data ports # 2 and # 3 are enabled in a second cycle. For this reason,-the secondary system will qualify as the only one between AND in Figure 29A. In other words, , ^ $ The data from the specified logical port (according to the mapping register) is the physical port input =. Focusing on a single disk array, where a single entity does not receive all four logical ports (see f 24D (Figure), only one decoded% system is enabled at a time, so that only one Nazha 74 system will be-inadequate: = Selection-unique data source (logical port). Other open questions are directed to = The source of the "PPn_SEL" signal in Figure 27B. Figure 28 shows the use of a five-input OR gate 68, which will determine that it is directed to-the physical port "" Part 2: SEL signal, if a data path exists in the subject Between the real two. This system provides an indication: the entity bee system functions and can be expressed in Figure 27B reads the disk x〇R. 乂 and 窎 'According to the ΑΤΑ / ΑΤΑΡΙ standard' to send commands to the disk drive system needs to use Program 10 (PIO, Programmed 1) mode, 1 support only PTO f々 Λ /, may be for a device with a pin == 4 such that each access is ~ η ... and is not better than each access ns. The previous command system requires eight or more accesses. If It must be based on the drive system of some AA-8L drives, and the sequence is 〒々. This time is multiplied by the number of drive wait times to the entire process. These commands "port can be issued from A u i" are issued simultaneously by each port-a separate controller, but this is a significant cost. F3 adds complexity and 54 200535609 Tian Bei material in dismantled in-the magnetic part of the array will be located in each drive? 'The given dismantling department for each drive's resources. Eye-to-position. This system makes the address,

Logical Buffer AdH 、 、軏戍衝器位址(LBA, IIer Address)為相同。处…二士 解之命令係針對 、°果奴碩出一給定的拆 寫入^ 車列之所有的磁碟機為相同。且,於 寫入一給疋的拆解之命令奴 處理器(例如:樣為相同。此係使得區域 、弟23圖之處理哭 將否則送出一命八^ Α时20)為可能於不超過其 播,,共同命令。 ” 4斤而要之柃間而“傳 如猶早指出,一磁碟機陣列係可由隹 磁碟機所組成。(本發 + w之附接的 置、或㈣“ 之一個優點係具有能力以易於配 一新配置該等附接的磁碟機之組織 j僅猎由儲存適當的配置位元組單 個陣列為由一工雇人 日存口口 )叙使一 括出逸宜 * 5之附接的磁碟機所組成,命令(諸如. 項m)係可料‘‘傳播”至 機係必須為一八八一 、释的子集合。该等磁碟 以“遮蔽(ma〗、: 是某些機構係必須為提供 — asc)其未芩與於目前的陣列之實體資_ $ 第,係顯示針對於此議題之—種實施。^貝㈣。 DAn參考第30圖,位址、選通、與晶片選擇訊號CS0、CSl、 DA1、DA2、DI0WM、與DI0R係針對於五個實體 传妓=者⑽與P1)而顯示。注意:此等位址與選通訊號 碟機之二所有的五個蜂。其為個別緩衝,使得一給定的磁 :'、《效係無法阻斷此等訊號之傳遞至其他磁碟機。 茶閱:緩衝器I、82。針對於—給定璋之二個晶片選擇訊 55 200535609 號CS#0、CS#1的輪出驅 蜜力^式係由針對於該遠 86。未由映射暫存器之目前的Logical Buffer AdH,, and Ier Address (LBA) are the same. The ... The command of the two scholars is to give a given teardown to the 果 ° slave slave. All drives in the car are the same. In addition, it is possible to write a dismantled command slave processor (for example, the same is the case. This is to make the processing of the area and the brother 23 pictures will otherwise send a life of eight ^ Α Hour 20) may not exceed Its broadcast, and common order. "As soon as it was pointed out, a disk array can be composed of a magnetic disk drive." (An advantage of the attached device, or ㈣ "of the present invention is that it has the ability to easily configure a new configuration of these attached drives. The organization j only stores a single array of appropriate configuration bytes. Consisting of an employee's daily deposit port) The enclosing drive of Yiyi * 5 is composed of the attached disk drive. The order (such as item m) can be expected to "spread" to the machine system. First, a subset of the release. These disks are based on "Mask (ma) :: Some institutions must provide — asc) their physical assets that are not related to the current array. The first is to show the implementation of this issue. ^ Bei. DAn refers to Figure 30. The address, strobe, and chip selection signals CS0, CS1, DA1, DA2, DI0WM, and DI0R are displayed for five physical prostitutes (Person and P1). Note: These addresses are all five of the bees of the selected drive. They are individually buffered, making a given magnet: ', "The effect cannot block the transmission of these signals to other drives. Tea Read: Buffer I, 82. For—The given chip selection information 55 200535609 No. CS # 0, CS # 1's drive-out driving force is targeted at the far 86. Not mapped temporarily Register's current

訊號所合格。參閱··間84、 、Θ旱之Pn—SEL 内容所選擇之任一埠係脾 ^ ’、、不έ令其晶片選擇之任一者為 定,且因此將忽略該讀出與寫入選通。 者為辦 通用項出似乎無任何意義 在衝突的資料值為返回於—共 /、為心心./曰 例,一“通用讀出”係致 二' 方、目別的貫施The signal station is qualified. Refer to any of the spleen spleen selected by the content of Pn-SEL of Θ, 84, Θ, ^, or any one of its chip selections, and therefore the read and write gates will be ignored . It seems that it is meaningless for the general item to be returned. The conflicting data value is returned to-total /, for the heart. / For example, a "general readout" is caused by the implementation of the second party and the item.

Pn—DI〇R#)為“傳播,,至所2。貝出違通(於弟30圖之 擇(Pn CS0#、pn CS1#^人仏 早由日日片選 - —#)所合袼之彼等附接的儲存裝置传將 返回資料至實體埠,於1 料罝係將 的下降邊緣。並I企圈;作:1鎖於Pn-DI0R#選通 哭^1 P 成以返回—諸值至區域處理 杰而作為此讀出週期之結果。 一區域處理器係將接著為—次一個而讀出各個埠,運用 :同的位址’其並未致使Pn—D1⑽選通週期之重複而 …’、、須改變任何問鎖的資料。此等週期係允許該區域處理 :以抓取其館存於各個資料問鎖器之潛在獨特的值。可能 需要高達600 ns之Pn—DI0R#週期係僅為執行—次。問鎖 於^料之值係可為於15 nS而抓取各者,以針對於顯著 的呀間節省而優於重複Pn_DI〇R#週期為五次。 、 通用讀出”及“通用寫入” f置係允許區域處理器 以最小可能時間量而送出命令至目前選擇的陣列及接收: 自°亥陣列之控制狀態。當一不同的子陣列係藉由載入一新 值於映射暫存器而選擇’控制介面係自動更新而無需其他 碼變化。 200535609 悲排—序(Status Ordering) 前文論述係處理產生諸多的實體 .,,且頒不其為如 何由映射暫存器所推進。此等埠之各 考係同樣具有多個輸 入訊號。再次,關聯此等訊號於邏輯 斗H系機係可使得軟體 ^€(overhead)^t^ , ^ ^ # „ t 斷輸出,其運用以發訊於來自控制器之服務 圖係顯示一種多工器90之運用,多工哭 口° y ◦ i由來自映射Pn—DI〇R #) is "Propagation, to the place 2. Pui out of control (Yu 30 choices of pictures (Pn CS0 #, pn CS1 # ^ People as early as the daily film selection-##) The attached storage devices will return the data to the physical port, which will be the falling edge of the material. And I will create a circle; operation: 1 lock on Pn-DI0R # gate to cry ^ 1 P to return —The values are processed to the region as a result of this read cycle. A region processor will then read each port for the next time, using: the same address 'which does not cause Pn-D1' strobe cycle Repeat and ... ', and must change any lock data. These cycles allow the region to process: to capture the potentially unique values of its data locks stored in each data lock. Pn up to 600 ns may be required— The DI0R # cycle is only executed one time. The value of the lock can be captured for 15 nS, which is better than repeating the Pn_DI〇R # cycle for five times for significant savings. , General read "and" general write "f sets allow the area processor to send commands to the currently selected array and receive them in the smallest possible amount of time: from The control status of the Hai array. When a different sub-array is selected by loading a new value into the mapping register, the 'control interface is automatically updated without other code changes. 200535609 Sad Order-Order (Status Ordering) There are many entities generated by the system processing, and it is not how they can be promoted by the mapping register. Each test of these ports also has multiple input signals. Again, these signals are associated with the logic bucket H system It can make the software ^ € (overhead) ^ t ^, ^ ^ # „t cut off the output. Its use is to send a message to the service diagram from the controller to show the use of a multiplexer 90. The multiplexer crying ° y ◦ i By mapping

暫存器之PP—LG值所控制以選擇其關聯於邏輯資料淳〇之 實體槔的中斷。其他的邏輯資料埠之各者係具有一個相同 的多工器(未顯示),其運用對應的PPjLn值以找出其中斷。 於第3 1圖,緩衝器92係取得來自各個邏輯資料埠多工器 (90、等等)之所選擇的中斷。當區域處理器(於第23圖之2〇) 係透過此緩衝器而讀出該中斷狀態,該等中斷係以邏輯資 料埠順序而出現且邏輯資料埠〇為於位元〇位置。相同的 技術係可運用以排序來自實體資料埠之内部與外部的訊 號,包括··磁碟機電纜ID訊號與内部FIF〇狀態訊號。此 特徵係致使區域韌體為能夠運用之一共同序列的碼以針對 於具有不同數目的實體埠之多個陣列。一旦該中斷緩衝器 92係載入,所需的狀態位元係恆為針對於選擇的任何陣列 之“排序(sorted),,暫存器之最小有效位元。位元數目係可 為隱藏減小至實際的埠之數目。The interruption controlled by the PP-LG value of the register to select its physical entity associated with the logical data. Each of the other logical data ports has an identical multiplexer (not shown), which uses the corresponding PPjLn value to find its interrupt. In FIG. 31, the buffer 92 obtains the selected interrupt from each logical data port multiplexer (90, etc.). When the area processor (20 in Fig. 23) reads the interrupt status through this buffer, the interrupts appear in the order of logical data ports and logical data port 0 is at bit 0. The same technology can be used to sequence the internal and external signals from the physical data port, including the drive cable ID signal and the internal FIF0 status signal. This feature results in regional firmware being able to use a common sequence of codes to target multiple arrays with different numbers of physical ports. Once the interrupt buffer 92 is loaded, the required status bits are always "sorted" for any selected array, the least significant bit of the register. The number of bits can be hidden and reduced As small as the actual number of ports.

中斷ANY輿ALL 如於第3 1圖所示,來自該等邏輯資料埠之選擇的中斷 係可為進行邏輯AND (於AND閘94)或OR (於0R間96) 200535609 以提供訊號“中斷侃,,與“中斷斯,,。當區域處理器 為已以出一命令,且在任何資料為已經傳輸之前,可能 〜要$ i關方、來自任何(ΑΝγ)磁碟機之中斷,由於一或多 個磁碟齡可能已經拒絕該命令或具有某些其他的誤差。 、,旦忒寻磁碟機為開始傳輸資料,區域處理器係將想要知 磁碟機為何時已經斷定其中斷訊號,由於此 :ρ 7之凡成。注意··此種型式之實施係使得軟體 於磁碟機之數目。(針對-個二磁碟機的陣列,來自 各個裝置之中斷訊號係 旱歹j末自 兄一认,而於早一磁碟機的陣 歹】,同一磁碟機係出 確作用。) A AND與all訊號係仍為正 儘管:部分執行時間的軟體係利用上 存在需要存取個別的裝置以供初始化爾理 方;特疋裝置内的誤差。針 免里 出現於區域處理器位址,門之内㈣’各個實體資料埠係 位置之任-者的一存取:解;Γ特位置。當對於此等 暫存哭之內六而i A 馬,解碼後的輸出係根據映射 曰廿為 &lt; 内谷而重新映射 、 载入以一‘、.、,於初始化期間’映射暫存器係 至實體埠〇、邏輯裳置!為=L裝置0為指向 使得實體埠為依序出現、曰σ至貝體埠1、等等。此係 Ψ ^ 4 e始於處理器的位址空間之第 戶、體埠位置。於正常 工Γ曰]之弟一 實體磁碟機圖(map)。若暫存器係將載入-邏輯至 到,區域處理器係”取:::=為自邏輯…收 断的磁碟機,透過獨特的位址 58 200535609 T間,其當該認同圖為載入時而存取實 、严。 一—科八%叫,,丁 % [饈砰 ζ。此令 輯兹業機之服務為無關於其為附接至之實體資料埠 璉軏疋址特徵之—個硬體實施係顯示於第32圖。當處 理器係存取對於裝置蟑空間之該位址區域,八之一型式 (r、e-of_eight)解碼器' 100係解碼處理器位址線五到七,其 疋義針對於各個奘罢 _ 一 1置之二十二個位元組的空間。各個空間 之角午石馬係斷定該制_庵 七之M £ , 皁N解碼訊號Pn-DEC。虛擬埠編號 七之解碼係針對於— 1他铲踩 、、用存取之訊號。P7-DEc訊號係與 /、他解碼矾號之各者 的埠選擇m / 運异力102,使得所造成 旱Pn_SEL (㈣,係針對於料之 以及針對於—通用存取而斷定。 特疋存取Interrupt ANY and ALL As shown in Figure 31, the interrupt selected from these logical data ports can be a logical AND (at AND gate 94) or OR (between 0R 96) 200535609 to provide the signal "Interrupt Kan ,, and "Interruption,". When the area processor has issued a command, and before any data is transmitted, it may be necessary to interrupt the request from any (Aγ) drive, because one or more of the disks may have been rejected The command may have some other error. To find the drive to start transmitting data, the regional processor will want to know when the drive has determined its interrupt signal. Because of this: ρ 7 is all right. Note ... This type of implementation is such that the software is on the number of drives. (For an array of two drives, the interruption signals from each device were recognized by the elder brother, but the array of the earlier one.) The same drive works. A The AND and all signals are still positive though: there is a need to access individual devices for initializing Erlifang on the use of the soft system for part of the execution time; special errors in the device. The needle appears in the address of the area processor, inside the door, and the access of each physical data port to any one of them: solution; special position. For these temporary caches, the Ai and Ma horses, the decoded output is re-mapped according to the mapping: <Uchitani, loaded with a ', ..., during initialization' mapping register. Tied to physical port 0, logical clothes! = L means device 0 is pointing so that the physical ports appear sequentially, from σ to shell body port 1, and so on. This system ^ ^ 4e starts at the user's address space in the processor's address space. Yu Zhenggong's younger brother, a physical drive map. If the register is to be loaded-logical to, the regional processor is "taken ::: =" is a self-logic ... disconnected disk drive, through the unique address 58 200535609 T, when the identification picture is Loading and accessing real and strict. One—Ke Ba% called, Ding% [馐 ping ζ. This order service is not related to the characteristics of the physical data port address to which it is attached. A hardware implementation is shown in Figure 32. When the processor accesses the address area of the device's cockroach space, one of the eighth type (r, e-of_eight) decoder '100 is the processor address. Lines five to seven, the meaning of which is directed to the space of 22 bytes each. The corner of each space, Wu Shima, determines that the system _ 庵 七 的 M £, soap N decoding signal Pn-DEC. The decoding of the virtual port number seven is aimed at the signal of -1 shovel, access, and access. The P7-DEc signal and / or the port selection of each of them to decode the alum number m / Yunli 102 , So that the resulting drought Pn_SEL (㈣, is determined for the material and for-universal access. Special access

…璋選擇訊號之各者係接著為由來自映射暫存器之 “ j值所推進。八之一型式解碼器1 〇 訊號且根據來自映射暫存H之PP u # P2』EL 由,產生其形式為L2—p〇 _二值而將該訊號為路 自邏輯埠-之央白^船 ,,且的矾號,其指出:來 、饵垾—之來自貫體埠零之一晶 田木 四個邏輯埠之該等八之一型式解碼器係相同:對於其他的 各個實體埠係具有—個五輪人 (未顯示)。 用於實體物之⑽们06係顯示。對方^例如··【 -片選擇的五個不同來源係— :體唓#2之一 對表早-磁碟機的子陣列,晶片選 …意··針 知裝置所斷定,而針對於雙磁碟機的 所有的四個邏 由該等邏輯裝置之二者所斷定。 ^ ,晶片選擇係 於前文敘述以及於圖式,—種型 器的數 59 2〇〇535609 個實例係說明;其 針對於各個邏輯磁::之―種邏輯映射暫存器。如所解說, 列’且於該摘位,-值位係提供於—定義的陣 其稱為一種實體映射 個對應的實體埠編號。於 針對於各個實體埠或所 〇例’一暫存器係提供 心出-個對應的邏輯埠編號。 h’位,-值係 於以下的實例。 曰代的映射暫存器係說明 拆解解- 寻疋序列而儲存於可刹ώΑ Μ 4 之各者。此過程俜接荽盔 、磁碟機 ^ 接者為重複。舉例而言,資料之笛厂 塊(以及第五區塊、望士 F〃 貝抖之弟一區 ^ £鬼、等等)係儲存於其連接至實 體㈣之磁碟機。第二區塊(以及第六區 只 等等)係儲存於1連接 &amp;鬼、 ㈣之磁碟機。資料之第三 區塊(以及弟七區塊、第一 n ~ 至實俨遠#4々 十 塊、寻寺)係儲存於其連接 機、/第二「磁碟機。資料之第一區塊係進行於邏輯磁碟 了塊為於邏輯磁碟冑1,第三區塊為於邏輯磁 、2且第四區塊為於邏輯磁碟機3。針對於此例之 替代型式的映射暫存器係如後: 邏輯映射… 璋 Each of the selected signals is then pushed by the "j" value from the mapping register. The one-of-eight type decoder 10 signal and the PP u # P2 from the mapping register H are used to generate its The form is L2-p0_, and the signal is from the logical port-Zhibaibai, and the alum number, which points out: Come, bait-comes from Jingtian wood, one of the zeros in the body. These eight-type decoders of the four logical ports are the same: for each of the other physical port systems, there is a five-rounder (not shown). Used for the display of the 06 of the physical objects. For example, ^ ... [【 -Five different source systems for film selection :: one of the body # 2 pairs early-sub array of the disk drive, chip selection ... meaning ... by the knowing device, but for all of the dual disk drive The four logics are determined by both of these logic devices. ^ The chip selection is described in the foregoing and in the drawings, the number of examples of the model 59 2 0 535 609 is illustrated; it is for each logic magnet: : Of — a kind of logical mapping register. As explained, the column 'is in the abstract position, the -value bit is provided —The defined matrix is called an entity mapping with a corresponding physical port number. For each physical port or example, a register is provided with a corresponding logical port number. H 'bit, -value It is related to the following example. The generation of the mapping register is to explain the disassembly-finding sequence and store it in each of the brakes. The process is repeated for the helmet and the disk drive. For example, the Flute Factory block of data (and the fifth block, Wang Shi F〃, the younger brother of the area 〃 £ ghost, etc.) are stored on the drive connected to the physical block. The second block (And the sixth area only, etc.) are stored in 1 connection &amp; ghost, ㈣ of the drive. The third block of data (and the seventh block, the first n ~ to the real real distance # 4々 ten blocks) , Xunsi) are stored in its connection machine, / the second "disk drive. The first block of data is performed on the logical disk. The block is on the logical disk 胄 1, and the third block is on the logical disk. 2 and the fourth block is in logical drive 3. The alternative type of mapping register for this example is as follows: logical mapping

邏輯埠# 2 1 值(實體埠 4 2 60 200535609 實體映射 實體埠# -~~-——----- 4 3 2 值(邏輯埠 —1--— —2 — 1 iv.針對於磁碟機控制器之窄與寬拆解、部分拆解更新 與其他同步操作的進一步改良Logical port # 2 1 value (physical port 4 2 60 200535609 Physical mapping physical port #-~~ -——----- 4 3 2 value (logical port — 1-— — 2 — 1 iv. For magnetic Narrow and wide dismantling of the disk drive controller, partial disassembly updates and further improvement of other synchronous operations

A ·介紹-u D Μ A 隨著稍早所述之ATA/ATAPI介面為持續發展,PI0資 料傳輸模式係由UltraDMA或_Α傳輸模式所取代. 模式性能係受限於介面與繞線之往返㈣咖㈣延遲。 UDMA模式係運用相同電氣介面於—“來源同步,,模式, 其中,起源資料之連結端(埠或磁碟機)係提供對於皇^ =選通:藉著來源同步時脈,性能係主要受限於由電續之 早一轉變利入的歪曲率(skew)而未受限於任何往返相關 的延遲。儘管此增強係推進資料傳輸率為自力則 MBPS至高達1〇〇 MBPS或 土— 尺冋者運用於傳統ΑΤΑ之同 步貧料傳輪計割係將不再運用於磁碟讀 機係起源針對於此方向之時序。 门由方、磁碟 以上’運用FIFO記憶體以修改ΑΤΑ/ΑΤΑρί介面之 UDMA版本為相容於同步資 门止, 十1寺翰又方沄及裝置係已經敘 述。同步貧料傳輸之優點係亦為論述於上。 B. SATA FIFO- 隨著ATA/ATAPI介面為梏庐 建立ΑΤΑ或SATA之一鍤 並列介面係由其 一 點對點(P〇ini-i〇-P〇im)的串列介 面連結而取代。於串列介面, ψ夕“ 斤有的L·制與資料傳輸係封 61 200535609 包化為框格資訊結構㈣,Frame Informatlon Structure)。 各個^S係具有1當的標頭(header),其指出FIS型式、 載里(Payl〇ad)、與用於檢查收到FIS的完整性之一 CRC 欄位μ /、、、、田即係岣為定義於介面規格。 第33圖係根據本發明之一種磁碟陣列控制器之一個每 施例的簡化概糸_ 、 κ ^ 〜° 運用複數個SATA埠與磁碟機。於 弟 3 3 f^l ,4 η &gt; a . 、、、、是衝态420係概念顯示。實際上,卜笑 緩衝器係可為於可利田沾+/ 此4 J利用的主機系統位址空間之任咅虛 圖係說明一單獨的堪1心處。此 、、友衝态420、主機介面450、直接v ,卜 體存取(DMA)通道42?次』丨 1接日己 、422、貝料路徑開關邏輯46〇、盥針 各個磁碟機428 /、計對於 雜一、〃、 車424。然而,於資料通道、緩 丁态、Μ磁碟機之間每 輯460之目前 “關係為取決於資料路徑開關邏 射計劃^ 如稍早所詳述,幾乎任何的資料映 c eme (例如:冗餘、拆解、等等)以及實r磁碟 機至邏輯磁碟機之任何的指:體磁碟 徑開關邏輯460而實施 :猎者重新配置❹料路 恶重新配置(例如··於一 于j為動 ^ 、磁碟機失效時),且其可為於軟f 路徑開關係同時為支佳的實施例,該資料 列之資料路徑,即一日^ 機及針對於多個陣 涉及。 —於该寺路徑為設定時而無需軟體 第34A圖係更為詳細爷 此兒明弟33圖之主機介面450。於 此戶、知例,主機介面係包 於 料禮认 舌糸統橋接器500,复接供次 4傳幸别於系統主記憶體與一 ’、(、貝A · Introduction -u D Μ A With the continuous development of the ATA / ATAPI interface described earlier, the PI0 data transmission mode is replaced by UltraDMA or _Α transmission mode. Mode performance is limited by the interface and winding back and forth ㈣ カ ㈣ was delayed. The UDMA mode uses the same electrical interface in-"source synchronization," mode, where the connection end of the source data (port or disk drive) provides the ^ = strobe: by the source synchronization clock, the performance is mainly affected by It is limited to the skew caused by the early conversion of the electric continuum and is not limited by any round-trip related delays. Although this enhancement is to advance the data transfer rate, the MBPS can be as high as 100 MBPS or soil. Synchronous lean material transmission and cutting system applied by the traditional ATP will no longer be applied to the timing of the origin of the magnetic disk reader system for this direction. The reason, above the disk, use the FIFO memory to modify ΑΤΑ / ΑΤΑρί The UDMA version of the interface is compatible with synchronous data gates. The 11th Sihan Fang and the device system have been described. The advantages of synchronous lean data transmission are also discussed above. B. SATA FIFO- With the ATA / ATAPI interface Establishing one of ΑΑΑ or SATA for a parallel interface is replaced by a point-to-point (P0ini-i0-P〇im) serial interface connection. In the serial interface, there is L · System and Data Transmission System 61 200535609 Sash information structure (iv), Frame Informatlon Structure). Each ^ S system has a 1-header, which indicates the FIS type, Payload, and one of the CRC fields μ ,,,, and Tian used to check the integrity of the received FIS. The system is defined in the interface specification. Fig. 33 is a simplified overview of each embodiment of a disk array controller according to the present invention, κ ^ ~ ° uses a plurality of SATA ports and disk drives. Yu Di 3 3 f ^ l, 4 η &gt; a., ,,, is a concept display of the 420 series of impulse states. In fact, Bu Xiao's buffer system can be used by Keli Tianzhang + / This 4 J uses any address space of the host system's address space. The illustration shows a single point. This, the friendly state 420, the host interface 450, the direct v, the physical access (DMA) channel 42 times. 丨 1 access to the day, 422, the shell material path switch logic 46, and each disk drive 428 / 、 As for the miscellaneous one, 〃, car 424. However, the current "relationship between each channel 460, data mode, M drive 460 depends on the data path switch logic projection plan ^ As detailed earlier, almost any data map c eme (for example: Redundancy, disassembly, etc.) and any reference from real disk drive to logical disk drive: physical disk path switching logic 460 is implemented: Hunter reconfigures data path reconfiguration (for example, ... (One when j is dynamic ^, when the disk drive fails), and it can be an embodiment that is also a good support when the soft f path is opened, and the data path of the data row, that is, one day machine and for multiple arrays Involved. — When the temple path is set, no software is needed. Figure 34A is a more detailed host interface 450 of Mingdi 33 here. In this example, the host interface is included in the ritual recognition system. Bridge 500, multiplexed for the next 4 pass but fortunately different from the system main memory and a ', (, and

Ci疆流排502之間。PCJ 62 200535609 流排係僅為-個實例。PCI匯流排係經由—pci匯流排介 面504而耦接至種種的邏輯直接記憶體存取(dma)通道 510,在一匯流排判優器(arbher) 5〇6之控制下。於操作, -次為僅有叫固DMA通道係實際傳輸資料至或自ρα匯 流排。即:多個DMA通道係可為同日寺“作用,,,如賴後 所進而論述。 第34B圖係更為詳細顯示一 SATA璋52〇,指出於磁 碟讀出方向之操作。根據公開的SATA規格,該介面係實 施:-實體層522,用於實體連接至—附接的磁碟機524 ; -連結層526;及傳送^ 528。該傳送層係包括一先 進先出,〇)記憶冑530,其提供於實體磁碟機與控制器 之間的一資料緩衝。 SATA彳面規格係提供用於連結532的各端之一種訊 號交換(handshake)機構,以壓制(暫停)自另一端之一資料 傳輸。SATA連結係半雙工。該協定係運用逆向通道以訊 號交換各個FIS之傳輸。歸因於連结532之較高許多的速 ^,咼達80個額外的位元組係可在請求於傳輸之一暫停 俊而收到。當收到來自該磁碟機之資料,傳送層先進先出 記憶體530係可產生一“幾乎為滿(仙刪Μ】)”指示(未 顯不)’纟將麼制運用反向通道之連結以防止fif〇溢流 (_心w)與資料損失。此FIF〇之另一側的資料路徑54〇 係可為以區域產生的時序而存取。於讀出情形,來自 之控制旗標“EMPTY”、與至FIF〇之控制訊號“p〇p” 係運用以控制存取資料。 63 200535609 第3 4C圖係顯示於一放适官 方、磁碟馬入插作期間之相同wATa 制訊號“FULL,,與“push,,係運用以 寫入貝料至.亥蟑FIFO ’用於隨後的傳輸至附接之磁碟機 再者,資料傳輪係解·禺接自實際的連結至磁碟、 來自該控制器之傳輪係可為同步,槓桿作用該“即時,,: 餘操作之優點,盆為垂斤狄 .^ 、 几 460 〇 ,、為“於-個較佳實施例之開關邏輯 係構==,第33_示之實施例’若是開關邏輯偏 糸構成以運用二個資料磁碟機與一個同位 制器/韓接哭夕德、主ja π 成一個控 ^轉接时之傳达層FIFO係可為以一共同的選通而 存取,以供舰計算。以該方式,原始 之優點係保留。 v ιυ傳輪 c·資料路徑開關邏輯 於配=33圖之虛線46G所概括指出的資料路徑開關邏輯 道之^ SATA磁碟機埠與其存取緩衝記憶體的DM通 至3二於一個實施例之資料路徑開關邏輯係參照第Μ 如户Λ而詳述於上文。舉例而言,參閱:資料路徑開關%。 其提供於邏輯資料埠或DMA通道與實體資料 由映;Γ態可配置資料路徑。於-個實施例,配置係藉 射俜可^存於映射暫存器之資料而決定。如上所述,映 係可為邏輯或實體。 =40圖係一種資料路徑開關的簡化圖,說明於磁碟寫 於—。之配置,且顯示四個DMA通道_Α0·Α3),用 、主機或緩衝記憶體的資料傳輪。該開關係較佳為包括 64 200535609 硬體以用於實施同步傳輸以及即時冗餘(XOR)操作。此方 塊係插入於DMA通道與SATA埠(SATA PO-SATA P3)之 間。DMA通道與SATA埠之特定數目係僅為用於說明而非 為限制。儘管一些其他圖式係顯示於DMA通道、SATA埠、 與XOR邏輯之間的特定連接(例如:第13、14、24、25圖), 此等路徑係運用邏輯而實際均為可配置,如稍早所論述。 所有的資料路徑係較佳為針對於SATA為32位元而針對 於PATA/UDMA應用為16位元。 • 參考第40圖,至任何SATA埠之磁碟寫入資料係可來 自任何DMA來源A-D或來自XOR方塊4010輸出“X” 。 如圖所指出,各個SATA埠係具有一 FULL狀態旗標輸出。 為了同步傳輸資料,所有的作用狀態埠之PUSH訊號係同 時斷定,但是僅當該等作用埠之FULL旗標係均為誤。 XOR方塊4010係可計算DMA輸入的任何組合之 XOR,藉由隨著必要而合格該等AND閘與致能邏輯訊號(例 如·· XB_ENA)之適當的組合,針對於一特定的磁碟機陣列 ®配置或拆解計劃。 DMA通道資料路徑係可能為32、64、或128位元, 視至記憶體之資料路徑的寬度而定。資料路徑開關方塊係 將隨著需要而封包或拆開32位元的SATA成分,以建立 該資料路徑寬度。針對於以下論述之窄的拆解磁碟寫入情 形,僅有一個DMA通道係運用。該邏輯係接收DMA寬度 之資料字組,且其為一次輸出1至4個32位元的字組, 同步於磁碟機陣列及XOR邏輯。針對於窄的拆解磁碟讀 65 (S: 200535609 出情形,此方塊係以同步方i 元的丰細剎田1 工而一次接收1至4個32位 兀的子組,利用如上所論述之 寬拆解應用,可能存在超過Y SATA#刪。針對於 圖之“主機介面細節”目式所厂道為作用。如第w 同%運作,但是主難流排 ^係了為 於此1主·^ LL 、堇為一次傳輸一個資料。 於此一形,此貧料路徑開關邏 針對於各個DMA通道之一 }方塊(弟40圖)係必須提供 記憶體4020,允許其為―、的緩衝器或先進先出(刪) 同時取出磁磾i#彳/、μ人”,、入一個,但亦允許資料為 了取出於磁碟機側。針對於 ^ /FIF0係必須為充分尺寸以允^糸統性能,此等緩衝器 或多個完整快取rca H ^允终主機匯流排於一次傳輸一 凡全昧取(cache)線路之資料。 D·磁碟寫入累積器 第42圖係說明用於_ 路的一個者施似。社劳 貝科路徑開關之x〇R累積器電 t ^ Λ ' °Λ,、積器係—個替代者以提供對於苴為 而要之各個DMA的額外緩衝 ϋ於其為 步傳輪資料。共用的主機匯流排介 保:來自不同的緩衝器之主機傳輪 之先進先山n 彻係將未重豐。於此圖例 延无山(FIFO)s己憶體421〇 舉例而言,此F t伤π &amp; ’、於長度為一個磁區。 料磁碟撫Λ μ ^认、 寫入一完全拆解之三個資 ” \加上几餘磁碟機 1 h J丨早夕J,運用以下的過程: !·该過程係將等待而直到 NOT FULT # l 有的作用磁碟機為指出 LL ’其於此例係將竟 ^ 之資料(“ ^ σ者·其可接受再一個磁區 枓。^閱:於第40圖之取L旗標。) 2.接著,一個磁區 貝枓係運用沿著於第40圖所示的Ci Xinjiang ranked between 502. PCJ 62 200535609 The row system is just one example. The PCI bus is coupled to various logical direct memory access (dma) channels 510 via the -ci bus interface 504, under the control of a bus arbiter 506. In operation,-the second time is that only the DMA channel is actually transmitting data to or from the ρα bus. That is, multiple DMA channels can function as the same temple, as further discussed by Lai later. Figure 34B shows a SATA 璋 52 in more detail, indicating the operation in the read direction of the magnetic disk. According to the published SATA specification, the interface is implemented:-physical layer 522 for physical connection to-attached disk drive 524;-link layer 526; and transmission ^ 528. The transmission layer includes a first-in, first-out, 0) memory胄 530, which provides a data buffer between the physical disk drive and the controller. The SATA interface specification provides a handshake mechanism for connecting the ends of the 532 to suppress (suspend) the other end One of the data transmission. SATA connection is half-duplex. This protocol uses a reverse channel to exchange signals for each FIS transmission. Due to the much higher speed of connection 532, up to 80 additional byte systems It can be received when the request is suspended during one of the transmissions. When data from the drive is received, the FIFO of the transport layer 530 can generate an "almost full (xian delete M)" instruction (not yet Show no) '纟 will use the reverse channel link to Prevent fif〇 overflow (_ 心 w) and data loss. The data path 54 on the other side of this FIF〇 can be accessed for the timing generated by the area. In the readout situation, the control flag "EMPTY" from ", And the control signal" p0p "to FIF〇 is used to control access to data. 63 200535609 Figure 3 4C shows the same wATa system signal" FULL, "which is displayed during an official, disk drive insertion. , And "push," is used to write the shell material to the Haihai FIFO 'for subsequent transmission to the attached disk drive. Furthermore, the data transmission system is decoupled from the actual link to the disk, The transmission wheel system from the controller can be synchronized, and the lever effect should be "real-time ,: the advantages of the remaining operations, the pot is the vertical weight. ^, A few 460,", "the switch logic of a preferred embodiment System structure ==, the 33th embodiment 'if the switch logic is biased to use two data drives and a parity control device / Han Jie Xi Xide, the master ja π into a control ^ The transport layer FIFO is accessible for a common strobe for calculation by the ship. In this way, the original is superior V υυ Passing wheel c. Data path switch logic is assigned to the data path switch logic path indicated by the dashed line 46G in Figure 33 ^ The SATA drive port and the DM that accesses the buffer memory are connected to 3.2 The data path switch logic of one embodiment is described in detail above with reference to the Mth section such as user Λ. For example, see: Data Path Switch%. It is provided in the logical data port or DMA channel and the physical data is mapped; Γ state Configurable data path. In one embodiment, the configuration is determined by the data that can be stored in the mapping register. As mentioned above, the mapping can be logical or physical. The = 40 diagram is a simplified diagram of a data path switch. Configuration, and display four DMA channels _Α0 · Α3), using, host or buffer memory data transfer wheel. The open relationship preferably includes 64 200535609 hardware for implementing synchronous transmission and immediate redundancy (XOR) operations. This block is inserted between the DMA channel and the SATA port (SATA PO-SATA P3). The specific numbers of DMA channels and SATA ports are for illustration purposes only and are not limiting. Although some other diagrams show specific connections between DMA channels, SATA ports, and XOR logic (for example, Figures 13, 14, 24, and 25), these paths are logically configurable, such as Discussed earlier. All data paths are preferably 32-bit for SATA and 16-bit for PATA / UDMA applications. • Refer to Figure 40. Disk writes to any SATA port can come from any DMA source A-D or from the XOR box 4010 output "X". As indicated in the figure, each SATA port has a FULL status flag output. In order to transmit data synchronously, the PUSH signals of all active status ports are determined at the same time, but only if the FULL flags of these active ports are incorrect. XOR block 4010 can calculate the XOR of any combination of DMA inputs. By qualifying the appropriate combination of these AND gates and enabling logic signals (eg, XB_ENA) as necessary, it is targeted at a specific disk array. ® Configuration or disassembly plan. The DMA channel data path may be 32, 64, or 128 bits, depending on the width of the data path to the memory. The data path switch block will package or unpack the 32-bit SATA component as needed to establish the data path width. For the narrow disassembly disk writing situation discussed below, only one DMA channel is used. The logic receives DMA-wide data blocks, and it outputs 1 to 4 32-bit blocks at a time, which is synchronized to the disk array and XOR logic. For the case of a narrow disassembled disk read 65 (S: 200535609), this block is used to synchronize the square i-elements of the rich and fine Sayada 1 and receive 1 to 4 32-bit sub-groups at a time, using as discussed above For wide disassembly applications, there may be more than Y SATA # deleted. For the purpose of the "host interface details" in the figure, the factory is used. As the wth operation is the same, but the main difficulty is ^ The master ^ LL and the cord are transmitting one data at a time. In this form, the lean path switching logic is directed to one of the DMA channels. The block (Figure 40) must provide memory 4020, allowing it to be-, Buffer or first-in-first-out (delete) At the same time take out the magnetic 磾 i # 彳 /, μperson ", and enter one, but also allow data to be taken out on the drive side. For ^ / FIF0 series must be full size to allow ^ System performance, these buffers or multiple complete caches rca H ^ Allow the host bus to transmit data for all cache lines at once. D. Disk Write Accumulator Figure 42 Explain that one is used for _ road. The x〇R accumulator electric t_ ^ Λ '° Λ, the integrator system—a substitute to provide additional buffering for each DMA that is required for the purpose of the pass data. Shared host bus mediation: the host pass from different buffers The first advanced mountain n will never be regained. In this legend, the FIFO s memory is 421. For example, this F t hurts π &amp; ', and the length is a magnetic zone. Λ μ ^ Recognize and write a complete dismantling of three assets "\ plus a few more drives 1 h J 丨 early evening J, use the following process:! · This process will wait until NOT FULT # l Some disk drives are used to indicate that LL 'its in this case will be ^ ("^ σ · It can accept another magnetic zone 枓. ^ Read: take the L flag in Figure 40.) 2 Next, a magnetic field shell system is applied along the line shown in Figure 40.

Cs 66 200535609 路徑“A”之DMA通道0而傳輸自缓衝器0至SATA埠0。 同時,參考第42圖,該資料(“ Α”)亦將為由Α_ΕΝΑ所 致能而通過XOR 4222至先進先出(FIFO)記憶體4210。 3 .接著,一個磁區之資料係運用沿著於第40圖所示的 路徑“B”之DMA通道1而傳輸自緩衝器1至SATA埠1。 同時,再次參考第42圖,該資料“ B”亦將為由B_ENA 所致能而通過XOR 4222。先進先出(FIFO)記憶體4210之 目前的内容係由X—ENA所致能。是以,“B”與“X”係 • 將進行XOR且其結果為送至先進先出(FIFO)記憶體 4210。 4.接著,一個磁區之資料係運用沿著路徑“C”之DMA 通道2而將傳輸自緩衝器2至SATA埠2。同時,該資料 亦將為由C — ENA所致能而通過XOR。該FIFO之目前的内 容係將為由X—ΕΝΑ所致能。接著,“C”與“X”係將進 行XOR且來自緩衝器0、緩衝器1、與緩衝器2的該等磁 區之XOR結果為同步傳送至SATA 3。 ® 不同於加入一或二個快取線路之FIFO至各個DMA路 徑,上述的計劃係運用僅有一個FIF Ο,於長度為單一個磁 區。自各個緩衝器而移動一個磁區之過程係將隨著需要而 重複,直到所有資料為已經傳輸。舉例而言,450 MBPS 之一主機匯流排速率係將支援對於四個SATA磁碟機之完 全 150 MBPS。 上述係針對於一完全拆解寫入。此係可純為一同位磁 碟機更新,若是來自A、B、與C至SATA磁碟機之傳輸 67 200535609 為免除而保留僅有同位傳輸。 E.資料路徑開關讀出方向 資料路徑讀出方向係說明於第4 1圖。同樣,此方塊係 插入於DMA通道與SATA埠之間。儘管其他圖式係顯示 於DMA通道、SATA埠、與XOR邏輯之間的特定連接, 此等路徑係運用類似於上述之邏輯而實際均為可配置。所 有的資料路徑係針對於 SATA為 32位元或針對於 PATA/UDMA 為 16 位元。 _ 來自任一 SATA埠(SATA PO-SATA P3)之磁碟讀出資 料係可推進至DMA目的A-D之任一者,如為由多工器(例 如:mux 4 1 1 0)所概念指出。此外,來自任一 SATA埠之資 料係可輸入至XOR 4120,由對應的致能訊號(例如:X1_ENA) 所選通。各個SATA埠係具有一個EMPTY狀態旗標輸出。 為了同步傳輸資料,所有的作用狀態埠之POP訊號係同時 斷定,但是僅當該等作用埠之EMPTY旗標係均為誤。X0R 方塊4130係可計算SATA輸入的任何組合之X0R,藉著 β合格該等AND閘之適當的組合。 DMA通道資料路徑係可能為32、64、或128位元, 視至記憶體之資料路徑的寬度而定。此方塊係接收來自該 等SATA埠之成分,且可接收來自X0R之成分以取代來自 一失效的磁碟機之資料。來自1至4個來源的成分係將封 包以建立針對於DMA傳輸之字組。針對於寬的拆解應用, 可能存在超過一個DMA通道。如第34A圖之“主機介面 細節”圖式所示,該等DMA通道係可為同時運作,但是 68 200535609 主機匯肌排係將僅為—次傳輸一個資料。於此情形,此方 塊係必須提供針對於各個舰通道之一小的緩衝器或 乂允„午其傳輪資料至主機為一次一個通道,且允許 同時傳輸資料於SATA槔側。此係由先進先出⑻卿己伊 體414 〇所說明。針對於良好的系統性能,此等緩衝器/ F㈣ 係必須為充分尺+ ^ ^ 允許主機匯流排於一次傳輸_或多個 元整快取線路之資料。 E.磁碟讀出累積器CS 66 200535609 DMA channel 0 of path "A" is transferred from buffer 0 to SATA port 0. At the same time, referring to Figure 42, this data ("Α") will also be enabled by Α_ΕΝΑ through XOR 4222 to FIFO memory 4210. 3. Next, the data of one magnetic sector is transferred from the buffer 1 to the SATA port 1 using the DMA channel 1 along the path "B" shown in FIG. 40. At the same time, referring to Figure 42 again, the data "B" will also pass XOR 4222 for enabling by B_ENA. First-in-first-out (FIFO) memory 4210 is currently enabled by X-ENA. Therefore, "B" and "X" are XORed and the result is sent to first-in-first-out (FIFO) memory 4210. 4. Next, the data of a magnetic sector is transferred from buffer 2 to SATA port 2 using DMA channel 2 along the path "C". At the same time, this information will also be passed through XOR for enabling by C-ENA. The current contents of the FIFO will be enabled by X-ENA. Next, "C" and "X" are XORed and the XOR results from these magnetic zones from buffer 0, buffer 1, and buffer 2 are transmitted to SATA 3 synchronously. Different from adding FIFO of one or two cache lines to each DMA path, the above plan uses only one FIF 〇 and the length is a single sector. The process of moving a magnetic field from each buffer will be repeated as needed until all data is transmitted. For example, a host bus rate of 450 MBPS will support a full 150 MBPS for four SATA drives. The above is directed to a complete disassembly write. This series can be purely a parity drive update, if it is a transfer from A, B, and C to SATA drives 67 200535609 For exemption, only parity transfers are reserved. E. Data path switch readout direction The data path readout direction is illustrated in Figure 41. Again, this block is inserted between the DMA channel and the SATA port. Although other diagrams show specific connections between DMA channels, SATA ports, and XOR logic, these paths use logic similar to the above and are actually configurable. All data paths are 32-bit for SATA or 16-bit for PATA / UDMA. _ Disk read data from any SATA port (SATA PO-SATA P3) can be advanced to any one of DMA destinations A-D, as indicated by the concept of a multiplexer (for example: mux 4 1 1 0). In addition, the data from any SATA port can be input to XOR 4120 and selected by the corresponding enable signal (for example: X1_ENA). Each SATA port has an EMPTY status flag output. In order to transmit data synchronously, the POP signals of all active status ports are determined at the same time, but only if the EMPTY flags of these active ports are incorrect. X0R Block 4130 can calculate the X0R of any combination of SATA inputs, and qualify the appropriate combination of these AND gates by β. The DMA channel data path may be 32, 64, or 128 bits, depending on the width of the data path to the memory. This block receives components from these SATA ports and can receive components from X0R to replace data from a failed drive. Ingredients from 1 to 4 sources are packetized to create blocks for DMA transfers. For wide disassembly applications, there may be more than one DMA channel. As shown in the "Host Interface Details" diagram in Figure 34A, these DMA channels can operate simultaneously, but the 68 200535609 host sink muscle system will only transmit one data at a time. In this case, this block must provide a small buffer for each ship channel or allow the transmission of data to the host one channel at a time, and allow simultaneous transmission of data to the SATA side. The first explanation is 414. For good system performance, these buffers / F systems must be sufficient. + ^ ^ Allows the host bus to transmit in one transmission or multiple cache lines. E. Disk Readout Accumulator

#圖係„兒明用於一資料路徑開關於讀出方向之一種 對累於積::路的—個實施例。該累積器係-個替代相 通道二同=要之各個〇ΜΑ的額外缓衝,因為該等隨A 34AFI、,輸處料。共用的主機匯流排介面(參閱第 3 4 A圖)係確保··炎白 # ^ ^ H , 5,、錢11之i機傳輸係將未重 二於此圖例之先進先出(_)記憶體侧 為一個磁區。 』%食戾 讀二置4Γ用於諸多方式。舉例而言,可為運用以 列,且於徇…下 碟機加上冗餘磁碟機的陣 r SATA2之磁碟機為失效,運用以下的過, κ該過程係將等待而直到所 -. 術賺TY,其於此例係將意謂著:其可提供=為指出 2.-個磁區之資料係透過DMa通 個磁區。 至'緩衝器1。,參考第43圖。同時傳輸广“ 所致能之來自SATA 〇的 /弟43圖’由 至先進先出(FIF0)記憶體伽内。、…以通過職侧 69 200535609 3.接著,-個磁區之資料係透過DMA通道】而傳輸自 SATA 1至—緩衝器1。同時,由P 1-舰所致能之來自SΑΤΑ 二的資料係將為與由Χ_ΕΝΑ所致能之先進先 記憶 fV的目前㈣㈣行職,且其結果為送回至該 FIFO。 4.接著,一個磁區資 貝卄係由P3-ΕΝΑ所致能而傳輸 ΑΤΑ 3,且再次為與由χ — 所致月b之先進先出(FIFO) dfe、脰4310的目前内容# 图 系 „Erming is an embodiment of a product that accumulates a pair of data path switches in the readout direction :: road. The accumulator is an alternative phase channel with two identical = required extras of each ΜΑ. Buffer, because these are delivered with A 34AFI ,, and the materials are shared. The shared host bus interface (see Figure 3 4 A) is to ensure ·· 炎 白 # ^ ^ H, 5, and 11 money machine transmission system The first-in-first-out (_) memory side that is not duplicated in this illustration is a magnetic area. 』% 食 戾 读 二 置 4Γ is used in many ways. For example, it can be used in columns and under 徇 ... Disk drive plus redundant disk drive array SATA2 disk drive is invalid. Use the following process, κ This process will wait until all the-. Shu earn TY, which in this case means: It can provide = to point out that 2.- magnetic field data is transmitted through DMa through a magnetic field. To 'buffer 1.', refer to Figure 43. Simultaneous transmission of the "Enable from SATA 0 / brother 43 picture" 'From first to first out (FIF0) memory Gane. … To pass the job side 69 200535609 3. Then, the data of a magnetic sector is transmitted from SATA 1 to —buffer 1 through DMA channel]. At the same time, the data from SAATA II enabled by P 1-ship will be in the same position as the advanced first memory fV enabled by X_ENA, and the result will be returned to the FIFO. 4. Next, a magnetic field data transmission was enabled by P3-EnNA, ΑΑΑ 3, and again the current content of FIFO dfe, 脰 4310 and month b caused by χ —

。、 延仃X〇R。結果係透過DMA K而延至緩衝器2,藉以完成該完全拆解讀出。 針對於具有失效的磁碟機之完全拆解讀出,所有未失 係先傳輸且隨後為同位磁碟機資料。針對於失 ;二==一區塊讀出,來自三…埠之對應區 結果係將為透過-DM“f輸XQR’_有最終的 ^ DMA通道而傳輸至一緩衝器。 不同於加入一或-個此M a 徑,上、^+±| 線路之FIF〇至各個DMA路 上相相係運用僅有單—個聊 區。自各㈤SATA琿而務叙, 長度為-個磁 ..&gt;Λ. 旱移動—個磁區之過程係將隨著雪要 而重複,直到所有資料為 %者而要. , Extend XOR. The result is extended to buffer 2 through DMA K to complete the complete disassembly interpretation. For a complete disassembly of a failed drive, it is interpreted that all data that was not lost is transmitted first and then the same drive data. For the missing; two == one block read, the corresponding area results from three ... ports will be transferred to a buffer for -DM "f lose XQR'_ with the final ^ DMA channel. Unlike adding one Or-this M a path, the FIF of the upper, ^ + ± | lines are connected to each DMA road and there is only a single chat area. Since each SATA is described, the length is-one magnetic .. &gt; Λ. Drought movement—The process of a magnetic field will repeat as the snow demands, until all the data are%

之—主媸《站土 輸牛例而s , 450 MBPS ㈣4速率係將支援對於三個加 全150 MBPS。 兹哚钺之το G•同步讀出方法-一般情形_ 僅為藉由舉例,資斗 ,L 貝枓拆解為4K寬於三個資料 . 加上一冗餘磁碟機之愔彤焱赴 貝枓磁碟機 设-4K讀出存取,盆 4 4專貝例係假 〃 ^為確貫對應於一特定磁碟機之 70 200535609 一特定拆解的4K P说、直田 π龙邊界 '然而,一般而言 何數目之資料磁碑遞 σ 一有任 茱祛以及早一個冗餘磁碟機。 可為單一磁區之任何供盔甘土 诉解寬度係 了 L數。甚者,一讀出請求 任何磁區,且1导声仫 μ 、 ’、可起始於 /、、又係可靶圍為自單一個磁區古 列之容量。根攄太八叫 達该陣 里根據本發明之同步讀出資料自一 列的一種概括方法係進行如後: '樂機陣 =區塊為含有初始磁區之磁碟機, …區而直到該區塊之結束、或是讀出請求之=自:亥 論何者為先來到。# #、 、、口束 無 兀木剞右该仞始磁區為碰巧於一生、 機,則對應範圍的磁區係讀 、;,磁碟The main feature is the example of "Stop soil and lose cattle, and 450 MBPS. The 4 rate system will support 150 MBPS for three additions." Το G • Synchronous readout method of the indole-general case_ Just by way of example, Zidou, L Beijou is disassembled into 4K wide than three data. Plus a redundant disk drive Beckham disk drive is set to -4K read access, basin 4 4 special cases are false. ^ To be consistent with a specific disk drive 70 200535609 A specific disassembly 4K P said, Naoda π Dragon boundary ' However, in general, the number of data inscriptions will be as long as there is a redundant drive. The width of the solution for any glacial soil for a single magnetic field is L-numbered. In addition, a read request requests any magnetic field, and a guide sound 仫 μ, ′, can start from /, and can target the volume of a column from a single magnetic field. A general method for synchronizing the data in the array according to the present invention is as follows: 'Machine array = block is a disk drive containing the initial sector, ... The end of the block, or the read request = from: Hai on whichever comes first. # # 、 、、 口 束 None The wood block is right. The beginning of the magnetic field happens to be in the lifetime, and the corresponding range of magnetic fields is read.

且“即時,,傳輸結果,如磁碟機、計算x〇R _ 所請求之額外的整個f料區塊,區㈣讀出自 y於拆解之連續的區塊以及來自連續的拆解。 傳輸性能料,以存取係必 須㈣發生於各個磁碟機。當—區塊為需要自-失饮的磁 碟機,同時性係不可 失效的磁 係需要以㈣“ u為;自所有其他磁碟機之資料 知而要以重新建立失效的區塊。 最終’-資料讀出存取係由其包括最終 位在於之磁碟機而作成。 尤〜 分、U 卞成貝料係傳輸自該區塊之起始或自 该初始磁區(若其為於相 ’ ^ , 直到最終磁區。若此區 塊之貝枓為位在於其已經 Μ . ^ ^ Λ, r ^ 磁碟機,則於其他磁碟 枝之拆解的區塊之對應磁區範圍係讀出,舰係計算,且 该結果係送回。前成&amp; 士 θ 、方法係由種種的裝置所致能,尤其 疋灰本文所揭示之資料路徑開關。 71 200535609 機更新係將同樣需要_冗餘磁碟機更:::何的資料磁碟 為一直適當反映該等資料磁碟機之内容。猎以令該冗餘性 解更新,其中,對於該陣列 :針對於一完全拆 係接收新資料,一標準的RAm x〇r 〃系祛的資料區塊 個資料磁碟機緩衝器而抓取寫入資料成擎係可運用以自各 等成分以產生冗餘磁碟機之新内二—:二進# X〇R此 分至-緩衝器以供一後續寫入傳輸至冗餘二:寫= 用- 4K寬的拆解之實例,除了傳輸邮料再-人運 相同的12κ之資料m + έ 、科至磁碟機, “出自该等緩衝器且運用以計曾4Κ 之冗餘性,用於總計為16Κ之額外資料傳輸。- 更新係可涉及小於—完全拆解。然而 時,於冗餘磁碟機之區塊的内容 叔 凡成 Α屑汉映该拆解的所有 ㈣'新區塊'與其為未作更新的先前區塊之職。第37A 至37C圖係說明針對於部分拆解更新之—種先前技藝的方 法’亦稱為-種讀出/修改/寫人(R/M/w,Read/M。响/w_) 操作。第37A圖係顯示步驟},自一資料磁碟機而讀出未 作改變之區塊’且自一同位(PAR)磁碟機而讀出對應的區 塊。該資料與冗餘資料係分別為儲存於緩衝器37〇2、37〇4。 於第37B圖之步驟2,—新舰係藉著讀出(且進行職) 其含有舊資料、舊同位、與新資料之緩衝器' 37〇2、37〇4、And "immediately, transfer the results, such as the disk drive, calculate the additional entire block of data requested by x〇R _, read out the continuous blocks from y and the continuous disassembly. Transfer Performance data, the access system must not occur on each drive. When-the block is a drive that requires self-loss, while the non-failure magnetic system at the same time needs to be "u"; since all other drives The disc player's data is known to re-create the failed block. Final'-data read access is made by including the drive where the final position lies. In particular, the U and B materials are transmitted from the beginning of the block or from the initial magnetic field (if it is in phase '^ to the final magnetic field. If the block of this block is located in the position M.) ^ ^ Λ, r ^ Disk drive, then read the corresponding magnetic zone range of the disassembled block of other disks, calculate it by the ship system, and send the result back. The former &amp; It is enabled by various devices, especially the data path switch disclosed in this article. 71 200535609 The machine update system will also need _ redundant disk drives. More :: What data disks have always properly reflected such data disks. The contents of the disc drive. The redundancy is updated to update the array. For the array: for a complete disassembly to receive new data, a standard RAm x〇r system data block is a data drive. Buffer and capture the written data into the engine can be used to generate redundant disk drives from various components of the new internal two :: binary # X〇R this to-buffer for a subsequent write transmission to Redundancy 2: Write = Example of-4K wide disassembly, except for transmitting postal materials-the same 12k Material m + έ, section to disk drive, "from these buffers and uses redundancy of 4K for additional data transmission totaling 16K.-Update can involve less than-complete disassembly. However At the time, the contents of the blocks of the redundant disk drive were all unresolved, and all the new blocks that were dismantled were replaced with the previous blocks that have not been updated. Figures 37A to 37C are for Partial disassembly and update-a method of prior art 'is also called-a kind of read / modify / write (R / M / w, Read / M. Ring / w_) operation. Figure 37A shows the display steps}, Read unchanged blocks from a data drive and read the corresponding blocks from a parity drive. The data and redundant data are stored in the buffer 3702 respectively. , 37〇. In step 2 of Figure 37B, by reading out (and performing duties), the new warship contains a buffer of old data, old parity, and new data '3704, 3704,

與3706而形成。X0R結果係儲存於緩衝器37〇8。各個缓 衝器係具有如圖示之一個對應的DMA通道。最後,於第37C 72 200535609 新的資料與新的同位資料係儲存於個別的磁 圖之步驟3 碟機。 對於此問題之另-個習知方式係預先讀出其為未作更 新之區塊。於此點,所有新的區塊與未改變的區塊係將可 利用於緩衝器,可為讀出以供X0R計算。預先讀出 係建立如同-完全拆解寫人之相同起始狀態。運用此方 式,^個資料磁碟機係讀出或寫人,且冗餘磁碟機係寫人。 第38與39圖係說明實行部分拆解更新之一種改 式的-個實施例。於第38目’首先為讀出目前的冗㈣ 其為欲作更新的區塊~之目前的内容之Mm _ 之^OR係接者為計算。(x〇R係實施於如所論述之陣列開 關邏輯460 〇 )此計算之結果係等效於其 ^ 區塊之XOR。該二個區塊之此x〇R計: 的所有 其將為更新之區塊的目前資料之效應 效取4 於-暫時緩衝器3810。 〜中間結果係健存 麥考弟39圖,該中間結果係接著為 進行XOR,以產生針對於整個拆解之—貝科3812而 新的冗餘者與新資料區塊係分別 :的几餘者。更 機3 820與資料磁碟機3822。於此:。至几餘(同位)磁碟 二個XOR、與二個寫入。 以’具有二個讀出、 L冗餘寫入-同步資料傳輪_ 完全拆解寫入操作係亦可 操作之-種同步冗餘資料傳輪而為二'針對於一磁碟讀* 磁碟寫入命令係發出至該陣之如後. 之各個磁碟機。 73 200535609 當該等磁碟機係均為備妥以傳輸資料,Dw q (engme)係自各個緩衝器而抓取一第一成分,1 μ ^ 之X〇R,且接著寫入該等第—成分至各個資二 寫入观計算之結果至冗餘磁碟機,運用針對於並歹^ ΛΤΑ/ΑΤΑΡΙ介面之一共用的DI〇w。 、” 如上所述,—FIF〇之引入於各個磁碟機之資 係允㈣同的方法為運用’雖然為一 pusH而 工 DIOW,針對於其運用_Α協定之ata/a 举 或SATA磁碟機。 ②碟機 此種方式之優點係包括下列者·· 傳輸—自該緩衝器之總資料量係其為寫入至磁 的新貧料。額外之丨2K的緩衝 ” 寫入係免除。 友衝…貝出與4Κ的緩衝器 几餘資料之產生係未增加 DMA引擎伤從斗』寺捋蚪間。運用專用的 …糸將涉及於寫入計算的冗 再次將其嘈Φ u ~ 口巧、、友衡杰U及 、。貝出以供傳輸至冗餘磁碟機。 如上所論,一部分拆解更新係 更新,藉菩古止π 理為如一元全拆解 先預先讀出其為未改變 預先讀出僅右^ | + 夂之區塊的内谷。同理, 印1夏有几餘貢料盥欲作爭身 運用同步冗餘資… 區塊之已述方式係可 几餘貝枓傳輪之優點,如後: 買出tfp令係首先發 碟機。 几餘磁碟機與欲作更新之磁 白夂2·當二個磁碟機為備妥以傳輪資料,m 自各個磁碟機, '科弟一成刀仏碩出 y刀之XOR係計算,且結果係儲存 74 200535609 於一緩衝器。此過程係重複於逐個成分,直到完整區塊為 已經讀出自該等磁碟機,所得的區塊係已經儲存於一缓衝 器。此缓衝器係持有其未作更新的所有區塊之X〇R。 3. —寫入命令係發出至冗餘磁碟機與欲作更新之資料 磁碟機。 磁碟機係均為備妥以接受資料 4·當 貝/丨、丁 成分係抓取自 XOR缮|突、η 也 ^ / %曰AUK、友衝為且一第一資料成分係抓取自更 新緩衝器。二個成分夕 vnp ^ 们成刀之XOR係計鼻。資料係接著為運用 —共用的mow選通而同步傳輸至該二個磁碟機。資料磁 碟機係接收來自未作改變的更新緩衝器之成分,而冗餘磁 碟機係接收該二個成分之計算 T开的XOR。此過程係重複於逐 成S ’以完成部分拆解更新。 注意:於最後步驟’該二個磁碟機係無須為同步寫入。 4 XOR與冗餘緩衝器之内容 ^ Γ為進行XOR且寫入5 π 餘磁碟機而無關於自更新緩衝 “’、 是此舉係將w亀之傳輸,但 J•同步讀出方法-窄拆解_ 對於此點之同步資料傳輪&amp; 成「 基於至少個 磁區之一拆解寬度。此係其亦 土夂至夕一個 所支援之資料結構,該等專 專用的_引擎 ^ ^ ^ DMA引整总μ 資料為已經傳輪至緩衝器而運作。上,〔1拏係於一旦讀出 體與方法係同樣適當作用於其小於—述之同步資料傳輪硬 參考第5圖,其說明具有資料:=之拆解寬度。 機加上一個冗餘磁碟機之一個 於二個資料磁碟 旱列。如前,一 π # 邏輯順序係 75 200535609 相疋主該 π肝見度係選擇。斜 列atA/atapi磁碟機,資料路徑之十六 ^方卜孟 用的拆解寬度。磁區。之第-字組⑼係儲存於;度係—有 (於第5圖之資料〇),坌—A。λ 于、弟—磁碟機 貝卄〇) ’弟一子組⑴係儲存於 料υ,且第三字組(2)係儲存 弟—磁碟機(資 程係接著為重複,儲存第n 碟機(資料2)。此過 二位置,儲存第五資料字袓(4) 途茱機之第 貝τ卞予、,且(4)於弟二磁碟機 且儲存第六資料字組(5)於第三磁碟機之第二—立置’ 係重複於逐個字組以及逐個拆解而至磁碟之末/此過程 说係儲存於同位磁碟機。舉例而言, ::位資 同位磁碟機字組係顯示…8, 二之拆解2’ 碟機的字組6、7、與8之職。、者储存於資料磁 於第5圖之配置’一完整拆解係由 -者為來自各個資料磁碟機。择所、、且成’ 自三個磁碟機之一存取,即使是不:有用=將需要來 終為達成。同牛^取 資料磁碟機的總計頻寬係始 一 攻Π步存取之字組寬的成分係組合為缓彳#-々a 是度,區塊,維持拆解順序,且為儲存於緩衝哭。體 *考弟38圖,於上述的窄拆 機’字組寬的成分係同步抓取自且餘的&quot;4二效的磁碟 及來自问位磁碟機。抓取的三個字口者 算,再造來自头、、之XOR係即時計 解係藉著合二=碟機之資料的字組。原始的拆 資料成分而再造。結果係傳輸至緩衝器:個貝料磁碟機之 76 200535609 於此配置之奢&amp; &amp; 乍拆解的一結果係在於: 碟機的各個磁區# I 、以卩丁列之一磁 頃出請求係非為針對於三 、’4右一 (modulo)三磁區邊界, =—倍數且對準至一模數 區塊係讀出,一二:件係將要求··三個磁區之- 僅有一或二者传至個磁碟機,且此等磁區之 …、回至主機。該等磁 是··完整的磁區為傳鈐 ,^ 竹始〜要求的 區係棄置。 々月尺之口亥一或二個磁 熟悉本項技術人士將清· 主要原則前提下,可進行上二奢的疋.於未背離本發明 進仃上述的貫施例細節之變化。 本發明之範疇係僅由申喑專 μ 此, 屯Τ明寻利靶圍所界定。 【圖式簡單說明】 弟1 Α至1D圖倍#日η括4^: u 你σ兒明種種的磁碟機配置。 f 2圖係說明二個資料磁碟機之拆解資料。 弟3圖係說明二個資料 拆解。 個w«加上-個冗餘磁碟機之 第4圖係說明三個資料磁 拆解。 票裇加上一個冗餘磁碟機之 第5圖係說明窄的拆解,其為―個字组寬( 其顯示用 第6圖係一種磁碟陣列系統之簡化示音圖 於同:化UDMA磁碟機資料之讀出資料路:。 其顯示用 第7圖係一種磁碟陣列系統之簡化示意圖 於寫入至UDMA磁碟機之寫入資料路徑。 弟8圖係一種磁碟陣列窝 』焉入貝枓路徑之簡化示意圖 77 200535609 其具有“即時”冗餘資料儲存。 第9圖係-種磁碟陣列讀出資料路徑之簡化示 其具有資料之“即時,,再生,其中一個磁碟機係失效: 第10圖係時序圖,其說明—磁碟陣列read摔作。 第11圖係一種磁碟陣列控制器之功能圖,其顯示用於 各個實體璋之邏輯資料路徑與DMA通道。 第1 2圖係顯示實際實施, 、 ,、有早一個貫體DMA通道 及資料路徑,用於所有實體埠,1係 1乐田陴列開關所提供用。 各個實體埠之DMA内容係儲存於一 ram中。 第1 3圖係說明用於一綠石度&amp;山 . 磁碟頃出之陣列開關資料路徑設 定。 弟1 4圖係說明用於一威虛仓、 , ^ 磁碟寫入之陣列開關資料路徑設 疋〇 第15A圖係說明二、二、命 7 y 一 人四個磁碟機之資料拆解, ”圖表係說明以一 jBC)Ekraid〇 穴对之一或更多個磁雄;I:# 的資料映射。 1口兹亲钱 第1 5B圖係說明針對RAi T耵RA1D1或鏡射的資料映射。 第16A圖係說明針對非冗餘陣列之二、: 碟機的RAIDXL資料映射。 -人四個磁 第1 6B圖係說明二^咨 或是三個資料磁碟機力Γ上個機加上—個同位磁碟機 映射。 一们问位磁碟機的RAIDXL·資料 第1 7圖係說明針對- ^ T對一個貝枓磁碟機加上一個同位磁虚 機或是三個資料磁磾機* 業 同㈣碟機之一種可能的 78 200535609 RAID5資料映射。 弟18至22圖係說明種種的陣列開關配置與摔作。 第23圖係一種磁碟陣列控制器之簡化 用於與一主機匯流排互動之一 〃、知(、 附接磁碟機互動之一磁碟機介广面、及用於與複數個Formed with 3706. X0R results are stored in buffer 3708. Each buffer has a corresponding DMA channel as shown. Finally, at 37C 72 200535609, the new data and the new parity data are stored in the separate 3 disk drive of the magnetic map. Another conventional approach to this problem is to read out in advance that it is an unupdated block. At this point, all new blocks and unchanged blocks will be available in the buffer and can be read out for X0R calculation. Read-ahead is to establish the same initial state as-completely disassemble the writer. In this way, ^ data drives are read or written, and redundant drives are written. Figures 38 and 39 illustrate an embodiment implementing a modification of the partial disassembly update. In item 38 ′, the first is to read the current redundancy, which is the ^ OR connection of Mm_ of the current content of the block to be updated ~. (X〇R is implemented in the array switch logic 460 0 as discussed) The result of this calculation is equivalent to the XOR of its ^ block. This x〇R of the two blocks counts: all of which will be the effect of the current data of the updated block. ~ The intermediate result is the 39 figure of Jiancun McCord. This intermediate result is then XORed to generate a new redundant block and a new data block for the entire dismantling of Beco 3812: By. Update 3 820 and data drive 3822. herein:. To several (parity) disks Two XORs and two writes. "With two read, L redundant write-synchronous data transfer wheel _ completely disassemble the write operation system can also operate-a kind of synchronous redundant data transfer wheel and two" for one disk read * magnetic The disk write command is issued to each drive in the array as follows. 73 200535609 When these drives are ready to transmit data, Dw q (engme) grabs a first component, X μR of 1 μ ^ from each buffer, and then writes these first — Write the results of the calculations to the redundant disk drives, using the DI0w that is shared for one of the parallel ΛΤΑ / ΑΤΑΡΙ interfaces. "" As mentioned above,-the introduction of FIF0 into each drive allows the same method to use 'Although it is DIOW for a pusH, for its use of the ATA protocol or SATA magnetic Disc player. ② The advantages of this method of disc player include the following: · Transmission—The total amount of data from the buffer is a new lean material written to the magnet. The extra 2K buffer is written. . Youchong ... Bei Chu and 4K's buffer The production of more than a few data has not increased DMA engine injury Congdou's temple. The use of special… 糸 will involve the redundancy of writing calculations again, making it noisy u u ~ 巧巧 、、 友 衡 杰 U and。 Be ready for transfer to redundant drives. As discussed above, a part of the disassembly and update system is updated. According to Boguzhi, it is disassembled as one yuan. It is read in advance that it is unchanged. The inner valley of the block that is only right ^ | + 读 出 is read in advance. By the same token, there are several tribute materials for Yin 1 Xia who want to compete and use synchronous redundant resources ... The described method of the block is the advantage of transferring more than a few rounds, as follows: Buying tfp makes the first release machine. More than a few disk drives and magnetic disks to be updated2. When two disk drives are ready to transfer wheel data, m from each disk drive Calculated and the results are stored in 74 200535609 in a buffer. This process is repeated one by one until the complete block is read from the drives, and the resulting block has been stored in a buffer. This buffer holds XOR of all blocks that it has not updated. 3. —Write command is issued to the redundant drive and the data drive to be updated. The disk drives are all ready to accept the data. 4 · Dangbei / 丨, Ding is captured from XOR 缮 | 突, η is also ^ /% said AUK, You Chongwei and a first data is captured from Update the buffer. Two ingredients eve vnp ^ The XOR of our knife is a nose. The data is then simultaneously transferred to the two drives for the use of a shared mow strobe. The data drive receives the components from the unchanged update buffer, and the redundant drive receives the calculated XOR of the two components. This process is repeated in steps S 'to complete the partial disassembly update. Note: In the last step, the two drives need not be written synchronously. 4 XOR and the contents of the redundant buffer ^ Γ To perform XOR and write 5 π excess disk drive without self-refresh buffer "', This is the transfer of w 亀, but J • synchronous read method- Narrow teardown_ For this point, the synchronous data transfer wheel &amp; becomes "based on the teardown width of one of at least one magnetic zone. This is a data structure supported by it. These dedicated _ engines ^ ^ ^ The DMA quotes the total μ data and operates as it has been transferred to the buffer. Above, [1 is based on the fact that once the reader and the method are equally suitable for its less than—the synchronous data transfer wheel described above refers to Figure 5. , Its description has the data: = the disassembly width. The machine plus one redundant disk drive is one of the two data disks. As before, a π # logical sequence is 75 200535609 Degree system selection. Slanted atA / atapi drive, sixteen of the data path ^ Fang Bu Meng's disassembly width. Magnetic area. The-character group is stored in; Degree system-yes (in Figure 5) Information 〇), 坌 —A. Λ Yu, Di—disk drive 卄 0) 'The first group of xi is stored in material υ, and the third Group (2) is the storage brother—disk drive (the process is then repeated to store the nth drive (data 2). At this second position, the fifth data word is stored. I give, and (4) in the second drive and stores the sixth data block (5) in the second-standing position of the third drive is repeated on a word-by-word basis and disassembled one by one to the magnetic The end of the disc / this process is said to be stored in the parity drive. For example, the :: Parity parity drive word group is displayed ... 8, the second disassembly 2 'drive word group 6, 7, and The position of 8. The data stored in the data magnetic configuration shown in Figure 5 'a complete dismantling is by the data drive from each data. Choice, and Cheng' access from one of the three drives , Even if it is not: useful = the need will come to an end. The total bandwidth of the data drive with the data of the bull ^ is the combination of the word width of the first attack and the step access. Degree, block, maintain the disassembly order, and store in the buffer cry. Body * Kaodi 38 picture, the composition of the word group width in the narrow disassembly machine described above is synchronously taken from the remaining "4 second-effect" magnetic And from the disk drive. The three words captured are counted, and the XOR from the head, the, and the real-time solution is based on the combination of the data of the two = disk drive. The original data is rebuilt and rebuilt. The result is transmitted to the buffer: 76 of the 2005 disk drive. 200535609 The luxury of the configuration here is one of the results of disassembly: One of the magnetic disks # I of the disk drive. The magnetic field request is not for the three, '4 right one (modulo) three magnetic field boundaries, =-multiples and aligned to a modulo block system read, one two: the system will require · three Sectors-Only one or both of them are passed to the drive, and of these sectors ..., back to the host. The magnetic field is ... The complete magnetic field is for transmission, and the required area is discarded. One or two magnets on the mouth of the moon ruler. Those familiar with this technology will understand the main principles, and you can carry out the above two extravagances. Without departing from the present invention, the above-mentioned implementation details are changed. The scope of the present invention is only defined by Shen Zhuan, and thus the profit-seeking target circle is defined. [Schematic description] Brother 1 Α to 1D 图 倍 # 日 η 含 4 ^: u You have various drive configurations. f 2 is the dismantling data of two data drives. Brother 3 illustrates the disassembly of two materials. Fig. 4 of w «plus a redundant disk drive illustrates three data magnetic disassembly. The fifth picture of the ticket plus a redundant disk drive is a narrow disassembly, which is ―width of a block (which is shown in FIG. 6 as a simplified audio chart of a disk array system). UDMA drive data readout data path: Figure 7 shows a simplified schematic diagram of a disk array system in the write data path written to the UDMA drive. Figure 8 shows a disk array socket "Simplified schematic diagram of the path to the shell" 77 200535609 It has "real-time" redundant data storage. Figure 9-Simplification of the read-out path of a disk array shows that it has data "real-time, regeneration, one of which Disk drive failure: Figure 10 is a timing diagram illustrating its disk array read failure. Figure 11 is a functional diagram of a disk array controller that displays the logical data path and DMA channel for each entity. Figure 12 shows the actual implementation. There is a consistent DMA channel and data path for all physical ports, which are provided by 1 series and 1 Letian queue switch. The DMA content of each physical port is stored. In a ram. Figure 1 3 is a description In a green stone &amp; mountain. The data path setting of the array switch that the disk is out of. The figure 14 illustrates the data path setting of the array switch used for a Weiwei,, ^ disk writing. Figure 15A It shows the dismantling of the data of two, two, and seven drives per person. The chart shows the data mapping of one or more magnetic males with one jBC) Ekraid〇 point pair; I: #. 1 mouthful of money, Figure 15B illustrates the mapping of data for RAi T 耵 RA1D1 or mirroring. FIG. 16A illustrates the RAIDXL data mapping for a non-redundant array. -Personal Four Magnets Figure 16B illustrates two or three data disk drives. The last machine plus a co-located disk drive mapping. Some of the RAIDXL data of the disk drive are shown in Figure 17 for-^ T to a belleville disk drive plus a co-located virtual machine or three data drives * A possible 78 200535609 RAID5 data mapping. Brothers 18 to 22 illustrate various array switch configurations and falls. Figure 23 is a simplified illustration of a disk array controller. It is used to interact with a host bus. It is a wide disk drive interface and is used to interact with multiple hosts.

第24A圖係概念圖,A 料埠之門的亩姑志 邏輯資料4與實體資 直接連接,·且其顯示對應的映射暫存器内容。 弟24B圖係概念圖,其說 實體資料蟑之指定的一個每制· b甘1耳阜至了用的五個 器内容。 ’且其顯示對應的映射暫存 弟2 4 C圖作$日g ^ 各個磁碟機二Γ五 碟機的陣列之概念圖,其中, 不對應的映射暫存器内容。 -頁 1 2 4 D圖係說明_種單磁碟機㈣統之概念圖,其中, 璉耳0-3係於連續的週期將資料傳 示對應的映射暫存器内容。 只體阜3,且顯 『25A圖係說明第“a圖之磁碟機配置之 方向:;職邏輯;且其顯示對應的映射暫存器内容:寫 路徑=:!:明針對如同第24A與25A圖的相同資料 、票5貝出方向的XOR邏輯,除了附接至實體埠 之磁碟機係目前已經失效;且顯示映射暫存心。“ #2 制器個圖射暫存器結構之一個實例,於陣列控 料崞之間的資料Μ經之=暫存器係控制於邏輯與實體資 79 200535609 第27A圖係於邏輯埠#丨讀出資料路徑中之多工器電路 的概念圖。 第27B圖係說明於陣列控制器之—個實施例中的磁碟 讀出XOR邏輯。 第28圖係說明於陣列控制器之一個實施例中的映射暫 存器之邏輯埠#1 (PP—L1)攔位的解碼器邏輯。 第29A圖係說明於陣列控制器之一個實施例中的邏輯 琿至實體埠資料路徑邏輯(圖示為僅針對於實體璋⑼。 第29B圖係說明於陣列控制器之—個實施例中的磁碟 寫入XOR邏輯。 第30圖係、說明磁碟位址、選通與晶片選擇邏輯,以致 能通用存取命令至一目前選擇的陣列。 。第31圖係說明用於關聯於邏輯磁碟機之中斷訊號邏 第3 2圖係說明邏輯定址之硬體實施。 第3 3圖係根據本發明之一豨 ^ ^種磁碟陣列系統之一個實施 例的簡化概念圖,其運用複數個sata蟑與磁碟機。 第34A圖係顯示第33圖之主機介面的細節。 第3 4 B圖係顯示於磁磾讀屮古 未σ貝出方向之一 SATA埠介面。 第3 4 C圖係顯示於磁磾寫入古 卜 朱馬入方向之一 SATA埠介面。 第35A與35B圖係說明一個券1 中 们先則技術的讀出操作,其 一個磁碟機係已經失效。 第3 6圖係說明根據本發明之一徊如从&amp;山 中 ^ ^ 個新的讀出操作,其 一個磁碟機係已經失效。 八 80 200535609 第3 7A至37C圖係說明於一種串列介面磁碟機陣列中 之一個先前技術的讀出-修改-寫入(r/M/w)操作。 第38圖係說明根據本發明之一種串列介面磁碟機陣列 中之一個新的言買出-修改-寫入操作,步驟1。 第39 ®係說明根據本發明之一㉟串列介面磁碟機 中之個新的讀出-修改-寫入操作,步驟2。Figure 24A is a conceptual diagram. The logic data 4 of Mu Guzhi, the gate of material port A, is directly connected to the physical assets, and it displays the corresponding mapping register contents. Brother 24B is a conceptual map, which says that the physical information specified by the cockroach contains five organ contents per system and one earphone. ′ And it shows the corresponding mapping temporary storage. The 2 C drawing is used as $ 日 g ^ Each disk drive is a two-five drive array conceptual diagram, in which the corresponding mapping register contents are not. -Page 1 2 4 D is a conceptual diagram of a single disk drive system. Among them, Er 0-3 is a serial register that transfers data to the corresponding register contents. Only the body 3, and the display of "25A" shows the direction of the drive configuration of the "a": the job logic; and it displays the corresponding mapping register contents: write path =:! :: The XOR logic with the same data as in Figure 25A and the output direction of 5 tickets, except that the drive system attached to the physical port is currently invalid; and the mapping temporary memory is displayed. "# 2 As an example, the data between the array controller and the controller M = the register is controlled by the logical and physical resources 79 200535609 Figure 27A is a conceptual diagram of the multiplexer circuit in the read data path . Figure 27B illustrates disk read XOR logic in one embodiment of an array controller. Fig. 28 illustrates the decoder logic of the logical register # 1 (PP-L1) block of the mapping register in one embodiment of the array controller. FIG. 29A illustrates the logic path to the physical port data path logic in one embodiment of the array controller (illustrated only for the physical port). FIG. 29B illustrates the array controller in one embodiment. The disk is written with XOR logic. Figure 30 illustrates the disk address, strobe and chip selection logic to enable universal access commands to a currently selected array. Figure 31 illustrates the logic used to associate with the logical disk. Figure 32 of the interrupt signal logic of the disc player illustrates the hardware implementation of logical addressing. Figure 33 is a simplified conceptual diagram of an embodiment of a disk array system according to one of the present invention, which uses a plurality of sata cock and disk drive. Figure 34A shows the details of the host interface in Figure 33. Figure 3 4B shows the SATA port interface in the direction of the magnetic readout of the ancient σ. Figure 3 4C It is shown on the SATA port interface where the magnetic disk is written in the direction of Gubu Zhuma. Figures 35A and 35B illustrate the read operation of the prior art in a coupon 1. One of its disk drives has failed. Section 3 6 The drawing illustrates a &amp; from the mountain according to one of the present invention ^ ^ A new read operation, one of the disk drives of which has failed. 8 80 200535609 Figures 37A to 37C illustrate a prior art read-modify-write in a serial interface drive array ( r / M / w) operation. Fig. 38 illustrates a new buy-modify-write operation in a serial interface disk drive array according to the present invention, step 1. Fig. 39 ® illustrates the operation according to the present invention. One of the inventions: a new read-modify-write operation in a serial interface disk drive, step 2.

第40圖係說明根據本發明之配置於磁碟寫入方向 種磁碟陣列開關的一個實施例。 ° 弟圖係說明根據本發明之配置於磁碟讀 種磁碟陣列開關的一個實施例。 °之一 第42圖係說明根據本發明之磁碟 一個實施例。 W⑽峰#耳中的 -個係㈣根據本發明之磁-邏輯中的 【主要元件符號說明】 10 ··磁碟機陣列 12 ' 20 :磁碟機 14 :磁碟機 16 ’ ‘ 24 : UDMA 介面 18 : 緩衝記憶體 20 : 控制處理器 22 : 磁碟機介面 24 : 映射暫存器 26 : 先進先出記憶體 81 200535609 30、32 :,,幾乎為滿”訊號 36 :邏輯閘 4 0 :邏輯方塊 42 : “均具有資料”訊號 44 :讀出選通 46、48 :資料輸出路徑 50 :多工器 52 :緩衝器Fig. 40 is a diagram illustrating an embodiment of a disk array switch arranged in a disk write direction according to the present invention. ° This figure illustrates an embodiment of a disk array switch configured for disk reading according to the present invention. ° Figure 42 illustrates an embodiment of a magnetic disk according to the present invention. W⑽ 峰 # -Each system in the ear [Description of the main component symbols] in the magneto-logic of the present invention 10 ·· Disk drive array 12 '20: Disk drive 14: Disk drive 16' '24: UDMA Interface 18: buffer memory 20: control processor 22: disk drive interface 24: mapping register 26: FIFO memory 81 200535609 30, 32 :, almost full "signal 36: logic gate 4 0: Logic block 42: "Everything has data" Signal 44: Read strobe 46, 48: Data output path 50: Multiplexer 52: Buffer

54、70 :計數器 66 :解碼器 68 : OR 閘 70 :資料路徑 72 :多工器 74 : AND 閘 76 : OR 閘 80、82 :緩衝器 84 、 86 :閘 90 :多工器 92 :緩衝器 94 : AND 閘 96 OR 閘 100、104 :解碼器 102、106 : OR 閘 300 :資料磁碟機 82 200535609 320、380 : UDMA 介面 322 :失效的資料磁碟機 340、370 :先進先出記憶體 342、344 :資料路徑 350 :緩衝器 360 : XOR 邏輯 390 :冗餘或同位磁碟機 392 :路徑 394 : XOR 邏輯 396 : XOR 輸出 420 :緩衝器 422 :直接記憶體存取通道 424 : SATA 埠 428 :磁碟機 450 :主機介面 460 :資料路徑開關邏輯 500 :系統橋接器 502 : PCI匯流排 504 : PCI匯流排介面 506 :匯流排判優器 5 1 0 :邏輯直接記憶體存取通道 520 : SATA 埠 522 :實體層 524 :磁碟機 83 200535609 526 :連結層 528 :傳送層 5 3 0 :先進先出記憶體 5 3 2 :連結 540 :資料路徑 620 : XOR 邏輯 622 :緩衝器 3702、3704、3706、3708、3810、3812 :緩衝器 3820 :冗餘(同位)磁碟機 3 822 :資料磁碟機 4010、4120、4130、4222、4320 : XOR 4020、4210、4310、4140 :先進先出記憶體 4 1 1 0 ··多工器 8454, 70: counter 66: decoder 68: OR gate 70: data path 72: multiplexer 74: AND gate 76: OR gate 80, 82: buffer 84, 86: gate 90: multiplexer 92: buffer 94: AND gate 96 OR gate 100, 104: decoder 102, 106: OR gate 300: data drive 82 200535609 320, 380: UDMA interface 322: invalid data drive 340, 370: FIFO memory 342, 344: data path 350: buffer 360: XOR logic 390: redundant or parity drive 392: path 394: XOR logic 396: XOR output 420: buffer 422: direct memory access channel 424: SATA port 428: Disk drive 450: Host interface 460: Data path switch logic 500: System bridge 502: PCI bus 504: PCI bus interface 506: Bus arbiter 5 1 0: Logical direct memory access channel 520 : SATA port 522: physical layer 524: disk drive 83 200535609 526: link layer 528: transmission layer 5 3 0: FIFO memory 5 3 2: link 540: data path 620: XOR logic 622: buffer 3702 3704, 3706, 3708, 3810, 3812: Buffer 3820: Redundant (Parity) Disk 3822: Information drives 4010,4120,4130,4222,4320: XOR 4020,4210,4310,4140: FIFO memory 4110 of the multiplexer 84 ··

Claims (1)

200535609 十、申請專利範圍: 1· 一種同步讀出來自冗餘 方法,該冗餘RAID磁碟機陣 碟機加上單一個冗餘磁碟機, 多個磁區之一拆解寬度而拆解 尺寸係自一至該陣列的容量之 含步驟: Raid磁碟機陣列的資料之 列係具有任何數目的資料磁 其中,儲存的資料係以一或 於該等磁碟機,且讀出請求 -個整數的磁區,該方法包200535609 10. Scope of patent application: 1. A synchronous read from redundant method, the redundant RAID disk drive plus a single redundant disk drive, one of the multiple sectors is disassembled and disassembled Dimensions include steps from one to the capacity of the array: The data array of the Raid drive array has any number of data magnets, the stored data is one or more of the drives, and the read request is- Integer sector, this method pack 決定是否該陣列之 識別該陣列之一第 一初始磁區; 任何磁碟機已經失效; 兹碟機,其含有讀出請求資料之 讀出來自該第一磁碟機之資料, 、 磁區至區塊之处束或至&gt; ^ °貝出凊求之初始 匕龙i、、.口末或至该讀出 到; s a &lt;、、、°朿,無論何者先來 仔於一失效的磁碟機,則讀出來f 其餘的磁碟機之對應範圍的磁區,計算讀出自 «機之資料的一個布 結果。 ^木X〇R函數,且“即時,,傳輸言! 物=請專利範圍…之方法,更包含:若該讀出 口月水延伸超過單一個咨 連續區塊之資#,且讀出二則#出來自跨於該拆解的 之資料,直到η咳 5於該等磁碟機的連續拆解 取係同_ w出請求完成為m,所有的讀出存 取係叫貫行於該等磁碟機之各者之上。 資料物範㈣2項之方法,更包含:若且當-…求自—失效的磁碟機,則讀出來自其餘的磁 85 200535609 碟機之對應範圍的磁@@ e t — 固们磁&amp;,计异讀出自該等其餘的磁碟機之 資料的一個布林XOP 了、赵 α “ R函數,且即時”傳輸該結果。 4 ·、如申明專利範圍第2項之方法,更包含: 識別第一磁碟機,其上係儲存一資料區塊,其包括 該讀出請求資料之一最終磁區; 讀出來自該識別的第二磁碟機之資料; 傳輸第二磁碑娘杳祖,&amp; ^ 系钺貝枓,自該區塊之起始、或若其係於 相同的區塊中而自贫3私芬 、 如 目°哀初始磁區,直到該最終磁區;及 若該最終磁區之資斜F由 &lt;貝才+ £塊係位在於已經失效之一磁 機,則讀出來自其餘的磁碟 、 卞拽 &lt; 對應耗圍的磁區,計算讀 出自ό亥寺其餘的磁碟機之資 示俄 &lt; 貝枓的一個布林XOR函數,且 傳輸該結果而作為該最終磁區之資料區塊。 5·如申請專利範圍第2項之方 貝 &lt; 万法,其中,該RAID陣 列之磁碟機係SΑΤΑ磁碟;)#,甘々+、 一 ’、 八各者為附接一 SATA埠, 且同步資料讀出操作係包 ^ ^ U步傳輸來自各個SATA埠之 控制器側的資料。 f :二種用於更新資料區塊之方法,該資料區塊係以拆 角午舅枓杀構儲存於一且有ώ欠 ^ ▼貝料區塊之冗餘磁碟機陣列之 τ,δ亥方法包含步驟: 識別一目前資料拆解,其 一巴括奴作更新之區塊; 識別該陣列之一第一磁 a η ... 、茱械’其儲存用於目前拆解之 目刖同位貧料; 1 識別該陣列之—筮-^ 區塊; I機,其儲存欲作更新之資料 86 200535609 分別自該第一與第二磁碟機 m μ + 戍向5貝出该賁料區塊且讀出 一個對應的同位區塊;什异擷取的資料區塊與擷 、糊取的同位區塊之一第一 XOR,以形成一中間的資料區塊;儲存所計算的XOR於_暫眸 曰蛉储存位置,其中,該辞曾步驟係“即時,,作成於該儲存步 ^Μ次r 少私期間而無須儲存該擷取的為料區塊與擷取的同位區塊於記憶體中; 讀出該中間區塊且讀出該新資料^塊;’ 汁异該中間區塊與新資料區埗,L現之一弟二X〇R · 同步儲存所計算的第二X〇R ,, R於该同位磁碟機(第一磁碟枝)中且儲存該新資料區塊 u 土乐一磁碟機中。 7·如申請專利範圍第6項之方法,其中 經由一串列埠而連接之磁碟機。 8·如申請專利範圍第7項之方法 經由一 SATA埠而連接之磁碟機。 9.如申請專利範圍第8項之方法 貧料路徑開關邏輯而耦接至一緩衝器。 10·如申請專利範圍第9項之方法 開關邏輯係可重新配置。 11·如申請專利範圍第9項之 開關邏輯係包括:布林X0R、廊Μ 穴| 1貝竹路 斤 不硬體,用於計算豆有夾白兮 寺磁碟機任—者之選擇的資料輸人之—舰。 巴二種用於更新—資料區塊之二步驟式方法,該資料區塊係以拆解資料架構 、 構儲存於一具有更新資料區塊之冗餘 含 含 該陣列係包 其中,該陣列係包 其中,該陣列係由 其中’該資料路徑 其中,該資料路徑 87 200535609 磁碟機陣列之中,該方法包含步驟: 方、第#I,自該陣列而讀出該資料區塊且讀出同 位々區塊’計算該資料區塊與同位區塊之一職,且儲存所 t勺XOR方、冑日寸儲存位置令而無須儲存該資料區塊 或同位區塊於記憶體中;及 於一乐二步驟’自該暫時儲存位置而讀出所計算的 且自記憶體而讀出該更新資料區塊,計算該第-計算 的XOR與更新資料區塊 ^ ^ ^ ^ 弟一 X0R,儲存該更新資料 &amp;塊方;該陣列中,且儲存該帛— 褅仔巧弟—汁异的X〇R於該 同位磁碟機以更新該同位區塊。 其中,該陣列係 I3·如申請專利範圍第12項之方法 包含:經由-串列埠而連接之磁碟機 其中,該陣列係 I4·如申請專利範圍第13項之方法 包含:經由一 SATA搶二、由μ Λ1Α埠而連接之磁碟機。 其中,該陣列係 b·如申請專利範圍第14項之方法 由資料路徑開關邏輯而耦接至'緩衝器。 %如申請專利範圍第15項之: 徑開關邏輯係插入於 、法’,、中’該資料路 ^ y- 磁碟機與主機緩衝哭直接咛产 -存取之間,以同步 :衝。。直接4 17.m h 夕動且即日T致能XOR計算。 、工AID磁碟陣列护r制哭 複數個串列磁碟機^ 歹為’包含:[寫入] ”歲&quot;面,用於附接實 一磁碟機陣列; 接貝體磁碟機以形成 該等串列介面之各者係 ^ 寫入資料,使1 、 綾衝記憶體,用於儲存 丁 1定具於一磁碟宫 廿 …、払作而儲存於附接的磁碟機 200535609 中; 該等串列介面之各者係提供—狀能 該緩衝記憶體於何時為滿; 〜-1輸出,以指出 開關邏輯電路,接至所 自至少一個直接記憶體存取通道之寫人=’用於接收來 期間提供該寫入資料至該等緩衝記_貝”且於寫入操作 -邏輯電路’用於谓測何時所有:’ 一控制電路,響應於該邏 ,σ fe體為未滿; _ 路’用以經由含穸P 3 Μ、F» 輯電路而同步寫入來自直接記憶體存取通道之寫:邏 所有的緩衝記憶體,藉以形成同步寫入資:;之及寫入-貝料至 ”該開關邏輯電路係包括,職電路,用 日守形成來自該同步寫入資斜夕^ i'Ρ 列中。 ‘、’、&quot;几餘資料以供儲存於該陣 18.如申請專利範圍第ρ項之 控制器,其中,該串列磁磾機人面“ D磁碟陣列 容的介面。 ^磁碟找”面係包含一 “ΤΑ規格相 八中 4控制器係實施於一子板。 °亥控制态係實施於一電腦母板。 二·如申明專利乾圍第18項之改良式Raid磁碟陣列 工制杰’其中’該開關邏輯電路係可配置,用於實施介於 直接记憶體存取通道與至少一個sata磁碟機介 期望關聯性。 間的 89 200535609 22.如申請專利範圍第2 1項之 控制器,其中,該開關邏輯電㈣M ^ D磁碟陣列 认如申請專利範圍第映射資料所配置。 控制器,其中,9扯. 、之改良式RAID磁碟陣列 ^ ^ /、,〜映射貢料係軟體可重新配置,用於 配置該開關邏輯電路。 用於動恶 24_如申請專利範圍第17項之改良式RAm 控制器,其巾 # 碟陣列 串列介面之_戈=關Γ 進該冗餘資料至該等 資料同時二广者’用以於單-磁碟寫入操作中與寫入 5 ·如申睛專利範圍第丨7項之 控制器,A中,夂心 6 、良式RAID磁碟陣列 (FIF0)記憶體。 先進先出 17 施3 2 1 ’该開關邏輯電路係根據-個目前配置而實 视兀的貧料路徑。 控制器,1月專利乾圍弟26項之改良式RAID磁碟陣列 -中’该開關邏輯電路係藉由映射資料所配置。 控制J.如:請專利範圍第27項之改良式一碟陣列 配置該開二料係軟體可重新配置,用於動態 控制2器9.如::她圍第27項之改良式議磁碟陣列 ”為貫施於一主機匯流排轉接器上。 控制3器〇.如::專利範圍第17項之改良式議磁碟陣列 /、中,该開關邏輯電路係包括一磁碟寫入累積器 90 200535609 電路。 種改良式RAiD磁碟陣列控制器,包含: 複數個串列磁碟機介面, — 一磁碟機陣列; 、&lt; 接貫體磁碟機以形成 磁辟-1:歹]面之各者係包括 '緩衝記憶體,用於於-磁碟讀出操作中儲在士# ^ 1 m H 料; 〖纟δ亥附接的磁碟機所抓取之讀出資 該等串列介面之各者#裎祝^ 兮π a # ^供一狀態旗標輸出,以指出 该緩衝記憶體目前是否為空; ^以扣出 開關邏輯電路,耦接 路徑,用於接收來自該等串列介=“,以貫施資料 且於一寫入操作期間而提供今接收=卜者的讀出資料 ^ ^ °亥接收的讀出資料至該至少一 個直接記憶體存取通道; ^ 一邏輯電路,用於傷測 &amp; + 、、饤可所有的緩衝記憶體為未空; 一控制電路,響應於該 政而η +後終i ώ ?耳私路,經由该開關邏輯電 路而同步傳輸來自所有的緩 _ _ / &amp;、S、f - T °己體之碩出貨料至直接記 肢存取通道,糟以形成同步讀出資料·,及 該開關邏輯電路係包括太 括…“…布林X0R電路,用於當同步 5貝出資料自S玄等緩衝記憶體 士 “ 士 ” 得輪至直接記憶體存取通道 %, 即犄重建來自該同步# Ψ t u少δ貝出貪料之漏失資料。 32·如申請專利範圍第Μ 貝之改良式RAID磁碟陳列 控制器,其中,該串列磁碟機 ' 六 系枞&quot;面係包含一 SATA規格相 各的介面。 3 3 ·如申請專利範圍第λ ? 2員之改良式RAID磁碟陣列Decide whether the array recognizes a first initial sector of the array; any drive has failed; a drive that contains read request data, reads data from the first drive, The place where the block is or goes to ^ ° Be the initial dagger i ,,., Which is requested, or it should be read out; sa &lt; ,,, °, no matter which comes first For the disk drive, read the magnetic field corresponding to the rest of the disk drive, and calculate a result of reading the data from the machine. ^ Wood X〇R function, and "immediately, transmission language! The method of the patent ... please include: if the reading outlet extends beyond a single continuous block of resources #, and read two # 出 出 了 数据。 The data from the disassembly until η cuck5 continuous disassembly and retrieval of these drives is the same as _ w out request is completed as m, all read access is called through the The method of item 2 of the data object further includes: if and if-... seek from-the failed disk drive, then read the magnetic range from the remaining range of the magnetic disk 85 200535609. @@ et — Gumen magnetic &amp;, a Brin XOP that reads the data from the remaining disk drives, Zhao α "R function, and real-time" transmission of the result. 4 · As stated in the patent scope The method of item 2 further includes: identifying the first disk drive, on which a data block is stored, which includes one of the final magnetic sectors of the read request data; and reading the second disk drive from the identification. Data; transmission of the second magnetic stele mother ancestor, & ^ is 钺 贝 钺, from the beginning of the block, or if it Tied to the same block and self-deprived from the private sector, such as the initial magnetic field until the final magnetic field; and if the final magnetic field F is from &lt; If one of the magnetic drives fails, it reads the remaining disks, and drags the magnetic disk corresponding to the consumption area, and calculates and reads out a Brin XOR function from the remaining disk drives of the Haihe Temple. And the result is transmitted as the data block of the final magnetic zone. 5. If the method of applying for the second item of the patent scope &lt; Wanfa, wherein the drive of the RAID array is a SAA disk;) #, Each of Gan ++, Yi ', and Qi is attached to a SATA port, and the synchronous data readout operation package is ^ ^ U step to transfer data from the controller side of each SATA port. F: two types for updating data blocks Method, the data block is stored in a decomposed structure at a corner with no money ^ ▼ the redundant disk array of the shell material block τ, δHai method includes steps: identifying a current data demolition Solution, one of which is the updated block; identify one of the arrays, the first magnetic a η ..., Zhu Qin 'its storage is used for the current dismantling purposes; parity-lean materials; 1 identify the — 筮-^ block of the array; I machine, which stores information to be updated 86 200535609 from the first And the second disk drive m μ + 出 out of this data block and read out a corresponding parity block; the first retrieved data block and one of the retrieved, pasted parity blocks are the first XOR to form an intermediate data block; store the calculated XOR in the storage location of _Temporary Eyes, where the diction step was "real time, and was made during the storage step ^ M times r less private period without the need Store the retrieved block and the parity block in memory; read the intermediate block and read the new data block; 'Judging the difference between the intermediate block and the new data area 埗, L The current second X〇R · Synchronous storage of the calculated second X 0R, R is stored in the same drive (the first disk stick) and the new data block is stored. in. 7. The method according to item 6 of the patent application, wherein the disk drive is connected via a serial port. 8. Method according to item 7 of the patent application. A disk drive connected via a SATA port. 9. The method according to item 8 of the scope of patent application is coupled to a buffer with lean path switching logic. 10. The method of item 9 in the scope of patent application The switching logic is reconfigurable. 11. If the switch logic system of item 9 of the scope of patent application includes: Boling X0R, Gallery M | 1 pezhu road catty is not a hard body, which is used to calculate the choice of the magnetic disk drive of Douji Temple Data input-ship. A two-step method for updating—a data block. The data block is a data structure that is disassembled and stored in a redundant block with the updated data block. The array system includes the array system. Including, the array is composed of the data path, the data path 87 200535609, and the method includes the steps: Fang, No. #I, read the data block from the array and read "Parity block" calculates the position of the data block and parity block, and stores the XOR side and the next day's storage location order without storing the data block or parity block in the memory; and One fun two steps' read out the calculated data from the temporary storage location and read the updated data block from memory, calculate the -calculated XOR and update data block ^ ^ ^ ^ Brother X0R, store The update data &amp; block is stored in the array, and the XOR of the 帛-巧 仔 巧 弟-Juyi is stored in the co-located drive to update the co-located block. Wherein, the array is I3. The method according to item 12 of the patent application includes: a disk drive connected via a serial port. Among them, the array is I4. The method is according to item 13 in the patent application: including a SATA. Grab the second, the drive connected by the μ Λ1Α port. Among them, the array is b. The method according to item 14 of the scope of patent application is coupled to the 'buffer' by the data path switch logic. % If the scope of the patent application is 15th: The path switch logic is inserted in the data path of the method, the method, and the method. ^ Y-The drive and the host's buffer memory are directly produced and accessed to synchronize: flush. . Directly enable the XOR calculation for 17.17 h at night and T on the same day. 2. AID disk arrays and multiple serial drives ^ 歹 is' contains: [write] "year-old" surface, used to attach a real disk array; connected to a shell disk drive In order to form each of these serial interfaces, ^ write data so that 1, 1 memory is used to store Ding 1 fixed on a magnetic disk ..., operation and stored on the attached disk drive 200535609; each of these serial interfaces provides-when the buffer memory is full; ~ -1 output to indicate the switching logic circuit, connected to the write of at least one direct memory access channel Person = 'used to provide the written data to these buffer records during the receiving period' and used in the write operation-logic circuit to measure when all: 'a control circuit, in response to the logic, σ fe body Is not full; _ Road 'is used to write synchronously from the direct memory access channel via the P3M and F »series circuits: Logic all buffer memories to form synchronous write data:; of And write-be expected to "The switch logic circuit includes, It is written from the synchronously written information column ^ i'P. Several data are stored for storage in the array. 18. For example, the controller of the item ρ of the patent application range, wherein the serial magnetic Computer interface "D disk array capacity interface. ^ The "disk lookup" system contains a "TA specification phase eight middle 4 controller is implemented on a daughter board. ° Control state is implemented on a computer motherboard. 2. As stated in the patent No. 18 of the improved Raid disk array manufacturing system, the switch logic circuit is configurable to implement a direct memory access channel and at least one SATA drive. Mediation expects relevance. 89 200535609 22. The controller of item 21 in the scope of patent application, wherein the switch logic voltage M ^ D disk array is configured as the mapping information in the scope of patent application. The controller, of which 9 is a modified RAID disk array ^ ^ /, ~ The mapping software is reconfigurable to configure the switch logic circuit. For the use of evil 24_ If the improved RAm controller of item 17 in the scope of patent application, its # disk array serial interface _ Ge = Off In single-disk write operation and write 5 · For example, the controller of item No. 丨 7 in the patent application scope of A, A6, good-quality RAID disk array (FIF0) memory. FIFO 17 Shi 3 2 1 ′ The switching logic circuit is based on a lean path that is based on a current configuration. The controller, January patented the improved RAID disk array of 26 items. The switch logic circuit is configured by mapping data. Control J. For example, please configure the improved one-disc array of the 27th item of the patent. The open-second software can be reconfigured for dynamic control of the 2 device. For example: She improved the 27th item. "Array" is implemented on a host bus adapter. Control 3 devices. For example: the improved magnetic disk array / in the 17th area of the patent, the switch logic circuit includes a disk write Accumulator 90 200535609 circuit. An improved RAiD disk array controller, including: a plurality of tandem disk drive interfaces, a disk drive array; &lt; contiguous disk drives to form a magnetic disk-1:歹] Each of them includes a 'buffer memory, which is used to store in the disk read operation # ^ 1 m H data; 纟 纟 亥 亥 read data captured by the attached disk drive Each of the serial interfaces # 裎 祝 ^ xiπ a # ^ is used for a status flag output to indicate whether the buffer memory is currently empty; ^ The switch logic circuit is pulled out and coupled to the path for receiving from These serial introductions = "to implement the data and provide it during a write operation. Output data ^ ^ ° Received read data to the at least one direct memory access channel; ^ a logic circuit for damage detection &amp; +, 饤 all buffer memory is not empty; a control circuit In response to this policy, η + will eventually be freely transmitted through the switch logic circuit to synchronize the transmission of all the material from the _ _ / &amp;, S, f-T ° to the direct The limb access channel is bad to form synchronous readout data, and the switch logic circuit includes Tai ... "... Brin X0R circuit, which is used to synchronize data from buffer memory such as Suan, etc." ”It ’s the turn of direct memory access channel%, that is, to reconstruct the missing data from the synchronization # Ψ 少 δ less than the missing data. 32. For example, the improved RAID disk display controller in the scope of the patent application, Among them, the tandem disk drive 'six series' includes a SATA-specific interface. 3 3 · If the patent application scope is λ-2 members of the improved RAID disk array 91 200535609 控制器,其中,該控制器係實施於一子板。 34.如申請專利範圍第32項之改良式磁碟陣列 控制器,其中,該開關邏輯電路係可配置,用於每 直接記憶體存取通道與至少—個SAT^碟機介 期望關聯性。 〈間的 35.如申請專利範圍第34項之改良彳尺伽磁碟陣列 工制益’其中’該間關邏輯電路係藉由映射 * i6.如申請專利範圍第35項之改良式RAID;^列 Mb ’其中’該映射資料係軟體可重新配置,用於動能 配置該開關邏輯電路。 心 37·如申請專利範圍第31項之改良式r· 控制器,其中,該間弓 業陣列 π ± , 開關邏輯電路係於單一磁碟讀出摔作中 同時與讀出資料將推進該冗餘資料 ::中 通道之-或多者。 m要。己隐體存取 38.如申請專利範圍第31項之改良式R· 工制益,其中,各個串列埠緩衝記憶體係包含 (FIFO)記憶體。 進先出 ,如範圍第31項之改良式… 乙制益,其中,琢開關 施32位元的資料路徑。路係根據-個目前配置而實 40.如申請專利範圍第39 控制器,其中,該開關邏輯電路#|^式RAID磁碟陣列 4 1 4由β击… 係错由映射資料所配置0 :·如申广範圍第4。項之改良式 H,其中,該映射資料 轉列 係“可重新配置,用於動態 92 200535609 配置該開關邏輯電路。 42. 如申請專利範圍第31項之改良式RAID磁碟陣列 控制器,其為實施於一主機匯流排轉接器。 43. 如申請專利範圍第31項之改良式RAID磁碟陣列 控制器,其中,該開關邏輯電路係包括一磁碟讀出累積器 電路。91 200535609 controller, in which the controller is implemented on a daughter board. 34. The improved magnetic disk array controller according to item 32 of the application, wherein the switch logic circuit is configurable, and is used for each direct memory access channel to be expected to be associated with at least one SAT ^ disk drive. <35. For example, the improvement of the ruler-gamma disk array manufacturing method of the 34th patent application scope, where 'the related logic circuit is by mapping * i6. For example, the improved RAID of the 35th patent application scope; ^ Column Mb 'where' The mapping data is software reconfigurable for kinetic energy configuration of the switch logic circuit. Heart 37 · If the improved r · controller of the 31st scope of the application for a patent application, wherein the bow industry array π ±, the switching logic circuit is in a single magnetic disk read and fall simultaneously with the read data will promote the redundant Yu information :: one or more of the middle channel. m want. Hidden body access 38. For example, the improved R · work system benefit of item 31 of the patent application scope, wherein each serial port buffer memory system includes (FIFO) memory. First-in, first-out, such as the modified form of the 31st item ... B system benefits, in which the 32-bit data path is implemented. The circuit is based on a current configuration 40. For example, the 39th controller in the scope of patent application, where the switch logic circuit # | ^ -type RAID disk array 4 1 4 is hit by β ... The error is configured by the mapping data 0: · Such wide range No. 4. Item H, wherein the mapping data is “reconfigurable for dynamic 92 200535609 to configure the switch logic circuit. 42. For example, the improved RAID disk array controller of item 31 of the patent application, which It is implemented in a host bus adapter. 43. The improved RAID disk array controller according to item 31 of the patent application, wherein the switch logic circuit includes a disk read accumulator circuit. 十一、圖式: 如次頁Eleven, schema: as the next page
TW94107704A 2004-03-12 2005-03-14 Disk controller methods and apparatus with improved striping, redundancy operations and interfaces TWI386795B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US55359404P 2004-03-12 2004-03-12

Publications (2)

Publication Number Publication Date
TW200535609A true TW200535609A (en) 2005-11-01
TWI386795B TWI386795B (en) 2013-02-21

Family

ID=34994276

Family Applications (1)

Application Number Title Priority Date Filing Date
TW94107704A TWI386795B (en) 2004-03-12 2005-03-14 Disk controller methods and apparatus with improved striping, redundancy operations and interfaces

Country Status (2)

Country Link
TW (1) TWI386795B (en)
WO (1) WO2005089339A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI391815B (en) * 2006-03-21 2013-04-01 Ibm Enclosure-based raid parity assist
US9207876B2 (en) 2007-04-19 2015-12-08 Microsoft Technology Licensing, Llc Remove-on-delete technologies for solid state drive optimization
TWI787848B (en) * 2021-03-17 2022-12-21 日商鎧俠股份有限公司 memory system
TWI788860B (en) * 2020-12-16 2023-01-01 日商鎧俠股份有限公司 memory system
CN117251115A (en) * 2023-11-14 2023-12-19 苏州元脑智能科技有限公司 Channel management method, system, equipment and medium of disk array

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI472920B (en) * 2011-09-01 2015-02-11 A system and method for improving the read and write speed of a hybrid storage unit

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5483641A (en) * 1991-12-17 1996-01-09 Dell Usa, L.P. System for scheduling readahead operations if new request is within a proximity of N last read requests wherein N is dependent on independent activities
US6018778A (en) * 1996-05-03 2000-01-25 Netcell Corporation Disk array controller for reading/writing striped data using a single address counter for synchronously transferring data between data ports and buffer memory
US5864653A (en) * 1996-12-31 1999-01-26 Compaq Computer Corporation PCI hot spare capability for failed components
US6151641A (en) * 1997-09-30 2000-11-21 Lsi Logic Corporation DMA controller of a RAID storage controller with integrated XOR parity computation capability adapted to compute parity in parallel with the transfer of data segments

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI391815B (en) * 2006-03-21 2013-04-01 Ibm Enclosure-based raid parity assist
US9207876B2 (en) 2007-04-19 2015-12-08 Microsoft Technology Licensing, Llc Remove-on-delete technologies for solid state drive optimization
US9696907B2 (en) 2007-04-19 2017-07-04 Microsoft Technology Licensing, Llc Remove-on-delete technologies for solid state drive optimization
US10156988B2 (en) 2007-04-19 2018-12-18 Microsoft Technology Licensing, Llc Composite solid state drive identification and optimization technologies
TWI788860B (en) * 2020-12-16 2023-01-01 日商鎧俠股份有限公司 memory system
TWI787848B (en) * 2021-03-17 2022-12-21 日商鎧俠股份有限公司 memory system
CN117251115A (en) * 2023-11-14 2023-12-19 苏州元脑智能科技有限公司 Channel management method, system, equipment and medium of disk array
CN117251115B (en) * 2023-11-14 2024-02-09 苏州元脑智能科技有限公司 Channel management method, system, equipment and medium of disk array

Also Published As

Publication number Publication date
WO2005089339A2 (en) 2005-09-29
TWI386795B (en) 2013-02-21
WO2005089339A3 (en) 2009-04-30

Similar Documents

Publication Publication Date Title
TWI307859B (en) Sas storage virtualization controller, subsystem and system using the same, and method therefor
US8074149B2 (en) Disk controller methods and apparatus with improved striping, redundancy operations and interfaces
TWI260500B (en) Storage virtualization computer system and external controller therefor
KR101455016B1 (en) Method and apparatus to provide a high availability solid state drive
TW200535609A (en) Disk controller methods and apparatus with improved striping, redundancy operations and interfaces
US11645218B2 (en) Network architecture providing high speed storage access through a PCI express fabric between a compute node and a storage server within an array of compute nodes
TWI344598B (en) Apparatus and method to convert data payloads from a first sector format to a second sector format
CN102209103B (en) Multicasting write requests to multiple storage controllers
TWI331327B (en) Raid controller disk write mask
TWI567565B (en) Nvram path selection
TWI329860B (en)
US8656117B1 (en) Read completion data management
TWI361348B (en) Parity engine for use in storage virtualization controller and methods of generating data by parity engine
US5771359A (en) Bridge having a data buffer for each bus master
JP2013025795A (en) Flash controller hardware architecture for flash devices
TW200915336A (en) ECC control circuits, multi-channel memory systems including the same, and related methods of operation
US7958302B2 (en) System and method for communicating data in a storage network
CN106021147A (en) Storage device for presenting direct access under logical drive model
TW201109938A (en) Transport agnostic SCSI I/O referrals
TW201017404A (en) System and method for loose coupling between RAID volumes and drive groups
TW480395B (en) Record regenerating device
US20090313430A1 (en) Positron Emission Tomography Event Stream Buffering
TWI251147B (en) Method, system, and computer readable recording medium for configuring components on a bus
US20040205269A1 (en) Method and apparatus for synchronizing data from asynchronous disk drive data transfers
TWI544401B (en) Method, storage controller and system for efficiently destaging sequential i/o streams