WO2001031456A1 - Techniques de mise en antememoire permettant d'ameliorer le rendement du systeme dans des applications raid (reseau redondant de disques independants) - Google Patents
Techniques de mise en antememoire permettant d'ameliorer le rendement du systeme dans des applications raid (reseau redondant de disques independants) Download PDFInfo
- Publication number
- WO2001031456A1 WO2001031456A1 PCT/US2000/029881 US0029881W WO0131456A1 WO 2001031456 A1 WO2001031456 A1 WO 2001031456A1 US 0029881 W US0029881 W US 0029881W WO 0131456 A1 WO0131456 A1 WO 0131456A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- bus
- cache memory
- disk drive
- interfacing
- xor
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0866—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2211/00—Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
- G06F2211/10—Indexing scheme relating to G06F11/10
- G06F2211/1002—Indexing scheme relating to G06F11/1076
- G06F2211/1009—Cache, i.e. caches used in RAID system with parity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2211/00—Indexing scheme relating to details of data-processing equipment not covered by groups G06F3/00 - G06F13/00
- G06F2211/10—Indexing scheme relating to G06F11/10
- G06F2211/1002—Indexing scheme relating to G06F11/1076
- G06F2211/1054—Parity-fast hardware, i.e. dedicated fast hardware for RAID systems with parity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/31—Providing disk cache in a specific location of a storage system
- G06F2212/312—In storage controller
Definitions
- the present invention relates to a caching technique for improving system performance in RAID applications.
- the memory caching architecture technique according to the present invention creates a substantial improvement on the system host bus, especially for a RAID application system.
- the present invention is related to U. S. Patent Nos. 5,734,924 entitled System For Host Accessing Local Memory, issued March 31, 1998; 5,586,268 entitled Multiple Peripheral Adapter Device Driver Architecture, issued December 17, 1996; and 5,561,813 entitled Circuit For Resolving I/O Port Address Conflicts, issued October 1, 1996, all of which are assigned to the same assignee as the present invention, and the details of which are hereby incorporated by reference.
- the traditional RAID (Redundant Arrays of Inexpensive Disks) architecture is to fetch data or to store data from/to SCSI Bus (a popular disk drive Bus) to/from PCI Bus (a popular system host Bus).
- SCSI Bus a popular disk drive Bus
- PCI Bus a popular system host Bus
- a typical data transfer size from the Host to the disk drive is in the order of 4K byte to 64K byte.
- the application software needs to perform the XOR operation (from the RAID algorithm), huge amounts of data transfer required to go on the host bus.
- each access from the hard disk drive is very slow relative to silicon memory access time. This in term generates a bottleneck situation on the host bus which in-effect slows down the overall system performance.
- the memory caching architecture technique according to the present invention creates a substantial improvement on the system host bus, especially for the RAID application system, implementing an Exclusive-OR (XOR) function in hardware.
- XOR Exclusive-OR
- the present invention provides a caching system comprising a host bus; a disk drive bus; an internal cache memory; interfacing means for interfacing between the host bus, the disk drive bus and the cache memory.
- the interfacing means includes a host interface for interfacing with the host bus; a disk drive interface for interfacing with the disk drive bus; XOR function means for interfacing with the cache memory and for providing and loading one or more XOR functions into the cache memory; and a micro-controller means for controlling data flow between the host bus, the disk drive bus and the cache memory, including the XOR functions from the cache memory.
- the present invention includes DMA means for buffering the data transfer between the host bus, the disk drive bus and the cache memory; an internal side bus for interfacing with the XOR function means for providing direct data transfers between the host bus and the disk drive bus; and means for providing simultaneous data transfers between the host bus and the disk drive bus and the cache memory.
- Figure 1 shows caching architecture block diagram for the present invention.
- Figure 2 shows a caching architecture block diagram with a XOR datapath and with an embedded CPU for the present invention.
- Figure 3 shows an XOR function with local memory for the present invention.
- Figure 4 shows a more detailed XOR function with local memory of Figure 3.
- the memory caching architecture technique creates a substantial improvement on the system host bus, especially for the RAID application system, implementing an exclusive-OR (XOR) function in hardware.
- XOR exclusive-OR
- the present invention introduces a caching technique which resides between the disk drive bus and the host bus.
- a caching technique which resides between the disk drive bus and the host bus.
- the data access time from the host can be significantly reduced when there is a cache hit or the data is in the cache memory.
- data is pulled out from the disk to the cache memory and to the host bus.
- a read ahead technique can be used in this case as well. More data is read out than required, which translates into more chances for the host to have a cache hit, perhaps for the next time around.
- the minimum size of the cache is equal to the maximum amount of each data transfer from the host multiply by the total number of the disk drives.
- the host Since the XOR operation is extracted out from the software application, the host is no longer necessary to access the data from the slow disk drive bus to the host bus.
- the XOR operation can be performed behind the scene and the RAID software just dispatches the XOR operation to the bus controller.
- the heavy data accesses between the disk drive and the cache memory are totally isolated from the host bus. This boosts up the overall host bus performance and reduces the software time to perform the XOR operation.
- Figure 1 shows the architecture described above.
- Figure 1 is one example of the RAID application architecture for an embedded system.
- the host bus interface block 102 provides the central communication between the host bus 100 and the host adapter circuitry. In most systems today, the host bus 100 would be a PCI bus.
- the host bus 100 communicates with the hard disk drive bus 120 through disk drive interface 124 through this architecture.
- This architecture provides direct data transfer from the host bus 100 to the disk drive via disk drive bus 120 or simultaneously transfer from the disk drive bus 120 to both the host bus 100 and to the cache memory 130.
- the DMA (Direct Memory Access) datapath 140 is a media to buffer up the incoming data and to pump the data out at the appropriate time, especially since the speed from each of the three buses may not be the same.
- the data flow and traffic between the buses is managed by the micro-controller 110.
- the micro-controller 110 accepts the appropriate command blocks from the RAID software and the micro-controller 110 ucode would decode and execute the commands in a way such that the controls are enabling the proper datapath for a specific task.
- the DMA datapath 140 would provide enough data buffering on each side of the three buses, so that data transfer can be operated in an effective manner.
- the proper buffering size would determine by the typical transferring size, the incoming data rate and the outgoing data rate. With this architecture, the cache data can transfer to/from the host bus and the disk drive bus concurrently.
- This effect can be achieved from the side bus interface 144 by providing dual datapath from side bus 146 to host bus 100 and to hard disk bus 120. By alternating the transfer to this two datapath in a reasonable transfer size(such as 512B each time), a concurrent transferring effect from the software point of view can be accomplished.
- One bus that DMA 140 interfaces with is the side bus 146 which communicates with the cache controller and the XOR functional block 150.
- This block 150 provides the capability of doing an XOR function and interfacing with the cache memory 130 via memory bus 154. This capability enables the RAID application to substantially improve the system performance. By performing the XOR function on the side while freeing up the host bus 100 for other tasks, the host bus bandwidth is improved significantly and at the same time, the time for performing the XOR task is reduced.
- the side bus 146 accepts multiple XOR blocks. In this way, the expandable cache memory and multiple XOR functions can execute simultaneously. For the high-end server class system, bigger cache size and multiple XOR functions actively performing simultaneously would be desirable.
- a 64Mbyte transfer is activated from disk drive bus 120 to host bus 100 via host interface 102.
- the data transfers from the disk drive bus 120 to the host bus 100, and at the same time, the same data transfers into the cache memory 130 as well.
- the host requires fetching the same set of data, the data can be fed to the host as fast as several hundreds of nanoseconds. Since the data resides in the cache memory 150, the data can be quickly accessed without going through the hard disk which normally would take up to few milliseconds.
- the RAID software loads the proper source data into the cache memory 130, and the RAID software would execute the command by properly programmed the internal required registers.
- the hardware puts the XOR result into the cache memory 130, and the RAID software gets the data from the cache memory 130 and put this result into the disk drive.
- the RAID software gets the data from the cache memory 130 and put this result into the disk drive.
- the top level command receives from the host and the microprocessor runs the RAID software to decode the command and formed a list of lower level commands which usually send over the host bus 100. Now these lower level commands are sent to the micro-controller for the next level of execution through the internal bus 146.
- the XOR function plus datapath 157 for direct cache memory 130 access includes XOR 161 and memory controller 159. This method will even further reduce the traffic on the host interface bus 102.
- the XOR hardware While the XOR hardware is executing, the data in the cache memory is locking up exclusively for this operation.
- the XOR operation can be very lengthy in time depending on the number of the input sources and the size of the data. If there are 8 input sources and each source is 64Kbytes, the XOR operation would occupy the cache data for few mini-seconds. This is a very long period of time. If the host bus or the disk drive bus needs to access the cache data, there is a long waiting period before the cache memory is free for access.
- the technique to improve this situation is to implement a dual datapath function 157, as shown in Figure 2.
- One datapath is used for the XOR operation and the other is used for direct cache memory access. While the XOR datapath 161 is performing the operation, the host bus 100 or the disk drive bus 120 can access the cache data through the second datapath. This technique would delay the XOR operation from finishing but the overall performance would improve. Since neither the host bus 100 nor the disk drive bus 120 need not have to wait for the XOR operation 161 to complete, the RAID application software can continue to process while the XOR operation 161 is performing in the background.
- Figure 2 shows the overall caching technique of Figure 1 and where the dual datapaths can fit in to further improve the caching technique.
- the XOR logic 161 continues for 512Kbtye.
- the disk drive request is granted. The transfer begins for the next 512byte from the cache memory 130 to the disk drive through disk drive bus 120. Once again the XOR 161 is put on hold. When the data is pumped out from the cache to the disk drive, the XOR 161 is activated again for another 512Kbtye. Once the 512byte operation is completed, the side bus interface 144 swings back to the host bus transfer. The same process continues until all 64Kbyte transfer is completed on both the host bus 100 and the disk drive bus 120. One may notice the cache bus is being multiplexed for the XOR function 161, the host bus 100 and disk drive bus 120.
- This capability allows the software to transfer data from/to the cache memory 130 while the lengthy XOR 161 is performing in the background.
- the data access from the cache memory 130 can be performed without waiting for the completion of the XOR operation 161. Therefore, the overall system performance can be greatly improved.
- the cache memory 130 is partitioned into 5 logical units and assigned the units to be A unit, B unit, C unit, D unit, and X unit. Each unit A, B, C, D and X are assigned 64Kbytes and the host is storing data into the disk. Since there is a cache memory 130, the data will be storing into the cache memory 130 instead. The following are the sequences of operation to perform this task without any local memory.
- Each unit in the above example is 64Kbytes and this translated into many thousands of cycles depending on the implementation of the memory controller and the memory bus width.
- memory transfer cycles are reduced by a significant amount and reduce or save power at the same time due to less transaction on the bus.
- the technique is to add a local memory with the XOR hardware, as shown as block 160 in Figure 3.
- the incoming data from the cache memory 130 or from the external sources can perform XOR operation with the local memory data and the result stores back into the local memory 160.
- the optimum size of the local memory is equal to the maximum host bus transfer size per transaction or commonly called strip/segment size. This size may vary from one application to another application and the example above uses 64Kbytes per host transfer as a unit.
- the intermediate XOR 166 result can be stored in the local memory 170 and the datapath can be arranged so that as the data comes-in from the local side bus 146, the data can go to the cache memory 130 by cache memory controller 159 and to the XOR 166 logic path as well.
- this architecture not only can reduce the cache bandwidth by a substantial amount but also can complete the XOR transaction in less than half the time.
- the XOR 166 function is performed as the data come-in from the local side bus 146. Since one source is ready in the local memory 170 and the other is coming in, the XOR 166 function is executed on the fly as the data flow through the datapath. The result is inputted back into the local memory 170 and it is ready to be used again for the next input source.
- the local memory 170 embedded in this architecture clearly has demonstrated the effectiveness and the efficiency of enhancing the implementation of the XOR feature in the RAID application system.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU14446/01A AU1444601A (en) | 1999-10-28 | 2000-10-27 | Caching techniques for improving system performance in raid applications |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US42914299A | 1999-10-28 | 1999-10-28 | |
US09/429,142 | 1999-10-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001031456A1 true WO2001031456A1 (fr) | 2001-05-03 |
WO2001031456A9 WO2001031456A9 (fr) | 2002-05-10 |
Family
ID=23701974
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/029881 WO2001031456A1 (fr) | 1999-10-28 | 2000-10-27 | Techniques de mise en antememoire permettant d'ameliorer le rendement du systeme dans des applications raid (reseau redondant de disques independants) |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU1444601A (fr) |
WO (1) | WO2001031456A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005066761A2 (fr) * | 2003-12-29 | 2005-07-21 | Intel Corporation | Procede, systeme et programme de gestion d'organisation des donnees |
US7188303B2 (en) | 2003-12-29 | 2007-03-06 | Intel Corporation | Method, system, and program for generating parity data |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4571674A (en) * | 1982-09-27 | 1986-02-18 | International Business Machines Corporation | Peripheral storage system having multiple data transfer rates |
US5206943A (en) * | 1989-11-03 | 1993-04-27 | Compaq Computer Corporation | Disk array controller with parity capabilities |
US5522065A (en) * | 1991-08-30 | 1996-05-28 | Compaq Computer Corporation | Method for performing write operations in a parity fault tolerant disk array |
US5937174A (en) * | 1996-06-28 | 1999-08-10 | Lsi Logic Corporation | Scalable hierarchial memory structure for high data bandwidth raid applications |
-
2000
- 2000-10-27 AU AU14446/01A patent/AU1444601A/en not_active Abandoned
- 2000-10-27 WO PCT/US2000/029881 patent/WO2001031456A1/fr active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4571674A (en) * | 1982-09-27 | 1986-02-18 | International Business Machines Corporation | Peripheral storage system having multiple data transfer rates |
US5206943A (en) * | 1989-11-03 | 1993-04-27 | Compaq Computer Corporation | Disk array controller with parity capabilities |
US5522065A (en) * | 1991-08-30 | 1996-05-28 | Compaq Computer Corporation | Method for performing write operations in a parity fault tolerant disk array |
US5937174A (en) * | 1996-06-28 | 1999-08-10 | Lsi Logic Corporation | Scalable hierarchial memory structure for high data bandwidth raid applications |
Non-Patent Citations (3)
Title |
---|
"Understanding HP AutoRAID part II or II: The AutoRAID storage hierarchy performance considerations", ENTERPRISE STORAGE SOLUTIONS DIVISION, HEWLETT-PACKARD COMPANY, March 1999 (1999-03-01), XP002937542, Retrieved from the Internet <URL:www.enterprisestorage.hp.com/products/disk_array/autoraid/sse_II_disk_array_12h_fa.html> [retrieved on 20001215] * |
MENON JAI ET AL.: "The architecture of a fault-tolerant cached RAID controller", PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, May 1993 (1993-05-01), pages 76 - 86, XP002937543 * |
WILKES JOHN ET AL.: "The HP AutoRAID hierarchical storage system", PROCEEDINGS OF THE FIFTEENTH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES, December 1995 (1995-12-01), pages 96 - 108, XP002937544 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005066761A2 (fr) * | 2003-12-29 | 2005-07-21 | Intel Corporation | Procede, systeme et programme de gestion d'organisation des donnees |
WO2005066761A3 (fr) * | 2003-12-29 | 2006-08-24 | Intel Corp | Procede, systeme et programme de gestion d'organisation des donnees |
US7188303B2 (en) | 2003-12-29 | 2007-03-06 | Intel Corporation | Method, system, and program for generating parity data |
US7206899B2 (en) | 2003-12-29 | 2007-04-17 | Intel Corporation | Method, system, and program for managing data transfer and construction |
Also Published As
Publication number | Publication date |
---|---|
WO2001031456A9 (fr) | 2002-05-10 |
AU1444601A (en) | 2001-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6636927B1 (en) | Bridge device for transferring data using master-specific prefetch sizes | |
EP1019835B1 (fr) | Acces direct en memoire (dma) segmente comportant un tampon xor pour sous-systemes de mise en memoire | |
US5649230A (en) | System for transferring data using value in hardware FIFO'S unused data start pointer to update virtual FIFO'S start address pointer for fast context switching | |
US5594877A (en) | System for transferring data onto buses having different widths | |
US6134630A (en) | High-performance bus architecture for disk array system | |
US7225326B2 (en) | Hardware assisted ATA command queuing | |
US5694581A (en) | Concurrent disk array management system implemented with CPU executable extension | |
US5978856A (en) | System and method for reducing latency in layered device driver architectures | |
US20040139286A1 (en) | Method and related apparatus for reordering access requests used to access main memory of a data processing system | |
JPH08504524A (ja) | 順次読取りキャッシュに有利な読み込み/書込みバッファ分割の割当て要求 | |
JP3247075B2 (ja) | パリティブロックの生成装置 | |
US8386727B2 (en) | Supporting interleaved read/write operations from/to multiple target devices | |
US6425053B1 (en) | System and method for zeroing data storage blocks in a raid storage implementation | |
JP2001523860A (ja) | 高性能構造ディスク・アレイ・コントローラ | |
JP3266470B2 (ja) | 強制順序で行う要求毎ライト・スルー・キャッシュを有するデータ処理システム | |
US20030236943A1 (en) | Method and systems for flyby raid parity generation | |
US5459838A (en) | I/O access method for using flags to selectively control data operation between control unit and I/O channel to allow them proceed independently and concurrently | |
EP0618537B1 (fr) | Système et procédé d'entrelacement d'information d'état avec transferts de données dans un adaptateur de communication | |
US6008823A (en) | Method and apparatus for enhancing access to a shared memory | |
JPH03189843A (ja) | データ処理システムおよび方法 | |
WO2001031456A1 (fr) | Techniques de mise en antememoire permettant d'ameliorer le rendement du systeme dans des applications raid (reseau redondant de disques independants) | |
US6513142B1 (en) | System and method for detecting of unchanged parity data | |
US7136972B2 (en) | Apparatus, system, and method for distributed management in a storage system | |
US5946707A (en) | Interleaved burst XOR using a single memory pointer | |
JPH076093A (ja) | 記憶制御装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/4-4/4, DRAWINGS, REPLACED BY NEW PAGES 1/4-4/4; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: COMMUNICATION PURSUANT TO RULE 69(1) EPC (EPO FORM 1205A DATED 09-12-2003) |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
122 | Ep: pct application non-entry in european phase |