JP2001147886A

JP2001147886A - Device and method for sharing disk time

Info

Publication number: JP2001147886A
Application number: JP32909599A
Authority: JP
Inventors: Yuji Hotta; 勇次堀田; Riichiro Take; 理一郎武; Tadashi Kato; 匡史加藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1999-11-19
Filing date: 1999-11-19
Publication date: 2001-05-29
Anticipated expiration: 2019-11-19
Also published as: JP4091225B2

Abstract

PROBLEM TO BE SOLVED: To automatically adjust the operation condition of time sharing fulfilling requested performance based on an achievement value. SOLUTION: An input and output schedule mechanism 20 defines the rate of a time for allowing each input and output group obtained by grouping the sources of input and output to a disk device 16 to use a disk, and decides an assignment time (quantum) for allowing each input and output group to continuously use the disk device based on the defined time rate, and operates time sharing for allowing the competing input and output groups to use the disk device by sequentially switching the assignment time at the time of receiving a request for the input and output from a plurality of input and output groups to the disk device. A tuning part 52 automatically adjusts the operation condition of the time sharing according to the requested performance and the achievements.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、複数の入出力に基
づいてディスク装置の使用をスケジューリングするディ
スク・タイムシェアリング装置及び方法に関し、特に、
競合する入出力に対し割当て時間を順番に切替えるよう
にディスク装置の使用をスケジューリングするディスク
・タイムシェアリング装置及び方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a disk time sharing apparatus and method for scheduling the use of a disk drive based on a plurality of inputs and outputs.
The present invention relates to a disk time sharing apparatus and a method for scheduling the use of a disk apparatus so as to sequentially change an allocation time for competing input / output.

【０００２】[0002]

【従来の技術】従来、ハードディスクドライブ等のディ
スク装置を使用してデータを管理するストレージシステ
ムにあっては、例えばディクス装置をＲＡＩＤ構成の装
置とし、このＲＡＩＤ装置をディスク制御装置の配下に
接続して上位のホストからの入出力を処理したり、直
接、サーバにＲＡＩＤ装置を接続し、サーバＯＳからの
入出力を処理するようにしている。2. Description of the Related Art Conventionally, in a storage system that manages data using a disk device such as a hard disk drive, for example, a disk device is a device having a RAID configuration, and this RAID device is connected under a disk control device. To process input / output from a host at a higher level, or directly connect a RAID device to a server to process input / output from a server OS.

【０００３】このようなストレージシステムにあって
は、同一のディスク装置に対して、応答時間の保証が要
求されるランダムアクセスと、単位時間当たりの処理量
が重視されるシーケンシャルアクセスを行う必要がある
場合、ランダムアクセスとシーケンシャルアクセスが競
合しないように、時間帯を分けた運用を行っている。例
えば、昼間は、ディスク装置のデータベースに対してラ
ンダムアクセス中心のＯＬＴＰ業務（On Line Transact
ion Processing）を行い、業務終了後の夜間にデータベ
ースのバックアップを行っている。In such a storage system, it is necessary to perform random access for which the response time is required to be guaranteed and sequential access in which the processing amount per unit time is emphasized for the same disk device. In this case, operation is performed in different time zones so that random access and sequential access do not conflict with each other. For example, in the daytime, random access-oriented OLTP operations (On Line Transact
ion processing) and back up the database at night after the work is over.

【０００４】[0004]

【発明が解決しようとする課題】１．ランダムアクセス
とシーケンシャルアクセスの資源配分しかしながら、このようなストレージシステムにあって
は、業務の無停止化に伴い、夜間といえどもランダムア
クセス系のＯＬＴＰ業務を継続する必要が出てきたた
め、ランダムアクセス系のＯＬＴＰ業務中にシーケンシ
ャルアクセスであるバックアップの実行が必要となって
きている。[Problems to be Solved by the Invention] However, in such a storage system, it is necessary to continue the random access OLTP business even at night due to the non-stop operation of the storage system. It is necessary to execute a backup that is a sequential access during the OLTP business.

【０００５】ランダムアクセスのみの場合は、ある平均
応答時間、例えば30ｍｓを満たせる単位時間当りの入出
力回数であるＩＯＰＳ（Input Output Per Second ）、
例えば１００IOPSを見積もることができる。シーケンシ
ャルアクセスのみの場合は、例えば２０ＭＢ／ｓといっ
たスループットを見積もることができる。In the case of only random access, IOPS (Input Output Per Second), which is the number of input / output operations per unit time, which can satisfy a certain average response time, for example, 30 ms,
For example, 100 IOPS can be estimated. In the case of only sequential access, a throughput of, for example, 20 MB / s can be estimated.

【０００６】ところが、ランダムアクセスとシーケンシ
ャルアクセスを同時に行った場合は、受け付けた入出力
を要求をＦＩＦＯを用いたキューで処理するため、ラン
ダムアクセスがディスク装置を使用できる時間およびシ
ーケンシャルアクセスがディスク装置を使用できる時間
を保証する仕組みがない。However, when random access and sequential access are performed simultaneously, the received I / O is processed in a queue using a FIFO, so that the time required for the random access to use the disk device and the time required for the sequential access to the disk device are reduced. There is no mechanism to guarantee usable time.

【０００７】例えば、平均応答時間３０ｍｓで５０IOPS
のランダムアクセスと、５ＭＢ／ｓのシーケンシャルア
クセスが欲しい場合でも、シーケンシャルアクセスが頻
繁に発生するとシーケンシャルアクセスのスループット
は、上がる必要はないのであるが、５ＭＢ／ｓから１０
ＭＢ／ｓに上がる。逆にランダムアクセスで平均応答時
間３０ｍｓを満たすIOPSは、低下させたくはないにもか
かわらず、５０IOPSから２５IOPSに低下する。２．論理ボリューム間の資源配分また従来のストレージシステムは、性能要件の異なるデ
ータは異なるディスク装置に配置することで、それぞれ
の性能特性を引き出している。例えば、小量データのラ
ンダムアクセスで応答時間の保証が要求されるデータ
と、大量データのシーケンシャルアクセスで単位時間当
たりの処理量が重視されるデータは、異なるディスク装
置に配置している。For example, 50 IOPS with an average response time of 30 ms
Even if random access and 5 MB / s sequential access are desired, if sequential access frequently occurs, the sequential access throughput does not need to be increased, but from 5 MB / s to 10 MB / s.
It goes up to MB / s. Conversely, the IOPS satisfying the average response time of 30 ms by random access decreases from 50 IOPS to 25 IOPS, although it is not desired to reduce it. 2. Resource Allocation Between Logical Volumes In a conventional storage system, data having different performance requirements is allocated to different disk devices to derive their respective performance characteristics. For example, data for which a response time is required to be guaranteed by random access of a small amount of data and data in which a processing amount per unit time is emphasized in a sequential access of a large amount of data are arranged in different disk devices.

【０００８】ところが、ディスク装置の大容量化に伴
い、異なる性能要件のデータを同一ディスク装置に配置
するケースが増えてきている。このように異なる性能要
件の論理ボリュームを同一ディスクに配置した場合も同
様の問題が生じる。従来は、受け付けた入出力をＦＩＦ
Ｏでスケジュールして論理ボリューム間のディスク資源
分配を制御する仕組みがない。このため、ある論理ボリ
ュームへの入出力が頻繁に発生すると、他の論理ボリュ
ームへの入出力性能が低下する。However, with the increase in the capacity of disk devices, the number of cases where data having different performance requirements are arranged on the same disk device has been increasing. Similar problems occur when logical volumes having different performance requirements are arranged on the same disk. Conventionally, input / output
There is no mechanism for controlling disk resource distribution between logical volumes by scheduling in O. Therefore, if input / output to a certain logical volume frequently occurs, input / output performance to another logical volume is reduced.

【０００９】例えば、１０IOPSを保証して欲しいボリュ
ームＡと、５０IOPSを保証して欲しいボリュームＢを同
一ディスクに配置した場合、ボリュームＡへのアクセス
が頻繁に発生するとボリュームＡのIOPSは、上がる必要
はないにもかかわらず、１０IOPSから２０IOPSに上が
る。逆にボリュームＢのIOPSは、低下させたくはないに
もかかわらず、５０IOPSから４０IOPSへと低下する。３．通常処理とバックアップ／コピー処理間の資源配分従来のストレージシステムで、同一ディスク装置上に複
数の論理ボリュームが存在し、個々の論理ボリューム単
位でバックアップやコピーを行う場合を考える。従来
は、バックアップ／コピー処理による通常の入出力への
影響を抑えるため、バックアップ／コピー処理のペース
（インターバル）を、バックアップ／コピー処理の実行
時に設定する手法を採っている。For example, if a volume A for which 10 IOPS is to be guaranteed and a volume B for which 50 IOPS are to be guaranteed are arranged on the same disk, if access to volume A occurs frequently, the IOPS of volume A need not be increased. Despite none, it goes up from 10 IOPS to 20 IOPS. Conversely, the IOPS of volume B drops from 50 IOPS to 40 IOPS, although we do not want to do so. 3. Resource Allocation Between Normal Processing and Backup / Copy Processing In a conventional storage system, a case is considered in which a plurality of logical volumes exist on the same disk device and backup and copy are performed in units of individual logical volumes. Conventionally, in order to suppress the influence of the backup / copy processing on normal input / output, a method of setting the pace (interval) of the backup / copy processing at the time of executing the backup / copy processing is adopted.

【００１０】ところが、ボリュームＡをコピー中に、ボ
リュームＡと同じディスク装置上のボリュームＢに対し
てコピーを実行すると、同時に２多重のコピー処理が同
一のディスク装置上で動作するため、通常の入出力への
影響は２倍になる。４．通常処理とリビルディング間の資源配分ＲＡＩＤ装置では、複数のディスクドライブでデータを
冗長化させることにより、１つのディスクドライブに障
害が発生しても残りのディスクドライブからデータを復
旧することができる。このため、ＲＡＩＤ装置では、デ
ィスクドライブに障害が発生しても、通常の入出力を継
続することができる。However, if copying is performed to volume B on the same disk device as volume A while copying volume A, two multiplex copy processes operate on the same disk device at the same time. The effect on the output is doubled. 4. Resource allocation between normal processing and rebuilding In a RAID device, data is made redundant by a plurality of disk drives, so that even if a failure occurs in one disk drive, data can be recovered from the remaining disk drives. For this reason, the RAID device can continue normal input / output even if a failure occurs in the disk drive.

【００１１】また、交換されたディスクドライブに対し
て、残りのディスクドライブからデータの復旧が行われ
る。この復旧処理のことをリビルディング(Rebuilding)
と呼ぶ。リビルディングは、ＲＡＩＤ装置を構成するデ
ィスクドライブに対する入出力処理を伴うため、同一の
ディスクドライブを通常の入出力と奪い合うことにな
る。[0011] Further, data recovery is performed on the replaced disk drive from the remaining disk drives. This recovery process is called Rebuilding
Call. Since the rebuilding involves input / output processing for the disk drives constituting the RAID device, the same disk drive competes with normal input / output.

【００１２】このため、リビルディングにより通常の入
出力の性能は低下する。例えば、ミラー構成をとるＲＡ
ＩＤ１の場合、リビルディングは、ディスクドライブの
障害により１台になったディスクドライブから交換され
た新しいディスクドライブへデータをコピーする処理で
あり、コピー元のディスクドライブに対しリード入出力
が発生する。このリード入出力が通常の入出力を待たせ
ることになり、通常の入出力の性能が低下する。For this reason, the normal input / output performance is reduced by the rebuilding. For example, RA with mirror configuration
In the case of ID1, rebuilding is a process of copying data from a single disk drive due to a disk drive failure to a new disk drive that has been replaced, and read input / output occurs to the copy source disk drive. This read input / output causes the normal input / output to wait, and the performance of the normal input / output deteriorates.

【００１３】この問題を解消するための従来のアプロー
チは２つある。第１のアプローチは、通常の入出力に影
響を与えないように、十分長いインターバルで、十分小
さいデータをコピーする。この場合、通常の入出力への
影響は小さくすることができるが、リビルディング完了
までの時間が長くなる。例えば９ＧＢのディスクドライ
プで構成するＲＡＩＤ１の場合、１０時間前後が必要と
なる。There are two conventional approaches to solving this problem. The first approach copies sufficiently small data at sufficiently long intervals so as not to affect normal I / O. In this case, the influence on the normal input / output can be reduced, but the time until the completion of the rebuilding becomes longer. For example, in the case of RAID 1 composed of a 9 GB disk drive, about 10 hours are required.

【００１４】第２のアプローチは、ディスクドライブが
空いていれば、即ち、通常の入出力でディスクドライブ
を使用していなければ、リビルディングの入出力をスケ
ジュールする。この場合の問題は、リビルディング完了
までの時間が保証できない点にある。これはディスクド
ライブがほとんど空いていないと、リビルディングに長
時間必要になってしまう。５．最大応答時間保証ミッションクリティカルな業務では、入出力性能の要件
として平均応答時間の他に最大応答時間が重要となる。
近年のディスク装置は、実行待ち入出力を処理時間が最
短になるように並び替えるリ・オーダリング機能（Re-o
rdering 機能）を持っている。A second approach schedules rebuilding I / O if the disk drive is free, ie, not using the disk drive for normal I / O. The problem in this case is that the time to complete the rebuild cannot be guaranteed. This can take a long time to rebuild if the disk drive is almost empty. 5. Maximum Response Time Guarantee In mission-critical tasks, the maximum response time is important in addition to the average response time as a requirement for input / output performance.
Recent disk units have a re-ordering function (Re-o
rdering function).

【００１５】リ・オーダリング機能は、ディスク装置
が、実行待ち入出力の中からシーク時間と回転待ち時間
の和で定義されるポジショニング時間を最小にする入出
力を、次に実行する入出力として選ぶ機能である。ディ
スク装置に入出力を依頼する際に、リ・オーダリングの
対象として良い旨のタスク指定となるシンプルタスク
（Simple task）をディスク装置に通知する。In the reordering function, the disk device selects the input / output that minimizes the positioning time defined by the sum of the seek time and the rotation wait time from the input / output to be executed as the next input / output. Function. When requesting an input / output from the disk device, the disk device is notified of a simple task (Simple task) for specifying a task to be reordered.

【００１６】ディスク装置はシンプルタスク指定の入出
力の場合は、ポジショニング時間を最小にするような順
番で入出力をスケジュールする。これにより、ランダム
アクセス時の平均処理時間が短縮される。例えば、ラン
ダムアクセスの平均処理時間は、リ・オーダリング機能
を使用することにより、９ｍｓから５ｍｓに短縮する。In the case of an input / output designated by a simple task, the disk device schedules the input / output in such an order as to minimize the positioning time. Thereby, the average processing time at the time of random access is reduced. For example, the average processing time of random access is reduced from 9 ms to 5 ms by using the reordering function.

【００１７】リ・オーダリング機能は、このようにディ
スク装置のスループットを向上させるが、最大応答時間
が大きくなる問題がある。これは、次の入出力にポジシ
ョニング時間が最小となる入出力を選択するため、ある
入出力が長い間待ちのままでスケジュールされない現象
が発生するためである。Although the reordering function improves the throughput of the disk device in this way, there is a problem that the maximum response time becomes longer. This is because an input / output for which the positioning time is minimized for the next input / output is selected, so that a phenomenon occurs in which a certain input / output is waited for a long time and is not scheduled.

【００１８】この現像を解決するため、ディスク装置
は、リ・オーダリングの対象として良いことを指定する
シンプルタスクの他に、オーダードタクス（Ordered ta
sk）を指定する機能を備えている。オーダードタクスの
指定で入出力を依頼すると、ディスク装置は、それまで
に受け付けていた未だ完了していない入出力を全て完了
させた後、オーダードタスクの入出力をスケジューリン
グする。In order to solve this development, the disk device requires an ordered task (ordered task) in addition to the simple task for specifying that the reordering can be performed.
sk). When an input / output is requested by specifying the ordered task, the disk device completes all the input / output that has been received and has not been completed, and then schedules the input / output of the ordered task.

【００１９】このようにシンプルタクスの間にオーダー
ドタスクを混ぜることにより、入出力の最大応答時間の
延長を抑えることが可能となる。しかし、ランダムアク
スセとシーケンシャルアクセス間、論理ボリューム間、
通常処理とバックアップ／コピー処理やリビルディング
処理間の資源配分を考えた場合、スループット（IOPS）
を向上させるためのシンプルタクスの利用に加え、シン
プルタクスクを使った場合の最大応答時間の保証が課題
となる。As described above, by mixing the ordered tasks during the simple task, it is possible to suppress the extension of the maximum input / output response time. However, between random access and sequential access, between logical volumes,
When considering resource allocation between normal processing and backup / copy processing or rebuilding processing, throughput (IOPS)
In addition to using simple tasks to improve the performance, guaranteeing the maximum response time when using simple tasks is an issue.

【００２０】このような問題を解決するため、本願発明
者にあっては、ディクス装置に種類の異なる複数の入出
力が競合した場合の性能の最低保証を可能とするディス
ク・タイムシェアリング装置及び方法を提案している
（特願平１１−２１８７５７号）。In order to solve such a problem, the present inventor has proposed a disk time sharing apparatus and a disk time sharing apparatus which can guarantee the minimum performance when a plurality of different types of input / output compete with each other. A method has been proposed (Japanese Patent Application No. 11-218575).

【００２１】このディスク・タイムシェアリング装置
は、１又は複数のディスクドライブを備えたディスク装
置、ディスク装置に入出力要求を発行する入出力要求
部、及びディスク装置への入出力元をグループ化した入
出力グループを形成すると共に各入出力グループがディ
スク装置を使用する時間の比率を定義し、定義された時
間比率に基づき各入出力グループが連続してディスク装
置１６を使用できるクォンタムτ１，τ２，τ３（割当
時間）を決定し、複数の入出力グループからディスク装
置に入出力の依頼を受け付けている場合、競合した入出
力グループ間でクォンタムτ１，τ２，τ３を順番に切
り替えてディスク装置を使用するタイムシェアリングを
行う入出力スケジュール機構を備える。This disk time-sharing device is obtained by grouping a disk device having one or a plurality of disk drives, an input / output request unit for issuing an input / output request to the disk device, and an input / output source to the disk device. An I / O group is formed, and a ratio of time during which each I / O group uses the disk device is defined. Quantum τ1, τ2, at which each I / O group can continuously use the disk device 16 based on the defined time ratio. If τ3 (assigned time) is determined and an I / O request is received from a plurality of I / O groups to the disk device, the disk device is used by sequentially switching the quantum τ1, τ2, τ3 between competing I / O groups. It has an input / output schedule mechanism for performing time sharing.

【００２２】具体的には、入出力スケジュール機構は、
シーケンシャルアクセスと判断された入出力はシーケン
シャルアクセス入出力グループに、それ以外の入出力は
ランダムアクセス入出力グループに対応させ、シーケン
シャルアクセスとランダムアクセスでディスク装置のタ
イムシェアリングを行う。More specifically, the input / output schedule mechanism includes:
The input / output determined to be sequential access corresponds to the sequential access input / output group, and the other inputs / outputs correspond to the random access input / output group, and time sharing of the disk device is performed by sequential access and random access.

【００２３】このため、ランダムアクセス要求がどれだ
け多く発生しようとも、シーケンシャルアクセスの入出
力でディスク装置を使用できる時間が保証されているか
ら、シーケンシャルアクセス性能の最低値保証が可能と
なる。また、ランダムアクセスの入出力でディスク装置
を使用できる時間が保証されているから、ランダムアク
セス性能の最低保証が可能となる。Therefore, no matter how many random access requests occur, the time during which the disk device can be used for sequential access input / output is guaranteed, so that the minimum value of the sequential access performance can be guaranteed. Also, since the time during which the disk device can be used for random access input / output is guaranteed, the minimum random access performance can be guaranteed.

【００２４】ところで、システム管理者であるユーザの
要求として、ランダムアクセス入出力の平均レスポンス
時間は例えば３０ｍｓ以下に抑えたいという要求がある
とする。ここでディスクタイムシェアリングにより、１
回のタイムシェアリングの周期ＴＳをＴＳ＝１００ｍ
ｓ、ランダムクォンタムとシーケンシャルクォンタムの
時間比率（以下「ＲＳ比」という）をＲＳ比＝９０％
（但し、ランダム側からみた比率ＲＳ＝Ｒ／ＴＳで算出
される）で、タイムシェアリング処理を行っているもの
とする。By the way, it is assumed that there is a request from a user who is a system administrator to suppress the average response time of random access input / output to, for example, 30 ms or less. Here, due to disk time sharing, 1
Time sharing cycle TS = 100m
s, the ratio of time between random quantum and sequential quantum (hereinafter referred to as “RS ratio”) is RS ratio = 90%
(However, it is assumed that the time sharing process is performed at a ratio RS = R / TS viewed from the random side).

【００２５】通常の比較的軽い負荷の状態で、タイムシ
ェアリング周期ＴＳを１００ｍｓと設定した場合は、平
均レスポンス時間Ａｖｅ［ｍｓ］、最大レスポンス時間
Ｍａｘ［ｍｓ］はともに短くなるため、この設定で都合
が良い。しかし、負荷が重くなった場合には、タイムシ
ェアリング周期ＴＳ＝１００ｍｓでは処理し切れず、平
均レスポンスＡｖｅの悪化を招く。When the time sharing cycle TS is set to 100 ms under a normal light load condition, the average response time Ave [ms] and the maximum response time Max [ms] are both short. convenient. However, when the load becomes heavy, the processing cannot be completed in the time sharing cycle TS = 100 ms, and the average response Ave is deteriorated.

【００２６】そこでタイムシェアリング周期ＴＳを予め
例えば３００ｍｓのように長い時間に設定することで、
負荷が重くなった場合の平均レスポンスＡｖｅの悪化を
抑制できるが、逆に負荷が軽い時のレスポンス時間がＴ
Ｓ＝１００ｍｓの設定に比べて長くなり、素早い応答が
得られない問題がある。Therefore, by setting the time sharing period TS to a long time such as 300 ms in advance,
The deterioration of the average response Ave when the load becomes heavy can be suppressed, while the response time when the load is light is T
There is a problem that the response becomes longer than the setting of S = 100 ms, and a quick response cannot be obtained.

【００２７】本発明は、ユーザの要求性能を満足するタ
イムシェアリングの動作条件を実績値に基づいて自動的
に調整可能なディスク・タイムシェアリング装置及び方
法を提供することを目的とする。It is an object of the present invention to provide a disk time sharing apparatus and a method capable of automatically adjusting operating conditions of time sharing satisfying performance required by a user based on actual values.

【００２８】[0028]

【課題を解決するための手段】図１は本発明の原理説明
図である。まず本発明は、図１（Ａ）のように、ディス
ク装置１６、入出力要求部１８及び入出力スケジュール
機構２０を備えたディスク・タイムシェアリング装置を
対象とする。FIG. 1 is a diagram illustrating the principle of the present invention. First, the present invention is directed to a disk time sharing device including a disk device 16, an input / output request unit 18, and an input / output schedule mechanism 20, as shown in FIG.

【００２９】ここで、ディスク装置１６は、１又は複数
のディスクドライブを備え、入出力要求部１８は、ディ
スク装置に入出力要求を発行する。更に、入出力スケジ
ュール機構２０は、ディスク装置への入出力元をグルー
プ化した入出力グループを形成すると共に各入出力グル
ープがディスクを使用する時間の比率を定義し、定義さ
れた時間比率に基づき各入出力グループが連続してディ
スク装置を使用できる割当時間（クォンタム）を決定
し、複数の入出力グループからディスク装置に入出力の
依頼を受け付けている場合、競合した入出力グループ間
で前記割当時間を順番に切り替えてディスク装置を使用
するタイムシェアリングを行う。Here, the disk device 16 includes one or a plurality of disk drives, and the input / output request unit 18 issues an input / output request to the disk device. Further, the I / O schedule mechanism 20 forms an I / O group in which I / O sources to the disk device are grouped, defines a ratio of time during which each I / O group uses a disk, and based on the defined time ratio. The allocation time (quantum) during which each I / O group can continuously use the disk device is determined, and when an I / O request is received from a plurality of I / O groups to the disk device, the allocation is performed among the competing I / O groups. Time sharing is performed by switching the time in order and using the disk device.

【００３０】このようなディスク・タイムシェアリング
装置につき本発明は、要求性能と実績に応じて前記タイ
ムシェアリングの動作条件を自動的に調整するチューニ
ング部５２を設けたことを特徴とする。The present invention for such a disk time sharing apparatus is characterized in that a tuning unit 52 for automatically adjusting the operating conditions of the time sharing according to required performance and performance is provided.

【００３１】このため本発明のディスク・タイムシェア
リング装置は、シミュレーションあるいは実測等により
得られた平均レンスポンス、最大レスポンス、スループ
ット等の実績（統計情報）を保存し、チューニング部に
より負荷の状態と保存した実績を元に、ユーザ要求性能
を満たす最適な調整値を決定し、この調整値に基づいて
タイムシェアリングの動作条件を自動的に調整でき、ユ
ーザの性能要求に適切に対応できる。For this reason, the disk time sharing apparatus of the present invention stores the results (statistical information) such as the average response, the maximum response, and the throughput obtained by simulation or actual measurement, and stores the load state and the storage by the tuning unit. The optimum adjustment value that satisfies the performance required by the user is determined based on the results of the performance, and the operating condition of time sharing can be automatically adjusted based on the adjustment value, and the performance requirement of the user can be appropriately responded.

【００３２】ここで入出力スケジュール機構２０は、複
数の入出力グループとして、少なくともランダムアクセ
ス入出力グループとシーケンシャルアクセス入出力グル
ープを形成する。Here, the input / output scheduling mechanism 20 forms at least a random access input / output group and a sequential access input / output group as a plurality of input / output groups.

【００３３】チューニング部５２は、図１（Ｂ）のよう
に、要求性能設定部５６、第１乃至第３基礎データ６
２，６４，６６、及び動作条件決定部５８を備える。要
求性能設定部５６は、（１）負荷の状態ＩＯＰＳ（実測
値又は設定値）、（２）ランダムアクセス入出力グルー
プに平均レスポンス時間Ａｖｅ［ｍｓ］と最大レスポン
ス時間Ｍａｘ［ｍｓ］、（３）シーケンシャルアクセス
入出力グループのスループットＴｈＰ［ＭＢ／ｓ］、の
各々を要求性能値として設定する。As shown in FIG. 1B, the tuning unit 52 includes a required performance setting unit 56 and first to third basic data 6.
2, 64, 66, and an operating condition determining unit 58. The required performance setting unit 56 includes (1) load state IOPS (actually measured value or set value), (2) average response time Ave [ms] and maximum response time Max [ms] for the random access input / output group, and (3) Each of the throughputs ThP [MB / s] of the sequential access input / output group is set as a required performance value.

【００３４】第１基礎データ６２は、ランダムアクセス
の負荷ＩＯＰＳ毎に分けた平均レスポンスＡｖｅの実績
値を、タイムシェアリング周期ＴＳと、ランダムアクセ
スとシーケンシャルアクセスの割当時間比率（クォンタ
ム比率）であるＲＳ比率に対応して格納する。The first basic data 62 represents the actual value of the average response Ave divided for each random access load IOPS, the time sharing period TS, and the RS which is the allocated time ratio (quantum ratio) between random access and sequential access. Store it corresponding to the ratio.

【００３５】第２基礎データ６４は、ランダムアクセス
の負荷ＩＯＰＳ毎に分けた最大レスポンスＭａｘの実績
値を、タイムシェアリング周期ＴＳと、ランダムアクセ
スとシーケンシャルアクセスの割当時間比率であるＲＳ
比率に対応して格納する。The second basic data 64 indicates the actual value of the maximum response Max divided for each random access load IOPS by the time sharing cycle TS and the ratio of the allocated time of the random access to the sequential access RS
Store it corresponding to the ratio.

【００３６】第３基礎データ６６は、スループットＴｈ
Ｐの実績値を、タイムシェアリング周期ＴＳと、ランダ
ムアクセスとシーケンシャルアクセスの割当時間比率で
あるＲＳ比率に対応して格納する。The third basic data 66 includes a throughput Th
The actual value of P is stored in correspondence with the time sharing period TS and the RS ratio, which is the ratio of the allocated time between random access and sequential access.

【００３７】更に、動作条件決定部５８は、要求性能設
定部で設定した１又は複数の要求性能値を満足するタイ
ムシェリング周期ＴＳと、ランダムアクセスとシーケン
シャルアクセスの割当時間比率（ＲＳ比）を、ランダム
負荷と第１乃至第３基礎データの参照で調整値として決
定し、タイムシェアリングの動作条件を自動的に調整す
る。Further, the operating condition determining unit 58 calculates a time shelling period TS that satisfies one or a plurality of required performance values set by the required performance setting unit, and an allocation time ratio (RS ratio) between random access and sequential access. An adjustment value is determined by referring to the random load and the first to third basic data, and the time sharing operating condition is automatically adjusted.

【００３８】チューニング部５２は、達成できない要求
性能値がある場合に、要求性能の種別に優先順を付け、
次のいずれかのモードを使用して動作条件を自動調整す
る。（１）上位の優先度の要求性能を達成できる設定範囲で
は下位の要求性能が達成できない場合に、下位の要求性
能は考慮しないで調整値を決定する第１モード。（２）上位の優先度の要求性能が達成できる設定範囲で
は下位の要求性能が達成できない場合も、下位の要求性
能を考慮して調整値を決定する第２モード。（３）上位の優先度の要求性能が達成できる設定範囲で
は下位の要求性能が達成できない場合に、上位の設定範
囲内から下位の性能が最善になる調整値を選択する第３
モード。（４）上位の優先度の要求性能が達成できる設定範囲で
は下位の要求性能が達成できない場合に、上位の設定範
囲から回の性能が良くなる候補を複数点選択し、全託し
た候補の中から上位の性能が最善になる調整値を選択す
る第４モード。If there is a required performance value that cannot be achieved, the tuning unit 52 assigns a priority to the type of required performance,
Automatically adjust operating conditions using one of the following modes: (1) A first mode for determining an adjustment value without considering lower required performance when lower required performance cannot be achieved within a setting range capable of achieving higher priority required performance. (2) The second mode in which the adjustment value is determined in consideration of the lower required performance even when the lower required performance cannot be achieved within the setting range where the higher priority required performance can be achieved. (3) When the lower required performance cannot be achieved in the setting range in which the required performance of the higher priority can be achieved, a third adjustment value for selecting the optimum value of the lower performance from the upper setting range is selected.
mode. (4) When the lower required performance cannot be achieved in the setting range in which the required performance of the higher priority can be achieved, a plurality of candidates having better performances are selected from the higher setting range, and the A fourth mode for selecting an adjustment value at which a higher-level performance is best from the above.

【００３９】このように要求性能に優先順位を付けて調
整値を決定することで、全ての要求性能を満足しなくと
も、ユーザの重視する優先度の高い要求性能を満足する
ように自動調整され、ユーザの要求を適切に反映でき
る。As described above, the priority is assigned to the required performance and the adjustment value is determined, so that even if not all the required performances are satisfied, the automatic adjustment is performed so as to satisfy the high-priority required performance emphasized by the user. Therefore, the user's request can be appropriately reflected.

【００４０】また本発明は、１又は複数のディスクドラ
イブを備えたディスク装置と、前記ディスク装置に入出
力要求を発行する入出力要求部と、前記入出力に基づい
て前記ディスク装置の使用をスケジューリングする入出
力スケジュール機構とを備えたディスク・タイムシェア
リング方法を提供するものであり、ディクス装置への入
出力元をグループ化した入出力グループを形成すると共
に各入出力グループがディスクを使用する時間の比率を
定義し、定義された時間比率に基づき各入出力グループ
が連続してディスク装置を使用できる割当時間（クォン
タム）を決定し、複数の入出力グループからディスク装
置に入出力の依頼を受け付けている場合、競合した入出
力グループ間で割当時間を順番に切り替えてディスク装
置を使用するタイムシェアリングを行い、更に、要求性
能と実績に応じてタイムシェアリングの動作条件を自動
的に調整することを特徴とする。Further, the present invention provides a disk device having one or a plurality of disk drives, an input / output request unit for issuing an input / output request to the disk device, and a schedule for using the disk device based on the input / output. A disk time sharing method provided with an I / O schedule mechanism that forms an I / O group in which I / O sources to a disk device are grouped and a time during which each I / O group uses a disk. The I / O group determines the allocation time (quantum) that each I / O group can use the disk device continuously based on the defined time ratio, and accepts I / O requests from multiple I / O groups to the disk device. Time, switch the allocation time between competing I / O groups in order to use disk units. Perform sharing, further characterized by automatically adjusting the operating conditions of the time-sharing in response to the required performance and results.

【００４１】尚、この方法につていの詳細は、装置と基
本的に同じになる。The details of this method are basically the same as those of the apparatus.

【００４２】[0042]

【発明の実施の形態】図２は、本発明が適用されるスト
レージシステムのブロック図である。図２において、ス
トレージシステムは、デバイス制御装置１２、アレイデ
ィスク装置１４、及びディスク装置１６で構成される。
デバイス制御装置１２に対しては、ホスト１０−１〜１
０−ｎが接続されており、ホスト１０−１〜１０−ｎの
アプリケーションにより入出力要求をデバイス制御装置
１２に対し行っている。FIG. 2 is a block diagram of a storage system to which the present invention is applied. In FIG. 2, the storage system includes a device control device 12, an array disk device 14, and a disk device 16.
For the device control device 12, the hosts 10-1 to 10-1
0-n are connected, and input / output requests are sent to the device control device 12 by applications of the hosts 10-1 to 10-n.

【００４３】アレイディスク装置１４は、デバイス制御
装置１２からの入出力要求を受けつけ、ディスク装置１
６に対し受けつけた入出力要求を発行する。本発明のデ
ィスク・タイムシェアリング装置は、アレイディスク装
置１４に設けた入出力要求依頼部１８及びディスク入出
力スケジュール機構２０と、ディスク装置１６に設けた
ディスク入出力処理部２２及びディスクドライブ２４−
１〜２４−ｎで構成される。The array disk unit 14 receives an input / output request from the device control unit 12 and
Issue the input / output request received for the request. The disk time sharing device of the present invention includes an input / output request requesting unit 18 and a disk input / output schedule mechanism 20 provided in the array disk device 14, a disk input / output processing unit 22 and a disk drive 24-provided in the disk device 16.
1 to 24-n.

【００４４】またディスク装置１６に設けている複数の
ディスクドライブ２４−１〜２４−ｎがＲＡＩＤ構成を
とる場合には、アレイディスク装置１４にはＲＡＩＤ制
御機構がさらに設けられることになる。When a plurality of disk drives 24-1 to 24-n provided in the disk device 16 have a RAID configuration, the array disk device 14 is further provided with a RAID control mechanism.

【００４５】更に本発明のディスク・タイムシェアリン
グ装置は、アレイディスク装置１４のディスク入出力ス
ケジュール機構２０に対しチューニング機構５０を設け
ており、チューニング機構５０はチューニング部５２と
基礎データファイル５４を備える。Further, in the disk time sharing apparatus of the present invention, a tuning mechanism 50 is provided for the disk input / output schedule mechanism 20 of the array disk apparatus 14, and the tuning mechanism 50 includes a tuning unit 52 and a basic data file 54. .

【００４６】チューニング部５２は、ユーザが希望した
要求性能を満足するように、負荷と基礎データに基づい
てディスク入出力スケジュール機構２０におけるタイム
シェアリングの動作条件を自動的に調整する。The tuning unit 52 automatically adjusts the operating conditions of time sharing in the disk input / output scheduling mechanism 20 based on the load and the basic data so as to satisfy the required performance desired by the user.

【００４７】図３は、図２のストレージシステムに適用
された本発明のタイムシェアリング装置の基本的な実施
形態のブロック図であり、ＲＡＩＤ構成のディスク装置
を例にとっている。FIG. 3 is a block diagram of a basic embodiment of the time sharing apparatus of the present invention applied to the storage system of FIG. 2, and exemplifies a disk device having a RAID configuration.

【００４８】図３において、アレイディスク装置１４
は、入出力要求部１８、ＲＡＩＤ制御部２６、ディスク
入出力スケジュール機構２０を備える。またディスク装
置１６にはディスク入出力処理部２２が設けられ、この
ディスク入出力処理２２に対し、例えばＲＡＩＤ１の構
成（ミラーディスク構成）をとる２台のディスクドライ
ブ２４−１，２４−２が接続されている。In FIG. 3, the array disk device 14
Includes an input / output request unit 18, a RAID control unit 26, and a disk input / output schedule mechanism 20. The disk device 16 is provided with a disk input / output processing unit 22. Two disk drives 24-1 and 24-2 having a RAID1 configuration (mirror disk configuration) are connected to the disk input / output processing unit 22, for example. Have been.

【００４９】このような本発明のディスク・タイムシェ
アリング装置は、入出力要求部１８からのディスク装置
１６への入出力要求をグループ化して入出力グループを
形成すると共に、各入出力グループがディスク装置１６
を使用する時間の比率を定義し、定義された時間比率に
基づき各入出力グループが連続してディスク装置を使用
できるクォンタム（割当時間）を決定し、複数の入出力
グループから依頼を受けつけている場合に競合した入出
力グループ間でクォンタムを順番に切り替えてディスク
装置１６を使用するようにスケジューリングする処理を
行う。The disk time sharing apparatus according to the present invention forms an input / output group by grouping input / output requests from the input / output requesting unit 18 to the disk device 16 and forms a disk group. Device 16
Defines the ratio of time to use, and based on the defined time ratio, determines the quantum (allocation time) in which each I / O group can use the disk device continuously, and accepts requests from multiple I / O groups. In this case, a process is performed in which the quantum is sequentially switched between the conflicting input / output groups and the disk device 16 is used.

【００５０】またひとつの入出力グループからのみ入出
力の依頼のある場合は、ひとつの入出力グループからの
入出力に対しディスク装置１６を連続して使用可能とす
るスケジューリングを行う。When there is an input / output request from only one input / output group, scheduling is performed so that the disk device 16 can be continuously used for input / output from one input / output group.

【００５１】このような本発明のディスク・タイムシェ
アリング処理を実現する図３の各部の構成及び機能を更
に詳細に説明すると次のようになる。入出力要求部１８
は、例えば図２に示した上位のデバイス制御装置１２か
らのコマンドに基づきディスク装置１６に対する入出力
要求をＲＡＩＤ制御部２６を介してディスク入出力スケ
ジュール機構２０に発行する。ＲＡＩＤ制御部は、依頼
された論理入出力要求を物理入出力要求に変換する処理
を主に行う。The configuration and function of each unit in FIG. 3 for realizing such a disk time sharing process of the present invention will be described in more detail as follows. Input / output request unit 18
Issues an I / O request to the disk device 16 to the disk I / O schedule mechanism 20 via the RAID control unit 26 based on, for example, a command from the upper device control device 12 shown in FIG. The RAID control unit mainly performs a process of converting the requested logical input / output request into a physical input / output request.

【００５２】ディスク入出力スケジュール機構２０に
は、ディスク・タイムシェアリング制御情報３０−１，
３０−２、入出力スケジュール部３２、入出力要求受付
部３４、及び入出力完了処理部３６が設けられる。ディ
スク・タイムシェアリング制御情報３０−１，３０−２
は、ディスク装置１６に設けているディスクドライブ２
４−１，２４−２単位に設けられる。The disk input / output scheduling mechanism 20 includes disk time sharing control information 30-1,
30-2, an input / output schedule unit 32, an input / output request receiving unit 34, and an input / output completion processing unit 36 are provided. Disk time sharing control information 30-1, 30-2
Is the disk drive 2 provided in the disk device 16
4-1 and 24-2 units are provided.

【００５３】入出力スケジュール部３２は、ディスクド
ライブ２４−１，２４−２単位に設けられたディスク・
タイムシェアリング制御情報３０−１，３０−２を参照
及び更新してディスク・タイムシェアリングを行う。The input / output schedule section 32 includes a disk drive provided for each of the disk drives 24-1 and 24-2.
The disk time sharing is performed by referring to and updating the time sharing control information 30-1 and 30-2.

【００５４】ここでディスク・タイムシェアリング制御
情報３０−１について説明すると、この実施形態にあっ
ては入出力グループをＧ１，Ｇ２，Ｇ３の３つに分けて
定義した場合を例にとっており、入出力グループＧ１〜
Ｇ３に対応してスケジュール待ちグループキュー３８−
１，３８−２が設けられる。このスケジュール待ちグル
ープキュー３８−１〜３８−３には、入出力要求受付部
３４で受けつけた入出力要求がキューを構成するＦＩＦ
Ｏに格納することで並ぶ。Here, the disk time sharing control information 30-1 will be described. In this embodiment, a case where the input / output groups are defined as three groups G1, G2 and G3 is taken as an example. Output group G1
Schedule queue group queue 38-corresponding to G3
1, 38-2 are provided. In the schedule waiting group queues 38-1 to 38-3, the I / O requests received by the I / O request receiving unit 34 are assigned to the FIFs constituting the queues.
Lined up by storing in O.

【００５５】また入出力グループＧ１〜Ｇ３に対応して
完了待ちグループキュー４０−１，４０−２，４０−３
が設けられる。完了待ちグループキュー４０−１，４０
−３には、ディスク装置１６への入出力依頼が完了し、
ディスク装置１６から入出力完了応答を受けていない入
出力要求がキューを構成するＦＩＦＯに格納することで
並んでいる。The completion waiting group queues 40-1, 40-2, 40-3 corresponding to the input / output groups G1 to G3.
Is provided. Completion waiting group queues 40-1 and 40
At -3, the input / output request to the disk device 16 is completed,
The input / output requests that have not received the input / output completion response from the disk device 16 are arranged by being stored in the FIFO constituting the queue.

【００５６】更に入出力グループＧ１〜Ｇ３に対応して
グループ用クォンタム４２−１，４２−２，４２−３が
設けられる。このグループ用クォンタム４２−１〜４２
−３には、入出力グループＧ１〜Ｇ３がディスク装置１
６を使用する時間の比率α１，α２，α３を予め定義
し、この定義された比率α１，α２，α３に基づき、そ
れぞれの入出力グループＧ１〜Ｇ３が連続してディスク
装置を使用できる割当時間となるクォンタムτ１，τ
２，τ３を決定して格納している。Further, group quanta 42-1, 42-2 and 42-3 are provided corresponding to the input / output groups G1 to G3. Quantum 42-1 to 42 for this group
-3, the input / output groups G1 to G3 are the disk devices 1
6 are defined in advance, and based on the defined ratios α1, α2, α3, the allocation time during which the respective input / output groups G1 to G3 can continuously use the disk device is determined. Quantum τ1, τ
2, τ3 are determined and stored.

【００５７】例えば１回のタイムシェアリングを行なう
タイムシェアリング周期をＴＳとすると、入出力グルー
プＧ１〜Ｇ３のクォンタムτ１〜τ３は次式で定義され
る。For example, assuming that the time sharing cycle for performing one time sharing is TS, the quantum τ1 to τ3 of the input / output groups G1 to G3 is defined by the following equation.

【００５８】τ１＝α１・ＴＳ τ２＝α２・ＴＳ τ３＝α３・ＴＳこのような入出力グループＧ１〜Ｇ３のディスク装置１
６の使用を決めるクォンタムτ１〜τ３の適正値は次の
ようにして決める。まずクォンタムは値を小さくしすぎ
るとディスク装置１６の入出力処理時間に近くなり、ポ
ジショニング時間を最小とするように入出力を選択する
リ・オーダリングの効果が小さくなり、全体の入出力性
能が低下する。Τ1 = α1 · TS τ2 = α2 · TS τ3 = α3 · TS Disk device 1 of such input / output groups G1 to G3
The appropriate value of the quantum τ1 to τ3 that determines the use of No. 6 is determined as follows. First, if the quantum is set too low, the input / output processing time of the disk device 16 will be close to the value, and the reordering effect of selecting the input / output to minimize the positioning time will be reduced, and the overall input / output performance will decrease I do.

【００５９】逆にクォンタムの値が大きすぎると、他の
入出力グループに切り替えるクォンタムの待ち時間が延
びることにより、平均入出力処理時間及び最大入出力処
理時間が延びることになる。例えばクォンタムτ１とク
ォンタムτ２をそれぞれ１時間に設定すると、クォンタ
ムτ１の処理中はクォンタムτ２の入出力を実行できな
いため、クォンタムτ２の入出力はクォンタムτ１の終
了を１時間待つことになる。On the other hand, if the value of the quantum is too large, the waiting time of the quantum for switching to another input / output group increases, so that the average input / output processing time and the maximum input / output processing time increase. For example, if the quantum τ1 and the quantum τ2 are set to one hour, respectively, the input and output of the quantum τ2 cannot be performed during the processing of the quantum τ1, so that the input and output of the quantum τ2 waits for one hour until the quantum τ1 ends.

【００６０】本願発明者の実験によれば、入出力の平均
処理時間が数ｍｓ〜２０ｍｓのディスク装置１６の場
合、クォンタムの値としては数十ｍｓ〜数百ｍｓが望ま
しい。According to the experiment performed by the inventor of the present invention, in the case of the disk device 16 in which the average input / output processing time is several ms to 20 ms, the quantum value is preferably several tens ms to several hundred ms.

【００６１】また本発明のディスク・タイムシェアリン
グにあっては、ランダムアクセスの割当時間となるクォ
ンタムヲＲ、シーケンシャルアクセスの割当時間となる
クォンタムヲＳとした場合、両者のクォンタム比率（割
当時間非理知）をＲＳ比と呼びとして次式で定義する。In the disk time sharing according to the present invention, if the quanta ヲ R, which is the allocation time for random access, and the quanta, ヲ S, which is the allocation time for sequential access, the quantum ratio of the two (allocation time not known). Is defined as the RS ratio by the following equation.

【００６２】ＲＳ比＝Ｒ／（Ｒ＋Ｓ）これはランダムア
クセス側から見たクォンタム比率である。そして、この
ＲＳ比を調整値として可変設定することで、タイムシェ
アリング周期ＴＳ内でのランダムアクセスとシーケンシ
ャルアクセスの割当時間を変更できるようにしている。RS ratio = R / (R + S) This is a quantum ratio viewed from the random access side. By variably setting the RS ratio as an adjustment value, it is possible to change the allocation time of the random access and the sequential access within the time sharing period TS.

【００６３】ここでディスク入出力スケジュール機構２
０でグループ化する入出力としては、例えば次のグルー
プ化がある。Here, the disk input / output schedule mechanism 2
The input / output grouped by 0 includes, for example, the following grouping.

【００６４】（１）ランダムアクセスの入出力グループ（２）シーケンシャルアクセスの入出力グループ（３）論理ボリュームによる入出力グループ（４）コピー／バックアップ処理による入出力グループ（５）ＲＡＩＤのリ・ビルディング処理による入出力グ
ループこれにの入出力グループの形成は、入出力要求依頼部１
４にシーケンシャルアクセス検出機構４５、バックアッ
プ検出機構７８及びリビルディング機構８４が設けられ
ていることを前提としている。(1) I / O group for random access (2) I / O group for sequential access (3) I / O group by logical volume (4) I / O group by copy / backup processing (5) RAID rebuilding processing The I / O group is formed by the I / O request requesting unit 1
4 is provided with a sequential access detection mechanism 45, a backup detection mechanism 78, and a rebuilding mechanism 84.

【００６５】このため本発明にあっては、例えばランダ
ムアクセス、シーケンシャルアクセス、及びコピー／バ
ックアップ処理、及びリ・ビルディング処理の４つの入
出力グループＧ１〜Ｇ４を形成してタイムシェアリング
周期ＴＳにつき各々にクォンタムを設定してタイムシェ
アリングする。For this reason, in the present invention, for example, four input / output groups G1 to G4 for random access, sequential access, copy / backup processing, and rebuilding processing are formed, and Quantum is set for time sharing.

【００６６】また複数の入出力グループを１つにまとめ
てもよい。例えばランダムアクセスとシーケンシャルア
クセスを１つの入出力グループにまとめ、コピー／バッ
クアップ処理を各々独立のグループとして３グループを
形成してもよい。この場合の同じグループに属するラン
ダムアクセスとシーケンシャルアクセスについては、Ｒ
Ｓ比を応じた割当時間をもつ。A plurality of input / output groups may be combined into one. For example, random access and sequential access may be combined into one input / output group, and copy / backup processing may be formed as three independent groups. In this case, for random access and sequential access belonging to the same group, R
It has an allocation time according to the S ratio.

【００６７】ディスク・タイムシェアリング制御情報３
０−１には、現クォンタム種別４４、現クォンタム開始
時刻４６、更に次入出力タスク種別４８が設けられる。
この現クォンタム種別４４は、ディスク装置１６のディ
スクドライブ２４−１，２４−２毎に設けられ、現在、
ディスクドライブ２４−１，２４−２を使用している入
出力グループの識別子が設定される。Disk time sharing control information 3
0-1 is provided with a current quantum type 44, a current quantum start time 46, and a next input / output task type 48.
The current quantum type 44 is provided for each of the disk drives 24-1 and 24-2 of the disk device 16.
The identifier of the input / output group using the disk drives 24-1 and 24-2 is set.

【００６８】現クォンタム開始時刻４６は、ディスク装
置１６のディスクドライブ２４−１，２４−２毎に設け
られ、現在クォンタム種別４４に設定されている現在の
クォンタムが開始した時刻Ｔ0 が設定される。更に次入
出力タスク種別４８は、ディスク装置１６のディスクド
ライブ２４−１，２４−２毎に設けられ、次のディスク
ドライブに対する入出力依頼をシンプル・タスクとする
かオーダード・タスクとするかが設定される。この次入
出力タスク種別４８に設定されるシンプル・タスク又は
オーダード・タスクは、ディスク装置１６におけるリ・
オーダリング機能の効果を十分に生かすために行う。The current quantum start time 46 is provided for each of the disk drives 24-1 and 24-2 of the disk device 16, and the time T0 at which the current quantum set in the current quantum type 44 starts is set. Further, the next input / output task type 48 is provided for each of the disk drives 24-1 and 24-2 of the disk device 16, and determines whether the input / output request for the next disk drive is a simple task or an ordered task. Is set. The simple task or ordered task set in the next input / output task type 48 is
This is performed to make full use of the effect of the ordering function.

【００６９】ここでディスク装置１６のリ・オーダリン
グ機能は、ディスクドライブ２４−１又は２４−２のそ
れぞれについて、実行待ちの入出力の中からシーク時間
と回転時間の和で与えられるポジショニング時間を最小
とする入出力を次に実行する入出力として選ぶ機能であ
る。Here, the reordering function of the disk device 16 minimizes the positioning time given by the sum of the seek time and the rotation time from the input / output waiting for execution for each of the disk drives 24-1 and 24-2. This function selects the input / output to be executed next as the input / output to be executed.

【００７０】このようなリ・オーダリング機能を備えた
ディスク装置に入出力を依頼する場合、サンプル・タス
クを指定するとリ・オーダリングの対象としてよいこと
をディスクドライブに通知することになる。このサンプ
ル・タスクを指定した入出力を受けつけたディスクドラ
イブは、ポジショニング時間を最小とするような順番で
入出力をスケジュールする。When requesting input / output from a disk device having such a reordering function, when a sample task is specified, the disk drive is notified that the reordering may be performed. The disk drive that receives the input / output specifying this sample task schedules the input / output in such an order as to minimize the positioning time.

【００７１】しかしながら、リ・オーダリング機能は常
にポジショニング時間が最小となる入出力を選択するた
め、ある入出力が長い間待ちのままスケジュールされな
い現象が発生する。この現象を解消するためディスクド
ライブはシンプル・タスクの他にオーダー・タスクの機
能を備えている。オーダー・タスクを指定して入出力を
依頼すると、ディスクドライブはそれまで受け継いでい
た未だ完了していない入出力を全て完了させた後に、オ
ーダード・タスクの入出力をスケジュールする。このた
めシンプル・タスクの間にオーダード・タスクを混ぜる
ことで、入出力の最大応答時間の延長を押さえることが
可能となる。However, since the reordering function always selects an input / output with a minimum positioning time, a phenomenon occurs in which a certain input / output is not waited for a long time and is not scheduled. In order to solve this phenomenon, the disk drive has an order task function in addition to the simple task. When an input / output is requested by designating the order task, the disk drive completes all the incomplete inputs / outputs that have been inherited and schedules the input / output of the ordered task. For this reason, by mixing ordered tasks between simple tasks, it is possible to suppress the extension of the maximum input / output response time.

【００７２】本発明のディスク・タイムシェアリング処
理にあっては、クォンタムを切り替えた後の最初の入出
力は、オーダード・タスクを指定してディスク装置１６
に依頼し、クォンタム切り替え前の未だ完了していない
入出力を完了させた後に次のクォンタムの入出力を実行
する。このためクォンタムに切り替えた後の２つ目以降
の入出力についてはシンプル・タスクを指定する。In the disk time sharing process of the present invention, the first input / output after quantum switching is performed by specifying an ordered task and using the disk device 16.
To complete the I / O that has not been completed before the quantum switch, and then execute the I / O for the next quantum. For this reason, a simple task is specified for the second and subsequent inputs and outputs after switching to quantum.

【００７３】またひとつの入出力グループからの入出力
しかない場合には、その入出力グループのスケジュール
を連続するためにクォンタムをリセットしながら繰り返
すことになる。この場合にあってはクォンタムをリセッ
トした直後の最初の入出力はオーダード・タスクで依頼
し、前のクォンタムで完了してない入出力を総て完了し
た後にリセット後のクォンタムの入出力をスケジュール
する。If there is only input / output from one input / output group, the input / output group is repeated while resetting the quantum in order to continue the schedule. In this case, the first I / O immediately after resetting the quantum is requested by the ordered task, and the I / O of the reset quantum is scheduled after all the I / O that has not been completed in the previous quantum is completed. I do.

【００７４】これによって複数の入出力グループの入出
力が競合する場合、及びひとつの入出力グループのみの
入出力のみを連続させる場合の最大応答時間の延長を防
止することができる。As a result, it is possible to prevent an increase in the maximum response time when the input / output of a plurality of input / output groups conflicts and when only the input / output of one input / output group is continuous.

【００７５】図４は、図３のディスク入出力スケジュー
ル機構２０に設けている入出力スケジュール部３２によ
るディスク・タイムシェアリングのスケジュールの一例
である。FIG. 4 shows an example of a disk time sharing schedule by the input / output schedule section 32 provided in the disk input / output schedule mechanism 20 of FIG.

【００７６】図４において、３つの入出力グループＧ１
〜Ｇ３について、ディスク・タイムシェアリング制御情
報３０−１のスケジュール待ちグループキュー３８−１
〜３８−３に入出力要求が格納されている競合状態にあ
っては、入出力グループＧ１〜Ｇ３毎に決定されたクォ
ンタム持ち時間τ１，τ２，τ３に従って、グループＧ
１〜Ｇ３の順に各入出力をスケジューリングしてディス
ク装置１６に入出力を依頼する。In FIG. 4, three input / output groups G1
ＧG3, the schedule waiting group queue 38-1 of the disk time sharing control information 30-1
38-3, the input / output requests are stored in the group G according to the quantum holding times τ1, τ2, τ3 determined for the input / output groups G1 to G3.
The input / output is scheduled in the order of 1 to G3, and the input / output is requested to the disk device 16.

【００７７】例えば時刻ｔ０からのクォンタム持ち時間
τ１の間は、入出力グループＧ１の２つの入出力がスケ
ジュールされる。クォンタム切替えは、入出力完了時点
の時刻が現クォンタム切替え時刻を越えた時点で、次の
入出力グループのクォンタムに切替える。この切替えは
次式で判断する。（現在クォオンタムの入出力開始時刻）＜（現クォンタム開始時刻＋クォンタム）（１）即ち、（１）式を満たせば、現クォンタム種別に対応す
る入出力グループＧ１の入出力をディスク装置に依頼
し、満たさない場合は、次の入出力グループＧ２のクォ
ンタムに切替える。For example, during the quantum holding time τ1 from time t0, two inputs / outputs of the input / output group G1 are scheduled. In the quantum switching, when the time at the time of completion of the input / output exceeds the current quantum switching time, the quantum is switched to the quantum of the next input / output group. This switching is determined by the following equation. (Current quantum input / output start time) <(current quantum start time + quantum) (1) That is, if the expression (1) is satisfied, the input / output of the input / output group G1 corresponding to the current quantum type is requested to the disk device. If not, it switches to the next input / output group G2 quantum.

【００７８】次の入出力グループＧ２のクォンタム持ち
時間τ２の間には、例えば６つの入出力がスケジュール
されている。更に時刻ｔ２でクォンタム持ち時間τ２が
経過すると、入出力グループＧ３のクォンタム持ち時間
τ３への切り替えが行われ、例えば入出力グループＧ３
の３つの入出力がスケジュールされる。以下同様にクォ
ンタム持ち時間τ１，τ２，τ３を切り替えて、それぞ
れの入出力グループの入出力をスケジュールする。During the quantum holding time τ2 of the next input / output group G2, for example, six inputs / outputs are scheduled. Further, when the quantum holding time τ2 elapses at the time t2, the input / output group G3 is switched to the quantum holding time τ3.
Are scheduled. In the same way, the quantum holding times τ1, τ2, τ3 are similarly switched to schedule the input / output of each input / output group.

【００７９】図５は、特定の入出力グループの入出力の
みが連続した場合のタイムシェアリング処理の一例であ
る。図５において、時刻ｔ０で入出力グループＧ１のみ
の入出力が図３のスケジュール待ちグループキュー３８
−１に並んでおり、残りの入出力グループＧ２，Ｇ３の
スケジュール待ちキュー３８−２，３８−３は空であっ
たとする。FIG. 5 shows an example of the time sharing process when only the input / output of a specific input / output group is continuous. In FIG. 5, at time t0, only the input / output of the input / output group G1 is changed to the schedule waiting group queue 38 of FIG.
-1 and the schedule waiting queues 38-2 and 38-3 of the remaining input / output groups G2 and G3 are assumed to be empty.

【００８０】この場合には時刻ｔ０からの入出力グルー
プＧ１のクォンタム持ち時間τ１で入出力グループＧ１
の２つの入出力をスケジュールした後、時刻ｔ１でクォ
ンタム持ち時間τ１をリセットすることで次の同じ入出
力グループＧ１のクォンタムτ持ち時間１をリ・スター
トさせ、例えば３つの入出力をスケジュールする。In this case, the input / output group G1 has a quantum holding time τ1 of the input / output group G1 from time t0.
After the two input / outputs are scheduled, the quantum holding time τ1 is reset at time t1, thereby restarting the quantum 1 holding time 1 of the next same input / output group G1, and, for example, three inputs / outputs are scheduled.

【００８１】このようにひとつの入出力グループの入出
力のみ待ち状態にある時は、そのクォンタムをリセット
することで連続してひとつの入出力グループの入出力を
スケジュールする。As described above, when only the input / output of one input / output group is in the waiting state, the input / output of one input / output group is continuously scheduled by resetting the quantum.

【００８２】更に図５にあっては、時刻ｔ２で３つの入
出力グループＧ１〜Ｇ３の入出力が競合状態となること
で、次のクォンタム持ち時間τ２への切り替えが行われ
る。しかしながら、クォンタム持ち時間τ２において入
出力グループＧ２の入出力が３つしかなく、クォンタム
持ち時間τ２の途中の時刻ｔ３で３つの入出力要求が途
絶えている。Further, in FIG. 5, when the inputs and outputs of the three input / output groups G1 to G3 enter a conflicting state at time t2, switching to the next quantum holding time τ2 is performed. However, there are only three inputs / outputs in the input / output group G2 in the quantum holding time τ2, and three input / output requests are interrupted at time t3 in the quantum holding time τ2.

【００８３】この場合には、例えば入出力グループＧ３
に待ち状態の入出力要求があることから、時刻ｔ３でク
ォンタム持ち時間τ３に切り替え、入出力グループＧ３
の例えば３つの入出力をスケジュールする。In this case, for example, the input / output group G3
Since there is an I / O request in the waiting state, the time is switched to the quantum holding time τ3 at time t3, and the I / O group G3
For example, schedule three inputs and outputs.

【００８４】この図４及び図５に示したディスク・タイ
ムシェアリングのスケジュールにおいて、ディスクドラ
イブに対する入出力の依頼は、クォンタムを切り替えた
直後の入出力はオーダード・タスクで依頼し、２回目以
降の次のクォンタム切替えまでの入出力はシンプル・タ
スクで依頼する。In the disk time sharing schedule shown in FIGS. 4 and 5, the input / output request to the disk drive is made by an ordered task immediately after the quantum is switched, and the second and subsequent times I / O until the next quantum change is requested by a simple task.

【００８５】このようにディスクドライブ２４−１，２
４−２のリ・オーダリング機能を生かすためには、クォ
ンタムを切り替えた際に現在ディスクトドライブ２４−
１，２４−２に依頼している入出力要求が全て完了する
までの時間を予測し、この予測時間が切り替え後のクォ
ンタム以内であれば、切り替え後にクォンタムの入出力
を依頼し、予測時間が切り替え後のクォンタムを越えて
いた場合には、切り替え後の入出力を依頼せずに次のク
ォンタムへの切り替えを待つようにする。As described above, the disk drives 24-1 and 24-2
In order to take advantage of the re-ordering function of 4-2, when the quantum is switched, the current disk drive 24-
Predict the time until all the input / output requests requested to 1,24-2 are completed, and if the predicted time is within the quantum after switching, request quantum input / output after switching, and estimate the time. If the quantum after the switch is exceeded, the input / output after the switch is not requested, and the switching to the next quantum is waited.

【００８６】これはディスク装置１６のリ・オーダリン
グの恩恵を受けるためにはディスク入出力スケジュール
機構２０において、できるだけ多くの入出力をディスク
装置１６に依頼する環境を作るためである。This is to create an environment in which the disk I / O schedule mechanism 20 requests the disk unit 16 for as much I / O as possible in order to benefit from the reordering of the disk unit 16.

【００８７】シンプルタスクを使う場合、ディスク装置
に対して複数の入出力要求を依頼することになる。本発
明のディスクタイムシェリングは、ディスク装置での入
出力処理時間の時分割制御を目的としているので、ディ
スク装置へ入出力要求を依頼する際には、依頼された複
数の要求をディスク装置で処理するのに必要な時間を予
測し、次のクォンタムに切替えた後に切替え後のクォン
タム種別の入出力をディスク装置に投入するか否か判断
する必要がある。When the simple task is used, a plurality of input / output requests are requested to the disk device. Since the disk time-shelling of the present invention aims at time-sharing control of the input / output processing time in the disk device, when requesting an input / output request to the disk device, a plurality of requested requests are processed by the disk device. It is necessary to predict the time required to perform the switching, and after switching to the next quantum, determine whether or not to input / output the quantum type after switching to the disk device.

【００８８】このため、クォンタム切替え時に現在ディ
スクドライブに依頼している要求が次のクォンタム内で
完了して新たな入出力要求が投入できるか否かを判断す
るため残り時間τr を次式で算出する。 τr ＝Ｔ0 ＋τ−Ｔw −Ｔnow （２）但し、Ｔ0 はクォンタム開始時刻（予測値） τはクォンタム割当時間Ｔw は未処理Ｉ／Ｏ処理時間（予測値）Ｔnow は現在時刻Ｔ0 ＝Ｔs ＋Ｔw （３）但し、Ｔs は切替え前のクォンタム開始時刻Ｔw ＝Ｎ×Ｔａ（４）但し、Ｎは未処理のＩ／Ｏ数Ｔa はアクセス種別毎によるＩ／Ｏの平均処理時間ここで未処理Ｉ／０とは、ディスク装置に入出力要求を
投入して完了応答が返っていなものをいう。この未処理
Ｉ／Ｏには、本発明の実施形態の場合、前未処理Ｉ／
Ｏ、前々未処理Ｉ／Ｏおよび全未処理Ｉ／Ｏがあり、そ
れぞれ直前のクォンタムの未処理Ｉ／Ｏ、２つ前のクォ
ンタムの未処理Ｉ／Ｏ、および全てのクォンタムを通じ
た未処理Ｉ／Ｏを意味する。Therefore, the remaining time τr is calculated by the following equation to determine whether the request currently requested to the disk drive at the time of quantum switching is completed within the next quantum and a new input / output request can be input. I do. τr = T0 + τ−Tw−Tnow (2) where T0 is the quantum start time (predicted value), τ is the quantum allocation time Tw is the unprocessed I / O processing time (predicted value), and Tnow is the current time T0 = Ts + Tw (3) Where Ts is the quantum start time before switching Tw = N × Ta (4) where N is the number of unprocessed I / Os Ta is the average I / O processing time for each access type. Means that an I / O request is input to the disk device and no completion response is returned. In the embodiment of the present invention, the unprocessed I / O includes a pre-processed I / O.
O, there are unprocessed I / Os before and all unprocessed I / Os, respectively, the unprocessed I / O of the immediately preceding quantum, the unprocessed I / O of the previous quantum, and the unprocessed through all quantums. I / O.

【００８９】またクォンタム開始時刻Ｔ0 も予測値であ
り、前クォンタムの残り時間予測により、現クォンタム
への切替えを判断した際に予測する。この時、ディスク
装置上で前クォンタムの未処理Ｉ／Ｏの処理が全て完了
するのに必要な時間Ｔｗを（４）式で予測し、ディスク
装置上での前クォンタムの終了時刻、即ち、現クォンタ
ムの開始時刻Ｔ0 を（３）式で予測する。The quantum start time T0 is also a predicted value, and is predicted when the switching to the current quantum is determined based on the remaining time prediction of the previous quantum. At this time, the time Tw required to complete the processing of all unprocessed I / Os of the previous quantum on the disk device is predicted by equation (4), and the end time of the previous quantum on the disk device, that is, the current time, is calculated. The start time T0 of the quantum is predicted by equation (3).

【００９０】（２）式の残り時間Ｔｒは、ディスク入出
力スケジュール機構が新たな入出力を受け付けた場合、
またはディスク装置から入出力の完了応答を受けた場合
に算出され、残り時間Ｔr がＴｒ＞０あれば、残り時間ありと判断し、現クォンタムの入出力
をディスク装置に投入する。またＴr ≦０であれば、残り時間なしと判断し、クォンタムを切替え
る。When the disk input / output scheduler receives a new input / output, the remaining time Tr in the equation (2)
Alternatively, it is calculated when an input / output completion response is received from the disk device. If the remaining time Tr is Tr> 0, it is determined that there is a remaining time, and the current quantum input / output is input to the disk device. If Tr ≦ 0, it is determined that there is no remaining time, and the quantum is switched.

【００９１】ここで前記（２）式のの残り時間Ｔr の算
出に使用するディスクドライブの平均入出力処理時間Ｔ
a の算出方法は、例えば直前のｎ個の入出力処理時間の
平均値とする。この場合ｎは例えばｎ＝１０の有限値で
あってもよし、例えばｎ＝∞つまりシステム始動時から
の総ての入出力についてでもよい。Here, the average input / output processing time T of the disk drive used for calculating the remaining time Tr in the above equation (2)
The calculation method of a is, for example, an average value of the immediately preceding n input / output processing times. In this case, n may be a finite value of, for example, n = 10, or may be, for example, n = ∞, that is, all inputs and outputs from the start of the system.

【００９２】更に入出力処理時間の平均値の算出につい
ては、入出力グループ毎に平均値を算出する方法と、全
ての入出力グループの平均値を算出する方法のいずれか
とすることができる。Further, the calculation of the average value of the input / output processing time can be either a method of calculating the average value for each input / output group or a method of calculating the average value of all the input / output groups.

【００９３】一方、大量データをアクセスする場合、ポ
ジショニング時間がデータ転送時間に比較して短いた
め、アクセスするデータ量とディスクドライブの転送能
力から平均入出力処理時間Ｔa を予測する。この場合、
ポジショニング時間はリ・オーダリング機能の恩恵をど
の程度受けられるか、即ちその時のディスクドライブで
のリ・オーダリング対象の入出力の数、個々の入出力要
求のアドレスの分散具合などによって違ってくるが、大
量データアクセスの場合、処理時間に占めるポジショニ
ング時間の割合が小さいため、この場合には処理時間を（平均ポジショニング時間）＋（データ転送時間）と予測する。On the other hand, when a large amount of data is accessed, the positioning time is shorter than the data transfer time. Therefore, the average input / output processing time Ta is estimated from the amount of data to be accessed and the transfer capability of the disk drive. in this case,
The positioning time depends on how much the benefit of the reordering function can be obtained, that is, the number of I / Os to be reordered in the disk drive at that time, the distribution of addresses of individual I / O requests, etc. In the case of a large amount of data access, the ratio of the positioning time to the processing time is small. In this case, the processing time is estimated as (average positioning time) + (data transfer time).

【００９４】例えば転送速度が２０ＭＢ／ｓ、平均回転
待ち時間が３ｍｓ、平均シーク時間が５ｍｓのディスク
ドライブで１ＭＢのデータをアクセスする場合、平均ポ
ジショニング時間が８ｍｓに対し、転送時間は５２ｍｓ
なので、処理時間は両者を加えた６０ｍｓとする。For example, when accessing 1 MB of data with a disk drive having a transfer speed of 20 MB / s, an average rotation waiting time of 3 ms, and an average seek time of 5 ms, the transfer time is 52 ms while the average positioning time is 8 ms.
Therefore, the processing time is 60 ms, which is the sum of the two.

【００９５】図６は、クォンタム切替え時の残り時間予
測の例であり、シーケンシャル・クォンタムとランダム
・クォンタムを交互に繰り返す場合について、図６
（Ａ）〜（Ｊ）と時間が経過する場合の例である。FIG. 6 shows an example of prediction of the remaining time at the time of quantum switching. FIG. 6 shows a case where sequential quantum and random quantum are alternately repeated.
This is an example in which time elapses from (A) to (J).

【００９６】図６（Ａ）（Ｂ）は、ランダム・クォンタ
ムからシーケンシャル・クォンタムへの切替え時に、次
のクォンタム開始時刻Ｔ0 を予測する例である。ランダ
ム・クォンタムに切替っている現在時刻Ｔnow で、ラン
ダム・クォンタムの残り時間不足になったとする。この
とき、前クォンタムのシーケンシャルＩ／Ｏが１要求、
現クォンタムのランダムＩ／Ｏが３要求の完了応答が返
ってきておらず、ディスク装置で処理中である。FIGS. 6A and 6B show an example of predicting the next quantum start time T0 when switching from random quantum to sequential quantum. It is assumed that the remaining time of the random quantum is insufficient at the current time Tnow at which the random quantum has been switched. At this time, the previous quantum sequential I / O requests 1
The random I / O of the current quantum has not returned a completion response of three requests, and is being processed by the disk device.

【００９７】この場合、図６（Ｂ）のように、ディスク
装置に投入している処理中Ｉ／Ｏが全て完了するまでの
時間Ｔw1を（４）式で予測し、（２）式より次のシーケ
ンシャル・クォンタムの開始時刻Ｔ0 を決定する。また
クォンタムをシーケンシャル・クォンタムに切替える。In this case, as shown in FIG. 6B, the time Tw1 until all the processing I / Os input to the disk device are completed is predicted by the equation (4). The start time T0 of the sequential quantum is determined. In addition, the quantum is switched to the sequential quantum.

【００９８】図６（Ｃ）〜（Ｆ）は、残り時間予測で、
残り時間ありと判断する例である。図６（Ｂ）でシーケ
ンシャル・クォンタムに切替わった後、ディスク入出力
スケジュール機構がシーケンシャルＩ／Ｏを現在時刻Ｔ
now で１要求受け付けたとする。この時、ディスク装置
に依頼しているＩ／Ｏ要求で完了応答が返ってきていな
い未処理Ｉ／Ｏとして、ランダムＩ／Ｏの１要求があ
る。即、ディスク装置は、前クォンタムのランダムＩ／
Ｏの１要求を処理中である。FIGS. 6C to 6F show remaining time predictions.
This is an example in which it is determined that there is remaining time. After switching to the sequential quantum in FIG. 6B, the disk I / O scheduling mechanism changes the sequential I / O to the current time T.
Assume that one request has been received by now. At this time, there is one random I / O request as an unprocessed I / O for which a completion response has not been returned in the I / O request requested to the disk device. Immediately, the disk device uses the random I / O of the previous quantum.
One request of O is being processed.

【００９９】この場合、図６（Ｅ）のように、ランダム
Ｉ／Ｏの１要求をディスク装置で完了するまでの時間Ｔ
w2を（４）式で予測し、図６（Ｂ）で求めたクォンタム
開始時刻Ｔ0 を使用して（２）式より残り時間Ｔr2を図
６（Ｆ）のように求める。この場合、Ｔr2＞０であるこ
とから、シーケシンャルＩ／Ｏをディスク装置に投入す
ることができる。In this case, as shown in FIG. 6E, the time T until one request for random I / O is completed by the disk device is obtained.
w2 is predicted by the equation (4), and the remaining time Tr2 is obtained as shown in FIG. 6F from the equation (2) using the quantum start time T0 obtained in FIG. 6B. In this case, since Tr2> 0, the sequential I / O can be input to the disk device.

【０１００】図６（Ｇ）〜（Ｊ）は、残り時間予測で、
残り時間なしと判断する例である。さらに時間が進み、
図６（Ｇ）の現在時刻Ｔnow でディスク入出力スケジュ
ール機構がシーケンシャルＩ／Ｏを１要求受け付けたと
する。この時、ディスク装置に依頼しているＩ／Ｏ要求
で完了応答が返ってきていない未処理Ｉ／Ｏとして、の
シーケンシャルＩ／Ｏの１要求がある。即、ディスク装
置は、現クォンタムのシーケンシャルＩ／Ｏの１要求を
処理中である。FIGS. 6G to 6J show remaining time predictions.
This is an example in which it is determined that there is no remaining time. Time goes further,
It is assumed that the disk input / output scheduling mechanism has received one sequential I / O request at the current time Tnow in FIG. At this time, there is one sequential I / O request as an unprocessed I / O for which a completion response has not been returned in the I / O request requested to the disk device. Immediately, the disk device is processing one request for the sequential I / O of the current quantum.

【０１０１】この場合、図６（Ｉ）のように、シーケン
シャルＩ／Ｏの１要求をディスク装置で完了するまでの
時間Ｔw3を（４）式で予測し、図６（Ｂ）で求めたクォ
ンタム開始時刻Ｔ0 を使用して（２）式より残り時間Ｔ
r3を図６（Ｊ）のように求める。この場合、Ｔr3＜０で
あることから、残り時間なしと判断し、次のランダム・
クォンタムに切替える。In this case, as shown in FIG. 6 (I), the time Tw3 until one sequential I / O request is completed by the disk device is predicted by equation (4), and the quantum calculated by FIG. 6 (B) is obtained. Using the start time T0, the remaining time T is obtained from equation (2).
r3 is obtained as shown in FIG. In this case, since Tr3 <0, it is determined that there is no remaining time, and the next random
Switch to quantum.

【０１０２】図７は、図３のディスク入出力スケジュー
ル機構２０に設けた入出力スケジュール部３２による本
発明のディスクタイム・シェアリング制御処理のフロー
チャートである。FIG. 7 is a flowchart of the disk time sharing control processing of the present invention by the input / output schedule unit 32 provided in the disk input / output schedule mechanism 20 of FIG.

【０１０３】この入出力スケジュール部３２によるディ
スクタイム・シェアリング制御処理は、入出力要求受付
部３４で入出力要求部１８より、ある入出力要求を受付
けた際の呼出し、或いは入出力完了処理部３６でディス
ク装置１６に依頼した入出力に対する完了報告があった
ときからの呼出しを受けて動作する。The disk time sharing control processing by the input / output schedule unit 32 is performed by the input / output request receiving unit 34 when a certain input / output request is received from the input / output request unit 18 or the input / output completion processing unit. It operates in response to a call from the time when there is a completion report for the input / output requested to the disk device 16 at 36.

【０１０４】まず図４のスケジュールに示したように、
図３のディスク入出力スケジュール機構２０において、
競合した３つの入出力グループ間でクォンタムを順番に
切替えてディスクドライブ２４−１のタイムシェアリン
グを行なう場合を説明する。First, as shown in the schedule of FIG.
In the disk input / output schedule mechanism 20 of FIG.
A case will be described in which the quantum is sequentially switched among the three competing input / output groups to perform time sharing of the disk drive 24-1.

【０１０５】ステップＳ１で現クォンタム種別に設定さ
れているクォンタム識別子ｉ＝１に対応するスケジュー
ル待ちグループキュー３８−１を調べ、待ちの入出力の
有無を判定する。In step S1, the schedule waiting group queue 38-1 corresponding to the quantum identifier i = 1 set for the current quantum type is checked to determine whether there is a waiting input / output.

【０１０６】スケジュール待ちグループキュー３８−１
に待ちの入出力があれば、ステップＳ２に進み、前々ク
ォンタムに未完了の入出力があるか否かチェックする。
いま、クォンタム識別子ｉ＝１が最初のスケジュールで
あるとすると、前々クォンタムに未完了入出力はないこ
とから、ステップＳ３に進み、残り時間Ｔr を（２）式
から予測する。Schedule waiting group queue 38-1
If there is a waiting input / output, the process proceeds to step S2, and it is checked whether there is an incomplete input / output in the quantum two months before.
Now, assuming that the quantum identifier i = 1 is the first schedule, since there is no incomplete input / output in the quantum two months before, the process proceeds to step S3, and the remaining time Tr is predicted from the equation (2).

【０１０７】続いてステップＳ４で残り時間ＴｒがＴr
＞０か否かチェックし、この条件が成立する場合には残
り時間ありと判断してステップＳ８に進む。ステップＳ
４にあっては、現クォンタムのスケジュール待ちグルー
プキュー３８−１の先頭の入出力をディスク装置１６の
ディスク入出力処理部２２を介してディスクドライブ２
４−１に依頼し、次入出力タスク種別情報４８のタスク
をシンプルタクスに設定する。Subsequently, at step S4, the remaining time Tr becomes Tr.
It is checked whether or not> 0, and if this condition is satisfied, it is determined that there is a remaining time, and the routine proceeds to step S8. Step S
4, the first input / output of the current quantum schedule waiting group queue 38-1 is transmitted to the disk drive 2 via the disk input / output processing unit 22 of the disk device 16.
4-1 and set the task of the next input / output task type information 48 to the simple task.

【０１０８】続いてステップＳ１に戻り、現クォンタム
のスケジュール待ちグループキュー３８−１に待ちの入
出力があるか否かチェックし、待ちがあればステップＳ
２、Ｓ３，Ｓ８の処理を繰り返す。このような入出力グ
ループＧ１のクォンタム持ち時間τ１における入出力の
スケジュールによりステップＳ３でＴr ≦０となり、残
り時間無しが判断されるとステップＳ５に進み、他の入
出力グループＧ２，Ｇ３のスケジュール待ちグループキ
ュー３８−２，３８−３に待ちの入出力があるか否かチ
ェックする。Then, returning to step S1, it is checked whether or not there is a waiting input / output in the schedule waiting group queue 38-1 of the current quantum.
2. The processes of S3 and S8 are repeated. According to the input / output schedule of the input / output group G1 during the quantum holding time τ1, Tr ≦ 0 in step S3, and if it is determined that there is no remaining time, the process proceeds to step S5 to wait for the schedules of the other input / output groups G2, G3. It is checked whether there is a waiting input / output in the group queues 38-2 and 38-3.

【０１０９】このとき次の入出力グループＧ２のスケジ
ュール待ちグループキュー３８−２に入出力の待ちがあ
るとステップＳ１０に進み、次の入出力グループＧ２の
クォンタム持ち時間τ２に切替え、次入出力タスク種別
情報４８について、次のタスクをオーダードに設定す
る。同時に、クォンタム現在時刻Ｔ0 を（３）式から予
測し、予測したＴ0 を現クォンタム開始時刻に設定す
る。At this time, if there is an input / output waiting in the schedule waiting group queue 38-2 of the next input / output group G2, the process proceeds to step S10, and the next input / output group G2 is switched to the quantum holding time τ2. For the type information 48, the next task is set to ordered. At the same time, the quantum current time T0 is predicted from equation (3), and the predicted T0 is set as the current quantum start time.

【０１１０】これにより最初の入出力グループＧ１のク
ォンタム持ち時間τ１から次の入出力グループＧ２のク
ォンタム持ち時間τ２への切替えが行われ、再びステッ
プＳ１に戻り、クォンタム切替えに伴う次の入出力グル
ープＧ２をステップＳ２，Ｓ３，Ｓ４，Ｓ８により処理
する。As a result, the switching from the quantum holding time τ1 of the first input / output group G1 to the quantum holding time τ2 of the next input / output group G2 is performed, and the process returns to step S1 again. G2 is processed by steps S2, S3, S4 and S8.

【０１１１】この時、ステップＳ１０で次のタスクをオ
ーダードに設定しているため、クォンタム切替え後の最
初の入出力はオーダードの指定でディスクドライブ２４
−１に依頼される。依頼が済んだならば次入出力タスク
種別情報４８のタスクをシンプルに設定する。At this time, since the next task is set to ordered in step S10, the first input / output after the quantum switching is performed by designating the ordered disk drive 24.
-1. When the request is completed, the task of the next input / output task type information 48 is simply set.

【０１１２】次に図５に示したように１つの入出力グル
ープ、例えば入出力グループＧ１の入出力要求が連続し
た場合の処理を説明する。同じ入出力グループＧ１の入
出力要求の受付けが連続した場合には、ステップＳ１〜
Ｓ４の処理をクォンタムｔ１で繰り返して同じ入出力グ
ループの入出力要求をディスクドライブ２４−１に依頼
し、この間にステップＳ４で残り時間無しが判別される
とステップＳ５に進み、他の入出力グループのスケジュ
ール待ちグループキュー３８−２，３８−３に待ちの入
出力があるか否かチェックする。Next, a description will be given of a process when input / output requests of one input / output group, for example, the input / output group G1 are continuous as shown in FIG. If the input / output requests of the same input / output group G1 are continuously received, steps S1 to S1 are executed.
The process of S4 is repeated at the quantum t1 to request an input / output request of the same input / output group to the disk drive 24-1. It is checked whether or not there is a waiting input / output in the schedule waiting group queues 38-2 and 38-3.

【０１１３】この時、他の入出力グループＧ２，Ｇ３の
スケジュール待ちクループキュー３８−２，３８−３に
待ちの入出力がなく空であった場合には、ステップＳ９
に進み、現クォンタム持ち時間τ１をリセットし次のタ
スクをオーダードに設定し、同時に、クォンタム現在時
刻Ｔo を予測して現クォンタム開始時刻に設定し、ステ
ップＳ１に戻る。この場合、現クォンタム種別はそのま
まとする。At this time, if there is no waiting input / output in the schedule waiting group queues 38-2 and 38-3 of the other input / output groups G2 and G3, the process goes to step S9.
Then, the current quantum holding time τ1 is reset and the next task is set to ordered. At the same time, the current quantum time To is predicted and set to the current quantum start time, and the process returns to step S1. In this case, the current quantum type is left as it is.

【０１１４】このため現クォンタム持ち時間τ１をリセ
ットした後の次のクォンタムも同じクォンタム持ち時間
τ１となり、入出力グループＧ１の入出力要求が続いて
いる場合には、同じクォンタムτ１が継続される。For this reason, the next quantum after resetting the current quantum holding time τ1 has the same quantum holding time τ1, and when the input / output request of the input / output group G1 continues, the same quantum τ1 is continued.

【０１１５】一方、図５の時刻ｔ２以降に示すように時
刻ｔ０〜ｔ２の間に同一入出力グループＧ１の入力要求
が連続してクォンタムｔ１，ｔ２がリセットにより継続
し、時刻ｔ２までに残り２つの入出力グループＧ２，Ｇ
３の入出力要求が受付けられてスケジュール待ちグルー
プキュー３８−２，３８−３に格納されていると、次の
入出力グループＧ２のクォンタム持ち時間τ２への切替
えが行われる。On the other hand, as shown after time t2 in FIG. 5, the input requests of the same input / output group G1 are continuously made between times t0 and t2, and the quantum t1 and t2 are continued by the reset, and the remaining two are left by time t2. I / O groups G2 and G
When the input / output request of No. 3 is received and stored in the schedule waiting group queues 38-2 and 38-3, the next input / output group G2 is switched to the quantum holding time τ2.

【０１１６】しかしながら、図５の時刻ｔ３のようにク
ォンタム持ち時間τ２の途中で入出力グループＧ２のス
ケジュール待ちグループキュー３８−２が空になってス
テップＳ１で待ちキュー無しが判別されると、ステップ
Ｓ６に進み、他のクォンタムについてスケジュール待ち
グループキュー３８−１，３８−３に待ちの入出力があ
るか否かチェックする。However, when the schedule waiting group queue 38-2 of the input / output group G2 becomes empty in the middle of the quantum holding time τ2 as shown at time t3 in FIG. Proceeding to S6, it is checked whether or not there is a waiting input / output in the schedule waiting group queues 38-1 and 38-3 for another quantum.

【０１１７】この時、他のクォンタムに待ちの入出力が
あればステップＳ１０に進み、全てのクォンタムに未完
了の入出力があるか否かチェックし、無ければステップ
Ｓ１１に進み、次の入出力グループＧ３のクォンタム持
ち時間τ３に切替え、次タスクをオーダードに設定し、
更に、クォンタム開始時刻Ｔ0 を予測して設定し、ステ
ップＳ１に戻ることで切替え後の入出力グループＧ３の
最初の入出力要求をオーダードでディスクドライブ２４
−１にステップＳ１〜Ｓ４，Ｓ８の処理を通じて依頼す
ることになる。At this time, if there is a waiting input / output in another quantum, the process proceeds to step S10, and it is checked whether all quantums have an uncompleted input / output. If not, the process proceeds to step S11 and the next input / output is performed. Switch to quantum hold time τ3 of group G3, set next task to ordered,
Further, the quantum start time T0 is predicted and set, and by returning to step S1, the first I / O request of the I / O group G3 after switching is ordered to the disk drive 24.
-1 through the processing of steps S1 to S4 and S8.

【０１１８】ここで図３でシーケンシャルアクセスとラ
ンダムアクセスについて入出力をグループを形成する場
合には、入出力要求部１８にシーケンシャルアクセス検
出機構４５を設けている。例えばＲＡＩＤ制御部２６に
対する入出力依頼インターフェースに、シーケンシャル
アクセス検出機構４５で検出したシーケンシャルアクセ
スの入出力であることを通知するインターフェースを追
加する。Here, in the case of forming a group of input / output for sequential access and random access in FIG. 3, a sequential access detection mechanism 45 is provided in the input / output request unit 18. For example, an interface for notifying the input / output of the sequential access detected by the sequential access detection mechanism 45 to the input / output request interface to the RAID control unit 26 is added.

【０１１９】シーケンシャルアクセス検出機構４５は、
図２に示した上位のデバイス制御装置１２から発行され
た入出力コマンドに含まれるアドレスとデータ長から次
の入出力コマンドのアドレスを認識しており、次の入出
力コマンドのアドレスが予測したアドレスに一致した場
合には、シーケンシャルアクセスを検出し、シーケンシ
ャルアクセスを示すフラグなどの情報をインタフェース
によりＲＡＩＤ制御部２６を介してディスク入出力スケ
ジュール機構２０の入出力要求受付部３４に発行する。The sequential access detection mechanism 45
The address of the next input / output command is recognized from the address and the data length included in the input / output command issued from the upper device control device 12 shown in FIG. 2, and the address of the next input / output command is predicted. If it is determined that the sequential access is detected, the sequential access is detected, and information such as a flag indicating the sequential access is issued to the input / output request receiving unit 34 of the disk input / output schedule mechanism 20 via the RAID control unit 26 through the interface.

【０１２０】このため入出力要求受付部３４にあって
は、入出力要求部１８から受付けた入出力要求につい
て、シーケンシャルアクセスかランダムアクセスかを認
識することができる。Therefore, the input / output request receiving unit 34 can recognize whether the input / output request received from the input / output request unit 18 is sequential access or random access.

【０１２１】またコピー／バックアップ処理の入出力グ
ループを形成する場合には、入出力要求部１８にバック
アップ機構７８を設けている。バックアップ機構７８か
らの入出力は、バックアップ入出力を通知するための追
加インターフェースによりＲＡＩＤ制御部２６に通知さ
れる。ＲＡＩＤ制御部２６は、ディスク入出力スケジュ
ール機構２０への入出力依頼時に、バックアップの入出
力であることを伝え、コピー処理／バックアップ処理の
入出力グループについてディスクタイムシェアリングを
行う。When forming an input / output group for copy / backup processing, a backup mechanism 78 is provided in the input / output request unit 18. The input / output from the backup mechanism 78 is notified to the RAID controller 26 by an additional interface for notifying the backup input / output. The RAID control unit 26 informs that it is a backup input / output at the time of the input / output request to the disk input / output schedule mechanism 20, and performs disk time sharing for the input / output group of the copy processing / backup processing.

【０１２２】更に、リビルディング処理の入出力グルー
プを形成する場合は、入出力要求部１８にリビルディン
グ機構８４を設けている。リビルディング処理の際に
は、ＲＡＩＤ制御部２６に対する入出力要求につき、リ
ビルディング処理であることを示すインターフェースを
追加している。ＲＡＩＤ制御部２６はディスク入出力ス
ケジュール機構２０に入出力を依頼する際に、リビルデ
ィング入出力の通知を行ない、リビルディング処理の入
出力グループについてディスクタイムシェアリングを行
う。Further, when forming an input / output group for the rebuilding process, the input / output request unit 18 is provided with a rebuilding mechanism 84. At the time of the rebuilding process, an interface indicating that the input / output request to the RAID control unit 26 is the rebuilding process is added. When requesting input / output from the disk input / output schedule mechanism 20, the RAID control unit 26 notifies the input / output of the rebuilding input / output and performs disk time sharing for the input / output group of the rebuilding process.

【０１２３】図８は図２のチューニング機構５０の機能
ブロック図である。図８において、チューニング機構５
０は、チューニング部５２と基礎データファイル５４で
構成されている。チューニング部５２には要求性能設定
部５６と動作条件決定部５８が設けられる。FIG. 8 is a functional block diagram of the tuning mechanism 50 of FIG. In FIG. 8, the tuning mechanism 5
0 is composed of a tuning unit 52 and a basic data file 54. The tuning unit 52 includes a required performance setting unit 56 and an operation condition determining unit 58.

【０１２４】要求性能設定部５６はユーザによるランダ
ムアクセスの平均レスポンスＡｖｅと最大レスポンスＭ
ａｘ及びシーケンシャルアクセスのスループットＴｈＰ
を受け付け、更に図２のアレイディスク装置１４側のデ
ィスク入出力スケジュール機構２０側で観測したランダ
ムアクセスの負荷状態ＩＯＰＳを入手し、動作条件決定
部５８に出力する。尚、ランダムアクセスの負荷状態Ｉ
ＯＰＳは直接、動作条件決定部５８に供給してもよい。The required performance setting unit 56 calculates the average response Ave and the maximum response M of random access by the user.
ax and sequential access throughput ThP
The random access load state IOPS observed by the disk input / output schedule mechanism 20 of the array disk device 14 shown in FIG. 2 is obtained and output to the operating condition determination unit 58. The load state I of random access
The OPS may be directly supplied to the operation condition determining unit 58.

【０１２５】動作条件決定部５８は、要求性能設定部５
６で設定した要求性能値を満足するタイムシェアリング
周期ＴＳとランダムアクセスとシーケンシャルアクセス
のクォンタム比率のＲＳ比を調整値として決定し、図２
のディスク入出力スケジュール機構２０のタイムシェア
リング周期ＴＳと各グループのクォンタムを自動的に調
整する。The operating condition determining unit 58 is configured to set the required performance setting unit 5
The time sharing period TS that satisfies the required performance value set in 6 and the RS ratio of the quantum ratio of random access and sequential access are determined as adjustment values.
Automatically adjusts the time sharing cycle TS of the disk input / output scheduling mechanism 20 and the quantum of each group.

【０１２６】基礎データファイル５４には、平均レスポ
ンス用の第１基礎データ６２、最大レスポンス用の第２
基礎データ６４及びスループット用の第３基礎データ６
６が格納されている。The basic data file 54 includes first basic data 62 for average response and second basic data 62 for maximum response.
Basic data 64 and third basic data 6 for throughput
6 is stored.

【０１２７】図９は図８の基礎データファイル５４に格
納している各基礎データのデータ構造を示す。このデー
タ構造で格納されている基礎データはシミュレーション
あるいは実測値により得られたデータである。FIG. 9 shows the data structure of each basic data stored in the basic data file 54 of FIG. The basic data stored in this data structure is data obtained by simulation or actual measurement.

【０１２８】図９（Ａ）は、ランダムアクセスの平均レ
スポンスに関する第１基礎データ６２であり、ランダム
アクセスの負荷ＩＯＰＳごとに分けて、対応する基礎デ
ータを格納している。例えば負荷ＩＯＰＳ＝１００，１
５０について平均レスポンスに関する基礎データを格納
している。FIG. 9A shows the first basic data 62 relating to the average response of random access, which stores the corresponding basic data for each random access load IOPS. For example, load IOPS = 100,1
For 50, basic data on the average response is stored.

【０１２９】負荷ＩＯＰＳ＝１００を例にとると、タイ
ムシェアリング周期ＴＳ＝１００ｍｓ，２００ｍｓ，３
００ｍｓとＲＳ比＝９０％，８０％，７０％の組合せに
応じた平均レスポンス時間が基礎データとして格納され
ている。同様に負荷ＩＯＰＳ＝１５０についても、３つ
のタイムシェアリング周期ＴＳと３つのＲＳ比の組合せ
に対応して平均レスポンスが格納されている。Taking the load IOPS = 100 as an example, the time sharing period TS = 100 ms, 200 ms, 3
The average response time corresponding to the combination of 00 ms and RS ratio = 90%, 80%, 70% is stored as basic data. Similarly, for load IOPS = 150, an average response is stored corresponding to a combination of three time sharing periods TS and three RS ratios.

【０１３０】図９（Ｂ）は、ランダムアクセスの最大レ
スポンスに関する第２基礎データ６４であり、図９
（Ａ）の平均レスポンスの基礎データ６２と同様、ラン
ダムアクセスの負荷ＩＯＰＳ＝１００，１５０に分けて
基礎データを格納している。それぞれの基礎データはタ
イムシェアリング周期ＴＳ＝１００ｍｓ，２００ｍｓ，
３００ｍｓとＲＳ比＝９０％，８０％，７０％の組合せ
に対応して、最大レスポンスを基礎データとして格納し
ている。FIG. 9B shows the second basic data 64 relating to the maximum response of the random access.
Similar to the basic data 62 of the average response in (A), the basic data is stored separately for random access loads IOPS = 100 and 150. Each basic data has a time sharing cycle TS = 100 ms, 200 ms,
The maximum response is stored as basic data corresponding to the combination of 300 ms and RS ratio = 90%, 80%, 70%.

【０１３１】図９（Ｃ）は、シーケンシャルアクセスの
スループットＴｈＰに関する第３基礎データ６６であ
り、このスループットに関してはランダムアクセスの負
荷ＩＯＰＳには関わらず、タイムシェアリング周期ＴＳ
とＲＳ比の組合せに対応したスループットが基礎データ
として格納されている。FIG. 9C shows the third basic data 66 related to the sequential access throughput ThP. Regarding this throughput, regardless of the random access load IOPS, the time sharing period TS
And the throughput corresponding to the combination of RS and RS ratio is stored as basic data.

【０１３２】次に図８の動作条件決定部５８によるチュ
ーニング処理を図９の基礎データを例にとって説明する
と次のようになる。まず要求性能設定部５６から平均レ
スポンス、最大レスポンス、スループットの順に優先度
が指示されており、このときユーザの要求値が次の値で
あったとする。・平均レスポンスＡｖｅ＝４０ｍｓ以下・最大レスポンスＭａｘ＝８０ｍｓ以下・スループットＴｈＰ＝３．０ＭＢ／ｓ以上また、このときのランダムアクセスの負荷状態の観測値
が１００ＩＯＰＳであったとする。Next, a description will be given of the tuning processing by the operating condition determining unit 58 in FIG. 8 with reference to the basic data in FIG. 9 as an example. First, it is assumed that the priority is instructed from the required performance setting unit 56 in the order of the average response, the maximum response, and the throughput, and the request value of the user at this time is the following value. -Average response Ave = 40 ms or less-Maximum response Max = 80 ms or less-Throughput ThP = 3.0 MB / s or more Also, assume that the observed value of the random access load state at this time is 100 IOPS.

【０１３３】図１０は、このような要求性能及び優先度
が設定された状態でのチューニング処理の手順を表して
いる。まず最も優先度が高い平均レスポンスに関し、図
９（Ａ）の第１基礎データ６２の中の負荷ＩＯＰＳ＝１
００のデータを図１０の第１基礎データ６２Ａのように
抽出し、この第１基礎データ６２Ａの中から平均レスポ
ンスＡｖｅが要求性能値である４０ｍｓ以下となる斜線
の領域を抽出する。FIG. 10 shows the procedure of the tuning process in a state where such required performance and priority are set. First, regarding the average response having the highest priority, the load IOPS = 1 in the first basic data 62 in FIG.
00 is extracted as the first basic data 62A in FIG. 10, and a hatched area where the average response Ave is equal to or less than the required performance value of 40 ms is extracted from the first basic data 62A.

【０１３４】次に図９（Ｂ）の第２基礎データ６４の中
からランダムアクセスの負荷ＩＯＰＳ＝１００のデータ
を図１０の第２基礎データ６４Ａのように抽出し、ユー
ザが要求した最大レスポンス８０ｍｓ以下を達成できる
斜線部の領域を獲得する。続いて共通領域検査部７１で
第１基礎データ６２Ａの斜線部と第２基礎データ６４Ａ
の斜線部の比較で、平均レスポンスと最大レスポンスの
両方のユーザ要求を達成する第１共通データ６８に示す
斜線部の領域を獲得する。Next, data of random access load IOPS = 100 is extracted from the second basic data 64 of FIG. 9B as the second basic data 64A of FIG. 10, and the maximum response 80 ms requested by the user is 80 ms. Obtain the shaded area where you can achieve: Subsequently, the shaded portion of the first basic data 62A and the second basic data 64A are output by the common area inspection unit 71.
In the comparison of the hatched portions, the area of the hatched portion indicated by the first common data 68 that achieves both the average response and the maximum response user request is obtained.

【０１３５】次に優先度が最も低い図９（Ｃ）のスルー
プットに関する第３基礎データ６６について、３．０Ｍ
Ｂ／ｓ以上とするスループットのユーザ要求を達成でき
る領域を、図１０の第３基礎データ６６の斜線部に示す
ように獲得する。Next, with respect to the third basic data 66 relating to the throughput of FIG.
An area in which the user request of the throughput of B / s or more can be achieved is obtained as shown by the shaded portion of the third basic data 66 in FIG.

【０１３６】最終的に第１共通データ６８の平均レスポ
ンスと最大レスポンスのユーザ要求を達成している斜線
部の領域と、第３基礎データ６６のスループットのユー
ザ要求を達成している斜線部の領域との比較により、平
均レスポンス、最大レスポンス及びスループットの全て
のユーザ要求を達成できる領域の共通部分を、第２共通
データ７０の斜線部のように獲得する。Finally, the hatched area where the user request of the average response and the maximum response of the first common data 68 is achieved, and the hatched area where the user request of the throughput of the third basic data 66 is achieved As a result, the common portion of the area where all the user requests of the average response, the maximum response, and the throughput can be achieved is obtained as shown by the shaded portion of the second common data 70.

【０１３７】以上の結果から第２共通データ７０の斜線
部の領域に対応するタイムシェアリング周期ＴＳ＝３０
０ｍｓとＲＳ比＝９０％の組合せが、平均レスポンス、
最大レスポンス及びスループットの全てのユーザ要求を
達成できる調整値として決定され、図２のディスク入出
力スケジュール機構２０に設定されてタイムシェアリン
グの動作条件を自動的に調整する。From the above results, the time sharing period TS = 30 corresponding to the shaded area of the second common data 70
The combination of 0 ms and RS ratio = 90% is the average response,
The maximum response and the throughput are determined as adjustment values that can achieve all user requests, and are set in the disk input / output scheduling mechanism 20 in FIG. 2 to automatically adjust the operating conditions of time sharing.

【０１３８】例えばディスク入出力スケジュール機構２
０において、ランダムアクセスとシーケンシャルアクセ
スの２グループのタイムシェアリングを行っていた場合
には、タイムシェアリング周期ＴＳ＝３００ｍｓの設定
と同時にＲＳ比＝９０％に基づいて、ランダムアクセス
のクォンタムが２７０ｍｓ、シーケンシャルアクセスの
クォンタムが３０ｍｓに設定される。For example, disk input / output schedule mechanism 2
0, when time sharing of two groups of random access and sequential access is performed, the quantum of random access is set to 270 ms based on the RS ratio = 90% at the same time as setting the time sharing cycle TS = 300 ms. The quantum of the sequential access is set to 30 ms.

【０１３９】図１１は図８のチューニング処理のフロー
チャートである。まずステップＳ１でユーザ要求の優先
度の高い方から順に要求性能の基礎データを獲得する。
例えば平均レスポンス、最大レスポンス、スループット
の順に優先度が設定されていた場合には、まず平均レス
ポンスについての基礎データを獲得する。FIG. 11 is a flowchart of the tuning process of FIG. First, in step S1, the basic data of the required performance is acquired in the order of the priority of the user request.
For example, when the priorities are set in the order of the average response, the maximum response, and the throughput, first, basic data on the average response is obtained.

【０１４０】続いてステップＳ２について、ユーザ要求
値を達成できる設定が可能な領域を基礎データについて
獲得する。続いてステップＳ３でユーザ要求を達成でき
る要求がある場合には、ステップＳ４に進み、現在処理
している基礎データが最優先の要求性能の項目か否か判
定する。Subsequently, in step S2, an area in which setting that can achieve the user request value can be set is obtained for the basic data. Subsequently, if there is a request that can satisfy the user request in step S3, the process proceeds to step S4, and it is determined whether or not the basic data currently being processed is an item of the highest priority required performance.

【０１４１】最優先の要求性能であった場合には、最初
のデータ領域であるために共通領域を判定することがで
きないので、ステップＳ１に戻り、次の要求性能につい
ての要求達成領域の獲得をステップＳ１〜Ｓ３で行う。
２番目以降の優先度の要求性能であった場合にはステッ
プＳ５に進み、既に獲得している要求性能の領域を獲得
して新たな共通領域とする。If the requested performance has the highest priority, the common area cannot be determined because it is the first data area. Therefore, the process returns to step S1 to acquire the requested area for the next required performance. This is performed in steps S1 to S3.
If the requested performance has the second or higher priority, the process proceeds to step S5, and the area of the requested performance that has already been obtained is obtained as a new common area.

【０１４２】続いてステップＳ６で共通領域が獲得でき
たか否かチェックし、獲得できればステップＳ７に進
む。ステップＳ７にあっては、次のユーザ要求性能の項
目がある場合にはステップＳ１に戻り、ない場合にはス
テップＳ８に進む。ステップＳ８にあっては、最終的に
得られた複数の要求性能の共通領域から最優先の要求性
能の項目が最も良い値をとる組合せを選択する。Subsequently, it is checked in step S6 whether or not the common area has been obtained. If the common area has been obtained, the flow advances to step S7. In step S7, if there is the next item of the user required performance, the process returns to step S1, and if not, the process proceeds to step S8. In step S8, from the finally obtained common area of the plurality of required performances, a combination in which the item of the highest priority required performance takes the best value is selected.

【０１４３】ここでステップＳ８の最善の組合せ選択
で、例えば平均レスポンス、最大レスポンス、スループ
ットの３つの要求性能について全ての要求を満足するタ
イムシェアリング周期ＴＳとＲＳ比の組合せが１または
複数ある場合には問題ないが、下位の要求性能が達成で
きていない場合には、例えば第１モード〜第４モードの
いずれかのモードによる調整値の決定を行う。例えば上
位の平均レスポンス及び最大レスポンスについては要求
性能を満足したが下位のスループットについては要求性
能が達成できない場合には、モード１〜４の処理は次の
ようになる。Here, in the selection of the best combination in step S8, for example, when there is one or more combinations of the time sharing period TS and the RS ratio that satisfy all of the three required performances of the average response, the maximum response, and the throughput. Although there is no problem in the above, when the lower required performance is not achieved, the adjustment value is determined in any one of the first to fourth modes, for example. For example, if the required performance is satisfied for the upper average response and the maximum response but the required performance cannot be achieved for the lower throughput, the processing in modes 1 to 4 is as follows.

【０１４４】（１）上位の優先度となる平均レスポンス
と最大レスポンスを達成できる設定範囲で下位のスルー
プットの要求性能が達成できない場合、下位のスループ
ットは考慮しないで調整値を決定する。(1) When the required performance of the lower throughput cannot be achieved within the setting range in which the average response and the maximum response as the upper priority can be achieved, the adjustment value is determined without considering the lower throughput.

【０１４５】（２）第２モードは、上位の優先度をもつ
平均レスポンスと最大レスポンスが達成できる設定範囲
では下位のスループットの要求性能が達成できない場合
も、下位のスループットの要求性能を考慮して調整値を
決定する。(2) In the second mode, even when the required performance of the lower throughput cannot be achieved within the setting range in which the average response and the maximum response having the higher priority can be achieved, the required performance of the lower throughput is considered. Determine the adjustment value.

【０１４６】（３）第３モードは、上位の優先度をもつ
平均レスポンスと最大レスポンスの要求性能が達成でき
る設定範囲では下位のスループットの要求性能が達成で
きない場合に、上位の平均レスポンスと最大レスポンス
の共通領域の設定範囲内から下位のスループットの性能
が最善になる調整値を選択する。(3) In the third mode, when the required performance of the lower throughput cannot be achieved within the setting range in which the required performance of the average response and the maximum response having the higher priority can be achieved, the upper average response and the maximum response can be achieved. From the setting range of the common area, the adjustment value that maximizes the performance of the lower throughput is selected.

【０１４７】（４）第４モードは、上位の優先度をもつ
平均レスポンスと最大レスポンスが達成できる設定範囲
では下位のスループットの要求性能が達成できない場合
に、上位の平均レスポンスと最大レスポンスの共通領域
の設定範囲から下位のスループットの性能が良くなる候
補を複数点選択し、選択した候補の中から上位の平均レ
スポンス及び最大レスポンスが最大になる調整値を選択
する。(4) In the fourth mode, when the required performance of the lower throughput cannot be achieved within the setting range in which the average response and the maximum response having the higher priority can be achieved, the common area of the upper average response and the maximum response is used. From the set range, a plurality of candidates for which the performance of the lower throughput is improved are selected, and an adjustment value at which the higher average response and the maximum response are maximized is selected from the selected candidates.

【０１４８】図１２は、上位の優先度の要求性能を達成
できる設定範囲で下位の要求性能が達成できない場合の
調整値の決定の具体例を示している。ここでユーザの要
求性能は図９の場合と同様、・平均レスポンスＡｖｅ＝４０ｍｓ以下・最大レスポンスＭａｘ＝８０ｍｓ以下・スループットＴｈＰ＝３．０ＭＢ／ｓ以上である。FIG. 12 shows a specific example of the adjustment value determination when the lower required performance cannot be achieved within the setting range in which the higher priority required performance can be achieved. Here, the required performance of the user is the same as in the case of FIG. 9:-Average response Ave = 40 ms or less-Maximum response Max = 80 ms or less-Throughput ThP = 3.0 MB / s or more.

【０１４９】また平均レスポンスの第１基礎データ６
２、最大レスポンスの第２基礎データ６４は、図９
（Ａ）（Ｂ）と同じデータである。これに対しスループ
ットに関する第３基礎データ６６が図９（Ｃ）の場合と
若干異なっており、図１３の第３基礎データ６６０とな
っている。相違点はタイムシェアリング周期ＴＳ＝３０
０ｍｓとＲＳ比＝９０％の組合せについて、スループッ
トが２．６ＭＢ／ｓとなっている点である。The first basic data 6 of the average response
2. The second basic data 64 of the maximum response is shown in FIG.
(A) Same data as (B). On the other hand, the third basic data 66 regarding the throughput is slightly different from the case of FIG. 9C, and is the third basic data 660 of FIG. The difference is that the time sharing period TS = 30
The point is that the throughput is 2.6 MB / s for the combination of 0 ms and RS ratio = 90%.

【０１５０】この図１３のチューニング処理にあって
は、平均レスポンス４０ｍｓ以下を抽出した第１基礎デ
ータ６２Ａの斜線部の領域と最大レスポンス８０ｍｓ以
下を抽出した第２基礎データ６４Ａの斜線部の領域につ
いての共通領域検査部７１による判断で第１共通データ
が得られている点は、図１０と同じである。In the tuning processing shown in FIG. 13, the shaded area of the first basic data 62A extracted from the average response of 40 ms or less and the shaded area of the second basic data 64A extracted from the maximum response of 80 ms or less. The point that the first common data is obtained by the judgment of the common area inspection unit 71 is the same as that of FIG.

【０１５１】これに対しスループットの第３基礎データ
６６０についてはユーザ要求のスループットを満足する
領域が斜線部となっており、共通領域検査部７２による
第１共通データ６８による検出結果としての第２基礎デ
ータ７２０にあっては、平均レスポンス、最大レスポン
ス及びスループットの全てのユーザ要求を達成できる領
域は獲得できない。On the other hand, with respect to the third basic data 660 of the throughput, the area satisfying the throughput requested by the user is indicated by oblique lines, and the second basic data as the detection result of the first common data 68 by the common area inspection unit 72 is obtained. In the data 720, an area where all the user requests of the average response, the maximum response, and the throughput can be achieved cannot be obtained.

【０１５２】このような場合、モード１にあっては下位
のスループットの要求性能は考慮しないことから、第１
共通データ６８Ａの斜線部の共通領域のいずれか１つを
選択する。また第２モードでは第１共通データ６８Ａの
中の斜線部の共通領域の選択において、下位のスループ
ットの第３基礎データ６６０を考慮し、スループットが
最大となる３．６ＭＢ／ｓに対応した共通領域について
のタイムシェアリング周期ＴＳ＝３００ｍｓとＲＳ比＝
９０％の組合せを選択する。In such a case, since the required performance of the lower throughput is not considered in the mode 1, the first
One of the shaded common areas of the common data 68A is selected. In the second mode, when selecting the shaded common area in the first common data 68A, the third basic data 660 of the lower throughput is taken into consideration, and the common area corresponding to 3.6 MB / s at which the throughput is maximized. Time sharing cycle TS = 300 ms and RS ratio =
Select 90% combination.

【０１５３】また第３モードの場合は、第１共通データ
６８Ａの３つの共通領域に対応する下位のスループット
の第３基礎データ６６０の中の対応する領域の中からス
ループットの改善が最善となる２．６ＭＢ／ｓに対応し
たタイムシェアリング周期ＴＳ＝３００ｍｓとＲＳ比＝
９０％を選択する。この場合にはモード３はモード２の
場合と同じ選択結果となっている。In the case of the third mode, the improvement of the throughput is the best among the corresponding areas in the third basic data 660 of the lower throughput corresponding to the three common areas of the first common data 68A. Time sharing period TS = 300 ms corresponding to .6 MB / s and RS ratio =
Select 90%. In this case, mode 3 has the same selection result as in mode 2.

【０１５４】図１３は上位の優先度の要求性能を達成で
きる設定範囲では下位の要求性能が達成できない場合の
モード４による調整値の選択処理の説明図である。FIG. 13 is an explanatory diagram of the adjustment value selection processing in mode 4 when the lower required performance cannot be achieved in the setting range in which the required performance of the higher priority can be achieved.

【０１５５】図１３にあっては、優先度が最も低いスル
ープットの第３基礎データ６６０におけるタイムシェア
リング周期ＴＳ＝２００ｍｓ，３００ｍｓとＲＳ比＝９
０％の組合せに格納しているデータが３．４ＭＢ／ｓ、
３．６ＭＢ／ｓと、図１２の場合と異なっている。In FIG. 13, the time sharing periods TS = 200 ms and 300 ms and the RS ratio = 9 in the third basic data 660 of the lowest priority throughput.
The data stored in the 0% combination is 3.4 MB / s,
3.6 MB / s, which is different from the case of FIG.

【０１５６】また図１３の場合のユーザの要求値は・平均レスポンスＡｖｅ＝４０ｍｓ以下・最大レスポンスＭａｘ＝８０ｍｓ以下・スループットＴｈＰ＝４．０ＭＢ／ｓ以上と、スループットが図１２に対し高めの要求となってい
る。In the case of FIG. 13, the request values of the user are as follows: average response Ave = 40 ms or less; maximum response Max = 80 ms or less; throughput ThP = 4.0 MB / s or more. Has become.

【０１５７】このような場合についても、図１２と同
様、第２共通データ７２０には平均レスポンス、最大レ
スポンス及びスループットの全てのユーザ要求を達成で
きる領域は存在しない。この場合、第４モードにあって
は、平均レスポンスと最大レスポンスの共通領域を獲得
した第１共通データ６８の３つの斜線部の領域に対応す
るスループットの第３基礎データ６６０の中から下位の
スループットの性能が良くなる領域の候補を複数点選択
する。Also in this case, as in FIG. 12, there is no area in the second common data 720 in which all user requests of the average response, the maximum response, and the throughput can be achieved. In this case, in the fourth mode, the lower throughput from the third basic data 660 of the throughput corresponding to the three shaded areas of the first common data 68 that has acquired the common area of the average response and the maximum response. A plurality of points are selected as candidates for a region in which the performance of is improved.

【０１５８】この場合にはスループット３．４ＭＢ／ｓ
と３．６ＭＢ／ｓの２点が選択される。このように選択
した２つの候補の中から上位の平均レスポンス及び最大
レスポンスの性能が最善となる候補を選択する。即ち、
第１基礎データ６２Ａで平均レスポンスが２５ｍｓ、第
２基礎データ６４Ａで最大レスポンスが６０ｍｓとなる
タイムシェアリング周期ＴＳ＝２００ｍｓとＲＳ比９０
％の組が選択される。In this case, the throughput is 3.4 MB / s
And 3.6 MB / s are selected. From the two candidates selected in this way, the candidate having the highest average response and maximum response performance is selected. That is,
The time sharing period TS = 200 ms and the RS ratio 90 at which the average response is 25 ms in the first basic data 62A and the maximum response is 60 ms in the second basic data 64A
A set of% is selected.

【０１５９】図１４は、図８において自動チューニング
を行わない場合のシミュレーション結果の特性図であ
る。FIG. 14 is a characteristic diagram of a simulation result when automatic tuning is not performed in FIG.

【０１６０】このシミュレーションにあっては、ランダ
ムアクセス、シーケンシャルアクセス、ＯＰＣアクセス
（コピーアクセス）、ＥＣアクセス（エラーアクセス）
の４つの入出力グループについて、それぞれ４つのクォ
ンタムを設定してタイムシェアリングを行っている。ま
たランダムアクセスとシーケンシャルアクセスは同一の
グループに属する。他のＯＰＣアクセスとＥＣアクセス
は全て独立したグループに属する。In this simulation, random access, sequential access, OPC access (copy access), EC access (error access)
For each of the four input / output groups, four quanta are set for time sharing. Random access and sequential access belong to the same group. All other OPC accesses and EC accesses belong to independent groups.

【０１６１】またシミュレーションにあっては、シーケ
ンシャル入出力要求が流れないようにしているため、同
じグループに属しているシーケンシャルアクセスのクォ
ンタムは全てランダムアクセスのクォンタムに使用され
る。In the simulation, since sequential input / output requests are prevented from flowing, all the quanta of sequential access belonging to the same group are used for the quanta of random access.

【０１６２】各クォンタムの時間比は（ランダム）：（シーケンシャル）：（ＯＰＣ）：（Ｅ
Ｃ）＝６５：５：１５：１５となる。ここでシーケンシャルクォンタムについては入
出力要求がないため、実質的には（ランダム）：（ＯＰＣ）：（ＥＣ）＝７０：１５：１
５となっている。またタイムシェアリング周期ＴＳは１０
０ｍｓとしている。更にランダムアクセスは７．５ｍｓ
ごとに負荷を２０ＩＯＰＳ，１００ＩＯＰＳ，２２０Ｉ
ＯＰＳ，１００ＩＯＰＳとして、この振幅を繰り返して
いる。更に、図１５は、図１４のシミュレーション開始
時間０〜１００ｍｓ部分の拡大図である。The time ratio of each quantum is (random): (sequential): (OPC): (E
C) = 65: 5: 15: 15. Since there is no input / output request for the sequential quantum, (Random) :( OPC) :( EC) = 70: 15: 1
5 The time sharing cycle TS is 10
0 ms. Furthermore, random access is 7.5ms
Load 20 IOPS, 100 IOPS, 220I for each
This amplitude is repeated as OPS, 100 IOPS. FIG. 15 is an enlarged view of the simulation start time 0 to 100 ms in FIG.

【０１６３】図１４，図１５の自動チューニングを行っ
ていない場合については、最も高くなる２２０ＩＯＰＳ
のランダムアクセスの負荷を処理しきれず、最高でもＡ
部のように２００ＩＯＰＳのランダムアクセスしか処理
することができない。When the automatic tuning shown in FIGS. 14 and 15 is not performed, the highest 220 IOPS is obtained.
Can't handle the random access load of
It can process only the random access of 200 IOPS like the unit.

【０１６４】またランダムアクセスの平均レスポンス
（ＲＡｖｅ）は、負荷が高いときはＢ部のように１２
０ｍｓ前後であり、最悪の場合はＣ部のように１５０ｍ
ｓ強まで悪化する。When the load is high, the average response (R Ave) of random access is 12
0ms, worst case 150m
Deterioration up to s.

【０１６５】図１６は、本発明による自動チューニング
を行った場合のシミュレーション結果であり、図１７に
図１６の０〜１００ｍｓ付近を拡大して示している。FIG. 16 shows a simulation result when the automatic tuning according to the present invention is performed. FIG. 17 shows an enlarged view of the vicinity of 0 to 100 ms in FIG.

【０１６６】このシミュレーションにおけるランダムア
クセスの負荷ＩＯＰＳの与え方は図１４の場合と同じで
あり、またタイムシェアリング周期ＴＳと各クォンタム
の時間比は、要求性能は固定であるが負荷ＩＯＰＳの観
測値により自動的に変動する。また自動チューニングの
設定ではユーザ要求の優先度として、平均レスポンス、
最大レスポンス、スループットの順とすることで、ラン
ダムアクセス優先のチューニングとしている。The way of giving the random access load IOPS in this simulation is the same as that of FIG. 14. The time sharing period TS and the time ratio of each quantum are such that the required performance is fixed, but the observed value of the load IOPS. Automatically fluctuates due to In the automatic tuning setting, the average response,
By arranging the maximum response and the throughput in this order, tuning for random access is performed.

【０１６７】この結果、図１６，図１７の自動チューニ
ングした場合については、Ａ部のように２２０ＩＯＰＳ
のランダムアクセスの負荷を概ね処理することができて
いる。また負荷ＩＯＰＳが高いときの平均レスポンスも
Ｂ部のように５０ｍｓ前後であり、最悪でもＣ部のよう
に６０ｍｓ強という値に抑えられている。As a result, in the case of the automatic tuning shown in FIGS.
Can almost handle the random access load. Also, the average response when the load IOPS is high is about 50 ms as in the part B, and is suppressed to a little over 60 ms as in the part C at worst.

【０１６８】尚、上記の実施形態にあっては、要求性能
として平均レスポンス、最大レスポンス、スループット
を例にとるものであったが、必要に応じて適宜の要求性
能を設定することができる。また優先度をランダムアク
セス優先とすることで平均レスポンス、最大レスポン
ス、スループットの順に設定しているが、逆にシーケン
シャルアクセス優先で優先度をスループット、平均レス
ポンス、最大レスポンスの順に設定するようにしても良
い。In the above-described embodiment, the average response, the maximum response, and the throughput are taken as examples of the required performance. However, an appropriate required performance can be set as required. Also, the priority is set in the order of the average response, the maximum response, and the throughput by setting the priority to random access. However, the priority may be set in the order of the throughput, the average response, and the maximum response in the sequential access priority. good.

【０１６９】また本発明は、その目的と利点を含まない
適宜の変形を含み、更に上記の実施形態に示した数値に
よる限定は受けない。The present invention includes appropriate modifications that do not include the objects and advantages thereof, and are not limited by the numerical values shown in the above embodiments.

【０１７０】[0170]

【発明の効果】以上説明してきたように本発明によれ
ば、シミュレーションあるいは実測等により得られた負
荷、平均レスポンス、最大レスポンス、スループット等
の実績（統計情報）を基礎データとして保存し、チュー
ニング部により負荷の状態と保存した基礎データによる
実績を基にユーザ要求性能を満たす最適な調整値、例え
ばタイムシェアリング周期とランダムアクセスとシーケ
ンシャルアクセスのクォンタム比率（ＲＳ比）を決定
し、この調整値に基づいてタイムシェアリングの動作条
件を自動的に調整することで、ユーザの要求性能に適切
に対応した入出力処理を行うことができる。As described above, according to the present invention, results (statistical information) such as load, average response, maximum response, and throughput obtained by simulation or actual measurement are stored as basic data, and the tuning unit Based on the load condition and the performance based on the stored basic data, the optimal adjustment value that satisfies the performance required by the user, for example, the time sharing cycle and the quantum ratio (RS ratio) of random access and sequential access is determined. By automatically adjusting the operating conditions of the time sharing based on this, it is possible to perform input / output processing appropriately corresponding to the performance required by the user.

[Brief description of the drawings]

【図１】本発明の原理説明図FIG. 1 is a diagram illustrating the principle of the present invention.

【図２】本発明が適用されるストレージシステムのブロ
ック図FIG. 2 is a block diagram of a storage system to which the present invention is applied;

【図３】３つの入出力グループを形成する本発明の基本
的な実施形態の機能ブロック図FIG. 3 is a functional block diagram of a basic embodiment of the present invention forming three input / output groups;

【図４】図３の３つの入出力グループの入出力を対象と
した場合のディスク・タイムシェアリング処理のスケジ
ュール説明図FIG. 4 is an explanatory diagram of a schedule of a disk time sharing process for input / output of the three input / output groups of FIG. 3;

【図５】１つの入出力グループのみの入出力が連続する
場合のディスク・タイムシェアリング処理のスケジュー
ル説明図FIG. 5 is a diagram illustrating a schedule of a disk time sharing process when input / output of only one input / output group is continuous.

【図６】クォンタム切替え時の残り時間の予測処理の説
明図FIG. 6 is an explanatory diagram of a process of estimating a remaining time at the time of quantum switching.

【図７】図３のディスク・タイムシェアリング処理のフ
ローチャートFIG. 7 is a flowchart of a disk time sharing process of FIG. 3;

【図８】図２のチューニング機構の機能ブロック図FIG. 8 is a functional block diagram of the tuning mechanism of FIG. 2;

【図９】図８の基礎データファイルに基礎データとして
格納する平均レスポンス、最大レスポンス、スループッ
トの実績値の説明図FIG. 9 is an explanatory diagram of average response, maximum response, and actual throughput values stored as basic data in the basic data file of FIG. 8;

【図１０】要求性能の優先度に従って調整値を選択する
図８のチューニング処理の説明図FIG. 10 is an explanatory diagram of the tuning processing of FIG. 8 for selecting an adjustment value according to the priority of required performance.

【図１１】図８のチューニング処理のフローチャートFIG. 11 is a flowchart of a tuning process in FIG. 8;

【図１２】下位の要求性能ができない場合のチューニン
グ処理の説明図FIG. 12 is an explanatory diagram of a tuning process when a lower required performance cannot be achieved;

【図１３】下位の要求性能ができない場合の他のチュー
ニング処理の説明図FIG. 13 is an explanatory diagram of another tuning process when a lower required performance cannot be obtained.

【図１４】チューニングを行っていない場合の負荷ＩＯ
ＰＳ、ランダムアクセスの平均レスポンスと最大レスポ
ンス、コピー処理、エラー処理実行のシミュレーション
結果の特性図FIG. 14 shows load IO when tuning is not performed.
Characteristic diagram of simulation result of PS, average response and maximum response of random access, copy process, error process execution

【図１５】図１４の部分拡大図FIG. 15 is a partially enlarged view of FIG. 14;

【図１６】チューニングを行った場合の負荷ＩＯＰＳ、
ランダムアクセスの平均レスポンスと最大レスポンス、
コピー処理、エラー処理実行のシミュレーション結果の
特性図FIG. 16 shows a load IOPS when tuning is performed,
Average and maximum response of random access,
Characteristic diagram of simulation results of execution of copy processing and error processing

【図１７】図１６の部分拡大図FIG. 17 is a partially enlarged view of FIG. 16;

[Explanation of symbols]

１０−１〜１０−ｍ：ホスト１２：デバイス制御装置１４：アレイディスク装置１６：ディスク装置１８：入出力要求部２０：ディスク入出力スケジュール機構２２：ディスク入出力処理部２４−１〜２４−ｎ：ディスクドライブ２６：ＲＡＩＤ制御部３０−１〜３０−４：ディイスタイムシアリング制御情
報３２：入出力スケジュール部３４：入出力受付部３６：入出力完了処理部３８−１〜３８−３：スケジュール待ちグループキュー４０−１〜４０−３：完了待ちグループキュー４２−１〜４２−３：グループ用クォンタム４４：現クォンタム種別情報４５：シーケンシャルアクセス検出機構４６：現クォンタム開始時刻４８：次入出力タスク種別情報５０：チューニング機構５２：チューニング部５４：基礎データファイル５６：要求性能設定部５８：動作条件決定部６２：第１基礎データ６４：第２基礎データ６６：第３基礎データ７８：バックアップ／コピー機構８４：リビルディング機構10-1 to 10-m: Host 12: Device control device 14: Array disk device 16: Disk device 18: I / O request unit 20: Disk I / O schedule mechanism 22: Disk I / O processing unit 24-1 to 24-n : Disk drive 26: RAID control unit 30-1 to 30-4: Distime shearing control information 32: Input / output schedule unit 34: Input / output reception unit 36: Input / output completion processing unit 38-1 to 38-3: Schedule Waiting group queue 40-1 to 40-3: Completion waiting group queue 42-1 to 42-3: Group quantum 44: Current quantum type information 45: Sequential access detection mechanism 46: Current quantum start time 48: Next input / output task Type information 50: Tuning mechanism 52: Tuning unit 54: Basic data file File 56: Required performance setting unit 58: Operating condition determination unit 62: First basic data 64: Second basic data 66: Third basic data 78: Backup / copy mechanism 84: Rebuilding mechanism

───────────────────────────────────────────────────── フロントページの続き (72)発明者加藤匡史神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内Ｆターム(参考） 5B014 EB04 GD05 GD17 GD23 GD36 HA01 HA04 HA06 HA08 5B065 CA04 CA11 CA16 CA30 CC08 5B082 CA01 EA07 FA16 GA15 JA04 ────────────────────────────────────────────────── ─── Continuing on the front page (72) Inventor Masafumi Kato 4-1-1, Kamidadanaka, Nakahara-ku, Kawasaki-shi, Kanagawa F-term within Fujitsu Limited (reference) 5B014 EB04 GD05 GD17 GD23 GD36 HA01 HA04 HA06 HA08 5B065 CA04 CA11 CA16 CA30 CC08 5B082 CA01 EA07 FA16 GA15 JA04

Claims

[Claims]

1. A disk device having one or a plurality of disk drives, an input / output request unit for issuing an input / output request to the disk device, and an input / output group formed by grouping input / output sources to the disk device At the same time, the ratio of time during which each I / O group uses the disk is defined. Based on the defined time ratio, the allocation time (quantum) at which each I / O group can use the disk device continuously is determined. A disk provided with an input / output scheduling mechanism for performing time sharing using the disk device by sequentially switching the allocation time between competing input / output groups when an input / output request is received from the output group to the disk device;・
A disk time sharing device, characterized in that the time sharing device is provided with a tuning unit for automatically adjusting the time sharing operation condition according to required performance and performance.

2. The disk time sharing apparatus according to claim 1, wherein said input / output scheduler forms at least a random access input / output group and a sequential access input / output group as said plurality of input / output groups. A tuning unit configured to set an average response time and a maximum response time of the random access I / O group, and a throughput of the sequential access I / O group as required performance, and a random access load; The first basic data stored in accordance with the time sharing period, the allocated time ratio of random access and sequential access, and the actual value of the average response obtained as described above, and the actual value of the maximum response divided for each load of random access. To Thailand The second basic data stored corresponding to the sharing period and the allocated time ratio between random access and sequential access, and the actual value of the throughput are corresponded to the time sharing period and the allocated time ratio between random access and sequential access. The third basic data stored and stored, a time shelling cycle that satisfies one or a plurality of required performance values set by the required performance setting unit, an allocation time ratio between random access and sequential access, An operating condition determining unit that determines an adjusting value by referring to the first to third basic data and automatically adjusts the operating condition of the time sharing. .

3. The disk time sharing apparatus according to claim 1, wherein the tuning unit assigns a priority to the type of the required performance when there is a required performance value that cannot be achieved, and If the lower required performance cannot be achieved within the setting range that can achieve the required performance, the first mode determines the adjustment value without considering the lower required performance, and the setting range that can achieve the required performance of the higher priority In the second mode, where the lower required performance cannot be achieved, the adjustment value is determined in consideration of the lower required performance, and when the lower required performance cannot be achieved in the setting range in which the higher priority required performance can be achieved. In addition, a third mode in which an adjustment value at which the lower performance is optimized from the upper setting range is selected, and the lower required performance cannot be achieved in the setting range in which the higher priority required performance can be achieved. And a fourth mode for selecting a plurality of candidates for which the lower performance is better from the upper setting range, and selecting an adjustment value for which the higher performance is best from the selected candidates. A disk time-sharing device characterized by automatically adjusting operating conditions by using a computer.

4. A disk device having one or a plurality of disk drives, an input / output request unit for issuing an input / output request to the disk device, and an input / output for scheduling use of the disk device based on the input / output. In a disk time sharing method having a scheduling mechanism, an I / O group is formed by grouping I / O sources to the disk device, and a ratio of time during which each I / O group uses a disk is defined. If the I / O group determines the allocation time (quantum) for which the I / O group can use the disk device continuously based on the defined time ratio, and if I / O requests are received from multiple I / O groups to the disk device, contention occurs. Time sharing in which disk units are used by sequentially switching the allocation time between There, further, disk time-sharing method characterized by automatically adjusting the operating conditions of the time-sharing in response to the required performance and results.