JP3918077B2

JP3918077B2 - Storage controller control method

Info

Publication number: JP3918077B2
Application number: JP50410999A
Authority: JP
Inventors: 憲司山神; 山本　　彰
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1997-06-20
Filing date: 1997-06-20
Publication date: 2007-05-23
Anticipated expiration: 2017-06-20
Also published as: WO1998059291A1

Description

技術分野
本発明は、中央処理装置と記憶装置との間のデータ転送に介在する制御装置の制御方法に関する。
背景技術
従来の制御装置は、記憶装置のデータを一時的に格納するキャッシュメモリと、キャッシュメモリと中央処理装置との間のデータ転送を実行するホスト側プロセッサと、キャッシュメモリと記憶装置との間のデータ転送を実行する記憶装置側プロセッサと、キャッシュメモリの管理情報を格納する共有メモリから構成される。そして、ホスト側プロセッサと記憶装置側プロセッサの双方が、共有メモリにアクセス可能となっている。
上記の構成では、制御装置内の各プロセッサは、キャッシュメモリ上のデータおよび管理情報を共有している。このため、各プロセッサは、キャッシュメモリを介していずれの記憶装置のデータにもアクセス可能である。また、ホスト側及び記憶装置側の各プロセッサは、独立して動作可能である。つまり、ホスト側プロセッサによるキャッシュメモリへのヒット処理と、記憶装置側プロセッサによる記憶装置へのライト処理は、並列に実行可能である。
従来のように、制御装置に一つの共有メモリ及びキャッシュメモリを搭載する構成では、各プロセッサからのアクセスが共有メモリに集中する。このため、共有メモリへのアクセス制御がボトルネックとなるという問題がある。
本発明の目的は、共有メモリおよびキャッシュメモリへのアクセスネックを解消する記憶制御装置およびその制御方法を提供することにある。
発明の開示
キャッシュメモリは記憶装置上のデータを一時的に保存しておくためのものであり、基本的には記憶装置対応に存在すれば良い。また、共有メモリに格納されたキャッシュ管理情報も同様に、記憶装置対応に存在すれば良い。以上の考えに基づいて、上記目的を達成するために本発明では、各記憶装置毎に、この記憶装置にアクセス可能な１つ以上の記憶装置側プロセッサを一つのグループとして割り当て、キャッシュメモリおよび共有メモリをそれぞれのグループに割り当て、グループ内の各記憶装置側プロセッサが、このグループに割り当てられたキャッシュ管理情報を管理する。これにより、複数のキャッシュメモリおよび複数の共有メモリを同時に動作させることが可能となり、ボトルネックを解消することができる。ただし、このグループ分けは、複数の記憶装置に対するグループ分けであっても良い。
本発明における制御装置では、ホスト側プロセッサは、中央処理装置からアクセス要求のあったデータが格納された記憶装置に基づいて、その記憶装置にアクセス可能なグループ内の記憶装置側プロセッサを特定する。また、特定された記憶装置側プロセッサは、ホスト側プロセッサからのアクセス要求に対して、このグループに割り当てられた共有メモリにアクセスして、目的とするデータがキャッシュ上に存在するかどうかを判定し、キャッシュにデータが存在すれば、データを格納しているキャッシュアドレスをホスト側プロセッサに通知する。この結果、ホスト側プロセッサは、通知されたキャッシュアドレスに対してアクセスする。
以上の処理により、各グループに割り当てられた１つの共有メモリへアクセスするのは、この共有メモリが割り当てられた１つのグループに属する記憶装置側プロセッサのみである。このため、１つの共有メモリに対するアクセス回数を削減できる。この結果、共有メモリに対するアクセスの競合を抑えることが出来る。また、システム内では複数の共有メモリおよび複数のキャッシュメモリを保持しているので、システム全体での共有メモリおよびキャッシュメモリへのアクセス性能を向上させることができる。
【図面の簡単な説明】
第１図は、本発明を適用した計算機システムの構成例を示す図である。
第２図は、記憶装置の構成を示す図である。
第３図は、キャッシュメモリの管理情報および管理方法を示す図である。
第４図は、リードヒット処理の処理概要を示す図である。
第５図は、中央処理装置からリード処理要求を受領したホスト側プロセッサの処理のフローチャートである。
第６図は、記憶装置側プロセッサから再接続要求を受領したホスト側プロセッサの処理のフローチャートである。
第７図は、リード処理における記憶装置側プロセッサの処理のフローチャートである。
第８図は、ライト処理におけるホスト側プロセッサの処理のフローチャートである。
第９図は、ライト処理における記憶装置側プロセッサの処理のフローチャートである。
発明を実施するための最良の形態
以下、図面を用いて本発明を実施するための最良の形態を説明する。
第１図に示すように、計算機システムは、中央処理装置１００、制御装置１８０、記憶装置１７０から構成される。制御装置１８０は、キャッシュメモリ１４０と中央処理装置１００間のデータ転送を制御する一つ以上のホスト側プロセッサ１１１と、記憶装置１７０とキャッシュメモリ１４０間のデータ転送を制御する一つ以上の記憶装置側プロセッサ１５２から構成されている。
また、同一の記憶装置１７０にアクセス可能な全ての記憶装置側プロセッサ１５２を同一のプロセッサグループ１９０としており、各グループ毎に、一つの共有メモリ１６０とキャッシュメモリ１４０を設ける。各グループ１９０内の記憶装置側プロセッサ１５２は、共有メモリアクセスパス１６５を介して共有メモリ１６０へアクセス可能である。各記憶装置側プロセッサ１５２は、記憶装置アクセスパス１７５を介して記憶装置１７０へアクセスする。記憶装置アクセスパス１７５は、一本のパス上に複数の記憶装置１７０が芋蔓状に接続されていているものとする。また、説明を簡単にするために、各記憶装置側プロセッサ１５２からは、一本の記憶装置アクセスパス１７５しか表記していないが、一つの記憶装置側プロセッサ１５２には、複数の記憶装置アクセスパス１７５が接続されていてもかまわない。
次に、第２図を用いて記憶装置１７０について説明する。本発明では記憶装置１７０は、回転する記憶媒体を備えた磁気記憶装置であるとする。記憶装置１７０は、複数の記憶媒体４００、記憶媒体４００毎に存在し、記憶媒体４００と記憶装置インタフェース４４０間のデータ転送を行う複数のヘッド４２０、ヘッド４２０と記憶装置外部とのデータ転送を実行する記憶装置インタフェース４４０から構成される。
ヘッド４２０は上から昇順に番号がつけられている。複数の記憶媒体４００から同時にデータをリード、ライトすることはできない。つまり、アクティブになるヘッド４２０は高々一つである。目的のデータをアクセスするために記憶媒体上をヘッド４２０を移動させることをシークと呼び、シークが完了した後に、目的データがヘッド４２０の下を通過するまで待つことをサーチと呼ぶ。ヘッド４２０が一回転する間にアクセス可能な記憶媒体４００上の領域をトラック４１０と呼ぶ。また、記憶媒体４００が一回転する間に、全ヘッド４２０下を通過する領域（トラック４１０の集まり）をシリンダ４３０と呼び、記憶媒体４００の外側から内側に向かって、昇順に番号が付けられている。シリンダ番号とヘッド番号の組みによって、トラック４１０を特定する。さらに、記憶媒体４００のデータ格納領域は５１２バイトの小領域に分かれていて、これをセクタと呼ぶ。セクタには、トラック４１０内で一意に番号が付けられていて、トラック４１０の先頭から昇順に、トラック４１０内で一意に番号が付けられている。
次に、第３図を用いてキャッシュメモリ１４０の管理方法を説明する。キャッシュメモリ１４０は、各グループ１９０毎に分割されているが、論理的には連続したアドレス空間として見えるものとする。つまり、あるキャッシュアドレスが定まれば、それはどのグループ１９０に属するどの領域かを特定できるものとする。この実現のためには、例えばキャッシュアドレス上にグループ１９０の番号を埋め込んでおき、データ転送を実行するハードウエアでそれを認識すれば良い。
キャッシュメモリ１４０は、セグメント５３０と呼ばれる例えば１６ＫＢ（システムにより固定の大きさ）の領域に分割されていて、１セグメントにつき、１個のセグメント管理テーブル５２０が共用メモリ１６０上に存在する。セグメント管理テーブル５２０は、当該セグメント５３０のキャッシュアドレス５２２、ダーティデータを示すビットマップ５２３、クリーンデータを示すビットマップ５２４、他セグメント管理テーブルへのポインタ５２１が格納されている。ここで、ダーティデータとは、キャッシュ１４０上は更新されているが、まだ記憶装置１７０に未反映のライトデータ、クリーンデータとは、キャッシュ１４０上に存在する記憶装置１７０の内容と一致したデータを表す。さらに、セグンメント５３０を例えば５１２バイト毎に区切り、この領域に１ビットを対応させたビットマップを作成しておく。そして、もし対応する領域にダーティデータあるいはクリーンデータが存在するなら、ダーティビットマップ５２３あるいはクリーンビットマップ５２４の対応するビットを１にする。以上のようにより細かくデータ領域を管理するために、ダーティビットマップ５２３およびクリーンビットマップ５２４を使用する。
記憶装置１７０上の連続した領域、あるいは近傍の領域に対応してキャッシュ領域を割り当てる場合、複数のセグメント５３０をまとめて管理したほうが都合がよい。本発明では、１トラック４１０分に対応するキャッシュ領域を、スロット管理テーブル５１０によって管理する。ここで、スロット管理テーブル５１０は、記憶装置番号５１２、トラック番号５１３、最初のセグメント管理ブロックへのポインタ５１５、ロック情報５１４を格納している。記憶装置番号５１２およびトラック番号５１３によって、どの記憶装置のどのトラック４１０のデータを格納しているのかを特定でき、さらにポインタ５１５をたどり、セグメント管理ブロック５２０を参照することによって、そのトラック４１０のどのセクタを格納しているのか、またそのセクタがダーティかどうかを特定できる。また、ロック情報５１４は、当該スロットを排他的に処理する場合に用いる。
ここで、ある記憶装置１７０のあるトラック４１０上のあるデータにアクセス要求があった場合に、どのようにキャッシュ管理が行われるか一例を示す。まず、アクセス要求として、リード・ライト種別、論理記憶装置番号、トラック番号、目的データの先頭のセクタ、長さ（セクタ数）が与えられる。まず、記憶装置番号とトラック番号を元に、目的のスロット管理ブロック５１０が存在するかどうかを調べる。この手法の一つとして、例えばハッシュがあげられる。すなわち、記憶装置番号とトラック番号をハッシュ関数に与えると、対応するハッシュテーブル５００のエントリを出力し、このエントリにスロット管理テーブル５１０のアドレスが格納されている。もしこれがヌル（アドレスなし）であれば、ミスである。ヌルでなければこのポインタをたどって、記憶装置番号５１２とトラック番号５１３が探しているものと一致しているかどうか調べる。一致してなければ、スロット管理ブロック５１０へのポインタをたどり、記憶装置番号５１２とトラック番号５１３の比較を繰り返す。このようにして、目的とするスロット管理ブロック５１０が存在するかどうかを探す。
もしスロット管理ブロック５１０が共有メモリ１６０上に存在すれば、続いてアクセス対象となっているセクタ範囲が存在するかどうかを調べる。そのためには、まず目的のデータの先頭セクタ番号とセクタ数から、先頭セグメント番号と先頭のビット位置、および最終セグメントと最終のビット位置を計算する。例えば、６０番のセクタを先頭に、１２個のセクタを読み出す場合では、第２セグメントのビット２９から、第３セグメントのビット９が算出される。これを元にして、このビット範囲のデータが存在するかどうかを、クリーンビットマップ５２４あるいはダーティビットマップ５２３を調べて判定する。もし目的の範囲全てに渡ってデータが存在すればヒット、もし一部分でも存在しなければミスということになる。ヒットの場合には、そのままアクセスを続行すれば良い。ミスの場合は、スロット管理テーブル５１０が存在しない場合と、スロット管理テーブル５１０は存在するが、セグメント管理テーブル５２０の一部あるいは全部が存在しない場合、セグメント管理テーブル５２０も全て存在するが、データがキャッシュ１４０上に存在しない場合がある。データがキャッシュ１４０上に存在しない場合には、データを記憶装置１７０からキャッシュ１４０へ読み込んでくれば良い。セグメント管理テーブル５２０が存在しない場合には、新しいセグメント管理テーブル５２０を必要数割り当て、スロット管理テーブル５１０に接続する。この際、ダーティビットマップ５２３とクリーンビットマップ５２４は初期化しておく。スロット管理テーブル５１０が存在しない場合には、スロット管理テーブル５１０一つと、セグメント管理テーブル５２０を必要数分割り当て、対応するハッシュテーブル５００のエントリにスロット管理テーブル５１０のアドレスを格納する。この際、論理記憶装置番号５１２とトラック番号５１３を更新しておく。
目的のスロットがヒットであると判明した時点、あるいはミスであることが判明し、スロット管理テーブル５１０を新規に割り当てた時点で、スロットロックを確保する。もし、すでにスロットロック済みであった場合、当該スロットのロックが解放されるまで待つことになる。キャッシュ操作が完了し、もはや当該スロットが不要となると、スロットロックを解放する。この場合にも、ヒットミス判定時と同様に、論理記憶装置番号、トラック番号から目的のスロットをサーチする。あるいはヒットミス判定時に当該スロットのアドレスを記憶しておき、解放時はそのアドレスからスロット管理テーブル５１０を求める方法でもよい。目的とするスロット管理テーブル５１０を求めると、スロットロックを解除して、スロット解放処理が完了する。
以下、中央処理装置１００からのアクセス要求に対する処理方法について説明する。中央処理装置１００と制御装置１８０間のデータアクセスプロトコルは、汎用機で使用されるＣＫＤプロトコル、ワークステーション等で使用されるＳＣＳＩプロトコルなどがあるが、本実施例ではＣＫＤプロトコルを前提として説明する。
ＣＫＤプロトコルで使用されるコマンドのうち、本発明に関連するものは、アクセス有効範囲やキャッシュアクセスモードなどを示すＤＸ（Define Extent）コマンド、アクセス位置やアクセスするレコード数を示すＬＯＣ（Locate）コマンド、リードを行うＲＤＤ（Read Data）コマンド、ライトを行うＷＲＤ（Write Data）コマンドなどがある。あるレコードをアクセスする場合には、ＤＸ，ＬＯＣ，ＲＤＤなど、複数のコマンドが連続して実行される。これをコマンドチェインと呼ぶ。また、ＤＸ，ＬＯＣ，ＲＤＤ...とコマンド発行することにより、複数の連続したレコードをアクセスすることもできる。ＤＸコマンドでは、レコードを逐次的に読み出すかどうかを指定するモードがある。ＬＯＣコマンドによって転送される位置づけ情報は、シリンダ番号、ヘッド番号、セクタ番号、レコード番号が指定される。また、記憶装置番号は中央処理装置がコマンドを発行するために、制御装置との接続を確立する際に指定される。
まず、第４図を用いて処理全体の概要を説明する。
第４図は、リードヒット処理の概略図を示す。ホスト側プロセッサ１１１が、中央処理装置１００からのリード要求を受領すると、リード対象となった記憶装置１７０にアクセス可能な記憶装置側プロセッサ１５２を以下の方法で選択し、リードメッセージを送信する。
各プロセッサは、そのローカルメモリ上に記憶装置アクセス表６００を保持している。各エントリは、各該記憶装置１７０にアクセス可能な記憶装置側プロセッサ番号と状態の組から構成され、この例では各記憶装置１７０に高々２つの記憶装置側プロセッサ１５２がアクセス可能な構成となっている。状態は正常あるいは閉塞のいずれかであり、閉塞状態の時は、当該プロセッサ１５２から記憶装置１７０へのアクセスはできない。この表は、システム構築時に共有メモリ１６０に作成されて、各プロセッサはそのコピーをローカルメモリ上に保持する。障害により記憶装置側プロセッサ１５２と記憶装置間アクセスパス１７５が閉塞した場合や、保守により一時的にアクセス不能にする場合は、本記憶装置アクセス表６００の対応するエントリの状態を閉塞とする。
ホスト側プロセッサ１１１は、記憶装置アクセス表６００の、アクセス対象である記憶装置番号に対応するエントリを見て、当該記憶装置１７０にアクセス可能な記憶装置側プロセッサ１５２を求め、任意の記憶装置側プロセッサに対して、処理要求のメッセージを発行する。メッセージは、アクセス種別、記憶装置番号、シリンダ番号、ヘッド番号、セクタ番号、セクタ数から成る。ここで、アクセス種別からセクタ番号までは中央処理装置から指定される。セクタ数は、LOCコマンドで指定されたレコード数とレコード長から算出する。例えば、１セクタ５１２バイトで、４キロバイトのレコードを３個読み出す場合には、４０９６×３÷５１２＝２４セクタとなる。
ホスト側プロセッサ１１１からのリード要求に対して、記憶装置側プロセッサ１５２は共有メモリをアクセスしてヒットミス判定を実行し、当該データを格納しているキャッシュ領域のアドレス、およびキャッシュ１４０上の有効データ長をホスト側プロセッサ１１１へ返す。これを受けて、ホスト側プロセッサ１１１は当該アドレスからデータを読み出して、中央処理装置１００へデータ転送を行う。以上の処理が完了すると、中央処理装置１００へ正常終了を報告するとともに、記憶装置側プロセッサ１５２へアクセス完了を報告し、処理を終了する。これを受けて記憶装置側プロセッサ１５２は、確保していたキャッシュ領域を解放する。
以上、中央処理装置１００からのリード要求に対する処理方式の概要を説明した。以下、リードおよびライト処理の詳細を処理フロー図を用いて説明する。
まず、リード処理方式について、第５図から第７図を用いて説明する。
第５図はホスト側プロセッサ１１１の処理を示す。ステップ７００で中央処理装置１００からリード処理要求を受領すると、ステップ７２０において、上述した方法で、記憶装置側プロセッサ１５２を選択し、リード処理要求を発行する。ステップ７３０で、記憶装置側プロセッサ１５２からの応答を待つ。
第７図のステップ９００において、記憶装置側プロセッサ１５２がホスト側プロセッサ１１１からの処理要求を受領すると、ステップ９１０でヒットミス判定を行う。その結果、ヒットであることがわかると、ステップ９２０でメッセージを発行したホスト側プロセッサ１１１に対して、以下の報告する。
（１）ヒットミス判定結果：ヒット。
（２）セグメントアドレスリスト：データを格納しているセグメントアドレスのリスト。
（３）セグメント内オフセット：データを格納している先頭セクタの先頭セグメント内オフセット。
ここで、セグメントアドレスはセグメント管理テーブル５２０に格納されている。また、セグメント内オフセットは、セグメント管理テーブル５２０のダーティビットマップ５２３あるいはクリーンビットマップ５２４における、目的データを格納した先頭セクタに対応するビット位置を返す。
ホスト側プロセッサ１１１がヒットであることを認識すると、前記セグメントアドレスリストおよびセグメント内オフセットから、目的のデータを格納したキャッシュアドレスがわかるので、ステップ７４０で中央処理装置１００に対してデータを転送する。その後、ステップ７５０で記憶装置側プロセッサ１５２に対して、アクセス完了を報告し、処理を完了する。この時送信されるメッセージは、記憶装置番号、シリンダ番号、ヘッド番号、ダーティ有無情報、先頭ダーティセグメント、セグメント内オフセット、ダーティセクタ数から構成される。ここで、ダーティ有無情報はダーティなしが格納され、先頭ダーティセグメント、セグメント内オフセット、ダーティセクタ数は無効値が格納されている。第７図のステップ９３０で、記憶装置側プロセッサ１５２が、ホスト側プロセッサ１１１から、アクセス完了報告を受領すると、現在確保中のスロットのロックを解放して、処理を終了する。
もし、キャッシュミスしていた場合には、記憶装置側プロセッサ１５２は、第７図のステップ９１０のヒットミス判定処理において、新規にスロット管理テーブル５１０ならびにセグメント管理テーブル５２０を割り当る。続いてステップ９５０でホスト側プロセッサ１１１に対してヒットミス判定結果（ミス）を報告した後、ステップ９６０で、データを格納した記憶装置１７０に対してリード要求を発行し、リード完了まで待つ。
ホスト側プロセッサ１１１は、第５図のステップ７３５において、リードミスであることを認識すると、いったん処理を中断し、中央処理装置１００との接続を切り離す（ステップ７７０）。
記憶装置１７０からキャッシュメモリ１４０へのデータ転送が完了すると、記憶装置側プロセッサ１５２は、第７図のステップ９７０において、当該記憶装置１７０からの読み出しを待っているホスト側プロセッサ１５２に対して、再接続要求を発行する。
第６図のステップ８００において、ホスト側プロセッサ１１１が記憶装置側プロセッサ１５２からの再接続要求を検出すると、ステップ８１０において、ホスト側プロセッサ１１１と中央処理装置１００間で再接続処理を実行する。これを完了すると、ステップ８３０で、記憶装置側プロセッサ１５２に対して再接続完了を報告する。
これを受けて、記憶装置側プロセッサ１５２は、第７図のステップ９９０で、要求データを格納したセグメントアドレスリスト、セグメント内オフセットを送信する。以降の処理はリードヒットの処理と同様なので省略する。
もしヒットミス判定の結果がライトミスだった場合は、ホスト側プロセッサ１１１からの要求が、ミス時リード要か不要かによって、対応が異なる。もし、ミス時リード要であった場合には、前記リードミス時と同様に、記憶装置１７０からキャッシュ１４０へデータを格納する。ホスト側プロセッサ１１１ではこの間、中央処理装置１００との接続を切り離しており、再開処理にてライト処理を実行する。ここでのライト処理はヒット時のライト処理と同様である。ミス時リード不要であれば、記憶装置側プロセッサ１５２およびホスト側プロセッサ１１１共、ライトヒット処理と同様の処理を行う。
次に、ライト処理方法について、第８図、第９図を用いて説明する。
第８図のステップ１０００において、ホスト側プロセッサ１１１が中央処理装置１００からライト処理要求を受領すると、ステップ１０２０で、第４図で説明した方法で記憶装置側プロセッサ１５２を選択して、ライト処理要求を発行し、ステップ１０３０で記憶装置側プロセッサ１５２からの応答を待つ。
記憶装置側プロセッサ１５２では、第９図のステップ１１１０でヒットミス判定を実行する。ここで、もしミスであれば、ヒットミス判定処理で、スロット管理テーブル５１０およびセグメント管理テーブル５２０を必要分割り当てる。続いてステップ１１２０で、セグメントアドレスリスト、セグメント内オフセットをホスト側プロセッサへ転送した後、ステップ１１３０でホスト側プロセッサ１１１からのスロット解放要求を待つ。
ホスト側プロセッサ１１１では、第８図のステップ１０４０で、記憶装置側プロセッサ１５２から、ヒットミス判定結果、およびキャッシュアドレスを受け取ると、当該キャッシュ領域に、中央処理装置１００から転送されたデータを格納する。データ転送が完了すると、ステップ１０５０において、ホスト側プロセッサ１１１は記憶装置側プロセッサ１５２に対して、アクセス完了を送信する。この時送信されるメッセージは、記憶装置番号、シリンダ番号、ヘッド番号、ダーティ有無情報、先頭ダーティセグメント、セグメント内オフセット、ダーティセクタ長から構成される。ここで、ダーティ有無情報はダーティありが格納され、先頭ダーティセグメントおよびセグメント内オフセットには、先頭のダーティセクタを保持したセグメントアドレスおよびそのセグメント内のセクタオフセット、ダーティセグメント数には中央処理装置１００からライトされたセグメントの個数が格納される。
アクセス完了報告を受領した記憶装置側プロセッサ１５２は、第９図のステップ１１４０で、前述の方法で目的とするスロット管理テーブル５１０を取得すると、先頭ダーティセグメントに対応するセグメント管理テーブル５２０のダーティビットマップ５２３の、セグメント内オフセットに対応するビットから、ダーティセグメント数分のビットを１にする（ステップ１１４０）。この際、ダーティセクタ数によっては複数セグメントにわたってダーティビットマップ５２３を１にする場合もありうる。この場合には次セグメントポインタをたどって、ダーティビットマップを順次１にしていく。この処理を完了すると、当該スロットロックを解除し、ホスト側プロセッサ１１１に完了報告を行った後、記憶装置１７０に対してライト要求を発行し、データを書き込む。
産業上の利用可能性
以上のように、本発明にかかる記憶制御装置およびその制御方法は、複数の記憶装置側プロセッサと共有メモリとキャッシュメモリによりグループを構成し、グループ内だけで共有メモリへのアクセスを許可することによって、制御を簡単化して共有メモリアクセスネックを低減し、かつ共有メモリおよびキャッシュメモリを各グループ毎に分散して持つことによって、記憶制御装置全体のスループットを向上することができる記憶制御装置およびその制御方法を構築するのに適している。Technical field
The present invention relates to a control method of a control device that intervenes in data transfer between a central processing unit and a storage device.
Background art
A conventional control device includes a cache memory that temporarily stores data in a storage device, a host-side processor that executes data transfer between the cache memory and the central processing unit, and data between the cache memory and the storage device. The storage device side processor that executes the transfer and the shared memory that stores the management information of the cache memory. Both the host-side processor and the storage device-side processor can access the shared memory.
In the above configuration, each processor in the control device shares data and management information on the cache memory. Therefore, each processor can access data in any storage device via the cache memory. The processors on the host side and the storage device side can operate independently. That is, the hit processing to the cache memory by the host side processor and the write processing to the storage device by the storage device side processor can be executed in parallel.
In the conventional configuration in which one shared memory and cache memory are mounted on the control device, accesses from each processor are concentrated on the shared memory. For this reason, there is a problem that access control to the shared memory becomes a bottleneck.
An object of the present invention is to provide a storage control device and a control method therefor that can eliminate an access bottleneck to a shared memory and a cache memory.
Disclosure of the invention
The cache memory is for temporarily storing data on the storage device, and basically only needs to exist for the storage device. Similarly, the cache management information stored in the shared memory only needs to exist corresponding to the storage device. Based on the above idea, in order to achieve the above object, in the present invention, for each storage device, one or more storage device side processors that can access this storage device are allocated as one group, and cache memory and shared A memory is allocated to each group, and each storage device processor in the group manages the cache management information allocated to this group. As a result, a plurality of cache memories and a plurality of shared memories can be operated simultaneously, and a bottleneck can be eliminated. However, this grouping may be a grouping for a plurality of storage devices.
In the control device according to the present invention, the host-side processor specifies a storage-device-side processor in a group that can access the storage device based on the storage device that stores the data requested to be accessed from the central processing unit. In response to an access request from the host processor, the specified storage device processor accesses the shared memory assigned to this group and determines whether the target data exists in the cache. If there is data in the cache, the host side processor is notified of the cache address storing the data. As a result, the host side processor accesses the notified cache address.
Through the above processing, only the storage device side processor belonging to one group to which the shared memory is allocated accesses the single shared memory allocated to each group. As a result, the number of accesses to one shared memory can be reduced. As a result, contention for access to the shared memory can be suppressed. Further, since a plurality of shared memories and a plurality of cache memories are held in the system, the access performance to the shared memory and the cache memory in the entire system can be improved.
[Brief description of the drawings]
FIG. 1 is a diagram showing a configuration example of a computer system to which the present invention is applied.
FIG. 2 is a diagram showing the configuration of the storage device.
FIG. 3 is a diagram showing management information and a management method for the cache memory.
FIG. 4 is a diagram showing a processing outline of the read hit processing.
FIG. 5 is a flowchart of the processing of the host side processor that has received the read processing request from the central processing unit.
FIG. 6 is a flowchart of the processing of the host processor that has received the reconnection request from the storage device processor.
FIG. 7 is a flowchart of the processing of the storage device side processor in the read processing.
FIG. 8 is a flowchart of the processing of the host side processor in the write processing.
FIG. 9 is a flowchart of the processing of the storage device side processor in the write processing.
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, the best mode for carrying out the present invention will be described with reference to the drawings.
As shown in FIG. 1, the computer system includes a central processing unit 100, a control unit 180, and a storage unit 170. The control device 180 includes one or more host processors 111 that control data transfer between the cache memory 140 and the central processing unit 100, and one or more storage devices that control data transfer between the storage device 170 and the cache memory 140. Side processor 152.
Further, all the storage device side processors 152 that can access the same storage device 170 are set as the same processor group 190, and one shared memory 160 and cache memory 140 are provided for each group. The storage device side processor 152 in each group 190 can access the shared memory 160 via the shared memory access path 165. Each storage device side processor 152 accesses the storage device 170 via the storage device access path 175. In the storage device access path 175, it is assumed that a plurality of storage devices 170 are connected in a vine shape on one path. In addition, for the sake of simplicity, only one storage device access path 175 is shown from each storage device side processor 152, but a single storage device side processor 152 has a plurality of storage device access paths. 175 may be connected.
Next, the storage device 170 will be described with reference to FIG. In the present invention, it is assumed that the storage device 170 is a magnetic storage device including a rotating storage medium. The storage device 170 exists for each of the plurality of storage media 400 and the storage media 400, and performs data transfer between the plurality of heads 420 that perform data transfer between the storage medium 400 and the storage device interface 440, and between the head 420 and the outside of the storage device. Storage device interface 440.
The heads 420 are numbered in ascending order from the top. Data cannot be read or written from a plurality of storage media 400 at the same time. That is, at most one head 420 becomes active. Moving the head 420 over the storage medium to access the target data is called seeking, and waiting after the seek is completed until the target data passes under the head 420 is called searching. An area on the storage medium 400 that can be accessed while the head 420 rotates once is called a track 410. An area (collection of tracks 410) that passes under all the heads 420 during one rotation of the storage medium 400 is called a cylinder 430, and is numbered in ascending order from the outside to the inside of the storage medium 400. Yes. The track 410 is specified by the combination of the cylinder number and the head number. Further, the data storage area of the storage medium 400 is divided into 512-byte small areas, which are called sectors. The sectors are uniquely numbered within the track 410, and are uniquely numbered within the track 410 in ascending order from the top of the track 410.
Next, a method for managing the cache memory 140 will be described with reference to FIG. The cache memory 140 is divided for each group 190, but is logically viewed as a continuous address space. That is, if a certain cache address is determined, it can be specified which area belongs to which group 190. In order to realize this, for example, the group 190 number may be embedded in the cache address and recognized by hardware that executes data transfer.
The cache memory 140 is divided into areas called segments 530, for example, of 16 KB (fixed size by the system), and one segment management table 520 exists on the shared memory 160 for each segment. The segment management table 520 stores a cache address 522 of the segment 530, a bitmap 523 indicating dirty data, a bitmap 524 indicating clean data, and a pointer 521 to another segment management table. Here, dirty data is updated on the cache 140, but is not yet reflected in the storage device 170, and clean data is data that matches the content of the storage device 170 existing on the cache 140. To express. Further, the segment 530 is divided every 512 bytes, for example, and a bit map is created in which 1 bit is associated with this area. If dirty data or clean data exists in the corresponding area, the corresponding bit of the dirty bitmap 523 or the clean bitmap 524 is set to 1. In order to manage the data area more finely as described above, the dirty bitmap 523 and the clean bitmap 524 are used.
When allocating a cache area corresponding to a continuous area or a nearby area on the storage device 170, it is more convenient to manage a plurality of segments 530 together. In the present invention, the cache area corresponding to one track 410 minutes is managed by the slot management table 510. Here, the slot management table 510 stores a storage device number 512, a track number 513, a pointer 515 to the first segment management block, and lock information 514. The storage device number 512 and the track number 513 can specify which track 410 data is stored in which storage device. Further, by tracing the pointer 515 and referring to the segment management block 520, which track 410 can be identified. Whether a sector is stored and whether the sector is dirty can be specified. The lock information 514 is used when the slot is exclusively processed.
Here, an example of how cache management is performed when there is an access request to certain data on a certain track 410 of a certain storage device 170 is shown. First, as an access request, a read / write type, a logical storage device number, a track number, the first sector of the target data, and a length (number of sectors) are given. First, it is checked whether the target slot management block 510 exists based on the storage device number and the track number. One example of this technique is a hash. That is, when the storage device number and the track number are given to the hash function, the corresponding entry in the hash table 500 is output, and the address of the slot management table 510 is stored in this entry. If this is null (no address), it is a miss. If it is not null, this pointer is traced to check whether the storage device number 512 and the track number 513 match what is being searched for. If they do not match, the pointer to the slot management block 510 is followed, and the comparison between the storage device number 512 and the track number 513 is repeated. In this way, it is searched whether or not the target slot management block 510 exists.
If the slot management block 510 exists on the shared memory 160, it is checked whether there is a sector range to be accessed. For this purpose, first, the start segment number and the start bit position, and the end segment and the end bit position are calculated from the start sector number and the number of sectors of the target data. For example, when reading 12 sectors starting from the 60th sector, bit 9 of the third segment is calculated from bit 29 of the second segment. Based on this, it is determined by examining the clean bitmap 524 or the dirty bitmap 523 whether there is data in this bit range. If there is data over the entire target range, it is a hit, and if there is no part, it is a miss. In the case of a hit, the access can be continued as it is. In the case of a miss, the slot management table 510 does not exist and the slot management table 510 exists, but if part or all of the segment management table 520 does not exist, all the segment management tables 520 exist, but the data does not exist. It may not exist on the cache 140. If the data does not exist on the cache 140, the data may be read from the storage device 170 to the cache 140. If the segment management table 520 does not exist, the necessary number of new segment management tables 520 are allocated and connected to the slot management table 510. At this time, the dirty bitmap 523 and the clean bitmap 524 are initialized. If the slot management table 510 does not exist, the slot management table 510 and the segment management table 520 are allocated as many as necessary, and the address of the slot management table 510 is stored in the corresponding hash table 500 entry. At this time, the logical storage device number 512 and the track number 513 are updated.
When the target slot is found to be a hit or when it is found to be a mistake and the slot management table 510 is newly assigned, a slot lock is secured. If the slot is already locked, the process waits until the lock of the slot is released. When the cache operation is completed and the slot is no longer needed, the slot lock is released. Also in this case, the target slot is searched from the logical storage device number and the track number as in the case of hit miss determination. Alternatively, a method may be used in which the address of the slot is stored at the time of hit miss determination and the slot management table 510 is obtained from the address at the time of release. When the target slot management table 510 is obtained, the slot lock is released and the slot release processing is completed.
Hereinafter, a processing method for an access request from the central processing unit 100 will be described. The data access protocol between the central processing unit 100 and the control device 180 includes a CKD protocol used in a general-purpose machine, a SCSI protocol used in a workstation, and the like. In this embodiment, the data access protocol will be described on the assumption of the CKD protocol.
Among the commands used in the CKD protocol, those related to the present invention include a DX (Define Extent) command indicating an access valid range and a cache access mode, a LOC (Locate) command indicating an access position and the number of records to be accessed, There are an RDD (Read Data) command for reading, a WRD (Write Data) command for writing, and the like. When accessing a certain record, a plurality of commands such as DX, LOC, and RDD are continuously executed. This is called a command chain. Further, a plurality of continuous records can be accessed by issuing commands such as DX, LOC, RDD. In the DX command, there is a mode for designating whether to read records sequentially. As positioning information transferred by the LOC command, a cylinder number, a head number, a sector number, and a record number are designated. The storage device number is specified when establishing a connection with the control device in order for the central processing unit to issue a command.
First, the outline of the entire process will be described with reference to FIG.
FIG. 4 shows a schematic diagram of the read hit process. When the host-side processor 111 receives a read request from the central processing unit 100, the storage-side processor 152 that can access the storage device 170 to be read is selected by the following method, and a read message is transmitted.
Each processor maintains a storage device access table 600 on its local memory. Each entry is composed of a pair of storage device side processor numbers and states that can access each storage device 170. In this example, at most two storage device side processors 152 can access each storage device 170. Yes. The state is either normal or blocked, and the processor 152 cannot access the storage device 170 in the blocked state. This table is created in the shared memory 160 at the time of system construction, and each processor holds a copy in the local memory. When the storage device side processor 152 and the inter-storage device access path 175 are blocked due to a failure, or when access is temporarily disabled due to maintenance, the state of the corresponding entry in the storage device access table 600 is blocked.
The host-side processor 111 looks at an entry corresponding to the storage device number to be accessed in the storage device access table 600, obtains the storage device-side processor 152 that can access the storage device 170, and determines any storage device-side processor. In response to this, a processing request message is issued. The message includes an access type, a storage device number, a cylinder number, a head number, a sector number, and a sector number. Here, the access type to the sector number are designated by the central processing unit. The number of sectors is calculated from the number of records specified by the LOC command and the record length. For example, when reading 3 records of 4 kilobytes with 512 bytes per sector, 4096 × 3 ÷ 512 = 24 sectors.
In response to a read request from the host-side processor 111, the storage-side processor 152 accesses the shared memory and executes hit / miss determination, and the address of the cache area storing the data and valid data on the cache 140 The length is returned to the host processor 111. In response to this, the host-side processor 111 reads data from the address and performs data transfer to the central processing unit 100. When the above processing is completed, the normal end is reported to the central processing unit 100, the access completion is reported to the storage device side processor 152, and the processing ends. In response to this, the storage device-side processor 152 releases the secured cache area.
The outline of the processing method for the read request from the central processing unit 100 has been described above. Details of the read and write processes will be described below with reference to a process flowchart.
First, the read processing method will be described with reference to FIGS.
FIG. 5 shows the processing of the host processor 111. When a read processing request is received from the central processing unit 100 in step 700, in step 720, the storage device side processor 152 is selected by the method described above and a read processing request is issued. In step 730, a response from the storage device side processor 152 is awaited.
In step 900 of FIG. 7, when the storage device side processor 152 receives a processing request from the host side processor 111, a hit miss determination is performed in step 910. As a result, if it is found that the hit occurs, the following report is made to the host processor 111 that issued the message in step 920.
(1) Hit miss determination result: hit.
(2) Segment address list: A list of segment addresses storing data.
(3) Offset within segment: Offset within the first segment of the first sector storing the data.
Here, the segment address is stored in the segment management table 520. The intra-segment offset returns the bit position corresponding to the first sector storing the target data in the dirty bitmap 523 or the clean bitmap 524 of the segment management table 520.
When the host processor 111 recognizes that it is a hit, the cache address where the target data is stored is known from the segment address list and the intra-segment offset, and the data is transferred to the central processing unit 100 in step 740. Thereafter, in step 750, the storage device side processor 152 is notified of the access completion, and the processing is completed. The message transmitted at this time includes a storage device number, a cylinder number, a head number, dirty presence / absence information, a head dirty segment, an intra-segment offset, and the number of dirty sectors. Here, the dirty presence / absence information stores “no dirty”, and the head dirty segment, the intra-segment offset, and the number of dirty sectors store invalid values. In step 930 of FIG. 7, when the storage device side processor 152 receives an access completion report from the host side processor 111, the lock of the currently secured slot is released and the processing is terminated.
If there is a cache miss, the storage device side processor 152 newly allocates the slot management table 510 and the segment management table 520 in the hit miss determination process at step 910 in FIG. Subsequently, after reporting the hit / miss determination result (miss) to the host processor 111 in step 950, in step 960, a read request is issued to the storage device 170 storing the data and waits until the read is completed.
When the host-side processor 111 recognizes that there is a read error in step 735 of FIG. 5, the host-side processor 111 interrupts the processing once and disconnects from the central processing unit 100 (step 770).
When the data transfer from the storage device 170 to the cache memory 140 is completed, the storage device processor 152 retransmits the host processor 152 waiting for reading from the storage device 170 in step 970 in FIG. Issue a connection request.
When the host processor 111 detects a reconnection request from the storage device processor 152 at step 800 in FIG. 6, a reconnection process is executed between the host processor 111 and the central processing unit 100 at step 810. When this is completed, in step 830, the completion of reconnection is reported to the storage device side processor 152.
In response to this, in step 990 of FIG. 7, the storage device side processor 152 transmits the segment address list storing the request data and the intra-segment offset. The subsequent processing is the same as the read hit processing, and is therefore omitted.
If the result of the hit miss determination is a write miss, the response varies depending on whether the request from the host-side processor 111 requires a read at the time of a miss or is unnecessary. If it is necessary to read at the time of a miss, data is stored from the storage device 170 to the cache 140 as in the case of the read miss. During this time, the host processor 111 disconnects the connection with the central processing unit 100 and executes the write process in the restart process. The write process here is the same as the write process at the time of hit. If reading is not necessary at the time of a miss, the storage device side processor 152 and the host side processor 111 perform the same processing as the write hit processing.
Next, the write processing method will be described with reference to FIGS.
When the host-side processor 111 receives a write processing request from the central processing unit 100 in step 1000 of FIG. 8, in step 1020, the storage-side processor 152 is selected by the method described in FIG. And waits for a response from the storage device side processor 152 in step 1030.
The storage device side processor 152 executes hit / miss determination in step 1110 of FIG. Here, if it is a miss, the slot management table 510 and the segment management table 520 are allocated as necessary in the hit miss determination process. Subsequently, in step 1120, the segment address list and the intra-segment offset are transferred to the host side processor, and then in step 1130, the host side processor 111 waits for a slot release request.
When the host-side processor 111 receives the hit / miss determination result and the cache address from the storage-side processor 152 in step 1040 of FIG. 8, the host-side processor 111 stores the data transferred from the central processing unit 100 in the cache area. . When the data transfer is completed, the host processor 111 transmits access completion to the storage device processor 152 in step 1050. The message transmitted at this time includes a storage device number, a cylinder number, a head number, dirty presence / absence information, a head dirty segment, an intra-segment offset, and a dirty sector length. Here, dirty presence / absence information is stored as dirty. The head dirty segment and the intra-segment offset include the segment address holding the head dirty sector, the sector offset within the segment, and the number of dirty segments from the central processing unit 100. Stores the number of segments written.
Receiving the access completion report, the storage device side processor 152 obtains the target slot management table 510 by the above-described method in step 1140 of FIG. 9, and when this is the case, the dirty bitmap of the segment management table 520 corresponding to the first dirty segment is obtained. Bits corresponding to the number of dirty segments are set to 1 from the bits corresponding to the intra-segment offset at 523 (step 1140). At this time, depending on the number of dirty sectors, the dirty bitmap 523 may be set to 1 over a plurality of segments. In this case, the next segment pointer is traced and the dirty bitmap is sequentially set to 1. When this processing is completed, the slot lock is released, a completion report is sent to the host processor 111, a write request is issued to the storage device 170, and data is written.
Industrial applicability
As described above, the storage control device and the control method thereof according to the present invention form a group with a plurality of storage device side processors, a shared memory, and a cache memory, and permit access to the shared memory only within the group. Storage control device capable of improving throughput of entire storage control device by simplifying control and reducing shared memory access bottleneck, and having shared memory and cache memory distributed to each group, and control thereof Suitable for building methods.

Claims

A control method for a storage system having a plurality of host-side processors connected to a central processing unit and a plurality of storage units for storing data from the central processing unit,
Here, the plurality of host-side processors have a plurality of local memories, and one host-side processor has one local memory,
The plurality of storage devices are divided into a plurality of groups, and each of the plurality of groups further stores a cache memory and management information regarding the presence or absence of data stored in the cache memory. And at least one storage device side processor that controls data transfer between the storage devices belonging to the group and the cache memory allocated to the group ,
Each of said plurality of host side processor to said local memory to which the host processor has to store the correspondence between each said plurality of storage device side processor of the plurality of storage devices,
When the host-side processor receives an access request from the central processing unit, the host-side processor that has received the access request has a storage-device-side processor corresponding to the access-target storage device specified by the access request. Determine by referring to the correspondence in the local memory,
The host-side processor that has received the access request performs data transfer between the cache memory included in the group to which the determined storage-device-side processor belongs and the central processing unit that has issued the access request. A control method of a storage device system.

A method for controlling a storage system according to claim 1 , comprising:
If the access request is a read request,
The host-side processor that has received the read request issues a read request together with the storage device address, data position, and read range to the determined storage device-side processor,
The storage device-side processor that has received the read request accesses the management information of the shared memory to determine whether the read target data exists on the cache,
When the read target data is present on the cache, the storage device side processor notifies the cache address and valid data length of the read target data to the host side processor that issued the read request,
The storage processor system control method, wherein the host side processor receiving the notification reads the read target data from the notified cache address and transfers the data to the central processing unit.

A method for controlling a storage system according to claim 1 , comprising:
If the access request is a read request,
The host-side processor that has received the read request issues a read request together with the storage device address, data position, and read range to the determined storage device-side processor,
The storage device-side processor that has received the read request accesses the management information of the shared memory to determine whether the read target data exists on the cache,
When the read target data does not exist in the cache, the storage device side processor notifies the host side processor that issued the read request of a read miss and notifies the storage device of the group to which the storage device side processor belongs. Issue a read request,
The host-side processor that has received the notification of the read miss disconnects the connection between the storage device-side processor and the central processing unit, and the data is transferred from the storage device to the cache and reconnected from the storage-device processor that has notified the read miss. In response to the request being issued, the CPU executes reconnection processing with the central processing unit, and further reports completion of reconnection to the storage device side processor that has issued the reconnection request,
The storage-side processor that has received the reconnection completion report notifies the host-side processor that has reported the reconnection completion to the cache address and valid data length in which the data is stored,
The method of controlling a storage system, wherein the host side processor receiving the cache address and the effective data length reads data from the cache address and transfers the data to the central processing unit.

A method for controlling a storage system according to claim 1 , comprising:
If the access request is a write request,
The host side processor that has received the write request issues a write request together with a storage device address, a data position, and a write range to the determined storage device side processor,
The storage device-side processor that has received the write request secures a cache area for storing write data, and notifies the host-side processor of a cache address for storing data,
The host-side processor that has received the cache address stores the write data transferred from the central processing unit at the notified cache address, and stores the write data for the storage-side processor. Send cache address and data length,
The storage device-side processor that has received the cache address and the data length writes the write data stored in the cache to the storage device included in the group to which the storage device processor belongs. Device system control method.