JP3487440B2

JP3487440B2 - Shared memory access method

Info

Publication number: JP3487440B2
Application number: JP29934693A
Authority: JP
Inventors: 健児加藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1993-11-30
Filing date: 1993-11-30
Publication date: 2004-01-19
Anticipated expiration: 2019-01-19
Also published as: JPH07152695A

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】本発明は、複数のプロセッサが共
有メモリをアクセスする共有メモリアクセス方式に関す
るものである。【０００２】【従来の技術】従来、図１１に示すように、複数のプロ
セッサ（以下ＰＭと記述する）がバスあるいは回線を介
して共有メモリをアクセスし、一連の処理を行なうマル
チプロセッサシステムがある。各ＰＭは、複数の空間を
持ちそれぞれが処理を行なうと共にその核がバスあるい
は回線を介して共有メモリの排他を獲得してアクセスす
る。この際、複数のＰＭが共有メモリをアクセスする場
合、ＰＭ間で排他が必要となる。【０００３】あるＰＭが共有メモリの排他を獲得した状
態でＰＭクラッシュして動作しなくなった場合、クラッ
シュしたＰＭが獲得して使用していた排他環境を回収し
ないと、他ＰＭが当該共有メモリの排他待ちの状態とな
って一連の処理が進まなくなってしまう。このためにク
ラッシュしたＰＭが獲得していた排他環境を回収する必
要がある。【０００４】また、クラッシュしたＰＭが使用していた
共有メモリを使って他のＰＭが動きだすとき、クラッシ
ュしたＰＭのしかかり中の内容が残っている場合があ
り、この状態の共有メモリの内容をそのまま使ってしま
うと、その内容が不完全で他のＰＭが誤動作してしまう
ことが発生する。【０００５】【発明が解決しようとする課題】上述したように、従来
の複数のＰＭが共有メモリをアクセスして一連の処理を
行なう場合、各ＰＭが共有メモリの排他を獲得した後、
アクセスし、排他を解放することで、共有メモリの内容
の信頼性を確保している。この際、共有メモリの排他を
獲得したＰＭがクラッシュしてしまうと、この共有メモ
リの使用を待機している他のＰＭが当該共有メモリを使
えなくなってしまうという問題があった。また、ＰＭが
クラッシュしてしまうと、クラッシュしたＰＭが排他を
獲得して使用していた共有メモリの使用を、待機してい
る他のＰＭがその共有メモリを使用できなくなったこと
を認識できず、処理が中断したままになってしまうとい
う問題もあった。【０００６】本発明は、これらの問題を解決するため、
共有メモリに排他フィールド、排他待ち管理フィール
ド、新旧管理フィールドを設けて排他、排他待ちのＰ
Ｍ、データを格納する新版／旧版領域を管理すると共に
ＰＭ識別子、ＩＰＬ回数、時刻を持つＰＭ管理テーブル
を設けてクラッシュＰＭを自動検出してリカバリを行な
い、共有メモリの排他、排他待ちの順番、クラッシュＰ
Ｍの自動検出、および共有メモリの復元を管理・実行す
ることを目的としている。【０００７】【課題を解決するための手段】図１は、本発明の原理構
成図を示す。図１において、共有メモリ１は、複数のプ
ロセッサ１１がバスあるいは回線を介して排他を獲得し
てアクセスするメモリであって、排他管理テーブル２、
ＰＭ管理テーブル６、新版領域７、および旧版領域８か
ら構成されるものである。【０００８】排他管理テーブル２は、排他を管理するテ
ーブルであって、排他フィールド３、排他待ち管理フィ
ールド４、および新旧管理フィールド５から構成される
ものである。【０００９】排他フィールド３は、共有メモリ１の領域
を獲得したプロセッサ１１のＰＭ識別子およびＩＰＬ回
数を設定するものである。排他未状態のときは０（零）
を設定する。【００１０】排他待ち管理フィールド４は、排他待ちの
プロセッサ１１のＰＭ識別子に対応づけて待ち順番を表
すカウンタ値を設定するものである。新旧管理フィール
ド５は、排他フィールド３が排他を管理する新版領域７
あるいは旧版領域８のいずれかをポイントするポインタ
を設定するものである。【００１１】ＰＭ管理テーブル６は、プロセッサ１１の
クラッシュを検出するためのものであって、所定時間毎
に各プロセッサ１１が自身のＰＭ識別子に対応づけてＩ
ＰＬ回数および時刻を設定したり、現時刻と他のＰＭ識
別子の時刻とが閾値以上異なるときに当該ＰＭ識別子の
プロセッサ１１がクラッシュしたと判定するためのもの
である。【００１２】プロセッサ（ＰＭ−１、ＰＭ−２・・・Ｐ
Ｍ−Ｘ）１１は、バスあるいは回線を介して共有メモリ
１をアクセスして各種処理を行なうものであって、核１
２および複数の各種処理を行なう空間を設けたものであ
る。各空間はメッセージキューにメッセージをキューイ
ングして他の空間や核１２にデータや各種要求を通知す
る。【００１３】核１２は、全体を統括制御するものであっ
て、ここでは核スレッド１３内に設けた排他制御手段１
４および核１２から呼び出されるリカバリ手段１５から
構成されるものである。【００１４】排他制御手段１４は、共有メモリ１の排他
を獲得するものである。リカバリ手段１５は、所定時間
毎にＰＭ管理テーブル６の自ＰＭ識別子の時刻を現時刻
に設定すると共に他のＰＭ識別子の時刻とを比較して所
定閾値以上の差があったときにそのＰＭ識別子のプロセ
ッサがクラッシュしたと判定してリカバリ処理を行なう
ものである。【００１５】【作用】本発明は、図１に示すように、プロセッサ１１
内の排他制御手段１４が共有メモリ１へのアクセス時
に、共有メモリ１内の排他フィールド３が排他未状態の
ときに当該排他フィールド３に自ＰＭ識別子を設定して
排他状態にしたもとで、共有メモリ１をアクセスした
後、排他フィールド３を排他未状態に設定および排他待
ち管理フィールド４に設定されている次の順番のＰＭ識
別子のプロセッサ１１に排他獲得可能通知を送信し、一
方、排他状態のときに排他待ち管理フィールド４に自Ｐ
Ｍ識別子に対応づけてカウンタ値を設定して待機するよ
うにしている。【００１６】また、プロセッサ１１の排他制御手段１４
が排他フィールド３に自ＰＭ識別子を設定して排他を獲
得し、共有メモリ１をアクセスするときに新旧管理フィ
ールド５のポインタによってポイントする旧版領域８か
ら内容を取り出して新版領域７にコピーすると共に新版
領域７へのポインタを新旧管理フィールド５に設定し、
当該新版領域７をアクセスするようにしている。【００１７】また、プロセッサ１１内のリカバリ手段１
５がＰＭ管理テーブル６の自ＰＭ識別子の時刻に現時刻
を書き込むと共に、現時刻と他のＰＭ識別子の時刻とを
比較して閾値以上の差があったときにそのＰＭ識別子の
プロセッサ１１がクラッシュしたと判定すると共に、こ
のクラッシュ判定したＰＭ識別子とＩＰＬ回数を他のプ
ロセッサ１１に通知、およびクラッシュしたプロセッサ
１１の再ＩＰＬを指示し、この指示に対応して、クラッ
シュしたプロセッサ１１が再ＩＰＬした後にＰＭ管理テ
ーブル６の自ＩＰＬ識別子のＩＰＬ回数を＋１し、ある
いはクラッシュしないプロセッサ１１のいずれかが通知
を受けたＰＭ識別子およびＩＰＬ回数と排他フィールド
３に設定されているＰＭ識別子およびＩＰＬ回数とを比
較し等しいと判明したときに新旧管理フィールド５のポ
インタを新版領域７から旧版領域６に切り替えて復元す
るようにしている。【００１８】更に、復元した後、排他フィールド３のＰ
Ｍ識別子およびＩＰＬ回数を０クリア、および排他待ち
管理フィールド４内の次の順番のＰＭ識別子のプロセッ
サ１１に排他獲得可能通知を送信して共有メモリ１のア
クセスを再開するようにしている。【００１９】従って、共有メモリ１に排他フィールド
３、排他待ち管理フィールド４、新旧管理フィールド５
を設けて排他、排他待ちのＰＭ、データを格納する新／
旧領域を管理すると共にＰＭ識別子、ＩＰＬ回数、時刻
を持つＰＭ管理テーブル６を設けてクラッシュＰＭを自
動検出してリカバリを行なうことにより、共有メモリ１
の排他、排他待ちの順番、クラッシュＰＭの自動検出、
および共有メモリ１の復元を管理・実行することが可能
となる。【００２０】【実施例】次に、図２から図１０を用いて本発明の実施
例の構成および動作を順次詳細に説明する。以下プロセ
ッサ１１をＰＭとして以下説明する。【００２１】図２は、本発明の共有メモリの説明図（そ
の１）を示す。これは、排他管理テーブル２を共有領域
毎に設けたときの構成を示す。図２において、排他管理
テーブル２は、図示のように・排他フィールド３・排他待ち管理フィールド４・新旧管理フィールド５から構成されるものであって、共有領域の新版領域７お
よび旧版領域８の排他および排他待ちを管理するもので
ある。【００２２】排他フィールド３は、ＰＭ識別子およびＩ
ＰＬ回数が設定されている場合に排他状態を表し、０
（零）が設定されている場合に排他未状態を表す。排他
待ち管理フィールド４は、共有メモリ１をアクセスする
プロセッサのＰＭ識別子（ここでは、ＰＭ識別子１から
Ｘ）に対応づけて、排他待ちの順番を表すカウンタ値を
設定し、排他待ちの順番を管理するものである。カウン
タ値は、排他待ちカウンタ４１を当初は０として、排他
待ちが発生する毎に＋１してその値を待ち順番としてＰ
Ｍ識別子に対応づけたカウンタ値として設定する。これ
により、一番小さいカウンタ値を持つ識別子のプロセッ
サが次の排他待ちのプロセッサである。【００２３】新旧管理フィールド５は、共有領域内の排
他を管理する新版領域７あるいは旧版領域８をポイント
するポインタを設定するものであって、排他獲得毎に新
版領域７と旧版領域８をポイントするのを交互に切り替
えるようにしたものである。【００２４】以上のように、共有メモリ１内の共有領域
について、新版領域Ａと旧版領域Ａ、新版領域Ｂと旧版
領域Ｂ、・・新版領域Ｚと旧版領域Ｚ毎に排他管理テー
ブル２を設けて排他を管理する。【００２５】図３は、本発明の共有メモリの説明図（そ
の２）を示す。これは、図２の共有領域のある１つの新
版領域７および旧版領域８についての排他管理テーブル
２を取り出し、更に、共有メモリ１の全体に１つ設けて
ＰＭクラッシュを検出してリカバリするために必要なＰ
Ｍ管理テーブル６を設けた様子を示したものである。【００２６】ＰＭ管理テーブル６は、ＰＭ識別子に対応
づけてＩＰＬ回数および時刻を設定するものである。各
ＰＭは、所定時間毎にＰＭ管理テーブル６の自ＰＭ識別
子のＩＰＬ回数を設定および現時刻を時刻に設定すると
共に、現時刻と他のＰＭ識別子の時刻とを比較してその
差が閾値以上のときにそのＰＭ識別子のＰＭがクラッシ
ュしたと判断し、リカバリ処理を行なう（図７を用いて
後述する）。【００２７】次に、図４のフローチャートに示す順序に
従い、図１の構成のＩＰＬ時の動作を詳細に説明する。
図４において、Ｓ１は、ＰＭにローディングする。これ
は、電源投入時に図１のＰＭ−１、ＰＭ−２・・・ＰＭ
−Ｘにローディングする。【００２８】Ｓ２は、最初のＩＰＬか判別する。これ
は、電源投入によるＩＰＬか、あるいはそれ以外のクラ
ッシュ検出したＰＭから割込みによって再ＩＰＬ指示が
あったか判別する。ＹＥＳの場合には、最初のＩＰＬと
判明したので、Ｓ３からＳ１１の順に処理を行なう。Ｎ
Ｏの場合には、再ＩＰＬと判明したので、Ｓ１２、Ｓ８
からＳ１１の順に処理を行なう。【００２９】（１）最初のＩＰＬの場合：Ｓ３は、排
他管理テーブル２の排他フィールド３を０（零）クリア
する。Ｓ４は、排他管理テーブル２の排他待ちカウンタ
４１を０（零）クリアする。【００３０】Ｓ５は、ＰＭ管理テーブル６と排他管理テ
ーブル２に自ＰＭ識別子を設定する。Ｓ６はＰＭ管理テ
ーブル６の自ＰＭ識別子のＩＰＬ回数に１を設定する。【００３１】Ｓ７は、排他管理テーブル２の新旧管理フ
ィールド５から新版領域７をポイントする。Ｓ８は、排
他管理テーブル２の自ＰＭ識別子のカウンタ値を０
（零）クリアする。【００３２】Ｓ９は、ＰＭ管理テーブル６の自ＰＭ識別
子の時刻に現時刻を設定する。Ｓ１０は、その他の初期
化を行なう。Ｓ１１は、運用開始する。【００３３】以上のＳ３からＳ１１によって、図１の共
有メモリ１内の排他管理テーブル２の・排他フィールド３を０クリア・排他待ちカウンタ４１を０クリア・排他管理フィールド４に自ＰＭ識別子を設定・新旧管理フィールド５に新版領域７へのポインタを設
定・ＰＭ管理テーブル６に自ＰＭ識別子、ＩＰＬ回数に
１、現時刻を設定するという一連の電源投入時の初期設定を行なう。【００３４】（２）再ＩＰＬ時の場合：Ｓ１２は、Ｐ
Ｍ管理テーブル６の自ＰＭ識別子のＩＰＬ回数を＋１に
設定する。【００３５】Ｓ８は、排他管理テーブル２の自ＰＭ識別
子のカウンタ値を０（零）クリアする。Ｓ９は、ＰＭ管
理テーブル６の自ＰＭ識別子の時刻に現時刻を設定す
る。【００３６】Ｓ１０は、その他の初期化を行なう。Ｓ１
１は、運用開始する。以上のＳ１２、Ｓ８からＳ１１に
よって、図１の共有メモリ１内の排他管理テーブル２の・排他管理フィールド４の自ＰＭ識別子のカウンタ値を
０（零）に設定（排他待ちでない旨を設定する）・ＰＭ管理テーブル６の自ＰＭ識別子のＩＰＬ回数に＋
１、現時刻を設定するという一連の再ＩＰＬ時の初期設
定を行なう。【００３７】次に、図５のフローチャートに示す順序に
従い、図１の構成の正常時の動作を詳細に説明する。図
５において、Ｓ２１は、排他管理テーブル２の自ＰＭ識
別子のカウンタ値を０クリアする。【００３８】Ｓ２２は、排他獲得要求を発行する。これ
は、例えば右側に記載したＣＳ命令に例えば下記の引数
を設定して発行したことに対応して、排他制御手段１４
が以下の処理を行なう。【００３９】・ＰＭ識別子＝１・ＩＰＬ回数＝１・排他フィールドのアドレス＝＃１・排他フィールドがヌル（０、零）Ｓ２３は、排他獲得可能か判別する。これは、Ｓ２２の
ＣＳ命令で指示された、排他フィールドのアドレス＝＃
１の領域を排他管理する排他管理テーブル２の排他フィ
ールド３にヌル（０、零）が設定されて排他未状態で排
他獲得可能か判別する。ＹＥＳの場合には、排他獲得可
能と判明したので、Ｓ２４からＳ３２の排他獲得、アク
セス、排他解放、次の順番のＰＭに排他獲得可能通知を
行なう。一方、ＮＯの場合には、排他獲得済で不可と判
明したので、Ｓ３３からＳ３５で排他待ち処理を行な
う。【００４０】（１）排他獲得、アクセス、排他解放、
次の順番のＰＭに排他獲得可能通知する場合：Ｓ２４
は、排他フィールド３にＰＭ識別子とＩＰＬ回数を設定
する。例えばＰＭ識別子＝１、ＩＰＬ回数＝１を設定す
る。【００４１】Ｓ２５は、排他獲得の応答（有無）を行な
う。Ｓ２６は、応答受け取る。Ｓ２７は、共有メモリの
アクセスを行なう。これは、排他フィールド３に自ＰＭ
識別子とＩＰＬ回数を設定して排他を獲得した後、新旧
管理フィールド５のポインタでポイントされる新版領域
７（共有メモリ）のアクセスを行なう。【００４２】Ｓ２８は、排他解放する。これは、右側に
記載したように、・ＰＭ識別子＝０・ＩＰＬ回数＝０・排他フィールドのアドレス＝＃１の設定、即ち図１の排他フィールドのアドレス＝＃１の
排他を管理する排他管理テーブル２の排他フィールド３
にＰＭ識別子＝０、ＩＰＬ回数＝０として排他待ちでな
い旨を設定する。【００４３】Ｓ２９は、排他フィールドを０クリアし、
排他未状態に設定する。Ｓ３０は、排他未状態にした旨
を応答する。Ｓ３１は、排他管理テーブル２内のカウン
タ値が０以外で、かつ、最小であるＰＭ識別子（次の排
他獲得する順番のＰＭ識別子）を取り出す。【００４４】Ｓ３２は、Ｓ３１で取り出したＰＭ識別子
に対応するＰＭに排他獲得可能通知を送信する。この排
他獲得可能通知を受信した、次の順番のＰＭが、Ｓ２１
以降を実行する。【００４５】以上によって、正常時に排他獲得要求命令
（ＣＳ命令）に対応して、排他フィールド３が０（零）
であって排他未状態のときに自ＰＭ識別子とＩＰＬ回数
を排他フィールド３に設定して排他を獲得し、共有メモ
リをアクセスし、排他フィールド３を０クリアして排他
を解除した後、排他待ち管理フィールド４のカウンタ値
が０以外の最小値の次の順番のＰＭに排他獲得可能通知
を行なう。これらにより、共有メモリの排他を獲得およ
び次の順番のＰＭにその旨の通知を行い、正常時の排他
獲得処理を行なうことが可能となる。【００４６】（２）排他獲得不可で待機する場合：Ｓ
３３は、排他待ちカウンタ４１を＋１する。これは、Ｓ
２３のＮＯで排他獲得不可と判明したので、ＣＤＳ命令
を発行して排他待ちカウンタ４１の値を＋１する。【００４７】Ｓ３４は、Ｓ３３で排他待ちカウンタ４１
の値を＋１した内容を自ＰＭ識別子のカウンタ値に設定
する。Ｓ３５は、排他獲得待ちにする。これらＳ３３か
らＳ３５は、Ｓ２３で排他フィールド３に識別子とＩＰ
Ｌ回数が設定されていて排他獲得不可であったので、排
他待ちカウンタ４１の値に＋１した内容を自ＰＭ識別子
のカウンタ値に設定して排他待ちの順番を設定して排他
待ちに入る。そして、既述した図５のＳ３１およびＳ３
２によって順番がきたときに１つ前のカウンタ値のＰＭ
から排他獲得可能状態の通知を受けるので、図５のＳ２
１からＳ３２によって共有メモリの排他獲得、共有メモ
リのアクセス、排他解放、次の順番のＰＭに排他獲得可
能通知を行なう。【００４８】次に、図６のフローチャートに示す順序に
従い、図１の構成のＰＭクラッシュの監視の動作を詳細
に説明する。図６において、Ｓ４１は、ＰＭ管理テーブ
ル６の自ＰＭ識別子の時刻に、現時刻を書き込む。【００４９】Ｓ４２は、現時刻と、他ＰＭ識別子の時刻
を比較する。Ｓ４３は、Ｓ４２で比較した差が閾値以上
か判別する。これは、Ｓ４２で現時刻と、ＰＭ管理テー
ブル６内の他のＰＭ識別子の時刻とを比較し、その差が
閾値以上で当該ＰＭ識別子のＰＭが現時刻の設定を行っ
ていなくてクラッシュしたか判別する。ＹＥＳの場合に
は、Ｓ４４でそのＰＭがクラッシュと判定し、Ｓ４５で
全ＰＭにクラッシュしたＰＭのＰＭ識別子とＩＰＬ回数
を通知し、Ｓ４６でクラッシュしたＰＭを再ＩＰＬす
る。一方、Ｓ４３のＮＯの場合には、クラッシュしてい
るＰＭが見つからなかったので、終了する。【００５０】以上によって、所定時間毎に各ＰＭが共有
メモリ１内のＰＭ管理テーブル６内の自ＰＭ識別子の時
刻を現時刻に書換えおよび現時刻と他の時刻を比較して
差が閾値以上のときのそのＰＭクラッシュと検出するこ
とにより、クラッシュＰＭを検出してそのＰＭ識別子お
よびＩＰＬ回数を全てのＰＭに知らせると共にクラッシ
ュＰＭを再ＩＰＬすることが可能となる。そして、クラ
ッシュＰＭのＰＭ識別子およびＩＰＬ回数の通知を受け
たＰＭ（特定のＰＭ、あるいは最も速く共有メモリをア
クセスしたＰＭ）が図７に従ってクラッシュＰＭが使用
していた共有メモリの復元を行なう。また、再ＩＰＬを
指示されたクラッシュしたＰＭは、既述した図４のＳ
１、Ｓ２のＮＯ、Ｓ１２、Ｓ８からＳ１１の処理を行
い、再ＩＰＬを行なう（図４の説明参照）。【００５１】次に、図７のフローチャートに示す順序に
従い、図１の構成のもとで、図６のＳ４５でクラッシュ
検出したＰＭのＰＭ識別子およびＩＰＬ回数の通知を受
けたＰＭが共有メモリのリカバリを行なう動作を詳細に
説明する。【００５２】図７において、Ｓ５１は、排他獲得要求を
発行する。例えば右側に記載したＣＳ命令に例えば下記
の引数を設定して発行する。・ＰＭ識別子＝１・ＩＰＬ回数＝１・排他フィールドのアドレス＝＃１・クラッシュしたＰＭ識別子＝２・クラッシュしたＩＰＬ回数＝１Ｓ５２は、排他フィールド３の内容がクラッシュしたＰ
Ｍ識別子とＩＰＬ回数を示している（クラッシュしたＰ
Ｍが排他を獲得したままの状態）か、あるいはヌル（排
他未状態）かを判別する。ＹＥＳの場合には、Ｓ５３か
らＳ６１でリカバリ処理を行なう。一方、ＮＯの場合に
は、既にリカバリ処理が他のＰＭによって実行済、ある
いはクラッシュしたＰＭが排他フィールド３にヌル
（零）を設定して排他を解放した後にクラッシュしたの
で、リカバリ処理を行なう必要がないので、終了する。【００５３】以下リカバリ処理について説明する。図７
において、Ｓ５３は、排他フィールド３にＰＭ識別子と
ＩＰＬ回数を上書きする。これにより、リカバリ処理を
するＰＭが排他を獲得したこととなる。【００５４】Ｓ５４は、排他獲得の応答（有無）を行な
う。Ｓ５５は、ＰＭのリカバリ手段１５が応答受け取
る。Ｓ５６は、共有メモリ１の復元を行なう（上記のＳ
５２の判定で排他フィールド３の内容がヌル（排他未状
態）であった場合は、共有メモリのリカバリを行なう必
要がないので、Ｓ５６をスキップする）。ここで、共有
メモリの復元は、図１の新旧管理フィールド５内のポイ
ンタを、クラッシュしたＰＭがコピーしてアクセスした
新版領域７からコピー元の旧版領域８に切り替え、クラ
ッシュ直前の状態に復元する。【００５５】Ｓ５７は、排他解放する。これは、右側に
記載したように、・ＰＭ識別子＝０・ＩＰＬ回数＝０・排他フィールドのアドレス＝＃１の設定、即ち図１の排他フィールドのアドレス＝＃１の
排他を管理する排他管理テーブル２の排他フィールド３
にＰＭ識別子＝０、ＩＰＬ回数＝０として排他待ちでな
い旨を設定する。【００５６】Ｓ５８は、排他フィールドを０クリアし、
排他未状態に設定する。Ｓ５９は、排他未状態にした旨
を応答する。Ｓ６０は、排他管理テーブル２内のカウン
タ値が０以外で、かつ、最小であるＰＭ識別子（次の排
他獲得する順番のＰＭ識別子）を取り出す。【００５７】Ｓ６１は、Ｓ６０で取り出したＰＭ識別子
に対応するＰＭに排他獲得可能通知を送信する。そし
て、この排他獲得可能通知を受信した、次の順番のＰＭ
が、図５の正常時のＳ２１以降を実行し、共有メモリ１
をアクセスする。【００５８】以上によって、Ｓ４５でクラッシュしたＰ
ＭのＰＭ識別子およびＩＰＬ回数を受信したＰＭのうち
特定のＰＭあるいは最も速く共有メモリにリカバリのた
めのアクセスをしたＰＭが上記リカバリ処理を行い、共
有メモリの復元および排他フィールド３の０クリアと排
他待ち管理フィールド４のクラッシュＰＭの排他待ちを
０クリアした後、次の順番のＰＭに排他獲得可能通知を
行なう。【００５９】次に、図８を用いて正常時の排他獲得の処
理の具体例について詳細に説明する。ここで、・ＰＭ−１が排他獲得待ち・ＰＭ−２が排他獲得・ＰＭ−３が排他獲得待ちを例に説明する。【００６０】（１）ＰＭ−２の動作：カウンタ値の０クリアＩＰＬ時に排他待ち管理フィールド４の自ＰＭ識別子の
カウンタ値を０クリアする（図４参照）。【００６１】ｃｓ命令発行（排他獲得要求発行）ここでは、排他フィールド３が０であったので、・自ＰＭ識別子＝２・ＩＰＬ回数＝１を排他フィールド３に設定して排他を獲得する。【００６２】旧版領域８の内容を新版領域７に複写
する。新版領域７をアクセスする。新旧管理テーブル５を新版領域７にポイントする。【００６３】ｃｓ命令発行（排他解除要求発行）ここでは、排他フィールド３に・ＰＭ識別子＝０・ＩＰＬ回数＝０として排他未状態に設定して排他解放する。【００６４】ｓｎｄ命令発行排他獲得可能状態を次のカウンタ値のＰＭに通知する。
受信したＰＭ、ここでは排他獲得待ちのＰＭ−１が受信
し、ＰＭ−２のからを行い、排他待ちＰＭがあれば
次のＰＭに排他獲得可能状態を通知する。【００６５】（２）ＰＭ−１の動作：カウンタ値の０クリアＩＰＬ時に排他待ち管理フィールド４の自ＰＭ識別子の
カウンタ値を０クリアする（図４参照）。【００６６】ｃｓ命令発行（排他獲得要求発行）ここでは、排他フィールド３にＰＭ識別子＝２、ＩＰＬ
回数＝１が設定されていたので、排他獲得に失敗する。【００６７】ｃｄｓ命令発行（排他獲得待ち要求発
行）排他管理テーブル２の排他待ちカウンタ４１の内容を＋
１する。カウンタ値の設定で＋１した排他待ちカウンタ４１の内容を、自ＰＭ識
別子＝１のカウンタ値に設定し、待機待ちの順番を設定
する。【００６８】ｒｃｖ命令発行排他獲得待ちの状態となる。この状態で、ＰＭ−２から
で排他獲得可能状態の通知があった場合、ＰＭ−１は
ＰＭ−２のからと同様の処理を行い、排他フィール
ド３に自ＰＭ識別子＝１、ＩＰＬ回数＝１を設定して排
他獲得、旧版領域８から内容を新版領域７に複写してポ
インタ切り替え、新版領域７にアクセスした後、排他フ
ィールド３にＰＭ識別子＝０、ＩＰＬ回数＝０を設定し
て排他解放および排他待ち管理フィールド４のカウンタ
値の０以外で最も小さいカウンタ値のＰＭ識別子があれ
ば（例えばＰＭ識別子＝３）、そのＰＭに排他獲得可能
状態を通知する。【００６９】（３）ＰＭ−３の動作：ＰＭ−１のか
らと同様の処理を行なう。以上によって、正常時にＰＭ−２が排他獲得して共有メ
モリ（新版領域７）にアクセスし、排他解放して次の排
他獲得待ちの順番の他のＰＭ−１に排他獲得可能状態を
通知する。通知を受けたＰＭ−１は同様に排他獲得、共
有メモリのアクセス、排他解放し、次のＰＭ−３に排他
獲得可能状態を通知する。そして、同様にＰＭ−３が排
他獲得、共有メモリのアクセス、排他解放する。これら
により、正常時に各ＰＭが順番に共有メモリの排他を獲
得して処理を行なうことが可能となる。【００７０】次に、図９を用いてクラッシュ発生してリ
カバリするときの処理の具体例について詳細に説明す
る。ここで、・ＰＭ−１にクラッシュ発生・ＰＭ−２がクラッシュ検出・ＰＭ−３がリカバリを行なう例について説明する。【００７１】（１）ＰＭ−１の動作：Ｓ１：排他フィールド３にＰＭ識別子＝１、ＩＰＬ回数
＝１を設定する。この状態では、・排他フィールド３の状態：ＰＭ識別子＝１ＩＰＬ回数＝１Ｓ２は、共有メモリのアクセスを行なう。【００７２】Ｓ３は、ＰＭ−１にクラッシュ発生する。（２）ＰＭ−２の動作（クラッシュＰＭの検出および
クラッシュＰＭの再ＩＰＬ）：所定時間毎に起動されたＰＭ−２のリカバリ手段１
５が共有メモリ１のＰＭ管理テーブル６のＰＭ識別子＝
１の時刻と、現時刻との差が閾値を越えていることを認
識し、ＰＭ識別子＝１のＰＭ−１がクラッシュと検出し
たので、全ＰＭにクラッシュしたＰＭのＰＭ識別子＝１
およびＩＰＬ回数＝１を通知する。【００７３】クラッシュしたＰＭ識別子＝１を再Ｉ
ＰＬさせる。（３）ＰＭ−３の動作（リカバリ動作）：（２）のの通知を受け取り、ＰＭ識別子＝１、Ｉ
ＰＬ回数＝１のＰＭがクラッシュしたことを認識する。【００７４】クラッシュリカバリのための排他を獲
得依頼する。ｃｓ命令発行・ＰＭ識別子＝３・ＩＰＬ回数＝１・排他フィールドのアドレス＝＃１・クラッシュしたＰＭ識別子＝１・クラッシュしたＩＰＬ回数＝１排他フィールド３に設定されているＰＭ識別子＝
１、ＩＰＬ回数＝１と、クラッシュ通知を受けたＰＭ識
別子、ＩＰＬ回数が一致するので、リカバリ処理を行な
う。【００７５】共有メモリの復元を行なう。これは、
新旧管理フィールド５のポインタを新版領域７から旧版
領域８をポイントするように切り替え、クラッシュ前の
共有メモリに戻し、復元する。【００７６】排他フィールド３を０クリアして排他
未状態にする。・排他フィールド３の状態ＰＭ識別子＝０ＩＰＬ回数＝０排他獲得可能通知を、次の排他獲得の順番のＰＭ
（排他待ち管理フィールド４のカウンタ値が０以外の最
小値のＰＭ識別子のＰＭ）に通知する。【００７７】（４）ＰＭ−１の再ＩＰＬ：ＰＭ−２から再ＩＰＬ通知を受けた、クラッシュし
たＰＭ−１が再ＩＰＬを行ない、図４のＳ１、Ｓ２のＮ
Ｏ、Ｓ１２、Ｓ８からＳ１１によって運用を開始する。
この再ＩＰＬにより、ＰＭ−１は、ＰＭ識別子＝１ＩＰＬ回数＝２となり、再ＩＰＬ時に併せて図９のＰＭ管理テーブル６
中の自ＰＭ識別子＝１、ＩＰＬ回数＝２と図示の右矢印
で示したように設定する。【００７８】以上によって、ＰＭ−１がクラッシュし、
ＰＭ−２がＰＭ−１クラッシュ検出とＰＭ−１の再ＩＰ
Ｌ、ＰＭ−３がリカバリを行なうことが可能となる。次
に、図１０を用いてクラッシュ発生してリカバリすると
きの他の処理の具体例について詳細に説明する。ここ
で、・ＰＭ−１にクラッシュ発生・ＰＭ−２がクラッシュ検出・ＰＭ−３がリカバリを例にして説明する。【００７９】（１）ＰＭ−１の動作：Ｓ１は、排他フ
ィールド３にＰＭ識別子＝０、ＩＰＬ回数＝０を設定す
る。この状態は、・排他フィールド３の状態：ＰＭ識別子＝０ＩＰＬ回数＝０Ｓ２は、ＰＭにクラッシュ発生する。これは、ＰＭ−１
が排他を獲得していないときに、ＰＭにクラッシュが発
生した状態である。【００８０】（２）ＰＭ−２の動作（クラッシュＰＭ
の検出およびクラッシュＰＭの再ＩＰＬ）：所定時間毎に起動されたＰＭ−２のリカバリ手段１
５が共有メモリ１のＰＭ管理テーブル６のＰＭ識別子＝
１の時刻と、現時刻との差が閾値を越えていることを認
識し、ＰＭ識別子＝１のＰＭ−１がクラッシュと検出し
たので、全ＰＭにクラッシュしたＰＭのＰＭ識別子＝１
およびＩＰＬ回数＝１を通知する。【００８１】クラッシュしたＰＭ識別子＝１を再Ｉ
ＰＬさせる。（３）ＰＭ−１の再ＩＰＬ：ＰＭ−２から再ＩＰＬ通知を受けた、クラッシュし
たＰＭ−１が再ＩＰＬを行ない、図４のＳ１、Ｓ２のＮ
Ｏ、Ｓ１２、Ｓ８からＳ１１によって運用を開始する。
この再ＩＰＬにより、ＰＭ−１は、ＰＭ識別子＝１ＩＰＬ回数＝２となり、再ＩＰＬ時に併せて図９のＰＭ管理テーブル６
中の自ＰＭ識別子＝１、ＩＰＬ回数＝２と図示の右矢印
で示したように設定する。【００８２】排他獲得する。これは、再ＩＰＬした
ＰＭ−１が排他獲得要求を発行し、排他フィールド３を
図示の下記のように設定して排他獲得する。・排他フィールドの状態ＰＭ識別子：１ＩＰＬ回数：２共有メモリアクセスする。【００８３】以上によって、ＰＭ−１が排他フィールド
３のＰＭ識別子＝０、ＩＰＬ回数＝０に設定した後にク
ラッシュした場合、ＰＭ−２がＰＭ−１クラッシュ検出
とＰＭ−１の再ＩＰＬし、再ＩＰＬしたＰＭ−１がＰＭ
識別子＝１、ＩＰＬ回数＝２をもとに排他獲得し、共有
メモリをアクセスすることが可能となり、ＰＭ−３がリ
カバリ処理を行おうとするがその必要がないと判明した
ので、リカバリ処理を何も行なうことなく終了する。【００８４】（４）ＰＭ−３の動作（リカバリ動
作）：（２）のの通知を受け取り、ＰＭ識別子＝１、Ｉ
ＰＬ回数＝１のＰＭがクラッシュしたことを認識する。【００８５】クラッシュリカバリのための排他を獲
得依頼する。ｃｓ命令発行・ＰＭ識別子＝３・ＩＰＬ回数＝１・排他フィールドのアドレス＝＃１・クラッシュしたＰＭ識別子＝１・クラッシュしたＩＰＬ回数＝１ＩＰＬ回数が異なるため、排他獲得不可でリカバリ
の必要がない。これは、クラッシュしたＰＭ−１が再Ｉ
ＰＬ後、排他獲得要求を発行し、排他フィールド３のＰ
Ｍ識別子＝１、ＩＰＬ回数＝２と排他獲得状態に設定し
た後にクラッシュ通知を受けたので、ＩＰＬ回数＝１と
が一致しないので、リカバリ処理の必要がないと判断し
たものである。【００８６】【発明の効果】以上説明したように、本発明によれば、
共有メモリ１に排他フィールド３、排他待ち管理フィー
ルド４、新旧管理フィールド５を設けて排他、排他待ち
のＰＭ、データを格納する新／旧領域を管理すると共に
ＰＭ識別子、ＩＰＬ回数、時刻を持つＰＭ管理テーブル
６を設けてクラッシュＰＭを自動検出してリカバリを行
なう構成を採用しているため、共有メモリ１の排他、排
他待ちの順番、クラッシュＰＭの自動検出、および共有
メモリ１の復元を管理・実行することができる。これら
により、（１）所定時間毎に各ＰＭがＰＭ管理テーブル６に自
ＰＭ識別子の時刻に現時刻を設定すると共に他の時刻と
現時刻との差が閾値以上のときにそのＰＭ識別子のＰＭ
がクラッシュしたと検出し、全部のＰＭにＩＰＬ識別
子、ＩＰＬ回数を通知してクラッシュの旨を通知してリ
カバリ処理を行なうと共にクラッシュＰＭの再ＩＰＬを
行って再起動を自動的に行なうことが可能となる。【００８７】（２）リカバリ処理の中で、次の排他待
ちのＰＭに排他獲得可能通知を送信し、中断していた共
有メモリへのアクセスを再開させることが自動的に可能
となる。【００８８】（３）共有メモリ内の内容を復元する際
に、排他獲得毎に新版領域７と旧版領域８とをポインタ
切り替えて使用し、ＰＭクラッシュ発生したときにポイ
ンタを切り替えて旧版領域８にポイントすることで、容
易にデータの復元を図ることが可能となる。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention
Regarding shared memory access method for accessing owned memory
Things. 2. Description of the Related Art Conventionally, as shown in FIG.
Sessa (hereinafter referred to as PM) via bus or line
To access the shared memory and perform a series of processes.
There is a multiprocessor system. Each PM has multiple spaces
Each has its own processing and its core is a bus or
Acquires exclusive access to the shared memory via the line and accesses
You. At this time, if multiple PMs access the shared memory,
In this case, exclusion is required between PMs. A state in which a certain PM has acquired exclusive memory exclusion
If the PM crashes and stops working, the
Collect the exclusive environment that was acquired and used by the PM
If not, other PMs will be in a state of waiting for exclusion of the shared memory.
As a result, a series of processing does not proceed. For this purpose
The exclusive environment acquired by the rushed PM must be collected.
It is necessary. [0004] Also, the crashed PM was using
When other PMs start using shared memory, crash
The contents of the working PM may remain.
And use the contents of the shared memory in this state as is.
The contents are incomplete and other PMs malfunction
That happens. [0005] As described above, as described above,
Multiple PMs access the shared memory and perform a series of processing
To do so, after each PM gains exclusion of the shared memory,
By accessing and releasing the exclusion, the contents of the shared memory
Has ensured its reliability. At this time, exclusion of shared memory
If the acquired PM crashes, this shared memo
Other PMs waiting to use the memory use the shared memory.
There was a problem that it would not be possible. Also, PM
If a crash occurs, the crashed PM will take exclusion
Waiting for the use of shared memory that was acquired and used
Other PMs can no longer use the shared memory
Is not recognized and processing is interrupted
There was also a problem. [0006] The present invention solves these problems,
Exclusive field and exclusive wait management field in shared memory
P, waiting for exclusion, exclusive with new and old management fields
M, manages new / old areas for storing data
PM management table having PM identifier, IPL count, and time
To detect crash PM automatically and perform recovery.
Exclusion of shared memory, exclusion wait order, crash P
Manage and execute automatic detection of M and restoration of shared memory
It is intended to be. FIG. 1 is a block diagram showing the principle of the present invention.
A diagram is shown. In FIG. 1, a shared memory 1 includes a plurality of programs.
The processor 11 acquires the exclusion via the bus or the line.
A memory to be accessed by means of exclusive control table 2,
PM management table 6, new area 7, and old area 8
It is composed of The exclusion management table 2 is a table for managing exclusion.
Exclusion field 3, exclusion wait management file
Field 4 and new and old management field 5
Things. The exclusive field 3 is an area of the shared memory 1
Identifier and IPL times of the processor 11 that has acquired the
Set the number. 0 (zero) when no exclusion is in progress
Set. The exclusion wait management field 4 contains an exclusion wait
The waiting order is displayed in association with the PM identifier of the processor 11.
The counter value is set. New and old management field
The field 5 is a new version area 7 in which the exclusion field 3 controls exclusion.
Or a pointer that points to any of the old edition area 8
Is set. The PM management table 6 stores the
This is for detecting a crash, and every predetermined time
, Each processor 11 associates its own PM identifier with I
Set the number of times and time of PL, and set the current time and other PM
When the time of the discriminator differs from the time of the
For determining that the processor 11 has crashed
It is. Processors (PM-1, PM-2... P
MX) 11 is a shared memory via a bus or a line.
1 to perform various processing by accessing
A space for performing two or more various processes is provided.
You. Each space queues messages in a message queue
To notify other spaces and the core 12 of data and various requests.
You. The core 12 controls the entire system.
Here, the exclusive control means 1 provided in the nuclear thread 13 is used here.
4 and from recovery means 15 called from nucleus 12
It is composed. The exclusive control means 14 controls the exclusion of the shared memory 1.
Is what you get. Recovery means 15
For each time, the time of the own PM identifier of the PM management table 6 is set to the current time.
And compare it with the time of another PM identifier.
When there is a difference equal to or greater than the fixed threshold,
Determines that the server has crashed and performs recovery processing
Things. According to the present invention, as shown in FIG.
When the exclusive control means 14 accesses the shared memory 1
The exclusive field 3 in the shared memory 1 is not exclusive
Sometimes, set the own PM identifier in the exclusive field 3
Accessed shared memory 1 in exclusive state
After that, the exclusive field 3 is set to the non-exclusive state and the exclusive
The PM order of the next order set in the management field 4
An exclusive acquisition possible notification is transmitted to the processor 11 of another child,
In the exclusive state, the own P
Set a counter value corresponding to the M identifier and wait
I'm trying. The exclusive control means 14 of the processor 11
Sets its own PM identifier in the exclusion field 3 and acquires exclusion
When accessing shared memory 1, new and old management files are
Old version area 8 pointed by the pointer of field 5
The contents are taken out and copied to the new edition area 7 and the new edition
The pointer to the area 7 is set in the new and old management field 5,
The new edition area 7 is accessed. The recovery means 1 in the processor 11
5 is the current time at the time of the own PM identifier in the PM management table 6
And write the current time and the time of another PM identifier.
When there is a difference equal to or greater than the threshold value,
It is determined that the processor 11 has crashed,
The PM identifier and IPL count of the crash
Notify the processor 11 and the processor that crashed
11 re-IPL, and in response to this
PM management software after the processor 11
The number of IPLs of the own IPL identifier of the cable 6 is incremented by one, and
Not notified by one of the processors 11 that does not crash
Received PM identifier, IPL count and exclusive field
3 with the PM identifier and IPL count set in
When it is determined that they are equal,
Switch the interface from the new version area 7 to the old version area 6 and restore
I am trying to. Further, after the restoration, the P
Clear M identifier and IPL count to 0, and wait for exclusion
The processor of the next PM identifier in the management field 4
Sends an exclusive acquisition notice to the shared memory 1
Access is resumed. Therefore, the exclusive field is stored in the shared memory 1.
3, exclusive waiting management field 4, new and old management field 5
To store exclusion, PM waiting for exclusion, and data
Manage old area, PM identifier, IPL frequency, time
Provide PM management table 6 with
Detection and recovery, the shared memory 1
Exclusion, exclusion waiting order, crash PM automatic detection,
Can manage and execute restoration of shared memory 1
It becomes. Next, an embodiment of the present invention will be described with reference to FIGS.
The configuration and operation of the example will be sequentially described in detail. The following process
A description will be given below with the case where the monitor 11 is PM. FIG. 2 is an explanatory diagram of the shared memory of the present invention (the
1) is shown. This means that the exclusive management table 2 is shared
The configuration when each is provided is shown. In FIG. 2, exclusive management
The table 2 is composed of an exclusive field 3, an exclusive waiting management field 4, and an old and new management field 5, as shown in the figure.
Manages exclusion and waiting for exclusion of the old version area 8
is there. The exclusive field 3 contains the PM identifier and I
When the number of PLs is set, it indicates an exclusive state, and 0
When (zero) is set, it indicates an exclusive non-state. Exclusive
The wait management field 4 accesses the shared memory 1
Processor PM identifier (here, PM identifier 1
X) and the counter value indicating the order of exclusion waiting
It sets and manages the order of waiting for exclusion. Coun
The exclusion wait counter 41 is initially set to 0,
Each time a wait occurs, the value is incremented by 1 and the value is set as the wait order P
It is set as a counter value associated with the M identifier. this
Of the identifier with the smallest counter value
Is the next processor waiting for exclusion. The new and old management fields 5 are used to store data in the shared area.
Point to new version area 7 or old version area 8 to manage other
Set a pointer to
Alternately pointing to plate area 7 and old version area 8
It is something that can be obtained. As described above, the shared area in the shared memory 1
About new edition area A and old edition area A, new edition area B and old edition
Area B, exclusive management table for each of the new edition area Z and the old edition area Z
A bull 2 is provided to manage exclusion. FIG. 3 is an explanatory diagram of the shared memory of the present invention (the
2) is shown. This is one new case with the shared area of FIG.
Exclusive management table for version area 7 and old version area 8
Out of the shared memory 1
P required to detect and recover from PM crash
This shows a state in which an M management table 6 is provided. The PM management table 6 corresponds to a PM identifier.
In addition, the number of times of IPL and the time are set. each
The PM identifies its own PM in the PM management table 6 every predetermined time.
If you set the number of child IPL and set the current time to time
In both cases, the current time is compared with the time of another PM identifier,
If the difference is greater than or equal to the threshold, the PM with that PM identifier
It is determined that the recovery has been performed, and the recovery process is performed (using FIG. 7).
See below). Next, in the order shown in the flowchart of FIG.
Therefore, the operation at the time of IPL of the configuration of FIG. 1 will be described in detail.
In FIG. 4, S1 loads the PM. this
PM-1, PM-2,... PM in FIG.
Load at -X. In step S2, it is determined whether the IPL is the first IPL. this
Is a power-on IPL or other
Re-IPL instruction is issued by an interrupt from the PM
Determine if there is. If yes, then the first IPL
Since it is found, the processing is performed in the order of S3 to S11. N
In the case of O, it has been determined that the IPL is a re-IPL, so S12 and S8
The processing is performed in the order from to S11. (1) In the case of the first IPL: S3 is
Clear exclusive field 3 of other management table 2 to 0 (zero)
I do. S4 is an exclusion wait counter of the exclusion management table 2.
41 is cleared to 0 (zero). In S5, the PM management table 6 and the exclusion management
The own PM identifier is set in the cable 2. S6 is the PM management
1 is set to the number of IPLs of the own PM identifier of the cable 6. At S7, the new and old management files in the exclusive management table 2 are stored.
Field 5 points to new edition area 7. S8 is an exhaust
Set the counter value of the own PM identifier in the other management table 2 to 0
(Zero) Clear. S9 is the identification of the own PM in the PM management table 6.
Set the current time to the child's time. S10 is other initial
Is performed. S11 starts operation. By the above-described steps S3 to S11, FIG.
・ Exclusion field 3 of the exclusive management table 2 in the resident memory 1 ・ Clear the exclusion field 3 ・ Clear the exclusion wait counter 41 to 0 ・ Set the own PM identifier in the exclusion management field 4 ・ Set a pointer to the new version area 7 in the new / old management field 5
In the fixed / PM management table 6, the own PM identifier and the number of IPL
1. Perform a series of initial settings at power-on, such as setting the current time. (2) In the case of re-IPL: S12 is P
Set the number of IPL of the own PM identifier in the M management table 6 to +1
Set. In step S8, the own PM is identified in the exclusive management table 2.
The child counter value is cleared to 0 (zero). S9 is PM tube
Set the current time to the time of the own PM identifier in the management table 6
You. In step S10, other initialization is performed. S1
1 starts operation. From S12 and S8 above to S11
Therefore, the counter value of the own PM identifier of the exclusive management field 4 of the exclusive management table 2 in the shared memory 1 in FIG.
Set to 0 (set to not wait for exclusion) + Add to IPL count of own PM identifier in PM management table 6
1. Initial setting at the time of a series of re-IPL to set the current time
Perform settings. Next, in the order shown in the flowchart of FIG.
Therefore, the normal operation of the configuration of FIG. 1 will be described in detail. Figure
In S5, S21 determines the own PM identification in the exclusion management table 2.
Clear the counter value of another child to 0. In step S22, an exclusive acquisition request is issued. this
Is, for example, the following argument in the CS command described on the right side
Is set and issued, the exclusive control means 14
Performs the following processing. PM identifier = 1 IPL count = 1 Exclusive field address = # 1 Exclusive field is null (0, zero) S23 determines whether exclusive acquisition is possible. This is S22
Address of exclusive field specified by CS instruction = #
The exclusive file in the exclusive management table 2 that exclusively manages the area 1
Field 3 is set to null (0, zero) and the
It is determined whether another acquisition is possible. In case of YES, exclusive acquisition is possible
No, the exclusive acquisition and access from S24 to S32
Access, exclusive release, and notification of exclusive acquisition possible to the next order PM
Do. On the other hand, in the case of NO, exclusive
The exclusive wait process is performed in steps S33 to S35.
U. (1) Exclusive acquisition, access, exclusive release,
In the case of notifying the next PM of the exclusive acquisition possible: S24
Sets PM identifier and IPL count in exclusive field 3
I do. For example, PM identifier = 1, IPL count = 1
You. In step S25, an exclusive acquisition response (presence or absence) is made.
U. In step S26, a response is received. In step S27, the shared memory
Perform access. This is because the exclusive PM
After acquiring the exclusion by setting the identifier and the number of IPLs,
New version area pointed by the pointer in management field 5
7 (shared memory). In step S28, exclusive release is performed. This is on the right
As described, PM identifier = 0 IPL count = 0 Exclusive field address = # 1 setting, that is, exclusive field address = # 1 in FIG.
Exclusion field 3 of exclusion management table 2 that manages exclusion
Wait for exclusion as PM identifier = 0, IPL count = 0
To the effect. In step S29, the exclusive field is cleared to 0,
Set to the exclusive state. In step S30, the exclusive state is set.
To respond. S31 is the count in the exclusion management table 2.
The PM identifier whose data value is other than 0 and is the smallest (the next
Other PM identifiers in the order of acquisition are taken out. S32 is the PM identifier extracted in S31
Is transmitted to the PM corresponding to. This exhaust
The next-order PM that has received the other acquisition enable notification is S21
Execute the following. As described above, an exclusive acquisition request instruction is issued in a normal state.
(CS instruction), exclusive field 3 is 0 (zero)
And the own PM identifier and the number of IPLs when not exclusive
Is set in the exclusive field 3 to acquire the exclusive
Access, and clear exclusive field 3 to 0
After canceling, the counter value of the exclusive waiting management field 4
Of exclusive acquisition possible to the PM in the order next to the minimum value other than 0
Perform As a result, exclusive exclusion of shared memory is obtained and
And notify the PM of the next order to that effect, and exclusion when normal
Acquisition processing can be performed. (2) When waiting without exclusive acquisition: S
33 increments the exclusion wait counter 41 by one. This is S
Since it was determined that exclusive acquisition was impossible with NO of 23, the CDS instruction
And the value of the exclusion wait counter 41 is incremented by one. In S34, the exclusion wait counter 41 is set in S33.
Set the value obtained by adding +1 to the counter value of the own PM identifier
I do. In step S35, the process waits for exclusive acquisition. These S33
In S23, the identifier and the IP are stored in the exclusive field 3 in S23.
Since the L count has been set and exclusive acquisition was not possible,
The content obtained by adding +1 to the value of the other waiting counter 41 is the own PM identifier
Set the counter value of, and set the order of waiting for exclusion and exclusion
I will wait. Then, S31 and S3 of FIG.
When the order comes by 2, the PM of the previous counter value
Is notified of the exclusive acquisition possible state from
Exclusive acquisition of shared memory, shared memo by 1 to S32
Access, exclusive release, exclusive acquisition possible for the next PM
Noh notification. Next, in the order shown in the flowchart of FIG.
Accordingly, the operation of monitoring the PM crash in the configuration of FIG. 1 is described in detail.
Will be described. In FIG. 6, S41 is a PM management table.
The current time is written to the time of the own PM identifier of the file 6. S42 is the current time and the time of the other PM identifier
Compare. In step S43, the difference compared in step S42 is equal to or larger than the threshold.
Is determined. This is based on the current time and PM management data in S42.
And the time of other PM identifiers in the
When the threshold value is exceeded or higher, the PM with the relevant PM identifier sets the current time
To determine if it has crashed. If yes
Determines in S44 that the PM has crashed, and in S45
PM identifier and IPL count of PM that crashed to all PMs
And re-IPL the PM that crashed in S46.
You. On the other hand, in the case of NO in S43, a crash has occurred.
The process ends because no PM was found. As described above, each PM is shared every predetermined time.
At the time of the own PM identifier in the PM management table 6 in the memory 1
Rewrite the time to the current time and compare the current time with another time
Detecting that PM crash when the difference is greater than or equal to the threshold
With this, the crash PM is detected and its PM identifier and
And the number of IPLs to all PMs
The new PM can be re-IPLed. And the club
Notification of the PM identifier and the number of IPLs
PM (specific PM or fastest shared memory
PM used by crash PM according to Fig. 7
Restore the shared memory that was used. Also, re-IPL
The instructed crashed PM is the S in FIG.
1, NO of S2, and processing of S12, S8 to S11 are performed.
Then, re-IPL is performed (see the description of FIG. 4). Next, in the order shown in the flowchart of FIG.
Therefore, under the configuration of FIG. 1, a crash occurs in S45 of FIG.
Receive notification of PM identifier and IPL count of detected PM
Details of the operation performed by the digitized PM to recover the shared memory.
explain. In FIG. 7, S51 sends an exclusive acquisition request
Issue. For example, in the CS instruction described on the right side, for example,
Set and issue the argument. -PM identifier = 1-IPL count = 1-Exclusive field address = # 1-Crashed PM identifier = 2-Crashed IPL count = 1 S52 is the P in which the contents of exclusive field 3 have crashed.
M identifier and the number of IPLs (P crashed
M remains exclusive, or null (excludes
Other state). In the case of YES, S53
In step S61, a recovery process is performed. On the other hand, in the case of NO
Indicates that the recovery process has already been executed by another PM
Or crashed PM is null in exclusive field 3
Crashed after setting (zero) to release exclusion
Then, since there is no need to perform a recovery process, the process ends. Hereinafter, the recovery process will be described. FIG.
In step S53, the exclusive field 3 includes the PM identifier
Overwrite the IPL count. This enables recovery processing
This means that the executing PM has acquired the exclusion. In step S54, an exclusive acquisition response (presence or absence) is made.
U. In step S55, the PM recovery unit 15 receives the response.
You. In step S56, restoration of the shared memory 1 is performed (the above-described S
52, the content of the exclusive field 3 is null (exclusive status
State), it is necessary to recover the shared memory.
Since it is unnecessary, S56 is skipped). Where shared
The restoration of the memory is performed by the point in the new and old management field 5 in FIG.
PM has copied and accessed the crashed PM
Switch from the new edition area 7 to the old edition area 8 of the copy source,
Restore to the state just before the switch. In step S57, exclusive release is performed. This is on the right
As described, PM identifier = 0 IPL count = 0 Exclusive field address = # 1 setting, that is, exclusive field address = # 1 in FIG.
Exclusion field 3 of exclusion management table 2 that manages exclusion
Wait for exclusion as PM identifier = 0, IPL count = 0
To the effect. In step S58, the exclusive field is cleared to 0,
Set to the exclusive state. In step S59, the exclusive state is set.
To respond. In step S60, the counter in the exclusion management table 2
The PM identifier whose data value is other than 0 and is the smallest (the next
Other PM identifiers in the order of acquisition are taken out. S61 is the PM identifier extracted in S60
Is transmitted to the PM corresponding to. Soshi
And the next order PM that has received this exclusive acquisition
Performs the processing after S21 in the normal state of FIG.
To access. As described above, P which crashed in S45
Of the PMs that have received the PM identifier of M and the number of IPLs
Recovery to a specific PM or the fastest shared memory
The PM that has accessed for recovery performs the above recovery process and
Restoring the memory and clearing and excluding exclusive field 3
Exclusion waiting for crash PM in other waiting management field 4
After clearing to 0, a notification of exclusive acquisition is sent to the next PM
Do. Next, referring to FIG. 8, the process of obtaining exclusive exclusion in the normal state will be described.
A specific example of the processing will be described in detail. Here, an example will be described in which PM-1 is waiting for exclusive acquisition, PM-2 is exclusive acquisition, and PM-3 is waiting for exclusive acquisition. (1) Operation of PM-2: The counter value is cleared to 0.
The counter value is cleared to 0 (see FIG. 4). Issuance of cs command (issuance of exclusive acquisition request) In this case, since the exclusive field 3 was 0, the own PM identifier = 2 and the number of IPL = 1 are set in the exclusive field 3 to acquire the exclusive. Copy the contents of the old edition area 8 to the new edition area 7
I do. The new version area 7 is accessed. The new / old management table 5 is pointed to the new version area 7. Issuance of the cs instruction (issuance of an exclusion release request) Here, the PM is set to 0 in the exclusion field 3 and the number of IPLs is set to 0 to set the exclusion to the non-exclusion state and release the exclusion. The state of the snd instruction issuance exclusive acquisition is notified to the PM of the next counter value.
Received PM, here, PM-1 waiting for exclusive acquisition
Then, the process of PM-2 is performed, and if there is a PM waiting for exclusion,
The next PM is notified of the exclusive acquisition enabled state. (2) Operation of PM-1: The counter value is cleared to 0.
The counter value is cleared to 0 (see FIG. 4). Issuance of cs instruction (issuance acquisition request issuance) Here, PM identifier = 2, IPL
Since the number of times = 1 has been set, exclusive acquisition fails. Issuing cds instruction (issuing exclusive acquisition wait request
Line) The content of the exclusion wait counter 41 of the exclusion management table 2 is incremented by +
Do one. The contents of the exclusion wait counter 41, which has been incremented by +1 in the counter value setting, are
Set the counter value to 1 and set the waiting order
I do. The system enters a state of waiting for the issuance of the rcv instruction issuance exclusion. In this state, from PM-2
Is notified of the exclusive acquisition possible status at
Performs the same processing as that of PM-2 and sets the exclusive field
Set the PM ID = 1 and the number of IPL = 1 in
Others, copy the contents from the old edition area 8 to the new edition area 7 and
After switching the interface and accessing the new version area 7, the exclusive file
Set PM identifier = 0 and IPL count = 0 in field 3
Counter of exclusive release and exclusive wait management field 4
If there is a PM identifier with the smallest counter value other than 0
If (for example, PM identifier = 3), exclusive acquisition is possible for that PM
Notify status. (3) Operation of PM-3: PM-1
The same processing is performed. As described above, the PM-2 acquires exclusive access to the shared
Access to the memory (new version area 7), release exclusive
Exclusive acquisition possible status for other PM-1 in the order of waiting for other acquisition
Notice. The PM-1 that received the notification also obtains the exclusive
Access to the existing memory, release exclusive, exclusive to next PM-3
Notify the available status. Then, similarly, PM-3 is discharged.
Acquire another, access shared memory, release exclusive. these
, Each PM takes exclusive memory exclusion in order during normal operation
Then, it is possible to perform the processing. Next, referring to FIG.
A detailed example of the process at the time of fogging will be described in detail.
You. Here, an example in which a crash occurs in PM-1, a crash detection in PM-2, and a recovery in PM-3 will be described. (1) Operation of PM-1: S1: PM identifier = 1 in exclusive field 3, IPL count
= 1 is set. In this state: The state of the exclusion field 3: PM identifier = 1 Number of IPL = 1 S2 accesses the shared memory. At S3, a crash occurs in PM-1. (2) Operation of PM-2 (Crash PM detection and
Re-IPL of crashed PM): recovery means 1 of PM-2 started at predetermined time intervals
5 is the PM identifier of the PM management table 6 of the shared memory 1 =
Confirm that the difference between the time of 1 and the current time exceeds the threshold.
And PM-1 with PM identifier = 1 is detected as a crash
Therefore, the PM identifier of the PM that crashed to all PMs = 1
And the number of IPL = 1. The PM identifier of the crashed = 1 is re-I
PL. (3) Operation of PM-3 (recovery operation): Upon receiving the notification of (2), PM identifier = 1, I
It is recognized that the PM having the PL count = 1 has crashed. Take Exclusivity for Crash Recovery
Ask for it. cs command issue PM identifier = 3 IPL count = 1Exclusive field address = # 1 PM identifier crashed = 1 IPL count crashed = 1 PM identifier set in exclusive field 3 =
1, the number of IPL = 1, and the PM
Since the identifier and the IPL count match, perform recovery processing.
U. The restoration of the shared memory is performed. this is,
Pointer of new and old management field 5 from new version area 7 to old version
Switch to point to area 8 before crash
Return to shared memory and restore. Clear exclusive field 3 to 0 and exclusive
Leave unstated. The status PM identifier of the exclusion field 3 = 0 The number of times of IPL = 0 The exclusion acquisition notification is sent to the PM in the next exclusion acquisition order.
(If the counter value of the exclusion wait management field 4 is other than 0,
To the small value PM identifier (PM). (4) Re-IPL of PM-1: When a re-IPL notification is received from PM-2, a crash occurs.
PM-1 re-IPLs, and N of S1 and S2 in FIG.
The operation is started by O, S12, S8 to S11.
By this re-IPL, PM-1 becomes PM identifier = 1 IPL count = 2, and the PM management table 6 in FIG.
Right PM in the figure, indicating the own PM identifier = 1 and the number of IPLs = 2
Set as shown in. As described above, PM-1 crashes,
PM-2 detects PM-1 crash and re-IPs PM-1
L and PM-3 can perform recovery. Next
Then, when a crash occurs and recovery is performed using FIG.
A specific example of another process will be described in detail. here
The following describes a case where a crash has occurred in PM-1, a crash has been detected in PM-2, and a recovery has occurred in PM-3. (1) Operation of PM-1: S1 is an exclusive
Set PM identifier = 0 and IPL count = 0 in field 3
You. This state is as follows: The state of the exclusive field 3: PM identifier = 0 IPL count = 0 S2 causes a crash to the PM. This is PM-1
PM crashes when no exclusion is acquired
It is in a state of being born. (2) Operation of PM-2 (Crash PM
Detection and re-IPL of crash PM): recovery means 1 of PM-2 started at predetermined time intervals
5 is the PM identifier of the PM management table 6 of the shared memory 1 =
Confirm that the difference between the time of 1 and the current time exceeds the threshold.
And PM-1 with PM identifier = 1 is detected as a crash
Therefore, the PM identifier of the PM that crashed to all PMs = 1
And the number of IPL = 1. The PM identifier of the crashed = 1 is re-I
PL. (3) Re-IPL of PM-1: A crash occurs when a re-IPL notification is received from PM-2.
PM-1 re-IPLs, and N of S1 and S2 in FIG.
The operation is started by O, S12, S8 to S11.
By this re-IPL, PM-1 becomes PM identifier = 1 IPL count = 2, and the PM management table 6 in FIG.
Right PM in the figure, indicating the own PM identifier = 1 and the number of IPLs = 2
Set as shown in. Acquire exclusion. This was a re-IPL
PM-1 issues an exclusive acquisition request and sets exclusive field 3
Exclusion is acquired by setting as shown below. -Status of exclusive field PM identifier: 1 Number of IPLs: 2 Access to shared memory. As described above, PM-1 is an exclusive field.
3 after setting PM identifier = 0 and IPL count = 0
In case of rush, PM-2 detects PM-1 crash
And re-IPL PM-1 and re-IPL PM-1
Exclusive acquisition based on ID = 1, IPL count = 2, shared
The memory can be accessed, and PM-3 is
Attempt to perform a bali treatment, but found it unnecessary
Therefore, the process ends without performing any recovery processing. (4) Operation of PM-3 (recovery operation)
Operation): Received the notification of (2), PM identifier = 1, I
It is recognized that the PM having the PL count = 1 has crashed. Get Exclusivity for Crash Recovery
Ask for it. cs instruction issuance • PM identifier = 3 • IPL count = 1 • Exclusive field address = # 1 • Crashed PM identifier = 1 • Crashed IPL count = 1 Because the IPL counts are different, exclusive recovery is not possible due to exclusive acquisition.
There is no need for This is because the crashed PM-1
After the PL, an exclusive acquisition request is issued, and the P of the exclusive field 3 is set.
Set M identifier = 1, IPL count = 2 and exclusive acquisition status
After receiving a crash notice, the number of IPL = 1
Does not match, it is determined that recovery processing is not necessary.
It is a thing. As described above, according to the present invention,
Exclusive field 3, exclusive wait management fee in shared memory 1
Exclusion, exclusive waiting with exclusion by setting new field 4 and new and old management field 5
PM manages new / old areas to store data and
PM management table having PM identifier, IPL count, and time
6 to automatically detect crash PM and perform recovery
Exclusion and exclusion of the shared memory 1
Waiting order, automatic detection of crash PM, and sharing
The restoration of the memory 1 can be managed and executed. these
(1) Each PM automatically stores in the PM management table 6 at predetermined time intervals.
Set the current time to the time of the PM identifier, and
When the difference from the current time is greater than or equal to the threshold, the PM of that PM identifier
Detected crash and IPL identification for all PMs
Child, notifies the number of IPLs and notifies
Perform the bali process and re-IPL the crash PM
And restart can be performed automatically. (2) During the recovery process, the next exclusive wait
The exclusive notification is sent to the other PM, and the
Automatically resumes access to resident memory
It becomes. (3) When Restoring Contents in Shared Memory
Pointer to the new version area 7 and the old version area 8 every time the exclusive
Switch to use and poi when PM crash occurs
By switching the printer and pointing to the old edition area 8,
Data can be easily restored.

【図面の簡単な説明】【図１】本発明の原理構成図である。【図２】本発明の共有メモリの説明図（その１）であ
る。【図３】本発明の共有メモリの説明図（その２）であ
る。【図４】本発明のＩＰＬ時の動作フローチャートであ
る。【図５】本発明の正常時の動作フローチャートである。【図６】本発明のＰＭ監視フローチャートである。【図７】本発明のリカバリ処理のフローチャートであ
る。【図８】本発明の動作説明図（その１）である。【図９】本発明の動作説明図（その２）である。【図１０】本発明の動作説明図（その３）である。【図１１】従来技術の説明図である。【符号の説明】１：共有メモリ２：排他管理テーブル３：排他フィールド４：排他待ち管理フィールド４１：排他待ちカウンタ５：新旧管理フィールド６：ＰＭ管理テーブル７：新版領域８：旧版領域１１：プロセッサ（ＰＭ）１２：核１３：核スレッド１４：排他制御手段１５：リカバリ手段BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a principle configuration diagram of the present invention. FIG. 2 is an explanatory diagram (part 1) of a shared memory according to the present invention. FIG. 3 is an explanatory diagram (part 2) of the shared memory of the present invention. FIG. 4 is an operation flowchart at the time of IPL of the present invention. FIG. 5 is a normal operation flowchart of the present invention. FIG. 6 is a PM monitoring flowchart of the present invention. FIG. 7 is a flowchart of a recovery process according to the present invention. FIG. 8 is a diagram (part 1) illustrating the operation of the present invention. FIG. 9 is a diagram (part 2) illustrating the operation of the present invention. FIG. 10 is a diagram (part 3) illustrating the operation of the present invention. FIG. 11 is an explanatory diagram of a conventional technique. [Description of Signs] 1: Shared memory 2: Exclusive management table 3: Exclusive field 4: Exclusive waiting management field 41: Exclusive waiting counter 5: New and old management field 6: PM management table 7: New area 8: Old area 11: Processor (PM) 12: core 13: core thread 14: exclusive control unit 15: recovery unit

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平４−343159（ＪＰ，Ａ) 特開平４−23160（ＪＰ，Ａ) 特開昭63−225851（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 15/16 - 15/177 ──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-4-343159 (JP, A) JP-A-4-23160 (JP, A) JP-A-63-225851 (JP, A) (58) Field (Int.Cl. ⁷ , DB name) G06F 15/16-15/177

Claims

(57) [Claim 1] In a shared memory access method in which a plurality of processors access a shared memory, the shared memory sets a PM identifier of the processor and the number of IPLs to obtain exclusion. Field, an exclusion wait management field that manages the exclusion acquisition wait state in association with the PM identifier of the processor, and a new version area that is replaced each time the processor acquires exclusion.
Or a pointer to one of the old edition areas.
And a PM management table for managing the PM identifier of the processor, the number of IPLs, and the time in association with each other. Each of the plurality of processors, when accessing the shared memory, Means for setting the own PM identifier and the number of IPLs in the exclusive field when the exclusive field is not in the exclusive state, and acquiring the exclusive; and when the exclusive field is in the exclusive state, the exclusive wait management field is associated with the own PM identifier. When the means for setting an exclusive acquisition wait state and the means for acquiring the exclusive acquire the exclusive,
Old version pointed to by old management field pointer
Take the contents out of the area and copy it to the new area
Set the pointer to the new version area in the new and old management fields above
Means for setting the current time to the time of the own PM identifier in the PM management table at predetermined time intervals, and comparing the current time with the time of the PM identifier of the other processor set by the other processor to set a threshold value. means for determining a processor of its PM identifier crashes when there is a difference of more, and means for notifying PM identifier and IPL number of crash the processor to all processors, re in the crashed processor Means for instructing the IPL; and when the own processor crashes and receives the instruction for the re-IPL from another processor, the IPL number of the own IPL identifier in the PM management table is increased by +1 after the re-IPL.
Means for performing, when the own processor is notified of the information of the processor determined to have crashed from another processor, the PM identifier and the IPL count notified of the notification, the PM identifier set in the exclusive field, and The number of IPLs is compared with each other, and when it is determined that they are equal to each other, the PM identifier of the own processor and the number of IPLs are set in the exclusion field to obtain exclusion of the shared memory area used by the crashed processor. , The new and old management
Field from the new version area to the old version area.
A recovery unit for performing a restoration process of the shared memory by replacing the PM identifier and the IPL count of the exclusion field with 0 after the recovery unit performs the restoration process, and a next order in the exclusion wait management field. Means for transmitting an exclusive acquisition enable notification to a processor having a PM identifier.