JP2015146063A

JP2015146063A - Method of handling and processing memory leak and abnormal end of management process

Info

Publication number: JP2015146063A
Application number: JP2014017650A
Authority: JP
Inventors: 茉莉住谷; Mari Sumikya
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2014-01-31
Filing date: 2014-01-31
Publication date: 2015-08-13

Abstract

PROBLEM TO BE SOLVED: To keep a management process in charge of monitoring and instruction of a lower process always active although even the management process may experience memory leak or abnormal end.SOLUTION: A management process is duplicated, an active management process is activated and a standby management process is then activated to stand by in a state of not receiving processes. If the active management process comes to the end of its life or malfunctions, the active management process transfers data and processes to the standby management process and ends. The standby management process that now serves as an active management process by transfer processing generates a new standby management process to make the standby management process on standby in a state of not receiving processes.

Description

本発明は、プロセスのメモリリークと異常終了時の対応処理方法に係り、特に処理実行プロセスを管理する管理プロセスのメモリリークと異常終了時の対応処理方法に関する。 The present invention relates to a processing method for handling a memory leak and abnormal termination of a process, and more particularly to a processing method for handling a memory leak and abnormal termination of a management process that manages a process execution process.

メモリリークに対処する方法としては、プロセスの停止や再起動を行うのが一般的である。プロセスのメモリリークに対処するためのプロセスの停止と再起動を行う手段として、特許文献１の手法がある。この手法では、処理プロセスに処理もしくはデータに応じての増加するカウンタを持たせ、カウンタが基準値を超えたら自身を停止させることでメモリリークに対応する。また、この処理プロセスは管理プロセスによって管理され、管理プロセスからの指示により再起動をすることでもメモリリークに対処する。ここでは管理プロセスは運用者からの通知を受けたとき、または定期的に再起動の指示を処理プロセスに出す。 As a method for dealing with memory leaks, it is common to stop or restart processes. As a means for stopping and restarting a process for dealing with a memory leak of a process, there is a method disclosed in Patent Document 1. In this method, the processing process has a counter that increases in accordance with processing or data, and when the counter exceeds a reference value, it stops itself to cope with a memory leak. This processing process is managed by a management process, and a memory leak is also dealt with by restarting according to an instruction from the management process. Here, the management process issues a restart instruction to the processing process when it receives a notification from the operator or periodically.

メモリリークと並んで懸念されるのがプロセスの障害発生と異常終了である。プロセスの異常終了への対策としては、プロセスの多重化が広く知られている。実行系プロセスと待機系プロセスが存在し、実行系プロセスに障害が発生すると待機系プロセスに処理が引き継がれ、システムの停止が回避される。実行系プロセスから待機系プロセスへの引き継ぎを実行するためには、やはり実行系プロセスを管理する上位プロセスが準備される。 Along with memory leaks, there are concerns about process failures and abnormal termination. Multiplexing processes is widely known as a countermeasure for abnormal process termination. An active process and a standby process exist, and if a failure occurs in the active process, the process is taken over by the standby process, thereby preventing the system from being stopped. In order to execute the takeover from the executing process to the standby process, a higher-level process that manages the executing process is also prepared.

特許文献２では実行系処理プロセスと待機系処理プロセス、そして実行系処理プロセスを監視する管理プロセスがあり、実行系処理プロセスの異常終了を管理プロセスが検出し、待機系処理プロセスに実行系処理プロセスを引き継ぐよう要求する。さらに管理プロセスは新たな待機系処理プロセスを起動する役割を持ち、実行系処理プロセスに対し常に待機系処理プロセスが存在するようシステムを管理する。 In Patent Document 2, there are a management process that monitors an execution system process, a standby system process, and an execution system process. The management process detects an abnormal end of the execution system process, and the standby system process has an execution system process. Request to take over. Furthermore, the management process has a role of starting a new standby processing process, and manages the system so that a standby processing process always exists with respect to the active processing process.

特開2011-54114号公報JP 2011-54114 A 特開平11-85556号公報Japanese Patent Laid-Open No. 11-85556

上記２つの手法はいずれも処理プロセスのメモリリークと異常終了の対応のために処理プロセスを管理する上位プロセスとして管理プロセスが生成され、この管理プロセスからの指示によって再起動やプロセスの引き継ぎが行われる。管理プロセスは下位プロセスの監視や指示出し等の役割を担うので、管理プロセスが稼働不能な状態に陥ると下位の処理プロセスにも影響が及ぶ。そのため管理プロセスは常に稼働していることが求められ、従来技術では管理プロセスが常に稼働状態であることが前提とされている。 In each of the above two methods, a management process is generated as a higher-level process that manages the processing process in order to deal with a memory leak and abnormal termination of the processing process, and restart or take over of the process is performed according to an instruction from this management process . Since the management process plays a role in monitoring the lower process and issuing instructions, if the management process falls into an inoperable state, the lower process is also affected. For this reason, it is required that the management process is always in operation, and the conventional technology assumes that the management process is always in operation.

しかし実際には管理プロセス自身もメモリリークや異常終了をする可能性があり、管理プロセスに発生しうる障害に対し何も対策が取られていない状態は好ましくない。そのため管理プロセスのメモリリークと異常終了を想定した対応処理が必要である。 However, in reality, the management process itself may cause a memory leak or abnormal termination, and it is not preferable that no measures are taken against a failure that may occur in the management process. For this reason, it is necessary to deal with a memory leak and abnormal termination of the management process.

本発明の構成は、
プロセスのメモリリークや異常終了によるサービス停止を防ぐ方法であり、
寿命を持つ実行系プロセスが自身の起動直後に自分と同じ待機系プロセスを、処理を受け付けない状態で起動し、
前記実行系プロセスが寿命に達するまたは障害を発生させた場合、前記実行系プロセス自身が終了する直前に前記待機系プロセスに処理とデータの引き継ぎを行い、
前記引き継ぎ処理により実行系プロセスになった前記待機系プロセスが新たな待機系プロセスを、処理を受け付けない状態で起動することで、
プロセスのメモリリークと異常終了に対処しサービス停止を防ぐ方法である。 The configuration of the present invention is as follows:
It is a method to prevent service stop due to process memory leak or abnormal termination,
An active process that has a lifetime starts the same standby system process as it is immediately after it starts, in a state that does not accept processing,
When the execution process reaches the end of life or causes a failure, immediately before the execution process itself ends, the standby process takes over processing and data transfer,
When the standby process that has become an active process by the takeover process starts a new standby process in a state that does not accept the process,
It is a method to prevent process stoppage by dealing with process memory leak and abnormal termination.

本発明によれば、管理プロセスに寿命を与え、二重化して実行系管理プロセスと待機系管理プロセスを用意することで、実行系管理プロセスが寿命に達したときに実行系管理プロセスから待機系管理プロセスに処理を引き継ぎ、管理プロセスのメモリリークを回避する。また、実行系管理プロセスの障害発生時に待機系管理プロセスへ処理を引き継ぐことで管理プロセスが常にシステムに存在する状態を作り、管理プロセス不在によるシステムの停止を防ぐ。 According to the present invention, a life is given to the management process, and the active system management process and the standby system management process are prepared by duplication. Take over processing to the process and avoid memory leak of the management process. In addition, when a failure occurs in the active management process, the management process is always present in the system by taking over the processing to the standby management process, thereby preventing the system from being stopped due to the absence of the management process.

実行系管理プロセスに対し待機系管理プロセスを待機させておくことにより、管理プロセスの引き継ぎにかかる時間を最小限に抑えることができる。加えて、処理に必要なデータの引き継ぎを行うため、管理プロセスが管理する起動中の下位プロセスに与える影響が少ない。さらに待機系管理プロセスを、処理を受け付けない状態で待機させることで待機系管理プロセスのメモリー使用量を極力抑える。 By making the standby system management process wait for the active system management process, the time taken to take over the management process can be minimized. In addition, since data necessary for processing is taken over, there is little influence on the active lower process managed by the management process. Furthermore, the standby system management process is made to stand by in a state where processing is not accepted, thereby reducing the memory usage of the standby system management process as much as possible.

実行系管理プロセスの処理を示す処理フローである。It is a processing flow which shows a process of an active system management process. プリントシステムおよびその利用者（印刷依頼者）の関係を示すユースケース図である。It is a use case figure which shows the relationship between a printing system and its user (print requester). 印刷データ変換サービスが動作するサーバーPCのハードウエア構成図である。2 is a hardware configuration diagram of a server PC on which a print data conversion service operates. FIG. サーバーPC上で動作するソフトウエアを説明するブロック図である。It is a block diagram explaining the software which operate | moves on a server PC. 印刷データ変換サービスとそれが依存するコンポーネント群を示す図である。It is a figure which shows the print data conversion service and the component group on which it depends. Service Catalogの構成を示す模式図である。It is a schematic diagram which shows the structure of Service Catalog. Service Inventoryの構成を示す模式図である。It is a schematic diagram which shows the structure of Service Inventory. 実行系管理プロセスの処理の概要を示す処理フローである。It is a processing flow which shows the outline | summary of a process of an active system management process. 管理プロセスの引き継ぎ処理を示す処理フローである。It is a processing flow which shows the takeover process of a management process. 待機系管理プロセスの処理を示す処理フローである。It is a processing flow which shows a process of a standby system management process.

本実施例では本発明を適用したサーバー上の印刷データ変換サービスについて述べる。まずは印刷データ変換サービスの構成について説明する。 In this embodiment, a print data conversion service on a server to which the present invention is applied will be described. First, the configuration of the print data conversion service will be described.

図２は印刷データ変換サービスとその依頼者から成る構成図である。印刷データの変換を依頼する依頼者（５０）はPC（２０）、タブレットPC（３０）や携帯端末（４０）などの機器を利用し、それらの内部で動作するアプリケーションを操作して、ネットワーク（１０）上に存在する印刷データ変換サービス（２１００）に印刷データの変換を依頼する。印刷データ変換サービス（２１００）はサーバーPC（８０）上でURLを公開しているWebサービスとして動作しているプログラムである。サーバーPC（８０）上では前記URLに対して到着した依頼を印刷データ変換サービス（２１００）に転送するWebサーバー（４１００）も動作している。URLへ到着する依頼は通常多数になるため、複数のサーバーPC（８０）に処理させる構成とする。従って、複数のサーバーPC（８０）の前段にLoad Balancer（７０）を設置し各サーバーPC（８０）に転送される依頼が均一に配分されるように構成する。 FIG. 2 is a block diagram of the print data conversion service and its requester. A client (50) who requests conversion of print data uses devices such as a PC (20), a tablet PC (30), and a portable terminal (40), operates an application that operates inside the network ( 10) Request the print data conversion service (2100) existing above to convert the print data. The print data conversion service (2100) is a program operating as a Web service that publishes a URL on the server PC (80). On the server PC (80), a Web server (4100) that transfers a request arriving for the URL to the print data conversion service (2100) is also operating. Since there are usually a large number of requests that arrive at the URL, a plurality of server PCs (80) are configured to process them. Therefore, a load balancer (70) is installed in front of a plurality of server PCs (80) so that requests transferred to the server PCs (80) are uniformly distributed.

印刷データ変換サービス（２１００）は、依頼者（５０）が指定するデータをプリンタ（６０）で印刷できる形式に変換するサービスである。依頼者（５０）が指定するデータにはドキュメント、画像ファイルなど各種のアプリケーションプログラムで作成されたフォーマットが存在する。一方、プリンタ（６０）が受け付けるデータは前述のフォーマットとは異なり、PDL( Page Description Language )で記述された印刷データや、印刷を制御するための命令とパラメータを備えたPJL( Print Job Language )で記述されたデータである。この、依頼者（５０）とプリンタ（６０）間のデータ形式の差異を埋めるのがプリンタドライバーであり、印刷データ変換サービス（２１００）はプリンタドライバーの処理をインターネット上で行うサービスである。サーバーPC（８０）上で動作している印刷データ変換サービス（２１００）は依頼者（５０）の前述した各種端末から印刷データのフォーマット変換の依頼受け、それに応じて生成した変換後のデータを直接あるいは前述した端末を経由してプリンタ（６０）に送信する。 The print data conversion service (2100) is a service that converts data designated by the client (50) into a format that can be printed by the printer (60). The data designated by the client (50) includes formats created by various application programs such as documents and image files. On the other hand, the data received by the printer (60) is different from the above format in print data described in PDL (Page Description Language) and in PJL (Print Job Language) with commands and parameters for controlling printing. Described data. The printer driver fills in the difference in data format between the requester (50) and the printer (60), and the print data conversion service (2100) is a service that performs processing of the printer driver on the Internet. The print data conversion service (2100) operating on the server PC (80) receives a request for format conversion of the print data from the above-described various terminals of the requester (50), and directly converts the converted data generated accordingly. Or it transmits to a printer (60) via the terminal mentioned above.

図３は印刷データ変換サービス（２１００）を実現するサーバーPC（８０）のハードウエア構成図である。印刷データ変換サービス（２１００）はOS（オペレーティングシステム）上で動作するアプリケーションプログラムであり、HDD（１００２）あるいはROM（１００３）に格納されている。CPU（１００５）がOSとアプリケーションプログラムをHDD（１００２）あるいはROM（１００３）から読み出してRAM（１００４）にロードし、実行することで様々な処理が進行する。処理結果はファイルとしてHDD（１００２）に格納され、あるいはデータとしてRAM（１００４）に記憶される。アプリケーションプログラムはコンピュータに接続されている入力装置（１００７）から使用者の入力、各種センサの読み取り値を取得する。さらに出力装置（１００６）に対して情報を出力し、処理結果を表示する。さらに、通信装置（１００８）を介してネットワークに接続された他のコンピュータや装置と通信を行う。これらのハードウエアはバス（１００１）で互いに接続されていてアプリケーションプログラムから操作できるように構成されている。 FIG. 3 is a hardware configuration diagram of the server PC (80) for realizing the print data conversion service (2100). The print data conversion service (2100) is an application program that runs on an OS (operating system), and is stored in the HDD (1002) or the ROM (1003). The CPU (1005) reads out the OS and application programs from the HDD (1002) or ROM (1003), loads them into the RAM (1004), and executes them to execute various processes. The processing result is stored in the HDD (1002) as a file or stored in the RAM (1004) as data. The application program acquires user input and reading values of various sensors from an input device (1007) connected to the computer. Further, information is output to the output device (1006), and the processing result is displayed. Furthermore, it communicates with other computers and devices connected to the network via the communication device (1008). These hardware components are connected to each other via a bus (1001) and can be operated from an application program.

図４は印刷データ変換サービス（２１００）を含むソフトウエア群がサーバーPC（８０）上でどのように構成されているかを示すブロック図である。Webサーバー（４１００）は依頼者（５０）が操作したPC（２０）上のアプリケーションからHTTPプロトコルで送信されたリクエストを受け取る。Load Balancer（７０）はWebサーバー（４１００）の前段に位置し、複数のサーバーPC（８０）上で動作しているWebサーバー（４１００）に振り分ける。この振り分ける方式は種々知られているがその詳細は本発明には関係がないため説明を省略する。 FIG. 4 is a block diagram showing how the software group including the print data conversion service (2100) is configured on the server PC (80). The Web server (4100) receives a request transmitted by the HTTP protocol from an application on the PC (20) operated by the client (50). Load Balancer (70) is located in front of Web server (4100) and is distributed to Web servers (4100) operating on a plurality of server PCs (80). Various methods of distributing are known, but the details thereof are not related to the present invention, so that the description thereof is omitted.

アプリケーションサーバー（４２００）は要求の送付先として指定されたURLを解析し、関連つけられている印刷データ変換サービス（２１００）を決定するソフトウエアである。本実施例では複数種あるサービスの一つが印刷データ変換サービス（２１００）であることを想定している。 The application server (4200) is software that analyzes a URL designated as a request destination and determines an associated print data conversion service (2100). In the present embodiment, it is assumed that one of a plurality of types of services is a print data conversion service (2100).

印刷データ変換サービス（２１００）は単一のプロセスではなく、複数プロセスが共同して動作することでサービスを提供する。Locator（２５００）とLogger（２６００）は印刷データ変換サービス（２１００）とは独立して存在する管理プロセスである。アプリケーションサーバー（４２００）は複数の要求を並列実行させるために、印刷データ変換サービス（２１００）を複数起動し、各々に要求を振り分ける。この場合でも、Locator（２５００）とLogger（２６００）はそれぞれ一個のプロセスとして存在している。Logger（２６００）は本発明には関係がないため以降説明を省略する。 The print data conversion service (2100) is not a single process, but a service is provided by a plurality of processes operating together. A Locator (2500) and a Logger (2600) are management processes that exist independently of the print data conversion service (2100). In order to execute a plurality of requests in parallel, the application server (4200) starts a plurality of print data conversion services (2100) and distributes the requests to each. Even in this case, the Locator (2500) and the Logger (2600) exist as one process. Since Logger (2600) is not related to the present invention, the description thereof will be omitted.

（印刷データ変換サービス）
図５は印刷データ変換サービス（２１００）の動作に関連するソフトウエアコンポーネント群の関連を説明する図である。印刷データ変換サービス（２１００）は先に説明したように複数のプロセスの共同体として動作している。その構成要素はPrint Service Gateway（２１１０）、Proxy（２２００）、Filter Host（２４００）と各種Filter群（２４１０，２４１１）となっている。Filterは多数存在し、その組み合わせは固定化されていない。 (Print data conversion service)
FIG. 5 is a diagram for explaining the relationship between software component groups related to the operation of the print data conversion service (2100). The print data conversion service (2100) operates as a community of a plurality of processes as described above. The components are Print Service Gateway (2110), Proxy (2200), Filter Host (2400), and various Filter groups (2410, 2411). There are many Filters, and the combination is not fixed.

（Print Service Gateway）
印刷データ変換サービス（２１００）に対するサービス依頼を受け付けるのはプロセスとして存在するPrint Service Gateway（２１１０）であり、その起動はアプリケーションサーバー（４２００）が行う。Print Service Gateway（２１１０）はProxy（２２００）をロードし、そのProxy（２２００）が後述するJob Controller（２３００）に受け付けた変換処理の実行を依頼し、変換処理の結果を受領する。 (Print Service Gateway)
A service request for the print data conversion service (2100) is received by the Print Service Gateway (2110) that exists as a process, and the application server (4200) starts it. The Print Service Gateway (2110) loads the Proxy (2200), the Proxy (2200) requests the Job Controller (2300) described later to execute the received conversion process, and receives the result of the conversion process.

（Locator）
印刷データ変換サービス（２１００）を構成するコンポーネント群が利用するLocator（２５００）は特別な役割を担っている管理プロセスである。Locator（２５００）はアプリケーションサーバー（４２００）が起動した時点で既に起動している特殊なプロセスである。Locator（２５００）の起動の方法はサーバーPC（８０）上で動作しているオペレーティングシステム（OS）によって種々の方法が存在する。例えば、Windows（登録商標）であれば”Windows Service”というシステム起動時に自動的に起動される特殊なプロセスとして実装することが可能であり、Linux （登録商標）やUnix（登録商標）ではデーモンプロセスとして動作させることが可能である。 (Locator)
The Locator (2500) used by the component group constituting the print data conversion service (2100) is a management process having a special role. The Locator (2500) is a special process that is already started when the application server (4200) is started. There are various methods for starting the Locator (2500) depending on the operating system (OS) operating on the server PC (80). For example, if it is Windows (registered trademark), it can be implemented as a special process called “Windows Service” that is automatically started when the system starts. In Linux (registered trademark) and Unix (registered trademark), it is a daemon process. Can be operated as

Locator（２５００）はコンポーネントの生成の責務を担っている。さらに他のコンポーネントからの求めに応じて別のコンポーネントのアクセスポイントを返すという機能を提供する。アクセスポイントとはTCP/IPのリスンポートを指す。あるコンポーネントは別のコンポーネントが公開しているアクセスポイントを取得し、そのアクセスポイントにアクセスすることでそのコンポーネントが提供する機能を利用する。あるコンポーネントが他のコンポーネントのアクセスポイントをLocator（２５００）に対して問い合わせる動作を “クエリー”と呼ぶ。クエリーの詳細は後述する。 The Locator (2500) is responsible for component generation. Furthermore, a function of returning an access point of another component in response to a request from another component is provided. An access point is a TCP / IP listen port. A component acquires an access point disclosed by another component, and uses the function provided by that component by accessing the access point. The operation in which a certain component inquires the access point of another component to the Locator (2500) is called “query”. Details of the query will be described later.

（Job Controller）
Proxy（２２００）を介してデータ変換のジョブ実行を依頼されたJob Controller（２３００）はジョブおよび、ジョブに含まれている印刷データを解析して変換に必要なFilterを選定し、これらFilterをロードしているFilter Host（２４００）を取得する。Filterの組み合わせとその変換順番は固定的な知識としてJob Controller（２３００）に実装されている。 (Job Controller)
Job Controller (2300) requested to execute a data conversion job via Proxy (2200) analyzes the job and print data included in the job, selects the filters required for conversion, and loads these filters. Get Filter Host (2400). The combination of filters and their conversion order are implemented in the Job Controller (2300) as fixed knowledge.

Job Controller（２３００）は自身と各Filter（２４１０，２４１１）、そしてProxy（２２００）との間にPipeline（３０００）と呼ばれるデータ転送チャネルを作成する。図５のようにPipeline（３０００）を作成することで、Proxy（２２００）から印刷データが複数のコンポーネントを還流し、最後に変換が終了した形で再びProxy（２２００）に戻ってくる。 The Job Controller (2300) creates a data transfer channel called Pipeline (3000) between itself, each Filter (2410, 2411), and Proxy (2200). By creating the Pipeline (3000) as shown in FIG. 5, the print data flows back from the Proxy (2200) to a plurality of components, and finally returns to the Proxy (2200) in the form where the conversion is completed.

（Filter、Filter Host）
Filterはデータ変換処理のみを実装したライブラリーモジュールの形式で用意されている。Filter Host（２４００）はプロセスとして存在し、実行時に指定されたFilterをロードする。Filter Host（２４００）はControl Bus（３１００）への接続とメッセージ送受信、Job Controller（２３００）との通信などの処理を担当する。 (Filter, Filter Host)
Filter is prepared in the form of a library module that implements only data conversion processing. Filter Host (2400) exists as a process, and loads a Filter specified at the time of execution. The Filter Host (2400) is responsible for processing such as connection to the Control Bus (3100), message transmission / reception, and communication with the Job Controller (2300).

ここで、印刷データ変換サービス（２１００）を構成しているコンポーネント群が他のコンポーネントをクエリーする意義と手順を説明する。 Here, the significance and procedure for the component group constituting the print data conversion service (2100) to query other components will be described.

（クエリーの意義）
コンポーネントが提供しているサービスはネットワークからアクセスできるエンドポイントとして他のコンポーネントに公開される。具体的にはTCP/IPのアドレスとポート番号の組が公開される。コンポーネントは他のコンポーネントの提供するサービスI/Fを取得して利用する。複数種類のコンポーネントがプロセスとして複数存在し、かつ同一コンポーネントも同一PC上に複数存在する。従って、各コンポーネントが公開するサービスI/Fのポート番号は固定にすることができないため、動的に生成せざるを得なくなる。動的に生成されたエンドポイントは何らかの管理機構が無くては他から参照することはできない。この管理を担当するのはLocator（２５００）である。さらに、コンポーネントが他のコンポーネントの動的に生成されたエンドポイントを取得する手段が必要である。この手段がクエリーである。クエリーに回答するのはLocator（２５００）の役割である。 (Significance of query)
Services provided by components are exposed to other components as endpoints accessible from the network. Specifically, a set of TCP / IP address and port number is disclosed. A component acquires and uses a service I / F provided by another component. There are multiple types of components as processes, and there are multiple identical components on the same PC. Therefore, the port number of the service I / F disclosed by each component cannot be fixed and must be dynamically generated. Dynamically generated endpoints cannot be referenced from others without some sort of management mechanism. The Locator (2500) is in charge of this management. Furthermore, there is a need for a means for a component to obtain dynamically generated endpoints of other components. This means is a query. It is the role of the Locator (2500) that answers the query.

（コンポーネントの生成）
Locator（２５００）は図６に示すService Catalog（５００１）を保持している。Service Catalog（５００１）はサーバーPC（８０）上にインストールされているコンポーネントのリストで構成されている。コンポーネントごとに名前、バージョン、実行ファイルパス、起動時のコンフィギュレーションファイルパス、コンポーネントの寿命、起動モードが格納されている。 (Component generation)
The Locator (2500) holds the Service Catalog (5001) shown in FIG. The Service Catalog (5001) is composed of a list of components installed on the server PC (80). For each component, the name, version, executable file path, configuration file path at startup, component lifetime, and startup mode are stored.

Locator（２５００）は自身の起動後、Service Catalog（５００１）を参照し、起動モードがホットスタンバイに指定されているコンポーネントを起動する。起動モードにはそのほかにオンデマンド（on-demand）が定義されている。オンデマンドの起動モードを持つコンポーネントは、そのコンポーネントを指定したクエリーがなされて初めて起動される。 After the Locator (2500) starts up itself, the Locator (2500) refers to the Service Catalog (5001) and starts up a component whose start mode is designated as hot standby. In addition, on-demand is defined in the startup mode. A component having an on-demand activation mode is activated only when a query specifying that component is made.

コンポーネントは起動が完了し、他のコンポーネントへのサービスの提供の準備ができると、Control Bus（３１００）に自身のAdvertise Messageを作成してLocator（２５００）宛てに送信する。Advertise Messageにはコンポーネントの名前、ID、バージョン、サービスI/FのURL、ポート番号、寿命が記載されている。このメッセージの受信によってLocator（２５００）はコンポーネントの起動を確認し、かつサービスI/Fを取得することができる。Locator（２５００）はコンポーネントからのAdvertise Messageを受信すると、管理テーブルであるService Inventory（５００２）にメッセージの送信元であるコンポーネントのエントリーを作成し、そこにメッセージの内容を転記して登録を行う。 When the component has been activated and is ready to provide services to other components, it creates its own Advertise Message in the Control Bus (3100) and sends it to the Locator (2500). The Advertise Message contains the component name, ID, version, service I / F URL, port number, and lifetime. By receiving this message, the Locator (2500) can confirm the activation of the component and can acquire the service I / F. When the Locator (2500) receives the Advertise Message from the component, the Locator (2500) creates an entry of the component that is the message transmission source in the Service Inventory (5002) that is a management table, and transcribes and registers the content of the message there.

（コンポーネントの管理）
Locator（２５００）は各コンポーネントからメッセージを受け取るたびにService Inventory（５００２）の該当するエントリーの情報を更新し、コンポーネントを管理する。図７はService Inventory（５００２）の内容を模式的に表した図である。「生存時間」欄にはコンポーネントが作成されてからの時間（秒数）が、「状態」欄にはコンポーネントの現在の状態を示す値が記録される。これらは各コンポーネントから送信されるメッセージとその内容によって更新される。「最終連絡時刻」欄にはLocator（２５００）がコンポーネントから最後に受け取ったメッセージの受け取り時刻が記録される。 (Component management)
Each time the Locator (2500) receives a message from each component, the Locator (2500) updates the information of the corresponding entry in the Service Inventory (5002) and manages the component. FIG. 7 is a diagram schematically showing the contents of Service Inventory (5002). The “live time” column records the time (in seconds) since the component was created, and the “status” column records a value indicating the current status of the component. These are updated by the message sent from each component and its contents. In the “last contact time” column, the reception time of the last message received from the component by the Locator (2500) is recorded.

寿命に達したコンポーネントはLocator（２５００）宛てにGoodbye Messageを送信する。Locator（２５００）はGoodbye Messageを受信すると、対象コンポーネントのエントリーをService Inventory（５００２）から削除する。 The component that has reached the end of its life sends a Goodbye Message to Locator (2500). When receiving the Goodbye Message, the Locator (2500) deletes the entry of the target component from the Service Inventory (5002).

コンポーネントが他のコンポーネントを利用するときはLocator（２５００）を経由する必要がある。例えばコンポーネントAがコンポーネントBを利用しようとするとき、まずコンポーネントAはLocator（２５００）に対してクエリーする。この時点でコンポーネントBがまだ生成されていない場合、Locator（２５００）はコンポーネントBを新たに生成する。Service Inventory（５００２）に記録されたコンポーネントBの「状態」がビジー状態でなければ、Locator（２５００）はコンポーネントBインスタンスを選定し、コンポーネントAのクエリーに対してレスポンスメッセージを返却する。レスポンスメッセージにはコンポーネントBの名前、バージョン、サービスI/Fを示すURLとポート番号が格納されている。コンポーネントAはレスポンスメッセージを受け取るとコンポーネントBのサービスI/Fを利用することができる。 When a component uses another component, it is necessary to go through a Locator (2500). For example, when component A intends to use component B, component A first queries Locator (2500). If the component B has not yet been generated at this point, the Locator (2500) newly generates the component B. If the “state” of the component B recorded in the Service Inventory (5002) is not busy, the Locator (2500) selects the component B instance and returns a response message to the component A query. The response message stores the name and version of component B, the URL indicating the service I / F, and the port number. When component A receives the response message, component B's service I / F can be used.

以上が印刷データ変換サービスの構成に関する説明である。 This completes the description of the configuration of the print data conversion service.

続いて、本発明の特徴となるLocator（２５００）について説明する。 Next, the Locator (2500) that is a feature of the present invention will be described.

Locator（２５００）は二重化されており、ひとつは実行系プロセスとして、もうひとつは待機系プロセスとして存在する。以降、それぞれを実行系Locator（２５１０）と待機系Locator（２５２０）と呼ぶ。各Locator（２５１０、２５２０）には起動時間または処理に応じた寿命が設定されており、実行系Locator（２５１０）は寿命に達するまで後述するコンポーネントの生成や管理等の処理を行う。実行系Locator（２５１０）は寿命に達する、または障害が発生すると、待機系Locator（２５２０）にプロセスを引き継ぐ処理を実行する。 The Locator (2500) is duplicated, one exists as an executing process and the other as a standby process. Hereinafter, they are called an execution system locator (2510) and a standby system locator (2520), respectively. Each Locator (2510, 2520) has a startup time or a lifetime corresponding to the process, and the execution system Locator (2510) performs processing such as component generation and management described later until the lifetime is reached. When the execution system locator (2510) reaches the end of its life or when a failure occurs, the standby system locator (2520) executes processing to take over the process.

（待機系Locatorの生成）
図１は実行系Locator（２５１０）の処理（S１００）のフローを表す。実行系Locator（２５１０）は起動直後に自身の寿命を設定する（S１０１）。実行系Locator（２５１０）の寿命のカウントは、起動されてからの時間と処理したジョブの数に依存する。寿命設定後、実行系Locator（２５１０）は自分自身と同じプログラムを起動し（S１０２）、これを待機系Locator（２５２０）とする。待機系Locator（２５２０）は他のコンポーネントとは異なり、起動後にAdvertise Messageを実行系Locator（２５１０）に送信しない。Advertise Messageを送信しないことにより、待機系Locator（２５２０）は他のコンポーネントからのクエリーを受け付けない状態を保つ。これにより、待機系Locator（２５２０）が待機中に使用するメモリーを最小限に抑えることができる。実行系Locator（２５１０）は待機系Locator（２５２０）の生成を終えると処理のイベントループに入る（S２００）。 (Generation of standby locator)
FIG. 1 shows a flow of processing (S100) of the execution system Locator (2510). The execution system locator (2510) sets its own life immediately after startup (S101). The life count of the execution locator (2510) depends on the time since activation and the number of processed jobs. After setting the lifetime, the execution system locator (2510) starts the same program as itself (S102), and this is set as the standby system locator (2520). Unlike other components, the standby locator (2520) does not transmit an Advertise Message to the execution locator (2510) after activation. By not transmitting the Advertise Message, the standby Locator (2520) maintains a state in which it does not accept queries from other components. As a result, the memory used by the standby locator (2520) during standby can be minimized. When the execution system locator (2510) finishes generating the standby system locator (2520), it enters the processing event loop (S200).

（Locatorの引き継ぎ）
図８は実行系Locator（２５１０）のイベントループ（S２００）を表す。実行系Locator（２５１０）は寿命に達した場合（S２０２）と、自身に障害が発生した場合（S２０１）に処理を待機系Locator（２５２０）に引き継ぎ（S３００）、プロセスを終了する。プロセスの引き継ぎ（S３００）は実行系Locator（２５１０）自身が実行する処理であり、前記寿命到達（S２０２）と障害発生（S２０１）のいずれの場合も引き継ぎ処理（S３００）を行う同じプログラムが呼び出され実行される。 (Inheriting Locator)
FIG. 8 shows an event loop (S200) of the execution system Locator (2510). The execution system locator (2510) takes over the processing to the standby system locator (2520) (S300) when the service life is reached (S202) and when a failure occurs in itself (S201), and the process ends. The process takeover (S300) is a process executed by the execution locator (2510) itself, and the same program that performs the takeover process (S300) is called in both cases of the end of life (S202) and the occurrence of failure (S201). Executed.

実行系Locator（２５１０）は定期的に自身の寿命を確認し、寿命に達したとき（S２０２）にプロセスの引き継ぎ（S３００）を実行する。寿命に達した際、実行系Locator（２５１０）がメッセージの受け渡し等の処理中であれば、その処理が終了した後で引き継ぎ（S３００）を行う。実行系Locator（２５１０）に障害が発生した場合（S２０１）は、実行系Locator（２５１０）を終了させる前に引き継ぎ処理（S３００）が実行される。障害発生時のプロセス引き継ぎ処理の実行方法として、プロセスの障害発生時に動作するハンドラに、自身のプロセス引き継ぎ処理を呼び出すよう実行系Locator（２５１０）が予め依頼しておく方法がある。Locator（２５００）の引き継ぎ処理（S３００）を実行するのは実行系Locator（２５１０）であり、上記障害発生時に動作するハンドラは実行系Locator（２５１０）に指定されたプログラムを実行するための合図を出す役割のみを担う。 The execution system locator (2510) periodically checks its own lifetime, and executes the process takeover (S300) when the lifetime is reached (S202). If the execution system Locator (2510) is in the process of delivering a message or the like when the service life is reached, it takes over (S300) after the process is completed. When a failure occurs in the execution system locator (2510) (S201), the takeover process (S300) is executed before the execution system locator (2510) is terminated. As a method for executing the process takeover process when a failure occurs, there is a method in which the execution locator (2510) requests the handler that operates when a process failure occurs to call its own process takeover process in advance. The takeover processing (S300) of the Locator (2500) is executed by the execution system Locator (2510), and the handler that operates when the failure occurs generates a signal for executing the program specified by the execution system Locator (2510). Take only the role to issue.

図９と図１０はそれぞれ、実行系Locator（２５１０）から待機系Locator（２５２０）への引き継ぎ処理（S３００）と、待機系Locator（２５２０）の処理（S４００）のフローを示す。引き継ぎ処理（S３００）ではまず、実行系Locator（２５１０）が保持する情報が格納されたパスが待機系Locator（２５２０）に渡される（S３０１、S４０２）。実行系Locator（２５１０）が保持する情報には前述したService Catalog（５００１）とService Inventory（５００２）がある。実行系Locator（２５１０）はService Catalog（５００１）とService Inventory（５００２）のパスが記入されたWakeup Messageを待機系Locator（２５２０）に送信する。 FIGS. 9 and 10 respectively show the flow of the takeover process (S300) from the active locator (2510) to the standby locator (2520) and the process (S400) of the standby locator (2520). In the takeover process (S300), first, a path storing information held by the execution system locator (2510) is passed to the standby system locator (2520) (S301, S402). Information held by the execution locator (2510) includes the above-described Service Catalog (5001) and Service Inventory (5002). The execution system locator (2510) transmits a wakeup message in which the paths of the service catalog (5001) and service inventory (5002) are written to the standby system locator (2520).

待機系Locator（２５２０）はWakeup Massageを受信するまでは他のメッセージを受け取っても処理をせず、Wakeup Messageを受け取るまで待機する（S４０１）。実行系Locator（２５１０）が保持する情報を待機系Locator（２５２０）が受け取ることにより、印刷データ変換サービス（２１００）は実行中の変換処理を中断することなく、起動中のコンポーネントをそのまま利用して処理を続けることが可能となる。続いて、実行系Locator（２５１０）が持つTokenが、待機系Locator（２５２０）に渡される（S３０２、S４０３）。このTokenは実行系Locator（２５１０）の複数同時起動を阻止するもので、これを持つLocatorは印刷データ変換サービス（２１００）の実行系Locator（２５１０）として起動する権限が与えられる。実行系Locator（２５１０）はTokenを待機系Locator（２５２０）に渡すと自身のプロセスを終了する。S４０３でTokenを受け取った待機系Locator（２５２０）は新たな実行系Locatorとして、実行系Locatorの寿命が設定され（S１０１）、自身の後を継ぐ待機系Locatorを作成する（S１０２）。 The standby locator (2520) does not process any other message until it receives the wakeup massage, but waits until it receives the wakeup message (S401). When the standby locator (2520) receives the information held by the execution locator (2510), the print data conversion service (2100) uses the active component as it is without interrupting the conversion process being executed. Processing can be continued. Subsequently, the Token of the active locator (2510) is transferred to the standby locator (2520) (S302, S403). This Token blocks the simultaneous activation of a plurality of execution system locators (2510), and the Locator having this token is given the right to start as the execution system locator (2510) of the print data conversion service (2100). The executing Locator (2510) ends its process when it passes the Token to the standby Locator (2520). The standby locator (2520) that has received the token in S403 sets the lifetime of the execution locator as a new execution locator (S101), and creates a standby locator that succeeds itself (S102).

2100・・・印刷データ変換サービス
2510・・・実行系Locator
2520・・・待機系Locator
5001・・・Service Catalog
5002・・・Service Inventory 2100: Print data conversion service
2510 ・・・ Execution Locator
2520 ・・・ Standby Locator
5001 ・・・ Service Catalog
5002 ・・・ Service Inventory

Claims

It is a method to prevent service stop due to process memory leak or abnormal termination,
The active system process having the lifetime (S101) starts the same standby system process as itself immediately after the start of the process without accepting the process (S102),
When the execution system process reaches the end of its life or causes a failure, immediately before the execution process itself ends, the standby system process performs processing and data transfer (S300),
When the standby process that has become an active process by the takeover process starts a new standby process in a state in which no process is accepted (S102),
A method of preventing service outages by dealing with memory leaks and abnormal termination of processes.