JP6743358B2

JP6743358B2 - Information processing system and information processing method

Info

Publication number: JP6743358B2
Application number: JP2015186956A
Authority: JP
Inventors: 直也堀口
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2015-09-24
Filing date: 2015-09-24
Publication date: 2020-08-19
Anticipated expiration: 2035-09-24
Also published as: JP2017062597A

Description

本発明は、情報処理システム及び情報処理方法に関する。 The present invention relates to an information processing system and an information processing method.

一般的な情報処理システムにおいて、キャッシュ技術は広く利用されている。キャッシュ技術とは、アクセスに時間がかかるデータを、高速にアクセスできる記憶装置に保持することによって平均的なアクセス時間を短縮する技術である。キャッシュ技術では、キャッシュの利用効率を高めるために、先読みやダイレクトアクセスなどの方法が用いられている。 The cache technology is widely used in general information processing systems. The cache technique is a technique for shortening the average access time by holding data that takes a long time to access in a storage device that can be accessed at high speed. In cache technology, methods such as prefetching and direct access are used in order to improve the efficiency of use of the cache.

先読みは、ファイルの先頭から順番にアクセスするような状況で有効とされている。しかし、例えば、順次アクセスを開始した時点で当該ファイルの末尾のデータが既にキャッシュに存在していた場合、ファイルの先頭のデータがキャッシュに読み込まれることによって末尾のデータがキャッシュから削除されてしまうことがある。すなわち、当該末尾のデータは後に再度アクセスされることが分かっているデータであるにもかかわらずキャッシュから削除されてしまう。これはページキャッシュの利用効率を下げることとなり、処理性能に悪影響を与える。 Read-ahead is effective in a situation where the files are accessed in order from the beginning. However, for example, if the data at the end of the file already exists in the cache when sequential access is started, the data at the beginning of the file is read into the cache and the data at the end is deleted from the cache. There is. That is, the data at the end is deleted from the cache even though the data is known to be accessed again later. This lowers the utilization efficiency of the page cache and adversely affects the processing performance.

ダイレクトアクセスは、キャッシュを回避する方法である。キャッシュが存在しているデータに対してダイレクトアクセスを実行すると、キャッシュ上のデータと二次記憶装置上のデータとの整合性を取るための処理が発生する。そのため、他のアプリケーションと共有されているデータに対してダイレクトアクセスを実行すると、既にキャッシュ上にデータが存在していたダーティなデータの書き出しや破棄の処理が発生する。その結果、他のアプリケーションのページキャッシュの利用効率を下げ、処理性能に悪影響を与える。 Direct access is a method of avoiding cache. When direct access is performed to the data in which the cache exists, a process for matching the data in the cache with the data in the secondary storage device occurs. Therefore, when direct access is made to the data shared with another application, a process of writing or discarding dirty data, which already exists in the cache, occurs. As a result, the utilization efficiency of the page cache of other applications is reduced, and the processing performance is adversely affected.

キャッシュ技術に関しては、以下の特許文献がある。特許文献１には、キャッシュが存在すれば、キャッシュ領域からアプリケーションが使用する記憶領域にデータを転送し、存在しなければ、記憶装置内の未格納データをキャッシュ領域の格納ブロックに順次転送すると共に、未格納データに含まれる転送データをアプリケーションが使用する所定の記憶領域に転送する技術が開示されている。特許文献２には、キャッシュが存在すればキャッシュメモリから読み出し、キャッシュが存在しなければメインメモリから読み出す技術が開示されている。特許文献３には、サーバ上のフラッシュメモリドライブをキャッシュとして使用し、頻繁にアクセスされる領域を記憶システム内のより高い階層に移動し、アクセス性能を向上させる技術が開示されている。 Regarding the cache technology, there are the following patent documents. In Patent Document 1, if a cache exists, data is transferred from the cache area to a storage area used by an application, and if not, unstored data in the storage device is sequentially transferred to a storage block in the cache area. A technique for transferring transfer data included in unstored data to a predetermined storage area used by an application is disclosed. Patent Document 2 discloses a technique of reading from a cache memory if a cache exists and reading from a main memory if a cache does not exist. Patent Document 3 discloses a technique in which a flash memory drive on a server is used as a cache and a frequently accessed area is moved to a higher hierarchy in a storage system to improve access performance.

特開２００３−１２２６３４JP-A-2003-122634 特開２０１３−１７４９９７JP, 2013-174997, A 特開２０１３−２２２４５７JP, 2013-222457, A

しかし、特許文献１乃至３に開示されている技術には次の問題点があった。すなわち、特許文献１乃至３に開示されている技術は、キャッシュミスが発生してからの動作に関する技術にすぎない。そのため、依然としてキャッシュの利用効率を改善することができず、アプリケーションの処理性能を向上させることができないという問題があった。 However, the techniques disclosed in Patent Documents 1 to 3 have the following problems. That is, the techniques disclosed in Patent Documents 1 to 3 are merely techniques relating to operations after a cache miss occurs. Therefore, there is still a problem that the utilization efficiency of the cache cannot be improved and the processing performance of the application cannot be improved.

本発明の目的は、上述の課題を解決する情報処理システム及び情報処理方法を提供することにある。 An object of the present invention is to provide an information processing system and an information processing method that solve the above problems.

本発明の情報処理システムは、ホストからの要求に応じて、読み出し速度の異なる複数の記憶媒体から処理対象データを読み出してホストに転送する情報処理システムであって、処理対象データが存在する記憶媒体を示す位置情報を取得する取得手段と、複数の記憶媒体間に定められる優先順位であって、複数の記憶媒体に格納された処理対象データの読み出し処理の優先順位を定めるテーブルを管理する管理手段と、位置情報及びテーブルに基づいて処理対象データを読み出す処理手段と、を備える。処理対象データは、文字列の検索対象となるデータである。前記複数の記憶媒体は、第１の記憶媒体と第２の記憶媒体とを含む。前記処理手段は、前記第２の記憶媒体よりも読み出し速度の速い前記第１の記憶媒体に前記処理対象データの一部のみが格納されている場合、前記第１の記憶媒体に格納されている前記処理対象データの前記一部に対する文字列の検索処理を優先的に実行する。 An information processing system of the present invention is an information processing system that reads processing target data from a plurality of storage media having different read speeds and transfers the processing target data to a host in response to a request from the host, and the storage medium in which the processing target data exists. And a management unit that manages a table that defines the priority order of the read processing of the processing target data stored in the plurality of storage media, which is the priority order defined between the plurality of storage media. And processing means for reading the processing target data based on the position information and the table. The processing target data is data to be searched for a character string. The plurality of storage media include a first storage medium and a second storage medium. In the case where only a part of the processing target data is stored in the first storage medium whose reading speed is faster than that of the second storage medium , the processing means stores the data in the first storage medium. The character string search process for the part of the processing target data is preferentially executed.

本発明の情報処理方法は、ホストからの要求に応じて、読み出し速度の異なる複数の記憶媒体から処理対象データを読み出してホストに転送するステップと、処理対象データが存在する記憶媒体を示す位置情報を取得する取得ステップと、複数の記憶媒体間に定められる優先順位であって、複数の記憶媒体に格納された処理対象データの読み出し処理の優先順位を定めるテーブルを管理する管理ステップと、位置情報及びテーブルに基づいて処理対象データを読み出す処理ステップと、を含む。処理対象データは、文字列の検索対象となるデータである。前記処理ステップでは、前記第２の記憶媒体よりも読み出し速度の速い前記第１の記憶媒体に前記処理対象データの一部のみが格納されている場合、前記第１の記憶媒体に格納されている前記処理対象データの前記一部に対する文字列の検索処理を優先的に実行する。 The information processing method of the present invention includes a step of reading data to be processed from a plurality of storage media having different read speeds and transferring the data to the host in response to a request from the host, and position information indicating the storage medium in which the data to be processed exists. An acquisition step of acquiring the data, a management step of managing a table that defines the priority order of the read processing of the processing target data stored in the plurality of storage media, which is the priority order defined between the plurality of storage media, and the position information. And a processing step of reading the processing target data based on the table. The processing target data is data to be searched for a character string. In the processing step, when only a part of the processing target data is stored in the first storage medium having a read speed faster than that of the second storage medium, it is stored in the first storage medium. The search processing of the character string for the part of the processing target data is preferentially executed.

本発明によれば、キャッシュの利用効率を改善し、アプリケーションの処理性能を向上させることができる。 According to the present invention, it is possible to improve the utilization efficiency of the cache and improve the processing performance of the application.

第１の実施形態にかかる情報処理システム１０００の構成を示す図である。It is a figure which shows the structure of the information processing system 1000 concerning 1st Embodiment. 第１の実施形態にかかる情報処理システム１０００の動作を説明する図である。It is a figure explaining operation|movement of the information processing system 1000 concerning 1st Embodiment. 第２の実施形態にかかる情報処理システム２０００の構成を示す図である。It is a figure which shows the structure of the information processing system 2000 concerning 2nd Embodiment. 第２の実施形態にかかる情報処理システム２０００の動作を説明する図である。It is a figure explaining operation|movement of the information processing system 2000 concerning 2nd Embodiment. 第３の実施形態にかかる情報処理システム３０００の構成を示す図である。It is a figure which shows the structure of the information processing system 3000 concerning 3rd Embodiment. 第３の実施形態にかかる情報処理システム３０００の動作を説明する図である。It is a figure explaining operation|movement of the information processing system 3000 concerning 3rd Embodiment. 第４の実施形態にかかる情報処理システム４０００の構成を示す図である。It is a figure which shows the structure of the information processing system 4000 concerning 4th Embodiment. 第４の実施形態にかかる情報処理システム４０００の動作を説明する図である。It is a figure explaining operation|movement of the information processing system 4000 concerning 4th Embodiment.

［第１の実施形態］
図１は、第１の実施形態にかかる情報処理システム１０００の構成を示す図である。 [First Embodiment]
FIG. 1 is a diagram showing a configuration of an information processing system 1000 according to the first embodiment.

情報処理システム１０００は、ホスト（図示せず）からの要求に応じて、読み出し速度の異なる複数の記憶媒体から処理対象データを読み出してホストに転送する情報処理システムである。また、情報処理システム１０００は、取得手段１０、管理手段１１及び処理手段１２を備える。 The information processing system 1000 is an information processing system that reads data to be processed from a plurality of storage media having different read speeds and transfers the data to the host in response to a request from a host (not shown). The information processing system 1000 also includes an acquisition unit 10, a management unit 11, and a processing unit 12.

本実施形態では、読み出し速度の異なる複数の記憶媒体を記憶媒体２０及び２１として説明する。記憶媒体２０及び２１は、例えばハードディスクと、当該ハードディスクよりも読み出し速度の速いキャッシュメモリである。 In this embodiment, a plurality of storage media having different read speeds will be described as the storage media 20 and 21. The storage media 20 and 21 are, for example, a hard disk and a cache memory having a faster reading speed than the hard disk.

取得手段１０は、処理対象データが存在する記憶媒体を示す位置情報を取得する。 The acquisition unit 10 acquires the position information indicating the storage medium in which the processing target data exists.

管理手段１１は、複数の記憶媒体２０、２１間に定められる優先順位であって、複数の記憶媒体２０、２１に格納された処理対象データの読み出し処理の優先順位を定めるテーブルを管理する。優先順位は、例えば、読み出し速度の速いキャッシュメモリに格納されている処理対象データから優先的に読み出すように定められる。 The management unit 11 manages a table that defines the priority order of the read processing of the processing target data stored in the plurality of storage media 20 and 21, which is the priority order defined between the plurality of storage media 20 and 21. The priority order is determined, for example, such that the processing target data stored in the cache memory having a high read speed is preferentially read.

処理手段１２は、取得手段１０が取得した位置情報及び管理手段１１が管理するテーブルに基づいて処理対象データを読み出す。 The processing unit 12 reads the processing target data based on the position information acquired by the acquisition unit 10 and the table managed by the management unit 11.

図２は、情報処理システム１０００の動作を説明するフローチャートである。 FIG. 2 is a flowchart illustrating the operation of the information processing system 1000.

管理手段１１は、記憶媒体２０及び２１間に定められる優先順位であって、記憶媒体２０及び２１に格納された処理対象データの読み出し処理の優先順位を定めるテーブルを管理している（ステップＳ１）。取得手段１０は、処理対象データが存在する記憶媒体を示す位置情報を取得する（ステップＳ２）。処理手段１２は、取得手段１０が取得した位置情報及び管理手段１１が管理するテーブルに基づいて処理対象データを読み出す（ステップＳ３）。 The management unit 11 manages a table that defines the priority order of the read processing of the processing target data stored in the storage media 20 and 21, which is the priority order defined between the storage media 20 and 21 (step S1). .. The acquisition unit 10 acquires the position information indicating the storage medium in which the processing target data exists (step S2). The processing unit 12 reads the processing target data based on the position information acquired by the acquisition unit 10 and the table managed by the management unit 11 (step S3).

第１の実施形態にかかる情報処理システム１０００では、管理手段１１が記憶媒体２０及び２１間に定められる優先順位であって、記憶媒体２０及び２１に格納された処理対象データの読み出し処理の優先順位を定めるテーブルを管理している。そして、取得手段１０は、処理対象データの読み出し処理に先立って処理対象データが存在する記憶媒体を示す位置情報を取得する。処理手段１２は、取得手段１０が取得した位置情報及び管理手段１１が管理するテーブルに基づいて処理対象データを読み出すため、キャッシュミスの回数を減らすことができる。その結果、キャッシュの利用効率を改善し、アプリケーションの処理性能を向上させることができる。 In the information processing system 1000 according to the first embodiment, the management unit 11 has a priority order defined between the storage media 20 and 21, and a priority order of the read processing of the processing target data stored in the storage media 20 and 21. Manages a table that determines Then, the acquisition unit 10 acquires the position information indicating the storage medium in which the processing target data exists prior to the processing of reading the processing target data. Since the processing unit 12 reads the processing target data based on the position information acquired by the acquisition unit 10 and the table managed by the management unit 11, the number of cache misses can be reduced. As a result, the utilization efficiency of the cache can be improved and the processing performance of the application can be improved.

上記説明ではキャッシュメモリ及びハードディスクの２つの記憶媒体を例として説明した。しかし、一般的にマルチティアストレージと呼ばれている、ハードディスクドライブ／ソリッドステートドライブ／ダイナミック・ランダム・アクセス・メモリという構成にも本発明を適用することができる。 In the above description, two storage media, a cache memory and a hard disk, have been described as examples. However, the present invention can be applied to a configuration of a hard disk drive/solid state drive/dynamic random access memory, which is generally called a multi-tier storage.

［第２の実施形態］
業務アプリケーションを含むデータベースシステムあるいはファイルサーバシステムのバックエンドにおいてデータのバックアップ処理を実行するケースを考える。ここで、業務アプリケーションとは、会社等の業務を行ううえで業務の一部をコンピューター処理するものをいう。業務アプリケーションを含むデータベースシステムあるいはファイルサーバシステムのバックエンドにおいてデータのバックアップ処理を実行する際、当該バックアップ処理が発生させるＩ／Ｏ（Input/Output）によって大量のデータがキャッシュに読み込まれる。その結果、業務アプリケーションの使用しているデータが、キャッシュから押し出されて削除されてしまうことがある。これにより、業務アプリケーションの性能が低下してしまう。一方で、バックアップ処理において、キャッシュを経由しないダイレクトＩ／Ｏを用いれば、バックアップ処理によってキャッシュにデータが読み込まれることはない。しかし、全データに対してダイレクトＩ／Ｏを行うと、キャッシュされていたデータのディスクへの書き出しや読み込みが必要以上に行われるため、やはり業務アプリケーションの性能が低下してしまう。 [Second Embodiment]
Consider a case where data backup processing is executed in the back end of a database system or file server system including business applications. Here, the business application refers to a computer processing part of the business in performing business of a company or the like. When the data back-up process is executed in the back end of the database system or the file server system including the business application, a large amount of data is read into the cache by the I/O (Input/Output) generated by the back-up process. As a result, the data used by the business application may be pushed out of the cache and deleted. As a result, the performance of the business application is reduced. On the other hand, in the backup process, if direct I/O that does not go through the cache is used, data will not be read into the cache by the backup process. However, when direct I/O is performed on all data, the cached data is written to and read from the disk more than necessary, so that the performance of the business application also deteriorates.

第２の実施形態にかかる情報処理システムは、バックアップ処理に関するキャッシュの効率低下についての上述した課題を解決する。 The information processing system according to the second embodiment solves the above-described problem regarding the reduction in cache efficiency related to backup processing.

図３は、第２の実施形態にかかる情報処理システム２０００の構成を示す図である。 FIG. 3 is a diagram showing the configuration of the information processing system 2000 according to the second embodiment.

情報処理システム２０００は、ビジネスアプリケーション１０１、バックアップアプリケーション１０３、取得手段１０４、管理手段１０５、キャッシュメモリ１０６、ストレージディスク１０７及びバックアップストレージディスク１０８を含んで構成される。 The information processing system 2000 includes a business application 101, a backup application 103, an acquisition unit 104, a management unit 105, a cache memory 106, a storage disk 107, and a backup storage disk 108.

ビジネスアプリケーション１０１は、業務アプリケーションである。 The business application 101 is a business application.

バックアップアプリケーション１０３は、格納領域１０９、一時保存領域１１０及び処理手段１１２を備える。バックアップアプリケーション１０３は、ファイル１０２を指定して取得手段１０４（後述）を呼び出す。 The backup application 103 includes a storage area 109, a temporary storage area 110, and a processing unit 112. The backup application 103 specifies the file 102 and calls the acquisition unit 104 (described later).

ファイル１０２は、バックアップの対象となるファイルである。ファイル１０２は、ページオフセット位置(図３におけるファイル１０２内の点線箱内の数字に相当)を有する。ファイル１０２のデータは、キャッシュメモリ１０６（後述）及び／又はストレージディスク１０７（後述）に格納されている。ＯＳ（Operating System）は、ビジネスアプリケーション１０１からのアクセスに応じて、ファイル１０２のデータの配置を更新する。すなわち、ファイル１０２のデータがキャッシュメモリ１０６に格納されている（ページキャッシュが存在する）か、あるいはストレージディスク１０７にのみ格納されている（ページキャッシュが存在しない）かを更新する。 The file 102 is a file to be backed up. The file 102 has a page offset position (corresponding to the number in the dotted box in the file 102 in FIG. 3). The data of the file 102 is stored in the cache memory 106 (described later) and/or the storage disk 107 (described later). The OS (Operating System) updates the data arrangement of the file 102 in response to the access from the business application 101. That is, whether the data of the file 102 is stored in the cache memory 106 (the page cache exists) or stored only in the storage disk 107 (the page cache does not exist) is updated.

格納領域１０９は、ページキャッシュの存在情報（ページキャッシュ存在情報）を格納する領域である。ページキャッシュ存在情報は、例えばページキャッシュの有無を示す二値であり、ビットマップとして保存される。この場合、ビットマップの各ビットはページオフセット位置に対応する。 The storage area 109 is an area for storing the page cache existence information (page cache existence information). The page cache existence information is, for example, a binary value indicating the presence or absence of a page cache, and is stored as a bitmap. In this case, each bit of the bitmap corresponds to a page offset position.

一時保存領域１１０は、バックアップ処理中にデータを一時保存するための領域である。 The temporary storage area 110 is an area for temporarily storing data during backup processing.

処理手段１１２は、第１の実施形態における処理手段１２に対応する構成である。処理手段１１２は、格納領域１０９が保持するページキャッシュ存在情報（ビットマップ）のページオフセット位置（各ビット）をチェックしてキャッシュメモリ１０６に格納されているデータを特定する。また、処理手段１１２は、キャッシュメモリ１０６に格納されているデータに対して、バッファードＩ／Ｏを用いてバックアップストレージディスク１０８（後述）への書き出し処理を実行する。また、キャッシュメモリ１０６に格納されていないデータに対して、ダイレクトＩ／Ｏを用いてストレージディスク１０７から一時保存領域１１０への読み込み処理を実行する。そして、一時保存領域１１０に読み込まれたデータに対して、ダイレクトＩ／Ｏを用いてバックアップストレージディスク１０８への書き出しを実行する。 The processing means 112 has a configuration corresponding to the processing means 12 in the first embodiment. The processing unit 112 identifies the data stored in the cache memory 106 by checking the page offset position (each bit) of the page cache existence information (bitmap) held in the storage area 109. Further, the processing unit 112 executes a writing process to the backup storage disk 108 (described later) using the buffered I/O for the data stored in the cache memory 106. Further, with respect to the data which is not stored in the cache memory 106, the read processing from the storage disk 107 to the temporary storage area 110 is executed by using the direct I/O. Then, the data read into the temporary storage area 110 is written to the backup storage disk 108 using direct I/O.

取得手段１０４は、第１の実施形態における取得手段１０に対応する構成である。取得手段１０４は、ページキャッシュ存在情報を取得する。取得手段１０４は、例えばビジネスアプリケーション１０１からカーネルの処理を呼び出すシステムコールとして実装される。取得手段１０４は、カーネル内で管理手段１０５（後述）に対してページキャッシュの存在を問い合わせる。そして、取得手段１０４は、キャッシュメモリ１０６に格納されているデータのページオフセット位置を、バックアップアプリケーション１０３の格納領域１０９に格納する。図３では、灰色の箱によってデータが存在することを表している。 The acquisition unit 104 has a configuration corresponding to the acquisition unit 10 in the first embodiment. The acquisition unit 104 acquires the page cache existence information. The acquisition unit 104 is implemented, for example, as a system call that calls a kernel process from the business application 101. The acquisition unit 104 makes an inquiry to the management unit 105 (described later) in the kernel about the existence of the page cache. Then, the acquisition unit 104 stores the page offset position of the data stored in the cache memory 106 in the storage area 109 of the backup application 103. In FIG. 3, the gray boxes indicate the presence of data.

管理手段１０５は、第１の実施形態における管理手段１１に対応する構成である。ファイル１０２のデータがキャッシュメモリ１０６に格納されている（ページキャッシュが存在する）場合、データのページオフセット位置とページキャッシュのメモリ内アドレスとの対応を管理する。 The management unit 105 has a configuration corresponding to the management unit 11 in the first embodiment. When the data of the file 102 is stored in the cache memory 106 (the page cache exists), the correspondence between the page offset position of the data and the in-memory address of the page cache is managed.

キャッシュメモリ１０６は、バックアップ対象のファイルのデータをページキャッシュとして格納するキャッシュメモリである。キャッシュメモリ１０６は、ストレージディスク１０７（後述）よりも読み出し速度が速いものとする。 The cache memory 106 is a cache memory that stores data of a backup target file as a page cache. The read speed of the cache memory 106 is faster than that of the storage disk 107 (described later).

ストレージディスク１０７は、バックアップ対象のファイルのデータを格納するストレージである。 The storage disk 107 is a storage that stores data of files to be backed up.

バックアップストレージディスク１０８は、バックアップデータを格納するストレージである。 The backup storage disk 108 is a storage that stores backup data.

第２の実施形態にかかる情報処理システム２０００の動作について、図３及び４を用いて説明する。 The operation of the information processing system 2000 according to the second embodiment will be described with reference to FIGS.

バックアップアプリケーション１０３がビジネスアプリケーション１０１の静止点を取ってバックアップ処理を開始する（ステップＳ２０１）。バックアップアプリケーション１０３は、取得手段１０４に対して対象ファイル１０２のページキャッシュ存在情報の取得を要求する（ステップＳ２０２）。取得手段１０４は、管理手段１０５からページキャッシュ存在情報を取得し、取得したページキャッシュ存在情報をバックアップアプリケーション１０３の格納領域１０９に格納する(ステップＳ２０３)。その後、バックアップアプリケーション１０３の処理手段１１２は、格納領域１０９に格納されているページキャッシュ存在情報のページオフセット位置をチェックしてキャッシュメモリ１０６に格納されているデータを特定する(ステップＳ２０４)。処理手段１１２は、キャッシュメモリ１０６に格納されているデータに対して、バッファードＩ／Ｏを用いてバックアップストレージディスク１０８への書き出し処理を実行する(ステップＳ２０５)。その後、キャッシュメモリ１０６に格納されていないデータに対して、ダイレクトＩ／Ｏを用いてストレージディスク１０７から一時保存領域１１０への読み込み処理を実行する(ステップＳ２０６)。続いて、一時保存領域１１０に読み込まれたデータに対して、ダイレクトＩ／Ｏを用いてバックアップストレージディスク１０８への書き出しを実行し(ステップＳ２０７)、処理を終了する(ステップＳ２０８)。 The backup application 103 takes a quiescent point of the business application 101 and starts backup processing (step S201). The backup application 103 requests the acquisition unit 104 to acquire the page cache existence information of the target file 102 (step S202). The acquisition unit 104 acquires the page cache existence information from the management unit 105 and stores the acquired page cache existence information in the storage area 109 of the backup application 103 (step S203). After that, the processing unit 112 of the backup application 103 checks the page offset position of the page cache existence information stored in the storage area 109 and specifies the data stored in the cache memory 106 (step S204). The processing unit 112 uses the buffered I/O to write the data stored in the cache memory 106 to the backup storage disk 108 (step S205). After that, the data that is not stored in the cache memory 106 is read from the storage disk 107 to the temporary storage area 110 using direct I/O (step S206). Subsequently, the data read into the temporary storage area 110 is written to the backup storage disk 108 using direct I/O (step S207), and the process is ended (step S208).

第２の実施形態にかかる情報処理システム２０００では、稼動中のシステムのバックエンドで行われるバックアップ処理において、ページキャッシュの存在情報に応じてバックアップ方法を分岐させる。これにより、ビジネスアプリケーションの性能に悪影響を及ぼすことなく、バックアップ処理を効率良く実施することができる。 In the information processing system 2000 according to the second embodiment, the backup method is branched according to the existence information of the page cache in the backup processing performed in the back end of the system in operation. As a result, backup processing can be performed efficiently without adversely affecting the performance of business applications.

［変形例１］
上記の説明では、ページキャッシュが存在するデータを優先的にバックアップ処理の対象としている。しかし、ファイル１０２のデータ列の先頭から順にページキャッシュの存在の有無を判定し、その都度バッファードＩ／Ｏを用いるかダイレクトＩ／Ｏを用いるかを選択する態様とすることもできる。 [Modification 1]
In the above description, data in which a page cache exists is preferentially targeted for backup processing. However, it is also possible to adopt a mode in which the presence or absence of the page cache is sequentially determined from the head of the data string of the file 102, and each time it is selected whether to use the buffered I/O or the direct I/O.

［第３の実施形態］
インデックス等を用いない単純な文字列検索処理では、ファイルのデータ列の先頭から順番に検索処理が進んでいく。そのため、データ列の途中にページキャッシュの存在しないページ、あるいはアクセスにかかる時間の長いページがあると、その都度リードを発行してＩ／Ｏ待ちをする必要があり、検索処理に膨大な時間を要するという問題があった。そこで、本実施形態では、当該問題を解決する手段について説明する。 [Third Embodiment]
In a simple character string search process that does not use an index or the like, the search process proceeds in order from the beginning of the data string of the file. Therefore, if there is a page that does not have a page cache or a page that takes a long time to access in the middle of a data string, it is necessary to issue a read and wait for I/O each time, which requires a huge amount of time for search processing. There was a problem of cost. Therefore, in this embodiment, means for solving the problem will be described.

図５は、第３の実施形態にかかる情報処理システム３０００の構成を示す図である。 FIG. 5 is a diagram showing the configuration of the information processing system 3000 according to the third embodiment.

情報処理システム３０００は、文字列検索アプリケーション３０１、取得手段３０４、キャッシュメモリ３０６、ストレージディスク３０７及び管理手段３０８を含む。 The information processing system 3000 includes a character string search application 301, an acquisition unit 304, a cache memory 306, a storage disk 307, and a management unit 308.

文字列検索アプリケーション３０１は、データ管理手段３０２及び検索処理実行手段３０３を備える。文字列検索アプリケーション３０１は、取得手段３０４を用いてカーネル内処理を呼び出し、ページキャッシュ情報の問い合わせや検索対象ファイル３０５のデータの非同期リード処理を実行する。 The character string search application 301 includes a data management unit 302 and a search processing execution unit 303. The character string search application 301 calls an in-kernel process using the acquisition unit 304 to execute an inquiry about page cache information and an asynchronous read process of data in the search target file 305.

データ管理手段３０２は、格納領域３０９、格納領域３１０、格納領域３１１及び格納領域３１２を有する。 The data management unit 302 has a storage area 309, a storage area 310, a storage area 311 and a storage area 312.

格納領域３０９は、検索対象データを格納する領域である。 The storage area 309 is an area for storing search target data.

格納領域３１０、３１１、３１２には、ファイルのページサイズ単位のオフセット位置を示すデータの配列（データの位置情報）が格納されている。格納領域３１０、３１１、３１２に格納されている位置情報は、検索処理実行手段３０３が検索処理を実行する際に参照される。 The storage areas 310, 311, and 312 store an array of data (positional information of data) indicating offset positions in page size units of files. The position information stored in the storage areas 310, 311, 312 is referred to when the search processing execution means 303 executes the search processing.

格納領域３１０は、取得手段３０４が取得したデータの位置情報を格納する領域である。取得手段３０４がデータの位置情報を取得するたびに格納領域３１０に位置情報が格納される。 The storage area 310 is an area for storing position information of the data acquired by the acquisition unit 304. The position information is stored in the storage area 310 every time the acquisition unit 304 acquires the position information of the data.

格納領域３１１は、通算の取得済みデータの位置情報を格納する領域である。取得手段３０４が繰り返しデータの位置情報を取得し、その都度格納領域３１０に位置情報が格納される場合、通算でどのデータを取得済みか知る必要がある。そのため、格納領域３１０とは別の格納領域３１１において、通算の取得済みデータの位置情報を管理する。 The storage area 311 is an area for storing position information of total acquired data. When the acquisition unit 304 repeatedly acquires the position information of the data and the position information is stored in the storage area 310 each time, it is necessary to know which data has been acquired in total. Therefore, the storage area 311 different from the storage area 310 manages the position information of the total acquired data.

格納領域３１２は、取得済みデータの位置情報を格納する領域である。領域手段３１２に格納される情報は、データ管理手段３０２が取得手段３０４を呼び出す際の入力となる。そのため、格納領域３１２には、格納領域３１１の情報がコピーされる。 The storage area 312 is an area for storing position information of acquired data. The information stored in the area means 312 is input when the data management means 302 calls the acquisition means 304. Therefore, the information in the storage area 311 is copied to the storage area 312.

検索処理実行手段３０３は、データ管理手段３０２からの検索指示を受けて検索処理を実行する。 The search processing execution means 303 receives the search instruction from the data management means 302 and executes the search processing.

取得手段３０４は、キャッシュメモリ３０６に格納されているデータ及び当該データの位置情報を取得する。 The acquisition unit 304 acquires the data stored in the cache memory 306 and the position information of the data.

検索対象ファイル３０５のデータは、キャッシュメモリ３０６及び／又はストレージディスク３０７に格納されている。 The data of the search target file 305 is stored in the cache memory 306 and/or the storage disk 307.

管理手段３０８は、検索対象ファイル３０５のデータがキャッシュメモリ３０６に格納されている場合に、データのファイルオフセットとキャッシュメモリ３０６のメモリ内アドレスの対応を管理する。 When the data of the search target file 305 is stored in the cache memory 306, the management unit 308 manages the correspondence between the file offset of the data and the in-memory address of the cache memory 306.

本実施形態にかかる情報処理手段３０００の動作について、図５及び６を用いて説明する。 The operation of the information processing unit 3000 according to this embodiment will be described with reference to FIGS.

検索処理が開始すると(ステップＳ４０１)、文字列検索アプリケーション３０１内のデータ管理手段３０２は、検索対象ファイル３０５に対して取得手段３０４を呼び出す(ステップＳ４０２)。次に、取得手段３０４は、カーネル内の処理として、キャッシュメモリ３０６に格納されているデータの存在を管理手段３０８に問い合わせる。取得手段３０４は、キャッシュメモリ３０６に格納されているデータを格納領域３０９にコピーし、今回コピーされたデータの位置情報を格納領域３１０の配列に格納する(ステップＳ４０３)。今回コピーされなかったデータに相当する配列要素にはＮＵＬＬ値が格納される。格納領域３１１に格納される位置情報は、取得手段３０４によりコピーされたデータの位置情報が格納領域３１０に格納された後に毎回更新される。すなわち、格納領域３１０内の非ＮＵＬＬ値の配列要素の位置情報のみ格納領域３１１の同じ配列要素にコピーされる(ステップＳ４０４)。その後、検索実行手段３０３は、格納領域３１１の通算の取得データの位置情報と格納領域３０９の取得済みデータとを用いて検索文字列の検索処理を実行する(ステップＳ４０５)。次に、検索にヒットしたかどうかを確認し(ステップＳ４０６)、検索にヒットした場合は(ステップＳ４０６でｙeｓ)、検索の結果を格納する(ステップＳ４０７)。続いて、検索対象ファイル３０５の全データに対する検索処理が完了したか否か、すなわち、格納領域３１１内の配列の全ての要素に位置情報が格納されているか否かを判別する(ステップＳ４０８)。全データに対する検索処理が完了している、すなわち、格納領域３１１内の配列の全ての要素に位置情報が格納されている場合は(ステップＳ４０８でｙeｓ)、処理を終了する(ステップＳ４１１)。全データに対する検索処理が完了していない場合は(ステップＳ４０８でｎo)、未取得のデータを非同期リードするためのＩ／Ｏをストレージディスク３０７に対して発行する(ステップＳ４０９)。未取得のデータは、格納領域３１１に格納されている位置情報から把握される。その後、格納領域３１２内の配列に格納領域３１１の位置情報をコピーする(ステップＳ４１０)。そして、再度データ及び位置情報を取得して検索処理を繰り返す。この繰り返し処理において、前回のサイクルから時間が経過しているため非同期リードによりページキャッシュの状態が変更していると想定される。前回のデータ取得処理により取得されたデータは格納領域３０９に保持されているため、毎回の検索処理は更新されたデータに対して行われる。 When the search process is started (step S401), the data management means 302 in the character string search application 301 calls the acquisition means 304 for the search target file 305 (step S402). Next, the acquisition unit 304 inquires of the management unit 308 about the existence of the data stored in the cache memory 306 as a process in the kernel. The acquisition unit 304 copies the data stored in the cache memory 306 to the storage area 309, and stores the position information of the data copied this time in the array of the storage area 310 (step S403). A NULL value is stored in the array element corresponding to the data not copied this time. The position information stored in the storage area 311 is updated every time after the position information of the data copied by the acquisition means 304 is stored in the storage area 310. That is, only the position information of the array element having the non-NULL value in the storage area 310 is copied to the same array element in the storage area 311 (step S404). After that, the search execution unit 303 executes the search processing of the search character string using the position information of the total acquired data in the storage area 311 and the acquired data in the storage area 309 (step S405). Next, it is confirmed whether or not the search is hit (step S406). When the search is hit (yes in step S406), the search result is stored (step S407). Subsequently, it is determined whether or not the search processing for all the data of the search target file 305 is completed, that is, whether or not the position information is stored in all the elements of the array in the storage area 311 (step S408). If the search processing for all the data is completed, that is, if the position information is stored in all the elements of the array in the storage area 311 (yes in step S408), the processing is ended (step S411). If the search processing for all the data has not been completed (No in step S408), I/O for asynchronously reading the unacquired data is issued to the storage disk 307 (step S409). The unacquired data is grasped from the position information stored in the storage area 311. Then, the position information of the storage area 311 is copied to the array in the storage area 312 (step S410). Then, the data and the position information are acquired again, and the search process is repeated. In this iterative process, it is assumed that the state of the page cache has been changed by asynchronous read because time has passed since the previous cycle. Since the data acquired by the previous data acquisition process is held in the storage area 309, the search process each time is performed on the updated data.

第３の実施形態にかかる情報処理システム３０００では、検索対象ファイルのデータがキャッシュメモリに格納されているか否かを事前に調べ、キャッシュメモリに格納されているデータから優先的に検索する。これにより、検索処理に要する時間を短縮することができる。特に、ある種の文字列検索ではファイルに指定した文字列が少なくとも一つ存在するかどうかが分かれば良いという状況、すなわち、ヒットした文字列の位置や出現回数には関心がない状況が存在する。このような状況では、キャッシュメモリに格納されているデータに検索文字列が含まれていた場合にＩ／Ｏを行う必要がない。そのため、ファイル全行の逐次検索よりも検索処理に要する時間を短縮することができる。 The information processing system 3000 according to the third embodiment checks in advance whether or not the data of the search target file is stored in the cache memory, and preferentially searches the data stored in the cache memory. As a result, the time required for the search process can be shortened. In particular, there are situations where it is necessary to know whether or not there is at least one character string specified in a file for certain types of character string searches, that is, there is no concern about the position of the character string hit or the number of appearances. .. In such a situation, it is not necessary to perform I/O when the data stored in the cache memory includes the search character string. Therefore, the time required for the search process can be shortened as compared with the sequential search of all lines of the file.

［第４の実施形態］
本実施形態では、第３の実施形態にかかる情報処理システム３０００を、仮想環境やクラウド環境など、ストレージが多層に渡るシステムに拡張する態様について説明する。 [Fourth Embodiment]
In this embodiment, an aspect in which the information processing system 3000 according to the third embodiment is expanded to a system having multiple layers of storage such as a virtual environment or a cloud environment will be described.

図７は、本実施形態にかかる情報処理システム４０００の構成を示す図である。情報処理システム４０００は、文字列検索アプリケーション５０１、取得手段５０３、第１層ストレージ５０４、第２層ストレージ５０５、第Ｎ層ストレージ５０６、第１層ストレージのデバイスドライバ５０７、第２層ストレージのデバイスドライバ５０８及び第Ｎ層ストレージのデバイスドライバ５０９を含んで構成される。 FIG. 7 is a diagram showing the configuration of the information processing system 4000 according to the present embodiment. The information processing system 4000 includes a character string search application 501, an acquisition unit 503, a first tier storage 504, a second tier storage 505, an N tier storage 506, a first tier storage device driver 507, and a second tier storage device driver. 508 and a device driver 509 of the Nth layer storage.

ファイル５０２は、検索対象ファイルである。ファイル５０２のデータは、Ｎ層からなるストレージ５０４、５０５及び５０６のうち少なくとも一以上のストレージに格納されている。 The file 502 is a search target file. The data of the file 502 is stored in at least one of the storages 504, 505, and 506 consisting of N layers.

文字列検索アプリケーション５０１は、格納領域５１０、第１層ストレージ上のデータ位置情報格納領域（格納領域）５１１、第２層ストレージ上のデータ位置情報格納領域（格納領域）５１２及び第Ｎ層ストレージ上のデータ位置情報格納領域（格納領域）５１３を備える。 The character string search application 501 includes a storage area 510, a data position information storage area (storage area) 511 on the first tier storage, a data position information storage area (storage area) 512 on the second tier storage, and an Nth tier storage. The data position information storage area (storage area) 513 of

格納領域５１０は、検索対象のデータを格納する領域である。 The storage area 510 is an area for storing search target data.

格納領域５１１、５１２及び５１３は、各層のストレージが格納するデータの位置情報を格納する領域である。 The storage areas 511, 512 and 513 are areas for storing position information of data stored in the storage of each layer.

取得手段５０３は、各層のストレージのデバイスドライバ５０７、５０８及び５０９（後述）を介して検索対象のデータを取得し、文字列検索アプリケーション５０１の格納領域５１１、５１２及び５１３に格納する。 The acquisition unit 503 acquires search target data via the device drivers 507, 508, and 509 (described later) of the storage of each layer, and stores the data in the storage areas 511, 512, and 513 of the character string search application 501.

ストレージ５０４、５０５及び５０６は、検索対象のデータを格納する領域であり、多層に構成されている。典型的な例としては、第１層がページキャッシュ、第２層がストレージキャッシュ、第３層がストレージディスク、という構成がある。システムによっては、ネットワークストレージやクラウドストレージなどを含むことがある。 The storages 504, 505, and 506 are areas for storing data to be searched, and are configured in multiple layers. A typical example is a configuration in which the first layer is a page cache, the second layer is a storage cache, and the third layer is a storage disk. Depending on the system, it may include network storage and cloud storage.

デバイスドライバ５０７、５０８及び５０９は、各層のストレージに対応して設けられ、ＯＳあるいはアプリケーションからストレージにアクセスするためのデバイスドライバである。 The device drivers 507, 508, and 509 are device drivers that are provided corresponding to the storage of each layer and that are used by the OS or applications to access the storage.

本実施形態にかかる情報処理システム４０００の動作について、図７及び図８を用いて説明する。 The operation of the information processing system 4000 according to this embodiment will be described with reference to FIGS. 7 and 8.

文字列検索アプリケーション５０１の処理を開始すると（ステップＳ５０１）、文字列検索アプリケーション５０１は、検索対象ファイル５０２に対して取得手段５０３を呼び出す（ステップＳ５０２）。取得手段５０３は、デバイスドライバ５０７、５０８及び５０９を介して検索対象データの位置情報を取得し、文字列検索アプリケーション５０１の各層のストレージに対応する格納領域５１１、５１２及び５１３に格納する（ステップＳ５０３）。また、第１層ストレージ(例えばページキャッシュ)等データ転送先に近い位置にデータが格納されている場合は、取得したデータを格納領域５１０にコピーする（ステップＳ５０４）。そして、文字列検索アプリケーション５０１の処理を終了する（ステップＳ５０５）。 When the processing of the character string search application 501 is started (step S501), the character string search application 501 calls the acquisition unit 503 for the search target file 502 (step S502). The acquisition unit 503 acquires the position information of the search target data via the device drivers 507, 508, and 509, and stores the position information in the storage areas 511, 512, and 513 corresponding to the storage of each layer of the character string search application 501 (step S503). ). If the data is stored in a position close to the data transfer destination such as the first layer storage (for example, page cache), the acquired data is copied to the storage area 510 (step S504). Then, the processing of the character string search application 501 ends (step S505).

本実施形態にかかる情報処理システム４０００によれば、各層のストレージに対応する格納領域５１１、５１２及び５１３を参照することにより、あるデータがどのストレージに保存されているかを知ることができる。また、あるデータへのアクセス時間に関する指標を得ることができる。また、ページキャッシュに未読み込みのデータに対する非同期リードを発行する際に効果を奏する。すなわち、どのデータに対して非同期リードを発行するかを判断する際に、最も浅い層にあるデータ、すなわちアクセス時間が短いと期待されるデータに対して優先的にリードを発行することができる。 According to the information processing system 4000 according to the present embodiment, by referring to the storage areas 511, 512 and 513 corresponding to the storage of each layer, it is possible to know in which storage the certain data is stored. In addition, it is possible to obtain an index regarding access time to certain data. It is also effective when issuing an asynchronous read for unread data to the page cache. That is, when determining which data the asynchronous read is issued to, the read can be preferentially issued to the data in the shallowest layer, that is, the data expected to have a short access time.

以上、本発明の実施形態について説明したが、本発明は上記実施形態に限定されるものではなく、本発明の趣旨を逸脱しない限りにおいて他の変形例、応用例を含むことは言うまでもない。上記の実施形態の一部又は全部は、以下のようにも記載されうるが、以下には限られない。
（付記１）
ホストからの要求に応じて、読み出し速度の異なる複数の記憶媒体から処理対象データを読み出して前記ホストに転送する情報処理システムであって、
前記処理対象データが存在する前記記憶媒体を示す位置情報を取得する取得手段と、
前記複数の記憶媒体間に定められる優先順位であって、前記複数の記憶媒体に格納された前記処理対象データの読み出し処理の優先順位を定めるテーブルを管理する管理手段と、
前記位置情報及び前記テーブルに基づいて前記処理対象データを読み出す処理手段と、を備える情報処理システム。
（付記２）
前記処理手段は、前記位置情報及び前記テーブルに基づいて、前記複数の記憶媒体に格納されている前記処理対象データに対して前記複数の記憶媒体ごとに異なる読み出し処理を実行する、付記１に記載の情報処理システム。
（付記３）
前記複数の記憶媒体はストレージディスク及び当該ストレージディスクよりも読み出し速度の速いキャッシュメモリであり、
前記処理手段は、前記処理対象データが前記キャッシュメモリに格納されている場合前記キャッシュメモリからの読み出し処理を実行し、前記処理対象データが前記キャッシュメモリに格納されていない場合前記ストレージディスクからの読み出し処理を実行する、付記１または２に記載の情報処理システム。
（付記４）
前記処理対象データは、検索対象となるデータであり、
前記処理手段は、前記複数の記憶媒体のうち読み出し速度の速い記憶媒体に格納されている前記処理対象データに対する検索処理を優先的に実行する、付記１または２に記載の情報処理システム。
（付記５）
前記処理手段は、前記位置情報及び前記テーブルに基づいて、前記転送にかかる時間が短いデータに対する読み出し処理を優先的に実行する、付記４に記載の情報処理システム。
（付記６）
ホストからの要求に応じて、読み出し速度の異なる複数の記憶媒体から処理対象データを読み出して前記ホストに転送するステップと、
前記処理対象データが存在する前記記憶媒体を示す位置情報を取得する取得ステップと、
前記複数の記憶媒体間に定められる優先順位であって、前記複数の記憶媒体に格納された前記処理対象データの読み出し処理の優先順位を定めるテーブルを管理する管理ステップと、
前記位置情報及び前記テーブルに基づいて前記処理対象データを読み出す処理ステップと、を含む情報処理方法。
（付記７）
前記処理ステップでは、前記位置情報及び前記テーブルに基づいて、前記複数の記憶媒体に格納されている前記処理対象データに対して前記複数の記憶媒体ごとに異なる読み出し処理を実行する、付記６に記載の情報処理方法。
（付記８）
前記複数の記憶媒体はストレージディスク及び当該ストレージディスクよりも読み出し速度の速いキャッシュメモリであり、
前記処理ステップでは、前記位置情報及び前記テーブルに基づいて、前記処理対象データが前記キャッシュメモリに格納されている場合前記キャッシュメモリからの読み出し処理を実行し、前記処理対象データが前記キャッシュメモリに格納されていない場合前記ストレージディスクからの読み出し処理を実行する、付記６または７に記載の情報処理方法。
（付記９）
前記処理対象データは、検索対象となるデータであり、
前記処理ステップでは、前記複数の記憶媒体のうち読み出し速度の速い記憶媒体に格納されている前記処理対象データに対する検索処理を優先的に実行する、付記６または７に記載の情報処理方法。
（付記１０）
前記処理ステップでは、前記位置情報及び前記テーブルに基づいて、前記転送にかかる時間が短いデータに対する読み出し処理を優先的に実行する、付記９に記載の情報処理方法。 Although the embodiments of the present invention have been described above, the present invention is not limited to the above embodiments, and it goes without saying that other modifications and applications are included without departing from the spirit of the present invention. The whole or part of the exemplary embodiments disclosed above can be described as, but not limited to, the following.
(Appendix 1)
An information processing system which reads data to be processed from a plurality of storage media having different read speeds and transfers the data to the host in response to a request from the host,
An acquisition unit that acquires position information indicating the storage medium in which the processing target data exists,
A management unit that manages a table that defines a priority order of the read processing of the processing target data stored in the plurality of storage media, the priority order being determined between the plurality of storage media.
An information processing system comprising: a processing unit that reads out the processing target data based on the position information and the table.
(Appendix 2)
Item 2. The processing unit executes different read processing for each of the plurality of storage media on the processing target data stored in the plurality of storage media based on the position information and the table. Information processing system.
(Appendix 3)
The plurality of storage media are a storage disk and a cache memory having a faster read speed than the storage disk,
When the processing target data is stored in the cache memory, the processing unit executes a read process from the cache memory, and when the processing target data is not stored in the cache memory, the processing unit reads from the storage disk. The information processing system according to appendix 1 or 2, which executes processing.
(Appendix 4)
The processing target data is data to be searched,
3. The information processing system according to appendix 1 or 2, wherein the processing unit preferentially executes a search process for the processing target data stored in a storage medium having a high read speed among the plurality of storage media.
(Appendix 5)
5. The information processing system according to appendix 4, wherein the processing unit preferentially executes a read process for data that takes a short time to transfer based on the position information and the table.
(Appendix 6)
Reading processing target data from a plurality of storage media having different read speeds and transferring the data to the host in response to a request from the host;
An acquisition step of acquiring position information indicating the storage medium in which the processing target data exists,
A management step of managing a table that defines the priority order of the read processing of the processing target data stored in the plurality of storage media, the priority order being determined among the plurality of storage media;
An information processing method, comprising: a processing step of reading the processing target data based on the position information and the table.
(Appendix 7)
In the processing step, based on the position information and the table, different read processing is performed on the processing target data stored in the plurality of storage media for each of the plurality of storage media. Information processing method.
(Appendix 8)
The plurality of storage media are a storage disk and a cache memory having a faster read speed than the storage disk,
In the processing step, based on the position information and the table, if the processing target data is stored in the cache memory, read processing from the cache memory is executed, and the processing target data is stored in the cache memory. The information processing method according to appendix 6 or 7, wherein a reading process from the storage disk is executed if not performed.
(Appendix 9)
The processing target data is data to be searched,
8. The information processing method according to appendix 6 or 7, wherein in the processing step, a search process for the processing target data stored in a storage medium having a high read speed among the plurality of storage media is preferentially executed.
(Appendix 10)
10. The information processing method according to appendix 9, wherein in the processing step, based on the position information and the table, a read process for data with a short transfer time is preferentially executed.

１０、１０４、３０４、５０３取得手段
１１、１０５、３０８管理手段
１２、１１２処理手段
２０、２１記憶媒体
１０６、３０６キャッシュメモリ
１０１ビジネスアプリケーション
１０２、３０５、５０２ファイル
１０３バックアップアプリケーション
１０７、３０７ストレージディスク
１０８バックアップストレージディスク
１０９、３０９、３１０、３１１、３１２、５１０、５１１、５１２、５１３格納領域
１１０一時保存領域
３０１、５０１文字列検索アプリケーション
３０２データ管理手段
３０３検索処理実行手段
５０４第１層ストレージ
５０５第２層ストレージ
５０６第Ｎ層ストレージ
５０７第１層ストレージのデバイスドライバ
５０８第２層ストレージのデバイスドライバ
５０９第Ｎ層ストレージのデバイスドライバ
１０００、２０００、３０００、４０００情報処理システム 10, 104, 304, 503 Acquisition means 11, 105, 308 Management means 12, 112 Processing means 20, 21 Storage medium 106, 306 Cache memory 101 Business application 102, 305, 502 File 103 Backup application 107, 307 Storage disk 108 Backup Storage disks 109, 309, 310, 311, 312, 510, 511, 512, 513 Storage area 110 Temporary storage area 301, 501 Character string search application 302 Data management means 303 Search processing execution means 504 First layer storage 505 Second layer Storage 506 Nth tier storage 507 First tier storage device driver 508 Second tier storage device driver 509 Nth tier storage device driver 1000, 2000, 3000, 4000 Information processing system

Claims

An information processing system for reading data to be processed from a plurality of storage media having different read speeds and transferring the data to the host in response to a request from the host,
An acquisition unit that acquires position information indicating the storage medium in which the processing target data exists,
A management unit that manages a table that defines a priority order of the read processing of the processing target data stored in the plurality of storage media, the priority order being determined between the plurality of storage media.
Processing means for reading the processing target data based on the position information and the table,
The processing target data is data to be searched for a character string,
The plurality of storage media include a first storage medium and a second storage medium,
In the case where only a part of the processing target data is stored in the first storage medium whose reading speed is faster than that of the second storage medium, the processing means stores the data in the first storage medium. An information processing system that preferentially executes a character string search process for the part of the processing target data.

The processing unit executes a different read process for each of the plurality of storage media on the processing target data stored in the plurality of storage media, based on the position information and the table. Information processing system described.

The second storage medium is a storage disk, the first storage medium is a cache memory having a read speed higher than that of the storage disk,
When only a part of the processing target data is stored in the cache memory, the processing unit executes a read process from the cache memory for the part of the processing target data, The information processing system according to claim 1, wherein a read process from the storage disk is executed for a part other than the part of the process target data.

When only a part of the processing target data is stored in the first storage medium, the processing means stores the data in the first storage medium that is data that takes a short time to transfer. The information processing system according to claim 1, wherein the read process for the part of the target data is preferentially executed.

When only a part of the processing target data is stored in the first storage medium, the processing unit executes a reading process from the first storage medium for the part of the processing target data, After the search process of the character string for the part of the process target data is completed, a process of reading from the second storage medium is executed for parts of the process target data other than the part of the process target data. The information processing system according to claim 1.

An information processing method for reading data to be processed from a plurality of storage media having different read speeds and transferring the data to the host in response to a request from the host,
An acquisition step of acquiring position information indicating the storage medium in which the processing target data exists,
A management step of managing a table that defines the priority order of the read processing of the processing target data stored in the plurality of storage media, the priority order being determined among the plurality of storage media;
A processing step of reading the processing target data based on the position information and the table,
The processing target data is data to be searched for a character string,
The plurality of storage media include a first storage medium and a second storage medium,
In the processing step, when only a part of the processing target data is stored in the first storage medium having a read speed faster than that of the second storage medium, it is stored in the first storage medium. An information processing method, which preferentially executes a character string search process for the part of the processing target data.

7. In the processing step, based on the position information and the table, a different read process is performed on the target data stored in the plurality of storage media for each of the plurality of storage media. Information processing method described.

The second storage medium is a storage disk, the first storage medium is a cache memory having a read speed higher than that of the storage disk,
In the processing step, the cases where only a portion of the processing target data in the cache memory is stored, the for said portion of said processed data to perform a read process from the cache memory, among the processed data The information processing method according to claim 6, wherein a read process from the storage disk is executed for a part other than the part of the process target data.

In the processing step, when only a part of the processing target data is stored in the first storage medium, the processing stored in the first storage medium that is data that takes a short time to transfer. The information processing method according to claim 6, wherein the read process for the part of the target data is preferentially executed.

In the processing step, when only a part of the processing target data is stored in the first storage medium, a reading process from the first storage medium is executed for the part of the processing target data, After the search process of the character string for the part of the process target data is completed, a process of reading from the second storage medium is executed for parts of the process target data other than the part of the process target data. The information processing method according to claim 6.