JP2018147143A

JP2018147143A - Information processing apparatus, information processing program and information processing method

Info

Publication number: JP2018147143A
Application number: JP2017040160A
Authority: JP
Inventors: 鷹詔中尾; Takanori Nakao
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-03-03
Filing date: 2017-03-03
Publication date: 2018-09-20
Anticipated expiration: 2037-03-03
Also published as: WO2018159167A1; JP6756280B2

Abstract

PROBLEM TO BE SOLVED: To make it possible to efficiently use a semiconductor memory device.SOLUTION: An information processing apparatus includes: a division processing unit 21 for dividing compression target character string data into a plurality of partial regions; a compression processing unit 22 for compressing region data included in each of the plurality of partial regions to an access unit size or less and generating compressed partial data; and a storage processing unit 22 for storing the generated compressed partial data in an access unit area of a semiconductor memory device 14.SELECTED DRAWING: Figure 1

Description

本発明は、情報処理装置，情報処理プログラムおよび情報処理方法に関する。 The present invention relates to an information processing apparatus, an information processing program, and an information processing method.

文字列のデータ量を削減するデータ圧縮方法の一つとして文法圧縮が知られている。 Grammar compression is known as one of data compression methods for reducing the amount of character string data.

図１１は従来の文法圧縮手法を説明するための図である。 FIG. 11 is a diagram for explaining a conventional grammar compression method.

文法圧縮においては、連続する２つの文字（２文字）を１文字に置換（変換，圧縮置換）することを繰り返し行ない、最終的に残った１文字と、変換の履歴を示す辞書とを記録する。 In grammar compression, two consecutive characters (2 characters) are repeatedly replaced (converted and compressed) with one character, and finally the remaining one character and a dictionary indicating the conversion history are recorded. .

図１１においては、文字列“abcabc”を圧縮する例を木構造で表している。この木構造において、ノード（節点，頂点）が置換された文字を示す。 In FIG. 11, an example of compressing the character string “abcabc” is represented by a tree structure. In this tree structure, the characters in which nodes (nodes, vertices) are replaced are shown.

以下、木構造における上位（上層）を、単に、上といい、下位（下層）を、単に、下というものとする。 Hereinafter, the upper (upper layer) in the tree structure is simply referred to as the upper, and the lower (lower) is simply referred to as the lower.

先ず、文字列“ab”を文字“X”に置換することで、文字列“abcabc”を文字列“XcXc”に変換する。 First, the character string “abcabc” is converted to the character string “XcXc” by replacing the character string “ab” with the character “X”.

次に、文字列“Xc”を文字“Y”に置換することで、文字列“XcXc”を文字列“YY”に変換する。 Next, the character string “Xc” is converted to the character string “YY” by replacing the character string “Xc” with the character “Y”.

さらに、文字列“YY”を文字“Z”に置換することで、文字列“YY”を文字“Z”に変換する。 Further, the character string “YY” is converted to the character “Z” by replacing the character string “YY” with the character “Z”.

そして、文字列“abcabc”を“Z”に変換するまでに行なった置換の情報を辞書として、最終的に残った１文字“Z”とともに記録する。 Then, information on replacement performed until the character string “abcabc” is converted to “Z” is recorded as a dictionary together with the finally remaining one character “Z”.

また、圧縮された文字“Z”は、上記と逆の処理を行なうことで、圧縮前の文字列“abcabc”に展開（伸長）される。 In addition, the compressed character “Z” is expanded (expanded) into the character string “abcabc” before compression by performing the reverse process.

また、文法圧縮では、個々のノードに対して、その下にいくつの文字が存在するかを示す情報（以下、ラベルという）を付記することで、特定の範囲の文字列を少ない工程で伸長することができる。ノードにラベルを付して示す木構造をラベル付き木構造という場合がある。 In grammar compression, information indicating how many characters exist below each node (hereinafter referred to as a label) is added to expand a character string in a specific range with a small number of steps. be able to. A tree structure with a label attached to a node may be referred to as a labeled tree structure.

図１２はラベル付き木構造を例示する図であり、図１１に示した木構造にラベルを付したものである。 FIG. 12 is a diagram illustrating a labeled tree structure, in which a label is attached to the tree structure shown in FIG.

図１２に例示するラベル付き木構造において、文字“X”の下には、文字“a”，“b”の２つの文字があるので、この文字“X”にはラベル“2”が付されている。同様に、文字“Y”の下には、文字“a”，“b”，“c”の３つの文字があるので、この文字“Y”にはラベル“3”が付されている。なお、ラベルには置換に用いられる文字“X”，“Y”は計数しないものとする。 In the labeled tree structure illustrated in FIG. 12, since there are two characters “a” and “b” under the character “X”, the label “2” is attached to the character “X”. ing. Similarly, under the character “Y”, there are three characters “a”, “b”, and “c”, and the label “3” is attached to the character “Y”. Note that the characters “X” and “Y” used for replacement are not counted in the label.

この図１２に例示するラベル付き木構造の辞書を、以下のように表すことができる。辞書は、文法圧縮の過程において行なわれる文字の置換を表す。 The labeled tree structure dictionary illustrated in FIG. 12 can be expressed as follows. The dictionary represents character substitution performed in the process of grammar compression.

“Z”：“Z(6) -> YY”,“Y(3) -> Xc”，“X(2) -> ab”
なお、辞書中において、符号“->”は、置換による対応関係を表す。また、辞書中においてはラベルをカッコ付きの数字（例えば、“(6)”）を用いて示す。また、ラベル付き木構造において、最上位の文字（図１２に示す例では文字“Z”）と、最下位の文字（図１２に示す例では文字“abcabc”）との間にあるノードを中間ノードという場合がある。 “Z”: “Z (6)-> YY”, “Y (3)-> Xc”, “X (2)-> ab”
In the dictionary, the symbol “->” represents a correspondence relationship by replacement. In the dictionary, labels are indicated using numbers with parentheses (for example, “(6)”). Further, in the labeled tree structure, a node between the uppermost character (character “Z” in the example shown in FIG. 12) and the lowermost character (character “abcabc” in the example shown in FIG. 12) is intermediate. Sometimes called a node.

ここで、図１２に例示するラベル付き木構造で表される文字列“abcabc”において、例えば５文字目から６文字目を参照（伸長）する場合には、下記の手順により行なう。 Here, in the character string “abcabc” represented by the labeled tree structure illustrated in FIG. 12, for example, when the fifth to sixth characters are referred to (expanded), the following procedure is performed.

なお、以下、例えば、文字“X”のa文字目からb文字目までの文字列を、“X[a,b]”と表す。a，bはいずれも自然数である。 Hereinafter, for example, a character string from the a-th character to the b-th character of the character “X” is represented as “X [a, b]”. Both a and b are natural numbers.

以下に、Z[5,6]を伸長する手法を示す。 A method for expanding Z [5,6] is shown below.

図１２に示すラベル付き木構造において、Zは、Y(3)とY(3)とに伸長できるので、Z[5,6]は、Y[2,3]と表すことができる。すなわち、Yにおける2文字目と3文字目とが、Zにおける5文字目と6文字目に相当する。 In the labeled tree structure shown in FIG. 12, Z can be expanded to Y (3) and Y (3), so Z [5,6] can be represented as Y [2,3]. That is, the second and third characters in Y correspond to the fifth and sixth characters in Z.

同様に、YはX(2)とc(1)とに伸長できるので、Y[2,3]は、X[2,2]およびcとなる。また、Xはa(1)とb(1)とに伸長できるので、Y[2,2]はbと表すことができる。 Similarly, Y can be expanded to X (2) and c (1), so Y [2,3] becomes X [2,2] and c. Further, since X can be expanded to a (1) and b (1), Y [2,2] can be represented as b.

従って、文法圧縮されたX[5,6]は、bcと求めることができる。 Therefore, the grammar-compressed X [5,6] can be obtained as bc.

なお、図１２に示すラベル付き木構造において、上述の如くZ[5,6]を伸長するために参照されるノードを破線で表している。 In the labeled tree structure shown in FIG. 12, the nodes referred to for extending Z [5,6] are represented by broken lines as described above.

特開平６−２０２８４４号公報JP-A-6-202844 特開２０１０−２６８８４号公報JP 2010-26884 A

一般に、文法圧縮されたデータを伸長する際にアクセス対象となる中間ノードは、記憶領域内において散在している。従って、上述の如く、木構造における一部のノードに対してのみアクセスすることでデータの伸長を行なうということは、データの記憶領域におけるアドレスで考えると、いくつかの領域にランダムアクセスが生じることを意味する。 Generally, intermediate nodes to be accessed when decompressing grammar-compressed data are scattered in a storage area. Therefore, as described above, data decompression by accessing only a part of nodes in the tree structure means that random access occurs in some areas when considering addresses in the data storage area. Means.

一方で、近年、高速なデータアクセスを実現するストレージ製品として、ＳＳＤ（Solid State Drive）を用いたＡＦＡ（All Flash Array）が開発されている。ＳＳＤにおいては、ページといわれる領域単位でデータアクセスが行なわれるという特性を有する。 On the other hand, in recent years, AFA (All Flash Array) using SSD (Solid State Drive) has been developed as a storage product for realizing high-speed data access. The SSD has a characteristic that data access is performed in units of areas called pages.

従って、文法圧縮されたデータがＳＳＤに格納されている場合に、データの伸長時に、記憶領域を成す多数のページのリードを行なう必要があり、これにより、ＳＳＤにおいて多くの領域のデータをリードする必要がある。これにより、ＳＳＤの寿命を縮めることになる。 Therefore, when grammar-compressed data is stored in the SSD, it is necessary to read a large number of pages constituting the storage area when the data is decompressed. As a result, data in many areas is read in the SSD. There is a need. This shortens the life of the SSD.

１つの側面では、本発明は、半導体記憶装置を効率的に使用できるようにすることを目的とする。 In one aspect, an object of the present invention is to enable efficient use of a semiconductor memory device.

このため、この情報処理装置は、圧縮対象文字列データを、複数の部分領域に区画する分割処理部と、前記複数の部分領域のそれぞれに含まれる領域データを、それぞれアクセス単位サイズ以下まで圧縮して圧縮済部分データを生成する圧縮処理部と、生成した前記圧縮済部分データを、半導体記憶装置におけるアクセス単位領域に格納させる格納処理部とを備える。 For this reason, this information processing apparatus compresses the compression target character string data into a plurality of partial areas and the area data included in each of the plurality of partial areas up to an access unit size or less. A compression processing unit that generates compressed partial data, and a storage processing unit that stores the generated compressed partial data in an access unit area in a semiconductor storage device.

一実施形態によれば、半導体記憶装置を効率的に用いることができる。 According to one embodiment, a semiconductor memory device can be used efficiently.

実施形態の一例としての情報処理装置の構成を模式的に示す図である。It is a figure which shows typically the structure of the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置における分割処理部による処理を説明するための図である。It is a figure for demonstrating the process by the division | segmentation process part in the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置における圧縮処理部による処理を説明するための図である。It is a figure for demonstrating the process by the compression process part in the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置における伸長処理部による処理を説明するための図である。It is a figure for demonstrating the process by the expansion | extension process part in the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置におけるメッセージの圧縮処理を説明するフローチャートである。It is a flowchart explaining the compression process of the message in the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置における伸長処理を説明するフローチャートである。It is a flowchart explaining the expansion | extension process in the information processing apparatus as an example of embodiment. 実施形態の一例としての情報処理装置におけるデータ圧縮方法を従来の文法圧縮手法と比較して示す図である。It is a figure which shows the data compression method in the information processing apparatus as an example of embodiment compared with the conventional grammar compression method. 実施形態の一例としての情報処理装置の機能構成の変形例を示す図である。It is a figure which shows the modification of the function structure of the information processing apparatus as an example of embodiment. 変形例としての情報処理装置における圧縮対象メッセージに対する圧縮処理および圧縮効果を説明するための図である。It is a figure for demonstrating the compression process and the compression effect with respect to the compression object message in the information processing apparatus as a modification. 変形例としての情報処理装置におけるメッセージの圧縮処理を説明するフローチャートである。It is a flowchart explaining the compression process of the message in the information processing apparatus as a modification. 従来の文法圧縮手法を説明するための図である。It is a figure for demonstrating the conventional grammar compression method. ラベル付き木構造を例示する図である。It is a figure which illustrates a labeled tree structure.

以下、図面を参照して本情報処理装置，情報処理プログラムおよび情報処理方法に係る実施の形態を説明する。ただし、以下に示す実施形態はあくまでも例示に過ぎず、実施形態で明示しない種々の変形例や技術の適用を排除する意図はない。すなわち、本実施形態を、その趣旨を逸脱しない範囲で種々変形（実施形態及び各変形例を組み合わせる等）して実施することができる。又、各図は、図中に示す構成要素のみを備えるという趣旨ではなく、他の機能等を含むことができる。 Hereinafter, embodiments of the information processing apparatus, an information processing program, and an information processing method will be described with reference to the drawings. However, the embodiment described below is merely an example, and there is no intention to exclude application of various modifications and techniques not explicitly described in the embodiment. In other words, the present embodiment can be implemented with various modifications (combining the embodiments and modifications) without departing from the spirit of the present embodiment. Each figure is not intended to include only the components shown in the figure, and may include other functions.

（Ａ）構成
図１は実施形態の一例としての情報処理装置１の構成を模式的に示す図である。 (A) Configuration FIG. 1 is a diagram schematically illustrating a configuration of an information processing apparatus 1 as an example of an embodiment.

情報処理装置１は、例えば、ストレージシステムにおけるＣＭ（Controller Module）やパーソナルコンピュータである。 The information processing apparatus 1 is, for example, a CM (Controller Module) or a personal computer in a storage system.

情報処理装置１は、例えば、図１に示すように、ＣＰＵ（Central Processing Unit）１１，ＲＡＭ（Random Access Memory）１２，ＲＯＭ（Read Only Memory）１３，ＳＳＤ１４，グラフィック処理装置２０，入力インタフェース１５，光学ドライブ装置１６，機器接続インタフェース１７およびネットワークインタフェース１８を構成要素として有する。これらの構成要素１１〜１８，２０は、バス１９を介して相互に通信可能に構成される。 For example, as shown in FIG. 1, the information processing apparatus 1 includes a CPU (Central Processing Unit) 11, a RAM (Random Access Memory) 12, a ROM (Read Only Memory) 13, an SSD 14, a graphic processing device 20, an input interface 15, The optical drive device 16, the device connection interface 17, and the network interface 18 are included as components. These components 11 to 18 and 20 are configured to be able to communicate with each other via a bus 19.

情報処理装置１は、文字列で構成されるデータ（文字列データ，圧縮対象文字列データ）を圧縮してデータサイズを削減した状態でＳＳＤ１４格納するデータ圧縮機能を備える。 The information processing apparatus 1 includes a data compression function for compressing data (character string data, compression target character string data) composed of character strings and storing the SSD 14 in a state where the data size is reduced.

ＳＳＤ１４は、半導体メモリを記憶媒体として用いた記録装置であり、種々のデータやプログラムを格納する。 The SSD 14 is a recording device that uses a semiconductor memory as a storage medium, and stores various data and programs.

以下、半導体メモリとしてフラッシュメモリを備えるＳＳＤ１４について示す。 Hereinafter, an SSD 14 including a flash memory as a semiconductor memory will be described.

フラッシュメモリの記憶領域は、複数個（例えば、4200個）のブロックを備え、各ブロックには、複数個（例えば256個）のページが形成されている。各ページはそれぞれ、例えば4KB（KiloByte）の記憶容量を有する。以下、１つのページのサイズをページサイズもしくはアラインメントサイズという。本例においては、ページサイズ=4KBである。 The storage area of the flash memory includes a plurality of (for example, 4200) blocks, and a plurality of (for example, 256) pages are formed in each block. Each page has a storage capacity of 4 KB (KiloByte), for example. Hereinafter, the size of one page is referred to as a page size or an alignment size. In this example, the page size is 4 KB.

フラッシュメモリにおいては、データのリードおよびライトはページ単位で行なわれるが、消去はブロック単位で行なわれる。 In the flash memory, data reading and writing are performed in units of pages, but erasing is performed in units of blocks.

例えば、情報処理装置１において実行されるファイルシステムが、ＳＳＤ１４に格納されたデータに対する操作やアクセス，検索を制御し、ＳＳＤ１４に対するデータのリードやライトを制御する。 For example, a file system executed in the information processing apparatus 1 controls operations, access, and retrieval with respect to data stored in the SSD 14, and controls reading and writing of data with respect to the SSD 14.

ＳＳＤ１４の記憶領域に対するリードやライトの入出力アクセスは、上述の如くページ単位で行なわれるので、ページをアクセス単位ということができる。また、ページサイズをアクセス単位サイズということができる。 Since read / write input / output access to the storage area of the SSD 14 is performed in units of pages as described above, a page can be referred to as an access unit. The page size can be referred to as an access unit size.

ＲＯＭ１３は、種々のデータやプログラムを格納する記憶装置である。ＲＡＭ１２はＣＰＵ１１が演算処理等を行なう際に、データやプログラム等を一時的に格納する記憶装置である。 The ROM 13 is a storage device that stores various data and programs. The RAM 12 is a storage device that temporarily stores data, programs, and the like when the CPU 11 performs arithmetic processing and the like.

また、このＲＡＭ１２には、ＳＳＤ１４に格納するデータやＳＳＤ１４から読み出したデータを一時的に記憶してもよい。 The RAM 12 may temporarily store data stored in the SSD 14 or data read from the SSD 14.

ＣＰＵ１１は、ＲＯＭ１３やＳＳＤ１４に格納されたＯＳ（Operating System）や各種プログラムを実行することにより、種々の演算や制御を行なう。本情報処理装置１においては、このＣＰＵ１１がＳＳＤ１４等に格納されたプログラム（圧縮処理プログラム，情報処理プログラム）を実行することにより、図１に示す、分割処理部２１，圧縮処理部２２，伸長処理部２３および格納処理部２５として機能する。 The CPU 11 performs various calculations and controls by executing an OS (Operating System) and various programs stored in the ROM 13 and the SSD 14. In the information processing apparatus 1, when the CPU 11 executes a program (compression processing program, information processing program) stored in the SSD 14 or the like, the division processing unit 21, the compression processing unit 22, and the decompression processing shown in FIG. Functions as the unit 23 and the storage processing unit 25.

なお、これらの分割処理部２１，圧縮処理部２２，伸長処理部２３および格納処理部２５としての機能を実現するためのプログラムは、例えばフレキシブルディスク，ＣＤ（Compact Disc；ＣＤ−ＲＯＭ，ＣＤ−Ｒ（Recordable），ＣＤ−ＲＷ（ReWritable）等），ＤＶＤ（Digital Versatile Disc；ＤＶＤ−ＲＯＭ，ＤＶＤ−ＲＡＭ，ＤＶＤ−Ｒ，ＤＶＤ＋Ｒ，ＤＶＤ−ＲＷ，ＤＶＤ＋ＲＷ，ＨＤＤＶＤ等），ブルーレイディスク，磁気ディスク，光ディスク，光磁気ディスク等の、コンピュータ読取可能な記録媒体に記録された形態で提供される。そして、コンピュータはその記録媒体からプログラムを読み取って内部記憶装置または外部記憶装置に転送し格納して用いる。又、そのプログラムを、例えば磁気ディスク，光ディスク，光磁気ディスク等の記憶装置（記録媒体）に記録しておき、その記憶装置から通信経路を介してコンピュータに提供するようにしてもよい。 A program for realizing the functions as the division processing unit 21, compression processing unit 22, decompression processing unit 23, and storage processing unit 25 is, for example, a flexible disk, CD (Compact Disc; CD-ROM, CD-R). (Recordable), CD-RW (ReWritable, etc.), DVD (Digital Versatile Disc; DVD-ROM, DVD-RAM, DVD-R, DVD + R, DVD-RW, DVD + RW, HD DVD, etc.), Blu-ray disc, magnetic disc, It is provided in a form recorded on a computer-readable recording medium such as an optical disk or a magneto-optical disk. Then, the computer reads the program from the recording medium, transfers it to the internal storage device or the external storage device, and uses it. The program may be recorded in a storage device (recording medium) such as a magnetic disk, an optical disk, or a magneto-optical disk, and provided from the storage device to the computer via a communication path.

分割処理部２１，圧縮処理部２２，伸長処理部２３および格納処理部２５としての機能を実現する際には、内部記憶装置（本実施形態ではＲＡＭ１２やＲＯＭ１３）に格納されたプログラムがコンピュータのマイクロプロセッサ（本実施形態ではＣＰＵ１１）によって実行される。このとき、記録媒体に記録されたプログラムをコンピュータが読み取って実行するようにしてもよい。 When realizing the functions as the division processing unit 21, compression processing unit 22, decompression processing unit 23, and storage processing unit 25, the program stored in the internal storage device (RAM 12 or ROM 13 in this embodiment) is stored in the microcomputer. It is executed by a processor (CPU 11 in this embodiment). At this time, the computer may read and execute the program recorded on the recording medium.

なお、本実施形態において、コンピュータとは、ハードウェアとオペレーティングシステムとを含む概念であり、オペレーティングシステムの制御の下で動作するハードウェアを意味している。又、オペレーティングシステムが不要でアプリケーションプログラム単独でハードウェアを動作させるような場合には、そのハードウェア自体がコンピュータに相当する。ハードウェアは、少なくとも、ＣＰＵ等のマイクロプロセッサと、記録媒体に記録されたコンピュータプログラムを読み取るための手段とをそなえており、本実施形態においては、情報処理装置１がコンピュータとしての機能を有している。 In the present embodiment, the computer is a concept including hardware and an operating system, and means hardware that operates under the control of the operating system. Further, when an operating system is unnecessary and hardware is operated by an application program alone, the hardware itself corresponds to a computer. The hardware includes at least a microprocessor such as a CPU and means for reading a computer program recorded on a recording medium. In the present embodiment, the information processing apparatus 1 has a function as a computer. ing.

なお、ＣＰＵ１１はマルチプロセッサであってもよい。また、ＣＰＵ１１に代えて、例えばＭＰＵ（Micro Processing Unit），ＤＳＰ（Digital Signal Processor），ＡＳＩＣ（Application Specific Integrated Circuit），ＰＬＤ（Programmable Logic Device），ＦＰＧＡ（Field Programmable Gate Array）のいずれか一つを備えてもよい。また、ＣＰＵ１１に代えて、ＣＰＵ，ＭＰＵ，ＤＳＰ，ＡＳＩＣ，ＰＬＤ，ＦＰＧＡのうちの２種類以上の要素を組み合わせ用いてもよい。 The CPU 11 may be a multiprocessor. Further, instead of the CPU 11, for example, any one of MPU (Micro Processing Unit), DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), PLD (Programmable Logic Device), FPGA (Field Programmable Gate Array) is used. You may prepare. Further, in place of the CPU 11, two or more types of elements of CPU, MPU, DSP, ASIC, PLD, and FPGA may be used in combination.

グラフィック処理装置２０には、モニタ２０ａが接続されている。グラフィック処理装置２０は、ＣＰＵ１１からの命令に従って、画像をモニタ２０ａの画面に表示させる。モニタ２０ａとしては、ＣＲＴ（Cathode Ray Tube）を用いた表示装置や液晶表示装置等が挙げられる。 A monitor 20 a is connected to the graphic processing device 20. The graphic processing device 20 displays an image on the screen of the monitor 20a in accordance with a command from the CPU 11. Examples of the monitor 20a include a display device using a CRT (Cathode Ray Tube) and a liquid crystal display device.

入力インタフェース１５には、キーボード１５ａおよびマウス１５ｂが接続されている。入力インタフェース１５は、キーボード１５ａやマウス１５ｂから送られてくる信号をＣＰＵ１１に送信する。なお、マウス１５ｂは、ポインティングデバイスの一例であり、他のポインティングデバイスを使用することもできる。他のポインティングデバイスとしては、タッチパネル，タブレット，タッチパッド，トラックボール等が挙げられる。 A keyboard 15a and a mouse 15b are connected to the input interface 15. The input interface 15 transmits signals sent from the keyboard 15a and the mouse 15b to the CPU 11. The mouse 15b is an example of a pointing device, and other pointing devices can also be used. Examples of other pointing devices include a touch panel, a tablet, a touch pad, and a trackball.

光学ドライブ装置１６は、レーザ光等を利用して、光ディスク１６ａに記録されたデータの読み取りを行なう。光ディスク１６ａは、光の反射によって読み取り可能にデータを記録された可搬型の非一時的な記録媒体である。光ディスク１６ａには、ＤＶＤ，ＤＶＤ−ＲＡＭ，ＣＤ−ＲＯＭ，ＣＤ−Ｒ／ＲＷ）等が挙げられる。 The optical drive device 16 reads data recorded on the optical disc 16a using laser light or the like. The optical disc 16a is a portable non-temporary recording medium on which data is recorded so as to be readable by reflection of light. Examples of the optical disk 16a include DVD, DVD-RAM, CD-ROM, CD-R / RW).

機器接続インタフェース１７は、情報処理装置１に周辺機器を接続するための通信インタフェースである。例えば、機器接続インタフェース１７には、メモリ装置１７ａやメモリリーダライタ１７ｂを接続することができる。メモリ装置１７ａは、機器接続インタフェース１７との通信機能を搭載した非一時的な記録媒体、例えばＵＳＢ（Universal Serial Bus）メモリである。メモリリーダライタ１７ｂは、メモリカード１７ｃへのデータの書き込み、またはメモリカード１７ｃからのデータの読み出しを行なう。メモリカード１７ｃは、カード型の非一時的な記録媒体である。 The device connection interface 17 is a communication interface for connecting peripheral devices to the information processing apparatus 1. For example, the device connection interface 17 can be connected to a memory device 17a or a memory reader / writer 17b. The memory device 17a is a non-temporary recording medium equipped with a communication function with the device connection interface 17, for example, a USB (Universal Serial Bus) memory. The memory reader / writer 17b writes data to the memory card 17c or reads data from the memory card 17c. The memory card 17c is a card-type non-temporary recording medium.

ネットワークインタフェース１８は、図示しないネットワークに接続される。ネットワークインタフェース１８は、ネットワークを介して、他のコンピュータまたは通信機器との間でデータの送受信を行なう。 The network interface 18 is connected to a network (not shown). The network interface 18 transmits / receives data to / from other computers or communication devices via the network.

分割処理部２１は、文字列として記載された圧縮対象のデータ（圧縮対象文字列データ）を、複数のリージョン（領域）に区切る。以下、圧縮対象のデータを、メッセージもしくは圧縮対象メッセージという場合がある。また、以下、圧縮対象メッセージを複数のリージョンに区切ることを、分割するもしくは区画すると表現する場合がある。 The division processing unit 21 divides compression target data (compression target character string data) described as a character string into a plurality of regions. Hereinafter, the data to be compressed may be referred to as a message or a message to be compressed. Hereinafter, dividing the compression target message into a plurality of regions may be expressed as dividing or partitioning.

個々のリージョンには、１つ以上のページが含まれる。すなわち、分割処理部２１は、メッセージを１つ以上のページが含まれるリージョンに分割する。なお、１つのリージョンには２つ以上のページを含むことが望ましく、リージョンのサイズ（リージョンサイズ）はページサイズの倍数であることが望ましい。また、各リージョンに含まれるページ数は同数であることが望ましい。 Each region contains one or more pages. In other words, the division processing unit 21 divides the message into regions including one or more pages. One region preferably includes two or more pages, and the region size (region size) is preferably a multiple of the page size. It is desirable that the number of pages included in each region is the same.

以下、リージョンに含まれるデータをリージョンデータという場合がある。また、後述する圧縮処理部２２および伸長処理部２３が処理を行なう対象のリージョンをターゲットリージョンもしくは、単にターゲットという場合がある。また、複数のリージョンをターゲット群という場合がある。 Hereinafter, data included in a region may be referred to as region data. In addition, a region to be processed by the compression processing unit 22 and the decompression processing unit 23 to be described later may be referred to as a target region or simply a target. A plurality of regions may be referred to as a target group.

また、分割処理部２１により複数のリージョンに区切られたメッセージを区画済みメッセージという場合がある。 In addition, a message divided into a plurality of regions by the division processing unit 21 may be referred to as a partitioned message.

図２は実施形態の一例としての情報処理装置１における分割処理部２１による処理を説明するための図である。 FIG. 2 is a diagram for explaining processing by the division processing unit 21 in the information processing apparatus 1 as an example of the embodiment.

図２中において符号（ａ）は分割処理部２１による分割前のメッセージを示しており、８ページ分の長さを有するメッセージを示している。 In FIG. 2, a symbol (a) indicates a message before division by the division processing unit 21, and indicates a message having a length of 8 pages.

以下、図に示すメッセージ中において、圧縮前の状態で個々のページに対応する部分にＰ１〜Ｐ８の符号を付して表す。 Hereinafter, in the message shown in the figure, portions corresponding to individual pages in the state before compression are denoted by reference numerals P1 to P8.

また、図２中において、符号（ｂ）は分割処理部２１により複数のリージョンに分割された状態のメッセージを示しており、この図２に示す例においては、分割処理部２１は、メッセージを、連続する２ページ毎に分割することで複数（図２に示す例では４つ）のリージョンＲ１〜Ｒ４を生成している。 In FIG. 2, reference numeral (b) indicates a message that is divided into a plurality of regions by the division processing unit 21. In the example illustrated in FIG. 2, the division processing unit 21 displays the message as follows: A plurality (four in the example shown in FIG. 2) of regions R1 to R4 are generated by dividing every two continuous pages.

図２に示す例において、リージョンＲ１にはリージョンデータとしてページＰ１，Ｐ２のデータが含まれている。同様に、リージョンＲ２にはページＰ３，Ｐ４のデータが、リージョンＲ３にはページＰ５，Ｐ６のデータが、リージョンＲ４にはページＰ７，Ｐ８のデータが、それぞれリージョンデータとして含まれている。 In the example shown in FIG. 2, the region R1 includes data of pages P1 and P2 as region data. Similarly, region R2 includes data of pages P3 and P4, region R3 includes data of pages P5 and P6, and region R4 includes data of pages P7 and P8 as region data.

分割処理部２１は、メッセージを、その先頭から順番にリージョンに割り当てる。メッセージのデータサイズがページの倍数でない場合には、最後尾のリージョンにメッセージの端数部分を格納してもよい。 The division processing unit 21 assigns messages to the regions in order from the beginning. If the message data size is not a multiple of the page, the fractional part of the message may be stored in the last region.

個々のリージョンのリージョンデータは、それぞれページの整数倍のデータサイズを有するものであるので、メッセージにおいて、隣り合うリージョン間の境界はページ間の境界と一致する。従って、各リージョンデータは、ＳＳＤ１４の記憶領域におけるページにアラインメントされているといえる。 Since the region data of each region has a data size that is an integral multiple of the page, the boundary between adjacent regions matches the boundary between pages in the message. Accordingly, it can be said that each region data is aligned with a page in the storage area of the SSD 14.

圧縮処理部２２は、分割処理部２１によって分割（生成）された各リージョンのデータを、それぞれ圧縮する。 The compression processing unit 22 compresses the data of each region divided (generated) by the division processing unit 21.

圧縮処理部２２は、各リージョンのリージョンデータを、文法圧縮の手法を用いて、それぞれページサイズ以下になるまで圧縮する。 The compression processing unit 22 compresses the region data of each region using a grammatical compression method until each region becomes equal to or smaller than the page size.

すなわち、圧縮処理部２２は、圧縮対象メッセージにおいて最も多く表れる連続する２文字を置換対象文字列として抽出し、この置換対象文字列に対して置換後文字を設定する。そして、圧縮処理部２２は、圧縮対象メッセージ全体に対して、置換対象文字列を置換後文字を用いて置換することで、圧縮処理（文法圧縮処理）を行なう。 That is, the compression processing unit 22 extracts two consecutive characters that appear most frequently in the compression target message as a replacement target character string, and sets a replacement character for the replacement target character string. Then, the compression processing unit 22 performs compression processing (grammar compression processing) by replacing the replacement target character string with the post-replacement character for the entire compression target message.

以下、圧縮処理部２２によりリージョンデータを圧縮することにより生成されたデータを、圧縮済部分データという場合がある。また、圧縮済部分データを圧縮データという場合がある。 Hereinafter, the data generated by compressing the region data by the compression processing unit 22 may be referred to as compressed partial data. Further, the compressed partial data may be referred to as compressed data.

図３は実施形態の一例としての情報処理装置１における圧縮処理部２２による処理を説明するための図である。 FIG. 3 is a diagram for explaining processing by the compression processing unit 22 in the information processing apparatus 1 as an example of the embodiment.

図３中において、符号（ａ）は圧縮処理部２２による圧縮前のメッセージ（圧縮対象メッセージ）を、符号（ｂ）は圧縮処理部２２による圧縮後のメッセージを、それぞれ示している。図３においては、符号（ｂ）に示すように、圧縮済部分データに符号Ｓ１〜Ｓ４を付して示し、辞書情報に符号Ｄ１を付して示す。 In FIG. 3, the code (a) indicates a message before being compressed by the compression processing unit 22 (message to be compressed), and the code (b) indicates a message after being compressed by the compression processing unit 22. In FIG. 3, as indicated by reference numeral (b), reference numerals S1 to S4 are attached to the compressed partial data, and reference numeral D1 is attached to the dictionary information.

圧縮処理部２２は、各リージョンのデータに対して、連続する２つの文字（２文字）を１文字に置換（圧縮置換，変換）することを繰り返し行なうことで、データのサイズを圧縮する文法圧縮を行なう。 The compression processing unit 22 repeatedly performs replacement (compression replacement, conversion) of two consecutive characters (two characters) with one character for each region data, thereby compressing the size of the data. To do.

圧縮処理部２２は、各リージョンのデータを、それぞれページサイズ以下になるまで圧縮を行ない、ページサイズ以下となった時点でそのリージョンデータの圧縮置換を中止する。 The compression processing unit 22 compresses the data in each region until it becomes equal to or smaller than the page size, and stops the compression and replacement of the region data when the data becomes smaller than the page size.

図３に示す例においては、符号（ａ）に示すように、圧縮前のメッセージ（圧縮対象メッセージ）には、リージョンＲ１に文字列“abcabc”が、リージョンＲ２に文字列“bcbcaa”が、リージョンＲ３に文字列“aabbcc”が、リージョンＲ１に文字列“abccba”が、それぞれ含まれている。 In the example shown in FIG. 3, as indicated by reference numeral (a), the message before compression (message to be compressed) includes the character string “abcabc” in the region R1, the character string “bcbcaa” in the region R2, The character string “aabbcc” is included in R3, and the character string “abccba” is included in the region R1.

圧縮処理部２２は、文字列“bc”を文字“P”に、文字列“aP”を文字“Q”に、文字列“aa”を文字“R”に、文字列“Pc”を文字“S”に、文字列“cb”を文字“T”に、それぞれ置換する。 The compression processing unit 22 converts the character string “bc” to the character “P”, the character string “aP” to the character “Q”, the character string “aa” to the character “R”, and the character string “Pc” to the character “ Replace the character string “cb” with the character “T” in S ”.

これにより、例えばリージョンＲ１の文字列“abcabc”が、符号（ｂ）に示すように、文字列“QQ”に変換される（符号Ｓ１参照）。圧縮済部分データである変換後の文字列“QQ”のデータサイズはページサイズ以下であり、１つのページ内に収まっている。 Thereby, for example, the character string “abcabc” in the region R1 is converted into the character string “QQ” as shown in the code (b) (see the code S1). The data size of the converted character string “QQ”, which is compressed partial data, is equal to or smaller than the page size and fits in one page.

同様に、リージョンＲ２の文字列“bcbcaa”が、符号（ｂ）に示すように、文字列“PPR”（圧縮済部分データ）に変換される（符号Ｓ２参照）。変換後の文字列“PPR”（圧縮済部分データ）のデータサイズもページサイズ以下であり、１つのページ内に収まっている。他のリージョンＲ３，Ｒ４の文字列についても同様に、変換後の圧縮済部分データのデータサイズは１つのページ内に収まっている。 Similarly, the character string “bcbcaa” in the region R2 is converted into the character string “PPR” (compressed partial data) as shown in the code (b) (see the code S2). The data size of the converted character string “PPR” (compressed partial data) is also equal to or smaller than the page size and fits in one page. Similarly for the character strings of the other regions R3 and R4, the data size of the compressed partial data after conversion is within one page.

圧縮処理部２２は、リージョン毎にデータ圧縮を行なうものであり、個々のリージョン内でそれぞれデータ圧縮を行なう。すなわち、圧縮処理部２２は、リージョン間の境界（アラインメント）を超えてデータ圧縮を行なうことはない。 The compression processing unit 22 performs data compression for each region, and performs data compression within each region. That is, the compression processing unit 22 does not perform data compression beyond the boundary (alignment) between regions.

すなわち、圧縮処理部２２は、各リージョンのデータに文字圧縮を行なう。 That is, the compression processing unit 22 performs character compression on the data of each region.

後述する格納処理部２５は、圧縮処理部２２が生成した各圧縮済部分データを、それぞれＳＳＤ１４の記憶領域における１つのページ内に収まるように格納する。これにより、１つ分のリージョンのデータは、圧縮されて圧縮済部分データとして、ＳＳＤ１４の１つのページ内に格納される。 The storage processing unit 25 described later stores each compressed partial data generated by the compression processing unit 22 so as to fit in one page in the storage area of the SSD 14. As a result, the data of one region is compressed and stored in one page of the SSD 14 as compressed partial data.

従って、各リージョンのデータにより生成された圧縮済部分データは、ＳＳＤ１４に格納された状態において、隣接するページ間の境界を跨ぐことがない、いわゆるアラインメントが揃った状態となる。 Therefore, the compressed partial data generated by the data of each region is in a state in which the so-called alignment is aligned without crossing the boundary between adjacent pages in the state stored in the SSD 14.

また、上述した圧縮処理部２２による各リージョンのデータの置換に関して、置換前の文字列と置換後の文字列との対応を表す情報（生成規則）が、後述する格納処理部２５により、辞書要素として辞書情報Ｄ１として格納される。この辞書情報Ｄ１は、図３の符号（ｂ）に示すように、例えば、複数の圧縮済部分データの先頭位置にまとめて格納される。図３において符号（ｂ）に示す例では、複数の圧縮済部分データに先行する２ページ分の領域に辞書情報Ｄ１が格納されている。 In addition, regarding the replacement of the data in each region by the compression processing unit 22 described above, information (generation rule) indicating the correspondence between the character string before replacement and the character string after replacement is stored in the dictionary element by the storage processing unit 25 described later. Is stored as dictionary information D1. The dictionary information D1 is stored together at, for example, the head positions of a plurality of compressed partial data, as indicated by reference numeral (b) in FIG. In the example indicated by reference numeral (b) in FIG. 3, the dictionary information D1 is stored in an area for two pages preceding a plurality of compressed partial data.

圧縮対象メッセージが圧縮処理部２２によって圧縮されたものを圧縮済メッセージといい、符号Ｅ１を付して示す。圧縮済メッセージＥ１は、辞書情報Ｄ１および複数の圧縮済部分データＳ１〜Ｓ４を備える。 A message to be compressed that has been compressed by the compression processing unit 22 is referred to as a “compressed message”, and is denoted by reference numeral E1. The compressed message E1 includes dictionary information D1 and a plurality of compressed partial data S1 to S4.

格納処理部２５は、圧縮処理部２２によって生成された圧縮済部分データを、ＳＳＤ１４の記憶領域を構成するページにそれぞれ格納する。 The storage processing unit 25 stores the compressed partial data generated by the compression processing unit 22 in each page constituting the storage area of the SSD 14.

上述の如く、各圧縮済部分データはページサイズ以下のデータサイズであるので、個々の圧縮済部分データは、それぞれページにはみ出すことなく格納される。 As described above, since each compressed partial data has a data size equal to or smaller than the page size, each compressed partial data is stored without protruding onto the page.

格納処理部２５は、圧縮処理部２２が生成した各圧縮済部分データを、それぞれ１つのページ内に収まるように格納する。例えば、格納処理部２５は圧縮済部分データの先頭をページの先頭位置に揃えて格納する。これにより、１つ分のリージョンのデータは、圧縮されて圧縮済部分データとして１つのページ内に格納される。 The storage processing unit 25 stores each compressed partial data generated by the compression processing unit 22 so as to fit in one page. For example, the storage processing unit 25 stores the compressed partial data with the head thereof aligned with the head position of the page. Thereby, the data of one region is compressed and stored in one page as compressed partial data.

従って、各リージョンのデータにより生成された圧縮済部分データは、ＳＳＤ１４に格納された状態において、隣接するページ間の境界を跨ぐことがない、いわゆるアラインメントが揃った状態となる。すなわち、ＳＳＤ１４において、各圧縮済部分データは記憶領域を構成するページにそれぞれアラインメントされた状態で格納される。 Therefore, the compressed partial data generated by the data of each region is in a state in which the so-called alignment is aligned without crossing the boundary between adjacent pages in the state stored in the SSD 14. In other words, in the SSD 14, each compressed partial data is stored in a state of being aligned with each page constituting the storage area.

また、格納処理部２５は、上述した圧縮処理部２２による各リージョンのデータの置換に関して、置換前の文字列と置換後の文字列との対応を表す情報（生成規則）を辞書要素として辞書情報Ｄ１として格納する。この辞書情報Ｄ１は、図３の符号（ｂ）に示すように、例えば、複数の圧縮済部分データの先頭位置にまとめて格納される。 Further, the storage processing unit 25 uses the compression processing unit 22 to replace data in each region with dictionary information using information (generation rules) indicating correspondence between the character string before replacement and the character string after replacement as a dictionary element. Store as D1. The dictionary information D1 is stored together at, for example, the head positions of a plurality of compressed partial data, as indicated by reference numeral (b) in FIG.

格納処理部２５は、圧縮済部分データを、ＳＳＤ１４の記憶領域において、同一の圧縮対象メッセージから生成される他の圧縮済部分データや辞書情報と近接する位置に格納することが望ましい。 The storage processing unit 25 desirably stores the compressed partial data in a storage area of the SSD 14 at a position close to other compressed partial data or dictionary information generated from the same message to be compressed.

伸長処理部２３は、圧縮処理部２２によって圧縮されたデータの伸長（展開，復元）を行なう。 The decompression processing unit 23 decompresses (decompresses and restores) the data compressed by the compression processing unit 22.

伸長処理部２３は、辞書情報Ｄ１を読み出し、この読み出した辞書情報を用いて、各ページに格納された圧縮済部分データをそれぞれ伸長する。 The decompression processing unit 23 reads the dictionary information D1 and decompresses the compressed partial data stored in each page using the read dictionary information.

伸長処理部２３は、例えば、圧縮処理部２２がリージョンのデータを圧縮済部分データに圧縮した工程とは逆の工程を辞書情報を用いて行なうことで、圧縮済部分データを伸長してリージョン毎にデータを復元する。 For example, the decompression processor 23 decompresses the compressed partial data for each region by performing a process opposite to the process in which the compression processor 22 compresses the region data into the compressed partial data using the dictionary information. Restore data to.

図４は実施形態の一例としての情報処理装置１における伸長処理部２３による処理を説明するための図である。 FIG. 4 is a diagram for explaining processing by the decompression processing unit 23 in the information processing apparatus 1 as an example of the embodiment.

図４中において、符号（ａ）は圧縮処理部２２により圧縮された圧縮済メッセージを、符号（ｂ）は伸長処理部２３により伸長されたメッセージの一部を、それぞれ示している。 In FIG. 4, symbol (a) indicates a compressed message compressed by the compression processing unit 22, and symbol (b) indicates a part of the message expanded by the expansion processing unit 23.

伸長処理部２３は、辞書情報Ｄ１を参照して辞書要素（生成規則）を読み出し、この辞書要素を用いて各圧縮済部分データＳを伸長する。 The decompression processing unit 23 reads dictionary elements (generation rules) with reference to the dictionary information D1, and decompresses each compressed partial data S using the dictionary elements.

図４に示す例においては、伸長処理部２３が圧縮済部分データＳ３を伸長する例を示している。この図４に示す例においては、１ページサイズ以下に圧縮され、１つのページに格納された圧縮済部分データＳ３が伸長されることにより、２ページサイズのリージョンＲ３のデータ（リージョンデータ）が復元された状態を示す。 In the example illustrated in FIG. 4, an example in which the decompression processing unit 23 decompresses the compressed partial data S3 is illustrated. In the example shown in FIG. 4, the compressed partial data S3 compressed to one page size or less and decompressed is decompressed, thereby restoring the data (region data) of the two-page region R3. Indicates the state that has been performed.

例えば、圧縮済部分データＳ３を伸長してリージョンＲ３のデータを復元する際に、伸長処理部２３は、ＳＳＤ１４に格納された辞書情報Ｄ１（辞書要素）と圧縮済部分データＳ３とにアクセスを行なう。 For example, when decompressing the compressed partial data S3 and restoring the data of the region R3, the decompression processing unit 23 accesses the dictionary information D1 (dictionary element) and the compressed partial data S3 stored in the SSD 14. .

すなわち、伸長処理部２３は、伸長対象の圧縮済部分データＳ３と辞書情報Ｄ１とが格納された記憶領域（ページ）に対してのみアクセスすることで、この圧縮済部分データＳ３の伸長を実現する。すなわち、伸長対象以外の圧縮済部分データＳ３の格納領域にアクセスしない。 That is, the decompression processing unit 23 achieves decompression of the compressed partial data S3 by accessing only the storage area (page) in which the compressed partial data S3 to be decompressed and the dictionary information D1 are stored. . That is, the storage area of the compressed partial data S3 other than the decompression target is not accessed.

（Ｂ）動作
上述の如く構成された実施形態の一例としての情報処理装置１におけるメッセージの圧縮処理を、図５に示すフローチャート（ステップＡ１〜Ａ７）に従って説明する。以下に示す例においては、分割処理部２１が、メッセージを連続する２ページ毎に分割している。 (B) Operation Message compression processing in the information processing apparatus 1 as an example of the embodiment configured as described above will be described with reference to a flowchart (steps A1 to A7) shown in FIG. In the example shown below, the division processing unit 21 divides a message every two consecutive pages.

以下の処理においては、圧縮対象のメッセージ（圧縮対象メッセージ）がオペレータ等によって入力される。オペレータは、リージョンサイズおよびページサイズを指定してもよい。なお、これらのリージョンサイズおよびページサイズには、予め設定された規定値を用いてもよい。圧縮対象メッセージは符号語としてＲＡＭ１２等の記憶領域に記録される。 In the following processing, a message to be compressed (compression target message) is input by an operator or the like. The operator may specify the region size and page size. Note that predetermined values set in advance may be used for these region sizes and page sizes. The compression target message is recorded in a storage area such as the RAM 12 as a code word.

ステップＡ１において、分割処理部２１が、処理対象メッセージを、ページサイズの２倍毎に区切り、複数のリージョン（ターゲット群）を生成する。これにより、各リージョンには隣り合う２つのページが含まれる。 In step A1, the division processing unit 21 divides the processing target message every two times the page size, and generates a plurality of regions (target groups). As a result, each region includes two adjacent pages.

ステップＡ２において、圧縮処理部２２は、複数のリージョンの中から１つのリージョン（ターゲット）を選択する。そして、圧縮処理部２２は、このターゲットリージョンに格納されているデータ（リージョンデータ）がページサイズ以下である場合には、そのデータ（圧縮データ，圧縮済部分データ）を、ＳＳＤ１４の記憶領域の１つのページ内に格納する。 In step A2, the compression processing unit 22 selects one region (target) from a plurality of regions. When the data (region data) stored in the target region is equal to or smaller than the page size, the compression processing unit 22 converts the data (compressed data, compressed partial data) to 1 in the storage area of the SSD 14. Store in one page.

ステップＡ３において、圧縮処理部２２は、ターゲット群において、リージョンデータがページサイズ以下まで圧縮が完了していないリージョン（ターゲット）があるかを確認する。 In step A <b> 3, the compression processing unit 22 checks whether there is a region (target) in the target group that has not been compressed until the region data is equal to or smaller than the page size.

ページサイズ以下までの圧縮が完了していないリージョン（ターゲット）がある場合には（ステップＡ３のＹＥＳルート参照）、ステップＡ４に移行する。以下のステップＡ４〜Ａ６の処理において、圧縮処理部２２は処理対象メッセージに対する圧縮（文法圧縮）処理を行なう。 If there is a region (target) that has not been compressed to the page size or less (see YES route in step A3), the process proceeds to step A4. In the processing of steps A4 to A6 below, the compression processing unit 22 performs compression (grammar compression) processing on the processing target message.

ステップＡ４において、圧縮処理部２２は、圧縮が完了していないリージョン（ターゲット）のリージョンデータを圧縮対象データとして、この圧縮対象データにおいて最も多く表れる連続する２文字を置換対象文字列として抽出する。圧縮処理部２２は、この置換対象文字列に対して、置換後の文字（置換後文字）を設定する。本例においては、置換後文字を“X”とする。 In step A4, the compression processing unit 22 extracts region data of a region (target) that has not been compressed as compression target data, and extracts two consecutive characters that appear most frequently in the compression target data as replacement target character strings. The compression processing unit 22 sets a character after replacement (character after replacement) for the replacement target character string. In this example, the replaced character is “X”.

ステップＡ５において、圧縮処理部２２は、置換対象文字列と置換後文字（例えばX）を対応付けた辞書要素（生成規則）を辞書情報Ｄ１に登録する。 In step A5, the compression processing unit 22 registers a dictionary element (generation rule) in which the replacement target character string and the replaced character (for example, X) are associated with each other in the dictionary information D1.

ステップＡ６において、圧縮処理部２２は、圧縮対象メッセージ全体に対して、置換対象文字列を置換後文字“X”を用いて置換することで、圧縮処理を行なう。その後、処理はステップＡ２に戻る。なお、置換後文字は適宜変更して実施される。 In step A <b> 6, the compression processing unit 22 performs compression processing by replacing the replacement target character string with the replaced character “X” for the entire compression target message. Thereafter, the process returns to step A2. It should be noted that the characters after replacement are appropriately changed.

また、ステップＡ３における確認の結果、ページサイズ以下までの圧縮が完了していないリージョンがない場合、すなわち、全てのリージョンのリージョンデータの圧縮済部分データが、いずれもページサイズ以下まで圧縮された場合には（ステップＡ３のＮＯルート参照）、ステップＡ７に移行する。 In addition, as a result of the confirmation in step A3, when there is no region that has not been compressed to the page size or less, that is, when all the compressed partial data of the region data of all regions is compressed to the page size or less. (Refer to the NO route in step A3), the process proceeds to step A7.

ステップＡ７においては、圧縮処理部２２は、辞書情報Ｄ１および各圧縮済部分データを圧縮済メッセージ（符号語）として出力する。すなわち、圧縮対象メッセージを圧縮した圧縮済メッセージが出力される。その後、処理を終了する。 In step A7, the compression processing unit 22 outputs the dictionary information D1 and each compressed partial data as a compressed message (codeword). That is, a compressed message obtained by compressing the compression target message is output. Thereafter, the process ends.

次に、実施形態の一例としての情報処理装置１における伸長処理を、図６に示すフローチャート（ステップＢ１〜Ｂ５）に従って説明する。 Next, decompression processing in the information processing apparatus 1 as an example of the embodiment will be described with reference to a flowchart (steps B1 to B5) illustrated in FIG.

以下の処理においては、圧縮されたメッセージ（符号語ともいう）と伸長対象の圧縮済部分データを示すブロック番号が入力され、伸長されたメッセージの一部（２ページ分）が出力される。 In the following processing, a compressed message (also referred to as a code word) and a block number indicating the compressed partial data to be decompressed are input, and a part of the decompressed message (for two pages) is output.

ステップＢ１において、伸長処理部２３は、圧縮済メッセージから伸長対象の圧縮済部分データを取り出し、伸長ターゲットとする。 In step B1, the decompression processing unit 23 extracts the compressed partial data to be decompressed from the compressed message and sets it as the decompression target.

ステップＢ２において、伸長処理部２３は、辞書情報Ｄ１において、伸長ターゲットに対して未処理の辞書要素があるかを確認する。 In step B2, the decompression processing unit 23 checks whether there is an unprocessed dictionary element for the decompression target in the dictionary information D1.

未処理の辞書要素がある場合には（ステップＢ２のＹＥＳルート参照）、ステップＢ３に移行する。 If there is an unprocessed dictionary element (see YES route in step B2), the process proceeds to step B3.

ステップＢ３において、伸長処理部２３は、辞書情報Ｄ１から圧縮状態のデータに対応する、データ伸長にまだ用いていない辞書要素を１つ取り出し、辞書要素Xとする。 In step B3, the decompression processing unit 23 extracts one dictionary element that has not yet been used for data decompression, corresponding to the compressed data, from the dictionary information D1 and sets it as the dictionary element X.

ステップＢ４において、伸長処理部２３は、辞書要素Xを用いて伸長ターゲットの圧縮済部分データの伸長を行なう。その後、ステップＢ２に戻る。 In step B4, the decompression processing unit 23 decompresses the compressed partial data of the decompression target using the dictionary element X. Thereafter, the process returns to step B2.

また、ステップＢ２における確認の結果、未処理の辞書要素がない場合には（ステップＢ２のＮＯルート参照）、ステップＢ５に移行する。 If there is no unprocessed dictionary element as a result of the confirmation in step B2 (see NO route in step B2), the process proceeds to step B5.

ステップＢ５において、伸長処理部２３は伸長済みのデータ（２ページ分）を出力し、その後、処理を終了する。 In step B5, the decompression processing unit 23 outputs decompressed data (for two pages), and thereafter ends the process.

（Ｃ）効果
このように、実施形態の一例としての情報処理装置１によれば、分割処理部２１が圧縮対象メッセージを１つ以上のページを含む複数のリージョンに区切り、圧縮処理部２２が、各リージョンのデータを、それぞれページサイズ以下となるまで圧縮する。 (C) Effect As described above, according to the information processing apparatus 1 as an example of the embodiment, the division processing unit 21 divides the compression target message into a plurality of regions including one or more pages, and the compression processing unit 22 Compress the data in each region until it is less than the page size.

これにより、各リージョンデータを圧縮して生成される圧縮済部分データは、それぞれページ内に収まり、ページ毎にアラインメントされた状態となる。 As a result, the compressed partial data generated by compressing each region data fits within the page and is aligned for each page.

従って、特定の圧縮済部分データを伸長する際には、伸長処理部２３は、伸長対象の圧縮済部分データが格納されたページと、辞書情報Ｄ１に対してのみアクセスすることで、伸長対象の圧縮済部分データを伸長させることができる。 Therefore, when decompressing specific compressed partial data, the decompression processing unit 23 accesses only the page storing the compressed partial data to be decompressed and the dictionary information D1, thereby obtaining the decompression target data. The compressed partial data can be expanded.

これにより、圧縮済メッセージの一部の圧縮済メッセージを伸長する際に、ＳＳＤ１４の記憶領域においてアクセスするページ範囲を少なくすることができ、ＳＳＤ１４の寿命を延ばすことができる。従って、ＳＳＤ１４を効率的に用いることができる。 As a result, when decompressing a part of the compressed message, the page range accessed in the storage area of the SSD 14 can be reduced, and the life of the SSD 14 can be extended. Therefore, the SSD 14 can be used efficiently.

圧縮対象メッセージを文法圧縮手法を用いて圧縮することによっても、ＳＳＤ１４におけるデータの記憶領域を効率的に用いることができる。 The data storage area in the SSD 14 can also be efficiently used by compressing the message to be compressed using the grammar compression technique.

分割処理部２１が、各リージョンがそれぞれ同数のページを含むよう圧縮対象メッセージを区画することで、圧縮処理部２２が生成する圧縮済部分データのサイズをほぼ均等にして均質化することができる。また、これにより、圧縮済部分データをページに容易にアラインメントすることができる。 The division processing unit 21 partitions the compression target message so that each region includes the same number of pages, so that the size of the compressed partial data generated by the compression processing unit 22 can be made substantially uniform and uniform. This also allows the compressed partial data to be easily aligned on the page.

また、圧縮済メッセージの一部の圧縮済メッセージを伸長する際に、ＳＳＤ１４においてアクセスするページ範囲が少ないので、処理速度を向上させることができる。 Further, when decompressing a part of the compressed message, the page range accessed in the SSD 14 is small, so that the processing speed can be improved.

図７は実施形態の一例としての情報処理装置１におけるデータ圧縮方法を従来の文法圧縮手法と比較して示す図である。 FIG. 7 is a diagram showing a data compression method in the information processing apparatus 1 as an example of the embodiment in comparison with a conventional grammar compression method.

図７において、符号（ａ）は従来の文法圧縮手法によるデータ圧縮例を木構造で示し、符号（ｂ）は本願の情報処理装置１の圧縮処理部２２によるデータ圧縮例を木構想で示す。 In FIG. 7, code (a) shows a data compression example by a conventional grammar compression method in a tree structure, and code (b) shows a data compression example by the compression processing unit 22 of the information processing apparatus 1 of the present application in a tree concept.

従来の文法圧縮手法により行なったデータ圧縮においては、符号（ａ）に示すように、最終的に１文字（図７に示す例では“G”）になるまで圧縮置換を繰り返し行なわれる。こにより、辞書要素が多く、辞書情報サイズが大きくなる。 In data compression performed by a conventional grammar compression method, compression and replacement are repeatedly performed until one character ("G" in the example shown in FIG. 7) is finally obtained, as indicated by symbol (a). Thereby, there are many dictionary elements and the dictionary information size becomes large.

これに対して、本情報処理装置１においては、圧縮処理部２２は、リージョン毎に、各リージョンデータに対して、ページサイズ以下となるまで圧縮置換を繰り返し行ない、ページサイズ以下となった時点で、圧縮置換を終了する。 On the other hand, in the information processing apparatus 1, the compression processing unit 22 repeatedly performs compression replacement for each region until the page size is equal to or smaller than the page size. Then, the compression replacement is terminated.

例えば、図７中において、符号（ｂ）に示す圧縮済部分データＳ１（Ｒ１）は、文字列“QQ”の状態で圧縮が終了している。 For example, in FIG. 7, the compressed partial data S1 (R1) indicated by reference numeral (b) has been compressed in the state of the character string “QQ”.

これにより、従来の文法圧縮手法に比べて、圧縮置換に伴う辞書要素の数が少なく、辞書情報Ｄ１のデータサイズが小さくなる。これにより、圧縮済メッセージを伸長する際に、アクセスするページ範囲をさらに少なくすることができる。 As a result, compared to the conventional grammatical compression method, the number of dictionary elements accompanying compression replacement is small, and the data size of the dictionary information D1 is small. Thereby, when decompressing the compressed message, the page range to be accessed can be further reduced.

（Ｄ）その他
開示の技術は上述した実施形態に限定されるものではなく、本実施形態の趣旨を逸脱しない範囲で種々変形して実施することができる。本実施形態の各構成及び各処理は、必要に応じて取捨選択することができ、あるいは適宜組み合わせてもよい。 (D) Others The disclosed technology is not limited to the above-described embodiment, and various modifications can be made without departing from the spirit of the present embodiment. Each structure and each process of this embodiment can be selected as needed, or may be combined suitably.

例えば、上述した実施形態においては、分割処理部２１が、予め規定されたサイズ（ページ数）のリージョンで圧縮対象メッセージを区切り、圧縮処理部２２が、各リージョンのリージョンデータをページサイズ以下になるまで圧縮を行なっているが、これに限定されるものではない。 For example, in the above-described embodiment, the division processing unit 21 divides the compression target message into regions of a predetermined size (number of pages), and the compression processing unit 22 reduces the region data of each region to a page size or less. However, the present invention is not limited to this.

図８は実施形態の一例としての情報処理装置１の機能構成の変形例を示す図である。 FIG. 8 is a diagram illustrating a modification of the functional configuration of the information processing apparatus 1 as an example of the embodiment.

この図８に示すように、本変形例においては、ＣＰＵ１１は、上述した分割処理部２１，圧縮処理部２２，伸長処理部２３および格納処理部２５としての機能に加えて、確認部２４としての機能を実現する。 As shown in FIG. 8, in this modification, the CPU 11 functions as the confirmation unit 24 in addition to the functions as the division processing unit 21, compression processing unit 22, decompression processing unit 23, and storage processing unit 25 described above. Realize the function.

なお、図中、既述の符号と同一の符号は同様の部分を示しているので、その詳細な説明は省略する。 In the figure, the same reference numerals as those already described indicate the same parts, and detailed description thereof is omitted.

確認部２４は、圧縮処理部２２が圧縮対象メッセージおよび圧縮済メッセージに対して行なう各データ圧縮処理について、それぞれ圧縮率を算出し、算出した圧縮率をＲＡＭ１２等の所定の記憶領域に格納する。 The confirmation unit 24 calculates a compression rate for each data compression process performed by the compression processing unit 22 on the compression target message and the compressed message, and stores the calculated compression rate in a predetermined storage area such as the RAM 12.

そして、確認部２４は、算出した圧縮率を予め規定された閾値と比較することで、圧縮処理部２２によって行われた圧縮処理について、圧縮効果があったか否かの判定を行なう。 Then, the confirmation unit 24 compares the calculated compression rate with a predetermined threshold value to determine whether or not the compression processing performed by the compression processing unit 22 has a compression effect.

例えば、確認部２４は、算出した圧縮率が閾値以上である場合に、評価要件が満たされており圧縮効果があると判断する。一方、確認部２４は、算出した圧縮率が閾値未満である場合に、評価要件が満たされておらず圧縮効果がないと判断する。すなわち、確認部２４は、圧縮処理部２２により行なわれる圧縮処理の評価を行なう。 For example, when the calculated compression rate is equal to or greater than the threshold value, the confirmation unit 24 determines that the evaluation requirement is satisfied and there is a compression effect. On the other hand, when the calculated compression rate is less than the threshold value, the confirmation unit 24 determines that the evaluation requirement is not satisfied and the compression effect is not obtained. That is, the confirmation unit 24 evaluates the compression processing performed by the compression processing unit 22.

図９は本変形例における圧縮対象メッセージに対する圧縮処理および圧縮効果を説明するための図である。 FIG. 9 is a diagram for explaining a compression process and a compression effect for a compression target message in the present modification.

図９中において、符号（ａ）は分割処理部２１により複数のリージョンに分割されたメッセージを示しており、この図９に示す例においては、分割処理部２１は、メッセージを連続する２ページ毎に分割している。以下、リージョンに符号Ｒ１〜Ｒ４を付して表す。また、図９中においては、符号（ｂ）〜（ｄ）に示すように、圧縮済部分データに符号Ｓ０１〜Ｓ０７を付して示し、辞書情報に符号Ｄ１を付して示す。 In FIG. 9, the code (a) indicates a message divided into a plurality of regions by the division processing unit 21. In the example shown in FIG. 9, the division processing unit 21 displays the message every two consecutive pages. It is divided into. Hereinafter, regions R1 to R4 are attached to the regions. In FIG. 9, as shown by reference numerals (b) to (d), reference numerals S01 to S07 are attached to the compressed partial data, and reference numeral D1 is attached to the dictionary information.

図９に示す例において、リージョンＲ１にはリージョンデータとしてページＰ１，Ｐ２のデータが含まれている。同様に、リージョンＲ２にはページＰ３，Ｐ４のデータが、リージョンＲ３にはページＰ５，Ｐ６のデータが、リージョンＲ４にはページＰ７，Ｐ８のデータが、それぞれリージョンデータとして含まれている。 In the example shown in FIG. 9, the data of pages P1 and P2 is included as region data in region R1. Similarly, region R2 includes data of pages P3 and P4, region R3 includes data of pages P5 and P6, and region R4 includes data of pages P7 and P8 as region data.

図９中において、符号（ｂ）は、圧縮処理部２２が、符号（ａ）に示す区画済みメッセージを圧縮して生成した圧縮済メッセージＥ１−１を示す。 In FIG. 9, a code (b) indicates a compressed message E1-1 generated by the compression processing unit 22 by compressing the partitioned message indicated by the code (a).

圧縮処理部２２は、圧縮対象メッセージのリージョンＲ１〜Ｒ４の各リージョンデータをページサイズ以下となるまで圧縮することで、圧縮済メッセージＥ１−１の圧縮済部分データＳ０１〜Ｓ０４をそれぞれ生成する。圧縮処理の対象である各リージョンのデータ（リージョンデータ）が圧縮処理部２２による圧縮ターゲットとなる。 The compression processing unit 22 generates compressed partial data S01 to S04 of the compressed message E1-1 by compressing each region data of the regions R1 to R4 of the compression target message until the page size is equal to or smaller than the page size. The data (region data) of each region that is the target of compression processing becomes a compression target by the compression processing unit 22.

本変形例において、圧縮処理部２２は、圧縮済メッセージに対して更なる圧縮を行なう、追加圧縮機能を有する。すなわち、圧縮処理部２２は、圧縮済メッセージにおいて隣接する複数の圧縮済部分データをまとめて、ページサイズ以下となるまで圧縮（追加圧縮）を行なう。 In this modification, the compression processing unit 22 has an additional compression function for performing further compression on the compressed message. That is, the compression processing unit 22 collects a plurality of adjacent compressed partial data in the compressed message and performs compression (additional compression) until the page size is equal to or smaller than the page size.

図９中において、符号（ｃ）は、圧縮処理部２２が、符号（ｂ）に示す圧縮済メッセージＥ１−１を更に圧縮して生成した圧縮済メッセージＥ１−２を示す。 In FIG. 9, the code (c) indicates a compressed message E1-2 generated by the compression processing unit 22 by further compressing the compressed message E1-1 shown in the code (b).

この符号（ｃ）に示す圧縮済メッセージＥ１−２においては、圧縮処理部２２が、符号（ｂ）の圧縮済メッセージＥ１−１において連続する２つの圧縮済部分データＳ０１，Ｓ０２をまとめて圧縮ターゲットとする。そして、圧縮処理部２２は、この生成した圧縮ターゲットをページサイズ以下となるまで圧縮することで、圧縮済部分データＳ０５を生成している。 In the compressed message E1-2 indicated by the code (c), the compression processing unit 22 collectively combines the two compressed partial data S01 and S02 in the compressed message E1-1 of the code (b). And Then, the compression processing unit 22 generates the compressed partial data S05 by compressing the generated compression target until it becomes equal to or smaller than the page size.

同様に、圧縮処理部２２は、符号（ｂ）の圧縮済メッセージＥ１−１において連続する２つの圧縮済部分データＳ０３，Ｓ０４をまとめて圧縮ターゲットとする。そして、圧縮処理部２２は、この生成した圧縮ターゲットをページサイズ以下となるまで圧縮することで、圧縮済部分データＳ０６を生成している。 Similarly, the compression processing unit 22 collectively sets two compressed partial data S03 and S04 in the compressed message E1-1 with the code (b) as a compression target. Then, the compression processing unit 22 generates the compressed partial data S06 by compressing the generated compression target until it becomes equal to or smaller than the page size.

圧縮処理部２２は、確認部２４が圧縮効果があると判断した場合に、その圧縮後の圧縮済メッセージに対して追加圧縮を行なう。また、圧縮処理部２２は、確認部２４が圧縮効果がないと判断した場合には、その圧縮後の圧縮済メッセージに対して追加圧縮を行なわない。すなわち、追加圧縮の実施を阻止する。 When the confirmation unit 24 determines that there is a compression effect, the compression processing unit 22 performs additional compression on the compressed message after the compression. On the other hand, when the confirmation unit 24 determines that there is no compression effect, the compression processing unit 22 does not perform additional compression on the compressed message after the compression. That is, additional compression is prevented from being performed.

図９中において、符号（ｄ）は、圧縮処理部２２が、符号（ｃ）に示す圧縮済メッセージＥ１−２を更に圧縮して生成した圧縮済メッセージＥ１−３を示す。 In FIG. 9, a code (d) indicates a compressed message E1-3 generated by the compression processing unit 22 by further compressing the compressed message E1-2 shown in the code (c).

圧縮処理部２２は、符号（ｃ）の圧縮済メッセージＥ１−２において連続する２つの圧縮済部分データＳ０５，Ｓ０６をまとめることで、新たな圧縮ターゲットを生成し、この生成した圧縮ターゲットをページサイズ以下となるまで圧縮する。これにより、圧縮処理部２２は、符号（ｄ）に示す圧縮済メッセージＥ１−３の圧縮済部分データにおける圧縮済部分データＳ０７を生成している。 The compression processing unit 22 generates a new compression target by combining two consecutive compressed partial data S05 and S06 in the compressed message E1-2 of the code (c), and converts the generated compression target into a page size. Compress until: Thereby, the compression process part 22 has produced | generated the compressed partial data S07 in the compressed partial data of the compressed message E1-3 shown to code | symbol (d).

図９に示す例においては、圧縮済メッセージＥ１−２を圧縮して圧縮済メッセージＥ１−３を生成する際に、辞書情報Ｄ１の容量が増大している。これにより、圧縮済メッセージＥ１−３については、その圧縮率は予め規定された閾値未満であり、後述する確認部２４により圧縮効果がないと判断される。 In the example shown in FIG. 9, when the compressed message E1-2 is compressed to generate the compressed message E1-3, the capacity of the dictionary information D1 is increased. As a result, the compression rate of the compressed message E1-3 is less than a predetermined threshold value, and it is determined by the confirmation unit 24 described later that there is no compression effect.

圧縮処理部２２は、確認部２４が圧縮効果がないと判断した場合には追加圧縮を行なわない。 The compression processing unit 22 does not perform additional compression when the confirmation unit 24 determines that there is no compression effect.

圧縮処理部２２は、確認部２４が圧縮効果があると判断した場合には、先に圧縮を行なった圧縮済メッセージに対する更なる圧縮を行なう。すなわち、先に圧縮を行なった圧縮済メッセージにおいて、連続する２つの圧縮済部分データをまとめて圧縮対象データとする。 When the confirmation unit 24 determines that there is a compression effect, the compression processing unit 22 performs further compression on the compressed message that has been previously compressed. That is, in the compressed message that has been previously compressed, two consecutive compressed partial data are collected as compression target data.

そして、圧縮処理部２２は、この圧縮対象データにおいて最も多く表れる連続する２文字を置換対象文字列として抽出し、この置換対象文字列に対して置換後文字を設定する。そして、圧縮処理部２２は、圧縮対象メッセージ全体に対して、置換対象文字列を置換後文字を用いて置換することで、圧縮処理を行なう。 Then, the compression processing unit 22 extracts two consecutive characters appearing most in the compression target data as a replacement target character string, and sets a post-replacement character for the replacement target character string. Then, the compression processing unit 22 performs compression processing on the entire compression target message by replacing the replacement target character string with the post-replacement character.

一方、確認部２４が圧縮効果がないと判断した場合には、圧縮処理部２２は、先に圧縮を行なった圧縮済メッセージに対する更なる圧縮を行なわない。 On the other hand, when the confirmation unit 24 determines that there is no compression effect, the compression processing unit 22 does not perform further compression on the compressed message that has been previously compressed.

上述の如く構成された実施形態の変形例としての情報処理装置１におけるメッセージの圧縮処理を、図１０に示すフローチャート（ステップＣ１〜Ｃ１０）に従って説明する。 A message compression process in the information processing apparatus 1 as a modified example of the embodiment configured as described above will be described with reference to a flowchart (steps C1 to C10) illustrated in FIG.

ステップＣ１において、分割処理部２１が、処理対象メッセージを、ページサイズの２倍毎に区切り、複数のリージョン（ターゲット群）を生成する。 In step C1, the division processing unit 21 divides the processing target message every two times the page size, and generates a plurality of regions (target groups).

また、圧縮処理部２２は中身が空の辞書情報Ｄ１を用意する。圧縮対象メッセージを符号語として記録する。 The compression processing unit 22 prepares dictionary information D1 that is empty. The message to be compressed is recorded as a code word.

ステップＣ２において、圧縮処理部２２は、複数のリージョンの中から１つのリージョン（ターゲット）を選択する。そして、圧縮処理部２２は、このターゲットリージョンに格納されているデータ（リージョンデータ）がページサイズ以下である場合には、そのデータ（圧縮データ，圧縮済部分データ）を、ＳＳＤ１４の記憶領域の１つのページ内に格納する。 In step C2, the compression processing unit 22 selects one region (target) from a plurality of regions. When the data (region data) stored in the target region is equal to or smaller than the page size, the compression processing unit 22 converts the data (compressed data, compressed partial data) to 1 in the storage area of the SSD 14. Store in one page.

ステップＣ３において、圧縮処理部２２は、ターゲット群において、リージョンデータがページサイズ以下まで圧縮が完了していないリージョン（ターゲット）があるかを確認する。 In step C <b> 3, the compression processing unit 22 confirms whether there is a region (target) in the target group that has not been compressed until the region data is equal to or smaller than the page size.

ページサイズ以下までの圧縮が完了していないリージョン（ターゲット）がある場合には（ステップＣ３のＹＥＳルート参照）、ステップＣ４に移行する。以下のステップＣ４〜Ｃ６の処理において、圧縮処理部２２は処理対象メッセージに対する圧縮（文法圧縮）処理を行なう。 If there is a region (target) that has not been compressed to the page size or less (see YES route in step C3), the process proceeds to step C4. In the processing of steps C4 to C6 below, the compression processing unit 22 performs compression (grammar compression) processing on the processing target message.

ステップＣ４において、圧縮処理部２２は、圧縮が完了していないリージョン（ターゲット）のリージョンデータを圧縮対象データとして、この圧縮対象データにおいて最も多く表れる連続する２文字を置換対象文字列として抽出する。圧縮処理部２２は、この置換対象文字列に対して、置換後の文字（置換後文字）を設定する。本例においては、置換後文字を“X”とする。 In step C4, the compression processing unit 22 extracts region data of a region (target) that has not been compressed as compression target data, and extracts two consecutive characters that appear most frequently in the compression target data as replacement target character strings. The compression processing unit 22 sets a character after replacement (character after replacement) for the replacement target character string. In this example, the replaced character is “X”.

ステップＣ５において、圧縮処理部２２は、置換対象文字列と置換後文字（例えばX）を対応付けた辞書要素（生成規則）を辞書情報Ｄ１に登録する。 In step C5, the compression processing unit 22 registers a dictionary element (generation rule) in which the replacement target character string and the replaced character (for example, X) are associated with each other in the dictionary information D1.

ステップＣ６において、圧縮処理部２２は、圧縮対象メッセージ全体に対して、置換対象文字列を置換後文字“X”を用いて置換することで、圧縮処理を行なう。その後、処理はステップＣ２に戻る。なお、置換後文字は適宜変更して実施される。 In step C <b> 6, the compression processing unit 22 performs compression processing by replacing the replacement target character string with the replaced character “X” for the entire compression target message. Thereafter, the process returns to step C2. It should be noted that the characters after replacement are appropriately changed.

また、ステップＣ３における確認の結果、ページサイズ以下までの圧縮が完了していないリージョンがない場合、すなわち、全てのリージョンのリージョンデータの圧縮済部分データが、いずれもページサイズ以下まで圧縮された場合には（ステップＣ３のＮＯルート参照）、ステップＣ７に移行する。 In addition, as a result of the confirmation in step C3, when there is no region that has not been compressed to the page size or less, that is, all the compressed partial data of the region data of all regions is compressed to the page size or less. (Refer to the NO route in step C3), the process proceeds to step C7.

ステップＣ７において、確認部２４が、圧縮処理部２２が行なったデータ圧縮処理により生成された圧縮済メッセージついて圧縮率を算出し、算出した圧縮率を予め規定された閾値と比較して、圧縮効果があったか否かの判定を行なう。 In step C7, the confirmation unit 24 calculates a compression rate for the compressed message generated by the data compression processing performed by the compression processing unit 22, compares the calculated compression rate with a predetermined threshold value, and determines the compression effect. It is determined whether or not there has been.

例えば、確認部２４は、算出した圧縮率が閾値以上である場合に、圧縮効果があると判断する。また、確認部２４は、算出した圧縮率が閾値未満である場合に、圧縮効果がないと判断する。すなわち、確認部２４は、圧縮処理部２２により行なわれる圧縮処理の評価を行なう。 For example, the confirmation unit 24 determines that there is a compression effect when the calculated compression rate is equal to or greater than a threshold value. The confirmation unit 24 determines that there is no compression effect when the calculated compression rate is less than the threshold value. That is, the confirmation unit 24 evaluates the compression processing performed by the compression processing unit 22.

確認部２４による評価の結果、圧縮効果があると判断された場合には（ステップＣ７のＹＥＳルート参照）、ステップＣ８に移行する。 As a result of the evaluation by the confirmation unit 24, when it is determined that there is a compression effect (see YES route of Step C7), the process proceeds to Step C8.

ステップＣ８において、圧縮処理部２２は、圧縮に使用した辞書要素を登録した辞書情報Ｄ１と、生成した圧縮済メッセージとを符号語として、ＲＡＭ１２等の所定の記憶領域に記録する。 In step C8, the compression processing unit 22 records the dictionary information D1 in which the dictionary elements used for compression are registered and the generated compressed message as codewords in a predetermined storage area such as the RAM 12.

ステップＣ９において、圧縮処理部２２は、圧縮済メッセージにおいて連続する２つの圧縮済部分データどうしをまとめることで、１つ以上の圧縮ターゲットを生成する。その後、ステップＣ２に戻る。 In step C9, the compression processing unit 22 generates one or more compression targets by combining two consecutive compressed partial data in the compressed message. Thereafter, the process returns to step C2.

また、ステップＣ７における、確認部２４による評価の結果、圧縮効果がないと判断された場合には（ステップＣ７のＮＯルート参照）、ステップＣ１０に移行する。 Moreover, when it is determined that there is no compression effect as a result of the evaluation by the confirmation unit 24 in Step C7 (see NO route in Step C7), the process proceeds to Step C10.

ステップＣ１０においては、圧縮処理部２２は、辞書情報Ｄ１および各圧縮済部分データを圧縮済メッセージ（符号語）として出力する。すなわち、圧縮対象メッセージを圧縮した圧縮済メッセージが出力される。その後、処理を終了する。 In step C10, the compression processing unit 22 outputs the dictionary information D1 and each compressed partial data as a compressed message (code word). That is, a compressed message obtained by compressing the compression target message is output. Thereafter, the process ends.

このように、実施形態の変形例としての情報処理装置１によれば、上述した実施形態と同様の作用効果を得られる他、リージョンのサイズ、すなわち、リージョンに含めるページのサイズを予め規定する必要がなく利便性が高い。 As described above, according to the information processing apparatus 1 as a modified example of the embodiment, it is necessary to obtain in advance the size and size of a region, that is, the size of a page included in the region, in addition to obtaining the same operational effects as the above-described embodiment There is no convenience.

すなわち、予めリージョンのサイズを規定することなく、ページ毎にアラインメントされた状態の圧縮済部分データを生成することができる。 That is, it is possible to generate compressed partial data that is aligned for each page without prescribing the region size.

（Ｅ）付記
以上の実施形態に関し、更に以下の付記を開示する。 (E) Supplementary Notes The following supplementary notes are further disclosed regarding the above embodiment.

（付記１）
圧縮対象文字列データを、複数の部分領域に区画する分割処理部と、
前記複数の部分領域のそれぞれに含まれる領域データを、それぞれアクセス単位サイズ以下まで圧縮して圧縮済部分データを生成する圧縮処理部と、
生成した前記圧縮済部分データを、半導体記憶装置におけるアクセス単位領域に格納させる格納処理部と
を備えることを特徴とする、情報処理装置。 (Appendix 1)
A division processing unit that divides compression target character string data into a plurality of partial areas;
A compression processing unit that generates compressed partial data by compressing the area data included in each of the plurality of partial areas to an access unit size or less;
An information processing apparatus comprising: a storage processing unit that stores the generated compressed partial data in an access unit area in a semiconductor storage device.

（付記２）
前記部分領域が、隣り合う複数の前記アクセス単位領域を含む
ことを特徴とする、付記１記載の情報処理装置。 (Appendix 2)
The information processing apparatus according to appendix 1, wherein the partial area includes a plurality of adjacent access unit areas.

（付記３）
前記圧縮処理部が、文法圧縮処理を行なうことで前記圧縮済部分データを生成する
ことを特徴とする、付記１または２記載の情報処理装置。 (Appendix 3)
The information processing apparatus according to appendix 1 or 2, wherein the compression processing unit generates the compressed partial data by performing a grammar compression process.

（付記４）
辞書情報を用いて伸長対象の圧縮済部分データを伸長して、前記圧縮対象文字列データの一部を復元する伸長処理部
を備えることを特徴とする、付記１〜３のいずれか１項に記載の情報処理装置。 (Appendix 4)
Any one of appendices 1 to 3, further comprising a decompression processing unit that decompresses compressed partial data to be decompressed using dictionary information and restores a part of the character string data to be compressed. The information processing apparatus described.

（付記５）
前記圧縮処理部が、生成した前記圧縮済部分データを備える圧縮済文字列データの圧縮率が評価要件を満たしている場合に、連続する複数の圧縮済部分データをアクセス単位サイズ（ページ単位）以下まで圧縮する
ことを特徴とする、付記１〜４のいずれか１項に記載の情報処理装置。 (Appendix 5)
When the compression rate of the compressed character string data including the compressed partial data generated by the compression processing unit satisfies the evaluation requirement, a plurality of continuous compressed partial data is equal to or smaller than an access unit size (page unit). The information processing apparatus according to any one of supplementary notes 1 to 4, wherein the information processing apparatus is compressed to a minimum.

（付記６）
圧縮対象文字列データを、複数の部分領域に区画し、
前記複数の部分領域のそれぞれに含まれる領域データを、それぞれアクセス単位サイズ以下まで圧縮して圧縮済部分データを生成し、
生成した前記圧縮済部分データを、半導体記憶装置におけるアクセス単位領域に格納させる
処理をコンピュータに実行させる情報処理プログラム。 (Appendix 6)
The compression target character string data is divided into a plurality of partial areas,
The area data included in each of the plurality of partial areas is compressed to an access unit size or less to generate compressed partial data,
An information processing program for causing a computer to execute processing for storing the generated compressed partial data in an access unit area in a semiconductor memory device.

（付記７）
前記部分領域が、隣り合う複数の前記アクセス単位領域を含む
ことを特徴とする、付記６記載の情報処理プログラム。 (Appendix 7)
The information processing program according to appendix 6, wherein the partial area includes a plurality of adjacent access unit areas.

（付記８）
文法圧縮処理を行なうことで前記圧縮済部分データを生成する
ことを特徴とする、付記６または７記載の情報処理プログラム。 (Appendix 8)
The information processing program according to appendix 6 or 7, wherein the compressed partial data is generated by performing grammar compression processing.

（付記９）
辞書情報を用いて伸長対象の圧縮済部分データを伸長して、前記圧縮対象文字列データの一部を復元する
処理を前記コンピュータに実行させる、付記６〜８のいずれか１項に記載の情報処理プログラム。 (Appendix 9)
The information according to any one of appendices 6 to 8, wherein the computer executes a process of decompressing the compressed partial data to be decompressed using dictionary information and restoring a part of the character string data to be compressed. Processing program.

（付記１０）
生成した前記圧縮済部分データを備える圧縮済文字列データの圧縮率が評価要件を満たしている場合に、連続する複数の圧縮済部分データをアクセス単位サイズ以下まで圧縮する
処理を前記コンピュータに実行させる、付記６〜９のいずれか１項に記載の情報処理プログラム。 (Appendix 10)
When the compression rate of the compressed character string data including the generated compressed partial data satisfies the evaluation requirement, the computer is caused to execute a process of compressing a plurality of continuous compressed partial data to an access unit size or less. The information processing program according to any one of appendices 6 to 9.

（付記１１）
圧縮対象文字列データを、複数の部分領域に区画する処理と、
前記複数の部分領域のそれぞれに含まれる領域データを、それぞれアクセス単位サイズ以下まで圧縮して圧縮済部分データを生成する処理と、
生成した前記圧縮済部分データを、半導体記憶装置におけるアクセス単位領域に格納させる処理と
を備えることを特徴とする、情報処理方法。 (Appendix 11)
A process of dividing the compression target character string data into a plurality of partial areas;
Processing for generating compressed partial data by compressing the area data included in each of the plurality of partial areas to an access unit size or less;
And a process of storing the generated compressed partial data in an access unit area in a semiconductor memory device.

（付記１２）
前記部分領域が、隣り合う複数の前記アクセス単位領域を含む
ことを特徴とする、付記１１記載の情報処理方法。 (Appendix 12)
12. The information processing method according to appendix 11, wherein the partial area includes a plurality of adjacent access unit areas.

（付記１３）
文法圧縮処理を行なうことで前記圧縮済部分データを生成する
ことを特徴とする、付記１１または１２記載の情報処理方法。 (Appendix 13)
13. The information processing method according to appendix 11 or 12, wherein the compressed partial data is generated by performing grammar compression processing.

（付記１４）
辞書情報を用いて伸長対象の圧縮済部分データを伸長して、前記圧縮対象文字列データの一部を復元する
処理を備えることを特徴とする、付記１１〜１３のいずれか１項に記載の情報処理方法。 (Appendix 14)
14. The method according to any one of appendices 11 to 13, further comprising a process of decompressing the compressed partial data to be decompressed using dictionary information and restoring a part of the compression target character string data. Information processing method.

（付記１５）
生成した前記圧縮済部分データを備える圧縮済文字列データの圧縮率が評価要件を満たしている場合に、連続する複数の圧縮済部分データをアクセス単位サイズ以下まで圧縮する
処理を備えることを特徴とする、付記１１〜１４のいずれか１項に記載の情報処理方法。 (Appendix 15)
When the compression rate of the compressed character string data including the generated compressed partial data satisfies the evaluation requirement, the processing includes compressing a plurality of continuous compressed partial data to an access unit size or less. The information processing method according to any one of supplementary notes 11 to 14.

１情報処理装置
１１ＣＰＵ
１２ＲＡＭ
１３ＲＯＭ
１４ＳＳＤ
１５入力インタフェース
１５ａキーボード
１５ｂマウス
１６光学ドライブ装置
１６ａ光ディスク
１７機器接続インタフェース
１７ａメモリ装置
１７ｂメモリリーダライタ
１７ｃメモリカード
１８ネットワークインタフェース
１９バス
２０グラフィック処理装置
２０ａモニタ
２１分割処理部
２２圧縮処理部
２３伸長処理部
２４確認部
２５格納処理部 1 Information processing apparatus 11 CPU
12 RAM
13 ROM
14 SSD
DESCRIPTION OF SYMBOLS 15 Input interface 15a Keyboard 15b Mouse 16 Optical drive device 16a Optical disk 17 Device connection interface 17a Memory device 17b Memory reader / writer 17c Memory card 18 Network interface 19 Bus 20 Graphic processing device 20a Monitor 21 Division processing unit 22 Compression processing unit 23 Decompression processing unit 24 Confirmation Unit 25 Storage Processing Unit

Claims

A division processing unit that divides compression target character string data into a plurality of partial areas;
A compression processing unit that generates compressed partial data by compressing the area data included in each of the plurality of partial areas to an access unit size or less;
An information processing apparatus comprising: a storage processing unit that stores the generated compressed partial data in an access unit area in a semiconductor storage device.

The information processing apparatus according to claim 1, wherein the partial area includes a plurality of adjacent access unit areas.

The information processing apparatus according to claim 1, wherein the compression processing unit generates the compressed partial data by performing a grammar compression process.

The decompression processing unit that decompresses the compressed partial data to be decompressed using the dictionary information and restores a part of the character string data to be compressed is provided. The information processing apparatus described in 1.

The compression processing unit compresses a plurality of continuous compressed partial data to an access unit size or less when the compression rate of the compressed character string data including the generated compressed partial data satisfies the evaluation requirement. The information processing apparatus according to claim 1, wherein:

The compression target character string data is divided into a plurality of partial areas,
The area data included in each of the plurality of partial areas is compressed to an access unit size or less to generate compressed partial data,
An information processing program for causing a computer to execute processing for storing the generated compressed partial data in an access unit area in a semiconductor memory device.

A process of dividing the compression target character string data into a plurality of partial areas;
Processing for generating compressed partial data by compressing the area data included in each of the plurality of partial areas to an access unit size or less;
And a process of storing the generated compressed partial data in an access unit area in a semiconductor memory device.