JP2007088962A

JP2007088962A - Ordinal data compressing method, ordinal data decompressing method, ordinal data processing program, ordinal data compressing apparatus, ordinal data decompressing apparatus, and ordinal data processing system

Info

Publication number: JP2007088962A
Application number: JP2005277083A
Authority: JP
Inventors: Yasuhiro Fujiwara; 靖宏藤原; Yasushi Sakurai; 保志櫻井; Masashi Yamamuro; 雅司山室
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-09-26
Filing date: 2005-09-26
Publication date: 2007-04-05

Abstract

<P>PROBLEM TO BE SOLVED: To compress ordial data such as time sequential data with a high compression ratio. <P>SOLUTION: An ordial data compressing method for compressing the ordial data of which the order is specified. A computer carries out: a compression pre-processing step of segmenting a plurality of pieces of continuous ordial data from the ordial data of which the order is specified, dividing each of the plurality of ordial data into a plurality of data elements, and creating a data element string by rearranging the data elements so as to continue data elements of the same bit position in the ordial data for each data element; a data compression step of compressing the data element string created by the compression pre-processing step; and a data storage step of storing, in a storage means, the data element string compressed by the data compression step. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、順序データ圧縮方法、順序データ解凍方法、順序データ処理プログラム、順序データ圧縮装置、順序データ解凍装置、および、順序データ処理システムに関する。 The present invention relates to an order data compression method, an order data decompression method, an order data processing program, an order data compression device, an order data decompression device, and an order data processing system.

現在インターネット技術とセンサ技術の発展により、通信機能を有するセンサから発生する時系列データを収集する時系列センサデータ情報収集システムが実現しつつある。例えば移動体位置管理システムは走行している大量の車などをセンシングして、車の移動のログを管理する。地震監視システムは大量の散布された地震計からの情報を収集し、地震発生時の揺れ方等を記録する。 With the development of Internet technology and sensor technology, a time-series sensor data information collection system that collects time-series data generated from a sensor having a communication function is being realized. For example, the moving body position management system senses a large number of traveling vehicles and manages a log of vehicle movement. The seismic monitoring system collects information from a large number of scattered seismometers and records how it shakes when an earthquake occurs.

センシングされた時系列データは次々と流入するため、そのデータを損失することなくディスクに格納することが求められている。時系列センサデータ情報収集システムでは、あらゆるものにセンサが付加され、センサは長期間にわたり運用されるので、センサから発生する時系列データのデータ量は膨大になる。よって、時系列データを圧縮してからディスクに格納することが望ましい。 Since the sensed time series data flows in one after another, it is required to store the data on the disk without loss. In the time-series sensor data information collection system, sensors are added to everything, and the sensors are operated for a long period of time, so the amount of time-series data generated from the sensors becomes enormous. Therefore, it is desirable to compress the time series data before storing it on the disk.

従来の可逆圧縮手法としてハフマン符号化（非特許文献１）やブロックソーティング（非特許文献２）が挙げられる。ハフマン符号化は圧縮時には圧縮するデータをその登場頻度に応じてハフマン木を生成し、ハフマン木に応じて符号をデータに割り当てる手法である。解凍時は同様にハフマン木を生成し、圧縮されたデータを解析する。ブロックソーティングは、圧縮時には圧縮するデータをローテーションさせ複数のデータ列を作成した後にソーティングする。ソーティングされたデータ列が圧縮の対象のデータとなる。
Huffman,D.A著，“A method for the construction of minimum redundancy codes”，In Proc. I.R.E，1952 M.Burrows and D.J. Wheeler著，“A block-sorting lossless data compression algorithm”，ISRC Research Report，1994 Conventional lossless compression methods include Huffman coding (Non-Patent Document 1) and block sorting (Non-Patent Document 2). Huffman coding is a method of generating a Huffman tree according to the appearance frequency of data to be compressed at the time of compression and allocating codes to data according to the Huffman tree. Similarly, when decompressing, a Huffman tree is generated and the compressed data is analyzed. In the block sorting, data to be compressed is rotated to create a plurality of data strings and then sorted. The sorted data string becomes the data to be compressed.
Huffman, DA, “A method for the construction of minimum redundancy codes”, In Proc. IRE, 1952 M. Burrows and DJ Wheeler, “A block-sorting lossless data compression algorithm”, ISRC Research Report, 1994

しかし、従来の可逆圧縮手法をそのまま活用するだけでは、時系列データの圧縮率を充分に高めることはできない。例えば、ハフマン符号化ではデータの登場頻度をカウントするが、時系列データは取りうる値の幅が大きい離散値であるので、多種類の値がそれぞれ低頻度で発生するような統計結果となる。その結果、圧縮率は低くなってしまう。 However, simply using the conventional lossless compression method as it is cannot sufficiently increase the compression rate of the time series data. For example, although the appearance frequency of data is counted in Huffman coding, since time series data is a discrete value having a wide range of values that can be taken, a statistical result in which various types of values are generated at low frequencies, respectively. As a result, the compression rate becomes low.

そこで、本発明は、前記した問題を解決し、時系列データなどの順序が規定されたデータの圧縮を、高圧縮率で実現することを主な目的とする。 In view of the above, the main object of the present invention is to solve the above-described problems and realize compression of data in which the order of time-series data and the like is defined at a high compression rate.

前記課題を解決するために、本発明は、順序が規定された順序データを圧縮する順序データ圧縮方法であって、コンピュータが、順序が規定された前記順序データから連続する複数個の前記順序データを切り出し、当該複数個の前記順序データそれぞれについて複数個のデータ要素に分割し、当該データ要素それぞれについて前記順序データにおけるビット位置が同じデータ要素が連続するように当該データ要素を並べ替えてデータ要素列を作成する圧縮前処理手順と、前記圧縮前処理手順により作成された前記データ要素列を圧縮するデータ圧縮手順と、前記データ圧縮手順により圧縮された前記データ要素列を記憶手段に格納するデータ格納手順と、を実行することを特徴とする。 In order to solve the above-mentioned problem, the present invention provides a sequence data compression method for compressing sequence data in which a sequence is defined, wherein a computer continuously includes a plurality of the sequence data from the sequence data in which the sequence is defined. Is divided into a plurality of data elements for each of the plurality of order data, and the data elements are rearranged so that the data elements having the same bit position in the order data are consecutive for each of the data elements. A pre-compression processing procedure for creating a column, a data compression procedure for compressing the data element sequence created by the pre-compression processing procedure, and data for storing the data element sequence compressed by the data compression procedure in storage means And a storing procedure.

これにより、隣り合う値が同じになることが多いデータ要素列を作成することで、高い圧縮率で順序データを圧縮することができる。 Thereby, the order data can be compressed at a high compression rate by creating a data element sequence in which adjacent values are often the same.

本発明は、前記データ圧縮手順が、ランレングス符号化により前記データ要素列を圧縮することを特徴とする。 The present invention is characterized in that the data compression procedure compresses the data element sequence by run-length encoding.

これにより、隣り合う値が同じになることが多いデータ要素列を圧縮することでランが大きくなり、高い圧縮率で圧縮することができる。 As a result, by compressing data element sequences in which adjacent values are often the same, the run becomes large, and compression can be performed at a high compression rate.

本発明は、前記データ格納手順が、前記データ要素列の順序に従って、シーケンシャルファイルに前記データ要素列を格納することを特徴とする。 According to the present invention, the data storage procedure stores the data element sequence in a sequential file according to the order of the data element sequence.

これにより、高速なシーケンシャルアクセスができる。 Thereby, high-speed sequential access is possible.

本発明は、順序が規定された順序データを解凍する順序データ解凍方法であって、コンピュータが、前記順序データにおけるビット位置が同じデータ要素が連続する前記データ要素列について、圧縮された当該データ要素列を記憶手段から読み取るデータ読み取り手順と、前記データ読み取り手順により読み取られた前記データ要素列を解凍するデータ解凍手順と、前記データ解凍手順が解凍した、前記順序データを構成する前記データ要素列を前記順序データに統合する解凍後処理手順と、を実行することを特徴とする。 The present invention is an order data decompression method for decompressing order data in which an order is defined, wherein the computer compresses the data element sequence in which the data elements having the same bit position in the order data are consecutive. A data reading procedure for reading a column from a storage means, a data decompression procedure for decompressing the data element sequence read by the data reading procedure, and the data element sequence constituting the sequence data decompressed by the data decompression procedure. And a post-decompression processing procedure that is integrated into the sequence data.

これにより、隣り合う値が同じになることが多いデータ要素列を読み取ることで、高い圧縮率で圧縮された順序データを解凍することができる。 As a result, the sequence data compressed at a high compression rate can be decompressed by reading a data element sequence in which adjacent values are often the same.

本発明は、前記データ解凍手順が、ランレングス符号化により前記データ要素列を解凍することを特徴とする。 The present invention is characterized in that the data decompression procedure decompresses the data element sequence by run-length encoding.

これにより、隣り合う値が同じになることが多いデータ要素列は、ランが大きく圧縮されているので、高い圧縮率で圧縮されたデータを解凍することができる。 As a result, the data element sequences in which adjacent values are often the same are compressed with a large run, so that data compressed at a high compression rate can be decompressed.

本発明は、前記データ読み取り手順が、前記データ要素列の順序に従って、前記データ要素列が格納されたシーケンシャルファイルから前記データ要素列を読み取ることを特徴とする。 The present invention is characterized in that the data reading sequence reads the data element sequence from a sequential file in which the data element sequence is stored according to the order of the data element sequence.

本発明は、前記順序データ圧縮方法、または、前記順序データ解凍方法をコンピュータに実行させるための順序データ処理プログラム。 The present invention is an order data processing program for causing a computer to execute the order data compression method or the order data decompression method.

本発明は、順序が規定された順序データを圧縮する順序データ圧縮装置であって、順序が規定された前記順序データから連続する複数個の前記順序データを切り出し、当該複数個の前記順序データそれぞれについて複数個のデータ要素に分割し、当該データ要素それぞれについて前記順序データにおけるビット位置が同じデータ要素が連続するように当該データ要素を並べ替えてデータ要素列を作成する圧縮前処理部と、前記圧縮前処理部により作成された前記データ要素列を圧縮するデータ圧縮部と、前記データ圧縮部により圧縮された前記データ要素列を記憶手段に格納するデータ格納部と、を有することを特徴とする。 The present invention is an order data compression device for compressing order data in which the order is defined, and a plurality of the order data are extracted from the order data in which the order is defined, and each of the order data A pre-compression processing unit that divides the data elements into a plurality of data elements and rearranges the data elements so that the data elements having the same bit position in the order data are consecutive for each of the data elements; A data compression unit that compresses the data element sequence created by the pre-compression processing unit, and a data storage unit that stores the data element sequence compressed by the data compression unit in a storage unit. .

本発明は、順序が規定された順序データを解凍する順序データ解凍装置であって、前記順序データにおけるビット位置が同じデータ要素が連続する前記データ要素列について、圧縮された当該データ要素列を記憶手段から読み取るデータ読み取り部と、前記データ読み取り部により読み取られた前記データ要素列を解凍するデータ解凍部と、前記データ解凍部が解凍した、前記順序データを構成する前記データ要素列を前記順序データに統合する解凍後処理部と、を実行することを特徴とする。 The present invention is an order data decompressing device for decompressing order data in which the order is defined, and stores the compressed data element sequence for the data element sequence in which data elements having the same bit position in the order data are continuous. A data reading unit read from the means, a data decompression unit for decompressing the data element sequence read by the data reading unit, and the data element sequence constituting the order data decompressed by the data decompression unit as the sequence data And a post-decompression processing unit to be integrated into the system.

本発明は、前記順序データ圧縮装置、および、前記順序データ解凍装置を備える順序データ処理システムである。 The present invention is an order data processing system including the order data compression device and the order data decompression device.

本発明により高い圧縮率で順序データを圧縮することができる。またデータ量を小さくすることができるので、ディスクにアクセスするときのＩ／Ｏ（Input／Output）コストも低減でき、また格納するのに必要なディスク容量も小さくすることができる。 According to the present invention, order data can be compressed at a high compression rate. Further, since the amount of data can be reduced, the I / O (Input / Output) cost for accessing the disk can be reduced, and the disk capacity required for storage can be reduced.

以下に、本発明の最良の実施形態を示す。 The best mode of the present invention will be described below.

図１（ａ）は、順序データ処理装置１を示す構成図である。順序データ処理装置１は、演算処理を行う際に用いられる記憶手段としてのメモリと、前記演算処理を行う演算処理装置とを少なくとも備えるコンピュータとして構成される。なお、メモリは、ＲＡＭ（Random Access Memory）などにより構成される。演算処理は、ＣＰＵ（Central Processing Unit）によって構成される演算処理装置が、メモリ上のプログラムを実行することで、実現される。以下、順序データの一例として時系列データをもとに説明を行うが、それらの説明は任意の順序データにも適用が可能である。 FIG. 1A is a configuration diagram showing the sequential data processing apparatus 1. The sequential data processing device 1 is configured as a computer including at least a memory as a storage unit used when performing arithmetic processing and an arithmetic processing device that performs the arithmetic processing. The memory is constituted by a RAM (Random Access Memory) or the like. Arithmetic processing is realized by an arithmetic processing unit configured by a CPU (Central Processing Unit) executing a program on a memory. Hereinafter, description will be made based on time-series data as an example of order data, but the description can be applied to arbitrary order data.

順序データ処理装置１は、圧縮前処理部１０、データ圧縮部１２、データ格納部１４、データ解凍部１６、および、解凍後処理部１８を含めて構成される。なお、順序データ処理装置１は、図１（ａ）のようなスタンドアロンの構成としてもよいし、図１（ｂ）のようなネットワークシステムの構成としてもよい。図１（ｂ）は、順序データ処理装置１が、圧縮を担当する順序データ圧縮装置２、および、解凍を担当する順序データ解凍装置３に分散され、互いにネットワークで接続され、通信部２０を介して圧縮されたデータが通信される。 The sequential data processing device 1 includes a pre-compression processing unit 10, a data compression unit 12, a data storage unit 14, a data decompression unit 16, and a post-decompression processing unit 18. Note that the sequential data processing apparatus 1 may have a stand-alone configuration as shown in FIG. 1A or a network system as shown in FIG. In FIG. 1B, the sequential data processing device 1 is distributed to the sequential data compression device 2 that is responsible for compression and the sequential data decompression device 3 that is responsible for decompression, and is connected to each other via a network. The compressed data is communicated.

圧縮前処理部１０は、次々と流れ込んでくる時系列データのデータ値を分割して並び替える。データ圧縮部１２は、圧縮前処理部１０が並び替えた後の時系列データを圧縮する。データ格納部１４は、例えばメモリやハードディスクなどの記憶手段により構成され、データ圧縮部１２による圧縮後の時系列データを格納する。データ解凍部１６は、データ格納部１４に格納された時系列データを解凍する。解凍後処理部１８は、データ解凍部１６による解凍後の時系列データを並び替えて統合する。 The pre-compression processing unit 10 divides and rearranges the data values of the time series data that flows in one after another. The data compression unit 12 compresses the time series data after the pre-compression processing unit 10 has rearranged. The data storage unit 14 is configured by a storage unit such as a memory or a hard disk, and stores time-series data compressed by the data compression unit 12. The data decompression unit 16 decompresses the time series data stored in the data storage unit 14. The post-decompression processing unit 18 rearranges and integrates the time-series data after decompression by the data decompression unit 16.

なお、圧縮前処理部１０から解凍後処理部１８に接続される矢印は、この方向にデータ処理に関するパラメータが通知されることを意味する。このパラメータの詳細は、後記する。パラメータの通知方法は、直接通知してもよいし、圧縮されたデータについてその属性として付加することにより、間接的に通知してもよい。さらに、順序データ圧縮装置２、および、順序データ解凍装置３は、あらかじめ設定された同じパラメータ値を互いに活用することにより、順序データ圧縮装置２から順序データ解凍装置３へのパラメータの通知処理を省略することができる。 An arrow connected from the pre-compression processing unit 10 to the post-decompression processing unit 18 means that a parameter related to data processing is notified in this direction. Details of this parameter will be described later. The parameter notification method may be notified directly or indirectly by adding the compressed data as an attribute thereof. Furthermore, the order data compression apparatus 2 and the order data decompression apparatus 3 use the same preset parameter values, thereby omitting the parameter notification process from the order data compression apparatus 2 to the order data decompression apparatus 3. can do.

図２（ａ）は、圧縮前処理部１０を示す構成図である。データ一時保存部３０は、メモリなどの記憶手段により構成され、次々と流れ込んでくる時系列データを保存する。データ分割部３２は、時系列データのデータ値を分割する。データ並び替え部３４は、作業領域としてメモリを利用することにより、分割されたデータ値を並び替える。 FIG. 2A is a configuration diagram illustrating the pre-compression processing unit 10. The data temporary storage unit 30 is configured by storage means such as a memory, and stores time-series data flowing in one after another. The data dividing unit 32 divides the data value of the time series data. The data rearrangement unit 34 rearranges the divided data values by using a memory as a work area.

図２（ｂ）は、解凍後処理部１８を示す構成図である。データ並び替え部３４は、圧縮前処理部１０のデータ一時保存部３０に保存された時系列データの量を基に分割されたデータ値を並び替える。データ統合部３６は、並び替えられたデータ値を統合する。 FIG. 2B is a configuration diagram showing the post-decompression processing unit 18. The data rearrangement unit 34 rearranges the divided data values based on the amount of time series data stored in the data temporary storage unit 30 of the pre-compression processing unit 10. The data integration unit 36 integrates the rearranged data values.

図３は、本実施形態における時系列データの可逆圧縮方法における圧縮処理を示すフローチャートである。この圧縮処理は、圧縮アルゴリズムを実施する前に、時系列データを時間軸に沿って分割する前処理により、高い圧縮率を実現することを特徴とする。これにより、格納するデータ量を削減でき、ディスクアクセスの処理を低減できるので処理が速くなるとともに、ディスク媒体のコストも抑えられる。 FIG. 3 is a flowchart showing compression processing in the time-series data lossless compression method according to this embodiment. This compression processing is characterized in that a high compression ratio is realized by preprocessing for dividing time-series data along the time axis before executing the compression algorithm. As a result, the amount of data to be stored can be reduced and the disk access processing can be reduced, which speeds up the processing and reduces the cost of the disk medium.

まず、圧縮前処理部１０のデータ一時保存部３０は、時系列データを連続する順序に沿って分割し、Ｎ時間ごとに時系列データを切り出す（Ｓ１１）。なお、個々の時系列データは、計算機の内部において複数バイトの集合として表現される。以下、データ値の一例として、４バイト長のデータ値が最下位のバイトから順にデータを格納するリトルエンディアン形式を仮定して説明するが、この方式以外にも任意の形式のデータ値を扱ってもよい。なお、４バイト長のうち、４バイト目には、値の＋−を示す符号部、および、値の桁数を示す仮数部が格納されている。 First, the data temporary storage unit 30 of the pre-compression processing unit 10 divides the time series data in a continuous order and cuts out the time series data every N hours (S11). Each time series data is expressed as a set of a plurality of bytes inside the computer. Hereinafter, as an example of a data value, a 4-byte data value will be described assuming a little-endian format in which data is stored in order from the least significant byte. Also good. Of the 4-byte length, the 4th byte stores a sign part indicating +/- of the value and a mantissa part indicating the number of digits of the value.

次に、圧縮前処理部１０のデータ分割部３２は、データ一時保存部３０が切り出した各時系列データについて、１つの時系列データをＭ個のデータ列に分割する（Ｓ１２）。例えば、４バイトの時系列データを４つに分割した結果、１バイトのデータが４つ並んだデータ列が生成される。図４（ａ）は、時系列データの計算機内での内部表現を示す説明図である。時系列データのＮ個の値が、それぞれ４つずつ分割され、Ｎ個の行要素およびＭ個（４個）の列要素を有する行列が形成される。 Next, the data division unit 32 of the pre-compression processing unit 10 divides one time series data into M data strings for each time series data cut out by the data temporary storage unit 30 (S12). For example, as a result of dividing 4-byte time-series data into four, a data string in which four 1-byte data are arranged is generated. FIG. 4A is an explanatory diagram showing an internal representation of time series data in a computer. Each of the N values of the time-series data is divided into four, thereby forming a matrix having N row elements and M (4) column elements.

さらに、圧縮前処理部１０のデータ並び替え部３４は、データ分割部３２によりバイト単位に分割されたデータを並び替える（Ｓ１３）。図５は、データの並び替え処理（Ｓ１３）を具体的に説明するフローチャートである。 Further, the data rearrangement unit 34 of the pre-compression processing unit 10 rearranges the data divided in byte units by the data division unit 32 (S13). FIG. 5 is a flowchart for specifically explaining the data rearrangement process (S13).

データ並び替え部３４は、１〜Ｎまで値をとるループ変数ｉ（Ｓ１３１）および１〜Ｍまで値をとるループ変数ｊ（Ｓ１３２）を用いて、図４（ａ）に示す行列に対して、行と列とを入れ替える転置行列演算を行い（Ｓ１３３）、図４（ｂ）に示す行列を生成する。そして、データ並び替え部３４は、Ｓ１３３で生成した行列について、各行を列方向に接続することにより、図４（ｃ）に示す１行の順序列を生成する（Ｓ１３４）。これにより、時系列データを構成する同じ位（同じバイト目）のデータ同士が、隣り合うように並ぶこととなる。 The data rearrangement unit 34 uses the loop variable i (S131) taking values from 1 to N and the loop variable j (S132) taking values from 1 to M to the matrix shown in FIG. A transposed matrix operation for exchanging rows and columns is performed (S133), and a matrix shown in FIG. 4B is generated. And the data rearrangement part 34 produces | generates the one order sequence shown in FIG.4 (c) by connecting each row | line | column to the column direction about the matrix produced | generated by S133 (S134). Thereby, data of the same position (same byte) constituting the time series data are arranged adjacent to each other.

そして、データ圧縮部１２は、データ並び替え部３４が並び替えたデータ（バイト列）に対して圧縮を行う（Ｓ１４）。符号化方式には、例えば、ランレングス符号化を用いる。ランレングス符号化はワンパスで実行できるので高速処理に向いているためである。なお、データ圧縮部１２は、例えば、次の表に示す様々な符号化方式を、単独で活用または複数の符号化方式を併用してもよい。

Then, the data compression unit 12 compresses the data (byte string) rearranged by the data rearrangement unit 34 (S14). For the encoding method, for example, run-length encoding is used. This is because run-length encoding can be executed in one pass and is suitable for high-speed processing. In addition, the data compression part 12 may utilize various encoding systems shown in the following table independently, or may use a some encoding system together, for example.

なお、Ｓ１１で切り出されたデータを、そのままＳ１４で圧縮すると、隣り合う値は似ていないため、高い圧縮率は、期待できない。しかし、Ｓ１１で切り出されたデータに、前処理（Ｓ１２，Ｓ１３）を行うことにより、隣り合う値が同じになることが多くなるので、高い圧縮率が期待できる。この効果は、特に、センサなどで計測された時系列データにおいてはデータ値が急激に変化せず、ゆるやかに（漸次的に）変化するため、顕著になる。そして、符号部および仮数部は、低頻度で変化するため、隣り合う値が同じになることが多くなる。また、時系列データを構成する各データについて、桁数が大きいデータは、桁数が小さいデータに比べて変化が少ない。 Note that if the data cut out in S11 is compressed as it is in S14, the adjacent values are not similar, so a high compression rate cannot be expected. However, by performing pre-processing (S12, S13) on the data cut out in S11, the adjacent values often become the same, so a high compression rate can be expected. This effect is particularly prominent in time-series data measured by a sensor or the like because the data value does not change abruptly but changes gradually (gradually). Since the sign part and the mantissa part change infrequently, adjacent values are often the same. In addition, for each data constituting the time series data, data with a large number of digits is less changed than data with a small number of digits.

さらに、データ格納部１４は、データ圧縮部１２により圧縮されたデータを格納する（Ｓ１５）。格納は、圧縮されたデータの順序ごとに、データを先頭から順番にアクセスするシーケンシャルファイルの形で行うことが望ましい。これは、高速なシーケンシャルアクセスができるようにするためである。 Further, the data storage unit 14 stores the data compressed by the data compression unit 12 (S15). The storage is preferably performed in the form of a sequential file for accessing the data in order from the top for each compressed data order. This is to enable high-speed sequential access.

以上、圧縮処理について、説明した。なお、図１における圧縮前処理部１０から解凍後処理部１８に接続される矢印に関するパラメータは、Ｓ１１における時系列データを切り出すＮ時間、および、Ｓ１２における１つの時系列データの分割数Ｍ個である。これらのパラメータは、それぞれ大きい値にすると、小さい値よりも高い圧縮率が期待できる。一方、リアルタイム処理などで時系列データの入力から圧縮された時系列データを出力するまでの処理時間を短縮化したいときには、パラメータの値を小さくすればよい。よって、処理時間が制限されたときには、制限された処理時間の範囲内で、パラメータの値を大きくした方が、高い圧縮率を実現できる。 The compression process has been described above. The parameters relating to the arrows connected from the pre-compression processing unit 10 to the post-decompression processing unit 18 in FIG. 1 are N time for extracting time-series data in S11 and M for the number of divisions of one time-series data in S12. is there. If each of these parameters is set to a large value, a higher compression rate can be expected than a small value. On the other hand, when it is desired to shorten the processing time from the input of time series data to the output of compressed time series data in real time processing or the like, the parameter value may be reduced. Therefore, when the processing time is limited, a higher compression ratio can be realized by increasing the parameter value within the limited processing time.

図６は、本実施形態における時系列データの可逆圧縮方法における解凍処理を示すフローチャートである。この解凍処理は、図３の圧縮処理と比較すると、処理順序を逆転させたものである。 FIG. 6 is a flowchart showing the decompression process in the time-series data lossless compression method according to this embodiment. This decompression process is obtained by reversing the processing order as compared with the compression process of FIG.

データ解凍部１６は、データ格納部１４から圧縮された時系列データを読み取り（Ｓ２１）、Ｓ１４の圧縮アルゴリズムに対応する解凍アルゴリズムを実施することにより、解凍を行う（Ｓ２２）。データ並び替え部３４は、Ｓ２２で解凍されたデータをＳ１３と同じ方法で並び替える（Ｓ２３）。データ統合部３６は、Ｍ個のデータ列を１つの時系列データに結合する（Ｓ２４）。解凍後処理部１８は、Ｎ時間ごとにまとめられた時系列データを出力する（Ｓ２５）。 The data decompression unit 16 reads the time-series data compressed from the data storage unit 14 (S21), and performs decompression by executing a decompression algorithm corresponding to the compression algorithm of S14 (S22). The data rearrangement unit 34 rearranges the data decompressed in S22 by the same method as S13 (S23). The data integration unit 36 combines the M data strings into one time series data (S24). The decompressed post-processing unit 18 outputs time series data collected every N hours (S25).

本発明の一実施形態に関する順序データ処理装置を示す構成図である。It is a block diagram which shows the order data processing apparatus regarding one Embodiment of this invention. 本発明の一実施形態に関する圧縮前処理部および解凍後処理部を示す構成図である。It is a block diagram which shows the pre-compression process part and post-decompression process part regarding one Embodiment of this invention. 本発明の一実施形態に関する圧縮処理を示すフローチャートである。It is a flowchart which shows the compression process regarding one Embodiment of this invention. 本発明の一実施形態に関する時系列データの計算機内での内部表現を示す説明図である。It is explanatory drawing which shows the internal expression in the computer of the time series data regarding one Embodiment of this invention. 本発明の一実施形態に関する並び替え処理を示すフローチャートである。It is a flowchart which shows the rearrangement process regarding one Embodiment of this invention. 本発明の一実施形態に関する解凍処理を示すフローチャートである。It is a flowchart which shows the decompression | decompression process regarding one Embodiment of this invention.

Explanation of symbols

１順序データ処理装置
２順序データ圧縮装置
３順序データ解凍装置
１０圧縮前処理部
１２データ圧縮部
１４データ格納部
１６データ解凍部
１８解凍後処理部 DESCRIPTION OF SYMBOLS 1 Order data processing apparatus 2 Order data compression apparatus 3 Order data decompression apparatus 10 Pre-compression processing part 12 Data compression part 14 Data storage part 16 Data decompression part 18 Decompression | decompression processing part

Claims

An order data compression method for compressing order data in which an order is defined,
Computer
A plurality of consecutive order data is cut out from the order data in which the order is defined, divided into a plurality of data elements for each of the plurality of order data, and a bit position in the order data for each of the data elements. A pre-compression processing procedure for creating a data element sequence by rearranging the data elements so that the same data elements are continuous;
A data compression procedure for compressing the data element sequence created by the pre-compression processing procedure;
A data storage procedure for storing the data element sequence compressed by the data compression procedure in a storage means;
A sequential data compression method comprising:

2. The sequential data compression method according to claim 1, wherein the data compression procedure compresses the data element sequence by run-length encoding.

3. The sequential data compression method according to claim 1, wherein the data storage procedure stores the data element sequence in a sequential file according to the order of the data element sequence.

An order data decompression method for decompressing order data in which an order is defined,
Computer
A data reading procedure for reading the compressed data element sequence from the storage means for the data element sequence in which the data elements having the same bit position in the sequence data are continuous;
A data decompression procedure for decompressing the data element sequence read by the data read procedure;
A post-decompression processing procedure for integrating the data element sequence constituting the sequence data, which is decompressed by the data decompression procedure, into the sequence data;
An order data decompression method comprising:

5. The order data decompression method according to claim 4, wherein the data decompression procedure decompresses the data element sequence by run-length encoding.

6. The ordered data decompression method according to claim 4, wherein the data reading procedure reads the data element sequence from a sequential file in which the data element sequence is stored according to the order of the data element sequence. .

The sequential data for causing a computer to execute the sequential data compression method according to any one of claims 1 to 3 or the sequential data decompression method according to any one of claims 4 to 6. Processing program.

An order data compression device for compressing order data in which an order is defined,
A plurality of consecutive order data is cut out from the order data in which the order is defined, divided into a plurality of data elements for each of the plurality of order data, and a bit position in the order data for each of the data elements. A pre-compression processing unit that rearranges the data elements so that the same data elements are continuous and creates a data element sequence;
A data compression unit for compressing the data element sequence created by the pre-compression processing unit;
A data storage unit that stores the data element sequence compressed by the data compression unit in a storage unit;
A sequential data compression apparatus comprising:

An order data decompressing device for decompressing order data in which an order is defined,
A data reading unit that reads the compressed data element sequence from the storage unit for the data element sequence in which the data elements having the same bit position in the sequence data are continuous;
A data decompression unit for decompressing the data element sequence read by the data reading unit;
A post-decompression processing unit that uncompresses the data element sequence that constitutes the order data and that is decompressed by the data decompression unit;
An order data decompressing device characterized by executing

An order data processing system comprising: the order data compression apparatus according to claim 8; and the order data decompression apparatus according to claim 9.