JP6965539B2

JP6965539B2 - Coding device, decoding device and program

Info

Publication number: JP6965539B2
Application number: JP2017057591A
Authority: JP
Inventors: 碧唯加茂
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2017-03-23
Filing date: 2017-03-23
Publication date: 2021-11-10
Anticipated expiration: 2037-03-23
Also published as: JP2018160827A

Description

本発明は、符号化装置、復号装置及びプログラムに関する。 The present invention relates to a coding device, a decoding device and a program.

特許文献１には、数値解析の結果情報を圧縮するプログラムが記載されている。このプログラムは、演算部が、数値解析で用いる、複数の所定の構造パラメータの値を変数依存情報として記憶部から読み込むステップと、演算部が、変数依存情報から変換情報を生成するステップと、演算部が、変換情報に基づいて複数の結果情報を変換するステップと、演算部が、変換した結果情報を圧縮するステップと、演算部が、圧縮した結果情報を記憶部に格納するステップとを備える。 Patent Document 1 describes a program that compresses information as a result of numerical analysis. In this program, the arithmetic unit reads the values of a plurality of predetermined structural parameters used in the numerical analysis from the storage unit as variable-dependent information, and the arithmetic unit generates conversion information from the variable-dependent information. The unit includes a step of converting a plurality of result information based on the conversion information, a step of the arithmetic unit compressing the converted result information, and a step of the arithmetic unit storing the compressed result information in the storage unit. ..

特許文献２には、粒子の空間座標およびその時系列データを圧縮する方法が記載されている。この方法は、解析対象となる空間とそこに存在する粒子の情報から、要求される精度の表現に座標を変換する処理と、変換された座標の一部の情報を共有するグループに分割する処理と、グループ毎に必要最小限の情報を出力する処理とからなる。 Patent Document 2 describes a method of compressing the spatial coordinates of particles and their time-series data. In this method, the process of converting the coordinates from the information of the space to be analyzed and the particles existing there to the expression of the required accuracy and the process of dividing a part of the information of the converted coordinates into a group to be shared. And the process of outputting the minimum necessary information for each group.

特許文献３には、分散が大きいデータを含む多様な分布のデータや、時間的に非定常なデータを可逆圧縮する装置が記載されている。この装置は、時系列の入力値で構成された入力データから入力値を時刻ごとに取得し、取得した入力値より過去の入力値に基づき所定の予測方法を用いて予測値を時刻ごとに計算し、取得した入力値と計算した予測値との残差を処理装置により時刻ごとに算出する残差算出部と、それぞれ異なる符号化方法を示す複数の符号化パラメータの各々が示す符号化方法の事前確率を定義する事前確率情報を記憶装置により予め格納する事前確率格納部と、残差算出部により算出された残差より過去の残差を、複数の符号化パラメータの各々が示す符号化方法を用いて符号化した場合の当該過去の残差の符号長と、事前確率格納部に格納された事前確率情報で定義されている当該符号化方法の事前確率とに基づき、残差算出部により算出された残差に対して、複数の符号化パラメータのうち、当該過去の残差を符号化した場合に当該過去の残差の符号長が他の符号化方法より短くなる符号化方法を示す符号化パラメータを処理装置により時刻ごとに選択する符号化パラメータ選択部と、残差算出部により算出された残差を時刻ごとに取得し、取得した残差を、符号化パラメータ選択部により当該残差に対して選択された符号化パラメータが示す符号化方法を用いて処理装置により符号化して当該残差の符号語を時刻ごとに算出し、算出した符号語を入力データの圧縮データとして出力する残差符号化部とを備える。 Patent Document 3 describes an apparatus for losslessly compressing data having various distributions including data having a large variance and data having non-stationary time. This device acquires an input value for each time from input data composed of time-series input values, and calculates a predicted value for each time using a predetermined prediction method based on the past input value from the acquired input value. Then, the residual calculation unit that calculates the residual between the acquired input value and the calculated predicted value for each time by the processing device, and the coding method indicated by each of the plurality of coding parameters indicating different coding methods. A coding method in which each of a plurality of coding parameters indicates a prior probability storage unit that stores prior probability information that defines prior probabilities in advance by a storage device, and a residual that is past the residual calculated by the residual calculation unit. Based on the code length of the past residual when encoded using, and the prior probability of the coding method defined in the prior probability information stored in the prior probability storage unit, the residual calculation unit An coding method is shown in which the code length of the past residual is shorter than that of other coding methods when the past residual is encoded among a plurality of coding parameters for the calculated residual. The coding parameter selection unit that selects the coding parameters for each time by the processing device and the residuals calculated by the residual calculation unit are acquired for each time, and the acquired residuals are obtained by the coding parameter selection unit. It is encoded by the processing device using the coding method indicated by the coding parameter selected for the difference, the code word of the residual is calculated for each time, and the calculated code word is output as compressed data of the input data. It is provided with a residual coding unit.

特許第４４９３６１４号公報Japanese Patent No. 4493614 特開平９−１６０８９８号公報Japanese Unexamined Patent Publication No. 9-160898 特許第５５７０４０９号公報Japanese Patent No. 5570409

データの圧縮に適した符号として、出現確率が高いシンボルほど短い符号を割り当てる可変長符号（エントロピー符号）が知られている。可変長符号による符号化と復号においては、シンボルと符号を対応付けた符号表に従って符号化と復号が行われる。符号表は、データに含まれるシンボルが増加するにつれて肥大化するため、圧縮の高効率化の障害となってしまう。
上記の事情に鑑み、本発明は、可変長符号による符号化及び復号を、経験分布を基に構築した符号表を用いるよりも高効率に行うことを目的とする。 As a code suitable for data compression, a variable length code (entropy code) is known in which a symbol having a higher probability of appearance is assigned a shorter code. In the coding and decoding by the variable length code, the coding and decoding are performed according to the code table in which the symbol and the code are associated with each other. Since the code table becomes bloated as the number of symbols contained in the data increases, it becomes an obstacle to high compression efficiency.
In view of the above circumstances, an object of the present invention is to perform coding and decoding by a variable length code with higher efficiency than using a code table constructed based on an empirical distribution.

請求項１に係る発明は、入力データと前記入力データの分布モデルとから前記入力データの確率分布の母数の推定量を算出する推定量算出手段と、算出された前記推定量を用いた確率分布から前記入力データに含まれるデータ値の出現確率を算出する出現確率算出手段と、算出された前記出現確率を用いて前記データ値を可変長符号化する符号化手段と、前記符号化手段により可変長符号化された前記データ値に付帯する付帯符号として、前記分布モデルを特定するプロファイル情報、及び前記推定量を出力する出力手段と、を備える符号化装置を提供する。 The invention according to claim 1 is an estimation amount calculation means for calculating an estimated amount of a population of a probability distribution of the input data from the input data and a distribution model of the input data, and a probability using the calculated estimated amount. By the appearance probability calculation means for calculating the appearance probability of the data value included in the input data from the distribution, the coding means for variable-length coding the data value using the calculated appearance probability, and the coding means. As an incidental code attached to the variable-length encoded data value, a coding device including profile information for specifying the distribution model and an output means for outputting the estimated amount is provided.

請求項２に係る発明は、請求項１に記載の符号化装置において、固定小数点数又は浮動小数点数で表現された前記入力データを整数で表現された入力データに変換し、整数で表現された前記入力データを前記推定量算出手段と前記符号化手段に入力する整数変換手段を備える。 In the coding apparatus according to claim 1, the invention according to claim 2 converts the input data represented by a fixed-point number or a floating-point number into input data represented by an integer, and is represented by an integer. It includes the estimation amount calculation means and the integer conversion means for inputting the input data to the coding means.

請求項３に係る発明は、請求項１又は２に記載の符号化装置において、前記入力データの時間方向の差分を算出し、前記入力データに代えて前記差分を前記推定量算出手段と前記符号化手段に入力する差分算出手段を備える。 In the invention according to claim 3, in the coding apparatus according to claim 1 or 2, the difference in the time direction of the input data is calculated, and the difference is used as the estimated amount calculating means and the reference numeral in place of the input data. A difference calculation means to be input to the conversion means is provided.

請求項４に係る発明は、請求項１乃至３のいずれか１項に記載の符号化装置において、前記入力データを時間ステップ毎のフレームに分割し、前記フレーム毎に前記入力データを前記推定量算出手段と前記符号化手段に入力するフレーム分割手段を備える。 In the invention according to claim 4, in the coding apparatus according to any one of claims 1 to 3, the input data is divided into frames for each time step, and the input data is divided into frames for each time step. A calculation means and a frame division means to be input to the coding means are provided.

請求項５に係る発明は、請求項１乃至４のいずれか１項に記載の符号化装置において、指定された精度に応じて前記入力データの精度を調整し、精度の調整された前記入力データを前記推定量算出手段と前記符号化手段に入力する精度調整手段を備える。 The invention according to claim 5 is the coding apparatus according to any one of claims 1 to 4, wherein the accuracy of the input data is adjusted according to a specified accuracy, and the accuracy of the input data is adjusted. Is provided with the estimator calculating means and the accuracy adjusting means for inputting to the coding means.

請求項６に係る発明は、請求項１乃至５のいずれか１項に記載の符号化装置において、前記データ値に関連する変数依存情報に基づき前記データ値をグループに分割し、前記グループ毎に前記入力データを前記推定量算出手段と前記符号化手段に入力するグループ分割手段を備える。 In the invention according to claim 6, in the coding apparatus according to any one of claims 1 to 5, the data value is divided into groups based on variable dependence information related to the data value, and each group is divided into groups. A group dividing means for inputting the input data to the estimated amount calculating means and the coding means is provided.

請求項７に係る発明は、請求項１乃至６のいずれか１項に記載の符号化装置において、前記入力データからサンプルを抽出し、抽出されたサンプルを前記推定量算出手段に入力するサンプリング手段を備える。 The invention according to claim 7 is a sampling means according to any one of claims 1 to 6, in which a sample is extracted from the input data and the extracted sample is input to the estimator calculation means. To be equipped.

請求項８に係る発明は、請求項１乃至７のいずれか１項に記載の符号化装置において、前記入力データのデータ数と有効数字とに基づき前記入力データを上位の桁と下位の桁に分割し、前記上位の桁を前記符号化手段に入力する精度分割手段を備える。 According to the invention of claim 8, in the coding apparatus according to any one of claims 1 to 7, the input data is converted into upper digits and lower digits based on the number of data and significant figures of the input data. A precision dividing means for dividing and inputting the upper digit into the coding means is provided.

請求項９に係る発明は、入力データの分布モデルを特定するプロファイル情報、及び前記入力データの確率分布の母数の推定量を示す付帯符号を入力する入力手段と、前記入力手段により入力された前記付帯符号から、特定される前記分布モデルと、復元される前記推定量とを用いて前記入力データに含まれるデータ値の出現確率を算出する出現確率算出手段と、算出された前記出現確率を用いて、可変長符号化された前記データ値を復号する復号手段とを備える復号装置を提供する。 The invention according to claim 9 is input by an input means for inputting profile information for specifying a distribution model of input data and an incidental code indicating an estimated amount of a population of a probability distribution of the input data, and the input means. The appearance probability calculating means for calculating the appearance probability of the data value included in the input data by using the distribution model specified from the incidental code and the estimated amount to be restored, and the calculated appearance probability. It is used to provide a decoding device including a decoding means for decoding the variable length encoded data value.

請求項１０に係る発明は、コンピュータを、入力データと前記入力データの分布モデルとから前記入力データの確率分布の母数の推定量を算出する推定量算出手段と、算出された前記推定量を用いた確率分布から前記入力データに含まれるデータ値の出現確率を算出する出現確率算出手段と、算出された前記出現確率を用いて前記データ値を可変長符号化する符号化手段と、前記符号化手段により可変長符号化された前記データ値に付帯する付帯符号として、前記分布モデルを特定するプロファイル情報、及び前記推定量を出力する出力手段として機能させるためのプログラムを提供する。 The invention according to claim 10 uses a computer as an estimation amount calculating means for calculating an estimated amount of a population of a probability distribution of the input data from the input data and a distribution model of the input data, and the calculated estimated amount. An appearance probability calculating means for calculating the appearance probability of a data value included in the input data from the probability distribution used, a coding means for variable-length coding the data value using the calculated appearance probability, and the code. As ancillary code attached to the data value variable length encoded by the conversion means, profile information for specifying the distribution model and a program for functioning as an output means for outputting the estimated amount are provided.

請求項１１に係る発明は、コンピュータを、入力データの分布モデルを特定するプロファイル情報、及び前記入力データの確率分布の母数の推定量を示す付帯符号を入力する入力手段と、前記入力手段により入力された前記付帯符号から、特定される前記分布モデルと、復元される前記推定量とを用いて前記入力データに含まれるデータ値の出現確率を算出する出現確率算出手段と、算出された前記出現確率を用いて、可変長符号化された前記データ値を復号する復号手段として機能させるためのプログラムを提供する。 The invention according to claim 11 uses the input means for inputting the profile information for specifying the distribution model of the input data and the incidental code indicating the estimated amount of the population of the probability distribution of the input data, and the input means. An appearance probability calculating means for calculating the appearance probability of a data value included in the input data using the distribution model specified and the estimated amount to be restored from the input incidental code, and the calculated said Provided is a program for functioning as a decoding means for decoding the variable length encoded data value by using the appearance probability.

請求項１、１０に係る発明によれば、可変長符号による符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。
請求項２に係る発明によれば、固定小数点数又は浮動小数点数で表現されたデータであっても、可変長符号による符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。
請求項３に係る発明によれば、入力データが特定の分布に従わない場合であっても、入力データの差分が特定の分布に従うならば、可変長符号による差分の符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。
請求項４に係る発明によれば、時間の経過とともに入力データの分布が変化する場合であっても、可変長符号による符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。
請求項５に係る発明によれば、入力データの精度を調整しない場合と比べて、圧縮率が向上する。
請求項６に係る発明によれば、変数依存情報に基づきデータ値をグループに分割しない場合と比べて、圧縮率が向上する。
請求項７に係る発明によれば、全てのデータを用いて推定量を算出する構成と比べて、推定量の計算量が軽減されるとともに、メモリが節約される。
請求項８に係る発明によれば、入力データの経験分布がスパースである場合に、全ての桁を符号化手段に入力する場合と比べて、可変長符号の符号設計に要する計算量が削減される。
請求項９、１１に係る発明によれば、可変長符号による復号が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。 According to the inventions according to claims 1 and 10, coding by a variable length code is performed with higher efficiency than using a code table constructed based on an empirical distribution.
According to the invention of claim 2, even if the data is represented by a fixed-point number or a floating-point number, coding with a variable-length code is more efficient than using a code table constructed based on an empirical distribution. It is done in.
According to the invention of claim 3, even if the input data does not follow a specific distribution, if the difference of the input data follows a specific distribution, the coding of the difference by the variable length code will make the empirical distribution. It is performed more efficiently than using the code table constructed on the basis.
According to the invention of claim 4, even when the distribution of the input data changes with the passage of time, the coding by the variable length code is more efficient than using the code table constructed based on the empirical distribution. It is done in.
According to the invention of claim 5, the compression ratio is improved as compared with the case where the accuracy of the input data is not adjusted.
According to the invention of claim 6, the compression rate is improved as compared with the case where the data value is not divided into groups based on the variable-dependent information.
According to the invention of claim 7, the calculation amount of the estimated amount is reduced and the memory is saved as compared with the configuration in which the estimated amount is calculated using all the data.
According to the invention of claim 8, when the empirical distribution of the input data is sparse, the amount of calculation required for the code design of the variable length code is reduced as compared with the case where all the digits are input to the coding means. NS.
According to the inventions of claims 9 and 11, decoding with a variable length code is performed with higher efficiency than using a code table constructed based on an empirical distribution.

情報処理装置１のハードウェア構成を示す図。The figure which shows the hardware configuration of the information processing apparatus 1. 情報処理装置１の機能構成を示す図。The figure which shows the functional structure of the information processing apparatus 1. 情報処理装置１が実行する処理の手順を示す流れ図。The flow chart which shows the procedure of the process which the information processing apparatus 1 executes. 或る時間ステップにおける速度の経験分布を示すグラフ。A graph showing the empirical distribution of velocity at a given time step. プロファイル番号と分布モデルとの対応表を示す図。The figure which shows the correspondence table of a profile number and a distribution model. 算出された推定量を用いた確率密度関数を示す図。The figure which shows the probability density function using the calculated estimator. 確率密度関数を示すグラフ。A graph showing the probability density function. 本体符号の符号化データの構造を示す図。The figure which shows the structure of the coded data of a body code. 付帯符号の符号化データの構造を示す図。The figure which shows the structure of the coded data of ancillary code. プロファイル番号と分布モデルとの対応表。Correspondence table between profile number and distribution model. 復元された推定量を用いた確率密度関数を示す図。The figure which shows the probability density function using the restored estimator. 確率密度関数を示すグラフ。A graph showing the probability density function. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 固定小数点数の変換を示す図。The figure which shows the conversion of a fixed-point number. 浮動小数点数の変換を示す図。Diagram showing the conversion of floating point numbers. 固定小数点数のデータ値を整数に変換した例。An example of converting a fixed-point data value to an integer. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 差分の例を示す図。The figure which shows the example of the difference. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 符号化の対象とする入力データの例。An example of the input data to be coded. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 精度調整の前後のデータ値を示す図。The figure which shows the data value before and after the precision adjustment. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. グループ分割の前後のデータ値を示す図。The figure which shows the data value before and after group division. グループ分割の前後の粒子の様子を示す図。The figure which shows the state of the particle before and after group division. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 変形例の機能構成を示す図。The figure which shows the functional structure of the modification. 入力データを示す図。The figure which shows the input data.

本発明を実施するための形態の一例について説明する。
図１は、情報処理装置１のハードウェア構成を示す図である。情報処理装置１は、演算部１１、記憶部１２、通信部１３、操作部１４、表示部１５を備える。記憶部１２は、例えばハードディスクドライブやメモリなどの記憶装置であり、プログラムやデータを記憶する。演算部１１は、プロセッサと、演算のワークエリアとして用いられるメモリを備え、記憶部１２に記憶されたプログラムに従って演算を実行する。通信部１３は、情報処理装置１と外部装置との通信インターフェイスである。操作部１４は、キーボードやマウスなどの入力装置を備え、ユーザによる操作を受け付ける。表示部１５は、液晶表示パネルなどの表示装置を備え、ＧＵＩ（Graphical User Interface）の画面を表示する。なお、操作部１４と表示部１５は、外部装置として構成されてもよい。 An example of a mode for carrying out the present invention will be described.
FIG. 1 is a diagram showing a hardware configuration of the information processing device 1. The information processing device 1 includes a calculation unit 11, a storage unit 12, a communication unit 13, an operation unit 14, and a display unit 15. The storage unit 12 is a storage device such as a hard disk drive or a memory, and stores programs and data. The arithmetic unit 11 includes a processor and a memory used as a work area for arithmetic operations, and executes arithmetic operations according to a program stored in the storage unit 12. The communication unit 13 is a communication interface between the information processing device 1 and the external device. The operation unit 14 includes an input device such as a keyboard and a mouse, and accepts operations by the user. The display unit 15 includes a display device such as a liquid crystal display panel, and displays a GUI (Graphical User Interface) screen. The operation unit 14 and the display unit 15 may be configured as external devices.

図２は、情報処理装置１の機能構成を示す図である。図示された各構成要素は、情報処理装置１にインストールされたソフトウェアを機能毎のモジュールとして示したものである。演算部１１がソフトウェアを実行することにより、これらのモジュールの機能が実現される。分布モデル選択部２１、推定量算出部２２、出現確率算出部２３、符号化部２４は、符号化に関連する機能であり、本発明に係る符号化装置の一例である。出現確率算出部４１、復号部４２は、復号に関連する機能であり、本発明に係る復号装置の一例である。 FIG. 2 is a diagram showing a functional configuration of the information processing device 1. Each of the illustrated components shows the software installed in the information processing apparatus 1 as a module for each function. The functions of these modules are realized when the arithmetic unit 11 executes the software. The distribution model selection unit 21, the estimator calculation unit 22, the appearance probability calculation unit 23, and the coding unit 24 are functions related to coding and are examples of the coding device according to the present invention. The appearance probability calculation unit 41 and the decoding unit 42 are functions related to decoding, and are examples of the decoding device according to the present invention.

図３は、情報処理装置１が実行する処理の手順を示す流れ図である。（ａ）は符号化の手順を示し、（ｂ）は復号の手順を示す。 FIG. 3 is a flow chart showing a procedure of processing executed by the information processing apparatus 1. (A) shows the coding procedure, and (b) shows the decoding procedure.

分布モデル選択部２１は、入力されたプロファイル番号に対応する分布モデルを選択する（ステップＳ１１）。推定量算出部２２（推定量算出手段の一例）は、入力データと入力データの分布モデルとから入力データの確率分布の母数の推定量を算出する（ステップＳ１２）。出現確率算出部２３（符号化装置の出現確率算出手段の一例）は、算出された推定量を用いた確率分布から入力データに含まれるデータ値の出現確率を算出する（ステップＳ１３）。符号化部２４（符号化手段の一例）は、算出された出現確率を用いてデータ値を可変長符号化する（ステップＳ１４）。 The distribution model selection unit 21 selects a distribution model corresponding to the input profile number (step S11). The estimator calculation unit 22 (an example of the estimator calculation means) calculates an estimator of the population parameter of the probability distribution of the input data from the input data and the distribution model of the input data (step S12). The appearance probability calculation unit 23 (an example of the appearance probability calculation means of the coding device) calculates the appearance probability of the data value included in the input data from the probability distribution using the calculated estimated amount (step S13). The coding unit 24 (an example of the coding means) encodes the data value with a variable length using the calculated appearance probability (step S14).

出現確率算出部４１（復号装置の出現確率算出手段の一例）は、入力データの分布モデルと入力データの確率分布の母数の推定量とを用いて分布を復元し（ステップＳ２１）、入力データに含まれるデータ値の出現確率を算出する（ステップＳ２２）。復号部４２（復号手段の一例）は、算出された出現確率を用いて、可変長符号化されたデータ値を復号する（ステップＳ２３）。以下、各部の詳細について説明する。 The appearance probability calculation unit 41 (an example of the appearance probability calculation means of the decoding device) restores the distribution using the distribution model of the input data and the estimator of the parameter of the probability distribution of the input data (step S21), and the input data. The appearance probability of the data value included in is calculated (step S22). The decoding unit 42 (an example of the decoding means) decodes the variable-length encoded data value using the calculated appearance probability (step S23). The details of each part will be described below.

［符号化に関連する構成］
符号化に関連する構成について説明する。符号化の対象とする入力データは、例えば、熱平衡状態の下でＩ個の原子の運動をＴ時間ステップに渡って分子動力学シミュレーションを行って得られたｘ軸方向の速度Ｖｘ（ｉ，ｔ）のデータ値である。ｉ（ｉ＝１，・・・，Ｉ）は原子の識別番号を表し、ｔ（ｔ＝１，・・・，Ｔ）は時間ステップを表す。データ値は、有効桁数Ｎ桁の整数で表されている。 [Configuration related to coding]
The configuration related to coding will be described. The input data to be encoded is, for example, the velocity Vx (i, t) in the x-axis direction obtained by performing a molecular dynamics simulation of the motion of I atoms over a T-time step under a thermal equilibrium state. ) Data value. i (i = 1, ..., I) represents the identification number of the atom, and t (t = 1, ..., T) represents the time step. The data value is represented by an integer having N significant digits.

図４は、或る時間ステップにおける速度の経験分布を示すグラフである。横軸は速度、縦軸は出現確率である。分子動力学シミュレーションにおいて、熱平衡状態の原子の速度分布は正規分布となるように制御される。 FIG. 4 is a graph showing the empirical distribution of velocity at a given time step. The horizontal axis is the velocity and the vertical axis is the probability of appearance. In molecular dynamics simulations, the velocity distribution of atoms in thermal equilibrium is controlled to be normal.

なお、分子動力学シミュレーションではｙ軸方向の速度Ｖｙ（ｉ，ｔ）とｚ軸方向の速度Ｖｚ（ｉ，ｔ）も計算されるが、これらの符号化と復号の手順はＶｘ（ｉ，ｔ）と同様であるから、以下ではＶｘ（ｉ，ｔ）の例のみ説明する。 In the molecular dynamics simulation, the velocity Vy (i, t) in the y-axis direction and the velocity Vz (i, t) in the z-axis direction are also calculated. ), Only the example of Vx (i, t) will be described below.

次に、分布モデル選択部２１について説明する。
図５は、プロファイル番号と分布モデルとの対応表を示す図である。この例では、正規分布、指数分布、ポアソン分布、ガンマ分布、ワイブル分布の合計５種類の分布モデルが示されているが、これ以外の分布モデルが対応表に含まれてもよい。対応表は、分布モデル選択部２１に設定されている。 Next, the distribution model selection unit 21 will be described.
FIG. 5 is a diagram showing a correspondence table between the profile number and the distribution model. In this example, a total of five types of distribution models, a normal distribution, an exponential distribution, a Poisson distribution, a gamma distribution, and a Weibull distribution, are shown, but other distribution models may be included in the correspondence table. The correspondence table is set in the distribution model selection unit 21.

分布モデル選択部２１には、プロファイル番号が入力される（図２参照）。プロファイル番号は操作部１４によって入力されてもよいが、入力データのヘッダにプロファイル番号を示すビットを設け、ヘッダの読み込みによってプロファイル番号が入力されるように構成されてもよい。 A profile number is input to the distribution model selection unit 21 (see FIG. 2). The profile number may be input by the operation unit 14, but the profile number may be input by providing a bit indicating the profile number in the header of the input data and reading the header.

分布モデル選択部２１は、入力されたプロファイル番号ｑに対応する分布モデルを対応表から選択し、選択された分布モデルを示す分布モデル情報を出力する。例えば、分子動力学シミュレーションにおいて、熱平衡状態の原子の速度分布は正規分布となるように制御されているから、プロファイル番号として０が入力され、分布モデルとして正規分布が選択され、正規分布を示す分布モデル情報が出力される。 The distribution model selection unit 21 selects a distribution model corresponding to the input profile number q from the correspondence table, and outputs distribution model information indicating the selected distribution model. For example, in a molecular dynamics simulation, the velocity distribution of atoms in the thermal equilibrium state is controlled to be a normal distribution, so 0 is input as the profile number, the normal distribution is selected as the distribution model, and the distribution shows the normal distribution. Model information is output.

次に、推定量算出部２２について説明する。推定量とは、確率分布の母数の推定値である。母数とは、確率分布を決定付ける定数であり、平均、分散、標準偏差、中央値、期待値などである。本実施形態では、期待値と分散の推定量を算出する。
推定量算出部２２には、入力データと、分布モデル選択部２１から出力された分布モデル情報が入力される。推定量算出部２２は、入力データと分布モデル情報を用いて以下の手順により期待値と分散の推定量を算出する。 Next, the estimator calculation unit 22 will be described. The estimator is an estimate of the population parameter of the probability distribution. The parameter is a constant that determines the probability distribution, such as mean, variance, standard deviation, median, and expected value. In this embodiment, the expected value and the estimated variance are calculated.
The input data and the distribution model information output from the distribution model selection unit 21 are input to the estimator calculation unit 22. The estimator calculation unit 22 calculates the expected value and the estimated amount of variance by the following procedure using the input data and the distribution model information.

次に、出現確率算出部２３について説明する。以下、速度Ｖｘ（ｉ，１）が従う確率変数をＶとする。また、Ｖｘ（ｉ，１）をｖ_iで表す場合がある。前述の推定量算出部２２の説明におけるＸ、は、それぞれＶ、ｖ_iに対応する。 Next, the appearance probability calculation unit 23 will be described. Hereinafter, the random variable that the velocity Vx (i, 1) follows is defined as V. In addition, Vx (i, 1) may be represented _{by v i.} X in the above-mentioned explanation of the estimator calculation unit 22 corresponds to _{V and v i, respectively.}

出現確率算出部２３には、推定量算出部２２によって算出された推定量が入力される。出現確率算出部２３は、推定量を用いた確率密度関数から確率変数Ｖがとり得る値の出現確率を算出する。 The estimator calculated by the estimator calculation unit 22 is input to the appearance probability calculation unit 23. The appearance probability calculation unit 23 calculates the appearance probability of the value that the random variable V can take from the probability density function using the estimated amount.

次に、符号化部２４について説明する。符号化部２４には、入力データと、出現確率算出部２３によって算出された出現確率が入力される。 Next, the coding unit 24 will be described. The input data and the appearance probability calculated by the appearance probability calculation unit 23 are input to the coding unit 24.

図８は、本体符号の符号化データの構造を示す図である。図９は、付帯符号の符号化データの構造を示す図である。符号化部２４は、ｐ（ｓ）の計算結果を基にハフマン符号や算術符号などの従来手法を用いてデータ値ｖ_iを可逆に符号化し、本体符号として出力する。また、符号化部２４は、固定長符号化の手法により、プロファイル番号ｑ、推定量ａ及びｂを符号化し、付帯符号として出力する。出力された本体符号と付帯符号は、情報処理装置１の記憶部１２や外部の記憶装置などに伝送され、記憶される。 FIG. 8 is a diagram showing the structure of coded data of the main body code. FIG. 9 is a diagram showing the structure of the coded data of the incidental code. _{The coding unit 24 reversibly encodes the data value v i} using a conventional method such as a Huffman code or an arithmetic code based on the calculation result of p (s), and outputs the data value v i as a main body code. Further, the coding unit 24 encodes the profile numbers q and the estimated quantities a and b by a fixed-length coding method, and outputs them as incidental codes. The output main body code and incidental code are transmitted to and stored in the storage unit 12 of the information processing device 1, an external storage device, or the like.

［復号に関連する構成］
次に、復号に関連する構成について説明する。出現確率算出部４１には、付帯符号が入力される（図２参照）。出現確率算出部４１は、付帯符号からプロファイル番号Ｑ、推定量Ａ及びＢを復元する。 [Configuration related to decryption]
Next, the configuration related to decoding will be described. An incidental code is input to the appearance probability calculation unit 41 (see FIG. 2). The appearance probability calculation unit 41 restores the profile numbers Q, the estimators A and B from the incidental codes.

出現確率算出部４１は、復元された推定量を用いた確率密度関数からデータ値がとり得る値の出現確率を算出する。 The appearance probability calculation unit 41 calculates the appearance probability of the value that the data value can take from the probability density function using the restored estimator.

次に、復号部４２について説明する。復号部４２には、本体符号が入力される（図２参照）。復号部４２は、本体符号をＰ（ｇ）の計算結果を基にハフマン復号や算術復号などの従来手法を用いて復号し、データ値ｄ_iとして出力する。復号されたデータ値は、情報処理装置１の記憶部１２や、外部の記憶装置などに伝送され、記憶される。 Next, the decoding unit 42 will be described. The main body code is input to the decoding unit 42 (see FIG. 2). Decoding unit 42, a body code was decoded using conventional techniques such as Huffman decoding or arithmetic decoding on the basis of the calculation results of P (g), and outputs the data value d _i. The decoded data value is transmitted to and stored in the storage unit 12 of the information processing device 1, an external storage device, or the like.

以上が１つの時間ステップのデータ値に対する処理である。全ての時間ステップにおいて確率変数の分布が同じであると仮定する場合には、１つの時間ステップ（例えば、ｔ＝１）のデータ値から上記の処理により算出された推定量を用いて、他の時間ステップのデータ値の符号化及び復号が行われてもよい。 The above is the processing for the data value of one time step. Assuming that the distribution of random variables is the same in all time steps, the estimator calculated by the above process from the data value of one time step (for example, t = 1) is used for the other. Coding and decoding of data values in time steps may be performed.

本発明と従来技術の違いを説明する。例えば、従来手法においてハフマン符号化を用いる場合、シンボルと符号を対応付けた符号表の大きさＣ（ビット）は、データ値の有効桁数をＮとすると、式（９）で与えられる。すなわち、有効桁数が大きくなるに従って符号表のサイズも大きくなる。これに対して、本実施形態の可変長符号化においてハフマン符号化を用いる場合、符号表の代わりにプロファイル番号と推定量を送ることになるが、プロファイル番号と推定量のデータ量は、符号表のデータ量よりも小さく、また、データ値の有効桁数に関係なく一定である。よって、本実施形態によれば、可変長符号による符号化及び復号が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。

The difference between the present invention and the prior art will be described. For example, when Huffman coding is used in the conventional method, the size C (bit) of the code table in which the symbol and the code are associated is given by the equation (9), where N is the number of effective digits of the data value. That is, as the number of significant digits increases, so does the size of the code table. On the other hand, when Huffman coding is used in the variable length coding of the present embodiment, the profile number and the estimated amount are sent instead of the code table, but the data amount of the profile number and the estimated amount is the code table. It is smaller than the amount of data in, and is constant regardless of the number of effective digits of the data value. Therefore, according to the present embodiment, the coding and decoding by the variable length code is performed with higher efficiency than using the code table constructed based on the empirical distribution.

［変形例］
上記の実施形態を以下のように変形してもよい。複数の変形例を組み合せてもよい。 [Modification example]
The above embodiment may be modified as follows. A plurality of modified examples may be combined.

［変形例１］
図１３は、変形例の機能構成を示す図である。この変形例は、上記実施形態に整数変換部３１を追加した例である。整数変換部３１（整数変換手段の一例）は、固定小数点数又は浮動小数点数で表現されたデータ値を整数で表現されたデータ値に変換する。推定量算出部２２と符号化部２４には、整数で表現されたデータ値が入力される。 [Modification 1]
FIG. 13 is a diagram showing a functional configuration of a modified example. This modification is an example in which the integer conversion unit 31 is added to the above embodiment. The integer conversion unit 31 (an example of an integer conversion means) converts a data value represented by a fixed-point number or a floating-point number into a data value represented by an integer. Data values represented by integers are input to the estimator calculation unit 22 and the coding unit 24.

図１４は、固定小数点数の変換を示す図である。固定小数点数をＤ、有効桁数をＮとし、Ｄを整数に変換した値をＩとする。このとき、Ｉは式（１０）で計算される。 FIG. 14 is a diagram showing conversion of fixed-point numbers. Let D be the fixed-point number, N be the number of significant digits, and I be the value obtained by converting D to an integer. At this time, I is calculated by the equation (10).

図１５は、浮動小数点数の変換を示す図である。浮動小数点数をＦとすると、Ｆは仮数値Ｍ及び指数値Ｅにより式（１１）で表される。このＭ及びＥを変換後の整数のデータ値とする。

FIG. 15 is a diagram showing the conversion of floating point numbers. Assuming that the floating-point number is F, F is represented by the equation (11) by the formal value M and the exponential value E. Let M and E be the converted integer data values.

図１６は、固定小数点数のデータ値を整数に変換した例である。（ａ）は、固定小数点数のデータ値、（ｂ）は、整数のデータ値である。 FIG. 16 is an example of converting a fixed-point number data value into an integer. (A) is a fixed-point number data value, and (b) is an integer data value.

この構成によれば、固定小数点数又は浮動小数点数で表現されたデータであっても、可変長符号による符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。なお、復号後のデータに対しては、復号部４２が、上記の変換の逆変換により、整数で表現されたデータを固定小数点数又は浮動小数点数で表現されたデータに変換する。 According to this configuration, even data represented by a fixed-point number or a floating-point number can be coded by a variable-length code with higher efficiency than using a code table constructed based on an empirical distribution. For the decrypted data, the decoding unit 42 converts the data represented by an integer into the data represented by a fixed-point number or a floating-point number by the inverse conversion of the above conversion.

［変形例２］
図１７は、変形例の機能構成を示す図である。図１８は、差分の例を示す図である。この変形例は、上記実施形態に差分算出部３２を追加した例である。差分算出部３２（差分算出手段の一例）は、入力データの時間方向の差分を算出する。時間ｔにおける粒子のｘ座標値ｘ（ｔ）から微小時間Δｔ後のｘ座標値ｘ（ｔ＋Δｔ）までの移動距離Δｘは、式（１２）に示すように当該時間のｘ軸方向の速度ｖ_xに比例する。そのため、ｖ_xの分布モデルが正規分布となる場合、座標データの時間方向の差分（移動距離）の分布モデルも正規分布となる。推定量算出部２２と符号化部２４には、差分算出部３２により算出された差分が入力される。

[Modification 2]
FIG. 17 is a diagram showing a functional configuration of a modified example. FIG. 18 is a diagram showing an example of the difference. This modification is an example in which the difference calculation unit 32 is added to the above embodiment. The difference calculation unit 32 (an example of the difference calculation means) calculates the difference in the time direction of the input data. The moving distance Δx from the x-coordinate value x (t) of the particle at the time t to the x-coordinate value x (t + Δt) after the minute time Δt is the velocity v _{x in the x-axis direction of the time as shown in the equation (12).} Is proportional to. Therefore, when _{the distribution model of v x} has a normal distribution, the distribution model of the difference (movement distance) in the time direction of the coordinate data also has a normal distribution. The difference calculated by the difference calculation unit 32 is input to the estimator calculation unit 22 and the coding unit 24.

この構成によれば、入力データが特定の分布に従わない場合であっても、入力データの差分が特定の分布に従うならば、可変長符号による差分の符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。 According to this configuration, even if the input data does not follow a specific distribution, if the differences in the input data follow a specific distribution, the coding of the differences with variable length codes is constructed based on the empirical distribution. It is more efficient than using a code table.

［変形例３］
図１９は、変形例の機能構成を示す図である。この変形例は、上記実施形態にフレーム分割部３３を追加した例である。 [Modification 3]
FIG. 19 is a diagram showing a functional configuration of a modified example. This modification is an example in which the frame dividing portion 33 is added to the above embodiment.

図２０は、符号化の対象とする入力データの例である。この入力データは、例えば、熱平衡状態の下でＩ個（Ｉ＝５００００）の原子の運動をＴ時間ステップ（Ｔ＝１０００００）に渡って分子動力学シミュレーションを行って得られたｘ軸方向の速度Ｖｘ（ｉ，ｔ）のデータ値である。ｉ（ｉ＝１，・・・，Ｉ）は原子の識別番号を表し、ｔ（ｔ＝１，・・・，Ｔ）は時間ステップを表す。速度Ｖｘ（ｉ，ｔ）は、−９．９９９９９から９．９９９９９までの有効数字６桁の固定小数点数で表される。 FIG. 20 is an example of input data to be encoded. This input data is, for example, the velocity in the x-axis direction obtained by performing a molecular dynamics simulation of the motion of I (I = 50,000) atoms over a T-time step (T = 100,000) under a thermal equilibrium state. It is a data value of Vx (i, t). i (i = 1, ..., I) represents the identification number of the atom, and t (t = 1, ..., T) represents the time step. The velocity Vx (i, t) is represented by a fixed-point number with 6 significant digits from -9.99999 to 9.99999.

フレーム分割部３３（フレーム分割手段の一例）は、入力データを時間ステップ毎のフレームに分割する。図２０の例では、それぞれがｉ＝１，・・・，Ｉの原子の速度Ｖｘのデータ値を含むＴ個のフレームに分割される。推定量算出部２２と符号化部２４には、フレーム毎に入力データが入力され、上記実施形態で例示した処理がフレーム毎に実行される。 The frame dividing unit 33 (an example of the frame dividing means) divides the input data into frames for each time step. In the example of FIG. 20, each is divided into T frames including the data values of the velocity Vx of the atoms of i = 1, ..., I. Input data is input to the estimator calculation unit 22 and the coding unit 24 for each frame, and the processing illustrated in the above embodiment is executed for each frame.

この構成によれば、時間の経過とともに入力データの分布が変化する場合であっても、可変長符号による符号化が、経験分布を基に構築した符号表を用いるよりも高効率に行われる。 According to this configuration, even when the distribution of the input data changes with the passage of time, the coding by the variable length code is performed with higher efficiency than using the code table constructed based on the empirical distribution.

［変形例４］
図２１は、変形例の機能構成を示す図である。この変形例は、上記実施形態に精度調整部３４を追加した例である。ここで、精度とは、浮動小数点数において、仮数部の最下位の値が示す桁である。例えば、データ値が1.2345e-05である場合、1.2345e-05=12345e-09となるから、精度はe-09となる。ユーザは、復号データに要求する精度（以下、要求精度という。）を予め決めておく。要求精度は操作部１４によって入力されてもよいが、入力データのヘッダに要求精度を示すビットを設け、ヘッダの読み込みによって要求精度が入力されるように構成されてもよい。 [Modification example 4]
FIG. 21 is a diagram showing a functional configuration of a modified example. This modification is an example in which the precision adjusting unit 34 is added to the above embodiment. Here, the precision is a digit indicated by the lowest value of the mantissa part in a floating-point number. For example, if the data value is 1.2345e-05, then 1.2345e-05 = 12345e-09, so the accuracy is e-09. The user determines in advance the accuracy required for the decrypted data (hereinafter referred to as the required accuracy). The required accuracy may be input by the operation unit 14, but the required accuracy may be input by providing a bit indicating the required accuracy in the header of the input data and reading the header.

精度調整部３４（精度調整手段の一例）は、入力された要求精度に応じてデータの精度を調整する。具体的には、精度調整部３４は、入力された要求精度に応じて、データ値を上位の桁と下位の桁に分割する。上位の桁とは、最上位の桁から要求精度に対応する桁までの部分である。下位の桁とは、要求精度に対応する桁の１つ下の桁から最下位の桁までの部分である。 The accuracy adjusting unit 34 (an example of the accuracy adjusting means) adjusts the accuracy of the data according to the input required accuracy. Specifically, the accuracy adjustment unit 34 divides the data value into upper digits and lower digits according to the input required accuracy. The high-order digit is the part from the highest-order digit to the digit corresponding to the required precision. The lower digit is a part from the digit immediately below the digit corresponding to the required accuracy to the lowest digit.

図２２は、精度調整の前後のデータ値を示す図である。（ａ）が精度調整前のデータ値ｖ_i’であり、（ｂ）が精度調整後のデータ値ｖ_i”である。精度調整部３４は、ｖ_i’を要求精度εに応じて式（１３）により下位の桁を破棄し、ｖ_i”を算出する。推定量算出部２２と符号化部２４には、精度調整部３４によって精度が調整された精度調整データが入力される。この構成によれば、入力データの精度を調整しない場合と比べて、圧縮率が向上する。

FIG. 22 is a diagram showing data values before and after the accuracy adjustment. (A) the accuracy unadjusted data values v _i 'is, (b) is the data value v _i "after accuracy adjustment. Precision adjustment unit 34, v _i' according to the required accuracy ε formula ( discard the lower digit by 13), to calculate the v _i ". The accuracy adjustment data whose accuracy has been adjusted by the accuracy adjustment unit 34 is input to the estimator calculation unit 22 and the coding unit 24. According to this configuration, the compression ratio is improved as compared with the case where the accuracy of the input data is not adjusted.

［変形例５］
図２３は、変形例の機能構成を示す図である。この変形例は、上記実施形態にグループ分割部３５を追加した例である。グループ分割部３５（グループ分割手段の一例）は、データ値に関連した変数依存情報に基づきデータ値をグループ化する。変数依存情報とは、ある変数が依存する別の変数である。例えば、分子動力学法において、粒子の速度分布は質量に依存するため、質量は、速度に対する変数依存情報である。グループとは、変数依存情報によって分類されたデータ群である。例えば、熱平衡状態における理想気体の粒子の速度分布は、粒子の質量に依存して変化する。すなわち、質量は、速度に対する変数依存情報である。 [Modification 5]
FIG. 23 is a diagram showing a functional configuration of a modified example. This modification is an example in which the group division unit 35 is added to the above embodiment. The group dividing unit 35 (an example of the group dividing means) groups the data values based on the variable dependency information related to the data values. Variable dependency information is another variable on which one variable depends. For example, in the molecular dynamics method, the velocity distribution of a particle depends on the mass, so the mass is variable-dependent information with respect to the velocity. A group is a group of data classified by variable-dependent information. For example, the velocity distribution of ideal gas particles in a thermal equilibrium state changes depending on the mass of the particles. That is, mass is variable-dependent information for velocity.

図２４は、グループ分割の前後のデータ値を示す図である。図２５は、グループ分割の前後の粒子の様子を示す図である。例えば、図２４（ア）、図２５（ア）に示すように質量ｍ_Aを有する粒子と質量ｍ_Bを有する粒子が混在する場合、グループ分割部３５は、入力データを、質量ｍ_Aを有する粒子のグループ（図２４（イ）、２５（イ）に示すグループＡ）と質量ｍ_Bを有する粒子のグループ（図２４（ウ）、図２５（ウ）に示すグループＢ）に分割する。 FIG. 24 is a diagram showing data values before and after group division. FIG. 25 is a diagram showing the state of particles before and after group division. For example, FIG. 24 (A), FIG. 25 if the particles having a particle mass m _B having a mass m _A as shown in (A) are mixed, the group division unit 35, the input data has a mass m _A group of the particles is divided into groups of particles having a mass m _B (FIG. 24 (b), 25 (group a shown in b)) (FIG. 24 (c), the group B shown in FIG. 25 (c)).

推定量算出部２２と符号化部２４には、グループ化されたデータ値が入力され、上記実施形態で例示した処理がグループ毎に実行され、グループ毎に推定量と出現確率が算出される。図２５（オ）はグループＡの推定量を用いた確率密度関数、図２５（カ）はグループＢの推定量を用いた確率密度関数である。図２５（エ）はグループ分割せずに算出した推定量を用いた確率密度関数である。この構成によれば、変数依存情報に基づきデータ値をグループに分割しない場合と比べて、圧縮率が向上する。 Grouped data values are input to the estimator calculation unit 22 and the coding unit 24, the processing illustrated in the above embodiment is executed for each group, and the estimator and the appearance probability are calculated for each group. FIG. 25 (e) is a probability density function using the estimator of group A, and FIG. 25 (f) is a probability density function using the estimator of group B. FIG. 25 (d) is a probability density function using an estimator calculated without grouping. According to this configuration, the compression ratio is improved as compared with the case where the data values are not divided into groups based on the variable dependency information.

［変形例６］
図２６は、変形例の機能構成を示す図である。この変形例は、上記実施形態にサンプリング部３６を追加した例である。サンプリング部３６（サンプリング手段の一例）は、推定量の算出に用いるデータを抽出する。上記実施形態では、ｉ＝５００００個の粒子から推定量を算出したが、５００００個の粒子から例えば１０００個のサンプルを抽出する。推定量算出部２２には、抽出されたサンプルが入力され、これらのサンプルから推定量が算出される。抽出されたサンプル以外のデータについては推定量の算出は行われず、符号表を用いて符号化が行われる。この構成によれば、全てのデータを用いて推定量を算出する構成と比べて、推定量の計算量が軽減されるとともに、メモリが節約される。 [Modification 6]
FIG. 26 is a diagram showing a functional configuration of a modified example. This modification is an example in which the sampling unit 36 is added to the above embodiment. The sampling unit 36 (an example of sampling means) extracts data used for calculating the estimated amount. In the above embodiment, the estimator is calculated from i = 50,000 particles, but for example, 1000 samples are extracted from 50,000 particles. The extracted samples are input to the estimator calculation unit 22, and the estimator is calculated from these samples. Estimators are not calculated for data other than the extracted samples, and coding is performed using a code table. According to this configuration, the calculation amount of the estimated amount is reduced and the memory is saved as compared with the configuration in which the estimated amount is calculated using all the data.

［変形例７］
図２７は、変形例の機能構成を示す図である。この変形例は、上記実施形態に精度分割部３７、精度統合部５１を追加し、符号化部２４に代えて第１の符号化部２４１、第２の符号化部２４２を設け、復号部４２に代えて第１の復号部４２１、第２の復号部４２２を設けた例である。入力データの精度が高い場合や、入力データの有効桁数に対して十分なデータ数が得られない場合、あるいはデータの分散が大きい場合などには、入力データの経験分布がスパースになるため、可変長符号化の計算量に対する符号化の効率が低下する。そこで、本変形例では、データ値を上位の桁と下位の桁に分割し、可変長符号化の対象を上位の桁に限定する。具体的には、精度分割部３７（精度分割手段の一例）は、入力データのデータ数及び有効数字に基づき入力データを上位の桁と下位の桁に分割する。 [Modification 7]
FIG. 27 is a diagram showing a functional configuration of a modified example. In this modification, the precision dividing unit 37 and the precision integrating unit 51 are added to the above embodiment, and the first coding unit 241 and the second coding unit 242 are provided in place of the coding unit 24, and the decoding unit 42 is provided. This is an example in which the first decoding unit 421 and the second decoding unit 422 are provided instead of the above. If the accuracy of the input data is high, if the number of valid digits of the input data is not sufficient, or if the data distribution is large, the empirical distribution of the input data becomes sparse. The efficiency of coding with respect to the computational amount of variable length coding is reduced. Therefore, in this modification, the data value is divided into upper digits and lower digits, and the target of variable length coding is limited to the upper digits. Specifically, the precision dividing unit 37 (an example of the precision dividing means) divides the input data into upper digits and lower digits based on the number of data of the input data and significant figures.

図２８は、入力データを示す図である。（ア）は、分割前の入力データｖ_i”である。（イ）は、式（１４）で定義される上位の桁ｖ_high,i”である。（ウ）は、式（１５）で定義される下位の桁ｖ_low,i”である。第１の符号化部２４１には、上位の桁が入力され、第２の符号化部２４２には、下位の桁が入力される。第１の符号化部２４１は、上位の桁ｖ_high,iを可変長符号により第１の符号に符号化する。第２の符号化部２４２は、下位の桁ｖ_low,i”を固定長符号により第２の符号に符号化する。第１の復号部４２１には、第１の符号が入力され、第２の復号部４２２には、第２の符号が入力される。第１の復号部４２１は、第１の符号を出現確率に基づき復号することで上位の桁を復元する。第２の復号部４２２は、第２の符号を復号することで下位の桁を復元する。精度統合部５１は、復元された上位の桁と下位の桁を統合する。この構成によれば、入力データの経験分布がスパースである場合に、全ての桁を符号化手段に入力する場合と比べて、可変長符号の符号設計に要する計算量が削減される。

FIG. 28 is a diagram showing input data. (A) is the input data v _i "before division. (B) is the upper digit v _{high, i} " defined by the equation (14). (C) is the lower digit v _{low, i} "defined in the equation (15). The upper digit is input to the first coding unit 241 and the second coding unit 242 is filled with the upper digit. , The lower digit is input. The first coding unit 241 encodes the upper digit v _{high, i} into the first code by the variable length code. The second coding unit 242 is the lower digit. The digit v _{low, i} ”is encoded into a second code with a fixed length code. The first code is input to the first decoding unit 421, and the second code is input to the second decoding unit 422. The first decoding unit 421 restores the upper digit by decoding the first code based on the appearance probability. The second decoding unit 422 restores the lower digit by decoding the second code. The precision integration unit 51 integrates the restored upper and lower digits. According to this configuration, when the empirical distribution of the input data is sparse, the amount of calculation required for the code design of the variable length code is reduced as compared with the case where all the digits are input to the coding means.

［変形例８］

[Modification 8]

［変形例９］

[Modification 9]

［変形例１０］

[Modification 10]

［変形例１１］

[Modification 11]

［変形例１２]

[Modification 12]

［変形例１３］
コンピュータに上記の処理を実行させるためのプログラムは、例えば、光記録媒体、半導体メモリなどのコンピュータに読取り可能な記録媒体に持続的に記憶された状態で提供されてもよいし、インターネット等の通信ネットワークを介して提供されてもよい。本発明に係るプログラムが記録媒体に持続的に記憶された状態で提供される場合、コンピュータが当該プログラムを記録媒体から読み取り用いる。また、本発明に係るプログラムが通信ネットワークを介して提供される場合、コンピュータが当該プログラムを配信元の装置から受信して用いる。 [Modification 13]
The program for causing the computer to execute the above processing may be provided, for example, in a state of being continuously stored in a computer-readable recording medium such as an optical recording medium or a semiconductor memory, or a communication such as the Internet. It may be provided over the network. When the program according to the present invention is provided in a state of being continuously stored in a recording medium, a computer reads the program from the recording medium and uses it. When the program according to the present invention is provided via a communication network, the computer receives the program from the distribution source device and uses it.

１…情報処理装置、１１…演算部、１２…記憶部、１３…通信部、１４…操作部、１５…表示部、２１…分布モデル選択部、２２…推定量算出部、２３…出現確率算出部、２４…符号化部、２４１…第１の符号化部、２４２…第２の符号化部、３１…整数変換部、３２…差分算出部、３３…フレーム分割部、３４…精度調整部、３５…グループ分割部、３６…サンプリング部、３７…精度分割部、４１…出現確率算出部、４２…復号部、４２１…第１の復号部、４２２…第２の復号部、５１…精度統合部 1 ... Information processing device, 11 ... Calculation unit, 12 ... Storage unit, 13 ... Communication unit, 14 ... Operation unit, 15 ... Display unit, 21 ... Distribution model selection unit, 22 ... Estimator calculation unit, 23 ... Appearance probability calculation Unit, 24 ... coding unit, 241 ... first coding unit, 242 ... second coding unit, 31 ... integer conversion unit, 32 ... difference calculation unit, 33 ... frame division unit, 34 ... accuracy adjustment unit, 35 ... group division unit, 36 ... sampling unit, 37 ... accuracy division unit, 41 ... appearance probability calculation unit, 42 ... decoding unit, 421 ... first decoding unit, 422 ... second decoding unit, 51 ... accuracy integration unit

Claims

An estimator calculating means for calculating an estimator of the population parameter of the probability distribution of the input data from the input data and the distribution model of the input data.
An appearance probability calculation means for calculating the appearance probability of a data value included in the input data from a probability distribution using the calculated estimated amount, and an appearance probability calculation means.
Encoding means for variable length coding the data value using the calculated the probability of occurrence,
As ancillary code attached to the data value variable-length coded by the coding means, a profile information for specifying the distribution model, an output means for outputting the estimated amount, and an output means for outputting the estimated amount.
A coding device comprising.

Integer conversion in which the input data represented by a fixed-point number or a floating-point number is converted into input data represented by an integer, and the input data represented by an integer is input to the estimated amount calculating means and the coding means. The coding apparatus according to claim 1, further comprising means.

The coding apparatus according to claim 1 or 2, further comprising a difference calculating means for calculating a difference in the time direction of the input data and inputting the difference to the estimator calculating means and the coding means in place of the input data. ..

The item according to any one of claims 1 to 3, further comprising a frame dividing means for dividing the input data into frames for each time step and inputting the input data to the estimator calculating means and the coding means for each frame. The encoding device described.

Any of claims 1 to 4, further comprising an accuracy adjusting means for adjusting the accuracy of the input data according to a designated accuracy and inputting the adjusted input data to the estimator calculating means and the coding means. The encoding device according to item 1.

Claims 1 to 1, further comprising a group dividing means for dividing the data value into groups based on variable dependence information related to the data value and inputting the input data to the estimated amount calculating means and the coding means for each group. 5. The encoding device according to any one of 5.

The coding apparatus according to any one of claims 1 to 6, further comprising a sampling means for extracting a sample from the input data and inputting the extracted sample to the estimated amount calculation means.

3. The encoding device according to any one item.

An input means for inputting profile information for specifying a distribution model of input data, and an incidental code indicating an estimated amount of a population parameter of the probability distribution of the input data.
An appearance probability calculation means for calculating the appearance probability of a data value included in the input data by using the distribution model specified and the estimated estimator to be restored from the incidental code input by the input means.
Using the calculated the probability of occurrence, and decoding means for decoding the variable length encoded said data values,
Decoding device.

Computer,
An estimator calculating means for calculating an estimator of the population parameter of the probability distribution of the input data from the input data and the distribution model of the input data.
An appearance probability calculation means for calculating the appearance probability of a data value included in the input data from a probability distribution using the calculated estimated amount, and an appearance probability calculation means.
A coding means for variable-length coding the data value using the calculated appearance probability, and
A program for functioning as an output means for outputting profile information for specifying the distribution model and the estimated amount as an incidental code attached to the data value variable-length coded by the coding means.

Computer,
An input means for inputting profile information for specifying a distribution model of input data, and an incidental code indicating an estimated amount of a population parameter of the probability distribution of the input data.
An appearance probability calculation means for calculating the appearance probability of a data value included in the input data by using the distribution model specified and the estimated estimator to be restored from the incidental code input by the input means.
A program for functioning as a decoding means for decoding the variable-length encoded data value using the calculated appearance probability.