JP3350385B2

JP3350385B2 - Code generation method and coding method

Info

Publication number: JP3350385B2
Application number: JP01116497A
Authority: JP
Inventors: 隆昭林
Original assignee: Kyocera Corp
Current assignee: Kyocera Corp
Priority date: 1997-01-24
Filing date: 1997-01-24
Publication date: 2002-11-25
Anticipated expiration: 2017-01-24
Also published as: JPH10209877A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はデータを符号化す
る方法に関し、特に１からＮまでの自然数で表される数
値データを符号化する方法に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a method of encoding data, and more particularly, to a method of encoding numerical data represented by a natural number from 1 to N.

【０００２】[0002]

【従来の技術】従来より自然数で表されるデータを符号
化するための方法が幾つか提案されている。その中の一
つとして、１からＮまでの自然数を表わすための符号
が、Fiala とGreeneにより開発され、ＳＳＳ符号といわ
れている。ＳＳＳ符号は、大きな自然数を長い符号長で
記述し、小さな自然数を短い符号長で記述するという特
徴を持つ。符号化対象データのシンボルの正確な出現確
率は不明であるが、概ね小さい値が現われやすく大きな
値が現われにくいことが分かっている場合、この特徴は
有効に作用する。ＳＳＳ符号は３つのビット長に関する
パラメータ（ｓｔａｒｔ、ｓｔｅｐ、ｓｔｏｐ）を有す
る。自然数の１から２＾ｓｔａｒｔ（演算子＾は塁乗を
表わす）までの数値をビット長ｓｔａｒｔの符号ビット
で表わす。それ以上の数値にはビット長をｓｔｅｐだけ
増加させ、（ｓｔｅｐ＋ｓｔａｒｔ）ビットの符号ビッ
トを割り振る。（ｓｔｅｐ＋ｓｔａｒｔ）ビットで表わ
すことのできる数値の範囲は、２＾ｓｔａｒｔ＋１から
（２＾（２＊ｓｔｅｐ＋ｓｔａｒｔ）ー２＾ｓｔａｒ
ｔ）／（２＾ｓｔｅｐー１）（演算子＊は乗算を表わ
す）までである。さらにそれ以上の数値に対して符号を
割り振る場合は、ビット長を再びｓｔｅｐだけ増加さ
せ、（２＊ｓｔｅｐ＋ｓｔａｒｔ）ビットの符号ビット
を割り振る。こうして、（２＾（３＊ｓｔｅｐ＋ｓｔａ
ｒｔ）ー２＾ｓｔａｒｔ）／（２＾ｓｔｅｐー１）まで
の数値を表わすことができる。さらに大きい数値に対し
て符号を割り振る場合は、ビット長をさらにｓｔｅｐず
つ増加させ、最終的にビット長がｓｔｏｐになるまで数
値に対して符号を割り振るようにする。よって、ＳＳＳ
符号は最大（２＾（ｓｔｏｐ＋ｓｔａｒｔ）ー２＾ｓｔ
ａｒｔ）／（２＾ｓｔｅｐー１）までの数値に対して符
号を割り振ることができる。ＳＳＳ符号は符号ビットだ
けでは個々の符号を一意に識別することができない。そ
こで個々の符号を識別可能にするため、（ｋ＊ｓｔｅｐ
＋ｓｔａｒｔ）ビットの符号ビットの前にその長さｋを
表す識別ビットを付加する。識別ビットは、ｋ個の０を
続けた後に１を加えることにより生成される。ただし、
ビット長がｓｔｏｐのときは識別ビットの最後の１は省
略することができる。ＳＳＳ符号の例として（ｓｔａｒ
ｔ、ｓｔｅｐ、ｓｔｏｐ）の値が（０、１、５）のとき
と（１、２、５）のときを図７に示す。また、同様に自
然数を表わすための符号として、ＣＢＴ符号が提案され
ている。符号化対象が１からＮまでの自然数のとき、固
定長符号を用いると符号長はｋ＝〈ｌｏｇ２（Ｎ）〉
（〈ｘ〉はｘ以上の最小の整数値を表わす）となる。一
般にＮが２の塁乗で表わされるとき以外は、固定長符号
は冗長を持つことが知られている。その冗長性を減少さ
せるため、ＣＢＴ符号は数値ｎを可変長符号により１≦ｎ≦２＾ｋーＮのとき、ｎー１をｋー１ビット２＾ｋーＮ＜ｎ≦Ｎのとき、ｎ＋２＾ｋーＮー１をｋビ
ットのビット長で表わす。Ｎ＝１０のときのＣＢＴ符号の例
を図８に示す。2. Description of the Related Art Several methods have been proposed for encoding data represented by natural numbers. As one of them, a code for representing a natural number from 1 to N has been developed by Fiala and Greene and is called an SSS code. The SSS code has a feature that a large natural number is described with a long code length and a small natural number is described with a short code length. This feature works effectively when the exact appearance probability of the symbol of the data to be encoded is unknown, but it is known that generally small values are likely to appear and large values are unlikely to appear. The SSS code has three bit length parameters (start, step, and stop). A numerical value from a natural number of 1 to 2 {start (operator を represents a base) is represented by a sign bit having a bit length start. For a numerical value larger than that, the bit length is increased by step, and a code bit of (step + start) bits is allocated. The range of numerical values that can be represented by (step + start) bits is from 2 @ start + 1 to (2 @ (2 * step + start) -2 @ star
t) / (2 @ step-1) (operator * represents multiplication). When a code is to be assigned to a numerical value larger than that, the bit length is increased again by step, and (2 * step + start) bits of the sign bit are assigned. Thus, (2 ＾ (3 * step + sta)
rt) -2 ＾ start) / (2 ＾ step-1). When assigning a code to a larger numerical value, the bit length is further increased step by step, and a code is assigned to the numerical value until the bit length finally becomes stop. Therefore, SSS
The sign is maximum (2 @ (stop + start) -2 @ st
Signs can be assigned to numerical values up to (art) / (2 @ step-1). The SSS code cannot uniquely identify each code only by the code bit. Therefore, to make each code identifiable, (k * step
+ Start) An identification bit indicating the length k is added before the sign bit of the bit. The identification bits are generated by adding k followed by ones followed by ones. However,
When the bit length is stop, the last one of the identification bits can be omitted. As an example of the SSS code, (star
FIG. 7 shows the case where the values of (t, step, stop) are (0, 1, 5) and (1, 2, 5). Similarly, a CBT code has been proposed as a code for representing a natural number. When the encoding target is a natural number from 1 to N, if a fixed length code is used, the code length is k = <log2 (N)>
(<X> represents the minimum integer value not less than x). Generally, it is known that fixed-length codes have redundancy except when N is represented by the power of two. In order to reduce the redundancy, the CBT code uses a variable length code for the numerical value n when 1 ≦ n ≦ 2 ＾ k−N, when n−1 is k−1 bits, when 2 ＾ k−N <n ≦ N, n + 2 ＾ k-N-1 is represented by a bit length of k bits. FIG. 8 shows an example of the CBT code when N = 10.

【０００３】[0003]

【発明が解決しようとする課題】一般的に１〜Ｎまでの
自然数を符号化する場合、ＳＳＳ符号はその符号に冗長
性が生じるという問題がある。例えば、Ｎ＝３７までの
自然数を図７のＳＳＳ（０、１、５）により符号化した
とき、符号ビット長がｓｔｏｐになる数値３２〜３７は数値ＳＳＳ（０、１、５）符号３２００００００００００３３０００００００００１３４００００００００１０３５００００００００１１３６０００００００１００３７０００００００１０１と符号化される。このとき符号ビットは多くても３ビッ
トあれば一意に復号可能な符号が形成されることは明ら
かである。この例から分かるように、Ｎの値が丁度ＳＳ
Ｓ符号で符号化できる最大値と一致しているとき以外、
ＳＳＳ符号は冗長性がある。その冗長性を削減する方法
として考えられるのは、符号ビット長がｓｔｏｐになる
最後のｍ個の数値に対してＣＢＴ符号を適用することで
ある。上記の例では、２＾ｋーＮ＝２なので、数値３２
および３３に対する符号の符号ビット長は２ビット、数
値３４〜３７に対する符号の符号ビット長は３ビットで
表わすことにより、符号は数値符号３２０００００００３３００００００１３４０００００１００３５０００００１０１３６０００００１１０３７０００００１１１となり、固定長符号に比べて
数値３２および３３の符号ビット長を１ビット削減でき
る。ＣＢＴ符号の適用により符号の冗長性は減少した
が、別の問題が発生する。ここで、数値１６〜３１の符
号に注目すると、数値符号１６〜３１００００１ｘｘｘｘ（ｘは０または１を表わす）であるから、符号長は９ビットである。従って、ＣＢＴ
符号を用いたとき、数値１６〜３１の符号長より数値３
２〜３７の符号長のほうが小さくなる。ＳＳＳ符号を使
用するときは、対象データの値が小さいほうがより出現
確率が大きいという前提があるので、第二の方法はエン
トロピーの観点から考えると最適とはいえない。本発明
の目的は、任意のＮまでの自然数の符号化に対し、常に
小さい数値のビット長が大きい数値のビット長より大き
くなることがなく、また冗長性が少ない符号を生成する
ための符号生成方法およびその符号を用いて符号化を実
施するための符号化方法を提供する。Generally, when encoding natural numbers 1 to N, the SSS code has a problem that the code has redundancy. For example, when a natural number up to N = 37 is encoded by SSS (0, 1, 5) in FIG. 7, numerical values 32 to 37 whose code bit length is “stop” are numerical values SSS (0, 1, 5) code 32 00000 00000 33 00000 0000 34 00000 0010 35 00000 00011 36 00000 00100 37 00000 00101. At this time, it is apparent that a uniquely decodable code is formed if there are at most three code bits. As can be seen from this example, the value of N is just SS
Except when it matches the maximum value that can be encoded with the S code,
SSS codes have redundancy. As a method for reducing the redundancy, a CBT code is applied to the last m numbers whose code bit length is “stop”. In the above example, since 2 ＾ k−N = 2, the numerical value 32
The code bit length of the code for the codes 33 and 33 is represented by 2 bits, and the code bit length of the codes for the numerical values 34 to 37 is represented by 3 bits, so that the code is represented by the following formula. 00000 111, and the code bit length of the numerical values 32 and 33 can be reduced by one bit compared to the fixed length code. Although the use of CBT codes reduces code redundancy, another problem arises. Here, paying attention to the codes of the numerical values 16 to 31, since the numerical codes are 16 to 3100001xxxx (x represents 0 or 1), the code length is 9 bits. Therefore, CBT
When a code is used, the numerical value 3 is obtained from the code length of the numerical values 16 to 31.
The code lengths of 2 to 37 are smaller. When the SSS code is used, it is premised that the smaller the value of the target data is, the higher the probability of appearance is. Therefore, the second method is not optimal from the viewpoint of entropy. SUMMARY OF THE INVENTION It is an object of the present invention to provide a code generator for generating a code having a small redundancy in which the bit length of a small numerical value is not always greater than the bit length of a large numerical value for encoding natural numbers up to an arbitrary N. A method and an encoding method for performing encoding using the code are provided.

【０００４】[0004]

【課題を解決するための手段】上記目的を達成するた
めの符号生成方法の基本的な流れを図１に示す。１０１
は、所定数Ｎ１、Ｎ２、・・、Ｎｍ、・・、ＮＭ（０＜
Ｎ１＜Ｎ２＜・・＜Ｎｍ＜・・＜ＮＭ＜Ｎ）を定め、そ
の所定数により１からＮの自然数を区間（０、Ｎ１］、
（Ｎ１、Ｎ２］、・・、（ＮＭ、Ｎ］に区分するステッ
プである。ここで、表記”（”および”）”は開区間、
表記”［”および”］”は閉区間を示す。ここで、分割
した各区間のうち（０、Ｎ１］、・・、（ＮＭー１、Ｎ
Ｍ］の大きさは２の塁乗となるように設定する。特に、
初期符号ビット長ａを設定し、所定数Ｎｍを（２＾ａ）
＊（２＾ｍー１）の値に設定し、所定数の最大値ＮＭを
（２＾ａ）＊（３＊２＾（Ｍー１）ー１）≦Ｎを満たす
最大整数Ｍの値にしたがって設定する。１０２は、各区
間に対し、任意の自然数が含まれる区間を識別するため
の識別ビットを生成するステップである。特に、識別ビ
ットは、区間（０、Ｎ１］、（Ｎ１、Ｎ２］、・・、
（ＮＭー１、ＮＭ］の中の各々の区間に含まれる自然数
を表わす符号ビットのビット長を表わすものである。１
０３は、区間（０、Ｎ１］、（Ｎ１、Ｎ２］、・・、
（ＮＭー１、ＮＭ］の各々の区間に含まれる自然数に対
し、固定長の符号ビットを割り当てるステップである。
特に、各区間に対して固定ビット長ａ、ａ＋１、・・、
ａ＋Ｍー１となる符号ビットを割り当てるものである。
１０４は、区間（ＮＭ、Ｎ］に含まれる自然数に対し、
可変長の符号ビットを割り当てるステップである。特
に、区間（ＮＭ、Ｎ］に対し、ビット長ａ＋Ｍビットの
符号ビットを２＾ｋーＮ個（ここで、ｋ＝〈ｌｏｇ２
（ＮーＮＭ）〉）の自然数に割り当て、またビット長ａ
＋Ｍ＋１ビットの符号ビットを２＊Ｎー２＾ｋ個の自然
数に割り当てるものである。さらに、上記目的を達成す
るための符号化方法の基本的な流れを図２に示す。２０
１は、符号化すべきデータｎ（１≦ｎ≦Ｎ）を入力する
ステップである。２０２は、所定数Ｎ1 、Ｎ２、・・、
Ｎｍ、・・、ＮＭ（０＜Ｎ1 ＜Ｎ２＜・・＜Ｎｍ＜・・
＜ＮＭ＜Ｎ）により区分される区間（０、Ｎ1 ］、（Ｎ
1 、Ｎ２］、・・、（ＮＭ、Ｎ］のいずれに前記データ
ｎが属するかを識別するための識別ビットを付与するス
テップである。２０３は、前記データｎが区間（ＮＭ、
Ｎ］に属するか否かを判別するステップである。２０４
は、前記判別の結果、前記データｎが区間（０、Ｎ1
］、（Ｎ1 、Ｎ２］、・・、（ＮＭー１、ＮＭ］に属
す場合、前記データｎに対し固定長符号ビットを付与す
るステップである。２０５は、前記データｎが区間（Ｎ
Ｍ、Ｎ］に属す場合、前記データｎに対し可変長符号ビ
ットを付与するステップである。２０６は、前記データ
ｎを表わす符号として前記識別ビットおよび前記符号ビ
ットを出力するステップである。このように本発明の符
号生成方法によれば、区間（ＮＭ、Ｎ］の符号を可変長
にすることにより、自然数データを符号化する際に生じ
る冗長性を減少することができる。また、ＮＭの値を
（２＾ａ）＊（３＊２＾（Ｍー１）ー１）≦Ｎを満たす
最大整数Ｍの値にしたがって（２＾ａ）＊（２＾Ｍー
１）の値に設定することにより、小さい自然数値の符号
長が常に大きい自然数値の符号長より大きくなることが
ないように符号を生成することができる。また、本発明
の符号化方法によれば、前記符号生成方法により生成さ
れた符号を利用して自然数データを効率的に符号化する
ことができる。FIG. 1 shows a basic flow of a code generation method for achieving the above object. 101
Is a predetermined number N1, N2,..., Nm,.
N1 <N2 <.. <Nm <.. <NM <N) are determined, and natural numbers from 1 to N are defined as sections (0, N1),
(N1, N2),..., (NM, N), where the notation “(” and “)” represents an open section,
The notation "[" and "]" indicates a closed section. Here, (0, N1),..., (NM-1, N
M] is set to be the second power. In particular,
An initial code bit length a is set, and a predetermined number Nm is set to (2 ＾ a).
* Set to the value of (2 @ m-1), and set the maximum number NM of the predetermined number to the value of the maximum integer M satisfying (2 @ a) * (3 * 2 @ (M-1) -1) ≤N Therefore set. Reference numeral 102 denotes a step of generating an identification bit for identifying a section including an arbitrary natural number for each section. In particular, the identification bits are in the interval (0, N1), (N1, N2),.
It represents the bit length of a code bit representing a natural number included in each section of (NM-1, NM).
03 is the interval (0, N1), (N1, N2),.
This is a step of allocating fixed-length code bits to natural numbers included in each section of (NM-1, NM).
In particular, fixed bit lengths a, a + 1,...
A code bit of a + M-1 is assigned.
104 denotes a natural number included in the section (NM, N).
This is the step of assigning variable-length code bits. In particular, for the section (NM, N), 2 ＾ k−N code bits with a bit length a + M bits (where k = <log2
(N-NM)>) and a bit length a
+ M + 1 bits are assigned to 2 * N-2 @ k natural numbers. FIG. 2 shows a basic flow of an encoding method for achieving the above object. 20
1 is a step of inputting data n (1 ≦ n ≦ N) to be encoded. 202 is a predetermined number N1, N2,.
Nm, ..., NM (0 <N1 <N2 <... <Nm <...
<NM <N) (0, N1), (N
, N2],..., (NM, N), to which an identification bit for identifying whether the data n belongs is given.
N]. 204
Indicates that, as a result of the determination, the data n is in the interval (0, N1
, (N1, N2),..., (NM-1, NM), a step of adding a fixed-length code bit to the data n.
M, N], a step of adding variable-length code bits to the data n. 206 is a step of outputting the identification bit and the code bit as a code representing the data n. As described above, according to the code generation method of the present invention, by making the code of the section (NM, N) variable length, it is possible to reduce the redundancy that occurs when encoding natural number data. Is set to the value of (2 ＾ a) * (2 ＾ M-1) according to the value of the maximum integer M that satisfies (2 ＾ a) * (3 * 2 ＾ (M−1) −1) ≦ N. By doing so, a code can be generated such that the code length of a small natural value does not always become larger than the code length of a large natural value. , Natural number data can be efficiently encoded using the code generated by.

【０００５】[0005]

【発明の実施の形態】本発明の実施例を図面を用いて説
明する。最初に、本発明の符号生成方法について説明す
る。図３は本発明の符号生成方法の実施例の流れを示す
図である。３０１はパラメータを初期化するステップで
ある。パラメータは初期符号ビット長ａ、ビット長の増
分Ｌ、区間番号ｍ、区間番号ｍにより識別される区間
（区間Ｒｍとあらわす）の範囲を定める数Ｎｍの４個で
ある。初期符号ビット長ａは区間（０、Ｎ１］に含まれ
る自然数に対する符号の符号ビット長である。初期符号
ビット長ａは符号化データの出現確率の予測に基づいて
適当な値に定められる。区間Ｒｍは（Ｎｍー１、Ｎｍ］
の範囲を表わす。ただし、Ｎ０＝０およびＮ（Ｍ＋１）
＝Ｎとする。ビット長の増分Ｌは初期符号ビット長ａに
対する各区間の符号ビット長の増加分を示す。すなわ
ち、各区間の符号ビット長はａ＋Ｌになる。初期値とし
て各パラメータをＬ＝０、ｍ＝１、Ｎ０＝０にセットす
る。本実施例では、区間Ｒ１、Ｒ２、Ｒ３、・・・、Ｒ
Ｍの大きさがそれぞれ２＾ａ、２＾（ａ＋１）、２＾
（ａ＋２）、・・・、２＾（ａ＋Ｍー１）となるように
符号を生成する。このとき、Ｎｍは区間Ｒｍの最大数を
表わすので、Ｎｍの値はＲ１〜Ｒｍの各区間の大きさの
総和となるので、Ｎｍは初項２＾ａ、公比２の等比級数
の和になるので、Ｎｍ＝（２＾ａ）＊（２＾ｍー１）と
なる。３０２は自然数の最大値Ｎと初期符号ビット長ａ
および区間番号ｍにより定まる値（２＾ａ）＊（３＊２
＾（ｍー１）ー１）の大小を比較するステップである。
Ｎが小さければ、区間Ｒｍの自然数に対して固定長符号
ビットを割り当てるために３０３〜３０６を実行し、次
の区間の符号を生成するために３０２へ戻る。そうでな
ければ、区間Ｒｍの自然数に対して可変長符号ビットを
割り当てるために３０７〜３０９を実行し、処理を終了
する。ここで、３０２の条件分岐において、Ｎと（２＾
ａ）＊（３＊２＾（ｍー１）ー１）を比較する理由につ
いて説明する。まず、区間Ｒ（Ｍー１）の大きさは２＾
（ａ＋Ｍー１）なので、区間Ｒ（Ｍー１）の符号ビット
長はａ＋Ｍー１ビットである。最後の区間ＲＭの大きさ
が２＾（ａ＋Ｍー１）より小さいと、区間ＲＭの符号ビ
ット長をａ＋Ｍー１ビットより短いビットで表わすこと
が可能になる。そのため、区間ＲＭの大きさは区間Ｒ
（Ｍー１）の大きさ以上でなければならない。図４
（ａ）は区間Ｒ（ｍー１）まで符号生成したときの様子
を数直線上に図示したものである。Ｎｍ＝（２＾ａ）＊
（２＾ｍー１）なので、区間Ｒｍの範囲は通常（２＾ａ
＊（２＾（ｍー１）ー１）、２＾ａ＊（２＾ｍー１）］
となるが、図４（ａ）に示すように、区間（２＾ａ＊
（２＾（ｍー１）ー１）、２＾ａ＊（２＾ｍー１）］の
大きさ２＾（ａ＋ｍー１）より区間（２＾ａ＊（２＾ｍ
ー１）、Ｎ］の大きさＮー２＾ａ＊（２＾ｍー１）のほ
うが小さいとき、すなわちＮー２＾ａ＊（２＾ｍー１）
＜２＾（ａ＋ｍー１）のとき、Ｒｍの範囲は図４（ｂ）
に示すように（Ｎｍー１、Ｎｍ］となり、このときＮｍ
ー１＝２＾（ｍー１）ー１およびＮｍ＝Ｎとなる。ま
た、図４（ｃ）に示すように、区間（２＾ａ＊（２＾
（ｍー１）ー１）、２＾ａ＊（２＾ｍー１）］の大きさ
より区間（２＾ａ＊（２＾ｍー１）、Ｎ］の大きさのほ
うが大きいとき、Ｒｍの範囲は図４（ｄ）に示す（Ｎｍ
ー１、Ｎｍ］となる。３０３は固定長符号ビットを割り
当てる区間Ｒｍの範囲を定めるためにＮｍを計算するス
テップである。前述したように、Ｎｍの値はＮｍ＝（２
＾ａ）＊（２＾ｍー１）により求めることができる。３
０４は区間Ｒｍに対する識別ビットを設定するステップ
である。識別ビットは各区間の符号ビットの符号長を表
わす。識別ビットは一意に復号可能なビット列であれば
いかなるものでも構わない。本実施例では、識別ビット
としてビット長の増分Ｌを符号化して用いる。すなわ
ち、ビット０をＬ個セットした後に識別ビットの区切り
を表わすためのビット１をセットすることにより識別ビ
ットを構成する。よって、ｍ＝０、１、２、・・・に対
する識別ビットは１、０１、００１、・・・となる。こ
のような識別ビットで表わされる区間の符号ビット長
は、識別ビットのビット０の数に初期符号ビット長ａを
加えた長さになる。３０５は区間Ｒｍの自然数に対し符
号ビットを設定するステップである。区間Ｒｍに含まれ
る２＾（ａ＋Ｌ）個の自然数に対し、ａ＋Ｌビットの固
定長符号を割り当てる。例えば、符号化する自然数値を
ｎとしたとき、ｎーＮ（ｍー１）＋１の値をａ＋Ｌビッ
トで表わしたビット列を自然数値ｎ符号ビットとする。
３０６はパラメータの更新を行うステップである。次の
区間の符号を生成するために区間番号ｍの値を１だけ増
加させ、それに伴いビット長の増分値Ｌの値も１だけ増
加させる。さて、３０２〜３０６の処理を実行すること
により、一つの区間に対する符号の生成が終了する。３
０６の処理を実行した後再び３０２の条件分岐に戻る。
ここで、数値比較の結果、Ｎが（２＾ａ）＊（３＊２＾
（ｍー１）ー１）の値以上となると、最後の区間に対す
る符号を生成し、処理を終了する。最後の区間（Ｎｍ、
Ｎ］に対する符号を生成するために、まずステップ３０
７で識別ビットの設定を行う。ここでの識別ビットはＬ
個のビット０をセットするが、３０４と異なり識別ビッ
トの区切りを表わすビット１は必要ない。それは区間
（Ｎｍ、Ｎ］は最後の区間なので、識別ビットとしてこ
れ以上ビット０が続く可能性がないためである。次に、
符号ビットを割り当てるが、本実施例では可変長コード
としてＣＢＴ符号を用いる。３０８はＣＢＴ符号の符号
長を求めるステップである。区間ＲＭの大きさは２＾
（ａ＋Ｌ）以上かつ２＾（ａ＋Ｌ＋１）未満なので、Ｃ
ＢＴ符号の符号長はａ＋Ｌおよびａ＋Ｌ＋１である。そ
こで、符号長ｋ＝ａ＋Ｌとおいて、区間ＲＭの符号化を
行う。３０９はＣＢＴ符号に基づいて区間ＲＭの自然数
に対し符号ビットを設定するステップである。区間ＲＭ
の大きさはＮーＮ（ｍー１）なので、Ｎ（ｍー１）≦ｎ
≦Ｎ（ｍー１）＋２＾ｋーＮのとき、ｎをａ＋Ｌビット
で符号化し、Ｎ（ｍー１）＋２＾ｋーＮ＜ｎ≦Ｎのと
き、ｎをａ＋Ｌ＋１ビットで符号化する。本実施例に基
づいて作成した符号を図９に示す。ここではａ＝０、２
およびＮ＝３７の場合について示してある。次に、これ
までに説明した符号生成方法により作成された符号を用
いて、自然数データを符号化するための符号化方法につ
いて説明する。図５は本発明の符号化方法の実施例の流
れを示す図である。５０１は符号化するデータｎを入力
するステップである。ｎは１〜Ｎの範囲に制限されてい
る。５０２はデータｎの値からｎが属する区間Ｒｍを調
べ、その区間Ｒｍの識別ビットＩ（ｍ）と符号化を行う
ためのパラメータを求めるステップである。符号化パラ
メータは符号ビット長Ｌ（ｍ）および区間Ｒｍの先頭デ
ータＮ（ｍ）であり、それぞれ初期符号ビット長と符号
ビットの増分の和および各区間に属すデータの最小デー
タを表わす。本実施例では図６に示したテーブルにより
識別ビットおよび符号化パラメータを参照する。すなわ
ち、識別ビットおよび符号化パラメータは区間Ｒｍに含
まれるデータに対して等しい値をとるので、入力データ
ｎからｎの属す区間Ｒｍの項目を参照できるようにして
おく。５０３はデータｎが最後の区間（ＮＭ、Ｎ］に属
するか否かを判別するステップである。そのために、本
実施例では識別ビットＩ（ｍ）を構成する全てのビット
の論理和を求める。符号生成過程から明らかなように、
最後の区間（ＮＭ、Ｎ］以外の区間は全て識別ビットの
終了を示すためにビット１が含まれているのでその論理
和は１となるが、区間（ＮＭ、Ｎ］の識別ビットを構成
する全てのビットは０なのでその論理和は０となる。よ
って、求めた論理和が０のときはデータｎは区間（Ｎ
Ｍ、Ｎ］に属し、１のときはそれ以外の区間に属す。５
０４は５０３の判別結果が偽（すなわち、論理和が１）
の場合、前記データｎに対する符号ビットＣ（ｎ）を付
与するステップである。このときの符号ビットＣ（ｎ）
のビット長はＬ（ｍ）となる。本実施例では、ｎーＮ
（ｍ）の値を求めたとき、その値が０〜２＾Ｌ（ｍ）ー
１の値をとるので、ｎーＮ（ｍ）の値をそのままＬ
（ｍ）ビットの２進数列で符号化する。５０５は５０３
の判別結果が真（すなわち、論理和が０）の場合、前記
データｎに対し符号ビットＣ（ｎ）を付与するステップ
である。ステップ５０４と同様にｎーＮ（ｍ）の値を求
めると、その値は０〜ＮーＮ（ｍ）の値をとるので、ｎ
ーＮ（ｍ）の値に対してＣＢＴ符号を適用する。すなわ
ち、定数ｋ＝２＾（Ｌ（ｍ）＋１）ー（ＮーＮ（ｍ）＋
１）とおいて、ｎーＮ（ｍ）＜ｋのときはｎーＮ（ｍ）
の値をＬ（ｍ）ビットの２進数列で符号化し、ｎーＮ
（ｍ）≧ｋのときはｎーＮ（ｍ）＋ｋの値をＬ（ｍ）＋
１ビットの２進数列で符号化する。５０６は前記データ
ｎを表わす符号として識別ビットＩ（ｍ）および符号ビ
ットＣ（ｎ）を結合し、符号出力するステップである。Embodiments of the present invention will be described with reference to the drawings. First, the code generation method of the present invention will be described. FIG. 3 is a diagram showing the flow of the embodiment of the code generation method of the present invention. 301 is a step of initializing parameters. The parameters are four, the initial code bit length a, the bit length increment L, the section number m, and the number Nm that defines the range of the section (represented as section Rm) identified by the section number m. The initial code bit length a is a code bit length of a code corresponding to a natural number included in the section (0, N1), and the initial code bit length a is set to an appropriate value based on prediction of the appearance probability of encoded data. Rm is (Nm-1, Nm)
Represents the range. Where N0 = 0 and N (M + 1)
= N. The bit length increment L indicates the increment of the code bit length in each section with respect to the initial code bit length a. That is, the code bit length of each section is a + L. Each parameter is set to L = 0, m = 1, and N0 = 0 as initial values. In this embodiment, the sections R1, R2, R3,.
M is 2 {a, 2 (a + 1), 2}
Codes are generated such that (a + 2),..., 2） (a + M−1). At this time, since Nm represents the maximum number of sections Rm, the value of Nm is the sum of the sizes of the sections R1 to Rm, and Nm is the sum of geometric series of the first term 2 ＾ a and the common ratio 2 Therefore, Nm = (2 ＾ a) * (2 ＾ m−1). 302 is the maximum value N of the natural number and the initial code bit length a
(2 ＾ a) * (3 * 2)
This is a step of comparing the magnitudes of ＾ (m-1) -1).
If N is smaller, steps 303 to 306 are executed to assign fixed-length code bits to natural numbers in the section Rm, and the process returns to 302 to generate a code for the next section. Otherwise, steps 307 to 309 are executed to assign variable length code bits to natural numbers in the section Rm, and the process ends. Here, in the conditional branch of 302, N and (2 ＾
a) The reason for comparing * (3 * 2 ＾ (m−1) −1) will be described. First, the size of the section R (M-1) is 2 ＾
Since (a + M−1), the code bit length of the section R (M−1) is a + M−1 bits. If the size of the last section RM is smaller than 2 ＾ (a + M−1), the code bit length of the section RM can be represented by bits shorter than a + M−1 bits. Therefore, the size of the section RM is the section R
It must be larger than (M-1). FIG.
(A) illustrates a state when codes are generated up to the section R (m-1) on a number line. Nm = (2 ＾ a) *
(2 ＾ m−1), the range of the section Rm is usually (2 ＾ a
* (2 @ (m-1) -1), 2 @ a * (2 @ m-1)]
However, as shown in FIG. 4A, the section (2 ＾ a *
(2 ＾ (m-1) -1), 2 ＾ a * (2 ＾ m-1)], the section (2 ＾ a * (2 ＾ m) from the size 2 ＾ (a + m-1)
−1), N] when the size N−2 ＾ a * (2 ＾ m−1) is smaller, ie, N−2 ＾ a * (2 ＾ m−1).
When <2 ＾ (a + m−1), the range of Rm is as shown in FIG.
(Nm−1, Nm) as shown in FIG.
−1 = 2 ＾ (m−1) −1 and Nm = N. Also, as shown in FIG. 4C, the section (2 ＾ a * (2 ＾
(M-1) -1) When the size of the section (2 ＾ a * (2 ＾ m-1), N] is larger than the size of 2 ＾ a * (2 ＾ m-1)], the value of Rm The range is shown in FIG.
-1, Nm]. Step 303 is a step of calculating Nm in order to determine the range of the section Rm to which fixed-length code bits are allocated. As described above, the value of Nm is Nm = (2
＾ a) * (2 ＾ m−1). 3
Step 04 is a step of setting an identification bit for the section Rm. The identification bit indicates the code length of the code bit in each section. The identification bit may be any bit string that can be uniquely decoded. In the present embodiment, a bit length increment L is encoded and used as an identification bit. That is, the identification bit is configured by setting L bits 0 and then setting bit 1 for indicating the division of the identification bit. Therefore, the identification bits for m = 0, 1, 2,... Are 1, 01, 001,. The code bit length of the section represented by such identification bits is a length obtained by adding the initial code bit length a to the number of bit 0 of the identification bits. Step 305 is a step of setting a sign bit for a natural number in the section Rm. A fixed length code of a + L bits is assigned to 2 ＾ (a + L) natural numbers included in the section Rm. For example, when a natural value to be encoded is n, a bit string in which a value of n−N (m−1) +1 is represented by a + L bits is a natural number n code bits.
306 is a step of updating the parameters. In order to generate a code for the next section, the value of the section number m is increased by 1, and accordingly, the value of the bit length increment L is also increased by 1. Now, by executing the processing of 302 to 306, the generation of the code for one section ends. 3
After executing the process of step 06, the process returns to the conditional branch of step 302 again.
Here, as a result of the numerical comparison, N is (2 ＾ a) * (3 * 2 ＾).
If the value is equal to or more than the value of (m-1) -1), a code for the last section is generated, and the process is terminated. The last section (Nm,
N] to generate a code for
At step 7, an identification bit is set. The identification bit here is L
Bits 0 are set, but unlike 304, bit 1 indicating the delimitation of the identification bit is not required. Because the section (Nm, N] is the last section, there is no possibility that bit 0 continues as an identification bit any more.
Although code bits are allocated, a CBT code is used as a variable length code in this embodiment. 308 is a step of obtaining the code length of the CBT code. The size of the section RM is 2 ＾
(A + L) or more and less than 2 ＾ (a + L + 1), C
The code length of the BT code is a + L and a + L + 1. Therefore, the coding of the section RM is performed with the code length k = a + L. 309 is a step of setting a code bit for a natural number in the section RM based on the CBT code. Section RM
Is N−N (m−1), so that N (m−1) ≦ n
When ≦ N (m−1) + 2 ＾ k−N, n is encoded with a + L bits, and when N (m−1) + 2 ＾ k−N <n ≦ N, n is encoded with a + L + 1 bits. FIG. 9 shows reference numerals created based on this embodiment. Here, a = 0, 2
And N = 37. Next, an encoding method for encoding natural number data using a code created by the code generation method described above will be described. FIG. 5 is a diagram showing a flow of an embodiment of the encoding method of the present invention. Reference numeral 501 denotes a step of inputting data n to be encoded. n is limited to the range of 1 to N. Reference numeral 502 denotes a step of examining a section Rm to which n belongs from the value of the data n, and determining an identification bit I (m) of the section Rm and a parameter for encoding. The coding parameters are the code bit length L (m) and the head data N (m) of the section Rm, and represent the sum of the initial code bit length and the increment of the code bit, and the minimum data of the data belonging to each section. In the present embodiment, the identification bits and the encoding parameters are referred to by the table shown in FIG. That is, since the identification bit and the encoding parameter take the same value with respect to the data included in the section Rm, the item of the section Rm to which n belongs from the input data n should be referred to. In step 503, it is determined whether or not the data n belongs to the last section (NM, N), and in this embodiment, the logical sum of all the bits constituting the identification bit I (m) is obtained. As is clear from the code generation process,
Since all the sections other than the last section (NM, N) include bit 1 to indicate the end of the identification bit, the logical sum thereof is 1, but constitutes the identification bit of the section (NM, N). Since all the bits are 0, the logical sum is 0. Therefore, when the obtained logical sum is 0, the data n is in the section (N
M, N], and when it is 1, it belongs to other sections. 5
04 is false in the judgment result of 503 (that is, the logical sum is 1)
Is a step of adding a sign bit C (n) to the data n. The sign bit C (n) at this time
Is L (m). In this embodiment, n−N
When the value of (m) is obtained, since the value takes a value of 0 to 2 ＾ L (m) −1, the value of n−N (m) is
(M) Encode with a binary string of bits. 505 is 503
Is true (that is, the logical sum is 0), a sign bit C (n) is added to the data n. When the value of nN (m) is obtained in the same manner as in step 504, the value takes a value of 0 to NN (m).
Apply the CBT code to the value of -N (m). That is, the constant k = 2 ＾ (L (m) +1) − (NN−m (m) +
1), when n−N (m) <k, n−N (m)
Is encoded by a binary sequence of L (m) bits, and n−N
When (m) ≧ k, the value of n−N (m) + k is represented by L (m) +
Encode with a 1-bit binary sequence. 506 is a step of combining the identification bit I (m) and the code bit C (n) as a code representing the data n, and outputting the code.

【０００６】[0006]

【発明の効果】以上のように請求項１から請求項７の符
号生成方法によれば、１〜Ｎで表わされる任意の大きさ
の自然数データに対し、小さい数値の符号長が必ず大き
い数値の符号長に等しいかあるいは短くなり、また符号
の冗長性が最小となるように符号化できる。このことに
より、正確な確率分布は不明であるが概ね小さい数値の
ほうが出現確率が高いことが判明しているような自然数
データに対して効率的な符号を生成することができる。
また、請求項８の符号化方法によれば、上記の符号生成
方法により生成された符号を用いて自然数データを効率
的に符号化することができる。As described above, according to the code generation method of any one of claims 1 to 7, for a natural number data having an arbitrary size represented by 1 to N, the code length of a small number is always larger than that of a larger number. The coding can be performed so as to be equal to or shorter than the code length and to minimize the code redundancy. This makes it possible to generate an efficient code for natural number data for which it is known that the exact probability distribution is unknown but that a smaller numerical value has a higher probability of occurrence.
According to the encoding method of the eighth aspect, natural number data can be efficiently encoded using the code generated by the above code generation method.

[Brief description of the drawings]

【図１】本発明の符号生成方法の基本的な流れを示す
図。FIG. 1 is a diagram showing a basic flow of a code generation method according to the present invention.

【図２】本発明の符号化方法の基本的な流れを示す
図。FIG. 2 is a diagram showing a basic flow of the encoding method of the present invention.

【図３】本発明の符号生成方法の実施例の流れを示す
図。FIG. 3 is a diagram showing a flow of an embodiment of a code generation method of the present invention.

【図４】自然数データの区分を説明するための図。FIG. 4 is a diagram for explaining division of natural number data.

【図５】本発明の符号化方法の実施例の流れを示す
図。FIG. 5 is a diagram showing a flow of an embodiment of the encoding method of the present invention.

【図６】図５ステップ５０２のテーブルの一例を示す
図。FIG. 6 is a diagram showing an example of a table in step 502 in FIG. 5;

【図７】ＳＳＳ符号による符号例を示す図。FIG. 7 is a view showing an example of an SSS code.

【図８】ＣＢＴ符号による符号例を示す図。FIG. 8 is a view showing a code example using a CBT code.

【図９】本発明の符号生成方法による符号例を示す
図。FIG. 9 is a view showing a code example according to the code generation method of the present invention.

[Explanation of symbols]

１０１自然数 101 natural number

Claims

(57) [Claims]

1. A code generation method for generating a code corresponding to a natural number from 1 to N, wherein a predetermined number of N1, N2,
.., Nm, .., NM (0 <N1 <N2 <.. <Nm
<NM <N), and 1 to N according to the predetermined number.
Of the interval (0, N1), (N1, N2),.
(NM, N), a second step of generating an identification bit for identifying an interval including an arbitrary natural number with respect to the interval,
N1], (N1, N2),..., (NM-1, NM), a third step of allocating a fixed-length code bit to a natural number included in each section;
N], a fourth step of allocating variable-length code bits to natural numbers included in the section [N].

2. The section (0, N1), (N1, N1
2. The code generation method according to claim 1, wherein the size of [2],..., (NM-1, NM) is the second power.

3. The first step sets an initial code bit length a and sets the predetermined number Nm to (2 ＾ a) * (2 ＾ m
-1), and the predetermined maximum number NM is set to (2 ＾
The code generation method according to claim 2, wherein a) is set according to a value of a maximum integer M that satisfies a) * (3 * 2 ^ (M-1) -1) ≤N.

4. The method according to claim 2, wherein in the second step, the identification bit is assigned to the interval (0, N1), (N1, N2),.
2. The code generation method according to claim 1, wherein the code length represents a bit length of the code bit representing a natural number included in each section of (NM-1, NM).

5. The section (0, N1), (N1, N1
2],..., (NM−1, NM) are the first bits of a number equal to the difference between the bit length of the sign bit representing a natural number included in each section and the initial sign bit length a. Bit, followed by one second bit, and the identification bit for the section (NM, N) is obtained by converting the second bit of the identification bit of the section (NM-1, NM) to the first bit. 5. The code generation method according to claim 4, wherein the code generation method comprises a replacement.

6. In the third step, the sections (0, N1], (N1, N2),..., (NM-1, N
M], the bit length a, a + 1,..., A + M
3. The code generation method according to claim 2, wherein the fixed-length code bits of -1 are assigned.

7. In the fourth step, for the section (NM, N), 2 ＾ k−N code bits having a bit length a + M bits (where k = <log2 (N−N
M)>), and the bit length a + M + 1
3. The code generation method according to claim 2, wherein the code bits are assigned to 2 * N-2 @ k natural numbers.

8. An encoding method for encoding data associated with natural numbers from 1 to N, a fifth step of inputting data n (1 ≦ n ≦ N) to be encoded, and a predetermined number N1. , N2,..., Nm,.
The sections (0, N1), (N1, N2),... Divided by <N1 <N2 <... <Nm <.
A sixth step of assigning an identification bit for identifying which of (NM, N] the data n belongs to; and a seventh step of determining whether the data n belongs to the section (NM, N). And the data n is in the interval (0, N1],
(N1, N2],..., (NM-1, NM), an eighth step of adding a code bit having a fixed length in each section to the data n; NM, N], a ninth step of adding a code bit having a variable length in the section to the data n;
An encoding method, comprising: a tenth step of outputting the identification bit and the code bit as a code representing the data n.