JP2602641B2

JP2602641B2 - Audio coding method

Info

Publication number: JP2602641B2
Application number: JP60036628A
Authority: JP
Inventors: 政彦笹岡; 寛平山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1985-02-27
Filing date: 1985-02-27
Publication date: 1997-04-23
Anticipated expiration: 2012-04-23
Also published as: JPS61198200A

Description

【発明の詳細な説明】〔概要〕スペクトル包絡情報抽出回路によりディジタル音声信
号固有の周波数特性を抽出し、音源情報抽出回路により
振幅，ビッチ周期，有音／無音の区別等の音源情報を抽
出し、それらの抽出情報から無音検出回路により音声信
号の分析区間の無音連続数を検出し、この無音連続数か
ら無音圧縮の可否を判定して、連続した無音データを無
音情報と連続数情報とに変換し、無音区間を圧縮して、
音声符号化を行うものである。DETAILED DESCRIPTION OF THE INVENTION [Overview] A frequency characteristic unique to a digital audio signal is extracted by a spectrum envelope information extraction circuit, and sound source information such as amplitude, bitch cycle, and distinction between sound / no sound is extracted by a sound source information extraction circuit. From the extracted information, the silence detection circuit detects the number of continuous silences in the analysis section of the audio signal, determines whether or not silence can be compressed based on the number of continuous silences, and converts the continuous silence data into silence information and continuous number information. Convert, compress the silence section,
It performs voice coding.

[Industrial applications]

本発明は、ディジタル信号に変換した音声信号を符号
化し、更に無音区間を圧縮して、記憶情報量を少なくし
た音声符号化方式に関するものである。The present invention relates to a voice coding method in which a voice signal converted into a digital signal is coded, and a silent section is further compressed to reduce the amount of stored information.

各種のシステムに採用されている音声応答方式に於い
ては、記録されているアナログ音声信号或いはPCM信号
等のディジタル化された音声信号を再生編集する編集方
式や、音声信号を分析してパラメータ化して記憶し、記
憶容量を少なくした分析合成方式等が知られている。こ
の分析合成方式の一つとしてのLSP（line spectrum pai
r）方式は、音声信号を周波数領域のパラメータに分析
するものであり、これに対して時間領域のパラメータに
分析するPARCOR方式も知られている。The voice response system used in various systems includes an editing system for playing back and editing recorded analog voice signals or digitized voice signals such as PCM signals, and analyzing and converting voice signals into parameters. An analysis synthesis method or the like in which the storage capacity is reduced and the storage capacity is reduced is known. LSP (line spectrum pai) as one of the analytical synthesis methods
The r) method analyzes a voice signal into frequency domain parameters, while the PARCOR method that analyzes time domain parameters is also known.

[Conventional technology]

第５図はLSP分析の為の従来例のブロック図を示す、
１はアナログ音声信号の入力端子、２はLSPデータの出
力端子、３′はLSP分析信号化装置、４はアナログ音声
信号をディジタル音声信号に変換するＡ−Ｄ変換部、５
は共通バス、６は各部を制御する為のプロセッサ（CP
U）、７はプログラムやデータ等を格納するメモリ（ME
M）、８はスペクトル包絡情報抽出部、９は蓄積部、10
は音源情報抽出部、11は振幅抽出部、12はピッチ周期抽
出部、13は有音／無音判定部、14は蓄積部、15は分析情
報を編集して出力する出力編集部である。FIG. 5 shows a block diagram of a conventional example for LSP analysis.
1 is an input terminal of an analog audio signal, 2 is an output terminal of LSP data, 3 'is an LSP analysis signal converting device, 4 is an AD converter for converting an analog audio signal into a digital audio signal, 5
Is a common bus, and 6 is a processor (CP
U) and 7 are memories (ME) for storing programs, data, etc.
M), 8 is a spectrum envelope information extraction unit, 9 is a storage unit, 10
Is a sound source information extraction unit, 11 is an amplitude extraction unit, 12 is a pitch period extraction unit, 13 is a sound / non-sound determination unit, 14 is a storage unit, and 15 is an output editing unit that edits and outputs analysis information.

入力端子１に加えられたアナログ音声信号は、Ａ−Ｄ
変換部４によってディジタル信号に変換され、変換され
たディジタル音声信号は、プロセッサ６の制御によっ
て、直接或いはメモリ７を介してスペクトル包絡情報抽
出部８及び音源情報抽出部10に転送され、サンプリング
周期（フレーム周期）及び分析閾値（無音判定値）を設
定することにより、スペクトル包絡情報抽出部８に於い
て入力音声信号固有の周波数特性としてのスペクトル包
絡情報（LSPパラメータ）が抽出され、音源情報抽出部1
0の振幅抽出部11により振幅、ピッチ周期抽出部12によ
りピッチ周期、又有音／無音判定部13により有音／無音
を判定した情報がそれぞれ抽出される。又蓄積部9,14
は、前述の設定値や抽出過程に於けるデータの蓄積を行
うものである。The analog audio signal applied to the input terminal 1 is AD
The converted digital audio signal is converted into a digital signal by the conversion unit 4 and transferred to the spectrum envelope information extraction unit 8 and the sound source information extraction unit 10 directly or via the memory 7 under the control of the processor 6, and the sampling period ( By setting a frame period) and an analysis threshold value (silence determination value), spectrum envelope information (LSP parameter) as a frequency characteristic unique to the input audio signal is extracted in the spectrum envelope information extraction unit 8, and the sound source information extraction unit 1
The amplitude extraction unit 11 extracts the amplitude, the pitch period extraction unit 12 extracts the pitch period, and the sound / non-speech determining unit 13 determines the presence / absence of sound / no-sound. Storage units 9, 14
Is for storing the set values and data in the extraction process.

出力編集部15は、スペクトル包絡情報抽出部８で抽出
したスペクトル包絡情報及び音源情報抽出部10で抽出し
た振幅，ピッチ周期，有音／無音の区別等の音源情報を
編集して出力するものであり、且つ有音／無音判定部13
により分析閾値以下であることによって無音と判定され
た分析区間には、強制的に無音パラメータを挿入するも
のである。The output editing unit 15 edits and outputs the spectrum envelope information extracted by the spectrum envelope information extraction unit 8 and the sound source information such as the amplitude, the pitch period, and the sound / no-sound discrimination extracted by the sound source information extraction unit 10. Yes and sound / silence determination unit 13
For this reason, a silent parameter is forcibly inserted into an analysis section determined to be silent due to being equal to or less than the analysis threshold value.

第６図は分析結果のLSPデータの説明図であり、６バ
イト構成の場合を示すものである。同図に於いて、SBは
ストップビット、Ｔはピッチ周期、T1,T0はフレーム周
期の種類を符号化して示すフレーム周期（サンプリング
周期）、Ａは振幅、ω１〜ω８はLSPパラメータを示
す。フレーム周期は、音片の長さ等により選定されるも
のであり、例えば、“00"〜“10"により３種類の周期の
何れかを指定することができるものである。出力編集部
15に於いては、このような分析結果の編集を行って出力
し、出力端子２から音声符号蓄積装置（図示せず）等に
転送して蓄積することになる。FIG. 6 is an explanatory diagram of the LSP data of the analysis result, showing a case of a 6-byte configuration. In the figure, SB is a stop bit, T is a pitch period, T1 and T0 are frame periods (sampling periods) indicating the types of frame periods encoded, A is an amplitude, and ω1 to ω8 are LSP parameters. The frame period is selected based on the length of the sound piece and the like. For example, any one of three types of periods can be designated by “00” to “10”. Output editing section
At 15, the analysis result is edited and output, and is transferred from the output terminal 2 to a speech code storage device (not shown) and stored.

[Problems to be solved by the invention]

従来の音声符号化に於ける例えばLSP分析に於いて、
音声情報中に含まれる有音部の無音音片についてもその
まま分析され、そのLSPデータとして記憶されるもので
ある。通常の音声情報中には、無音区間が連続すること
が多く、その場合に於いても、連続する各無音区間につ
いて、それぞれ分析結果が出力されて記憶されるので、
音声情報をディジタル化して記憶するには、相当の記憶
容量が必要となり、無駄が多い欠点があった。For example, in LSP analysis in conventional speech coding,
The silent part of the sound part included in the audio information is also analyzed as it is and stored as the LSP data. In normal voice information, silent sections are often continuous. Even in this case, the analysis result is output and stored for each continuous silent section.
Digitizing and storing voice information requires a considerable storage capacity, which is disadvantageous in that it is wasteful.

本発明は、前述のような無音区間を圧縮して、分析，
蓄積効率を向上させることを目的とするものである。The present invention compresses the silent section as described above,
The purpose is to improve the storage efficiency.

[Means for solving the problem]

本発明の音声符号化方式は、第１図を参照して説明す
ると、Ａ−Ｄ変換部４等によりアナログ音声信号をディ
ジタル音声信号に変換し、そのディジタル信号に変換さ
れた音声信号から、その音声信号固有の周波数特性を示
すスペクトル包絡情報を抽出するスペクトル包絡情報抽
出部８と、ディジタル信号に変換された音声信号から、
音声信号の振幅，ピッチ周期，有音／無音の区別等の音
源情報を抽出する振幅抽出部11,ピッチ周期抽出部12,有
音／無音判定部13等を含む音源情報抽出部10と、スペク
トル包絡情報と音源情報とから無音フレームの連続数を
検出し、検出された無音フレームの連続数が、復号時の
有音区間に対する参照フレーム数を超えた時に無音圧縮
可と判定して、無音圧縮処理を行う無音検出部16,無音
圧縮可否判定部17,無音圧縮部18等を含む一点鎖線内の
圧縮処理手段とを備え、この圧縮処理手段により、無音
圧縮可の判定時に、無音フレームの連続数の中の復号時
の有音区間に対する参照フレーム数分を残して、他の連
続する無音フレームのデータを無音フレームの連続数情
報に変換して、音声情報中に含まれる無音区間を圧縮す
る。なお、15は出力編集部である。Referring to FIG. 1, the audio encoding system of the present invention converts an analog audio signal into a digital audio signal by an A / D converter 4 and the like, and converts the digital audio signal into a digital audio signal. A spectrum envelope information extraction unit 8 for extracting spectrum envelope information indicating a frequency characteristic inherent to a voice signal, and a voice signal converted into a digital signal,
A sound source information extracting unit 10 including an amplitude extracting unit 11, a pitch period extracting unit 12, a sound / silence determining unit 13 for extracting sound source information such as an amplitude, a pitch period of a voice signal, and distinction between a sound / no sound and a spectrum; The number of continuous silence frames is detected from the envelope information and the sound source information, and when the number of detected consecutive silence frames exceeds the number of reference frames for a sound section at the time of decoding, it is determined that silence compression is possible. A compression processing unit within a dashed line including a silence detection unit 16 that performs processing, a silence compression availability determination unit 17, a silence compression unit 18, and the like. The data of the other continuous silence frames are converted into the silence frame continuation number information, leaving the reference frames for the speech interval at the time of decoding in the number, and the silence interval included in the audio information is compressed. . Reference numeral 15 denotes an output editing unit.

(Operation)

スペクトル包絡情報抽出部８は、ディジタル信号に変
換された音声信号からスペクトル包絡情報を抽出する。
又音源情報抽出部10は、ディジタル信号に変換された音
声信号から音源情報を抽出する。又圧縮処理手段は、一
点鎖線内のように、無音検出部16,無音圧縮可否判定部1
7,無音圧縮部18等を含み、有音フレームか無音フレーム
かを判定して、無音フレームの場合はその連続数を検出
し、その無音フレームの連続数が有音区間に対する参照
フレーム数を超えている場合は無音圧縮可と判定し、無
音フレームの連続数の中の参照フレーム数分を残して他
の連続数の無音フレームのデータを、無音フレームの連
続数情報に変換する。無音フレームの連続数が多い程、
即ち、無音区間が長い程、音声情報を圧縮することがで
きる。The spectrum envelope information extraction unit 8 extracts spectrum envelope information from the audio signal converted into the digital signal.
Further, the sound source information extraction unit 10 extracts sound source information from the audio signal converted into the digital signal. The compression processing means includes a silence detection unit 16 and a silence compression availability determination unit 1 as indicated by a dashed line.
7, including the silence compression unit 18 and the like, determines whether the frame is a voice frame or a voiceless frame, and in the case of a voiceless frame, detects the number of continuous frames, and the number of continuous voiceless frames exceeds the number of reference frames for the voiced section. In this case, it is determined that silence compression is possible, and the data of the silence frames of the other consecutive numbers are converted into information on the number of consecutive silence frames, except for the number of reference frames in the consecutive number of silence frames. As the number of continuous silence frames increases,
That is, the longer the silent section is, the more the audio information can be compressed.

又出力編集部15に於いて、スペクトル包絡情報と音源
情報とを含む有音フレームのデータと、参照フレーム数
分のスペクトル包絡情報と音源情報とを含む無音フレー
ムの無音データと、参照フレーム数分を除く無音フレー
ムについての無音フレームの連続数情報とを編集して、
音声符号蓄積装置等に転送することになる。又復号時
に、有音区間に対する参照フレーム数分の無音フレーム
のデータが残っているから、クリック性の雑音を抑制す
ることができる。Further, in the output editing unit 15, the data of the sound frame including the spectrum envelope information and the sound source information, the silence data of the silent frame including the spectrum envelope information and the sound source information for the number of reference frames, and the Edit the silence frame continuation number information for silence frames excluding
The data is transferred to a voice code storage device or the like. Further, at the time of decoding, data of silence frames corresponding to the number of reference frames for a sound section remains, so that click noise can be suppressed.

〔Example〕

以下図面を参照して本発明の実施例について詳細に説
明する。Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

第２図は本発明の実施例のブロック図であり、第５図
と同一符号は同一部分を示し、３はLSP分析符号化装
置、16は無音検出部、17は無音圧縮可否判定部、18は無
音圧縮部であり、共通バス５に接続され、プロセッサ６
により制御される。又プロセッサ６により各種のデータ
処理が行われる。FIG. 2 is a block diagram of an embodiment of the present invention, in which the same reference numerals as in FIG. 5 denote the same parts, 3 is an LSP analysis coding apparatus, 16 is a silence detecting section, 17 is a silence compression availability judging section, 18 Is a silence compression unit, which is connected to the common bus 5 and
Is controlled by The processor 6 performs various data processing.

無音検出部16は、スペクトル包絡情報抽出部８により
抽出されたスペクトル包絡情報（LSPパラメータ）と、
音源抽出部10により抽出された音源情報とからなるLSP
データをもとに、無音区間であるか否か及びその無音空
間の連続数を検出するものである。例えば、第６図に示
すLSPデータをもとに、その振幅Ａが零又は零に近い値
のものを無音区間とし、その連続数を計数するものであ
る。The silence detection unit 16 includes the spectrum envelope information (LSP parameter) extracted by the spectrum envelope information extraction unit 8,
LSP composed of sound source information extracted by the sound source extraction unit 10
Based on the data, it is detected whether or not it is a silent section and the continuous number of the silent space. For example, based on the LSP data shown in FIG. 6, those having amplitude A of zero or a value close to zero are regarded as silent sections, and the number of continuous sections is counted.

又符号化された音声情報を用いて音声再生を行う時
に、無音の場合は、前フレーム（分析区間）の参照を行
わないが、有音の場合は、前フレームのLSPパラメータ
及び音源情報を参照するものである。即ち、各種パラメ
ータの急激な変化は、通常クリック性の雑音となるか
ら、これ回避する為に、無音フレームが連続した場合で
も、有音区間の直前の１フレーム乃至数フレームの無音
フレームを参照フレームとして残すことが望ましいこと
になる。Also, when sound is reproduced using the encoded sound information, if there is no sound, the previous frame (analysis section) is not referred to, but if there is sound, the LSP parameter and sound source information of the previous frame are referred to. Is what you do. That is, a sudden change in various parameters usually causes click noise, and in order to avoid this, even if silence frames are consecutive, one to several silence frames immediately before a sound section are referred to as reference frames. It would be desirable to leave it as

そこで、無音圧縮可否判定部17は、無音フレームの連
続数が参照フレーム数を超えた時の無音区間を圧縮可と
判定し、無音フレームの連続数が参照フレーム数を超え
ない時は、無音区間を圧縮不可と判定する。Therefore, the silence compression availability determination unit 17 determines that a silence section when the continuous number of silence frames exceeds the reference frame number is compressible, and when the continuous number of silence frames does not exceed the reference frame number, the silence section Is determined to be incompressible.

又無音圧縮部18は、無音圧縮可否判定部17により無音
圧縮可と判定した時に、連続する無音フレームの中の有
音区間直前の参照フレーム数を残して、LSPデータを無
音情報と連続数情報とに変換して無音区間の圧縮処理を
行う。従って、有音区間直前の参照フレーム数の無音フ
レームについては、圧縮処理されることなく、LSPデー
タのまま出力され、復号時の有音区間に対する参照フレ
ームとして、クリック性の雑音の発生を防止する。又参
照フレーム数を超えた連続数の無音フレームについては
圧縮処理されて出力される。又無音フレームの連続数が
少ないことにより、無音圧縮可否判定部17が無音圧縮不
可と判定した時は、無音圧縮部18は動作しないので、LS
Pデータはそのまま出力編集部15から出力端子２に出力
される。The silence compression unit 18 also converts the LSP data into the silence information and the continuation number information, leaving the number of reference frames immediately before the voiced section in the continuous silence frames when the silence compression availability determination unit 17 determines that silence compression is possible. And performs compression processing of the silent section. Therefore, silent frames having the number of reference frames immediately before the voiced section are output as LSP data without being compressed, and click noise is prevented as a reference frame for the voiced section during decoding. . Consecutive silence frames exceeding the number of reference frames are compressed and output. When the silence compression possibility determination unit 17 determines that silence compression is impossible due to the small number of continuous silence frames, the silence compression unit 18 does not operate.
The P data is output from the output editing unit 15 to the output terminal 2 as it is.

第３図は無音圧縮データの説明図であり、SBはストッ
プビット、Ｔはピッチ周期、T1,T0は通常はフレーム周
期を示すものであり、例えば、前述のように、２ビット
の“00"〜“10"で３種類のフレーム周期を示すものであ
るが、無音圧縮の場合はフレーム周期と区別できる表
示、例えば、“11"に設定するものである。又Ａは振
幅、Ｎは第６図のLSPパラメータω1,ω２に対応する部
分で無音連続数を示し、又第６図に於けるLSPパラメー
タω３〜ω８に対応する部分はオール“0"とするもので
ある。この場合、振幅Ａは零となる。FIG. 3 is an explanatory diagram of silence compressed data, in which SB indicates a stop bit, T indicates a pitch period, and T1 and T0 generally indicate a frame period. For example, as described above, 2-bit "00" Although three types of frame periods are indicated by "to" 10 ", in the case of silent compression, a display that can be distinguished from the frame period, for example," 11 "is set. A is an amplitude, N is a portion corresponding to the LSP parameters ω1 and ω2 in FIG. 6 and indicates the number of continuous silences, and a portion corresponding to the LSP parameters ω3 to ω8 in FIG. 6 is all “0”. Things. In this case, the amplitude A becomes zero.

第４図は本発明の実施例の制御フロー図を示し、第６
図に於ける符号化情報をもとに各情報を１ブロック（例
えば、符号化情報は６バイト）抽出し、有音か無音か
を判定し、無音であれば計数する。一方、有音であ
れば、前回抽出のフロックが無音か否か判定し、前回
も無音であれば、計数用カウンタをクリアし、次の１
ブロックを抽出する。FIG. 4 shows a control flow chart of the embodiment of the present invention, and FIG.
One block (for example, 6 bytes of encoded information) is extracted from each piece of information based on the encoded information in the figure, and it is determined whether there is sound or no sound. On the other hand, if there is a sound, it is determined whether or not the previously extracted floc is silent. If there is no sound again, the counting counter is cleared and the next 1
Extract blocks.

無音から有音に変った場合には、圧縮可能か否かを判
定する。即ち、無音の連続ブロックの計数値から所定
値、例えば、３ブロック以上連続か否かを判定し、３ブ
ロック以上無音のブロックが連続すると、圧縮可として
圧縮処理を行う。When the sound changes from silence to sound, it is determined whether compression is possible. That is, it is determined whether or not a predetermined value, for example, three or more blocks are continuous, based on the count value of the continuous blocks with no sound.

符号化された入力音声ブロック100と圧縮音声ブロッ
ク101との一例を示し、符号化された６バイト,1ブロッ
クの音声情報a,b,c,・・・k,・・の斜線を施しブロック
が無音ブロックとすると、無音ブロックd,c,・・i,jが
計数ステップで計数され、無音ブロックから有音ブロ
ックに変化した時に、３ブロック以上であるか否か判定
され、この場合は３ブロック以上であるから、圧縮処理
ステップで圧縮され、圧縮音声ブロック101で示すよ
うに、無声ブロックd,c,・・・i,jをブロックi,jに圧縮
するものである。An example of an encoded input audio block 100 and a compressed audio block 101 is shown. The encoded 6-byte, 1-block audio information a, b, c,... K,. Assuming that the block is a silence block, the silence blocks d, c,... I, j are counted in the counting step, and when changing from a silence block to a speech block, it is determined whether or not there are three or more blocks. As described above, the unvoiced blocks d, c,..., I, j are compressed into blocks i, j as shown by the compressed audio block 101 in the compression processing step.

このような音声符号化方式によって得られたデータか
ら音声を再生する場合には、有音の場合は第６図に示す
LSPデータを用いることになり、又無音の場合は、第３
図に示す無音圧縮データの３バイト目の無音連続数Ｎの
回数だけ無音を挿入すれば良いことになる。FIG. 6 shows a case where a sound is reproduced from data obtained by such a sound encoding method and a sound is reproduced.
LSP data will be used, and if there is no sound,
It is sufficient to insert silence by the number of times N of silence continuation of the third byte of the silence compressed data shown in the figure.

前述の実施例は、LSP方式について説明したものであ
るが、本発明はこの方式に限定されるものではなく、他
の方式に対しても適用できるものである。Although the above-described embodiment describes the LSP system, the present invention is not limited to this system, and can be applied to other systems.

〔The invention's effect〕

以上説明したように、本発明は、スペクトル包絡情報
抽出部８と、音源情報抽出部10と、圧縮処理手段とを備
え、無音検出部16,無音圧縮可否判定部17,無音圧縮部18
等を含む圧縮手段によって、無音フレームか有音フレー
ムかを識別し、無音フレームの場合はその連続数を検出
し、この連続数が復号時の有音区間に対する参照フレー
ム数を超えている場合には無音圧縮可と判定し、無音フ
レームの連続数の中の参照フレーム数分を残して、他の
連続する無音フレームのデータを、無音フレームの連続
数情報に変換するものであり、例えば、第３図に示すよ
うに、１フレーム情報によって、連続する複数の無音フ
レームのデータを無音フレームの連続数情報で表現する
ことが可能となり、冗長となる無音区間を大幅に圧縮す
ることによって、音声情報の蓄積容量を削減することが
できる利点がある。As described above, the present invention includes the spectrum envelope information extraction unit 8, the sound source information extraction unit 10, and the compression processing unit, and includes a silence detection unit 16, a silence compression availability determination unit 17, and a silence compression unit 18.
By the compression means including, for example, a silent frame or a voiced frame is identified, and in the case of a voiceless frame, the number of consecutive frames is detected, and when the number of continuous frames exceeds the number of reference frames for a voiced section at the time of decoding. Is determined to be capable of silence compression, the data of other consecutive silence frames, leaving the number of reference frames in the number of consecutive silence frames, to convert the number of consecutive silence frame information, for example, As shown in FIG. 3, one frame information makes it possible to represent data of a plurality of continuous silence frames with information on the number of consecutive silence frames. There is an advantage that the storage capacity of the device can be reduced.

【図面の簡単な説明】第１図は本発明の原理ブロック図、第２図は本発明の実
施例のブロック図、第３図は本発明の実施例の無音圧縮
データの説明図、第４図は実施例の本発明の制御フロー
図、第５図は従来例のブロック図、第６図はLSPデータ
の説明図である。１は入力端子、２は出力端子、３はLSP分析符号化装
置、４はＡ−Ｄ変換部、５は共通バス、６はプロセッサ
（CPU）、７はメモリ（MEM）、８はスペクトル包絡情報
抽出部、9,14は蓄積部、10は音源情報抽出部、11は振幅
抽出部、12はピッチ周期抽出部、13は有音／無音判定
部、15は出力編集部、16は無音検出部、17は無音圧縮可
否判定部、18は無音圧縮部である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of the principle of the present invention, FIG. 2 is a block diagram of an embodiment of the present invention, FIG. 3 is an explanatory diagram of silence compressed data of the embodiment of the present invention, FIG. FIG. 5 is a control flow chart of the present invention in the embodiment, FIG. 5 is a block diagram of a conventional example, and FIG. 6 is an explanatory diagram of LSP data. 1 is an input terminal, 2 is an output terminal, 3 is an LSP analysis coding device, 4 is an A / D converter, 5 is a common bus, 6 is a processor (CPU), 7 is a memory (MEM), and 8 is spectrum envelope information. Extraction units, 9 and 14 are storage units, 10 is a sound source information extraction unit, 11 is an amplitude extraction unit, 12 is a pitch period extraction unit, 13 is a sound / non-speech determination unit, 15 is an output editing unit, and 16 is a silence detection unit , 17 is a silence compression availability determination unit, and 18 is a silence compression unit.

Claims

(57) [Claims]

A spectrum envelope information extraction unit for extracting spectrum envelope information indicating a frequency characteristic inherent to the audio signal from the audio signal converted to a digital signal; A sound source information extraction unit (10) for extracting sound source information such as an amplitude, a pitch cycle, and distinction between sound / non-speech of a sound signal; and detecting a continuous number of silence frames from the spectrum envelope information and the sound source information. When the number of detected consecutive silence frames exceeds the number of reference frames for smooth transition from silence immediately before the speech interval at the time of decoding to speech, it is determined that silence can be compressed and silence compression processing is performed. Processing means, wherein, when it is determined that silence compression is possible, the compression processing means keeps the number of reference frames immediately before the sound interval in the continuous number of silence frames as silence frames. And converts the data of the other successive silence frames except the few minutes the reference frame number of consecutive information silent frame, speech coding, characterized in that compressing the silent sections included in the audio information.