JPH10124097A

JPH10124097A - Voice recording and reproducing device

Info

Publication number: JPH10124097A
Application number: JP8278337A
Authority: JP
Inventors: 秀享 ▲高▼橋; Hideyuki Takahashi
Original assignee: Olympus Optical Co Ltd
Current assignee: Olympus Corp
Priority date: 1996-10-21
Filing date: 1996-10-21
Publication date: 1998-05-15

Abstract

PROBLEM TO BE SOLVED: To obtain high quality sounds without increasing the amount of computations in a coding process by providing a discriminating means which discriminates input signal, a coding means which codes the signals and a coded data smoothing means which smooth the data obtained by coding non-voice signals. SOLUTION: A coding/combining section has a voice/non-voice discrimination section 17 which discriminates inputted signals to voice signals and non-voice signals employing the frames made by diving digital input signals into a constant length as a unit, a multipulse coding section 19 and a non-voice coding section 20. A frame energy computing section 16 is connected to the input terminal of a coding selection switching switch 18 through the section 17. A first output terminal 'a' of the switch 18 is connected to the section 19 and a second output terminal 'b' is connected to the section 20. The section 20 also acts as a coded data smoothing means.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声記録再生装
置、詳しくは、音声信号にデジタル情報圧縮処理を施し
て記録、再生する音声記録再生装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio recording / reproducing apparatus, and more particularly, to an audio recording / reproducing apparatus that performs digital information compression processing on an audio signal for recording and reproducing.

【０００２】[0002]

【従来の技術】近年、マイクロホン等によって得られた
音声信号をデジタル信号に変換して、例えば半導体メモ
リに記録しておき、再生時において、該半導体メモリか
らこの音声信号を読み出してアナログ信号に変換し、ス
ピーカ等により音声として出力する、いわゆるデジタル
レコーダと呼ばれているデジタル情報記録再生装置が開
発されている。また、特開昭６３−２５９７００号公報
には、上述したようなデジタル情報記録再生装置が開示
されている。2. Description of the Related Art In recent years, an audio signal obtained by a microphone or the like is converted into a digital signal, which is recorded in, for example, a semiconductor memory. At the time of reproduction, the audio signal is read from the semiconductor memory and converted into an analog signal. A digital information recording / reproducing apparatus called a so-called digital recorder which outputs sound as a sound by a speaker or the like has been developed. Japanese Patent Application Laid-Open No. 63-259700 discloses a digital information recording / reproducing apparatus as described above.

【０００３】上述したデジタル情報記録再生装置等の記
憶再生装置においては、半導体メモリに記録されるデー
タ量を節約するために、デジタル化された音声信号に対
して高能率な符号化を施すことによって発生するデータ
量をできるだけ少なくする技術手段が提案されている。
特に近年では、デジタル信号処理技術の発展によりさま
ざまな音声符号化技術が開発され、録音可能時間が飛躍
的に延びるようになっている。また、非音声区間や無声
区間においては、より高能率な符号化を行う、可変レー
ト符号化が多く提案されている。[0003] In a storage / reproduction device such as the digital information recording / reproduction device described above, in order to save the amount of data recorded in a semiconductor memory, a highly efficient encoding is performed on a digitized audio signal. Technical means for minimizing the amount of generated data have been proposed.
In particular, in recent years, various audio coding techniques have been developed with the development of digital signal processing techniques, and the recordable time has been dramatically increased. Also, in non-voice sections and unvoiced sections, many variable-rate codings for performing more efficient coding have been proposed.

【０００４】この高能率な符号化方式としては、符号励
起線形予測符号化方式（ＣＥＬＰ：ＣｏｄｅＥｘｃｉ
ｔｅｄＬｉｎｅａｒＰｒｅｄｉｃｔｉｖｅＣｏｄ
ｉｎｇ）に代表される分析合成型音声符号化方式、ＡＤ
ＰＣＭのような波形符号型圧縮方式等の符号化方式が知
られるところにある。[0004] As this highly efficient coding method, a code excitation linear prediction coding method (CELP: Code Exci) is used.
ted Linear Predictive Cod
ing), an analysis-synthesis-type speech coding method represented by AD
There are known coding methods such as a waveform coding type compression method such as PCM.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上述し
たような可変レート符号化は、非音声区間や無声区間に
おいては、より符号化ビットレートが低下するために良
好な音質を得ることができず、特に背景雑音などが混入
すると急激に音質が劣化してしまうという問題点があっ
た。また、上述したような音声符号化技術手段は、演算
量が多いという問題がある。However, in the variable rate coding as described above, in a non-voice section or an unvoiced section, good coding quality cannot be obtained because the coding bit rate is further reduced. In particular, there is a problem that the sound quality is rapidly deteriorated when background noise or the like is mixed. Further, the above-described speech coding technology means has a problem that the amount of calculation is large.

【０００６】また、上記音声記録再生装置をより安価に
提供するために、符号化処理および復号化処理は、固定
小数点ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏ
ｃｅｓｓｏｒ）により実現されるが、現状では符号化で
ＤＳＰの演算能力を使い切ってしまうことが多く、符号
化性能向上のためにこれ以上処理を付加すると実時間処
理が達成できない、といった不具合が生じていた。In order to provide the audio recording / reproducing apparatus at a lower cost, encoding and decoding are performed by a fixed-point DSP (Digital Signal Pro).
However, in the present situation, the DSP often uses up the computing power of the encoding, and if further processing is added in order to improve the encoding performance, real-time processing cannot be achieved. Was.

【０００７】本発明はかかる問題点に鑑みてなされたも
のであり、符号化処理における演算量を増加させること
なく良好な音質が得られる音声記録再生装置を提供する
ことを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and has as its object to provide an audio recording / reproducing apparatus capable of obtaining good sound quality without increasing the amount of calculation in encoding processing.

【０００８】[0008]

【課題を解決するための手段】上記の目的を達成するた
めに本発明の第１の音声記録再生装置は、入力信号を音
声信号と非音声信号とに判別する判別手段と、上記入力
信号を符号化する符号化手段と、上記非音声信号を符号
化したデータを平滑化する符号化データ平滑化手段と、
を具備する。To achieve the above object, a first audio recording / reproducing apparatus according to the present invention comprises: a discriminating means for discriminating an input signal into an audio signal and a non-audio signal; Encoding means for encoding, encoded data smoothing means for smoothing the data obtained by encoding the non-voice signal,
Is provided.

【０００９】上記の目的を達成するために本発明の第２
の音声記録再生装置は、デジタル化した入力信号を一定
の長さに分割したフレームを単位として該入力信号を音
声信号と非音声信号とに判別する判別手段と、非音声音
源推定部を有し、上記入力信号を符号化する線形予測符
号化手段と、上記非音声音源推定部からの信号のゲイン
情報と線形予測パラメータ情報との少なくとも一方を平
滑化する符号化データ平滑化手段と、を具備する。In order to achieve the above object, a second aspect of the present invention is provided.
The audio recording / reproducing apparatus has a discriminating means for discriminating the input signal into a speech signal and a non-speech signal in units of a frame obtained by dividing the digitized input signal into a fixed length, and a non-speech sound source estimating unit. A linear prediction encoding unit that encodes the input signal, and encoded data smoothing unit that smoothes at least one of gain information and linear prediction parameter information of a signal from the non-speech sound source estimation unit. I do.

【００１０】上記の目的を達成するために本発明の第３
の音声記録再生装置は、上記第１または第２の音声記録
再生装置において、上記符号化データ平滑化手段は非音
声フレームが一定の数以上連続した場合のみ非音声信号
を符号化したデータを平滑化することを特徴とする。[0010] In order to achieve the above object, a third aspect of the present invention is provided.
The audio recording / reproducing apparatus according to the first or second audio recording / reproducing apparatus, wherein the encoded data smoothing means smoothes the data obtained by encoding the non-audio signal only when the non-audio frame continues for a predetermined number or more. It is characterized in that

【００１１】上記第１の音声記録再生装置は、判別手段
で、入力信号を音声信号と非音声信号とに判別し、符号
化手段で上記入力信号を符号化する。また、符号化デー
タ平滑化手段で上記非音声信号を符号化したデータを平
滑化する。In the first audio recording / reproducing apparatus, the discriminating means discriminates the input signal into a speech signal and a non-speech signal, and the encoding means encodes the input signal. The encoded data smoothing means smoothes the data obtained by encoding the non-voice signal.

【００１２】上記第２の音声記録再生装置は、判別手段
で、デジタル化した入力信号を一定の長さに分割したフ
レームを単位として該入力信号を音声信号と非音声信号
とに判別し、線形予測符号化手段で上記入力信号を符号
化する。また、符号化データ平滑化手段で、上記線形予
測符号化手段内の非音声音源推定部からの信号のゲイン
情報と線形予測パラメータ情報との少なくとも一方を平
滑化する。In the second audio recording / reproducing apparatus, the discriminating means discriminates the input signal into a speech signal and a non-speech signal in units of frames obtained by dividing the digitized input signal into fixed lengths, The input signal is encoded by predictive encoding means. Further, the coded data smoothing means smoothes at least one of the gain information and the linear prediction parameter information of the signal from the non-speech sound source estimation section in the linear prediction coding means.

【００１３】上記第３の音声記録再生装置は、上記第１
または第２の音声記録再生装置において、上記符号化デ
ータ平滑化手段は非音声フレームが一定の数以上連続し
た場合のみ非音声信号を符号化したデータを平滑化す
る。[0013] The third audio recording / reproducing apparatus is provided with the first audio recording / reproducing apparatus.
Alternatively, in the second audio recording / reproducing apparatus, the encoded data smoothing means smoothes data obtained by encoding a non-audio signal only when a predetermined number or more of non-audio frames are continuous.

【００１４】[0014]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１５】図１は、本発明の一実施形態である音声記
録再生装置の全体構成を示すブロック図である。FIG. 1 is a block diagram showing the overall configuration of an audio recording / reproducing apparatus according to one embodiment of the present invention.

【００１６】同図において、マイクロホン１はマイクア
ンプ２、ローパスフィルタ３、さらにＡ／Ｄ変換器４を
介して符号化／復号化部５の一端に接続されている。こ
の符号／復号化部５の他端はメモリ制御部６を介して音
声メモリ７に接続されている。In FIG. 1, a microphone 1 is connected to one end of an encoding / decoding unit 5 via a microphone amplifier 2, a low-pass filter 3, and an A / D converter 4. The other end of the encoding / decoding unit 5 is connected to an audio memory 7 via a memory control unit 6.

【００１７】また、スピーカ８はパワーアンプ９とロー
パスフィルタ１０とを介してＤ／Ａ変換器１１に接続さ
れ、このＤ／Ａ変換器１１は符号／復号化部５の一端に
接続されている。The speaker 8 is connected to a D / A converter 11 via a power amplifier 9 and a low-pass filter 10, and the D / A converter 11 is connected to one end of the encoding / decoding unit 5. .

【００１８】さらに、各部の動作を制御するシステム制
御部１２は、符号／復号化部５と、メモリ制御部６と、
音声メモリ７の他に録音、再生、停止等の操作スイッチ
からなる操作入力部１３と、アドレスカウンタ１４と、
表示部１５とに接続されている。Further, a system control unit 12 for controlling the operation of each unit includes an encoding / decoding unit 5, a memory control unit 6,
In addition to the voice memory 7, an operation input unit 13 including operation switches such as recording, playback, and stop, an address counter 14,
It is connected to the display unit 15.

【００１９】図２は、上記符号／復号化部５における符
号化部の構成を示すブロック図である。FIG. 2 is a block diagram showing the configuration of the encoding unit in the encoding / decoding unit 5.

【００２０】この符号／復号化部５は、デジタル化した
入力信号を一定の長さに分割したフレームを単位として
該入力信号を音声信号と非音声信号とに判別する判別手
段（音声／非音声判別部１７）、音声音源推定部（マル
チパルス符号化部１９）および非音声音源推定部（非音
声符号化部２０）を有し、上記入力信号を符号化する線
形予測符号化手段としての役目を果たす。The encoding / decoding unit 5 discriminates the input signal into a speech signal and a non-speech signal in units of frames obtained by dividing the digitized input signal into fixed lengths (speech / non-speech signal). A discriminating unit 17), a speech sound source estimating unit (multi-pulse encoding unit 19), and a non-speech sound source estimating unit (non-speech encoding unit 20), and serve as a linear predictive encoding means for encoding the input signal. Fulfill.

【００２１】同図２において、フレームエネルギー計算
部１６は音声／非音声判別手段としての音声／非音声判
別部１７を介して符号化選択手段としての符号化選択切
換スイッチ１８の入力端子ｃに接続されている。この符
号化切換スイッチ１８の第１出力端子ａはマルチパルス
符号化部１９に接続されており、第２出力端子ｂは非音
声符号化部２０に接続されている。In FIG. 2, a frame energy calculation unit 16 is connected to an input terminal c of a coding selection switch 18 as a coding selection unit via a voice / non-voice determination unit 17 as a voice / non-voice determination unit. Have been. The first output terminal a of the encoding changeover switch 18 is connected to the multi-pulse encoding unit 19, and the second output terminal b is connected to the non-speech encoding unit 20.

【００２２】上記マルチパルス符号化部１９の構成は、
例えば特公平４−２５５６０号公報に詳細に記載されて
いるので、ここではその説明を省略する。The configuration of the multi-pulse encoder 19 is as follows.
For example, since it is described in detail in Japanese Patent Publication No. 4-25560, description thereof is omitted here.

【００２３】図３は、上記非音声符号化部２０の構成を
示すブロック図である。FIG. 3 is a block diagram showing the configuration of the non-speech encoding section 20.

【００２４】この非音声符号化部２０は、符号化データ
平滑化手段としての役目を果たすものであり、同図３に
おいて、線形予測分析部２１はエネルギー計算部２２を
介してゲイン計算部２４に接続されるとともに、マルチ
プレクサ２５に接続されている。また、ランダム信号発
生部２３はゲイン計算部２４に接続されている。このゲ
イン計算部２４はマルチプレクサ２５に接続されてい
る。The non-speech coding unit 20 serves as a coded data smoothing means. In FIG. 3, the linear prediction analysis unit 21 sends a signal to the gain calculation unit 24 via the energy calculation unit 22. Connected to the multiplexer 25. The random signal generator 23 is connected to the gain calculator 24. The gain calculator 24 is connected to the multiplexer 25.

【００２５】図４は、上記符号／復号化部５における復
号化部の構成を示すブロック図である。FIG. 4 is a block diagram showing the configuration of the decoding unit in the encoding / decoding unit 5.

【００２６】同図において、音声／非音声判別部２６は
復号化切換スイッチ２７の入力端子ｃに接続され、この
復号化切換スイッチ２７の第１出力端子ａはマルチパル
ス復号化部２８に接続されており、第２出力端子ｂは非
音声復号化部２９に接続されている。In the figure, a voice / non-voice discriminating section 26 is connected to an input terminal c of a decoding changeover switch 27, and a first output terminal a of the decoding changeover switch 27 is connected to a multi-pulse decoding section 28. The second output terminal b is connected to the non-speech decoding unit 29.

【００２７】図５は、上記非音声復号化部２９の構成を
示すブロック図である。ただし、上記図３，図４に示す
構成図中、同じ構成要素に同じ符号を付与してある。FIG. 5 is a block diagram showing the configuration of the non-speech decoding unit 29. However, the same reference numerals are given to the same components in the configuration diagrams shown in FIGS.

【００２８】同図において、デマルチプレクサ３０はゲ
イン乗算部３１および線形予測合成部３２に接続されて
いる。また、ランダム信号発生部２３はゲイン乗算部３
１を介して線形予測合成部３２に接続されている。In FIG. 1, a demultiplexer 30 is connected to a gain multiplying unit 31 and a linear prediction synthesizing unit 32. Further, the random signal generation unit 23 includes the gain multiplication unit 3
1 is connected to the linear prediction synthesis unit 32.

【００２９】次に、上述した構成をなす音声記録再生装
置の録音、再生動作を説明する。Next, the recording / reproducing operation of the audio recording / reproducing apparatus having the above configuration will be described.

【００３０】図１において、操作者が操作入力部１３を
介して録音操作を行ったとき、マイク１から入力された
音声アナログ信号がマイクアンプ２で増幅され、ローパ
スフィルタ３によって音声信号成分のうち不要な高域成
分が遮断される。ローパスフィルタ３からの出力信号は
Ａ／Ｄ変換器４によってデジタル信号に変換される。こ
のとき、システム制御部１２は、符号／復号化部５にお
ける符号化部を選択して動作させ、Ａ／Ｄ変換器４から
のデジタル信号に対して符号化を施す。そして、この符
号化によって得られた符号化データはメモリ制御部６を
介して音声メモリ７に格納される。In FIG. 1, when an operator performs a recording operation via the operation input unit 13, an audio analog signal input from the microphone 1 is amplified by the microphone amplifier 2, and the low-pass filter 3 outputs the audio analog signal. Unnecessary high frequency components are cut off. An output signal from the low-pass filter 3 is converted into a digital signal by the A / D converter 4. At this time, the system control unit 12 selects and operates the encoding unit in the encoding / decoding unit 5, and encodes the digital signal from the A / D converter 4. The encoded data obtained by this encoding is stored in the audio memory 7 via the memory control unit 6.

【００３１】また、操作者が上記操作入力部１３を介し
て再生操作を行ったとき、音声メモリ７から符号化デー
タが読み出され、メモリ制御部６を介して符号／復号化
部５に供給される。このとき、システム制御部１２は、
符号／復号化部５における復号化部を選択して動作さ
せ、符号化データに対して符号化を行い、復号化データ
が作成される。この復号化データはデジタル信号なの
で、Ｄ／Ａ変換器１１においてこの復号化データはアナ
ログ信号に変換される。さらにローパスフィルタ１０に
おいてＤ／Ａ変換器１１より出力されるアナログ信号に
含まれる周波数成分のうち不要な高域成分が遮断され
る。そして、パワーアンプ９によってローパスフィルタ
１０から出力されたアナログ信号が増幅されて、スピー
カ８より再生信号が出力される。When an operator performs a reproducing operation via the operation input unit 13, encoded data is read from the audio memory 7 and supplied to the encoding / decoding unit 5 via the memory control unit 6. Is done. At this time, the system control unit 12
The decoding unit in the encoding / decoding unit 5 is selected and operated, and the encoded data is encoded, so that decoded data is created. Since the decoded data is a digital signal, the D / A converter 11 converts the decoded data into an analog signal. Further, in the low-pass filter 10, unnecessary high-frequency components among frequency components included in the analog signal output from the D / A converter 11 are cut off. Then, the analog signal output from the low-pass filter 10 is amplified by the power amplifier 9, and a reproduction signal is output from the speaker 8.

【００３２】上記一連の動作時において、上記メモリ制
御部６は音声メモリ７と符号／復号化部５との間の信号
の入出力動作を制御する。また、アドレスカウンタ１４
は、システム制御部１２から与えられるアドレスデータ
に従ってカウント動作を行い、音声メモリ７に対してア
ドレス指定を行う。In the above series of operations, the memory control unit 6 controls the input / output operation of signals between the audio memory 7 and the encoding / decoding unit 5. The address counter 14
Performs a count operation in accordance with address data provided from the system control unit 12 and specifies an address to the audio memory 7.

【００３３】次に、上記符号／復号化部５における符号
化部の符号化動作を図６に示すフローチャートを参照し
て説明する。Next, the encoding operation of the encoding unit in the encoding / decoding unit 5 will be described with reference to the flowchart shown in FIG.

【００３４】上記システム制御部１２に制御されて符号
／復号化部５において符号化処理動作が開始されると、
まず、入力される信号が音声信号であるか非音声信号で
あるかを判定するためのノイズカウンタNoiseCntの初期
化が行われ、NoiseCnt=0とされる（ステップＳ１）。こ
の後、上記符号／復号化部５の符号化部でフレーム音声
が読み込まれると（ステップＳ２）、フレームエネルギ
ー計算部１６（図２参照）においてフレームエネルギー
Eng が計算される。When the encoding / decoding operation is started in the encoding / decoding unit 5 under the control of the system control unit 12,
First, a noise counter NoiseCnt for determining whether an input signal is a voice signal or a non-voice signal is initialized, and NoiseCnt is set to 0 (step S1). Thereafter, when a frame sound is read by the encoding unit of the encoding / decoding unit 5 (step S2), the frame energy is calculated by the frame energy calculation unit 16 (see FIG. 2).
Eng is calculated.

【００３５】この後、上記音声／非音声判別部１７（図
２参照）において、該フレームエネルギーEng を所定の
しきい値NoiseLevと比較し（ステップＳ４）、該しきい
値NoiseLevより大きい場合は上記ノイズカウンタNoiseC
ntを再びNoiseCnt=0として（ステップＳ５）、ステップ
Ｓ７に移行する。また、上記フレームエネルギーEngが
しきい値NoiseLevより小さい場合は上記ノイズカウンタ
NoiseCntをインクリメントして（ステップＳ６）、ステ
ップＳ７に移行する。Thereafter, the voice / non-voice discriminating section 17 (see FIG. 2) compares the frame energy Eng with a predetermined threshold value NoiseLev (step S4). Noise counter NoiseC
nt is set to NoiseCnt = 0 again (step S5), and the process proceeds to step S7. If the frame energy Eng is smaller than the threshold NoiseLev, the noise counter
NoiseCnt is incremented (step S6), and the process proceeds to step S7.

【００３６】このステップＳ７においては、上記音声／
非音声判別部１７により上記ノイズカウンタNoiseCntが
“２”より大きいか否かが判定され、ここでNoiseCnt≦
２のときは入力された信号が音声信号であるとして、音
声信号判定フラグＳＰ＝１とされる（ステップＳ８）。
この後、符号化切換スイッチ１８は、上記音声信号判定
フラグＳＰ＝１に基づいてマルチパルス符号化部１９を
選択し、該マルチパルス符号化部１９においてマルチパ
ルス符号化が行われる（ステップＳ９）。In step S7, the voice /
The non-voice discriminating unit 17 determines whether or not the noise counter NoiseCnt is greater than “2”. Here, NoiseCnt ≦
If it is 2, it is determined that the input signal is an audio signal, and the audio signal determination flag SP is set to 1 (step S8).
Thereafter, the encoding changeover switch 18 selects the multi-pulse encoding unit 19 based on the audio signal determination flag SP = 1, and the multi-pulse encoding unit 19 performs multi-pulse encoding (step S9). .

【００３７】一方、上記ステップＳ７において、NoiseC
nt＞２と判定されると、入力された信号は非音声信号で
あるとして、音声信号判定フラグＳＰ＝０とされる（ス
テップＳ１１）。そして、符号化切換スイッチ１８は、
上記音声信号判定フラグＳＰ＝０に基づいて非音声符号
化部２０を選択し、該非音声符号化部２０において非音
声符号化が行われる（ステップＳ１２）。On the other hand, in step S7, NoiseC
If it is determined that nt> 2, the input signal is determined to be a non-voice signal, and the voice signal determination flag SP is set to 0 (step S11). Then, the encoding switch 18 is
The non-speech encoding unit 20 is selected based on the speech signal determination flag SP = 0, and non-speech encoding is performed in the non-speech encoding unit 20 (step S12).

【００３８】ここで、図３を参照して非音声符号化につ
いて説明する。Here, the non-speech coding will be described with reference to FIG.

【００３９】線形予測分析部２１は、フレーム単位で入
力された信号に対して線形予測分析を行い、得られた線
形予測パラメータをマルチプレクサ２５に送出すると共
に、線形予測残差信号をエネルギー計算部２２に送出す
る。このエネルギー計算部２２は、入力された線形予測
残差信号のエネルギーＥｒを次式１にしたがって計算
し、ゲイン計算部２４に送出する。The linear prediction analysis unit 21 performs a linear prediction analysis on the signal input in frame units, sends out the obtained linear prediction parameters to the multiplexer 25, and outputs the linear prediction residual signal to the energy calculation unit 22. To send to. The energy calculation unit 22 calculates the energy Er of the input linear prediction residual signal according to the following equation 1, and sends it to the gain calculation unit 24.

【００４０】[0040]

【式１】ここで、r(n)はサンプルｎにおける線形予測残差信号、
Ｎはフレーム長を示す。(Equation 1) Where r (n) is the linear prediction residual signal at sample n,
N indicates the frame length.

【００４１】また、ランダム信号発生部２３は、非音声
音源信号としてのランダム信号を発生させて、ゲイン計
算部２４に送出する。ゲイン計算部２４は、上記ランダ
ム信号発生部２３で作成された非音声音源信号としての
ランダム信号rand(n) のゲインを計算する。該ゲイン
は、以下のように決定する。The random signal generator 23 generates a random signal as a non-voice sound source signal and sends it to the gain calculator 24. The gain calculator 24 calculates the gain of the random signal rand (n) as the non-speech sound source signal created by the random signal generator 23. The gain is determined as follows.

【００４２】まず、複数のゲイン候補値geと該ランダム
信号rand(n) のエネルギーとの乗算値と、該線形予測残
差信号のエネルギーＥｒとの誤差eer を最小とするゲイ
ン候補値geの値を求める。すなわち、以下に示す式２で
求められる誤差eer を最小とするゲイン候補値geの値を
求める。First, a value of a gain candidate value ge that minimizes an error eer between a product of a plurality of gain candidate values ge and the energy of the random signal rand (n) and the energy Er of the linear prediction residual signal. Ask for. That is, the value of the gain candidate value ge that minimizes the error eer obtained by the following Expression 2 is obtained.

【００４３】[0043]

【式２】この式２は、換言すると、発生する音源信号のエネルギ
ーを線形予測残差信号のエネルギーとが等しくなるよう
なランダム信号のゲインを探索するものである。そし
て、誤差eer を最小とするゲイン候補値geの値をｇとし
てマルチプレクサ２５に送出する。(Equation 2) In other words, Expression 2 searches for the gain of a random signal such that the energy of the generated sound source signal becomes equal to the energy of the linear prediction residual signal. Then, the value of the gain candidate value ge that minimizes the error eer is sent to the multiplexer 25 as g.

【００４４】このマルチプレクサ２５は、受信した線形
予測パラメータとｇとをまとめて、符号化データとして
出力する。This multiplexer 25 collectively outputs the received linear prediction parameters and g and outputs them as encoded data.

【００４５】図６に戻って、上記ステップＳ９またはス
テップＳ１２の後、上記音声信号判定フラグＳＰおおび
各符号化データが音声メモリ７に記録され（ステップＳ
１０）、上記ステップＳ２に戻る。Returning to FIG. 6, after step S9 or step S12, the audio signal determination flag SP and each encoded data are recorded in the audio memory 7 (step S9).
10), and return to step S2.

【００４６】次に、符号化データ平滑化に関する動作を
図７に示すフローチャートを参照して説明する。Next, the operation relating to the encoded data smoothing will be described with reference to the flowchart shown in FIG.

【００４７】なお、本実施形態においては、上述したマ
ルチパルス符号化は１フレームあたり191bitで、非音声
符号化は１フレームあたり81bit で、音声信号判定フラ
グＳＰは１フレームあたり1bitで符号化されるものとす
る。In the present embodiment, the above-described multi-pulse coding is coded with 191 bits per frame, non-voice coding is coded with 81 bits per frame, and the voice signal determination flag SP is coded with 1 bit per frame. Shall be.

【００４８】まず、現在のフレーム数を示すカウンタCn
t 、音声信号判定フラグPrev＿SPの初期化が行われ（ス
テップＳ１３）、それぞれCnt =0、Prev＿SP=1とする。
この後、上記音声メモリ７から上記音声信号判定フラグ
ＳＰにあたる１ビットデータの値が読み出される（ステ
ップＳ１４）。そして、該１ビットデータの値が音声信
号判定フラグＳＰ＝１であるか否かを判定する（ステッ
プＳ１５）。First, a counter Cn indicating the current number of frames
At time t, the audio signal determination flag Prev_SP is initialized (step S13), and Cnt = 0 and Prev_SP = 1, respectively.
Thereafter, the value of 1-bit data corresponding to the audio signal determination flag SP is read from the audio memory 7 (step S14). Then, it is determined whether or not the value of the one-bit data is the audio signal determination flag SP = 1 (step S15).

【００４９】このステップＳ１５において、音声信号判
定フラグＳＰ＝１でない場合は、音声メモリ７から31bi
t 分読み出し（ステップＳ１８）、このフレームのゲイ
ンｇを量子化してＧ［Cnt ］＝ｇとして記憶する（ステ
ップＳ１９）。この後、上記カウンタCnt をインクリメ
ントし（ステップＳ２０）、さらに音声信号判定フラグ
Prev＿SP＝０として（ステップＳ２１）、上記ステップ
Ｓ１４に戻る。If it is determined in step S15 that the audio signal determination flag SP is not 1, the audio memory 7 stores 31bi.
The reading for t is performed (step S18), and the gain g of this frame is quantized and stored as G [Cnt] = g (step S19). Thereafter, the counter Cnt is incremented (step S20), and the audio signal determination flag is further increased.
Prev_SP = 0 (step S21), and the process returns to step S14.

【００５０】上記ステップＳ１５において、音声信号判
定フラグＳＰ＝１である場合は、音声メモリ７を191bit
分スキップし（ステップＳ１６）、音声信号判定フラグ
Prev＿SPが“０”であるか否かを判定する（ステップＳ
３０）。該音声信号判定フラグPrev＿SP＝０でない場合
は、上記ステップＳ１４に戻り、Prev＿SP＝０になるま
で待機する。In step S15, when the audio signal determination flag SP = 1, the audio memory 7
Minute (step S16), the audio signal determination flag
It is determined whether or not Prev_SP is “0” (step S
30). If the audio signal determination flag is not Prev_SP = 0, the process returns to step S14 and waits until Prev_SP = 0.

【００５１】上記ステップＳ３０において、Prev＿SP＝
０になると、上記量子化されたゲインＧ［i］,｛i=0,1,
…,Cnt｝を平滑化して、音声メモリ７に書き直す（ステ
ップＳ１７）。この後、上記Cnt Prev＿SPを再び初期化
して（ステップＳ３１）、上記ステップＳ１４に戻る。In step S30, Prev_SP =
0, the quantized gain G [i], ｛i = 0,1,
.., Cnt} are smoothed and rewritten in the voice memory 7 (step S17). Thereafter, the Cnt Prev_SP is initialized again (step S31), and the process returns to step S14.

【００５２】本実施形態においては、非音声フレームの
前後各１０フレームのゲインＧの平均値をそのフレーム
の平滑化後のゲインＧ′とする。そして移動平均を各フ
レーム毎に順次計算をしてそれぞれ平滑化後のゲイン
Ｇ′として前のゲインＧを置き換えていくようにしてい
る。In the present embodiment, the average value of the gain G of each of the 10 frames before and after the non-voice frame is set as the smoothed gain G 'of that frame. Then, the moving average is sequentially calculated for each frame, and the previous gain G is replaced with the smoothed gain G '.

【００５３】また、このほかにも、ゲインＧの最大値を
所定値ａ以下にする。すなわち、Ｇ［i］＝Ｇ［i］・・・Ｇ＜ａＧ［i］＝ａ・・・Ｇ＞ａとなるようにしても良いし、あるいは、非音声フレーム
のゲインＧの連続区間にわたる平均を求め、各ゲインＧ
と置き換えるようにしても良い。In addition, the maximum value of the gain G is set to a predetermined value a or less. That is, G [i] = G [i]... G <a G[i]=a... G> a or over a continuous section of the gain G of the non-voice frame. Find the average and calculate each gain G
May be replaced.

【００５４】また、上述した平滑化を施すタイミング
は、録音終了後、再生前であるなら如何なるときでも可
能だが、本実施形態においては、所定のタイミングとし
て録音終了後に自動的に平滑化が行われるように設定し
ている。なお、このタイミング以外でも、所定のタイミ
ングとして、（１）再生開始に先立ち、平滑化がされていない場合
には、平滑化が自動的に起動する。（２）予め設定された時間に起動する。（３）オートパワーオフ（自動電源終了）時に自動的
に実行する。（４）使用者が意識的に起動する。等の場合が考えられる。The above-mentioned smoothing can be performed at any time after the end of the recording and before the reproduction, but in the present embodiment, the smoothing is automatically performed after the end of the recording as a predetermined timing. Is set as follows. Note that, other than this timing, as a predetermined timing, (1) if smoothing is not performed prior to the start of reproduction, smoothing is automatically started. (2) Start up at a preset time. (3) Automatically executed when auto power off (auto power off). (4) The user intentionally starts up. And so on.

【００５５】また、平滑化が２回にわたって実施される
のを防止するために、平滑化が行われたか否かを示すフ
ラグを設け、平滑化が行われていない場合にだけ平滑化
を実施するようにすると効率が向上する。In order to prevent the smoothing from being performed twice, a flag indicating whether or not the smoothing is performed is provided, and the smoothing is performed only when the smoothing is not performed. This improves efficiency.

【００５６】さらに、記録される音声データに不整合部
分を発生させないように、平滑化が行われている際に
は、当該音声記録再生装置における操作部等のキー、ス
イッチ等をロックするような機構を設けることで、不整
合部分の発生および誤動作を防止することができる。Further, when smoothing is performed so as not to generate an inconsistent portion in recorded audio data, keys, switches, etc. of an operation unit or the like in the audio recording / reproducing apparatus are locked. By providing the mechanism, it is possible to prevent occurrence of a mismatched portion and malfunction.

【００５７】このように、上記非音声符号化部２０によ
る符号化データ平滑化手段によって、非音声区間におけ
るゲインパラメータが平滑化されるために前後のフレー
ムとの連続性がよくなり、聴感的に自然な再生音が得ら
れるようになる。また、上記符号化データ平滑化手段に
よる符号化データ平滑化処理は、記録媒体に記録された
符号化データに対して行うので、符号化時の演算量を増
加させることなく、音質を向上させることができる。As described above, since the gain parameter in the non-speech section is smoothed by the coded data smoothing means of the non-speech coding section 20, continuity with the preceding and succeeding frames is improved, and the audibility is improved. Natural reproduction sound can be obtained. Further, since the encoded data smoothing processing by the encoded data smoothing means is performed on encoded data recorded on a recording medium, it is possible to improve sound quality without increasing the amount of calculation at the time of encoding. Can be.

【００５８】また、本実施形態においては、上記した符
号化データ平滑化手段においてはゲインパラメータのみ
を平滑化するようにしたが、同様に非音声区間における
スペクトルパラメータを平滑化することも可能である。In the present embodiment, only the gain parameter is smoothed in the above-mentioned coded data smoothing means. However, it is also possible to smooth the spectrum parameter in the non-voice section. .

【００５９】次に、上記符号／復号化部５における復号
化部の復号化動作を図８に示すフローチャートを参照し
て説明する。Next, the decoding operation of the decoding unit in the encoding / decoding unit 5 will be described with reference to the flowchart shown in FIG.

【００６０】上記システム制御部１２（図１参照）に制
御されて符号／復号化部５において復号化処理動作が開
始されると、まず、音声／非音声判別部２６（図４参
照）において、上記音声メモリ７から１ビットデータの
値が読み出され（ステップＳ２２）、該１ビットデータ
の値が音声信号判定フラグＳＰ＝１であるか否かが判定
される（ステップＳ２３）。When the decoding processing operation is started in the encoding / decoding section 5 under the control of the system control section 12 (see FIG. 1), first, the speech / non-speech discriminating section 26 (see FIG. 4) The value of the 1-bit data is read from the audio memory 7 (step S22), and it is determined whether the value of the 1-bit data is the audio signal determination flag SP = 1 (step S23).

【００６１】このステップＳ２３において、音声信号判
定フラグＳＰ＝１でない場合は、音声メモリ７から31bi
t 分読み出し（ステップＳ２７）、復号化切換スイッチ
２７で非音声復号化部２９が選択され、該非音声復号化
部２９において非音声復号化処理が行われる（ステップ
Ｓ２８）。If it is determined in step S23 that the audio signal determination flag SP is not 1, the audio memory 7 stores 31bi.
The non-speech decoding unit 29 is selected by the decoding changeover switch 27, and non-speech decoding processing is performed in the non-speech decoding unit 29 (step S28).

【００６２】図５において、デマルチプレクサ３０は、
符号化データを線形予測パラメータとゲインに分離し
て、それぞれを線形予測合成部３２、ゲイン乗算部３１
に送出する。ランダム信号発生部２３は、フレーム長に
等しいランダム信号を発生し、ゲイン乗算部３１は、該
ランダム信号を、受信したゲインの値で増幅する。そし
て、線形予測合成部３２は該増幅されたランダム信号を
線形予測合成して出力する。In FIG. 5, the demultiplexer 30 comprises
The encoded data is separated into a linear prediction parameter and a gain, and each is separated into a linear prediction synthesis unit 32 and a gain multiplication unit 31.
To send to. The random signal generator 23 generates a random signal equal to the frame length, and the gain multiplier 31 amplifies the random signal with the received gain value. Then, the linear prediction synthesis unit 32 performs linear prediction synthesis on the amplified random signal and outputs the result.

【００６３】一方、このステップＳ２３において、音声
信号判定フラグＳＰ＝１である場合は、音声メモリ７か
ら191bit 分読み出し（ステップＳ２４）、復号化切換
スイッチ２７でマルチパルス復号部２８が選択され、該
マルチパルス復号化部２８でマルチパルス復号化処理が
なされる（ステップＳ２５）。On the other hand, if the audio signal determination flag SP = 1 in step S23, 191 bits are read from the audio memory 7 (step S24), and the multi-pulse decoding unit 28 is selected by the decoding changeover switch 27. The multi-pulse decoding unit 28 performs a multi-pulse decoding process (step S25).

【００６４】この後、符号／復号化部５における復号化
部からの出力はＤ／Ａ変換器１１に対して送出され（ス
テップＳ２６）、上記ステップＳ２２に戻る。Thereafter, the output from the decoding unit in the encoding / decoding unit 5 is sent to the D / A converter 11 (step S26), and the process returns to step S22.

【００６５】また、本実施形態においては、音声区間に
おける符号化／復号化処理はマルチパルス方式を用いた
が、ＣＥＬＰ方式などの方式を使うことも当然可能であ
る。In the present embodiment, the encoding / decoding process in the voice section uses the multi-pulse system, but it is naturally possible to use a system such as the CELP system.

【００６６】[付記]以上詳述した如き本発明の実施形態
によれば、以下の如き構成を得ることができる。即ち、（１）入力信号を音声信号と非音声信号とに判別する
判別手段と、上記入力信号を符号化する符号化手段と、
上記非音声信号を符号化したデータを平滑化する符号化
データ平滑化手段と、を具備したことを特徴とする音声
記録再生装置。[Appendix] According to the embodiment of the present invention as described in detail above, the following configuration can be obtained. (1) discriminating means for discriminating an input signal into a speech signal and a non-speech signal; encoding means for encoding the input signal;
And a coded data smoothing means for smoothing data obtained by coding the non-voice signal.

【００６７】（２）デジタル化した入力信号を一定の
長さに分割したフレームを単位として該入力信号を音声
信号と非音声信号とに判別する判別手段と、非音声音源
推定部を有し、上記入力信号を符号化する線形予測符号
化手段と、上記非音声音源推定部からの信号のゲイン情
報と線形予測パラメータ情報との少なくとも一方を平滑
化する符号化データ平滑化手段と、を具備したことを特
徴とする音声記録再生装置。(2) a discriminating means for discriminating the input signal into a speech signal and a non-speech signal in units of a frame obtained by dividing the digitized input signal into a predetermined length, and a non-speech sound source estimating unit; Linear prediction encoding means for encoding the input signal; and encoded data smoothing means for smoothing at least one of gain information and linear prediction parameter information of a signal from the non-speech sound source estimating unit. An audio recording / reproducing apparatus characterized by the above-mentioned.

【００６８】（３）上記符号化データ平滑化手段は非
音声フレームが一定の数以上連続した場合のみ非音声信
号を符号化したデータを平滑化することを特徴とする
（１）または（２）に記載の音声記録再生装置。(3) The coded data smoothing means smoothes the data obtained by coding the non-voice signal only when the non-voice frame continues for a predetermined number or more (1) or (2). An audio recording / reproducing apparatus according to claim 1.

【００６９】（４）上記符号化手段は、音声音源推定
部と非音声音源推定部とを有し、音声信号が入力された
際は、線形予測パラメータと上記音声音源推定部より得
られた音声音源推定パラメータとを符号化し、非音声信
号が入力された際は、線形予測パラメータと上記非音声
音源推定部より得られた音源信号をあらわすランダム信
号のゲインを符号化することを特徴とする（１）、
（２）または（３）に記載の音声記録再生装置。(4) The encoding means has a speech sound source estimating unit and a non-speech sound source estimating unit, and when a speech signal is input, the linear prediction parameter and the speech obtained by the speech sound source estimating unit. The sound source estimation parameter is encoded, and when a non-speech signal is input, the linear prediction parameter and the gain of a random signal representing the sound source signal obtained from the non-speech sound source estimator are encoded. 1),
The audio recording / reproducing device according to (2) or (3).

【００７０】（５）デジタル化した入力信号を一定の
長さに分割したフレームを単位に、入力信号を音声信号
と非音声信号とに判別する判別手段と、音声音源推定部
と非音声音源推定部を有し、音声信号が入力された場合
は線形予測パラメータと該音声音源推定部より得られた
音声音源推定パラメータを符号化し、非音声信号が入力
された場合は線形予測パラメータと該非音声音源推定部
より得られた音源信号をあらわすランダム信号のゲイン
を符号化する線形予測符号化手段と、判別結果と符号化
されたデータを記録媒体に記録する記録手段と、記録媒
体から判別結果と符号化されたデータを読取る読取り手
段と、読み出した判別結果にもとづき、音声信号と非音
声信号をそれぞれ復号化する復号化手段と、非音声信号
のフレームが予め定められた数以上連続した場合は、そ
の連続フレームにおける線形予測パラメータとゲインの
両方もしくはもどらか一方を平滑化する符号化データ平
滑化手段と、を有することを特徴とする音声記録再生装
置。(5) Discriminating means for discriminating an input signal into a speech signal and a non-speech signal in units of a frame obtained by dividing a digitized input signal into a predetermined length, a speech sound source estimating unit and a non-speech sound source estimation Encoding a linear prediction parameter and a speech source estimation parameter obtained by the speech source estimation unit when a speech signal is input, and a linear prediction parameter and the non-speech source when a non-speech signal is input. Linear predictive encoding means for encoding the gain of a random signal representing the sound source signal obtained from the estimating unit; recording means for recording the discrimination result and the encoded data on a recording medium; Reading means for reading the encoded data, decoding means for decoding the voice signal and the non-voice signal based on the read determination result, and a frame for the non-voice signal, which is predetermined. An audio data recording / reproducing apparatus comprising: encoded data smoothing means for smoothing either or both of the linear prediction parameter and the gain in the continuous frame when the number of consecutive frames exceeds a predetermined number.

【００７１】（６）上記符号化データ平滑化手段は、
所定のタイミングで非音声フレームを平滑化することを
特徴とする、（１），（２），（３），（４）または
（５）に記載の音声記録再生装置。(6) The coded data smoothing means includes:
The audio recording / reproducing apparatus according to (1), (2), (3), (4) or (5), wherein the non-audio frame is smoothed at a predetermined timing.

【００７２】（７）上記符号化データ平滑化手段は、
平滑化が未実施の場合のみ平滑化を実施することを特徴
とする、（１），（２），（３），（４），（５）また
は（６）に記載の音声記録再生装置。(7) The coded data smoothing means includes:
The audio recording / reproducing apparatus according to (1), (2), (3), (4), (5) or (6), wherein smoothing is performed only when smoothing has not been performed.

【００７３】（８）上記非音声信号を符号化したデー
タを平滑化中は、操作入力ができないことを特徴とす
る、（１），（２），（３），（４），（５），（６）
または（７）に記載の音声記録再生装置。(8) The operation input cannot be performed while the data obtained by encoding the non-speech signal is smoothed. (1), (2), (3), (4), (5) , (6)
Or the audio recording / reproducing device according to (7).

【００７４】[0074]

【発明の効果】以上説明したように本発明によれば、符
号化処理における演算量を増加させることなく良好な音
質が得られる音声記録再生装置を提供できる。As described above, according to the present invention, it is possible to provide an audio recording / reproducing apparatus capable of obtaining good sound quality without increasing the amount of calculation in the encoding process.

[Brief description of the drawings]

【図１】本発明の一実施形態である音声記録再生装置の
全体構成を示すブロック図である。FIG. 1 is a block diagram showing an overall configuration of an audio recording / reproducing apparatus according to an embodiment of the present invention.

【図２】上記実施形態の音声記録再生装置において、符
号／復号化部における符号化部の構成を示すブロック図
である。FIG. 2 is a block diagram showing a configuration of an encoding unit in an encoding / decoding unit in the audio recording / reproducing apparatus of the embodiment.

【図３】上記実施形態の音声記録再生装置において、図
２に示す符号化部における非音声符号化部の構成を示す
ブロック図である。FIG. 3 is a block diagram showing a configuration of a non-speech encoding unit in the encoding unit shown in FIG. 2 in the audio recording / reproducing apparatus of the embodiment.

【図４】上記実施形態の音声記録再生装置において、符
号／復号化部における復号化部の構成を示すブロック図
である。FIG. 4 is a block diagram showing a configuration of a decoding unit in an encoding / decoding unit in the audio recording / reproducing apparatus of the embodiment.

【図５】上記実施形態の音声記録再生装置において、図
４に示す復号化部における非音声復号化部の構成を示す
ブロック図である。FIG. 5 is a block diagram showing a configuration of a non-speech decoding unit in the decoding unit shown in FIG. 4 in the audio recording / reproducing apparatus of the embodiment.

【図６】上記実施形態の音声記録再生装置において、符
号／復号化部における符号化部の符号化処理動作を示し
たフローチャートである。FIG. 6 is a flowchart showing an encoding processing operation of an encoding unit in an encoding / decoding unit in the audio recording / reproducing apparatus of the embodiment.

【図７】上記実施形態の音声記録再生装置において、符
号／復号化部における符号化部の非音声部平滑化処理動
作を示したフローチャートである。FIG. 7 is a flowchart showing a non-speech part smoothing operation of the encoding part in the encoding / decoding part in the audio recording / reproducing apparatus of the embodiment.

【図８】上記実施形態の音声記録再生装置において、符
号／復号化部における復号化部の復号化処理動作を示し
たフローチャートである。FIG. 8 is a flowchart showing a decoding processing operation of a decoding unit in the encoding / decoding unit in the audio recording / reproducing apparatus of the embodiment.

[Explanation of symbols]

１…マイクロホン２…マイクアンプ３…ローパスフィルタ４…Ａ／Ｄ変換器５…符号／復号化部６…メモリ制御部７…音声メモリ８…スピーカ９…パワーアンプ１０…ローパスフィルタ１１…Ｄ／Ａ変換器１２…システム制御部１３…操作入力部１６…フレームエネルギー計算部１７…音声／非音声判別部１８…符号化切換スイッチ１９…マルチパルス符号化部２０…非音声符号化部２６…音声／非音声判別部２７…復号化切換スイッチ２８…マルチパルス復号化部２９…非音声復号化部 DESCRIPTION OF SYMBOLS 1 ... Microphone 2 ... Microphone amplifier 3 ... Low-pass filter 4 ... A / D converter 5 ... Encoding / decoding part 6 ... Memory control part 7 ... Sound memory 8 ... Speaker 9 ... Power amplifier 10 ... Low-pass filter 11 ... D / A Converter 12 ... System control unit 13 ... Operation input unit 16 ... Frame energy calculation unit 17 ... Speech / non-speech discriminating unit 18 ... Coding changeover switch 19 ... Multi-pulse coding unit 20 ... Non-speech coding unit 26 ... Speech / Non-speech discriminator 27 ... Decoding switch 28 ... Multi-pulse decoder 29 ... Non-speech decoder

Claims

[Claims]

1. A discriminating means for discriminating an input signal into a voice signal and a non-voice signal; a coding means for coding the input signal; and coded data for smoothing data obtained by coding the non-voice signal. An audio recording and reproducing apparatus, comprising: a smoothing unit.

2. A non-speech sound source estimating unit comprising: a discrimination means for discriminating the input signal into a speech signal and a non-speech signal in units of a frame obtained by dividing the digitized input signal into a predetermined length; Linear prediction encoding means for encoding an input signal; and encoded data smoothing means for smoothing at least one of gain information and linear prediction parameter information of a signal from the non-speech sound source estimating unit. An audio recording / reproducing apparatus characterized by the above-mentioned.

3. The encoded data smoothing means smoothes data obtained by encoding a non-speech signal only when a non-speech frame continues for a predetermined number or more.
Alternatively, the audio recording / reproducing device according to claim 2.