JPS5817499A

JPS5817499A - Variable speed type voice reproducer

Info

Publication number: JPS5817499A
Application number: JP56116058A
Authority: JP
Inventors: 信之寺浦; 岡本　敦稔
Original assignee: NipponDenso Co Ltd
Current assignee: Denso Corp
Priority date: 1981-07-24
Filing date: 1981-07-24
Publication date: 1983-02-01

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は例えばＶ’ｌ’ｌ（ビデオテープレコーダ）等
の音声再生装置の再生時において、再生速度にかかわら
ず話し言葉に近い音声出力を。DETAILED DESCRIPTION OF THE INVENTION The present invention provides an audio output close to spoken words, regardless of the playback speed, during playback by an audio playback device such as a V'l'l (video tape recorder).

を声合成を用いて行なわせる可変速式音声再生装置１＝
関する。Variable speed voice playback device 1 that performs voice synthesis using voice synthesis
related.

従来、ｖテ璽の定格速度と興なる速度、たとえば２倍速
、ｏ、ｉ倍速で再生すると、それにともなって音声信号
の周波数が変化するため、そのまま音声出力をしたので
は、聰きとることは不可能であった。このため再生時に
は、音声の再生出力回路は遮断され、ｆ声が再生されな
いか、もしくは、再生音と同じ陶波数に戻すため。Conventionally, when playing at a speed that is different from the rated speed of the V-telephone, for example, 2x speed, o, ix speed, the frequency of the audio signal changes accordingly, so if you output the audio as is, it will not be possible. It was impossible. For this reason, during playback, the sound playback output circuit is shut off, and either the f-voice is not played back, or the sound is returned to the same wave number as the playback sound.

音声波形の一部をカットしたり、あるいは波形を伸長す
る方法が採用されている。A method is adopted in which a part of the audio waveform is cut or the waveform is expanded.

マ’Ｉ’Ｒを定格速度と異なる速度で再生する場合１画
面だけではなく、音声も再生されるほうが附加価値が高
まる。例えば、速送りをする場合、音声が再生されない
ときには１画面を注視していなければならないという欠
点がある。When playing Ma'I'R at a speed different from the rated speed, the added value increases if not only one screen is played but also the sound is played. For example, when performing fast forwarding, there is a drawback in that the user must keep their eyes on one screen when audio is not being played back.

ｔｔｅ、音声を出力する場合、音声波形の一部をカット
したり、あるいは波形を伸長する方法では、音声波形に
不自然な乱れが生じ、再生音の明瞭変が低いという欠点
が゛ある。ｔた。上記方法では、再生速度が０．５倍〜
２倍程度に限定される欠点がある。When outputting audio, methods of cutting part of the audio waveform or expanding the waveform have the disadvantage that unnatural disturbances occur in the audio waveform and the clarity of the reproduced sound is low. It was. In the above method, the playback speed is 0.5 times ~
The drawback is that it is limited to about twice as much.

本発明は上記の事情に鐙みてんされたもので。The present invention has been developed in consideration of the above circumstances.

音声再生装置の音声出力部に、デムＲＣＯＲｃｌ自己相
関；　Ｐａｒｔｉａｌ　ｍｕｔｏｃ＠ｒｒ＠１ｓｔＩｏ
ｍ　）分析合成器を有し、非定格速度再生時において。DemRCORcl autocorrelation is applied to the audio output section of the audio playback device; Partial mutoc@rr@1stIo
m) With an analytical synthesizer and during non-rated speed regeneration.

再生速度に比例して音声波形のサンプリング周波数を変
更し、を声合成時において、同一データの使用時間を再
生速度に′反比例して変更することにより、再生速度に
かかわらず普通の話し言葉に近い合成音を出力し得る可
変速式音声再生装置を提供することを目的とする。By changing the sampling frequency of the audio waveform in proportion to the playback speed, and changing the usage time of the same data in inverse proportion to the playback speed during voice synthesis, synthesis that is close to normal spoken language can be achieved regardless of the playback speed. An object of the present invention is to provide a variable speed audio reproduction device capable of outputting sound.

例えばＶ’ｌ’ｌを定格速度で再生する場合には。For example, when playing V'l'l at the rated speed.

再生音をそのままスピーカーから出力するが、そうでな
い場合には、再生音なそのままスピーカーから出力せず
、アナログデータである音声波形を、再生速度に比例す
る周波数ごとのチンプツｙグクロックによりムＤ（アナ
ログ・ディジタル）変換し、ディジタルデータとし、メ
モツーに格納する。このディジタル・データについて、
ＰＡＲＣＯＩ分析を行ない、波形領域のデータから、Ｐ
ＡＩＣＯ１ｌデータに変換する。The reproduced sound is output from the speaker as it is, but if this is not the case, the reproduced sound is not output from the speaker as is, but the audio waveform, which is analog data, is processed using a chimp clock for each frequency proportional to the playback speed.・Digital) Convert it to digital data and store it in Memo Two. Regarding this digital data,
Perform PARCOI analysis, and from the waveform domain data, P
Convert to AICO1l data.

ＰＡｌｌＣＯ１ｌデータは、ピッチ、振幅、有声／無声
の区別、に−パラメータから構成される。The PAllCO1l data is composed of parameters such as pitch, amplitude, and voiced/unvoiced distinction.

次に、これらのＰＡＲＣＯＲデータを用いて音声を合成
する。Next, audio is synthesized using these PARCOR data.

音声合成時には１分析により得た同一ディジタルデータの使用時間を、ＶＴＲの再生速度に反
比例した時間に設定する。During speech synthesis, the usage time of the same digital data obtained by one analysis is set to a time inversely proportional to the playback speed of the VTR.

このようにして合成された合成音は、ピッチ。The synthesized sound synthesized in this way has a pitch.

ホルマントの位置が普通の話し言葉と同一となるため、
早口で話された音声と同様となる。これにより、非定格
速度再生時にも聰き取りが可能となる。Because the position of formants is the same as in normal spoken language,
The sound is similar to that of someone speaking quickly. This allows recording even when playing at a non-rated speed.

上記デムＲＣＯＲ分析合成を応用した装置を用いる事に
より、速送りを行なう場合、−面を注視している必要が
なくなり、奮だけを聴きながら、テープの頬出しを行な
うことができる。By using a device to which the above-mentioned DEMRCOR analysis and synthesis is applied, there is no need to keep an eye on the - side when performing fast forwarding, and it is possible to read the tape while listening only to the sound of the tape.

また、録音されたエエース等を短時間に聴取する事が可
能になる。In addition, it becomes possible to listen to recorded airs, etc. in a short time.

また、ＰＡＲＣＯＲ分析合成法を用いる事により、広い
範囲の再生速度の音声再生がＩＪ簡になる。Furthermore, by using the PARCOR analysis and synthesis method, audio reproduction over a wide range of reproduction speeds becomes easy.

以下図面を参照して本発明の実施例を詳細に説明する。Embodiments of the present invention will be described in detail below with reference to the drawings.

第１図は本発明の一冥施倒を示す、第１図において、１
はスピーカを除〈従来のＶＴＲ部分である。８Ｗ１は再
生速度を選択するスイッチ。FIG. 1 shows the introduction of the present invention. In FIG. 1, 1
This is the conventional VTR part, excluding the speakers. 8W1 is a switch to select the playback speed.

１−１はＶＴＲの再生音出力部である。ＳＷＺは再生音
をそのまま出力するか、またはＰＡＩ？ＣＯＲ分析を行
ない、ピッチ変換を行なったのち合成音を出力するか、
または再生音出力を行なわないかを選択するスイッチで
ある。1-1 is a reproduced sound output section of the VTR. Does SWZ output the playback sound as is, or does it use PAI? After performing COR analysis and pitch conversion, output a synthesized sound, or
This is a switch for selecting whether to output playback sound or not.

２はスピーカ、３は再生音をサンプリングするためのＡ
Ｄコンバータ、４はサンプリングのクロックを与えるタ
イマー、５はＰＡＲＣＯＲ分析処理等を行なうＣＰＵ（
中央演算処理装置）である、６はＣＰＵのプログラムを
格納するＲＯ）ＪＣ読み出し専用メモリー）、１は演算
用ＲＡＭ（書き込み読み出しメモリー）、８は。2 is the speaker, 3 is A for sampling the playback sound
D converter, 4 is a timer that provides a sampling clock, 5 is a CPU that performs PARCOR analysis processing, etc.
6 is the RO (RO) JC read-only memory that stores the CPU program, 1 is the arithmetic RAM (write/read memory), and 8 is the CPU (central processing unit).

ＣＰＵによってＰＡＲＣＯＲ分析され、またピッチ変換
された結果得られたＰＡＲＣ？ＯＲデータの格納するＲ
ムＭであり、このデータを用いて奮声合、成器が音声を
合成する。９は、ＣＰＵと音声合成器のアドレス・パス
及びデータ・バスを切り換えるマルチプレクサである。PARC? which is obtained as a result of PARCOR analysis and pitch conversion by the CPU. R for storing OR data
The voice synthesizer synthesizes the voice using this data. 9 is a multiplexer that switches the address path and data bus between the CPU and the speech synthesizer.

１０はｔ声合成器である。１１は１合成音に含まれる高
周波成分を落とすローパスフィルター及び増幅器である
。８ＷＪは１合成音をスピーカに伝えるスイッチであり
、ｇＷＪと連動している。10 is a T voice synthesizer. Reference numeral 11 denotes a low-pass filter and an amplifier that remove high frequency components contained in one synthesized sound. 8WJ is a switch that transmits one synthesized sound to the speaker, and is linked to gWJ.

即ち、ＶＴＲ１が、Ｘイーｊｆ８Ｗ１，８Ｖｉｌによっ
て、非定格速度再生を行ない、且つ、ＰＡＩＣＯ３ｊ分
析合成によってピッチ変換を行なうことが選択された場
合、ＣＰＵｊはＩＯＭＣに格納されたプログラムに従い
、タイマー４によって与えられるナンプ亨ンダ馬波数で
。That is, when it is selected that the VTR1 performs non-rated speed playback using the At the number of Nampu horses that can be used.

ムＤツｙパータ１を介して１割り込み処理によって音声
をサンプリングし、ＲＡＭＰに格納する。＄ｙプヅング
周波数はスイッチｓＷ１によって設定された再生速度に
比例するようにＣＰＵｊによりタイマー４にセットされ
る。一定数（例えば！５０ケ）のサンプリングが行なわ
れると、サンプリングされたデータはソフトウェアによ
りへミング・クイノド−がかけられ。The audio is sampled by 1 interrupt processing via the programmer 1 and stored in RAMP. The $y puzzling frequency is set in the timer 4 by the CPUj so as to be proportional to the playback speed set by the switch sW1. Once a certain number of samples (for example, !50) have been sampled, the sampled data is subjected to a Hemming Quinod by software.

ブレイムデータとして切り出される。切り出されたブレ
イム単位のデータは、自己相関係数の計算、ピッチ抽出
、ダービンの解法によるに一パラメータの計算１等のＰ
ＡＲＣＯＲ分析がなされる。Extracted as Blame Data. The extracted frame unit data is calculated by autocorrelation coefficient calculation, pitch extraction, and calculation of one parameter by Durbin's solution.
An ARCOR analysis is performed.

波形領域のデータであったブレイム単位のサンプリング
データは、ｉ”ＡＲｃＯＲ分析によって、ピッチ、ａｍ
、有声／無声の区別、に−パラメータのＰＡＲＣＯＲデ
ータに変換され。The frame unit sampling data, which was data in the waveform region, was analyzed by i''ARcOR analysis to determine the pitch, am
, voiced/unvoiced distinction, and converted into parameter PARCOR data.

ＲＡＭ８に格納される。ＲＡＭＪに格納されたデータの
使用陣間をスイッチ８　Ｗ　Ｊによって設定されるＶＴ
Ｒの再生速度に反比例するように。Stored in RAM8. VT set by switch 8 WJ to use the data stored in RAMJ
So that it is inversely proportional to the playback speed of R.

ＣＰｔ７ｔは、音声合成器１０をセットする。音声合成
器１０は、マルチプレクサ９によって。CPt7t sets the speech synthesizer 10. Speech synthesizer 10 by multiplexer 9.

アドレス・パス、データ・バスが切り換えられる事によ
りＲＡＭＪにＣＰＵ＃と平行してアクセスし、ＲＡＭＪ
のデータを用いて音声を合成する０合成された背戸波形
は、高調波成分を除く為のローパスフィルター及び増幅
器１１＃を経、スイッチ８　Ｗ　ｊと連動するスイッチ
８ＷＪを介してスピーカ２に送られ、音声出力される。RAMJ is accessed in parallel with CPU# by switching the address path and data bus.
The synthesized seido waveform is sent to the speaker 2 via a switch 8WJ that works in conjunction with a switch 8Wj, after passing through a low-pass filter and an amplifier 11# for removing harmonic components. , the audio is output.

なお、ＰＡＲＣＯＲ分析を行なうためのＡＤコンバータ
１．タイマー４．Ｃ’ＰＵＪ、ＲＯＭｇ、ＩＡＭＦ、Ｒ
ＡＭＰ、マルチブレクチ９は。In addition, an AD converter 1. for performing PARCOR analysis. Timer 4. C'PUJ, ROMg, IAMF, R
AMP, multi-breech 9.

第２図のように、ＰＡＲＣＯＲ分析専用Ｌｌ’１ｉ１１
、及び１チップＭＰＵ（マイクロ演算処理装置）１１に
置き換えることができる。As shown in Figure 2, Ll'1i11 dedicated to PARCOR analysis
, and a one-chip MPU (micro processing unit) 11.

なお、上記実施例ではｖ’ｒｍについて説明したが、こ
れに限らず、カセットテープレコーダ等の皆声再生装置
についても同様に実施することができる。In addition, although the above-mentioned example explained v'rm, it is not limited to this and can be implemented similarly to a universal voice reproducing device such as a cassette tape recorder.

次に１本発明の効果について以下に述べる。Next, the effects of the present invention will be described below.

■　ＰＡＲＣＯ＊分析を行なうことにより、ピッチ、ホ
ルマントの位置の保存が広い範囲にわたって可能である
。■ By performing PARCO* analysis, it is possible to preserve pitch and formant positions over a wide range.

■　ピッチ、ホルマントの位置を保存することにより、
蕾遍の話し言葉の声の高さで音声出力することができ、
かつ上記により、聴き取りが可能になる。■ By saving pitch and formant positions,
It is possible to output audio at the pitch of Budhen's spoken words,
And the above enables listening.

４１１１−簡単な説明ｌｌ１ｔｊ４は１本発明の一実施例を示す全体構成図、
第２図は１本発明の他の実施例な示す全体構成図である
。4111-Brief Descriptionll1tj4 is an overall configuration diagram showing an embodiment of the present invention;
FIG. 2 is an overall configuration diagram showing another embodiment of the present invention.

Ｊ−ＶＴＲ，ｊ−・・スピーカ、Ｓ・・・ムＤコンバー
タ、４−ｊイｖ＋、Ｊ・　ＣＰＵ、ａ−ＲＯＭ。J-VTR, j-... speaker, S... mu-D converter, 4-j iv+, J-CPU, a-ROM.

１、＃・・・１ムＭ、９・・・マルチプレクチ、１ｏ・
・・音声合成器、１１・・・ローパスフィルター及び増
幅Ｓ。1, #...1muM, 9...Multiplex, 1o.
...Speech synthesizer, 11...Low pass filter and amplification S.

出願人代理人　弁理士　　鈴　　江　　武　　彦牙１図Applicant's agent Patent attorney Suzu E Take Hikoga Figure 1

Claims

[Claims]

The audio output of the audio playback device that is playing at a non-rated speed is subjected to PMURCOR using a sampling wave number that is proportional to the playback speed.
Analyze and PARCO1! A variable speed one-voice playback device characterized by comprising means for obtaining data, and means for synthesizing speech by setting the usage time of the PAICOI data obtained by this means to a time inversely proportional to the playback speed. .