JP2005092021A - Method and device for speaking speed conversion - Google Patents

Method and device for speaking speed conversion Download PDF

Info

Publication number
JP2005092021A
JP2005092021A JP2003327784A JP2003327784A JP2005092021A JP 2005092021 A JP2005092021 A JP 2005092021A JP 2003327784 A JP2003327784 A JP 2003327784A JP 2003327784 A JP2003327784 A JP 2003327784A JP 2005092021 A JP2005092021 A JP 2005092021A
Authority
JP
Japan
Prior art keywords
signal
channel
speech
speed conversion
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2003327784A
Other languages
Japanese (ja)
Inventor
Mikio Oda
幹夫 小田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP2003327784A priority Critical patent/JP2005092021A/en
Publication of JP2005092021A publication Critical patent/JP2005092021A/en
Pending legal-status Critical Current

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To perform speaking speed conversion processing which is free of time deviation in speech stereo signal input. <P>SOLUTION: Provided are an adder which adds inputted speech stereo signals of channels L and R together, a speaking speed conversion processing part which inputs the speech signal made monaural by the adder and performs specified speaking speed conversion, and a distributor which distributes the signal outputted after the speaking speed conversion process by the speaking speed conversion processing part to the speech channels L and R respectively, so that the distributed speech L and R signals are reproduced by a speaker. Speed conversion which is free of time deviation in speech stereo signal input is realized without adding algorithm of mutual time management processing of the inputted speech stereo signals and extra memory. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、話す速度を制御する話速変換に関するものである。   The present invention relates to speaking speed conversion for controlling speaking speed.

昨今のテレビ放送は放送局の増加、また衛星デジタル放送の開局などで、さまざまなジャンルのプログラムを放送できるようになり、視聴者を楽しませている。しかしながらこれらのプログラムの中には、早口でしゃべるアナウンサーやタレントがおり、高齢者には聞き取れない場合がある。この課題を解決する技術として、デジタル技術の進歩により、音程を変えずに、速度のみをゆっくりとするデジタル方式の話速変換装置が提案されている(例えば特許文献1参照)。   With the recent increase in broadcasting stations and the opening of digital satellite broadcasting, various types of programs can now be broadcast, which is entertaining viewers. However, some of these programs have spoken announcers and talents that may not be audible to the elderly. As a technique for solving this problem, there has been proposed a digital speech speed conversion device that slows only the speed without changing the pitch due to the advancement of digital technology (see, for example, Patent Document 1).

以下図4を参照しながら、デジタル方式の話速変換装置における第1の従来技術の一例について説明する。図4において、符号10は入力された音声信号をアナログ音声信号からデジタル音声信号に変換するA/Dコンバータ、11はデジタル音声信号を話速変換する話速変換部、12は話速変換されたデジタル音声信号をアナログ音声信号に戻すD/Aコンバータ、13は入力され、デジタル変換された音声デジタル信号を一時格納するリングメモリに構成されたメモリである。   Hereinafter, an example of the first prior art in the digital speech rate conversion apparatus will be described with reference to FIG. In FIG. 4, reference numeral 10 is an A / D converter that converts an input voice signal from an analog voice signal to a digital voice signal, 11 is a speech speed conversion unit that converts the digital voice signal, and 12 is speech speed converted. A D / A converter 13 for converting a digital audio signal back to an analog audio signal is a memory configured as a ring memory that temporarily stores an input digitally converted audio digital signal.

以上のように構成された話速変換装置について、その動作を説明する。入力された音声信号はA/Dコンバータ10でアナログ信号からデジタル信号に変換され、話速変換する話速変換部11に入力され、また同時にリングメモリ構成されたメモリ13に格納される。話速変換部11は入力された音声デジタル信号の無音区間の切り出し、話速変換する発音の母音切り出しなどの処理を行い、図5(a)に示す会話例「それは わたしのです」を図5(b)に示すごとく、単純に2倍の時間に話速を伸ばすとすると「そーれーはーわーたーしーのーでーすー」という具合に母音を追加して時間を延ばす処理を行う。時間を伸ばされた音声デジタル信号はD/Aコンバータ12でデジタル音声信号からアナログ音声信号に戻され、一連の話速変換処理を終了する。一方、話速変換され時間が伸ばされていくと、有限なメモリ容量であるメモリ13は、書き込みアドレスを読み出しアドレスが追い越す、いわゆるオーバーフローする可能性が出てくる。メモリ13の残量を照会しながら話速変換処理の話速変換率を決定しようとするのが従来の方式であり、逆に無音区間では保存の必要がなく、つぎの音声信号の立ち上がりで、話速変換の頭合わせを行い、オーバーフローの削減を図る。   The operation of the speech speed converting apparatus configured as described above will be described. The input audio signal is converted from an analog signal to a digital signal by the A / D converter 10 and input to the speech speed conversion unit 11 for converting the speech speed, and is simultaneously stored in the memory 13 configured as a ring memory. The speech speed conversion unit 11 performs processing such as extraction of a silent section of the input voice digital signal, extraction of a vowel of a pronunciation for speech speed conversion, and the conversation example “that is mine” shown in FIG. As shown in (b), if you simply increase the speaking speed in twice the time, you can add vowels and say, “Sore is so awesome!” Perform a lengthening process. The audio digital signal whose time has been extended is returned from the digital audio signal to the analog audio signal by the D / A converter 12, and a series of speech speed conversion processing is completed. On the other hand, when the speech speed is converted and the time is extended, the memory 13 having a finite memory capacity may overflow so that the read address overtakes the write address. It is the conventional method to try to determine the speech rate conversion rate of the speech rate conversion process while inquiring the remaining amount of the memory 13, and conversely, there is no need to store in the silent period, and at the rising edge of the next speech signal, The head of speech speed conversion is adjusted and the overflow is reduced.

しかしながら前記従来の構成では、音声がモノラル信号なら、話速変換することで、聞き易くなるが、昨今のテレビ放送の音声はステレオ信号が通常であり、音声ステレオ信号を話速変換する場合は従来構成のステレオ化として図6に示す2チャンネル並列を想定した場合、同期ずれの問題が考えられる。すなわち、図6の話速変換装置に、図5(c)に示す音声ステレオ信号が入力された場合の話速変換を考えると、Lチャンネル音声が時間t1で言い終わってからRチャンネル音声が時間t2で言い始める場合、話速変換すると、図5(d)に示すごとくLチャンネル音声が時間t3まで遅れたにもかかわらず、無音区間で待機していたRチャンネル音声が時間t2で話速変換を開始し、時間t2と時間t3の区間に時間オーバーラップが発生するなど同期ずれの問題があった。   However, in the conventional configuration, if the sound is a monaural signal, it is easier to hear by converting the speech speed. However, the sound of a recent television broadcast is usually a stereo signal, and when converting the audio stereo signal to the speech speed, it is conventional. Assuming that the two-channel parallel configuration shown in FIG. 6 is assumed as the stereo structure, there is a problem of synchronization shift. That is, considering the speech speed conversion when the speech stereo signal shown in FIG. 5 (c) is input to the speech speed conversion device of FIG. 6, the R channel sound is timed after the L channel sound is finished at time t1. When speaking is started at t2, when the speech speed is converted, as shown in FIG. 5 (d), the R channel speech that has been waiting in the silent section is converted to speech speed at time t2 even though the L channel speech is delayed until time t3. , And there is a problem of synchronization error such that time overlap occurs between time t2 and time t3.

この課題を解決する技術として音声ステレオ信号を加算し、加算された音声信号で無音区間を検出して、それぞれLチャンネル音声と、Rチャンネル音声の無音区間削除を行い、LチャンネルとRチャンネルの同期をとった話速変換装置が提案されている(例えば特許文献2参照)。   As a technique for solving this problem, an audio stereo signal is added, a silence interval is detected from the added audio signal, and silence intervals of the L channel audio and the R channel audio are deleted, respectively, and the L channel and the R channel are synchronized. A speech speed conversion device has been proposed (see, for example, Patent Document 2).

以下図7を参照しながら、話速変換装置における第2の従来技術の一例について説明する。図7において、符号20は映像、音声信号の記録媒体としてのハードディスクドライブ、21はハードディスクドライブ20から読み出された映像、音声データをフレーム毎に記憶するフレームメモリ、22は映像、音声信号を分離する信号分離部、23は分離された音声ステレオ信号を加算する信号加算部、24は話速変換するために、ピッチ周期を検出するピッチ周期算出部、25は1フレーム分の加算音声信号が有音区間か無音区間かを判定する区間判定部、26は区間判定部25で無音区間と判定された区間のLチャンネル音声と、Rチャンネル音声を共に削除する無音区間削除部、27はピッチ周期単位で時間軸圧縮伸張処理する時間軸圧縮伸張部、28は時間軸圧縮伸張された音声データを記憶する音声メモリ、29は音声メモリ28のデータ蓄積率を算出し、話速変換制御を行う蓄積率算出、話速制御部、30は音声メモリ28から順次読み出される音声LRデータを分離するL/R分離部、31はLR分離された音声信号のうち、Lチャンネルの音声信号をデジタルからアナログに変換するD/Aコンバータ、32はLR分離された音声信号のうち、Rチャンネルの音声信号をデジタルからアナログに変換するD/Aコンバータである。   Hereinafter, an example of the second prior art in the speech speed conversion device will be described with reference to FIG. In FIG. 7, reference numeral 20 is a hard disk drive as a recording medium for video and audio signals, 21 is a frame memory for storing video and audio data read from the hard disk drive 20, and 22 is a video and audio signal separated. 23, a signal adding unit for adding the separated audio stereo signals, 24 for a pitch period calculating unit for detecting the pitch period for speech speed conversion, and 25 for having an added audio signal for one frame. A section determining unit that determines whether the section is a sound section or a silent section, 26 is a silent section deleting unit that deletes both the L channel sound and the R channel sound of the section determined by the section determining unit 25 as a silent section, and 27 is a pitch cycle unit A time axis compression / decompression unit that performs time axis compression / decompression processing, 28 is a voice memory that stores time-axis compressed / expanded voice data, and 29 is a voice memo 28, the storage rate calculation for performing speech speed conversion control, the speech speed control unit, 30 is an L / R separation unit that separates the speech LR data that is sequentially read from the speech memory 28, and 31 is the LR separation. A D / A converter that converts an L channel audio signal from digital to analog among the audio signals, and 32 is a D / A converter that converts an R channel audio signal from digital to analog among the LR separated audio signals. It is.

以上のように構成された第2の従来の話速変換装置について、その動作を説明する。デジタルテレビ放送などで送られた映像信号、音声信号は復調され、ハードディスクドライブ20に記録される。記録された映像信号、音声信号はフレームメモリ21にフレーム単位で読み出され、信号分配部22で映像信号、音声信号に分離する。分離された音声LR信号は信号加算部23で順次加算され、両チャンネルの1フレーム分の音声信号が加算された時点で加算信号を出力する。区間判定部25は、加算出力された1フレーム分の信号の信号レベルを検出し、有音区間か、無音区間かを判定し、無音区間削除部26を制御し、無音区間であれば、その1フレーム分のLチャンネル音声と、Rチャンネル音声を共に削除する。このようにして、フレーム単位の同期を維持する。但し、Lチャンネル音声と、Rチャンネル音声はそれぞれ独立である。次にピッチ周期算出部24でピッチ周期を算出し、同一のピッチ周期を用いて時間軸圧縮伸張することにより、時間軸圧縮伸張後の、音声Lチャンネル、Rチャンネルの同期が維持されつつ、品質の良い時間圧縮伸張処理が可能となる。時間軸圧縮伸張された音声データは音声メモリ28に記録され、順次読み出され、音声LRデータを分離するL/R分離部30、Lチャンネルの音声信号をデジタルからアナログに変換するD/Aコンバータ31、Rチャンネルの音声信号をデジタルからアナログに変換するD/Aコンバータ32を通してアナログ2チャンネルで再生される。また音声メモリ28のデータ蓄積率を算出し、話速変換制御を行う蓄積率算出、話速制御部29は、蓄積率が少ないと無音区間を削除しないとか、逆に蓄積率が多いと、圧縮率を高くするとかの制御を行う、つまり、速聞き時の話速変換方法である。
特開平7−192392号公報 特開2002−297200号公報
The operation of the second conventional speech speed converting apparatus configured as described above will be described. A video signal and an audio signal transmitted by digital television broadcasting or the like are demodulated and recorded in the hard disk drive 20. The recorded video signal and audio signal are read into the frame memory 21 in units of frames and separated into a video signal and an audio signal by the signal distribution unit 22. The separated audio LR signals are sequentially added by the signal adder 23, and an added signal is output when the audio signals for one frame of both channels are added. The section determination unit 25 detects the signal level of the signal for one frame that is added and output, determines whether it is a voiced section or a silent section, controls the silent section deletion unit 26, and if it is a silent section, Both L-channel audio and R-channel audio for one frame are deleted. In this way, synchronization in frame units is maintained. However, the L channel sound and the R channel sound are independent of each other. Next, the pitch period is calculated by the pitch period calculation unit 24, and the time axis compression / expansion is performed using the same pitch period, thereby maintaining the synchronization of the audio L channel and R channel after the time axis compression / expansion. Time compression / decompression processing is possible. The audio data subjected to time-axis compression / expansion is recorded in the audio memory 28, sequentially read out, and an L / R separation unit 30 for separating the audio LR data, and a D / A converter for converting the L channel audio signal from digital to analog. 31. The audio signal of the R channel is reproduced with two analog channels through a D / A converter 32 for converting the digital signal into an analog signal. Further, the storage rate calculation / speech rate control unit 29 that calculates the data storage rate of the voice memory 28 and performs speech speed conversion control does not delete the silent section if the storage rate is low, or conversely, if the storage rate is high, the compression is performed. This is a method for converting the speech speed at the time of fast listening.
JP-A-7-192392 JP 2002-297200 A

しかしながら音声ステレオ信号を話速変換する場合は、前記第1の従来技術の構成のように単純に2チャンネル並列処理すると、Lチャンネル音声とRチャンネル音声に時間同期がとれず、オーバーラップ区間が発生するなど問題があり、このオーバーラップを防ぐには前記第2の従来技術の構成のように、Lチャンネル、Rチャンネルの相互時間管理処理のアルゴリズムの追加と、LR独立の音声メモリの追加が必要になる。   However, when converting the speech speed of the audio stereo signal, if the two channels are simply processed in parallel as in the first prior art configuration, the L channel audio and the R channel audio cannot be synchronized in time, and an overlap period is generated. In order to prevent this overlap, it is necessary to add an algorithm for mutual time management processing of the L channel and the R channel and to add an LR independent audio memory as in the configuration of the second prior art. become.

前記課題を解決するために、本発明の話速変換方法は、入力された複数チャンネル、例えばLチャンネル及びRチャンネルの音声ステレオ信号を加算して、モノラル信号で話速変換することを特徴としたものである。   In order to solve the above-mentioned problem, the speech speed conversion method of the present invention is characterized in that the input voice stereo signals of a plurality of channels, for example, L channel and R channel, are added and the speech speed is converted with a monaural signal. Is.

また話速変換装置として、入力されたLチャンネル及びRチャンネルの音声ステレオ信号を加算する加算器と、前記加算器でモノラル信号になった音声信号を入力とし所定の話速変換を行う話速変換処理部と、前記話速変換処理部の出力信号をLチャンネルとRチャンネルにそれぞれ分配し音声LR信号として出力する分配器を具備し、話速変換及び分配された音声LR信号をスピーカあるいはイヤホンで再生するように構成したものである。   Further, as a speech speed conversion device, an adder for adding the input L-channel and R-channel audio stereo signals, and a speech speed conversion for performing a predetermined speech speed conversion by inputting the audio signal that has become a monaural signal by the adder. A processing unit and a distributor that distributes the output signal of the speech speed conversion processing unit to the L channel and the R channel, respectively, and outputs the result as a speech LR signal. The speech LR signal that has been subjected to speech speed conversion and distribution is output by a speaker or an earphone. It is configured to reproduce.

上記構成を備えることにより、本発明は、音声ステレオ信号の話速変換処理の場合において、時間同期がとれないことで発生するLチャンネル音声とRチャンネル音声のオーバーラップを防止できるとともに、Lチャンネル、Rチャンネルの相互時間管理処理アルゴリズムの追加及び、それに伴うメモリの追加をすることなく簡単な回路構成で話速変換が行えるものである。   By providing the above configuration, the present invention can prevent the overlap of the L channel sound and the R channel sound that are generated when time synchronization is not achieved in the case of the speech speed conversion processing of the sound stereo signal, and the L channel, The speech speed can be converted with a simple circuit configuration without adding an R channel mutual time management processing algorithm and accompanying memory.

(実施の形態1)
以下に、本発明の第1の実施の形態について、図1、図2を用いて説明する。
(Embodiment 1)
The first embodiment of the present invention will be described below with reference to FIGS.

図1は、本発明の第1の実施の形態における話速変換装置の構成を示すブロック図である。図1において、符号1は、入力されたLチャンネル、Rチャンネルの音声ステレオ信号を加算する加算器、2は加算器1でモノラル信号になった音声信号を入力とし、所定の話速変換を行う話速変換処理部、3は話速変換処理部2で話速変換処理され、出力された信号を、LチャンネルとRチャンネルにそれぞれ分配し音声LR信号として出力する分配器であり、分配器3で分配された音声LR信号をLチャンネルとRチャンネルのスピーカで再生するように構成している。なお音声LR信号は、話速変換されたモノラル信号をLチャンネルとRチャンネルに同一情報を有する信号として分配したものである。   FIG. 1 is a block diagram showing the configuration of the speech rate conversion apparatus according to the first embodiment of the present invention. In FIG. 1, reference numeral 1 is an adder for adding input L channel and R channel audio stereo signals, and 2 is an audio signal that has been converted to a monaural signal by the adder 1, and performs predetermined speech speed conversion. The speech speed conversion processing unit 3 is a distributor that performs speech speed conversion processing by the speech speed conversion processing unit 2 and distributes the output signals to the L channel and the R channel, respectively, and outputs them as audio LR signals. The audio LR signal distributed in (1) is reproduced by speakers of the L channel and the R channel. Note that the audio LR signal is obtained by distributing a monaural signal subjected to speech speed conversion as a signal having the same information in the L channel and the R channel.

以上のように構成された話速変換装置について、その動作と各部の詳細を説明する。例えばテレビ受信機の音声再生において、Lチャンネル、Rチャンネルの音声ステレオ信号に検波された音声信号は加算器1で加算され、モノラル信号に変換される。加算器1は、例えばデジタル信号を扱うのであれば、アダーと両チャンネルのタイミングを取るラッチとで容易に構成できるし、アナログ信号を扱うのであれば、抵抗ネットワーク等を用いて容易に構成でき、複数チャンネルの音声情報を加算してモノラル信号に変換するものであればその構成は限定されない。   The operation and details of each part of the speech speed converting apparatus configured as described above will be described. For example, in audio reproduction by a television receiver, audio signals detected as L-channel and R-channel audio stereo signals are added by an adder 1 and converted to a monaural signal. For example, if the adder 1 handles digital signals, the adder 1 can be easily configured with an adder and a latch that takes timing of both channels. If an analog signal is handled, the adder 1 can be easily configured using a resistor network or the like. The configuration is not limited as long as audio information of a plurality of channels is added and converted into a monaural signal.

次に加算器1から出力されたモノラル信号を入力とし、話速変換処理部2は所定の話速変換処理をする。以下、図2を用いて話速変換処理の詳細を説明する。   Next, the monaural signal output from the adder 1 is input, and the speech speed conversion processing unit 2 performs a predetermined speech speed conversion process. Details of the speech speed conversion process will be described below with reference to FIG.

今、図2(a)に示すごとく、Lチャンネル、Rチャンネルにそれぞれ、「それは わたしのです」、「いいえ かれのです」という会話があったとすると、加算器1の出力はモノラル信号に変換されることにより、図2(b)に示すごとく、「それは わたしのです いいえ かれのです」という具合に、LチャンネルとRチャンネル間での会話における時間的対応を保持したモノラル信号に変換される。   As shown in Fig. 2 (a), if there are conversations on the L channel and the R channel, that is "I am" and "No, it is", the output of the adder 1 is converted to a monaural signal. As shown in FIG. 2 (b), it is converted to a monaural signal that retains the temporal correspondence in the conversation between the L channel and the R channel, such as "That's mine.

話速変換処理部2における話速変換処理は、例えばその一例を示すと、まず入力されたモノラル信号の無音区間の切り出し、話速変換する発音の母音切り出しなどの処理を行う。例えば変換比率を2倍の時間に話速変換すると仮定すると、無音区間を除いた会話の区間に母音を追加して2倍の時間に延ばす処理を行う。その処理結果を図2(c)に示す。   As an example of the speech speed conversion processing in the speech speed conversion processing unit 2, first, processing such as extraction of a silent section of an input monaural signal, extraction of a vowel of pronunciation for speech speed conversion, and the like is performed. For example, assuming that the conversion rate is converted to a speech speed of twice the time, processing is performed to add a vowel to the conversation section excluding the silent section and extend it to twice the time. The processing result is shown in FIG.

図2(c)の処理結果に示されるごとく、Lチャンネルの会話部分の終了時刻はt1からt3になり時間が延びている。そしてRチャンネルの会話部分の開始時刻もt2からt3にシフトされている。その結果Lチャンネルの会話部分に対してRチャンネルの会話部分がオーバーラップすることがなく、かつ会話における時間的対応も保持しつつ加速変換されたことが分かる。   As shown in the processing result of FIG. 2C, the end time of the conversation portion of the L channel is from t1 to t3, and the time is extended. The start time of the conversation part of the R channel is also shifted from t2 to t3. As a result, it can be seen that the conversation portion of the R channel does not overlap with the conversation portion of the L channel, and acceleration conversion is performed while maintaining temporal correspondence in the conversation.

次に、話速変換部1より出力された信号は、分配器3でLチャンネルとRチャンネルに音声LR信号としてそれぞれ分配される。分配器3は、デジタル信号を扱うのであれば、両チャンネルに同一音声情報を流せばよく、アナログ信号を扱うのであれば、抵抗分割等で両チャンネルに同一信号を流すように構成すればよく、当該機能を満たすものであればその構成は限定されない。   Next, the signal output from the speech speed conversion unit 1 is distributed as an audio LR signal to the L channel and the R channel by the distributor 3. The distributor 3 may be configured to flow the same audio information to both channels if it handles digital signals, and may be configured to flow the same signal to both channels by resistance division or the like if it handles analog signals. The configuration is not limited as long as the function is satisfied.

さらに、話速変換され分配された音声LR信号はスピーカ再生のための回路、例えばアンプ回路などに供給され、テレビ受信機内蔵のスピーカで再生されてテレビ視聴が可能となる。   Further, the speech LR signal which has been converted after the speech speed is supplied to a circuit for reproducing a speaker, for example, an amplifier circuit, and is reproduced by a speaker built in the television receiver so that the television can be viewed.

以上のように、本発明の話速変換装置によれば、Lチャンネル、Rチャンネルの音声ステレオ信号を、単純に加算するだけの簡単な回路構成で、音声ステレオ信号入力時の時間ズレのない話速変換が実現でき聞き易いテレビ視聴が可能となる。   As described above, according to the speech speed converting apparatus of the present invention, the speech stereo signal of L channel and R channel can be simply added and the speech without a time shift when the audio stereo signal is input. High-speed conversion can be realized and TV viewing that is easy to hear is possible.

また、従来技術の構成のようにステレオ信号のまま話速変換を行った場合、Lチャンネル、Rチャンネルの相互時間管理処理を行う必要があるのと、余分なメモリが必要となることが予想され、話速変換処理システムが大きくなる欠点があるのに対し、本願発明のように分配されたモノラル信号である音声LR信号を再生することで、音の定位情報は損なわれるが、上記の従来技術の構成で予想されるような欠点もなく、簡単な構成で両チャンネル間での会話における時間的対応を保持した聞き易い話速変換が実現でき、例えば高齢者など早い会話を聞き取ることが困難な視聴者に対しそのメリットは大きい。   In addition, when speech speed conversion is performed with a stereo signal as in the configuration of the prior art, it is necessary to perform mutual time management processing of the L channel and the R channel, and an extra memory is expected. The speech speed conversion processing system has a disadvantage that the sound localization information is lost by reproducing the audio LR signal which is a monaural signal distributed as in the present invention. With the simple configuration, it is possible to realize easy-to-listen speech speed conversion that keeps the time correspondence in the conversation between both channels with a simple configuration. For example, it is difficult to hear early conversations such as elderly people The benefit is great for viewers.

特にLR独立の話速変換処理だと、Lチャンネルの話速変換が終わった事を検出してからRチャンネルの話速変換する。そのため余分な同期取りのアルゴリズム処理時間の追加があり、それまでRチャンネルの話速変換を待たせておくというように、Lチャンネルの会話終了時とRチャンネルの会話開始時とのタイムラグが生じ、話速変換による画像とのズレが拡大してしまう。   In particular, in the LR independent speech speed conversion process, the R channel speech speed conversion is performed after detecting that the L channel speech speed conversion has been completed. Therefore, extra synchronization processing time is added, and there is a time lag between the end of the L channel conversation and the start of the R channel conversation. Misalignment with the image due to speech speed conversion will increase.

しかし本発明の話速変換装置によれば、話速変換による両チャンネル間の無音処理は同時に行われ、例えば図2(c)に示すように、Lチャンネルの会話終了時t3に対し、Rチャンネルの会話開始時もt3となり、タイムラグがなく、よって話速変換による画像とのズレは最小に抑えることができる。すなわち、本発明の話速変換装置は、テレビジョン受像機、VTR、DVDなどの画像を伴うAV機器の話速変換において格別の効果を奏する。   However, according to the speech speed converting apparatus of the present invention, the silence processing between the two channels by the speech speed conversion is performed at the same time. For example, as shown in FIG. T3 at the start of the conversation, and there is no time lag. Therefore, the deviation from the image due to the speech speed conversion can be minimized. That is, the speech speed conversion apparatus of the present invention has a special effect in the speech speed conversion of AV equipment with images such as television receivers, VTRs, and DVDs.

なお、本実施の形態では話速変換の変換比率が2倍の場合を説明したが、変換比率はこの値に限定されることはなく、必要に応じてその値を設定すればよい。その場合、図2で示した時刻t3に相当するタイミングが変わるだけで、本願発明の作用効果は同様に奏し得るものである。   In the present embodiment, the case where the conversion ratio of the speech speed conversion is double has been described. However, the conversion ratio is not limited to this value, and may be set as necessary. In that case, only the timing corresponding to the time t3 shown in FIG.

また音声ステレオ信号の2チャンネルの場合を例に説明したが、MPEG−2AAC、ドルビーデジタルなどの5.1チャンネルや、マルチチャンネルの音声信号に対しても、全音声チャンネルを加算してモノラル信号で話速変換すれば、簡単な回路構成で、時間ズレのない話速変換が実現できることは言うまでもない。   Also, the case of two audio stereo signals has been described as an example, but 5.1 audio such as MPEG-2 AAC, Dolby Digital, and multi-channel audio signals are also added to all audio channels as monaural signals. Needless to say, if the speech speed is converted, the speech speed can be converted with a simple circuit configuration and without time lag.

(実施の形態2)
つぎに、本発明の第2の実施の形態について、図3を用いて説明する。図3は、本発明の第2の実施の形態における話速変換装置の構成を示すブロック図である。なお、第1の実施形態と同一の構成要素に対しては同一の符合を付すこととし、説明が重複する部分は適宜省略するものとする。
(Embodiment 2)
Next, a second embodiment of the present invention will be described with reference to FIG. FIG. 3 is a block diagram showing the configuration of the speech rate conversion apparatus according to the second embodiment of the present invention. In addition, the same code | symbol shall be attached | subjected with respect to the component same as 1st Embodiment, and the part which overlaps description shall be abbreviate | omitted suitably.

図3において、符号1は、入力されたLチャンネル、Rチャンネルの音声ステレオ信号を加算する加算器、2は加算器1でモノラル信号になった音声信号を入力とし、所定の話速変換を行う話速変換処理部、3は話速変換処理部2で話速変換処理され、出力された信号を、LチャンネルとRチャンネルにそれぞれ分配し音声LR信号として出力する分配器であり、分配器3で分配された音声LR信号をLチャンネルとRチャンネルのイヤホン出力とし、スピーカでは、入力された通常のLチャンネルとRチャンネルの音声ステレオ信号を再生する構成である。   In FIG. 3, reference numeral 1 is an adder for adding the input L-channel and R-channel audio stereo signals, and 2 is the input of the audio signal that has been converted to a monaural signal by the adder 1, and performing predetermined speech speed conversion. The speech speed conversion processing unit 3 is a distributor that performs speech speed conversion processing by the speech speed conversion processing unit 2 and distributes the output signals to the L channel and the R channel, respectively, and outputs them as audio LR signals. The audio LR signal distributed in the above is used as the L channel and R channel earphone outputs, and the speaker is configured to reproduce the input normal L channel and R channel audio stereo signals.

以上のように構成された話速変換装置について、その動作を説明する。テレビ受信機の音声再生において、Lチャンネル、Rチャンネルの音声ステレオ信号に検波された音声信号はそのままテレビ受像機内臓のスピーカで再生される。他方、この音声ステレオ信号は加算器1で加算され、モノラル信号に変換され、さらに話速変換処理部2で所定の話速変換がなされ、分配器3で音声LRチャンネルにそれぞれ分配され、話速変換及び分配された音声LR信号としてテレビ受信機に追加されたイヤホン端子で再生される。なお、話速変換の動作や各構成要素の詳細な説明は前述の実施の形態1で説明したのと同じなので、ここでは割愛する。   The operation of the speech speed converting apparatus configured as described above will be described. In the audio reproduction of the television receiver, the audio signal detected as the L-channel and R-channel audio stereo signals is reproduced as it is by a speaker built in the television receiver. On the other hand, this audio stereo signal is added by the adder 1 and converted into a monaural signal, further subjected to predetermined speech speed conversion by the speech speed conversion processing unit 2, and distributed to the audio LR channel by the distributor 3, respectively. The converted and distributed audio LR signal is reproduced at the earphone terminal added to the television receiver. Note that the speech speed conversion operation and the detailed description of each component are the same as those described in the first embodiment, and are omitted here.

以上のような構成により、通常のLチャンネル、Rチャンネルの音声ステレオ信号は、テレビ受信機内蔵のスピーカで再生することで、通常の話速で理解できる視聴者はスピーカ再生で聞き、通常の話速についていけない高齢者などはイヤホン端子から、イヤホンで話速変換された音声信号を聞くことにより、同じテレビ受信機でのテレビ視聴が可能となり、家族団欒が楽しめる。高齢者にとっては、モノラル信号になり音の定位情報は損なわれるが、話速変換による聞き易さのメリットの方が大きい。   With the above configuration, normal L-channel and R-channel audio stereo signals are played back with a speaker built in the television receiver, so that a viewer who can understand at normal speaking speed listens to the speaker with normal playback speed. Elderly people who cannot keep up with the speed can listen to the sound signal converted from the earphone through the earphone terminal, so that they can watch the TV on the same TV receiver and enjoy the family together. For elderly people, it becomes a monaural signal and the localization information of the sound is impaired, but the merit of ease of hearing by speech speed conversion is greater.

またイヤホン端子から話速変換信号を出力することにより、この信号を入力に利用し、赤外線などの音飛ばし機能などと併用することにより、高齢者にとって、更に快適な聞き取り易いテレビ視聴環境が整う。   Also, by outputting a speech rate conversion signal from the earphone terminal, this signal is used for input, and in combination with a sound skip function such as infrared rays, a more comfortable TV viewing environment for the elderly can be established.

またイヤホン端子出力は、スピーカ再生側とは独立の音量調整を可能とするように音量調整部を設ける構成とすることが望ましい。さらに所定の話速を保ち音程を可変する音程調整部を設けることもできる。一般に高齢者は早い話速に追従困難なだけではなく、聴取感度も低下し、かつ聴取可能な音域も狭くなっているので、イヤホン端子出力に上記のような構成を追加すれば、他の視聴者に影響を与えることなく、自ら最も聞き取りやすい設定でテレビ視聴が可能になる。   Further, it is desirable that the earphone terminal output is provided with a volume adjusting unit so that the volume can be adjusted independently from the speaker reproduction side. Furthermore, it is possible to provide a pitch adjusting unit that keeps a predetermined speech speed and varies the pitch. In general, elderly people are not only difficult to follow fast speaking speeds, but also have low listening sensitivity and a narrow range of sounds that can be listened to. TV can be viewed with a setting that is easy to hear without affecting the user.

本発明に係る話速変換方法及び話速変換装置によれば、簡単な回路構成で時間ズレのない話速変換が実現でき、とりわけ、画像が映し出される場合には、画像と音声のズレを極力小さくできるので、画像を伴う話速変換に効果があり、テレビのみならずVTR、DVDなどのAV機器の話速変換においてとりわけ有用である。   According to the speech speed conversion method and the speech speed conversion apparatus according to the present invention, speech speed conversion without time deviation can be realized with a simple circuit configuration, and in particular, when an image is projected, the difference between the image and the voice is minimized. Since it can be made small, it is effective for speech speed conversion with images, and is particularly useful for speech speed conversion not only for television but also for AV equipment such as VTR and DVD.

本発明の第1の実施の形態における話速変換装置のブロック構成図The block block diagram of the speech-speed converter in the 1st Embodiment of this invention 本発明の話速変換のタイミング説明図Explanation of timing of speech speed conversion of the present invention 本発明の第2の実施の形態における話速変換装置のブロック構成図Block diagram of the speech rate conversion apparatus in the second embodiment of the present invention 従来の実施例の話速変換装置のブロック構成図Block configuration diagram of a speech speed conversion device according to a conventional embodiment 従来の話速変換のタイミング説明図Timing diagram of conventional speech speed conversion 本発明を使用しない従来の第1の話速変換装置のブロック構成図1 is a block diagram of a conventional first speech speed conversion apparatus that does not use the present invention. 本発明を使用しない従来の第2の話速変換装置のブロック構成図The block block diagram of the 2nd conventional speech rate converter which does not use this invention

符号の説明Explanation of symbols

1 加算器
2 話速変換処理部
3 分配器
1 adder 2 speech speed conversion processing unit 3 distributor

Claims (3)

入力された複数チャンネルの音声信号を加算してモノラル信号に変換し、前記モノラル信号を話速変換することを特徴とする話速変換方法。 A speech speed conversion method comprising: adding a plurality of input channel audio signals to convert them to a monaural signal; and converting the monaural signal to a speech speed. 入力されたLチャンネル及びRチャンネルの音声ステレオ信号を加算する加算器と、前記加算器でモノラル信号になった音声信号を入力とし所定の話速変換を行う話速変換処理部と、前記話速変換処理部の出力信号をLチャンネルとRチャンネルにそれぞれ分配し音声LR信号として出力する分配器を具備し、前記音声LR信号をスピーカで再生する構成としたことを特徴とする話速変換装置。 An adder for adding the input L-channel and R-channel audio stereo signals; a speech rate conversion processing unit for inputting a speech signal that has been converted to a monaural signal by the adder and performing a predetermined speech rate conversion; and the speech rate A speech rate conversion apparatus comprising a distributor for distributing an output signal of a conversion processing unit to an L channel and an R channel and outputting the signal as an audio LR signal, and reproducing the audio LR signal with a speaker. 入力されたLチャンネル及びRチャンネルの音声ステレオ信号を加算する加算器と、前記加算器でモノラル信号になった音声信号を入力とし所定の話速変換を行う話速変換処理部と、前記話速変換処理部の出力信号をLチャンネルとRチャンネルにそれぞれ分配し音声LR信号として出力する分配器を具備し、スピーカでは前記音声ステレオ信号を再生し、イヤホン端子では前記音声LR信号を再生する構成としたことを特徴とする話速変換装置。 An adder for adding the input L-channel and R-channel audio stereo signals; a speech rate conversion processing unit for inputting a speech signal that has been converted to a monaural signal by the adder and performing a predetermined speech rate conversion; and the speech rate A distributor that distributes the output signal of the conversion processing unit to the L channel and the R channel and outputs the signal as an audio LR signal, reproduces the audio stereo signal at a speaker, and reproduces the audio LR signal at an earphone terminal; A speech speed conversion device characterized by that.
JP2003327784A 2003-09-19 2003-09-19 Method and device for speaking speed conversion Pending JP2005092021A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2003327784A JP2005092021A (en) 2003-09-19 2003-09-19 Method and device for speaking speed conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2003327784A JP2005092021A (en) 2003-09-19 2003-09-19 Method and device for speaking speed conversion

Publications (1)

Publication Number Publication Date
JP2005092021A true JP2005092021A (en) 2005-04-07

Family

ID=34457554

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2003327784A Pending JP2005092021A (en) 2003-09-19 2003-09-19 Method and device for speaking speed conversion

Country Status (1)

Country Link
JP (1) JP2005092021A (en)

Similar Documents

Publication Publication Date Title
JP4170808B2 (en) Information display device, information display method, and program
CN100505064C (en) Audio reproducing apparatus
JP3629253B2 (en) Audio reproduction device and audio reproduction control method used in the same
KR100739355B1 (en) Speech processing method and apparatus
RU2002126217A (en) SYSTEM FOR APPLICATION OF THE SIGNAL OF PRIMARY AND SECONDARY AUDIO INFORMATION
JP2005519537A (en) Delete and mute audio data playing in trick mode
JP2007318604A (en) Digital audio signal processor
WO2018173413A1 (en) Audio signal processing device and audio signal processing system
JP2013179570A (en) Reproduction device
CN100581234C (en) Recording and reproducing device
US20070192089A1 (en) Apparatus and method for reproducing audio data
JP2006317768A (en) Speaking speed conversion apparatus and speaking speed conversion program for controlling the speaking speed conversion apparatus
JP2005092021A (en) Method and device for speaking speed conversion
JP3550110B2 (en) Signal processing circuit and signal processing method
JP4212253B2 (en) Speaking speed converter
JP2001290500A (en) Speech speed converter, speaker device and television receiver
JP2005352330A (en) Speech division recording device
JP2006243128A (en) Reproducing device and reproducing method
JP2010093614A (en) Video signal playback apparatus
JP4551734B2 (en) Variable speed reproduction apparatus and variable speed reproduction method
US20060008093A1 (en) Media recorder system and method
JP2005128132A (en) Speech speed conversion method and speech speed conversion device
JP2006041660A (en) Method and apparatus of converting speaking speed
KR100808201B1 (en) Method for synchronizing audio/video data
JP2003195893A (en) Device and method for speech reproduction

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20060810

RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7421

Effective date: 20060913

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20090518

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090526

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20090626

RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7421

Effective date: 20091119

A02 Decision of refusal

Free format text: JAPANESE INTERMEDIATE CODE: A02

Effective date: 20091208