JP2009008843A

JP2009008843A - Acoustic signal playback device and acoustic signal playback method

Info

Publication number: JP2009008843A
Application number: JP2007169345A
Authority: JP
Inventors: Hiroaki Takeda; 博昭竹田
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2007-06-27
Filing date: 2007-06-27
Publication date: 2009-01-15

Abstract

<P>PROBLEM TO BE SOLVED: To provide an acoustic signal playback device for reducing degradation of viewing environment, even if an error occurs in the acoustic signal to be played back, when the acoustic signals of a plurality of channels are included. <P>SOLUTION: Acoustic data of the plurality of channels are held in a frame unit, and it is determined whether or not, an error occurs in the acoustic data of each channel in the frame. When the error is determined, the acoustic data of the channel having a defective frame, is replaced with a predetermined silence data. Then, an acoustic playback data for reproducing based on the acoustic data of each channel in adjacent frames and the silence data is generated, and the acoustic playback data is played back. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、ビットストリームとして連続的に現れる符号化された音響信号を入力し再生するための音響信号再生装置及び音響信号再生方法に関する。 The present invention relates to an acoustic signal reproducing apparatus and an acoustic signal reproducing method for inputting and reproducing an encoded acoustic signal that continuously appears as a bit stream.

例えば携帯電話端末、デジタルテレビ受像機、デジタルオーディオ機器のようなデジタル機器においては、音響を出力しようとする場合に、所定の規格に従って圧縮符号化されたデジタル音響信号のビットストリームを入力し、デコーダを用いて音響信号のビットストリームを復号処理して音響を再生する場合が多い。音声信号の圧縮符号化の規格については、例えばＡＡＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ）が用いられている。 For example, in a digital device such as a cellular phone terminal, a digital television receiver, or a digital audio device, when a sound is to be output, a bit stream of a digital sound signal compressed and encoded according to a predetermined standard is input and a decoder In many cases, the sound is reproduced by decoding the bit stream of the sound signal using. For example, AAC (Advanced Audio Coding) is used as a compression encoding standard for audio signals.

ＡＡＣの圧縮符号化においては、ＤＣＴ（離散コサイン変換）を用いて音響信号を時間領域から周波数領域の信号に変換した後、人間の聴覚の性質を利用して圧縮率の高い符号化を行い音響信号のビットストリームを生成する。 In AAC compression encoding, an acoustic signal is converted from a time domain signal to a frequency domain signal using DCT (Discrete Cosine Transform), and then encoded with a high compression ratio using the characteristics of human hearing. Generate a bitstream of the signal.

ところで、例えば携帯電話端末が無線伝送路を経由して他局から受信した音響信号のビットストリームを再生しようとする場合には、無線伝送路上で発生したノイズ等の影響によってビットストリームの内容に誤りが発生する可能性が高い。もし、訂正不可能な誤りを含むビットストリームをそのまま復号して音響として出力すると、本来は連続的な信号波形である音響信号の連続性が維持できなくなるため、不自然な信号波形に変化し、その結果、視聴者にとって耳障りなノイズ（異音）が出力音響に現れることになる。 By the way, for example, when a mobile phone terminal tries to reproduce a bit stream of an acoustic signal received from another station via a wireless transmission path, the content of the bit stream is incorrect due to noise or the like generated on the wireless transmission path. Is likely to occur. If a bitstream containing an uncorrectable error is decoded as it is and output as sound, the continuity of the sound signal, which is originally a continuous signal waveform, cannot be maintained, so it changes to an unnatural signal waveform, As a result, noise (abnormal noise) that is annoying to the viewer appears in the output sound.

そこで、このような異音の発生を防止するための１つの方法として、無音信号を再生するための特別な符号化データを装置上に事前に用意しておき、入力されたビットストリームのエラーを検出した場合には、エラーの生じている本来のビットストリームの代わりに特別な符号化データを復号処理することにより、音響として無音が出力されるように処理することが提案されている（例えば、特許文献１参照）。 Therefore, as one method for preventing the occurrence of such abnormal noise, special encoded data for reproducing a silence signal is prepared in advance on the apparatus, and an error of the input bit stream is detected. In the case of detection, it has been proposed to perform processing so that silence is output as sound by decoding special encoded data instead of the original bitstream in which an error has occurred (for example, Patent Document 1).

また、入力されたビットストリームの１フレームのみが誤っている場合にはその前後のフレームを利用してデータを補間したり、連続するフレームで誤りが生じている場合には最後の正常なフレームを用いて補間したり、データの一部のみが誤っている場合にはなるべくそのデータを利用することが提案されている（例えば、特許文献２参照）。 If only one frame of the input bitstream is incorrect, data is interpolated using the frames before and after it, or if there is an error in consecutive frames, the last normal frame is It has been proposed to interpolate using the data or to use the data as much as possible when only a part of the data is wrong (for example, see Patent Document 2).

特開２００３−０９１２９４号公報JP 2003-091294 A 特開２００４−３６１７３１号公報Japanese Patent Application Laid-Open No. 2004-361731

しかしながら、特許文献２の技術では異音の発生を十分に抑制できない可能性が高い。 However, there is a high possibility that the technology of Patent Document 2 cannot sufficiently suppress the generation of abnormal noise.

また、特許文献１の技術では、エラーの生じている本来のビットストリームの代わりに使用する特別な符号化データを事前に用意しておく必要があるが、ＡＡＣ等の規格に基づいて圧縮符号化された音響信号のビットストリームを処理する場合には、例えばチャネル数の違いや、サンプリングレートの違いに応じてそれぞれ独立した複数パターンの符号化データを無音信号として用意しなければならず、事前に保持すべきデータの容量が増大したり処理が複雑化したりすることは避けられない。 In the technique of Patent Document 1, it is necessary to prepare in advance special encoded data to be used in place of the original bit stream in which an error has occurred, but compression encoding is performed based on a standard such as AAC. In order to process a bit stream of an acoustic signal, for example, encoded data of a plurality of independent patterns must be prepared as a silence signal in accordance with the difference in the number of channels and the sampling rate. It is inevitable that the amount of data to be held increases and the processing becomes complicated.

さらに、再生すべきビットストリームには独立した複数チャネルの音響信号が含まれている場合もあるが、従来技術では１チャネルのみでエラーが発生している場合であっても、全てのチャネルの出力音響が無音状態に制御されるので、視聴者から見ると無音状態が発生する頻度が高く、視聴環境の品質低下をより強く感じることになる。 Furthermore, although the bit stream to be reproduced may include independent plural channels of acoustic signals, the conventional technique outputs all channels even when an error occurs in only one channel. Since the sound is controlled to the silent state, when viewed from the viewer, the frequency of the silent state is high, and the quality of the viewing environment is more strongly felt.

本発明は、上記事情を鑑みてなされたものであって、複数チャネルの音響信号が含まれている場合に、再生すべき音響信号にエラーが発生した際にも、視聴環境の悪化を低減することが可能な音響信号再生装置及び音響信号再生方法を提供することを目的とする。 The present invention has been made in view of the above circumstances, and reduces the deterioration of the viewing environment when an error occurs in an audio signal to be reproduced when an audio signal of a plurality of channels is included. It is an object of the present invention to provide an acoustic signal reproducing device and an acoustic signal reproducing method that can be used.

上記目的を達成するために、本発明の第１の音響信号再生装置は、複数のチャネルに関するチャネル情報および各チャネルに対応する音響データを有するコンテンツデータをフレーム単位で保持するデータ保持部と、前記フレームにおける各チャネルの音響データにエラーがあるか否かを判定するエラー判定部と、前記エラーがあると判定された場合、前記フレームにおける前記エラーがあると判定されたチャネルの音響データを所定の無音データで置換する無音データ置換部と、互いに隣接するフレームにおける各チャネルの前記音響データおよび前記無音データに基づいて、再生を行うための音響再生データを生成する再生データ生成部と、前記音響再生データを再生する再生部とを有する構成としている。 In order to achieve the above object, a first audio signal reproduction device of the present invention includes a data holding unit that holds channel information regarding a plurality of channels and content data having audio data corresponding to each channel in units of frames, An error determination unit that determines whether or not there is an error in the acoustic data of each channel in the frame, and if it is determined that there is an error, the acoustic data of the channel that is determined to have the error in the frame Silence data replacement unit for replacing with silence data, reproduction data generation unit for generating sound reproduction data for reproduction based on the sound data and the silence data of each channel in adjacent frames, and the sound reproduction And a reproducing unit for reproducing data.

この構成により、複数チャネルの音響信号が含まれている場合に、再生すべき音響信号にエラーが発生した際にも、エラーの発生したフレームのチャネルにおけるデータを無音データと置き換えるので、視聴環境の悪化を低減することが可能である。例えば、デジタルテレビ放送データやデジタルテレビ放送を録画したデータに含まれる音響データを該当チャネルのみ無音とすることが可能である。また、再生対象の音響データに関するチャネル数の違いや、サンプリングレートの違いとは無関係に無音データを決定できるので、無音データの保持や生成は極めて容易である。さらに、大量の無音データを蓄積しておく必要がないし、無音データを生成するために構成の複雑な電気回路を用意する必要もない。 With this configuration, when multiple channels of audio signals are included, even if an error occurs in the audio signal to be reproduced, the data in the channel of the frame in which the error has occurred is replaced with silence data. Deterioration can be reduced. For example, sound data included in digital TV broadcast data or data recorded by digital TV broadcast can be silenced only in the corresponding channel. In addition, since silence data can be determined regardless of the difference in the number of channels related to the acoustic data to be reproduced and the difference in sampling rate, it is extremely easy to hold and generate silence data. Furthermore, it is not necessary to store a large amount of silence data, and it is not necessary to prepare a complicated electric circuit for generating silence data.

また、本発明の第２の音響信号再生装置は、外部のコンテンツ配信装置からの前記コンテンツデータをストリーミングデータとして入力するデータ入力部を有する構成としている。 In addition, the second acoustic signal reproduction device of the present invention has a data input unit that inputs the content data from an external content distribution device as streaming data.

この構成により、外部のコンテンツ配信装置からストリーミングデータが配信される際に通信環境が劣悪であり、一部の音響データに何らかのエラーが生じてしまった場合であっても、視聴者は一部の音響データのみを確認することが可能である。 With this configuration, even when streaming data is distributed from an external content distribution device, the communication environment is poor, and even if some error occurs in some acoustic data, Only acoustic data can be confirmed.

また、本発明の第３の音響信号再生装置は、周波数領域のデータから時間領域のデータに変換する領域変換部を有し、前記データ保持部が、周波数領域の音響データを保持し、前記エラー判定部が、周波数領域の音響データにエラーがあるか否かを判定し、前記再生データ生成部が、前記領域変換部による変換後、互いに隣接するフレームにおける各チャネルの時間領域の音響データおよび無音データに基づいて、前記音響再生データを生成する構成としている。 The third acoustic signal reproduction device of the present invention includes a domain conversion unit that converts frequency domain data to time domain data, and the data storage unit stores frequency domain acoustic data, and the error The determination unit determines whether or not there is an error in the frequency domain acoustic data, and the reproduction data generation unit converts the time domain acoustic data and silence of each channel in frames adjacent to each other after the conversion by the region conversion unit. The sound reproduction data is generated based on the data.

この構成により、例えば、ストリーミングデータとして入力される周波数領域の音響データを音響信号再生装置内で処理を行うことで、コンテンツ再生時に極力無音となる状態が少なくなる時間領域の再生用データ（音響再生データ）を生成することが可能である。 With this configuration, for example, time domain reproduction data (acoustic reproduction) in which the frequency domain acoustic data input as streaming data is processed in the acoustic signal reproduction device to minimize the state of silence during content reproduction. Data) can be generated.

また、本発明の第４の音響信号再生装置は、前記音響データの種類を判定するコンテンツ種別判定部を有し、前記領域変換部が、前記データ保持部に保持された音響データを時間領域のデータに変換し、前記無音データ置換部が、前記コンテンツ種別判定部によって前記音響データが複数の言語の音声データを含むデータではないと判定された場合、前記フレームにおける前記エラーがあると判定されたチャネルの時間領域の音響データを時間領域の所定の無音データで置換する構成としている。 In addition, the fourth acoustic signal reproduction device of the present invention has a content type determination unit that determines the type of the acoustic data, and the region conversion unit converts the acoustic data held in the data holding unit in the time domain. When the sound data replacement unit determines that the acoustic data is not data including audio data of a plurality of languages, the silence data replacement unit determines that there is the error in the frame. The sound data in the time domain of the channel is replaced with predetermined silence data in the time domain.

この構成により、例えば２カ国語放送用の音声データ以外のデータである場合に、エラーがあるフレームのチャネルの音響データを時間領域で無音データと置き換えることで、異音が発生せず、視聴環境の品質低下を視聴者があまり感じない音響信号の再生が可能となる。 With this configuration, for example, in the case of data other than audio data for bilingual broadcasting, by replacing the acoustic data of the channel of the frame with an error with silence data in the time domain, no abnormal sound is generated, and the viewing environment Therefore, it is possible to reproduce an acoustic signal so that the viewer does not feel the quality degradation.

また、本発明の第５の音響信号再生装置は、前記音響データの種類を判定するコンテンツ種別判定部を有し、前記無音データ置換部が、前記コンテンツ種別判定部によって前記音響データが複数の言語の音声データを含むデータであると判定された場合、前記フレームにおける前記エラーがあると判定されたチャネルの周波数領域の音響データを周波数領域の所定の無音データで置換し、前記領域変換部が、周波数領域の音響データおよび無音データを時間領域のデータへ変換する構成としている。 The fifth acoustic signal reproduction device of the present invention has a content type determination unit that determines the type of the acoustic data, and the silence data replacement unit is configured to output the acoustic data in a plurality of languages by the content type determination unit. When it is determined that the data includes the audio data of the above, the acoustic data of the frequency domain of the channel determined to have the error in the frame is replaced with predetermined silence data of the frequency domain, the domain conversion unit, Frequency domain acoustic data and silence data are converted to time domain data.

この構成により、例えば、２カ国語放送コンテンツのように、独立した複数の音響信号が１つのビットストリームに含まれる複数チャネルの信号として同時に入力される場合に、チャネル毎に独立に異音発生の抑制制御を実施することができる。従って、例えば主音声と副音声とを含む２カ国語放送コンテンツを再生する場合であって、主音声のデータには誤りがなく、副音声のデータのみに誤りが発生しているような場合には、主音声をそのまま再生し、副音声のみを無音状態にすることができる。そのため、視聴者が主音声だけを聞いている場合には、副音声のデータにエラーが発生していてもその影響を受けることなく異音を含まない主音声の視聴を継続でき、視聴者が感じる無音状態の発生頻度は小さくなる。 With this configuration, for example, when two or more independent audio signals are simultaneously input as signals of a plurality of channels included in one bit stream, such as bilingual broadcast content, abnormal noise is generated independently for each channel. Suppression control can be implemented. Therefore, for example, when a bilingual broadcast content including main audio and sub audio is played back, and there is no error in the main audio data, and there is an error only in the sub audio data. Can reproduce the main sound as it is and silence only the sub sound. Therefore, if the viewer is listening only to the main audio, even if an error has occurred in the secondary audio data, it can continue to view the main audio that does not include abnormal sounds without being affected by it, The frequency of the silent state to be felt is reduced.

また、本発明の第６の音響信号再生装置は、前記コンテンツデータが、ストリーミングデータまたはデジタルテレビ放送データである構成としている。 In the sixth audio signal reproduction device of the present invention, the content data is streaming data or digital television broadcast data.

この構成により、例えば、デジタルテレビ放送の受信中に通信環境が悪化して放送データを正しく受信することができなかった場合に、エラー発生部分にミュートをかけて、正常部分のデータを再生することが可能である。例えば、日本語と英語の２カ国語放送のデータを受信中に、英語の音声データを有するチャネルだけにエラーが発生した場合、英語の音声データを有するチャネルだけを無音状態とし、日本語の音声データを有するチャネルを通常通りに再生することが可能となる。 With this configuration, for example, if the communication environment deteriorates during reception of a digital television broadcast and broadcast data cannot be received correctly, the error occurrence part is muted and the normal part data is reproduced. Is possible. For example, if an error occurs only in a channel that has English audio data while receiving Japanese and English bilingual data, only the channel that has English audio data will be silenced, It becomes possible to reproduce a channel having data as usual.

また、本発明の第１の音響信号再生方法は、音響信号再生装置において、複数のチャネルに関するチャネル情報および各チャネルに対応する音響データを有するコンテンツデータをフレーム単位で保持するステップと、前記フレームにおける各チャネルの音響データにエラーがあるか否かを判定するステップと、前記エラーがあると判定された場合、前記フレームにおける前記エラーがあると判定されたチャネルの音響データを所定の無音データで置換するステップと、互いに隣接するフレームにおける各チャネルの前記音響データおよび前記無音データに基づいて、再生を行うための音響再生データを生成するステップと、前記音響再生データを再生するステップとを有する方法としている。 The first acoustic signal reproduction method of the present invention includes a step of holding content data having channel information on a plurality of channels and acoustic data corresponding to each channel in a unit of frames in the acoustic signal reproduction device; Determining whether or not there is an error in the sound data of each channel; and if it is determined that there is an error, replace the sound data of the channel determined to have the error in the frame with predetermined silence data A step of generating sound reproduction data for reproduction based on the sound data and the silence data of each channel in frames adjacent to each other, and a step of reproducing the sound reproduction data. Yes.

この方法により、複数チャネルの音響信号が含まれている場合に、再生すべき音響信号にエラーが発生した際にも、エラーの発生したフレームのチャネルにおけるデータを無音データと置き換えるので、視聴環境の悪化を低減することが可能である。例えば、デジタルテレビ放送データやデジタルテレビ放送を録画したデータに含まれる音響データを該当チャネルのみ無音とすることが可能である。また、再生対象の音響データに関するチャネル数の違いや、サンプリングレートの違いとは無関係に無音データを決定できるので、無音データの保持や生成は極めて容易である。さらに、大量の無音データを蓄積しておく必要がないし、無音データを生成するために構成の複雑な電気回路を用意する必要もない。 This method replaces the data in the channel of the frame in which the error occurred with silence data even if an error occurs in the audio signal to be reproduced when multiple channels of audio signals are included. Deterioration can be reduced. For example, sound data included in digital TV broadcast data or data recorded by digital TV broadcast can be silenced only in the corresponding channel. In addition, since silence data can be determined regardless of the difference in the number of channels related to the acoustic data to be reproduced and the difference in sampling rate, it is extremely easy to hold and generate silence data. Furthermore, it is not necessary to store a large amount of silence data, and it is not necessary to prepare a complicated electric circuit for generating silence data.

本発明によれば、複数チャネルの音響信号が含まれている場合に、再生すべき音響信号にエラーが発生した際にも、視聴環境の悪化を低減することが可能である。 According to the present invention, when a plurality of channels of audio signals are included, it is possible to reduce the deterioration of the viewing environment even when an error occurs in the audio signal to be reproduced.

特に、ビットストリームのエラー発生に対して、時間領域窓掛け処理部に入力される時間領域の音響信号を、事前に決定された時間領域無音データに置き換えるか、又は周波数−時間領域変換部に入力される周波数領域の音響信号を事前に決定された周波数領域無音データに置き換えるので、再生対象のビットストリームに関するチャネル数の違いやサンプリングレートの違いに対応して様々なパターンの無音データを用意する必要がなく、時間領域無音データ又は周波数領域無音データの保持や生成は極めて容易である。すなわち、大量の無音データを蓄積しておく必要がないし、無音データを生成するために構成の複雑な電気回路を用意する必要もない。 In particular, when a bitstream error occurs, the time domain acoustic signal input to the time domain windowing processing unit is replaced with predetermined time domain silence data or input to the frequency-time domain conversion unit. Since the frequency domain acoustic signal is replaced with the frequency domain silence data determined in advance, it is necessary to prepare various patterns of silence data corresponding to the difference in the number of channels and the sampling rate regarding the bit stream to be reproduced. It is very easy to hold and generate time domain silence data or frequency domain silence data. That is, it is not necessary to store a large amount of silence data, and it is not necessary to prepare a complicated electric circuit for generating silence data.

本発明の実施形態における音響信号再生装置及び音響信号再生方法ついて、図面を参照しながら以下に説明する。 An acoustic signal reproduction device and an acoustic signal reproduction method according to an embodiment of the present invention will be described below with reference to the drawings.

図１は本発明の実施形態における音響信号再生装置の構成の一例を示すブロック図である。 FIG. 1 is a block diagram showing an example of the configuration of an acoustic signal reproduction device according to an embodiment of the present invention.

ここでは、ＡＡＣ（ＡｄｖａｎｃｅｄＡｕｄｉｏＣｏｄｉｎｇ）規格に従って圧縮符号化された音響信号のビットストリーム（データ列）を入力し、このデータ列を処理して、処理された音響信号を再生する場合を想定している。すなわち、音響信号再生装置１００に入力される入力ビットストリームＳＧ１は、ＡＡＣ規格に従って圧縮符号化された音響信号である。 Here, it is assumed that a bit stream (data string) of an audio signal compressed and encoded according to the AAC (Advanced Audio Coding) standard is input, the data string is processed, and the processed audio signal is reproduced. Yes. That is, the input bit stream SG1 input to the audio signal reproduction device 100 is an audio signal that is compression-encoded according to the AAC standard.

音響信号再生装置１００は、補正機能付きビットストリーム解釈部１０と、周波数領域−時間領域変換部２０と、時間領域窓掛け部３０と、前フレームデータ蓄積部４０と、無音データ生成部５１と、データパス切り替え器５２及び５３と、再生部６０とを有して構成される。 The acoustic signal reproduction device 100 includes a bitstream interpretation unit 10 with a correction function, a frequency domain-time domain conversion unit 20, a time domain windowing unit 30, a previous frame data storage unit 40, a silence data generation unit 51, The data path switching units 52 and 53 and the reproducing unit 60 are included.

また、補正機能付きビットストリーム解釈部１０の構成の一例を図２に示す。補正機能付きビットストリーム解釈部１０は、外部のコンテンツ配信装置からのビットストリームをフレーム単位で入力して一時的に保持する構成要素であり、ビットストリーム解釈部１１と、コンテンツ種別判定部１２と、周波数領域データ無音化部１３とを有して構成される。尚、補正機能付きビットストリーム解釈部１０は「データ保持部」、「データ入力部」としての機能を有する。 An example of the configuration of the bitstream interpretation unit 10 with a correction function is shown in FIG. The bitstream interpretation unit 10 with a correction function is a component that temporarily inputs and holds a bitstream from an external content distribution apparatus in units of frames, and includes a bitstream interpretation unit 11, a content type determination unit 12, And a frequency domain data silencer 13. The bitstream interpretation unit 10 with a correction function has functions as a “data holding unit” and a “data input unit”.

なお、音響信号再生装置１００の各構成要素の実体については、論理回路を含む専用のハードウェアで構成することもできるし、マイクロプロセッサやＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）のハードウェアとそれを制御するプログラムとの組み合わせにより実現することもできる。 It should be noted that each component of the acoustic signal reproduction device 100 can be configured by dedicated hardware including a logic circuit, or hardware of a microprocessor or DSP (Digital Signal Processor) and a program for controlling the hardware. It can also be realized by a combination.

コンテンツ種別判定部１２は、入力ビットストリームＳＧ１に関するコンテンツの種別を判定する機能を有している。具体的には、入力ビットストリームＳＧ１を構成するフレームのヘッダの内容を調べることにより、このコンテンツの種別が二種類の言語音声を含む音響コンテンツであることを示す２カ国語音響なのか、２カ国語音響以外なのか、音響コンテンツがモノラル形式であることを示すモノラル音響なのか、音響コンテンツがステレオ形式であることを示すステレオ音響なのかなどを判定し、その判定結果をコンテンツ種別信号ＳＧ１１としてビットストリーム解釈部１１に与える。尚、これらの各音響コンテンツは一例であり、これ以外の音響コンテンツであってもよい。尚、コンテンツとしては、デジタルテレビ放送のテレビ放送データなどが考えられる。 The content type determination unit 12 has a function of determining the type of content related to the input bitstream SG1. Specifically, by examining the contents of the header of the frame constituting the input bitstream SG1, whether the content type is a bilingual sound indicating that the content type is an acoustic content including two kinds of language sounds or two countries It is determined whether the sound content is monaural sound indicating that the sound content is in a monaural format or stereo sound indicating that the sound content is in a stereo format, and the determination result is a bit as a content type signal SG11 This is given to the stream interpretation unit 11. Each of these acoustic contents is an example, and other acoustic contents may be used. The content may be digital television broadcast data.

尚、音響コンテンツは「音響データ」の一例である。ここでは、音響コンテンツは音のデータや音声のデータや音楽のデータなど、音に関するデータを広く含むものである。 The acoustic content is an example of “acoustic data”. Here, the acoustic content widely includes sound-related data such as sound data, sound data, and music data.

ここで、入力ビットストリームＳＧ１を構成するフレームの一例を図５に示す。フレーム５００は、ヘッダ領域５１０、データ領域５２０、トレーラ領域５３０を有する。ヘッダ領域５１０は、モノラル音響であることを示すモノラル情報、ステレオ音響であることを示すステレオ情報、２カ国語音響であることを示す２カ国語情報、チャネル数などチャネルに関するチャネル情報、データ領域５２０の長さを示すデータ長情報などを有する。データ領域５２０は、チャネルを識別するためのチャネルＩＤ５２１（５２１ａ、５２１ｂ、・・・）およびチャネルＩＤ５２１に対応する音響コンテンツデータ５２２（５２２ａ、５２２ｂ、・・・）を各チャネル分有している。図５では、チャネル数を２つとしているが、これに限られない。トレーラ領域５３０は、フレームの終了を示す情報を有する。 Here, an example of a frame constituting the input bit stream SG1 is shown in FIG. The frame 500 includes a header area 510, a data area 520, and a trailer area 530. The header area 510 includes monaural information indicating monophonic sound, stereo information indicating stereophonic sound, bilingual information indicating bilingual sound, channel information such as the number of channels, and data area 520. Data length information indicating the length of the data. The data area 520 includes channel IDs 521 (521a, 521b,...) For identifying the channels and acoustic content data 522 (522a, 522b,...) Corresponding to the channel IDs 521 for each channel. In FIG. 5, the number of channels is two, but this is not a limitation. The trailer area 530 has information indicating the end of the frame.

例えば、入力ビットストリームＳＧ１が日本語と英語の２カ国語音響である場合、チャネルＩＤ５２１ａには日本語を示すＩＤ、音響コンテンツデータ５２２ａには日本語の音声データ、チャネルＩＤ５２１ｂには英語を示すＩＤ、音響コンテンツデータ５２２ｂには英語の音声データが格納される。 For example, when the input bitstream SG1 is bilingual sound of Japanese and English, the channel ID 521a is an ID indicating Japanese, the sound content data 522a is Japanese sound data, and the channel ID 521b is English indicating an ID. The audio content data 522b stores English audio data.

ビットストリーム解釈部１１は、入力ビットストリームＳＧ１の内容を順次に解釈し、音響信号をチャネル毎に分離したり、例えばハフマン符号化されているデータを復号したり、エラーの有無を調べたりする。この処理の結果は、解釈後周波数領域データＳＧ１３として周波数領域データ無音化部１３に入力される。尚、ビットストリーム解釈部１１は「エラー判定部」としての機能を有する。 The bit stream interpretation unit 11 sequentially interprets the contents of the input bit stream SG1, separates the acoustic signal for each channel, decodes Huffman-encoded data, and checks for errors. The result of this processing is input to the frequency domain data silencer 13 as interpreted frequency domain data SG13. The bit stream interpretation unit 11 has a function as an “error determination unit”.

また、ビットストリーム解釈部１１は、入力ビットストリームＳＧ１に関するエラーを検出した場合には、ビットストリームエラー情報ＳＧ１２を周波数領域データ無音化部１３へ出力する。 In addition, when detecting an error related to the input bitstream SG1, the bitstream interpretation unit 11 outputs the bitstream error information SG12 to the frequency domain data silencer 13.

また、ビットストリーム解釈部１１は、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響以外の場合に入力ビットストリームＳＧ１に関するエラーを検出した場合には、同時に時間領域補正指示信号ＳＧ７を出力する。 When the bit stream interpretation unit 11 detects an error related to the input bit stream SG1 when the content type of the input bit stream SG1 is other than bilingual sound, the bit stream interpretation unit 11 simultaneously outputs the time domain correction instruction signal SG7.

尚、ＳＧ１２の出力は、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響以外の場合には省略してもよい。 The output of SG12 may be omitted when the content type of the input bitstream SG1 is other than bilingual sound.

なお、入力ビットストリームＳＧ１は周波数領域の音響信号であり、圧縮符号化される際にフーリエ変換によって時間領域から周波数領域への情報の変換が行われている。 Note that the input bit stream SG1 is a frequency domain acoustic signal, and information is converted from the time domain to the frequency domain by Fourier transform when compression encoded.

周波数領域データ無音化部１３は、通常はビットストリーム解釈部１１から入力される解釈後周波数領域データＳＧ１３をそのまま補正済み周波数領域データＳＧ２として出力するが、ビットストリームエラー情報ＳＧ１２によってエラーの発生を認識した場合には、補正済み周波数領域データＳＧ２をフレーム単位で無音データに置き換えて出力する。すなわち、周波数毎に値が０のデータを無音データとして出力する。これにより、異音の発生を防止することができる。尚、周波数領域データ無音化部１３は「無音データ置換部」としての機能を有する。 The frequency domain data silencer 13 normally outputs the post-interpretation frequency domain data SG13 input from the bitstream interpretation unit 11 as corrected frequency domain data SG2, but recognizes the occurrence of an error by the bitstream error information SG12. In this case, the corrected frequency domain data SG2 is replaced with silence data for each frame and output. That is, data having a value of 0 for each frequency is output as silence data. Thereby, generation | occurrence | production of unusual noise can be prevented. The frequency domain data silencer 13 functions as a “silence data replacement unit”.

また、周波数領域データ無音化部１３は、入力ビットストリームＳＧ１が複数チャネルの音響信号で構成されている場合には、チャネル毎に処理を行い、エラーの発生しているチャネルについてのみ無音データへの置き換えを行う。但し、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響以外の場合には、時間領域補正指示信号ＳＧ７を用いて他の方法で無音化処理を行うので、周波数領域データ無音化部１３におけるデータの置き換えは省略される。 In addition, when the input bit stream SG1 is composed of a plurality of channels of acoustic signals, the frequency domain data silencer 13 performs processing for each channel, and converts the silence data only to the channel in which an error has occurred. Perform replacement. However, when the content type of the input bitstream SG1 is other than bilingual sound, the silence processing is performed by another method using the time domain correction instruction signal SG7, and therefore the data in the frequency domain data silencer 13 is used. Replacement of is omitted.

周波数領域−時間領域変換部２０は、入力される補正済み周波数領域データＳＧ２に対して周波数領域から時間領域への変換を行う要素であり、例えば、ＩＭＤＣＴ（逆修正離散コサイン変換）処理を行う。すなわち、入力される周波数領域の音響信号は周波数スペクトル分布に相当する情報であり、これを逆フーリエ変換の一種であるＩＭＤＣＴ処理を用いて元の時系列の音響信号に変換する。尚、周波数領域−時間領域変換部２０は「領域変換部」としての機能を有する。 The frequency domain-time domain transformation unit 20 is an element that performs transformation from the frequency domain to the time domain on the input corrected frequency domain data SG2, and performs, for example, IMDCT (Inverse Corrected Discrete Cosine Transform) processing. That is, the input frequency domain acoustic signal is information corresponding to the frequency spectrum distribution, and this is converted into the original time-series acoustic signal using IMDCT processing which is a kind of inverse Fourier transform. The frequency domain-time domain converter 20 has a function as a “domain converter”.

図４は音響信号再生装置１００が扱う音響信号の構成の一例を示すタイムチャートである。
図４に示すように、実際の時系列の音響信号については、時系列で連続的に現れる多数のフレームで構成されている。また、互いに隣接するフレーム同士は時間的に最大半分ずつオーバーラップした状態で現れる。それぞれのフレームは２０４８サンプルのデータで構成されており、前のフレームの後半の１０２４サンプルのデータとそれに続く後のフレームの前半の１０２４サンプルのデータとが重なった状態で現れる。 FIG. 4 is a time chart showing an example of the configuration of an acoustic signal handled by the acoustic signal reproduction device 100.
As shown in FIG. 4, an actual time-series acoustic signal is composed of a number of frames that appear continuously in time-series. Also, adjacent frames appear in a state where they overlap each other by a maximum of half. Each frame is composed of data of 2048 samples, and appears in a state where the data of 1024 samples in the latter half of the previous frame and the data of 1024 samples in the first half of the subsequent frame overlap.

圧縮符号化前の音響信号を復元するためには、互いに隣接するフレーム同士のデータを時間領域窓掛け部３０で演算処理する必要があるので、周波数領域−時間領域変換部２０は、ＩＭＤＣＴ処理の変換結果を、現フレーム用時間領域データＳＧ３と次フレーム用時間領域データＳＧ４として同時に出力する。 In order to restore the acoustic signal before compression encoding, it is necessary to perform arithmetic processing on the data of adjacent frames in the time domain windowing unit 30, so the frequency domain-time domain conversion unit 20 performs the IMDCT processing. The conversion result is simultaneously output as current frame time domain data SG3 and next frame time domain data SG4.

現フレーム用時間領域データＳＧ３は、通常はデータパス切り替え器５２を通り、時間領域窓掛け部３０に補正済み現フレーム用時間領域データＳＧ３１として入力される。 The current frame time domain data SG3 normally passes through the data path switch 52 and is input to the time domain windowing unit 30 as corrected current frame time domain data SG31.

また、次フレーム用時間領域データＳＧ４は、通常はデータパス切り替え器５３を通り、補正済み次フレーム用時間領域データＳＧ４１として前フレームデータ蓄積部４０に入力され蓄積された後、必要なタイミング（１フレーム分処理された後）で前フレームデータ蓄積部４０から読み出され、前フレーム時間領域データＳＧ５として時間領域窓掛け部３０に入力される。 Further, the time region data SG4 for the next frame usually passes through the data path switch 53, and is input to the previous frame data storage unit 40 and stored as corrected time region data SG41 for the next frame. Is read from the previous frame data storage unit 40 and input to the time domain windowing unit 30 as previous frame time domain data SG5.

時間領域窓掛け部３０は、入力された補正済み現フレーム用時間領域データＳＧ３１と、前フレーム時間領域データＳＧ５との加算処理を行い、その結果を出力音響データＳＧ６として出力する。尚、時間領域窓掛け部３０は「再生データ生成部」としての機能を有する。 The time domain windowing unit 30 adds the corrected current frame time domain data SG31 and the previous frame time domain data SG5, and outputs the result as output acoustic data SG6. The time domain windowing unit 30 has a function as a “reproduction data generation unit”.

無音データ生成部５１は、時間領域無音データＳＧ８を生成する。時間領域無音データＳＧ８は、データパス切り替え器５２及び５３の入力にそれぞれ入力される。この時間領域無音データＳＧ８は、値が０のデータであって、１フレーム相当の時間に渡って継続的に出力される。 The silence data generation unit 51 generates time domain silence data SG8. The time domain silence data SG8 is input to the inputs of the data path switchers 52 and 53, respectively. This time-domain silence data SG8 is data having a value of 0, and is continuously output over a time corresponding to one frame.

データパス切り替え器５２、５３は、補正機能付きビットストリーム解釈部１０から出力される時間領域補正指示信号ＳＧ７に基づいて、スイッチ状態を切り替える。具体的には、例えば、エラーが発生していない場合にはビットストリーム解釈部１０から時間領域補正指示信号ＳＧ７を送らないようにすることで、時間領域補正指示信号ＳＧ７を取得した場合にはエラーが発生、取得しなかった場合にはエラーが発生していないと認識することができる。尚、データパス切り替え器５２、５３は「無音データ置換部」としての機能を有する。 The data path switchers 52 and 53 switch the switch state based on the time domain correction instruction signal SG7 output from the bitstream interpretation unit with correction function 10. Specifically, for example, when the time domain correction instruction signal SG7 is acquired by not sending the time domain correction instruction signal SG7 from the bitstream interpretation unit 10 when no error has occurred, an error occurs. If it does not occur and is not acquired, it can be recognized that no error has occurred. The data path switchers 52 and 53 have a function as a “silent data replacement unit”.

そして、入力ビットストリームＳＧ１又は補正済み周波数領域データＳＧ２にエラーが発生していない時には、時間領域補正指示信号ＳＧ７に基づいて、データパス切り替え器５２は、現フレーム用時間領域データＳＧ３を補正済み現フレーム用時間領域データＳＧ３１として選択し、データパス切り替え器５３は、次フレーム用時間領域データＳＧ４を補正済み次フレーム用時間領域データＳＧ４１として選択する。 When no error has occurred in the input bitstream SG1 or the corrected frequency domain data SG2, the data path switcher 52 corrects the current frame time domain data SG3 based on the time domain correction instruction signal SG7. The frame time domain data SG31 is selected, and the data path switch 53 selects the next frame time domain data SG4 as the corrected next frame time domain data SG41.

また、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響以外の場合に、入力ビットストリームＳＧ１に関するエラーもしくは解釈後周波数領域データＳＧ１３に関するエラーが検出された場合には、時間領域補正指示信号ＳＧ７に基づいて、データパス切り替え器５２は、時間領域無音データＳＧ８を補正済み現フレーム用時間領域データＳＧ３１として選択し、データパス切り替え器５３は、時間領域無音データＳＧ８を補正済み次フレーム用時間領域データＳＧ４１として選択する。 In addition, when the content type of the input bit stream SG1 is other than bilingual sound, if an error related to the input bit stream SG1 or an error related to the frequency domain data SG13 after interpretation is detected, the time domain correction instruction signal SG7 is displayed. Based on this, the data path switch 52 selects the time domain silence data SG8 as the corrected current frame time domain data SG31, and the data path switch 53 selects the time domain silence data SG8 as the corrected next frame time domain data. Select as SG41.

したがって、音響信号再生装置１００では、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響以外の場合に、入力ビットストリームＳＧ１に関するエラーが検出された場合には、時間領域窓掛け部３０は、補正済み現フレーム用時間領域データＳＧ３１として入力される無音データと、前フレーム時間領域データＳＧ５として入力される無音データとを加算してその結果を出力音響データＳＧ６として出力するので、出力音響は一時的に無音状態になる。つまり、エラーの発生によって一時的な無音状態になる。 Therefore, in the audio signal reproduction device 100, when the content type of the input bit stream SG1 is other than bilingual sound, and the error related to the input bit stream SG1 is detected, the time domain windowing unit 30 performs the correction. Since the silence data input as the current frame time domain data SG31 and the silence data input as the previous frame time domain data SG5 are added and the result is output as the output audio data SG6, the output audio is temporarily It becomes silent. In other words, a temporary silence occurs due to the occurrence of an error.

また、入力ビットストリームＳＧ１にエラーが発生していないか、又は入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響の場合には、時間領域窓掛け部３０は、補正済み周波数領域データＳＧ２を周波数領域−時間領域変換部２０によって時間領域に変換した結果に基づいて、音響信号の再生に必要な処理を行った結果を出力音響データＳＧ６として出力することができる。 When no error has occurred in the input bitstream SG1 or the content type of the input bitstream SG1 is bilingual sound, the time domain windowing unit 30 uses the corrected frequency domain data SG2 as a frequency. Based on the result converted into the time domain by the area-time domain conversion unit 20, the result of the processing necessary for reproducing the acoustic signal can be output as the output acoustic data SG6.

但し、入力ビットストリームＳＧ１のコンテンツの種別が２カ国語音響の場合であっても、入力ビットストリームＳＧ１にエラーが発生している場合には、周波数領域データ無音化部１３の内部の処理によって、エラーの生じているチャネルについては補正済み周波数領域データＳＧ２が周波数領域の無音データに置き換えられる。したがって、エラーの生じていないチャネルの音響信号だけが再生されて出力音響データＳＧ６として現れるので、この場合もエラーの発生によって異音が発生するチャネルの音響は一時的に無音状態に制御される。 However, even if the content type of the input bitstream SG1 is bilingual sound, if an error occurs in the input bitstream SG1, the internal processing of the frequency domain data silencer 13 For a channel in which an error has occurred, the corrected frequency domain data SG2 is replaced with frequency domain silence data. Therefore, only the sound signal of the channel in which no error has occurred is reproduced and appears as output sound data SG6. In this case, the sound of the channel in which the abnormal sound is generated due to the occurrence of the error is temporarily controlled to be silent.

再生部６０は、出力音響データＳＧ６を音響信号として再生する。 The reproducing unit 60 reproduces the output acoustic data SG6 as an acoustic signal.

次に、音響信号再生装置１００の動作の一例について説明する。図３は音響信号再生装置１００の主要な動作の一例を示すフローチャートである。 Next, an example of the operation of the acoustic signal reproduction device 100 will be described. FIG. 3 is a flowchart showing an example of main operations of the acoustic signal reproduction device 100.

ステップＳ１１では、コンテンツ種別判定部１２が、入力ビットストリームＳＧ１の内容を調べ、コンテンツの種別を判定する。コンテンツの種別が２カ国語（放送）の音響コンテンツであることを認識した場合にはステップＳ１１からステップＳ２１に進み、２カ国語放送の音響コンテンツ以外のコンテンツ（例えば、モノラル放送の音響コンテンツ、ステレオ放送の音響コンテンツなど）を認識した場合にはステップＳ１１からステップＳ１２に進む。 In step S11, the content type determination unit 12 examines the content of the input bitstream SG1 and determines the content type. If it is recognized that the content type is bilingual (broadcast) audio content, the process proceeds from step S11 to step S21, and content other than the bilingual broadcast audio content (for example, mono broadcast audio content, stereo) If the broadcast audio content or the like is recognized, the process proceeds from step S11 to step S12.

ステップＳ１２では、ビットストリーム解釈部１１が、入力ビットストリームＳＧ１の内容を解釈する。入力ビットストリームＳＧ１の内容を解釈する際には、例えばＣＲＣ（ＣｙｃｌｉｃＲｅｄｕｎｄａｎｃｙＣｈｅｃｋ）の検査などを行い、入力ビットストリームＳＧ１自体に含まれるエラーの有無（あるいはエラーの発生量が予め定めた閾値以上か否か）を判定する。 In step S12, the bit stream interpretation unit 11 interprets the contents of the input bit stream SG1. When interpreting the contents of the input bitstream SG1, for example, CRC (Cyclic Redundancy Check) is checked, and the presence or absence of an error included in the input bitstream SG1 itself (or whether the amount of error occurrence is equal to or greater than a predetermined threshold value). Or not).

ステップＳ１３では、ビットストリーム解釈部１１は、ステップＳ１２における処理の結果を用いてエラーの有無を調べる。エラーがある場合にはステップＳ１３からステップＳ１８に進み、エラーがない場合にはステップＳ１４に進む。 In step S13, the bit stream interpretation unit 11 checks the presence or absence of an error using the processing result in step S12. If there is an error, the process proceeds from step S13 to step S18, and if there is no error, the process proceeds to step S14.

ステップＳ１４では、ビットストリーム解釈部１１は、解釈後周波数領域データＳＧ１３を出力する。解釈後周波数領域データＳＧ１３は、周波数領域データ無音化部１３に入力され、そのまま補正済み周波数領域データＳＧ２として出力される。 In step S14, the bitstream interpretation unit 11 outputs post-interpretation frequency domain data SG13. The interpreted frequency domain data SG13 is input to the frequency domain data silencer 13 and output as corrected frequency domain data SG2.

ステップＳ１５では、周波数領域−時間領域変換部２０が、補正済み周波数領域データＳＧ２を時間領域に変換し、フレーム毎に現フレーム用時間領域データＳＧ３及び次フレーム用時間領域データＳＧ４を生成する。 In step S15, the frequency domain-time domain conversion unit 20 converts the corrected frequency domain data SG2 into the time domain, and generates current frame time domain data SG3 and next frame time domain data SG4 for each frame.

ステップＳ１６では、時間領域窓掛け部３０が、補正済み現フレーム用時間領域データＳＧ３１と前フレーム時間領域データＳＧ５との加算処理を行い、その結果を出力音響データＳＧ６として出力する。 In step S16, the time domain windowing unit 30 adds the corrected current frame time domain data SG31 and the previous frame time domain data SG5, and outputs the result as output acoustic data SG6.

ステップＳ１７では、出力音響データＳＧ６を音響信号として再生する。 In step S17, the output acoustic data SG6 is reproduced as an acoustic signal.

ステップＳ１８では、ビットストリーム解釈部１１における解釈の結果、例えばエラーが検出された場合に限り、時間領域補正指示信号ＳＧ７を出力する。この時間領域補正指示信号ＳＧ７に基づいて、データパス切り替え器５２、５３は、該当するフレームの期間だけ無音データを出力する。すなわち、時間領域補正指示信号ＳＧ７が出力されると、データパス切り替え器５２は、現フレーム用時間領域データＳＧ３の代わりに時間領域無音データＳＧ８を選択して補正済み現フレーム用時間領域データＳＧ３１として出力し、データパス切り替え器５３は、次フレーム用時間領域データＳＧ４の代わりに時間領域無音データＳＧ８を選択して補正済み次フレーム用時間領域データＳＧ４１として出力する。 In step S18, the time domain correction instruction signal SG7 is output only when, for example, an error is detected as a result of interpretation by the bitstream interpretation unit 11. Based on the time domain correction instruction signal SG7, the data path switchers 52 and 53 output silence data for the corresponding frame period. That is, when the time domain correction instruction signal SG7 is output, the data path switch 52 selects the time domain silence data SG8 in place of the current frame time domain data SG3 and uses it as corrected current frame time domain data SG31. Then, the data path switch 53 selects the time domain silence data SG8 instead of the next frame time domain data SG4 and outputs it as corrected next frame time domain data SG41.

時間領域補正指示信号ＳＧ７が現れないときには、データパス切り替え器５２は、現フレーム用時間領域データＳＧ３を選択して補正済み現フレーム用時間領域データＳＧ３１として出力し、データパス切り替え器５３は、次フレーム用時間領域データＳＧ４を選択して補正済み次フレーム用時間領域データＳＧ４１として出力する。 When the time domain correction instruction signal SG7 does not appear, the data path switch 52 selects the current frame time domain data SG3 and outputs it as corrected current frame time domain data SG31, and the data path switch 53 The frame time domain data SG4 is selected and output as corrected next frame time domain data SG41.

但し、コンテンツの種別が２カ国語の音響コンテンツである場合には、時間領域補正指示信号ＳＧ７は出力しないので、ビットストリームにエラーが発生している場合であっても、データパス切り替え器５２は、現フレーム用時間領域データＳＧ３を選択し、データパス切り替え器５３は、次フレーム用時間領域データＳＧ４を選択する。 However, when the content type is bilingual audio content, the time domain correction instruction signal SG7 is not output, so the data path switch 52 does not fail even when an error occurs in the bitstream. The current frame time domain data SG3 is selected, and the data path switch 53 selects the next frame time domain data SG4.

また、ステップＳ１８を実行した場合には、ステップＳ１５の処理をスキップして次のステップＳ１６に進む。この場合、無音データ生成部５１が生成した時間領域無音データＳＧ８を使用するので、現フレーム用時間領域データＳＧ３及び次フレーム用時間領域データＳＧ４は不要であるため、周波数領域−時間領域変換部２０は変換処理を一時的に停止することもできる。 If Step S18 is executed, the process of Step S15 is skipped and the process proceeds to the next Step S16. In this case, since the time domain silence data SG8 generated by the silence data generation unit 51 is used, the current frame time domain data SG3 and the next frame time domain data SG4 are unnecessary, and therefore the frequency domain-time domain conversion unit 20 Can also temporarily stop the conversion process.

一方、ステップＳ１１においてコンテンツの種別が２カ国語の音響コンテンツであると判定された場合には、入力ビットストリームＳＧ１のチャネル毎に処理が行われる。 On the other hand, if it is determined in step S11 that the content type is bilingual audio content, processing is performed for each channel of the input bitstream SG1.

ステップＳ２１では、入力ビットストリームＳＧ１に含まれているＬチャネル（例えば、主音響チャネル又は副音響チャネルなど）についてビットストリーム解釈部１１がビットストリームの内容を解釈し、この音響信号を復号する。 In step S21, the bit stream interpretation unit 11 interprets the contents of the bit stream for the L channel (for example, the main acoustic channel or the secondary acoustic channel) included in the input bit stream SG1, and decodes the acoustic signal.

ステップＳ２２では、ビットストリーム解釈部１１が、ステップＳ２１においてＬチャネルの音響信号のデータ成分（例えば、図５における音響コンテンツデータ５２２ａ）に関するエラーが検出されたかどうかを判定し、エラーがある場合にはステップＳ２４に進み、エラーがない場合にはステップＳ２３に進む。 In step S22, the bitstream interpretation unit 11 determines whether an error relating to the data component of the L channel acoustic signal (for example, the acoustic content data 522a in FIG. 5) is detected in step S21. The process proceeds to step S24, and if there is no error, the process proceeds to step S23.

ステップＳ２３では、エラーのないＬチャネルの音響信号のデータ成分について、ビットストリーム解釈部１１の出力する解釈後周波数領域データＳＧ１３を周波数領域データ無音化部１３が入力し、これをそのまま補正済み周波数領域データＳＧ２として出力する。 In step S23, the frequency domain data silencer 13 inputs the post-interpretation frequency domain data SG13 output from the bitstream interpretation unit 11 for the data component of the L-channel acoustic signal without error, and this is corrected as it is. Output as data SG2.

ステップＳ２４では、エラーのあるＬチャネルの音響信号のデータ成分について、ビットストリーム解釈部１１の出力する解釈後周波数領域データＳＧ１３を周波数領域データ無音化部１３が入力し、これを予め用意した周波数領域無音データ（周波数毎に値が０のデータ）に置き換えて、補正済み周波数領域データＳＧ２として出力する。 In step S24, the frequency domain data silencer 13 inputs the post-interpretation frequency domain data SG13 output from the bitstream interpretation unit 11 for the data component of the L channel acoustic signal having an error, and the frequency domain is prepared in advance. Replaced with silence data (data with a value of 0 for each frequency), and output as corrected frequency domain data SG2.

ステップＳ２５では、入力ビットストリームＳＧ１に含まれているＲチャネル（例えば、副音響チャネル又は主音響チャネル）についてビットストリーム解釈部１１がビットストリームの内容を解釈し、この音響信号を復号する。 In step S25, the bit stream interpretation unit 11 interprets the contents of the bit stream for the R channel (for example, the secondary acoustic channel or the main acoustic channel) included in the input bit stream SG1, and decodes the acoustic signal.

ステップＳ２６では、ビットストリーム解釈部１１は、ステップＳ２５において、Ｒチャネルの音響信号のデータ成分（例えば、図５における音響コンテンツデータ５２２ｂ）に関するエラーが検出されたかどうかを判定し、エラーがある場合にはステップＳ２８に進み、エラーがない場合にはステップＳ２７に進む。 In step S26, the bitstream interpretation unit 11 determines in step S25 whether an error relating to the data component of the R channel audio signal (for example, the audio content data 522b in FIG. 5) has been detected. Proceeds to step S28, and if there is no error, proceeds to step S27.

ステップＳ２７では、エラーのないＲチャネルの音響信号のデータ成分について、ビットストリーム解釈部１１の出力する解釈後周波数領域データＳＧ１３を周波数領域データ無音化部１３が入力し、これをそのまま補正済み周波数領域データＳＧ２として出力する。 In step S27, the frequency domain data silencer 13 inputs the post-interpretation frequency domain data SG13 output from the bitstream interpretation unit 11 for the data component of the R-channel acoustic signal without error, and the corrected frequency domain is input as it is. Output as data SG2.

ステップＳ２８では、エラーのあるＲチャネルの音響信号のデータ成分について、ビットストリーム解釈部１１の出力する解釈後周波数領域データＳＧ１３を周波数領域データ無音化部１３が入力し、これを予め用意した周波数領域無音データに置き換えて、補正済み周波数領域データＳＧ２として出力する。 In step S28, the frequency domain data silencer 13 inputs the post-interpretation frequency domain data SG13 output from the bitstream interpretation unit 11 for the data component of the R channel acoustic signal having an error, and the frequency domain is prepared in advance. It is replaced with silence data and output as corrected frequency domain data SG2.

２カ国語の音響コンテンツである入力ビットストリームＳＧ１を処理する場合においても、ステップＳ２７又はＳ２８を実行した後でステップＳ１５に進むので、補正機能付きビットストリーム解釈部１０が出力する補正済み周波数領域データＳＧ２は、チャネル毎に周波数領域−時間領域変換部２０によって時間領域データに変換され、現フレーム用時間領域データＳＧ３、次フレーム用時間領域データＳＧ４が生成される。 Even in the case of processing the input bitstream SG1 which is the bilingual sound content, the process proceeds to step S15 after executing step S27 or S28, so that the corrected frequency domain data output by the bitstream interpretation unit with correction function 10 is output. SG2 is converted into time domain data by the frequency domain-time domain conversion unit 20 for each channel, and time domain data SG3 for the current frame and time domain data SG4 for the next frame are generated.

また、入力コンテンツが２カ国語の音響コンテンツの場合、データパス切り替え器５２、５３が時間領域無音データＳＧ８を選択しないので、現フレーム用時間領域データＳＧ３及び次フレーム用時間領域データＳＧ４に基づいて、時間領域窓掛け部３０によって出力音響データＳＧ６が生成される。但し、入力ビットストリームＳＧ１にエラーが発生している場合には、エラーのあるチャネルの音響については周波数領域データ無音化部１３の内部で無音データに置き換えられるので、出力音響データＳＧ６として無音状態の信号が出力される。 When the input content is bilingual audio content, the data path switchers 52 and 53 do not select the time domain silence data SG8, so that the current frame time domain data SG3 and the next frame time domain data SG4 are used. The time domain windowing unit 30 generates output acoustic data SG6. However, if an error has occurred in the input bitstream SG1, the sound of the channel with the error is replaced with silence data within the frequency domain data silencer 13, so that the output sound data SG6 is in a silent state. A signal is output.

また、入力コンテンツが２カ国語の音響コンテンツの場合、例えば主音響チャネルにはエラーがなく、副音響チャネルのみにエラーが発生している場合であっても、チャネル毎に独立に制御されるので、主音響チャネルについてはそのまま出力音響データＳＧ６として音響信号が再生され、副音響チャネルだけが無音状態に置き換えられる。 Also, if the input content is bilingual audio content, for example, even if there is no error in the main audio channel and an error has occurred only in the secondary audio channel, it is controlled independently for each channel. For the main sound channel, the sound signal is reproduced as output sound data SG6 as it is, and only the sub sound channel is replaced with the silent state.

そのため、比較的エラーの発生が少ないチャネルだけを視聴する機会の多い視聴者にとっては、一方のチャネルが無音状態になっている時であっても目的のチャネルの視聴を継続できるので、無音状態が発生する頻度が小さくなる。 Therefore, for viewers who often view only channels with relatively few errors, the target channel can be viewed even when one channel is silent. The frequency of occurrence is reduced.

尚、ここでは複数のチャネルを利用した音響コンテンツとして２カ国語音響を用いる場合について説明したが、２カ国語以上の音響コンテンツであってもよい。また、２種類以上の外国語を含む音響コンテンツではなくても、複数のチャネルを利用した音響コンテンツに対して本発明を適用可能である。 Although the case where bilingual sound is used as sound content using a plurality of channels has been described here, sound content in two or more languages may be used. Further, the present invention can be applied to audio content using a plurality of channels, not audio content including two or more foreign languages.

このような音響信号再生装置１００によれば、ビットストリーム解釈部１１によって復号処理が終了した周波数領域の音響信号について、又は、周波数領域−時間領域変換部２０によって変換された後の時間領域の音響信号について、フレーム毎に無音データに置き換えることができ、入力ビットストリームＳＧ１として入力される音響信号のチャネル構成の違いやサンプリングレートの違いとは無関係に、一定の無音データを用意するだけで出力を無音状態に制御することができる。また、大量の符号化データを保持するための大容量メモリや符号化データを生成するための複雑な電荷回路も不要である。なお、訂正可能なエラーや視聴者が認識できない程度のエラーの発生に対しては、無音化制御を省略しても良い。 According to such an audio signal reproduction device 100, the frequency domain audio signal that has been decoded by the bitstream interpretation unit 11 or the time domain audio after being converted by the frequency domain-time domain conversion unit 20. The signal can be replaced with silence data for each frame, and the output can be obtained simply by preparing a certain silence data regardless of the difference in the channel configuration or the sampling rate of the acoustic signal input as the input bit stream SG1. It can be controlled to silence. Further, a large-capacity memory for holding a large amount of encoded data and a complicated charge circuit for generating encoded data are unnecessary. Note that the silence control may be omitted for errors that can be corrected or errors that cannot be recognized by the viewer.

以上のように、本発明は、ＡＡＣ等の規格に従って圧縮符号化された音響信号のビットストリームを入力して再生する音響信号再生装置等として有用であり、例えばデジタルテレビ放送を受信するテレビ受像機、デジタル放送を受信する機能を備えた携帯電話端末、デジタルオーディオ機器等に有用である。 As described above, the present invention is useful as an audio signal reproducing device that inputs and reproduces a bit stream of an audio signal that has been compression-encoded in accordance with a standard such as AAC. For example, a television receiver that receives a digital television broadcast It is useful for a mobile phone terminal, a digital audio device, etc. having a function of receiving digital broadcasting.

本発明の実施形態における音響信号再生装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the acoustic signal reproducing | regenerating apparatus in embodiment of this invention. 本発明の実施形態における補正機能つきビットストリーム解釈部の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the bit stream interpretation part with a correction function in embodiment of this invention. 本発明の実施形態における音響信号再生装置における主要な動作の一例を示すフローチャートである。It is a flowchart which shows an example of main operation | movement in the acoustic signal reproducing | regenerating apparatus in embodiment of this invention. 本発明の実施形態における音響信号再生装置が扱う音響信号の構成例を示すタイムチャートである。It is a time chart which shows the structural example of the acoustic signal which the acoustic signal reproduction apparatus in embodiment of this invention handles. 本発明の実施形態における音響信号再生装置が処理するフレームの一例を示す図である。It is a figure which shows an example of the flame | frame which the acoustic signal reproducing | regenerating apparatus in embodiment of this invention processes.

Explanation of symbols

１００音響信号再生装置
１０補正機能付きビットストリーム解釈部
１１ビットストリーム解釈部
１２コンテンツ種別判定部
１３周波数領域データ無音化部
２０周波数領域−時間領域変換部
３０時間領域窓掛け部
４０前フレームデータ蓄積部
５１無音データ生成部
５２，５３データパス切り替え器
６０再生部
５００フレーム
５１０ヘッダ領域
５２０データ領域
５２１チャネルＩＤ
５２２音響コンテンツデータ
５３０トレーラ領域
ＳＧ１入力ビットストリーム
ＳＧ２補正済み周波数領域データ
ＳＧ３現フレーム用時間領域データ
ＳＧ４次フレーム用時間領域データ
ＳＧ５前フレーム時間領域データ
ＳＧ６出力音響データ
ＳＧ７時間領域補正指示信号
ＳＧ８時間領域無音データ
ＳＧ１１コンテンツ種別信号
ＳＧ１２ビットストリームエラー情報
ＳＧ１３解釈後周波数領域データ
ＳＧ３１補正済み現フレーム用時間領域データ
ＳＧ４１補正済み次フレーム用時間領域データ DESCRIPTION OF SYMBOLS 100 Sound signal reproduction apparatus 10 Bit stream interpretation part with correction function 11 Bit stream interpretation part 12 Content type determination part 13 Frequency domain data silence part 20 Frequency domain-time domain conversion part 30 Time domain windowing part 40 Previous frame data storage part 51 Silent Data Generation Unit 52, 53 Data Path Switcher 60 Playback Unit 500 Frame 510 Header Area 520 Data Area 521 Channel ID
522 Sound content data 530 Trailer region SG1 Input bit stream SG2 Corrected frequency region data SG3 Current frame time region data SG4 Next frame time region data SG5 Previous frame time region data SG6 Output sound data SG7 Time region correction instruction signal SG8 Time region Silence data SG11 Content type signal SG12 Bit stream error information SG13 Interpreted frequency domain data SG31 Corrected current frame time domain data SG41 Corrected next frame time domain data

Claims

A data holding unit for holding content data having channel information on a plurality of channels and sound data corresponding to each channel in units of frames;
An error determination unit that determines whether there is an error in the acoustic data of each channel in the frame;
When it is determined that there is the error, a silence data replacement unit that replaces acoustic data of a channel determined to have the error in the frame with predetermined silence data;
A reproduction data generation unit that generates acoustic reproduction data for reproduction based on the acoustic data and the silence data of each channel in mutually adjacent frames;
A sound signal reproduction device comprising: a reproduction unit that reproduces the sound reproduction data.

The acoustic signal reproducing apparatus according to claim 1, further comprising:
An acoustic signal reproduction device having a data input unit for inputting the content data from an external content distribution device.

The acoustic signal reproduction device according to claim 1, further comprising:
It has a domain conversion unit that converts frequency domain data to time domain data,
The data holding unit holds acoustic data in the frequency domain,
The error determination unit determines whether there is an error in the frequency domain acoustic data,
The reproduction data generation unit is an acoustic signal generation device that generates the acoustic reproduction data based on acoustic data and silence data in a time domain of each channel in adjacent frames after conversion by the region conversion unit.

The acoustic signal reproduction device according to claim 3, further comprising:
A content type determination unit for determining the type of the acoustic data;
The region conversion unit converts the acoustic data held in the data holding unit into time domain data,
The silence data replacement unit, when the content type determination unit determines that the acoustic data is not data including audio data of a plurality of languages, the time domain of the channel determined to have the error in the frame An acoustic signal reproduction device that replaces acoustic data with predetermined silence data in the time domain.

The acoustic signal reproduction device according to claim 3, further comprising:
A content type determination unit for determining the type of the acoustic data;
The silence data replacement unit, when the content type determination unit determines that the acoustic data is data including audio data of a plurality of languages, a frequency domain of a channel determined to have the error in the frame Replace acoustic data with predetermined silence data in the frequency domain,
The said area | region conversion part is an acoustic signal reproduction | regeneration apparatus which converts the acoustic data and silence data of a frequency domain into data of a time domain.

The acoustic signal reproduction device according to claim 1, wherein
The content data is an audio signal reproduction device which is streaming data or digital television broadcast data.

In an acoustic signal reproduction device,
Holding content data having channel information on a plurality of channels and acoustic data corresponding to each channel in units of frames;
Determining whether there is an error in the acoustic data of each channel in the frame;
If it is determined that there is an error, replacing acoustic data of a channel determined to have the error in the frame with predetermined silence data; and
Generating sound reproduction data for reproduction based on the sound data and silence data of each channel in mutually adjacent frames;
A method of reproducing the sound reproduction data.