JP2017009743A

JP2017009743A - Voice data processing device, and electronic apparatus

Info

Publication number: JP2017009743A
Application number: JP2015123870A
Authority: JP
Inventors: 晃小池; Akira Koike
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2015-06-19
Filing date: 2015-06-19
Publication date: 2017-01-12

Abstract

PROBLEM TO BE SOLVED: To reproduce a voice that is closer to an actual voice.SOLUTION: A voice data processing device 1 has a high-frequency interpolation unit 30 that converts the time domain signal of an original sound decoded on the basis of voice data into a frequency domain signal, extracts information on the amplitude of the original voice to a frequency fa/3 in the frequency domain signal when interpolating the frequency fa component of a high frequency domain that was cut in the original voice, generates a first interpolation-use frequency domain signal in which the amplitude information multiplied by 1/3 is made the amplitude information Ae of a frequency fa, extracts information on the amplitude of the original voice to a frequency fa/2 in the frequency domain signal, generates a second interpolation-use frequency domain signal in which the amplitude information multiplied by 1/2 is made the amplitude information Ae of a frequency fa, and synthesizes a signal derived from the first interpolation-use frequency domain signal by conversion to a time domain signal and a signal derived from the second interpolation-use frequency domain signal by conversion to a time domain signal, the synthesized time domain signal and the time domain signal of the original voice being synthesized.SELECTED DRAWING: Figure 1

Description

本発明は、音声データに基づき復号された原音の高周波成分を補間する音声データ処理装置及び該音声データ処理装置を備えた電子機器に関する。 The present invention relates to an audio data processing device that interpolates a high-frequency component of an original sound that is decoded based on audio data, and an electronic device that includes the audio data processing device.

近年、テレビの高画質化、高解像度化と共に高音質化も望まれており、いわゆるハイレゾリューション音源（以下、ハイレゾ音源）に対応したテレビも出てきている。ハイレゾ音源には、今まで人間の可聴限界と考えられ圧縮時にカットされていた20kHzを超える周波数成分も含まれる。近年、上記可聴限界を超える周波数成分に対しても人間が感受性を有することが判明してきたからである。 In recent years, there has been a demand for higher sound quality as well as higher image quality and higher resolution of televisions, and televisions compatible with so-called high resolution sound sources (hereinafter referred to as high resolution sound sources) have come out. The high-resolution sound source includes frequency components exceeding 20 kHz that have been considered to be human audible limits and have been cut during compression. This is because in recent years it has been found that humans are sensitive to frequency components exceeding the audible limit.

上述のようにハイレゾ音源に対応し高周波成分も再現するテレビ等のＡＶ（Audio Visual）機器が出てきている中で、そのＡＶ機器に入力される放送の音声データやＣＤの音声データは、ハイレゾ音源用のデータではなく、多くはデータ圧縮により高周波領域がカットされたもののままである。ハイレゾ音源に対応したＡＶ機器であっても入力データがハイレゾ音源対応のデータでない場合、再生される音は高周波成分を含まないものとなる。その場合でも高周波成分が含まれるようにするためには高周波成分を補間する必要がある。 As described above, AV (Audio Visual) devices such as TVs that support high-resolution sound sources and reproduce high-frequency components are emerging, and broadcast audio data and CD audio data input to the AV devices are high-resolution. It is not data for a sound source, and in many cases, the high frequency region is cut by data compression. Even if the AV device is compatible with a high-resolution sound source, if the input data is not data compatible with the high-resolution sound source, the reproduced sound does not include a high-frequency component. Even in such a case, it is necessary to interpolate the high frequency component in order to include the high frequency component.

特許文献１に開示の高周波成分補間方法では、まず、ＭＰ３（MPEG-1 Audio Layer 3）規格で不可逆圧縮された音声データを実際の音に近い音域で再生するため、ＭＰ３音声データの周波数成分から基音を抽出し、この基音から倍音となる周波数成分を生成し、生成した倍音の周波数成分のうち、圧縮時にカットされた領域の周波数成分のみを残し、その他をカットし、その周波数成分を増幅する。そして、ＭＰ３音楽データから音色の時間的変化特性を抽出し、この時間的変化特性に基づいて、上記増幅された周波数成分と、元のＭＰ３音声データの周波数成分と、を合成して再生する。 In the high-frequency component interpolation method disclosed in Patent Document 1, first, audio data irreversibly compressed according to the MP3 (MPEG-1 Audio Layer 3) standard is reproduced in a sound range close to an actual sound. Extracts the fundamental tone, generates frequency components that become harmonics from this fundamental tone, leaves only the frequency components of the region that was cut during compression, among the generated harmonic components, cuts the others, and amplifies the frequency components . Then, the temporal change characteristic of the timbre is extracted from the MP3 music data, and the amplified frequency component and the frequency component of the original MP3 audio data are synthesized and reproduced based on the temporal change characteristic.

特開２００１−３２４９９６号公報JP 2001-324996 A

しかし、実際の音には基音とその倍音以外の成分も含まれるため、特許文献１に開示の方法では実際の音を再現するのには十分とはいえない。 However, since the actual sound includes components other than the fundamental tone and its overtones, the method disclosed in Patent Document 1 is not sufficient to reproduce the actual sound.

本発明は、上述のような実情に鑑みてなされたものであり、より実際の音に近い音を再生することができる音声データ処理装置及び該音声データ処理装置を備える電子機器を提供することをその目的とする。 The present invention has been made in view of the above circumstances, and provides an audio data processing device capable of reproducing a sound closer to an actual sound and an electronic device including the audio data processing device. For that purpose.

上記課題を解決するために、本発明の第１の技術手段は、音声データに基づき復号された原音の時間領域信号に対し、前記原音においてカットされた高周波領域の周波数成分の補間を行う高周波補間部を備えた音声データ処理装置であって、前記高周波補間部が、前記復号された原音の時間領域信号を周波数領域信号へ変換する変換部と、前記カットされた高周波領域の周波数ｆａ成分の補間に際し、前記原音の周波数領域信号内の周波数ｆａ／（２ｎ＋１）（ｎは正の整数）に対する振幅情報を抽出し、該抽出した振幅情報に１／（２ｎ＋１）を乗じたものを前記周波数ｆａの振幅情報Ａａとした第１の補間用周波数領域信号を生成する第１補間用信号生成部と、前記原音の周波数領域信号内の周波数ｆａ／２ｎに対する振幅情報を抽出し、該抽出した振幅情報に１／２ｎを乗じたものを前記周波数ｆａの振幅情報Ａｅとした第２の補間用周波数領域信号を生成する第２補間用信号生成部と、前記第１の補間用周波数領域信号を時間領域信号に変換した信号と、前記第２の補間用周波数領域信号を時間領域信号に変換した信号と、を合成する合成部を有することを特徴としたものである。 In order to solve the above-described problem, the first technical means of the present invention is a high-frequency interpolation that interpolates a frequency component of a high-frequency region cut in the original sound with respect to a time-domain signal of the original sound decoded based on audio data. A high-frequency interpolating unit for converting the time domain signal of the decoded original sound into a frequency domain signal, and interpolation of the frequency fa component of the cut high-frequency domain At this time, amplitude information for the frequency fa / (2n + 1) (n is a positive integer) in the frequency domain signal of the original sound is extracted, and the extracted amplitude information multiplied by 1 / (2n + 1) is the frequency fa. A first interpolation signal generation unit that generates a first interpolation frequency domain signal as amplitude information Aa, and amplitude information for the frequency fa / 2n in the frequency domain signal of the original sound are extracted. A second interpolation signal generating unit for generating a second interpolation frequency domain signal, which is obtained by multiplying the extracted amplitude information by ½n into the amplitude information Ae of the frequency fa, and the first interpolation frequency. It has a combining unit that combines a signal obtained by converting a domain signal into a time domain signal and a signal obtained by converting the second interpolation frequency domain signal into a time domain signal.

本発明の第２の技術手段は、第１の技術手段において、前記高周波補間部が、前記原音の時間領域信号に対し、前記周波数ｆａ／２ｎを通すフィルタをかけ、該フィルタをかけた前記原音の時間領域信号の最大値から最小値までの時間間隔Ｔｉｎが４ｎ／ｆａであるか否か判定する判定部を有し、前記第２補間用信号生成部は、前記周波数ｆａ／２ｎを通すフィルタをかけた前記原音の時間間隔Ｔｉｎが４ｎ／ｆａである場合、前記周波数ｆａの振幅情報Ａｅを０とすることを特徴としたものである。 According to a second technical means of the present invention, in the first technical means, the high frequency interpolation unit applies a filter that passes the frequency fa / 2n to the time domain signal of the original sound, and the original sound that has been subjected to the filter is applied. A determination unit for determining whether a time interval Tin from the maximum value to the minimum value of the time domain signal is 4n / fa, wherein the second interpolation signal generation unit is a filter that passes the frequency fa / 2n When the time interval Tin of the original sound multiplied by 4 is 4n / fa, the amplitude information Ae of the frequency fa is set to 0.

本発明の第３の技術手段は、第１または第２の技術手段において、前記合成部が、前記第１の補間用周波数領域信号を時間領域信号に変換した信号と、前記第２の補間用周波数領域信号を時間領域信号に変換した信号と、を混合比を調整可能に合成することを特徴としたものである。 According to a third technical means of the present invention, in the first or second technical means, the combining unit converts the first interpolation frequency domain signal into a time domain signal, and the second interpolation means. A signal obtained by converting a frequency domain signal into a time domain signal is synthesized so that the mixing ratio can be adjusted.

本発明の第４の技術手段は、第１〜第３のいずれか１の技術手段の音声データ処理装置を備えることを特徴とした電子機器である。 According to a fourth technical means of the present invention, there is provided an electronic apparatus comprising the audio data processing apparatus according to any one of the first to third technical means.

本発明の第５の技術手段は、第４の技術手段の音声データ処理装置がテレビジョン受信機であることを特徴としたものである。 The fifth technical means of the present invention is characterized in that the audio data processing apparatus of the fourth technical means is a television receiver.

本発明によれば、より実際の音に近い音を再生することができる。 According to the present invention, a sound closer to an actual sound can be reproduced.

本発明の第１の実施例に係る音声データ処理装置を説明するためのブロック図である。It is a block diagram for demonstrating the audio | voice data processing apparatus which concerns on 1st Example of this invention. 原音の時間領域信号を示す図である。It is a figure which shows the time-domain signal of an original sound. 原音の高周波成分が補間された時間領域信号を示す図である。It is a figure which shows the time domain signal by which the high frequency component of the original sound was interpolated. 原音のカットオフ周波数の検出方法を説明するための図である。It is a figure for demonstrating the detection method of the cutoff frequency of an original sound. 原音の単位時間毎の周波数領域信号の一例を示す図である。It is a figure which shows an example of the frequency domain signal for every unit time of an original sound. 高周波成分の補間に用いられる第１の補間用周波数領域信号の生成処理を説明するための図である。It is a figure for demonstrating the production | generation process of the frequency domain signal for 1st interpolation used for the interpolation of a high frequency component. 高周波成分の補間に用いられる第２の補間用周波数領域信号の生成処理を説明するための図である。It is a figure for demonstrating the production | generation process of the 2nd frequency domain signal for interpolation used for the interpolation of a high frequency component. 図１の音声データ処理装置の判定部における判定処理を説明するための図である。It is a figure for demonstrating the determination process in the determination part of the audio | voice data processing apparatus of FIG.

以下、図面を参照しながら、本発明に係る音声データ処理装置及び電子機器の好適な実施形態について説明する。なお、以下の発明において、異なる図面においても同じ符号を付した構成は同様のものであるとして、その説明を省略することがある。 Hereinafter, preferred embodiments of an audio data processing device and an electronic apparatus according to the present invention will be described with reference to the drawings. In the following inventions, the same reference numerals in different drawings are the same, and the description thereof may be omitted.

（第１の実施例）
図１は、本発明の第１の実施例に係る音声データ処理装置を説明するためのブロック図である。図２は、図１の音声データ処理装置の復号部が復号した原音の時間領域信号を示す図である。図３は、図１の音声データ処理装置により原音の高周波成分が補間された時間領域信号を示す図である。図４は、図１の音声データ処理装置の帯域検出部におけるカットオフ周波数の検出処理を説明するための図である。図５は、原音の単位時間毎の周波数領域信号の一例を示す図である。図６は、上記高周波成分の補間に用いられる第１の補間用周波数領域信号の生成処理を説明するための図である。図７は、上記高周波成分の補間に用いられる第２の補間用周波数領域信号の生成処理を説明するための図である。図８は、図１の音声データ処理装置の判定部における判定処理を説明するための図である。 (First embodiment)
FIG. 1 is a block diagram for explaining an audio data processing apparatus according to a first embodiment of the present invention. FIG. 2 is a diagram showing a time domain signal of the original sound decoded by the decoding unit of the audio data processing apparatus of FIG. FIG. 3 is a diagram showing a time domain signal obtained by interpolating the high frequency component of the original sound by the audio data processing apparatus of FIG. FIG. 4 is a diagram for explaining a cut-off frequency detection process in the band detection unit of the audio data processing apparatus of FIG. FIG. 5 is a diagram illustrating an example of a frequency domain signal for each unit time of the original sound. FIG. 6 is a diagram for explaining a generation process of the first interpolation frequency domain signal used for the interpolation of the high-frequency component. FIG. 7 is a diagram for explaining the generation process of the second interpolation frequency domain signal used for the interpolation of the high-frequency component. FIG. 8 is a diagram for explaining the determination process in the determination unit of the audio data processing apparatus of FIG.

図の音声データ処理装置１は、復号部１０、アップサンプリング部２０、高周波補間部３０、合成部５０と、を備える。 The audio data processing apparatus 1 shown in the figure includes a decoding unit 10, an upsampling unit 20, a high frequency interpolation unit 30, and a synthesis unit 50.

復号部１０は、音声データ処理装置１に入力された不可逆圧縮された音声データを復号して図２に示すような原音の時間領域信号を生成する。なお、不可逆圧縮の方式としては、ＭＰ３、ＡＡＣ（Advanced Audio Coding）等が考えられる。
アップサンプリング部２０は、復号部１０が復号した原音の時間領域信号を不可逆圧縮時の２倍のサンプリングレートでアップサンプリングする。 The decoding unit 10 decodes the irreversibly compressed audio data input to the audio data processing device 1 to generate a time domain signal of the original sound as shown in FIG. Note that irreversible compression methods include MP3, AAC (Advanced Audio Coding), and the like.
The up-sampling unit 20 up-samples the time domain signal of the original sound decoded by the decoding unit 10 at a sampling rate twice that at the time of irreversible compression.

高周波補間部３０は、原音の時間領域信号に含まれていない高周波成分を補間するためのデータを生成するものであって、アップサンプリング部２０によりアップサンプリングされた原音の時間領域信号に基づいて上記高周波成分を補間するデータを生成する。高周波補間部３０が出力するデータは時間領域のデータである。高周波補間部３０における具体的な処理例については後述する。
合成部５０は、復号部１０が復号しアップサンプリング部によりアップサンプリングされた原音の時間領域信号と、高周波補間部３０から出力された高周波補間用の時間領域のデータとを合成して出力するものである。 The high frequency interpolation unit 30 generates data for interpolating high frequency components not included in the time domain signal of the original sound, and is based on the time domain signal of the original sound upsampled by the upsampling unit 20. Data for interpolating high frequency components is generated. The data output from the high frequency interpolation unit 30 is time domain data. A specific processing example in the high frequency interpolation unit 30 will be described later.
The synthesizing unit 50 synthesizes and outputs the time domain signal of the original sound decoded by the decoding unit 10 and upsampled by the upsampling unit and the time domain data for high frequency interpolation output from the high frequency interpolation unit 30 It is.

このような構成により、音声データ処理装置１では、図３のような時間領域信号の音声、すなわち、図２の原音の時間領域信号には含まれていなかった高周波成分も含む時間領域信号の音声を出力することができるようになっている。 With such a configuration, the audio data processing apparatus 1 has the time domain signal audio as shown in FIG. 3, that is, the audio of the time domain signal including the high frequency components not included in the original time domain signal of FIG. Can be output.

音声データ処理装置１の特徴部に係る高周波補間部３０は、帯域検出部３１、周波数領域変換部３２、奇数倍高調波生成部３３、第１ゲイン乗算部３４、第１再変換部３５、判定部３６、偶数倍高調波生成部３７、第２ゲイン乗算部３８、第２再変換部３９、合成部４０と、を有する。 The high frequency interpolation unit 30 according to the characteristic part of the audio data processing device 1 includes a band detection unit 31, a frequency domain conversion unit 32, an odd multiple harmonic generation unit 33, a first gain multiplication unit 34, a first reconversion unit 35, and a determination. A unit 36, an even harmonic generation unit 37, a second gain multiplication unit 38, a second reconversion unit 39, and a synthesis unit 40.

帯域検出部３１は、原音の時間領域信号に含まれる周波数の帯域を検出するため、後述のカットオフ周波数を検出するものである。帯域検出部３１は、例えば、復号部１０から入力される原音の時間領域信号のうち、初めの数秒間のものについてフーリエ変換等を用いて図４に示すような周波数ヒストグラムを作成し、度数が閾値以下となったことが検知された周波数をカットオフ周波数ｆｃとして検出する。なお、度数が閾値以下となった周波数が複数ある場合は、最大のものをカットオフ周波数ｆｃとする。また、カットオフ周波数ｆｃの判定方法は上述の例に限られない。例えば、不可逆圧縮された音声データにカットオフ周波数に係る情報を付与しておき、該情報が帯域検出部３１に入力されるようにし、帯域検出部３１が該情報に基づいてカットオフ周波数を判定してもよい。帯域検出部３１により検出されたカットオフ周波数ｆｃは奇数倍高調波生成部３３と偶数倍高調波生成部３７に出力される。なお、以下では、カットオフ周波数ｆｃが２４ｋＨＺであったものとして説明する。 The band detection unit 31 detects a cut-off frequency described later in order to detect a frequency band included in the time domain signal of the original sound. For example, the band detection unit 31 creates a frequency histogram as shown in FIG. 4 using the Fourier transform or the like for the first several seconds of the original sound time domain signal input from the decoding unit 10, and the frequency is A frequency detected to be equal to or lower than the threshold is detected as a cut-off frequency fc. When there are a plurality of frequencies whose frequency is equal to or less than the threshold value, the maximum frequency is set as the cut-off frequency fc. Further, the determination method of the cutoff frequency fc is not limited to the above example. For example, information related to the cut-off frequency is given to the irreversibly compressed audio data so that the information is input to the band detecting unit 31, and the band detecting unit 31 determines the cut-off frequency based on the information. May be. The cut-off frequency fc detected by the band detector 31 is output to the odd harmonic generator 33 and the even harmonic generator 37. In the following description, it is assumed that the cut-off frequency fc is 24 kHz.

周波数領域変換部３２は、アップサンプリング部２０によりアップサンプリングされた原音の単位時間毎の時間領域信号を、ＦＦＴ（Fast Fourier Transform）アナライザやイコライザで使われているような一般的な手法で、図５に示すような原音の単位時間毎の周波数領域信号に変換する。変換された原音の単位時間毎の周波数領域信号は、奇数倍高調波生成部３３と偶数倍高調波生成部３７に出力される。 The frequency domain transform unit 32 uses a general technique such as that used for FFT (Fast Fourier Transform) analyzers and equalizers for the time domain signal for each unit time of the original sound upsampled by the upsampling unit 20. 5 is converted into a frequency domain signal for each unit time of the original sound as shown in FIG. The converted frequency domain signal of the original sound for each unit time is output to the odd harmonic generation unit 33 and the even harmonic generation unit 37.

奇数倍高調波生成部３３は、原音の単位時間毎の周波数領域信号の補間領域、すなわちカットオフ周波数ｆｃより大きくカットオフ周波数ｆｃの２倍以下の領域を、カットオフ周波数ｆｃ以下の周波数の振幅（レベル）情報に基づいて補間する。
奇数倍高調波生成部３３は、例えば、図６（Ａ）に示すような単位時間毎の周波数領域信号に含まれる８ｋＨＺより大きく１６ｋＨＺ以下の周波数の振幅情報に基づいて、２４ｋＨＺより大きく４８ｋＨＺ以下の周波数の振幅情報を補間する。より具体的には、奇数倍高調波生成部３３は、２４ｋＨＺ＜ｆａ≦４８ｋＨＺである補間領域の周波数ｆａの振幅情報として、周波数ｆａ／３の振幅情報を加算し、図６（Ｂ）のような周波数領域信号のデータを作成する。例えば４８ｋＨＺの振幅情報としては周波数１６ｋＨＺの振幅情報が加算される。周波数ｆａの間隔は例えば０．３ｋＨＺである。 The odd harmonic generation unit 33 performs an interpolation region of the frequency region signal for each unit time of the original sound, that is, a region larger than the cut-off frequency fc and less than or equal to twice the cut-off frequency fc with the amplitude of the frequency equal to or lower than the cut-off frequency fc Interpolation based on (level) information.
The odd-numbered harmonic generation unit 33 is, for example, greater than 24 kHz and less than 48 kHz based on amplitude information of a frequency greater than 8 kHz and less than or equal to 16 kHz included in the frequency domain signal per unit time as shown in FIG. Interpolates frequency amplitude information. More specifically, the odd multiple harmonic generation unit 33 adds the amplitude information of the frequency fa / 3 as the amplitude information of the frequency fa of the interpolation region where 24 kHz <fa ≦ 48 kHz, as shown in FIG. Creates data for a simple frequency domain signal. For example, amplitude information of frequency 16 kHz is added as amplitude information of 48 kHz. The interval of the frequency fa is, for example, 0.3 kHz.

第１ゲイン乗算部３４は、奇数倍高調波生成部３３から出力された図６（Ｂ）の周波数領域信号のデータに対して、各周波数ｆａの振幅情報に第１のゲインとして１／３を乗じ、該乗じたものを各周波数ｆａの振幅情報Ａａとした図６（Ｃ）のような第１の補間用周波数領域信号を生成する。本発明の「第１の補間用信号生成部」は奇数倍高調波生成部３３と第１ゲイン乗算部３４により構成することができる。 The first gain multiplication unit 34 applies 1/3 as the first gain to the amplitude information of each frequency fa for the data of the frequency domain signal of FIG. 6B output from the odd multiple harmonic generation unit 33. A first interpolation frequency domain signal as shown in FIG. 6C is generated by multiplying and multiplying the amplitude information Aa of each frequency fa. The “first interpolation signal generation unit” of the present invention can be configured by an odd harmonic generation unit 33 and a first gain multiplication unit 34.

第１再変換部３５は、第１ゲイン乗算部３４から出力された第１の補間用周波数領域信号を、例えば逆フーリエ変換などを行い、時間領域信号に変換し、合成部４０に出力する。変換の際、補間領域の周波数ｆａについて位相情報が必要な場合、周波数ｆａ／３についての位相情報を用いる。 The first reconversion unit 35 performs, for example, inverse Fourier transform on the first interpolation frequency domain signal output from the first gain multiplication unit 34, converts it into a time domain signal, and outputs the time domain signal to the synthesis unit 40. When phase information is necessary for the frequency fa of the interpolation region during the conversion, the phase information for the frequency fa / 3 is used.

判定部３６は、不可逆圧縮された音声データに基づき復号された時間領域信号内に対し、複数ある周波数ｆａ／２のうち１つの周波数ｆａ／２を通すフィルタをかけ、該フィルタをかけた原音の時間領域信号が後述の所定の条件を満たすか否か判定する。 The determination unit 36 applies a filter that passes one frequency fa / 2 among a plurality of frequencies fa / 2 to the time domain signal decoded based on the irreversibly compressed audio data, and the original sound that has been subjected to the filter is filtered. It is determined whether or not the time domain signal satisfies a predetermined condition described later.

偶数倍高調波生成部３７は、奇数倍高調波生成部３３と同様に、原音の周波数領域信号に含まれるカットオフ周波数ｆｃ以下の周波数の振幅情報に基づいて、補間領域の周波数の振幅情報を補間するものである。しかし、偶数倍高調波生成部３７は、奇数倍高調波生成部３３とは異なり、補間領域の周波数ｆａの振幅値として、周波数領域信号内の周波数ｆａ／２に対する振幅値または０を加算する。 Similarly to the odd harmonic generation unit 33, the even harmonic generation unit 37 generates amplitude information of the frequency in the interpolation region based on the amplitude information of the frequency equal to or lower than the cutoff frequency fc included in the frequency region signal of the original sound. Interpolate. However, unlike the odd harmonic generation unit 33, the even harmonic generation unit 37 adds the amplitude value or 0 for the frequency fa / 2 in the frequency domain signal as the amplitude value of the frequency fa of the interpolation domain.

偶数倍高調波生成部３７は、例えば、図７（Ａ）に示すような原音の単位時間毎の周波数領域信号に含まれる１２ｋＨＺより大きく２４ｋＨＺ以下の周波数の振幅情報に基づいて、２４ｋＨＺより大きく４８ｋＨＺ以下の周波数の振幅情報を補間する。その際、偶数倍高調波生成部３７は、２４ｋＨＺ＜ｆａ≦４８ｋＨＺの補間領域の周波数ｆａの振幅情報として、図７（Ｂ）のように周波数ｆａ／２の振幅情報または０を加算する。０を加算する場合とは、判定部３６での判定の結果、周波数ｆａ／２を通すフィルタをかけた原音の時間領域信号が所定の条件を満たす場合である。偶数倍高調波生成部３７における周波数ｆａの間隔は例えば０．３ｋＨＺとして奇数倍高調波生成部３３と同じとしてもよいし、異ならせても良い。 The even harmonic generation unit 37, for example, based on amplitude information of a frequency greater than 12 kHz and less than or equal to 24 kHz included in a frequency domain signal for each unit time of the original sound as shown in FIG. 7A, greater than 24 kHz and 48 kHz. Interpolate the amplitude information of the following frequencies. At that time, the even-numbered harmonic generation unit 37 adds the amplitude information of the frequency fa / 2 or 0 as shown in FIG. 7B as the amplitude information of the frequency fa in the interpolation region of 24 kHZ <fa ≦ 48 kHZ. The case where 0 is added is a case where the time domain signal of the original sound filtered through the frequency fa / 2 satisfies a predetermined condition as a result of determination by the determination unit 36. The interval of the frequency fa in the even-numbered harmonic generation unit 37 may be the same as that of the odd-numbered harmonic generation unit 33, for example, 0.3 kHz, or may be different.

第２ゲイン乗算部３８は、偶数倍高調波生成部３７から出力された図７（Ｂ）の周波数領域信号のデータに対して、各周波数ｆａの振幅情報に第２のゲインとして１／２を乗じ、該乗じたものを各周波数ｆａの振幅情報Ａｅとした図７（Ｃ）のような第２の補間用周波数領域信号を作成する。本発明の「第２補間用信号生成部」は偶数倍高調波生成部３７と第２ゲイン乗算部３８により構成することができる。 The second gain multiplication unit 38 halves the amplitude information of each frequency fa as a second gain for the frequency domain signal data in FIG. 7B output from the even-numbered harmonic generation unit 37. A second interpolation frequency domain signal as shown in FIG. 7C is generated by multiplying and multiplying the amplitude information Ae of each frequency fa. The “second interpolation signal generation unit” of the present invention can be configured by an even harmonic generation unit 37 and a second gain multiplication unit 38.

第２再変換部３９は、第２ゲイン乗算部３８から出力された第２の補間用周波数領域信号を、例えば逆フーリエ変換などを行い、時間領域信号に変換し、合成部４０に出力する。変換の際、補間領域の周波数ｆａについて位相情報が必要な場合、周波数ｆａ／２についての位相情報を用いる。 The second re-conversion unit 39 performs, for example, inverse Fourier transform on the second interpolation frequency domain signal output from the second gain multiplication unit 38, converts it into a time-domain signal, and outputs it to the synthesis unit 40. When phase information is necessary for the frequency fa of the interpolation region during the conversion, the phase information for the frequency fa / 2 is used.

合成部４０は、第１再変換部３５からの第１の補間用周波数領域信号を変換した時間領域信号と、第２再変換部３９からの第１の補間用周波数領域信号を変換した時間領域信号とを合成して、合成部５０に出力する。 The combining unit 40 converts the time domain signal obtained by converting the first interpolation frequency domain signal from the first reconversion unit 35 and the time domain obtained by converting the first interpolation frequency domain signal from the second reconversion unit 39. The signals are combined and output to the combining unit 50.

以上のように、音声データ処理装置１では、基音以外の音の高調波も含ませることができるので、より実際の音に近い音を再生することができるようになる。 As described above, since the sound data processing apparatus 1 can include harmonics of sounds other than the fundamental sound, it is possible to reproduce sounds closer to actual sounds.

なお、基音の周波数領域信号内に周波数ｆａ／３や周波数ｆａ／２の振幅情報が存在しない場合、補間領域の周波数ｆａや周波数ｆａに対する振幅情報は０となる。
また、上述のように所定の条件を満たすか否かに応じて偶数倍高調波生成部３７で高周波領域の周波数ｆａの振幅情報として周波数ｆａ／２の振幅情報か０を加算するのに代えて、偶数波高調波生成部３７では所定の条件を満たすか否かに関わらず高周波領域の周波数ｆａの振幅情報として周波数ｆａ／２の振幅情報を加算し、上記所定の条件を満たす場合には、偶数波高調波生成部３７から出力された該当する周波数ｆａの振幅情報に対して第２ゲイン乗算部３８にて０を乗ずるようにしてもよい。 Note that when there is no amplitude information of the frequency fa / 3 or the frequency fa / 2 in the frequency domain signal of the fundamental tone, the amplitude information for the frequency fa or the frequency fa of the interpolation domain is zero.
Moreover, instead of adding the amplitude information of the frequency fa / 2 or 0 as the amplitude information of the frequency fa in the high frequency region in the even harmonic generation unit 37 according to whether or not the predetermined condition is satisfied as described above. The even harmonic generation unit 37 adds the amplitude information of the frequency fa / 2 as the amplitude information of the frequency fa in the high frequency region regardless of whether or not the predetermined condition is satisfied. The second gain multiplication unit 38 may multiply the amplitude information of the corresponding frequency fa output from the even-numbered harmonic generation unit 37 by 0.

図８は、判定部３６での判定処理について説明する図である。
判定部３６は、バンドパスフィルタ３６ａを有する。バンドパスフィルタ３６ａは、原音の時間領域信号に含まれる複数の周波数ｆａ／２のうち１つの周波数のみを選択可能に通すことができるように構成されている。 FIG. 8 is a diagram illustrating the determination process in the determination unit 36.
The determination unit 36 includes a band pass filter 36a. The band-pass filter 36a is configured so that only one frequency among a plurality of frequencies fa / 2 included in the time domain signal of the original sound can be selectively passed.

この判定部３６は、バンドパスフィルタ３６ａを用いて、周波数ｆａ／２それぞれについて、１つの周波数ｆａ／２を通すフィルタをかけ、該フィルタをかけた原音の時間領域信号が所定の条件を満たすか、すなわち、フィルタをかけた原音の時間領域信号の最大値から最小値までの時間間隔Ｔｉｎが４／ｆａであるか否かを判定する。判定部３６は、フィルタをかけた結果図８（Ａ）に示すように正弦波が得られる場合は、上記時間間隔Ｔｉｎは４／ｆａになるので（ｆａが１ｋＨＺであれば４×１０^-4秒となるので）「所定の条件を満たす」と判定する。フィルタをかけた結果図８（Ｂ）に示すように正弦波でない時間領域信号が得られた場合、上記時間間隔Ｔｉｎは４／ｆａにはならないので「所定の条件を満たさない」と判定する。 The determination unit 36 applies a filter that passes one frequency fa / 2 for each frequency fa / 2 using the band-pass filter 36a, and whether the time domain signal of the filtered original sound satisfies a predetermined condition. That is, it is determined whether or not the time interval Tin from the maximum value to the minimum value of the time domain signal of the filtered original sound is 4 / fa. As shown in FIG. 8A, the determination unit 36 applies a sine wave as shown in FIG. 8A, so the time interval Tin is 4 / fa (if fa is 1 kHz, 4 × 10 ⁻⁴ It is determined that “predetermined condition is satisfied”. If a time domain signal that is not a sine wave is obtained as a result of applying the filter as shown in FIG. 8B, the time interval Tin does not become 4 / fa, so it is determined that “the predetermined condition is not satisfied”.

偶数倍高調波生成部３７では、前述のように、高周波領域の周波数ｆａの振幅情報の補間に際し、周波数ｆａ／２を通すフィルタをかけた原音の時間領域信号が判定部３６での判定の結果、所定の条件を満たす場合は、周波数ｆａ／２の振幅情報を周波数ｆａの振幅情報として加算するが、上記所定の条件を満たさない場合は、周波数ｆａの振幅情報として０を加算する。
このようにすることによって、より実際の音に違い音声を再生することができるようになる。 In the even harmonic generation unit 37, as described above, when interpolating the amplitude information of the frequency fa in the high frequency region, the time domain signal of the original sound that has been filtered through the frequency fa / 2 is the result of the determination in the determination unit 36. When the predetermined condition is satisfied, the amplitude information of the frequency fa / 2 is added as the amplitude information of the frequency fa. However, when the predetermined condition is not satisfied, 0 is added as the amplitude information of the frequency fa.
By doing in this way, it becomes possible to reproduce the sound more different from the actual sound.

（第２の実施形態）
第１の実施形態では、判定部３６を設けたうえで、偶数倍高調波生成部３７では、補間領域の周波数ｆａの振幅値として、周波数領域信号内の周波数ｆａ／２に対する振幅値または０を加算していた。それに対し、本実施形態では、判定部３６を設けずに、偶数倍高調波生成部３７では、補間領域の周波数ｆａの振幅値として、常に、原音の周波数領域信号内の周波数ｆａ／２に対する振幅値を加算する。
この構成によっても、高周波成分を補間し、より実際の音に違い音声を再生することができるようになる。 (Second Embodiment)
In the first embodiment, after the determination unit 36 is provided, the even harmonic generation unit 37 sets the amplitude value or 0 for the frequency fa / 2 in the frequency domain signal as the amplitude value of the frequency fa in the interpolation domain. I was adding. On the other hand, in this embodiment, the determination unit 36 is not provided, and the even harmonic generation unit 37 always uses the amplitude value for the frequency fa / 2 in the frequency domain signal of the original sound as the amplitude value of the frequency fa in the interpolation domain. Add values.
Also with this configuration, it becomes possible to interpolate high frequency components and reproduce the sound more different from the actual sound.

（第３の実施形態）
第１の実施形態では、合成部４０が、第１再変換部３５からの出力と第２再変換部３９からの出力とを１対１の混合比で合成していた。それに対し、第３の実施形態では、合成部４０が、第１再変換部３５からの出力と第２再変換部３９からの出力とを混合比を変えて合成することができる。混合比はユーザ入力に応じて可変としておけば、ユーザの好みにあった音声を再生することができるようになる。 (Third embodiment)
In the first embodiment, the synthesis unit 40 synthesizes the output from the first reconversion unit 35 and the output from the second reconversion unit 39 at a one-to-one mixing ratio. On the other hand, in the third embodiment, the combining unit 40 can combine the output from the first reconversion unit 35 and the output from the second reconversion unit 39 by changing the mixing ratio. If the mixing ratio is variable according to the user input, it is possible to reproduce audio that suits the user's preference.

（第４の実施形態）
第１の実施形態では、補間領域の周波数ｆａの振幅情報の補間に際し、奇数倍高調波生成部３３が、原音の周波数領域信号内の周波数ｆａ／３に対する振幅情報を抽出し、偶数倍高調波生成部３７が、原音の周波数領域信号内の周波数ｆａ／２に対する振幅情報を抽出していた。 (Fourth embodiment)
In the first embodiment, when interpolating the amplitude information of the frequency fa in the interpolation region, the odd harmonic generation unit 33 extracts the amplitude information for the frequency fa / 3 in the frequency domain signal of the original sound, and even harmonics are generated. The generation unit 37 extracts amplitude information for the frequency fa / 2 in the frequency domain signal of the original sound.

それに対し、第４の実施形態では、補間領域の周波数ｆａの振幅情報の補間に際し、奇数倍高調波生成部３３が、原音の周波数領域信号内の周波数ｆａ／３に対する振幅情報の他に、周波数ｆａ／５に対する振幅情報等、周波数ｆａ／（２ｍ＋１）（ｍは２以上の正の整数）に対する振幅情報を抽出し、また、偶数倍高調波生成部３７が、原音の周波数領域信号内の周波数ｆａ／２に対する振幅情報の他に、周波数ｆａ／４に対する振幅情報等、周波数ｆａ／２ｍに対する振幅情報を抽出する。
この場合、第１ゲイン乗算部３４が周波数ｆａ／（２ｍ＋１）に対する振幅情報に乗ずる第１のゲインは１／（２ｍ＋１）であり、第２ゲイン乗算部３８が周波数ｆａ／２ｍに対する振幅情報に乗ずる第２のゲインは１／２ｍである。 On the other hand, in the fourth embodiment, when interpolating the amplitude information of the frequency fa in the interpolation region, the odd harmonic generation unit 33 performs the frequency information in addition to the amplitude information for the frequency fa / 3 in the frequency region signal of the original sound. The amplitude information for the frequency fa / (2m + 1) (m is a positive integer greater than or equal to 2), such as the amplitude information for fa / 5, is extracted, and the even-numbered harmonic generation unit 37 uses the frequency in the frequency domain signal of the original sound. In addition to amplitude information for fa / 2, amplitude information for frequency fa / 2m, such as amplitude information for frequency fa / 4, is extracted.
In this case, the first gain multiplied by the amplitude information for the frequency fa / (2m + 1) by the first gain multiplier 34 is 1 / (2m + 1), and the second gain multiplier 38 multiplies the amplitude information for the frequency fa / 2m. The second gain is 1 / 2m.

このようにすることによって、さらに実際の音に違い音声を再生することができるようになる。 By doing so, it is possible to reproduce a sound that is different from the actual sound.

また、本発明は、上述の第１〜第４のいずれか１の実施形態の音声データ処理装置を備えた電子機器とすることもできる。電子機器とは、例えば、テレビジョン受信機や各種オーディオ機器である。 In addition, the present invention may be an electronic device including the audio data processing device according to any one of the first to fourth embodiments described above. The electronic device is, for example, a television receiver or various audio devices.

１…音声データ処理装置、１０…復号部、２０…アップサンプリング部、３０…高周波補間部、３１…帯域検出部、３２…周波数領域変換部、３３…奇数倍高調波生成部、３４…第１ゲイン乗算部、３５…第１再変換部、３６…判定部、３６ａ…バンドパスフィルタ、３７…偶数倍高調波生成部、３８…第２ゲイン乗算部、３９…第２再変換部、４０…合成部、５０…合成部。
DESCRIPTION OF SYMBOLS 1 ... Audio | voice data processing apparatus, 10 ... Decoding part, 20 ... Upsampling part, 30 ... High frequency interpolation part, 31 ... Band detection part, 32 ... Frequency domain conversion part, 33 ... Odd harmonics generation part, 34 ... 1st Gain multiplying unit, 35 ... first reconversion unit, 36 ... determination unit, 36a ... bandpass filter, 37 ... even multiple harmonic generation unit, 38 ... second gain multiplication unit, 39 ... second reconversion unit, 40 ... Synthesis unit, 50... Synthesis unit.

Claims

An audio data processing device including a high frequency interpolation unit that interpolates a frequency component of a high frequency region cut in the original sound with respect to a time domain signal of the original sound decoded based on audio data,
The high-frequency interpolation unit
A converter that converts the decoded time domain signal of the original sound into a frequency domain signal;
When the frequency fa component of the cut high frequency region is interpolated, amplitude information for the frequency fa / (2n + 1) (n is a positive integer) in the frequency region signal of the original sound is extracted, and the extracted amplitude information is 1 / A first interpolation signal generation unit for generating a first interpolation frequency domain signal obtained by multiplying (2n + 1) by the amplitude information Aa of the frequency fa;
A second frequency domain signal for interpolation is obtained by extracting amplitude information for the frequency fa / 2n in the frequency domain signal of the original sound and multiplying the extracted amplitude information by 1 / 2n as amplitude information Ae of the frequency fa. A second interpolation signal generator for generating
And a synthesis unit that synthesizes a signal obtained by converting the first interpolation frequency domain signal into a time domain signal and a signal obtained by converting the second interpolation frequency domain signal into a time domain signal. Audio data processing device.

The high-frequency interpolator applies a filter that passes the frequency fa / 2n to the time domain signal of the original sound, and the time interval Tin from the maximum value to the minimum value of the filtered time domain signal is 4n. A determination unit for determining whether or not
The second interpolation signal generation unit sets the amplitude information Ae of the frequency fa to 0 when the time interval Tin of the original sound that has been filtered through the frequency fa / 2n is 4n / fa. The audio data processing apparatus according to claim 1.

The synthesizer can adjust a mixing ratio between a signal obtained by converting the first frequency domain signal for interpolation into a time domain signal and a signal obtained by converting the second frequency domain signal for interpolation into a time domain signal. The audio data processing apparatus according to claim 1, wherein the audio data processing apparatus is synthesized.

An electronic apparatus comprising the audio data processing device according to claim 1.

The electronic apparatus according to claim 4, wherein the electronic apparatus is a television receiver.