JP2004053891A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2004053891A5 JP2004053891A5 JP2002210899A JP2002210899A JP2004053891A5 JP 2004053891 A5 JP2004053891 A5 JP 2004053891A5 JP 2002210899 A JP2002210899 A JP 2002210899A JP 2002210899 A JP2002210899 A JP 2002210899A JP 2004053891 A5 JP2004053891 A5 JP 2004053891A5
- Authority
- JP
- Japan
- Prior art keywords
- data
- tap
- learning
- prediction
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Claims (25)
学習を行うことにより求められたタップ係数を取得する取得手段と、
前記タップ係数および第1のデータを用いて、所定の予測演算を行うことにより、前記第1のデータを、前記第2のデータに変換する予測演算手段と
を備えることを特徴とするデータ処理装置。A data processing device for converting first data obtained by converting time-domain audio data into frequency-domain audio data into second data,
Obtaining means for obtaining the tap coefficient obtained by performing learning;
A data processing apparatus comprising: a prediction calculation unit that converts the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. .
前記クラスタップに基づいて、前記注目データをクラス分類するクラス分類手段と、
前記注目データを求める予測演算に用いる予測タップを、前記第1のデータから生成する予測タップ生成手段と
をさらに備え、
前記取得手段は、前記注目データのクラス分類によって求められる前記注目データのクラスのタップ係数を取得し、
前記予測演算手段は、前記注目データのクラスのタップ係数と、前記予測タップとを用いて予測演算を行うことにより、前記注目データを求める
ことを特徴とする請求項1に記載のデータ処理装置。Class tap generation means for generating, from the first data, a class tap to be used for class classification that classifies attention data that is data of interest among the second data,
Class classification means for classifying the data of interest based on the class tap;
A prediction tap generation unit configured to generate a prediction tap used for a prediction calculation for obtaining the attention data from the first data;
The acquisition means acquires a tap coefficient of the class of attention data obtained by class classification of the attention data,
The data processing apparatus according to claim 1, wherein the prediction calculation unit obtains the attention data by performing a prediction calculation using a tap coefficient of the class of attention data and the prediction tap.
ことを特徴とする請求項1に記載のデータ処理装置。The data processing apparatus according to claim 1, wherein the second data is time-domain audio data.
ことを特徴とする請求項1に記載のデータ処理装置。The data processing apparatus according to claim 1, wherein the second data is audio data in a frequency domain.
前記予測演算手段は、前記第1のデータを、その第1のデータにおいて抜けている周波数帯域のオーディオデータである前記第2のデータに変換する
ことを特徴とする請求項4に記載のデータ処理装置。The first data is audio data for each of a plurality of frequency bands,
5. The data processing according to claim 4, wherein the prediction calculation unit converts the first data into the second data that is audio data in a frequency band that is missing in the first data. apparatus.
前記予測演算手段は、前記第1のデータにおいてビット割り当てが0とされた周波数帯域を、前記第1のデータにおいて抜けている周波数帯域として、その周波数帯域のオーディオデータである前記第2のデータを求める
ことを特徴とする請求項5に記載のデータ処理装置。In the first data, the bit allocation for some frequency bands may be set to 0 by a perceptual coding technique or a psychoacoustic coding technique.
The prediction calculation means sets the frequency band in which the bit allocation is set to 0 in the first data as the frequency band missing in the first data, and the second data which is audio data in the frequency band. The data processing device according to claim 5, wherein the data processing device is obtained.
ことを特徴とする請求項4に記載のデータ処理装置。The data processing apparatus according to claim 4, wherein the prediction calculation unit converts the first data into the second data that is audio data for each of a plurality of frequency bands.
ことを特徴とする請求項7に記載のデータ処理装置。The data processing apparatus according to claim 7 , further comprising a synthesizing unit that synthesizes audio data for each of the plurality of frequency bands as the second data to obtain audio data in a time domain.
前記クラスタップに基づいて、前記注目データをクラス分類するクラス分類手段と、
前記注目データを求める予測演算に用いる予測タップを、前記第1のデータから生成する予測タップ生成手段と
をさらに備え、
前記取得手段は、前記注目データのクラス分類によって求められる前記注目データのクラスのタップ係数を取得し、
前記予測演算手段は、前記注目データのクラスのタップ係数と、前記予測タップとを用いて予測演算を行うことにより、前記注目データを求める
ことを特徴とする請求項7に記載のデータ処理装置。A class that generates from the first data a class tap that is used for classifying the attention data, which is the data of interest, among the audio data of the attention frequency band of interest in the second data. Tap generating means;
Class classification means for classifying the data of interest based on the class tap;
A prediction tap generation unit configured to generate a prediction tap used for a prediction calculation for obtaining the attention data from the first data;
The acquisition means acquires a tap coefficient of the class of attention data obtained by class classification of the attention data,
The data processing apparatus according to claim 7 , wherein the prediction calculation unit obtains the attention data by performing a prediction calculation using a tap coefficient of the class of the attention data and the prediction tap.
前記第2のデータが周波数領域のオーディオデータである場合には、その周波数領域のオーディオデータを、時間領域のオーディオデータに変換する変換手段をさらに備える
ことを特徴とする請求項1に記載のデータ処理装置。The second data is time domain or frequency domain audio data;
2. The data according to claim 1, further comprising: a converting unit configured to convert the audio data in the frequency domain into audio data in the time domain when the second data is audio data in the frequency domain. Processing equipment.
ことを特徴とする請求項10に記載のデータ処理装置。The apparatus further comprises correction means for correcting the time domain audio data obtained by converting the time domain audio data as the second data or the frequency domain audio data as the second data. The data processing apparatus according to claim 10 .
学習を行うことにより求められたタップ係数を取得する取得ステップと、
前記タップ係数および第1のデータを用いて、所定の予測演算を行うことにより、前記第1のデータを、前記第2のデータに変換する予測演算ステップと
を備えることを特徴とするデータ処理方法。A data processing method for converting first data obtained by converting time domain audio data into frequency domain audio data into second data,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A prediction calculation step of converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. .
学習を行うことにより求められたタップ係数を取得する取得ステップと、
前記タップ係数および第1のデータを用いて、所定の予測演算を行うことにより、前記第1のデータを、前記第2のデータに変換する予測演算ステップと
を備えることを特徴とするプログラム。A program for causing a computer to perform data processing for converting first data obtained by converting time domain audio data into frequency domain audio data into second data,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A prediction calculation step of converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data.
学習を行うことにより求められたタップ係数を取得する取得ステップと、
前記タップ係数および第1のデータを用いて、所定の予測演算を行うことにより、前記第1のデータを、前記第2のデータに変換する予測演算ステップと
を備えるプログラムが記録されている
ことを特徴とする記録媒体。A recording medium on which a program for causing a computer to perform data processing for converting first data obtained by converting time domain audio data into frequency domain audio data into second data is recorded,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A program comprising: a prediction calculation step for converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. A characteristic recording medium.
前記タップ係数の学習の教師となる、前記第2のデータに対応する教師データのうちの注目している注目データを求めるのに用いる予測タップを、前記学習の生徒となる、前記第1のデータに対応する生徒データから生成する予測タップ生成手段と、
前記注目データと予測タップを用い、前記教師データと生徒データとの関係を学習することにより、前記タップ係数を求める学習手段と
を備えることを特徴とするデータ処理装置。A data processing device for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. Prediction tap generation means for generating from student data corresponding to
A data processing apparatus comprising: learning means for obtaining the tap coefficient by learning the relationship between the teacher data and student data using the attention data and a prediction tap.
前記クラスタップに基づいて、前記注目データをクラス分類するクラス分類手段と
をさらに備え、
前記学習手段は、前記クラス分類手段で行われるクラス分類によって得られるクラスごとに、前記タップ係数を求める
ことを特徴とする請求項15に記載のデータ処理装置。A class tap generating means for generating a class tap used for classifying the attention data from the student data;
Class classification means for classifying the data of interest based on the class tap; and
The data processing apparatus according to claim 15 , wherein the learning unit obtains the tap coefficient for each class obtained by class classification performed by the class classification unit.
ことを特徴とする請求項15に記載のデータ処理装置。The data processing apparatus according to claim 15 , wherein the second data is time-domain audio data.
ことを特徴とする請求項15に記載のデータ処理装置。The data processing apparatus according to claim 15 , wherein the second data is frequency domain audio data.
前記学習手段は、前記第1のデータを、その第1のデータにおいて抜けている周波数帯域のオーディオデータである前記第2のデータに変換するのに用いられる前記タップ係数を求める
ことを特徴とする請求項18に記載のデータ処理装置。The first data is audio data for each of a plurality of frequency bands,
The learning means obtains the tap coefficient used to convert the first data into the second data that is audio data in a frequency band that is missing in the first data. The data processing apparatus according to claim 18 .
前記学習手段は、前記第1のデータにおいてビット割り当てが0とされた周波数帯域を、第1のデータにおいて抜けている周波数帯域として、その周波数帯域のオーディオデータである前記第2のデータを求めるのに用いられる前記タップ係数を求める
ことを特徴とする請求項19に記載のデータ処理装置。In the first data, the bit allocation for some frequency bands may be set to 0 by a perceptual coding technique or a psychoacoustic coding technique.
The learning means obtains the second data, which is audio data in the frequency band, by setting a frequency band in which the bit allocation is 0 in the first data as a frequency band missing in the first data. The data processing apparatus according to claim 19 , wherein the tap coefficient used in the calculation is obtained.
ことを特徴とする請求項18に記載のデータ処理装置。Said learning means, according to claim 18, characterized in that determining said tap coefficient used to convert the first data, the second data is audio data of each plurality of frequency bands Data processing equipment.
前記注目データをクラス分けするクラス分類に用いるクラスタップを、前記生徒データから生成するクラスタップ生成手段と、
前記クラスタップに基づいて、前記注目データをクラス分類するクラス分類手段と
をさらに備え、
前記学習手段は、前記クラス分類手段で行われるクラス分類によって得られるクラスごとに、前記タップ係数を求める
ことを特徴とする請求項21に記載のデータ処理装置。The prediction tap generation means generates the prediction tap for attention data that is data of interest among audio data in a frequency band of interest of interest in the teacher data,
A class tap generating means for generating a class tap used for class classification for classifying the attention data from the student data;
Class classification means for classifying the data of interest based on the class tap; and
The data processing apparatus according to claim 21 , wherein the learning unit obtains the tap coefficient for each class obtained by class classification performed by the class classification unit.
前記タップ係数の学習の教師となる、前記第2のデータに対応する教師データのうちの注目している注目データを求めるのに用いる予測タップを、前記学習の生徒となる、前記第1のデータに対応する生徒データから生成する予測タップ生成ステップと、
前記注目データと予測タップを用い、前記教師データと生徒データとの関係を学習することにより、前記タップ係数を求める学習ステップと
を備えることを特徴とするデータ処理方法。A data processing method for learning tap coefficients used to convert first data obtained by converting time domain audio data into frequency domain audio data into second data,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
A learning method for obtaining the tap coefficient by learning the relationship between the teacher data and the student data using the attention data and the prediction tap.
前記タップ係数の学習の教師となる、前記第2のデータに対応する教師データのうちの注目している注目データを求めるのに用いる予測タップを、前記学習の生徒となる、前記第1のデータに対応する生徒データから生成する予測タップ生成ステップと、
前記注目データと予測タップを用い、前記教師データと生徒データとの関係を学習することにより、前記タップ係数を求める学習ステップと
を備えることを特徴とするプログラム。A program for causing a computer to perform data processing for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data. And
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
And a learning step for obtaining the tap coefficient by learning the relationship between the teacher data and the student data using the attention data and the prediction tap.
前記タップ係数の学習の教師となる、前記第2のデータに対応する教師データのうちの注目している注目データを求めるのに用いる予測タップを、前記学習の生徒となる、前記第1のデータに対応する生徒データから生成する予測タップ生成ステップと、
前記注目データと予測タップを用い、前記教師データと生徒データとの関係を学習することにより、前記タップ係数を求める学習ステップと
を備えるプログラムが記録されている
ことを特徴とする記録媒体。A program for causing a computer to perform data processing for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data is recorded. Recording medium,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
A recording medium, comprising: a learning step for obtaining the tap coefficient by learning a relationship between the teacher data and student data using the attention data and a prediction tap.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002210899A JP4218271B2 (en) | 2002-07-19 | 2002-07-19 | Data processing apparatus, data processing method, program, and recording medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002210899A JP4218271B2 (en) | 2002-07-19 | 2002-07-19 | Data processing apparatus, data processing method, program, and recording medium |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2004053891A JP2004053891A (en) | 2004-02-19 |
JP2004053891A5 true JP2004053891A5 (en) | 2005-10-20 |
JP4218271B2 JP4218271B2 (en) | 2009-02-04 |
Family
ID=31934276
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2002210899A Expired - Fee Related JP4218271B2 (en) | 2002-07-19 | 2002-07-19 | Data processing apparatus, data processing method, program, and recording medium |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP4218271B2 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4627737B2 (en) * | 2006-03-08 | 2011-02-09 | シャープ株式会社 | Digital data decoding device |
JP4649351B2 (en) * | 2006-03-09 | 2011-03-09 | シャープ株式会社 | Digital data decoding device |
EP4407610A1 (en) * | 2008-07-11 | 2024-07-31 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoder, audio decoder, methods for encoding and decoding an audio signal, audio stream and computer program |
EP4372602A3 (en) | 2013-01-08 | 2024-07-10 | Dolby International AB | Model based prediction in a critically sampled filterbank |
-
2002
- 2002-07-19 JP JP2002210899A patent/JP4218271B2/en not_active Expired - Fee Related
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6903611B2 (en) | Signal generators, signal generators, signal generators and programs | |
CN1272911C (en) | Audio signal decoding device and audio signal encoding device | |
CN101183527B (en) | Method and apparatus for encoding and decoding high frequency signal | |
CN1279512C (en) | Methods for improving high frequency reconstruction | |
CN109147805B (en) | Audio tone enhancement based on deep learning | |
CN1265217A (en) | Method and appts. for speech enhancement in speech communication system | |
JP2001100773A5 (en) | ||
JP2005157354A (en) | Method and apparatus for multi-sensory speech enhancement | |
JP5651980B2 (en) | Decoding device, decoding method, and program | |
JPWO2007088853A1 (en) | Speech coding apparatus, speech decoding apparatus, speech coding system, speech coding method, and speech decoding method | |
CN104170009A (en) | Phase coherence control for harmonic signals in perceptual audio codecs | |
Borsos et al. | Speechpainter: Text-conditioned speech inpainting | |
CN115171709A (en) | Voice coding method, voice decoding method, voice coding device, voice decoding device, computer equipment and storage medium | |
JP2003108197A (en) | Audio signal decoding device and audio signal encoding device | |
CN1647186A (en) | Time domain watermarking of multimedia signals | |
JP2012523579A (en) | Method and apparatus for forming mixed signals, method and apparatus for separating signals, and corresponding signals | |
Nematollahi et al. | Digital speech watermarking based on linear predictive analysis and singular value decomposition | |
JP2004053891A5 (en) | ||
CN115588434A (en) | Method for directly synthesizing voice from tongue ultrasonic image | |
Fisher et al. | WaveMedic: convolutional neural networks for speech audio enhancement | |
JP2007334261A (en) | Signal processing method, signal processing device, and program | |
JPH03233500A (en) | Voice synthesis system and device used for same | |
JP6129321B2 (en) | Method and apparatus for separating signals by minimum variance spatial filtering under linear constraint conditions | |
WO2021172053A1 (en) | Signal processing device and method, and program | |
Park et al. | Artificial stereo extension based on Gaussian mixture model |