JP2004053891A5

JP2004053891A5 -

Info

Publication number: JP2004053891A5
Application number: JP2002210899A
Authority: JP
Filing date: 2002-07-19
Publication date: 2005-10-20
Anticipated expiration: 2022-07-19

Claims

A data processing device for converting first data obtained by converting time-domain audio data into frequency-domain audio data into second data,
Obtaining means for obtaining the tap coefficient obtained by performing learning;
A data processing apparatus comprising: a prediction calculation unit that converts the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. .

Class tap generation means for generating, from the first data, a class tap to be used for class classification that classifies attention data that is data of interest among the second data,
Class classification means for classifying the data of interest based on the class tap;
A prediction tap generation unit configured to generate a prediction tap used for a prediction calculation for obtaining the attention data from the first data;
The acquisition means acquires a tap coefficient of the class of attention data obtained by class classification of the attention data,
The data processing apparatus according to claim 1, wherein the prediction calculation unit obtains the attention data by performing a prediction calculation using a tap coefficient of the class of attention data and the prediction tap.

The data processing apparatus according to claim 1, wherein the second data is time-domain audio data.

The data processing apparatus according to claim 1, wherein the second data is audio data in a frequency domain.

The first data is audio data for each of a plurality of frequency bands,
5. The data processing according to claim 4, wherein the prediction calculation unit converts the first data into the second data that is audio data in a frequency band that is missing in the first data. apparatus.

In the first data, the bit allocation for some frequency bands may be set to 0 by a perceptual coding technique or a psychoacoustic coding technique.
The prediction calculation means sets the frequency band in which the bit allocation is set to 0 in the first data as the frequency band missing in the first data, and the second data which is audio data in the frequency band. The data processing device according to claim 5, wherein the data processing device is obtained.

The data processing apparatus according to claim 4, wherein the prediction calculation unit converts the first data into the second data that is audio data for each of a plurality of frequency bands.

The data processing apparatus according to claim 7 , further comprising a synthesizing unit that synthesizes audio data for each of the plurality of frequency bands as the second data to obtain audio data in a time domain.

A class that generates from the first data a class tap that is used for classifying the attention data, which is the data of interest, among the audio data of the attention frequency band of interest in the second data. Tap generating means;
Class classification means for classifying the data of interest based on the class tap;
A prediction tap generation unit configured to generate a prediction tap used for a prediction calculation for obtaining the attention data from the first data;
The acquisition means acquires a tap coefficient of the class of attention data obtained by class classification of the attention data,
The data processing apparatus according to claim 7 , wherein the prediction calculation unit obtains the attention data by performing a prediction calculation using a tap coefficient of the class of the attention data and the prediction tap.

The second data is time domain or frequency domain audio data;
2. The data according to claim 1, further comprising: a converting unit configured to convert the audio data in the frequency domain into audio data in the time domain when the second data is audio data in the frequency domain. Processing equipment.

The apparatus further comprises correction means for correcting the time domain audio data obtained by converting the time domain audio data as the second data or the frequency domain audio data as the second data. The data processing apparatus according to claim 10 .

A data processing method for converting first data obtained by converting time domain audio data into frequency domain audio data into second data,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A prediction calculation step of converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. .

A program for causing a computer to perform data processing for converting first data obtained by converting time domain audio data into frequency domain audio data into second data,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A prediction calculation step of converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data.

A recording medium on which a program for causing a computer to perform data processing for converting first data obtained by converting time domain audio data into frequency domain audio data into second data is recorded,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
A program comprising: a prediction calculation step for converting the first data into the second data by performing a predetermined prediction calculation using the tap coefficient and the first data. A characteristic recording medium.

A data processing device for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. Prediction tap generation means for generating from student data corresponding to
A data processing apparatus comprising: learning means for obtaining the tap coefficient by learning the relationship between the teacher data and student data using the attention data and a prediction tap.

A class tap generating means for generating a class tap used for classifying the attention data from the student data;
Class classification means for classifying the data of interest based on the class tap; and
The data processing apparatus according to claim 15 , wherein the learning unit obtains the tap coefficient for each class obtained by class classification performed by the class classification unit.

The data processing apparatus according to claim 15 , wherein the second data is time-domain audio data.

The data processing apparatus according to claim 15 , wherein the second data is frequency domain audio data.

The first data is audio data for each of a plurality of frequency bands,
The learning means obtains the tap coefficient used to convert the first data into the second data that is audio data in a frequency band that is missing in the first data. The data processing apparatus according to claim 18 .

In the first data, the bit allocation for some frequency bands may be set to 0 by a perceptual coding technique or a psychoacoustic coding technique.
The learning means obtains the second data, which is audio data in the frequency band, by setting a frequency band in which the bit allocation is 0 in the first data as a frequency band missing in the first data. The data processing apparatus according to claim 19 , wherein the tap coefficient used in the calculation is obtained.

Said learning means, according to claim 18, characterized in that determining said tap coefficient used to convert the first data, the second data is audio data of each plurality of frequency bands Data processing equipment.

The prediction tap generation means generates the prediction tap for attention data that is data of interest among audio data in a frequency band of interest of interest in the teacher data,
A class tap generating means for generating a class tap used for class classification for classifying the attention data from the student data;
Class classification means for classifying the data of interest based on the class tap; and
The data processing apparatus according to claim 21 , wherein the learning unit obtains the tap coefficient for each class obtained by class classification performed by the class classification unit.

A data processing method for learning tap coefficients used to convert first data obtained by converting time domain audio data into frequency domain audio data into second data,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
A learning method for obtaining the tap coefficient by learning the relationship between the teacher data and the student data using the attention data and the prediction tap.

A program for causing a computer to perform data processing for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data. And
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
And a learning step for obtaining the tap coefficient by learning the relationship between the teacher data and the student data using the attention data and the prediction tap.

A program for causing a computer to perform data processing for learning tap coefficients used to convert first data obtained by converting time-domain audio data into frequency-domain audio data into second data is recorded. Recording medium,
The first data serving as a learning student is a prediction tap used to obtain attention data of interest among teacher data corresponding to the second data and serving as a teacher for learning the tap coefficient. A prediction tap generation step generated from student data corresponding to,
A recording medium, comprising: a learning step for obtaining the tap coefficient by learning a relationship between the teacher data and student data using the attention data and a prediction tap.