CN102270210A - MP3 audio attribute discretization method based on heterogeneity rule - Google Patents

MP3 audio attribute discretization method based on heterogeneity rule Download PDF

Info

Publication number
CN102270210A
CN102270210A CN2010106122593A CN201010612259A CN102270210A CN 102270210 A CN102270210 A CN 102270210A CN 2010106122593 A CN2010106122593 A CN 2010106122593A CN 201010612259 A CN201010612259 A CN 201010612259A CN 102270210 A CN102270210 A CN 102270210A
Authority
CN
China
Prior art keywords
audio
attribute
heterogeneous
breakpoint
discretize
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010106122593A
Other languages
Chinese (zh)
Inventor
余小清
刘军伟
万旺根
张静
杨薇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN2010106122593A priority Critical patent/CN102270210A/en
Publication of CN102270210A publication Critical patent/CN102270210A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an MP3 audio attribute discretization method based on a heterogeneity rule. The method directly performs discretization on the MP3 audio. Firstly, MP3 audio features are preprocessed, then, an MDCT (Modified Discrete Cosine Transform) spectral coefficient of each frame of audio is obtained, main features (including sideband energy ratio BER, root mean square RMS, spectral center distance SC, mel frequency ceptral coefficient MFCC (12-dimensional)) of the audio are extracted based on an MDCT domain and taken as an attribute set of training samples so as to obtain a 15-dimensional feature attribute input set, and finally, a discretization result is obtained via the discretization method based on the heterogeneity rule. The experimental results demonstrate that the discretization method can facilitate post-treatment for optimization of compressed domain audio attribute features and lay a foundation for establishing a practical and quick audio multi-classification and searching system.

Description

A kind of MP3 audio attribute discretization method based on heterogeneous criterion
Technical field
The present invention relates to a kind of MP3 audio attribute discretization method based on heterogeneous criterion, mainly be to carry out handling, can simplify the method for final discrete point set when being intended to guarantee degree of accuracy based on the discretize of heterogeneous criterion at MP3 audio attribute feature.
Background technology
The attribute discretization technique at first is divided into some equivalence classes to the connection attribute value in the data acquisition, then in guaranteeing each equivalence class under the prerequisite of data consistency, represent each equivalence class with different symbols or round values, and these equivalence classes are handled as single discrete data, thereby reach the purpose of discretize.Briefly, the discretize process of connection attribute is exactly the process of attribute space being divided with some specific symbols or round values.
Along with the fast development of mass data, how from disorderly and unsystematic noisy huge database, to excavate useful knowledge, become human challenge to the Intelligent Information Processing ability.For some data digging method, they all are the data set at discrete type usually when carrying out algorithm design, as decision tree, rough set, correlation rule etc., particularly become one of subject matter of rough set theory, also be one of bottleneck that influences the rough set theory application.Yet, in actual applications, attribute more is to present continuously or the state that mixes, rather than single discrete data, in order to contain the data sample of obtaining the database of connection attribute from these, obtain succinct and effective rule, excavate more effective information, need carry out the pretreated discretize of data connection attribute.
Discretization method proposed by the invention has solved the problem of connection attribute discretize in the MP3 compression domain, can have nothing in common with each other to the selected discrete point that comes out of each dimension attribute, by sample attribute itself and sample class decision.This method is selected method more reasonable of breakpoint than " lumping together " formula in traditional discretization method, can keep the more characteristic of each attribute.Can further be applied in the speech recognition and classification and retrieval system of MP3 audio frequency.
Summary of the invention
The objective of the invention is at the defective that exists in the prior art, a kind of MP3 audio attribute discretization method based on heterogeneous criterion be provided, by extract based on
Figure 2010106122593100002DEST_PATH_IMAGE001
The principal character of territory audio frequency, and choose candidate's breakpoint based on flex point, realize MP3 audio attribute discretize is handled problems.
For achieving the above object, design of the present invention is: from the MP3 voice data, extract earlier the MDCT coefficient, then based on
Figure 413372DEST_PATH_IMAGE001
The principal character of audio frequency is extracted in the territory, as the property set of training sample, obtains the characteristic attributes input set of 15 dimensions, and obtains the breakpoint set of connection attribute according to the character of flex point, obtains discrete results by the discretization method based on heterogeneous criterion at last.
Conceive according to foregoing invention, the technical solution used in the present invention is further improved: at first extract the MDCT coefficient from the MP3 voice data, analyze the characteristic of MDCT coefficient again, according to the principal character of the feature extraction audio frequency of MDCT coefficient (comprising root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER, Mel cepstrum coefficient MFCC (12 dimension)), property set as training sample, obtain the characteristic attribute input set of 15 dimensions, obtain the breakpoint set of connection attribute then according to the character of flex point, obtain discrete results by discretization method at last based on heterogeneous criterion.This method specifically comprises the steps:
1), the pre-service of MP3 audio frequency characteristics: comprise to the MP3 frame head decode, side information obtains, obtain master data and zoom factor, Hafman decoding and four parts of inverse quantization;
2), based on the audio feature extraction of MDCT coefficient: the MDCT coefficient of finding out two granularities of each frame the MP3 frame behind inverse quantization, MDCT coefficient to two particles asks average by Frequency point, make up the MDCT spectral coefficient of every frame audio frequency, extract root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER, Mel cepstrum coefficient MFCC (12 dimension) then;
3), the selection of candidate's breakpoint: from the envelope character of connection attribute, will be retained in the important information of the interval attribute change of different breakpoints, and improve the adaptability of discretization method based on the flex point of this envelope initial candidate breakpoint as the attribute discretize;
4), design heterogeneous amount: calculate class-based conditional probability vector
Figure 2010106122593100002DEST_PATH_IMAGE002
, and with vector
Figure 2010106122593100002DEST_PATH_IMAGE003
With middle probability vector
Figure 2010106122593100002DEST_PATH_IMAGE004
Between distance be called vector
Figure 2010106122593100002DEST_PATH_IMAGE005
Heterogeneous amount
Figure 2010106122593100002DEST_PATH_IMAGE006
, with
Figure 2010106122593100002DEST_PATH_IMAGE007
With the center of gravity probability vector
Figure 891365DEST_PATH_IMAGE004
Between distance be heterogeneous amount as the method for weighing the discretize quality;
5), the discretize algorithm under the heterogeneous criterion: according to the candidate's breakpoint in the step 3) algorithm each dimension attribute in the property set is handled, and the property set after handling is carried out discretize according to the heterogeneous amount that calculates in the step 4);
The present invention compared with prior art, have following conspicuous outstanding substantive distinguishing features and remarkable advantage: directly the MP3 audio frequency is carried out discretize and handle based on the MP3 compression domain, carry out the method that discretize is handled again than traditional incompressible audio frequency that the MP3 compressed audio is decoded as, the method that the present invention proposes is simpler, and saves computing time; New algorithm according to the present invention can have nothing in common with each other to the selected discrete point that comes out of each dimension attribute, by sample attribute itself and sample class decision.This method is more reasonable than " lumping together " formula discretize result in traditional discretization method, can keep the more characteristic of each attribute, not only make things convenient for the subsequent treatment that the compressed domain audio attributive character is optimized, and lay the first stone for setting up practicality many classification of audio frequency fast and searching system.
Description of drawings
Fig. 1 is the process flow diagram of a kind of MP3 audio attribute discretization method based on heterogeneous criterion of the present invention;
Fig. 2 is a compressed domain audio feature SC primitive character attribute synoptic diagram;
Fig. 3 is that compressed domain audio feature SC is through characteristic attribute synoptic diagram after the linear discreteization;
Fig. 4 is that compressed domain audio feature SC process is based on characteristic attribute synoptic diagram behind the discretize algorithm of entropy;
Fig. 5 is that compressed domain audio feature SC process is based on characteristic attribute synoptic diagram behind the discretize algorithm of heterogeneous criterion.
Embodiment
A preferred embodiment of the present invention accompanying drawings is as follows: referring to Fig. 1, it was five steps that the MP3 audio attribute discretization method that the present invention is based on heterogeneous criterion is divided into:
The first step: the pre-service of MP3 compressed domain audio feature
The pre-service of MP3 compressed audio, comprise to the MP3 frame head decode, side information obtains, read master data and zoom factor, Hafman decoding and four parts of inverse quantization.
1, obtaining of synchronous data flow and frame head information:
A), according to the MP3 coded format, from the MP3 data stream, search for synchronizing information;
B), according to synchronizing information, find the reference position of each frame data in the MP3 data stream;
C), after the reference position of specified data frame, obtain frame head information Head
2, obtaining of side information:
A), according to the coded format of MP3 frame head, determine the reference position of side information in the MP3 frame head;
B), from MP3 frame head information HeadIn obtain side information Side
3, reading of MP3 master data and zoom factor:
A), according to side information SideCalculate the length of master data L
B), according to frame head information HeadIn the side-play amount of master data, determine the reference position of MP3 master data;
C), obtaining total length from present frame is LMaster data D
D), from master data DThe middle zoom factor that extracts Scale
4, Hafman decoding and inverse quantization:
A), according to side information SideDetermine the starting and ending position of Hafman decoding data;
B), to the MP3 master data DCarry out Hafman decoding, obtain the Hafman decoding result of 32*18 dimension F[32,18];
C), to the Hafman decoding result FData in [32,18] are carried out inverse quantization.
Second step: the MDCT coefficient extracts and the MP3 audio feature extraction
1, make up the correction discrete cosine transform MDCT coefficient of every frame audio frequency:
A), divide and to be used in the MDCT coefficient of depositing two granularities of a frame MP3 audio frequency n* the storage space of 576 sizes MDCT 0 [ n, 576], MDCT 1 [ n, 576] in, wherein nFrame number for the MP3 audio frequency;
B), from array FIn find the MDCT coefficient of two granularities of same frame audio frequency respectively, rearrange by frequency principle from low to high, obtain MDCT 0[ i, j], MDCT 1[ i, j] in;
C), calculate the mean value of the MDCT coefficient at two granulometric facies same frequency point places in the same frame audio frequency, as the MDCT coefficient value of this frame audio frequency M[ i, j];
Figure 2010106122593100002DEST_PATH_IMAGE008
Wherein,
Figure 2010106122593100002DEST_PATH_IMAGE009
,
Figure 2010106122593100002DEST_PATH_IMAGE010
Respectively the iOf the 0th granularity of frame audio frequency and the 1st granularity jIndividual MDCT spectrum value. Be iOf frame audio frequency jIndividual average MD CT spectrum value.
2, extract root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER, Mel cepstrum coefficient MFCC (12 dimension):
A), root mean square RMS
This parameter is the envelope of sound signal, embodies the energy variation of signal.To a particle root mean square RMS computing formula be:
Figure 2010106122593100002DEST_PATH_IMAGE012
Wherein, Be a coefficient number in the particle,
Figure 2010106122593100002DEST_PATH_IMAGE014
It is the MDCT coefficient value.
B), spectrum centre distance SC
This parameter is the equilibrium point of MDCT coefficient energy distribution, has embodied the spectral regions that most of signal energy is concentrated.Its computing formula is:
Wherein
Figure DEST_PATH_RE-DEST_PATH_IMAGE002
It is the MDCT coefficient value.
C), sideband energy ratio B ER
The sideband energy ratio is meant, gets a reference frequency in signal frequency range
Figure DEST_PATH_RE-DEST_PATH_IMAGE003
, being lower than and being higher than
Figure DEST_PATH_344801DEST_PATH_IMAGE003
The pairing MDCT coefficient of frequency energy do ratio, obtain the sideband energy ratio.Suppose
Figure DEST_PATH_776919DEST_PATH_IMAGE003
With
Figure DEST_PATH_RE-DEST_PATH_IMAGE004
Individual sideband, the
Figure DEST_PATH_RE-DEST_PATH_IMAGE005
Individual
Figure DEST_PATH_RE-DEST_PATH_IMAGE006
The pairing frequency of coefficient is the most approaching.So, sideband energy ratio B ER as shown in the formula:
Wherein, M represents the window type, during long window
Figure DEST_PATH_RE-DEST_PATH_IMAGE008
, during the weak point window
Figure DEST_PATH_RE-DEST_PATH_IMAGE009
,
Figure DEST_PATH_RE-DEST_PATH_IMAGE010
It is the MDCT coefficient value.
D), Mel cepstrum coefficient MFCC
MFCC is based on that human hearing characteristic puts forward, and it becomes nonlinear correspondence relation with the Hz frequency, and the calculation procedure of MFCC is:
(1): the definition number is 12 triangular filter group, corresponding to the practical center frequency of Mel frequency is
Figure DEST_PATH_RE-DEST_PATH_IMAGE011
, m=1,2 ... 12.
Figure DEST_PATH_418247DEST_PATH_IMAGE011
Determine by following formula:
Figure DEST_PATH_RE-DEST_PATH_IMAGE012
N=576 wherein,
Figure DEST_PATH_RE-DEST_PATH_IMAGE013
Be the transforming relationship of actual frequency and Mel frequency,
Figure DEST_PATH_RE-DEST_PATH_IMAGE014
It is its inverse function.
Figure DEST_PATH_RE-DEST_PATH_IMAGE015
,
Figure DEST_PATH_RE-DEST_PATH_IMAGE016
It is respectively minimum and Mel frequency representation highest frequency.Following formula is converted into the practical center frequency to equally spaced centre frequency in the Mel frequency domain.The triangle filtering of Mel frequency domain is to calculate the frequency domain components that falls in the triangular filter scope in fact, and MDCT can be multiplied by the corresponding factor by discharge amplitude.Triangle filtering is shown below to the frequency response of different frequency component:
Figure DEST_PATH_RE-DEST_PATH_IMAGE017
M represents corresponding wave filter sequence number, and the scope of m is here
Figure DEST_PATH_RE-DEST_PATH_IMAGE018
And be integer.K represents the sequence number of frequency line, and the scope of k is here
Figure DEST_PATH_RE-DEST_PATH_IMAGE019
(2): the output energy that calculates each wave filter by following formula:
Figure DEST_PATH_RE-DEST_PATH_IMAGE020
M is the wave filter sequence number, and k is Sequence number.
(3): calculate cosine transform to obtain the MFCC coefficient by following formula:
Figure DEST_PATH_RE-DEST_PATH_IMAGE022
The 3rd step: the selection of candidate's breakpoint
1, analyzing audio characteristic attribute collection; Choose successively four continuous attribute points on the first dimension attribute collection (A, B, C, D);
2, three vectors that four order points are formed (AB, BC CD), calculate two groups of curvature of intersecting vector by following formula:
Figure DEST_PATH_RE-DEST_PATH_IMAGE023
Figure DEST_PATH_RE-DEST_PATH_IMAGE024
3, calculate by following formula
Figure DEST_PATH_RE-DEST_PATH_IMAGE025
:
Figure DEST_PATH_RE-DEST_PATH_IMAGE026
If 4
Figure DEST_PATH_RE-DEST_PATH_IMAGE027
, also promptly satisfy the necessary and sufficient condition that flex point exists, and flex point is positioned at
Figure DEST_PATH_RE-DEST_PATH_IMAGE028
A certain position on the vector.Usually we get BC mid point as this flex point, i.e. candidate's breakpoint;
5, cycling.To other conditional attributes, the flow process that repeats Step1-Step4 is to obtain candidate's breakpoint set of each dimension attribute.
The 4th step: design heterogeneous amount
1, the calculating of heterogeneous amount:
Suppose in the infosystem that M sample is to attribute
Figure DEST_PATH_RE-DEST_PATH_IMAGE029
Quantize, its classification is divided into
Figure DEST_PATH_RE-DEST_PATH_IMAGE030
Figure DEST_PATH_RE-DEST_PATH_IMAGE031
Class is between the discrete regions of attribute being
Figure DEST_PATH_RE-DEST_PATH_IMAGE032
Figure DEST_PATH_RE-DEST_PATH_IMAGE033
Individual interval.
Figure DEST_PATH_RE-DEST_PATH_IMAGE034
The representation attribute classification is
Figure DEST_PATH_RE-DEST_PATH_IMAGE035
And attribute
Figure DEST_PATH_461640DEST_PATH_IMAGE029
Value be positioned at i discrete interval All sample numbers; The
Figure DEST_PATH_RE-DEST_PATH_IMAGE037
The row sample statistics and
Figure DEST_PATH_RE-DEST_PATH_IMAGE038
The expression all properties is Total sample number, i row quantitative statistics and
Figure DEST_PATH_RE-DEST_PATH_IMAGE039
The value of expression sample attribute drops on the
Figure DEST_PATH_RE-DEST_PATH_IMAGE040
Individual discrete interval the interval
Figure DEST_PATH_368602DEST_PATH_IMAGE036
Total sample number.To i discrete interval , obtain the class conddition probability distribution , wherein
Figure DEST_PATH_RE-DEST_PATH_IMAGE043
, and
Figure DEST_PATH_RE-DEST_PATH_IMAGE044
Order
Figure DEST_PATH_RE-DEST_PATH_IMAGE045
For
Figure DEST_PATH_67699DEST_PATH_IMAGE031
The probability vector combination of dimension has comprised all class probability vectors of each discrete interval, then:
Figure DEST_PATH_RE-DEST_PATH_IMAGE046
With vectorial P and middle probability vector
Figure DEST_PATH_RE-DEST_PATH_IMAGE047
Between distance be called vector
Figure DEST_PATH_RE-DEST_PATH_IMAGE048
Heterogeneous amount
Figure DEST_PATH_RE-DEST_PATH_IMAGE049
, as shown in the formula:
Figure DEST_PATH_RE-DEST_PATH_IMAGE050
Wherein
Figure DEST_PATH_RE-DEST_PATH_IMAGE051
, be called the center of gravity probability vector.To
Figure DEST_PATH_724070DEST_PATH_IMAGE040
Individual discrete interval is when condition class probability vector =P 0The time, its heterogeneous amount minimum has at interval characterized its relatively poor classification performance.Therefore to class probability vector arbitrarily:
Figure DEST_PATH_594123DEST_PATH_IMAGE042
, with
Figure DEST_PATH_RE-DEST_PATH_IMAGE052
With the center of gravity probability vector
Figure DEST_PATH_521628DEST_PATH_IMAGE047
Between distance be that heterogeneous amount is as balancing method.
[The comparison of classification degree of accuracy under three kinds of discretize modes of table 2
The audio types sample Discretize not Linear discreteization EBD Heterogeneous discrete
Allusion/voice/rock and roll 93.53% 83.83% 85.14% 89.71%
Male voice/female voice/rock and roll 88.78% 71.46% 81.52% 86.87%
From table 2, can see, classification degree of accuracy during without the discretize mode is the highest, and other through after discretizes all more or less reduction the classification degree of accuracy, this is because discretize will inevitably make former sample attribute information ignore a part of characteristic, though this shortcoming is also arranged by the discretization method that this paper proposed, but its error is less relatively, can guarantee higher classification accuracy, and when the audio samples data when huge and characteristic attribute dimension increases, discretize is handled the complexity that will reduce subsequent algorithm greatly, and sacrificing a part of accuracy rate in tolerance interval is worth; In addition, the connection attribute discretize also is the essential treatment scheme of doing based on the work of the characteristic optimization of rough set theory for next.
,
Figure DEST_PATH_660485DEST_PATH_IMAGE053
Computing formula be shown below:
Figure DEST_PATH_RE-DEST_PATH_IMAGE055
With Represent the discretize scheme respectively
Figure DEST_PATH_RE-DEST_PATH_IMAGE057
And the border of D '.When
Figure DEST_PATH_RE-DEST_PATH_IMAGE058
The time, expression can from
Figure DEST_PATH_926469DEST_PATH_IMAGE057
Produce D ' in the scheme, promptly in the D border, increase some frontier point and can access D '.Therefore, for two discretize scheme D and D ' arbitrarily, can obtain following rule:
Figure DEST_PATH_RE-DEST_PATH_IMAGE059
, in view of the quantity of discrete interval, when under the condition of the good classification effectiveness of maintenance, the few more complicacy to the reduction data of quantity at interval is good more, also helps follow-up sample classification more.Therefore can obtain the criterion of equal value of following measurement discretize scheme:
Suppose two discretize scheme D and D ', the discrete interval number that they had is respectively
Figure DEST_PATH_238502DEST_PATH_IMAGE033
With
Figure DEST_PATH_RE-DEST_PATH_IMAGE060
, use
Figure DEST_PATH_RE-DEST_PATH_IMAGE061
Sign by
Figure DEST_PATH_787557DEST_PATH_IMAGE057
The scheme that produces, and the scheme that satisfies the discretize criterion is called Candidate Set, represents with CD.If following formula is set up:
Figure DEST_PATH_RE-DEST_PATH_IMAGE062
Show that discretize scheme D is better than , wherein Be called criterion function.
The 5th step: the discretize under the heterogeneous criterion
(1): the set of initialization breakpoint.Algorithm according to the resulting candidate's breakpoint of claim 5 is handled first attribute in the property set, makes the initialization frontier point gather
Figure DEST_PATH_RE-DEST_PATH_IMAGE065
(2): the initialization discrete solution.According to breakpoint collection initialization discrete solution
Figure DEST_PATH_RE-DEST_PATH_IMAGE066
, simultaneously
Figure DEST_PATH_RE-DEST_PATH_IMAGE067
,
(3): add candidate's breakpoint.In current discrete solution, add breakpoint to produce new scheme
Figure DEST_PATH_RE-DEST_PATH_IMAGE069
(4): upgrade discrete solution.In current GD, establish discrete solution
Figure DEST_PATH_RE-DEST_PATH_IMAGE070
, whether judge CF (D) greater than current Globalopt, if then upgrade CD=D, G=CF (D); If not then upgrading discrete solution
Figure DEST_PATH_RE-DEST_PATH_IMAGE071
, continue relatively the value of CF (D) and Globalopt,, obtain a best CD scheme and have maximum Globalopt value after the possible and unduplicated scheme up to checking institute;
(5): circulation step (3) and (4) operation.Up to having verified all initial breakpoint set, end loop;
(6): the discrete point that obtains current attribute.Then to other conditional attributes, repeated execution of steps (1) to the flow process of step (5) to obtain the discrete point of each dimension attribute.
Experimental result: the characteristic parameter of this experiment: comprise root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER and Mel cepstrum coefficient MFCC (12 dimension) by under the VC++ platform, extracting, as the property set of training sample, obtain the feature set input set of 15 dimensions.The algorithm of introducing according to last joint, to this 15 dimensional feature property set under the Matlab platform, carry out the linear discrete method, based on the discretize of entropy, handle based on the discretize of heterogeneous criterion, and three kinds of discretization methods have been carried out analyzing contrast.The audio frequency of choosing is that sampling rate is that one section duration of 44.1KHz/s, monophony, 16bit coding is the audio frequency combination of 60 seconds MP3 format: absolute music/voice/rap music respectively are 20 seconds.
The attribute SC that chooses the attribute vector collection analyzes, and four figure of Fig. 2 to Fig. 5 have represented that successively compressed domain audio feature SC is without the virgin state under the discrete form, linear discrete method, based on the discretize algorithm of entropy and the result schematic diagram of this characteristic parameter being handled based on the discretize of heterogeneous criterion.
By Fig. 5 and Fig. 2 contrast as can be seen, be a kind of mode that supervision is arranged based on the discretization method of heterogeneous standard, the selection meeting of discrete point is according to the difference of sample class and difference; And the artificial regulation parameter K of linear discrete algorithm, and once obtained all breakpoint set, and not considering the classified information that community-internal is contained, the result that discretize is handled does not ensure.
Can find by Fig. 4 and Fig. 5 contrast, discretize algorithm based on entropy has seriously changed original DATA DISTRIBUTION, the reason that this situation occurs is that preset threshold is excessive, and discretize result fails to reflect really the difference of raw data, but how setting threshold is the process of difficult operation; Heterogeneous criterion discretize then such problem can not occur.
Listed in the table 1 one section music samples 15 dimension attribute vector after standardizing before bidimensional be that the discrete point of RMS and SC attribute is selected.
Table 1 is based on the attribute discrete point of heterogeneous standard
The attribute discrete point 1 2 3 4 5 6 7
RMS 0.0002 0.0003 0.0004 0.0052 0.0074 0.0075 0.0102
SC 0.0206 0.0308 0.0524 0.0669 0.0691 0.0915 0.1815
The attribute discrete point 8 9 10 11 12 13 ?
RMS 0.0104 0.0119 0.0122 0.0123 0.0216 0.0255 ?
SC 0.1823 0.2077 0.3042 0.5240 0.5570 0.9572 ?
As can be seen from Table 1, based on the discretization method of heterogeneous standard, the discrete point of two attributes changes along with the change of attribute.In the above-mentioned linear discrete method, for 15 Wei Yangbenshuxingji, the discrete point of each attribute all is the same, and the new algorithm that proposes according to this paper can have nothing in common with each other to the selected discrete point that comes out of each dimension attribute, is determined by sample attribute itself and sample class.This method is more reasonable than " lumping together " formula in the linear discrete method, can keep the more characteristic of each attribute.
For the discretize result is verified, this paper has carried out class test on the basis as a result of discretize.Listed statistical classification degree of accuracy in the table 2, to linear discreteization, based on contrasting under the discretize (EBD) of entropy and three kinds of algorithms of heterogeneous normal scatterization to four kinds of audio samples.
The comparison of classification degree of accuracy under three kinds of discretize modes of table 2
The audio types sample Discretize not Linear discreteization EBD Heterogeneous discrete
Allusion/voice/rock and roll 93.53% 83.83% 85.14% 89.71%
Male voice/female voice/rock and roll 88.78% 71.46% 81.52% 86.87%
From table 2, can see, classification degree of accuracy during without the discretize mode is the highest, and other through after discretizes all more or less reduction the classification degree of accuracy, this is because discretize will inevitably make former sample attribute information ignore a part of characteristic, though this shortcoming is also arranged by the discretization method that this paper proposed, but its error is less relatively, can guarantee higher classification accuracy, and when the audio samples data when huge and characteristic attribute dimension increases, discretize is handled the complexity that will reduce subsequent algorithm greatly, and sacrificing a part of accuracy rate in tolerance interval is worth; In addition, the connection attribute discretize also is the essential treatment scheme of doing based on the work of the characteristic optimization of rough set theory for next.

Claims (6)

1. MP3 audio attribute discretization method based on heterogeneous criterion, it is characterized in that: the concrete operations step is as follows:
1), the pre-service of MP3 audio frequency characteristics: comprise to the MP3 frame head decode, side information obtains, obtain master data and zoom factor, Hafman decoding and inverse quantization;
2), based on the audio feature extraction of MDCT coefficient: the MDCT coefficient of finding out two granularities of each frame the MP3 frame behind inverse quantization, MDCT coefficient to two particles asks average by Frequency point, make up the MDCT spectral coefficient of every frame audio frequency, extract root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER, Mel cepstrum coefficient MFCC then;
3), the selection of candidate's breakpoint: from the envelope character of connection attribute, will be retained in the important information of the interval attribute change of different breakpoints, and improve the adaptability of discretization method based on the flex point of this envelope initial candidate breakpoint as the attribute discretize;
4), design heterogeneous amount: calculate class-based conditional probability vector
Figure 636804DEST_PATH_IMAGE001
, and with vector
Figure 989288DEST_PATH_IMAGE002
With middle probability vector
Figure 620252DEST_PATH_IMAGE003
Between distance be called vector
Figure 58186DEST_PATH_IMAGE002
Heterogeneous amount
Figure 478803DEST_PATH_IMAGE004
, with With the center of gravity probability vector
Figure 736926DEST_PATH_IMAGE003
Between distance be heterogeneous amount as the method for weighing the discretize quality;
5), the discretize algorithm under the heterogeneous criterion: according to the candidate's breakpoint in the step 3) algorithm each dimension attribute in the property set is handled, and the property set after handling is carried out discretize according to the heterogeneous amount that calculates in the step 4).
2. the MP3 audio attribute discretization method based on heterogeneous criterion according to claim 1 is characterized in that the pre-service concrete steps of carrying out the MP3 audio frequency characteristics in the described step 1) are as follows:
(1), synchronous data flow and frame head information obtains;
(2), from the frame head information that decoding obtains, obtain side information;
(3), extract MP3 master data and zoom factor;
(4), the MP3 primary traffic is carried out Hafman decoding and inverse quantization.
3. the MP3 audio attribute discretization method based on heterogeneous criterion according to claim 1 is characterized in that described step 2) in the audio feature extraction concrete steps based on the MDCT coefficient as follows:
(1), makes up the MDCT coefficient of every frame audio frequency;
(2), extraction is based on root mean square RMS, spectrum centre distance SC, sideband energy ratio B ER, the Mel cepstrum coefficient MFCC of MDCT coefficient.
4. the MP3 audio attribute discretization method based on heterogeneous criterion according to claim 1, it is characterized in that: the selection concrete steps of candidate's breakpoint are as follows in the described step 3):
(1), initialization audio frequency characteristics property set;
(2), choose three vectors of four the order points formation in the audio frequency characteristics property set successively
Figure 849108DEST_PATH_IMAGE006
, and calculate two groups of curvature that intersection is vectorial;
(3), judge according to the variation of curvature direction whether flex point exists;
(4), cycling, to other conditional attributes, the flow process that repeats Step1-Step3 is to obtain candidate's breakpoint set of each dimension attribute.
5. the MP3 audio attribute discretization method based on heterogeneous criterion according to claim 1, it is as follows to it is characterized in that described step 4) designs heterogeneous measuring step:
(1), calculates heterogeneous amount between the different types of audio according to Euclidean distance;
(2), calculate heterogeneity between the different types of audio according to the heterogeneous amount of selecting.
6. the MP3 audio attribute discretization method based on heterogeneous criterion according to claim 1 is characterized in that the discretize algorithm concrete steps under the heterogeneous criterion of described step 5) are as follows:
(1), to each dimension attribute collection initialization breakpoint set;
(2), according to initialized breakpoint set initialization discrete solution;
(3), in discrete solution, add candidate's breakpoint;
(4), whether basis verifies that all candidate's breakpoints upgrade discrete solution;
(5), circulation step (3) and (4) operation, up to verify that all initial breakpoint gather end loop;
(6), obtain the discrete point of current attribute, then to other conditional attributes, repeated execution of steps (1) to the flow process of step (5) to obtain the discrete point of each dimension attribute.
CN2010106122593A 2010-12-30 2010-12-30 MP3 audio attribute discretization method based on heterogeneity rule Pending CN102270210A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010106122593A CN102270210A (en) 2010-12-30 2010-12-30 MP3 audio attribute discretization method based on heterogeneity rule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010106122593A CN102270210A (en) 2010-12-30 2010-12-30 MP3 audio attribute discretization method based on heterogeneity rule

Publications (1)

Publication Number Publication Date
CN102270210A true CN102270210A (en) 2011-12-07

Family

ID=45052517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010106122593A Pending CN102270210A (en) 2010-12-30 2010-12-30 MP3 audio attribute discretization method based on heterogeneity rule

Country Status (1)

Country Link
CN (1) CN102270210A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610710A (en) * 2017-09-29 2018-01-19 武汉大学 A kind of audio coding and coding/decoding method towards Multi-audio-frequency object
CN112754502A (en) * 2021-01-12 2021-05-07 曲阜师范大学 Automatic music switching method based on electroencephalogram signals

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610710A (en) * 2017-09-29 2018-01-19 武汉大学 A kind of audio coding and coding/decoding method towards Multi-audio-frequency object
CN112754502A (en) * 2021-01-12 2021-05-07 曲阜师范大学 Automatic music switching method based on electroencephalogram signals

Similar Documents

Publication Publication Date Title
CN103971689B (en) A kind of audio identification methods and device
Zhou et al. Predicting the geographical origin of music
CN106960006A (en) Measuring similarity system and its measure between a kind of different tracks
CN101488150A (en) Real-time multi-view network focus event analysis apparatus and analysis method
CN104462199A (en) Near-duplicate image search method in network environment
Kumar et al. Canopy clustering: a review on pre-clustering approach to K-Means clustering
CN104239553A (en) Entity recognition method based on Map-Reduce framework
CN108052863A (en) Electrical energy power quality disturbance recognition methods based on the maximum variance method of development
CN103336832A (en) Video classifier construction method based on quality metadata
CN108615532A (en) A kind of sorting technique and device applied to sound field scape
CN103077228B (en) A kind of Fast Speed Clustering based on set feature vector and device
CN105678244A (en) Approximate video retrieval method based on improvement of editing distance
CN104036296A (en) Method and device for representing and processing image
Wang et al. Multi-task Joint Sparse Representation Classification Based on Fisher Discrimination Dictionary Learning.
CN103870840A (en) Improved latent Dirichlet allocation-based natural image classification method
Cheng et al. Multiplicity of nontrivial solutions for Kirchhoff type problems
Genussov et al. Musical genre classification of audio signals using geometric methods
Zhang et al. Discretizing numerical attributes in decision tree for big data analysis
CN107908807B (en) Small subsample reliability evaluation method based on Bayesian theory
CN109409644A (en) A kind of student performance analysis method based on improved C4.5 algorithm
CN104731811A (en) Cluster information evolution analysis method for large-scale dynamic short texts
CN104835174A (en) Robustness model fitting method based on supermap mode search
CN104753075A (en) Identifying method and device of leading oscillating mode of interconnected electric power system
CN102270210A (en) MP3 audio attribute discretization method based on heterogeneity rule
CN104167211A (en) Multi-source scene sound abstracting method based on hierarchical event detection and context model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20111207