CN102623007A - Audio characteristic classification method based on variable duration - Google Patents

Audio characteristic classification method based on variable duration Download PDF

Info

Publication number
CN102623007A
CN102623007A CN2011100334102A CN201110033410A CN102623007A CN 102623007 A CN102623007 A CN 102623007A CN 2011100334102 A CN2011100334102 A CN 2011100334102A CN 201110033410 A CN201110033410 A CN 201110033410A CN 102623007 A CN102623007 A CN 102623007A
Authority
CN
China
Prior art keywords
vector
short
time characteristic
training sequence
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011100334102A
Other languages
Chinese (zh)
Other versions
CN102623007B (en
Inventor
卢敏
窦维蓓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201110033410.2A priority Critical patent/CN102623007B/en
Publication of CN102623007A publication Critical patent/CN102623007A/en
Application granted granted Critical
Publication of CN102623007B publication Critical patent/CN102623007B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses an audio characteristic classification method based on variable duration in a multimedia signal processing and mode identification technology field. The method comprises the following steps: taking a marked audio sequence whose type is determined as a training sequence; extracting short time characteristics of an audio signal in the training sequence so as to form a short time characteristic vector; calculating a statistical parameter of the each short time characteristic in setting duration so as to acquire a statistical characteristic vector corresponding to the short time characteristic vector; calculating a group of the statistical characteristic vectors corresponding to the short time characteristic vector, and forming a long time characteristic vector of the training sequence by the group of the statistical characteristic vectors; using the long time characteristic vector of the training sequence to train a classifier; extracting a short time characteristic of an ist frame audio signal in a test sequence and calculating an ist frame input long time characteristic vector of the test sequence; sending the ist frame input long time characteristic vector into the trained classifier so as to obtain a classification type. By using the method of the invention, a time-delay problem caused by long time characteristic extraction can be avoided and real time classification of the audio characteristic can be realized.

Description

Audio frequency characteristics sorting technique based on variable duration
Technical field
The invention belongs to multimedia signal dispose and mode identification technology, relate in particular to a kind of audio frequency characteristics sorting technique based on variable duration.
Background technology
Along with the continuous development of the communication technology, digital audio processing has obtained in a plurality of fields such as mobile communication, internet, broadcasting and personal electrics using widely.With audio encoding and decoding technique; Its from traditional be main voice coding with the narrowband voice; Expand to the higher multimedia audio coding of bandwidth expansion quality gradually, the rise of 3G, LTE is also further having higher requirement to audio encoding and decoding technique of new generation to aspects such as the reliability of the adaptability of channel, transmission and encoding and decoding quality.And no matter be audio coding decoding, or the sounds effects editing making, the diversity that sound signal itself is had, making possibly need to select different treatment technologies to different kind of audio signal.As ITU-T G.718 and G.729.1, just sound signal has been divided into voice and two kinds of coding modes of music, and after G.718-SWB in added coding mode to the sound signal that contains sinuso sine protractor.This shows, in some application scenarios, need earlier sound signal to be carried out simply and efficiently classification, know affiliated type.
Divide time-like, the characteristic when short-time characteristic of extraction sound signal and length.Because the stationarity in short-term of sound signal is compared short-time characteristic usually, the stability of characteristic is better with the property distinguished when long, but shortcoming is that the detection time-delay is big, and the application in the real-time grading system is had certain limitation.In addition, steady cycle that different characteristic shows maybe be inconsistent, and characteristic possibly not be an optimum when calculating correspondence long same surely duration under if these characteristics are all got.
Summary of the invention
The objective of the invention is to; The technical scheme of characteristic influences the problem of live effect when mainly adopting extraction long to audio frequency characteristics sorting technique commonly used; A kind of audio frequency characteristics sorting technique based on variable duration is proposed; Characteristic is come training classifier when growing through the variable duration that extracts the same statistical parameter formation of same short-time characteristic under different durations, and utilizes the sorter that trains to carry out the audio frequency characteristics classification.
Technical scheme of the present invention is that a kind of audio frequency characteristics sorting technique based on variable duration is characterized in that said method comprises the following steps:
Step 1: the tonic train that will confirm type and process mark is as training sequence;
Step 2: the short-time characteristic F that extracts the sound signal in the training sequence 1, F 2..., F K, constitute short character vector
Figure BDA0000046240470000021
, K is the component number of short character vector;
Step 3: calculate each short-time characteristic F kIn setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in the duration; Each short-time characteristic F kCorresponding one group of statistical nature vector that constitutes by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000022
, and then short character vector
Figure BDA0000046240470000023
Corresponding statistical nature vector
Figure BDA0000046240470000024
, wherein
Figure BDA0000046240470000025
1≤k≤K;
Step 4: choose P value, N 1, N 2..., N PSatisfy N 1<N 2<...<N P, make n equal N respectively 1, N 2..., N P, calculate short character vector according to step 3
Figure BDA0000046240470000026
One group of corresponding statistical nature vector
Figure BDA0000046240470000027
, proper vector during by this group statistical nature vector composing training sequence long:
Figure BDA0000046240470000028
Step 5: proper vector
Figure BDA0000046240470000031
training classifier when utilizing training sequence long;
Step 6: extract the short-time characteristic of the sound signal in the cycle tests, and calculate statistical nature vectorial
Figure BDA0000046240470000032
and the cycle tests of the i frame of cycle tests according to the method for step 2 and step 3;
Figure BDA0000046240470000033
Figure BDA0000046240470000034
Step 7:, calculate the input of i frame of cycle tests proper vector when long according to the statistical nature vector
Figure BDA0000046240470000035
of the i frame of cycle tests and
Figure BDA0000046240470000036
Figure BDA0000046240470000037
of cycle tests;
Figure BDA0000046240470000038
Step 8: proper vector
Figure BDA0000046240470000039
was sent in the sorter after step 5 is trained when the input of i frame was grown, and its output is the classification type of i frame.
Said short-time characteristic comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
The statistical parameter of the short-time characteristic of said present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF k(n) or variance VarF k(n) one or more in.
The use of the long training sequence feature vectors
Figure BDA00000462404700000310
training classifiers is the use of specific long training sequence feature vectors
Figure BDA00000462404700000311
training single classifier.
Said when utilizing training sequence long proper vector
Figure BDA00000462404700000312
training classifier specifically be to use forward direction feature selecting method; When training sequence long, select validity feature to constitute proper vector
Figure BDA00000462404700000314
when effectively long in the proper vector
Figure BDA00000462404700000313
, and utilize when effectively long proper vector
Figure BDA00000462404700000315
to train single sorter.
The use of the long training sequence feature vectors
Figure BDA00000462404700000316
training classifiers is the use of specific long training sequence feature vectors
Figure BDA00000462404700000317
sub vector
Figure BDA00000462404700000318
separately from the same type of training after a single classifier classifier consisting of parallel groups.
Proper vector
Figure BDA0000046240470000041
was specifically utilized formula when the input of the i frame of said calculating cycle tests was long
Wherein, Q=1; 2; L; P-1; Total q of
Figure BDA0000046240470000044
in
Figure BDA0000046240470000043
, total P-q of
Figure BDA0000046240470000046
in
Figure BDA0000046240470000045
.
Said single sorter is the independent characteristic sorter based on normal distribution.
Features training sorter when the present invention grows through the variable duration that extracts the same statistical parameter formation of same short-time characteristic under different durations; And utilize the sorter that trains to carry out the audio frequency characteristics classification; Avoid extracting the latency issue that characteristic causes when long, realized the real-time grading of audio frequency characteristics.
Description of drawings
Fig. 1 is based on the audio frequency characteristics sorting technique process flow diagram of variable duration;
Fig. 2 is the synoptic diagram that proper vector is trained single sorter when utilizing training sequence long;
Proper vector was trained the synoptic diagram of single sorter when Fig. 3 was effective long that the validity feature of proper vector constitutes when utilizing training sequence long;
Fig. 4 is that the branch vector of proper vector is trained parallelly connected composition and classification device group synoptic diagram behind the single sorter of the same type separately respectively when utilizing training sequence long;
Fig. 5 is the training sample database information table;
Fig. 6 is a test sample book library information table;
Fig. 7 is a sorter performance comparison table.
Embodiment
Below in conjunction with accompanying drawing, preferred embodiment is elaborated.Should be emphasized that following explanation only is exemplary, rather than in order to limit scope of the present invention and application thereof.
The present invention is categorized as example with the voice/music signal under the 32kHz sampling rate and describes.To the sound signal classification of other types, the present invention stands good.
Fig. 1 is based on the audio frequency characteristics sorting technique process flow diagram of variable duration.Among Fig. 1, comprise the following steps: based on the audio frequency characteristics sorting technique of variable duration
Step 1: the tonic train that will confirm type and process mark is as training sequence.
Step 2: the short-time characteristic F that extracts the sound signal in the training sequence 1, F 2..., F K, constitute short character vector , K is the component number of short character vector.
Present embodiment sound intermediate frequency signal is by every 40ms one frame, and the short-time characteristic of calculating comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.In the present invention, short-time characteristic includes but not limited to logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
If the sound signal sampling point of i frame is x (n), n=(i-1) L, (i-1) L+1, L, iL-1, L are frame lengths, the computing formula of each short-time characteristic is following:
A, logarithm energy
E 1 ( i ) = Σ n = ( i - 1 ) L i · L - 1 x 2 ( n )
E 2(i)=max(log[E 1(i)],-10)
B, zero-crossing rate
ZCR ( i ) = Σ n = ( i - 1 ) L i · L - 1 [ sign ( x ( n ) - x ( n - 1 ) ) + 1 ] / 2
Wherein, sign (x) is-symbol function, Sign ( x ) = 1 , x > 0 0 , x = 0 - 1 , x < 0
C, evenly sub belt energy distribution
SubE ( i , k ) = &Sigma; m = ( k - 1 ) L / 2 K kL / 2 K - 1 X ( i , m ) ,k=1,2,L,K
Wherein, (i m) is amplitude spectrum after i frame sound signal is done the FFT conversion to X.
X ( i , m ) = | &Sigma; k = 1 L x ( ( i - 1 ) L + k - 1 ) &CenterDot; exp [ - j &CenterDot; 2 &pi; L ( m - 1 ) ( k - 1 ) ] | ,m=1,2,L,L
Character according to real sequence FFT can know that (i is m) about the m=L/2+1 even symmetry, so (L/2+1) individual value before can only keeping for X.K is even sub band number, makes K=16 in the present embodiment.
When present embodiment extracts audio frequency characteristics, the short character vector of i frame
V r s ( i ) = E 2 ( i ) ZCR ( i ) SubE ( i , 1 ) M SubE ( i , 16 )
Its vectorial dimension is 18.E 2(i), ZCR (i), SubE (i, 1) ..., SubE (i, 16) promptly is respectively the short character vector F of i frame 1, F 2..., F 18
Step 3: calculate each short-time characteristic F kIn setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in the duration; Each short-time characteristic F kCorresponding one group of statistical nature vector that constitutes by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000064
, and then short character vector
Figure BDA0000046240470000065
Corresponding statistical nature vector
Figure BDA0000046240470000066
, wherein
Figure BDA0000046240470000067
1≤k≤K.
The statistical parameter of the short-time characteristic of present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF k(n) or variance VarF k(n) one or more in.In the present embodiment, select maximal value and variance as statistical parameter, then each short-time characteristic F kCorresponding one group of statistical nature vector that constitutes by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000071
Figure BDA0000046240470000072
Because after present embodiment the 2nd step is calculated; 18 short-time characteristics are arranged; The statistical nature vector that the statistical parameter by this short-time characteristic that each short-time characteristic is corresponding constitutes has 2, and then the dimension of a statistical nature vector
Figure BDA0000046240470000074
of short character vector
Figure BDA0000046240470000073
correspondence is 36 dimensions.
Step 4: choose P value, N 1, N 2..., N PSatisfy N 1<N 2<... N P, make n equal N respectively 1, N 2..., N P, calculate short character vector according to step 3
Figure BDA0000046240470000075
One group of corresponding statistical nature vector
Figure BDA0000046240470000076
, proper vector during by this group statistical nature vector composing training sequence long
In the present embodiment, get P=3, N 1=5, N 2=15, N 3=25, the corresponding one group of statistical nature of 3 short character vector that obtains the i frame is vectorial
Figure BDA0000046240470000078
, their vectorial dimension all is 36 dimensions.And then; Proper vector
Figure BDA0000046240470000079
during by this group statistical nature vector composing training sequence long, its vectorial dimension are 108 to tie up.
Step 5: proper vector
Figure BDA00000462404700000710
training classifier when utilizing training sequence long.
When obtaining training sequence long after the proper vector
Figure BDA00000462404700000711
; Can use known technology, proper vector training classifier when utilizing training sequence long.
Fig. 2 is the synoptic diagram that proper vector is trained single sorter when utilizing training sequence long.Among Fig. 2, proper vector the single sorter of direct Training when proper vector training classifier can utilize training sequence long when utilizing training sequence long.
Proper vector was trained the synoptic diagram of single sorter when Fig. 3 was effective long that the validity feature of proper vector constitutes when utilizing training sequence long.Among Fig. 3; Proper vector
Figure BDA0000046240470000081
training classifier also can use forward direction feature selecting method when utilizing training sequence long; When training sequence long, select validity feature to constitute proper vector when effectively long in the proper vector , and utilize when effectively long proper vector
Figure BDA0000046240470000084
to train single sorter.
Fig. 4 is that the branch vector of proper vector is trained the set of classifiers synoptic diagram that parallel connection constitutes behind the single sorter of the same type separately respectively when utilizing training sequence long.Among Fig. 4, branch vector
Figure BDA0000046240470000087
Figure BDA0000046240470000088
of proper vector
Figure BDA0000046240470000086
trained the set of classifiers that parallel connection constitutes behind the single sorter of the same type separately respectively when proper vector
Figure BDA0000046240470000085
training classifier can also utilize training sequence long when utilizing training sequence long.
In the present embodiment, single sorter is selected the independent characteristic sorter based on normal distribution, and for other sorter, the present invention stands good.During training classifier, use like Fig. 3 and method training classifier shown in Figure 4.Promptly use forward direction feature selecting method; When training sequence long in 108 dimensional features of proper vector ; Select 36 dimension validity features to constitute proper vector
Figure BDA00000462404700000810
when effectively long, and utilize when effectively long proper vector
Figure BDA00000462404700000811
to train single sorter.Simultaneously; With
Figure BDA00000462404700000812
respectively the characteristic of division vector, the sorter of stand-alone training same type.
Step 6: extract the short-time characteristic of the sound signal in the cycle tests, and calculate statistical nature vectorial
Figure BDA00000462404700000813
and the cycle tests of the i frame of cycle tests according to the method for step 2 and step 3.
Figure BDA00000462404700000814
Figure BDA00000462404700000815
Step 7:, calculate the input of i frame of cycle tests proper vector when long according to the statistical nature vector
Figure BDA00000462404700000816
of the i frame of cycle tests and
Figure BDA00000462404700000817
Figure BDA00000462404700000818
of cycle tests.
Figure BDA00000462404700000819
Proper vector
Figure BDA00000462404700000820
was specifically utilized formula when the input of the i frame of calculating cycle tests was long
Figure BDA0000046240470000091
Wherein, Q=1; 2; L; P-1; Total q of
Figure BDA0000046240470000093
in
Figure BDA0000046240470000092
, total P-q of
Figure BDA0000046240470000095
in
Figure BDA0000046240470000094
.
Step 8: proper vector
Figure BDA0000046240470000096
was sent in the sorter of step 5 training when the input of i frame was grown, and its output is the classification type of i frame.
Training sample database in the present embodiment and test sample book storehouse are formed by voice sequence and music sequence, and be separate between two databases.Fig. 5 is the training sample database information table, and Fig. 6 is a test sample book library information table.On aforesaid test sample book storehouse, test, comparison-of-pair sorting's device results of property is as shown in Figure 7.Can be found out by test result contrast among Fig. 7: the duration of characteristic is big more when long, and classification accuracy rate is high more, but it is also big more to detect the time-delay of type conversion simultaneously; By contrast, the sorter that obtains of training according to the present invention aspect the promptness two that changes in the classification accuracy and the type of detection of audio types, has more excellent performance to show, and is more suitable for the system of real-time music/phonetic classification.
The above; Be merely the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, any technician who is familiar with the present technique field is in the technical scope that the present invention discloses; The variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (8)

1. the audio frequency characteristics sorting technique based on variable duration is characterized in that said method comprises the following steps:
Step 1: the tonic train that will confirm type and process mark is as training sequence;
Step 2: the short-time characteristic F that extracts the sound signal in the training sequence 1, F 2..., F K, constitute short character vector
Figure FDA0000046240460000011
, K is the component number of short character vector;
Step 3: calculate each short-time characteristic F kIn setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in the duration; Each short-time characteristic F kCorresponding one group of statistical nature vector that constitutes by the statistical parameter of this short-time characteristic
Figure FDA0000046240460000012
, and then short character vector
Figure FDA0000046240460000013
Corresponding statistical nature vector
Figure FDA0000046240460000014
, wherein 1≤k≤K;
Step 4: choose P value, N 1, N 2..., N PSatisfy N 1<N 2<...<N P, make n equal N respectively 1, N 2..., N P, calculate short character vector according to step 3
Figure FDA0000046240460000016
One group of corresponding statistical nature vector
Figure FDA0000046240460000017
, proper vector during by this group statistical nature vector composing training sequence long;
Figure FDA0000046240460000018
Step 5: proper vector
Figure FDA0000046240460000019
training classifier when utilizing training sequence long;
Step 6: extract the short-time characteristic of the sound signal in the cycle tests, and calculate statistical nature vectorial
Figure FDA00000462404600000110
and
Figure FDA00000462404600000111
Figure FDA0000046240460000021
of cycle tests of the i frame of cycle tests according to the method for step 2 and step 3
Step 7:, calculate the input of i frame of cycle tests proper vector when long according to the statistical nature vector
Figure FDA0000046240460000022
of the i frame of cycle tests and
Figure FDA0000046240460000023
Figure FDA0000046240460000024
of cycle tests;
Figure FDA0000046240460000025
Step 8: proper vector
Figure FDA0000046240460000026
was sent in the sorter after step 5 is trained when the input of i frame was grown, and its output is the classification type of i frame.
2. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1 is characterized in that said short-time characteristic comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
3. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, the statistical parameter that it is characterized in that the short-time characteristic of said present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF k(n) or variance VarF k(n) one or more in.
4. a kind of audio frequency characteristics sorting technique according to claim 1 based on variable duration, it is characterized in that said when utilizing training sequence long proper vector
Figure FDA0000046240460000027
training classifier specifically be that proper vector is trained single sorter when utilizing training sequence long.
5. a kind of audio frequency characteristics sorting technique according to claim 1 based on variable duration; It is characterized in that said when utilizing training sequence long proper vector
Figure FDA0000046240460000029
training classifier specifically be to use forward direction feature selecting method; When training sequence long, select validity feature to constitute proper vector when effectively long in the proper vector
Figure FDA00000462404600000210
, and utilize when effectively long proper vector
Figure FDA00000462404600000212
to train single sorter.
6. a kind of audio frequency characteristics sorting technique according to claim 1 based on variable duration, it is characterized in that said when utilizing training sequence long proper vector
Figure FDA00000462404600000213
training classifier specifically be that the branch vector
Figure FDA0000046240460000032
of proper vector when utilizing training sequence long is trained the set of classifiers that parallel connection constitutes behind the single sorter of the same type separately respectively.
7. according to the described a kind of audio frequency characteristics sorting technique of any claim among the claim 4-6, it is characterized in that said single sorter is the independent characteristic sorter based on normal distribution based on variable duration.
8. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, proper vector
Figure FDA0000046240460000033
is specifically utilized formula when it is characterized in that the input of i frame of said calculating cycle tests is long
Figure FDA0000046240460000034
Wherein, Q=1; 2; L; P-1; Total q of
Figure FDA0000046240460000036
in
Figure FDA0000046240460000035
, total P-q of in
Figure FDA0000046240460000037
.
CN201110033410.2A 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration Expired - Fee Related CN102623007B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110033410.2A CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110033410.2A CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Publications (2)

Publication Number Publication Date
CN102623007A true CN102623007A (en) 2012-08-01
CN102623007B CN102623007B (en) 2014-01-01

Family

ID=46562887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110033410.2A Expired - Fee Related CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Country Status (1)

Country Link
CN (1) CN102623007B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968986A (en) * 2012-11-07 2013-03-13 华南理工大学 Overlapped voice and single voice distinguishing method based on long time characteristics and short time characteristics
CN105654944A (en) * 2015-12-30 2016-06-08 中国科学院自动化研究所 Short-time and long-time feature modeling fusion-based environmental sound recognition method and device
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN108305616A (en) * 2018-01-16 2018-07-20 国家计算机网络与信息安全管理中心 A kind of audio scene recognition method and device based on long feature extraction in short-term
CN110249320A (en) * 2017-04-28 2019-09-17 惠普发展公司有限责任合伙企业 Utilize the audio classification for using the machine learning model of audio duration to carry out
CN113780180A (en) * 2021-09-13 2021-12-10 江苏环雅丽书智能科技有限公司 Audio long-time fingerprint extraction and matching method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN101236742A (en) * 2008-03-03 2008-08-06 中兴通讯股份有限公司 Music/ non-music real-time detection method and device
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system
CN101398825A (en) * 2007-09-29 2009-04-01 三星电子株式会社 Rapid music assorting and searching method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN101398825A (en) * 2007-09-29 2009-04-01 三星电子株式会社 Rapid music assorting and searching method and device
CN101236742A (en) * 2008-03-03 2008-08-06 中兴通讯股份有限公司 Music/ non-music real-time detection method and device
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CYRIL JODER等: "Temporal Integration for Audio Classification With Application to Musical Instrument Classification", 《IEEE TRANSACTIONS ON AUDIO,SPEECH,AND LANGUAGE PROCESSING》, vol. 17, no. 1, 31 January 2009 (2009-01-31), pages 174 - 186, XP011241211, DOI: doi:10.1109/TASL.2008.2007613 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968986A (en) * 2012-11-07 2013-03-13 华南理工大学 Overlapped voice and single voice distinguishing method based on long time characteristics and short time characteristics
CN102968986B (en) * 2012-11-07 2015-01-28 华南理工大学 Overlapped voice and single voice distinguishing method based on long time characteristics and short time characteristics
CN106328152A (en) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 Automatic identification and monitoring system for indoor noise pollution
CN105654944A (en) * 2015-12-30 2016-06-08 中国科学院自动化研究所 Short-time and long-time feature modeling fusion-based environmental sound recognition method and device
CN105654944B (en) * 2015-12-30 2019-11-01 中国科学院自动化研究所 It is a kind of merged in short-term with it is long when feature modeling ambient sound recognition methods and device
CN110249320A (en) * 2017-04-28 2019-09-17 惠普发展公司有限责任合伙企业 Utilize the audio classification for using the machine learning model of audio duration to carry out
CN108305616A (en) * 2018-01-16 2018-07-20 国家计算机网络与信息安全管理中心 A kind of audio scene recognition method and device based on long feature extraction in short-term
CN113780180A (en) * 2021-09-13 2021-12-10 江苏环雅丽书智能科技有限公司 Audio long-time fingerprint extraction and matching method

Also Published As

Publication number Publication date
CN102623007B (en) 2014-01-01

Similar Documents

Publication Publication Date Title
CN102623007B (en) Audio characteristic classification method based on variable duration
CN1909060B (en) Method and apparatus for extracting voiced/unvoiced classification information
CN110827837A (en) Whale activity audio classification method based on deep learning
CN101599271A (en) A kind of recognition methods of digital music emotion
CN101527141B (en) Method of converting whispered voice into normal voice based on radial group neutral network
CN102272832B (en) Selective scaling mask computation based on peak detection
CN101577117B (en) Extracting method of accompaniment music and device
CN1215491A (en) Speech processing
CN106297770A (en) The natural environment sound identification method extracted based on time-frequency domain statistical nature
CN103854646A (en) Method for classifying digital audio automatically
CN101159834A (en) Method and system for detecting repeatable video and audio program fragment
CN102446504A (en) Voice/Music identifying method and equipment
CN101221766B (en) Method for switching audio encoder
CN103308919A (en) Fish identification method and system based on wavelet packet multi-scale information entropy
Lu et al. Self-supervised audio spatialization with correspondence classifier
CN107293306A (en) A kind of appraisal procedure of the Objective speech quality based on output
CN104732970A (en) Ship radiation noise recognition method based on comprehensive features
CN108615536A (en) Time-frequency combination feature musical instrument assessment of acoustics system and method based on microphone array
Taenzer et al. Investigating CNN-based Instrument Family Recognition for Western Classical Music Recordings.
CN102592589A (en) Speech scoring method and device implemented through dynamically normalizing digital characteristics
CN117095694A (en) Bird song recognition method based on tag hierarchical structure attribute relationship
CN104392716A (en) Method and device for synthesizing high-performance voices
CN102214219B (en) Audio/video content retrieval system and method
Shifas et al. A non-causal FFTNet architecture for speech enhancement
Zeinali et al. Acoustic scene classification using fusion of attentive convolutional neural networks for DCASE2019 challenge

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140101

Termination date: 20180130

CF01 Termination of patent right due to non-payment of annual fee