CN101546556B - Classification system for identifying audio content - Google Patents

Classification system for identifying audio content Download PDF

Info

Publication number
CN101546556B
CN101546556B CN2008100353510A CN200810035351A CN101546556B CN 101546556 B CN101546556 B CN 101546556B CN 2008100353510 A CN2008100353510 A CN 2008100353510A CN 200810035351 A CN200810035351 A CN 200810035351A CN 101546556 B CN101546556 B CN 101546556B
Authority
CN
China
Prior art keywords
module
audio
transient state
frame
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2008100353510A
Other languages
Chinese (zh)
Other versions
CN101546556A (en
Inventor
黄鹤云
林福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN2008100353510A priority Critical patent/CN101546556B/en
Publication of CN101546556A publication Critical patent/CN101546556A/en
Application granted granted Critical
Publication of CN101546556B publication Critical patent/CN101546556B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an audio content classification system, which comprises a training end and a test end, wherein the training end extracts characteristics of audio test samples through an audio characteristics extracting module, and trains classifier parameters through a classifier training module; and the test end comprises the audio characteristics extracting module shared by the training end, a classifier decision module, a transient characteristics extracting module, a transient characteristics smoothing module and an incremental learning module, wherein the audio characteristics extracting module is used for extracting audio characteristics of input signals; the classifier decision module takes output audio characteristics of the audio characteristics extracting module as input to classify the classifier parameters obtained by training a first frame through a training part; simultaneously, the transient characteristics extracting module extracts transient characteristics of the input signals, and outputs the transient characteristics of the input signals to the transient characteristics smoothing module; the transient characteristics smoothing module corrects and outputsan output result of the classifier decision module; and simultaneously, an incremental learning module utilizes classified class information and characteristic information of audio frames as a group of incremental learning samples to update the classifier parameters.

Description

The categorizing system that is used for audio content identification
Technical field
The present invention relates to a kind of pattern-recognition and signal processing technology, relate in particular to a kind of categorizing system that is used for audio content identification.
Background technology
Audio frequency is a kind of important medium in the multimedia, the audio-frequency information retrieval technique is a pith in the multimedia information retrieval technology, corresponding prior art can be with reference to No. 1391211,1223739 and 1270361, Chinese patent and United States Patent (USP) 5,613,037,6,292,776 and 5,440, No. 662 etc.In audio retrieval is used, need classify to voice data, its purpose is that the sound signal of distinguishing input belongs to that class, common audio categories has voice, ground unrest, pop music, classical music etc., and the application of audio content classification is also very extensive, particularly in the audio retrieval field, audio content classification decisive role, and in the extraction process of some multimedia summaries, the audio content classification has also been played vital role as a kind of supplementary means of video content retrieval.Broadly, at a lot of voice and audio standard, for example in the AMR-WB and AMR-WB+ of 3GPP, they have all used voice/noise classification device and voice/music sorter, offering the scrambler input signal is any sound signal, thereby each signal is taked different scramblers, and it is quite crucial and important therefore designing a kind of good audio content sorting technique.In common sorting technique, usually use two requisite modules, i.e. audio feature extraction module, its function are to extract to reflect the audio content kinds of information from the audio sample point of input, another then is a sorter, and it utilizes these information to finish the process that kind is judged.A lot of features of audio content wherein, temporal signatures (zero-crossing rate for example, curvature, linear predictor coefficient or the like), frequency domain character (Mel cepstrum coefficient, fourier transform coefficient, wavelet conversion coefficient or the like) and some other nonlinear characteristics (fractal, chaos parameter or the like) is proved to be very effective sorting technique, and in audio content sorting technique field, existing a variety of sorters have been widely used, wherein decision tree (Decision Tree) and k-arest neighbors method (K Nearest Neighbor) are two kinds of relative sorters of realizing and understanding of being easy to, they and to voice, neighbourhood noise, music three class audio frequency classifying contents have been obtained good effect.In addition, in the AMR-WB+ standard, the sorter of voice and music also is the method for the decision tree of employing.And support vector machine classifier (Support Vector MachineClassifier) as a kind of in recent years by the sorter that adopts in a lot of machine learning and the area of pattern recognition, also be proved to be a kind of very efficient ways.Other several classical sorters, reverse neural network (Back-Propagation Neural Network) for example, artificial neural network (ArtificialNeural Network), cluster (Clustering) method, it is effective also being proved to be audio content classification.
And in existing categorizing system, because that the parameter of its sorter is is fixing, can't upgrades in time, and the acoustic characteristic of accident can't effectively be handled, therefore can not satisfy the request for utilization of specific environment (as safety monitoring).
Summary of the invention
The technical problem to be solved in the present invention is to propose a kind of audio content classification system, the defective that can't upgrade and can't effectively handle the acoustic characteristic of accident in order to the parameter that solves existing sorter.
For addressing the above problem, according to a kind of audio content classification system of the present invention, comprise training end and test lead, wherein the training end comprises audio feature extraction module and sorter training module, wherein the audio feature extraction module is in order to extract the feature of sound signal, and the sorter training module trains the parameter of sorter according to the audio frequency characteristics of audio feature extraction module collection and the classification information of this sound signal; And test lead comprises and train the shared audio feature extraction module of end, the sorter decision-making module, the transient state characteristic extraction module, level and smooth module of transient state characteristic and incremental learning module, wherein the audio feature extraction module is in order to extract the audio frequency characteristics of input signal, the sorter decision-making module is that the output audio according to the audio feature extraction module is characterized as input, the classifier parameters that training obtains to first frame utilization training part is classified, the transient state characteristic extraction module extracts and exports to the level and smooth module of transient state characteristic to the transient state characteristic of input signal simultaneously, the level and smooth module of this transient state characteristic comes the output result of sorter decision-making module is revised and exports, and the incremental learning module utilizes the classification information and the characteristic information of the classified audio frame of level and smooth module correction of transient state characteristic and output to be used as the parameter that one group of incremental learning sample upgrades sorter simultaneously.
According to above-mentioned principal character, the transient state characteristic extraction module extracts the transient state characteristic of present frame and judges, the level and smooth module of transient state characteristic is taked different smoothing processing methods according to the difference of transient state characteristic, when wherein present frame is judged as the transient state frame, adopt second smoothing method, otherwise adopt first smoothing method, wherein first smoothing method is meant and the irrelevant smoothing method of transient state characteristic, it analyzes first three frame earlier, if " non-accident frame; accident frame; non-accident frame " this classification results, all smoothly be non-accident frame then with three frames, second smoothing method then is the smoothing method relevant with transient state characteristic, be when transient state characteristic during, then make this frame begin first three frame and back three frames all are accident greater than a threshold value.
According to above-mentioned principal character, the renewal classifier parameters is to form a bigger training sample by the sample of the training data that will preserve in advance and incremental learning, and training classifier upgrades classifier parameters again.
According to above-mentioned principal character, also comprise Feature Fusion module or feature dimensionality reduction module in the above-mentioned sorter.
According to above-mentioned principal character, described sorter adopts traditional decision-tree.
According to above-mentioned principal character, described sorter adopts neural net method.
According to above-mentioned principal character, described sorter adopts support vector machine method.
According to above-mentioned principal character, described sorter adopts clustering method.
According to above-mentioned principal character, described sorter adopts bayes method.
Compared with prior art, the present invention has adopted enhancing learning art and transient state characteristic smoothing technique, has improved the accuracy of classification.
Description of drawings
Fig. 1 is the composition Organization Chart of the training end of the embodiment of the invention.
Fig. 2 is the composition Organization Chart of the test lead of the embodiment of the invention.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the invention is described.
Audio frequency is a kind of important medium in the multimedia, and the audio-frequency information retrieval technique is a pith in the multimedia information retrieval technology.In audio retrieval is used, need classify to voice data, its purpose is that the sound signal of distinguishing input belongs to that class, common audio categories has voice, ground unrest, pop music, classical music etc., and the application of audio content classification is also very extensive, particularly in the audio retrieval field, audio content classification decisive role, and in the extraction process of some multimedia summaries, the audio content classification has also been played vital role as a kind of supplementary means of video content retrieval.Broadly, at a lot of voice and audio standard, for example in the AMR-WB and AMR-WB+ of 3GPP, they have all used voice/noise classification device and voice/music sorter, offering the scrambler input signal is any sound signal, thereby each signal is taked different scramblers, and it is quite crucial and important therefore designing a kind of good audio content sorting technique.In common sorting technique, usually use two requisite modules, i.e. audio feature extraction module, its function are to extract to reflect the audio content kinds of information from the audio sample point of input, another then is a sorter, and it utilizes these information to finish the process that kind is judged.A lot of features of audio content wherein, temporal signatures (zero-crossing rate for example, curvature, linear predictor coefficient or the like), frequency domain character (Mel cepstrum coefficient, fourier transform coefficient, wavelet conversion coefficient or the like) and some other nonlinear characteristics (fractal, chaos parameter or the like) is proved to be very effective sorting technique, and in audio content sorting technique field, existing a variety of sorters have been widely used, wherein decision tree (Decision Tree) and k-arest neighbors method (K Nearest Neighbor) are two kinds of relative sorters of realizing and understanding of being easy to, they and to voice, neighbourhood noise, music three class audio frequency classifying contents have been obtained good effect.In addition, in the AMR-WB+ standard, the sorter of voice and music also is the method for the decision tree of employing.And support vector machine classifier (Support Vector Machine Classifier) as a kind of in recent years by the sorter that adopts in a lot of machine learning and the area of pattern recognition, also be proved to be a kind of very efficient ways.Other several classical sorters, reverse neural network (Back-Propagation NeuralNetwork) for example, artificial neural network (Artificial Neural Network), cluster (Clustering) method, it is effective also being proved to be audio content classification.
And in existing categorizing system, because the parameter of its sorter is fixing, can't upgrade in time, and the acoustic characteristic to accident can't effectively be handled, therefore can not satisfy the request for utilization of specific environment (as safety monitoring), therefore the invention provides a kind of audio content classification system, the defective that can't upgrade and can't effectively handle the acoustic characteristic of accident in order to the parameter that solves existing sorter.
Figure 1 shows that the composition Organization Chart of the training end of the embodiment of the invention, wherein the training end comprises two modules, and one is the audio feature extraction module, and one is the sorter training module.In the present invention, all Audio Signal Processing all are to handle frame by frame, suppose to read in each frame sound signal and are expressed as x 1, x 2...., x N, after characteristic extracting module is handled, can obtain the proper vector (F of a M dimension 1, F 2...., F M), that is:
x 1 , x 2 , . . . . , x N → FeatureExtraction F 1 , F 2 , . . . . , F M
Be that zero-crossing rate (Zero-Crossing Rate) with signal is a feature in the present embodiment, other basis
Following method is calculated:
F 1 = ZCR = Σ i = 1 N - 1 sgn ( x i x i + 1 )
Sgn (x) is-symbol function wherein, if x greater than zero then get 1, gets-1 less than zero, equalling zero then is zero.
Certainly, also the gross energy of available signal is a feature, and it calculates according to following formula:
F 2 = TE = Σ i = 1 N x i 2
Obtain feature and promptly finished the work of audio feature extraction later on, carry out last classification according to feature then, promptly enter the sorter training module, the effect of sorter training module is according to feature (F 1, F 2...., F M) and the classification information of this frame sound signal, train the parameter of sorter, use for test lead, wherein common sorter embodiment has traditional decision-tree, neural net method, support vector machine method, clustering method, bayes method etc.
See also shown in Figure 2, composition Organization Chart for the test lead of the embodiment of the invention, wherein test lead comprises and trains and hold shared audio feature extraction module, the sorter decision-making module, the transient state characteristic extraction module, level and smooth module of transient state characteristic and incremental learning module, wherein the sorter decision-making module is that output audio according to the audio feature extraction module is characterized as input, the sorter that training obtains to first frame utilization training part is classified, all frames that second frame is begun use the sorter (being detailed later) after incremental learnings upgrade to classify, and embodiment can comprise traditional decision-tree, neural net method, support vector machine method, clustering method and bayes method etc.And the audio feature extraction module is when extracting audio frequency characteristics to the input audio frame, and the transient state characteristic extraction module has extracted the transient state characteristic of this frame, outputs to the level and smooth module of transient state characteristic and comes the output result of sorter decision-making module is revised.The definition of transient state characteristic then is whether the energy at time domain up-sampling point significantly improves, and take different smoothing processing methods according to the difference of transient state characteristic, when wherein present frame is judged as the transient state frame, adopt second smoothing method, otherwise adopt first smoothing method.Wherein first smoothing method is meant and the irrelevant smoothing method of transient state characteristic, and second smoothing method then is the smoothing method relevant with transient state characteristic.
Wherein the embodiment of transient state characteristic extraction then is that the input audio frame is divided into M section: B l, l=1,2 ..., 32, wherein:
B l = { x N l + 1 , x N l + 2 , . . . . , x N l + 32 } , N l = lN 64 , l = 1,2 , . . . , 64 ;
So between the adjacent segment the overlapping of half arranged.Calculate every section amplitude sum then, i.e. the absolute value sum of sampled point numerical value obtains:
M i = 1 32 Σ n ∈ B i | x n | , i = 1,2 , . . . , 64 ;
Calculate energy ratio and the amplitude-energy ratio of each section and the last period afterwards again:
r l 1 = E l min ( E l - 1 , E l - 2 ) , r l 2 = max x i ∈ B l x i 2 E l - 1 , L ∈ S, wherein E l = Σ n ∈ B l x n 2
Calculate maximum amplitude-energy ratio and energy ratio again:
F i = max l ( log r l i ) , i = 1,2 ,
Therefore, transient state characteristic can calculate with following mode:
F=0.45F 1+0.55F 2
Obtain after the transient state characteristic, judge according to this feature to start which smoothing method.Transient state characteristic can be an one dimension, also can be higher-dimension, and whether output is bidimensional at least, be transient state frame or non-transient state frame in order to judge this frame.A kind of embodiment then is whether to judge F greater than first threshold value, if greater than would be expressed as the transient state frame, start classification results second smoothing method, otherwise then start first smoothing method.A kind of embodiment of first smoothing method can be that (being that present frame is non-transient state frame) analyzes earlier first three frame, if " non-accident frame, accident frame, non-accident frame " this classification results, all smoothly be non-accident frame then with three frames.A kind of embodiment of second smoothing method can be as feature F during greater than second threshold value (bigger than first threshold value usually), then makes this frame begin first three frame and back three frames all are accident.
The incremental learning module then is to utilize the classification information of classified audio frame and characteristic information to be used as the parameter that one group of incremental learning sample upgrades sorter.A kind of embodiment then is that the training data of preservation in advance and the sample of incremental learning are formed a bigger training sample, and training classifier has reached the purpose of upgrading classifier parameters again.
Pay special attention to, be with the part preferred implementation in above-mentioned description, really in above-mentioned all sorters, can take any one feature extraction algorithm or several feature extraction algorithm, and in wherein involved all sorters, can increase Feature Fusion module or feature dimensionality reduction module arbitrarily, a kind of preferable mode then is to use principal component analysis (PCA) with the feature dimensionality reduction after having extracted feature with before the decision-making classification, and in the related sorter, can take any one sorting technique, a kind of variation example is support vector machine classifier or neural network classifier.In addition, in above-mentioned description in the related sorter, the transient state characteristic extracting method can be any one method, a kind of variation pattern is a perceptual entropy, and the transient state characteristic extracting method can extract one-dimensional characteristic, also can extract high dimensional feature, the output of transient state frame determination methods can be the bidimensional result, also can be higher-dimension result more, and the method that the transient state frame is judged can be any one method, a kind of variation example then is a support vector machine method, and the classification results smoothing algorithm can be an arbitrary method.
In addition, in above-mentioned all sorters, the incremental learning module can adopt incremental learning method arbitrarily.
Be understandable that, for those of ordinary skills, can be equal to replacement or change according to technical scheme of the present invention and inventive concept thereof, and all these changes or replacement all should belong to the protection domain of the appended claim of the present invention.

Claims (9)

1. an audio content classification system comprises training end and test lead, it is characterized in that the training end comprises:
The audio feature extraction module is in order to extract the feature of sound signal;
The sorter training module, it trains the parameter of sorter according to the audio frequency characteristics of audio feature extraction module collection and the classification information of this sound signal;
And test lead comprises:
With the shared audio feature extraction module of training end;
The sorter decision-making module is characterized as input according to the output audio of audio feature extraction module, and the classifier parameters that training obtains to first frame utilization training part is classified;
The transient state characteristic extraction module extracts and exports to the level and smooth module of transient state characteristic to the transient state characteristic of input signal;
The level and smooth module of transient state characteristic is revised and is exported the output result of sorter decision-making module;
The incremental learning module utilizes the classification information and the characteristic information of the classified audio frame of level and smooth module correction of transient state characteristic and output to be used as the parameter that one group of incremental learning sample upgrades sorter.
2. audio content classification system as claimed in claim 1, it is characterized in that: the transient state characteristic extraction module extracts the transient state characteristic of present frame and judges, the level and smooth module of transient state characteristic is taked different smoothing processing methods according to the difference of transient state characteristic, when wherein present frame is judged as the transient state frame, adopt second smoothing method, otherwise adopt first smoothing method, wherein first smoothing method is meant and the irrelevant smoothing method of transient state characteristic, it analyzes first three frame earlier, if " non-accident frame; accident frame; non-accident frame " this classification results, all smoothly be non-accident frame then with three frames, second smoothing method then is the smoothing method relevant with transient state characteristic, be when transient state characteristic during, then make this frame begin first three frame and back three frames all are accident greater than a threshold value.
3. audio content classification system as claimed in claim 1 is characterized in that: the renewal classifier parameters is to form a bigger training sample by the sample of the training data that will preserve in advance and incremental learning, and training classifier upgrades classifier parameters again.
4. audio content classification system as claimed in claim 1 is characterized in that: also comprise Feature Fusion module or feature dimensionality reduction module in the above-mentioned sorter.
5. as each described audio content classification system of claim 1 to 4, it is characterized in that: described sorter adopts traditional decision-tree.
6. as each described audio content classification system of claim 1 to 4, it is characterized in that: described sorter adopts neural net method.
7. as each described audio content classification system of claim 1 to 4, it is characterized in that: described sorter adopts support vector machine method.
8. as each described audio content classification system of claim 1 to 4, it is characterized in that: described sorter adopts clustering method.
9. as each described audio content classification system of claim 1 to 4, it is characterized in that: described sorter adopts bayes method.
CN2008100353510A 2008-03-28 2008-03-28 Classification system for identifying audio content Active CN101546556B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008100353510A CN101546556B (en) 2008-03-28 2008-03-28 Classification system for identifying audio content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008100353510A CN101546556B (en) 2008-03-28 2008-03-28 Classification system for identifying audio content

Publications (2)

Publication Number Publication Date
CN101546556A CN101546556A (en) 2009-09-30
CN101546556B true CN101546556B (en) 2011-03-23

Family

ID=41193649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008100353510A Active CN101546556B (en) 2008-03-28 2008-03-28 Classification system for identifying audio content

Country Status (1)

Country Link
CN (1) CN101546556B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103000172A (en) * 2011-09-09 2013-03-27 中兴通讯股份有限公司 Signal classification method and device
CN103251405B (en) * 2013-04-18 2015-04-08 深圳市科曼医疗设备有限公司 System for analyzing arrhythmia
CN103337248B (en) * 2013-05-17 2015-07-29 南京航空航天大学 A kind of airport noise event recognition based on time series kernel clustering
CN104347067B (en) * 2013-08-06 2017-04-12 华为技术有限公司 Audio signal classification method and device
CN103824557B (en) * 2014-02-19 2016-06-15 清华大学 A kind of audio detection sorting technique with custom feature
CN104731979A (en) * 2015-04-16 2015-06-24 广东欧珀移动通信有限公司 Method and device for storing all exclusive information resources of specific user
CN105788592A (en) * 2016-04-28 2016-07-20 乐视控股(北京)有限公司 Audio classification method and apparatus thereof
CN107154866B (en) * 2017-04-19 2018-10-30 腾讯科技(深圳)有限公司 Realize the method and system of service dynamic configuration
CN109147771B (en) * 2017-06-28 2021-07-06 广州视源电子科技股份有限公司 Audio segmentation method and system
US10311872B2 (en) * 2017-07-25 2019-06-04 Google Llc Utterance classifier
CN109389989B (en) * 2017-08-07 2021-11-30 苏州谦问万答吧教育科技有限公司 Sound mixing method, device, equipment and storage medium
US20190114543A1 (en) * 2017-10-12 2019-04-18 British Cayman Islands Intelligo Technology Inc. Local learning system in artificial intelligence device
US11335328B2 (en) * 2017-10-27 2022-05-17 Google Llc Unsupervised learning of semantic audio representations
CN107943865A (en) * 2017-11-10 2018-04-20 阿基米德(上海)传媒有限公司 It is a kind of to be suitable for more scenes, the audio classification labels method and system of polymorphic type
CN108388942A (en) * 2018-02-27 2018-08-10 四川云淞源科技有限公司 Information intelligent processing method based on big data
CN108875655A (en) * 2018-06-25 2018-11-23 鲁东大学 A kind of real-time target video tracing method and system based on multiple features
CN109166593B (en) * 2018-08-17 2021-03-16 腾讯音乐娱乐科技(深圳)有限公司 Audio data processing method, device and storage medium
CN111385688A (en) * 2018-12-29 2020-07-07 安克创新科技股份有限公司 Active noise reduction method, device and system based on deep learning
CN110132598B (en) * 2019-05-13 2020-10-09 中国矿业大学 Fault noise diagnosis algorithm for rolling bearing of rotating equipment
CN110910906A (en) * 2019-11-12 2020-03-24 国网山东省电力公司临沂供电公司 Audio endpoint detection and noise reduction method based on power intranet
CN111681674B (en) * 2020-06-01 2024-03-08 中国人民大学 Musical instrument type identification method and system based on naive Bayesian model
CN113920473B (en) * 2021-10-15 2022-07-29 宿迁硅基智能科技有限公司 Complete event determination method, storage medium and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1110070A (en) * 1993-05-26 1995-10-11 艾利森电话股份有限公司 Discriminating between stationary and non-stationary signals
US6292776B1 (en) * 1999-03-12 2001-09-18 Lucent Technologies Inc. Hierarchial subband linear predictive cepstral features for HMM-based speech recognition
CN1391211A (en) * 2001-04-20 2003-01-15 皇家菲利浦电子有限公司 Exercising method and system to distinguish parameters
CN1708787A (en) * 2002-10-30 2005-12-14 三星电子株式会社 Method for encoding digital audio using advanced psychoacoustic model and apparatus thereof
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1110070A (en) * 1993-05-26 1995-10-11 艾利森电话股份有限公司 Discriminating between stationary and non-stationary signals
US6292776B1 (en) * 1999-03-12 2001-09-18 Lucent Technologies Inc. Hierarchial subband linear predictive cepstral features for HMM-based speech recognition
CN1391211A (en) * 2001-04-20 2003-01-15 皇家菲利浦电子有限公司 Exercising method and system to distinguish parameters
CN1708787A (en) * 2002-10-30 2005-12-14 三星电子株式会社 Method for encoding digital audio using advanced psychoacoustic model and apparatus thereof
CN101021849A (en) * 2006-09-14 2007-08-22 浙江大学 Transmedia searching method based on content correlation

Also Published As

Publication number Publication date
CN101546556A (en) 2009-09-30

Similar Documents

Publication Publication Date Title
CN101546556B (en) Classification system for identifying audio content
Demir et al. A new pyramidal concatenated CNN approach for environmental sound classification
US7457749B2 (en) Noise-robust feature extraction using multi-layer principal component analysis
EP3701528B1 (en) Segmentation-based feature extraction for acoustic scene classification
CN109767785A (en) Ambient noise method for identifying and classifying based on convolutional neural networks
CN102436810A (en) Record replay attack detection method and system based on channel mode noise
CN110910283A (en) Method, device, equipment and storage medium for generating legal document
CN102394062A (en) Method and system for automatically identifying voice recording equipment source
CN101546557B (en) Method for updating classifier parameters for identifying audio content
KR20210043833A (en) Apparatus and Method for Classifying Animal Species Noise Robust
CN103474072A (en) Rapid anti-noise twitter identification method by utilizing textural features and random forest (RF)
CN102567512B (en) Method and device for webpage video control by classification
CN112885330A (en) Language identification method and system based on low-resource audio
CN108806725A (en) Speech differentiation method, apparatus, computer equipment and storage medium
Swaminathan et al. Multi-label classification for acoustic bird species detection using transfer learning approach
Zeng et al. Audio source recording device recognition based on representation learning of sequential gaussian mean matrix
CN117497008A (en) Speech emotion recognition method and tool based on glottal vibration sequence dynamic modeling
CN116884435A (en) Voice event detection method and device based on audio prompt learning
CN113673561B (en) Multi-mode-based automatic music tag classification method, device and medium
Islam et al. DCNN-LSTM based audio classification combining multiple feature engineering and data augmentation techniques
CN114626412A (en) Multi-class target identification method and system for unattended sensor system
Shim et al. Attentive max feature map for acoustic scene classification with joint learning considering the abstraction of classes
Bhavya et al. Deep Learning Approach for Sound Signal Processing
CN113539298A (en) Sound big data analysis calculates imaging system based on cloud limit end
Xie et al. Image processing and classification procedure for the analysis of australian frog vocalisations

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20180410

Address after: The 300456 Tianjin FTA test area (Dongjiang Bonded Port) No. 6865 North Road, 1-1-1802-7 financial and trade center of Asia

Patentee after: Xinji Lease (Tianjin) Co.,Ltd.

Address before: Pudong Zhangjiang Zuchongzhi road 201203 Lane 2288 Shanghai City Center Building 1 houses

Patentee before: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20090930

Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Assignor: Xinji Lease (Tianjin) Co.,Ltd.

Contract record no.: 2018990000196

Denomination of invention: Classification system for identifying audio content

Granted publication date: 20110323

License type: Exclusive License

Record date: 20180801

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221018

Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Address before: 300456 1-1-1802-7, north area of financial and Trade Center, No. 6865, Asia Road, Tianjin pilot free trade zone (Dongjiang Bonded Port Area)

Patentee before: Xinji Lease (Tianjin) Co.,Ltd.