CN108122562A - A kind of audio frequency classification method based on convolutional neural networks and random forest - Google Patents

A kind of audio frequency classification method based on convolutional neural networks and random forest Download PDF

Info

Publication number
CN108122562A
CN108122562A CN201810037337.8A CN201810037337A CN108122562A CN 108122562 A CN108122562 A CN 108122562A CN 201810037337 A CN201810037337 A CN 201810037337A CN 108122562 A CN108122562 A CN 108122562A
Authority
CN
China
Prior art keywords
convolutional neural
neural networks
audio
spectrogram
random forest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810037337.8A
Other languages
Chinese (zh)
Inventor
彭德中
付炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201810037337.8A priority Critical patent/CN108122562A/en
Publication of CN108122562A publication Critical patent/CN108122562A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of audio frequency classification method based on convolutional neural networks and random forest, this method includes:S1:Spectrum analysis is carried out to original audio data collection, including segmentation, framing, adding window, Fourier transformation, obtains the corresponding spectrogram of original audio file;S2:Spectrogram to obtain trains a convolutional neural networks feature extractor as input;S3:Remove the softmax layers of convolutional neural networks, extract the high-level characteristic of spectrogram;S4:Random forest grader is trained using the spectrogram high-level characteristic of extraction;S5:Based on the high-level characteristic of convolutional neural networks extraction, audio classification is carried out using trained random forest.The present invention is based on convolutional neural networks to do feature extraction, avoid the complicated processes of construction extraction feature manually, simultaneously for causing generalization ability insufficient as convolutional neural networks grader using softmax the problem of, the softmax layers of convolutional neural networks are replaced using random forest, as final grader.Higher accuracy rate and recall rate are achieved during the test.

Description

A kind of audio frequency classification method based on convolutional neural networks and random forest
Technical field
The invention belongs to machine learning fields, are related to a kind of audio classification side based on convolutional neural networks and random forest Method.
Background technology
The life of internet and the development let us of multimedia technology is flooded with substantial amounts of audio, especially various music nets It stands, possesses the audio file of substantial amounts and different style.In face of the audio of magnanimity, audio retrieval can help us quick and precisely Find required audio file in ground.Audio classification is the premise of audio retrieval, but carries out manual sort to a large amount of audio files It is a quite time-consuming and a hard row to hoe.With the auditory fatigue of people, the accuracy rate of manual sort also can decrease.For A large amount of audio files, fast and accurately automatic classification seem very it is necessary to.Research in relation to audio frequency classification method is more, such as Using the two-stage audio frequency classification method based on hidden Markov model and support vector machines mixing, first with hidden Markov model Preliminary classification is carried out to audio, most probable two kinds of classification results is determined, then is made finally of corresponding support vector machine classifier Judgement.The method that similarity between also with good grounds audio content classifies to audio, being represented with the pitch collection of each audio should Audio file, with LDA topic models to audio classification.Also have and carried out using gauss hybrid models, decision tree etc. as grader Classification.But these methods craft construction feature mostly by the way of traditional, both cumbersome, the feature of extraction is also not enough. And single grader is used, the generalization ability for causing model is not strong.
In recent years, deep learning is gradually burning hot, and structure contains more hidden layers, more abstract by combining low-level image feature formation It is high-rise represent attribute or feature, preferably feature can be represented by the distributed of mining data, than the side of traditional manual construction feature Formula effect is more preferable.For present situation and the above problem, it is necessary to design a kind of audio frequency classification method based on deep learning.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of audio based on convolutional neural networks and random forest point Class method, this method automatically extract high-level characteristic using convolutional neural networks, and it is extensive to solve single grader using random forest The problem of indifferent, has higher accuracy rate and recall rate.
Inventive technique solution is as follows:
A kind of audio frequency classification method based on convolutional neural networks and random forest, comprises the following steps.
Step 1:Spectrum analysis is carried out to original audio file, obtains its corresponding spectrogram.Due to audio file often Longer, it is excessive directly to do the spectrogram that spectrum analysis obtains to original audio, cause later stage training pattern occupying system resources compared with It is more.So appropriate segmentation is taken original audio, then spectrum analysis is done to every section audio, including framing, in short-term adding window, Fourier The processes such as conversion.Assuming thatIt is a long sequence,It is the window function that length is N, usesTo addingAdding window obtains N Point sequence, i.e.,
Have on frequency domain
The formula of Short Time Fourier Transform is as follows:
WhereinFor original signal,For window function.By spectrum analysis, the corresponding spectrogram of audio has been obtained.
Step 2:The spectrogram obtained by the use of in step 1 trains an improved convolutional neural networks as training set.It should Network has 14 layers, including convolutional layer, down-sampling layer, Dropout layers, Flatten layers, full articulamentum, Batch Normalization layers, softmax layers etc., using cross entropy as loss function.Each layer is described as follows:
Input:Size is the spectrogram of 248*248;
Layer1:Convolutional layer, core size are (5,5), and 64, strides=1, output characteristic figure size is (244,244);
Layer2:Down-sampling layer, core size are (2,2), and output characteristic figure size is (122,122);
Layer3:Convolutional layer, core size are (3,3), and 128, strides=2, output characteristic figure size is (60,60);
Layer4:Down-sampling layer, core size are (2,2), and output characteristic figure size is (30,30);
Layer5:Convolutional layer, core size are (3,3), and 256, strides=2, output characteristic figure size is (14,14);
Layer6:Down-sampling layer, core size are (2,2), and output characteristic figure size is (7,7);
Layer7:Convolutional layer, core size are (2,2), and 512, strides=1, output characteristic figure size is (6,6);
Layer8:Down-sampling layer, core size are (2,2), and output characteristic figure size is (3,3);
Layer9:Dropout layers, dropout=0.5 makes neuron fail by certain probability, prevented plan in the training process It closes;
Layer10:Flatten layers, multidimensional data one-dimensional, it is transitioned into full articulamentum;
Layer11:Full articulamentum, output neuron number are 128;
Layer12:Batch Normalization, normalize input signal, while keep the ability to express of model again;
Layer13:Full articulamentum, output neuron number is 9, because the data set sample used has 9 classes;
Layer14:Softmax layers, grader is exported as final probability distribution, and each value represents a kind of probability of classification.
Step 3:The softmax layers of trained convolutional neural networks in step 2 are removed, by the last one full articulamentum High-level characteristic of the output as spectrogram.
Step 4:Random forest grader is trained using the high-level characteristic extracted in step 3.Using Gini impurity level conducts The criterion of decision tree feature selecting.Algorithm description is as follows:
Input:Sample set D={ (x1, y1), (x2, y2) ... (xm, ym) }, Weak Classifier iterations T;
Output:Final strong classifier f (x);
The T for t=1,2 ...
A) concentrated from initial data and carry out the t times stochastical sampling, sampled m times altogether, obtain sampling set Dm;
B) m-th of decision tree Gm (x) is built using sampling set Dm.A part of feature is randomly choosed in all features of sample, so An optimal feature is selected from these features again afterwards and divides left and right subtree for decision tree.
Step 5:Audio to be sorted is subjected to the spectrum analysis in step 1 and obtains spectrogram, is then removed in step 3 The high-level characteristic of extraction, is finally input in step 4 and instructs by softmax layers of convolutional neural networks extraction spectrogram high-level characteristic The random forest grader perfected carries out audio classification, and the classifications of the most polls launched by the use of T weak learners is as final class Not.
The present invention is based on deep learnings to propose a kind of audio frequency classification method, employs convolutional neural networks and random forest The mixed model being combined.For conventional model it is insufficient to feature extraction the problem of, the present invention converts the audio into spectrogram, The high-level characteristic of convolutional neural networks extraction spectrogram is recycled, has given full play to powerful feature of the convolutional neural networks to image Extractability simplifies the complex process of feature extraction.For single grader generalization ability it is not strong the problem of, employ random Forest model, the advantages of giving full play to random forest integrated study, structure more decision trees are classified, and compensate for single grader Deficiency.From classification results, the present invention has higher accuracy rate and recall rate.
Description of the drawings
Fig. 1 is a kind of flow chart of the audio frequency classification method based on convolutional neural networks and random forest of the present invention.
The spectrogram obtained after Fig. 2 spectrum analyses.
Fig. 3 is the flow chart that high-level characteristic extraction is carried out using improved convolutional neural networks.
Specific embodiment
With reference to the accompanying drawings and examples, the specific implementation method of the present invention is described further.Example is applied below only to use In illustrating the present invention, but it is not limited to the scope of the present invention.
Embodiment 1 is a kind of example of the present invention, using " GTZAN Genre Collection " as data set, using it In nine kinds of different schools audio file as training set and test set, nine kinds of classifications are:blues、C1assical、 Country, Disco, Jazz, Metal, Pop, Reggae and Rock.
1. audio file is divided into isometric 6 sections, each section all corresponds to identical label.To each section audio framing, add Window, Fourier transformation obtain its spectrogram.What attached drawing 2 was shown is the spectrogram obtained.Spectrogram is read in, is converted to ash Degree figure.It is again 248*248 by the size adjusting of every figure.The pixel value of the picture after adjustment is finally saved in array, as A sample in convolutional neural networks data set.By operation above, data set D (5400,248,248) is obtained, is represented There are 5400 spectrograms, the width of every spectrogram is 248, is highly 248.Data set is divided into training set and test set, Wherein 80% is used as training set, and 20% is used as test set, finally obtains training set T (4320,248,248), test set V (1080, 248,248)。
2. utilize training set T (4320,248,248) training convolutional neural networks model.Network has 14 layers altogether, including volume Lamination, down-sampling layer, full articulamentum, Dropout layers, Normalization layers of Batch etc..
3. after the completion of convolutional neural networks training, remove last softmax layers.With trained convolutional Neural net Network carries out deeper feature extraction to spectrogram, by the original training set T (4320,248,248) being made of spectrogram weight Structure is new training set T ' (4320,9), the original test set V (1080,248,248) being made of spectrogram is reconstructed into new Test set V ' (1080,9).
4. random forest is trained with new training set T ' and test set V ', as final grader.Using different ginsengs Number combination settings, wherein
Parameter Numerical value
n_estimators [10,50,100]
min_samples_split [2, 3, 4]
min_samples_leaf [1, 2, 3]
By selecting, optimal parameter is combined as n_estimators:100, min_samples_split:3, min_samples_ leaf:1.After the completion of random forest training, tested on test set, it is as a result as follows:
Classes Precision Recall F1-score support
0 0.80 0.74 0.77 118
1 0.89 0.92 0.90 133
2 0.75 0.80 0.78 117
3 0.75 0.83 0.79 118
4 0.93 0.88 0.90 134
5 0.94 0.90 0.92 108
6 0.88 0.85 0.87 103
7 0.86 0.78 0.82 124
8 0.64 0.68 0.66 125
Avg/total 0.83 0.82 0.82 1080
This method can accurately classify automatically to audio as can be seen from the above table, and wherein Average Accuracy reaches 83%, average recall rate has reached 82%.

Claims (3)

1. a kind of audio frequency classification method based on convolutional neural networks and random forest, feature include the following steps:
Step 1:Spectrum analysis is carried out to original audio data collection, long audio file is divided into isometric several sections first, every section Audio corresponds to identical label, then carries out framing, adding window, Fourier transformation to every section audio, obtains the frequency spectrum of every section audio Figure, a sample as new training set;
Step 2:All spectrograms and its corresponding label obtained using step 1, one improved convolutional neural networks of training, The network has 14 layers;
Step 3:Remove the softmax layers for the convolutional neural networks that step 2 learns, then extract institute with convolutional neural networks again There is the high-level characteristic of spectrogram;
Step 4:The high-level characteristic of the spectrogram extracted using step 3 trains random forest grader, is made using Gini impurity levels For the criterion of decision tree feature selecting;
Step 5:Audio to be sorted is subjected to the spectrum analysis in step 1 and obtains spectrogram, is then removed in step 3 The high-level characteristic of extraction, is finally input in step 4 and instructs by softmax layers of convolutional neural networks extraction spectrogram high-level characteristic The random forest grader perfected carries out audio classification, and final classification results are obtained in a manner of ballot.
2. a kind of audio frequency classification method based on convolutional neural networks and random forest according to claim 1, feature It is, for audio frequency characteristics, the specific implementation process of this method includes two-stage feature extraction, and first order feature extraction is to pass through frequency Spectrum analysis obtains the corresponding spectrogram of audio, tentatively extracts its low layer time-frequency characteristics, and second level feature extraction uses improved volume Product neutral net, further extracts high-level characteristic to spectrogram.
3. a kind of audio frequency classification method based on convolutional neural networks and random forest according to claim 1, feature Be, this method in order to overcome the problems, such as that softmax causes generalization ability not strong as convolutional neural networks grader, using with Machine forest replaces last layer of convolutional neural networks, as final audio classifiers.
CN201810037337.8A 2018-01-16 2018-01-16 A kind of audio frequency classification method based on convolutional neural networks and random forest Pending CN108122562A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810037337.8A CN108122562A (en) 2018-01-16 2018-01-16 A kind of audio frequency classification method based on convolutional neural networks and random forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810037337.8A CN108122562A (en) 2018-01-16 2018-01-16 A kind of audio frequency classification method based on convolutional neural networks and random forest

Publications (1)

Publication Number Publication Date
CN108122562A true CN108122562A (en) 2018-06-05

Family

ID=62232892

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810037337.8A Pending CN108122562A (en) 2018-01-16 2018-01-16 A kind of audio frequency classification method based on convolutional neural networks and random forest

Country Status (1)

Country Link
CN (1) CN108122562A (en)

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108766461A (en) * 2018-07-17 2018-11-06 厦门美图之家科技有限公司 Audio feature extraction methods and device
CN109002529A (en) * 2018-07-17 2018-12-14 厦门美图之家科技有限公司 Audio search method and device
CN109166593A (en) * 2018-08-17 2019-01-08 腾讯音乐娱乐科技(深圳)有限公司 audio data processing method, device and storage medium
CN109493881A (en) * 2018-11-22 2019-03-19 北京奇虎科技有限公司 A kind of labeling processing method of audio, device and calculate equipment
CN109684506A (en) * 2018-11-22 2019-04-26 北京奇虎科技有限公司 A kind of labeling processing method of video, device and calculate equipment
CN109739112A (en) * 2018-12-29 2019-05-10 张卫校 A kind of wobble objects control method and wobble objects
CN109949825A (en) * 2019-03-06 2019-06-28 河北工业大学 Noise classification method based on the FPGA PCNN algorithm accelerated
CN110010128A (en) * 2019-04-09 2019-07-12 天津松下汽车电子开发有限公司 A kind of sound control method and system of high discrimination
CN110324657A (en) * 2019-05-29 2019-10-11 北京奇艺世纪科技有限公司 Model generation, method for processing video frequency, device, electronic equipment and storage medium
CN110414483A (en) * 2019-08-13 2019-11-05 山东浪潮人工智能研究院有限公司 A kind of face identification method and system based on deep neural network and random forest
CN110600038A (en) * 2019-08-23 2019-12-20 北京工业大学 Audio fingerprint dimension reduction method based on discrete kini coefficient
CN110675893A (en) * 2019-09-19 2020-01-10 腾讯音乐娱乐科技(深圳)有限公司 Song identification method and device, storage medium and electronic equipment
CN110808033A (en) * 2019-09-25 2020-02-18 武汉科技大学 Audio classification method based on dual data enhancement strategy
CN110931046A (en) * 2019-11-29 2020-03-27 福州大学 Audio high-level semantic feature extraction method and system for overlapped sound event detection
CN110933236A (en) * 2019-10-25 2020-03-27 杭州哲信信息技术有限公司 Machine learning-based null number identification method
CN110931045A (en) * 2019-12-20 2020-03-27 重庆大学 Audio feature generation method based on convolutional neural network
CN111159464A (en) * 2019-12-26 2020-05-15 腾讯科技(深圳)有限公司 Audio clip detection method and related equipment
CN111179971A (en) * 2019-12-03 2020-05-19 杭州网易云音乐科技有限公司 Nondestructive audio detection method and device, electronic equipment and storage medium
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111583890A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Audio classification method and device
CN112735386A (en) * 2021-01-18 2021-04-30 苏州大学 Voice recognition method based on glottal wave information
CN113313197A (en) * 2021-06-17 2021-08-27 哈尔滨工业大学 Full-connection neural network training method
CN113729715A (en) * 2021-10-11 2021-12-03 山东大学 Parkinson's disease intelligent diagnosis system based on finger pressure
CN113901977A (en) * 2020-06-22 2022-01-07 中国电力科学研究院有限公司 Deep learning-based power consumer electricity stealing identification method and system
CN115064184A (en) * 2022-06-28 2022-09-16 镁佳(北京)科技有限公司 Audio file musical instrument content identification vector representation method and device
US11905926B2 (en) * 2019-12-31 2024-02-20 Envision Digital International Pte. Ltd. Method and apparatus for inspecting wind turbine blade, and device and storage medium thereof
CN118098270A (en) * 2024-04-24 2024-05-28 安徽大学 Noise tracing method based on feature extraction and feature fusion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408015A (en) * 2016-09-13 2017-02-15 电子科技大学成都研究院 Road fork identification and depth estimation method based on convolutional neural network
CN106952274A (en) * 2017-03-14 2017-07-14 西安电子科技大学 Pedestrian detection and distance-finding method based on stereoscopic vision
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN107492383A (en) * 2017-08-07 2017-12-19 上海六界信息技术有限公司 Screening technique, device, equipment and the storage medium of live content
CN107491606A (en) * 2017-08-17 2017-12-19 安徽工业大学 Variable working condition epicyclic gearbox sun gear method for diagnosing faults based on more attribute convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408015A (en) * 2016-09-13 2017-02-15 电子科技大学成都研究院 Road fork identification and depth estimation method based on convolutional neural network
CN106952274A (en) * 2017-03-14 2017-07-14 西安电子科技大学 Pedestrian detection and distance-finding method based on stereoscopic vision
CN107066553A (en) * 2017-03-24 2017-08-18 北京工业大学 A kind of short text classification method based on convolutional neural networks and random forest
CN107492383A (en) * 2017-08-07 2017-12-19 上海六界信息技术有限公司 Screening technique, device, equipment and the storage medium of live content
CN107491606A (en) * 2017-08-17 2017-12-19 安徽工业大学 Variable working condition epicyclic gearbox sun gear method for diagnosing faults based on more attribute convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
曹林林: ""卷积神经网络在高分遥感影像分类中的应用"", 《测绘科学》 *
罗建华: ""基于深度卷积神经网络的高光谱遥感图像分类"", 《西华大学学报》 *

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108766461A (en) * 2018-07-17 2018-11-06 厦门美图之家科技有限公司 Audio feature extraction methods and device
CN109002529A (en) * 2018-07-17 2018-12-14 厦门美图之家科技有限公司 Audio search method and device
CN108766461B (en) * 2018-07-17 2021-01-26 厦门美图之家科技有限公司 Audio feature extraction method and device
CN109002529B (en) * 2018-07-17 2021-02-02 厦门美图之家科技有限公司 Audio retrieval method and device
CN109166593A (en) * 2018-08-17 2019-01-08 腾讯音乐娱乐科技(深圳)有限公司 audio data processing method, device and storage medium
CN109684506A (en) * 2018-11-22 2019-04-26 北京奇虎科技有限公司 A kind of labeling processing method of video, device and calculate equipment
CN109493881B (en) * 2018-11-22 2023-12-05 北京奇虎科技有限公司 Method and device for labeling audio and computing equipment
CN109684506B (en) * 2018-11-22 2023-10-20 三六零科技集团有限公司 Video tagging processing method and device and computing equipment
CN109493881A (en) * 2018-11-22 2019-03-19 北京奇虎科技有限公司 A kind of labeling processing method of audio, device and calculate equipment
CN109739112A (en) * 2018-12-29 2019-05-10 张卫校 A kind of wobble objects control method and wobble objects
CN109739112B (en) * 2018-12-29 2022-03-04 张卫校 Swinging object control method and swinging object
CN111583890A (en) * 2019-02-15 2020-08-25 阿里巴巴集团控股有限公司 Audio classification method and device
CN109949825A (en) * 2019-03-06 2019-06-28 河北工业大学 Noise classification method based on the FPGA PCNN algorithm accelerated
CN110010128A (en) * 2019-04-09 2019-07-12 天津松下汽车电子开发有限公司 A kind of sound control method and system of high discrimination
CN110324657A (en) * 2019-05-29 2019-10-11 北京奇艺世纪科技有限公司 Model generation, method for processing video frequency, device, electronic equipment and storage medium
CN110414483A (en) * 2019-08-13 2019-11-05 山东浪潮人工智能研究院有限公司 A kind of face identification method and system based on deep neural network and random forest
CN110600038B (en) * 2019-08-23 2022-04-05 北京工业大学 Audio fingerprint dimension reduction method based on discrete kini coefficient
CN110600038A (en) * 2019-08-23 2019-12-20 北京工业大学 Audio fingerprint dimension reduction method based on discrete kini coefficient
CN110675893A (en) * 2019-09-19 2020-01-10 腾讯音乐娱乐科技(深圳)有限公司 Song identification method and device, storage medium and electronic equipment
CN110808033A (en) * 2019-09-25 2020-02-18 武汉科技大学 Audio classification method based on dual data enhancement strategy
CN110808033B (en) * 2019-09-25 2022-04-15 武汉科技大学 Audio classification method based on dual data enhancement strategy
CN110933236A (en) * 2019-10-25 2020-03-27 杭州哲信信息技术有限公司 Machine learning-based null number identification method
CN110931046A (en) * 2019-11-29 2020-03-27 福州大学 Audio high-level semantic feature extraction method and system for overlapped sound event detection
CN111179971A (en) * 2019-12-03 2020-05-19 杭州网易云音乐科技有限公司 Nondestructive audio detection method and device, electronic equipment and storage medium
CN110931045A (en) * 2019-12-20 2020-03-27 重庆大学 Audio feature generation method based on convolutional neural network
CN111159464A (en) * 2019-12-26 2020-05-15 腾讯科技(深圳)有限公司 Audio clip detection method and related equipment
CN111159464B (en) * 2019-12-26 2023-12-15 腾讯科技(深圳)有限公司 Audio clip detection method and related equipment
US11905926B2 (en) * 2019-12-31 2024-02-20 Envision Digital International Pte. Ltd. Method and apparatus for inspecting wind turbine blade, and device and storage medium thereof
CN111508526B (en) * 2020-04-10 2022-07-01 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN111508526A (en) * 2020-04-10 2020-08-07 腾讯音乐娱乐科技(深圳)有限公司 Method and device for detecting audio beat information and storage medium
CN113901977A (en) * 2020-06-22 2022-01-07 中国电力科学研究院有限公司 Deep learning-based power consumer electricity stealing identification method and system
CN112735386B (en) * 2021-01-18 2023-03-24 苏州大学 Voice recognition method based on glottal wave information
CN112735386A (en) * 2021-01-18 2021-04-30 苏州大学 Voice recognition method based on glottal wave information
CN113313197A (en) * 2021-06-17 2021-08-27 哈尔滨工业大学 Full-connection neural network training method
CN113729715A (en) * 2021-10-11 2021-12-03 山东大学 Parkinson's disease intelligent diagnosis system based on finger pressure
CN115064184A (en) * 2022-06-28 2022-09-16 镁佳(北京)科技有限公司 Audio file musical instrument content identification vector representation method and device
CN118098270A (en) * 2024-04-24 2024-05-28 安徽大学 Noise tracing method based on feature extraction and feature fusion

Similar Documents

Publication Publication Date Title
CN108122562A (en) A kind of audio frequency classification method based on convolutional neural networks and random forest
Chang et al. Learning representations of emotional speech with deep convolutional generative adversarial networks
CN106503805B (en) A kind of bimodal based on machine learning everybody talk with sentiment analysis method
CN106328121B (en) Chinese Traditional Instruments sorting technique based on depth confidence network
CN109147804A (en) A kind of acoustic feature processing method and system based on deep learning
CN111723874B (en) Sound field scene classification method based on width and depth neural network
CN111000553B (en) Intelligent classification method for electrocardiogram data based on voting ensemble learning
CN107644057B (en) Absolute imbalance text classification method based on transfer learning
CN106815369A (en) A kind of file classification method based on Xgboost sorting algorithms
CN109271550B (en) Music personalized recommendation method based on deep learning
CN107993663A (en) A kind of method for recognizing sound-groove based on Android
CN107392241A (en) A kind of image object sorting technique that sampling XGBoost is arranged based on weighting
CN106295717A (en) A kind of western musical instrument sorting technique based on rarefaction representation and machine learning
CN103000172A (en) Signal classification method and device
CN109767789A (en) A kind of new feature extracting method for speech emotion recognition
Shen et al. Learning how to listen: A temporal-frequential attention model for sound event detection
Shakil et al. Feature based classification of voice based biometric data through Machine learning algorithm
CN112861984A (en) Speech emotion classification method based on feature fusion and ensemble learning
CN110910175A (en) Tourist ticket product portrait generation method
CN104077598A (en) Emotion recognition method based on speech fuzzy clustering
CN111583957B (en) Drama classification method based on five-tone music rhythm spectrogram and cascade neural network
CN102521402B (en) Text filtering system and method
CN110084126A (en) A kind of satellite communication jamming signal type recognition methods based on Xgboost
CN109460872A (en) One kind being lost unbalanced data prediction technique towards mobile communication subscriber
CN111785236A (en) Automatic composition method based on motivational extraction model and neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180605