CN107545902A - A kind of article Material Identification method and device based on sound characteristic - Google Patents

A kind of article Material Identification method and device based on sound characteristic Download PDF

Info

Publication number
CN107545902A
CN107545902A CN201710575310.XA CN201710575310A CN107545902A CN 107545902 A CN107545902 A CN 107545902A CN 201710575310 A CN201710575310 A CN 201710575310A CN 107545902 A CN107545902 A CN 107545902A
Authority
CN
China
Prior art keywords
mrow
mtd
msub
mtr
article
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710575310.XA
Other languages
Chinese (zh)
Other versions
CN107545902B (en
Inventor
刘华平
付海滨
孙富春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201710575310.XA priority Critical patent/CN107545902B/en
Publication of CN107545902A publication Critical patent/CN107545902A/en
Application granted granted Critical
Publication of CN107545902B publication Critical patent/CN107545902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The present invention proposes a kind of article Material Identification method and device based on sound characteristic, belongs to signal transacting and area of pattern recognition.The training article of this method unlike material selected first, tap each training article surface and obtain audio file, the feature of extraction audio file obtains training the material characteristic coefficient matrix of article, builds Material Identification expert database and trains to obtain extreme learning machine grader as training sample set;Obtain the audio file of article to be measured and extract corresponding material characteristic coefficient matrix, by Input matrix extreme learning machine grader, grader exports the output prediction value matrix of the test article, each output valve corresponds to a kind of article material classification in the matrix, and the article material classification corresponding to maximum is the Material Identification result of the article to be measured.The present apparatus includes microphone voice collection pen and computer, passes through bluetooth connection between the two.Of the invention effectively to help client to carry out net purchase product Material Identification, recognition result is accurate, practical.

Description

A kind of article Material Identification method and device based on sound characteristic
Technical field
The present invention relates to a kind of article Material Identification method and device based on sound characteristic, belong to signal transacting and mould The field of formula identification.
Background technology
At present, pioneer is applied as " internet+", shopping online is also the core sub-industry of ecommerce.At this In the world that internet is provided infrastructures, original aspectant transaction instead of by computer and netting twine, and both parties only need to be Mouse is gently clicked on before computer to conclude the transaction, and the process of transaction becomes more quick easy, browses, places an order, pays and matches somebody with somebody Send more smooth efficient.But also generate the problem of new between both parties simultaneously:Trust.Whether the seller for providing commodity has letter Reputation, how is the marketing quality that it is sold, and whether commodity meet language and the picture description of businessman, become net purchase compared to tradition Transaction need overcome it is one very big the problem of.
For net purchase client, the material of product is undoubtedly the most concerned problem of consumer.Although businessman is also to product Material be usually constructed with word and be described, or even picture can be shot, but for ordinary consumer, due to can not directly see To or perceive described material, this detailed and without bias the word picture felt product might not be useful. Therefore consumer buys the possibility of product and can decreased in this case.
Article Material Identification based on sound characteristic is to send the tone color of sound, tone, frequency, sound using different materials The different characteristics such as degree, the identification of article material, such as metal, plastics, cloth, timber, papery, glass, ceramics etc. are carried out, Other metal can be divided into iron, copper, aluminium, steel etc. again, and timber can also be divided into all multiple types such as poplar, willow, elm.Pass through sound Sound, the material of article is not only known that, it will also be appreciated that the internal feature of article, such as it is solid or hollow etc..
Existing voice recognition technology, is the identification of voice mostly, and the main content of speech recognition technology includes sound Three feature extraction, pattern match principle and model training aspects.The application of speech recognition includes Voice Navigation, voice is searched Rope, phonetic dialing, voiced translation etc., its field related generally to include pattern-recognition, artificial intelligence, signal transacting etc..It is existing Sound characteristic is not applied to the correlation technique of article Material Identification still by the stage.
ELM extreme learning machines are a kind of typical Single hidden layer feedforward neural networks structure, and it is so that pace of learning is fast, extensive energy The advantages that power is strong, the concern and research of many experts and scholars on internal and international are attracted.ELM is applicable not only to return and intended Conjunction problem, it is also suitable in the field such as classification and pattern-recognition.ELM is obtained in every field and is widely applied.At the same time, Much improved method and strategy are also proposed ELM successively so that ELM performance is greatly improved.Its application Also more and more extensive, therefore importance increasingly improves.
The content of the invention
The invention aims to overcome simple during net purchase to carry out article Material Identification not using word description The problem of sufficient, propose a kind of article Material Identification method and device based on sound characteristic.The present invention utilizes the MFCC sound mixed Sound feature realizes the method and device of article Material Identification.The present invention can effectively help the material of client's progress net purchase product to know Not, recognition result is accurate, practical.
A kind of article Material Identification method based on sound characteristic proposed by the present invention, it is characterised in that including following step Suddenly:
1) audio file is gathered;
Selected training article, the surface that the training article is tapped using microphone voice collection pen produce rattle Signal, it is converted into digital audio signal by rattle signal and uploads to computer to preserve into an audio file;Each sound The sample frequency of frequency file is 44100Hz, and a length of 1s during collection, acquisition precision 16bit, file save as wav forms, is converted to Data point shares 44100 data points;
2) feature extraction;Specifically include following steps:
The voice data of training article audio file 2-1) obtained to step 1) pre-processes;Comprise the following steps that:
2-1-1) preemphasis:Voice data in the audio file that step 1) is obtained is by high-pass filtering, and expression formula is such as Under:
H(z)=1- μ z-1
In formula, H(z)High pass filter function is represented, pre emphasis factor μ is that 0.97, z represents z-transform;
2-1-2) voice data of the audio file to completing preemphasis carries out sub-frame processing:Frame length of the order per frame is 1104 When a length of 1104/44100*1000=25ms of individual data point, then corresponding every frame;It is 441 data points that frame, which moves, then corresponds to frame A length of 441/44100*1000=10ms during shifting, then it is 98 to obtain corresponding frame number after the voice data division of the audio file Frame;
2-1-3) windowing process:If voice data is x in any one frame after framingi(n), n=0,1 ..., N-1, N is every The number of data point in frame, N=1104, the i-th frame after i representative voice data framings, i=1,2 ..., 98;After being multiplied by Hamming window x′i(n)=xi(n)×W(n)
Hamming window formula is:
In formula, W(n)Represent Hamming window function, Hamming window factor alpha=0.46;
2-2) to passing through step 2-1) voice data of audio file that finishes of pretreatment extracts feature, the feature bag of extraction Include:Sound static characteristic MFCC characteristic coefficients, sound dynamic characteristic MFCC first-order difference coefficient DMFCC and sound dynamic characteristic MFCC second differnce coefficients D2MFCC;Comprise the following steps that:
2-2-1) extract sound static characteristic MFCC characteristic coefficients:Comprise the following steps that:
Every frame voice data after pretreatment 2-2-1-1) is subjected to Fast Fourier Transform (FFT) FFT and obtains the frequency of every frame Spectrum, expression formula are as follows:
In formula, Xi(k)For the frequency spectrum of the i-th frame voice data, k is that the data of FFT in the i-th frame are counted, x'i(n)For step Every frame voice data after adding window 2-1-3) obtained;
2-2-1-2) calculate the power spectrum P per frame voice dataim, expression formula is as follows:
Pim=| Xi(k)|2
2-2-1-3) by the power spectrum P of every frame voice dataimIt is filtered by Mel triangular filter groups, the filter of Mel triangles Ripple device group includes M wave filter, and the expression formula of Mel triangular filter groups is as follows:
In formula,f(m)Represent the centre frequency of m-th of Mel wave filter, 0≤m≤M-1;
The logarithmic energy of every frame voice data of Mel triangular filters group output 2-2-1-4) is calculated, expression formula is as follows:
2-2-1-5) to step 2-2-1-4) obtained logarithmic energy carries out discrete cosine transform and obtains every frame voice data Mel cepstrum characteristic coefficient MFCC, expression formula is as follows:
In formula, Ci(η)The η rank MFCC coefficients of the i-th frame voice data are represented, L is total exponent number of MFCC coefficients;Train thing Each audio file of product obtains 98*L MFCC eigenmatrixes;
2-2-2) extract sound dynamic characteristic MFCC first-order difference characteristic coefficients DMFCC;
MFCC first-order difference characteristic coefficients DMFCC expression formula is:
In formula, Di(η)It is the first-order difference parameter of the i-th frame voice data η rank MFCC characteristic coefficients;θ represents first derivative Time difference;Each audio file of training article obtains 98*L DMFCC eigenmatrixes;
2-2-3) extract sound dynamic characteristic MFCC second differnce characteristic coefficients D2MFCC;
MFCC second differnce characteristic coefficients D2MFCC expression formula is:
In formula, Di2(η)It is the second differnce parameter of the i-th frame voice data η rank MFCC characteristic coefficients, ω represents that second order is led Several time differences;Each audio file of training article obtains 98*L D2MFCC eigenmatrixes;
2-2-4) by step 2-2-1) to 2-2-3) obtained three eigenmatrixes combination, obtain the selected training of step 1) The material characteristic coefficient matrix of article:
The MFCC eigenmatrixes that will be extracted, DMFCC eigenmatrixes and D2MFCC eigenmatrixes are combined into 98*3L's Eigenmatrix, first two rows and tail two row of the eigenmatrix after combination is removed, each row of the eigenmatrix after combination represent Characteristic coefficient, to each row characteristic coefficient averaged, finally obtain the 1*3L of the audio file of training article mixing MFCC sound characteristics matrix simultaneously preserves, and mixing MFCC sound characteristics matrix is to obtain the material of the selected training article of step 1) Matter characteristic coefficient matrix;
3) repeat step 1) to step 2), the secondary noise frequency file of training article repeated acquisition 20 selected to step 1) simultaneously extracts Corresponding material characteristic coefficient matrix;
4) the different classes of material of A kinds is chosen, B different articles are chosen in every kind of material as training article, are repeated Step 1) to step 3), the corresponding audio file of each training article of acquisition simultaneously extracts corresponding material characteristic coefficient matrix, Extraction altogetherAll training samples are set up into material and known as training sample by individual material characteristic coefficient matrix Other expert database;
5) extreme learning machine ELM graders are trained;
5-1) build ELM graders;
ELM graders include three input layer, hidden layer and output layer levels;Input layer is set to have a neuron, each Neuron corresponds to an input feature vector of each training sample respectively, then a=3L;If output layer has c neuron, each nerve Member corresponds to a kind of classification of article material in the Material Identification expert database that step 4) obtains, c=A respectively;Setting is implicit Layer neuron number is l;
5-2) the input matrix X expression formulas of ELM graders are:
In formula, each row represent a training sample, and a kind of feature is represented per a line;It is sharedIndividual training sample, each Training sample includes a feature, then input matrix X has a rowsRow;
Reality output matrix Y expression formulas are:
In formula, reality output matrix Y arranges for c rows R, each to arrange the output result for representing a training sample, each instruction The output result for practicing sample corresponds to different material classifications comprising c output valve of output, each output valve;
5-2) randomly choose input layer and the connection weight w of the implicit interlayer and biasing b of hidden layer neuron;
Connection weight w between input layer and hidden layer, expression formula are as follows:
In formula, wσpRepresent σ interneuronal connection weights of p-th of neuron of input layer and hidden layer;
The biasing b of hidden layer neuron, expression formula are as follows:
In formula, bσRepresent the biasing of the σ neuron of hidden layer;
5-3) calculate hidden layer output matrix H;
Excitation function g (x) of the function that infinitely can be micro- as hidden layer neuron is selected, remembers hidden layer and output layer Between connection weight be β, expression formula is:
The prediction output matrix T for obtaining ELM graders is:
Wherein,
In formula, wσ=[wσ1,wσ2,…,wσa];xq=[x1q,x2q,…,xaq]T
In above formula, prediction output matrix T is expressed as:H β=T'
Hidden layer output matrix H is calculated:
5-4) calculate the optimal connection weight of hidden layer and output layer
β value is obtained by solving the least square solution of following expression:
Its optimal solution isExpression formula is as follows:
In formula, H+For hidden layer output matrix H Moore-Penrose generalized inverses, ELM classifier trainings finish;
6) Material Identification;Comprise the following steps that:
6-1) select any article to be measured, repeat step 1), rattle signal caused by the survey article is gathered, and preserve For corresponding audio file;
6-2) repeat step 2), to step 6-1) obtained audio file extraction feature, obtain the material of the article to be measured Characteristic coefficient matrix;
6-3) the limit for finishing the material characteristic coefficient Input matrix of the article to be measured extracted to step 5) training Learning machine ELM graders, grader export to obtain the output prediction value matrix of c × 1 corresponding to the test article, output prediction Include c output valve in value matrix, each output valve corresponds to a kind of article material classification, corresponding to the maximum in output valve Article material classification is the Material Identification result of the article to be measured.
A kind of article Material Identification device based on such as above-mentioned method proposed by the present invention, it is characterised in that including:Mike Wind sound collection pen and computer, pass through bluetooth connection between the two;The microphone voice collection pen includes:Metal taps Rod, microphone sensor, sound analog-to-digital conversion module, Bluetooth communication modules, display module and handwriting;Mike's wind sensing Device, sound analog-to-digital conversion module, Bluetooth communication modules and display module are installed in inside handwriting, and metal stamp one end is placed Inside handwriting, the other end is placed on outside handwriting;The surface that the metal stamp is used to tap article produces rattle Signal, microphone sensor is used to gather rattle signal, and sends sound analog-to-digital conversion module to, sound analog-to-digital conversion mould Block is used to rattle signal is converted into audio digital signal and sends Bluetooth communication modules to, and Bluetooth communication modules pass through indigo plant Audio digital signal is uploaded to computer and preserved into audio file by tooth, and computer exports article after audio file is identified Material Identification result simultaneously returns to Bluetooth communication modules by bluetooth, and Bluetooth communication modules show recognition result by display module Show to user.
The technical characterstic and beneficial effect of the present invention:
(1) present invention carries out signature analysis simultaneously using the different characteristic of unlike material article sounding to the sound collected The material for carrying out article is differentiated, and compensate for identifying defect caused by article material by vision or word description etc..
(2) method based on sound mix MFCC characteristic coefficients is applied in the Material Identification of article by the present invention, is made up Simple result of the sound static nature mel cepstrum characteristic coefficient for article Material Identification not good enough gaps and omissions, introduces sound The first-order difference and second differnce of the dynamic dynamic characteristic mel cepstrum characteristic coefficient of sound, make the feature of extraction more accurate.
(3) present invention constructs training set of the expert database of the percussion sound characteristic of various material as grader, It is as much as possible to meet the needs of unlike material identification.
(4) limit of utilization learning machine of the present invention is analyzed article material as grader so that work is more efficient.
Brief description of the drawings
Fig. 1 is the overall flow figure of the inventive method.
Fig. 2 is feature extraction phases flow chart in the inventive method.
Fig. 3 is the structural representation that microphone voice gathers pen in apparatus of the present invention.
In Fig. 3,1-1, metal stamp;1-2, microphone sensor;1-3, sound analog-to-digital conversion module;1-4, bluetooth pass Defeated module;1-5, display module;1-6, handwriting.
Embodiment
A kind of article Material Identification method and device based on sound characteristic proposed by the present invention, below in conjunction with the accompanying drawings and have Body embodiment is further described as follows.
A kind of article Material Identification method based on sound characteristic proposed by the present invention, overall flow as shown in figure 1, including Following steps:
1) audio file is gathered;
Article (is trained in selected training article, the surface that the training article is tapped using microphone voice collection pen Material is known) rattle signal is produced, it is converted into digital audio signal by rattle signal and uploads to computer to protect It is saved as an audio file.In this example, the sample frequency of each audio file is 44100Hz, a length of 1s during collection, collection essence 16bit is spent, file saves as wav forms, is converted to data point and shares 44100 data points.
2) feature extraction;Flow is as shown in Fig. 2 specifically include following steps:
2-1) obtain training the voice data in the audio file of article to pre-process step 1);Specific steps are such as Under:
2-1-1) preemphasis:Voice data in the audio file that step 1) is obtained is by high-pass filtering, and expression formula is such as Under:
H(z)=1- μ z-1
In formula, H(z)High pass filter function is represented, pre emphasis factor μ is that 0.97, z represents z-transform.
2-1-2) voice data of the audio file to completing preemphasis carries out sub-frame processing:Frame length of the order per frame is 1104 When a length of 1104/44100*1000=25ms of individual data point, then corresponding every frame;It is 441 data points that frame, which moves, then corresponds to frame A length of 441/44100*1000=10ms during shifting, then it is 98 to obtain corresponding frame number after the voice data division of the audio file Frame.
2-1-3) windowing process:If voice data is x in any one frame after framingi(n), n=0,1 ..., N-1, N is every The number of data point in frame, the i-th frame after i representative voice data framings, i=1,2 ..., 98.It is multiplied by x ' after Hamming windowi(n)= xi(n)×W(n)
Hamming window formula is:
In formula, W(n)Represent Hamming window function, Hamming window factor alpha=0.46.
2-2) to passing through step 2-1) the voice data extraction feature finished is pre-processed, the feature of extraction includes:Sound is quiet Step response MFCC characteristic coefficients, sound dynamic characteristic MFCC first-order difference coefficient DMFCC and sound dynamic characteristic MFCC bis- jump Divide coefficient D2MFCC;Comprise the following steps that:
2-2-1) extract sound static characteristic MFCC characteristic coefficients:Comprise the following steps that:
Every frame voice data after pretreatment 2-2-1-1) is subjected to Fast Fourier Transform (FFT) FFT and obtains the frequency of every frame Spectrum, expression formula are as follows:
In formula, Xi(k)For the frequency spectrum of the i-th frame voice data, k is that the data of FFT in the i-th frame are counted, x'i(n)For step Every frame voice data after adding window 2-1-3) obtained, N is the number of the data point in each frame, in this example, N=1104.
2-2-1-2) calculate the power spectrum P per frame voice dataim, expression formula is as follows:
Pim=| Xi(k)|2
2-2-1-3) by the power spectrum P of every frame voice dataimIt is filtered by Mel triangular filter groups, the filter of Mel triangles Ripple device group includes M wave filter, and usual M values are between 22~26, and M takes 24 herein in this embodiment.Mel triangular filters The calculation formula of group is as follows:
In formula,f(m)Represent the centre frequency of m-th of Mel wave filter, 0≤m≤M-1.
The logarithmic energy of every frame voice data of Mel triangular filters group output 2-2-1-4) is calculated, expression formula is as follows:
2-2-1-5) to step 2-2-1-4) obtained logarithmic energy carries out discrete cosine transform (DCT) and obtains every frame sound The mel cepstrum characteristic coefficient MFCC of data, expression formula are as follows:
In formula, Ci(η)Represent the η rank MFCC coefficients of the i-th frame voice data, L is total exponent number of MFCC coefficients, this example Middle L takes 13.Each the voice data for the audio file that training article collects is divided into 98 frames in the present embodiment, per frame sound Data obtain 13 rank MFCC coefficients, therefore train each audio file of article to obtain 98*13 MFCC eigenmatrixes.
2-2-2) extract sound dynamic characteristic MFCC first-order difference characteristic coefficients DMFCC;
MFCC first-order difference characteristic coefficients DMFCC expression formula is:
In formula, Ci(η)It is the i-th frame voice data η rank MFCC characteristic coefficients, Di(η)It is the i-th frame voice data η ranks MFCC The first-order difference parameter of characteristic coefficient.θ represents the time difference of first derivative, and θ takes 2 herein.In the present embodiment, the every of article is trained Individual audio file obtains 98*13 DMFCC eigenmatrixes.
2-2-3) extract sound dynamic characteristic MFCC second differnce characteristic coefficients D2MFCC;
MFCC second differnce characteristic coefficients D2MFCC expression formula is:
In formula, Di(η)It is the first-order difference parameter of the i-th frame voice data η rank MFCC characteristic coefficients, Di2(η)It is the i-th frame sound The second differnce parameter of sound data η rank MFCC characteristic coefficients.ω represents the time difference of second dervative, and ω takes 2 herein.This implementation In example, each audio file of article is trained to obtain 98*13 D2MFCC eigenmatrixes;
2-2-4) by step 2-2-1) to 2-2-3) obtained three eigenmatrixes combination, obtain the selected training of step 1) The material characteristic coefficient matrix of article:
The MFCC eigenmatrixes that will be extracted, DMFCC eigenmatrixes and D2MFCC eigenmatrixes are combined into a 98*3L The eigenmatrix of (the present embodiment 98*39), due to DMFCC eigenmatrixes and D2First two rows and tail two row of MFCC eigenmatrixes For 0, therefore first two rows and tail two row of the eigenmatrix after combination is removed.Each row representative feature of eigenmatrix after combination Coefficient, to each row characteristic coefficient averaged, the 1*3L for finally obtaining the audio file of the training article (is originally embodied as 1* 39) mixing MFCC sound characteristics matrix simultaneously preserves, and mixing MFCC sound characteristics matrix is to obtain the selected instruction of step 1) Practice the material characteristic coefficient matrix of article.
3) repeat step 1) to step 2), the secondary noise frequency file of training article repeated acquisition 20 selected to step 1) simultaneously extracts Corresponding material characteristic coefficient matrix;
4) material of A kinds different classes of (the present embodiment is 7 classes) is chosen, B different article (these are chosen in every kind of material 5 articles are chosen in the every kind of material classification of embodiment) as training article, repeat step 1) to step 3), obtain each training The corresponding audio file of article simultaneously extracts corresponding material characteristic coefficient matrix;All material characteristic coefficient matrixes are set up Into Material Identification expert database, and the training sample set of the grader as extreme learning machine, wherein, each characteristic coefficient square Battle array is used as a training sample, includes altogetherIndividual training sample.The different material bag of 7 classes wherein in the present embodiment Include metal, plastics, textile, glass, timber, papery, ceramics.
5) extreme learning machine (ELM) grader is trained;
5-1) build extreme learning machine (ELM) grader;
ELM graders include three input layer, hidden layer and output layer levels in the present embodiment;Input layer is set there are a Neuron, each neuron correspond to an input feature vector of each training sample respectively, then a=3L (each train by the present embodiment Sample is the material characteristic coefficient matrix of one that step 2) obtains, and the matrix includes 39 kinds of features altogether, then 39) a value is; Output layer is provided with c neuron, and each neuron corresponds to training sample and concentrates a kind of classification of article material, c=A, this implementation C=7 in example.Hidden layer neuron number is set as l, l rule of thumb assignment, in the present embodiment, l=2000.
5-2) the training sample set using the Material Identification expert database that step 4) is established as ELM graders, wherein often Individual material characteristic coefficient matrix is as a training sample;Remember that the training dataset includes altogetherIndividual training sample, ELM graders Input matrix X expression formulas be:
In formula, each row represent a training sample, and a kind of feature is represented per a line;Training sample set sharesIndividual training Sample, each training sample include a feature, then input matrix X has a rowsRow;
Reality output matrix Y expression formulas are:
In formula, reality output matrix Y arranges for c rows R, each to arrange the output result for representing a training sample, each instruction The output result for practicing sample corresponds to different material classifications comprising c output valve of output, each output valve;
5-2) randomly choose input layer and the connection weight w of the implicit interlayer and biasing b of hidden layer neuron;
Connection weight w between input layer and hidden layer, expression formula are as follows:
In formula, wσpRepresent σ interneuronal connection weights of p-th of neuron of input layer and hidden layer.
The biasing b of hidden layer neuron, expression formula are as follows:
In formula, bσRepresent the biasing of the σ neuron of hidden layer.
5-3) calculate hidden layer output matrix H;
Excitation function g (x) of the function that infinitely can be micro- as hidden layer neuron is selected, remembers hidden layer and output layer Between connection weight be β, expression formula is:
And then the prediction output matrix T for obtaining ELM graders is:
Wherein,
In formula, wσ=[wσ1,wσ2,…,wσa];xq=[x1q,x2q,…,xaq]T
In above formula, prediction output matrix T is represented by:H β=T'
And then hidden layer output matrix H is calculated:
5-4) calculate the optimal connection weight of hidden layer and output layer
The purpose for training ELM graders is to find optimal w, b and β so thatIt is minimum.Due to when activation Function g(x)When infinitely can be micro-, w and b can be randomly choosed when training, and be kept in the training process constant.Therefore β can pass through solution The least square solution of following formula obtains:
Its optimal solution isExpression formula is as follows:
In formula, H+For hidden layer output matrix H Moore-Penrose generalized inverses, ELM classifier trainings finish.
6) Material Identification, comprise the following steps that:
6-1) select any article to be measured, repeat step 1), rattle signal caused by the survey article is gathered, and preserve For corresponding audio file;
6-2) repeat step 2), to step 6-1) obtained audio file extraction feature, obtain the material of the article to be measured Characteristic coefficient matrix;
6-3) the limit for finishing the material characteristic coefficient Input matrix of the article to be measured extracted to step 5) training Learning machine ELM graders.Grader exports to obtain the output prediction value matrix of c × 1 corresponding to the test article, output prediction Include c output valve in value matrix, each output valve corresponds to a kind of article material classification, the thing in output valve corresponding to maximum Product material classification is the Material Identification result of the article to be measured.
A kind of article Material Identification device based on method as described above proposed by the present invention, including:Microphone voice is adopted Collect pen and computer, pass through bluetooth connection between the two;The composition of the microphone voice collection pen is as shown in figure 3, including gold Belong to stamp (1-1), microphone sensor (1-2), sound analog-to-digital conversion module (1-3), Bluetooth communication modules (1-4), display Module (1-5) and handwriting (1-6).
The microphone sensor (1-2), sound analog-to-digital conversion module (1-3), Bluetooth communication modules (1-4) and display mould Block (1-5) is all arranged on handwriting (1-6) inside, and metal stamp (1-1) one end is placed on handwriting (1-6) inside, and the other end is put Put outside in handwriting (1-6);The surface that the metal stamp (1-1) is used to tap article produces rattle signal, Mike Wind sensor (1-2) is used to gather rattle signal and sends sound analog-to-digital conversion module (1-3), sound analog-to-digital conversion mould to Block (1-3) is used to rattle signal is converted into audio digital signal and sends Bluetooth communication modules (1-4), Bluetooth transmission to Audio digital signal is uploaded to computer by bluetooth and preserves into audio file, computer to enter audio file by module (1-4) Article Material Identification result is exported after row identification and Bluetooth communication modules (1-4), Bluetooth communication modules (1- are returned to by bluetooth 4) recognition result is shown to user by display module (1-5).
Microphone sensor in apparatus of the present invention can be disposable type, and model Raspberry is applied in the present embodiment Pi B+2, computer can be disposable type, and miscellaneous part is conventional components.

Claims (2)

  1. A kind of 1. article Material Identification method based on sound characteristic, it is characterised in that comprise the following steps:
    1) audio file is gathered;
    Selected training article, the surface that the training article is tapped using microphone voice collection pen produce chatter message Number, it is converted into digital audio signal by rattle signal and uploads to computer to preserve into an audio file;Each audio The sample frequency of file is 44100Hz, and a length of 1s during collection, acquisition precision 16bit, file save as wav forms, is converted to number Strong point shares 44100 data points;
    2) feature extraction;Specifically include following steps:
    The voice data of training article audio file 2-1) obtained to step 1) pre-processes;Comprise the following steps that:
    2-1-1) preemphasis:Voice data in the audio file that step 1) is obtained is as follows by high-pass filtering, expression formula:
    H(z)=1- μ z-1
    In formula, H(z)High pass filter function is represented, pre emphasis factor μ is that 0.97, z represents z-transform;
    2-1-2) voice data of the audio file to completing preemphasis carries out sub-frame processing:Frame length of the order per frame is 1104 numbers When a length of 1104/44100*1000=25ms at strong point, then corresponding every frame;It is 441 data points that frame, which moves, then when corresponding to frame shifting A length of 441/44100*1000=10ms, then it is 98 frames to obtain corresponding frame number after the voice data division of the audio file;
    2-1-3) windowing process:If voice data is x in any one frame after framingi(n), n=0,1 ..., N-1, N is in every frame Data point number, N=1104, the i-th frame after i representative voice data framings, i=1,2 ..., 98;After being multiplied by Hamming window x′i(n)=xi(n)×W(n)
    Hamming window formula is:
    <mrow> <msub> <mi>W</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </msub> <mo>=</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>&amp;alpha;</mi> <mo>&amp;times;</mo> <mi>c</mi> <mi>o</mi> <mi>s</mi> <mo>&amp;lsqb;</mo> <mfrac> <mrow> <mn>2</mn> <mi>&amp;pi;</mi> <mi>n</mi> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <mo>&amp;rsqb;</mo> <mo>,</mo> <mn>0</mn> <mo>&amp;le;</mo> <mi>n</mi> <mo>&amp;le;</mo> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow>
    In formula, W(n)Represent Hamming window function, Hamming window factor alpha=0.46;
    2-2) to passing through step 2-1) voice data of audio file that finishes of pretreatment extracts feature, and the feature of extraction includes: Sound static characteristic MFCC characteristic coefficients, sound dynamic characteristic MFCC first-order difference coefficient DMFCC and sound dynamic characteristic MFCC Second differnce coefficient D2MFCC;Comprise the following steps that:
    2-2-1) extract sound static characteristic MFCC characteristic coefficients:Comprise the following steps that:
    Every frame voice data after pretreatment 2-2-1-1) is subjected to Fast Fourier Transform (FFT) FFT and obtains the frequency spectrum of every frame, Expression formula is as follows:
    <mrow> <msub> <mi>X</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <msup> <mi>x</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </msub> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mi>j</mi> <mn>2</mn> <mi>&amp;pi;</mi> <mi>n</mi> <mi>k</mi> <mo>/</mo> <mi>N</mi> </mrow> </msup> <mo>,</mo> <mrow> <mo>(</mo> <mn>0</mn> <mo>&amp;le;</mo> <mi>k</mi> <mo>&amp;le;</mo> <mi>N</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
    In formula, Xi(k)For the frequency spectrum of the i-th frame voice data, k is that the data of FFT in the i-th frame are counted, x'i(n)For step 2-1- 3) every frame voice data after the adding window obtained;
    2-2-1-2) calculate the power spectrum P per frame voice dataim, expression formula is as follows:
    Pim=| Xi(k)|2
    2-2-1-3) by the power spectrum P of every frame voice dataimIt is filtered by Mel triangular filter groups, Mel triangular filters Group includes M wave filter, and the expression formula of Mel triangular filter groups is as follows:
    <mrow> <msub> <mi>H</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>k</mi> <mo>&lt;</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <mrow> <mn>2</mn> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mrow> <mo>(</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <mo>)</mo> <mo>(</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>)</mo> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <mo>)</mo> </mrow> </mfrac> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>&amp;le;</mo> <mi>k</mi> <mo>&amp;le;</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mfrac> <mrow> <mn>2</mn> <mrow> <mo>(</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <mi>k</mi> <mo>)</mo> </mrow> <mrow> <mo>(</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <mo>)</mo> <mo>(</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>)</mo> <mo>-</mo> <mi>f</mi> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> <mo>)</mo> </mrow> </mfrac> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> <mo>&amp;le;</mo> <mi>k</mi> <mo>&amp;le;</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>k</mi> <mo>&gt;</mo> <mi>f</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>
    In formula,f(m)Represent the centre frequency of m-th of Mel wave filter, 0≤m≤M-1;
    The logarithmic energy of every frame voice data of Mel triangular filters group output 2-2-1-4) is calculated, expression formula is as follows:
    <mrow> <msub> <mi>Q</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>=</mo> <mi>l</mi> <mi>n</mi> <mrow> <mo>(</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>P</mi> <mrow> <mi>i</mi> <mi>m</mi> </mrow> </msub> <msub> <mi>H</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mo>,</mo> <mn>0</mn> <mo>&amp;le;</mo> <mi>m</mi> <mo>&amp;le;</mo> <mi>M</mi> <mo>-</mo> <mn>1</mn> </mrow>
    2-2-1-5) to step 2-2-1-4) obtained logarithmic energy carries out discrete cosine transform and obtains the plum of every frame voice data You are cepstrum feature coefficient MFCC, and expression formula is as follows:
    <mrow> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>m</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>M</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>Q</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>)</mo> </mrow> </mrow> </msub> <mi>c</mi> <mi>o</mi> <mi>s</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>&amp;pi;</mi> <mi>&amp;eta;</mi> <mrow> <mo>(</mo> <mi>m</mi> <mo>-</mo> <mn>0.5</mn> <mo>)</mo> </mrow> </mrow> <mi>M</mi> </mfrac> <mo>)</mo> </mrow> <mo>,</mo> <mi>&amp;eta;</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>L</mi> </mrow>
    In formula, Ci(η)The η rank MFCC coefficients of the i-th frame voice data are represented, L is total exponent number of MFCC coefficients;Train the every of article Individual audio file obtains 98*L MFCC eigenmatrixes;
    2-2-2) extract sound dynamic characteristic MFCC first-order difference characteristic coefficients DMFCC;
    MFCC first-order difference characteristic coefficients DMFCC expression formula is:
    <mrow> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>,</mo> <mi>&amp;eta;</mi> <mo>&lt;</mo> <mi>&amp;theta;</mi> </mtd> </mtr> <mtr> <mtd> <mfrac> <mn>1</mn> <msqrt> <mrow> <mn>2</mn> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>&amp;nu;</mi> <mo>-</mo> <mi>&amp;theta;</mi> </mrow> </munderover> <msup> <mi>&amp;nu;</mi> <mn>2</mn> </msup> </mrow> </msqrt> </mfrac> <mstyle> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mi>&amp;theta;</mi> </mrow> </munderover> </mstyle> <mi>&amp;nu;</mi> <mo>&amp;CenterDot;</mo> <mo>(</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>+</mo> <mi>&amp;nu;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>-</mo> <mi>&amp;nu;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>)</mo> <mo>,</mo> <mi>&amp;theta;</mi> <mo>&amp;le;</mo> <mi>&amp;eta;</mi> <mo>&lt;</mo> <mi>L</mi> <mo>-</mo> <mi>&amp;theta;</mi> </mtd> </mtr> <mtr> <mtd> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>C</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </msub> <mo>,</mo> <mi>&amp;eta;</mi> <mo>&amp;GreaterEqual;</mo> <mi>L</mi> <mo>-</mo> <mi>&amp;theta;</mi> </mtd> </mtr> </mtable> </mfenced> </mrow>
    In formula, Di(η)It is the first-order difference parameter of the i-th frame voice data η rank MFCC characteristic coefficients;θ represent first derivative when Between it is poor;Each audio file of training article obtains 98*L DMFCC eigenmatrixes;
    2-2-3) extract sound dynamic characteristic MFCC second differnce characteristic coefficients D2MFCC;
    MFCC second differnce characteristic coefficients D2MFCC expression formula is:
    <mrow> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mn>2</mn> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>,</mo> <mi>&amp;eta;</mi> <mo>&lt;</mo> <mi>&amp;omega;</mi> </mtd> </mtr> <mtr> <mtd> <mfrac> <mn>1</mn> <msqrt> <mrow> <mn>2</mn> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mi>&amp;omega;</mi> </mrow> </munderover> <msup> <mi>&amp;nu;</mi> <mn>2</mn> </msup> </mrow> </msqrt> </mfrac> <mstyle> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>&amp;nu;</mi> <mo>=</mo> <mi>&amp;omega;</mi> </mrow> </munderover> </mstyle> <mi>&amp;nu;</mi> <mo>&amp;CenterDot;</mo> <mo>(</mo> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>+</mo> <mi>&amp;nu;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>-</mo> <mi>&amp;nu;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>)</mo> <mo>,</mo> <mi>&amp;omega;</mi> <mo>&amp;le;</mo> <mi>&amp;eta;</mi> <mo>&lt;</mo> <mi>L</mi> <mo>-</mo> <mi>&amp;omega;</mi> </mtd> </mtr> <mtr> <mtd> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>)</mo> </mrow> </mrow> </msub> <mo>-</mo> <msub> <mi>D</mi> <mrow> <mi>i</mi> <mrow> <mo>(</mo> <mi>&amp;eta;</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </msub> <mo>,</mo> <mi>&amp;eta;</mi> <mo>&amp;GreaterEqual;</mo> <mi>L</mi> <mo>-</mo> <mi>&amp;omega;</mi> </mtd> </mtr> </mtable> </mfenced> </mrow>
    In formula, Di2(η)It is the second differnce parameter of the i-th frame voice data η rank MFCC characteristic coefficients, ω represents second dervative Time difference;Each audio file of training article obtains 98*L D2MFCC eigenmatrixes;
    2-2-4) by step 2-2-1) to 2-2-3) obtained three eigenmatrixes combination, obtain the selected training article of step 1) Material characteristic coefficient matrix:
    The MFCC eigenmatrixes that will be extracted, DMFCC eigenmatrixes and D2MFCC eigenmatrixes are combined into 98*3L feature Matrix, first two rows and tail two row of the eigenmatrix after combination is removed, each row representative feature of the eigenmatrix after combination Coefficient, to each row characteristic coefficient averaged, finally obtain the 1*3L of the audio file of training article mixing MFCC Sound characteristic matrix simultaneously preserves, and mixing MFCC sound characteristics matrix is to obtain the material spy of the selected training article of step 1) Levy coefficient matrix;
    3) repeat step 1) to step 2), the secondary noise frequency file of training article repeated acquisition 20 selected to step 1) simultaneously extracts corresponding Material characteristic coefficient matrix;
    4) the different classes of material of A kinds is chosen, B different articles are chosen in every kind of material as training article, repeat step 1) to step 3), obtain the corresponding audio file of each training article and extract corresponding material characteristic coefficient matrix, carry altogether TakeIndividual material characteristic coefficient matrix is set up all training samples special into Material Identification as training sample Family's database;
    5) extreme learning machine ELM graders are trained;
    5-1) build ELM graders;
    ELM graders include three input layer, hidden layer and output layer levels;Input layer is set to have a neuron, each nerve Member corresponds to an input feature vector of each training sample respectively, then a=3L;If output layer has c neuron, each neuron point A kind of classification of article material in the Material Identification expert database that step 4) obtains, c=A are not corresponded to;Set hidden layer god It is l through first number;
    5-2) the input matrix X expression formulas of ELM graders are:
    In formula, each row represent a training sample, and a kind of feature is represented per a line;It is sharedIndividual training sample, each trains sample This includes a feature, then input matrix X has a rowsRow;
    Reality output matrix Y expression formulas are:
    In formula, reality output matrix Y arranges for c rows R, each to arrange the output result for representing a training sample, each training sample This output result corresponds to different material classifications comprising c output valve of output, each output valve;
    5-2) randomly choose input layer and the connection weight w of the implicit interlayer and biasing b of hidden layer neuron;
    Connection weight w between input layer and hidden layer, expression formula are as follows:
    In formula, wσpRepresent σ interneuronal connection weights of p-th of neuron of input layer and hidden layer;
    The biasing b of hidden layer neuron, expression formula are as follows:
    <mrow> <mi>b</mi> <mo>=</mo> <msub> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msub> <mi>b</mi> <mn>1</mn> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>b</mi> <mn>2</mn> </msub> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msub> <mi>b</mi> <mi>&amp;sigma;</mi> </msub> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msub> <mi>b</mi> <mi>l</mi> </msub> </mtd> </mtr> </mtable> </mfenced> <mrow> <mi>l</mi> <mo>&amp;times;</mo> <mn>1</mn> </mrow> </msub> </mrow>
    In formula, bσRepresent the biasing of the σ neuron of hidden layer;
    5-3) calculate hidden layer output matrix H;
    Excitation function g (x) of the function that infinitely can be micro- as hidden layer neuron is selected, note hidden layer is with exporting interlayer Connection weight is β, and expression formula is:
    <mrow> <mi>&amp;beta;</mi> <mo>=</mo> <msub> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msub> <mi>&amp;beta;</mi> <mn>11</mn> </msub> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mn>12</mn> </msub> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mrow> <mn>1</mn> <mi>c</mi> </mrow> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>&amp;beta;</mi> <mn>21</mn> </msub> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mn>22</mn> </msub> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mrow> <mn>2</mn> <mi>c</mi> </mrow> </msub> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> <mtd> <mo>.</mo> </mtd> </mtr> <mtr> <mtd> <msub> <mi>&amp;beta;</mi> <mrow> <mi>l</mi> <mn>1</mn> </mrow> </msub> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mrow> <mi>l</mi> <mn>2</mn> </mrow> </msub> </mtd> <mtd> <mn>...</mn> </mtd> <mtd> <msub> <mi>&amp;beta;</mi> <mrow> <mi>l</mi> <mi>c</mi> </mrow> </msub> </mtd> </mtr> </mtable> </mfenced> <mrow> <mi>l</mi> <mo>&amp;times;</mo> <mi>c</mi> </mrow> </msub> </mrow>
    The prediction output matrix T for obtaining ELM graders is:
    Wherein,
    In formula, wσ=[wσ1,wσ2,…,wσa];xq=[x1q,x2q,…,xaq]T
    In above formula, prediction output matrix T is expressed as:H β=T'
    Hidden layer output matrix H is calculated:
    5-4) calculate the optimal connection weight of hidden layer and output layer
    β value is obtained by solving the least square solution of following expression:
    <mrow> <munder> <mrow> <mi>m</mi> <mi>i</mi> <mi>n</mi> </mrow> <mi>&amp;beta;</mi> </munder> <mo>|</mo> <mo>|</mo> <mi>H</mi> <mi>&amp;beta;</mi> <mo>-</mo> <msup> <mi>T</mi> <mo>&amp;prime;</mo> </msup> <mo>|</mo> <mo>|</mo> </mrow>
    Its optimal solution isExpression formula is as follows:
    <mrow> <mover> <mi>&amp;beta;</mi> <mo>^</mo> </mover> <mo>=</mo> <msup> <mi>H</mi> <mo>+</mo> </msup> <msup> <mi>T</mi> <mo>&amp;prime;</mo> </msup> </mrow>
    In formula, H+For hidden layer output matrix H Moore-Penrose generalized inverses, ELM classifier trainings finish;
    6) Material Identification;Comprise the following steps that:
    6-1) select any article to be measured, repeat step 1), rattle signal caused by the survey article is gathered, and save as phase The audio file answered;
    6-2) repeat step 2), to step 6-1) obtained audio file extraction feature, obtain the material feature of the article to be measured Coefficient matrix;
    6-3) limit for finishing the material characteristic coefficient Input matrix of the article to be measured extracted to step 5) training learns Machine ELM graders, grader export to obtain the output prediction value matrix of c × 1 corresponding to the test article, the output predicted value square C output valve is included in battle array, and each output valve corresponds to a kind of article material classification, the article corresponding to maximum in output valve Material classification is the Material Identification result of the article to be measured.
  2. A kind of 2. article Material Identification device based on method as claimed in claim 1, it is characterised in that including:Mike's sound of the wind Sound gathers pen and computer, passes through bluetooth connection between the two;The microphone voice collection pen includes:Metal stamp, wheat Gram wind sensor, sound analog-to-digital conversion module, Bluetooth communication modules, display module and handwriting;The microphone sensor, sound Analog-to-digital conversion module, Bluetooth communication modules and display module are installed in inside handwriting, and metal stamp one end is placed on handwriting Inside, the other end are placed on outside handwriting;The surface that the metal stamp is used to tap article produces rattle signal, wheat Gram wind sensor is used to gather rattle signal and sends sound analog-to-digital conversion module to, and sound analog-to-digital conversion module is used for will Rattle signal is converted into audio digital signal and sends Bluetooth communication modules to, and Bluetooth communication modules are by bluetooth by sound Data signal is uploaded to computer and preserved into audio file, and computer exports article Material Identification after audio file is identified As a result Bluetooth communication modules and by bluetooth are returned to, recognition result is shown to use by Bluetooth communication modules by display module Family.
CN201710575310.XA 2017-07-14 2017-07-14 Article material identification method and device based on sound characteristics Active CN107545902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710575310.XA CN107545902B (en) 2017-07-14 2017-07-14 Article material identification method and device based on sound characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710575310.XA CN107545902B (en) 2017-07-14 2017-07-14 Article material identification method and device based on sound characteristics

Publications (2)

Publication Number Publication Date
CN107545902A true CN107545902A (en) 2018-01-05
CN107545902B CN107545902B (en) 2020-06-02

Family

ID=60971008

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710575310.XA Active CN107545902B (en) 2017-07-14 2017-07-14 Article material identification method and device based on sound characteristics

Country Status (1)

Country Link
CN (1) CN107545902B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520758A (en) * 2018-03-30 2018-09-11 清华大学 A kind of audio visual cross-module state object material search method and system
CN109870697A (en) * 2018-12-27 2019-06-11 东莞理工学院 A kind of object detection and classification method based on ultrasonic acoustic
CN111160352A (en) * 2019-12-27 2020-05-15 创新奇智(北京)科技有限公司 Workpiece metal surface character recognition method and system based on image segmentation
CN113345443A (en) * 2021-04-22 2021-09-03 西北工业大学 Marine mammal vocalization detection and identification method based on mel-frequency cepstrum coefficient
CN113514544A (en) * 2020-12-29 2021-10-19 大连理工大学 Mobile robot pavement material identification method based on sound characteristics
CN113671031A (en) * 2021-08-20 2021-11-19 北京房江湖科技有限公司 Wall hollowing detection method and device
CN114429771A (en) * 2022-04-02 2022-05-03 武汉地震工程研究院有限公司 Intelligent detection method and system for bonding defects of steel beam and CFRP (carbon fiber reinforced plastics) plate
CN115484342A (en) * 2021-06-15 2022-12-16 南宁富联富桂精密工业有限公司 Indoor positioning method, mobile terminal and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854646A (en) * 2014-03-27 2014-06-11 成都康赛信息技术有限公司 Method for classifying digital audio automatically
US20160104486A1 (en) * 2011-04-22 2016-04-14 Angel A. Penilla Methods and Systems for Communicating Content to Connected Vehicle Users Based Detected Tone/Mood in Voice Input
CN106356077A (en) * 2016-08-29 2017-01-25 北京理工大学 Laughter detection method and device
CN106531153A (en) * 2016-10-27 2017-03-22 天津大学 Chinese opera classification method based on extraction of singing segments and spoken parts
CN106682574A (en) * 2016-11-18 2017-05-17 哈尔滨工程大学 One-dimensional deep convolution network underwater multi-target recognition method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160104486A1 (en) * 2011-04-22 2016-04-14 Angel A. Penilla Methods and Systems for Communicating Content to Connected Vehicle Users Based Detected Tone/Mood in Voice Input
CN103854646A (en) * 2014-03-27 2014-06-11 成都康赛信息技术有限公司 Method for classifying digital audio automatically
CN106356077A (en) * 2016-08-29 2017-01-25 北京理工大学 Laughter detection method and device
CN106531153A (en) * 2016-10-27 2017-03-22 天津大学 Chinese opera classification method based on extraction of singing segments and spoken parts
CN106682574A (en) * 2016-11-18 2017-05-17 哈尔滨工程大学 One-dimensional deep convolution network underwater multi-target recognition method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
寄珊珊 等: "基于梅尔频率倒谱系数的球磨机料位软测量", 《计算机仿真》 *
赵拓: "基于频谱动态特征和ELM的挖掘设备识别方法研究", 《杭州电子科技大学硕士论文》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108520758A (en) * 2018-03-30 2018-09-11 清华大学 A kind of audio visual cross-module state object material search method and system
CN108520758B (en) * 2018-03-30 2021-05-07 清华大学 Visual-auditory cross-modal object material retrieval method and system
CN109870697A (en) * 2018-12-27 2019-06-11 东莞理工学院 A kind of object detection and classification method based on ultrasonic acoustic
CN111160352A (en) * 2019-12-27 2020-05-15 创新奇智(北京)科技有限公司 Workpiece metal surface character recognition method and system based on image segmentation
CN111160352B (en) * 2019-12-27 2023-04-07 创新奇智(北京)科技有限公司 Workpiece metal surface character recognition method and system based on image segmentation
CN113514544A (en) * 2020-12-29 2021-10-19 大连理工大学 Mobile robot pavement material identification method based on sound characteristics
CN113345443A (en) * 2021-04-22 2021-09-03 西北工业大学 Marine mammal vocalization detection and identification method based on mel-frequency cepstrum coefficient
CN115484342A (en) * 2021-06-15 2022-12-16 南宁富联富桂精密工业有限公司 Indoor positioning method, mobile terminal and computer readable storage medium
CN113671031A (en) * 2021-08-20 2021-11-19 北京房江湖科技有限公司 Wall hollowing detection method and device
CN114429771A (en) * 2022-04-02 2022-05-03 武汉地震工程研究院有限公司 Intelligent detection method and system for bonding defects of steel beam and CFRP (carbon fiber reinforced plastics) plate

Also Published As

Publication number Publication date
CN107545902B (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN107545902A (en) A kind of article Material Identification method and device based on sound characteristic
CN109036465B (en) Speech emotion recognition method
CN108766419A (en) A kind of abnormal speech detection method based on deep learning
CN109599129A (en) Voice depression recognition methods based on attention mechanism and convolutional neural networks
CN106847309A (en) A kind of speech-emotion recognition method
CN109524014A (en) A kind of Application on Voiceprint Recognition analysis method based on depth convolutional neural networks
CN108630209B (en) Marine organism identification method based on feature fusion and deep confidence network
CN108550375A (en) A kind of emotion identification method, device and computer equipment based on voice signal
CN109065072A (en) A kind of speech quality objective assessment method based on deep neural network
CN109493886A (en) Speech-emotion recognition method based on feature selecting and optimization
CN100585617C (en) Based on sorter integrated face identification system and method thereof
CN106782511A (en) Amendment linear depth autoencoder network audio recognition method
CN1197526A (en) Speaker verification system
CN103456302B (en) A kind of emotional speaker recognition method based on the synthesis of emotion GMM Model Weight
Wang et al. Research on speech emotion recognition technology based on deep and shallow neural network
CN110491415A (en) A kind of speech-emotion recognition method based on convolutional neural networks and simple cycle unit
CN109147774A (en) A kind of improved Delayed Neural Networks acoustic model
CN109036468A (en) Speech-emotion recognition method based on deepness belief network and the non-linear PSVM of core
Qiao et al. Sub-spectrogram segmentation for environmental sound classification via convolutional recurrent neural network and score level fusion
CN115565550A (en) Baby crying emotion identification method based on characteristic diagram light convolution transformation
Wu et al. The DKU-LENOVO Systems for the INTERSPEECH 2019 Computational Paralinguistic Challenge.
Zhao et al. Transferring age and gender attributes for dimensional emotion prediction from big speech data using hierarchical deep learning
CN112133326A (en) Gunshot data amplification and detection method based on antagonistic neural network
CN114464159A (en) Vocoder voice synthesis method based on half-flow model
Ramani et al. Autoencoder based architecture for fast & real time audio style transfer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant