CN111724796B

CN111724796B - Musical instrument sound identification method and system based on deep pulse neural network

Info

Publication number: CN111724796B
Application number: CN202010572964.9A
Authority: CN
Inventors: 唐华锦; 文湘兰; 潘纲
Original assignee: Zhejiang University ZJU; Zhejiang Lab
Current assignee: Zhejiang University ZJU; Zhejiang Lab
Priority date: 2020-06-22
Filing date: 2020-06-22
Publication date: 2023-01-13
Anticipated expiration: 2040-06-22
Also published as: CN111724796A

Abstract

The invention relates to a musical instrument voice recognition method and a musical instrument voice recognition system based on a deep pulse neural network, which belong to the technical field of voice recognition, and the method comprises the following steps: constructing a training data set; constructing a change resistance according to the characteristics of the membrane voltage change of the leakage integrated ignition model; constructing a leakage integrated ignition model with a variable resistance according to the variable resistance; constructing a depth pulse neural network model based on the variable resistance; each neuron in the model adopts a leakage integrated ignition model with variable resistance as a neuron model; training and optimizing the deep pulse neural network model based on the variable resistance by utilizing a training data set to obtain the trained and optimized deep pulse neural network model based on the variable resistance; and inputting the sound pulse sequence of the musical instrument to be recognized into the trained and optimized deep pulse neural network model based on the variable resistance, and determining the name of the musical instrument emitting the sound of the musical instrument to be recognized. The method and the system have higher identification precision on instrument identification.

Description

Musical instrument sound identification method and system based on deep pulse neural network

Technical Field

The invention relates to the technical field of sound recognition, in particular to a musical instrument sound recognition method and system based on a deep pulse neural network.

Background

The existing deep neural network is mainly realized based on a Back Propagation (BP) algorithm, the back propagation of the deep pulse neural network needs to be carried out simultaneously in time and space, and the back propagation is carried out instantly in the air. In 2018, wu et al propose an STBP algorithm, which constructs an iterative LIF (leak integrated-and-Fire) model, and performs space-time back propagation by using an approximate ignition function derivative, while considering the time and space relationship in the training phase. The method can achieve good performance on both static MNIST and dynamic NMNIST datasets without using training skills. In the same year, sumit et al, through the analysis of the ignition function derivative, approximates it to a gaussian probability function, and based on SRM (Spike Response Model), performs credit allocation in space and time by using a back propagation method at the same time, so as to perform weight adjustment. The method achieves excellent performance on DVS (Dynamic Vision System). In 2019, gu et al studied on the basis of STBP, proposed an iterative current LIF model, and proposed the concept of pulse clusters at the same time, defined the loss function by the pulse cluster method, and combined with the approximate ignition function derivative, carried out back propagation in space-time, which obtained good performance on MINIST data set, and combined with the pulse coding method, also obtained good performance on musical instrument sound recognition.

However, by comparing the biologically accurate HH model (Hodgkin-Huxley model) with the LIF model, it can be found that the resistance in the LIF model is a variable rather than a constant, and thus, the deep neural network based on the LIF model lacks certain biological accuracy and has room for improvement.

Disclosure of Invention

The invention aims to provide a musical instrument sound identification method and system based on a deep pulse neural network, wherein a leakage integrated ignition model with a variable resistance is adopted as a model of a neuron in the deep pulse neural network, so that the musical instrument identification has higher identification precision.

In order to achieve the purpose, the invention provides the following scheme:

a musical instrument sound identification method based on a deep pulse neural network comprises the following steps:

constructing a training data set; the training data set comprises a sound pulse sequence and an instrument name pulse sequence;

constructing a change resistance according to the characteristic of the membrane voltage change of the leakage integrated ignition model;

constructing a leakage integrated ignition model with a variable resistance according to the variable resistance;

constructing a depth pulse neural network model based on the variable resistance; the resistance-variation-based deep pulse neural network model comprises an input layer consisting of a plurality of input neurons, a feature layer consisting of a plurality of feature neurons, and a recognition layer consisting of a plurality of instrument neurons; the input neuron, the characteristic neuron and the instrument neuron all adopt the leakage integrated firing model with the variable resistance as neuron models;

training and optimizing the deep pulse neural network model based on the variable resistance by using the training data set to obtain a trained and optimized deep pulse neural network model based on the variable resistance; the output of the trained and optimized depth impulse neural network model based on the variable resistance is the name of the musical instrument;

carrying out pulse coding on the sound of the musical instrument to be identified to obtain a pulse sequence of the sound of the musical instrument to be identified;

and inputting the musical instrument sound pulse sequence to be recognized into the trained and optimized depth pulse neural network model based on the variable resistance, and determining the name of the musical instrument emitting the musical instrument sound to be recognized.

Optionally, the constructing a training data set specifically includes:

acquiring sound data and the name of a musical instrument to which the sound data belongs;

and respectively carrying out pulse coding on the sound data and the musical instrument name to obtain a sound pulse sequence and a musical instrument name pulse sequence.

Optionally, the constructing a variable resistance according to a characteristic of a membrane voltage change of the leakage integrated ignition model specifically includes:

according to the formula

Constructing a variable resistance R (t); where α is a constant, V (t-1) is the membrane voltage of the input neuron, the signature neuron or the instrumental neuron at time t-1, θ is the pulse threshold, R is ₀ Is the initial resistance.

Optionally, the constructing a leakage integrated ignition model with a varying resistance according to the varying resistance specifically includes:

according to the system of equations

Constructing a leakage integrated ignition model with variable resistance; wherein the content of the first and second substances,

is the membrane voltage at time t of the n-layer i neuron,

membrane voltage changes caused by the sound pulse sequence or the instrument name pulse sequence,

is the attenuation of the membrane(s),

is the decay of the synapse and is,

is the ignition suppression amount, beta _m 、β _s For the damping constant, O is the sound pulse sequence or the instrument name pulse sequence, g (x) is the firing function, V ₀ Is the normalized quantity, w is the weight, and N is the maximum number of layers.

Optionally, the depth pulse neural network model based on the variable resistance specifically includes:

an input layer consisting of a plurality of input neurons, the input layer for acquiring the sequence of acoustic pulses;

the characteristic layer is connected with the input layer and is used for judging whether the sound pulse sequence has the characteristics represented by the characteristic neurons;

and the identification layer is connected with the characteristic layer and is used for determining the name of the instrument to which the sound pulse sequence belongs according to the characteristics of the sound pulse sequence.

Optionally, the training and optimizing the deep pulse neural network model based on the variable resistance by using the training data set specifically includes:

constructing a Tempotron-like loss function according to a single-layer algorithm Tempotron of the impulse neural network; the Tempotron-like loss function is:

in the formula, V _max ＝V _N (t ^* )，t ^* The time when the membrane voltage of the musical instrument neuron is maximum, N represents the serial number of the musical instrument neuron, theta represents the pulse threshold value which can be confirmed, and L represents the accuracy degree of the musical instrument name to which the sound pulse sequence belongs, which is presumed by the deep pulse neural network model;

inputting the training data set into the depth impulse neural network model based on the variable resistance, calculating the error between the actual maximum membrane voltage and the target membrane voltage according to the Tempotron-like loss function, and distributing the error to the weight of each layer of neuron through the space-time relation between the membrane voltages by utilizing a space-time back propagation algorithm so as to adjust the weight of the depth impulse neural network.

In order to achieve the purpose, the invention also provides the following scheme:

a musical instrument sound recognition system based on a deep pulse neural network, comprising:

the training data set construction module is used for constructing a training data set; the training data set comprises a sound pulse sequence and an instrument name pulse sequence;

the variable resistance construction module is used for constructing a variable resistance according to the characteristic of the membrane voltage change of the leakage integrated ignition model;

the leakage integrated ignition model building module is used for building a leakage integrated ignition model with a variable resistance according to the variable resistance;

the impulse neural network model building module is used for building a depth impulse neural network model based on the variable resistance; the depth pulse neural network model based on the variable resistance comprises an input layer consisting of a plurality of input neurons, a characteristic layer consisting of a plurality of characteristic neurons and a recognition layer consisting of a plurality of musical instrument neurons; the input neuron, the characteristic neuron and the instrument neuron all adopt the leakage integrated firing model with the variable resistance as neuron models;

the model training optimization module is used for training and optimizing the deep pulse neural network model based on the variable resistance by utilizing the training data set to obtain the trained and optimized deep pulse neural network model based on the variable resistance; the output of the trained and optimized depth impulse neural network model based on the variable resistance is the name of the musical instrument;

the instrument sound coding module is used for carrying out pulse coding on instrument sound to be identified to obtain an instrument sound pulse sequence to be identified;

and the musical instrument sound identification module is used for inputting the musical instrument sound pulse sequence to be identified into the trained and optimized depth pulse neural network model based on the variable resistance and determining the name of the musical instrument emitting the musical instrument sound to be identified.

Optionally, the training data set constructing module specifically includes:

the training data acquisition unit is used for acquiring sound data and the names of the musical instruments to which the sound data belong;

and the training data coding unit is used for respectively carrying out pulse coding on the sound data and the musical instrument name to obtain a sound pulse sequence and a musical instrument name pulse sequence.

Optionally, the variable resistance configuration module specifically includes:

a variable resistance construction unit for constructing a variable resistance according to a formula

Optionally, the leakage integrated ignition model building module specifically includes:

a leakage integrated ignition model construction unit for constructing a leakage integrated ignition model based on a system of equations

the membrane voltage at time t of n layers of i neurons,

membrane voltage variations caused by the sound pulse sequence or the instrument name pulse sequence,

is the attenuation of the film, and is,

is the decay of the synapse and is,

According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention discloses a musical instrument sound identification method and system based on a deep pulse neural network, wherein a leakage integrated ignition model with a variable resistance is adopted as a model of a neuron in the deep pulse neural network, and the leakage integrated ignition model with the variable resistance newly establishes a relation between the current neuron resistance and the membrane voltage at the last moment on the basis of a current LIF model, so that when the current neuron receives an input pulse, the membrane voltage change generated by the input pulse is no longer a fixed value, but can be a dynamically changed value according to the current state of the neuron, namely, when the neuron with different membrane voltages receives the same input pulse, the neuron can generate different membrane voltage changes according to the previously input information due to the property. Because the membrane voltage condition of the previous moment can be considered when the network weight value of each moment is adjusted, the deep pulse neural network can better utilize the change in time to learn, and has higher identification precision in the aspect of instrument identification.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings required in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of an embodiment 1 of a deep pulse neural network-based musical instrument voice recognition method according to the present invention;

FIG. 2 is a flow chart of the learnable deep impulse neural network of the present invention identified in conjunction with an impulse neural network coding scheme;

FIG. 3 is a block diagram of a multi-layer deep neural network used in the deep impulse neural network for processing musical instrument sound information according to the present invention;

FIG. 4 is a simplified diagram of a deep pulse neural network model for processing musical instrument sound information in accordance with the present invention;

FIG. 5 is a flowchart of one iteration of calculating the gradient of the weight matrix W according to the present invention;

FIG. 6 is a graph showing the harmonic accuracy variation of the instrument voice recognition method based on the deep pulse neural network according to the present invention;

FIG. 7 is a block diagram of an embodiment of the system for musical instrument voice recognition based on deep pulse neural network according to the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.

Example 1:

fig. 1 is a flowchart of an instrument sound identification method 1 based on a deep pulse neural network according to an embodiment of the present invention. Referring to fig. 1, the method for recognizing musical instrument voice based on deep pulse neural network includes:

step 101: constructing a training data set; the training data set includes a sound pulse sequence and an instrument name pulse sequence.

In this step 101, the constructing a training data set specifically includes:

acquiring sound data and the name of the musical instrument to which the sound data belongs. The sound data includes sound characteristics.

And respectively carrying out pulse coding on the sound data and the musical instrument name to obtain a sound pulse sequence and a musical instrument name pulse sequence. The sound pulse sequence comprises sound characteristics.

Step 102: the varying resistance is constructed according to characteristics of a membrane voltage variation of the leakage integrated ignition model.

In step 102, the constructing the change resistance according to the characteristic of the membrane voltage change of the leakage integrated ignition model specifically includes:

according to the formula

Constructing a variable resistance R (t); where α is a constant, V (t-1) is the membrane voltage of the input neuron, the signature neuron, or the instrumental neuron at time t-1, θ is the pulse threshold, R ₀ Is the initial resistance.

Step 103: and constructing a leakage integrated ignition model with the variable resistance according to the variable resistance.

In this step 103, the constructing a leakage integrated ignition model with a varying resistance according to the varying resistance specifically includes:

according to the system of equations

the membrane voltage at time t of n layers of i neurons,

is the attenuation of the membrane(s),

is the decay of the synapse or of the synapse,

is the ignition suppression amount, beta _m 、β _s For the damping constant, O is the sound pulse sequence or the instrument name pulse sequence, g (x) is the firing function, V ₀ Is a normalized quantity, w is a weight, and N is the maximum number of layers.

Step 104: constructing a depth pulse neural network model based on the variable resistance; the resistance-variation-based deep pulse neural network model comprises an input layer consisting of a plurality of input neurons, a feature layer consisting of a plurality of feature neurons, and a recognition layer consisting of a plurality of instrument neurons; the input neuron, the feature neuron, and the instrument neuron each employ the leakage integrated firing model with varying resistance as a neuron model.

In this step 104, the depth pulse neural network model based on the variable resistance specifically includes:

an input layer consisting of a plurality of input neurons, the input layer for acquiring the sequence of acoustic pulses.

And the characteristic layer is connected with the input layer and is used for judging whether the sound pulse sequence has the characteristics represented by the characteristic neurons.

The characteristic layer is one or more layers.

When the feature layer is a layer, each feature neuron represents a feature, each feature neuron is connected with all the input neurons in the input layer according to different weights, each feature neuron performs forward weighting calculation on pulses of the input neurons in the input layer, and whether the sound pulse sequence has the feature represented by the feature neuron is judged.

When the characteristic layer is a plurality of layers, each characteristic layer is composed of a plurality of characteristic neurons, and each characteristic neuron represents one characteristic; each characteristic neuron in the first layer of characteristic layer is connected with all the input neurons in the input layer according to different weights, each characteristic neuron in the first layer of characteristic layer carries out forward weighted calculation on pulses of the input neurons in the input layer, and whether the sound pulse sequence has the characteristics represented by the characteristic neurons is judged; and each characteristic neuron in the rest characteristic layers is connected with all the characteristic neurons in the previous layer according to different weights, each characteristic neuron in the rest characteristic layers performs forward weighted calculation on pulses of the characteristic neurons in the previous layer, integrates characteristics input by the previous layer, and judges whether the sound pulse sequence has more complex characteristics.

Step 105: training and optimizing the deep pulse neural network model based on the variable resistance by using the training data set to obtain a trained and optimized deep pulse neural network model based on the variable resistance; and the output of the trained and optimized depth impulse neural network model based on the variable resistance is the name of the musical instrument.

In this step 105, the training and optimizing the deep pulse neural network model based on the variable resistance by using the training data set specifically includes:

in the formula, V _max ＝V _N (t ^* )，t ^* The time when the membrane voltage of the musical instrument neuron is maximum, N represents the serial number of the musical instrument neuron, theta represents the pulse threshold value which can be confirmed, and L represents the correctness degree of the musical instrument name to which the sound pulse sequence belongs, which is presumed by the deep pulse neural network model.

Step 106: and carrying out pulse coding on the sound of the musical instrument to be identified to obtain a sound pulse sequence of the musical instrument to be identified.

Step 107: and inputting the musical instrument sound pulse sequence to be recognized into the trained and optimized depth pulse neural network model based on the variable resistance, and determining the name of the musical instrument emitting the musical instrument sound to be recognized.

Example 2:

by comparing the HH model and the LIF model, it can be found that the resistance in the LIF model is a variable, not a constant. Based on the method, the invention provides a current LIF model of variable resistance, combines a loss function of Tempotron-like, utilizes space-time back propagation, adopts approximate derivative to replace ignition function derivative in the propagation process to solve the problem that the derivative is not microminiature, and constructs a new deep pulse neural network. The new model modifies the structure of the impulse neural network, which requires recalculation of the back propagation of the network. Meanwhile, the variable resistance actually increases network adjustable parameters, and the difficulty is how to select proper parameters for experimental application. Based on this, the network is applied to instrument identification to detect its performance.

The invention relates to a musical instrument sound identification method based on a deep pulse neural network, which adopts a learnable structure and algorithm of the deep pulse neural network to realize the identification of musical instrument sound. The learnable deep impulse neural network can be applied to practical problems such as image recognition, voice recognition and the like by combining with an impulse neural network coding mode, and a flow chart of the learnable deep impulse neural network is shown in fig. 2. The method is applied to the process of musical instrument sound identification, and comprises the following specific steps:

step 1: and constructing a deep pulse neural network for processing the sound information of the musical instrument.

The construction method comprises the following specific steps:

step 1.1, according to the actual situation, a multilayer deep network structure is adopted, and the structure is shown in figure 3.

Each layer consists of a number of neurons that update the membrane voltage and fire according to the rules of the neuron model. The first layer is an input layer consisting of a plurality of input neurons, each neuron representing a sound-pulse firing neuron. The intermediate layer can be a plurality of layers, each layer is a characteristic layer consisting of a plurality of characteristic neurons, each characteristic neuron of each layer is connected with all neurons in the previous layer (if the characteristic layer is the first layer, the previous layer is an input layer, and the previous layer is the previous characteristic layer in other cases) according to different weights, and each characteristic neuron of each layer mainly carries out forward weighting calculation on the pulse of the neuron in the input layer or the previous characteristic layer to judge whether the input sound has the characteristic. When a plurality of layers of feature layers exist, the input of the feature neuron of the next layer is the output of the feature neuron of the previous layer, and the feature neuron of the next layer judges on the judgment result of the previous layer at the moment. The last layer is a recognition layer composed of a plurality of musical instrument neurons, each neuron represents a musical instrument, and if a certain musical instrument neuron emits pulses, the sound is generated by the musical instrument. For a sound, the network inputs the sound into the feature layer through the input layer, the feature layer judges whether the sound has certain features, and the recognition layer recognizes which instrument the sound is according to the features of the sound.

Step 1.2: the neuron model adopted by each layer of neurons is the LIF model for changing the resistance provided by the method, and the model establishes the relation between the current resistance and the previous membrane voltage at the previous moment, so that when the current neurons receive input pulses, the membrane voltage change generated by the input pulses is no longer a fixed value, but a value which can be dynamically changed according to the current state of the neurons, namely, the neurons with different membrane voltages can generate different membrane voltage changes when receiving the same input pulses, and the characteristic enables the neurons to make different changes according to the previously input information when receiving the same input information. Meanwhile, the resistance of a biologically accurate HH (Hodgkin-Huxley model) neuron model varies and is voltage-dependent, and based on this, according to the characteristics of LIF membrane voltage variation, that is, when the membrane voltage increases, the membrane voltage variation decreases, the variation resistance R (t) of the configuration has:

where α is a constant, V (t) is the membrane voltage of the current neuron at t, θ is the pulse threshold, R ₀ Is the initial resistance since the change is established at R ₀ When the membrane voltage is 0, R should be R ₀ Therefore, 1 is required in the formula. The model fires a firing pulse by updating the membrane voltage to represent whether the input information is for such instrument as represented by an instrument neuron or whether it has the characteristics represented by a characteristic neuron. The membrane voltage update for this model is:

among them are:

the equation set of the membrane voltage update is an expression of the LIF model of the variable resistance provided by the method.

Wherein, the first and the second end of the pipe are connected with each other,

the membrane voltage at n layers of i neurons t,

for the membrane voltage change caused by the input,

is the attenuation of the film, and is,

is the decay of the synapse and is,

is an ignition suppression amount; beta is a beta _m 、β _s Is the decay constant; o is a pulse sequence, input neurons represent input sound information, characteristic neurons represent input sound with the characteristics, and instrument neurons represent corresponding instrument types; g (x) is the ignition function; v ₀ Is a normalized quantity; w is a weight; n is the maximum number of layers.

The LIF model fires a firing pulse by updating the membrane voltage to represent whether it is such a musical instrument.

When the musical instrument neuron is not firing, the membrane voltage energy represents the probability of the musical instrument, the greater the membrane voltage, the greater the probability, and when the threshold is exceeded, the musical instrument neuron will fire, i.e., the firing is a determination that the input sound is of the musical instrument.

Step 2: associated instrument sound data sets are obtained, the data sets containing sounds of different types of instruments. The data set is divided into a training set and a test set.

And step 3: the instrument sound data of the training set are encoded into corresponding pulse sequences by using traditional pulse encoding methods, such as time lag encoding and Poisson encoding.

And 4, step 4: the method comprises the steps of taking a sound pulse sequence generated after coding of a training set as a pulse sequence of an input neuron to train a weight of a network, enabling the network to correctly identify the type of an instrument after weighting, obtaining a deep pulse neural network for processing sound information of the instrument by adopting a deep pulse neural network algorithm based on a variable resistor, and judging which instrument the sound is emitted by the sound through the input sound by the neural network.

The specific weight training steps of the network are as follows:

step 4.1: constructing a loss function, tempotron-like function, according to a single-layer algorithm Tempotron of the impulse neural network (the basic idea of the Tempotron-like function is the same as that of the loss function of Tempotron, but the network according to the invention is changed in some forms):

wherein, V _max ＝V _N (t ^* ) T is when the membrane voltage of a certain musical instrument neuron is maximum; at this time, the membrane voltage represents the probability of being the instrument, and θ represents the pulse threshold from possible to affirmative, so L represents the correctness of the probability that the network predicts the current sound to be the instrument.

Step 4.2: during training, the neuron weights of all layers need to be adjusted, and the deep impulse neural network for processing the sound information of the musical instrument, which is constructed according to a neuron model formula and a loss function, has a relationship shown in fig. 4, namely a simplified diagram of a deep impulse neural network model. Fig. 4 shows the spatiotemporal relationship of three neurons at two moments (n is any layer, t is any moment) respectively located at two adjacent feature layers n, n +1 and recognition layers:

where each circle represents a neuron, the dashed lines represent the temporal relationship of the network, and the solid lines represent the spatial relationship of the network. The input pulse sequence of each layer forms a membrane voltage change U after passing through a weight matrix W, the membrane voltage change U influences a membrane voltage V through attenuation quantities M and S, and the ignition result at the last moment influences the membrane voltage V through an ignition restraining quantity E. And, the variable is also affected by the last moment.

According to the relationship of fig. 4, during training, the weight may be derived by using a spatio-temporal back propagation algorithm, and the weight may be updated according to the derivative. The training process takes advantage of the ideas of the prior art, such as the derivation formulas and the chain rule, but the actual computation conditions and computation steps need to be adjusted according to the network of the present invention. For example, the method is divided into two parts, and the derivation formula, the chain rule calculation, the iterative calculation and the like are respectively used for selecting the method according to the derivation condition.

The update rule of the weight w is as follows:

wherein eta is an update speed constant, and is,

as the gradient of the weight, there are:

the ignition function is non-conductive, so that a moment function is used for replacing a derivative of the ignition function, the derivative of the ignition function is used for solving a gradient, and the weight needs to be updated according to the derivative during network training; the moment function h (x) is:

where a is a range constant, θ is an error constant, sign is a sign-taking function, and if true, it is positive, and if false, it is negative.

The gradient can be calculated in two parts:

(1) Using the derivation formula:

among them are:

(2) Using the chain rule:

wherein:

wherein:

so far, all unknown parameters can be obtained through iterative computation, and the gradient of the weight matrix W can be computed. Since the weight matrix W needs to be iteratively solved, the gradient of the weight matrix W needs to be calculated according to the calculation direction shown in fig. 5, in combination with the above formula. Since the variable relationship is iterative, the gradient of the weight matrix W can be calculated by an iterative method. Fig. 5 shows the process of one iteration of the calculation, wherein the calculation process of the gradient of the weight matrix W is shown in a back propagation. N represents the current layer and t represents the current time.

And 5: after the network is trained, the sound data of the musical instruments in the test set are coded into pulse sequences.

And 6: the sound pulse sequence generated after the test set is coded is input into a depth pulse neural network which is trained before and used for processing sound information, the network carries out forward calculation by using the input sound pulse sequence, and finally, the neuron of the musical instrument generates the change of membrane voltage according to the condition of the input sound pulse sequence and ignites the change, so that the condition of which musical instrument generates the input sound can be judged.

According to the application of the method in voice recognition, the selected instrument recognition data sets have ten instrument types.

Firstly, input data are coded into a pulse sequence by adopting a pulse coding method, then identification is carried out, here, recall rate, accuracy and harmonic precision are adopted for evaluation, and experimental results are as follows:

comparison table of experimental results of the invention and the prior method

	Recall rate	Rate of accuracy	Harmonic precision
				LSTM	93.31％	96.08％	94.62％
CNN	99.21％	95.94％	97.51％
				STCA	97.29％	97.23％	97.25％
Method for producing a composite material	98.17％	98.23％	98.20％

As can be seen from the above table, the instrument sound identification method based on the deep pulse neural network is obviously higher than the existing method in recall rate, accuracy rate and harmonic precision.

In order to verify the effectiveness of the variable resistance, a control variable experiment is designed, only the parameter alpha is changed to identify the musical instrument, and the harmonic precision of the method is changed as shown in fig. 6.

As shown in fig. 6, it can be proved that the deep impulse neural network of the present invention has higher recognition accuracy in musical instrument recognition. Meanwhile, the network has higher biological interpretability while keeping simple calculation due to the improvement of the neuron model according to the biological neuron characteristics. Because the network is a pulse neural network, the pulse neural network has lower energy consumption on the neural morphology hardware. In the prior art, although there is also an STCA (space-Temporal Credit Assignment) algorithm that can achieve similar effects of this embodiment, because the algorithm of the present invention has a variable resistance, and establishes a relationship between the current resistance and the previous membrane voltage, so that the weight at each time can take into account the membrane voltage condition at the previous time when adjusting, the deep pulse neural network of the present invention can better utilize the change in time for learning, and therefore, the accuracy is slightly higher than the STCA, and because the STCA algorithm is based on a constant resistance, the bioanalysis is slightly lower than the method of the present invention.

Compared with the prior art, the instrument sound identification method based on the deep pulse neural network at least has the following advantages:

1. by using an improved pulse neuron model, namely a leakage integrated-and-Fire (LIF) model with variable resistance, on the basis of a current LIF model, a relationship between the current neuron resistance and the previous membrane voltage is newly established, and the relationship between the current membrane voltage and the previous membrane voltage is enhanced, so that neurons with different membrane voltages can generate different membrane voltage changes when receiving the same input.

2. The method utilizes a deep pulse neural network, namely a deep pulse neural network of which a pulse neuron model is a LIF model with variable resistance, can calculate the error between the actual maximum membrane voltage and the target membrane voltage according to a Tempotron-like loss function, and distributes the error to the weight of each layer of neurons through the space-time relation between the membrane voltages by utilizing back propagation, thereby adjusting the network weight. Because the LIF model with the variable resistance establishes the relation between the current resistance and the membrane voltage at the previous moment, the membrane voltage condition at the previous moment can be considered when the weight value at each moment is adjusted, and therefore the deep pulse neural network of the LIF model with the variable resistance can better learn by using the change in time, and has higher identification precision in instrument identification.

FIG. 7 is a block diagram of an embodiment of the instrument voice recognition system based on the deep pulse neural network according to the present invention. Referring to fig. 7, the musical instrument sound recognition system based on the deep pulse neural network includes:

a training data set constructing module 701, configured to construct a training data set; the training data set includes a sound pulse sequence and an instrument name pulse sequence.

The training data set building module 701 specifically includes:

the training data acquisition unit is used for acquiring sound data and the names of the musical instruments to which the sound data belong.

A varying resistance construction module 702 for constructing a varying resistance according to a characteristic of a membrane voltage variation of the leakage integrated ignition model.

The variable resistance configuration module 702 specifically includes:

And a leakage integrated ignition model building module 703, configured to build a leakage integrated ignition model with a varying resistance according to the varying resistance.

The leakage integrated ignition model building module 703 specifically includes:

the membrane voltage at time t of n layers of i neurons,

is the attenuation of the membrane(s),

is the decay of the synapse or of the synapse,

A pulse neural network model building module 704, configured to build a deep pulse neural network model based on the varying resistance; the resistance-variation-based deep pulse neural network model comprises an input layer consisting of a plurality of input neurons, a feature layer consisting of a plurality of feature neurons, and a recognition layer consisting of a plurality of instrument neurons; the input neuron, the feature neuron, and the instrument neuron each employ the leakage integrated firing model with varying resistance as a neuron model.

The deep impulse neural network model based on the variable resistance, which is constructed by the impulse neural network model construction module 704, specifically includes:

The model training optimization module 705 is configured to train and optimize the deep pulse neural network model based on the variable resistance by using the training data set, so as to obtain a trained and optimized deep pulse neural network model based on the variable resistance; and the output of the trained and optimized depth impulse neural network model based on the variable resistance is the name of the musical instrument.

The model training optimization module 705 specifically includes:

the loss function constructing unit is used for constructing a Tempotron-like loss function according to a single-layer algorithm Tempotron of the impulse neural network; the Tempotron-like loss function is:

And the training optimization unit is used for inputting the training data set into the depth pulse neural network model based on the variable resistance, calculating the error between the actual maximum membrane voltage and the target membrane voltage according to the Tempotron-like loss function, and distributing the error to the weight of each layer of neuron through the space-time relation between each layer of membrane voltage by utilizing a space-time back propagation algorithm so as to adjust the weight of the depth pulse neural network.

And the instrument sound coding module 706 is configured to perform pulse coding on the instrument sound to be identified to obtain an instrument sound pulse sequence to be identified.

And the instrument sound identification module 707 is configured to input the instrument sound pulse sequence to be identified into the trained and optimized deep pulse neural network model based on the varying resistance, and determine an instrument name emitting the instrument sound to be identified.

The invention discloses a musical instrument sound identification method and system based on a deep pulse neural network, wherein a leakage integrated ignition model with a variable resistance is adopted as a model of a neuron in the deep pulse neural network, and the leakage integrated ignition model with the variable resistance newly establishes a relation between the current neuron resistance and the membrane voltage at the last moment on the basis of a current LIF model, so that when the current neuron receives an input pulse, the membrane voltage change generated by the input pulse is no longer a fixed value, but can be a dynamically changed value according to the current state of the neuron, namely, when the neuron with different membrane voltages receives the same input pulse, the neuron can generate different membrane voltage changes according to the previously input information due to the property. Because the membrane voltage condition of the previous moment can be considered when the network weight value of each moment is adjusted, the deep pulse neural network can better utilize the change in time to learn, and the higher recognition accuracy on the recognition of musical instruments is realized.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.

The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the foregoing, the description is not to be taken in a limiting sense.

Claims

1. A musical instrument sound identification method based on a deep pulse neural network is characterized by comprising the following steps:

training and optimizing the deep pulse neural network model based on the variable resistance by using the training data set to obtain a trained and optimized deep pulse neural network model based on the variable resistance; the output of the trained and optimized deep pulse neural network model based on the variable resistance is the name of the musical instrument;

inputting the musical instrument sound pulse sequence to be recognized into the trained and optimized depth pulse neural network model based on the variable resistance, and determining the name of the musical instrument emitting the musical instrument sound to be recognized;

the constructing of the variable resistance according to the characteristics of the membrane voltage variation of the leakage integrated ignition model specifically includes: according to the formula

Resistance with variable structure

(ii) a In the formula (I), the compound is shown in the specification,

is a constant value that is a constant value,

is that

Membrane voltages of the input neuron, the signature neuron, or the instrumental neuron at the time,

is the pulseThe value of the impulse threshold is set to be,

is the initial resistance;

the building of the leakage integrated ignition model with the variable resistance according to the variable resistance specifically comprises the following steps: according to the system of equations

is composed of

Layer(s)

Neuron and its use

The membrane voltage at the moment of time,

is the attenuation of the film, and is,

is the decay of the synapse or of the synapse,

is the amount of suppression of ignition,

in order to be able to obtain a decay constant,

is the sound pulse train or the instrument name pulse train,

is the function of the ignition function and,

is a normalized quantity, w is a weight, and N is the maximum number of layers.

2. The method for musical instrument sound recognition based on the deep pulse neural network of claim 1, wherein the constructing of the training data set specifically comprises:

3. The method for recognizing the musical instrument sound based on the deep pulse neural network as claimed in claim 1, wherein the deep pulse neural network model based on the variable resistance specifically comprises:

4. The method for recognizing the sound of the musical instrument based on the deep pulse neural network of claim 1, wherein the training and optimizing the model of the deep pulse neural network based on the variable resistance by using the training data set specifically comprises:

(ii) a In the formula (I), the compound is shown in the specification,

，t ^* is the time when the membrane voltage of the musical instrument neuron is maximum,Nrepresenting the serial number of the instrument neuron,

representing a pulse threshold from possible to positive, and L representing the correctness degree of the instrument name to which the deep pulse neural network model infers the sound pulse sequence;

inputting the training data set into the depth pulse neural network model based on the variable resistance, calculating the error between the actual maximum membrane voltage and the target membrane voltage according to the Tempotron-like loss function, and distributing the error to the weight of each layer of neuron through the space-time relation between each layer of membrane voltage by utilizing a space-time reverse propagation algorithm so as to adjust the weight of the depth pulse neural network.

5. A musical instrument sound recognition system based on a deep pulse neural network, comprising:

the pulse neural network model building module is used for building a depth pulse neural network model based on the variable resistance; the resistance-variation-based deep pulse neural network model comprises an input layer consisting of a plurality of input neurons, a feature layer consisting of a plurality of feature neurons, and a recognition layer consisting of a plurality of instrument neurons; the input neuron, the characteristic neuron and the instrument neuron all adopt the leakage integrated firing model with the variable resistance as neuron models;

the model training optimization module is used for training and optimizing the deep pulse neural network model based on the variable resistance by utilizing the training data set to obtain the trained and optimized deep pulse neural network model based on the variable resistance; the output of the trained and optimized deep pulse neural network model based on the variable resistance is the name of the musical instrument;

the musical instrument sound identification module is used for inputting the musical instrument sound pulse sequence to be identified into the trained and optimized depth pulse neural network model based on the variable resistance and determining the name of the musical instrument emitting the musical instrument sound to be identified;

the variable resistance structure module specifically includes: a variable resistance construction unit for constructing a variable resistance according to a formula

Resistance with variable structure

(ii) a In the formula (I), the compound is shown in the specification,

is a constant value which is a constant value,

is that

is the threshold value of the pulse or pulses,

is the initial resistance;

the leakage integrated ignition model building module specifically comprises: a leakage integrated ignition model construction unit for constructing a leakage integrated ignition model based on a system of equations

is composed of

Layer(s)

Neuron and its use

The membrane voltage at the moment of time,

is the attenuation of the film, and is,

is the decay of the synapse and is,

is the amount of ignition suppression,

in order to be able to obtain a damping constant,

is the sound pulse train or the instrument name pulse train,

is the function of the ignition function and,

is the normalized quantity, w is the weight, and N is the maximum number of layers.

6. The deep pulse neural network-based musical instrument sound recognition system according to claim 5, wherein the training data set construction module specifically comprises: