CN114510970A

CN114510970A - Flotation process working condition identification method based on audio signal characteristics

Info

Publication number: CN114510970A
Application number: CN202210092432.4A
Authority: CN
Inventors: 王雅琳; 吴翰升; 王凯; 刘晨亮; 袁小锋; 谭栩杰; 李思龙
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-01-26
Filing date: 2022-01-26
Publication date: 2022-05-17

Abstract

The invention provides a flotation process working condition identification method based on audio signal characteristics, which comprises the following steps: step 1, data preparation and data preprocessing, specifically comprising data acquisition, noise reduction and working condition classification; and 2, amplifying the proportion of the key frequency band in the flotation audio signal according to the frequency physical significance represented by the longitudinal axis of the Mel spectrogram, reducing the influence of unimportant frequency bands, and constructing the flotation audio Mel spectrogram based on a characteristic attention mechanism. According to the method, the influence of different frequency bands in the Mel spectrogram on an identification result is observed, a key frequency band is found out, a flotation audio Mel spectrogram based on an attention mechanism is constructed, the primary extraction of characteristics is carried out, the transfer learning is introduced when the model is established, the data of a test set which is most likely to be correctly identified is screened out through the output characteristics of the model, a pseudo label is marked, meanwhile, the edge distribution and the condition distribution of the data are aligned to ensure that a good classification boundary can be generated during the transfer, and the purpose of improving the generalization performance of the model is achieved.

Description

Flotation process working condition identification method based on audio signal characteristics

Technical Field

The invention relates to the technical field of flotation process data analysis, in particular to a flotation process working condition identification method based on audio signal characteristics.

Background

Flotation is widely applied to industries such as nonferrous metals, nonmetallic minerals, coal and the like, and is a mineral separation method for separating minerals by utilizing different physicochemical properties of the surfaces of the minerals. The useful minerals are selectively attached to the air bubbles in the ore pulp and float to the surface of the ore pulp along with the air bubbles, so that the purpose of separating the useful minerals from impurities is achieved. Flotation is a critical process in many industrial processes, however controlling the effect of flotation is not an easy task because the factors affecting the effect of flotation and the variables that can be adjusted in operation are very numerous. The flotation process adjustment is operated manually all the time, at present, field workers mostly adjust the production process according to self experience and the grade test result of a final product, but the rough adjustment mode firstly requires that the field operators have very strong specialty; the second operation has strong subjectivity, different workers can operate completely differently under the same condition, and the working condition changes frequently, so that the product quality is unstable; the result of the third off-line test cannot reflect the current flotation effect in real time, and has larger time lag.

The online identification of the working condition is the premise of realizing objective operation optimization, and most of the prior researches on online working condition identification of the flotation process pay attention to foam image analysis. However, in the actual flotation process, besides observing the form of the foam, the field operation worker also has important information of the sound generated when the foam is scraped by the scraper and falls into the bottom tank, because the sound signals corresponding to different working conditions are different, but the research of identifying the working conditions by using audio signals is rarely carried out at present. In some studies using public data sets for audio classification, the study objects are different, and therefore information of all frequency bands in a spectrogram is required to be concerned. The method is characterized in that the audio signals generated in the flotation process have special properties, the audio signal differences generated under different working conditions can be concentrated on a certain frequency band, and how to better capture the partial tiny features which generate the differences among different categories and amplify and eliminate the interference of irrelevant features is crucial to improving the model identification performance.

In addition, the training set and the test set of the working condition recognition model are subjected to different distributions frequently due to the fact that the actual industrial data set is small in data volume or data with time sequence, environment, noise, operation and the like, and therefore a model which is well represented on the training set is poor in representation on the test set, and an overfitting phenomenon occurs. For the problems described above, migration learning may be used to improve the generalization performance of the model. The conventional migration learning method only focuses on the edge distribution between two domain data, but considering only the edge distribution does not necessarily result in a favorable classification boundary, and how to reasonably include the conditional distribution of the data is the key for further improving the performance of the model.

Disclosure of Invention

The invention provides a flotation process working condition identification method based on audio signal characteristics, and aims to solve the problems that the traditional flotation process working condition identification method cannot capture tiny characteristics of different types of parts generating differences and data set drifting exists between a training set and a test set.

In order to achieve the above object, an embodiment of the present invention provides a method for identifying a flotation process operating condition based on audio signal characteristics, including:

step 1, data preparation and data preprocessing, specifically comprising data acquisition, noise reduction, working condition category division and data division of a training set and a test set;

step 2, amplifying the proportion of the key frequency band in the flotation audio signal based on the frequency physical meaning represented by the longitudinal axis of the Mel spectrogram, reducing the influence of unimportant frequency bands, and constructing the flotation audio Mel spectrogram based on a characteristic attention mechanism;

step 3, constructing a deep convolution network model, taking a flotation audio Mel spectrogram based on a feature attention mechanism as input of the deep convolution network model, taking the classification loss of a training set and the maximum mean difference between the training set and a test set as loss, training the deep convolution network model to automatically learn and extract deep audio signal features, and finally inputting the extracted deep features into a classifier for classification;

step 4, inputting the unlabeled test set data into the deep convolution network model, selecting partial data in the recognition result, marking pseudo labels on the partial data, and retraining the deep convolution network model by using the pseudo label marked test set data and training set data;

and 5, repeatedly executing the step 4 until the specified times are reached, and taking the depth convolution network model obtained at the last time as a flotation process working condition identification model.

Wherein, the step 1 specifically comprises:

step 11, collecting audio signals generated when the foam scraped by the scraper and the scraped foam fall into the bottom groove in the flotation process through a microphone, collecting audio frequency for two minutes every hour, and recording the corresponding foam concentration and grade at the corresponding moment as a label of audio data;

step 12, designing a high-pass filter with a preset cut-off frequency to realize the stability of signals above the cut-off frequency of the high-pass filter and the rapid attenuation of signals below the cut-off frequency of the high-pass filter;

and step 13, segmenting each segment of the filtered audio signal, wherein multiple segments of short audio obtained by segmenting the same segment of audio correspond to the same data label, dividing the data into N categories according to the labels corresponding to all the audio, and dividing all the obtained data into a training set and a test set corresponding to N different working conditions in the production process.

Wherein the step 12 specifically includes:

step 121, performing FFT on the original signal;

step 122, filtering is performed by the following formula:

wherein | Y (ω) emitting^*Representing the amplitude of the filtered signal, | Y (ω) | representing the amplitude of the signal before filtering, ω representing the frequency;

and step 123, performing inverse FFT to obtain a filtered signal.

Wherein, the step 2 specifically comprises:

step 21, converting each section of the flotation audio signal into a Mel sound spectrogram, establishing a deep convolution network model, using the obtained Mel sound spectrogram as input of the deep convolution network model, training the deep convolution network model by using a training set, identifying working conditions of data in a test set, and selecting the deep convolution network model with the highest identification accuracy of the test set from a plurality of groups of different-depth convolution network models obtained by super-parameter training as a reference model;

and step 22, adding masks to different frequency bands in the Mel spectrogram for shielding characteristics of partial frequency bands, training a new deep convolution network model by taking the Mel spectrogram added with the masks as input, comparing the accuracy of the new deep convolution network model on the test set with the result of the reference model, and determining the influence on the accuracy after shielding different frequency bands.

Wherein, the step 2 further comprises:

step 23, using the frequency band with the largest influence on accuracy as a key feature, and using the frequency band with smaller influence as a secondary feature;

and 24, weighting the key features in the process of converting the audio signal into the Mel spectrogram, so as to enhance the proportion of the key features and weaken the influence of the secondary features.

Wherein, the step 3 specifically comprises:

step 31, constructing a deep convolutional network model, wherein the output of the deep convolutional network model is an N-dimensional vector normalized by a Softmax function containing a temperature coefficient, and each element of the N-dimensional vector represents the prediction probability of the deep convolutional network model to each class; wherein the expression of the Softmax function containing the temperature coefficient is

Wherein z is_iRepresenting the ith element of the vector before being subjected to Softmax normalization, N representing the number of categories, and T representing the temperature coefficient;

step 32, inputting the training set data into the constructed deep convolution network model, and calculating the cross entropy loss function of the training set data

Wherein n is the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample.

Wherein, the step 3 specifically comprises:

step 33, inputting the training set data and the test set data into the constructed deep convolution network model, and calculating the maximum mean difference loss between the training set and the test set

Where n represents the number of training samples, m represents the number of test set samples,

representing the ith training set sample input,

represents the sample input of the jth test set, phi (x) represents the most hidden layer feature vector learned by the deep convolution network, | x | | survival rate₂A 2-norm representing a vector;

step 34, calculate the total loss

Wherein λ represents a weight hyperparameter;

and step 35, training the constructed deep convolutional network model by taking the minimum total loss as a target, and inputting test set data after training to obtain a recognition result on the constructed deep convolutional network model.

Wherein, the step 4 specifically comprises:

step 41, setting a probability threshold value theta, when the maximum Softmax output of a certain test set data in the current deep convolutional network model is larger than theta, determining that the identification result of the current deep convolutional network model on the test set data is correct, and taking the identification result output by the current deep convolutional network model as a pseudo tag of the data;

step 42, inputting the test set data marked with the pseudo label into the current deep convolution network model, and calculating the cross entropy loss function of the test set data

Wherein k represents the number of pseudo-labeled test samples,

indicating a false label on the ith test specimen,

representing the identification result of the current model to the ith test sample.

Wherein, the step 4 further comprises:

step 43, inputting the training set data into the constructed deep convolution network model, and calculating the cross entropy loss function of the training set data

Where n represents the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample;

step 44, inputting the training set data and the test set dataThe constructed deep convolution network model calculates the maximum mean difference loss between the training set and the test set

representing the ith training set sample input,

represents the sample input of the jth test set, phi (x) represents the most hidden layer feature vector learned by the deep convolutional network, | x | | calness₂Representing the 2-norm of the vector.

Wherein, the step 4 further comprises:

step 45, recalculating the total loss

Wherein γ and λ represent weight hyperparameters;

step 46, training the model by taking the minimum total loss as a target, and inputting test set data after training to obtain a recognition result of the model;

step 47, skipping to step 41, and repeatedly executing steps 41 to 46 until the specified times are reached to obtain a group of working condition identification models under the hyperparameters including T, theta, gamma and lambda;

and 48, resetting multiple groups of hyper-parameters, repeatedly executing the step 47 until reaching the specified times, obtaining different models trained based on the multiple groups of hyper-parameters, and selecting the model with the highest identification accuracy of the test set data as the flotation process working condition identification model based on the obtained different models.

The scheme of the invention has the following beneficial effects:

the method for identifying the working condition of the flotation process based on the audio signal characteristics in the embodiment of the invention,

(1) the method combines the field experience of operators in actual production, utilizes the flotation audio signal concerned by people in the past research to identify the working condition, has no interference to the industrial process, is easy to obtain information and low in cost, widens the industrial information obtaining channel, and provides important information for the operation optimization of subsequent industrial production according to the identification result;

(2) according to the invention, an accurate process mechanism model is not required, and learning can be carried out by utilizing strong fitting capability of deep learning only by collecting enough audio signal data and corresponding labels, so that the complexity of manual design features is avoided, and the modeling difficulty is reduced;

(3) according to the method, the key frequency band characteristics beneficial to the identification precision are focused by constructing a characteristic attention mechanism, the proportion of non-key characteristics is reduced, the characteristic dimension can be reduced, the calculated amount is greatly reduced, and the generalization capability of the model and the identification accuracy of the model can be improved;

(4) the present invention uses transfer learning to solve the data set migration problem, which is essentially a straight-push learning. The model can be updated along with the change of the test set, compared with inductive learning which only uses a training set to train the model, the generalization performance is better, the model is more in line with industrial practice, and the performance of the model on the test set can be obviously improved;

(5) according to the method, the output characteristics of the test data in the model are utilized, the samples similar to the training data distribution are selected through the output of Softmax to be marked with the pseudo labels, the condition distribution of the test data is considered during migration, the test data accurately marked with the pseudo labels are more and more along with the model continuous iterative learning, and the recognition accuracy better than that of the conventional migration learning can be obtained.

Drawings

FIG. 1 is a flow chart of the present invention;

FIG. 2 is a schematic of the flotation process of the present invention;

FIG. 3 is a diagram of audio waveforms before and after filtering in accordance with the present invention;

FIG. 4 is a schematic diagram of the deep convolutional neural network structure of the present invention;

FIG. 5 is a Mel spectrogram of the invention before and after masking.

Detailed Description

In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.

The invention provides a flotation process working condition identification method based on audio signal characteristics, aiming at the problems that the existing flotation process working condition identification method cannot capture tiny characteristics of parts with differences among different categories and data set drifting exists among a training set and a test set.

As shown in fig. 1 to 5, an embodiment of the present invention provides a method for identifying a flotation process operating condition based on audio signal characteristics, including: step 1, data preparation and data preprocessing, specifically comprising data acquisition, noise reduction, working condition category division and data division of a training set and a test set; step 2, amplifying the proportion of the key frequency band in the flotation audio signal based on the frequency physical meaning represented by the longitudinal axis of the Mel spectrogram, reducing the influence of unimportant frequency bands, and constructing the flotation audio Mel spectrogram based on a characteristic attention mechanism; step 3, constructing a deep convolution network model, taking a flotation audio Mel spectrogram based on a feature attention mechanism as input of the deep convolution network model, taking the classification loss of a training set and the maximum mean difference between the training set and a test set as loss, training the deep convolution network model to automatically learn and extract deep audio signal features, and finally inputting the extracted deep features into a classifier for classification; step 4, inputting the unlabeled test set data into the deep convolution network model, selecting partial data in the recognition result, marking pseudo labels on the partial data, and retraining the deep convolution network model by using the pseudo label marked test set data and training set data; and 5, repeatedly executing the step 4 until the specified times are reached, and taking the depth convolution network model obtained at the last time as a flotation process working condition identification model.

According to the method for identifying the working condition of the flotation process based on the audio signal characteristics, in the potassium fertilizer production process of an enterprise, flotation is critical to the quality of a final product, as can be seen from fig. 1, raw ore is subjected to flotation after being decomposed by a decomposition tank, and the flotation process mainly comprises primary roughing, secondary roughing and concentration, and aims to sort out useful potassium ions in the ore through roughing and concentration and avoid mixing useless sodium ions and the like into the product. According to the actual research condition of the site, a key flotation cell with obvious sound change is selected as a recording point in the rough selection. In order to realize objective, real-time and accurate judgment of the current flotation process working condition, conditions are provided for optimizing the flotation process.

Wherein, the step 1 specifically comprises:

step 12, designing a high-pass filter with a preset cut-off frequency to realize the stable signal above the cut-off frequency of the high-pass filter and the quick attenuation of the signal below the cut-off frequency of the high-pass filter;

Wherein, the step 12 specifically includes:

step 121, performing FFT on the original signal;

step 122, filtering is performed by the following formula:

and step 123, performing inverse FFT to obtain a filtered signal.

Wherein, the step 2 specifically comprises:

Wherein, the step 2 further comprises:

According to the flotation process working condition identification method based on the audio signal characteristics, each section of flotation audio signal is converted into a Mel spectrogram, wherein the conversion frequency range is 0-22.05 kHz, and the specific conversion process comprises the following steps: inputting a voice signal after noise reduction, and performing pre-emphasis, framing and windowing; FFT is carried out on each frame signal to obtain the frequency spectrum of each frame signal, and then the amplitude spectrum of each frame signal is obtained; filtering the amplitude spectrum through a Mel filter bank; and logarithm is taken for the output of the filter bank to obtain a final spectrogram.

Wherein, the step 3 specifically comprises:

step 31, constructing a deep convolutional network model, wherein the output of the deep convolutional network model is an N-dimensional vector normalized by a Softmax function containing a temperature coefficient, and each element of the N-dimensional vector represents the pre-prediction of the deep convolutional network model for each classMeasuring the probability; wherein the expression of the Softmax function containing the temperature coefficient is

Wherein n is the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample.

Wherein, the step 3 specifically comprises:

representing the ith training set sample input,

step 34, calculate the total loss

Wherein λ represents a weight hyperparameter;

Wherein, the step 4 specifically comprises:

Wherein k represents the number of pseudo-labeled test samples,

indicating a false label on the ith test specimen,

Wherein, the step 4 further comprises:

Where n represents the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample;

step 44, inputting the training set data and the test set data into the constructed deep convolution network model, and calculating the maximum mean difference loss between the training set and the test set

representing the ith training set sample input,

represents the sample input of the jth test set, phi (x) represents the most hidden layer feature vector learned by the deep convolution network, | x | | survival rate₂Representing the 2-norm of the vector.

Wherein, the step 4 further comprises:

step 45, recalculating the total loss

Wherein γ and λ represent weight hyperparameters;

step 47, jumping to step 41, and repeatedly executing steps 41 to 46 until reaching the specified times to obtain a group of working condition identification models under the hyperparameters including T, theta, gamma and lambda;

and 48, resetting multiple groups of hyper-parameters, repeatedly executing the step 47 until the specified times are reached, obtaining different models trained on the basis of the multiple groups of hyper-parameters, and selecting the model with the highest data identification accuracy of the test set as a flotation process working condition identification model on the basis of the obtained different models.

In the flotation process working condition identification method based on the audio signal characteristics, table 1 shows the working condition identification accuracy of the same model under various training strategies, and it can be seen that the model is poor in generalization performance and low in precision when only a common mel spectrogram is used as input and migration learning is not used; when the Mel spectrogram is reconstructed as input based on the characteristic attention mechanism, the model precision is greatly improved; on the other hand, when the edge distribution between the two data sets is aligned only by adding the maximum mean difference loss, the model accuracy is also obviously improved; when the audio signal characteristic-based flotation process working condition identification method is used for marking pseudo labels on the data of the test set and aligning the edge distribution and the condition distribution of the data, the model accuracy is further improved; and finally, the highest accuracy is obtained when the feature attention mechanism and the provided transfer learning method are combined for modeling, and the effect of the provided method is verified.

TABLE 1 comparison of the recognition results of the working conditions

In summary, the method for identifying the working condition of the flotation process based on the audio signal characteristics has the following advantages:

(1) the method combines the field experience of operators in actual production, utilizes the flotation audio signals concerned by the study of the people in the past to identify the working conditions, has no interference to the industrial process, easily obtains information and has low cost, widens the industrial information obtaining channel, and provides important information for the operation optimization of subsequent industrial production according to the identification result;

(4) the present invention uses migration learning to solve the data set migration problem, which is essentially a straight-push learning. The model can be updated along with the change of the test set, compared with inductive learning which only uses a training set to train the model, the generalization performance is better, the model is more in line with industrial practice, and the performance of the model on the test set can be obviously improved;

(5) according to the method, the output characteristics of the test data in the model are utilized, the sample similar to the training data distribution is selected through the output of Softmax to be marked with the pseudo label, the condition distribution of the test data is considered during the migration, the test data accurately marked with the pseudo label is more and more along with the continuous iterative learning of the model, and the recognition accuracy better than that of the conventional migration learning can be obtained.

The method for identifying the working condition of the flotation process based on the audio signal characteristics, which is disclosed by the embodiment of the invention, is an innovative attempt for analyzing the data of the flotation process by identifying the working condition according to the audio signal generated in the flotation process, so that not only is an industrial information acquisition channel widened, but also guidance can be provided for multi-view learning and research of multi-aspect information such as subsequent fusion images and the like, and the working condition identification result can provide important state information for operation optimization of the flotation process.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A flotation process working condition identification method based on audio signal characteristics is characterized by comprising the following steps:

2. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 1, wherein the step 1 specifically comprises:

3. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 2, wherein the step 12 specifically comprises:

step 121, performing FFT on the original signal;

step 122, filtering is performed by the following formula:

wherein | Y (ω) & gtLily^*Representing the amplitude of the filtered signal, | Y (ω) | representing the amplitude of the signal before filtering, ω representing the frequency;

and step 123, performing inverse FFT to obtain a filtered signal.

4. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 3, wherein the step 2 specifically comprises:

5. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 4, wherein the step 2 further comprises the following steps:

6. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 5, wherein the step 3 specifically comprises:

Wherein n is the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample.

7. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 6, wherein the step 3 specifically comprises:

representing the ith training set sample input,

step 34, calculate the total loss

Wherein λ represents a weight hyperparameter;

8. The method for identifying the flotation process working condition of the audio signal characteristic according to claim 7, wherein the step 4 specifically comprises:

Wherein k represents the number of pseudo-labeled test samples,

indicating a false label on the ith test specimen,

9. The method for identifying the flotation process working condition of the audio signal feature according to claim 8, wherein the step 4 further comprises the following steps:

Where n represents the number of training samples,

a true label representing the ith training sample,

representing the recognition result of the model on the ith training sample;

WhereinN denotes the number of training samples, m denotes the number of test set samples,

representing the ith training set sample input,

10. The method for identifying the flotation process working condition of the audio signal feature according to claim 9, wherein the step 4 further comprises the following steps:

step 45, recalculating the total loss

Wherein γ and λ represent weight hyperparameters;