CN115392284A

CN115392284A - Site micro-vibration source identification method based on machine learning

Info

Publication number: CN115392284A
Application number: CN202210827980.7A
Authority: CN
Inventors: 孙长库; 张钧奕; 余才志; 王鹏; 付鲁华
Original assignee: Tianjin University
Current assignee: Tianjin University
Priority date: 2022-07-14
Filing date: 2022-07-14
Publication date: 2022-11-25

Abstract

The invention discloses a field micro-vibration source identification method based on machine learning, which utilizes an acceleration sensor in the edge area of a measured field

And acceleration sensor of measured site central position

Measuring the signal to the sensor

Sequentially denoising, transient impact signal extraction and feature extraction of the acquired signals, clustering the transient impact signals based on a feature matrix for dynamic time distortion distance, GMM modeling to obtain a model, verifying the number of independent Gaussian components of the model, and selecting the component number with the minimum value as an optimal model according to the Chichi information criterion value

Sensor with a flexible substrate

Captured transient impulse signal pair

Carrying out maximum posterior probability identification; and performing GMM parameter comparison on the sets which probably belong to the same class, and combining the two classes of sets with highly overlapped models into the same set to serve as a vibration data set which is generated around the field and has influence on the center position of the field.

Description

Site micro-vibration source identification method based on machine learning

Technical Field

The invention relates to the technical field of electronic industry factory building environment micro-vibration signal measurement, is used for carrying out vibration source identification on a micro-vibration source around a field, and mainly relates to a field micro-vibration source identification method based on machine learning.

Description of the background

With the continuous development of the semiconductor industry in China, the realization of the autonomous processing production of high-end chips is an indispensable part for the development of the technical field in China; besides the high-end chip manufacturing industry has high requirements on temperature, humidity and cleanliness of a factory building, the most important point is to control the situation of site micro-vibration during production of products, and the over-limit micro-vibration can influence the normal operation of precision equipment and can greatly reduce the yield of the products.

According to technical specification of electronic industry micro-vibration prevention engineering (GB 51076-2015) issued by 2015 of China, the definition of micro-vibration is that the time domain vibration amplitude is below 10 mu m, and the vibration speed is limited to be below 1000 mu m/s. For a specific precision apparatus, a contact interferometer type universal tool microscope with a precision of 1 μm is exemplified, and the maximum allowable vibration speed is 300 μm/s. The micro-vibration of the site around the environment such as a production factory building and the like has a plurality of micro-vibration sources, and can be approximately regarded as superposition including transient impact vibration sources and a plurality of periodic vibration sources, including natural vibration sources, rail transit, power equipment, building construction, artificial vibration and the like, which are potential interference vibration sources. In order to monitor and analyze such micro-vibration sources, the vibration signals need to be extracted and classified from long-term observation signals, and signals acquired in the later period need to be identified, so that the distribution positions and other information of the potential vibration sources can be mastered, the vibration isolation condition of a factory building is improved, a guarantee is provided for precision equipment, and economic loss is reduced.

At present, the method for identifying the vibration source is relatively few in research, and the vibration source identification is carried out by utilizing various machine learning methods and algorithms such as blind signal separation based on the cocktail party problem in the information theory, which are developed in recent years. And the engineering application at home and abroad is rarely used for micro-vibration analysis and identification, most of the engineering applications are in the projects such as optical fiber sensors, geological activity measurement and the like, and the vibration source identification and analysis of the field micro-vibration are basically in a vacuum state.

At present, problems can be divided into two categories of supervised learning and unsupervised learning according to the development of the whole machine learning field, and the problem of the classification of micro-vibration source identification without labels is suitable for a clustering algorithm of machine learning, and the existing vibration signal data set is subjected to feature classification so as to obtain the specific relevant information of each category of signals. According to the core thought of unsupervised learning, the following data processing ideas can be comprehensively obtained for monitoring and analyzing the micro-vibration source: intercepting various vibration signals from a large number of long-term monitoring signals of the site, extracting features of the intercepted signals, clustering according to feature vectors, and modeling each clustered data set through a mathematical model to obtain uniform features capable of representing the vibration sources.

The invention content is as follows:

the invention aims to provide a site micro-vibration source identification method based on machine learning, which is applied to the processing and analysis of long-term micro-vibration monitoring data of laboratories for placing precision equipment and semiconductor production vibration isolation plants, and aims to classify micro-vibrations emitted by different vibration sources around a site and captured by long-term monitoring, further analyze the influence of various vibration sources on the site, and simultaneously judge whether different sensors capture vibration signals emitted by the same vibration source, thereby improving the vibration isolation design and judging whether the same type of vibration source exists in the improved vibration isolation site.

The invention specifically comprises the following contents:

a field micro-vibration source identification method based on machine learning comprises the following steps:

s1: micro-vibration acceleration sensor utilizing edge area of measured field

Micro-vibration acceleration sensor for measuring center position of field

Respectively measuring the signals x _i And y _i Estimating the power spectrum of the signal according to Welch overlapped segment average method to obtain respective main frequency components

And

amplitude square coherence calculation is carried out, peak value extraction is carried out on the calculated amplitude square coherence result, and corresponding common frequency f is returned _kcom Will be

The frequencies of (a) are determined as k frequencies of which the periphery of the field has an influence on the central position of the field

The periodic vibration source of (2);

s2: for the sensor

Carrying out spectral subtraction noise reduction on the acquired time domain discrete signal x (n) to obtain a denoised frequency domain signal s (omega) and restoring to obtain a denoised time domain signal s (n);

s3: transient impact signal extraction is carried out on the denoised time domain signal s (n) by adopting an endpoint detection algorithm based on two-dimensional Gaussian distribution;

s4: the method comprises the steps of extracting MFCC features of n extracted transient impact signals s (i), wherein i belongs to 1 _i ；

S5: basing transient impulse signals on feature matrices EM _i Performing clustering processing on the dynamic time distortion distance, wherein the clustering method selects K-medoids, the cluster number K is comprehensively determined by a contour coefficient method and a cross-correlation undirected graph connected component to obtain K sets S (K), and K belongs to K, and the member number I (K) of each set, K belongs to K, is counted and is used as the vibration frequency condition of different vibration sources for long-term real-time micro-vibration monitoring;

s6: for the global feature matrix of the set S (k)

GMM modeling is carried out, and the row vector is taken as an independent variable to obtain a model m _k (c) C ∈ (2,... N), model parameters (Σ) ₁ ,...Σ _i ,μ ₁ ,...μ _i ,σ ₁ ,...σ _i ) Iteratively obtaining by an EM algorithm, verifying the number c of independent Gaussian components of the model, selecting the component number with the minimum value as an optimal model according to the Chichi information criterion value, and taking the optimal model as the set SktopFinal model

S7: for the sensor

Repeating the steps S2-S6 on the acquired time domain discrete signal y (n);

s8: sensor with a flexible substrate

The captured transient impact signal t (i), i ∈ 1

Carrying out maximum posterior probability identification; and performing GMM parameter comparison on the sets which probably belong to the same class, combining the two classes of sets with highly overlapped models into the same set, and performing statistical analysis to obtain a vibration data set which is generated around the site and has influence on the central position of the site.

In the above technical solution, step S2 specifically includes:

s2-1: fourier transform is carried out on time domain discrete signal x (n) to obtain noise spectrum estimation E _mean (ω)；

S2-2: carrying out spectral subtraction noise reduction processing on the original vibration signal, wherein the specific expression is as follows:

wherein the content of the first and second substances,

for the vibration signal after noise reduction, | Y (ω) | is the original vibration signal, and the pair

Performing IFFT to obtain a denoised time domain signal s (n).

In the above technical solution, step S3 specifically includes:

s3-1: the training data set is firstly subjected to framing processing, and the framed data X (i) is subjected to STFT (space time transform) conversion to be converted into a frequency domain signal X (k), wherein the formula is as follows:

framesize is the frame length, N is the number of sampling points per frame

S3-2: dividing each frame of signal converted into frequency domain into n frequency sub-bands with different bandwidths, and calculating energy P of each sub-band _j Wherein the upper and lower limits of the sub-bands are respectively

And

f ^j _size for the jth sub-band bandwidth

S3-3: for training data Y containing transient impact waveform and silence state waveform _i The frame energy characteristics of the frame are constructed into a characteristic matrix:

a _mn energy of nth sub-band representing data of mth frame

S3-4: and classifying each row vector II of the transient impact waveform and the silence state waveform into different clusters by using a clustering algorithm, wherein the distance measurement during clustering is selected as Euclidean distance as the characteristic variable is energy:

wherein a, b is E (1,2.. M)

Respectively constructing a Gaussian model with the dimension of two in the following formula for the clustering result, wherein the parameters of the model are obtained by multiple iterations through an EM (effective magnetic) algorithm;

wherein phi is _i Weight, μ, for each independent Gaussian distribution _i 、Σ _i Is the expectation and variance;

s3-5: the input waveform data is also subjected to frame division and frequency sub-band energy calculation, and the probability P of each frame data for the frame division and the frequency sub-band energy calculation is calculated for each frame data _s (x) And P _n (x) And judging whether the ith frame contains the transient impact signal according to a likelihood ratio test criterion:

wherein, P _s (x, i, j) and P _n (x, i, j) is the probability that j characteristic variables of the ith frame belong to transient impact and silence states, and the probability ratio is subjected to threshold judgment to obtain the probability that each frame contains transient impact:

T _ξ and T _η For each frame energy threshold and subband energy threshold derived from the training data set

Taking F for two adjacent frames with partial overlap _AD ⁱ And F _AD ⁱ⁺¹ The union of (2) is used as the judgment result of the current frame, and F within the interval of 20 frames is added _AD ⁱ A frame of =1 as a complete set of transient impulse waveforms.

In the above technical solution, step S4 specifically includes:

s4-1: intercepting the transient impact signal S (i), i belongs to 1, and n into an equal-length frame with the length same as that of the frame in the step S3-1, filtering by a Mel filter bank, and replacing the conversion relation between the traditional Mel scale and Hertz by the following formula:

s4-2: performing MFCC feature extraction of conversion scale on s (i) to form a feature matrix M _i Will M _i Performing energy normalization as follows:

CM _i (t,f)＝(1-s)CM _i (t-1,f)+sM _i (t,f)

wherein, α, s, r and δ are main parameters, and are selected according to the selected micro-vibration sensor in engineering and the measured specific vibration data amplitude distribution, s =0.025, α =0.98, δ =2, r =0.5, and e =10 ^-6 ；

S4-3: obtaining a feature matrix EM belonging to s (i) _i ：

Where row T represents the number of frames per independent transient impulse and column J represents the mel-filter bank coefficients.

In the above technical solution, step S5 specifically includes:

s5-1: performing DTW distance-based K-medoids clustering:

(1) The feature matrix EM of each s (i) _i Selecting K points with the largest mutual DTW distance as an initial centroid;

(2) Clustering according to a nearest principle, sequentially calculating the DTW distance from each s (i) to K class centroids, and assigning all the rest s (i) to the nearest class according to the principle of the nearest distance from the K class centroids to form K classes;

(3) Re-determining the centroid of each class, cost function:

in the formula of SUM _k (i) Is the sum of the distances s (i) to the DTW of other transient impacts of the same k-th class; find SUM in this class _k (i) SUM of minimal s (i) and protocentroids _k (i) Comparing, and selecting the new centroid with a smaller cost function value between the two centroids;

(4) Repeating the steps (2) and (3) until the centroids do not change any more or the maximum iteration number is reached, and obtaining the K-class K-classified data

S5-2: determining the optimal cluster number K by adopting a method of comprehensively determining connected components of a contour coefficient method and a cross-correlation undirected graph;

(1) Taking K as a variable, clustering the whole s (i) for multiple times, wherein the value range of K [2,2K ] _c ]Obtaining the optimal K value K by the contour coefficient method _s (j)：

K _s (j)＝peakfinder(c _sum (k) Peakfield is the peak extraction function, K) _s (j) For j different peaks, contour coefficient

Wherein a (i) and b (i) respectively represent the DTW distance from the element i to the center of mass of the cluster to which the element belongs and the DTW distance from the element to the next closest cluster center of mass;

(2) Obtaining optimal K value K by cross-correlation undirected graph communication component method _c Setting the normalized cross-correlation value of the transient impact signals of the same kind to be more than 0.8, and setting the total s (i):

1) And (3) calculating the cross correlation between every two s (i), taking the maximum value, and forming a cross correlation matrix:

a _ij is the maximum of the cross-correlation of s (i) and s (j)

2) For is to

Threshold processing is carried out to obtain a cross-correlation adjacency matrix

Wherein L is ² (i) Represents the square of the maximum amplitude of the transient impact signal s (i);

3) For cross correlation adjacency matrix

All the connected components are obtained, and the number of the channels is recorded as K _c ；

(3) The optimal K value is represented by K _s (i) And K _c Jointly determining:

K＝min{K _s (i)|K _s (i)≥K _c }。

in the above technical solution, step S8 specifically includes:

s8-1: identifying individual t (i) and judging the individual t (i) as a sensor

Class k of captured transient impulse signals ^* ：

Meanwhile, whether t (i) is a transient impact signal emitted by an external vibration source of the set needs to be determined, and the method needs to be applied tok ^* Making a threshold decision to determine that it is the same type of transient impact signal in the set, i.e.

Where T is the EER or high FRR threshold, which needs to be determined synthetically using the training data set during model training.

S8-2: for the sensor

And

respectively determining various transient impact signal GMM models, and setting the models respectively

And

calculating the sum of relative errors of the two parameters and the DTW distance of the central point as a judgment result of whether the two signals belong to the same class or not; is provided with

Has a parameter

Corresponding to

The subscript of the parameter is set to j, at

Is found in

Each of the closest

And calculating the similarity of the two models

When i is not equal to j, zero filling is carried out on the parameter with insufficient number

S8-3: and traversing model similarity of all clusters of the two sensors, finding a minimum pair, comparing the DTW distances of the central points of the two clusters, and considering that the two models describe signal data sets sent by the same vibration source when the distance is smaller than the average distance in the cluster.

The invention has the advantages and beneficial effects that:

the micro-vibration source identification method based on the machine learning field is a complete vibration source identification scheme, can classify vibration signals captured by a single sensor arranged in the field by using limited data, models different types of vibration signals, is convenient for later-stage discovery of vibration sources for elimination and vibration isolation treatment, and can determine whether different sensors are influenced by the same vibration source.

Drawings

FIG. 1 is a block flow diagram of the present invention.

FIG. 2 is a measured micro-vibration velocity time domain signal.

Fig. 3 is a WOSA power spectrum estimation result.

Fig. 4 is the amplitude squared coherence result.

Fig. 5 is an endpoint detection algorithm result.

FIG. 6 is a transient impact waveform EN-MFCC characteristic.

Fig. 7 is three sets of data sets demonstrating the clustering effect.

FIG. 8 is a graph of the visual effect of the clustering result t-SNE.

For a person skilled in the art, other relevant figures can be obtained from the above figures without inventive effort.

The specific implementation mode is as follows:

in order to make the technical solution of the present invention better understood, the technical solution of the present invention is further described below with reference to specific examples. The time domain data collected by the micro-vibration sensor arranged on the measured field are used for example demonstration, the data is shown in figure 2, the sampling rate is 3200SPS, and the data is measured by the sensor in one hour in a laboratory with relatively stable vibration condition.

A field micro-vibration source identification method based on machine learning is disclosed, a flow chart is shown in figure 1, and the specific steps are as follows:

s1: in order to measure the periodic vibration source existing in the center position and the edge area of the field, the following steps are carried out:

s1-1: sensor arranged at center of field

And a sensor arranged in the field edge region

Respectively measuring data x _i And y _i Selecting the data within the specified time range to be analyzed or cutting the data into L sections, x _i (n)、y _i (n) represents the ith section data, and the power spectrum estimation of the signal is carried out according to Welch overlapped piecewise average (WOSA) to obtain respective main frequency components

And

the calculation result of randomly extracting data in the same time period in this embodiment is shown in fig. 3.

S1-2: for x _i And y _i Performing cross-power spectral density estimation:

s1-3: for x _i And y _i Amplitude square coherence calculation was performed:

will peak value f _kcom As x _i And y _i Of the common frequency. In this embodiment, as shown in fig. 4, the result is obtained by finding the peak with the threshold of 0.08, and the common frequency f of the two is obtained _kcom If the data is used to find the vibration source actually emitting the frequency vibration, parameter optimization needs to be performed on the precision of frequency domain conversion and spectrum estimation, and from the results, the first two pairs of peak values can be analyzed to represent a periodic vibration source with the frequency of about 100Hz, and the second two pairs of peak values represent a periodic vibration source with the frequency of about 248 Hz.

Next, to identify the transient shock sources contained in the collected data, the sensors are calibrated

And carrying out subsequent processing on the acquired time domain discrete signal x (n).

S2: for the sensor

And (3) carrying out spectral subtraction denoising on the acquired time domain discrete signal x (n) to obtain a denoised frequency domain signal S (omega) and recovering to obtain a denoised time domain signal S (n), wherein the noise spectrum can be obtained by manually intercepting a representative long-time noise signal or the average value of a silent state segment set obtained by an endpoint detection algorithm through S3. The step S2 specifically includes:

s2-1: fourier transform is carried out on time domain discrete signal x (n) to obtain noise spectrum estimation E _mean (ω)。

wherein the content of the first and second substances,

And performing IFFT to obtain a denoised time domain signal s (n).

S3: transient impact extraction is carried out on the obtained denoised time domain signal s (n), and an endpoint detection algorithm based on two-dimensional Gaussian distribution is selected for realization. Step S3 specifically includes:

s3-1: the training data set is firstly subjected to framing processing, the frame length is usually short due to transient impact and is about 0.01s, and the signal of the sampling frequency Fs is set according to the frame length framesize of engineering experience

(in this embodiment, the signal frame length framesize for a sampling frequency 3200SPS is set to

) The overlap (overlapping) between frames may be set to a frame length of 25% -50% (in this embodiment, the overlap between frames is set to 40 samples). STFT conversion is performed on the data X (i) after framing to obtain X (k), and the formula is as follows:

s3-2: the method comprises the steps of calculating frequency band energy of each frame of signals converted into frequency domains, wherein the micro-vibration sensor is different from a common acceleration sensor, has the characteristic that the sensitivity is extremely high, the frequency response range is only concentrated on a low frequency band, the highest effective vibration frequency captured by the micro-vibration sensor is set to be about 1600Hz, and the micro-vibration sensor is divided into 6 bandwidths [10-100;100-300 parts of; 300-500;500-800 parts; 800-1100 parts; 1100-1600]Unequal frequency sub-bands, calculating the energy P of each sub-band _j Which isThe upper and lower limits of the subband are respectively

And

s3-3: for training data Y containing transient impulse waveforms and non-transient impulse waveforms (hereinafter referred to as "silence state waveforms") _i The frame energy characteristics of the frame are constructed into a characteristic matrix:

a _mn energy of nth sub-band representing data of mth frame

during the selection of the training data set, attention should be paid to the fact that the types of transient impact waveforms collected by the micro-vibration sensor are ensured to be enriched as much as possible, the accuracy of a later-stage endpoint detection algorithm is improved, and a Gaussian model with the dimension of two in the formula (7) is respectively constructed for clustering results. The parameters of the model are obtained by the EM algorithm through multiple iterations.

The multidimensional Gaussian distribution model is as follows:

wherein phi is _i Weight, μ, for each independent Gaussian distribution _i 、Σ _i Is the expectation and variance.

S3-5: the input waveform data is also subjected to framing and frequency sub-band energy calculation, and the probability P of each frame data for the framing and the frequency sub-band energy calculation is calculated _s (x) And P _n (x) And judging whether the ith frame contains a transient impact signal instead of a silent state according to a likelihood ratio test criterion:

wherein, P _s (x, i, j) and P _n (x, i, j) is the probability that the j characteristic variables of the ith frame belong to transient attack and silence states. And carrying out threshold judgment on the likelihood ratio to obtain the probability that each frame contains transient impact:

taking F for two adjacent frames of overlapping _AD ⁱ And F _AD ⁱ⁺¹ The union of (a) and (b) is used as the judgment result of the present frame. The detection result is shown in fig. 5, and the blue line represents the probability of the instant belonging to the transient impulse signal. F within 20 frames of consecutive sum _AD ⁱ A frame of =1 as a complete set of transient impulse waveforms.

S4: and performing EN-MFCC (Energy Normalized MFCC) feature extraction on the extracted n transient impact signals s (i). Because the micro-vibration sensor has better frequency response (frequency response) to the low-frequency region, the traditional MFCC Mel scale expression is abandoned, the frequency scale which is more in line with the micro-vibration signal characteristics is selected, and the energy normalization processing is carried out on the extracted MFCC characteristics to obtain the EN-MFCC characteristic matrix EM of each transient impact signal _i . The method comprises the following specific steps:

s4-1: and (5) cutting the transient impact signal S (i) into frames with the same length as the framesize in the step S3-1, and performing Meyer filter bank filtering. According to the low-frequency sensitive characteristic of a micro-vibration sensor in engineering, the conversion relation between the traditional Mel scale and Hertz is replaced by the following formula:

s4-2: performing MFCC feature extraction of conversion scale on s (i) to form a feature matrix M _i To ensure that s (i) are not affected by the amplitude, M is added _i Performing energy normalization as follows:

wherein, α, s, r, δ are the main parameters, selected according to the micro-vibration sensor selected in the project and the measured specific vibration data amplitude distribution, and a set of practically usable parameter settings are given here: s =0.025, α =0.98, δ =2,r =0.5, and ε is the minimum amount to ensure that the divisor is not 0, typically ε =10 ^-6 . S4-3: obtaining a feature matrix EM belonging to s (i) _i ：

Where row T represents the number of frames per independent transient impulse and column J represents the mel-filter bank coefficients. In this embodiment, as shown in FIG. 6, a characteristic matrix diagram of four separate transient impact signals is shown, wherein the X-axis represents the number of frames per independent transient impact and the Y-axis represents the EN-MFCC coefficients.

S5: basing transient impulse signals on feature matrices EM _i And performing K-medoids clustering processing of Dynamic Time Warping (DTW) distance, wherein the cluster number K is comprehensively determined by a contour coefficient method and a mutual Correlation Undirected Graph (CUG) connected component. This results in K sets S (K), K ∈ K. And counting the number I (K) of members in each set, wherein K belongs to K, and the number is used as the vibration frequency condition of different vibration sources for long-term real-time micro-vibration monitoring. The method comprises the following specific steps:

s5-1: and performing K-medoids clustering based on the DTW distance until each centroid is not changed any more or the maximum iteration number is reached.

(1) The feature matrix EM of each s (i) _i Selecting K points with the largest mutual DTW distance as initial center points (centroids);

(2) Clustering is performed according to the nearest principle. Sequentially calculating the DTW distance from each s (i) to K class centroids, and assigning all the rest s (i) to the nearest class according to the principle that the distance from each s (i) to the K class centroids is the nearest to form K classes;

(3) Re-determining the centroid of each class, cost function:

in the formula of SUM _k (i) Is the sum of the distances s (i) to the DTW of other transient impacts of the same k-th class; find SUM in this class _k (i) SUM of minimal s (i) and protocentroids _k (i) And comparing, and selecting the new centroid with a smaller cost function value between the two centroids.

(4) Repeating the steps (2) and (3) until each centroid is not changed any more or the maximum iteration number is reached, and obtaining the K-class K-classified product

S5-2: the optimal cluster number K, i.e. the type of the vibration signal, is determined, whereby a method is used which is determined by a combination of a contour coefficient method (silouette coefficient) and a connected component of a cross-Correlation Undirected Graph (CUG).

K _s (j)＝peakfinder(c _sum (k) Peakfield is the peak extraction function, K _s (j) Is j different peaks

Coefficient of contour

Wherein a (i) and b (i) respectively represent the DTW distance from the element i to the centroid of the cluster to which the element belongs and the DTW distance from the element to the centroid of the next closest cluster.

(2) Obtaining optimal K value K by cross-Correlation Undirected Graph (CUG) connected component method _c Setting the normalized cross-correlation value of the transient impact signals of the same kind to be more than 0.8, and setting the total s (i):

a _ij is the maximum value (14) of the cross-correlation of s (i) and s (j)

2) To pair

Wherein L is ² (i) Representing the square of the maximum amplitude of the transient impact signal s (i)

3) For cross correlation adjacency matrix

All the connected components are obtained, and the number of the channels is recorded as K _c 。

K＝min{K _s (i)|K _s (i)≥K _c } (16)

to obtain compounds classified in class K

In order to conveniently show the classification effect of the method, three types of the method measured in a laboratory are selectedAnd (3) selecting eight transient impact signals for classification in a group data set as shown in fig. 7, and carrying out data visualization on the result by using a t-SNE dimension reduction method as shown in fig. 8.

S6: for the global feature matrix of the set S (k)

GMM (Gaussian mixture model) modeling is carried out, and a row vector, namely a frame vector is taken as an independent variable to obtain a model m _k (c) C ∈ (2,... N), model parameters (Σ) ₁ ,...Σ _i ,μ ₁ ,...μ _i ,σ ₁ ,...σ _i ) Obtained by the iteration of the EM algorithm. And the number c of independent Gaussian components of the model is verified, and AIC (m) is selected according to the red pool information criterion (AIC) value _k (c) Minimum GMM model as the final model

Step S6 specifically includes:

s6-1: for class K

Performing GMM modeling, performing c independent Gaussian component mixed model modeling by taking the row vector of each type of all characteristic matrix as an independent element according to the formula (7), and obtaining the parameter (Sigma) of each Gaussian component by using an EM algorithm ₁ ,...Σ _i ,μ ₁ ,...μ _i ,σ ₁ ,...σ _i ) Model is denoted as m _k (c) C ∈ (2,...., n), n typically takes an integer within 40.

S6-2: for model m _k (c) Determining the optimal component number c: selecting and using the Red pool information criterion (AIC) value _k (c) ) the minimum c value is used as the final component number to obtain the final model

S7: for the sensor

Is collected toThe above steps S2 to S6 are repeated for the time domain discrete signal y (n).

S8: sensor with a flexible substrate

The captured transient impact signal t (i), i ∈ 1

By a sensor

And

for example, if it is to be determined

Whether a transient impulse signal t (i) captured or captured thereafter with the sensor is in contact with the sensor

If the signal of the same type belongs to the same type, the EN-MFCC feature extraction can be carried out on t (i), the probability density calculation can be carried out on each GMM model, and judgment can be carried out according to a threshold value. At the same time, if the comparison sensor is needed

And

and when the two types of vibration sources are the same vibration source, judging according to the ModelSimiarity and the DTW distance between the two types of central points.

Step S8 specifically includes:

s8-1: identifying individual t (i) and judging the individual t (i) as a sensor

Class k of captured transient impulse signals ^* ：

Meanwhile, whether t (i) is a transient impact signal emitted by an external vibration source of the set needs to be determined, and k needs to be determined ^* Making a threshold decision to determine that it is the same type of transient impulse signal in the set, i.e.

Where T may be EER or a high FRR threshold, which requires a comprehensive determination using a training data set during model training.

S8-2: for the sensor

And

respectively determining various transient impact signal GMM models, and respectively setting the models

And

and calculating the sum of the relative errors of the two parameters and the DTW distance of the central point as a judgment result of whether the two types of signals belong to the same type or not. Is provided with

Has parameters

Corresponding to

The subscript of the parameter is set to j, at

Is found in

Each of the closest

And calculating the similarity of the two models

S8-3: and traversing model similarity of all clusters of the two sensors, finding a minimum pair, comparing the DTW distances of the central points of the two clusters, and when the distance is smaller than the average distance in the cluster, considering that the two models describe signal data sets sent by the same vibration source.

The invention effectively solves the problem of classification of vibration signals of different vibration sources in long-term micro-vibration monitoring data, is convenient for better analyzing the micro-vibration sources existing around the field, can identify whether transient impact signals subsequently captured by different sensors or the same sensor belong to a certain type of known vibration source, is helpful for searching the micro-vibration sources and performing vibration isolation treatment at the later stage, and can judge whether the improved vibration isolation field still has the same type of previous vibration source.

Claims

1. A field micro-vibration source identification method based on machine learning is characterized by comprising the following steps:

s1: micro-vibration acceleration sensor utilizing edge area of measured field

And the site to be testedCentral micro-vibration acceleration sensor

Separately measuring the signals x _i And y _i Estimating the power spectrum of the signal according to Welch overlapped segmented average method to obtain respective main frequency components

And

The periodic vibration source of (2);

s2: for the sensor

S5: basing transient impulse signals on feature matrices EM _i Performing clustering of dynamic time warp distanceThe class method selects K-medoids, the cluster number K is comprehensively determined by a contour coefficient method and a cross-correlation undirected graph connected component, K sets S (K) are obtained, K belongs to K, the member number I (K) of each set, K belongs to K, and the member number I (K) of each set is counted to serve as the vibration frequency condition of different vibration sources for long-term real-time micro-vibration monitoring;

s6: for the global feature matrix of the set S (k)

GMM modeling is carried out, and the row vector is taken as an independent variable to obtain a model m _k (c) C ∈ (2,.., n), model parameters (Σ) ₁ ,...Σ _i ,μ ₁ ,...μ _i ,σ ₁ ,...σ _i ) Iteratively obtaining by an EM algorithm, verifying the number c of independent Gaussian components of the model, selecting the component number with the minimum value as an optimal model according to the Chichi information criterion value, and taking the optimal model as a final model of the set S (k)

S7: for the sensor

Repeating the steps S2-S6 on the acquired time domain discrete signal y (n);

s8: sensor with a flexible substrate

The captured transient impact signal t (i), i ∈ 1

Carrying out maximum posterior probability identification; and performing GMM parameter comparison on the sets which probably belong to the same class, merging the two classes of sets with highly overlapped models into the same set, and performing statistical analysis to obtain a vibration data set which is generated around the site and influences the central position of the site.

2. The field micro-vibration source identification method based on machine learning according to claim 1, wherein the step S2 specifically comprises:

wherein, the first and the second end of the pipe are connected with each other,

And performing IFFT to obtain a denoised time domain signal s (n).

3. The field micro-vibration source identification method based on machine learning according to claim 1, wherein the step S3 specifically comprises:

framesize is the frame length, N is the number of sampling points per frame

And

f ^j _size for the jth sub-band bandwidth

a _mn energy of nth sub-band representing data of mth frame

respectively constructing a Gaussian model with the dimension of two in the following formula for the clustering result by using a, b e (1,2.. M), wherein the parameters of the model are obtained by an EM algorithm through multiple iterations;

wherein phi _i Weight, μ, for each independent Gaussian distribution _i 、Σ _i Expectation and variance;

s3-5: the input waveform data is also subjected to framing and frequency sub-band energy calculation, and the probability P of each frame data for the framing and the frequency sub-band energy calculation is calculated _s (x)And P _n (x) And judging whether the ith frame contains the transient impact signal according to a likelihood ratio test criterion:

Taking F for two adjacent frames with partial overlap _AD ⁱ And F _AD ⁱ⁺¹ The union of (2) is used as the judgment result of the current frame, and F within the interval of 20 frames is added _AD ⁱ Frame of =1 as a complete set of transient impact waveforms.

4. The field micro-vibration source identification method based on machine learning according to claim 1, wherein the step S4 specifically comprises:

s4-2: rotate s (i)Scale-changed MFCC feature extraction to form a feature matrix M _i A 1, M _i Performing energy normalization as follows:

CM _i (t,f)＝(1-s)CM _i (t-1,f)+sM _i (t,f)

wherein, α, s, r, and δ are main parameters, and are selected according to the micro-vibration sensor selected in the engineering and the measured amplitude distribution of the specific vibration data, s =0.025, α =0.98, δ =2, r =0.5, and e =10 ^-6 ；

S4-3: obtaining a feature matrix EM belonging to s (i) _i ：

5. The field micro-vibration source identification method based on machine learning according to claim 4, wherein the step S5 specifically comprises:

s5-1: performing K-medoids clustering based on DTW distance:

(3) Re-determining the centroid of each class, cost function:

in the formula of SUM _k (i) Is the sum of the distances s (i) to the DTW of other transient impacts of the same k-th class; find SUM in this class _k (i) SUM of minimal s (i) and protocentroids _k (i) Comparing, and selecting the smaller cost function value between the two as a new centroid;

K _s (j)＝peakfinder(c _sum (k) Peakfield is the peak extraction function, K) _s (j) For the number j of different peaks,

coefficient of contour

a _ij is the maximum of the cross-correlation of s (i) and s (j)

2) To pair

3) For cross correlation adjacency matrix

K＝min{K _s (i)|K _s (i)≥K _c }。

6. the field micro-vibration source identification method based on machine learning according to claim 5, wherein the step S8 specifically comprises:

s8-1: identifying individual t (i) and judging the individual t (i) as a sensor

Class k of captured transient impulse signals ^* ：

Meanwhile, whether t (i) is a transient impact signal emitted by an external vibration source of the set needs to be determined, and k needs to be determined ^* Making a threshold decision to determine that it is the same type of transient impact signal in the set, i.e.

Wherein T is EER or high FRR threshold, and needs to be comprehensively determined by using a training data set during model training;

s8-2: for the sensor

And

And

calculating the sum of relative errors of the two parameters and the DTW distance of the central point as a judgment result of whether the two types of signals belong to the same type or not; is provided with

Has parameters

Corresponding to

The subscript of the parameter is set to j, at

Is found in

Each of the closest

And calculating the similarity of the two models