CN109284777B

CN109284777B - Water supply pipeline leakage identification method based on signal time-frequency characteristics and support vector machine

Info

Publication number: CN109284777B
Application number: CN201810988501.3A
Authority: CN
Inventors: 刘洋; 吴琼; 任学利; 赵婷; 龚政; 青春
Original assignee: Inner Mongolia University
Current assignee: Inner Mongolia University
Priority date: 2018-08-28
Filing date: 2018-08-28
Publication date: 2021-09-28
Anticipated expiration: 2038-08-28
Also published as: CN109284777A

Abstract

The invention discloses a water supply pipeline leakage identification method based on signal time-frequency characteristics and a support vector machine, and belongs to the technical field of water leakage detection and positioning. The method comprises the following steps: inputting the detected signal; performing feature extraction on an input signal; inputting the extracted feature set into an optimized support vector machine, and identifying the features by using the support vector machine; the support vector machine outputs a recognition result according to the input signal characteristics, and determines whether the signal is a leakage signal or a non-leakage signal. The method provides three time-frequency characteristics based on the inherent modal function, approximate entropy and principal components of the signal by utilizing the characteristics of randomness, more concentrated frequency spectrum and the like of the water leakage signal. The characteristic matrixes constructed by the characteristics are used as the input of a support vector machine, the support vector machine is used as a classifier to identify signals and output an identification result, so that the problems of high modeling difficulty coefficient, high misjudgment rate and the like in the existing pipeline leakage detection technology are solved.

Description

Water supply pipeline leakage identification method based on signal time-frequency characteristics and support vector machine

Technical Field

The invention relates to the technical field of water leakage detection and positioning, in particular to a water supply pipeline leakage identification method based on signal time-frequency characteristics and a support vector machine.

Background

Water is the material basis on which human beings and all organisms live, and is an indispensable natural resource for the development of the human society. The '2018 world water resource development report' issued by the united nations shows that the global water resource demand is increasing at a rate of 1% per year due to factors such as population growth, economic development and change of consumption modes, and the rate will be greatly increased in the next 20 years. As the population grows and pollution continues to deteriorate, the scarcity of water resources is increasing. Under the condition of serious shortage of water resources in the world, the problem of water resource waste is serious. Wherein, the water resource waste caused by the leakage of the water supply pipeline is very serious. One study by world banks shows that worldwide water losses due to water supply pipeline leaks amount to 486 billions of cubic meters per year, with a corresponding economic loss of approximately $ 146 billion per year. Therefore, the research on the efficient water supply pipeline leakage detection positioning technology has important significance for protecting water resources and promoting economic development.

In order to detect the leakage of underground water supply pipes, a great deal of research work has been carried out in academia and industry, and many effective detection methods have been developed. The earliest detection method to occur was the audiometry. The method is characterized in that a detection person judges a water leakage area according to the size and the tone quality characteristics of leakage sound by means of listening detection equipment. Although the method is simple and convenient to operate, the method depends on the experience of detection personnel, and the method has large workload and low reliability because the water supply network is widely distributed. Ground penetrating radar can determine the location of a leak in a pipe by detecting soil voids caused by water leaks, however, this method is less practical and expensive due to the relatively complex geological structure of the different regions. According to the pressure change inside the water supply pipeline, scholars successively put forward a pressure gradient method, a negative pressure wave method and a flow balance method. These methods are sensitive to the pressure value of the water flow in the pipe, but due to the continuous fluctuation of the water flow in the water supply network, false alarm is easy to generate when the fluctuation amplitude is large. Research finds that the frequency spectrum of the leakage signal is concentrated, and the vibration frequency of the pipeline is only related to the leakage condition. This characteristic is used to perform leak detection by performing spectral analysis on the signal acquired by the piezoelectric acceleration sensor on the pipe. However, this method is prone to false positives when there is ambient noise similar to the leakage signal spectrum. Researchers have further improved the ability to distinguish between leaky and environmental interfering signals by combining leaky acoustic signal linear predictive coding bi-spectral coefficients (LPCC) and Hidden Markov Models (HMM). However, the error probability of this method increases as the system running time increases, limited by the algorithm itself. For a large-scale water supply network system, scholars try to model on a pipe network by using a real-time model method and compare measured data on the pipe network with a predicted value of a flow model, but in practical application, the method is high in modeling difficulty coefficient and large in data calculation amount.

Disclosure of Invention

In order to solve the problems, the invention provides a water supply pipeline leakage identification method based on signal time-frequency characteristics and a support vector machine. The characteristic matrixes constructed by the characteristics are used as the input of a support vector machine, the support vector machine is used as a classifier to identify signals and output an identification result, so that the problems of high modeling difficulty coefficient, high misjudgment rate and the like in the existing pipeline leakage detection technology are solved.

According to the water supply pipeline leakage identification method based on the signal time-frequency characteristics and the support vector machine, three time-frequency characteristics are provided based on the inherent modal function, approximate entropy and principal components of the signal according to the difference of the time-frequency characteristics of the leakage signal and the non-leakage signal, the characteristics are used for constructing a characteristic matrix to serve as the input of the support vector machine, the support vector machine serves as a classifier to identify the signal and output an identification result, and the signal is determined to be the leakage signal or the non-leakage signal.

Further, the method comprises:

s1: inputting the detected signal;

s2: performing feature extraction on an input signal;

s3: inputting the extracted feature set into an optimized support vector machine, and identifying the features by using the support vector machine;

s4: the support vector machine outputs a recognition result according to the input signal characteristics, and determines whether the signal is a leakage signal or a non-leakage signal.

Further, the features extracted in S2 include the following three time-frequency features: frequency domain features based on natural modal functions, features based on approximate entropy, and features based on principal component analysis.

Further, the step of extracting the frequency domain features based on the natural mode functions comprises:

processing an input signal by using an empirical mode decomposition method, and decomposing to obtain a plurality of inherent mode functions of the input signal;

processing the obtained inherent mode function to obtain an inherent mode function power spectrum of the signal;

and calculating the average value of the power spectrum of the inherent modal function as the time-frequency characteristic of the leakage signal.

Further, the step of extracting features based on approximate entropy comprises:

constructing two sequences of length m, x (i) ([ u (i)), u (i +1), …, u (i + m-1) ], x (j) ([ u (j)), u (j +1), …, u (j + m-1) ], wherein i, j ≦ N-m +1, calculating the distance between the sequences x (i) and x (j), from the N samples u (1), u (2), …, u (N) of the acquired pipeline signal,

d[x(i),x(j)]＝max_{k＝1,2,…,m}[|u(i+m-1)-u(j+k-1)|]

given a threshold r, the number of d [ x (i), x (j) ] r is counted for each i < N-m +1, and the ratio between this number and the number of vectors is calculated:

for all values of i, solving

Has an average value of phi^m(r)，

Increasing m by 1, repeating the above steps to obtain phi^m+1(r) according to phi^m+1(r) and phi^m(r) the approximate entropy is obtained as:

ApEn(m,r)＝φ^m(r)-φ^m+1(r) as a time-frequency characteristic of the leakage signal.

Further, r is 0.1 to 0.2 times the standard deviation of the detected signal.

Further, the step of extracting features based on principal component analysis includes:

collecting m groups of pipeline signals x₁，x₂，…，x_mEach group of signals contains n samples, denoted x_i＝(x_1i,x_2i,…,x_ni)^TThen, the thus formed n × m order matrix X ═ X₁x₂…x_m]Is composed of

Utilizing the first l eigenvalues of the covariance matrix of X, wherein l is more than 0 and less than or equal to m, the eigenvalues are arranged from large to small, and the corresponding eigenvectors alpha are arranged_i＝(α_1i,α_2i,…,α_mi)^T(

i

1,2, ·, l) one obtains l new vectors,

y_i＝Xα_i,(i＝1,2,…,l)

y_iis a major component of X;

using principal component y of the signal_iConstructing a principal component signal matrix Y of order n × l ═ Y₁y₂…y_l]According to the inner product g of the principal component signal matrix and the original signal matrix_ji＝[y_j,x_i]A matrix G is constructed by constructing a matrix G,

G＝Y^T(X-E[X])

selecting g_j＝[g_j1g_j2···g_jm]And j is more than 0 and less than or equal to l is used as the time-frequency characteristic of the leakage signal.

Further, the value of l is determined by the contribution rate and the accumulated contribution rate.

Further, the support vector machine is optimized in the following way:

acquiring signals to construct a signal sample library when pipelines in different areas leak or do not leak in different time periods, and randomly selecting signals from the signal sample library as training sample signals and test sample signals;

training the support vector machine by utilizing the feature set of the training sample signal to form a support vector machine identification model;

testing the trained support vector machine identification model by using the feature set of the test sample signal;

further optimizing the support vector machine according to the test result until the accuracy of the test output meets the requirement;

a support vector machine model for pipe leak identification is formed.

The invention has the beneficial effects that:

(1) the invention provides three time-frequency characteristics based on the inherent modal function, approximate entropy and principal component of the signal. The feature matrixes constructed by the features are used as input of a support vector machine, so that the detection effect of the method is more comprehensive, the problem of higher misjudgment probability generated when a certain feature is singly considered is avoided, and the leak detection accuracy is effectively improved;

(2) according to the method, the pipeline signal is subjected to time-frequency analysis by adopting Empirical Mode Decomposition (EMD), and the complex signal is decomposed into a form of sum of finite Intrinsic Mode Functions (IMFs), so that the multi-scale analysis of the power spectral density of the leakage signal by utilizing a plurality of IMFs is realized, and the high-precision water leakage positioning is realized;

(3) a main component.

The invention constructs a principal component matrix by using the principal component components of the signals, and then uses the inner product g of the principal component matrix and the original signal matrix_ji＝[y_j,x_i]A principal component-based feature matrix is constructed to implement a plurality of principal component-based time-domain features that may be used for leak detection.

Drawings

FIG. 1 is a flow chart of a water supply pipeline leakage detection method based on signal time-frequency characteristics and a support vector machine according to the invention;

FIG. 2a shows a signal power spectrum during normal operation of a pipeline;

FIG. 2b shows a signal power spectrum when a leak occurs in a pipe;

FIG. 3 shows a natural mode function of a pipe water leakage signal;

FIG. 4 illustrates the approximate entropy of a pipe leak signal versus a no leak signal;

FIG. 5 illustrates a schematic diagram of support vector machine recognition model training and optimization;

FIG. 6 shows recognition accuracy for support vector machine parameter combinations (C, γ);

fig. 7 shows the results of the detection of a leak and no leak in the pipe.

Detailed Description

Specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that technical features or combinations of technical features described in the following embodiments should not be considered as being isolated, and they may be combined with each other to achieve better technical effects. In the drawings of the embodiments described below, the same reference numerals appearing in the respective drawings denote the same features or components, and may be applied to different embodiments.

As shown in FIG. 1, the method for detecting the water supply pipeline leakage based on the signal time-frequency characteristics and the support vector machine comprises the following steps:

step 101: inputting the detected signal;

step 102: performing feature extraction on an input signal;

step 103: inputting the extracted feature set into an optimized support vector machine, and identifying the features by using the support vector machine;

step 104: the support vector machine outputs a recognition result according to the input signal characteristics, and determines whether the signal is a leakage signal or a non-leakage signal.

Frequency domain features based on natural mode functions

A number of studies have shown that the power spectrum of the signal on the pipe when no leakage occurs differs significantly from the power spectrum of the signal when leakage occurs, with the power spectrum components of the leakage signal being concentrated primarily within a particular frequency band. Thus, the power spectrum of the signal can be characterized as a pipeline leak identification. To extract the difference in the power spectral density of the signal, we perform time-frequency analysis on the pipeline signal using Empirical Mode Decomposition (EMD). Empirical mode decomposition can select the scale of decomposing signals, and decompose complex signals into the form of the sum of finite Intrinsic Mode Functions (IMFs), so that multi-scale analysis of the power spectral density of a leakage signal by using a plurality of IMFs is realized.

Performing Hilbert transform on a function s (t) to obtain

Then, constructing an analytical function based on x (t) and y (t),

resolving the phase function of the signal to

By deriving the phase function with respect to time, the instantaneous frequency function of the analytic signal can be obtained as

From the definition of the instantaneous frequency function, the instantaneous frequency is a function of time, and only one frequency value corresponds to it at a particular time. The instantaneous frequencies in some cases show negative frequencies that are not meaningful, and if all the instantaneous frequencies are positive frequencies, this function s (t) is called the natural mode function (IMF). Therefore, the natural mode function must satisfy the following two conditions:

(1) number N of extreme points of the function_e(including minima and maxima) equal in number to or differ by at most 1 from the zero crossing, i.e.

(N_s-1)≤N_e≤(N_s+1) (5)

(2) At any time t_iIn the above-mentioned manner,the mean value of the upper envelope determined by the local maxima of the function and the lower envelope determined by the local minima is zero, i.e.

[s_max(t)+s_min(t)]/2＝0,t_i∈[t_a,t_b] (6)

In the formula [ t_a,t_b]Is a length of time.

The constraint (1) indicates that neither maxima nor minima below zero occur in s (t). The condition (2) is that the waveform is asymmetric by removing the local fluctuation. Generally, a signal may include a plurality of natural mode functions, and the natural mode functions of a complex signal may be extracted using empirical mode decomposition. To obtain the power spectral density characteristic of the leakage signal, we first process the signal by empirical mode decomposition to obtain the natural mode function of the signal. Then, the power spectrums of different modal functions can be obtained according to the inherent modal functions, and further, the frequency domain features are extracted.

Firstly, connecting all extreme points of an original signal x (t) by a cubic spline curve respectively to obtain an upper envelope line and a lower envelope line of the x (t), so that the signal is positioned between the two envelope lines. Meanwhile, let m (t) be the function of the mean value of the two envelopes. Subtracting its upper and lower envelope means m (t) from the original signal x (t) yields:

h₁(t)＝x(t)-m(t) (7)

then, detecting h₁(t) whether or not the two conditions of IMF are satisfied, and if not, h₁(t) repeating the above operations until the condition of IMF is satisfied. H will be at this point₁(t) is represented by c₁(t), then c₁(t) is the first IMF of signal x (t),

c₁(t)＝h₁(t) (8)

further, c is subtracted from the original signal x (t)₁(t)，

r₁(t)＝x(t)-c₁(t) (9)

Will r is₁(t) as a new signal, r is determined by the method described above₁(t) a first IMF, which is a function of x (t) a second IMF, which is denoted asc₂(t) of (d). By analogy, the nth IMFc of the signal x (t) can be obtained step by step_n(t) and remainder r_n(t)。

By the above steps, the original signal x (t) is decomposed into the sum of n IMFs and a remainder,

in general, the condition (2) of the IMF is determined to be unsatisfiable, and a stop criterion is generally set, and when the stop criterion is satisfied, the condition (2) is considered to be satisfied. For this purpose, the standard deviation S between two successive processing results is set_dTherefore, the standard deviation S in the method_dSatisfy the requirement of

Namely, condition (2) is considered to be satisfied. Wherein T is the observed length of the signal, h_k-1(t) and h_k(t) is the result of two consecutive processes in the IMF solution process. The study showed that the standard deviation S_dThe threshold value of (2) may be usually 0.2 to 0.3.

Fig. 2a shows the signal power spectrum when the pipe is operating normally, and fig. 2b shows the signal power spectrum when the pipe leaks. The analysis results show that the frequency of the pipe leakage acoustic signal is mainly concentrated around 1.6 kHz. FIG. 3 shows the empirical mode decomposition result of the pipeline leakage signal, standard deviation S in the experiment_dThe threshold of (2) is set to 0.3. The experimental result shows that the signal is subjected to empirical mode decomposition to obtain 5 intrinsic mode functions. The 5 natural mode functions can be used to analyze the frequency domain characteristics of the leakage signal from multiple scales.

IMF component c after EMD decomposition_i(n) obtaining the discrete Fourier transform of C_i(k)，

To C_i(k) The square of the modulus is obtained to obtain the power spectrum of the signal

Then, the mean value of the formula (13) is obtained

The invention takes the mean value of the power spectrum of the natural modal function of the signal as the frequency domain characteristic of the leakage signal.

Features based on approximate entropy

The pipeline leakage is a local small-probability event, so that the leakage signal and the non-leakage signal have a certain difference in randomness, and the characteristics of the leakage signal can be extracted from the viewpoint of analyzing the randomness of the signals. Approximate Entropy (ApEn) is the conditional probability that a similarity vector continues to maintain its similarity as it increases from dimension m to dimension m +1, and is the magnitude of the probability that a new pattern will be produced when the dimension changes. The greater the probability of generating a new pattern, the more complex the signal and the greater the corresponding approximate entropy. Therefore, the present invention selects the approximate entropy as one of the characteristics of the leakage signal identification.

Firstly, two sequences with the length of m are constructed according to N samples u (1), u (2), …, u (N) of the collected pipeline signals, wherein x (i) ═ u (i), u (i +1), …, u (i + m-1) ], x (j) ([ u (j)), u (j +1), …, u (j + m-1) ], and i, j is less than or equal to N-m + 1. Then, the distance between the sequences x (i) and x (j) is calculated,

d[x(i),x(j)]＝max_{k＝1,2,…,m}[|u(i+m-1)-u(j+k-1)|] (15)

for all the values of the value of i,to find

Has an average value of phi^m(r)，

Increasing m by 1, repeating (15) - (17) to obtain phi^m+1(r) according to phi^m+1(r) and phi^m(r) the approximate entropy can be obtained as,

ApEn(m,r)＝φ^m(r)-φ^m+1(r) (18)

the above analysis shows that the approximate entropy is a dimensionless scalar quantity whose value is related to m and r. In order to make the approximate entropy have reasonable statistical properties, m is 2, and r is generally 0.1-0.2 times of the Standard Deviation (SD) according to the experience. Fig. 4 is the approximate entropy of the signal when a leak occurs and when no leak occurs in a cast iron pipe. In the experiment, 50 sets of data were extracted for each case, with data length of 5000 for each set, m being 2 and r being 0.2 SD. From the results of fig. 4, it can be seen that the ApEn mean of the signal at the time of leakage is significantly higher than that at the time of no leakage, which indicates that the randomness characteristic of the leakage signal is higher than that of the signal without leakage, and can be used as the characteristic of the leakage identification.

Features based on principal components of the signal

Principal Component Analysis (PCA) is a classical feature extraction method. The method is to convert a plurality of variables into a few comprehensive variables (namely principal components) by using the idea of dimension reduction. Wherein each principal component is a linear combination of original variables, and the principal components are not related to each other. The principal component can reflect most of the information of the original variable, and all the information do not overlap. The present invention utilizes principal component analysis to analyze the difference between the leak and non-leak signals of the pipeline.

Collecting m groups of pipeline signals x₁，x₂，…，x_mEach group of signals contains n samples, which can be expressed as x_i＝(x_1i,x_2i,…,x_ni)^TThen, the thus formed n × m order matrix X ═ X₁x₂…x_m]Is composed of

According to the principle of PCA, the eigenvector alpha corresponding to the first l (l is more than 0 and less than or equal to m) eigenvalues (arranged from large to small) of the covariance matrix of X is utilized_i＝(α_1i,α_2i,…,α_mi)^T(

i

1,2, ·, l) one obtains l new vectors,

y_i＝Xα_i,(i＝1,2,…,l) (20)

balance y_iIs a main component of X, and y is satisfied in the formula (20)_iAnd y_j(i ≠ j; i, j ═ 1,2, …, l) is irrelevant. y is₁Is x₁，x₂，…，x_mThe maximum variance among all linear combinations of (a), (b), (c), (d)₂Is given as₁Uncorrelated x₁，x₂，…，x_mThe largest variance among all linear combinations. By analogy, y_lIs given as₁，y₂，…，y_l-1All uncorrelated x₁，x₂，…，x_mThe largest variance among all linear combinations. In practical applications, the value of l can be determined by the contribution rate and the cumulative contribution rate. The contribution rate is defined as the ratio of,

wherein λ ═ λ₁λ₂ ··· λ_m]The eigenvalues of the X covariance matrix are arranged from large to small. Then, a threshold value is set to make the cumulative contribution rate

And determining the value of l when a set threshold value is reached.

In the method, an n multiplied by l order principal component signal matrix Y is constructed by using principal components of signals as [ Y ═ Y₁ y₂ … y_l]Then, according to the inner product g of the principal component signal matrix and the original signal matrix_ji＝[y_j,x_i]A matrix G is constructed by constructing a matrix G,

G＝Y^T(X-E[X]) (22)

further selecting g_j＝[g_j1 g_j2 ··· g_jm]And j is more than 0 and less than or equal to l is used as the characteristic of water leakage identification.

Leakage signal detection based on time-frequency features and support vector machine

The signal time-frequency characteristics provided above have different characteristics when identifying the pipeline leakage, but the probability of generating misjudgment is higher when a certain characteristic is singly considered. For example, if the power spectrum distribution of the leakage signal and the power spectrum distribution of the non-leakage signal have a significant difference, the power spectrum mean value of the intrinsic mode function has a good identification effect, but if the co-band interference exists, the method is easy to misjudge. When the leakage amount of a pipeline leakage point is small, the difference between the approximate entropy mean value of a leakage signal and a non-leakage signal is not obvious, and the method is easy to misjudge.

In order to improve the accuracy of leakage detection, the invention comprehensively utilizes the feature combination provided above as the identification feature, and adopts a Support Vector Machine (SVM) to classify the signal features and judge whether the pipeline leaks. The SVM is a data mining method based on a statistical learning theory, has advantages in solving the problems of small samples, nonlinearity, high-dimensional data and the like, and is widely applied to the fields of data prediction, data fitting, pattern recognition and the like. Assume training set data samples are (x)_i,y_i) Where 1. ltoreq. i.ltoreq.N, each sample x_i∈R^dD is the dimension of the input space, y_iE { -1,1} is a category label. If the training set can be linearly divided by a hyperplane, the hyperplane can be represented as w · x + b ═ 0, where w and b are the positions that determine the hyperplane. Samples that satisfy the following condition are called support vectors,

y_i(w·x_i+b)＝1 (23)

the optimal division of the samples is actually a solution problem to the optimal classification hyperplane,

wherein w is a coefficient vector of a classification hyperplane in the feature space; b is a threshold of the classification face; xi_iIs a relaxation factor and ξ introduced taking into account classification errors_iNot less than 0; c is a penalty factor for misclassifying samples. The resulting optimal classification hyperplane can be expressed as,

w₀·x+b₀＝0 (25)

the optimized hyperplane solution problem of equation (24) can be converted into its dual problem by using the Lagrange multiplier method,

wherein α ═ α (α)₁,…,α_N) Is a Lagrange multiplier, satisfies alpha_iSamples > 0 are support vectors. The optimal hyperplane obtained by the equation (26) can be used to obtain the corresponding decision function,

wherein,

α is the optimum solution of equation (33)_iThe value is obtained.

For the case of non-linear divisibility, the low-dimensional input space R can be represented by a mapping function (called kernel function in SVM)^dAnd mapping to a high-dimensional feature space H, so that the training sample is converted from a low-dimensional linear inseparable problem to a high-dimensional linear separable problem. The dual problems of the optimization problem at this time are:

wherein, K (x)_i,x_j)＝Φ(x_i)·Φ(x_j) Is a kernel function. From equation (28), it can be seen that for the nonlinear separable problem, it is necessary to select an appropriate kernel function K (·) to construct the SVM model. The decision function corresponding to equation (28) is,

in order to improve the accuracy of water leakage detection, we need to optimize the support vector machine by using known signals, and the optimization process is shown in fig. 5. Considering the influence of the environmental factors of underground water supply pipelines, signal acquisition is required to be carried out when pipelines in different areas leak or do not leak at different time periods to construct a signal sample library, and signals are randomly selected from the signal sample library to serve as training samples and testing samples. Firstly, training the SVM by using the feature set of the training sample to form a primary identification model. And then, testing the trained SVM model by using the feature set of the test sample. And further optimizing the SVM according to the test result until the accuracy of the test output meets the requirement, and forming an SVM model for pipeline leakage recognition. In application, the upper computer of the detection system inputs signals acquired by the wireless sensor network into the SVM model, and judges whether the pipeline leaks or not according to the label output by the model.

Theoretical analysis of equations (26) - (29) shows that the factors determining the performance of the SVM model are the kernel function and the penalty factor C. The kernel function of the SVM is mainly divided into a linear kernel, a polynomial kernel, a Sigmoid kernel, and a radial basis kernel. According to the characteristics of the water leakage signal and the generalization capability of the SVM model, the method selects the radial basis kernel as the kernel function. The expression of the radial basis kernel function is:

K(x_i,x_j)＝exp(-γ||x_i-x_j||²) (30)

therefore, for an SVM based on a radial basis kernel function, its performance is determined by the parameters (C, γ). In order to achieve a high recognition effect of the SVM model, the optimization process in FIG. 5 is actually performedThe parameters C and γ are optimally adjusted by using training samples and test samples. A large number of researches show that the index sequences of C and gamma can achieve good effect in practical application, and C is usually 2^-5,2^-4,···,2¹⁵，γ＝2^-15,2^-14,···,2⁵。

The method is based on a cross-validation 'grid-search' method, and SVM parameters are optimized.

Firstly, the parameter is valued according to the value range of the parameter, and C is 2^x,x∈[-5,15]，γ＝2^y,y∈[-15,5]. Then, the training sample and the test sample are used for the difference 2^xAnd 2^yAnd testing the SVM model under the combination, and outputting the accuracy of the test. And finally, selecting C and gamma of the optimal cross-validation accuracy as parameters of the water leakage identification SVM.

Example 1

In the experiment, a PVC water supply pipeline is selected for signal acquisition. 100 groups of data are collected for leakage and non-leakage conditions in different periods, and the length of each group of data is 5000, so that the support vector machine is trained and optimized. Meanwhile, 100 data are respectively acquired for leakage and non-leakage conditions in a relatively silent early morning period, and the validity of the leakage detection and delay estimation algorithm researched by the invention is verified by a method of artificially adding noise.

First, 50 sets of data are extracted from each of the leakage signal and the non-leakage signal collected at different time periods to form a training sample of 100 sets. Then, a test sample of 100 sets of data was constructed using the remaining samples. The parameter (C, gamma) of the support vector machine is an integer power of 2, and the value range of C is C epsilon [2 ]^-5,2¹⁵]The value range of gamma is gamma belongs to [2 ]^-15,2⁵]. By the grid search method, the model was trained using (C, γ) under a combination of 21 × 21 — 441 parameters, and the model performance was detected using the test set data, and the obtained detection accuracy was as shown in fig. 6. The results of fig. 6 show that the highest recognition accuracy of the algorithm proposed by the present invention is 98%. Moreover, it can be seen that when the penalty factor C is greater than or equal to 2²The sum kernel function parameter gamma is less than or equal to 20, and the product of the two parameters is 2¹≤C×γ≤2⁷The SVM model based on the radial basis kernel has better recognition performance on the pipeline leakage signal.

FIG. 7 shows that the combination of parameters (C, γ) ═ 2 is selected⁹,2^-4) And (3) detecting the leakage of the pipeline, wherein a signal label of the leakage of the pipeline is set to be-1, and a signal label of the leakage-free pipeline is set to be 1. As can be seen from the identification result, the method provided by the invention only judges the signals under the condition of 2 groups of leakage as non-leakage signals, and judges the other conditions to be correct. Table 1 shows the situation of leakage identification by using the algorithm proposed in the present invention after gaussian noise and impulse noise are artificially added to the leakage signal acquired in the silence period. The result shows that the water supply pipeline leakage detection method based on the signal multi-feature and support vector machine can effectively detect the leakage condition of the pipeline.

TABLE 1 rate of leakage signal identification in Gaussian noise and impulsive noise environments

While embodiments of the present invention have been presented herein, it will be appreciated by those skilled in the art that changes may be made to the embodiments herein without departing from the spirit of the invention. The above examples are merely illustrative and should not be taken as limiting the scope of the invention.

Claims

1. The method is characterized in that three time-frequency characteristics are provided based on a signal inherent modal function, approximate entropy and principal components according to the difference of time-frequency characteristics of leakage signals and non-leakage signals, a characteristic matrix is constructed by using the characteristics to serve as the input of a support vector machine, the support vector machine serves as a classifier to identify the signals and output an identification result, and whether the signals are leakage signals or non-leakage signals is determined;

the method comprises the following steps:

s1: inputting the detected signal;

s2: performing feature extraction on an input signal;

s4: the support vector machine outputs an identification result according to the input signal characteristics, and determines whether the signal is a leakage signal or a non-leakage signal;

frequency domain features based on the natural modal function, features based on approximate entropy and features based on principal component analysis;

the step of extracting the frequency domain features based on the natural modal functions comprises the following steps:

2. The method of claim 1, wherein the extracting approximate entropy-based features step comprises:

d[x(i)，x(j)]＝max_{k-1，2，…，m}[|u(i+m-1)-u(j+k-1)|]

for all values of i, solving

Has an average value of phi^m(r)，

ApEn(m，r)＝φ^m(r)-φ^m+1(r) as a time-frequency characteristic of the leakage signal.

3. The method of claim 2, wherein r is 0.1 to 0.2 times the standard deviation of the detected signal.

4. The method of claim 1, wherein the step of extracting features based on principal component analysis comprises:

collecting m groups of pipeline signals x₁，x₂，…，x_mEach group of signals contains n samples, denoted x_i＝(x_1i，x_2i，φ，x_ni)²Then, the thus formed n × m order matrix X ═ X₁x₂…x_m]Is composed of

Utilizing the first l eigenvalues of the covariance matrix of X, wherein l is more than 0 and less than or equal to m, the eigenvalues are arranged from large to small, and the corresponding eigenvectors alpha are arranged_i＝(α_1i，α_2i，…，α_mi)^T(i 1,2, …, l) one can determine l new vectors,

y_i＝Xα_iwherein i ═ 1, 2.. times, l

y_iIs a major component of X;

using principal component y of the signal_iConstructing a principal component signal matrix Y of order n × l ═ Y₁y₂…y_l]According to the inner product g of the principal component signal matrix and the original signal matrix_ji＝[y_j，x_i]A matrix G is constructed by constructing a matrix G,

G＝Y^T(X-E[X])

selecting g_j＝[g_j1g_j2…g_jm]And j is more than 0 and less than or equal to l is used as the time-frequency characteristic of the leakage signal.

5. The method of claim 4, wherein the value of/is determined by the contribution rate and the cumulative contribution rate.

6. The method of claim 1, wherein the support vector machine is optimized by:

acquiring signals to construct a signal sample library when pipelines in different areas leak or do not leak in different time periods, and randomly selecting signals from the signal sample library as training samples and test samples;

training the support vector machine by utilizing the feature set of the training sample to form a support vector machine identification model;

testing the trained support vector machine identification model by using the feature set of the test sample;

a support vector machine model for pipe leak identification is formed.