CN106934208B

CN106934208B - A kind of dam thundering observed data automatic identifying method

Info

Publication number: CN106934208B
Application number: CN201710006186.5A
Authority: CN
Inventors: 杨鸽; 沈海尧; 王玉洁; 崔何亮
Original assignee: Large Dam Safety Supervision Center National Energy Administration Of People's Republic Of China; PowerChina Huadong Engineering Corp Ltd
Current assignee: Large Dam Safety Supervision Center National Energy Administration Of People's Republic Of China; PowerChina Huadong Engineering Corp Ltd
Priority date: 2017-01-05
Filing date: 2017-01-05
Publication date: 2019-07-23
Anticipated expiration: 2037-01-05
Also published as: CN106934208A

Abstract

The present invention discloses a kind of dam thundering observed data automatic identifying method, comprising: the track matrix of building primary monitoring data sequence carries out singular value decomposition to track matrix and obtains feature group；Feature group is arranged from big to small according to characteristic value, selecting several the preceding feature groups of accumulation contribution rate more than or equal to 85% is main feature group；The corresponding fundamental matrix of main feature group is calculated, diagonal equalization is carried out to fundamental matrix and obtains the main component of data sequence；Main component is added up and obtains reconstruct data sequence；Subtract each other to obtain residual sequence with reproducing sequence and former data sequence, finds out the standard deviation of residual sequence；Judge whether measured value is exceptional value by Pauta criterion according to the residual error of measured value.This method can automatically extract the main feature of monitoring data sequence, avoid artificial founding mathematical models, not only can guarantee the consistency and accuracy of judgement, also reduce human resources investment；It still is able to differentiate monitoring data when the environment parameters such as water level, temperature missing.

Description

A kind of dam thundering observed data automatic identifying method

Technical field

The present invention relates to Dam Monitoring field more particularly to a kind of dam thundering observed data automatic identifying methods.

Background technique

China builds together into 100 meters or more more than 200 seats of dam, wherein 150 meters or more of more than 40 seat of high dam.To these dams into Row monitoring, monitoring are to guarantee the important means of its safe operation." the Dam operational safety prison that Bureau of Energy in 2015 promulgates Superintend and direct administrative provisions " " for 100 meters of height of dam or more of dam, electric power enterprise should establish dam safety and supervise online further requirement Control system ", " dam center should carry out remote online technical supervision to the dam operational safety that registration (putting on record) registers ".Due to Operating condition is complicated, and the monitoring point that high dam is laid is usually far more than common engineering；And when being managed to group dam, the survey that is related to Point number is even more huge.

The vital task of on-line monitoring is identification abnormal operating condition, and basis is identification thundering observed data.Currently, The visible outlier identification method in Dam Monitoring field is divided into method of identification based on envelope domain according to its principle, based on item The method of identification of part similitude and method of identification three classes based on mathematical model.Wherein, the exception based on statistical regression mathematical model Value recognition methods can be normally reached higher susceptibility, while also be not easy for normal value to be mistaken for exception, therefore application is most It is extensive.But the people that the foundation of regression model is typically required for abundant data analysis experience operates completion.When to monitoring point pole When more high dams or group dam is monitored, human cost is huge；Meanwhile different operators be likely to occur standard it is inconsistent, The even situation of inaccuracy.In addition, also typically including the environment parameters such as water level, temperature in statistical model, and pass through in practical projects Often there is the case where reporting and submitting when environment parameter and effect quantity difference, at this moment will be unable to obtain effect quantity predicted value using statistical model, Thus also can not just judge whether measured value is abnormal.

Summary of the invention

Manual operation and environment parameter are relied in the dam thundering observed data identification technology based on statistical model The excessively high problem of degree, the present invention provide a kind of method of automatic identification thundering observed data, it is intended to promote dam safety monitor Precision, efficiency and the robustness that abnormal value identifies in work, and reduce human resources investment.

To achieve the goals above, the present invention discloses a kind of dam thundering observed data automatic identifying method, the method The following steps are included:

1) by primary monitoring data sequence construct track matrix, singular value decomposition then is carried out to track matrix and obtains a system Column feature group；

2) feature group is arranged from big to small according to characteristic value, selection accumulation contribution rate more than or equal to 85% it is preceding several Feature group is main feature group；

3) the corresponding fundamental matrix of main feature group is calculated, diagonal equalization then is carried out to fundamental matrix and obtains data sequence Several preceding main components of column；

4) main component obtained is added up and obtains reconstruct data sequence；

5) subtract each other to obtain residual sequence with reproducing sequence and former data sequence, find out the standard deviation of residual sequence；

6) judge whether measured value is exceptional value by Pauta criterion according to the standard deviation of residual sequence.

Further, in above-mentioned steps disclosed by the invention, the track matrix X is by monitoring data sequence f₀,f₁, f₂,...,f_N-1Lag arrangement in time obtains, and indicates are as follows:

Wherein, N is monitoring data sequent data count, and L is length of window, and 1 < L < N, K includes by the track every row of matrix X Measured value quantity, K=N-L+1；I, j are used to indicate element x_ijIt is to be arranged in the i-th row jth the location of in the matrix X of track.

Further, described that track matrix X progress singular value decomposition is referred to: to seek S=XX^TNon-negative eigenvalue λ₁,λ₂, λ₃,...,λ_lAnd the feature vector U of corresponding orthonormalization₁,U₂,U₃,...,U_lAndThe spy Sign group refers to (λ_i,U_i,V_i), referred to as ith feature group.

Further, the contribution rate CR of the ith feature group_iIt is calculated by following formula:

Further, the main feature group is the preceding m feature group accumulated contribution rate and be more than or equal to 85%, it may be assumed that

It is that i, j are used to indicate for which characteristic value, and m indicates that the sum of main feature group, l indicate the total of non-negative feature group Number.

Further, the fundamental matrix X_iIt is calculated by feature group:

Further, described to fundamental matrix X_iThe process diagonally equalized are as follows: for the matrix X of L × K_i, enable it Element is x_ij, L^*=min (L, K), K^*=max (L, K), N=L+K-1；If L < K, x_ij ^*=x_ij, otherwise x_ij ^*=x_ji.It is described I-th of main componentIt is calculated as follows:

The reconstruct data sequence is obtained by preceding m main component is cumulative, it may be assumed that

The residual sequence Δ_kSubtract each other to obtain by reproducing sequence and former data sequence, it may be assumed that

When the Pauta criterion is that take confidence level be 99%, the standard deviation of residual sequence is σ, when following formula is set up Measured value is normal value, is otherwise exceptional value:

|Δ_k|≤2.58σ (14)

The invention has the benefit that monitoring data can be automatically extracted using the outlier identification method that the present invention announces The main feature of sequence, avoids artificial founding mathematical models, not only can guarantee the consistency and accuracy of judgement, also substantially reduces Human resources investments.Meanwhile it still being able to differentiate monitoring data when the environment parameters such as water level, temperature missing.

Detailed description of the invention

Fig. 1 is the flow chart of one embodiment of method of the invention.

Fig. 2 is measured data to be analyzed in one embodiment, is a top of arch dam displacement monitoring data.

Fig. 3 is the contribution rate distribution map of preceding 20 feature groups in one embodiment.

Fig. 4 is the corresponding main component curve graph of CR1=67.1% in one embodiment.

Fig. 5 is the corresponding main component curve graph of CR2=12% in one embodiment.

Fig. 6 is the corresponding main component curve graph of CR3=11.98% in one embodiment.

Fig. 7 is the reconstruct data sequence obtained in one embodiment by first three main component.

Fig. 8 is the residual sequence acquired in one embodiment by reconstruct data sequence and measured data sequence and reasonable residual error Range.

Specific embodiment

Below in conjunction with attached drawing and one embodiment to the reality of Dam Monitoring Data outlier identification method of the present invention It applies mode and beneficial effect is described further.Fig. 1 is the flow chart for the outlier identification method announced according to the present invention, is used This method carries out analysis identification to top of arch dam displacement monitoring data shown in Fig. 2.The following steps are included:

Step 1: reading monitoring data sequence f₀,f₁,f₂,...,f_N-1, data sequence is lagged to arrangement in time and is obtained Track matrix X:

Wherein, N is monitoring data sequent data count, and L is known as length of window, 1 < L < N；For there is periodic monitoring data Sequence, it is desirable that the value of L is greater than the length of a cycle.L=400 days, N=6800 days in the present embodiment.

Step 2: singular value decomposition being carried out to track matrix X: seeking matrix S=XX^T, seek the non-negative eigenvalue λ of S₁,λ₂, λ₃,...,λ_lAnd the feature vector U of corresponding orthonormalization₁,U₂,U₃,...,U_lAndObtain l Feature group (λ_i,U_i,V_i), i=1,2 ... .l.

Step 3: seeking the contribution rate CR of ith feature group_i, its calculation formula is:

Wherein, the non-negative characteristic value sum that l is matrix S.

Step 4: feature group being arranged from big to small according to characteristic value, the accumulation contribution rate SCR of i feature group before asking_i:

The accumulation contribution rate SCR of i feature group before judging_iWhether it is more than or equal to 85%, if set up, enables main feature Group number m=i, and choosing this m feature group is main feature group；Otherwise, the cumulative tribute for seeking next feature group is continued through Offer rate.

Fig. 3 is the contribution rate distribution map of preceding 20 feature groups of embodiment illustrated in fig. 2, it is seen that the contribution of first three feature group Rate be noticeably greater than after feature group, accumulation contribution rate be 91%；Therefore, select first three feature group for the master of the present embodiment Want feature group.

Step 5: calculating the corresponding fundamental matrix of main feature group: the corresponding fundamental matrix X of i-th of main feature group_iAre as follows:

Wherein, i=1,2 ..., m.

Step 6: diagonal equalization being carried out to the corresponding fundamental matrix of main feature group and seeks main component: for the square of L × K Battle array X_i, enabling its element is x_ij, L^*=min (L, K), K^*=max (L, K), N=L+K-1；If L < K, x_ij ^*=x_ij, otherwise x_ij ^* =x_ji.I-th of main componentIt is calculated as follows:

Fig. 4 show first three corresponding main component of main feature group of embodiment illustrated in fig. 2.

Step 7: preceding m main component being added up and obtains reconstruct data sequence, it may be assumed that

Wherein, k=0,1,2 ..., N-1.Heavy line show the reconstruct data sequence of corresponding embodiment illustrated in fig. 2 in Fig. 5 Column are obtained by first three corresponding main component of main feature group is cumulative.

Step 8: subtracting each other to obtain residual sequence Δ by reproducing sequence and former data sequence_k, it may be assumed that

Wherein, k=0,1,2 ..., N-1.Fig. 6 show the residual sequence of corresponding embodiment illustrated in fig. 2.

Step 9: find out the standard deviation sigma of residual sequence:

The standard deviation of the residual error of embodiment illustrated in fig. 2 is 0.9090.

Step 10: for each measured value, judge whether measured value is exceptional value according to Pauta criterion:

|Δ_k|≤2.58σ (9)

When above formula is set up, measured value corresponding to residual error is normal value；Otherwise the probability that the residual error corresponds to measured value appearance is less than 1%, thus be exceptional value.

The grey area Fig. 6 is residual error range ± 2.3452 of normal measured value in the embodiment determined according to above-mentioned criterion, Measured value corresponding to residual error except the region is abnormal value.

Claims

1. a kind of dam thundering observed data automatic identifying method, which is characterized in that the described method comprises the following steps:

1) by primary monitoring data sequence construct track matrix, singular value decomposition then is carried out to track matrix and obtains a series of spies Sign group；

2) feature group is arranged from big to small according to characteristic value, selection accumulation contribution rate is more than or equal to 85% several preceding features Group is main feature group；

3) the corresponding fundamental matrix of main feature group is calculated, diagonal equalization then is carried out to fundamental matrix and obtains data sequence Several preceding main components；

4) main component is added up and obtains reconstruct data sequence；

5) subtract each other to obtain residual sequence with reconstruct data sequence and former data sequence, find out the standard deviation of residual sequence；

2. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that the track matrix X is by monitoring data sequence f₀,f₁,f₂,...,f_N-1Lag arrangement in time obtains, and indicates are as follows:

Wherein, N is monitoring data sequent data count, and L is length of window, the measured value that 1 < L < N, K includes by the track every row of matrix X Quantity, K=N-L+1；I, j are used to indicate element x_ijIt is to be arranged in the i-th row jth the location of in the matrix X of track.

3. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that described to track square Battle array X carries out singular value decomposition and refers to: seeking S=XX^TNon-negative eigenvalue λ₁,λ₂,λ₃,...,λ_lAnd corresponding orthonormalization Feature vector U₁,U₂,U₃,...,U_lAndThe feature group refers to (λ_i,U_i,V_i), referred to as ith feature Group.

4. dam thundering observed data automatic identifying method according to claim 3, which is characterized in that described i-th special The contribution rate CR of sign group_iIt is calculated by following formula:

The main feature group is the preceding m feature group accumulated contribution rate and be more than or equal to 85%, it may be assumed that

It is that i, j are used to indicate for which characteristic value, and m indicates that the sum of main feature group, l indicate the sum of non-negative feature group.

5. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that the fundamental matrix X_iIt is calculated by following formula:

6. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that described to basic square Battle array X_iCarry out the process that diagonal equalization seeks main component are as follows: for the matrix X of L × K_i, enabling its element is x_ij, L*=min (L, K), K*=max (L, K), N=L+K-1；If L < K, x_ij*=x_ij, otherwise x_ij*=x_ji, i-th of main componentIt is calculated as follows:

7. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that the reconstruct data Sequence is obtained by preceding m main component is cumulative, it may be assumed that

8. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that the residual sequence Δ_kSubtract each other to obtain by reproducing sequence and former data sequence, it may be assumed that

9. dam thundering observed data automatic identifying method according to claim 1, which is characterized in that the La Yida is quasi- When being then that take confidence level be 99%, the standard deviation of residual sequence is σ, otherwise it is different that when following formula is set up, measured value, which is normal value, Constant value:

|Δ_k|≤2.58σ (8)。