CN114048808A - Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data - Google Patents

Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data Download PDF

Info

Publication number
CN114048808A
CN114048808A CN202111323819.8A CN202111323819A CN114048808A CN 114048808 A CN114048808 A CN 114048808A CN 202111323819 A CN202111323819 A CN 202111323819A CN 114048808 A CN114048808 A CN 114048808A
Authority
CN
China
Prior art keywords
monitoring data
gross error
credibility
error identification
judgment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111323819.8A
Other languages
Chinese (zh)
Inventor
马靖航
李俊
张群
张鸣伦
咸永财
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoneng Shaanxi Hydropower Co ltd
Original Assignee
Guoneng Shaanxi Hydropower Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoneng Shaanxi Hydropower Co ltd filed Critical Guoneng Shaanxi Hydropower Co ltd
Priority to CN202111323819.8A priority Critical patent/CN114048808A/en
Publication of CN114048808A publication Critical patent/CN114048808A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The invention provides a gross error identification and elimination method, a system, a server and a computer readable storage medium, wherein the gross error identification and elimination method comprises the following steps: acquiring monitoring data, and solving a maximum entropy probability density function of a sequence where the monitoring data is based on a maximum entropy principle; according to a preset grade division standard, dividing the credibility of data points of the monitoring data to form a plurality of credibility grades, and establishing a consistency judgment matrix based on a hierarchical analysis algorithm: for quantile ciRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, and solving the optimal quantile point cbest(ii) a According to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function. After the technical scheme is adopted, the device can be used for cleaning the workpieceThe monitoring data are accurately cleaned, the reliability of the monitoring data is improved, and a foundation is laid for analysis of the monitoring data.

Description

Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data
Technical Field
The invention relates to the field of safety monitoring, in particular to a gross error identification and elimination method, a gross error identification and elimination system, a server and a computer readable storage medium for monitoring data.
Background
In the field of hydraulic safety, the method has great significance for analyzing monitoring data. However, the authenticity of the data itself will greatly affect the analysis result, and in fact, it is inevitable that data information obviously inconsistent with the actual situation, i.e. gross errors, will be generated in the sequence of monitoring data for some reasons. The existence of gross errors can seriously affect the calculation precision of subsequent monitoring data analysis. In fact, the occurrence of gross errors, especially large gross errors, can cause the classical adjustment result to be seriously distorted or completely unusable, and in order to ensure the processing progress of the monitored data, the gross errors are data which must be removed before the safety monitoring data analysis is carried out.
In the prior art, gross errors are manually eliminated by observing a process line of monitoring data and relying on engineering experience. The existing main gross error elimination methods include a Lauda criterion, a Dixon criterion, a limit error method and the like, and the conventional gross error elimination methods have certain limitations, for example, values of certain inspection parameters are estimated values calculated by adopting a certain mathematical method, and the calculation of residual errors also has an approximate process, so that the methods sometimes cannot accurately identify the existence of gross error data.
Therefore, a novel gross error identification and elimination method for the monitoring data is needed, which can improve the reliability of the monitoring data and improve the working efficiency of data cleaning.
Disclosure of Invention
In order to overcome the technical defects, the invention aims to provide a gross error identification and rejection method, a system, a server and a computer readable storage medium for monitoring data, which improve the reliability of the monitoring data by accurately cleaning the monitoring data and lay a foundation for the analysis of the monitoring data.
The invention discloses a gross error identification and elimination method for monitoring data, which comprises the following steps:
s100: acquiring monitoring data, and solving a maximum entropy probability density function of a sequence where the monitoring data is based on a maximum entropy principle;
s200: according to a preset grade division standard, dividing the credibility of data points of the monitoring data to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000021
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
s300: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest
S400: according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
Preferably, step S100 includes:
s110: acquiring monitoring data;
s120: constructing the following maximum entropy model based on the maximum entropy principle:
maxH(x)=-∫Rf(x)lnf(x)dx
s.t.∫Rf(x)dx=1
Rxif(x)dx=μi(i=1,2,…,n)
wherein R is the set of the monitoring data sequence x, muiTo monitor the ith moment of origin of the sequence x of data,
Figure BDA0003344889520000022
s130: the following lagrange function is constructed:
Figure BDA0003344889520000023
wherein λ is0、λiIs a lagrange multiplier;
s140: computing a partial derivative of the Lagrangian function
Figure BDA0003344889520000024
To obtain the following maximum entropy probability density function:
Figure BDA0003344889520000025
Figure BDA0003344889520000026
preferably, step S140 includes:
s141: order:
Figure BDA0003344889520000031
s142: assuming an initial iteration of λ to λ0The above formula is set at λ0Performing first-order Taylor expansion:
Figure BDA0003344889520000032
let Δ ═ λ - λ0,ζ=[μ0-G00),μ1-G11),…,μn-Gnn)]TThen, there are:
Figure BDA0003344889520000033
s143: the above formula is abbreviated as: G.DELTA.. zeta., and solving for.DELTA.. lambda.i+1=λi+ Δ continues to iterate until convergence to: delta is less than or equal to Deltamin
Wherein, DeltaminIs a preset iteration precision threshold value.
Preferably, step S200 includes:
s210: according to a preset grade division standard, dividing the credibility of the data points of the monitoring data into normal points, suspicious points and gross error points;
s220: constructing a consistency judgment matrix among the comments by using qualitative terms composed in a linear mode, and using L with the value range of 1-91、L2Respectively representing the weight comparison of different credibility grades, and dividing the scale of 1-9 into two sections;
s230: record ciQuantiles on a scale of 1-9, such that qualitative terms map to: l is1:[1,c),L2[c,9)。
Preferably, step S220 includes:
s221: according to the mapping of qualitative terms, a judgment matrix for probability division of the credibility grade of the monitoring data is obtained as follows:
Figure BDA0003344889520000034
s222: calculating the maximum eigenvalue and eigenvector corresponding to the judgment matrix R: r.w*=λmax·w*Wherein: lambda [ alpha ]maxIn order to determine the maximum eigenvalue of the matrix, w is the eigenvector corresponding to the maximum eigenvalue.
Preferably, step S300 includes:
s310: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes;
s320: calculating the consistency ratio of the judgment matrix based on the consistency indexes of the judgment matrix as follows:
Figure BDA0003344889520000041
wherein λmaxIn order to judge the maximum eigenvalue of the matrix, n is the matrix dimension of the judgment matrix;
s330: and taking the consistency ratio of the judgment matrix as a fitness function of the particle swarm algorithm, and updating the particle speed and the particle position of the particle swarm algorithm according to the following formula:
Vi(t+1)=ξvi(t)+c1r1[pi(t)-xi(t)]+c2r2[pg(t)-xi(t)]
where t is the number of iterations, c1、c2To be an acceleration factor, r1、r2The number is a random number within the range of 0-1, and xi is an inertia weight;
s340: solving for optimal quantile cbest
Preferably, step S400 includes:
s410: according to the optimal quantile cbestDividing the critical probability of the reliability grade of the monitoring data within the range of 1%;
s420: optimal quantile point c for rating when obtaining confidence of monitoring databestThen, a judgment matrix R when the consistency ratio CR is minimum is obtainedbest
Figure BDA0003344889520000042
S430: calculating to obtain a judgment matrix RbestThe maximum eigenvalue of (d) corresponds to the eigenvector w ═ w1,w2,w3]
S440: normalizing w by the feature vector, and collecting the evaluation of the credibility grade of the monitoring data in 0, 1%]And (3) division in intervals:
Figure BDA0003344889520000043
s450: and determining a critical value of the credibility grade of the monitoring data so as to realize gross error identification and elimination of the monitoring data.
The invention also discloses a gross error identification and elimination system aiming at the monitoring data, which comprises the following steps:
the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle;
the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000051
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest
A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
The invention also discloses a server, comprising:
the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle;
the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000052
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest
A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
The invention also discloses a computer readable storage medium on which a computer program is stored, the computer program, when executed by a processor, implementing the gross error identification and rejection method as described above.
After the technical scheme is adopted, compared with the prior art, the method has the following beneficial effects:
1. the method is particularly suitable for identifying and processing isolated gross errors and run-out abnormal values;
2. compared with the prior art, the working efficiency is greatly improved.
Drawings
FIG. 1 is a schematic flow chart illustrating a gross error identification and rejection method for monitored data according to a preferred embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating a solution process for a maximum entropy probability density function according to a preferred embodiment of the present invention;
FIG. 3 is a schematic diagram of a process for checking the consistency of a decision matrix according to a preferred embodiment of the present invention;
FIG. 4 is a graph comparing distribution curves, normal distribution curves and sample histograms for a maximum entropy probability density function in accordance with a preferred embodiment of the present invention;
fig. 5 is a diagram illustrating an iterative process of a fitness function in accordance with a preferred embodiment of the present invention.
Detailed Description
The advantages of the invention are further illustrated in the following description of specific embodiments in conjunction with the accompanying drawings.
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.
In the description of the present invention, unless otherwise specified and limited, it is to be noted that the terms "mounted," "connected," and "connected" are to be interpreted broadly, and may be, for example, a mechanical connection or an electrical connection, a communication between two elements, a direct connection, or an indirect connection via an intermediate medium, and specific meanings of the terms may be understood by those skilled in the art according to specific situations.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in themselves. Thus, "module" and "component" may be used in a mixture.
Referring to fig. 1, a schematic flow chart of a gross error identification and rejection method for monitoring data according to a preferred embodiment of the present invention is shown, in which the gross error identification and rejection method includes the following steps:
s100: acquiring monitoring data, and solving a maximum entropy probability density function of a sequence where the monitoring data is based on a maximum entropy principle;
the maximum entropy principle is a criterion for selecting the random variable statistical property to best meet the objective condition, and is also called as a maximum information principle. The probability distribution of random quantities is difficult to determine, and generally, only various mean values (such as mathematical expectation, variance and the like) or values (such as peak values, number of values and the like) under certain known limiting conditions can be measured, and the distribution conforming to the measured values can be various and can be more than infinite, and usually, the entropy of one distribution is the maximum. The distribution with the maximum entropy is selected as the distribution of the random variable, which is an effective processing method and criterion. Although this method is subjective to some extent, it is considered to be the most suitable choice for the objective situation. By the maximum entropy principle, the maximum entropy probability density function of the sequence where the monitoring data is located can be solved.
In the prior art, gross error identification and elimination are carried out on a group of monitored data sequences which only contain random errors, the random errors are calculated to obtain standard deviation, an interval is determined according to a certain probability, the errors exceeding the interval are considered not to belong to the random errors but to be the gross errors, and the data containing the errors are eliminated. The discrimination processing principle and method are only limited to the processing of sample data with normal or approximately normal distribution, and the method is based on the premise that the measurement times are sufficiently large, and the method is not reliable enough to remove gross errors by using a criterion under the condition that the measurement times are few. Therefore, in the embodiment, the maximum entropy probability density function is adopted to replace the normal distribution function, the application range of the method is improved, and meanwhile, the analytic hierarchy process is adopted to determine the gross error probability interval more reasonably.
S200: according to a preset grade division standard, dividing the credibility of data points of the monitoring data to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000081
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
the judgment matrix refers to that any system analysis is based on certain information, the information basis of the Analytic Hierarchy Process (AHP) is mainly the judgment given by people to the relative importance of each factor of each layer, and the judgment is expressed by numerical values and written into a matrix form result.
In this embodiment, the decision matrix is compared by the weight of the confidence level to L1、L2And (4) forming.
S300: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile pointcbest
It will be appreciated that the quantile ciI.e. the separation weight comparison L1、L2And the consistency ratio is used for measuring the deviation degree of the judgment matrix and the characteristic matrix. The fitness function used as the particle swarm algorithm is used for solving the optimal quantile point cbest. The iterative process of the fitness function is shown in fig. 5.
S400: according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
In a preferred embodiment, n may be equal to 1 to improve recognition accuracy and rejection accuracy.
Preferably, step S100 includes:
s110: acquiring monitoring data;
s120: let x be a sequence of a set of monitored data, whose probability density function is f (x), and belongs to a continuous random variable, its entropy is defined as: h (x) ═ jRf (x) lnf (x) dx wherein: h (x) is the sample entropy of the sequence of monitored data.
Constructing the following maximum entropy model based on the maximum entropy principle:
maxH(x)=-∫Rf(x)lnf(x)dx
s.t.∫Rf(x)dx=1
Rxif(x)dx=μi(i=1,2,…,n)
wherein R is the set of the monitoring data sequence x, muiTo monitor the ith moment of origin of the sequence x of data,
Figure BDA0003344889520000091
the distribution curve, normal distribution curve and sample histogram contrast of the maximum entropy probability density function are shown in fig. 4.
S130: the following lagrange function is constructed:
Figure BDA0003344889520000092
wherein λ is0、λiIs a lagrange multiplier;
s140: computing a partial derivative of the Lagrangian function
Figure BDA0003344889520000093
To obtain the following maximum entropy probability density function:
Figure BDA0003344889520000094
the formula is a sequence probability density function analytic formula of the monitoring data based on the maximum entropy principle.
Preferably, referring to fig. 2, step S140 includes:
s141: order:
Figure BDA0003344889520000095
s142: assuming an initial iteration of λ to λ0The above formula is set at λ0Performing first-order Taylor expansion:
Figure BDA0003344889520000096
let Δ ═ λ - λ0,ζ=[μ0-G00),μ1-G11),…,μn-Gnn)]TThen, there are:
Figure BDA0003344889520000097
s143: the above formula is abbreviated as: G.DELTA.. zeta., and solving for.DELTA.. lambda.i+1=λi+ Δ continues to iterate until convergence to: delta is less than or equal to Deltamin
Wherein, DeltaminIs a preset iteration precision threshold value.
Preferably, step S200 includes:
s210: according to a preset grade division standard, dividing the credibility of the data points of the monitoring data into normal points, suspicious points and gross error points;
based on the idea of analytic hierarchy process, the monitoring data is divided into three categories, namely normal point, suspicious point and gross error point, which are expressed as follows: v ═ V1,V2,V3]As [ coarse difference, suspicious, normal ]]
S220: considering the unknown difference between the comments, a consistency judgment matrix is constructed by qualitative terms which are linearly formed, and L with the value range of 1-9 is used1、L2Respectively representing the weight comparison of different credibility grades, and dividing the scale of 1-9 into two sections;
s230: record ciQuantiles on a scale of 1-9, such that qualitative terms map to: l is1:[1,c),L2[ c,9 ], wherein it can be understood that the gross error point is stricter than the suspicious point, stricter than the normal point, and the suspicious point is stricter than the normal point. Thus, L1: the former is more stringent than the latter in comparison to the two grades; l is2: the former is more stringent than the latter in comparison to the two grades.
Preferably, step S220 includes:
s221: according to the mapping of qualitative terms, a judgment matrix for probability division of the credibility grade of the monitoring data is obtained as follows:
Figure BDA0003344889520000101
s222: calculating the maximum eigenvalue and eigenvector corresponding to the judgment matrix R: r.w*=λmax·w*Wherein: lambda [ alpha ]maxIn order to determine the maximum eigenvalue of the matrix, w is the eigenvector corresponding to the maximum eigenvalue.
It is understood that if λmaxIf n is the order number of the judgment matrix, it indicates that the judgment matrix has complete consistency, otherwise, it needs to calculate the consistency index to judge the consistency of the judgment matrix R.
Preferably, referring to fig. 3, step S300 includes:
s310: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes;
s320: calculating the consistency ratio of the judgment matrix based on the consistency indexes of the judgment matrix as follows:
Figure BDA0003344889520000102
wherein λmaxIn order to judge the maximum eigenvalue of the matrix, n is the matrix dimension of the judgment matrix;
generally speaking, CI >0, with larger CI values indicating poorer consistency of the decision matrix R. Comparing the consistency ratio CI of the judgment matrix R with the average random consistency index RI, and recording as CR, and judging whether the consistency of the judgment matrix R meets the requirement or not according to the value of CR:
Figure BDA0003344889520000111
the value of the random consistency index RI in the formula is related to the order of the judgment matrix, which is shown in the following table. If CR is more than or equal to 0.1, the judgment matrix R needs to be adjusted until CR meets the requirement.
Figure BDA0003344889520000112
TABLE 1 values of the random consistency index RI
S330: and taking the consistency ratio of the judgment matrix as a fitness function of the particle swarm algorithm, and updating the particle speed and the particle position of the particle swarm algorithm according to the following formula:
Vi(t+1)=ξvi(t)+c1r1[pi(t)-xi(t)]+c2r2[pg(t)-xi(t)]
wherein, t is the iteration number,c1、c2to be an acceleration factor, r1、r2The number is a random number within the range of 0-1, and xi is an inertia weight;
in order to make the consistency of the judgment matrix R reach the optimal state, the quantile point c is adjustediAnd carrying out consistency check on the judgment matrix R based on a Particle Swarm Optimization (PSO), and seeking the position of the optimal quantile point by taking the mean value of the consistency ratio CR of the judgment matrix R in a random state as a fitness function.
The particle swarm optimization is an intelligent algorithm for solving global search. In the algorithm, any possible solution of the optimization problem is called as a particle, the particle updates the position of the particle by continuously searching to find the optimal solution of the particle and the optimal solution in the population, and the iterative search is carried out until the global optimal solution is found. The superiority of each particle is measured by a fitness function, where the fitness function is the mean of the consistency ratios CR of the decision matrix R in a random state. Let Xi=(xi) Denotes the position of the particle at the i-th component site, Vi=(vi) Is the velocity of the particle at the ith component site, Pi=(pi) For the best position that the ith component site particle itself has experienced, Pg=(pg) The best position searched for by the particle in the current state.
S340: solving for optimal quantile cbest
Preferably, step S400 includes:
s410: according to the optimal quantile cbestDividing the critical probability of the reliability grade of the monitoring data within the range of 1%;
s420: optimal quantile point c for rating when obtaining confidence of monitoring databestThen, a judgment matrix R when the consistency ratio CR is minimum is obtainedbest
Figure BDA0003344889520000121
S430: calculating to obtain a judgment matrix RbestThe maximum eigenvalue of (d) corresponds to the eigenvector w ═ w1,w2,w3]
S440: normalizing w by the feature vector, and collecting the evaluation of the credibility grade of the monitoring data in 0, 1%]And (3) division in intervals:
Figure BDA0003344889520000122
s450: and determining a critical value of the credibility grade of the monitoring data so as to realize gross error identification and elimination of the monitoring data.
The gross error identification and culling method of a preferred embodiment is described in detail below with reference to an embodiment.
Given a set of sequences x of monitored data, a maximum entropy probability density function of the sequence of monitored data is calculated.
First, the origin moment μ of each order is calculatediStandard deviation σ, integral domain are shown in the table below.
Figure BDA0003344889520000123
TABLE 2 origin moment, standard deviation and integral field of each order
Using Newton iteration method, the initial value lambda of Lagrange multiplier is given0When it is 0, the convergence determination threshold value Δmin=10-6The maximum entropy probability density function analytic expression is obtained as follows:
f(x)=exp(-513.21+1166.33x-1098.29x2-102.38x3+100.95x4)
drawing a function image and a sample distribution probability histogram are shown in fig. 4, and it can be seen that the probability density function fitting accuracy of the sequence for solving the monitoring data by adopting the maximum entropy principle is obviously higher than that of normal distribution fitting.
In the range of [0, 1%]Within the interval, based on the particle swarm optimization, for each group of particles ci(1<ci<9) Randomly generating k as 1000 groups of judgment matrixes, taking the consistency index of the 1000 judgment matrixes as a fitness function, wherein the maximum iteration number is 500, and the iteration process is shown in fig. 5.
By iteration, when ciMaximum when it is 4.27And (4) preferably dividing the sites, wherein the consistency ratio of the judgment matrix reaches the minimum, and the judgment matrix at the moment is as follows:
Figure BDA0003344889520000131
the eigenvector corresponding to the maximum eigenvalue is:
w=[0.9178,0.3684,0.1479]
after normalization, a weight vector is obtained:
w*=[0.6400,0.2569,0.1031]
the reliability of the monitoring data sequence is divided into three levels:
V=[V1,V2,V3]as [ coarse difference, suspicious, normal ]]=[0.104%,0.256%,0.640%]
According to the maximum entropy probability density function of the sequence of the monitoring data and the critical probability of the credibility grade, obtaining a normal point measurement value interval as follows: [ -31.589,32.466], suspect spot detection interval is: [ -33.125,31.589 ] U (32.466,37.266], gross nadir interval: (∞, -33.125) and (37.266, + ∞).
It should be noted that the monitoring data in the gross error point measurement interval can be directly considered as gross error to be removed, but the data points in the suspicious point measurement interval should be considered and judged again.
The invention also discloses a gross error identification and elimination system aiming at the monitoring data, which comprises the following steps: the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle; the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000132
wherein L is1、L2Respectively representing different degrees of reliability, etcComparing the weights of the levels, wherein the value range is between 1 and 9; processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest(ii) a A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
The invention also discloses a server, comprising: the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle; the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure BDA0003344889520000141
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9; processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest(ii) a A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
The invention also discloses a computer readable storage medium on which a computer program is stored, the computer program, when executed by a processor, implementing the gross error identification and rejection method as described above.
It should be noted that the embodiments of the present invention have been described in terms of preferred embodiments, and not by way of limitation, and that those skilled in the art can make modifications and variations of the embodiments described above without departing from the spirit of the invention.

Claims (10)

1. A gross error identification and elimination method for monitoring data is characterized by comprising the following steps:
s100: acquiring monitoring data, and solving a maximum entropy probability density function of a sequence where the monitoring data is based on a maximum entropy principle;
s200: according to a preset grade division standard, dividing the credibility of data points of the monitoring data to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure FDA0003344889510000011
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
s300: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest
S400: according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
2. The gross error identification and rejection method according to claim 1, wherein step S100 comprises:
s110: acquiring monitoring data;
s120: constructing the following maximum entropy model based on the maximum entropy principle:
max H(x)=-∫Rf(x)ln f(x)dx
s.t.∫Rf(x)dx=1
Rxif(x)dx=μi(i=1,2,…,n)
wherein R is the set of the monitoring data sequence x, muiTo monitor the ith moment of origin of the sequence x of data,
Figure FDA0003344889510000012
s130: the following lagrange function is constructed:
Figure FDA0003344889510000013
wherein λ is0、λiIs a lagrange multiplier;
s140: computing a partial derivative of the Lagrangian function
Figure FDA0003344889510000021
To obtain the following maximum entropy probability density function:
Figure FDA0003344889510000022
Figure FDA0003344889510000023
3. the gross error identification and rejection method according to claim 1, wherein step S140 comprises:
s141: order:
Figure FDA0003344889510000024
s142: assuming an initial iteration of λ to λ0The above formula is set at λ0Performing first-order Taylor expansion:
Figure FDA0003344889510000025
let Δ ═ λ - λ0,ζ=[μ0-G00),μ1-G11),…,μn-Gnn)]TThen, there are:
Figure FDA0003344889510000026
s143: the above formula is abbreviated as: G.DELTA.. zeta., and solving for.DELTA.. lambda.i+1=λi+ Δ continues to iterate until convergence to: delta is less than or equal to DeltaminWherein, DeltaminIs a preset iteration precision threshold value.
4. The gross error identification and rejection method according to claim 1, wherein step S200 comprises:
s210: according to a preset grade division standard, dividing the credibility of the data points of the monitoring data into normal points, suspicious points and gross error points;
s220: constructing a consistency judgment matrix among the comments by using qualitative terms composed in a linear mode, and using L with the value range of 1-91、L2Respectively representing the weight comparison of different credibility grades, and dividing the scale of 1-9 into two sections;
s230: record ciQuantiles on a scale of 1-9, such that qualitative terms map to: l is1:[1,c),L2[c,9)。
5. The gross error identification and rejection method according to claim 4, wherein step S220 comprises:
s221: according to the mapping of qualitative terms, a judgment matrix for probability division of the credibility grade of the monitoring data is obtained as follows:
Figure FDA0003344889510000027
s222: calculating the maximum eigenvalue and eigenvector corresponding to the judgment matrix R: r.w*=λmax·w*Wherein: lambda [ alpha ]maxIn order to determine the maximum eigenvalue of the matrix, w is the eigenvector corresponding to the maximum eigenvalue.
6. The gross error identification and rejection method according to claim 1, wherein step S300 comprises:
s310: for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes;
s320: calculating the consistency ratio of the judgment matrix based on the consistency indexes of the judgment matrix as follows:
Figure FDA0003344889510000031
wherein λmaxIn order to judge the maximum eigenvalue of the matrix, n is the matrix dimension of the judgment matrix;
s330: and taking the consistency ratio of the judgment matrix as a fitness function of the particle swarm algorithm, and updating the particle speed and the particle position of the particle swarm algorithm according to the following formula:
Vi(t+1)=ξvi(t)+c1r1[pi(t)-xi(t)]+c2r2[pg(t)-xi(t)]
where t is the number of iterations, c1、c2To be an acceleration factor, r1、r2The number is a random number within the range of 0-1, and xi is an inertia weight; s340: solving for optimal quantilecbest
7. The gross error identification and rejection method according to claim 1, wherein step S400 comprises:
s410: according to the optimal quantile cbestDividing the critical probability of the reliability grade of the monitoring data within the range of 1%;
s420: optimal quantile point c for rating when obtaining confidence of monitoring databestThen, a judgment matrix R when the consistency ratio CR is minimum is obtainedbest
Figure FDA0003344889510000032
S430: calculating to obtain a judgment matrix RbestThe maximum eigenvalue of (d) corresponds to the eigenvector w ═ w1,w2,w3]
S440: normalizing w by the feature vector, and dividing a comment set of the credibility grade of the monitoring data in a [0, 1% ] interval:
Figure FDA0003344889510000033
s450: and determining a critical value of the credibility grade of the monitoring data so as to realize gross error identification and elimination of the monitoring data.
8. A gross error identification and rejection system for monitoring data, comprising:
the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle;
the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure FDA0003344889510000041
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest
A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
9. A server, comprising:
the solving module is used for acquiring the monitoring data and solving a maximum entropy probability density function of a sequence where the monitoring data are based on a maximum entropy principle;
the matrix module is used for dividing the credibility of the data points of the monitoring data according to a preset grade division standard to form a plurality of credibility grades, and establishing the following consistency judgment matrix based on a hierarchical analysis algorithm:
Figure FDA0003344889510000042
wherein L is1、L2Respectively representing weight comparison of different credibility grades, wherein the value range is between 1 and 9;
processing module for L1、L2Each quantile c between 1-9 scales ofiRandomly generating k judgment matrixes, calculating the consistency ratio of the k judgment matrixes, taking the consistency ratio as a fitness function of the particle swarm algorithm, and solving the optimal quantile point cbest(ii) a A culling module according to the optimal quantile cbestAnd dividing the critical probability of the reliability grade of the monitoring data within the range of n%, and determining the critical value of the reliability grade according to a critical probability density function so as to realize gross error identification and elimination of the monitoring data.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the gross error identification and culling method according to any one of claims 1-7.
CN202111323819.8A 2021-11-09 2021-11-09 Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data Pending CN114048808A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111323819.8A CN114048808A (en) 2021-11-09 2021-11-09 Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111323819.8A CN114048808A (en) 2021-11-09 2021-11-09 Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data

Publications (1)

Publication Number Publication Date
CN114048808A true CN114048808A (en) 2022-02-15

Family

ID=80207895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111323819.8A Pending CN114048808A (en) 2021-11-09 2021-11-09 Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data

Country Status (1)

Country Link
CN (1) CN114048808A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114609372A (en) * 2022-02-18 2022-06-10 江苏徐工工程机械研究院有限公司 Engineering machinery oil monitoring system and method based on maximum entropy

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114609372A (en) * 2022-02-18 2022-06-10 江苏徐工工程机械研究院有限公司 Engineering machinery oil monitoring system and method based on maximum entropy
CN114609372B (en) * 2022-02-18 2023-10-03 江苏徐工工程机械研究院有限公司 Engineering machinery oil monitoring system and method based on maximum entropy

Similar Documents

Publication Publication Date Title
CN111950918B (en) Market risk assessment method based on power transaction data
CN105930976B (en) Node voltage sag severity comprehensive evaluation method based on weighted ideal point method
CN109816031B (en) Transformer state evaluation clustering analysis method based on data imbalance measurement
CN114559819A (en) Electric vehicle battery safety early warning method based on signal processing
CN114048808A (en) Gross error identification and elimination method, system, server and computer readable storage medium for monitoring data
CN115858794B (en) Abnormal log data identification method for network operation safety monitoring
CN111709668A (en) Power grid equipment parameter risk identification method and device based on data mining technology
CN112949735A (en) Liquid hazardous chemical substance volatile concentration abnormity discovery method based on outlier data mining
CN117150244A (en) Intelligent power distribution cabinet state monitoring method and system based on electrical parameter analysis
CN111612120A (en) Group abnormal behavior detection method and device based on fuzzy clustering algorithm
CN117033891A (en) Traffic accident severity assessment method based on single vehicle traffic accident database
CN114971345B (en) Quality measuring method, equipment and storage medium for built environment
CN115754199A (en) Water quality detection method based on membership function and principal component analysis
CN115935160A (en) Air quality data processing method based on neighborhood rough set attribute reduction
CN112906746B (en) Multi-source track fusion evaluation method based on structural equation model
CN112765219B (en) Stream data abnormity detection method for skipping steady region
CN112381380B (en) System health detection method and device for spacecraft
CN115129503A (en) Equipment fault data cleaning method and system
CN114818886A (en) Method for predicting soil permeability based on PCA and Catboost regression fusion
CN114548306A (en) Intelligent monitoring method for early drilling overflow based on misclassification cost
CN107506824B (en) Method and device for detecting bad observation data of power distribution network
CN117517596B (en) Method and system for monitoring combustible and toxic harmful gases in real time based on Internet of things
CN117629122B (en) Dam displacement monitoring and early warning method and system
CN113534129B (en) Method and system for evaluating high-speed target detection performance of foundation broadband radar
CN116680517B (en) Method and device for determining failure probability in automatic driving simulation test

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination