WO2022141746A1

WO2022141746A1 - Method for detecting anomaly in water quality and electronic device

Info

Publication number: WO2022141746A1
Application number: PCT/CN2021/075420
Authority: WO
Inventors: 许红龙; 郭沛清
Original assignee: 佛山科学技术学院
Priority date: 2020-12-30
Filing date: 2021-02-05
Publication date: 2022-07-07
Also published as: CN112733904A; CN112733904B

Abstract

A method for detecting an anomaly in water quality and an electronic device. The method for detecting the anomaly in water quality comprises: calculating distance values between all water quality arrays in a water quality data set and a reference point, and forming all the distance values into a one-dimensional data set; calculating k nearest neighbors of each object of the one-dimensional data set; determining a pre-threshold; and dividing the ordered water quality data set into a plurality of data blocks, taking the pre-threshold as an outlier threshold, sequentially performing outlier detection on each data block, updating the outlier threshold to an N-th largest outlier of the detected data blocks, and taking the updated outlier threshold as a determination standard for performing outlier detection on the next data block. In the case that some of water quality abnormal points do not need to be known in advance and the global outlier does not need to be calculated, the outlier detection speed is accelerated, and it is guaranteed that the water quality abnormal detection result is consistent with the traditional distance-based outlier detection algorithm.

Description

A kind of water quality abnormal detection method and electronic equipment

technical field

The invention relates to the technical field of water quality detection, in particular to a water quality abnormality detection method and electronic equipment.

Background technique

Water is essential for aquatic life, human life and industrial production. The cleanliness of the water body and the content of various chemical components are an important basis for determining the use of water sources and environmental protection work. Especially in water body environmental protection work, the limited environmental protection resources determine that sewage treatment must be targeted, focusing on some areas, rather than casting a wide net. Water quality monitoring and analysis involves chemical oxygen demand COD, ammonia nitrogen, total phosphorus, dissolved oxygen, etc. and various heavy metal content indicators, and the indicators included in different water quality monitoring and analysis instruments are not exactly the same. After the values of these indicators are measured by instruments, they are then analyzed and ranked, and the key water areas to be treated are determined according to the water quality samples at the top of the ranking and combined with the regional conditions.

In the existing water quality anomaly detection methods, it is widely used to set abnormal thresholds for various water quality indicators involved in water quality monitoring and analysis instruments. If the thresholds are exceeded, the indicators are considered abnormal, and the method is hereinafter referred to as the indicator threshold method. In the invention patent with the application number of 201910560024.5, "A Method and System for Abnormal Water Quality Detection Based on Prior Knowledge", the distance-based outlier detection algorithm is applied to realize the abnormal detection of water quality for the data of different monitoring and analysis instruments. , and in order to speed up the outlier detection process, it utilizes the priori water quality anomalies to increase the outlier threshold to improve the detection speed, hereinafter referred to as the a priori threshold method.

The disadvantage of the indicator threshold method is that it requires domain expert knowledge to set abnormal thresholds for each indicator, and because multiple water quality indicators are used at the same time, it is more difficult to judge which water quality samples are abnormal and the order of abnormality. The a priori threshold method is a distance-based outlier detection algorithm. It can automatically give the most abnormal N water quality samples without domain expert knowledge, but its acceleration effect depends on the abnormal water quality samples (prior samples) known in advance. If the number of abnormal samples known in advance is too small, or there are no such abnormal samples, the acceleration effect of detection will be greatly reduced or even not accelerated. Furthermore, even if the number of prior samples is sufficient, in order to obtain the prior threshold, it is still necessary to detect their outliers based on the global dataset.

SUMMARY OF THE INVENTION

The present invention provides a method and electronic device for detecting abnormal water quality, so as to solve one or more technical problems existing in the prior art, and at least provide a beneficial choice or create conditions.

In a first aspect, an embodiment of the present invention provides a method for detecting abnormal water quality, including:

S101. Acquire multiple water quality arrays to form a water quality data set, and each water quality array has the same dimension and includes at least one water quality data;

S102, randomly select a water quality array as a reference point in the water quality data set;

S103. Calculate the distance values between all water quality arrays in the water quality data set and the reference point, and form all the distance values into a one-dimensional data set;

S104, performing descending sorting on all distance values of the one-dimensional data set to obtain an ordered one-dimensional data set, and sorting all the water quality arrays in the water quality data set according to the descending order to obtain an ordered water quality data set,

S105. Determine the k nearest neighbors of each object in the ordered one-dimensional data set, 1≤k≤D*1%, where D is the number of water quality arrays in the water quality data set;

S106. Calculate the distance value between each object of the ordered one-dimensional data set and its k-th nearest neighbor to obtain the outlier degree of each object, and the outlier degrees of all objects in the one-dimensional data set constitute the one-dimensional outlier degree, according to The size of each outlier in one-dimensional outliers, select the largest N outliers in descending order, and use the Nth largest outlier as the pre-threshold, where the kth nearest neighbor is the k nearest neighbor the kth in the neighborhood;

S107: Divide the ordered water quality data set into multiple data blocks, use the pre-threshold as the outlier threshold, perform outlier detection on each data block in turn, and determine the maximum N of the detected data blocks according to the outlier threshold outliers, update the outlier threshold to the Nth largest outlier of the detected data block, and use the updated outlier threshold as the judgment criterion for outlier detection in the next data block until all data blocks are detected. After the detection is completed, the water quality arrays corresponding to the maximum N outlier degrees of all data blocks are regarded as abnormal N water quality arrays.

Further, the k nearest neighbors of each object in the ordered one-dimensional data set determined in step S105 include:

Assuming that any object in the ordered one-dimensional data set is denoted as O, there are k1 objects in front of object O, and k2 objects in the back of object O, where k1≥0, k2≥0;

When k1≥k, search k objects forward, when k1<k, search forward k1 objects;

When k2≥k, search k objects backward, when k2<k, search backward k2 objects;

Calculate the distance between object O and all searched objects, sort the searched objects from small to large according to the size of the distance, and the objects with the top k distances are the k nearest neighbors of object O.

Further, step S107 is specifically:

S201. Divide the ordered water quality data set into B data blocks, each data block includes M water quality arrays, outlier threshold=pre-threshold;

S202, set t=1, and t represents the t-th data block;

S203, determine whether t is 1, if yes, go to step S205, if not, go to step S204;

S204, determine whether d0+the outlier degree of the reference point<outlier degree threshold, wherein d0 is the distance between the first water quality array in the t-th data block and the reference point, if so, execute step S215, if not, execute S205;

S205. From the median object of the t-th data block of the ordered water quality data set, determine the water quality data set sorted in the spiral order according to the spiral order, xj represents the water quality data group of the water quality data set sorted in the spiral order, j =1;

S206, set m=1, m represents the position number of the water quality array in the initial t-th data block, and Xm represents the water quality array numbered m;

S207, determine whether Xm has been removed, if yes, then go to step S211, if not, go to step S208;

S208, calculate the distance between Xm and xj;

S209, determine whether j<k, if so, execute step S211, if not, update the temporary k nearest neighbor of Xm, update the temporary outlier degree of Xm to be the distance between Xm and the kth nearest neighbor of the temporary k nearest neighbors, execute Step S210;

S210, determine whether the temporary outlier degree of Xm is lower than the outlier degree threshold; if the determination result is yes, remove Xm from the t-th data block, and execute step S211; if the determination result is no, execute step S211;

S211, determine whether m is less than M, if yes, m=m+1, go to step S207, if not, go to step S212;

S212, determine whether j is less than D, if yes, then j=j+1, go to step S206; if not, go to step S213;

S213. When t=1, determine the largest N outlier degrees in the t-th data block, take the N-th largest outlier degree as the outlier degree threshold, and execute step S214; when t>1, determine the N-th largest outlier degree Max N outliers in 1 to (t-1) blocks, from max N outliers in 1 to (t-1) blocks and max N in t block Determine the maximum N outlier degrees in the 1st to tth data blocks in the outlier degree, and the outlier degree threshold=the Nth largest outlier degree in the 1st to tth data blocks, and execute step S214;

S214, determine whether t is less than B, if yes, t=t+1, go to step S204, if not, go to step S215;

S215. The N water quality arrays corresponding to the maximum N outlier degrees of all the currently detected data blocks are taken as the abnormal N water quality data groups.

Further, the distance between all the arrays and the reference point is calculated by the method of calculating the distance between the two water quality arrays, and the method for calculating the distance between the two water quality arrays includes:

Assuming that the two water quality arrays are x1 and x2, respectively represented by n-dimensional variables, x1=(x ₁₁ , x ₁₂ ,...,x _1n ), x2=(x ₂₁ ,x ₂₂ ,...,x _2n ), then the two The distance between x1 and x2 is:

Among them, x ₁₁ , x ₁₂ ,…,x _1n represent the normalized data of the different physical quantities of the water quality array x1, and x ₂₁ , x ₂₂ ,…,x _2n represent the normalized processing of the different physical quantities of the water quality array x2 After the data, dist(x1, x2) represents the distance between the water quality array x1 and x2.

Further, n is greater than or equal to 1, and each water quality array includes at least one of chemical oxygen demand data, ammonia nitrogen data, total phosphorus data, and dissolved oxygen data.

In a second aspect, an embodiment of the present invention also provides an electronic device, including:

processor;

a memory for storing a computer-readable program;

The computer-readable program, when executed by the processor, causes the processor to implement the method of any one of claims 1-5.

One of the embodiments of the present invention has at least the following beneficial effects: calculating the distance values between all water quality arrays in the water quality data set and the reference point, and forming all the distance values into a one-dimensional data set; finding the k nearest to each object in the one-dimensional data set Neighbor, determine the pre-threshold, in addition, divide the ordered water quality data set into multiple data blocks, take the pre-threshold as the outlier threshold, perform outlier detection on each data block in turn, and update the outlier threshold to The Nth largest outlier degree of the detected data block is used, and the updated outlier degree threshold is used as the judgment criterion for outlier detection in the next data block. Without the need to know some abnormal water quality points in advance, and do not need to calculate the global outlier degree, the outlier detection speed is improved, and the detection results of water quality abnormality are guaranteed to be consistent with the traditional distance-based outlier detection algorithm.

Description of drawings

The accompanying drawings are used to provide a further understanding of the technical solutions of the present invention, and constitute a part of the description. They are used to explain the technical solutions of the present invention together with the embodiments of the present invention, and do not constitute a limitation on the technical solutions of the present invention.

FIG. 1 is a flowchart of a method for detecting abnormal water quality according to an embodiment of the present invention.

FIG. 2 is a flowchart of a method for detecting abnormal water quality in a water quality data set provided by an embodiment of the present invention.

Detailed ways

In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

It should be noted that although the functional modules are divided in the schematic diagram of the system and the logical sequence is shown in the flow chart, in some cases, the modules may be divided differently from the system or executed in the order in the flow chart. steps shown or described. The terms "first", "second" and the like in the description and claims and the above drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

Terms used in this example are introduced:

Data block: a unit of outlier detection, consisting of several objects in the data set, for example, 1000 objects are commonly used as a data block.

k nearest neighbors: refers to the distance between object A and all objects in the dataset, and the k corresponding objects with the smallest distance value are the nearest neighbors of A.

Temporary k nearest neighbors: refers to the calculation of the distance between object A and some objects in the dataset, and the k corresponding objects with the smallest distance value are the temporary k nearest neighbors of A.

The kth nearest neighbor: refers to the k distance values between object A and its k nearest neighbors. The distance values are sorted from small to large, and the object corresponding to the kth distance value is the kth nearest neighbor of object A.

The temporary kth nearest neighbor refers to the k distance values between object A and its temporary k nearest neighbors. The distance values are sorted from small to large, and the object corresponding to the kth distance value is the temporary kth nearest neighbor of object A.

Outlier degree of object A: refers to the distance value of object A and its kth nearest neighbor.

Temporary outlier degree of object A: refers to the distance value of object A and its temporary k-th nearest neighbor.

Spiral order: If there is an index sequence of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, if it starts at 5, its spiral order is 5, 4, 6, 3, 7, 2 , 8..., or 5, 6, 4, 7, 3, 8, 2..., which means one after the other, and so on.

Fig. 1 is a kind of water quality abnormal detection method provided by the embodiment of the present invention, including:

S101. Acquire multiple water quality arrays to form a water quality data set. Each water quality array has the same dimension and includes at least one water quality data;

Each water quality array is multi-dimensional data, including at least one water quality data, and each water quality array includes at least one of chemical oxygen demand data, ammonia nitrogen data, total phosphorus data, dissolved oxygen data, temperature data, turbidity data, pH value, etc. . Those skilled in the art can select water quality data of different physical quantities according to actual needs.

Specifically, the Euclidean distance is used to calculate the distance. The method for calculating the distance between two water quality arrays is:

n is determined according to the actual situation, for example, n=4, and the water quality array includes chemical oxygen demand data, ammonia nitrogen data, total phosphorus data, and dissolved oxygen data.

S104, performing descending sorting on all distance values of the one-dimensional data set to obtain an ordered one-dimensional data set, and sorting all the water quality arrays of the water quality data set according to the descending order to obtain an ordered water quality data set;

D is the number of water quality arrays, and the value of D is generally relatively large, which can be more than tens of thousands.

Determining the k-nearest neighbors of each object of the dataset includes:

Assuming that each object of the ordered one-dimensional data set, any object of the ordered one-dimensional data set is denoted as O, there are k1 objects in front of the object O, and k2 objects exist behind the object O, where k1≥0 , k2≥0;

When k1≥k, search k objects forward, when k1<k, search forward k1 objects;

When k2≥k, search k objects backward, when k2<k, search backward k2 objects;

Since the ordered one-dimensional data set is sorted in order, when determining the k nearest neighbors of the object O, only the distances of the k objects before and after the object O need to be calculated, and the distance between the object O and all objects does not need to be calculated, reducing the calculation time.

The accuracy of the detection results is ensured by setting the pre-threshold according to the data set. The principle is as follows: Due to the triangular inequality of distances, the distance between each object in the data set and the reference point is calculated, so that after mapping to a one-dimensional space, the objects are two by two. The distance between them (called one-dimensional space distance) is less than or equal to their actual distance (multi-dimensional space distance); further, search k nearest neighbors for object s _a in one-dimensional space, then the k nearest neighbors and s _a The one-dimensional space distances are all less than or equal to the multi-dimensional space distances. It can be further deduced that the one-dimensional outlier degree of s _a is less than or equal to the multi-dimensional outlier degree. From the generality of s _a , it can be known that the one-dimensional outlier degrees of all objects are less than their The multi-dimensional outlier degree of ; take the N objects with the largest one-dimensional outlier degree, and the smallest one-dimensional outlier degree (ie, the Nth largest) is used as the pre-threshold Tb, and similarly it can be proved that Tb is less than or equal to the multi-dimensional outlier degree threshold ; Multi-dimensional outlier threshold, which is the outlier degree of the Nth largest water quality abnormal point to be detected. It is less than or equal to the pre-threshold Tb of this value to exclude non-outlier points, which obviously will not cause false exclusion, thus ensuring the detection result. correctness.

As shown in Figure 2, step S107 is specifically:

S201. Divide the ordered water quality data set into B data blocks, each data block includes M water quality arrays, and outlier threshold=pre-threshold;

S202, set t=1, and t represents the t-th data block;

Specifically, the distance between the first water quality array in the t-th data block and the reference point, and the outlier degree of the reference point are calculated and stored in step S103.

The water quality arrays in the data block are arranged in order. When t is greater than or equal to 2, as long as the first water quality array in the data block satisfies the termination rule, that is, the outlier degree of d0 + reference point < outlier degree threshold, it means that The first water quality array is not an outlier, and the other water quality arrays and other data blocks in this data block are not outliers, and the entire data set does not need to be tested. The first water quality array is used to judge, and when it is determined that the termination rule is met, the detection is stopped and the detection result is output, which greatly shortens the detection time.

The ordered water quality data sets are arranged in order, and the objects that are close to each other are also arranged close to each other. Therefore, taking the object in the middle of the data block, for example, the data block is 1000 objects, the 500th or 501st object can be selected. The object is the median object, and the k nearest neighbors are searched spirally (alternately searching for its front and back).

In this step, when the number m of the water quality array in the data block is determined for the first time, its number remains unchanged, even if the water quality array is subsequently deleted, its number remains unchanged. For example, the data block is [X1, . , …, XM], but the position number is still the number in the original data block.

During the execution process, the non-outlier points will be deleted. Since m adopts the position number of the initial data block, it is necessary to judge whether the water quality at the position number m has been removed. to be processed.

S208, calculate the distance between Xm and xj;

Use Euclidean distance to calculate the distance between two water quality arrays.

S209, determine whether j<k, if so, execute step S211, if not, update the temporary k nearest neighbor of Xm, update the temporary outlier degree of Xm to be the distance between Xm and the temporary kth nearest neighbor, and execute step S210;

When j<k, it means that the number of distance values has not yet reached k, and the calculation of the temporary outlier is not performed.

Because the calculated distance between Xm and all objects in the dataset is calculated one by one, the distance value between Xm and its temporary k-th nearest neighbor is gradually smaller or unchanged during the period (because the k-nearest neighbor update is also the k smallest, which is impossible. Take a larger value), that is, the temporary outlier cannot become larger, but may only remain unchanged or become smaller, and if the temporary outlier is less than the outlier threshold, it is determined that it is not an outlier. Therefore, once it is found that the temporary outlier degree is less than the outlier degree threshold, it can be directly excluded as a non-outlier point, and there is no need to continue searching for its k nearest neighbors. There is no need to calculate the distance of all objects before making judgments, which speeds up the detection speed and reduces the detection time. Moreover, since the water quality array of the data block is removed, the water quality array of the data block is getting smaller and smaller, reducing the amount of calculation and speeding up the detection. speed.

When the water quality array in the data block has not been detected, proceed to the next water quality array for processing.

Specifically, when t=1, the Nth largest outlier degree in the first data block is directly used as a new outlier degree threshold. When t>1, take the largest N outliers from the largest N outliers in the 1st to (t-1) data blocks and the largest N outliers in the tth data block As the largest N outliers in the 1st to tth data blocks, the Nth largest outlier in the 1st to tth data blocks is taken as a new outlier threshold.

When there are still data blocks undetected, continue to detect the next data block.

The present invention also provides an electronic device, comprising:

processor;

a memory for storing a computer-readable program;

When the computer-readable program is executed by the processor, the processor is caused to implement the control method as in the above-described embodiment.

Those of ordinary skill in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. Some or all physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data flexible, removable and non-removable media. Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices, or may Any other medium used to store desired information and which can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and can include any information delivery media, as is well known to those of ordinary skill in the art .

The preferred implementation of the present invention has been specifically described above, but the present invention is not limited to the above-mentioned embodiments. Those skilled in the art can also make various equivalent deformations or replacements under the premise of not violating the spirit of the present invention. These Equivalent modifications or substitutions are included within the scope defined by the claims of the present invention.

Claims

A method for detecting abnormal water quality, comprising:

S101. Acquire multiple water quality arrays to form a water quality data set. Each water quality array has the same dimension and includes at least one water quality data;

S102, randomly select a water quality array as a reference point in the water quality data set;

S103. Calculate the distance values between all water quality arrays in the water quality data set and the reference point, and form all the distance values into a one-dimensional data set;

S104, performing descending sorting on all distance values of the one-dimensional data set to obtain an ordered one-dimensional data set, and sorting all the water quality arrays of the water quality data set according to the descending order to obtain an ordered water quality data set,

S105. Determine the k nearest neighbors of each object in the ordered one-dimensional data set, 1≤k≤D*1%, where D is the number of water quality arrays in the water quality data set;

S106. Calculate the distance value between each object of the ordered one-dimensional data set and its k-th nearest neighbor to obtain the outlier degree of each object, and the outlier degrees of all objects in the one-dimensional data set constitute the one-dimensional outlier degree, according to The size of each outlier in one-dimensional outliers, select the largest N outliers in descending order, and use the Nth largest outlier as the pre-threshold, where the kth nearest neighbor is the k nearest neighbor the kth in the neighborhood;

S107: Divide the ordered water quality data set into multiple data blocks, use the pre-threshold as the outlier threshold, perform outlier detection on each data block in turn, and determine the maximum N of the detected data blocks according to the outlier threshold outliers, update the outlier threshold to the Nth largest outlier of the detected data block, and use the updated outlier threshold as the judgment criterion for outlier detection in the next data block until all data blocks are detected. After the detection is completed, the water quality arrays corresponding to the maximum N outlier degrees of all data blocks are regarded as abnormal N water quality arrays.
The method for detecting abnormality in water quality according to claim 1, wherein determining the k nearest neighbors of each object in the ordered one-dimensional data set in step S105 comprises:

Assuming that any object in the ordered one-dimensional data set is denoted as O, there are k1 objects in front of object O, and k2 objects in the back of object O, where k1≥0, k2≥0;

When k1≥k, search k objects forward, when k1<k, search forward k1 objects;

When k2≥k, search k objects backward, when k2<k, search backward k2 objects;

Calculate the distance between object O and all searched objects, sort the searched objects from small to large according to the size of the distance, and the objects with the top k distances are the k nearest neighbors of object O.
The method for detecting abnormal water quality according to claim 1, wherein step S107 is specifically:

S201. Divide the ordered water quality data set into B data blocks, each data block includes M water quality arrays, and outlier threshold=pre-threshold;

S202, set t=1, and t represents the t-th data block;

S203, determine whether t is 1, if yes, go to step S205, if not, go to step S204;

S204, determine whether d0+the outlier degree of the reference point<outlier degree threshold, wherein d0 is the distance between the first water quality array in the t-th data block and the reference point, if so, execute step S215, if not, execute S205;

S205. From the median object of the t-th data block of the ordered water quality data set, determine the water quality data set sorted in the spiral order according to the spiral order, xj represents the water quality data group of the water quality data set sorted in the spiral order, j =1;

S206, set m=1, m represents the position number of the water quality array in the initial t-th data block, and Xm represents the water quality array numbered m;

S207, determine whether Xm has been removed, if so, execute step S211, if not, execute step S208;

S208, calculate the distance between Xm and xj;

S209, judge whether j<k, if so, execute step S211, if not, update the temporary k nearest neighbor of Xm, update the temporary outlier degree of Xm to be the distance between Xm and the kth nearest neighbor of the temporary k nearest neighbors, execute Step S210;

S210, determine whether the temporary outlier degree of Xm is lower than the outlier degree threshold; if the determination result is yes, remove Xm from the t-th data block, and execute step S211; if the determination result is no, execute step S211;

S211, determine whether m is less than M, if yes, m=m+1, go to step S207, if not, go to step S212;

S212, determine whether j is less than D, if yes, then j=j+1, go to step S206; if not, go to step S213;

S213. When t=1, determine the largest N outlier degrees in the t-th data block, take the N-th largest outlier degree as the outlier degree threshold, and execute step S214; when t>1, determine the N-th largest outlier degree Max N outliers in 1 to (t-1) blocks, from max N outliers in 1 to (t-1) blocks and max N in t block Determine the maximum N outlier degrees in the 1st to tth data blocks in the outlier degree, and the outlier degree threshold=the Nth largest outlier degree in the 1st to tth data blocks, and execute step S214;

S214, determine whether t is less than B, if yes, t=t+1, go to step S204, if not, go to step S215;

S215. The N water quality arrays corresponding to the maximum N outlier degrees of all the currently detected data blocks are taken as the abnormal N water quality data groups.
The method for detecting abnormality in water quality according to claim 1, wherein the method for calculating the distance between two water quality arrays is used to calculate the distance between all the arrays and the reference point, and the method for calculating the distance between the two water quality arrays comprises:

Assuming that the two water quality arrays are x1 and x2, respectively represented by n-dimensional variables, x1=(x 11 , x 12 ,...,x 1n ), x2=(x 21 ,x 22 ,...,x 2n ), then the two The distance between x1 and x2 is:

Among them, x 11 , x 12 ,…,x 1n represent the normalized data of the different physical quantities of the water quality array x1, and x 21 , x 22 ,…,x 2n represent the normalized processing of the different physical quantities of the water quality array x2 After the data, dist(x1, x2) represents the distance between the water quality array x1 and x2.
The abnormal water quality detection method according to claim 1, wherein n is greater than or equal to 1, and each water quality array includes at least one of chemical oxygen demand data, ammonia nitrogen data, total phosphorus data, and dissolved oxygen data.
An electronic device, comprising:

processor;

a memory for storing a computer-readable program;

The computer-readable program, when executed by the processor, causes the processor to implement the method of any one of claims 1-5.