CN114254836A

CN114254836A - Water bloom disaster early warning method and terminal based on similarity analysis

Info

Publication number: CN114254836A
Application number: CN202111620749.2A
Authority: CN
Inventors: 单森华; 戴诗琪; 徐能通; 林永清; 吴闽帆; 吴弘毅
Original assignee: Istrong Technology Co ltd
Current assignee: Istrong Technology Co ltd
Priority date: 2021-12-28
Filing date: 2021-12-28
Publication date: 2022-03-29

Abstract

The invention discloses a water bloom disaster early warning method and a terminal based on similarity analysis, and the method comprises the following steps: step S1, algae data, meteorological data and water quality data are obtained, and the algae data, the meteorological data and the water quality data are preprocessed to form a standard time sequence with the same time frequency; step S2, calculating according to the algae data, the meteorological data and the water quality data of the standard time sequence to obtain the spatial distribution similarity; step S3, calculating according to the algae data, the meteorological data and the water quality data of the standard time sequence to obtain the similarity of environmental conditions; and S4, screening environmental condition similar time periods according to the environmental condition similarity, screening spatial distribution similar time points according to the spatial distribution similarity, and calculating to obtain the final follow-up trend grade according to the environmental condition similar time periods and the spatial distribution similar time points. The invention calculates the spatial similarity and the time similarity of the environment and obtains the subsequent trend grade of the algae condition.

Description

Water bloom disaster early warning method and terminal based on similarity analysis

Technical Field

The invention relates to the technical field of disaster early warning, in particular to a water bloom disaster early warning method and a terminal based on similarity analysis.

Background

The water bloom refers to algae floating matters formed on the water surface and explosively propagated by certain algae due to eutrophication of water bodies in fresh water bodies, the water bloom not only damages water landscapes, but also stimulates bacterial breeding to seriously destroy the water bodies, inhibits the growth and propagation of beneficial planktons and harms cultivated animals, so that the physical and mental health of people is harmed, and as a multi-lake country in China, many lakes are affected by the water bloom disasters, for example, great blue algae eruptions occur in Taihu lakes, nested lakes and Dian ponds for many years continuously, Yanghu lakes, Dongting lakes, Er-Hai and other lakes are also subjected to water bloom disasters of different degrees, so that the country pays more and more attention to monitoring and early warning of the water disasters.

At present, the monitoring of the algal bloom disaster mainly depends on the laboratory analysis of algae water body samples collected by a laying station periodically and manually, or the blue algae algal bloom is inverted by constructing a water color remote sensing model. The early warning mainly comprises two methods, one is to establish a prediction model according to the combination of the concentration of algae cells and weather forecast, the other is to establish a water ecology dynamics model by considering the factors of nutrient substances, water flow, vertical water temperature distribution and the like, however, the former depends on the weather forecast precision seriously, and the latter has many factors influencing algae and is very complex in model establishment, so the current early warning technology of the water bloom disaster has certain limitation.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the method and the terminal for early warning the water bloom disasters based on similarity analysis are provided for predicting the water bloom disasters.

In order to solve the technical problems, the invention adopts the technical scheme that:

a bloom disaster early warning method based on similarity analysis comprises the following steps:

step S1, algae data, meteorological data and water quality data are obtained, and the algae data, the meteorological data and the water quality data are preprocessed to form a standard time sequence with the same time frequency;

step S2, calculating to obtain spatial distribution similarity according to the algae data of the standard time sequence;

step S3, calculating according to the algae data, the meteorological data and the water quality data of the standard time sequence to obtain the similarity of environmental conditions;

and S4, screening environmental condition similar time periods according to the environmental condition similarity, screening spatial distribution similar time points according to the spatial distribution similarity, and calculating to obtain the final follow-up trend grade according to the environmental condition similar time periods and the spatial distribution similar time points.

In order to solve the technical problem, the invention adopts another technical scheme as follows:

a water bloom disaster early warning terminal based on similarity analysis comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor executes the computer program to realize the following steps:

The invention has the beneficial effects that: a bloom disaster early warning method and a terminal based on similarity analysis are disclosed, wherein the subsequent trend grade of algae conditions is obtained by calculating spatial distribution similarity and environmental condition similarity and according to the spatial distribution similarity and the environmental condition similarity, so that the bloom disaster is predicted.

Drawings

Fig. 1 is a flow diagram illustrating a bloom disaster warning method based on similarity analysis according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a bloom disaster warning terminal based on similarity analysis according to an embodiment of the present invention.

Description of reference numerals:

1. a bloom disaster early warning terminal based on similarity analysis; 2. a processor; 3. a memory.

Detailed Description

In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.

Referring to fig. 1, a method,

From the above description, the beneficial effects of the present invention are: the method has the advantages that the water bloom disasters are predicted by calculating the spatial distribution similarity and the environmental condition similarity and obtaining the subsequent trend grade of the algae conditions according to the spatial distribution similarity and the environmental condition similarity, and compared with the existing water bloom prediction method, the method is easy to realize and high in accuracy.

Further, the step S3 specifically includes:

step S31, calculating the correlation of all weather, water quality factors and algae changes by using the Pearson correlation coefficient;

step S32, taking the element with the absolute value of the correlation larger than the set value as the key element related to the algae change;

step S33, calculating the similarity of environmental conditions according to the key elements related to the algae change.

From the above description, it can be seen that the key elements to be calculated are obtained by screening the elements according to the correlation, and the environmental correlation is obtained by calculating the key elements, which reduces the complexity of calculation through screening, and makes the whole method easier to implement.

Further, the step S31 specifically includes:

for time series data, key elements related to algae change are screened by using a Pearson correlation coefficient, and the calculation formula is as follows:

where ρ represents correlation, A, B represents the time series of algae and the time series of individual meteorological or water quality elements, respectively, cov represents covariance, std represents standard deviation;

the step S33 specifically includes:

the environmental condition similarity is calculated according to the following formula:

in the formula, ω_mThe element weight is obtained from the correlation of the elements, and d (x) is higher when the element weight is higher with higher correlation_m,y_m) Is the distance of a single element in two time series, Y is a plurality of element series of time a, and X is a plurality of element series of time b:

in the formula, m is a main element, so m in Y and X is equal, k and n are time lengths, and k is less than n;

distance d (x) of single element in two time series_m,y_m) Calculated according to the following formula:

in the formula I_kIs x_mAnd y_mDistance of middle alignment point:

in the formula, i and j are index numbers in two sequences respectively, and l_ZDistance of the Z-th alignment point;

with the constraint:

{x_mn',…,x_mn”}∈x_m，n”-n'＝[k-2，k+2]；

in the formula, n' is the initial position of interception, n "is the end position of interception, if:

l_z-1＝(x_mn',y_mk')；

the next alignment point:

l_z＝(x_mn,y_mk)；

the conditions are required to be satisfied: n-n 'is less than or equal to 1, k-k' is less than or equal to 1, n-n 'is more than or equal to 0, and k-k' is more than or equal to 0.

As can be seen from the above description, an example of a method for calculating the similarity of environmental conditions is given, which overcomes the disadvantage that euclidean distances have similarity that cannot distinguish between shapes and dynamic variation amplitudes through a dynamic time warping algorithm, and achieves minimum weighted distance sum through a local optimization method.

Further, the step S2 specifically includes:

the spatial similarity s _ spatial of time a and time b is calculated using the following equation_a,b：

In the formula, s _ num_a,bS _ trend, the numerical similarity of the spatial distribution of time a and time b_a,bSimilarity of trends, ω, of the spatial distribution of time a and time b_numAnd ω_trendRespectively weighting the numerical similarity and the trend similarity;

wherein for numerical similarity s _ num_a,bCalculated according to the following formula:

wherein x represents the algal cell concentration, i.e., x_aiThe concentration of algal cells, x, at the time a sampling point i_biAlgal cell concentration, std, representing the sampling point i at time b_xiIs the standard deviation, omega, of the algal cell concentration at all historical times for sample point i_iThe weight representing sample point i is calculated using the following equation:

in the formula, ω_iRepresenting the weight of the sample point i, N representing the total number of sample points, R representing the number of sub-regions to be divided, R_iRepresenting sub-regions to which sample points i belongThe number of sampling points;

for trend similarity s _ trend_a,bCalculated according to the following formula:

in the formula, r_aiThe ranking of the algal cell concentration at sample point i at time a in all samples, r_biThe ranking of algal cell concentration at sample point i at time b in all samples, std_riThe standard deviation of the algal cell concentration ranking for sampling point i over all historical times.

From the above description, it can be seen that a specific calculation method of spatial distribution similarity is provided, and calculation of spatial distribution similarity is achieved, wherein the lake region is divided into R sub-regions, and weight is allocated to each point location when numerical similarity is calculated, so that the similarity is prevented from being influenced too much by regions with dense sampling points.

Further, the step S4 specifically includes:

grading the subsequent trend of each similar time point or similar time period, calculating the change amplitude of the algae cell concentration of the similar time point or the similar time period and the next time point by taking the similar time point or the tail time point of the similar time period as a standard, obtaining a corresponding set score according to the grading of the subsequent trend, and calculating the score of the final trend grade according to the following formula:

in the formula, grade_rA grade representing the subsequent trend of similar time r, i represents the corresponding set score, sim_rIndicating the degree of similarity, trend, at a similar time r_iA score representing the final trend rating.

From the above description, the final trend grade score is obtained by performing weighted calculation on the trends of the similar time periods and time points, and the forecasting of the bloom disasters can be easily realized and calculated.

Further, the step S3 specifically includes:

Further, the step S31 specifically includes:

the step S33 specifically includes:

in the formula I_kIs x_mAnd y_mDistance of middle alignment point:

with the constraint:

{x_mn',…,x_mn”}∈x_m，n”-n'＝[k-2，k+2]；

l_z-1＝(x_mn',y_mk')；

the next alignment point:

l_z＝(x_mn,y_mk)；

Further, the step S2 specifically includes:

wherein the content of the first and second substances,for numerical similarity s _ num_a,bCalculated according to the following formula:

in the formula, ω_iRepresenting the weight of the sample point i, N representing the total number of sample points, R representing the number of sub-regions to be divided, R_iRepresenting the number of sampling points of a sub-region to which the sampling points i belong;

Further, the step S4 specifically includes:

Referring to fig. 1, a first embodiment of the present invention is:

and step S1, acquiring algae data, meteorological data and water quality data, and preprocessing the algae data, the meteorological data and the water quality data to form a standard time sequence with the same time frequency.

Specifically, the algae data, the meteorological data and the water quality data are preprocessed by a standard time sequence including outlier rejection, null filling and resampling to the same time frequency.

And step S2, calculating the spatial distribution similarity according to the standard time series algae data.

Calculating the space distribution similarity s _ spatial of the time a and the time b according to the algae cell concentration numerical similarity of each point and the trend similarity of the algae cells of each point_a,b；

Wherein, the time a refers to the current sampling time point, and the time b refers to the historical sampling time points.

Specifically, the spatial similarity s _ spatial is calculated by the following equation_a,b：

In the formula, s _ num_a,bS _ trend, the numerical similarity of the spatial distribution of time a and time b_a,bSimilarity of trends, ω, of the spatial distribution of time a and time b_numAnd ω_trendThe weights are respectively numerical similarity and trend similarity, and are given by technical personnel according to actual conditions, in the embodiment, the weight omega of the numerical similarity is_numSpecifically 0.7, weight ω of trend similarity_trendSpecifically 0.3.

wherein x represents the algal cell concentration, i.e., x_aiThe concentration of algal cells, x, at the time a sampling point i_biAlgal cell concentration, std, representing the sampling point i at time b_xiIs the standard deviation, omega, of the algal cell concentration at all historical times for sample point i_iRepresenting the weight of sample point i.

The lake area is divided into R sub-areas in order to avoid that the similarity is greatly influenced by the dense areas of the sampling points, the weight is distributed to each point location when the numerical similarity is calculated, and the weight calculation formula of each point location is as follows:

in the formula, ω_iRepresents the weight of the sample point i, N represents the total number of sample points, and R representsDividing the number of sub-regions, R_iRepresenting the number of sampling points of the sub-region to which the sampling point i belongs.

While for the trend similarity s _ trend_a,bCalculated according to the following formula:

And step S3, calculating the similarity of the environmental conditions according to the algae data, the meteorological data and the water quality data of the standard time series.

Specifically, step S3 includes:

specifically, regarding the similarity of environmental conditions, the key elements related to algae changes are screened by using the pearson correlation coefficient for time series data, and the calculation formula is as follows:

where ρ represents correlation, A, B represents the time series of algae and individual meteorological or water quality elements, respectively, cov represents covariance, std represents standard deviation.

Step S32, an element with the absolute value of correlation larger than 0.5 is taken as a key element related to the algae change.

Specifically, when a plurality of main element sequences at time a is denoted by Y and a plurality of main element sequences at time b is denoted by X, there are:

in the formula, m is a main element, so m is equal to m in Y and X, k and n are time lengths, and k is less than n.

Then the historical time set R with similar weather and water quality conditions is:

R＝{X_i|Find(Sim(X_i,Y)),X_i∈X}；

it can be seen that Y and X are both multivariate time series due to the presence of multiple influencing elements, and the univariate-based time series similarity measure is improved when searching for similar times. Common univariate-based Time sequence similarity measurement methods include Euclidean distance, Dynamic Time Warping (DTW) and the like, wherein the DTW algorithm overcomes the defect that the Euclidean distance has similarity of indistinguishable shape and Dynamic change amplitude, and the minimum sum of weighted distances is realized by a local optimization method, so that the similarity of Time sequences is calculated by adopting an improved DTW algorithm.

Specifically, for two time series in which m influencing elements exist, the similarity of environmental conditions is calculated as:

in the formula, ω_mThe element weight is obtained from the correlation of the elements, and d (x) is higher when the element weight is higher with higher correlation_m,y_m) Solving the distance of a single element in two time sequences by adopting a DTW (delay tolerant shift) method, and setting x_mAnd y_mDistance of middle alignment point:

in the formula, i and j are index numbers in two sequences respectively, then:

since the path must start from the start point and end at the end point, i.e. the head and tail of the two sequences must match, it has boundary conditions:

l₁＝(x_mn',y_m1)，l_z＝(x_mn”,y_mk)；

wherein:

{x_mn',…,x_mn”}∈x_m，n”-n'＝[k-2，k+2]；

in the formula, n 'is the initial position of interception, n' is the end position of interception, and n '-n' limits that the time length of interception and the target sequence can not differ by more than two time units.

Since it is not possible to align across a point, each point of the two sequences needs to be matched, which has continuity if:

l_z-1＝(x_mn',y_mk')；

the next alignment point

l_z＝(x_mn,y_mk)；

The conditions are required to be satisfied: n-n 'is less than or equal to 1, and k-k' is less than or equal to 1;

because the alignment point is monotonously performed along with the time, the two alignment lines are ensured not to be intersected and also have monotonicity, if:

l_z-1＝(x_mn',y_mk')；

the next alignment point

l_z＝(x_mn,y_mk)；

The conditions are required to be satisfied: n-n 'is not less than 0 and k-k' is not less than 0.

Specifically, each subsequent trend is graded, including five grades of large-amplitude rising, slight rising, gradual trend, slight falling and large-amplitude falling, the variation amplitude of the algae cell concentration between the subsequent trend and the next time point is calculated by taking the similar time point or the end time point of the similar time period as a standard, the final trend is graded by taking 0.4, 0.1, -0.1 and-0.4 as boundary values, and then the final trend grade is obtained by weighted average:

in the formula, grade_rShowing the subsequent trend of variation, sim, of similar time r_rIndicating the degree of similarity, trend, at a similar time r_iA score representing the final trend rating.

Referring to fig. 2, the second embodiment of the present invention is:

a similarity analysis-based bloom disaster early warning method terminal 1 comprises a memory 3, a processor 2 and a computer program which is stored on the memory 3 and can run on the processor 2, wherein the first step of the embodiment is realized when the processor 2 executes the computer program.

In summary, the similarity analysis-based bloom disaster early warning method and the terminal provided by the invention predict the bloom disasters by calculating the space similarity and the time similarity of the environment, obtaining the subsequent trend grade of the algae condition according to the time similarity and the space similarity of the environment, and comparing with the existing bloom prediction method, the method has the advantages of easy realization and high accuracy, screening the elements according to the correlation to obtain the key elements needing to be calculated, calculating the key elements to obtain the environment correlation, reducing the complexity of calculation through screening to make the whole method easier to realize, overcoming the defect that the Euclidean distance has similarity with indistinguishable shape and dynamic change amplitude through a dynamic time warping algorithm, realizing the minimum sum of weighted distances through a local optimization method, by dividing the lake region into R sub-regions, weight is distributed to each point location when numerical similarity is calculated, and the phenomenon that the similarity is influenced too much by the region with dense sampling points is avoided.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields are included in the scope of the present invention.

Claims

1. A bloom disaster early warning method based on similarity analysis is characterized by comprising the following steps:

2. The bloom disaster warning method based on similarity analysis as claimed in claim 1, wherein the step S3 specifically comprises:

3. The bloom disaster warning method based on similarity analysis as claimed in claim 2, wherein the step S31 specifically comprises:

where ρ represents correlation, A, B represents the time series of algae and the time series of individual weather or water quality elements, respectively, cov represents covariance, std represents standard deviation;

the step S33 specifically includes:

in the formula, ω_mThe element weight is obtained from the correlation of elements, and d (x) is higher when the element weight is higher with higher correlation_m,y_m) Is the distance of a single element in two time series, Y is a plurality of element series of time a, and X is a plurality of element series of time b:

in the formula I_kIs x_mAnd y_mDistance of middle alignment point:

with the constraint:

{x_mn',…,x_mn”}∈x_m，n”-n'＝[k-2，k+2]；

l_z-1＝(x_mn',y_mk')；

the next alignment point:

l_z＝(x_mn,y_mk)；

4. The bloom disaster warning method based on similarity analysis as claimed in claim 1, wherein the step S2 specifically comprises:

In the formula, s _ num_a,bS _ trend, the numerical similarity of the spatial distribution of time a and time b_a,bSimilarity of trends, ω, of the spatial distribution of time a and time b_numAnd ω_trendWeights of numerical similarity and trend similarity respectively;

wherein x represents the algal cell concentration, i.e., x_aiThe concentration of algal cells, x, at the time a sampling point i_biAlgal cell concentration, std, representing the sampling point i at time b_xiIs the standard deviation, omega, of the algal cell concentration at all the historical times for the sampling point i_iThe weight representing sample point i is calculated using the following equation:

in the formula, ω_iRepresenting the weight of the sampling point i, N representing the total number of sampling points, R representing the number of sub-regions to be divided, R_iRepresenting the number of sampling points of a sub-region to which the sampling points i belong;

5. The bloom disaster warning method based on similarity analysis as claimed in claim 1, wherein the step S4 specifically comprises:

grading each similar time point or the subsequent trend of the similar time period, calculating the change range of the algae cell concentration between the similar time point or the tail time point of the similar time period and the next time point by taking the similar time point or the tail time point of the similar time period as a standard, obtaining a corresponding set score according to the grading of the subsequent trend, and calculating the score of the final trend grade according to the following formula:

6. A bloom disaster early warning terminal based on similarity analysis comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, and is characterized in that the processor executes the computer program to realize the following steps:

7. The bloom disaster warning terminal based on similarity analysis as claimed in claim 6, wherein the step S3 specifically comprises:

8. The bloom disaster warning terminal based on similarity analysis as claimed in claim 7, wherein the step S31 specifically comprises:

the step S33 specifically includes:

in the formula I_kIs x_mAnd y_mDistance of middle alignment point:

with the constraint:

{x_mn',…,x_mn”}∈x_m，n”-n'＝[k-2，k+2]；

l_z-1＝(x_mn',y_mk')；

the next alignment point:

l_z＝(x_mn,y_mk)；

9. The bloom disaster warning terminal based on similarity analysis as claimed in claim 6, wherein the step S2 specifically comprises:

In the formula, s _ num_a,bS _ trend, the numerical similarity of the spatial distribution of time a and time b_a,bSimilarity of trends, ω, of the spatial distribution of time a and time b_numAnd ω_trendNumerical similarity and trend facies, respectivelyA weight of the similarity;

10. The bloom disaster warning terminal based on similarity analysis as claimed in claim 6, wherein the step S4 specifically comprises: