CN115293577A - Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning - Google Patents
Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning Download PDFInfo
- Publication number
- CN115293577A CN115293577A CN202210939068.0A CN202210939068A CN115293577A CN 115293577 A CN115293577 A CN 115293577A CN 202210939068 A CN202210939068 A CN 202210939068A CN 115293577 A CN115293577 A CN 115293577A
- Authority
- CN
- China
- Prior art keywords
- water chemistry
- water
- alpine
- control factors
- analyzing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 156
- 238000000034 method Methods 0.000 title claims abstract description 67
- 238000010801 machine learning Methods 0.000 title claims abstract description 22
- 239000000126 substance Substances 0.000 title abstract description 27
- 239000011159 matrix material Substances 0.000 claims abstract description 32
- 239000003673 groundwater Substances 0.000 claims abstract description 20
- 238000010219 correlation analysis Methods 0.000 claims abstract description 14
- 238000013528 artificial neural network Methods 0.000 claims abstract description 12
- 230000000007 visual effect Effects 0.000 claims abstract description 9
- 229910052729 chemical element Inorganic materials 0.000 claims description 33
- 238000009826 distribution Methods 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 9
- 238000012800 visualization Methods 0.000 claims description 8
- 238000004458 analytical method Methods 0.000 claims description 7
- 210000002569 neuron Anatomy 0.000 claims description 7
- 239000013598 vector Substances 0.000 claims description 7
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000001932 seasonal effect Effects 0.000 claims description 5
- 230000006870 function Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 238000004451 qualitative analysis Methods 0.000 abstract description 6
- 238000004445 quantitative analysis Methods 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 2
- 239000012491 analyte Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 239000003651 drinking water Substances 0.000 description 1
- 235000020188 drinking water Nutrition 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000008020 evaporation Effects 0.000 description 1
- 238000001704 evaporation Methods 0.000 description 1
- 239000008235 industrial water Substances 0.000 description 1
- 230000002262 irrigation Effects 0.000 description 1
- 238000003973 irrigation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A20/00—Water conservation; Efficient water supply; Efficient water use
- Y02A20/152—Water filtration
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Economics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Tourism & Hospitality (AREA)
- Software Systems (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Crystallography & Structural Chemistry (AREA)
- Water Supply & Treatment (AREA)
- Data Mining & Analysis (AREA)
- Quality & Reliability (AREA)
- Public Health (AREA)
- Game Theory and Decision Science (AREA)
- Operations Research (AREA)
- Chemical & Material Sciences (AREA)
- Primary Health Care (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Geophysics And Detection Of Objects (AREA)
Abstract
The invention discloses a method for analyzing chemical control factors of groundwater in alpine regions based on machine learning, which is applied to the technical field of groundwater environment management in alpine regions and comprises the following steps: constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph; determining an optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph; and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method. According to the method, the SOM, the PMF, the correlation analysis and the ion ratio method are combined, so that qualitative and quantitative analysis of underground water chemical control factors in the high and cold flow region is realized.
Description
Technical Field
The invention relates to the technical field of groundwater environment management in alpine regions, in particular to a method for analyzing chemical control factors of groundwater in alpine regions based on machine learning.
Background
Groundwater is a key component of water resources in northwest of China, and most of drinking water, industrial water, agricultural irrigation and ecological water demand of residents depend on groundwater. Due to less precipitation and large evaporation, the quality of underground water is reduced and even worsened due to drought and water shortage in northwest regions, so that the water consumption in the regions is more severe. The development of research on underground water chemistry control factors in alpine and cold water-deficient areas is beneficial to deep understanding of the water quality evolution of underground water in the areas and scientific guidance of underground water resource management.
The factors controlling underground water chemistry have been studied extensively by many scholars both at home and abroad in recent years. Current research is focused primarily on hydrogeological surveys, qualitative analyte sources, water chemistry characteristics in local regions for short periods of time, and the like. However, the environmental conditions of the alpine regions are difficult, the chemical characteristics of the underground water are very complex, and a single method cannot systematically reveal the control factors of the underground water chemistry of the whole region in different seasons and different regions. A large number of complex space-time variation water chemistry data sets are difficult to highly visualize, so that the simultaneous qualitative and quantitative analysis of control factors of groundwater in high and cold flow regions is more difficult.
Therefore, the technical staff in the field needs to solve the problem of how to describe complex groundwater data, identify the hydrological geochemical process and source analysis of each cluster, quantify control factors of different spaces and seasons of the watershed simultaneously, and qualitatively and quantitatively analyze the control factors of the groundwater chemistry in the alpine watershed urgently.
Disclosure of Invention
In view of the above, the invention provides a method for analyzing underground water chemistry control factors in a high and cold flow region based on machine learning. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components of underground water and potential control factors of the chemical components are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; by combining SOM, PMF, correlation analysis and an ion ratio method, the method for analyzing the chemical control factors of the underground water in the alpine regions based on machine learning is provided, complex underground water data can be described, the hydrological geochemical process and source analysis of each cluster can be identified, the control factors of different spaces and seasons of the watershed can be quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the underground water in the alpine regions can be realized, and the method has important significance for deep understanding of water quality evolution of underground water, priority treatment of pollution sources and scientific guidance of underground water resource control.
In order to achieve the purpose, the invention adopts the following technical scheme:
the method for analyzing the chemical control factors of underground water in the alpine-cold flow region based on machine learning comprises the following steps:
step (1): and acquiring water chemical element data of the underground water sample in the alpine-cold watershed and preprocessing the acquired water chemical element data.
Step (2): and constructing the SOM self-organizing neural network based on the preprocessed water chemical element data to obtain a water chemical visual topological graph.
And (3): determining the optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph.
And (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
Optionally, in step (1), preprocessing the water chemistry element data includes: and naming and arranging the sample data on the basis of the water chemistry element data by combining the seasonal and regional attributes.
Optionally, in step (1), the method further includes: and when the preprocessed sample data has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
Optionally, in step (2), constructing an SOM self-organizing neural network to obtain a water chemistry visualization topological graph, specifically:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
And B: a weight space with a smaller random value is initialized, and simultaneously, the mapping size, the initial winner neuron and the initial learning rate are set respectively.
And C: the best matching unit BMU whose weight vector is most similar to the input vector is found.
Step D: the weight vectors of the BMU and its proximal neurons are updated.
Step E: the search process is iterated and converges on the optimal self-organizing map.
And when the preset iteration number is reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
Optionally, in step (2), a water chemistry visualization topological graph obtained from the calculation result of the SOM self-organizing neural network is output in the form of a uniform distance matrix and a component plane.
Optionally, in step (3), the formula for determining the optimal cluster number DBI is as follows:
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
Optionally, in the step (3), clustering the preprocessed water chemistry element data specifically includes:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin indexes, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine a clear boundary of clustering; finally, the label of the preprocessed water chemistry element data is projected on the neuron, and the position of the preprocessed water chemistry element data in each cluster is determined.
Optionally, in the step (3), a radar map and an ion ratio map of the concentration of the water chemical element data in each cluster can be drawn by performing normalization processing on the clustering result.
Optionally, in the step (4), according to the clustering result, fusing with an orthogonal matrix factorization PMF, specifically:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert the water chemistry X into the water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry elementAnd (4) degree.
Step b: calculating uncertainty u ij ,u ij Is the uncertainty of the jth water chemical element in the ith sample, and is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
wherein the error fraction is an error coefficient.
If the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (c) is:
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
where m is the number of water chemistry elements and n is the number of samples.
Optionally, in the step (4), qualitative and quantitative analysis of the chemical control factors of the underground water in the high-cold flow region by combining correlation analysis and an ion ratio method is specifically as follows:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor to radar map based on SOM classification results, correlation analysis, and ion ratio relationship (TDS vs. Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl--(Na + +K + ) (/ Cl-) and CAI-II (= (Cl- - ((Na)) + +K + )/(SO 4 2- +HCO 3 -+CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl - /Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
According to the technical scheme, compared with the prior art, the method for analyzing the underground water chemistry control factors in the alpine-cold flow regions based on machine learning is provided. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components and potential control factors of underground water are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; by combining SOM, PMF, correlation analysis and an ion ratio method, the method for analyzing the chemical control factors of the underground water in the alpine regions based on machine learning is provided, complex underground water data can be described, the hydrological geochemical process and source analysis of each cluster can be identified, the control factors of different spaces and seasons of the watershed can be quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the underground water in the alpine regions can be realized, and the method has important significance for deep understanding of water quality evolution of underground water, priority treatment of pollution sources and scientific guidance of underground water resource control.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a schematic view of a water chemistry visualization topology of the present invention.
FIG. 3 is a diagram illustrating SOM clustering results according to the present invention.
Fig. 4 is a schematic illustration of the radar of the present invention.
FIG. 5 is a graphical illustration of the ion ratio of the present invention.
FIG. 6 is a schematic diagram illustrating the contributions of different factors to the determination of groundwater chemical element data based on a PMF model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment 1 of the invention discloses a method for analyzing underground water chemistry control factors in alpine flow regions based on machine learning, which comprises the following steps of:
step (1): acquiring water chemical element data of an underground water sample in a high and cold flow region, and preprocessing the acquired water chemical element data, wherein the preprocessing comprises the following steps: naming and arranging sample data by combining seasonal and regional attributes on the basis of water chemical element data, which comprises the following steps:
the water chemistry element data are named simply as labels, the numbers represent regional positions, english abbreviations in months represent seasons, SGW represents diving, and DGW represents confined water; then arranging according to the sample name of the ordinate and the arrangement sequence of the water chemical elements of the abscissa; and when the preprocessed sample data has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
Step (2): constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph, which specifically comprises the following steps:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
step A: determining an optimal number of neuronsWherein n is a groundwater sampleThe number of books.
And B: a weight space with a smaller random value is initialized, and simultaneously, the mapping size, the initial winner neuron and the initial learning rate are set respectively.
And C: the best matching unit BMU whose weight vector is most similar to the input vector is found.
Step D: the weight vectors of the BMU and its proximal neurons are updated.
Step E: the search process is iterated and converges on the optimal self-organizing map.
And when the preset iteration number is reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
As shown in FIG. 2, the water chemistry visualization topological graph is obtained by outputting the calculation result of the SOM self-organizing neural network in the form of a uniform distance matrix (u matrix) and a component plane.
And (3): determining the optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, drawing a radar graph and an ion ratio graph, and reflecting the spatial and seasonal changes of the chemical concentration of the underground water.
The equation for determining the optimal cluster number DBI is as follows:
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
Clustering the preprocessed water chemistry element data, as shown in fig. 3, specifically includes:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin index, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine the clear boundary of the clustering; finally, the label of the preprocessed water chemistry element data is projected on the neuron, and the position of the preprocessed water chemistry element data in each cluster is determined.
Finally, by normalizing the clustering result, a radar chart of the water chemical element data concentration in each cluster can be drawn, as shown in fig. 4, and an ion ratio chart, as shown in fig. 5, the spatial and seasonal variation of the water chemical concentration in each cluster can be analyzed according to the radar charts.
And (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
According to the clustering result, the method is fused with an orthogonal matrix factorization (PMF) method, and specifically comprises the following steps:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert the water chemistry X into the water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g is a radical of formula ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry.
Step b: calculating uncertainty u ij ,u ij The uncertainty of the jth water chemical element in the ith sample is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
wherein the error fraction is an error coefficient.
If the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (2) is as follows:
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
where m is the number of water chemistry elements and n is the number of samples.
As shown in fig. 6, fig. 6a is a PMF factor fingerprint diagram, fig. 6b is a PMF factor contribution diagram, fig. 6c is a correlation diagram of PMF factor contribution and groundwater chemical components, and the correlation analysis and the ion ratio method are combined to qualitatively and quantitatively analyze the groundwater chemical control factors in the high-cold flow region, specifically:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor was correlated with SOM classification results based radar maps, correlation analysis, and ion ratio (TDS vs Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl - -(Na + +K + )/Cl - ) And CAI-II (= (Cl) - -(Na + +K + )/(SO 4 2- +HCO 3 - +CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl-/Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
The embodiment of the invention discloses a method for analyzing chemical control factors of underground water in a high and cold flow region based on machine learning. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components and potential control factors of underground water are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; and SOM, PMF, correlation analysis and an ion ratio method are combined, so that a method for analyzing the chemical control factors of the groundwater in the alpine-cold watershed based on machine learning is provided, complex groundwater data can be described, the hydrological geochemistry process and source analysis of each cluster are identified, the control factors of different spaces and seasons of the watershed are quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the groundwater in the alpine-cold watershed is realized, and the method has important significance for deep understanding of water quality evolution of the groundwater, priority treatment of pollution sources and scientific guidance of groundwater resource control.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. The method for analyzing the underground water chemistry control factors in the alpine-cold flow region based on machine learning is characterized by comprising the following steps of:
step (1): acquiring water chemical element data of an underground water sample in a high and cold watershed, and preprocessing the water chemical element data;
step (2): constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph;
and (3): determining an optimal clustering number on the water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph;
and (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
2. The method for analyzing underground water chemistry control factors in alpine-flow regions based on machine learning of claim 1, wherein the preprocessing of the water chemistry element data in step (1) comprises: naming and arranging the sample data in combination with seasonal and regional attributes on the basis of the water chemistry element data.
3. The method for analyzing the underground water chemistry control factor of the alpine-flow region based on the machine learning according to claim 1, wherein the step (1) further comprises: and when the sample data after pretreatment has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
4. The method for analyzing the underground water chemistry control factors in the alpine-cold watershed based on the machine learning according to claim 1, wherein in the step (2), an SOM self-organizing neural network is constructed to obtain a water chemistry visual topological graph, and specifically:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
and B: initializing a weight space with a smaller random value, and simultaneously respectively setting mapping size, initial winner neurons and initial learning rate;
and C: finding out the best matching unit BMU with the most similar weight vector and input vector;
step D: updating the weight vectors of the BMU and the near-end neurons thereof;
step E: iterating the search process and converging on the optimal self-organizing map;
and when the preset iteration times are reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
5. The method for analyzing water chemistry control factors in alpine-flow-region underground based on machine learning according to claim 1, wherein in the step (2), the water chemistry visualization topological graph obtained by the calculation result of the SOM self-organizing neural network is output in the form of a uniform distance matrix and a component plane.
6. The method for analyzing the underground water chemistry control factors in alpine-cold flow regions based on machine learning according to claim 1, wherein in the step (3), an optimal clustering number DBI is determined according to the following formula:
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
7. The method for analyzing the underground water chemistry control factor in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (3), the preprocessed water chemistry element data are clustered, specifically:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin index, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine the clear boundary of the clustering; finally, the preprocessed labels of the water chemistry element data are projected on the neurons, and the positions of the preprocessed water chemistry element data in each cluster are determined.
8. The method for analyzing the water chemistry control factors in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (3), a radar map and an ion ratio map of the water chemistry element data concentration in each cluster can be drawn by normalizing the clustering result.
9. The method for analyzing the underground water chemistry control factors in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (4), the clustering result is fused with an orthogonal matrix factorization (PMF), and specifically comprises:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert water chemistry X into water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry;
step b: calculating uncertainty u ij ,u ij Is the jth water in the ith sampleThe uncertainty of the chemical element is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
wherein the error fraction is an error coefficient;
if the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (2) is as follows:
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
where m is the number of water chemistry elements and n is the number of samples.
10. The method for analyzing the underground water chemistry control factors in the alpine-flow region based on the machine learning according to claim 9, wherein in the step (4), the underground water chemistry control factors in the alpine-flow region are qualitatively and quantitatively analyzed by combining correlation analysis and an ion ratio method, and specifically:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor to radar map based on SOM classification results, correlation analysis, and ion ratio relationship (TDS vs. Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl - -(Na + +K + )/Cl - ) And CAI-II (= (Cl) - -(Na + +K + )/(SO 4 2- +HCO 3 - +CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl - /Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210939068.0A CN115293577B (en) | 2022-08-05 | 2022-08-05 | Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210939068.0A CN115293577B (en) | 2022-08-05 | 2022-08-05 | Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115293577A true CN115293577A (en) | 2022-11-04 |
CN115293577B CN115293577B (en) | 2023-07-21 |
Family
ID=83828207
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210939068.0A Active CN115293577B (en) | 2022-08-05 | 2022-08-05 | Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115293577B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116030900A (en) * | 2023-03-24 | 2023-04-28 | 安徽瑞邦数科科技服务有限公司 | Method, device, equipment and storage medium for controlling component content of chemical product |
CN117524347A (en) * | 2023-11-20 | 2024-02-06 | 中南大学 | First principle prediction method for acid radical anion hydration structure accelerated by machine learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103942841A (en) * | 2013-08-15 | 2014-07-23 | 中国地质科学院矿产资源研究所 | Mineral resource multivariate information processing method and system based on GIS |
CN106355011A (en) * | 2016-08-30 | 2017-01-25 | 有色金属矿产地质调查中心 | Geochemical data element sequence structure analysis method and device |
CN113706354A (en) * | 2021-09-02 | 2021-11-26 | 浙江索思科技有限公司 | Marine integrated service management system based on big data technology |
CN113780465A (en) * | 2021-09-27 | 2021-12-10 | 中国水利水电科学研究院 | Underground water chemistry seasonal change analysis method based on self-organizing neural network |
CN113887635A (en) * | 2021-10-08 | 2022-01-04 | 河海大学 | Basin similarity classification method and classification device |
-
2022
- 2022-08-05 CN CN202210939068.0A patent/CN115293577B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103942841A (en) * | 2013-08-15 | 2014-07-23 | 中国地质科学院矿产资源研究所 | Mineral resource multivariate information processing method and system based on GIS |
CN106355011A (en) * | 2016-08-30 | 2017-01-25 | 有色金属矿产地质调查中心 | Geochemical data element sequence structure analysis method and device |
CN113706354A (en) * | 2021-09-02 | 2021-11-26 | 浙江索思科技有限公司 | Marine integrated service management system based on big data technology |
CN113780465A (en) * | 2021-09-27 | 2021-12-10 | 中国水利水电科学研究院 | Underground water chemistry seasonal change analysis method based on self-organizing neural network |
CN113887635A (en) * | 2021-10-08 | 2022-01-04 | 河海大学 | Basin similarity classification method and classification device |
Non-Patent Citations (1)
Title |
---|
张妹;刘启蒙;刘凯旋;: "潘三矿区地下水化学特征及成因分析", 煤矿开采, no. 02, pages 272 - 274 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116030900A (en) * | 2023-03-24 | 2023-04-28 | 安徽瑞邦数科科技服务有限公司 | Method, device, equipment and storage medium for controlling component content of chemical product |
CN117524347A (en) * | 2023-11-20 | 2024-02-06 | 中南大学 | First principle prediction method for acid radical anion hydration structure accelerated by machine learning |
CN117524347B (en) * | 2023-11-20 | 2024-04-16 | 中南大学 | First principle prediction method for acid radical anion hydration structure accelerated by machine learning |
Also Published As
Publication number | Publication date |
---|---|
CN115293577B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiang et al. | Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments | |
Sun et al. | Using Bayesian deep learning to capture uncertainty for residential net load forecasting | |
CN115293577A (en) | Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning | |
Sun et al. | C-vine copula mixture model for clustering of residential electrical load pattern data | |
Bowden et al. | Input determination for neural network models in water resources applications. Part 1—background and methodology | |
Li et al. | A novel combined prediction model for monthly mean precipitation with error correction strategy | |
Brentan et al. | Correlation analysis of water demand and predictive variables for short‐term forecasting models | |
CN110045227B (en) | power distribution network fault diagnosis method based on random matrix and deep learning | |
CN116580308B (en) | Monitoring method and monitoring device for soil remediation | |
CN109583635A (en) | A kind of short-term load forecasting modeling method towards operational reliability | |
US20240346210A1 (en) | Multi-scale analysis method for time series based on quantum walk | |
Cui et al. | Deep learning methods for atmospheric PM2. 5 prediction: A comparative study of transformer and CNN-LSTM-attention | |
CN117078114B (en) | Water quality evaluation method and system for water-bearing lakes under influence of diversion engineering | |
CN105787259A (en) | Method for analyzing influence correlation of multiple meteorological factors and load changes | |
Salam et al. | A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power | |
CN117993305B (en) | Dynamic evaluation method for river basin land utilization and soil erosion relation | |
CN116826737A (en) | Photovoltaic power prediction method, device, storage medium and equipment | |
Hemann et al. | Assessing positive matrix factorization model fit: a new method to estimate uncertainty and bias in factor contributions at the measurement time scale | |
Wang et al. | R2-trans: Fine-grained visual categorization with redundancy reduction | |
Merufinia et al. | On the simulation of streamflow using hybrid tree-based machine learning models: A case study of Kurkursar basin, Iran | |
Sohrabi et al. | Coupling large-scale climate indices with a stochastic weather generator to improve long-term streamflow forecasts in a Canadian watershed | |
CN116701974A (en) | Precipitation multi-element space-time change analysis and attribution identification method under climate change | |
CN114996624B (en) | Remote sensing PM2.5 and NO based on multitasking deep learning 2 Collaborative inversion method | |
Zhong et al. | Positive and Inverse Degree of Grey Incidence Estimation Model of Soil Organic Matter Based on Hyper-spectral Data. | |
Ahmadi et al. | Identification of dominant sources of sea level pressure for precipitation forecasting over Wales |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |