CN115293577A - Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning - Google Patents

Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning Download PDF

Info

Publication number
CN115293577A
CN115293577A CN202210939068.0A CN202210939068A CN115293577A CN 115293577 A CN115293577 A CN 115293577A CN 202210939068 A CN202210939068 A CN 202210939068A CN 115293577 A CN115293577 A CN 115293577A
Authority
CN
China
Prior art keywords
water chemistry
water
alpine
control factors
analyzing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210939068.0A
Other languages
Chinese (zh)
Other versions
CN115293577B (en
Inventor
张海发
王巍
张旭
曾祥云
邵世鹏
王宝强
宋东东
郭舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pearl River Water Resources Commission Technical Consulting Guangzhou Co ltd
Original Assignee
Pearl River Water Resources Commission Technical Consulting Guangzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pearl River Water Resources Commission Technical Consulting Guangzhou Co ltd filed Critical Pearl River Water Resources Commission Technical Consulting Guangzhou Co ltd
Priority to CN202210939068.0A priority Critical patent/CN115293577B/en
Publication of CN115293577A publication Critical patent/CN115293577A/en
Application granted granted Critical
Publication of CN115293577B publication Critical patent/CN115293577B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Tourism & Hospitality (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Water Supply & Treatment (AREA)
  • Data Mining & Analysis (AREA)
  • Quality & Reliability (AREA)
  • Public Health (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Chemical & Material Sciences (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • Geophysics And Detection Of Objects (AREA)

Abstract

The invention discloses a method for analyzing chemical control factors of groundwater in alpine regions based on machine learning, which is applied to the technical field of groundwater environment management in alpine regions and comprises the following steps: constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph; determining an optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph; and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method. According to the method, the SOM, the PMF, the correlation analysis and the ion ratio method are combined, so that qualitative and quantitative analysis of underground water chemical control factors in the high and cold flow region is realized.

Description

Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning
Technical Field
The invention relates to the technical field of groundwater environment management in alpine regions, in particular to a method for analyzing chemical control factors of groundwater in alpine regions based on machine learning.
Background
Groundwater is a key component of water resources in northwest of China, and most of drinking water, industrial water, agricultural irrigation and ecological water demand of residents depend on groundwater. Due to less precipitation and large evaporation, the quality of underground water is reduced and even worsened due to drought and water shortage in northwest regions, so that the water consumption in the regions is more severe. The development of research on underground water chemistry control factors in alpine and cold water-deficient areas is beneficial to deep understanding of the water quality evolution of underground water in the areas and scientific guidance of underground water resource management.
The factors controlling underground water chemistry have been studied extensively by many scholars both at home and abroad in recent years. Current research is focused primarily on hydrogeological surveys, qualitative analyte sources, water chemistry characteristics in local regions for short periods of time, and the like. However, the environmental conditions of the alpine regions are difficult, the chemical characteristics of the underground water are very complex, and a single method cannot systematically reveal the control factors of the underground water chemistry of the whole region in different seasons and different regions. A large number of complex space-time variation water chemistry data sets are difficult to highly visualize, so that the simultaneous qualitative and quantitative analysis of control factors of groundwater in high and cold flow regions is more difficult.
Therefore, the technical staff in the field needs to solve the problem of how to describe complex groundwater data, identify the hydrological geochemical process and source analysis of each cluster, quantify control factors of different spaces and seasons of the watershed simultaneously, and qualitatively and quantitatively analyze the control factors of the groundwater chemistry in the alpine watershed urgently.
Disclosure of Invention
In view of the above, the invention provides a method for analyzing underground water chemistry control factors in a high and cold flow region based on machine learning. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components of underground water and potential control factors of the chemical components are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; by combining SOM, PMF, correlation analysis and an ion ratio method, the method for analyzing the chemical control factors of the underground water in the alpine regions based on machine learning is provided, complex underground water data can be described, the hydrological geochemical process and source analysis of each cluster can be identified, the control factors of different spaces and seasons of the watershed can be quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the underground water in the alpine regions can be realized, and the method has important significance for deep understanding of water quality evolution of underground water, priority treatment of pollution sources and scientific guidance of underground water resource control.
In order to achieve the purpose, the invention adopts the following technical scheme:
the method for analyzing the chemical control factors of underground water in the alpine-cold flow region based on machine learning comprises the following steps:
step (1): and acquiring water chemical element data of the underground water sample in the alpine-cold watershed and preprocessing the acquired water chemical element data.
Step (2): and constructing the SOM self-organizing neural network based on the preprocessed water chemical element data to obtain a water chemical visual topological graph.
And (3): determining the optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph.
And (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
Optionally, in step (1), preprocessing the water chemistry element data includes: and naming and arranging the sample data on the basis of the water chemistry element data by combining the seasonal and regional attributes.
Optionally, in step (1), the method further includes: and when the preprocessed sample data has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
Optionally, in step (2), constructing an SOM self-organizing neural network to obtain a water chemistry visualization topological graph, specifically:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
step A: determining an optimal number of neurons
Figure BDA0003784799520000031
Wherein n is the number of groundwater samples.
And B: a weight space with a smaller random value is initialized, and simultaneously, the mapping size, the initial winner neuron and the initial learning rate are set respectively.
And C: the best matching unit BMU whose weight vector is most similar to the input vector is found.
Step D: the weight vectors of the BMU and its proximal neurons are updated.
Step E: the search process is iterated and converges on the optimal self-organizing map.
And when the preset iteration number is reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
Optionally, in step (2), a water chemistry visualization topological graph obtained from the calculation result of the SOM self-organizing neural network is output in the form of a uniform distance matrix and a component plane.
Optionally, in step (3), the formula for determining the optimal cluster number DBI is as follows:
Figure BDA0003784799520000032
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
Optionally, in the step (3), clustering the preprocessed water chemistry element data specifically includes:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin indexes, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine a clear boundary of clustering; finally, the label of the preprocessed water chemistry element data is projected on the neuron, and the position of the preprocessed water chemistry element data in each cluster is determined.
Optionally, in the step (3), a radar map and an ion ratio map of the concentration of the water chemical element data in each cluster can be drawn by performing normalization processing on the clustering result.
Optionally, in the step (4), according to the clustering result, fusing with an orthogonal matrix factorization PMF, specifically:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert the water chemistry X into the water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
Figure BDA0003784799520000041
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry elementAnd (4) degree.
Step b: calculating uncertainty u ij ,u ij Is the uncertainty of the jth water chemical element in the ith sample, and is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
Figure BDA0003784799520000042
wherein the error fraction is an error coefficient.
If the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (c) is:
Figure BDA0003784799520000043
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
Figure BDA0003784799520000044
where m is the number of water chemistry elements and n is the number of samples.
Optionally, in the step (4), qualitative and quantitative analysis of the chemical control factors of the underground water in the high-cold flow region by combining correlation analysis and an ion ratio method is specifically as follows:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor to radar map based on SOM classification results, correlation analysis, and ion ratio relationship (TDS vs. Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl--(Na + +K + ) (/ Cl-) and CAI-II (= (Cl- - ((Na)) + +K + )/(SO 4 2- +HCO 3 -+CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl - /Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
According to the technical scheme, compared with the prior art, the method for analyzing the underground water chemistry control factors in the alpine-cold flow regions based on machine learning is provided. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components and potential control factors of underground water are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; by combining SOM, PMF, correlation analysis and an ion ratio method, the method for analyzing the chemical control factors of the underground water in the alpine regions based on machine learning is provided, complex underground water data can be described, the hydrological geochemical process and source analysis of each cluster can be identified, the control factors of different spaces and seasons of the watershed can be quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the underground water in the alpine regions can be realized, and the method has important significance for deep understanding of water quality evolution of underground water, priority treatment of pollution sources and scientific guidance of underground water resource control.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a schematic view of a water chemistry visualization topology of the present invention.
FIG. 3 is a diagram illustrating SOM clustering results according to the present invention.
Fig. 4 is a schematic illustration of the radar of the present invention.
FIG. 5 is a graphical illustration of the ion ratio of the present invention.
FIG. 6 is a schematic diagram illustrating the contributions of different factors to the determination of groundwater chemical element data based on a PMF model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment 1 of the invention discloses a method for analyzing underground water chemistry control factors in alpine flow regions based on machine learning, which comprises the following steps of:
step (1): acquiring water chemical element data of an underground water sample in a high and cold flow region, and preprocessing the acquired water chemical element data, wherein the preprocessing comprises the following steps: naming and arranging sample data by combining seasonal and regional attributes on the basis of water chemical element data, which comprises the following steps:
the water chemistry element data are named simply as labels, the numbers represent regional positions, english abbreviations in months represent seasons, SGW represents diving, and DGW represents confined water; then arranging according to the sample name of the ordinate and the arrangement sequence of the water chemical elements of the abscissa; and when the preprocessed sample data has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
Step (2): constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph, which specifically comprises the following steps:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
step A: determining an optimal number of neurons
Figure BDA0003784799520000061
Wherein n is a groundwater sampleThe number of books.
And B: a weight space with a smaller random value is initialized, and simultaneously, the mapping size, the initial winner neuron and the initial learning rate are set respectively.
And C: the best matching unit BMU whose weight vector is most similar to the input vector is found.
Step D: the weight vectors of the BMU and its proximal neurons are updated.
Step E: the search process is iterated and converges on the optimal self-organizing map.
And when the preset iteration number is reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
As shown in FIG. 2, the water chemistry visualization topological graph is obtained by outputting the calculation result of the SOM self-organizing neural network in the form of a uniform distance matrix (u matrix) and a component plane.
And (3): determining the optimal clustering number on a water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, drawing a radar graph and an ion ratio graph, and reflecting the spatial and seasonal changes of the chemical concentration of the underground water.
The equation for determining the optimal cluster number DBI is as follows:
Figure BDA0003784799520000071
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
Clustering the preprocessed water chemistry element data, as shown in fig. 3, specifically includes:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin index, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine the clear boundary of the clustering; finally, the label of the preprocessed water chemistry element data is projected on the neuron, and the position of the preprocessed water chemistry element data in each cluster is determined.
Finally, by normalizing the clustering result, a radar chart of the water chemical element data concentration in each cluster can be drawn, as shown in fig. 4, and an ion ratio chart, as shown in fig. 5, the spatial and seasonal variation of the water chemical concentration in each cluster can be analyzed according to the radar charts.
And (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
According to the clustering result, the method is fused with an orthogonal matrix factorization (PMF) method, and specifically comprises the following steps:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert the water chemistry X into the water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
Figure BDA0003784799520000081
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g is a radical of formula ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry.
Step b: calculating uncertainty u ij ,u ij The uncertainty of the jth water chemical element in the ith sample is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
Figure BDA0003784799520000082
wherein the error fraction is an error coefficient.
If the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (2) is as follows:
Figure BDA0003784799520000083
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
Figure BDA0003784799520000084
where m is the number of water chemistry elements and n is the number of samples.
As shown in fig. 6, fig. 6a is a PMF factor fingerprint diagram, fig. 6b is a PMF factor contribution diagram, fig. 6c is a correlation diagram of PMF factor contribution and groundwater chemical components, and the correlation analysis and the ion ratio method are combined to qualitatively and quantitatively analyze the groundwater chemical control factors in the high-cold flow region, specifically:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor was correlated with SOM classification results based radar maps, correlation analysis, and ion ratio (TDS vs Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl - -(Na + +K + )/Cl - ) And CAI-II (= (Cl) - -(Na + +K + )/(SO 4 2- +HCO 3 - +CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl-/Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
The embodiment of the invention discloses a method for analyzing chemical control factors of underground water in a high and cold flow region based on machine learning. According to the method, complex high-dimensional data is visualized to a low-dimensional space by applying the SOM, so that chemical components and potential control factors of underground water are determined; the non-negative value source contribution of each chemical substance and main factors of qualitative classification is obtained by applying PMF (orthogonal matrix factorization) quantitative source distribution; and SOM, PMF, correlation analysis and an ion ratio method are combined, so that a method for analyzing the chemical control factors of the groundwater in the alpine-cold watershed based on machine learning is provided, complex groundwater data can be described, the hydrological geochemistry process and source analysis of each cluster are identified, the control factors of different spaces and seasons of the watershed are quantized simultaneously, qualitative and quantitative analysis of the chemical control factors of the groundwater in the alpine-cold watershed is realized, and the method has important significance for deep understanding of water quality evolution of the groundwater, priority treatment of pollution sources and scientific guidance of groundwater resource control.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. The method for analyzing the underground water chemistry control factors in the alpine-cold flow region based on machine learning is characterized by comprising the following steps of:
step (1): acquiring water chemical element data of an underground water sample in a high and cold watershed, and preprocessing the water chemical element data;
step (2): constructing a SOM self-organizing neural network based on the preprocessed water chemistry element data to obtain a water chemistry visual topological graph;
and (3): determining an optimal clustering number on the water chemistry visual topological graph by combining a Davies-Bouldin index and a K-means algorithm, clustering the preprocessed water chemistry element data, normalizing the clustering result, and drawing a radar graph and an ion ratio graph;
and (4): and according to the clustering result, fusing with an orthogonal matrix factorization (PMF) method, and qualitatively and quantitatively analyzing underground water chemistry control factors in the high-cold flow region by combining correlation analysis and an ion ratio method.
2. The method for analyzing underground water chemistry control factors in alpine-flow regions based on machine learning of claim 1, wherein the preprocessing of the water chemistry element data in step (1) comprises: naming and arranging the sample data in combination with seasonal and regional attributes on the basis of the water chemistry element data.
3. The method for analyzing the underground water chemistry control factor of the alpine-flow region based on the machine learning according to claim 1, wherein the step (1) further comprises: and when the sample data after pretreatment has abnormal values or missing values, removing the abnormal values and compensating the missing values by interpolation.
4. The method for analyzing the underground water chemistry control factors in the alpine-cold watershed based on the machine learning according to claim 1, wherein in the step (2), an SOM self-organizing neural network is constructed to obtain a water chemistry visual topological graph, and specifically:
inputting the preprocessed water chemical element data into a SOM self-organizing neural network tool box in Matlab;
step A: determining an optimal number of neurons
Figure FDA0003784799510000011
Wherein n is the number of groundwater samples;
and B: initializing a weight space with a smaller random value, and simultaneously respectively setting mapping size, initial winner neurons and initial learning rate;
and C: finding out the best matching unit BMU with the most similar weight vector and input vector;
step D: updating the weight vectors of the BMU and the near-end neurons thereof;
step E: iterating the search process and converging on the optimal self-organizing map;
and when the preset iteration times are reached or the learning rate tends to 0, finishing outputting the water chemistry visualization topological graph, otherwise, returning to the step C.
5. The method for analyzing water chemistry control factors in alpine-flow-region underground based on machine learning according to claim 1, wherein in the step (2), the water chemistry visualization topological graph obtained by the calculation result of the SOM self-organizing neural network is output in the form of a uniform distance matrix and a component plane.
6. The method for analyzing the underground water chemistry control factors in alpine-cold flow regions based on machine learning according to claim 1, wherein in the step (3), an optimal clustering number DBI is determined according to the following formula:
Figure FDA0003784799510000021
wherein N is the number of clusters; sigma i 、σ j Respectively all patterns in the ith and the jth clusters to the centroid c j And c j Average distance of (d); d (c) j ,c j ) Is c j And c j The distance between them.
7. The method for analyzing the underground water chemistry control factor in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (3), the preprocessed water chemistry element data are clustered, specifically:
amplifying a section of interval at the lowest point through the change of the Davies-Bouldin index, wherein the minimum Davies-Bouldin index value corresponds to the optimal clustering number, so as to determine the clear boundary of the clustering; finally, the preprocessed labels of the water chemistry element data are projected on the neurons, and the positions of the preprocessed water chemistry element data in each cluster are determined.
8. The method for analyzing the water chemistry control factors in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (3), a radar map and an ion ratio map of the water chemistry element data concentration in each cluster can be drawn by normalizing the clustering result.
9. The method for analyzing the underground water chemistry control factors in the alpine-flow region based on the machine learning according to claim 1, wherein in the step (4), the clustering result is fused with an orthogonal matrix factorization (PMF), and specifically comprises:
step a: applying an orthogonal matrix factorization (PMF) method to the preprocessed water chemistry element data to convert water chemistry X into water chemistry X ij Is decomposed into a factor contribution matrix g ik The factor distribution matrix f kj And residual matrix e ij The following are:
Figure FDA0003784799510000031
wherein, X ij A sample concentration matrix X is the concentration of the jth water chemical element in the ith sample; p is the number of pollution sources; g ik Is the contribution of the kth water chemistry element to the ith sample; f. of kj Is the concentration of the jth species in the kth water chemistry;
step b: calculating uncertainty u ij ,u ij Is the jth water in the ith sampleThe uncertainty of the chemical element is calculated by the water chemical element content, the method detection limit MDL and the measurement uncertainty:
if the water chemical element content is greater than the detection limit of the method MDL, u ij The calculation formula of (2) is as follows:
Figure FDA0003784799510000032
wherein the error fraction is an error coefficient;
if the water chemical element content is greater than the method detection limit MDL, the associated u ij The calculation formula of (2) is as follows:
Figure FDA0003784799510000033
step c: the factor contributions and distributions are derived by minimizing the objective function Q:
Figure FDA0003784799510000034
where m is the number of water chemistry elements and n is the number of samples.
10. The method for analyzing the underground water chemistry control factors in the alpine-flow region based on the machine learning according to claim 9, wherein in the step (4), the underground water chemistry control factors in the alpine-flow region are qualitatively and quantitatively analyzed by combining correlation analysis and an ion ratio method, and specifically:
obtaining quantitative information of each factor contribution and quantitative information of each factor distribution to each water chemical element according to an orthogonal matrix factorization (PMF); water chemistry contribution of each factor to radar map based on SOM classification results, correlation analysis, and ion ratio relationship (TDS vs. Na) + /(Na + +Ca2 + )、Mg 2+ /Na + With Ca 2+ /Na + 、CAI-I(=(Cl - -(Na + +K + )/Cl - ) And CAI-II (= (Cl) - -(Na + +K + )/(SO 4 2- +HCO 3 - +CO 3 2- +NO 3 - ))、NO 3 - /Na + With Cl - /Na + ) In conjunction with analysis, each factor may correspond to a subterranean water chemistry control factor and reflect the contribution rate.
CN202210939068.0A 2022-08-05 2022-08-05 Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method Active CN115293577B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210939068.0A CN115293577B (en) 2022-08-05 2022-08-05 Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210939068.0A CN115293577B (en) 2022-08-05 2022-08-05 Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method

Publications (2)

Publication Number Publication Date
CN115293577A true CN115293577A (en) 2022-11-04
CN115293577B CN115293577B (en) 2023-07-21

Family

ID=83828207

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210939068.0A Active CN115293577B (en) 2022-08-05 2022-08-05 Machine learning-based high-cold-flow-domain groundwater chemical control factor analysis method

Country Status (1)

Country Link
CN (1) CN115293577B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030900A (en) * 2023-03-24 2023-04-28 安徽瑞邦数科科技服务有限公司 Method, device, equipment and storage medium for controlling component content of chemical product
CN117524347A (en) * 2023-11-20 2024-02-06 中南大学 First principle prediction method for acid radical anion hydration structure accelerated by machine learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942841A (en) * 2013-08-15 2014-07-23 中国地质科学院矿产资源研究所 Mineral resource multivariate information processing method and system based on GIS
CN106355011A (en) * 2016-08-30 2017-01-25 有色金属矿产地质调查中心 Geochemical data element sequence structure analysis method and device
CN113706354A (en) * 2021-09-02 2021-11-26 浙江索思科技有限公司 Marine integrated service management system based on big data technology
CN113780465A (en) * 2021-09-27 2021-12-10 中国水利水电科学研究院 Underground water chemistry seasonal change analysis method based on self-organizing neural network
CN113887635A (en) * 2021-10-08 2022-01-04 河海大学 Basin similarity classification method and classification device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942841A (en) * 2013-08-15 2014-07-23 中国地质科学院矿产资源研究所 Mineral resource multivariate information processing method and system based on GIS
CN106355011A (en) * 2016-08-30 2017-01-25 有色金属矿产地质调查中心 Geochemical data element sequence structure analysis method and device
CN113706354A (en) * 2021-09-02 2021-11-26 浙江索思科技有限公司 Marine integrated service management system based on big data technology
CN113780465A (en) * 2021-09-27 2021-12-10 中国水利水电科学研究院 Underground water chemistry seasonal change analysis method based on self-organizing neural network
CN113887635A (en) * 2021-10-08 2022-01-04 河海大学 Basin similarity classification method and classification device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张妹;刘启蒙;刘凯旋;: "潘三矿区地下水化学特征及成因分析", 煤矿开采, no. 02, pages 272 - 274 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030900A (en) * 2023-03-24 2023-04-28 安徽瑞邦数科科技服务有限公司 Method, device, equipment and storage medium for controlling component content of chemical product
CN117524347A (en) * 2023-11-20 2024-02-06 中南大学 First principle prediction method for acid radical anion hydration structure accelerated by machine learning
CN117524347B (en) * 2023-11-20 2024-04-16 中南大学 First principle prediction method for acid radical anion hydration structure accelerated by machine learning

Also Published As

Publication number Publication date
CN115293577B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
Jiang et al. Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments
Sun et al. Using Bayesian deep learning to capture uncertainty for residential net load forecasting
CN115293577A (en) Method for analyzing chemical control factors of underground water in alpine-cold flow region based on machine learning
Sun et al. C-vine copula mixture model for clustering of residential electrical load pattern data
Bowden et al. Input determination for neural network models in water resources applications. Part 1—background and methodology
Li et al. A novel combined prediction model for monthly mean precipitation with error correction strategy
Brentan et al. Correlation analysis of water demand and predictive variables for short‐term forecasting models
CN110045227B (en) power distribution network fault diagnosis method based on random matrix and deep learning
CN116580308B (en) Monitoring method and monitoring device for soil remediation
CN109583635A (en) A kind of short-term load forecasting modeling method towards operational reliability
US20240346210A1 (en) Multi-scale analysis method for time series based on quantum walk
Cui et al. Deep learning methods for atmospheric PM2. 5 prediction: A comparative study of transformer and CNN-LSTM-attention
CN117078114B (en) Water quality evaluation method and system for water-bearing lakes under influence of diversion engineering
CN105787259A (en) Method for analyzing influence correlation of multiple meteorological factors and load changes
Salam et al. A comparison of activation functions in multilayer neural network for predicting the production and consumption of electricity power
CN117993305B (en) Dynamic evaluation method for river basin land utilization and soil erosion relation
CN116826737A (en) Photovoltaic power prediction method, device, storage medium and equipment
Hemann et al. Assessing positive matrix factorization model fit: a new method to estimate uncertainty and bias in factor contributions at the measurement time scale
Wang et al. R2-trans: Fine-grained visual categorization with redundancy reduction
Merufinia et al. On the simulation of streamflow using hybrid tree-based machine learning models: A case study of Kurkursar basin, Iran
Sohrabi et al. Coupling large-scale climate indices with a stochastic weather generator to improve long-term streamflow forecasts in a Canadian watershed
CN116701974A (en) Precipitation multi-element space-time change analysis and attribution identification method under climate change
CN114996624B (en) Remote sensing PM2.5 and NO based on multitasking deep learning 2 Collaborative inversion method
Zhong et al. Positive and Inverse Degree of Grey Incidence Estimation Model of Soil Organic Matter Based on Hyper-spectral Data.
Ahmadi et al. Identification of dominant sources of sea level pressure for precipitation forecasting over Wales

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant