CN115081310A - Method for predicting biological accessibility of mining and metallurgy sites - Google Patents
Method for predicting biological accessibility of mining and metallurgy sites Download PDFInfo
- Publication number
- CN115081310A CN115081310A CN202210489626.8A CN202210489626A CN115081310A CN 115081310 A CN115081310 A CN 115081310A CN 202210489626 A CN202210489626 A CN 202210489626A CN 115081310 A CN115081310 A CN 115081310A
- Authority
- CN
- China
- Prior art keywords
- heavy metal
- soil
- accessibility
- biological
- mining
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000005065 mining Methods 0.000 title claims abstract description 22
- 238000005272 metallurgy Methods 0.000 title description 3
- 229910001385 heavy metal Inorganic materials 0.000 claims abstract description 88
- 239000002689 soil Substances 0.000 claims abstract description 83
- 238000007637 random forest analysis Methods 0.000 claims abstract description 35
- 238000005070 sampling Methods 0.000 claims abstract description 10
- 238000003723 Smelting Methods 0.000 claims abstract description 7
- 238000011160 research Methods 0.000 claims abstract description 5
- 230000002496 gastric effect Effects 0.000 claims description 12
- 238000000338 in vitro Methods 0.000 claims description 12
- 238000004166 bioassay Methods 0.000 claims description 9
- 238000003066 decision tree Methods 0.000 claims description 9
- 238000004088 simulation Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 229910052785 arsenic Inorganic materials 0.000 abstract description 8
- RQNWIZPPADIBDY-UHFFFAOYSA-N arsenic atom Chemical compound [As] RQNWIZPPADIBDY-UHFFFAOYSA-N 0.000 abstract description 8
- 229910052793 cadmium Inorganic materials 0.000 abstract description 6
- BDOSMKKIYDKNTQ-UHFFFAOYSA-N cadmium atom Chemical compound [Cd] BDOSMKKIYDKNTQ-UHFFFAOYSA-N 0.000 abstract description 6
- 229910052751 metal Inorganic materials 0.000 description 7
- 239000002184 metal Substances 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 5
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 150000002739 metals Chemical class 0.000 description 4
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- MHAJPDPJQMAIIY-UHFFFAOYSA-N Hydrogen peroxide Chemical compound OO MHAJPDPJQMAIIY-UHFFFAOYSA-N 0.000 description 2
- WTDHULULXKLSOZ-UHFFFAOYSA-N Hydroxylamine hydrochloride Chemical compound Cl.ON WTDHULULXKLSOZ-UHFFFAOYSA-N 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 239000004927 clay Substances 0.000 description 2
- 239000004576 sand Substances 0.000 description 2
- -1 silt Substances 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- USFZMSVCRYTOJT-UHFFFAOYSA-N Ammonium acetate Chemical compound N.CC(O)=O USFZMSVCRYTOJT-UHFFFAOYSA-N 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 208000001738 Nervous System Trauma Diseases 0.000 description 1
- GRYLNZFGIOXLOG-UHFFFAOYSA-N Nitric acid Chemical compound O[N+]([O-])=O GRYLNZFGIOXLOG-UHFFFAOYSA-N 0.000 description 1
- UCKMPCXJQFINFW-UHFFFAOYSA-N Sulphide Chemical compound [S-2] UCKMPCXJQFINFW-UHFFFAOYSA-N 0.000 description 1
- PNZVFASWDSMJER-UHFFFAOYSA-N acetic acid;lead Chemical compound [Pb].CC(O)=O PNZVFASWDSMJER-UHFFFAOYSA-N 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 229940043376 ammonium acetate Drugs 0.000 description 1
- 235000019257 ammonium acetate Nutrition 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- QZPSXPBJTPJTSZ-UHFFFAOYSA-N aqua regia Chemical compound Cl.O[N+]([O-])=O QZPSXPBJTPJTSZ-UHFFFAOYSA-N 0.000 description 1
- 239000007900 aqueous suspension Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- YKYOUMDCQGMQQO-UHFFFAOYSA-L cadmium dichloride Chemical compound Cl[Cd]Cl YKYOUMDCQGMQQO-UHFFFAOYSA-L 0.000 description 1
- 150000004649 carbonic acid derivatives Chemical class 0.000 description 1
- 238000005341 cation exchange Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000003344 environmental pollutant Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 238000001095 inductively coupled plasma mass spectrometry Methods 0.000 description 1
- 238000002354 inductively-coupled plasma atomic emission spectroscopy Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- UQSXHKLRYXJYBZ-UHFFFAOYSA-N iron oxide Inorganic materials [Fe]=O UQSXHKLRYXJYBZ-UHFFFAOYSA-N 0.000 description 1
- AMWRITDGCCNYAT-UHFFFAOYSA-L manganese oxide Inorganic materials [Mn].O[Mn]=O.O[Mn]=O AMWRITDGCCNYAT-UHFFFAOYSA-L 0.000 description 1
- PPNAOCWZXJOHFK-UHFFFAOYSA-N manganese(2+);oxygen(2-) Chemical class [O-2].[Mn+2] PPNAOCWZXJOHFK-UHFFFAOYSA-N 0.000 description 1
- 239000004570 mortar (masonry) Substances 0.000 description 1
- 201000011682 nervous system cancer Diseases 0.000 description 1
- 229910017604 nitric acid Inorganic materials 0.000 description 1
- 239000005416 organic matter Substances 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 231100000719 pollutant Toxicity 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
- 239000013558 reference substance Substances 0.000 description 1
- XCVRTGQHVBWRJB-UHFFFAOYSA-M sodium dihydrogen arsenate Chemical compound [Na+].O[As](O)([O-])=O XCVRTGQHVBWRJB-UHFFFAOYSA-M 0.000 description 1
- 238000003900 soil pollution Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000000209 wet digestion Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Evolutionary Computation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Artificial Intelligence (AREA)
- Operations Research (AREA)
- Educational Administration (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Sampling And Sample Adjustment (AREA)
Abstract
The invention discloses a method for predicting the biological accessibility of a mining and smelting site, which relates to the technical field of mining and smelting, and comprises the following steps: s1, collecting soil characteristics, heavy metal components and heavy metal biological accessibility data in a research area through a collection module, and determining biological accessibility of soil at a sampling point, S2, according to the soil characteristics data, the heavy metal components and the heavy metal biological accessibility data, with the biological accessibility as a response variable, the soil characteristics and the heavy metal components as a prediction variable, establishing a random forest regression model, and the mining and metallurgical site biological accessibility prediction method can accurately obtain non-linear relation data between soil heavy metal biological effectiveness and the soil characteristics and the heavy metal components, and the constructed random forest model has good generalization capability, can predict biological accessibility of cadmium, lead and arsenic in different sites, and reduces influence of model prediction on prediction due to site differences.
Description
Technical Field
The invention relates to the technical field of mining and smelting, in particular to a method for predicting the biological accessibility of a mining and smelting site.
Background
With the rapid development of industrial activities such as mining, smelting and the like, various heavy metals are released into soil, resulting in heavy metal pollution of the soil environment. According to the first national soil pollution survey in China, the soil environment pollution is serious, and 34.9% of industrial and mining waste land samples exceed the quality standard of the soil environment in China. Among them, cadmium, lead and arsenic, which are three heavy metals, are concerned with cardiovascular diseases, nervous system injuries, cancers and the like, and thus are receiving much attention. In order to better manage and control heavy metal pollution in soil environments, several countries around the world have promulgated soil environment general evaluation standards including various chemicals such as cadmium, lead and arsenic. In most cases, the total concentration of heavy metals is considered in the model, which means that the bioavailability of heavy metals in soil is not considered by the soil environment standards, and the soil environment standards based on the bioavailability are very important for the repair and reuse of industrial and mining fields.
The bioavailability of cadmium, lead and arsenic is determined by comparing the accumulation of metals in animal tissues or urine with the accumulation of soluble reference substances such as sodium arsenate (NaH2AsO4) and lead acetate (pb (ac)2) and cadmium chloride (CdCl2), and since bioavailability is determined by expensive and ethically disputed animal experiments, a number of in vitro gastrointestinal phase simulations are widely used as an alternative method for assessing the bioavailability of metals and have been validated by animal experiments. The ratio of heavy metals extracted from the in vitro simulated gastrointestinal phase experiments to total content was defined as bioassays. The bioavailabilities of heavy metals determined by different in vitro gastrointestinal period simulation methods are related to animal experiments, so that the bioavailabilities of heavy metals in soil need to be evaluated by using a proper in vitro gastrointestinal period simulation method, at present, most researches have created accurate bioavailabilities prediction models for specific sites, but model predictions are different from site to site, and a model designed for one site is usually difficult to apply to another site, so that the soil properties of the site can be greatly different, and the feasibility of using one model in two or more different sites is greatly limited.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art and to providing a method for predicting the bio-accessibility of a mining or metallurgical site that addresses the problem of the potential for large differences in the soil properties of the site that greatly limit the feasibility of using a model in two or more different sites.
In order to achieve the purpose, the invention provides the following technical scheme: a method for predicting the accessibility of organisms in a mining and metallurgical site comprises the following steps: the S1 collects soil characteristics, heavy metal components and heavy metal biological accessibility data in the research area through the collection module, and determines the biological accessibility of the soil of the sampling point;
s2, establishing a random forest regression model by using the biological accessibility as a response variable and the soil characteristics and the heavy metal components as prediction variables according to the soil characteristic data, the heavy metal components and the heavy metal biological accessibility data through a modeling module;
s3, according to the measured data, using biological accessibility of heavy metals in soil as response variables and soil characteristics and heavy metal components as prediction variables through a calculation module, and constructing a decision tree by adopting a random bootstrap sampling method to establish a random forest regression model; wherein, the biological accessibility refers to the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil;
s4, determining the number of the prediction variables and the number of the decision trees randomly sampled by each decision tree in the random forest regression model;
s5, according to the random forest regression model, calculating the contribution rate of the prediction factor to the accessibility of the heavy metal organisms based on the shape value.
Preferably, the accuracy of the random forest regression model is estimated by the method of the S2 through a five-fold cross test.
Preferably, the contribution rate of the prediction factor to the heavy metal biological effectiveness is calculated according to a random forest regression model, and the contribution rate of the prediction factor to the heavy metal biological effectiveness is calculated based on the shape value.
Preferably, a random forest regression model is established through the S2 modeling module according to the soil characteristics, heavy metal components and heavy metal biological accessibility data, with the heavy metal biological accessibility as a response variable and the soil characteristics and the heavy metal components as predictions; wherein, the biological accessibility refers to the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil.
Preferably, the calculation module in S3 is used to calculate the contribution rate of the geochemical factor to the effectiveness of the heavy metal organism according to a random forest regression model.
Preferably, the processor is used for loading the program to execute the estimation method of the bioavailability of the soil heavy metal.
Compared with the prior art, the invention has the beneficial effects that:
(1) the mining and metallurgy site biological accessibility prediction method can accurately obtain the soil heavy metal biological effectiveness and nonlinear relation data between the soil characteristics and the heavy metal components, the constructed random forest model has good generalization capability, the biological accessibility of cadmium, lead and arsenic in different sites can be predicted, the influence of model prediction on prediction due to different sites is reduced, and the application range is expanded.
Drawings
The invention is further illustrated with reference to the following figures and examples:
FIG. 1 is a schematic flow chart of a method for predicting the bio-accessibility of a mining and metallurgical site according to the present invention;
FIG. 2 is a schematic diagram of importance ranking calculated based on SHAP values of a random forest model using soil characteristics and heavy metal forms as prediction variables according to the present invention;
FIG. 3 is a schematic diagram of importance ranking calculated based on SHAP values for a random forest model using soil characteristics as predictor variables according to the present invention.
FIG. 4 is a schematic of the composition and in vitro parameters of the bioassays of UBM, IVG and PBET of the present invention.
FIG. 5 is a graph of the biological accessibility of heavy metals in the gastric phase of the Random Forest (RF) model of the present invention. (a) Soil characteristics and heavy metal forms are used as prediction variables. (b) And the soil property is used as a prediction variable diagram.
FIG. 6 is a diagram illustrating the importance of predictor variables to the model performance through shape value calculation feature importance, and an analysis result.
Detailed Description
Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, greater than, less than, exceeding, etc. are understood as excluding the present numbers, and the above, below, inside, etc. are understood as including the present numbers. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
Referring to fig. 1-6, the present invention provides a technical solution: a method for predicting the accessibility of organisms in a mining and metallurgical site comprises the following steps: s1, collecting soil characteristics, heavy metal components and heavy metal biological accessibility data in a research area through a collection module, and determining the biological accessibility of soil of a sampling point;
s2, establishing a random forest regression model by using the biological accessibility as response variables and the soil characteristics and the heavy metal components as prediction variables through a modeling module according to the soil characteristic data, the heavy metal components and the heavy metal biological accessibility, estimating the precision of the random forest regression model by using a five-fold cross-over test method, and establishing the random forest regression model by using the soil characteristic data, the heavy metal components and the heavy metal biological accessibility as response variables and the soil characteristic data and the heavy metal components as predictions through the modeling module; wherein, the biological accessibility refers to the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil;
s3, according to the measured data, using biological accessibility of heavy metals in soil as response variables and soil characteristics and heavy metal components as prediction variables through a calculation module, and constructing a decision tree by adopting a random bootstrap sampling method to establish a random forest regression model; the biological accessibility is the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil, and the calculation module is used for calculating the contribution rate of geochemical factors to the biological effectiveness of the heavy metal according to a random forest regression model;
s4, determining the number of the prediction variables and the number of the decision trees randomly sampled by each decision tree in the random forest regression model;
s5, according to the random forest regression model, calculating the contribution rate of the prediction factor to the accessibility of the heavy metal organisms based on the shape value.
The processor is used for loading the program to execute the estimation method of the biological effectiveness of the soil heavy metal.
Firstly, sampling samples: from eight provinces in china 33 soils from different contaminated sites in mining and smelting areas were collected, each soil sample consisting of three cores (0-20 cm deep), and details on the sampling sites can be found in supplementary table S1, all soils were air dried at room temperature and then passed through a 10 mesh screen to remove debris and pebbles for physical and chemical analysis of soil properties. A portion of the sample was ground in an agate mortar and sieved to less than 250 microns for total elemental and bioassays.
The detailed physicochemical properties of the soil were then measured. The pH and conductivity of the soil were measured with a pH meter and a conductivity meter at a mass of 1:2.5 and 1:5, respectively: soil and water suspensions, soil Total Organic Carbon (TOC) and Total Carbon (TC) were measured by total organic carbon analyzer (TOC, ASI-5000A), air dried soil samples were leached with 1.66cmol/L of Co (NH3)6CL3 solution, soil Cation Exchange Capacity (CEC) was determined, soil dissolved Total Nitrogen (TN) was extracted with 0.5mol-L-1K2SO4 at 1:5 (mass: volume), soil texture (sand, silt, clay; volume percent) was determined using total organic carbon analyzer (TOC, ASI-5000A, Shimazu), soil texture (sand, silt, clay; volume percent) was determined using laser analyzer (mastersizer 3000, Morvern, UK) to determine soil particle size distribution, hi order to determine total concentration of metals in the soil, a wet digestion microwave oven (Milestone MLS 1200Mega) was used to add HNO 3: HCL: mixture of HF in a ratio of 2: 1: 1 (v: v: v), the solution was dried on a hot plate to completely remove HF, and then the residue was dissolved in nitric acid. The resulting solution was analyzed for metal content using ICP-MS and ICP-OES.
And extracting four components of heavy metal from the soil at different sampling points by adopting an optimized BCR three-step sequential extraction method. The method comprises four stages: (1) exchangeable metals soluble in water or weakly linked to carbonates, obtained with 0.11mol/L acidic acetic acid; (2) the metal adhering to the iron and manganese oxides was oxidized with 0.1mol/L hydroxylamine hydrochloride at pH 1.5, (3) the metal bound to the organic matter and sulfide was oxidized first with hydrogen peroxide to the residue of step 2 and then with 1mol/L ammonium acetate at pH 2, (4) the residue was decomposed with aqua regia and HF to obtain the residue of step 3.
Three in vitro assays were used to assess the bioassays of HMS in soil, which were verified by animal models as the best in vitro method to predict bioavailability, respectively, and the composition and analytical parameters in the Gastric Phase (GP) and Intestinal Phase (IP) are shown in fig. 4.
Then, regression prediction: the bioavailabilities of soil lead and soil arsenic measured by soil lead and IVG measured by a PBET method and a UBM method are respectively used as response variables of a random forest model, 15 soil characteristics and heavy metal components measured by the experiments are used as prediction variables, a single regression tree is trained, the number of the trees is set to be 200, the single regression tree trained by 200 lessons is combined, the test is carried out by using test data, the regression prediction is carried out by using the random forest model obtained by final combination, the bioavailabilities of the prediction variables to different heavy metals and the interaction of the prediction variables and the response variables are respectively calculated, and the analysis result is shown in figure 5.
After pre-treatment logarithmic transformation of the target pollutant dataset under study, the random forest models were found to have high fitting accuracy for all three heavy metals, such As Cd (R2 CV-0.92, RMSEcv-0.30), Pb (R2 CV-0.3, RMSEcv-0.39), and As (R2 CV-0.810, RMSEcv-0.23).
In order to more fully understand the remarkable characteristics affecting the accessibility of heavy metal organisms, the importance of characteristics is calculated by using shape values to characterize the importance of predictive variables to the performance of the model, and the analysis result is shown in FIG. 6.
Feature importance calculated based on the shape value shows that HMStotal, F1, F2 and F123 are several features of greater importance in the random forest model all the time, and although their importance ranks are slightly different in different heavy metal prediction models, specifically, Cd prediction model, where Cdtotal and F2 are the first and second important features in the rank, and then F123 and F1, where the important features in the lead prediction model are the same As the Cd prediction model, there is a difference in their importance ranks, where F123 > F2 > Pbtotal > F1, and As-GP prediction model, where F123 and Astotal are the two most important features, and then EC and F1. In the biological accessibility model for predicting Cd only by using soil characteristics as prediction variables, the most important characteristics are Cdtotal, and the more important characteristics are EC, Eh and pH. The F1, F2 and F123 in the heavy metal component are key factors influencing the bioavailability of the heavy metal. In addition, conductivity in soil is also a key factor affecting the accessibility of arsenic organisms in soil.
The method can accurately obtain the data of the nonlinear relationship between the bioavailability of the heavy metal in the soil and the characteristics of the soil and the components of the heavy metal, and the constructed random forest model has better generalization capability and can predict the biological accessibility of cadmium, lead and arsenic in different fields.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.
Claims (6)
1. A method for predicting the accessibility of organisms in a mining and smelting site is characterized by comprising the following steps: the method comprises the following steps: the S1 collects soil characteristics, heavy metal components and heavy metal biological accessibility data in the research area through the collection module, and determines the biological accessibility of the soil of the sampling point;
s2, establishing a random forest regression model by using the biological accessibility as a response variable and the soil characteristics and the heavy metal components as prediction variables according to the soil characteristic data, the heavy metal components and the heavy metal biological accessibility data through a modeling module;
s3, according to the measured data, using biological accessibility of heavy metals in soil as response variables and soil characteristics and heavy metal components as prediction variables through a calculation module, and constructing a decision tree by adopting a random bootstrap sampling method to establish a random forest regression model; wherein, the biological accessibility refers to the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil;
s4, determining the number of the prediction variables and the number of the decision trees randomly sampled by each decision tree in the random forest regression model;
s5, according to the random forest regression model, calculating the contribution rate of the prediction factors to the accessibility of the heavy metal organisms based on the shape value.
2. A method of predicting the bioassays of mining and metallurgical sites according to claim 1, wherein said method comprises: and estimating the precision of the random forest regression model by adopting a five-fold cross-checking method through the S2.
3. A method of predicting the bioassays of mining and metallurgical sites according to claim 1, wherein said method comprises: and calculating the contribution rate of the prediction factor to the heavy metal biological effectiveness according to the random forest regression model, and calculating the contribution rate of the prediction factor to the heavy metal biological effectiveness based on the shape value.
4. A method of predicting the bioassays of mining and metallurgical sites according to claim 1, wherein said method comprises: establishing a random forest regression model by the S2 modeling module according to the soil characteristics, the heavy metal components and the heavy metal biological accessibility data, taking the heavy metal biological accessibility as a response variable and taking the soil characteristics and the heavy metal components as predictions; wherein, the biological accessibility refers to the ratio of the content of heavy metal extracted in an in vitro gastrointestinal phase simulation experiment to the content of heavy metal in soil.
5. A method of predicting the bioassays of mining and metallurgical sites according to claim 1, wherein said method comprises: and calculating the contribution rate of the geochemical factors to the effectiveness of the heavy metal organisms according to a random forest regression model by the calculating module in the S3.
6. A method of predicting the bioassays of mining and metallurgical sites according to claim 1, wherein said method comprises: the through memory is used for storing programs, and the processor is used for loading the programs to execute the estimation method of the bioavailability of the soil heavy metal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210489626.8A CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210489626.8A CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115081310A true CN115081310A (en) | 2022-09-20 |
Family
ID=83247902
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210489626.8A Pending CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115081310A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117094123A (en) * | 2023-07-12 | 2023-11-21 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
-
2022
- 2022-05-06 CN CN202210489626.8A patent/CN115081310A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117094123A (en) * | 2023-07-12 | 2023-11-21 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
CN117094123B (en) * | 2023-07-12 | 2024-06-11 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Meers et al. | Comparison of cadmium extractability from soils by commonly used single extraction protocols | |
Reis et al. | Overview and challenges of mercury fractionation and speciation in soils | |
Degryse et al. | Soil solution concentration of Cd and Zn canbe predicted with a CaCl2 soil extract | |
Sahuquillo et al. | Overview of the use of leaching/extraction tests for risk assessment of trace metals in contaminated soils and sediments | |
Kaninga et al. | mine tailings in an African tropical environment—mechanisms for the bioavailability of heavy metals in soils | |
Peijnenburg et al. | Monitoring metals in terrestrial environments within a bioavailability framework and a focus on soil extraction | |
Jamali et al. | Speciation of heavy metals in untreated domestic wastewater sludge by time saving BCR sequential extraction method | |
Rao et al. | A review of the different methods applied in environmental geochemistry for single and sequential extraction of trace elements in soils and related materials | |
Peng et al. | Predicting heavy metal partition equilibrium in soils: Roles of soil components and binding sites | |
Amery et al. | The UV‐absorbance of dissolved organic matter predicts the fivefold variation in its affinity for mobilizing Cu in an agricultural soil horizon | |
Bakircioglu et al. | Comparison of extraction procedures for assessing soil metal bioavailability of to wheat grains | |
Degryse et al. | Radio‐labile cadmium and zinc in soils as affected by pH and source of contamination | |
Yolcubal et al. | Adsorption and transport of arsenate in carbonate-rich soils: coupled effects of nonlinear and rate-limited sorption | |
CN110987909A (en) | Method and device for analyzing spatial distribution and source of heavy metals in farmland soil | |
Campos et al. | A study of the analytical parameters important for the sequential extraction procedure using microwave heating for Pb, Zn and Cu in calcareous soils | |
Meers et al. | Zn in the soil solution of unpolluted and polluted soils as affected by soil characteristics | |
Zan et al. | Prediction of the solubility of zinc, copper, nickel, cadmium, and lead in metal-contaminated soils | |
Rodrigues et al. | Evaluation of an approach for the characterization of reactive and available pools of 20 potentially toxic elements in soils: Part II–Solid-solution partition relationships and ion activity in soil solutions | |
Zhang et al. | Aging of zinc added to soils with a wide range of different properties: factors and modeling | |
Wang et al. | Distribution and integrated assessment of lead in an abandoned lead-acid battery site in Southwest China before redevelopment | |
Chang et al. | Evaluation of phytoavailability of heavy metals to Chinese cabbage (Brassica chinensis L.) in rural soils | |
Voegelin et al. | Zinc fractionation in contaminated soils by sequential and single extractions: influence of soil properties and zinc content | |
Zhai et al. | Leaching behaviors and chemical fraction distribution of exogenous selenium in three agricultural soils through simulated rainfall | |
CN115081310A (en) | Method for predicting biological accessibility of mining and metallurgy sites | |
Beaumelle et al. | Subcellular partitioning of metals in Aporrectodea caliginosa along a gradient of metal exposure in 31 field-contaminated soils |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |