CN115081310A - Method for predicting biological accessibility of mining and metallurgy sites - Google Patents
Method for predicting biological accessibility of mining and metallurgy sites Download PDFInfo
- Publication number
- CN115081310A CN115081310A CN202210489626.8A CN202210489626A CN115081310A CN 115081310 A CN115081310 A CN 115081310A CN 202210489626 A CN202210489626 A CN 202210489626A CN 115081310 A CN115081310 A CN 115081310A
- Authority
- CN
- China
- Prior art keywords
- heavy metal
- soil
- accessibility
- mining
- random forest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000005065 mining Methods 0.000 title claims abstract description 21
- 238000005272 metallurgy Methods 0.000 title description 3
- 229910001385 heavy metal Inorganic materials 0.000 claims abstract description 88
- 239000002689 soil Substances 0.000 claims abstract description 81
- 238000007637 random forest analysis Methods 0.000 claims abstract description 35
- 230000004044 response Effects 0.000 claims abstract description 12
- 238000005070 sampling Methods 0.000 claims abstract description 9
- 238000003723 Smelting Methods 0.000 claims abstract description 7
- 238000011160 research Methods 0.000 claims abstract description 3
- 230000002496 gastric effect Effects 0.000 claims description 12
- 238000000338 in vitro Methods 0.000 claims description 12
- 238000003066 decision tree Methods 0.000 claims description 9
- 238000004088 simulation Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000004166 bioassay Methods 0.000 claims 5
- 229910052785 arsenic Inorganic materials 0.000 abstract description 8
- RQNWIZPPADIBDY-UHFFFAOYSA-N arsenic atom Chemical compound [As] RQNWIZPPADIBDY-UHFFFAOYSA-N 0.000 abstract description 8
- 229910052793 cadmium Inorganic materials 0.000 abstract description 7
- BDOSMKKIYDKNTQ-UHFFFAOYSA-N cadmium atom Chemical compound [Cd] BDOSMKKIYDKNTQ-UHFFFAOYSA-N 0.000 abstract description 7
- 229910052751 metal Inorganic materials 0.000 description 7
- 239000002184 metal Substances 0.000 description 7
- 150000002739 metals Chemical class 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 4
- 229910052799 carbon Inorganic materials 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- QTBSBXVTEAMEQO-UHFFFAOYSA-N Acetic acid Chemical compound CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- MHAJPDPJQMAIIY-UHFFFAOYSA-N Hydrogen peroxide Chemical compound OO MHAJPDPJQMAIIY-UHFFFAOYSA-N 0.000 description 2
- WTDHULULXKLSOZ-UHFFFAOYSA-N Hydroxylamine hydrochloride Chemical compound Cl.ON WTDHULULXKLSOZ-UHFFFAOYSA-N 0.000 description 2
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- GRYLNZFGIOXLOG-UHFFFAOYSA-N Nitric acid Chemical compound O[N+]([O-])=O GRYLNZFGIOXLOG-UHFFFAOYSA-N 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 229910017604 nitric acid Inorganic materials 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- USFZMSVCRYTOJT-UHFFFAOYSA-N Ammonium acetate Chemical compound N.CC(O)=O USFZMSVCRYTOJT-UHFFFAOYSA-N 0.000 description 1
- 239000005695 Ammonium acetate Substances 0.000 description 1
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- PNZVFASWDSMJER-UHFFFAOYSA-N acetic acid;lead Chemical compound [Pb].CC(O)=O PNZVFASWDSMJER-UHFFFAOYSA-N 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 229940043376 ammonium acetate Drugs 0.000 description 1
- 235000019257 ammonium acetate Nutrition 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- QZPSXPBJTPJTSZ-UHFFFAOYSA-N aqua regia Chemical compound Cl.O[N+]([O-])=O QZPSXPBJTPJTSZ-UHFFFAOYSA-N 0.000 description 1
- 239000007900 aqueous suspension Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000003353 bioavailability assay Methods 0.000 description 1
- YKYOUMDCQGMQQO-UHFFFAOYSA-L cadmium dichloride Chemical compound Cl[Cd]Cl YKYOUMDCQGMQQO-UHFFFAOYSA-L 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 150000004649 carbonic acid derivatives Chemical class 0.000 description 1
- 238000005341 cation exchange Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000004927 clay Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000003344 environmental pollutant Substances 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000000099 in vitro assay Methods 0.000 description 1
- 238000001095 inductively coupled plasma mass spectrometry Methods 0.000 description 1
- 238000002354 inductively-coupled plasma atomic emission spectroscopy Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000968 intestinal effect Effects 0.000 description 1
- UQSXHKLRYXJYBZ-UHFFFAOYSA-N iron oxide Inorganic materials [Fe]=O UQSXHKLRYXJYBZ-UHFFFAOYSA-N 0.000 description 1
- AMWRITDGCCNYAT-UHFFFAOYSA-L manganese oxide Inorganic materials [Mn].O[Mn]=O.O[Mn]=O AMWRITDGCCNYAT-UHFFFAOYSA-L 0.000 description 1
- PPNAOCWZXJOHFK-UHFFFAOYSA-N manganese(2+);oxygen(2-) Chemical class [O-2].[Mn+2] PPNAOCWZXJOHFK-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000004570 mortar (masonry) Substances 0.000 description 1
- 210000000653 nervous system Anatomy 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 231100000719 pollutant Toxicity 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 239000013558 reference substance Substances 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- -1 silt Substances 0.000 description 1
- XCVRTGQHVBWRJB-UHFFFAOYSA-M sodium dihydrogen arsenate Chemical compound [Na+].O[As](O)([O-])=O XCVRTGQHVBWRJB-UHFFFAOYSA-M 0.000 description 1
- 238000003900 soil pollution Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 150000003568 thioethers Chemical class 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000000209 wet digestion Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Evolutionary Computation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Artificial Intelligence (AREA)
- Operations Research (AREA)
- Educational Administration (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Game Theory and Decision Science (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- General Engineering & Computer Science (AREA)
- Sampling And Sample Adjustment (AREA)
Abstract
Description
技术领域technical field
本发明涉及采矿、冶炼技术领域,特别涉及一种矿冶场地生物可及性的预测方法。The invention relates to the technical fields of mining and smelting, in particular to a method for predicting the biological accessibility of a mining and smelting site.
背景技术Background technique
随着采矿、冶炼等工业化活动的快速发展,各种重金属被释放到土壤中,导致土壤环境的重金属污染。根据中国第一次全国土壤污染调查,土壤环境污染较为严重,34.9%的工矿废弃地采样超过中国土壤环境质量标准。其中,镉、铅和砷这三种重金属与心血管疾病、神经系统损伤和癌症等疾病有关,因此备受关注。为了更好地管理和控制土壤环境中的重金属污染,世界上一些国家已经发布了土壤环境通用评估标准,包括镉、铅和砷等多种化学物质。在这些通用评估标准中,大多数情况下,模型中考虑的是重金属的总浓度,这意味着这些土壤环境标准都没有考虑到土壤中重金属的生物有效性,基于生物有效性的土壤环境标准对工矿用地的修复和再利用非常重要。With the rapid development of industrial activities such as mining and smelting, various heavy metals are released into the soil, resulting in heavy metal pollution of the soil environment. According to the first national soil pollution survey in China, the soil environmental pollution is relatively serious, and 34.9% of the industrial and mining abandoned land samples exceeded the Chinese soil environmental quality standards. Among them, three heavy metals, cadmium, lead and arsenic, have been concerned with diseases such as cardiovascular disease, nervous system damage and cancer. In order to better manage and control heavy metal pollution in soil environment, some countries in the world have issued general assessment criteria for soil environment, including various chemicals such as cadmium, lead and arsenic. Among these general evaluation criteria, in most cases, the total concentration of heavy metals is considered in the model, which means that these soil environmental standards do not take into account the bioavailability of heavy metals in soil. The restoration and reuse of industrial and mining land is very important.
镉、铅、砷的生物有效性是通过比较金属在动物组织或尿液中的积累与可溶性参照物如砷酸钠(NaH2AsO4)和醋酸铅(Pb(AC)2)以及氯化镉(CdCl2) 的积累来确定的,由于生物利用度是由昂贵和有伦理争议的动物实验决定的,许多体外胃肠期模拟被广泛用作评估金属的生物利用度的替代方法,并已被动物实验所验证。从体外模拟胃肠期实验中提取的重金属与总含量的比率被定义为生物可及性。不同的体外胃肠期模拟方法确定的重金属生物可及性与动物实验相关,因此,有必要使用合适的体外胃肠期模拟方法评估土壤中重金属的生物可及性,目前,大多数研究已经为特定地点创建了准确的生物可及性预测模型,但模型预测因地点而异,通常很难将针对一个站点设计的模型应用到另一个站点,场地的土壤性质可能会有很大差异,这极大地限制了在两个或多个不同场地使用一个模型的可行性。The bioavailability of cadmium, lead, and arsenic was determined by comparing the accumulation of metals in animal tissues or urine with soluble reference substances such as sodium arsenate (NaH2AsO4) and lead acetate (Pb(AC)2) and cadmium chloride (CdCl2). Since bioavailability is determined by expensive and ethically controversial animal experiments, many in vitro gastrointestinal phase simulations are widely used as surrogate methods for assessing the bioavailability of metals and have been validated in animal experiments. The ratio of heavy metals to total content extracted from in vitro simulated gastrointestinal phase experiments was defined as bioaccessibility. The bioavailability of heavy metals determined by different in vitro gastrointestinal phase simulation methods is related to animal experiments. Therefore, it is necessary to use suitable in vitro gastrointestinal phase simulation methods to evaluate the bioavailability of heavy metals in soil. Accurate prediction models for bioaccessibility are created for specific sites, but model predictions vary from site to site, it is often difficult to apply a model designed for one site to another, and soil properties at sites can vary widely, which is extremely The earth limits the feasibility of using one model on two or more different sites.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于至少解决现有技术中存在的技术问题之一,提供矿冶场地生物可及性的预测方法,能够解决场地的土壤性质可能会有很大差异,这极大地限制了在两个或多个不同场地使用一个模型的可行性的问题。The purpose of the present invention is to solve at least one of the technical problems existing in the prior art, and to provide a method for predicting the bioaccessibility of mining and metallurgy sites, which can solve the problem that the soil properties of the sites may be very different, which greatly limits the two The question of the feasibility of using a model for one or more different sites.
为实现上述目的,本发明提供如下技术方案:一种矿冶场地生物可及性的预测方法,包括以下步骤:所述S1通过采集模块采集研究区域内的土壤特性和重金属组分、重金属生物可及性数据,并测定采样点土壤的生物可及性;In order to achieve the above purpose, the present invention provides the following technical solution: a method for predicting the bioavailability of a mining and metallurgical site, comprising the following steps: the S1 collects soil properties and heavy metal components, heavy metal bioavailability in the research area through a collection module. availability data and determine the biological accessibility of the soil at the sampling site;
S2通过建模模块根据土壤特性数据和重金属组分和重金属生物可及性数据,以生物可及性作为响应变量,土壤特性和重金属组分作为预测变量,建立随机森林回归模型;S2 establishes a random forest regression model through the modeling module based on soil property data, heavy metal components and heavy metal bioavailability data, with bioaccessibility as the response variable, soil properties and heavy metal components as the predictor variables;
S3通过计算模块,根据上述测定数据,以土壤中重金属生物可及性作为响应变量,以土壤特性和重金属组分作为预测变量,采用随机自举抽样的方法来构建决策树来建立随机森林回归模型;其中,生物可及性是指体外肠胃相模拟实验提取的重金属含量与土壤中重金属含量的比值;S3 uses the calculation module, according to the above measurement data, with the bio-accessibility of heavy metals in the soil as the response variable, the soil properties and heavy metal components as the predictor variables, and the random bootstrap sampling method is used to construct a decision tree to establish a random forest regression model. ; Among them, bioaccessibility refers to the ratio of the heavy metal content extracted from the in vitro gastrointestinal phase simulation experiment to the heavy metal content in the soil;
S4确定随机森林回归模型中每棵决策树随机取样的预测变量的数量和决策树的数量;S4 determines the number of predictors and the number of decision trees randomly sampled from each decision tree in the random forest regression model;
S5根据随机森林回归模型,基于shap值计算出预测因子对重金属生物可及性的贡献率。S5 calculates the contribution rate of predictors to the bioaccessibility of heavy metals based on the shap value according to the random forest regression model.
优选的,通过所述S2采用五折交叉检验法估算随机森林回归模型的精度。Preferably, the accuracy of the random forest regression model is estimated by using the five-fold cross-check method through the S2.
优选的,所述根据随机森林回归模型计算预测因子对重金属生物有效性的贡献率,基于shap值计算预测因子对重金属生物有效性的贡献率。Preferably, the contribution rate of the predictor to the bioavailability of heavy metals is calculated according to a random forest regression model, and the contribution rate of the predictor to the bioavailability of heavy metals is calculated based on the shap value.
优选的,通过所述S2建模模块根据所述土壤特性和重金属组分、重金属生物可及性数据,以重金属生物可及性作为响应变量,以土壤特性、重金属组分作为预测,建立随机森林回归模型;其中,生物可及性是指体外肠胃相模拟实验提取的重金属含量与土壤中重金属含量的比值。Preferably, according to the soil characteristics, heavy metal components, and heavy metal bioavailability data, the S2 modeling module uses the heavy metal bioavailability as a response variable, and uses soil characteristics and heavy metal components as predictions to establish a random forest. Regression model; in which, bioaccessibility refers to the ratio of the heavy metal content extracted from the in vitro gastrointestinal phase simulation experiment to the heavy metal content in the soil.
优选的,通过所述S3中的计算模块用于根据随机森林回归模型计算地球化学因子对重金属生物有效性的贡献率。Preferably, the calculation module in S3 is used to calculate the contribution rate of geochemical factors to the bioavailability of heavy metals according to a random forest regression model.
优选的,所述通过存储器,用于存储程序,通过处理器,用于加载程序以执行土壤重金属生物有效性的估算方法。Preferably, the memory is used to store the program, and the processor is used to load the program to execute the estimation method of soil heavy metal bioavailability.
与现有技术相比,本发明的有益效果是:Compared with the prior art, the beneficial effects of the present invention are:
(1)、该矿冶场地生物可及性的预测方法,能够较为准确地获得土壤重金属生物有效性与土壤特性与重金属组分之间的非线性关系数据,并且构建的随机森林模型有较好的泛化能力,能够预测不同场地中镉、铅、砷的生物可及性, 降低模型预测因地点而异对预测的影响,扩大适用范围。(1) The method for predicting the bioavailability of mining and metallurgical sites can more accurately obtain the non-linear relationship data between soil heavy metal bioavailability and soil properties and heavy metal components, and the constructed random forest model has better performance The generalization ability of the model can predict the bioavailability of cadmium, lead, and arsenic in different sites, reduce the impact of model predictions on predictions that vary from location to location, and expand the scope of application.
附图说明Description of drawings
下面结合附图和实施例对本发明进一步地说明:Below in conjunction with accompanying drawing and embodiment, the present invention is further described:
图1为本发明一种矿冶场地生物可及性的预测方法的流程示意图;1 is a schematic flow chart of a method for predicting biological accessibility of a mining and metallurgical site according to the present invention;
图2为本发明利用土壤特性和重金属形态作为预测变量的随机森林模型基于SHAP值计算的重要性排序示意图;2 is a schematic diagram of the importance ranking based on SHAP value calculation using a random forest model using soil properties and heavy metal forms as predictors in the present invention;
图3为本发明利用土壤特性作为预测变量的随机森林模型基于SHAP值计算的重要性排序示意图。FIG. 3 is a schematic diagram of the importance ranking based on SHAP value calculation in the random forest model using soil properties as predictors according to the present invention.
图4为本发明UBM、IVG和PBET生物可及性检测的组成和体外参数示意图。Figure 4 is a schematic diagram of the composition and in vitro parameters of the UBM, IVG and PBET bioavailability assays of the present invention.
图5为本发明随机森林(RF)模型在胃相中对重金属的生物可及性。(a)土壤特性和重金属形态作为预测变量。(b)土壤属性作为预测变量示意图。Figure 5 is the bioavailability of heavy metals in the gastric phase of the random forest (RF) model of the present invention. (a) Soil properties and heavy metal species as predictors. (b) Schematic diagram of soil properties as predictors.
图6为本发明通过shap值计算特征重要性来表征预测变量对模型性能的重要性,分析结果。FIG. 6 shows the importance of predicting variables to model performance by calculating feature importance by shap value in the present invention, and the analysis result.
具体实施方式Detailed ways
本部分将详细描述本发明的具体实施例,本发明之较佳实施例在附图中示出,附图的作用在于用图形补充说明书文字部分的描述,使人能够直观地、形象地理解本发明的每个技术特征和整体技术方案,但其不能理解为对本发明保护范围的限制。This part will describe the specific embodiments of the present invention in detail, and the preferred embodiments of the present invention are shown in the accompanying drawings. Each technical feature and overall technical solution of the invention should not be construed as limiting the protection scope of the invention.
在本发明的描述中,需要理解的是,涉及到方位描述,例如上、下、前、后、左、右等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that the azimuth description, such as the azimuth or position relationship indicated by up, down, front, rear, left, right, etc., is based on the azimuth or position relationship shown in the drawings, only In order to facilitate the description of the present invention and simplify the description, it is not indicated or implied that the indicated device or element must have a particular orientation, be constructed and operated in a particular orientation, and therefore should not be construed as limiting the present invention.
在本发明的描述中,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。In the description of the present invention, greater than, less than, exceeding, etc. are understood as not including this number, and above, below, within, etc. are understood as including this number. If it is described that the first and the second are only for the purpose of distinguishing technical features, it cannot be understood as indicating or implying relative importance, or indicating the number of the indicated technical features or the order of the indicated technical features. relation.
本发明的描述中,除非另有明确的限定,设置、安装、连接等词语应做广义理解,所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本发明中的具体含义。In the description of the present invention, unless otherwise clearly defined, words such as setting, installation, connection should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above words in the present invention in combination with the specific content of the technical solution.
请参阅图1-6,本发明提供一种技术方案:一种矿冶场地生物可及性的预测方法,包括以下步骤:S1通过采集模块采集研究区域内的土壤特性和重金属组分、重金属生物可及性数据,并测定采样点土壤的生物可及性;1-6, the present invention provides a technical solution: a method for predicting the biological accessibility of mining and metallurgy sites, including the following steps: S1 collects soil properties and heavy metal components, heavy metal biological Accessibility data and determination of soil bioaccessibility at sampling sites;
S2通过建模模块根据土壤特性数据和重金属组分和重金属生物可及性数据,以生物可及性作为响应变量,土壤特性和重金属组分作为预测变量,建立随机森林回归模型,采用五折交叉检验法估算随机森林回归模型的精度,通过建模模块根据土壤特性和重金属组分、重金属生物可及性数据,以重金属生物可及性作为响应变量,以土壤特性、重金属组分作为预测,建立随机森林回归模型;其中,生物可及性是指体外肠胃相模拟实验提取的重金属含量与土壤中重金属含量的比值;S2 uses the modeling module to establish a random forest regression model based on soil property data, heavy metal components and heavy metal bioaccessibility data, with bioaccessibility as a response variable, soil properties and heavy metal components as predictor variables, and a five-fold crossover model is used. The test method is used to estimate the accuracy of the random forest regression model. Based on the data of soil properties, heavy metal components, and heavy metal bioavailability, the modeling module takes heavy metal bioavailability as the response variable and soil properties and heavy metal components as predictions. Random forest regression model; in which, bioaccessibility refers to the ratio of the heavy metal content extracted from the in vitro gastrointestinal phase simulation experiment to the heavy metal content in the soil;
S3通过计算模块,根据上述测定数据,以土壤中重金属生物可及性作为响应变量,以土壤特性和重金属组分作为预测变量,采用随机自举抽样的方法来构建决策树来建立随机森林回归模型;其中,生物可及性是指体外肠胃相模拟实验提取的重金属含量与土壤中重金属含量的比值,通过计算模块用于根据随机森林回归模型计算地球化学因子对重金属生物有效性的贡献率;S3 uses the calculation module, according to the above measurement data, with the bio-accessibility of heavy metals in the soil as the response variable, the soil properties and heavy metal components as the predictor variables, and the random bootstrap sampling method is used to construct a decision tree to establish a random forest regression model. ; Among them, bioaccessibility refers to the ratio of the heavy metal content extracted from the in vitro gastrointestinal phase simulation experiment to the heavy metal content in the soil, and the calculation module is used to calculate the contribution rate of geochemical factors to the bioavailability of heavy metals according to the random forest regression model;
S4确定随机森林回归模型中每棵决策树随机取样的预测变量的数量和决策树的数量;S4 determines the number of predictors and the number of decision trees randomly sampled from each decision tree in the random forest regression model;
S5根据随机森林回归模型,基于shap值计算出预测因子对重金属生物可及性的贡献率。S5 calculates the contribution rate of predictors to the bioaccessibility of heavy metals based on the shap value according to the random forest regression model.
通过存储器,用于存储程序,通过处理器,用于加载程序以执行土壤重金属生物有效性的估算方法。The memory is used for storing the program, and the processor is used for loading the program to execute the estimation method of soil heavy metal bioavailability.
首先,样本采样:从中国八个省收集了33个来自采矿和冶炼区不同污染地点的土壤,每个土壤样品由三个土芯(0-20厘米深)组成,关于取样地点的详细信息可在补充资料的表S1中找到,所有的土壤在室温下风干,然后通过一个 10目筛子,以去除碎片和卵石,用于土壤性质的物理和化学分析。一部分样品在玛瑙研钵中研磨,并筛分至小于250微米,用于总元素和生物可及性分析。First, sample sampling: 33 soils from different contaminated sites in mining and smelting areas were collected from eight provinces in China. Each soil sample consisted of three soil cores (0-20 cm deep). Found in Table S1 in the Supplementary Information, all soils were air-dried at room temperature and then passed through a 10-mesh sieve to remove debris and pebbles for physical and chemical analysis of soil properties. A portion of the sample was ground in an agate mortar and sieved to less than 250 microns for total elemental and bioavailability analysis.
然后对土壤的详细理化性质进行了测量。土壤的pH值和电导率是用pH计和电导率仪测量的,分别在1:2.5和1:5的质量:体积的土壤和水悬浮液中,土壤总有机碳(TOC)和总碳(TC)是借助总有机碳分析仪(TOC,ASI-5000A) 测量的,用1.66cmol/L的Co(NH3)6CL3溶液浸出风干的土壤样品,测定土壤阳离子交换能力(CEC),土壤溶解总氮(TN)用0.5mol-L-1K2SO4以1:5(质量:体积)提取,用总有机碳分析仪(TOC,ASI-5000A,Shimazu)测量,使用激光分析仪(Mastersizerm 3000,Morvern,UK)测定土壤质地(沙子、淤泥、粘土;体积百分比),以确定土壤颗粒大小分布,为了确定土壤中金属的总浓度,使用湿法消化微波炉(Milestone MLS 1200Mega)加入HNO3:HCL:HF的混合物,比例为2:1:1(v:v:v),溶液在热板上干燥以完全去除HF,然后将残余物溶解在硝酸中。用ICP-MS和ICP-OES分析所得溶液的金属含量。The detailed physicochemical properties of the soil were then measured. Soil pH and conductivity were measured with a pH meter and a conductivity meter at 1:2.5 and 1:5 mass:volume soil and water suspensions, soil total organic carbon (TOC) and total carbon ( TC) was measured by a total organic carbon analyzer (TOC, ASI-5000A), and the air-dried soil samples were leached with 1.66cmol/L Co(NH3)6CL3 solution to determine soil cation exchange capacity (CEC), soil dissolved total nitrogen (TN) was extracted with 0.5 mol-L-1K2SO4 at 1:5 (mass:volume) and measured with a total organic carbon analyzer (TOC, ASI-5000A, Shimazu) using a laser analyzer (Mastersizerm 3000, Morvern, UK) Soil texture (sand, silt, clay; volume percent) was determined to determine the soil particle size distribution, and to determine the total concentration of metals in the soil, a wet digestion microwave oven (Milestone MLS 1200Mega) was used to add a mixture of HNO3:HCL:HF, ratio As 2:1:1 (v:v:v), the solution was dried on a hot plate to completely remove HF, and the residue was dissolved in nitric acid. The resulting solutions were analyzed for metal content by ICP-MS and ICP-OES.
采用优化的BCR三步顺序提取法,从不同采样点的土壤中提取重金属的四个组分。该方法分为四个阶段:(1)可溶于水或与碳酸盐弱连接的可交换金属,用0.11mol/L酸性乙酸获得;(2)附着在铁和锰氧化物上的金属,用0.1mol/L 盐酸羟胺在pH 1.5,(3)与有机物和硫化物结合的金属,首先用过氧化氢氧化步骤2的残留物,然后用1mol/L乙酸铵在pH值为2的条件下进行氧化,(4)残留物状态,用王水和HF分解步骤3的残留物得到。The optimized BCR three-step sequential extraction method was used to extract four components of heavy metals from soils at different sampling points. The method is divided into four stages: (1) exchangeable metals soluble in water or weakly linked to carbonates, obtained with 0.11 mol/L acidic acetic acid; (2) metals attached to iron and manganese oxides, With 0.1 mol/L hydroxylamine hydrochloride at pH 1.5, (3) metals combined with organics and sulfides, first oxidize the residue of
采用三种体外检测方法来评估土壤中的HMS生物可及性,它们分别通过动物模型验证为预测生物有效性的最佳体外方法,胃相(GP)和肠相(IP)中的成分和分析参数见图4。Three in vitro assays were employed to assess the bioavailability of HMS in soil, which were validated by animal models as the best in vitro method for predicting bioavailability, composition and analysis in the gastric phase (GP) and intestinal phase (IP), respectively. The parameters are shown in Figure 4.
接着,回归预测:分别以PBET方法测得土壤镉、UBM方法测得土壤铅、IVG 测得土壤砷的生物可及性作为随机森林模型的响应变量,上述实验测得的15个土壤特性和重金属组分作为预测变量,训练单棵回归树,设置树的数目为200,组合200课树训练的单个回归树,利用测试数据进行测试,用最终组合得到的随机森林模型进行回归预测,分别计算得到预测变量对不同重金属生物可及性的重要性以及预测变量和响应变量的交互作用,分析结果见图5。Then, regression prediction: the bioaccessibility of soil cadmium measured by PBET method, soil lead measured by UBM method, and soil arsenic measured by IVG method were used as the response variables of random forest model. The component is used as a predictor to train a single regression tree, the number of trees is set to 200, a single regression tree trained with 200 lesson trees is combined, the test data is used for testing, and the random forest model obtained by the final combination is used for regression prediction. The importance of predictor variables to the bioavailability of different heavy metals and the interaction of predictor variables and response variables are shown in Figure 5.
在对研究中的目标污染物数据集进行预处理对数转换后,发现随机森林模型对于三种重金属的拟合精度均较高,例如Cd(R2CV=0.92,RMSEcv=0.30), Pb(R2CV=0.3,RMSEcv=0.39)和As(R2CV=0.810,RMSEcv=0.23)。After preprocessing logarithmic transformation of the target pollutant data set in the study, it was found that the fitting accuracy of the random forest model for the three heavy metals was high, such as Cd (R2CV=0.92, RMSEcv=0.30), Pb (R2CV= 0.3, RMSEcv=0.39) and As (R2CV=0.810, RMSEcv=0.23).
为了更全面的了解影响重金属生物可及性的显著特征,我们采用通过shap 值计算特征重要性来表征预测变量对模型性能的重要性,分析结果见图6。In order to gain a more comprehensive understanding of the salient features that affect the bioaccessibility of heavy metals, we used the feature importance calculated by the shap value to characterize the importance of the predictor variables to the model performance. The analysis results are shown in Figure 6.
基于shap值计算的特征重要性显示,HMStotal、F1、F2和F123始终为随机森林模型中较重要的几个特征,尽管在不同重金属预测模型中它们重要性排名略有差异,具体来说,Cd预测模型中,Cdtotal和F2是排名第一和第二重要的特征,其次是F123、F1,铅预测模型中较重要特征与Cd预测模型相同,但其重要性排序存在差异,其排序是F123>F2>Pbtotal>F1,而As-GP预测模型中, F123和Astotal是最重要的两个特征,其次是EC和F1。而仅利用土壤特性作为预测变量预测Cd的生物可及性模型中,最为重要的特征为Cdtotal,较为重要的特征是EC、Eh、pH。说明重金属组分中F1、F2、F123是影响重金属生物有效性的关键因素。此外,土壤中电导率也是影响土壤中砷生物可及性的关键因素。The feature importance calculated based on the shap value shows that HMStotal, F1, F2 and F123 are always the more important features in the random forest model, although their importance rankings are slightly different in different heavy metal prediction models. Specifically, Cd In the prediction model, Cdtotal and F2 are the first and second most important features, followed by F123 and F1. The more important features in the lead prediction model are the same as the Cd prediction model, but there are differences in their importance ranking, and the ranking is F123> F2>Pbtotal>F1, and in the As-GP prediction model, F123 and Astotal were the two most important features, followed by EC and F1. In the bioavailability model that only used soil properties as predictors to predict Cd, the most important feature was Cdtotal, and the more important features were EC, Eh, and pH. It indicated that F1, F2 and F123 in the heavy metal components were the key factors affecting the bioavailability of heavy metals. In addition, electrical conductivity in soil is also a key factor affecting the bioaccessibility of arsenic in soil.
能够较为准确地获得土壤重金属生物有效性与土壤特性与重金属组分之间的非线性关系数据.并且构建的随机森林模型有较好的泛化能力,能够预测不同场地中镉、铅、砷的生物可及性。The nonlinear relationship data between soil heavy metal bioavailability and soil properties and heavy metal components can be obtained more accurately. And the constructed random forest model has good generalization ability, and can predict the concentrations of cadmium, lead and arsenic in different sites. Bioavailability.
上面结合附图对本发明实施例作了详细说明,但是本发明不限于上述实施例,在所述技术领域普通技术人员所具备的知识范围内,还可以在不脱离本发明宗旨的前提下作出各种变化。The embodiments of the present invention have been described in detail above in conjunction with the accompanying drawings, but the present invention is not limited to the above-mentioned embodiments. Within the scope of knowledge possessed by those of ordinary skill in the technical field, various modifications can be made without departing from the purpose of the present invention. kind of change.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210489626.8A CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210489626.8A CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115081310A true CN115081310A (en) | 2022-09-20 |
Family
ID=83247902
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210489626.8A Pending CN115081310A (en) | 2022-05-06 | 2022-05-06 | Method for predicting biological accessibility of mining and metallurgy sites |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115081310A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117094123A (en) * | 2023-07-12 | 2023-11-21 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
CN118980668A (en) * | 2024-09-12 | 2024-11-19 | 东北大学 | Soil heavy metal estimation method, device and equipment based on multi-angle reflectivity |
-
2022
- 2022-05-06 CN CN202210489626.8A patent/CN115081310A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117094123A (en) * | 2023-07-12 | 2023-11-21 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
CN117094123B (en) * | 2023-07-12 | 2024-06-11 | 广东省科学院生态环境与土壤研究所 | Soil carbon fixation driving force identification method, device and medium based on interpretable model |
CN118980668A (en) * | 2024-09-12 | 2024-11-19 | 东北大学 | Soil heavy metal estimation method, device and equipment based on multi-angle reflectivity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kicińska et al. | Changes in soil pH and mobility of heavy metals in contaminated soils | |
Sahuquillo et al. | Overview of the use of leaching/extraction tests for risk assessment of trace metals in contaminated soils and sediments | |
Reis et al. | Overview and challenges of mercury fractionation and speciation in soils | |
Degryse et al. | Soil solution concentration of Cd and Zn canbe predicted with a CaCl2 soil extract | |
Meers et al. | Comparison of cadmium extractability from soils by commonly used single extraction protocols | |
Peng et al. | Predicting heavy metal partition equilibrium in soils: Roles of soil components and binding sites | |
Desaules | Critical evaluation of soil contamination assessment methods for trace metals | |
Tokalioǧlu et al. | Determination of heavy metals and their speciation in lake sediments by flame atomic absorption spectrometry after a four-stage sequential extraction procedure | |
Middleton | Identifying chemical activity residues on prehistoric house floors: A methodology and rationale for multi‐elemental characterization of a mild acid extract of anthropogenic sediments | |
Amery et al. | The UV‐absorbance of dissolved organic matter predicts the fivefold variation in its affinity for mobilizing Cu in an agricultural soil horizon | |
Black et al. | Evaluation of soil metal bioavailability estimates using two plant species (L. perenne and T. aestivum) grown in a range of agricultural soils treated with biosolids and metal salts | |
CN115081310A (en) | Method for predicting biological accessibility of mining and metallurgy sites | |
De Vries et al. | Transfer functions for solid–solution partitioning of cadmium for Australian soils | |
Zan et al. | Prediction of the solubility of zinc, copper, nickel, cadmium, and lead in metal-contaminated soils | |
Zhai et al. | Leaching behaviors and chemical fraction distribution of exogenous selenium in three agricultural soils through simulated rainfall | |
Zhang et al. | Aging of zinc added to soils with a wide range of different properties: factors and modeling | |
Dari et al. | Estimation of phosphorus isotherm parameters: a simple and cost-effective procedure | |
Li et al. | Prediction of the uptake of Cd by rice (Oryza sativa) in paddy soils by a multi-surface model | |
Voegelin et al. | Zinc fractionation in contaminated soils by sequential and single extractions: influence of soil properties and zinc content | |
Kalkhajeh et al. | DGT technique to assess P mobilization from greenhouse vegetable soils in China: a novel approach | |
Degryse et al. | Mobility of Cd and Zn in polluted and unpolluted Spodosols | |
Golia et al. | Distribution of heavy metals of agricultural soils of central Greece using the modified BCR sequential extraction method | |
Nel et al. | Comparison of five methods to determine the cation exchange capacity of soil | |
Sun et al. | Does freeze-thaw action affect the extractability and bioavailability of Pb and As in contaminated soils? | |
Jiang et al. | Prediction of soil copper phytotoxicity to barley root elongation by an EDTA extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |