CN113916822A - Infrared spectroscopic analysis method for total nitrogen content of water-containing soil - Google Patents
Infrared spectroscopic analysis method for total nitrogen content of water-containing soil Download PDFInfo
- Publication number
- CN113916822A CN113916822A CN202110998389.3A CN202110998389A CN113916822A CN 113916822 A CN113916822 A CN 113916822A CN 202110998389 A CN202110998389 A CN 202110998389A CN 113916822 A CN113916822 A CN 113916822A
- Authority
- CN
- China
- Prior art keywords
- near infrared
- infrared spectrum
- soil
- data
- total nitrogen
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000002689 soil Substances 0.000 title claims abstract description 150
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 title claims abstract description 130
- 229910052757 nitrogen Inorganic materials 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 64
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 title claims abstract description 22
- 238000012844 infrared spectroscopy analysis Methods 0.000 title claims description 8
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 112
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 69
- 238000004458 analytical method Methods 0.000 claims abstract description 28
- 238000006243 chemical reaction Methods 0.000 claims abstract description 24
- 238000001228 spectrum Methods 0.000 claims abstract description 23
- 230000007613 environmental effect Effects 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 30
- 230000003595 spectral effect Effects 0.000 claims description 24
- 238000007781 pre-processing Methods 0.000 claims description 12
- 238000009499 grossing Methods 0.000 claims description 11
- 238000012546 transfer Methods 0.000 claims description 10
- 238000001035 drying Methods 0.000 claims description 5
- 238000010987 Kennard-Stone algorithm Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 4
- 239000000126 substance Substances 0.000 claims description 4
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 238000011160 research Methods 0.000 abstract description 6
- 238000012549 training Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 10
- 238000012795 verification Methods 0.000 description 9
- 238000002203 pretreatment Methods 0.000 description 7
- 238000012628 principal component regression Methods 0.000 description 7
- 239000004570 mortar (masonry) Substances 0.000 description 6
- 238000010238 partial least squares regression Methods 0.000 description 6
- IOLCXVTUBQKXJR-UHFFFAOYSA-M potassium bromide Chemical compound [K+].[Br-] IOLCXVTUBQKXJR-UHFFFAOYSA-M 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 238000004566 IR spectroscopy Methods 0.000 description 5
- 230000000052 comparative effect Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000010561 standard procedure Methods 0.000 description 4
- 238000004497 NIR spectroscopy Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000012417 linear regression Methods 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 238000010200 validation analysis Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000007605 air drying Methods 0.000 description 1
- 238000005842 biochemical reaction Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 229930002875 chlorophyll Natural products 0.000 description 1
- 235000019804 chlorophyll Nutrition 0.000 description 1
- ATNHDLDRLWWWCB-AENOIHSZSA-M chlorophyll a Chemical compound C1([C@@H](C(=O)OC)C(=O)C2=C3C)=C2N2C3=CC(C(CC)=C3C)=[N+]4C3=CC3=C(C=C)C(C)=C5N3[Mg-2]42[N+]2=C1[C@@H](CCC(=O)OC\C=C(/C)CCC[C@H](C)CCC[C@H](C)CCCC(C)C)[C@H](C)C2=C5 ATNHDLDRLWWWCB-AENOIHSZSA-M 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000029553 photosynthesis Effects 0.000 description 1
- 238000010672 photosynthesis Methods 0.000 description 1
- 238000004940 physical analysis method Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000002310 reflectometry Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000009475 tablet pressing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3563—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/01—Arrangements or apparatus for facilitating the optical investigation
Landscapes
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention belongs to the technical field of soil total nitrogen content analysis, and particularly relates to a total nitrogen content infrared spectrum analysis method of water-containing soil. The method of the invention comprises the following steps: step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum; step 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm; and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2. The method can more accurately predict the total nitrogen content in the water-containing soil, can reduce the consumption of manpower and resources, and has the advantages of high efficiency and low cost, so the method has good application prospect in the fields of agriculture, environmental protection, biological research and the like.
Description
Technical Field
The invention belongs to the technical field of soil total nitrogen content analysis, and particularly relates to a total nitrogen content infrared spectrum analysis method of water-containing soil.
Background
The content of nutrient elements in soil not only affects the growth of vegetation and crops, but also affects the regional ecological quality and the distribution of animal and plant populations. Therefore, the determination of nutrient elements in soil has become an urgent need in the fields of modern environmental science, agricultural science, ecology and the like.
Nitrogen is a basic element required for biochemical reactions in many organisms, is one of four basic elements constituting biomolecules such as DNA and RNA, and is also one of constituent elements of proteins. Nitrogen also plays a role in plant photosynthesis, which is used to produce chlorophyll molecules in related chemical reactions. Soil nitrogen (total nitrogen) can promote the growth and development of leaves, roots and stems of crops, so that the total nitrogen can influence the growth quality of the crops. Therefore, the determination of the total nitrogen content of the soil is very important for crop growth, related scientific research, environmental monitoring, and the like.
At present, the national standard method (HJ 717-.
Therefore, other techniques for analyzing total nitrogen in soil have been developed. The near infrared spectroscopy is a physical analysis method based on the spectroscopic technology, has the advantages of high analysis speed, low cost, no reagent consumption, capability of realizing simultaneous measurement of multiple components, small and portable instrument, suitability for field analysis and the like, and is widely concerned by people. Chinese patent application CN108982406A discloses a method for analyzing total nitrogen content of soil by using near infrared spectrum data, wherein a backward interval partial least square method BIPLS and a competitive adaptive weight sampling method CARS are fused to respectively select a near infrared spectrum characteristic interval and a characteristic variable of soil, and the result of two algorithms is optimized and fused to determine the near infrared spectrum characteristic interval of the soil; and establishing a prediction model between the characteristic wave band spectrum and the soil nitrogen content by using the PLS algorithm again.
However, the soil usually contains water, and the water content of the soil varies greatly depending on time, region, and other factors. According to the relevant research results in the field, the presence of water in a soil sample may have the following effect on the infrared spectrum: (1) soil moisture can affect the chemical properties of soil, so that the near infrared spectrum value can be changed; (2) the increase in moisture content causes a decrease in the reflectivity of the soil, thereby causing a change in the intensity of the collected infrared spectral data. However, the existing soil nitrogen content prediction model does not consider the influence of the water content on the infrared spectrum. Therefore, when the existing prediction model is used for predicting the nitrogen content of soil, the accuracy of the prediction result of the soil sample with larger water content deviated from the modeling sample is inevitably adversely affected.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an infrared spectrum analysis method for the total nitrogen content of water-containing soil, aiming at accurately analyzing the total nitrogen content of the water-containing soil.
An infrared spectrum analysis method for total nitrogen content of hydrous soil comprises the following steps:
and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
Preferably, the water content of the soil sample is 5% -35%, and/or the soil sample is collected from a Chengdu plain.
Preferably, in step 2, in the direct spectrum conversion algorithm, a conversion relationship between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data; f is the spectrum transfer matrix and E is the residual matrix.
Preferably, the spectrum transfer matrix and the residual matrix are determined by the following method:
step a, collecting near infrared spectrum of a soil sample to obtain near infrared spectrum original data, and using S as the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
wherein the content of the first and second substances,andare each S1And S2' A row vector is formed by averaging elements in each column, k is a column vector with 1 element, dsIs an environmental baseline correction matrix.
Preferably, the value of m is 50, and when the number of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by using a Kennard-Stone algorithm.
Preferably, step 3 comprises the steps of:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm;
and 3B, predicting the near infrared spectrum data preprocessed in the step 3A through a prediction model to obtain a total nitrogen content result.
Preferably, step 3B includes the steps of: predicting the preprocessed near infrared spectrum data by adopting a full-wavelength prediction model to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm or an ANN algorithm.
The invention also provides computer equipment for infrared spectrum analysis of soil total nitrogen, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and is characterized in that the processor realizes the infrared spectrum analysis method of the total nitrogen content of the water-containing soil when executing the program.
The invention also provides an infrared spectroscopic analysis system for soil total nitrogen, which is characterized by comprising the following components:
infrared spectrum acquisition and/or input means for acquiring and/or inputting near infrared spectrum data;
the computer equipment is used for analyzing the near infrared spectrum data to obtain a total nitrogen content result.
The present invention also provides a computer-readable storage medium characterized in that: on which a computer program is stored for implementing the above-mentioned method for the infrared spectroscopic analysis of the total nitrogen content of hydrous soil.
The analysis method can transform the infrared spectrum of the water-containing soil into the infrared spectrum of the dry soil in a mathematical conversion mode through a direct spectrum conversion algorithm, and can obtain a more accurate analysis result of the total nitrogen content of the soil by utilizing the infrared spectrum of the dry soil obtained after the conversion.
In general, when a laboratory needs to analyze spectrum data of dry soil, experimental steps such as air drying or baking are generally performed. By the method of the present invention, however, the infrared spectroscopy of dry soil can be performed without these drying steps. This saves a large amount of manpower and resources, has improved the efficiency of relevant research work, has reduced the cost. Therefore, the method has good application prospect in the related research work of the total nitrogen content of the soil.
Obviously, many modifications, substitutions, and variations are possible in light of the above teachings of the invention, without departing from the basic technical spirit of the invention, as defined by the following claims.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention.
Drawings
FIG. 1 is a comparison of different NIR spectra data for example 1;
FIG. 2 shows the results of the spectral similarity coefficient in Experimental example 2;
FIG. 3 is the result of the prediction of the total nitrogen content of the soil sample in example 1;
FIG. 4 is the result of predicting the total nitrogen content of the soil sample in comparative example 1;
FIG. 5 is the near infrared spectrum raw data collected after the soil sample was dried in Experimental example 4;
FIG. 6 is near infrared spectrum data after processing by SG smoothing algorithm in Experimental example 4;
FIG. 7 is data of near infrared spectrum after the processing by moving average method in Experimental example 4;
FIG. 8 is a graph showing data of a near infrared spectrum after being processed by a first derivative method in Experimental example 4;
FIG. 9 shows the data of the near infrared spectrum after the second derivative method in Experimental example 4;
FIG. 10 is a graph showing data of a near infrared spectrum after a standard normal change treatment in Experimental example 4;
fig. 11 is a schematic structural diagram of an ANN network in experimental example 5;
FIG. 12 shows the predicted result of the modeling algorithm MLR in Experimental example 5;
FIG. 13 shows the predicted result of PCR as the modeling algorithm in Experimental example 5;
FIG. 14 is a predicted result of PLSR as a modeling algorithm in Experimental example 5;
FIG. 15 shows the result of the model-building algorithm SVR prediction in Experimental example 5;
fig. 16 shows the predicted result of the modeling algorithm as ANN in experimental example 5.
Detailed Description
It should be noted that, in the embodiments, the algorithms of the steps of data acquisition, transmission, storage, processing, etc. which are not specifically described, and the hardware structures, circuit connections, etc. which are not specifically described, can be implemented by the contents disclosed in the prior art.
Example 1 Total Nitrogen content Infrared Spectroscopy method of Water-containing soil
An infrared spectrum analysis method for total nitrogen content of hydrous soil comprises the following steps:
the specific method comprises the following steps: and (3) grinding the potassium bromide into powder in a mortar, adding the soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the sheeting was completed, the finished product was tested in a Fourier Infrared spectrometer (the instrument was first powered on for 10 minutes to preheat). Each sample was measured 3 times and the average was taken. According to the method, the near infrared spectrum result of a batch of samples collected from Chengdu plain (farmland in Chong State demonstration area) is detected.
And 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm.
The Direct spectral conversion (DS) is a method for converting infrared spectral data by mathematical means. Specifically, the method comprises the following steps:
the conversion relation between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data;
f (p × p) is a spectrum transfer matrix for comparing the relationship between the converted spectral data, and E (m × p) is a residual matrix, introduced as a correction error.
The spectrum transfer matrix and the residual error matrix are determined by the following method:
step a, collecting the near infrared spectrum of the soil sample according to the mode of the step 1 to obtain near infrared spectrum original data, and using S to the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
wherein the content of the first and second substances,andare each S1And S2' A row vector is formed by averaging elements in each column, k (m × 1) is a column vector having 1 as each element, and ds(p × 1) is an ambient baseline correction matrix.
And the value of m is 50, and when the quantity of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by adopting a Kennard-Stone algorithm.
And 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
Specifically, the step 3 comprises the following steps:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm; the SG smoothing algorithm, Savitzky-Golsy convolution smoothing, can be implemented by the existing software MATLAB.
Step 3B, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm.
Wherein the number of principal components of the PLSR algorithm is set to 8. The PLSR algorithm, i.e. Partial least squares regression algorithm (Partial least squares regression), belongs to the prior art. And carrying out model training set and verification set division on the sample in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of soil samples are taken as a training set, and 30 percent of soil samples are taken as a verification set. The near infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is preprocessed by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
Comparative example 1 Total Nitrogen content Infrared Spectroscopy method of Water-containing soil
An infrared spectrum analysis method of soil total nitrogen comprises the following steps:
The specific method comprises the following steps: and (3) grinding the potassium bromide into powder in a mortar, adding the soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the sheeting was completed, the finished product was tested in a Fourier Infrared spectrometer (the instrument was first powered on for 10 minutes to preheat). Each sample was measured 3 times and the average was taken.
And 2, preprocessing the near infrared spectrum data through an SG smoothing algorithm.
And 3, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result.
The full-wavelength prediction model is obtained through modeling by a PLSR algorithm, and the number of main components of the PLSR algorithm is set to be 8. The PLSR algorithm, i.e. Partial least squares regression algorithm (Partial least squares regression), belongs to the prior art. And carrying out model training set and verification set division on the samples in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of the soil samples are taken as a training set, and 30 percent of the soil samples are taken as a verification set. The near-infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is preprocessed by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
In order to explain the technical effects of the present invention, the following further describes the technical solution of the present invention by experimental examples.
In the following experimental examples, the main statistical parameters for comparing the merits of different methods include: relative Predictive Development (RPD), coefficient of determination R2A Mean Square Error (MSE), a Root Mean Square Error (RMSE), and a correlation coefficient R.
Experimental example 1 comparison of Infrared Spectroscopy data
The experimental example compares the conversion effect of the DS algorithm, and acquires near infrared spectrum original data for 30 soil samples according to the step 1 in the example 1, and the result is shown as a black curve in a figure 1; the raw data of the near infrared spectrum of the dry soil were collected according to the step b of example 1, and the result is shown in the red curve of fig. 1; near infrared spectral data for dry soil were obtained after conversion according to step 2 of example 1, and the results are shown in the gray curve of fig. 1.
Comparing the red and gray curves in fig. 1, the two were found to be very close, indicating that conversion of infrared spectral data is feasible according to the method of example 1.
Experimental example 2 optimization of the value of m
The experimental example optimizes the value of the number m of the conversion sample sets, and the specific steps are as follows:
1. for 100 soil sample data, according to 7: and 3, dividing the training set and the test set by using a random division method (specifically, performing corresponding operation in matlab by using a randderm function).
2. And determining the number of the optimal conversion sets.
The value of m is [10,15,20,25,30,35,40,45, 50, 55,60,65,70 ═ m]Selecting, selecting a soil sample for calculating F from the training set by using a Kennard-Stone algorithm, calculating a corresponding spectrum transfer matrix F by using a DS algorithm described in embodiment 1, converting near infrared spectrum data of the test set by using the spectrum conversion matrix F, and calculating a spectrum similarity coefficient d between the converted near infrared spectrum data of the dry soil and near infrared spectrum original data (namely a red curve and a gray curve in the graph 1) of the dry soil acquired according to the step b in embodiment 1ccsm. In this example, the number of soil samples was 100, and the final spectral similarity coefficient d was determinedccsmThe mean value of the spectral similarity coefficients of 100 soil samples.
Spectral similarity coefficient dccsm(s1,s2N) belongs to the prior art, and is specifically calculated by the following formula:
wherein cov(s)1,s2) Is a spectral vector s1And s2Covariance of (a)(s)1) And σ(s)2) Are respectively s1And s2Standard deviation of (2). s1And s2Respectively representing the near infrared spectrum data of the dry soil obtained after conversion and the near infrared spectrum original data acquired according to the step b in the embodiment 1, wherein n is the number of data points, dccsm(s1,s2,n)∈[0,1],dccsm(s1,s2And n) is larger, the fitting degree of the two spectral curves is higher, and the fitting degree is lower.
The spectral similarity coefficients obtained when the sampling values of the converted sample sets are different are shown in fig. 2, and it can be seen from the graph that the spectral similarity coefficient d is obtained when the number of the converted sample sets is smallccsmAre small. As the number of converted sample sets increases, dccsmHas also increased. Spectral similarity factor d as the number of transformed sample sets increases to 50ccsmA peak is reached and thereafter the wave starts, and at 55 a dip also occurs. 60 to 70, spectral similarity coefficient dccsmSlowly rises. And combining the results, converting the value of the number m of the sample sets into 50 as a better choice.
Experimental example 3 Effect of near Infrared Spectroscopy data conversion on prediction accuracy
The experimental example discusses the influence of near infrared spectrum data conversion on the prediction accuracy, and compares the prediction accuracy of the example 1 and the comparative example 1 on the total nitrogen content of the soil sample, wherein the prediction result of the example 1 is shown in fig. 3, and the prediction result of the comparative example 1 is shown in fig. 4.
Coefficient of determination R in example 120.702, 0.838, a high correlation coefficient value, 200.89mg/kg root mean square error value, 1.83 RPD, and a RPD between 2.0 and 1.8 indicate the potential of the model as a quantitative predictor.
Coefficient of determination R of comparative example 120.443, 0.666, RMSE 500.69mg/kg, RPD 1.05, 1.0 and 1.4, indicating that the model predicted poor results.
From the comparison, in the example 1, the accuracy of predicting the total nitrogen content in the water-containing soil can be effectively improved by converting the near infrared spectrum original data of the soil sample.
Experimental example 4 comparison of preprocessing methods and selection of the number of principal components in PLSR Algorithm modeling
1. The concrete procedure of this example
An infrared spectrum analysis method of soil total nitrogen comprises the following steps:
The specific method comprises the following steps: and drying the soil sample at 105 ℃ for 10 hours. And grinding potassium bromide into powder in a mortar, adding the dried soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the tablet pressing is finished, the finished product is put in a Fourier infrared spectrometer for detection (the device needs to be started up for 10 minutes for preheating). Each sample was measured 3 times and averaged. The results of measuring the near infrared spectra of a batch of samples collected from the Chengdu plain (the farmland in Chong State demonstration area) according to this method are shown in FIG. 5.
And 2, preprocessing the near infrared spectrum data.
And 3, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result.
Wherein, the full wavelength prediction model is obtained by modeling through a PLSR algorithm. And carrying out model training set and verification set division on the samples in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of soil samples are taken as a training set, and 30 percent of soil samples are taken as a verification set. The near infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is pretreated by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
2. Preference of the pretreatment method
In this example, the pretreatment method in step 2 is preferably performed by the following methods: SG smoothing algorithm, moving average method, first derivative method, second derivative method and Standard Normal Variation (SNV). The pre-processed spectral data are shown in FIGS. 6-10. The pretreatment method belongs to the prior art and can be realized by software MATLAB.
3. Optimization of the number of principal components in PLSR Algorithm modeling
According to the modeling step of the PLSR algorithm, the number of principal components needs to be determined in the modeling process, which is a very important step of the algorithm, and according to the relevant knowledge of matrix theory, the component arrangement sequence represents the size of the data information quantity related to the extraction of the score factors, and the proportion of the components at the front to the data information quantity is larger. Therefore, it is important to determine the number of components, and too many components may cause loss of the dimension reduction means, and bring much useless noise, thereby affecting the data prediction model and causing overfitting. Too few components lose much useful information in the spectrum, and the prediction effect and accuracy of the model are seriously affected.
In the experimental example, the selection is performed by a method of drawing the variance percentage explained in the variables as a function of the number of the components, and the range of the number of the primarily selected preferred principal components is 6-10. For the different pretreatment methods, the number of main components should be in the range of 6 to 10.
4. Comparison of results
RMSE and R of soil total nitrogen prediction result of optimal model established by each pretreatment method2As shown in table 1. Wherein the number of principal components shown in the table is by RMSE and R2After comparison, the most preferred results for each pretreatment method.
TABLE 1 Total nitrogen modeling results of soil by different pretreatment methods
As can be seen from the above table, when the SG smoothing algorithm is selected for preprocessing, and the number of the principal components is set to 10, the RMSE of the verification set of the established model is lower than that of the models established by other preprocessing methods, and the R of the verification set is lower than that of the models established by other preprocessing methods2Compared with the models established by other preprocessing methods, the method is closer to 1. This indicates that the prediction model is more accurate. Therefore, it is better to select the SG smoothing algorithm for preprocessing and set the number of principal components to 10.
Experimental example 5 comparison of full wavelength prediction model modeling algorithms
1. Modeling algorithm
In the experimental example, on the basis of the specific experimental steps of the experimental example 4, the SG smoothing algorithm is selected in a preprocessing mode, the modeling algorithm in the step 3 is changed, and the prediction accuracy of the modeling result is compared. The modeling algorithms compared are: the method comprises a multiple linear regression algorithm (MLR), a principal component regression modeling analysis (PCR), a partial least squares regression algorithm (PLSR), a support vector machine regression modeling analysis (SVR) and an artificial neural network Algorithm (ANN), wherein specific algorithms of the methods are all the prior art and can be realized in MATLAB software.
Specifically, the method comprises the following steps:
(1) multiple linear regression algorithm (MLR)
The spectral data were used as input variables and the total nitrogen content as output. The sample is divided into a training set and a testing set, and the proportion is 7: 3. the specific realization of the multiple linear regression model algorithm is realized by calling a regression function in matlab.
(2) Principal component regression modeling analysis (PCR)
According to the experimental steps of the principal component regression algorithm, firstly, data are subjected to standardization processing; the second step is to determine the number of the principal components, and finally, regression analysis is performed by using the determined number of the principal components, and the specific experimental steps call a zscore function, a pca function and a regression function in matlab for processing. According to the variance percentage change trend chart of the component number, when the component number reaches more than 9, the variance ratio reaches nearly 100 percent; the mean square error tends to be smooth and stable after the number of the components is 9, and the number of the principal components is preferably 10 in the comprehensive view.
(3) Partial least squares regression algorithm (PLSR)
According to the experimental steps of the partial least square algorithm, the first step is to carry out standardization processing on data; the second step is to determine the quantity of the components to be extracted, and since the Y matrix only has one variable of the total nitrogen content, the Y matrix does not need to be subjected to variable extraction; the third step is to perform regression analysis using the determined amounts of the components. According to the results of experimental example 1, 10 major components were measured. The specific experimental steps are calculated and analyzed in matlab by using a zcore function, a Plregress function and the like.
(4) Support vector machine regression modeling analysis (SVR)
The algorithm is realized through a libsvm library of Matlab, wherein a kernel function selects a radial basis function, the penalty factor value is 9.38, and the gamma value is 5.
(5) Artificial neural network Algorithm (ANN)
The algorithm is implemented in Matlab. The sample set division mode adopts a random division method, and the training set, the verification set and the test set are respectively according to the following steps of 7: 1.5: 1.5. The number of hidden layers is default to 10, and an output layer is added, and the specific network structure is shown in fig. 11. In the figure, input represents the variable input and the number of variables 1272 represents the near infrared spectrum matrix in this experiment and the number of spectral wavelengths in this experiment. w is a weight vector, b is a bias coefficient, and output is a predicted value of the nitrogen content of the soil sample. The optimal number of training sessions is 10.
2. Modeling results
The predicted results for MLR, PCR, PLSR, SVR, and ANN are shown in FIGS. 12-16, respectively.
The prediction results of the specific modeling algorithms on the test set are shown in table 2.
TABLE 2 modeling results of different spectral modeling algorithms
Comparing the correlation coefficient R and the determination coefficient R in the table2And the evaluation indexes such as root mean square error RMSE and relative analysis error RPD. The indexes of the PLSR algorithm and the ANN algorithm are superior to those of the other three algorithms. Therefore, the PLSR algorithm and the ANN algorithm can be considered to be superior. In addition, although the ANN algorithm correlation coefficient 0.933 is higher than the correlation coefficient 0.883 of PLSR, the RMS error 55.97mg/kg of PLSR is significantly smaller than the RMS error 85.13mg/kg of ANN, the RMS error represents the deviation between the predicted and measured values of the model, and the relative analysis error of PLSR is also higher than that of ANN. In conclusion, the model established by the PLSR algorithm has the best prediction performance.
As can be seen from the above examples and experimental examples, the method for analyzing the total nitrogen content of the hydrous soil by infrared spectroscopy is added with the step of performing mathematical transformation on the near infrared spectroscopy data of the hydrous soil. The total nitrogen content in the water-containing soil can be predicted more accurately by using the near infrared spectrum data of the dry soil obtained after conversion. When the method is used for analyzing the total nitrogen content of the water-containing soil, the consumption of manpower and resources can be reduced, and the method has the advantages of high efficiency and low cost, so that the method has good application prospects in the fields of agriculture, environmental protection, biological research and the like.
Claims (10)
1. An infrared spectrum analysis method for total nitrogen content of water-containing soil is characterized by comprising the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum;
step 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm;
and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
2. The analytical method of claim 1, wherein: the water content of the soil sample is 5% -35%, and/or the soil sample is collected from a Chengdu plain.
3. The analytical method of claim 1, wherein: in step 2, in the direct spectrum conversion algorithm, a conversion relationship between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data; f is the spectrum transfer matrix and E is the residual matrix.
4. The analytical method of claim 3, wherein: the spectrum transfer matrix and the residual error matrix are determined by the following method:
step a, collecting near infrared spectrum of a soil sample to obtain near infrared spectrum original data, and using S as the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
5. The analytical method of claim 3, wherein: and the value of m is 50, and when the quantity of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by adopting a Kennard-Stone algorithm.
6. The analytical method of claim 1, wherein step 3 comprises the steps of:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm;
and 3B, predicting the near infrared spectrum data preprocessed in the step 3A through a prediction model to obtain a total nitrogen content result.
7. The analytical method of claim 6, wherein step 3B comprises the steps of: predicting the preprocessed near infrared spectrum data by adopting a full-wavelength prediction model to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm or an ANN algorithm.
8. A computer apparatus for infrared spectroscopic analysis of total nitrogen in soil comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements a method for infrared spectroscopic analysis of total nitrogen content in hydrous soil as claimed in any one of claims 1 to 7.
9. An infrared spectroscopic analysis system for total nitrogen in soil, comprising:
infrared spectrum acquisition and/or input means for acquiring and/or inputting near infrared spectrum data;
the computer apparatus of claim 8, configured to analyze the near infrared spectral data to obtain a total nitrogen content result.
10. A computer-readable storage medium characterized by: stored thereon is a computer program for carrying out the method for the infrared spectroscopic analysis of the total nitrogen content of aqueous soils according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110998389.3A CN113916822A (en) | 2021-08-27 | 2021-08-27 | Infrared spectroscopic analysis method for total nitrogen content of water-containing soil |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110998389.3A CN113916822A (en) | 2021-08-27 | 2021-08-27 | Infrared spectroscopic analysis method for total nitrogen content of water-containing soil |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113916822A true CN113916822A (en) | 2022-01-11 |
Family
ID=79233355
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110998389.3A Pending CN113916822A (en) | 2021-08-27 | 2021-08-27 | Infrared spectroscopic analysis method for total nitrogen content of water-containing soil |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113916822A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103712923A (en) * | 2013-12-23 | 2014-04-09 | 浙江大学 | Method for eliminating moisture influence factor in field in-situ soil measurement spectrums |
CN103884661A (en) * | 2014-02-21 | 2014-06-25 | 浙江大学 | Soil total nitrogen real-time detection method based on soil visible-near infrared spectrum library |
CN106990056A (en) * | 2017-04-20 | 2017-07-28 | 武汉大学 | A kind of total soil nitrogen spectrum appraising model calibration samples collection construction method |
CN107421911A (en) * | 2017-05-10 | 2017-12-01 | 浙江大学 | A kind of preprocess method of the soil nitrogen detection based on portable near infrared spectrometer |
CN107505179A (en) * | 2017-09-01 | 2017-12-22 | 浙江大学 | A kind of soil pretreatment and nutrient near infrared spectrum detection method |
CN108982406A (en) * | 2018-07-06 | 2018-12-11 | 浙江大学 | A kind of soil nitrogen near-infrared spectral characteristic band choosing method based on algorithm fusion |
CN108982407A (en) * | 2018-07-06 | 2018-12-11 | 浙江大学 | A method of probing into the soil optimum moisture content of detection soil nitrogen using near infrared spectrum |
CN109374860A (en) * | 2018-11-13 | 2019-02-22 | 西北大学 | A kind of soil nutrient prediction and integrated evaluating method based on machine learning algorithm |
CN110455726A (en) * | 2019-07-30 | 2019-11-15 | 北京安赛博技术有限公司 | A kind of method of real-time Forecasting Soil Moisture and total nitrogen content |
-
2021
- 2021-08-27 CN CN202110998389.3A patent/CN113916822A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103712923A (en) * | 2013-12-23 | 2014-04-09 | 浙江大学 | Method for eliminating moisture influence factor in field in-situ soil measurement spectrums |
CN103884661A (en) * | 2014-02-21 | 2014-06-25 | 浙江大学 | Soil total nitrogen real-time detection method based on soil visible-near infrared spectrum library |
CN106990056A (en) * | 2017-04-20 | 2017-07-28 | 武汉大学 | A kind of total soil nitrogen spectrum appraising model calibration samples collection construction method |
CN107421911A (en) * | 2017-05-10 | 2017-12-01 | 浙江大学 | A kind of preprocess method of the soil nitrogen detection based on portable near infrared spectrometer |
CN107505179A (en) * | 2017-09-01 | 2017-12-22 | 浙江大学 | A kind of soil pretreatment and nutrient near infrared spectrum detection method |
CN108982406A (en) * | 2018-07-06 | 2018-12-11 | 浙江大学 | A kind of soil nitrogen near-infrared spectral characteristic band choosing method based on algorithm fusion |
CN108982407A (en) * | 2018-07-06 | 2018-12-11 | 浙江大学 | A method of probing into the soil optimum moisture content of detection soil nitrogen using near infrared spectrum |
CN109374860A (en) * | 2018-11-13 | 2019-02-22 | 西北大学 | A kind of soil nutrient prediction and integrated evaluating method based on machine learning algorithm |
CN110455726A (en) * | 2019-07-30 | 2019-11-15 | 北京安赛博技术有限公司 | A kind of method of real-time Forecasting Soil Moisture and total nitrogen content |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111855589A (en) | Remote sensing inversion model and method for rice leaf nitrogen accumulation | |
CN111855590A (en) | Remote sensing inversion model and method for rice leaf starch accumulation | |
CN111488926B (en) | Soil organic matter determination method based on optimization model | |
Pang et al. | Hyperspectral imaging coupled with multivariate methods for seed vitality estimation and forecast for Quercus variabilis | |
Wang et al. | Rapid detection of protein content in rice based on Raman and near-infrared spectroscopy fusion strategy combined with characteristic wavelength selection | |
CN110455726A (en) | A kind of method of real-time Forecasting Soil Moisture and total nitrogen content | |
CN105158200A (en) | Modeling method capable of improving accuracy of qualitative near-infrared spectroscopic analysis | |
CN111855593A (en) | Remote sensing inversion model and method for starch content of rice leaf | |
CN102313712A (en) | Correction method of difference between near-infrared spectrums with different light-splitting modes based on fiber material | |
Yao et al. | Prediction of total nitrogen in soil based on random frog leaping wavelet neural network | |
Zhao et al. | Determination of residual levels of procymidone in rapeseed oil using near-infrared spectroscopy combined with multivariate analysis | |
Wang et al. | Predicting organic matter content, total nitrogen and ph value of lime concretion black soil based on visible and near infrared spectroscopy | |
Zhang et al. | Hyperspectral model based on genetic algorithm and SA-1DCNN for predicting Chinese cabbage chlorophyll content | |
CN107796779A (en) | The near infrared spectrum diagnostic method of rubber tree LTN content | |
Zhang et al. | Measurement of aspartic acid in oilseed rape leaves under herbicide stress using near infrared spectroscopy and chemometrics | |
CN113049526B (en) | Corn seed moisture content determination method based on terahertz attenuated total reflection | |
CN107271389A (en) | A kind of spectral signature variable fast matching method based on index extreme value | |
CN113916822A (en) | Infrared spectroscopic analysis method for total nitrogen content of water-containing soil | |
CN111650130A (en) | Prediction method and prediction system for magnesium content of litchi leaves | |
CN110887798A (en) | Nonlinear full-spectrum water turbidity quantitative analysis method based on extreme random tree | |
CN108398400B (en) | Method for nondestructive testing of fatty acid content in wheat by terahertz imaging | |
Song et al. | Fractional-order derivative spectral transformations improved partial least squares regression estimation of photosynthetic capacity from hyperspectral reflectance | |
CN116380869A (en) | Raman spectrum denoising method based on self-adaptive sparse decomposition | |
Jin et al. | Quantitative inversion model of protein and fat content in milk based on hyperspectral techniques | |
Yao et al. | Prediction of total nitrogen content in different soil types based on spectroscopy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |