CN113916822A - Infrared spectroscopic analysis method for total nitrogen content of water-containing soil - Google Patents

Infrared spectroscopic analysis method for total nitrogen content of water-containing soil Download PDF

Info

Publication number
CN113916822A
CN113916822A CN202110998389.3A CN202110998389A CN113916822A CN 113916822 A CN113916822 A CN 113916822A CN 202110998389 A CN202110998389 A CN 202110998389A CN 113916822 A CN113916822 A CN 113916822A
Authority
CN
China
Prior art keywords
near infrared
infrared spectrum
soil
data
total nitrogen
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110998389.3A
Other languages
Chinese (zh)
Inventor
冷庚
刘哲
许文波
贾海涛
罗欣
常乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangtze River Delta Research Institute of UESTC Huzhou
Original Assignee
Yangtze River Delta Research Institute of UESTC Huzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangtze River Delta Research Institute of UESTC Huzhou filed Critical Yangtze River Delta Research Institute of UESTC Huzhou
Priority to CN202110998389.3A priority Critical patent/CN113916822A/en
Publication of CN113916822A publication Critical patent/CN113916822A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3563Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing solids; Preparation of samples therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/01Arrangements or apparatus for facilitating the optical investigation

Landscapes

  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention belongs to the technical field of soil total nitrogen content analysis, and particularly relates to a total nitrogen content infrared spectrum analysis method of water-containing soil. The method of the invention comprises the following steps: step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum; step 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm; and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2. The method can more accurately predict the total nitrogen content in the water-containing soil, can reduce the consumption of manpower and resources, and has the advantages of high efficiency and low cost, so the method has good application prospect in the fields of agriculture, environmental protection, biological research and the like.

Description

Infrared spectroscopic analysis method for total nitrogen content of water-containing soil
Technical Field
The invention belongs to the technical field of soil total nitrogen content analysis, and particularly relates to a total nitrogen content infrared spectrum analysis method of water-containing soil.
Background
The content of nutrient elements in soil not only affects the growth of vegetation and crops, but also affects the regional ecological quality and the distribution of animal and plant populations. Therefore, the determination of nutrient elements in soil has become an urgent need in the fields of modern environmental science, agricultural science, ecology and the like.
Nitrogen is a basic element required for biochemical reactions in many organisms, is one of four basic elements constituting biomolecules such as DNA and RNA, and is also one of constituent elements of proteins. Nitrogen also plays a role in plant photosynthesis, which is used to produce chlorophyll molecules in related chemical reactions. Soil nitrogen (total nitrogen) can promote the growth and development of leaves, roots and stems of crops, so that the total nitrogen can influence the growth quality of the crops. Therefore, the determination of the total nitrogen content of the soil is very important for crop growth, related scientific research, environmental monitoring, and the like.
At present, the national standard method (HJ 717-.
Therefore, other techniques for analyzing total nitrogen in soil have been developed. The near infrared spectroscopy is a physical analysis method based on the spectroscopic technology, has the advantages of high analysis speed, low cost, no reagent consumption, capability of realizing simultaneous measurement of multiple components, small and portable instrument, suitability for field analysis and the like, and is widely concerned by people. Chinese patent application CN108982406A discloses a method for analyzing total nitrogen content of soil by using near infrared spectrum data, wherein a backward interval partial least square method BIPLS and a competitive adaptive weight sampling method CARS are fused to respectively select a near infrared spectrum characteristic interval and a characteristic variable of soil, and the result of two algorithms is optimized and fused to determine the near infrared spectrum characteristic interval of the soil; and establishing a prediction model between the characteristic wave band spectrum and the soil nitrogen content by using the PLS algorithm again.
However, the soil usually contains water, and the water content of the soil varies greatly depending on time, region, and other factors. According to the relevant research results in the field, the presence of water in a soil sample may have the following effect on the infrared spectrum: (1) soil moisture can affect the chemical properties of soil, so that the near infrared spectrum value can be changed; (2) the increase in moisture content causes a decrease in the reflectivity of the soil, thereby causing a change in the intensity of the collected infrared spectral data. However, the existing soil nitrogen content prediction model does not consider the influence of the water content on the infrared spectrum. Therefore, when the existing prediction model is used for predicting the nitrogen content of soil, the accuracy of the prediction result of the soil sample with larger water content deviated from the modeling sample is inevitably adversely affected.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an infrared spectrum analysis method for the total nitrogen content of water-containing soil, aiming at accurately analyzing the total nitrogen content of the water-containing soil.
An infrared spectrum analysis method for total nitrogen content of hydrous soil comprises the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum;
step 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm;
and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
Preferably, the water content of the soil sample is 5% -35%, and/or the soil sample is collected from a Chengdu plain.
Preferably, in step 2, in the direct spectrum conversion algorithm, a conversion relationship between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data; f is the spectrum transfer matrix and E is the residual matrix.
Preferably, the spectrum transfer matrix and the residual matrix are determined by the following method:
step a, collecting near infrared spectrum of a soil sample to obtain near infrared spectrum original data, and using S as the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Figure BDA0003234606920000021
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Figure BDA0003234606920000022
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
Figure BDA0003234606920000023
Figure BDA0003234606920000024
Figure BDA0003234606920000025
wherein the content of the first and second substances,
Figure BDA0003234606920000031
and
Figure BDA0003234606920000032
are each S1And S2' A row vector is formed by averaging elements in each column, k is a column vector with 1 element, dsIs an environmental baseline correction matrix.
Preferably, the value of m is 50, and when the number of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by using a Kennard-Stone algorithm.
Preferably, step 3 comprises the steps of:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm;
and 3B, predicting the near infrared spectrum data preprocessed in the step 3A through a prediction model to obtain a total nitrogen content result.
Preferably, step 3B includes the steps of: predicting the preprocessed near infrared spectrum data by adopting a full-wavelength prediction model to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm or an ANN algorithm.
The invention also provides computer equipment for infrared spectrum analysis of soil total nitrogen, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and is characterized in that the processor realizes the infrared spectrum analysis method of the total nitrogen content of the water-containing soil when executing the program.
The invention also provides an infrared spectroscopic analysis system for soil total nitrogen, which is characterized by comprising the following components:
infrared spectrum acquisition and/or input means for acquiring and/or inputting near infrared spectrum data;
the computer equipment is used for analyzing the near infrared spectrum data to obtain a total nitrogen content result.
The present invention also provides a computer-readable storage medium characterized in that: on which a computer program is stored for implementing the above-mentioned method for the infrared spectroscopic analysis of the total nitrogen content of hydrous soil.
The analysis method can transform the infrared spectrum of the water-containing soil into the infrared spectrum of the dry soil in a mathematical conversion mode through a direct spectrum conversion algorithm, and can obtain a more accurate analysis result of the total nitrogen content of the soil by utilizing the infrared spectrum of the dry soil obtained after the conversion.
In general, when a laboratory needs to analyze spectrum data of dry soil, experimental steps such as air drying or baking are generally performed. By the method of the present invention, however, the infrared spectroscopy of dry soil can be performed without these drying steps. This saves a large amount of manpower and resources, has improved the efficiency of relevant research work, has reduced the cost. Therefore, the method has good application prospect in the related research work of the total nitrogen content of the soil.
Obviously, many modifications, substitutions, and variations are possible in light of the above teachings of the invention, without departing from the basic technical spirit of the invention, as defined by the following claims.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention.
Drawings
FIG. 1 is a comparison of different NIR spectra data for example 1;
FIG. 2 shows the results of the spectral similarity coefficient in Experimental example 2;
FIG. 3 is the result of the prediction of the total nitrogen content of the soil sample in example 1;
FIG. 4 is the result of predicting the total nitrogen content of the soil sample in comparative example 1;
FIG. 5 is the near infrared spectrum raw data collected after the soil sample was dried in Experimental example 4;
FIG. 6 is near infrared spectrum data after processing by SG smoothing algorithm in Experimental example 4;
FIG. 7 is data of near infrared spectrum after the processing by moving average method in Experimental example 4;
FIG. 8 is a graph showing data of a near infrared spectrum after being processed by a first derivative method in Experimental example 4;
FIG. 9 shows the data of the near infrared spectrum after the second derivative method in Experimental example 4;
FIG. 10 is a graph showing data of a near infrared spectrum after a standard normal change treatment in Experimental example 4;
fig. 11 is a schematic structural diagram of an ANN network in experimental example 5;
FIG. 12 shows the predicted result of the modeling algorithm MLR in Experimental example 5;
FIG. 13 shows the predicted result of PCR as the modeling algorithm in Experimental example 5;
FIG. 14 is a predicted result of PLSR as a modeling algorithm in Experimental example 5;
FIG. 15 shows the result of the model-building algorithm SVR prediction in Experimental example 5;
fig. 16 shows the predicted result of the modeling algorithm as ANN in experimental example 5.
Detailed Description
It should be noted that, in the embodiments, the algorithms of the steps of data acquisition, transmission, storage, processing, etc. which are not specifically described, and the hardware structures, circuit connections, etc. which are not specifically described, can be implemented by the contents disclosed in the prior art.
Example 1 Total Nitrogen content Infrared Spectroscopy method of Water-containing soil
An infrared spectrum analysis method for total nitrogen content of hydrous soil comprises the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum;
the specific method comprises the following steps: and (3) grinding the potassium bromide into powder in a mortar, adding the soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the sheeting was completed, the finished product was tested in a Fourier Infrared spectrometer (the instrument was first powered on for 10 minutes to preheat). Each sample was measured 3 times and the average was taken. According to the method, the near infrared spectrum result of a batch of samples collected from Chengdu plain (farmland in Chong State demonstration area) is detected.
And 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm.
The Direct spectral conversion (DS) is a method for converting infrared spectral data by mathematical means. Specifically, the method comprises the following steps:
the conversion relation between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data;
f (p × p) is a spectrum transfer matrix for comparing the relationship between the converted spectral data, and E (m × p) is a residual matrix, introduced as a correction error.
The spectrum transfer matrix and the residual error matrix are determined by the following method:
step a, collecting the near infrared spectrum of the soil sample according to the mode of the step 1 to obtain near infrared spectrum original data, and using S to the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Figure BDA0003234606920000051
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Figure BDA0003234606920000052
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
Figure BDA0003234606920000053
Figure BDA0003234606920000054
Figure BDA0003234606920000055
wherein the content of the first and second substances,
Figure BDA0003234606920000056
and
Figure BDA0003234606920000057
are each S1And S2' A row vector is formed by averaging elements in each column, k (m × 1) is a column vector having 1 as each element, and ds(p × 1) is an ambient baseline correction matrix.
And the value of m is 50, and when the quantity of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by adopting a Kennard-Stone algorithm.
And 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
Specifically, the step 3 comprises the following steps:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm; the SG smoothing algorithm, Savitzky-Golsy convolution smoothing, can be implemented by the existing software MATLAB.
Step 3B, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm.
Wherein the number of principal components of the PLSR algorithm is set to 8. The PLSR algorithm, i.e. Partial least squares regression algorithm (Partial least squares regression), belongs to the prior art. And carrying out model training set and verification set division on the sample in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of soil samples are taken as a training set, and 30 percent of soil samples are taken as a verification set. The near infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is preprocessed by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
Comparative example 1 Total Nitrogen content Infrared Spectroscopy method of Water-containing soil
An infrared spectrum analysis method of soil total nitrogen comprises the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain near infrared spectrum data.
The specific method comprises the following steps: and (3) grinding the potassium bromide into powder in a mortar, adding the soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the sheeting was completed, the finished product was tested in a Fourier Infrared spectrometer (the instrument was first powered on for 10 minutes to preheat). Each sample was measured 3 times and the average was taken.
And 2, preprocessing the near infrared spectrum data through an SG smoothing algorithm.
And 3, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result.
The full-wavelength prediction model is obtained through modeling by a PLSR algorithm, and the number of main components of the PLSR algorithm is set to be 8. The PLSR algorithm, i.e. Partial least squares regression algorithm (Partial least squares regression), belongs to the prior art. And carrying out model training set and verification set division on the samples in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of the soil samples are taken as a training set, and 30 percent of the soil samples are taken as a verification set. The near-infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is preprocessed by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
In order to explain the technical effects of the present invention, the following further describes the technical solution of the present invention by experimental examples.
In the following experimental examples, the main statistical parameters for comparing the merits of different methods include: relative Predictive Development (RPD), coefficient of determination R2A Mean Square Error (MSE), a Root Mean Square Error (RMSE), and a correlation coefficient R.
Experimental example 1 comparison of Infrared Spectroscopy data
The experimental example compares the conversion effect of the DS algorithm, and acquires near infrared spectrum original data for 30 soil samples according to the step 1 in the example 1, and the result is shown as a black curve in a figure 1; the raw data of the near infrared spectrum of the dry soil were collected according to the step b of example 1, and the result is shown in the red curve of fig. 1; near infrared spectral data for dry soil were obtained after conversion according to step 2 of example 1, and the results are shown in the gray curve of fig. 1.
Comparing the red and gray curves in fig. 1, the two were found to be very close, indicating that conversion of infrared spectral data is feasible according to the method of example 1.
Experimental example 2 optimization of the value of m
The experimental example optimizes the value of the number m of the conversion sample sets, and the specific steps are as follows:
1. for 100 soil sample data, according to 7: and 3, dividing the training set and the test set by using a random division method (specifically, performing corresponding operation in matlab by using a randderm function).
2. And determining the number of the optimal conversion sets.
The value of m is [10,15,20,25,30,35,40,45, 50, 55,60,65,70 ═ m]Selecting, selecting a soil sample for calculating F from the training set by using a Kennard-Stone algorithm, calculating a corresponding spectrum transfer matrix F by using a DS algorithm described in embodiment 1, converting near infrared spectrum data of the test set by using the spectrum conversion matrix F, and calculating a spectrum similarity coefficient d between the converted near infrared spectrum data of the dry soil and near infrared spectrum original data (namely a red curve and a gray curve in the graph 1) of the dry soil acquired according to the step b in embodiment 1ccsm. In this example, the number of soil samples was 100, and the final spectral similarity coefficient d was determinedccsmThe mean value of the spectral similarity coefficients of 100 soil samples.
Spectral similarity coefficient dccsm(s1,s2N) belongs to the prior art, and is specifically calculated by the following formula:
Figure BDA0003234606920000071
Figure BDA0003234606920000081
wherein cov(s)1,s2) Is a spectral vector s1And s2Covariance of (a)(s)1) And σ(s)2) Are respectively s1And s2Standard deviation of (2). s1And s2Respectively representing the near infrared spectrum data of the dry soil obtained after conversion and the near infrared spectrum original data acquired according to the step b in the embodiment 1, wherein n is the number of data points, dccsm(s1,s2,n)∈[0,1],dccsm(s1,s2And n) is larger, the fitting degree of the two spectral curves is higher, and the fitting degree is lower.
The spectral similarity coefficients obtained when the sampling values of the converted sample sets are different are shown in fig. 2, and it can be seen from the graph that the spectral similarity coefficient d is obtained when the number of the converted sample sets is smallccsmAre small. As the number of converted sample sets increases, dccsmHas also increased. Spectral similarity factor d as the number of transformed sample sets increases to 50ccsmA peak is reached and thereafter the wave starts, and at 55 a dip also occurs. 60 to 70, spectral similarity coefficient dccsmSlowly rises. And combining the results, converting the value of the number m of the sample sets into 50 as a better choice.
Experimental example 3 Effect of near Infrared Spectroscopy data conversion on prediction accuracy
The experimental example discusses the influence of near infrared spectrum data conversion on the prediction accuracy, and compares the prediction accuracy of the example 1 and the comparative example 1 on the total nitrogen content of the soil sample, wherein the prediction result of the example 1 is shown in fig. 3, and the prediction result of the comparative example 1 is shown in fig. 4.
Coefficient of determination R in example 120.702, 0.838, a high correlation coefficient value, 200.89mg/kg root mean square error value, 1.83 RPD, and a RPD between 2.0 and 1.8 indicate the potential of the model as a quantitative predictor.
Coefficient of determination R of comparative example 120.443, 0.666, RMSE 500.69mg/kg, RPD 1.05, 1.0 and 1.4, indicating that the model predicted poor results.
From the comparison, in the example 1, the accuracy of predicting the total nitrogen content in the water-containing soil can be effectively improved by converting the near infrared spectrum original data of the soil sample.
Experimental example 4 comparison of preprocessing methods and selection of the number of principal components in PLSR Algorithm modeling
1. The concrete procedure of this example
An infrared spectrum analysis method of soil total nitrogen comprises the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain near infrared spectrum data.
The specific method comprises the following steps: and drying the soil sample at 105 ℃ for 10 hours. And grinding potassium bromide into powder in a mortar, adding the dried soil sample into the mortar for grinding together, and then putting the ground soil sample into a tablet machine for tabletting. After the tablet pressing is finished, the finished product is put in a Fourier infrared spectrometer for detection (the device needs to be started up for 10 minutes for preheating). Each sample was measured 3 times and averaged. The results of measuring the near infrared spectra of a batch of samples collected from the Chengdu plain (the farmland in Chong State demonstration area) according to this method are shown in FIG. 5.
And 2, preprocessing the near infrared spectrum data.
And 3, adopting a full-wavelength prediction model to predict the preprocessed near infrared spectrum data to obtain a total nitrogen content result.
Wherein, the full wavelength prediction model is obtained by modeling through a PLSR algorithm. And carrying out model training set and verification set division on the samples in the modeling process. Training and validation sets were as follows 7: 3, namely 70 percent of soil samples are taken as a training set, and 30 percent of soil samples are taken as a verification set. The near infrared spectrum of the soil sample required by modeling is collected by the method in the step 1 and is pretreated by the method in the step 2, and the total nitrogen content of the soil sample required by modeling is detected by the existing national standard method (HJ 717 2014).
2. Preference of the pretreatment method
In this example, the pretreatment method in step 2 is preferably performed by the following methods: SG smoothing algorithm, moving average method, first derivative method, second derivative method and Standard Normal Variation (SNV). The pre-processed spectral data are shown in FIGS. 6-10. The pretreatment method belongs to the prior art and can be realized by software MATLAB.
3. Optimization of the number of principal components in PLSR Algorithm modeling
According to the modeling step of the PLSR algorithm, the number of principal components needs to be determined in the modeling process, which is a very important step of the algorithm, and according to the relevant knowledge of matrix theory, the component arrangement sequence represents the size of the data information quantity related to the extraction of the score factors, and the proportion of the components at the front to the data information quantity is larger. Therefore, it is important to determine the number of components, and too many components may cause loss of the dimension reduction means, and bring much useless noise, thereby affecting the data prediction model and causing overfitting. Too few components lose much useful information in the spectrum, and the prediction effect and accuracy of the model are seriously affected.
In the experimental example, the selection is performed by a method of drawing the variance percentage explained in the variables as a function of the number of the components, and the range of the number of the primarily selected preferred principal components is 6-10. For the different pretreatment methods, the number of main components should be in the range of 6 to 10.
4. Comparison of results
RMSE and R of soil total nitrogen prediction result of optimal model established by each pretreatment method2As shown in table 1. Wherein the number of principal components shown in the table is by RMSE and R2After comparison, the most preferred results for each pretreatment method.
TABLE 1 Total nitrogen modeling results of soil by different pretreatment methods
Figure RE-GDA0003301846030000101
As can be seen from the above table, when the SG smoothing algorithm is selected for preprocessing, and the number of the principal components is set to 10, the RMSE of the verification set of the established model is lower than that of the models established by other preprocessing methods, and the R of the verification set is lower than that of the models established by other preprocessing methods2Compared with the models established by other preprocessing methods, the method is closer to 1. This indicates that the prediction model is more accurate. Therefore, it is better to select the SG smoothing algorithm for preprocessing and set the number of principal components to 10.
Experimental example 5 comparison of full wavelength prediction model modeling algorithms
1. Modeling algorithm
In the experimental example, on the basis of the specific experimental steps of the experimental example 4, the SG smoothing algorithm is selected in a preprocessing mode, the modeling algorithm in the step 3 is changed, and the prediction accuracy of the modeling result is compared. The modeling algorithms compared are: the method comprises a multiple linear regression algorithm (MLR), a principal component regression modeling analysis (PCR), a partial least squares regression algorithm (PLSR), a support vector machine regression modeling analysis (SVR) and an artificial neural network Algorithm (ANN), wherein specific algorithms of the methods are all the prior art and can be realized in MATLAB software.
Specifically, the method comprises the following steps:
(1) multiple linear regression algorithm (MLR)
The spectral data were used as input variables and the total nitrogen content as output. The sample is divided into a training set and a testing set, and the proportion is 7: 3. the specific realization of the multiple linear regression model algorithm is realized by calling a regression function in matlab.
(2) Principal component regression modeling analysis (PCR)
According to the experimental steps of the principal component regression algorithm, firstly, data are subjected to standardization processing; the second step is to determine the number of the principal components, and finally, regression analysis is performed by using the determined number of the principal components, and the specific experimental steps call a zscore function, a pca function and a regression function in matlab for processing. According to the variance percentage change trend chart of the component number, when the component number reaches more than 9, the variance ratio reaches nearly 100 percent; the mean square error tends to be smooth and stable after the number of the components is 9, and the number of the principal components is preferably 10 in the comprehensive view.
(3) Partial least squares regression algorithm (PLSR)
According to the experimental steps of the partial least square algorithm, the first step is to carry out standardization processing on data; the second step is to determine the quantity of the components to be extracted, and since the Y matrix only has one variable of the total nitrogen content, the Y matrix does not need to be subjected to variable extraction; the third step is to perform regression analysis using the determined amounts of the components. According to the results of experimental example 1, 10 major components were measured. The specific experimental steps are calculated and analyzed in matlab by using a zcore function, a Plregress function and the like.
(4) Support vector machine regression modeling analysis (SVR)
The algorithm is realized through a libsvm library of Matlab, wherein a kernel function selects a radial basis function, the penalty factor value is 9.38, and the gamma value is 5.
(5) Artificial neural network Algorithm (ANN)
The algorithm is implemented in Matlab. The sample set division mode adopts a random division method, and the training set, the verification set and the test set are respectively according to the following steps of 7: 1.5: 1.5. The number of hidden layers is default to 10, and an output layer is added, and the specific network structure is shown in fig. 11. In the figure, input represents the variable input and the number of variables 1272 represents the near infrared spectrum matrix in this experiment and the number of spectral wavelengths in this experiment. w is a weight vector, b is a bias coefficient, and output is a predicted value of the nitrogen content of the soil sample. The optimal number of training sessions is 10.
2. Modeling results
The predicted results for MLR, PCR, PLSR, SVR, and ANN are shown in FIGS. 12-16, respectively.
The prediction results of the specific modeling algorithms on the test set are shown in table 2.
TABLE 2 modeling results of different spectral modeling algorithms
Figure BDA0003234606920000111
Comparing the correlation coefficient R and the determination coefficient R in the table2And the evaluation indexes such as root mean square error RMSE and relative analysis error RPD. The indexes of the PLSR algorithm and the ANN algorithm are superior to those of the other three algorithms. Therefore, the PLSR algorithm and the ANN algorithm can be considered to be superior. In addition, although the ANN algorithm correlation coefficient 0.933 is higher than the correlation coefficient 0.883 of PLSR, the RMS error 55.97mg/kg of PLSR is significantly smaller than the RMS error 85.13mg/kg of ANN, the RMS error represents the deviation between the predicted and measured values of the model, and the relative analysis error of PLSR is also higher than that of ANN. In conclusion, the model established by the PLSR algorithm has the best prediction performance.
As can be seen from the above examples and experimental examples, the method for analyzing the total nitrogen content of the hydrous soil by infrared spectroscopy is added with the step of performing mathematical transformation on the near infrared spectroscopy data of the hydrous soil. The total nitrogen content in the water-containing soil can be predicted more accurately by using the near infrared spectrum data of the dry soil obtained after conversion. When the method is used for analyzing the total nitrogen content of the water-containing soil, the consumption of manpower and resources can be reduced, and the method has the advantages of high efficiency and low cost, so that the method has good application prospects in the fields of agriculture, environmental protection, biological research and the like.

Claims (10)

1. An infrared spectrum analysis method for total nitrogen content of water-containing soil is characterized by comprising the following steps:
step 1, collecting a near infrared spectrum of a soil sample to obtain original data of the near infrared spectrum;
step 2, converting the near infrared spectrum original data obtained in the step 1 into near infrared spectrum data of dry soil by adopting a direct spectrum conversion algorithm;
and 3, predicting the total nitrogen content result of the soil sample according to the near infrared spectrum data of the dry soil obtained in the step 2.
2. The analytical method of claim 1, wherein: the water content of the soil sample is 5% -35%, and/or the soil sample is collected from a Chengdu plain.
3. The analytical method of claim 1, wherein: in step 2, in the direct spectrum conversion algorithm, a conversion relationship between the near infrared spectrum original data and the near infrared spectrum data of the dry soil is as follows:
S1=S2F+E
wherein S is1As near infrared spectral data of dry soil, S2As raw data of near infrared spectrum, S1And S2The method is a matrix of dimension m multiplied by p, wherein m is the number of soil samples, and p is the number of data points of near infrared spectrum original data; f is the spectrum transfer matrix and E is the residual matrix.
4. The analytical method of claim 3, wherein: the spectrum transfer matrix and the residual error matrix are determined by the following method:
step a, collecting near infrared spectrum of a soil sample to obtain near infrared spectrum original data, and using S as the near infrared spectrum original data1Expressing of S1Is subjected to centralized pretreatment to obtain
Figure FDA0003234606910000011
Step b, drying the soil sample to obtain a dry soil sample, collecting the near infrared spectrum of the dry soil sample to obtain the near infrared spectrum original data of the dry soil, and using S as the near infrared spectrum original data of the dry soil2' expression of2' obtained after a centralized pretreatment
Figure FDA0003234606910000012
Step c, calculating to obtain a spectrum transfer matrix F and a residual error matrix E, wherein the calculation formula is as follows:
Figure FDA0003234606910000013
Figure FDA0003234606910000014
Figure FDA0003234606910000015
wherein the content of the first and second substances,
Figure FDA0003234606910000016
and
Figure FDA0003234606910000017
are each S1And S2' A row vector is formed by averaging elements in each column, k is a column vector with 1 element, dsIs an environmental baseline correction matrix.
5. The analytical method of claim 3, wherein: and the value of m is 50, and when the quantity of the collected soil samples is more than 50, 50 samples are selected from all the soil samples by adopting a Kennard-Stone algorithm.
6. The analytical method of claim 1, wherein step 3 comprises the steps of:
step 3A, preprocessing the near infrared spectrum data of the dry soil through an SG smoothing algorithm;
and 3B, predicting the near infrared spectrum data preprocessed in the step 3A through a prediction model to obtain a total nitrogen content result.
7. The analytical method of claim 6, wherein step 3B comprises the steps of: predicting the preprocessed near infrared spectrum data by adopting a full-wavelength prediction model to obtain a total nitrogen content result; the full-wavelength prediction model is obtained through modeling by a PLSR algorithm or an ANN algorithm.
8. A computer apparatus for infrared spectroscopic analysis of total nitrogen in soil comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements a method for infrared spectroscopic analysis of total nitrogen content in hydrous soil as claimed in any one of claims 1 to 7.
9. An infrared spectroscopic analysis system for total nitrogen in soil, comprising:
infrared spectrum acquisition and/or input means for acquiring and/or inputting near infrared spectrum data;
the computer apparatus of claim 8, configured to analyze the near infrared spectral data to obtain a total nitrogen content result.
10. A computer-readable storage medium characterized by: stored thereon is a computer program for carrying out the method for the infrared spectroscopic analysis of the total nitrogen content of aqueous soils according to any one of claims 1 to 7.
CN202110998389.3A 2021-08-27 2021-08-27 Infrared spectroscopic analysis method for total nitrogen content of water-containing soil Pending CN113916822A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110998389.3A CN113916822A (en) 2021-08-27 2021-08-27 Infrared spectroscopic analysis method for total nitrogen content of water-containing soil

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110998389.3A CN113916822A (en) 2021-08-27 2021-08-27 Infrared spectroscopic analysis method for total nitrogen content of water-containing soil

Publications (1)

Publication Number Publication Date
CN113916822A true CN113916822A (en) 2022-01-11

Family

ID=79233355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110998389.3A Pending CN113916822A (en) 2021-08-27 2021-08-27 Infrared spectroscopic analysis method for total nitrogen content of water-containing soil

Country Status (1)

Country Link
CN (1) CN113916822A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103712923A (en) * 2013-12-23 2014-04-09 浙江大学 Method for eliminating moisture influence factor in field in-situ soil measurement spectrums
CN103884661A (en) * 2014-02-21 2014-06-25 浙江大学 Soil total nitrogen real-time detection method based on soil visible-near infrared spectrum library
CN106990056A (en) * 2017-04-20 2017-07-28 武汉大学 A kind of total soil nitrogen spectrum appraising model calibration samples collection construction method
CN107421911A (en) * 2017-05-10 2017-12-01 浙江大学 A kind of preprocess method of the soil nitrogen detection based on portable near infrared spectrometer
CN107505179A (en) * 2017-09-01 2017-12-22 浙江大学 A kind of soil pretreatment and nutrient near infrared spectrum detection method
CN108982406A (en) * 2018-07-06 2018-12-11 浙江大学 A kind of soil nitrogen near-infrared spectral characteristic band choosing method based on algorithm fusion
CN108982407A (en) * 2018-07-06 2018-12-11 浙江大学 A method of probing into the soil optimum moisture content of detection soil nitrogen using near infrared spectrum
CN109374860A (en) * 2018-11-13 2019-02-22 西北大学 A kind of soil nutrient prediction and integrated evaluating method based on machine learning algorithm
CN110455726A (en) * 2019-07-30 2019-11-15 北京安赛博技术有限公司 A kind of method of real-time Forecasting Soil Moisture and total nitrogen content

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103712923A (en) * 2013-12-23 2014-04-09 浙江大学 Method for eliminating moisture influence factor in field in-situ soil measurement spectrums
CN103884661A (en) * 2014-02-21 2014-06-25 浙江大学 Soil total nitrogen real-time detection method based on soil visible-near infrared spectrum library
CN106990056A (en) * 2017-04-20 2017-07-28 武汉大学 A kind of total soil nitrogen spectrum appraising model calibration samples collection construction method
CN107421911A (en) * 2017-05-10 2017-12-01 浙江大学 A kind of preprocess method of the soil nitrogen detection based on portable near infrared spectrometer
CN107505179A (en) * 2017-09-01 2017-12-22 浙江大学 A kind of soil pretreatment and nutrient near infrared spectrum detection method
CN108982406A (en) * 2018-07-06 2018-12-11 浙江大学 A kind of soil nitrogen near-infrared spectral characteristic band choosing method based on algorithm fusion
CN108982407A (en) * 2018-07-06 2018-12-11 浙江大学 A method of probing into the soil optimum moisture content of detection soil nitrogen using near infrared spectrum
CN109374860A (en) * 2018-11-13 2019-02-22 西北大学 A kind of soil nutrient prediction and integrated evaluating method based on machine learning algorithm
CN110455726A (en) * 2019-07-30 2019-11-15 北京安赛博技术有限公司 A kind of method of real-time Forecasting Soil Moisture and total nitrogen content

Similar Documents

Publication Publication Date Title
CN111855589A (en) Remote sensing inversion model and method for rice leaf nitrogen accumulation
CN111855590A (en) Remote sensing inversion model and method for rice leaf starch accumulation
CN111488926B (en) Soil organic matter determination method based on optimization model
Pang et al. Hyperspectral imaging coupled with multivariate methods for seed vitality estimation and forecast for Quercus variabilis
Wang et al. Rapid detection of protein content in rice based on Raman and near-infrared spectroscopy fusion strategy combined with characteristic wavelength selection
CN110455726A (en) A kind of method of real-time Forecasting Soil Moisture and total nitrogen content
CN105158200A (en) Modeling method capable of improving accuracy of qualitative near-infrared spectroscopic analysis
CN111855593A (en) Remote sensing inversion model and method for starch content of rice leaf
CN102313712A (en) Correction method of difference between near-infrared spectrums with different light-splitting modes based on fiber material
Yao et al. Prediction of total nitrogen in soil based on random frog leaping wavelet neural network
Zhao et al. Determination of residual levels of procymidone in rapeseed oil using near-infrared spectroscopy combined with multivariate analysis
Wang et al. Predicting organic matter content, total nitrogen and ph value of lime concretion black soil based on visible and near infrared spectroscopy
Zhang et al. Hyperspectral model based on genetic algorithm and SA-1DCNN for predicting Chinese cabbage chlorophyll content
CN107796779A (en) The near infrared spectrum diagnostic method of rubber tree LTN content
Zhang et al. Measurement of aspartic acid in oilseed rape leaves under herbicide stress using near infrared spectroscopy and chemometrics
CN113049526B (en) Corn seed moisture content determination method based on terahertz attenuated total reflection
CN107271389A (en) A kind of spectral signature variable fast matching method based on index extreme value
CN113916822A (en) Infrared spectroscopic analysis method for total nitrogen content of water-containing soil
CN111650130A (en) Prediction method and prediction system for magnesium content of litchi leaves
CN110887798A (en) Nonlinear full-spectrum water turbidity quantitative analysis method based on extreme random tree
CN108398400B (en) Method for nondestructive testing of fatty acid content in wheat by terahertz imaging
Song et al. Fractional-order derivative spectral transformations improved partial least squares regression estimation of photosynthetic capacity from hyperspectral reflectance
CN116380869A (en) Raman spectrum denoising method based on self-adaptive sparse decomposition
Jin et al. Quantitative inversion model of protein and fat content in milk based on hyperspectral techniques
Yao et al. Prediction of total nitrogen content in different soil types based on spectroscopy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination