CN113670837A - Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning - Google Patents

Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning Download PDF

Info

Publication number
CN113670837A
CN113670837A CN202110884882.2A CN202110884882A CN113670837A CN 113670837 A CN113670837 A CN 113670837A CN 202110884882 A CN202110884882 A CN 202110884882A CN 113670837 A CN113670837 A CN 113670837A
Authority
CN
China
Prior art keywords
longan pulp
sugar content
total sugar
sample
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110884882.2A
Other languages
Chinese (zh)
Inventor
田兴国
岑俏媛
郑曼妮
徐小艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Agricultural University
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN202110884882.2A priority Critical patent/CN113670837A/en
Publication of CN113670837A publication Critical patent/CN113670837A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N1/00Sampling; Preparing specimens for investigation
    • G01N1/28Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
    • G01N1/38Diluting, dispersing or mixing samples
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Biochemistry (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention provides a method for detecting the total sugar content in longan pulp based on hyperspectrum and deep learning, which is used for establishing a high-accuracy rapid nondestructive detection method for longan pulp based on hyperspectrum and machine learning for the first time and successfully solves the problems that the traditional longan pulp quality detection method is long in time consumption and complicated in operation and is difficult to rapidly and accurately detect the quality of longan pulp. The method is simple to operate, rapid, nondestructive, high in efficiency, free of sample damage, free of pretreatment of the sample, free of any chemical reagent, low in cost and accurate in determination result, and provides possibility for longan pulp quality classification and product standardization.

Description

Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning
Technical Field
The invention relates to the field of nondestructive detection of agricultural product quality, in particular to a method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning.
Background
The longan pulp is a medicinal and edible nourishing medicinal material, has sweet and warm properties, enters heart and spleen channels, can be used for tonifying heart and spleen, nourishing blood and soothing nerves, and has the drug effect recorded in Shen nong herbal Jing of the Han Dynasty as early as possible. The longan pulp contains a large amount of monosaccharides and oligosaccharides, seven polysaccharides consisting of monosaccharides, seventeen amino acids (including all essential amino acids), high potassium and low sodium, and has the effects of resisting stress, anxiety, oxidation, bacteria, aging, tumors, sleep, endocrine regulation, immunity improvement and the like. Longan pulp is produced in Vietnam and Thailand, and domestic Guangdong Gao Zhou, Guangxi Bao Bai and Fujian Pu Tian are three major producing areas, wherein the Gao Zhou longan pulp and the Bao Bai longan pulp are evaluated as national geographical sign products.
At present, no general national standard or industry standard exists for quality evaluation of longan pulp, a characteristic agricultural product, namely longan pulp, and only DB44/T1046-2012 geographical marking product Gaozhou longan pulp and DB 45/T848-2012 geographical marking product Bo Bai longan pulp provide evaluation requirements on longan pulp in two regions in aspects of sense, physicochemical and microorganisms respectively. In the detection standard, the two most important physical and chemical indexes are the moisture content and the total sugar content. The saccharides are the substances with the highest content in the longan pulp, and besides influencing the viscoelasticity of the product, the saccharides are the most main flavor development substances and nutritional ingredients and are also important components participating in the Maillard reaction. The measuring method of the content of the total sugar of the Gao Zhou longan pulp specified in DB44/T1046-2012 geographical marking product Gao Zhou longan pulp is complex. When the total sugar content is detected, the longan pulp is firstly smashed, all the saccharides are decomposed and reduced, the content of the reducing sugar is measured, the operation is very complicated, and the smashing can not completely separate out the sugar, so that the result has deviation. In the detection of the total sugar content of the longan pulp, reducing sugar comprises several determination methods. The direct titration method, the potassium permanganate method and the spectrophotometric method are commonly used. Both the direct titration method and the potassium permanganate method use a titration method for measurement. The direct titration method needs to perform titration under the boiling condition, the potassium permanganate method needs to perform boiling treatment on sample liquid before titration, and both methods are prone to generating titration end point judgment errors. In contrast, spectrophotometry is more sensitive and simpler to operate, but requires destructive detection of the sample. The conventional longan pulp total sugar content determination steps are complicated, labor-consuming and expensive, the longan pulp texture is sticky, soft and not easy to process, the experimental result is easy to generate large errors, the quality detection requirement of the longan pulp processing process cannot be met, and the quality control of the longan pulp product cannot be realized.
Disclosure of Invention
The invention aims to provide a method for detecting the total sugar content in longan pulp based on hyperspectrum and deep learning.
The invention has the following conception: taking a representative geographical mark product Gaozhou longan pulp as a research object, measuring an important physicochemical index value of total sugar content in the quality of the longan pulp by collecting hyperspectral data of a test sample, and performing machine learning modeling on the hyperspectral information and the total sugar content to obtain a prediction model of water and the total sugar content in the longan pulp; a rapid nondestructive testing method for the water content and the total sugar content in the longan pulp based on the hyperspectrum is established.
In order to realize the purpose of the invention, the invention provides a method for detecting the total sugar content in longan pulp based on hyperspectrum and deep learning, which comprises the following steps:
A. collecting hyperspectral data of a longan pulp sample in a range of 400-1000nm, and extracting a Region of Interest (ROI) for subsequent mathematical modeling;
B. determining the total sugar content of the longan pulp sample by a colorimetric method;
C. dividing the collected spectral data into a correction set and a verification set, taking the spectral data of the correction set as an independent variable and the total sugar content as a dependent variable, and meanwhile, establishing a long-short term memory network (long-short term memory artificial neural network, LSTM) prediction model by combining spectral data preprocessing and spectral data dimension reduction processing, and verifying the established model by adopting the verification set;
D. c, evaluating the model built in the step C, judging the effectiveness of the model and obtaining an effective mathematical model;
E. and D, collecting hyperspectral data of the longan pulp sample to be detected under the same experimental conditions, and calculating the total sugar content of the longan pulp sample to be detected by using the effective mathematical model obtained in the step D.
Preferably, the hyperspectral data collected in step a is hyperspectral reflectivity data.
In the foregoing method, the method for extracting the region of interest in step a is as follows: opening software ENVI + IDL 4.8, correcting the hyperspectral data of the longan pulp sample by using white board hyperspectral data with the reflectivity of 93% to obtain corrected longan pulp hyperspectral data, selecting the whole longan pulp in the hyperspectral data by using an ROL Tool, wherein the area is an interested area. And obtaining the average spectrum of the whole longan pulp in the selected interested area through a Stats option, wherein the spectrum is the high spectral reflectivity data of the longan pulp.
The method for determining the total sugar content in step B comprises the following steps:
b1, grinding a longan pulp sample of 0.5-1.0 g into powder, putting the powder into a centrifuge tube, adding 2mL of water to soak the powder, adding 2mL of water, shaking and uniformly mixing by using a vortex oscillator, and extracting in a water bath at 37 ℃ for 30 min; centrifuging at 10000rpm for 10min after extraction is finished, and taking supernatant; adding 4mL of water again, extracting in water bath at 37 deg.C for 30min, centrifuging at 10000rpm for 10min, collecting supernatant, filtering, and adding water to desired volume of 10 mL; then diluting the sample liquid by 200 times, and determining the total sugar content by using a colorimetric method; adding 0.5mL of diluted sample solution into 0.5mL of 5% phenol solution, rapidly adding 2.5mL of concentrated sulfuric acid (with concentration of 98%) into the diluted sample solution, standing in an ice water bath for 10min, shaking and uniformly mixing the sample solution by using a vortex oscillator, carrying out a water bath at 90 ℃ for 20min, rapidly cooling the ice water bath, and measuring the light absorption value at the wavelength of 490 nm;
b2, respectively sucking 0.5mL of glucose series standard working solutions with the concentrations of 0mg/L, 20mg/L, 40mg/L, 60mg/L, 80mg/L and 100mg/L, carrying out reaction by adopting the same method as B1, measuring light absorption values, and establishing a glucose standard curve;
and B3, substituting the measured light absorption value of the longan pulp sample into the standard curve to obtain the total sugar content of each longan pulp sample.
Preferably, the sample ratio of the correction set and the verification set in step C is 5: 1.
The aforementioned method, the preprocessing of the spectral data in step C, may adopt SG smoothing algorithm or moving average Model (MA) algorithm, preferably SG smoothing algorithm.
Preferably, the spectral data dimension reduction processing in step C adopts a variable projection importance (VIP) algorithm.
The method mentioned above, step D uses R2(R Square), RMSEC (root Mean Square Error of calibration) and RESMP (root Mean Square Error of prediction) are used as index evaluation to judge the effectiveness of the model.
The longan pulp refers to dry pulp.
Preferably, the longan pulp is Gaozhou longan pulp. The longan variety is selected from the group consisting of champignon, native longan, euryale, and corn.
By the technical scheme, the invention at least has the following advantages and beneficial effects:
the detection method provided by the invention is simple and rapid to operate, has high efficiency, does not damage the sample, does not need pretreatment on the sample, does not use any chemical reagent, has low cost and accurate measurement result, and realizes rapid nondestructive detection of the total sugar content in the longan pulp.
The invention successfully solves the problems that the traditional longan pulp quality detection method is long in time consumption and complicated in operation and is difficult to rapidly and accurately detect the quality of the longan pulp, and the established rapid nondestructive detection method for the total sugar content in the longan pulp based on the hyper-spectral technology and the machine deep learning greatly improves the detection efficiency, reduces the detection cost and improves the product quality and the economic benefit.
The invention firstly establishes the rapid longan pulp nondestructive testing method with high accuracy based on hyperspectrum and machine learning, and provides possibility for realizing longan pulp quality classification and product standardization.
And (IV) constructing a long-short term memory artificial neural network (LSTM) prediction model with strong nonlinear and high-dimensional data processing capability, improving the model accuracy and generalization capability, and making the successful application of the long-short term memory network to hyperspectral modeling possible.
Drawings
Fig. 1 is a technical route diagram of the method for detecting the total sugar content in the longan pulp based on hyperspectrum and deep learning.
FIG. 2 is a graph of a glucose standard curve plotted in accordance with a preferred embodiment of the present invention.
FIG. 3 is an average spectrum of 400-1000nm band for determining total sugar content of Gaozhou longan pulp in the preferred embodiment of the present invention.
FIG. 4 is the 900-1700nm band average spectrum for measuring the total sugar content of the Gaozhou longan pulp in the preferred embodiment of the present invention.
FIG. 5 is a graph showing the comparison of the total sugar content measurement effect of the hyperspectral MA smoothing treatment at the wavelength of 400-1000nm in the preferred embodiment of the present invention.
FIG. 6 is a graph showing the comparison of the effect of the hyperspectral MA smoothing treatment at the wavelength band of 900-1700nm in the determination of the total sugar content in the preferred embodiment of the invention.
FIG. 7 is a comparison of the effect of hyperspectral SG smoothing treatment at the wavelength of 400-1000nm for total sugar content determination in the preferred embodiment of the invention.
FIG. 8 is a comparison of the effect of the hyperspectral SG smoothing treatment at the wavelength band of 900-1700nm for the total sugar content determination in the preferred embodiment of the invention.
FIG. 9 is a graph showing the measurement of the total sugar content of the longan pulp in Gaozhou in the preferred embodiment of the invention, the score of the modeling wavelength VIP of the hyperspectral PLSR at the wavelength of 400-.
FIG. 10 is a graph showing the measurement of the total sugar content of the longan pulp in Gaozhou in the preferred embodiment of the invention, and the VIP score of the modeling wavelength of the hyperspectral PLSR in the 900-1700nm band.
FIG. 11 is a scattering diagram for selecting hyperspectral features at the wavelength band of 400-1000nm for determining the total sugar content of the Gaozhou longan pulp in the preferred embodiment of the invention.
FIG. 12 is a scattering diagram for selecting hyperspectral features at the wavelength band of 900-.
Detailed Description
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention. Unless otherwise specified, the technical means used in the examples are conventional means well known to those skilled in the art, and the raw materials used are commercially available products.
The SPECIM industrial hyperspectral cameras (FX10, FX17) used in the following examples were purchased from SPECIM corporation, finland.
Example 1 method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning
The method takes Gaozhou longan pulp (dry pulp) as a research object, firstly selects 5 Gaozhou longan pulp of varieties (good storax, soil dragon eye, Guangyuan, Shixia and corn) as an experimental sample, and collects hyperspectral data in two wave band ranges of 400-1000nm and 900-1700 nm; secondly, determining the total sugar content value in the longan pulp sample; comparing the modeling effects of different spectrum preprocessing modes and dimension reduction methods again, establishing different machine learning models by using the optimal spectrum preprocessing mode and the dimension reduction method, and evaluating the performance of the models after adjusting the parameters to obtain the optimal machine learning model; a rapid nondestructive detection method for longan pulp components based on hyperspectrum is established and is applied to the field detection of longan pulp processing enterprises. The technical route of the invention is shown in figure 1.
1. Experimental materials and instruments
(1) Experimental sample
Longan pulp is purchased from the village town farmers in Gao Zhou city in 9 months in 2020, and is the latest batch.
Longan aril is purchased from Zhendangdan lip food science and technology limited in Gao Zhou city in 9 months in 2020, and the production date is 8 months in 2020.
Longan aril, which is purchased from Tougezhi Touchengsheng food Co., Ltd in Guangzhou in 9 months in 2020, and has a production date of 8 months in 2020.
The Shixia longan pulp is purchased from Gaozhou Jingzizhusheng food Co Ltd in 9 months in 2020, and the production date is 8 months in 2020.
The corn longan pulp is purchased from the boundary town farmers of Gao Zhou city in 9 months in 2020, and is the latest batch.
(2) The reagent medicines are as follows:
liquid nitrogen Analytical purity Guangzhou Yingxin gas Co Ltd
Redistilled phenol Analytical purity Guangzhou Chemical Reagent Factory
98% concentrated sulfuric acid Analytical purity Guangzhou Chemical Reagent Factory
Anhydrous glucose Analytical purity ALADDIN REAGENT (SHANGHAI) Co.,Ltd.
(3) The equipment comprises the following parts:
MIKRO 220R high-speed refrigerated centrifuge Hettich Hertiz, Germany
SN-HWS-24 electrothermal digital display constant temperature water bath Shanghai Shang apparatus and Equipment Co Ltd
752N ultraviolet visible spectrophotometer SHANGHAI INESA SCIENTIFIC INSTRUMENT Co.,Ltd.
NP-28S inching speed-regulating vortex oscillator Changzhou Enpei Instrument manufacturing Co Ltd
FA1004 electronic analytical balance Shanghainengchen Instrument science and technology Co Ltd
2. Experimental methods
(1) Hyperspectral atlas Collection
The lumo-sanner software in the computer is turned on and the FX10 camera is mounted at a height of 25 cm. A longan pulp sample is placed on an object stage and scanned, and the object stage moves at the speed of 9.8mm/s in the process of scanning the atlas. After the above operations are completed, the sample is immediately put back into the sealed bag to prevent the sample from being affected with damp, and then the atlas collection of the next sample is carried out. And when all samples are collected with the 400-plus-1000 nm hyperspectral atlases, taking down the FX10 camera, installing the FX17 camera, moving the objective table at the speed of 7.5mm/s according to the steps, and collecting the 900-plus-1700 nm hyperspectral atlases of the samples one by one. After all the maps are collected, the envi4.8 software is used for processing the spectrum data in a unified mode, and a Region of Interest (ROI) is extracted and used for subsequent mathematical modeling.
(2) Determination of Total sugar content
The longan pulp is sticky and is not easy to be directly processed, and a liquid nitrogen pretreatment step is added on the basis of GB5009.3-2016 (determination of moisture in food) (drying the longan pulp at low temperature to avoid damaging sugar in the longan pulp). The longan pulp sample is ground into powder and is quickly transferred to a centrifuge tube, 2mL of primary water is added to soak the sample, 2mL of water is added to be uniformly shaken by using a vortex oscillator, and the mixture is extracted in water bath at 37 ℃ for 30 min. Centrifuging at 10000rpm for 10min after extraction, and collecting supernatant. Adding 4mL of first-class water again, extracting in water bath at 37 deg.C for 30min, centrifuging at 10000rpm for 10min, collecting supernatant, filtering, and adding water to desired volume of 10 mL. Because the longan pulp has high sugar content, the sample liquid is diluted by 200 times and then the experiment is carried out, and the content of the longan pulp is measured by using a colorimetric method. Adding 0.5mL of diluted sample solution into 0.5mL of 5% phenol solution, rapidly adding 2.5mL of concentrated sulfuric acid, standing in ice water bath for 10min, shaking by using a vortex oscillator, uniformly mixing, performing water bath at 90 ℃ for 20min, rapidly cooling in ice water bath, and measuring the light absorption value N at the wavelength of 490 nm. 0.5mL of glucose-series standard working solutions with concentrations of 0mg/L, 20mg/L, 40mg/L, 60mg/L, 80mg/L and 100mg/L were respectively aspirated, the reaction was carried out according to the same method, the absorbance was measured, and a glucose standard curve was drawn. Substituting N into the formula of standard yeast, and calculating the total sugar content of each longan pulp sample through conversion. The standard is shown in fig. 2.
(3) Experimental sample partitioning
The sample is divided into a training set (correction set) and a test set (verification set) according to the proportion of 5: 1, namely 10 samples are randomly extracted from 60 samples for modeling the total sugar content to serve as test samples for hyperspectral mathematical modeling, hierarchical sampling is carried out according to the quantity proportion of five different varieties of longan pulp in the sampling process, and the rest samples serve as the training set.
(4) Spectral preprocessing
a. Hyperspectral moving average smoothing process
Moving Average smoothing (MA), i.e. uniform sliding window data processing, can effectively eliminate the influence of random fluctuation. The hyperspectral reflectivity data corresponding to the wavelength are divided by the same window length to be subjected to sliding window processing one by one, and the given window length in the experiment is 400.
b. Hyperspectral polynomial convolution smoothing
The polynomial convolution smoothing processing Savitzky-Golay Method (SG) is an algorithm based on the least square principle, the minimum thought is utilized to smooth the reflectivity corresponding to each row of wavelengths, and the polynomial correspondence relation is utilized to sequentially slide until all the spectrums are traversed (traversal refers to the algorithm based on the polynomial convolution smoothing processing, and hyperspectral reflectivity in each spectrum is sequentially subjected to one-time sliding processing only). In the experiment, 10-degree polynomial and a window with a smoothing window length of 53 are set for carrying out data smoothing operation.
(5) Data dimension reduction processing
a. Importance of variable projection
The Variable Projection Importance Variable Image for Project (VIP) is a method for evaluating and screening independent variables. Variables with a VIP score greater than 1 have an important effect on the predictive effect of the model, and variables with a score less than 1 are redundant information with overlapping other features and are negligible. By using the method, redundant variables can be removed, and the effect of reducing the dimension is achieved.
b. Filtering feature selection
Filtering feature selection variance threshold (vt) is a feature selection method based on a specific variance, and features with larger variances are more representative in reflecting the difference of the index. The variance of each feature is arranged in the order from big to small, the threshold value is set to be 1, and only the feature with the variance larger than 1 is selected to establish the mapping of the target information, so that the purpose of reducing the dimension can be achieved.
c. Principal component analysis
Principal Component Analysis (PCA) is a method of decomposing spectral data into a set of variables by using variance, mean and correlation coefficient matrix information, and using eigenvalues and eigenvectors as guidance, wherein each variable is linearly uncorrelated and reflects information to different degrees. The variables are sorted in the order from big to small, the bigger the variables are, the more representative the main directions of all the items of the characteristics are, and the smaller the variables are, the modeling effect is also smaller. In order to reduce dimensionality and improve efficiency, modeling is good in prediction effect only by using a few representative variables, namely principal components.
(6) Optimal prediction model selection
A mathematical model is established by respectively adopting three methods of Partial Least Squares Regression (PLSR), BP neural network (BPNN) and long-short term memory network (LSTM) corresponding to two hyperspectral maps and physicochemical index data of 400-. Substituting the data of the test set into the established mathematical model for verification so as to train the R of the test set and the R of the test set2(R Square), RMSEC (root Mean Square Error of calibration), and RESMP (root Mean Square Error of prediction) as indexes to evaluate the modeling effect of each method. R2 is the goodness of fit and is the most important evaluation index, and the closer the numerical value is to 1, the higher the fitting degree of the model is; RMSEC and RMSEP are respectively the root mean square error of the training set and the root mean square error of the testing set, and are also important indexes for expressing the fitting effect of the model, and the smaller the numerical value is, the better the fitting effect is. Another index is ACCURACY, which is the prediction ACCURACY, and is used when the number of PCAs is evaluated and selected. Experiments were performed using excel and origin for data preprocessing, written in m language and python. The PLSR model is realized by matlab, and the BPNN model and the LSTM model are realized by pycharm and anaconda.
a. Partial Least Squares Regression (PLSR)
And respectively setting the spectrum data after smoothing, dimension reduction and standardization and the chemical measurement value after normalization as X and Y, and solving the principal component pair, calculating the contribution rate and establishing a many-to-many regression equation of X and Y by using a correlation coefficient matrix on the basis of the principle of the correlation maximization idea.
BP neural network (BPNN)
And 3 layers of neural networks are used for building the model, wherein the 3 layers are an input layer, a hidden layer and an output layer respectively. According to the Kolmogorov theorem, a three-layer neural network is sufficient to represent essentially all mathematical models. Inputting wavelength corresponding values of 224 dimensions at an input layer, and putting the wavelength corresponding values into 224 neurons; the number of the set hidden layer neurons is 447, namely the number of the input layer neurons is multiplied by two and reduced by one; and inputting the chemical measured value data into neurons of an output layer, wherein the number of the neurons of the output layer corresponds to the number of the samples. And the construction of the BP neural network is realized through full connection between different layers. And optimizing by using a small-batch random gradient descent method, dividing a small training sample set, and improving the convergence efficiency of the model, which is the most common optimization algorithm in deep learning. And activating a neural network by using a sigmoid function, carrying out model self-learning, mapping variables between (0, 1) and realizing the filtering and extraction of the features. For the training of the model, each data set was trained 1000 times.
c. Long Short Term Memory network (Long Short-Term Memory, LSTM)
And performing nonlinear processing and threshold analysis on the data through sigmoid and tanh functions by using a five-layer neural network which is more complex than the BP neural network. sigmoid is used for back propagation learning because of the fast learning speed, and tanh is used for learning and updating weight information to increase the weight of useful features. In order to improve the convergence efficiency of the model, a threshold value of 0.27 is set for hyperspectral measurement data of total sugar content of 400-1000 nm. In order to simplify the training of the model, branch pruning is carried out, namely, partial features are deleted in a hidden layer with a certain probability to reduce the complex co-adaptability among neurons and enhance the generalization capability, and the model also plays a role in reducing the dimension to a certain extent. For the training of the model, each data set was trained 15000 times.
3. Results of the experiment
3.1 Total sugar content measurement results
The results of measuring the total sugar content of the high-state longan pulp are shown in tables 1 and 2.
TABLE 1 analysis of the total sugar content of different species of Gaozhou longan pulp
Figure BDA0003192992720000081
TABLE 2 analysis of variance of total sugar content of different species of Gaozhou longan pulp
Figure BDA0003192992720000082
As can be seen from the data in Table 1, the span of the total sugar content of the Gaozhou longan pulp is also large, and varies from 39.7% to 75.2%, but the difference between different varieties is not large. Contrary to the law of moisture content, the average total sugar content of the stock is the highest and the average total sugar content of the corn is the lowest, which is related to the variety characteristics on one hand, and on the other hand, the moisture increase also reduces the proportion of sugar. The product quality of Shixia is most prone to floating, and not only is the range of the moisture content largest, but also the range of the total sugar content is widest. The difference between the samples of the longan and the wide-eye sample is small, so that the product standardization is easier to realize.
3.2 mean spectrogram
The average spectrogram of the longan pulp of Gaozhou is shown in fig. 3 and fig. 4. As can be seen from fig. 3 and 4, the spectra of different states with high total sugar content have the same trend. The spectrum measured by the total sugar content has higher reflectivity at the wave band of 760-. However, the average spectrum overlapping parts of the high-state longan pulp with different total sugar contents are too much, so that the difference of the total sugar contents is difficult to be seen from the spectrum. On one hand, the high-state longan pulp used for making the average spectrogram has difference of varieties besides the total sugar content, and interference is caused on the identification of the total sugar content; on the other hand, the hyperspectral spectrum contains massive information, only a small part of which is related to the total sugar content, and redundant information such as overlapping characteristics and the like which are unrelated to the total sugar content measurement exist. Therefore, the subsequent data preprocessing and dimension reduction steps are very important, the feature difference can be reduced, and useful information can be extracted, so that the modeling effect and efficiency are improved.
3.3 Hyperspectral pretreatment results
3.3.1 MA smoothing
The effect pairs before and after the hyperspectral MA smoothing treatment of 400-1000nm wave band and 900-1700nm wave band of the total sugar content determination of the Gaozhou longan pulp are shown in the figures 5 and 6. As can be seen from fig. 5 and 6, the spectrum curve of the high-state longan pulp total sugar content determination through MA smoothing has large smooth changes, many important features in the original spectrum are lost, and the characteristics of each wavelength can be implicitly reflected, but the spectrum curve is not obvious, so that the spectrum curve is not suggested to be used as a spectrum preprocessing method for hyperspectral modeling of the high-state longan pulp total sugar content determination.
3.3.2 SG smoothing
The effect pairs before and after the hyperspectral SG smoothing treatment of 400-1000nm wave band and 900-1700nm wave band in the determination of the total sugar content of the Gaozhou longan pulp are shown in the figures 7 and 8. As can be seen from fig. 7 and 8, the spectral curve of the total sugar content measurement of the high-state longan pulp subjected to SG smoothing treatment becomes smooth along with the reduction of noise and background interference, so that not only is no important information lost, but also the characteristics become more concentrated due to the reduction of variance, and can represent the specific trend of the spectrogram. Therefore, the SG smoothing treatment is very suitable for spectral preprocessing for carrying out hyperspectral modeling of the high-content determination of the total sugar content of the longan pulp in the Gazhou province.
3.4 dimension reduction results
3.4.1 principal Components analysis
The modeling effects of selecting the first 1, 3, 5 and 7 groups of main components in the hyperspectral PLSR modeling of 400-1000nm wave band and 900-1700nm wave band of the determination of the total sugar content of the longan pulp in Gaozhou are shown in tables 3 and 4.
TABLE 3 determination of Total sugar content of longan pulp in Gaozhou 400-fold and 1000nm waveband of Hyperspectral PLSR modeling effect
Figure BDA0003192992720000091
TABLE 4 The 900-1700nm waveband PLSR modeling effect of Gaozhou longan pulp total sugar content determination
Figure BDA0003192992720000092
Figure BDA0003192992720000101
As the data for measuring the total sugar is slightly less, the hyperspectral PLSR model for measuring the total sugar content of the longan pulp in Gaozhou selects the first 1, 3, 5 and 7 groups of main components for modeling. As is clear from tables 3 and 4, the effect of the high-state total longan pulp sugar content PLSR model is improved as the number of main component groups increases. The modeling effects of the first 5 groups and the first 7 groups of main components in the 400-plus-1000 nm waveband high spectrum model are not very different, and for the 900-plus-1700 nm waveband high spectrum model, the modeling effect of the first 7 groups of main components is obviously higher than that of the 1, 3 and 5 groups of main components. Therefore, the hyperspectral modeling of 400-1000nm wave band and 900-1700nm wave band for determining the total sugar content of the longan pulp in Gaozhou is most reasonable by intensively adopting the front 7 groups of main components.
3.4.2 VIP score
The determination of the total sugar content of the longan pulp in Gao Zhou is shown in the graph in FIGS. 9 and 10 by modeling the VIP score corresponding to each wavelength through hyperspectral at the wavelength band of 400-nm and PLSR at the wavelength band of 900-1700 nm. As can be seen from FIGS. 9 and 10, the 400-nm band hyperspectral PLSR model for determining the total sugar content of the longan pulp in Gao Zhou has two concentrated areas with VIP scores of more than 1 in the dimensions of about 0-35 and 135-180, that is, the hyperspectrum in the bands of about 400-500nm and 750-880nm plays a great role in determining the total sugar content of the longan pulp in Gao Zhou. The 900-plus 1700nm waveband hyperspectral PLSR model also has two sections of concentrated areas with the VIP score larger than 1, which are respectively in the dimensions of about 0-60 and 135-plus 145, but the proportion of the latter is far lower than that of the former, so that the hyperspectrum in the 900-plus 1150nm waveband plays a great role in the determination of the total sugar content of the Gao Zhou longan pulp, and the hyperspectrum in the 1400-plus 1450nm waveband also has a certain influence on the determination. The result is also consistent with the conclusion that the response value of the near infrared short wave region in the previous atlas analysis is higher.
3.4.3 Filter feature selection
The characteristics of the hyperspectrum of 400-1000nm wave band and 900-1700nm wave band of the high-state longan pulp total sugar content measurement based on the variance are shown in the figures 11 and 12. As can be seen from FIGS. 11 and 12, the threshold values set by the hyperspectral data sets of 400-1000nm band and 900-1700nm band for the determination of the total sugar content of the longan pulp in Gaozhou are respectively 0.52 and 0.50. The data dimension of the original 224 of the hyperspectrum with two wave bands is respectively reduced to 62 and 49 by selecting, namely 62 characteristic wavelengths are selected from the hyperspectrum with the wave band of 400-.
3.4.4 comparison of Effect of three dimensionality reduction methods
High spectrum modeling different dimension reduction treatment PLSR modeling effects of high-content longan pulp total sugar content determination are shown in Table 5.
TABLE 5 high Spectrum modeling different dimension reduction treatment PLSR modeling effect for determination of total sugar content of Gaozhou longan pulp
Figure BDA0003192992720000102
Figure BDA0003192992720000111
As can be seen from Table 5, in the three dimensionality reduction methods for hyperspectral modeling of 400-1000nm and 900-1700nm for determining the total sugar content of the longan pulp in Gaozhou, the dimensionality reduction effect of VIP and PCA comprehensively considering various factors is better; variance based VT least effective, Rc 2And Rp 2Minimum, RMSEC and RMSEP maximum. In contrast, VIP has better dimensionality reduction than PCA in total sugar content determination modeling, and has larger Rc 2And Rp 2Illustrating the effect of independent variables in the correlation of a Hyperspectral model for Total sugar content determinationMore importantly, the dimension reduction process is important to consider. Therefore, VIP is the dimension reduction method with the best effect in the hyperspectral modeling of two wave bands for measuring the total sugar content of the longan pulp in Gaozhou, the highest model fitting goodness and the smallest root mean square error exist, and other models are built subsequently and the dimension reduction method is also used for processing the spectrum uniformly.
3.5 predictive model comparison
The modeling effects of the 400-1000nm wave band and the 900-1700nm wave band hyperspectral modeling three mathematical models for the determination of the total sugar content of the longan pulp in Gaozhou are shown in Table 6.
TABLE 6 Hyperspectral modeling different mathematical model modeling effects of determination of total sugar content of Gaozhou longan pulp
Figure BDA0003192992720000112
As can be seen from Table 6, in the three mathematical models for hyperspectral modeling of 400-1000nm and 900-1700nm for determination of total sugar content of longan pulp in Gaozhou, the modeling effect of PLSR is the worst, and R is the lowestc 2And Rp 2Minimum, RMSEC and RMSEP maximum; the secondary effect of BPNN; the modeling effect of LSTM is best, Rc 2And Rp 2The largest of the three, RMSEC and RMSEP, are greatly reduced compared to the other two models.
Therefore, the long-short term memory network is most suitable for modeling for measuring the total sugar content of the longan pulp in Gao Zhou in three models of partial least squares regression, BP neural network and long-short term memory network. Compared with the modeling effects of hyperspectral LSTM at the wave bands of 400-1000nm and 900-1700nm, the R of the hyperspectral LSTM is smaller in RMSEC and RMSEP of the hyperspectral LSTMc 2And Rp 2And higher, finally selecting a 400-one 1000nm waveband LSTM model for nondestructive rapid detection of the water content of the Gaozhou longan pulp.
In the three models used in the invention, for the prediction of the total sugar content of the longan pulp in Gaozhou, the modeling effect of the LSTM and BPNN models is obviously higher than that of the PLSR model, the LSTM has the highest goodness of fit and the least root mean square error, and the capability difference has a certain relation with the number of samples participating in modeling. And the modeling effect of the three models on the 400-plus-one 1000nm hyperspectral data is better than that of the 900-plus-one 1700nm, and in the LSTM model, the root mean square error of modeling on the 400-plus-one 1000nm hyperspectral data is larger than that of the 900-plus-one 1700nm, but is obviously smaller than that of modeling on the BPNN model and the PLSR model. Therefore, under the condition that the sample data is not much, the LSTM modeling is preferably selected, and more useful information is extracted to improve the prediction effect.
The invention analyzes the chemical experiment and hyperspectral modeling for measuring the total sugar content of the Gaozhou longan pulp, and the result shows that: (1) the spectra were smoothed using MA and SG, which works best to smooth the spectral curves without losing important features. (2) Spectral dimensionality reduction using VIP, VT, PCA, resulting in R modeled by PLSRc 2And Rp 2And RMSEC and RMSEP, wherein VIP has the best effect of reducing the dimensionality. (3) The preprocessed spectrum and the chemical measured value are applied to establish a PLSR, BPNN and LSTM model, and the nonlinear LSTM model with long-term and short-term memory has the best modeling effect in the 400-nm and 1000-nm wave bands. (4) LSTM should be preferred to model in case of less modeling data to extract more information and improve prediction accuracy.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (10)

1. A method for detecting the total sugar content in longan pulp based on hyperspectrum and deep learning is characterized by comprising the following steps:
A. collecting hyperspectral data of a longan pulp sample in a range of 400-;
B. determining the total sugar content of the longan pulp sample by a colorimetric method;
C. dividing the collected spectral data into a correction set and a verification set, taking the spectral data of the correction set as an independent variable and the total sugar content as a dependent variable, and meanwhile, combining spectral data preprocessing and spectral data dimension reduction processing to establish a long-term and short-term memory network prediction model and verifying the established model by adopting the verification set;
D. c, evaluating the model built in the step C, judging the effectiveness of the model and obtaining an effective mathematical model;
E. and D, collecting hyperspectral data of the longan pulp sample to be detected under the same experimental conditions, and calculating the total sugar content of the longan pulp sample to be detected by using the effective mathematical model obtained in the step D.
2. The method of claim 1, wherein the hyperspectral data collected in step a is hyperspectral reflectance data.
3. The method according to claim 1, wherein the determination of the total sugar content in step B comprises:
b1, grinding a longan pulp sample of 0.5-1.0 g into powder, putting the powder into a centrifuge tube, adding 2mL of water to soak the powder, adding 2mL of water, shaking and uniformly mixing by using a vortex oscillator, and extracting in a water bath at 37 ℃ for 30 min; centrifuging at 10000rpm for 10min after extraction is finished, and taking supernatant; adding 4mL of water again, extracting in water bath at 37 deg.C for 30min, centrifuging at 10000rpm for 10min, collecting supernatant, filtering, and adding water to desired volume of 10 mL; then diluting the sample liquid by 200 times, and determining the total sugar content by using a colorimetric method; adding 0.5mL of diluted sample solution into 0.5mL of 5% phenol solution, rapidly adding 2.5mL of concentrated sulfuric acid, standing in ice water bath for 10min, shaking by using a vortex oscillator, uniformly mixing, performing water bath at 90 ℃ for 20min, rapidly cooling in ice water bath, and measuring the light absorption value at 490 nm;
b2, respectively sucking 0.5mL of glucose series standard working solutions with the concentrations of 0mg/L, 20mg/L, 40mg/L, 60mg/L, 80mg/L and 100mg/L, carrying out reaction by adopting the same method as B1, measuring light absorption values, and establishing a glucose standard curve;
and B3, substituting the measured light absorption value of the longan pulp sample into the standard curve to obtain the total sugar content of each longan pulp sample.
4. The method of claim 1, wherein the sample ratio of the calibration set to the verification set in step C is 5: 1.
5. The method of claim 1, wherein the preprocessing of the spectral data in step C employs an SG smoothing algorithm or a moving average model algorithm.
6. The method of claim 1, wherein the spectral data dimension reduction in step C employs a variable projection importance algorithm.
7. The method of claim 1, wherein R is used in step D2RMSEC and RESMP were used as index evaluations to determine the effectiveness of the model.
8. The method of any one of claims 1 to 7, wherein the longan pulp is dry pulp.
9. The method of claim 8, wherein the longan pulp is Gaozhou longan pulp.
10. The method of claim 8, wherein the longan variety is selected from the group consisting of sorrel, native dragon eye, euryale, gorge, and corn.
CN202110884882.2A 2021-08-02 2021-08-02 Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning Pending CN113670837A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110884882.2A CN113670837A (en) 2021-08-02 2021-08-02 Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110884882.2A CN113670837A (en) 2021-08-02 2021-08-02 Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning

Publications (1)

Publication Number Publication Date
CN113670837A true CN113670837A (en) 2021-11-19

Family

ID=78541661

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110884882.2A Pending CN113670837A (en) 2021-08-02 2021-08-02 Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning

Country Status (1)

Country Link
CN (1) CN113670837A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104277131A (en) * 2013-07-07 2015-01-14 青岛康地恩药业股份有限公司 Radix astragali polysaccharide extraction and detection method
CN105044021A (en) * 2015-07-08 2015-11-11 湖南环境生物职业技术学院 Mid-autumn crispy jujube sugar degree nondestructive test method
CN105181606A (en) * 2015-08-28 2015-12-23 中国农业科学院农产品加工研究所 Hyperspectral imaging technology-based method for detecting sucrose content distribution of peanut
CN108752492A (en) * 2018-05-23 2018-11-06 阜阳师范学院 The extracting method of total reducing sugar in snake gourd fruit pulp
CN111323407A (en) * 2020-02-20 2020-06-23 浙江大学 Raman spectrum detection method for rapidly determining content of traditional Chinese medicine polysaccharide
CN111595790A (en) * 2020-05-30 2020-08-28 南京林业大学 Hyperspectral image-based green plum acidity prediction method
CN112986231A (en) * 2021-03-08 2021-06-18 青岛农业大学 High-throughput method for measuring content of tremella polysaccharide
CN113030011A (en) * 2021-03-26 2021-06-25 中国计量大学 Rapid nondestructive testing method and system for sugar content of fruits

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104277131A (en) * 2013-07-07 2015-01-14 青岛康地恩药业股份有限公司 Radix astragali polysaccharide extraction and detection method
CN105044021A (en) * 2015-07-08 2015-11-11 湖南环境生物职业技术学院 Mid-autumn crispy jujube sugar degree nondestructive test method
CN105181606A (en) * 2015-08-28 2015-12-23 中国农业科学院农产品加工研究所 Hyperspectral imaging technology-based method for detecting sucrose content distribution of peanut
CN108752492A (en) * 2018-05-23 2018-11-06 阜阳师范学院 The extracting method of total reducing sugar in snake gourd fruit pulp
CN111323407A (en) * 2020-02-20 2020-06-23 浙江大学 Raman spectrum detection method for rapidly determining content of traditional Chinese medicine polysaccharide
CN111595790A (en) * 2020-05-30 2020-08-28 南京林业大学 Hyperspectral image-based green plum acidity prediction method
CN112986231A (en) * 2021-03-08 2021-06-18 青岛农业大学 High-throughput method for measuring content of tremella polysaccharide
CN113030011A (en) * 2021-03-26 2021-06-25 中国计量大学 Rapid nondestructive testing method and system for sugar content of fruits

Similar Documents

Publication Publication Date Title
Weng et al. Hyperspectral imaging for accurate determination of rice variety using a deep learning network with multi-feature fusion
Chandrasekaran et al. Potential of near-infrared (NIR) spectroscopy and hyperspectral imaging for quality and safety assessment of fruits: An overview
Pang et al. Rapid vitality estimation and prediction of corn seeds based on spectra and images using deep learning and hyperspectral imaging techniques
Yang et al. Quantitative prediction and visualization of key physical and chemical components in black tea fermentation using hyperspectral imaging
CN102788752A (en) Non-destructive detection device and method of internal information of crops based on spectrum technology
CN106501470A (en) Using gustatory system and the method for Electronic Nose association evaluation mustard chilli sauce local flavor grade
Weng et al. Nondestructive detection of storage time of strawberries using visible/near-infrared hyperspectral imaging
Pang et al. Hyperspectral imaging coupled with multivariate methods for seed vitality estimation and forecast for Quercus variabilis
Chen et al. Fast detection of cumin and fennel using NIR spectroscopy combined with deep learning algorithms
Wang et al. Nondestructive prediction and visualization of total flavonoids content in Cerasus Humilis fruit during storage periods based on hyperspectral imaging technique
Li et al. Non-destructive discrimination of paddy seeds of different storage age based on Vis/NIR spectroscopy
Zeb et al. Is this melon sweet? A quantitative classification for near-infrared spectroscopy
CN111595790A (en) Hyperspectral image-based green plum acidity prediction method
Wang et al. NIR based wireless sensing approach for fruit monitoring
Zhang et al. Identification of rice-weevil (Sitophilus oryzae L.) damaged wheat kernels using multi-angle NIR hyperspectral data
Jiang et al. Rapid nondestructive detecting of wheat varieties and mixing ratio by combining hyperspectral imaging and ensemble learning
Mu et al. Non‐destructive detection of blueberry skin pigments and intrinsic fruit qualities based on deep learning
Zhao et al. Determination of quality and maturity of processing tomatoes using near-infrared hyperspectral imaging with interpretable machine learning methods
Wang et al. Determination of polysaccharide content in shiitake mushroom beverage by NIR spectroscopy combined with machine learning: A comparative analysis
Liu et al. Estimation of chlorophyll content in maize canopy using wavelet denoising and SVR method
Yang et al. Rapid detection method of Pleurotus eryngii mycelium based on near infrared spectral characteristics
CN113670837A (en) Method for detecting total sugar content in longan pulp based on hyperspectrum and deep learning
CN112763448A (en) ATR-FTIR technology-based method for rapidly detecting content of polysaccharides in rice bran
Liu et al. Saccharinity test on cherry tomatoes based on hyperspectral imaging
Chen et al. The Application of Optical Nondestructive Testing for Fresh Berry Fruits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211119

RJ01 Rejection of invention patent application after publication