CN112924410B - Terahertz spectrum rapid identification method for germinated sunflower seeds - Google Patents
Terahertz spectrum rapid identification method for germinated sunflower seeds Download PDFInfo
- Publication number
- CN112924410B CN112924410B CN202110125211.8A CN202110125211A CN112924410B CN 112924410 B CN112924410 B CN 112924410B CN 202110125211 A CN202110125211 A CN 202110125211A CN 112924410 B CN112924410 B CN 112924410B
- Authority
- CN
- China
- Prior art keywords
- sunflower seeds
- germinated
- terahertz
- sample
- sunflower
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 235000020238 sunflower seed Nutrition 0.000 title claims abstract description 138
- 238000001228 spectrum Methods 0.000 title claims abstract description 40
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000005516 engineering process Methods 0.000 claims abstract description 26
- 238000005102 attenuated total reflection Methods 0.000 claims abstract description 17
- 238000001514 detection method Methods 0.000 claims abstract description 16
- 238000002203 pretreatment Methods 0.000 claims abstract description 4
- 238000010801 machine learning Methods 0.000 claims abstract description 3
- 230000003595 spectral effect Effects 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 12
- 238000010606 normalization Methods 0.000 claims description 12
- 230000003287 optical effect Effects 0.000 claims description 11
- 238000007781 pre-processing Methods 0.000 claims description 11
- 230000035784 germination Effects 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000007226 seed germination Effects 0.000 claims description 5
- 230000005284 excitation Effects 0.000 claims description 3
- 238000002835 absorbance Methods 0.000 claims description 2
- 238000010521 absorption reaction Methods 0.000 claims description 2
- 230000009466 transformation Effects 0.000 claims description 2
- 238000010926 purge Methods 0.000 claims 2
- 239000003921 oil Substances 0.000 description 15
- 235000019198 oils Nutrition 0.000 description 15
- 238000002474 experimental method Methods 0.000 description 13
- 230000001174 ascending effect Effects 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 239000002245 particle Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012306 spectroscopic technique Methods 0.000 description 4
- 150000001875 compounds Chemical class 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000001328 terahertz time-domain spectroscopy Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 239000013078 crystal Substances 0.000 description 2
- 239000008157 edible vegetable oil Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000004519 grease Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000010408 sweeping Methods 0.000 description 2
- 238000009423 ventilation Methods 0.000 description 2
- 229920000742 Cotton Polymers 0.000 description 1
- 235000009161 Espostoa lanata Nutrition 0.000 description 1
- 240000001624 Espostoa lanata Species 0.000 description 1
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 1
- 238000001237 Raman spectrum Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000005336 cracking Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000000227 grinding Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 239000010903 husk Substances 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000002329 infrared spectrum Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000005022 packaging material Substances 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3581—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation
- G01N21/3586—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation by Terahertz time domain spectroscopy [THz-TDS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Toxicology (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention discloses a quick identification method for a terahertz spectrum of germinated sunflower seeds, which is characterized in that a terahertz attenuated total reflection technology is used for detecting the germinated sunflower seeds, a sunflower seed sample is not required to be ground and tableted, a spectrum pretreatment method is combined with a machine learning algorithm, an extreme learning machine qualitative model of normal sunflower seeds and germinated sunflower seeds based on refractive index is established, and the quick, accurate and pollution-free detection of the quality of the sunflower seeds is realized.
Description
Technical Field
The invention provides a terahertz spectrum rapid identification method for germinated sunflower seeds, particularly relates to rapid, accurate and pollution-free detection of germinated sunflower seeds by utilizing a terahertz attenuated total reflection technology, and belongs to the technical field of oil crop quality detection.
Background
With the concern of people on the safety of edible oil, oil crops are used as a main source of edible vegetable oil, and the quality safety of the oil crops is also concerned by people. The selection of raw materials is the most important link in the actual production process, and the quality of oil crops directly influences the quality of produced grease. Sunflower seeds are one of five oil crops in the world, and the quality detection of the sunflower seeds is also particularly important. Due to careless management, the safe moisture content of crops (generally less than 11%) is not considered, the harvested sunflower seeds with the moisture content higher than the safe moisture content are directly stored, and once the environment temperature is proper, the sunflower seeds germinate, the content of nutrients such as protein and fat contained in the sunflower seeds is reduced, and the subsequent oil yield and the quality of oil products are influenced. In addition, when the germinated grains appear in storage, the germinated grains are not treated and even mildew can be caused, so that the health of people is threatened. Therefore, a rapid and effective germination grain detection method is necessary.
According to the inspection regulations of national standard GBT 5494-. With the improvement of optical instruments and communication technologies in recent years, a method for rapidly detecting and evaluating the quality of agricultural products by using a spectrum technology is generated. As a detection method which is extremely popular in the safety detection of the agricultural products at present, the spectrum technology has the characteristics of rapidness, simplicity, convenience, no pollution, online monitoring and the like, and has great practical value in the processes of compound identification, unknown compound structure analysis and compound quantitative analysis. However, the spectroscopic technique has disadvantages, such as the defects that several spectroscopic techniques are interfered by the color and shape of the sample, the spatial resolution and sensitivity are not sufficient, or the spectroscopic technique cannot be used for detecting trace samples and detecting through a sealed packaging material, and the like, and the common problems which are difficult to avoid cause that the classical spectroscopic technique is blocked in the quality detection of oil crops, and the safety problem of the oil crops cannot be well solved.
The emerging terahertz spectrum technology can overcome the defects and is expected to better solve the problem. The terahertz spectrum technology comprises a terahertz time-domain spectrum technology and an imaging technology. The terahertz time-domain spectroscopy technology utilizes the femtosecond laser technology to obtain broadband terahertz pulses, has the incomparable advantages of other spectroscopy technologies such as more bearing information, low energy, no damage to various quality attributes of agricultural products and the like, and provides possibility for better solving the quality safety problems of other agricultural products such as oil materials and the like.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a method for detecting the germinated sunflower seeds by utilizing a terahertz attenuated total reflection technology, a qualitative model of the germinated sunflower seeds is established by selecting a characteristic wave band, the germinated sunflower seeds can be quickly and accurately detected without pollution, complex sample preparation is not needed, and the accuracy is high.
The technical scheme provided by the invention is as follows:
the invention provides a terahertz spectrum rapid identification method for germinated sunflower seeds, which is characterized in that a terahertz attenuated total reflection technology is used for detecting the germinated sunflower seeds, a sunflower seed sample is not required to be ground and tableted, based on the terahertz attenuated total reflection technology, a spectrum preprocessing method is combined with a machine learning algorithm, an Extreme Learning Machine (ELM) qualitative model based on normal grains and the germinated sunflower seeds with refractive indexes is established, and the quality of oil crops is rapidly, accurately and pollution-free detected. The method comprises the following steps:
1) obtaining sunflower seed samples to be identified from the same batch of sunflower seeds;
2) sweeping oil seeds (sunflower seed samples) completely to prepare samples to be detected;
3) collecting terahertz attenuated total reflection spectral data of a sample to be detected by using a terahertz time-domain spectrometer in a dry environment at room temperature (about 22 ℃);
in the specific implementation of the invention, a sample to be detected is processed by adopting a shelling method, and then terahertz attenuated total reflection time domain spectrum information of the sample is directly collected.
4) Processing terahertz spectrum data; the terahertz time-domain spectral information acquired by collection needs to be subjected to a series of processing, and the processing relates to conversion between signal time domains and signal frequency domains, signal denoising, optical constant extraction, spectral data preprocessing by adopting a preprocessing method and the like.
5) And analyzing the obtained preprocessed optical constant data, and selecting a characteristic wave band.
6) Establishing a sunflower seed germination grain qualitative identification model based on a terahertz attenuated total reflection technology by using the selected characteristic wave band;
7) and (3) combining the qualitative identification model of the sunflower seed germinating grains based on the terahertz attenuated total reflection technology established in the step 6), and realizing rapid and nondestructive sunflower seed quality detection.
In the step 1), when the method is implemented specifically, the experimental sample is commercially available sunflower seeds for oil, particles with uniform size, shape and color are selected and purchased in the same batch, the physical and chemical properties of the particles are basically consistent, and the particles are usually stored in a dry and ventilated room temperature environment for ensuring the stability of the particles.
In the step 2), in the experiment implemented by the invention, because the sunflower seed sample may contain impurities such as dust and the like to influence the experiment, the sunflower seed sample needs to be simply purged and cleaned, and is dried in a dry and ventilated room temperature environment; then, reserving a plurality of normal hulled sunflower seeds, taking the rest sunflower seeds as a sprouting culture object, and placing the sunflower seeds in a warm, humid and dark semi-closed environment for sprouting culture.
In the step 3), in the experiment implemented by the invention, the shell of the sunflower seeds needs to be stripped, and then the terahertz time-domain spectral information of the sunflower seed sample is directly acquired by utilizing the attenuated total reflection technology.
In the step 4), the terahertz signal is different from common infrared and raman spectra and contains more abundant sample information, so that how to extract more useful information from the measurement signal has important significance. The implementation experiment of the invention comprises the analysis and extraction of the effective information of the time domain spectral data. Specifically, for the collected time domain spectral data: firstly, performing Fast Fourier Transform (FFT) on a time domain signal of a sample to convert the time domain signal into a frequency domain; meanwhile, a proper apodization function is selected to carry out windowing operation on the frequency domain signal, and a noise signal caused by signal truncation appearing in time-frequency domain transformation is eliminated. Preferably, a Happ Genzel function is selected that takes into account both signal-to-noise ratio and resolution; and then optical constants such as absorbance, absorption coefficient, refractive index and the like are obtained according to the amplitude and the phase of the frequency domain signal. Preferably, the optical constant is selected to be the refractive index; and finally, selecting a proper preprocessing method for optimization processing by analyzing the characteristics of the spectral data. Because the refractive index data overlapping degree of the germinated sunflower seeds and the refractive index data of the normal sunflower seeds have no obvious characteristic peak, the aim of identifying the germinated sunflower seeds cannot be achieved, and the normalization can amplify the characteristics of the data and improve the data performance, the preprocessing method adopted is preferably the normalization preprocessing.
In the step 5), the refractive index data after the normalization pretreatment is analyzed, and the characteristic wave band is selected as the model input data. Specifically, the refractive index data of the sunflower seeds germinating between 10cm and 40cm & lt-1 & gt shows a trend that the refractive index data of the sunflower seeds firstly rises and then gradually fallsThe refractive index data of the sunflower seeds germinating between 60 cm and 80cm & lt-1 & gt integrally show a trend of descending firstly and then ascending; the refractive index data of the normal sunflower seeds integrally show a trend of firstly descending and then gradually ascending within 10-40cm < -1 >, and the refractive index data of the normal sunflower seeds integrally show a trend of ascending and then descending within 60-80cm < -1 >. The refractive index data of two types of sunflower seeds are both 10-40cm-1And 60-80cm-1Has characteristic peak, so the invention selects 10-40cm-1And 60-80cm-1The refractive index data of (a) to build an Extreme Learning Machine (ELM) model.
In the step 6), in the experiment implemented by the invention, the selected characteristic wave band (10-40 cm)-1、60~80cm-1) As model input data, establishing an Extreme Learning Machine (ELM) qualitative identification model of the germinated sunflower seeds, which comprises the following operations:
61) firstly, establishing an ELM qualitative model of germinated sunflower seeds; the collected time domain spectral information of the normal sunflower seeds and the germinated sunflower seeds is used for carrying out relevant pretreatment on sample information, and the pretreatment method is detailed in step 4) and then is carried out according to the following steps of 2:1, randomly dividing a training set and a test set in proportion; then, determining the number of input layer nodes, the number of output layer nodes and the number of hidden layer nodes of the ELM network model, and establishing a germination sunflower seed ELM qualitative model based on the refractive index;
during specific implementation, firstly, carrying out continuous Fourier transform on terahertz time-domain spectral information of the sunflower seed sample acquired in the step 3) to obtain frequency-domain information, then calculating to obtain refractive index data, carrying out normalization pretreatment on the refractive index data, then analyzing the normalized refractive index data, determining a characteristic waveband according to the position of a characteristic peak in the refractive index spectrum, and taking the refractive index data of the characteristic waveband as modeling data; and respectively setting category label values for the germinated sunflower seeds and the normal sunflower seeds. Randomly dividing a training set and a testing set according to a ratio of 2:1, using training set data as input of an ELM model, determining the number of nodes of an input layer according to the characteristic quantity of the input data, using a label value of the training set as model output, determining the number of nodes of an output layer according to the type number of the label value, and establishing the ELM qualitative model of the germinated sunflower seeds.
62) The number of hidden nodes in the ELM network in step 61) is calculated according to an empirical formula of the number m of hidden nodes in a Back Propagation (BP) neural network (see (b)n is the number of nodes of an input layer, l is the number of nodes of an output layer, and a is an integer of 1-10).
The ELM model structure comprises an input layer, a hidden layer and an output layer, wherein specific input layer nodes, output layer nodes and hidden layer nodes are parameters of the built ELM neural network model, and the structure of the model is determined by the parameters. In specific implementation, the ELM network structure is as follows: the number of the input layer nodes is 52 (determined according to the number of characteristic wave number points contained in a characteristic wave band), the number of the output layer nodes is 2 (determined according to sample label values for germinated sunflower seeds and normal sunflower seeds), the number of the hidden layer nodes is 13 according to an empirical formula used when the number of the hidden layer nodes is determined in a BP neural network, and a sig function is selected as an excitation function.
63) The step 61) is to perform normalization preprocessing on the sample refractive index data, wherein the normalization range is [0,1 ].
64) The refractive index of the characteristic wave band in the step 61) is 10-40cm-1And 60-80cm-1Total 52 characteristic wavenumber points.
In the step 7), according to the ELM qualitative model of the germinated sunflower seeds established in the step 6), the refractive index data of the test set sample is input into the ELM model, the model outputs the predicted category label value of the test set sample, and the accuracy of the established ELM model can be tested by comparing the predicted category label value with the actual label value of the test set sample. If the class label value of the model prediction is the same as the actual class label value, the model prediction is correct. The established ELM model can realize the rapid and accurate identification of the germinated sunflower seeds.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a quick identification method of a terahertz spectrum of germinated sunflower seeds, which is used for detecting and identifying based on a terahertz time-domain spectroscopy technology and can realize quick and accurate identification and detection of the germinated sunflower seeds. The method adopts a high-sensitivity terahertz spectrum technology, does not need complicated pretreatment steps such as sample preparation and the like, and can well realize quick, accurate and pollution-free detection of the oil quality by shelling and detecting the sample.
In the invention, the modeling data does not directly use the time domain signal of the sample, but uses the refractive index data obtained by calculation, and the sample data is subjected to normalization pretreatment by analysis, and a characteristic wave band is selected for modeling; in specific implementation, the characteristic wave band is selected to establish a qualitative model of the germinated sunflower seeds, so that data redundancy can be reduced, and the modeling accuracy is improved; the invention has high learning rate and strong generalization ability by utilizing an ELM algorithm, and the established model has high identification accuracy; the number of the hidden nodes of the ELM model can be determined more quickly and efficiently by using an empirical formula for determining the number of the hidden nodes by using the BP neural network. By adopting the technical scheme of the invention, complex sample preparation is not needed, and the accuracy is high, so that the method has good practical value and can be popularized to the rapid and nondestructive detection of other food quality.
Drawings
Fig. 1 is a schematic structural diagram of an ELM network model of germinated sunflower seeds according to the present invention;
wherein n is the number of nodes of an input layer, m is the number of nodes of a hidden layer, and l is the number of nodes of an output layer; omega is a weight vector of the input layer and the hidden layer; beta is the weight vector of the output layer and the hidden layer, (x)i,ti) For the ith training sample, xiInput data for the model, tiThe model output value.
FIG. 2 shows sunflower seed samples taken at different positions in the examples;
wherein, (a) is normal sunflower seeds; (b) is germinated sunflower seed.
FIG. 3 is an ATR refractive index profile of a sunflower seed sample from an example;
wherein the solid line is normal grains and the dotted line is germinated grains.
FIG. 4 is the refractive index data of the sunflower seed sample after normalized preprocessing in the example;
wherein the solid line represents a normal sunflower seed refractive index normalized spectrogram, and the dotted line represents a germinated sunflower seed refractive index normalized spectrogram;
FIG. 5 is the ELM qualitative model prediction results of the germinated sunflower seed sample based on the characteristic band refractive index of the example;
where ". smallcircle" represents the actual class of the test set sample and "+" represents the predicted class of the sample.
Detailed Description
The invention will be further described by way of examples, without in any way limiting the scope of the invention, with reference to the accompanying drawings.
The invention provides a quick identification and detection method for a germinated sunflower seed by using a terahertz spectrum, which does not need grinding and tabletting. The detection method provided by the invention is based on the terahertz time-domain spectroscopy technology, can be used for quickly and accurately identifying and detecting the germinated sunflower seeds by combining a chemometrics pretreatment method and a modeling algorithm, and is high in identification rate and free from complex sample preparation.
Example (b): terahertz attenuated total reflection rapid detection method for germinated sunflower seeds
1. Experimental Material
In order to eliminate interference of other factors, 500g of sunflower seeds are purchased once in an experiment, and after individual broken seeds are manually removed, the sunflower seeds are stored in a dry and ventilated room temperature environment.
2. Sample preparation
Selecting 60 sunflower seeds with uniform size and color, and sweeping the sunflower seeds completely without damage. 30 sunflower seeds are reserved as normal sunflower seed samples, and the rest 30 sunflower seeds are used as germination cultivation objects. The early-stage investigation finds that the oil crops are most prone to germinate in a high-temperature, high-humidity and semi-closed ventilation environment, so that during experiments, in order to promote quick germination of the sunflower seeds, 30 sunflower seeds to be cultivated are soaked for half an hour in warm water at the temperature of 30-40 ℃, then are taken out and placed in a culture dish, wetted clean degreased cotton is padded at the bottom of the culture dish, a layer of wetted gauze covers the surface of the sunflower seeds, and the culture dish cover keeps a semi-closed ventilation state and is placed in a biochemical culture box at the temperature of 26 ℃. After 1-2 days, the sunflower seeds all germinate, and the phenomenon is that the seed kernels break through the seed coats, the top ends of the shells are whitish, and slight cracking marks exist. And airing the germinated sunflower seeds for 3-4 days in a dry and normal-temperature environment, and taking the sunflower seeds as a germinated sunflower seed sample after the sunflower seeds are completely dried. Fig. 2 shows sunflower seed samples in different states.
3. Spectrum acquisition
The spectrum is collected by selecting an attenuated total emission (ATR) mode. ATR spectrum collection is carried out on sunflower seeds in the experiment. Spectra of normal samples that were not subjected to germination culture were first collected. During spectrum collection, firstly collecting a reference signal under the condition that no sample is placed in the ATR crystal; second, the sunflower seeds are placed in the ATR collection site and the pressure screw is tightened to ensure good contact between the sample and the ATR crystal. And collecting ATR spectra of all normal sunflower seeds one by one according to the steps. For improving the accuracy of the experiment, the collection parameters of the ATR are set as follows: resolution of 0.94cm-1The average number of times per fast scan is 450. After the normal samples are collected, ATR spectrum collection is carried out on 30 germination samples, and the collection method and parameters are completely the same as those of the normal samples.
It should be noted that, in order to ensure the stability of the instrument system, the environmental temperature of the whole experiment is 22 ℃; every time a sample is collected, the ATR experiment table needs to be wiped by an alcohol cotton ball, and the influence of residual grease on the next sample after the sample is collected is prevented. In addition, in order to prevent the sunflower seed kernels from being oxidized after the husks are removed, the experimental process needs to be as fast and accurate as possible.
4. Spectral data processing
The terahertz time-domain waveform of the sample is finally obtained through the experiment, the signal is converted to obtain a frequency-domain spectrum through Fast Fourier Transform (FFT), and an optical constant such as a refractive index required by later analysis is obtained through calculation. The refractive index spectrum of the experimental sample is shown in figure 3. Wherein the solid line is normal grains and the dotted line is germinated grains. In the process of obtaining the frequency domain spectrum of the signal, an apodization (windowing) process must be performed on the signal to reduce the error caused by the truncation of the time domain signal. The apodization functions are of various types, and the Happ Genzel which takes the signal-to-noise ratio and the resolution into account is selected in the research.
Due to the working range of ATR of 10cm-1~120cm-1(0.3 Thz-3.6 Thz)In the electromagnetic spectrum region, the observation experiment data shows that the thickness is 0-85cm-1The whole sample refractive index spectrogram in the range shows a trend of descending first and then ascending and finally tending to be horizontal, but the data overlap is high, the data difference is large and difficult to distinguish, and the refractive index spectrogram is 85cm-1The sample data is then affected by noise as invalid data. After further normalization of all samples of each category, it was found that the difference between the normal and germinated samples was significant within a certain band. FIG. 4 shows the normalized germination at 10cm-1~85cm-1The solid line represents a normal sunflower seed refractive index normalized spectrogram, the dotted line is a germinated sunflower seed refractive index normalized spectrogram, the difference between the two curves is very obvious, and the possibility is provided for later-stage model establishment.
5. Characteristic band selection
After normalization pretreatment, the refractive index data of the germinated sunflower seeds and the refractive index data of the normal sunflower seeds are greatly different, the normal sunflower seeds integrally show a trend of descending first, then ascending and finally slowly descending, and the germinated sunflower seeds integrally show a trend of ascending first and then descending. Meanwhile, between 10-40cm-1 and 60-80cm-1, the normal sunflower seeds and the germinated sunflower seeds have characteristic peaks, so that data (total 52 characteristic wave number points) of two characteristic wave bands of 10-40cm-1 and 60-80cm-1 are selected for modeling.
6. Model building
And (3) establishing an ELM qualitative model of the germinated sunflower seeds. According to the following steps: 1, dividing a training set and a testing set, and establishing a germination sunflower seed ELM model based on a characteristic wave band-refractive index. Fig. 1 is a schematic structure of an ELM network model of germinated sunflower seeds according to the present invention; the ELM network structure is as follows: the number of input layer nodes is 52 (determined according to the number of wave numbers contained in the characteristic wave band), the number of output layer nodes is 2, the number of hidden layer nodes is 13 according to a BP neural network empirical formula, and the sig function is selected by the excitation function. The prediction result of the model on the test set is shown in fig. 5, ". smallcircle" represents the actual category of the test set sample, and "+" represents the prediction category of the sample, so that it can be seen that both types of test samples are accurately classified into the categories to which the model belongs, and the model prediction accuracy is 100%.
It is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.
Claims (7)
1. A terahertz spectrum rapid identification method for germinated sunflower seeds is characterized in that a terahertz attenuated total reflection technology is used for detecting the germinated sunflower seeds, a sunflower seed sample is not required to be ground and tableted, an Extreme Learning Machine (ELM) qualitative model based on normal sunflower seeds and the germinated sunflower seeds with refractive indexes is established by combining a spectrum pretreatment method and a machine learning algorithm, and the quality of the sunflower seeds is rapidly, accurately and pollution-free detected; the method comprises the following steps:
1) obtaining sunflower seed samples to be identified from the same batch of sunflower seeds;
2) purging a sunflower seed sample to prepare a sample to be detected;
3) collecting terahertz attenuated total reflection spectral data of a sample to be detected by using a terahertz time-domain spectrometer in a dry room temperature environment;
4) processing the terahertz spectrum data, and analyzing and extracting effective information of the time domain spectrum data; the method comprises the following steps: converting between signal time domains and signal frequency domains, denoising signals, extracting optical constants and preprocessing the signals to obtain optical constant data; the method comprises the following steps:
41) performing fast Fourier transform on a time domain signal of time domain spectral data acquired by a sample, and converting the time domain signal into a frequency domain;
42) selecting an apodization function to carry out windowing operation on the frequency domain signal, and eliminating a noise signal caused by signal truncation appearing in time-frequency domain transformation;
43) then obtaining an optical constant according to the amplitude and the phase of the frequency domain signal; the optical constants include absorbance, absorption coefficient and/or refractive index;
44) optimizing the spectral data by analyzing the spectral data and adopting a normalization preprocessing method;
5) analyzing the obtained optical constants, and selecting a characteristic wave band of 10-40cm-1And 60-80cm-1The refractive index data is used as input data of an established germinating sunflower seed extreme learning machine ELM qualitative identification model based on the terahertz attenuated total reflection technology;
6) establishing an ELM (ElM) qualitative identification model of the sunflower seed germinating grain limit learning machine based on the terahertz attenuated total reflection technology by using the selected characteristic wave band; the method comprises the following steps:
61) firstly, establishing an ELM qualitative identification model of a sunflower seed germination grain limit learning machine;
preprocessing the acquired time domain spectral information of normal sunflower seeds and germinated sunflower seeds, selecting the refractive index of a characteristic wave band as sample data, setting category label values for the germinated sunflower seeds and the normal sunflower seeds respectively by using 52 characteristic wave number points, and then randomly dividing a training set and a test set according to a proportion; the normalization range of the normalization pretreatment of the sample refractive index data is [0,1 ];
the ELM model structure comprises an input layer, a hidden layer and an output layer; taking training set data as input of an ELM model, determining the number of nodes of an input layer according to the characteristic quantity of the input data, taking a category label value of the training set as model output, determining the number of nodes of an output layer according to the category number of the category label value, and establishing an ELM qualitative recognition model of a sunflower seed germination grain extreme learning machine;
62) the number of hidden nodes in the sunflower seed germination grain extreme learning machine ELM qualitative recognition model is determined according to an empirical formula of the number of hidden nodes of a back propagation BP neural network; specifically determining the number of hidden layer nodes to be 13;
7) inputting the data of the test set into the sunflower seed germination grain limit learning machine ELM qualitative identification model based on the terahertz attenuated total reflection technology established in the step 6), and outputting the label value of the sample of the test set; and comparing with the actual label value of the sample, thereby realizing rapid and nondestructive sunflower seed quality detection.
2. The method for rapidly identifying the germinated sunflower seeds by the terahertz spectrum as claimed in claim 1, wherein the step 2) comprises the following steps: purging sunflower seeds, and airing the sunflower seeds in a dry and ventilated room-temperature environment; then reserving a plurality of normal sunflower seeds, using the rest sunflower seeds as germination cultivation objects, and performing germination cultivation in a warm, humid and dark semi-closed environment.
3. The method for rapidly identifying the terahertz spectrum of the germinated sunflower seed as claimed in claim 1, wherein in the step 3), the shell of the sunflower seed is stripped, and then the terahertz time-domain spectrum information of the sunflower seed sample is directly collected by using an attenuated total reflection technology.
4. The method for rapidly identifying the terahertz spectrum of the germinated sunflower seeds as claimed in claim 1, wherein in the step 4), the apodization function is a Happ Genzel function which has both signal-to-noise ratio and resolution.
5. The method for rapidly identifying the germinated sunflower seeds by the terahertz spectrum as claimed in claim 1, wherein in the step 62), an empirical formula of the number m of hidden nodes of the back propagation BP neural network is represented as follows:
wherein n is the number of nodes of an input layer, l is the number of nodes of an output layer, and a is an integer of 1-10.
6. The method for rapidly identifying the germinated sunflower seeds by the terahertz spectrum according to claim 5, wherein in the step 62), the number of the characteristic wave number points contained in the characteristic wave band determines that the number of the input layer nodes of the ELM network is 52; and determining that the number of output layer nodes is 2 for the germinated sunflower seeds and the normal sunflower seeds according to the sample label value.
7. The method for rapidly identifying the germinated sunflower seeds through the terahertz spectrum as claimed in claim 5, wherein a sig function is adopted as an excitation function of the ELM network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110125211.8A CN112924410B (en) | 2021-01-29 | 2021-01-29 | Terahertz spectrum rapid identification method for germinated sunflower seeds |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110125211.8A CN112924410B (en) | 2021-01-29 | 2021-01-29 | Terahertz spectrum rapid identification method for germinated sunflower seeds |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112924410A CN112924410A (en) | 2021-06-08 |
CN112924410B true CN112924410B (en) | 2022-06-17 |
Family
ID=76168545
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110125211.8A Active CN112924410B (en) | 2021-01-29 | 2021-01-29 | Terahertz spectrum rapid identification method for germinated sunflower seeds |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112924410B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114931115A (en) * | 2022-05-27 | 2022-08-23 | 浙江大学 | Poultry hatching egg gender rapid identification method based on flexible metamaterial |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102010010285B4 (en) * | 2010-03-04 | 2012-03-22 | Technische Universität Carolo-Wilhelmina Zu Braunschweig | Sample analysis using terahertz spectroscopy |
CN106841083A (en) * | 2016-11-02 | 2017-06-13 | 北京工商大学 | Sesame oil quality detecting method based on near-infrared spectrum technique |
CN110108649A (en) * | 2019-05-06 | 2019-08-09 | 北京工商大学 | The fast non-destructive detection method of oil crops quality based on terahertz light spectral technology |
-
2021
- 2021-01-29 CN CN202110125211.8A patent/CN112924410B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN112924410A (en) | 2021-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
McGlone et al. | Vis/NIR estimation at harvest of pre-and post-storage quality indices for ‘Royal Gala’apple | |
Jha et al. | Non-destructive prediction of sweetness of intact mango using near infrared spectroscopy | |
Zude et al. | Non-destructive tests on the prediction of apple fruit flesh firmness and soluble solids content on tree and in shelf life | |
Li et al. | Discriminating varieties of tea plant based on Vis/NIR spectral characteristics and using artificial neural networks | |
CN105158186B (en) | A kind of method detected based on high spectrum image to ternip evil mind | |
CN111157511B (en) | Egg freshness nondestructive testing method based on Raman spectrum technology | |
CN110779875B (en) | Method for detecting moisture content of winter wheat ear based on hyperspectral technology | |
CN112129709A (en) | Apple tree canopy scale nitrogen content diagnosis method | |
CN112924410B (en) | Terahertz spectrum rapid identification method for germinated sunflower seeds | |
CN107202784B (en) | Method for detecting process nodes in rice seed soaking and germination accelerating process | |
CN116385784A (en) | Method and system for measuring and calculating chlorophyll content of rice under cadmium stress | |
CN104255118A (en) | Rapid lossless testing method based on near infrared spectroscopy technology for paddy rice seed germination percentage | |
CN108732137A (en) | The model and method of Species Diversity in Plant are estimated based on high-spectrum remote sensing data | |
Yang | Nondestructive prediction of optimal harvest time of cherry tomatoes using VIS-NIR spectroscopy and PLSR calibration | |
Qi et al. | Rapid and non-destructive determination of soluble solid content of crown pear by visible/near-infrared spectroscopy with deep learning regression | |
CN109142238B (en) | Cotton phosphorus nutrition rapid diagnosis method | |
Kumar et al. | Reflectance based non-destructive determination of colour and ripeness of tomato fruits | |
Huang et al. | Fusion models for detection of soluble solids content in mandarin by Vis/NIR transmission spectroscopy combined external factors | |
CN110231302A (en) | A kind of method of the odd sub- seed crude fat content of quick measurement | |
Yuan et al. | In-field and non-destructive determination of comprehensive maturity index and maturity stages of Camellia oleifera fruits using a portable hyperspectral imager | |
CN113267458A (en) | Method for establishing quantitative prediction model of soluble protein content of sweet potatoes | |
XIA et al. | Application of wavelet transform in the prediction of navel orange vitamin C content by near-infrared spectroscopy | |
Wang et al. | Monitoring model for predicting maize grain moisture at the filling stage using NIRS and a small sample size | |
CN113049526B (en) | Corn seed moisture content determination method based on terahertz attenuated total reflection | |
Yuan et al. | Application of hyperspectral imaging to discriminate waxy corn seed vigour after aging. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |