CN113324940A - Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk - Google Patents

Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk Download PDF

Info

Publication number
CN113324940A
CN113324940A CN202110503811.3A CN202110503811A CN113324940A CN 113324940 A CN113324940 A CN 113324940A CN 202110503811 A CN202110503811 A CN 202110503811A CN 113324940 A CN113324940 A CN 113324940A
Authority
CN
China
Prior art keywords
milk
model
mid
infrared spectrum
special
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110503811.3A
Other languages
Chinese (zh)
Inventor
张淑君
肖仕杰
王巧华
李春芳
王海童
马亚宾
倪俊卿
张依
罗雪路
樊懿楷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Agricultural University
Original Assignee
Huazhong Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Agricultural University filed Critical Huazhong Agricultural University
Priority to CN202110503811.3A priority Critical patent/CN113324940A/en
Publication of CN113324940A publication Critical patent/CN113324940A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3577Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing liquids, e.g. polluted water
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N2021/3595Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using FTIR
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Abstract

The invention belongs to the technical field of milk product analysis, and particularly relates to a spectral grading method for superior milk, high-protein special milk, high-milk-fat special milk and common milk. Related to the field of mid-infrared spectroscopy. The invention is characterized in that: the preferred modeling band is 925-1597cm‑1And 1712 and 3024cm‑1The band of (2); the invention comprises the following steps: (1) taking a milk sample to be detected as a detection sample; (2) collecting mid-infrared spectral data; (3) selecting a spectral band; (4) preprocessing original mid-infrared spectrum data; (5) extracting the characteristic wavelength of the mid-infrared spectrum; (6) predicting samples in the test set by using the model; (7) comparing and evaluating the models; (8) and selecting the optimal grading model through comparative analysis. The invention utilizes the characteristic variable combination to establish the model, simplifiesAnd the model improves the precision and the detection speed of the model.

Description

Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk
Technical Field
The invention belongs to the technical field of milk product detection, and particularly relates to a spectral identification method for superior quality milk, high protein special milk, high milk fat special milk and common milk. The field of the invention is related to the field of mid-infrared spectroscopy.
Background
The special milk with high protein, high milk fat and the like is commonly found in the market. Furthermore, the number of somatic cells in milk has a major influence on the milk protein and milk fat content[1]. Therefore, the number of somatic cells is also an important index for evaluating the quality of milk. If the milk quality can be graded accurately in batches, the production efficiency and the economic benefit of the milk enterprises can be greatly improved.
Traditional chemical analysis methods of milk proteins, milk fat and somatic cell count are time consuming and environmental polluting. If the measurement is carried out by using an instrument, a milk component analyzer and a somatic cell detector are used for separately measuring the milk protein, the milk fat and the number of somatic cells, wherein acridine orange used in the measurement process of the number of the somatic cells is a 3-type carcinogen published by the international cancer research institution of the world health organization. The intermediate infrared spectroscopy has the advantages of rapidness, no damage and simple operation. In China, the mid-infrared spectrum technology is mainly used for the research of ginseng and false[2-4]. In foreign countries, mid-infrared spectroscopy is widely used for content prediction of milk components (such as protein components and fatty acids)[5-7]. However, a direct rapid grading of the quality of milk containing different milk fat, milk protein and somatic cell counts has not been investigated.
In addition, these studies have problems such as a large amount of redundant information of the spectrum and unclear characteristic wavelength of the spectrum.
Disclosure of Invention
The invention aims to overcome the defects of the traditional technology and make up the defects of the prior art, and the invention finds the characteristic wavelength of the mid-infrared spectrum of the milk quality based on the mid-infrared spectrum analysis, and provides the spectral classification method for identifying the superior and high-quality milk, the high-protein characteristic milk, the high-milk-fat characteristic milk and the common milk.
The invention is realized by the following technical scheme:
the method selects the characteristic wavelengths of the milk quality, has a small number of characteristic wavelengths, and is high in prediction speed under the condition of the same prediction accuracy by using the method for establishing the mid-infrared model of the milk quality by using the characteristic wavelengths.
The spectrum grading method of four milk samples, wherein the four kinds of milk are commodity super-quality milk, high-protein special milk, high-milk-fat special milk and common milk, and the method comprises the following steps:
(1) acquiring 4 types of milk samples to be detected, namely super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk;
(2) collecting mid-infrared spectrum data: using a milk component analyzer (Fourier transform mid-infrared spectrometer) at 925--1Scanning all milk samples in a wave number range, and outputting sample transmittance through a computer connected with the milk samples;
(3) selecting a spectral band: removing wave bands with more noise and less effective information;
(4) dividing original mid-infrared spectrum data into a training set and a testing set, and preprocessing;
(5) extracting the characteristic wavelength of the mid-infrared spectrum;
(6) establishing a model: a Naive Bayes (NB) model is adopted to construct a grading model on a training set, and the established model is utilized to predict samples in a testing set;
(7) screening and model determination: comparing and evaluating the model according to the training set accuracy and the test set accuracy;
(8) and selecting the optimal grading model through comparative analysis.
The milk sample of the invention comes from 10 different pastures in the Hebei region.
The Fourier transform mid-infrared spectrometer is a MilkoScan of the FOSS companyTM7RM milk composition detector.
When the mid-infrared spectrum is collected, a sample is poured into a cylindrical sample tube with the diameter of 3.5cm and the height of 9cm, water bath is carried out in a water bath kettle at the temperature of 42 ℃ for 15-20min, and then a solid optical fiber probe is stretched into liquid.
The selection of the spectral band is 1597-1712cm for removing much noise-1And 3024 and 3680cm-1Wavenumber range, and 3680cm with less significant information removed and low contribution to modeling-1-4000cm-1Wave number range, and finally selecting a modeling wave band of 925-1597cm-1And 1712 and 3024cm-1The wavelength band of (1).
A random algorithm is adopted to randomly divide a sample into a training set and a testing set according to the ratio of 7:3, and 6 preprocessing methods including standard normal variable correction (SNV), Multivariate Scattering Correction (MSC), first-order derivatives, second-order derivatives, first-order differences and second-order differences are adopted for data preprocessing.
And extracting the characteristic wavelength of the mid-infrared spectrum by adopting a non-information variable elimination method (UVE), a competitive adaptive re-weighting method (CARS) and a stability competitive adaptive re-weighting sampling algorithm (SCARS).
The characteristic wavelengths extracted by the three algorithms are respectively as follows:
(1) UVE (229) 3.858cm-1、7.716cm-1、11.574cm-1、23.148cm-1、46.296cm-1、50.154cm-1、54.012cm-1、65.586cm-1、69.444cm-1、73.302cm-1、77.160cm-1、88.734cm-1、92.592cm-1、158.178cm-1、162.036cm-1、177.468cm-1、181.326cm-1、185.184cm-1、189.042cm-1、200.616cm-1、204.474cm-1、208.332cm-1、212.190cm-1、216.048cm-1、223.764cm-1、227.622cm-1、231.48cm-1、239.19cm-1、243.054cm-1、246.912cm-1、250.770cm-1、258.486cm-1、262.344cm-1、270.060cm-1、273.918cm-1、277.776cm-1、281.634cm-1、285.492cm-1、289.350cm-1、293.208cm-1、297.066cm-1、300.924cm-1、312.498cm-1、316.356cm-1、320.214cm-1、324.072cm-1、327.930cm-1、331.788cm-1、362.652cm-1、366.510cm-1、370.368cm-1、381.942cm-1、385.80cm-1、389.658cm-1、412.806cm-1、416.664cm-1、420.522cm-1、432.096cm-1、435.954cm-1、439.812cm-1、443.670cm-1、447.528cm-1、451.386cm-1、455.244cm-1、459.102cm-1、462.960cm-1、474.534cm-1、478.392cm-1、482.250cm-1、486.108cm-1、497.682cm-1、501.540cm-1、520.830cm-1、524.688cm-1、528.546cm-1、532.404cm-1、536.262cm-1、543.978cm-1、547.836cm-1、551.694cm-1、555.552cm-1、559.410cm-1、563.268cm-1、567.126cm-1、570.984cm-1、574.842cm-1、578.700cm-1、586.416cm-1、590.274cm-1、594.132cm-1、597.990cm-1、601.848cm-1、667.434cm-1、794.748cm-1、806.322cm-1、814.038cm-1、817.896cm-1、821.754cm-1、829.470cm-1、833.328cm-1、837.186cm-1、841.044cm-1、844.902cm-1、848.760cm-1、852.618cm-1、856.476cm-1、860.334cm-1、875.766cm-1、879.624cm-1、883.482cm-1、887.340cm-1、922.062cm-1、941.352cm-1、960.642cm-1、991.506cm-1、995.364cm-1、999.222cm-1、1003.080cm-1、1033.944cm-1、1037.802cm-1、1041.660cm-1、1045.518cm-1、1049.376cm-1、1064.808cm-1、1068.666cm-1、1072.524cm-1、1076.382cm-1、1103.388cm-1、1107.246cm-1、1111.104cm-1、1126.536cm-1、1130.394cm-1、1180.548cm-1、1184.406cm-1、1188.264cm-1、1230.702cm-1、1234.560cm-1、1238.418cm-1、1242.276cm-1、1257.708cm-1、1276.998cm-1、1280.856cm-1、1284.714cm-1、1304.004cm-1、1307.862cm-1、1311.720cm-1、1315.578cm-1、1319.436cm-1、1354.158cm-1、1358.016cm-1、1485.330cm-1、1489.188cm-1、1493.046cm-1、1496.904cm-1、1512.336cm-1、1516.194cm-1、1520.052cm-1、1523.910cm-1、1527.768cm-1、1531.626cm-1、1535.484cm-1、1539.342cm-1、1543.200cm-1、1558.632cm-1、1562.490cm-1、1566.348cm-1、1570.206cm-1、1574.064cm-1、1577.922cm-1、1581.780cm-1、1597.212cm-1、1601.070cm-1、1604.928cm-1、1608.786cm-1、1612.644cm-1、1616.502cm-1、1620.360cm-1、1624.218cm-1、1635.792cm-1、1639.650cm-1、1643.508cm-1、1647.366cm-1、1651.224cm-1、1655.082cm-1、1658.940cm-1、1662.798cm-1、1678.230cm-1、1682.088cm-1、1685.946cm-1、1689.804cm-1、1693.662cm-1、1697.5200cm-1、1701.378cm-1、1716.810cm-1、1720.668cm-1、1724.526cm-1、1728.384cm-1、1732.242cm-1、1736.100cm-1、1739.958cm-1、1751.532cm-1、1755.390cm-1、1759.248cm-1、1763.106cm-1、1766.964cm-1、1770.822cm-1、1774.680cm-1、1778.538cm-1、1782.396cm-1、1793.970cm-1、1797.828cm-1、1801.686cm-1、1805.544cm-1、1840.266cm-1、1844.124cm-1、1847.982cm-1、1851.840cm-1、1855.698cm-1、1890.420cm-1、1894.278cm-1、1898.136cm-1、1929cm-1、1932.858cm-1、1948.290cm-1、1952.148cm-1、1956.006cm-1、1959.864cm-1、2025.450cm-1、2029.308cm-1
(2) CARS (37) 73.302cm-1、177.468cm-1、204.474cm-1、208.332cm-1、212.190cm-1、243.054cm-1、246.912cm-1、270.060cm-1、273.918cm-1、300.924cm-1、335.646cm-1、432.096cm-1、435.954cm-1、478.392cm-1、482.250cm-1、486.108cm-1、532.404cm-1、574.842cm-1、586.416cm-1、590.274cm-1、594.132cm-1、605.706cm-1、609.564cm-1、640.428cm-1、667.434cm-1、794.748cm-1、798.606cm-1、806.322cm-1、814.038cm-1、837.186cm-1、841.044cm-1、844.902cm-1、848.760cm-1、852.618cm-1、856.476cm-1、860.334cm-1、2064.030cm-1
(3) SCARS (20) 73.302cm-1、142.746cm-1、208.332cm-1、243.054cm-1、270.060cm-1、273.918cm-1、327.930cm-1、331.788cm-1、335.646cm-1、412.806cm-1、416.664cm-1、435.954cm-1、478.392cm-1、482.250cm-1、486.108cm-1、532.404cm-1、594.132cm-1、605.706cm-1、636.570cm-1、729.162cm-1
The intermediate infrared spectrum data is preprocessed, and a model is constructed and verified by a commercial mathematic software matlab2016b produced by MathWorks company in America.
The invention has the advantages that:
(1) the optimal modeling band is selected, and the preferred modeling band is 925-1597cm as can be seen from FIG. 2 and example 1-1And 1712 and 3024cm-1Two wave bands.
(2) The cost of the instrument is saved, and the harm to human body is avoided. When the model is applied, the prediction category can be output only by inputting the infrared spectrum data in the milk obtained by the milk component analyzer into the model. In the conventional instrumental measurement method, two instruments, namely a milk component analyzer and a somatic cell detector, are required, and acridine orange is a 3-class carcinogen published by the international cancer research institution of the world health organization.
(3) The characteristic wavelengths of the super-high-quality milk, the high-protein characteristic milk, the high-milk-fat characteristic milk and the common milk are determined, the model is built by utilizing the characteristic wavelengths, the model is simplified, and the model precision is improved.
(4) The batch detection of the milk with excellent quality, the milk with high protein, the milk with high milk fat and the common milk is realized, and the detection of each milk only needs 2-3 milliseconds. The method has the characteristics of high identification speed, high precision, low cost, simplicity in operation, strong practicability and the like.
Drawings
FIG. 1: average spectra of 4 high-grade milks (extra high quality milk, high protein specialty milk, high milk fat specialty milk) and regular milk in example 1.
FIG. 2: modeling band 925-1597cm selected in example 1-1And 1712 and 3024cm-1And (4) combining to form a full spectrum image.
FIG. 3: full spectrum after second order differential pre-processing in example 1.
FIG. 4: the UVE characteristic wavelength screening pattern of the invention in example 1.
Description of reference numerals: in the process of extracting characteristic wavelengths by UVE, 90% of the absolute value of the maximum stability at a noise matrix is set as a rejection threshold, a left curve of an image represents a spectral variable of milk, a right curve represents an added random noise variable consistent with the number of milk samples, two horizontal dotted lines are selection thresholds of the variables, the inner parts of the dotted lines are eliminated useless information, the outer parts of the dotted lines are useful information, the useful information is used for subsequent modeling, and 229 characteristic wavelengths are selected by UVE.
FIG. 5: CARS characteristic wavelength screening graph of example 1.
Description of reference numerals: because the running processes of the CARS algorithm and the SCARS algorithm are similar, the invention explains the process of extracting the characteristic variables by taking the CARS algorithm as an example. The Monte Carlo sampling is set as 100 times, and the calculation is carried out by adopting a 5-fold cross-validation method. As can be seen from plot A in FIG. 5, the RMSECV value decreases and then increases as the number of sample runs increases: when the RMSECV value is gradually reduced, part of useless information in the spectral data is removed; as the RMSECV value increases, important information that is useful in indicating the spectral data is eliminated. Therefore, the corresponding minimum RMSECV in the PLSR model established by sampling 100 times is taken as the optimal result. As can be seen from the B plot in fig. 5, when the RMSECV value reaches a minimum value, the regression coefficient for each variable is located on the vertical line in the C plot in fig. 5 (sampling runs 48 times).
Detailed Description
The technical scheme of the invention is a conventional scheme in the field if not specifically stated. The reagents or materials, if not specifically mentioned, are commercially available.
Example 1: establishment of rapid spectrum grading method for excellent milk, high-protein special milk, high-milk-fat special milk and common milk
(1) Test materials and methods
5121 parts of milk samples of 10 different pastures in Hebei province of China are selected and numbered. Pouring the sample into a cylindrical sample tube with diameter of 3.5cm and height of 9cm, and water-bathing in a water bath at 42 deg.C for 15-20min, using MilkoScan from FOSSTMA7 RM milk component detector is prepared by extending solid fiber probe into liquid, scanning sample to obtain milk protein, milk fat content and mid-infrared spectrum (see FIG. 1) of milk sample, and using Fossomatic of FOSS companyTMThe 7 somatic cell detector measures the number of somatic cells in milk, and the difference of 4 milk components is shown in Table 1.
TABLE 1 milk ingredient differences
Figure BDA0003057501540000051
(2) Selection of modeling bands
The collection range of the mid-infrared spectrum is 925-4000cm-1Due to 1597-1712cm-1And 3024 and 3680cm-1Spectral information in a wave number range has large fluctuation and contains a large amount of noise data, and if the spectral information is used for subsequent analysis and modeling, the generalization capability of a model is poor; 3680 4000 cm--1And the effective information in the wave number range is less, and the contribution rate to model construction is low. 3680 4000 cm--1The effect of wavenumber on the model is shown in example 2. Therefore, 925-1597cm was finally selected-1And 1712 and 3024cm-1The wave number combinations of (a) are used as a full spectrum for modeling.
(3) Mid-infrared spectrum pretreatment
Using standard normal variable correction (SNV), Multivariate Scatter Correction (MSC)[8]First order derivative, second order derivative[9]First order difference[10]And second order difference[11]The full spectrum is preprocessed by 6 preprocessing methods in total. The first-order derivative and the second-order derivative both adopt a Savitzky-Golay convolution derivation method, wherein the influence of the number of Savitzky-Golay smoothing points on the accuracy of a derivative model is shown in factColumn 3, second order differential effect see example 4. The full spectrum and the spectrum after the second order difference preprocessing are shown in the attached figures 2 and 3.
As can be seen from FIG. 3, the spectral curve after the second-order differential preprocessing is enhanced compared with the effective information of the full spectral curve.
(4) Screening of characteristic wavelength of mid-infrared spectrum
Using a method of elimination (UVE) without information variables[12]Competitive Adaptive Reweighting (CARS), robust competitive adaptive reweighting (SCARS) sampling algorithm[13]And extracting the characteristic wavelength of the mid-infrared spectrum. In the process of extracting characteristic variables by the UVE, the influence of the UVE threshold on the model accuracy is shown in example 5. In the process of extracting the characteristic variables of CARS and SCARS, the influence of the sampling times on the model precision is shown in an example 6, and the characteristic extraction process of UVE and CARS is shown in the attached figures 4 and 5.
(5) Comparison of different models
The characteristic wavelengths of the mid-infrared spectrum extracted by a non-information variable elimination method (UVE), a competitive adaptive re-weighting method (CARS) and a stability competitive adaptive re-weighting sampling algorithm (SCARS) are respectively brought into an NB model, the accuracy of the model is compared, and the result is shown in a table 2A and a table 2B. The finally obtained optimal grading model is a second-order difference SCARS-NB model.
TABLE 2A comparison of training set accuracies for different models
Figure BDA0003057501540000061
TABLE 2B comparison of training set accuracies for different models
Figure BDA0003057501540000062
Example 2: influence of wavenumber on model accuracy
Mid-infrared spectrum collection of milk was performed in the same manner as in example 1, and 1597 and 1712cm-1And 3024 and 3680cm-1Wave number later, 3680-4000cm-1Influence of wavenumber on the model. Table 3 showsRemove 3680-4000cm-1After the wave number, the training set accuracy and the test set accuracy of the model are improved to 84.4996% and 84.2243%, respectively. Therefore, 925-1597cm was finally selected-1And 1712 and 3024cm-1The band combinations of (a) are used as a full spectrum for modeling.
TABLE 3 influence of wave number on model accuracy
Figure BDA0003057501540000071
Example 3: effect of derivative processing on model accuracy
And (3) acquiring the mid-infrared spectrum of the milk, wherein the method is the same as the embodiment 1, and the first derivative and the second derivative are respectively used for preprocessing the mid-infrared spectrum value of the milk. The number of smoothing points was set to 7, 9, 11, respectively, and the preprocessed spectra were substituted into the NB model, and the results are shown in table 4. Table 4 shows that the NB model established after 9-point smooth derivative preprocessing has the best effect, the training set accuracy and the test set accuracy of the first derivative are 90.6886% and 88.8527%, respectively, and the training set accuracy and the test set accuracy of the second derivative are 93.7552% and 92.0469%, respectively. Therefore, 9 is selected as the optimal number of smoothing points, and the first derivative and the second derivative of the 9-point smoothing are used as the optimal derivative preprocessing method.
TABLE 4 Effect of derivative processing on model accuracy
Figure BDA0003057501540000072
Example 4: effect of preprocessing on model accuracy
The method for collecting the mid-infrared spectrum of the milk is the same as the embodiment 1, and the mid-infrared spectrum value of the milk is preprocessed by respectively adopting standard normal variable correction (SNV), Multivariate Scattering Correction (MSC), first derivative, second derivative, first difference and second difference, and the preprocessed spectrum is brought into an NB model.
Table 5 shows that the training set accuracy and the test set accuracy of the NB model created by the unprocessed full spectrum data are only 84.4996% and 84.2243%, and compared with the full spectrum, the training set accuracy and the test set accuracy of the NB model created by all the preprocessed full spectrum data are significantly improved, which indicates that the preprocessing can improve the prediction performance of the NB model. The NB model established after the second-order difference preprocessing has the best effect, and the accuracy of the training set and the accuracy of the testing set are 94.3128% and 92.1121% respectively. Therefore, second order differential preprocessing is selected as the best preprocessing method and used for subsequent modeling analysis.
TABLE 5 Effect of preprocessing on model accuracy
Figure BDA0003057501540000081
Example 5: effect of UVE threshold on model accuracy
And (3) acquiring the mid-infrared spectrum of the milk, wherein the method is as in the embodiment 1, and extracting the characteristic wavelength by using UVE after preprocessing the mid-infrared spectrum value of the milk by using second-order difference. Thresholds of UVE were set to 0.8, 0.9, and 0.99, respectively, and the extracted characteristic wavelengths were brought into the NB model, respectively. As can be seen from table 6, the NB model established by 229 characteristic wavelengths obtained when the threshold is 0.9 is the best, and the accuracy of the training set and the accuracy of the test set are 94.1734% and 92.6988%, respectively.
TABLE 6 influence of UVE threshold on model accuracy
Figure BDA0003057501540000082
Example 6: influence of CARS and SCARS sampling times on model accuracy
The method for collecting the mid-infrared spectrum of the milk is the same as that in the embodiment 1, after the mid-infrared spectrum value of the milk is preprocessed by using second-order difference, the characteristic wavelengths are extracted by using CARS and SCARS, the sampling times of the CARS and the SCARS are respectively set to be 50, 100, 150 and 200, and the quantity of the characteristic wavelengths determined by different sampling times is shown in a table 7 in other methods and the embodiment 1.
And respectively bringing the extracted characteristic wavelengths into the NB models. Table 7 shows that the NB models established by the 37 and 20 characteristic wavelengths obtained when the sampling frequency is 100 have the best effect, the training set accuracy and the test set accuracy of the CARS model are 93.7273% and 92.5033%, respectively, and the training set accuracy and the test set accuracy of the SCARS model are 94.4522% and 93.9374%, respectively.
TABLE 7 influence of CARS and SCARS sampling times on model accuracy
Figure BDA0003057501540000091
Example 7: application of spectrum rapid grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk
80 milk samples were taken for model prediction. The prediction class is compared to the true class. The purpose is to verify the practical application effect of the model and determine the reliability of the model.
The method comprises the following specific steps: the spectrum of the sample is collected according to the method of the embodiment 1, and the spectrum data is directly substituted into the second order difference-UVE-NB model constructed in the embodiment 1, so that the prediction result can be output, and the prediction result is shown in tables 8A and 8B.
The overall accuracy of the model is 92.5%, wherein the prediction accuracy of the milk of the A-grade, the B-grade, the C-grade and the D-grade is 100%, 95%, 90% and 85% respectively. Furthermore, the predicted time for 80 milk samples was 0.17 seconds. The prediction time for each milk sample was only 2.125 milliseconds.
As can be seen by combining table 3A and table 3B, the training set accuracy, the test set accuracy, and the external verification accuracy of the second-order difference-UVE-NB model are all greater than 92%, so that the model has good practical application effect and high reliability, and can meet the production requirements.
External verification results for the model of Table 8A
Figure BDA0003057501540000092
External verification results for the model of Table 8B
Figure BDA0003057501540000101
Description of the drawings: a level: b grade of super-excellent high-quality milk: high protein specialty milk grade C: high-cream special milk grade D: the invention finally selects a second-order difference-UVE-NB model as the best model for grading the quality of the milk, and has the following advantages:
(1) the batch detection of the milk with excellent quality, the milk with high protein, the milk with high milk fat and the common milk is realized, and the detection of each milk only needs 2-3 milliseconds. The method has the characteristics of high identification speed, high precision, low cost, simplicity in operation, strong practicability and the like.
(2) The optimal modeling band is selected, and the preferred modeling band is 925-1597cm as can be seen from FIG. 2 and example 1-1And 1712 and 3024cm-1Two wave bands.
(3) The characteristic wavelengths of the super-high-quality milk, the high-protein characteristic milk, the high-milk-fat characteristic milk and the common milk are determined, the model is built by utilizing the characteristic wavelengths, the model is simplified, and the model precision is improved. The training set accuracy, the test set accuracy and the external verification accuracy of the model are all more than 92 percent
(4) The cost of the instrument is saved, and the harm to human body is avoided. When the model is applied, the prediction category can be output only by inputting the infrared spectrum data in the milk obtained by the milk component analyzer into the model. In the conventional instrumental measurement method, two instruments, namely a milk component analyzer and a somatic cell detector, are required, and acridine orange is a 3-class carcinogen published by the international cancer research institution of the world health organization.
Primary references
1. Influence of plum blossom in spring, cow recessive mastitis on milk quality [ J ] zootechnical veterinary science (electronic edition), 2019(03) 16-17.doi: 10.3969/j.issn.2096-3637.2019.03.008;
2. plum celebration et al, a near infrared spectrum rapid measurement method of milk main component content [ J ] food science, 2002(06): 121-;
3. plum double red and the like, near infrared spectrum quantitative analysis of milk protein content in milk of different dairy cows of the birth times [ J ]. food industry science and technology, 2014,35(04) 60-65.DOI 10.13386/J, issn 1002-0306.2014.04.024;
4. the near infrared transreflection spectrometry is used for measuring the milk component [ J ]. food science, 2013,34(20) 153-156.doi: 10.7506/spkx.1002-6630-;
5.Massimo De Marchi,Valentina Bonfatti,Alessio Cecchinato,et al.Prediction of protein composition of individual cow milk using mid-infrared spectroscopy[J].Italian Journal of Animal Science,2010,8(2s);
6.A.Fleming,F.S.Schenkel,J.Chen,et al.Prediction of milk fatty acid content with mid-infrared spectroscopy in Canadian dairy cattle using differently distributed model development sets[J].Journal of Dairy Science,2017,100(6).https://doi.org/10.3168/jds.2016-12102;
7.H.Soyeurt,F.Dehareng,N.Gengler,et al.Mid-infrared prediction of bovine milk fatty acids across multiple breeds,production systems,and countries[J].Journal of Dairy;
8. high rise, etc., non-destructive detection of red blood sugar content [ J ] based on hyperspectral image information fusion, 2019,40(12):1574-1584.
9. Padan and the like, a hyperspectral detection model of freshness, pH value and viscosity of eggs [ J ] food science 2016,37(22): 173-.
10. King Wei et al, winter wheat chlorophyll content hyperspectral detection technology [ J ] agro-mechanical journal 2010,41(05): 172-.
11. Ducky et al, identification of copper and lead contamination in soil by spectral second-order differential Gabor expansion method [ J ] spectroscopy and spectral analysis, 2018,38(10):3245-3253.
12. High-lift, etc., high spectral imaging nondestructive testing of red sugar content and hardness [ J ] optical science, 2019,39(10):355-364.
13. High-grade red-extracted vitamin C content, sugar degree and total acid content nondestructive detection method based on near infrared spectrum [ J ] analytical chemistry, 2019,47(06): 941-949.

Claims (1)

1. The spectrum grading method of four milk samples, wherein the four milks are commodity super-quality milk, high-protein special milk, high-milk-fat special milk and common milk, and is characterized by comprising the following steps:
(1) acquiring four milk samples to be detected, namely ultra-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk, as detection samples;
(2) collecting mid-infrared spectrum data: using a milk component analyzer at 925-4000cm-1Scanning the detection sample in the step (1) in a wave number range, and outputting sample transmittance through a computer connected with the detection sample;
(3) selecting a spectral band: removing wave bands with more noise and less effective information;
(4) dividing original mid-infrared spectrum data into a training set and a testing set, and preprocessing;
(5) extracting characteristic wavelengths of representative mid-infrared spectrums;
(6) establishing a model: a naive Bayes model is adopted to construct a grading model on a training set, and the established model is utilized to predict samples in a test set;
(7) screening and model determination: comparing and evaluating the model according to the training set accuracy and the test set accuracy;
(8) selecting an optimal grading model through comparative analysis;
wherein:
when the mid-infrared spectrum is collected in the step (2), pouring the detection sample in the step (1) into a cylindrical sample tube with the diameter of 3.5cm and the height of 9cm, carrying out water bath for 15-20min in a water bath kettle at 42 ℃, and then extending a solid optical fiber probe into liquid;
the selection of the spectral band in the step (3) is 1597-1712cm for removing much noise-1And 3024 and 3680cm-1Wavenumber range, and 3680cm with less significant information removed and low contribution to modeling-1-4000cm-1Wave number range, selecting the modeling wave number of 925-1597cm-1And 1712 and 3024cm-1The band of (2);
in the step (4), a random algorithm is adopted, samples are randomly divided into a training set and a testing set according to the ratio of 7:3, and standard normal variable correction (SNV), Multivariate Scattering Correction (MSC), a first derivative, a second derivative, a first difference and a second difference preprocessing method are respectively adopted for data preprocessing; wherein, the number of the smooth points of the first derivative and the second derivative is set as 9;
in the step (5), a non-information variable elimination method (UVE), a competitive adaptive re-weighting method (CARS) and a stability competitive adaptive re-weighting sampling algorithm (SCARS) are adopted, and mid-infrared spectrum characteristic wavelengths are extracted from the 3 characteristic wavelengths by an extraction algorithm; wherein, the threshold value of UVE is set to be 0.9, and the sampling times of CARS and SCARS are set to be 100;
the characteristic wavelengths extracted by the three algorithms are respectively as follows:
UVE (229) 3.858cm-1、7.716cm-1、11.574cm-1、23.148cm-1、46.296cm-1、50.154cm-1、54.012cm-1、65.586cm-1、69.444cm-1、73.302cm-1、77.160cm-1、88.734cm-1、92.592cm-1、158.178cm-1、162.036cm-1、177.468cm-1、181.326cm-1、185.184cm-1、189.042cm-1、200.616cm-1、204.474cm-1、208.332cm-1、212.190cm-1、216.048cm-1、223.764cm-1、227.622cm-1、231.48cm-1、239.19cm-1、243.054cm-1、246.912cm-1、250.770cm-1、258.486cm-1、262.344cm-1、270.060cm-1、273.918cm-1、277.776cm-1、281.634cm-1、285.492cm-1、289.350cm-1、293.208cm-1、297.066cm-1、300.924cm-1、312.498cm-1、316.356cm-1、320.214cm-1、324.072cm-1、327.930cm-1、331.788cm-1、362.652cm-1、366.510cm-1、370.368cm-1、381.942cm-1、385.80cm-1、389.658cm-1、412.806cm-1、416.664cm-1、420.522cm-1、432.096cm-1、435.954cm-1、439.812cm-1、443.670cm-1、447.528cm-1、451.386cm-1、455.244cm-1、459.102cm-1、462.960cm-1、474.534cm-1、478.392cm-1、482.250cm-1、486.108cm-1、497.682cm-1、501.540cm-1、520.830cm-1、524.688cm-1、528.546cm-1、532.404cm-1、536.262cm-1、543.978cm-1、547.836cm-1、551.694cm-1、555.552cm-1、559.410cm-1、563.268cm-1、567.126cm-1、570.984cm-1、574.842cm-1、578.700cm-1、586.416cm-1、590.274cm-1、594.132cm-1、597.990cm-1、601.848cm-1、667.434cm-1、794.748cm-1、806.322cm-1、814.038cm-1、817.896cm-1、821.754cm-1、829.470cm-1、833.328cm-1、837.186cm-1、841.044cm-1、844.902cm-1、848.760cm-1、852.618cm-1、856.476cm-1、860.334cm-1、875.766cm-1、879.624cm-1、883.482cm-1、887.340cm-1、922.062cm-1、941.352cm-1、960.642cm-1、991.506cm-1、995.364cm-1、999.222cm-1、1003.080cm-1、1033.944cm-1、1037.802cm-1、1041.660cm-1、1045.518cm-1、1049.376cm-1、1064.808cm-1、1068.666cm-1、1072.524cm-1、1076.382cm-1、1103.388cm-1、1107.246cm-1、1111.104cm-1、1126.536cm-1、1130.394cm-1、1180.548cm-1、1184.406cm-1、1188.264cm-1、1230.702cm-1、1234.560cm-1、1238.418cm-1、1242.276cm-1、1257.708cm-1、1276.998cm-1、1280.856cm-1、1284.714cm-1、1304.004cm-1、1307.862cm-1、1311.720cm-1、1315.578cm-1、1319.436cm-1、1354.158cm-1、1358.016cm-1、1485.330cm-1、1489.188cm-1、1493.046cm-1、1496.904cm-1、1512.336cm-1、1516.194cm-1、1520.052cm-1、1523.910cm-1、1527.768cm-1、1531.626cm-1、1535.484cm-1、1539.342cm-1、1543.200cm-1、1558.632cm-1、1562.490cm-1、1566.348cm-1、1570.206cm-1、1574.064cm-1、1577.922cm-1、1581.780cm-1、1597.212cm-1、1601.070cm-1、1604.928cm-1、1608.786cm-1、1612.644cm-1、1616.502cm-1、1620.360cm-1、1624.218cm-1、1635.792cm-1、1639.650cm-1、1643.508cm-1、1647.366cm-1、1651.224cm-1、1655.082cm-1、1658.940cm-1、1662.798cm-1、1678.230cm-1、1682.088cm-1、1685.946cm-1、1689.804cm-1、1693.662cm-1、1697.5200cm-1、1701.378cm-1、1716.810cm-1、1720.668cm-1、1724.526cm-1、1728.384cm-1、1732.242cm-1、1736.100cm-1、1739.958cm-1、1751.532cm-1、1755.390cm-1、1759.248cm-1、1763.106cm-1、1766.964cm-1、1770.822cm-1、1774.680cm-1、1778.538cm-1、1782.396cm-1、1793.970cm-1、1797.828cm-1、1801.686cm-1、1805.544cm-1、1840.266cm-1、1844.124cm-1、1847.982cm-1、1851.840cm-1、1855.698cm-1、1890.420cm-1、1894.278cm-1、1898.136cm-1、1929cm-1、1932.858cm-1、1948.290cm-1、1952.148cm-1、1956.006cm-1、1959.864cm-1、2025.450cm-1、2029.308cm-1
CARS (37) 73.302cm-1、177.468cm-1、204.474cm-1、208.332cm-1、212.190cm-1、243.054cm-1、246.912cm-1、270.060cm-1、273.918cm-1、300.924cm-1、335.646cm-1、432.096cm-1、435.954cm-1、478.392cm-1、482.250cm-1、486.108cm-1、532.404cm-1、574.842cm-1、586.416cm-1、590.274cm-1、594.132cm-1、605.706cm-1、609.564cm-1、640.428cm-1、667.434cm-1、794.748cm-1、798.606cm-1、806.322cm-1、814.038cm-1、837.186cm-1、841.044cm-1、844.902cm-1、848.760cm-1、852.618cm-1、856.476cm-1、860.334cm-1、2064.030cm-1
SCARS (20) 73.302cm-1、142.746cm-1、208.332cm-1、243.054cm-1、270.060cm-1、273.918cm-1、327.930cm-1、331.788cm-1、335.646cm-1、412.806cm-1、416.664cm-1、435.954cm-1、478.392cm-1、482.250cm-1、486.108cm-1、532.404cm-1、594.132cm-1、605.706cm-1、636.570cm-1、729.162cm-1
Intermediate infrared spectrum data are preprocessed, and a model is constructed and verified through matlab2016b software.
CN202110503811.3A 2021-05-10 2021-05-10 Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk Pending CN113324940A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110503811.3A CN113324940A (en) 2021-05-10 2021-05-10 Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110503811.3A CN113324940A (en) 2021-05-10 2021-05-10 Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk

Publications (1)

Publication Number Publication Date
CN113324940A true CN113324940A (en) 2021-08-31

Family

ID=77415095

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110503811.3A Pending CN113324940A (en) 2021-05-10 2021-05-10 Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk

Country Status (1)

Country Link
CN (1) CN113324940A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113916824A (en) * 2021-11-01 2022-01-11 华中农业大学 Alpha in milks1Intermediate infrared rapid batch detection method of casein
CN114166785A (en) * 2021-11-16 2022-03-11 华中农业大学 Intermediate infrared rapid batch detection method for fat content in buffalo milk and application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105973816A (en) * 2016-05-06 2016-09-28 中国农业大学 Visible light/near infrared spectroscopy-based fowl egg hatching capability determination method
CN112525850A (en) * 2020-10-01 2021-03-19 华中农业大学 Spectral fingerprint identification method for milk, mare, camel, goat and buffalo milk

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105973816A (en) * 2016-05-06 2016-09-28 中国农业大学 Visible light/near infrared spectroscopy-based fowl egg hatching capability determination method
CN112525850A (en) * 2020-10-01 2021-03-19 华中农业大学 Spectral fingerprint identification method for milk, mare, camel, goat and buffalo milk

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
熊芩等: "最小角回归算法(LAR)结合采样误差分布分析(SEPA)建立稳健的近红外光谱分析模型", 《分析测试学报》, 18 July 2018 (2018-07-18), pages 778 - 783 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113916824A (en) * 2021-11-01 2022-01-11 华中农业大学 Alpha in milks1Intermediate infrared rapid batch detection method of casein
CN114166785A (en) * 2021-11-16 2022-03-11 华中农业大学 Intermediate infrared rapid batch detection method for fat content in buffalo milk and application
CN114166785B (en) * 2021-11-16 2024-02-13 华中农业大学 Mid-infrared rapid batch detection method for fat content in buffalo milk and application thereof

Similar Documents

Publication Publication Date Title
Wang et al. Quality analysis, classification, and authentication of liquid foods by near-infrared spectroscopy: A review of recent research developments
Cozzolino et al. Feasibility study on the use of visible and near-infrared spectroscopy together with chemometrics to discriminate between commercial white wines of different varietal origins
Urbano-Cuadrado et al. Near infrared reflectance spectroscopy and multivariate analysis in enology: Determination or screening of fifteen parameters in different types of wines
Vitale et al. A rapid and non-invasive method for authenticating the origin of pistachio samples by NIR spectroscopy and chemometrics
Tsenkova et al. Near infrared spectroscopy for biomonitoring: cow milk composition measurement in a spectral region from 1,100 to 2,400 nanometers
KR101574895B1 (en) Method for predicting sugar contents and acidity of citrus using ft-ir fingerprinting combined by multivariate analysis
CN113324940A (en) Spectrum grading method for super-high-quality milk, high-protein special milk, high-milk-fat special milk and common milk
US20210247318A1 (en) Non-Destructive Detection of Egg Freshness Based on Raman Spectroscopy
CN112666111A (en) Method for quickly identifying milk and mare milk
Martín-Tornero et al. Comparative quantification of chlorophyll and polyphenol levels in grapevine leaves sampled from different geographical locations
Cayuela et al. NIR prediction of fruit moisture, free acidity and oil content in intact olives
Porep et al. Implementation of an on‐line near infrared/visible (NIR/VIS) spectrometer for rapid quality assessment of grapes upon receival at wineries
Tarkosova et al. Determination of carbohydrate content in bananas during ripening and storage by near infrared spectroscopy
Niimi et al. Prediction of wine sensory properties using mid-infrared spectra of Cabernet Sauvignon and Chardonnay grape berries and wines
CN112666114A (en) Method for identifying buffalo milk and mare milk by using spectrum
CN110231302A (en) A kind of method of the odd sub- seed crude fat content of quick measurement
CN113310929A (en) Soybean powder doped in high-temperature sterilized milk and spectral identification method of doping proportion thereof
CN112213281A (en) Comprehensive evaluation method for rapidly determining freshness of freshwater fish based on transmission near infrared spectrum
NL2029012B1 (en) Method for quickly identification of cow milk and goat milk
CN110609011A (en) Near-infrared hyperspectral detection method and system for starch content of single-kernel corn seeds
CN113310933A (en) Spectrum identification method for number of days for storing raw buffalo milk
Li et al. Rapid analysis of alcohol content during the green jujube wine fermentation by FT-NIR
Liu et al. Application of hyperspectral imaging for cocoa bean grading with machine learning approaches
CN113324941A (en) Method for rapidly identifying preservation time of raw milk
CN114166779A (en) Intermediate infrared rapid batch detection method for beta-casein in milk

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210831