CN117871459A - Mutton crude fat content determination method and system - Google Patents
Mutton crude fat content determination method and system Download PDFInfo
- Publication number
- CN117871459A CN117871459A CN202410068013.6A CN202410068013A CN117871459A CN 117871459 A CN117871459 A CN 117871459A CN 202410068013 A CN202410068013 A CN 202410068013A CN 117871459 A CN117871459 A CN 117871459A
- Authority
- CN
- China
- Prior art keywords
- crude fat
- fat content
- data
- mutton
- characteristic wavelength
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 235000019784 crude fat Nutrition 0.000 title claims abstract description 101
- 238000000034 method Methods 0.000 title claims abstract description 59
- 238000001228 spectrum Methods 0.000 claims abstract description 82
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 53
- 238000007781 pre-processing Methods 0.000 claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 23
- 238000005259 measurement Methods 0.000 claims abstract description 20
- 238000012937 correction Methods 0.000 claims description 50
- 238000004458 analytical method Methods 0.000 claims description 30
- 230000009466 transformation Effects 0.000 claims description 24
- 230000003595 spectral effect Effects 0.000 claims description 18
- 238000011156 evaluation Methods 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 6
- 235000013372 meat Nutrition 0.000 abstract description 9
- 238000003912 environmental pollution Methods 0.000 abstract description 5
- 238000011160 research Methods 0.000 abstract description 3
- 239000000523 sample Substances 0.000 description 82
- 241001494479 Pecora Species 0.000 description 30
- 210000003141 lower extremity Anatomy 0.000 description 18
- 239000013598 vector Substances 0.000 description 15
- 239000000126 substance Substances 0.000 description 12
- 238000010200 validation analysis Methods 0.000 description 12
- 238000012795 verification Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 239000011159 matrix material Substances 0.000 description 8
- 210000003205 muscle Anatomy 0.000 description 8
- 238000002360 preparation method Methods 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 238000004497 NIR spectroscopy Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 235000013305 food Nutrition 0.000 description 4
- 238000012417 linear regression Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000002203 pretreatment Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000007705 chemical test Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 210000003195 fascia Anatomy 0.000 description 1
- 235000019197 fats Nutrition 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 244000144972 livestock Species 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- 235000016709 nutrition Nutrition 0.000 description 1
- 230000000050 nutritive effect Effects 0.000 description 1
- 244000144977 poultry Species 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Landscapes
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention discloses a method and a system for determining the crude fat content of mutton, and relates to the technical field of meat quality research, wherein the method comprises the steps of obtaining original near infrared spectrum data and crude fat content actual measurement values of a plurality of mutton samples; preprocessing original near infrared spectrum data to obtain processed spectrum data; extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data; determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method; and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into a final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected. According to the invention, the characteristic wavelength data modeling obtained by extracting the characteristic wavelength is performed by using a continuous projection algorithm, so that the crude fat content of mutton is predicted, the prediction accuracy of the crude fat content of mutton is improved, the labor and time cost is reduced, and the environmental pollution is avoided.
Description
Technical Field
The invention relates to the technical field of meat quality research, in particular to a method and a system for determining the content of mutton crude fat.
Background
The Chinese is a large country for producing and consuming mutton, the mutton accounts for about 5.9 percent in meat consumer products, and the published data in 2022 shows that the Chinese mutton accounts for one third of the world, and the inlet and outlet amounts are increased by 2.5 percent in a same way. The method encourages enterprises to strengthen technical research and development, improves the quality and benefit of livestock and poultry cultivation, increases the supervision on the aspects of enterprise registration, inspection and quarantine, product quality, food safety and the like in the mutton industry, formulates and perfects relevant quality standards and food safety standards, and ensures the quality and food safety of mutton products. The nutritive value of mutton is usually detected by adopting a wet chemical method, and the chemical detection methods have the defects of high cost, time and labor waste, environmental pollution, large accidental errors and the like.
Disclosure of Invention
The invention aims to provide a method and a system for determining the content of mutton crude fat, which can improve the prediction accuracy of the mutton crude fat content, reduce the labor and time cost and avoid environmental pollution.
In order to achieve the above object, the present invention provides the following solutions:
the invention provides a method for determining the content of mutton crude fat, which comprises the following steps:
and obtaining original near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples.
And preprocessing the original near infrared spectrum data to obtain processed spectrum data.
And extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain the characteristic wavelength data.
And determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
And obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
The invention also provides a system for determining the content of the mutton crude fat, which comprises the following steps:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and raw fat content actual measurement values of a plurality of mutton samples.
And the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data.
And the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data.
And the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
The predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention provides a method and a system for determining the crude fat content of mutton, which are characterized in that firstly, raw near infrared spectrum data and actual measurement values of the crude fat content of a plurality of mutton samples are obtained; preprocessing original near infrared spectrum data to obtain processed spectrum data; extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data; determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method; and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into a final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected. According to the invention, the characteristic wavelength data modeling obtained by extracting the characteristic wavelength is performed by using a continuous projection algorithm, so that the crude fat content of mutton is predicted, the prediction accuracy of the crude fat content of mutton is improved, the labor and time cost is reduced, and the environmental pollution is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for determining the content of mutton crude fat;
FIG. 2 is an original near infrared spectrum of a mutton sample provided in example 1 of the present invention;
FIG. 3 is a spectrum chart of standard normal transformation data of mutton samples provided in example 2 of the present invention;
FIG. 4 is a spectrum chart of the multivariate scattering correction data of the mutton sample provided in example 3 of the present invention;
FIG. 5 is a spectrum of the standard normal transformation characteristic wavelength of the mutton sample provided in example 4 of the present invention;
FIG. 6 is a spectrum of characteristic wavelengths of the multi-component scattering correction of a mutton sample provided in example 5 of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a method and a system for determining the content of mutton crude fat, which are used for predicting the content of mutton crude fat by modeling characteristic wavelength data obtained by extracting characteristic wavelengths through a continuous projection algorithm, so that the prediction accuracy of the mutton crude fat content is improved, the labor and time cost is reduced, and the environmental pollution is avoided.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 1, the embodiment provides a method for determining the content of mutton crude fat, which includes:
s1: and obtaining original near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples.
S2: and preprocessing the original near infrared spectrum data to obtain processed spectrum data.
S3: and extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm (Successive Projections Algorithm, SPA) to obtain characteristic wavelength data.
S4: and determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
S5: and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
The samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
The fascia and fat are removed from the leg muscle of the mutton, the meat is minced to a meat paste state by a meat grinder, and the meat paste is filled into a sample cup, and the thickness of the meat paste is 1cm, so that the meat paste is compact and seamless. The original near infrared spectrum data in the step S1 are obtained by scanning a mutton sample by a near infrared analyzer, in the embodiment, near infrared spectra of mutton are collected by scanning by using a FoodScan 2 near infrared analyzer of Focus company, the wavelength range is 400-1100nm, the data wavelength interval is 0.5nm, each sample is repeatedly filled in a cup for 3 times, and the average spectrum is obtained after the cup is repeatedly scanned for 3 times. The test operating environment was maintained at room temperature of 20-25 ℃.
The content of crude fat (EE) in the mutton sample was determined according to the method GB 5009.6-2016.
In step S2, preprocessing the original near infrared spectrum data to obtain processed spectrum data, which specifically includes: preprocessing the original near infrared spectrum data by using a standard normal transformation algorithm to obtain standard normal transformation spectrum data; preprocessing the original near infrared spectrum data by utilizing a multi-element scattering correction algorithm to obtain multi-element scattering correction spectrum data; the standard normal transformation spectral data or the multivariate scattering correction spectral data is the processing spectral data.
Specifically, the original near infrared spectrum data of the mutton sample is respectively preprocessed ((Standard Normal Variate transform, SNV) and multi-element scattering correction (Multiplicative Scatter Correction, MSC), so that preparation is made for extracting characteristic wavelengths of a continuous projection algorithm and establishing a partial least square model for the preprocessed spectrum data, and comparison of the follow-up models is facilitated.
The standard normal transformation can correct the spectrum error caused by scattering among samples, the SNV is essentially the original spectrum minus the spectrum average value, divided by the standard deviation, and the standard normal transformation is carried out on the whole spectrum, and the specific formula is as follows:
in the above, x SNV The wavelength after pretreatment is converted in standard normal state, m is the number of wave points, and x k Is the wavelength of the kth wavelength point of the original near infrared spectrum,is the average of the full spectrum.
The Multivariate Scatter Correction (MSC) is calculated based on the spectral matrix of a set of samples. The basic idea of this approach is to assume that the scattering coefficient is the same at all wavelengths. When using the multi-element scattering correction, the change of the spectrum and the content of the components in the sample need to satisfy a direct linear relation, and the near infrared spectrum of all other samples needs to be corrected by taking the spectrum as a standard, wherein the correction comprises baseline translation and offset correction. The specific formula is as follows:
in the above, A i,j The method is characterized in that the method is used for representing the calibration spectrum data of the ith row and the jth column in the n multiplied by p dimensional calibration spectrum data matrix, n is the number of samples, p is the number of wavelength points used for spectrum acquisition, i and j are the ranges of the spectrum matrix, namely the spectrum data wavelength range.Representing the average of the raw near infrared spectra of all samples taken from the data at each wavelength point. A is that i Is a 1 xp dimension matrix representing a single sample spectral vector, m i And b i Respectively represent the near infrared spectrum A of each sample i Mean spectrum->And (5) obtaining a relative offset coefficient and a translation amount after performing unitary linear regression. A is that i(MSC) The spectrum data after the multi-element scattering correction, namely the multi-element scattering correction spectrum data, are shown.
In step S3, a continuous projection algorithm is used to extract characteristic wavelengths from the processed spectrum data, so as to obtain characteristic wavelength data, which specifically includes:
extracting characteristic wavelengths of the standard normal transformation spectrum data by using a continuous projection algorithm to obtain standard normal transformation characteristic wavelengths;
extracting characteristic wavelengths of the multi-element scattering correction spectrum data by using a continuous projection algorithm to obtain multi-element scattering correction characteristic wavelengths; the standard normal transformation characteristic wavelength and the multivariate scattering correction characteristic wavelength constitute the characteristic wavelength data.
In step S4, a partial least square method is used to determine a final crude fat content prediction model according to the characteristic wavelength data, which specifically includes:
taking the standard normal transformation characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a first crude fat content prediction model;
taking the multi-element scattering correction characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a second crude fat content prediction model;
determining a final crude fat content prediction model according to the first crude fat content prediction model and the second crude fat content prediction model, specifically comprising:
according to the evaluation index of the first crude fat content prediction model and the evaluation index of the second crude fat content prediction model, determining a crude fat content prediction model with the prediction capability index meeting a set condition as a final crude fat content prediction model; the evaluation index comprises root mean square error (Root Mean Square Error, RMSE), determination coefficient R 2 (coefficient of determinant) and relative analysis errors (Relative Percent Deviation, RPD). The setting condition is that the difference value of the root mean square error and the root mean square error set value is in a first threshold range, the difference value of the decision coefficient and the decision coefficient set value is in a second threshold range, and the difference value of the relative analysis error and the relative analysis error set value is in a third threshold range.
Besides the two modeling modes, the invention further comprises the following steps:
taking the original near infrared spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a third crude fat content prediction model;
taking the standard normal transformation spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a fourth crude fat content prediction model;
and taking the multivariate scattering correction spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a fifth crude fat content prediction model.
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton samples as a calibration set and 39 mutton samples as a validation set.
And respectively constructing a near infrared spectrum prediction model by taking SNV and MSC as pretreatment methods. Modeling the near infrared spectrum of the sample of the correction set and the chemical value of the sample, carrying out model external verification through the sample of the verification set, and carrying out correlation comparison on the predicted value and the content of the known sample. And simultaneously comparing the judging indexes of the two models: root mean square error, decision coefficient and relative analysis error, and selecting an optimal preprocessing method for modeling.
And extracting characteristic wavelengths of the spectrum data by utilizing a continuous projection algorithm respectively from the original near infrared spectrum data preprocessed by the SNV and the MSC. And establishing a partial least square model of the characteristic wavelength, so as to predict the mutton sample with unknown content. Firstly, a partial least square model is built by the near infrared spectrum of the sample of the correction set and the chemical value of the sample (crude fat content of the sample), and the input and output of the model are the near infrared spectrum of the known sample and the content of the known sample respectively. And then the spectrum data of the sample in the verification set is put into a prediction model to obtain the concentration (predicted value of crude fat content) of the sample in the verification set, and the obtained predicted value is subjected to correlation comparison with the known sample content (the crude fat content of the sample in the verification set). Comparing the judgment indexes of the two models: and searching an optimal model by root mean square error, determination coefficient and relative analysis error.
Five specific embodiments are provided below, respectively, for further details of the establishment and evaluation of the above five prediction models.
Example 1: construction of full spectrum partial least square model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing partial least square modeling on the original near infrared spectrum data without any pretreatment, and performing model evaluation on a third crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 1.
Table 1 sample correction set and validation set data statistics table
Partial least squares (FS-PLS) modeling is directly performed on full spectrum wavelengths, wherein the root mean square error (RMSEcal) of the correction set is 0.7627, and the coefficient Rc is determined 2 0.6445; the root mean square error RMSEp of the prediction set is 0.7641, and the coefficient Rp is determined 2 0.7641, the relative analysis error RPD was 1.26. FIG. 2 is a raw near infrared spectrum of 190 different varieties of mutton samples.
Example 2: construction of partial least squares model for SNV pretreatment
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, carrying out standard normal transformation (SNV) pretreatment on the original near infrared spectrum data, carrying out partial least square modeling, and carrying out model evaluation on a fourth crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 2.
Table 2 sample correction set and validation set data statistics table
Standard normal transformation (SNV) preprocessing is performed on the full spectrum band, partial least squares modeling (FS-SNV-PLS) is performed, wherein root mean square error RMSEcal of a correction set is 0.7602, and a coefficient Rc is determined 2 0.6514; the root mean square error RMSEp of the prediction set is 0.7651, and the coefficient Rc is determined 2 0.6421, the relative analysis error RPD was 1.467. Each different line represents the SNV-treated spectrum of a different sample. The standard normal transformed spectral data, i.e. the spectral diagram under SNV preprocessing, is shown in fig. 3.
Example 3: construction of MSC pretreatment partial least square model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing multi-component scattering correction (MSC) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data, performing partial least square modeling, and performing model evaluation on a fifth crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 3.
Table 3 sample correction set and validation set data statistics table
Performing Multivariate Scattering Correction (MSC) pretreatment on the full spectrum band, performing partial least squares modeling (FS-MSC-PLS), wherein the root mean square error RMSEcal of the correction set is 0.7739, and determining a coefficient Rc 2 0.7508; the root mean square error RMSEp of the prediction set is 0.7821, and the coefficient Rp is determined 2 0.7501, the relative analysis error RPD was 1.44. The multivariate scatter corrected spectral data, i.e., the spectral diagram under MSC pretreatment, is shown in fig. 4.
Example 4: construction of SNV+SPA partial least squares model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing standard normal transformation (SNV) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data, and extracting characteristic wavelengths by using a continuous projection algorithm.
The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity.
And starting to take one wavelength as an initial wavelength variable randomly, projecting the rest wavelength variables onto the initial wavelength to obtain corresponding projections, taking the maximum projection wavelength in the traversal as a candidate wavelength, selecting a new wavelength again as a new initial wavelength variable, and repeating the operation to sequentially obtain the maximum projections of all the wavelength variables corresponding to other wavelengths. Through projection analysis of the wavelength variable, the collinearity of the selected wavelength is reduced, and most redundant information is reduced to a certain extent, so that the accuracy and stability of the model are improved. The SPA calculation step of the continuous projection algorithm is as follows:
let the target spectrum matrix be X k(0) The initial band is k (0) and the selected wavelength variable is N.
(1) Initializing variables: n=1, x j ∈X *,j J=1, … J selects any column wavelength in the matrix and assigns it to x j ;X *,j Is the initial set of variables.
(2) Determining unselected variables:s is the set of unselected wavelength variables.
(3) Calculating a mapping of the unselected wavelengths on the initial variable wavelengths:
(4) Determining a maximum wavelength projection:
(5) And (3) cyclic operation: n=n+1, and when N is less than N, returning to the step (2) to continue operation;
(6) Determining the selected band: when N > N, the selected band is finally output.
And establishing a linear regression model for the subset of each iteration, and selecting the wavelength variable combination with the minimum root mean square error as the preferred wave band.
Implementation of SPA functions
Parameter description: the specrum: representing the input spectral data. This is a vector containing spectral information.
n: the number of principal components used in the principal component analysis is represented. It determines the number of maximum singular values selected, i.e. the number of characteristic wavelengths.
spa_result: results are included for SPA at each location. In this implementation, it represents the norm of the largest singular value selected in each sub-spectrum for determining the most informative wavelength.
feature_wavelength hs: an index of the selected characteristic wavelength at each location is included.
1.3 results:
190 samples were divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set at a ratio of 4:1 according to the method of spxy sample division. The data of the mean value, the maximum value, the minimum component content, the standard deviation, and the like of the correction set and the verification set are shown in table 4.
Table 4 sample correction set and validation set data statistics table
After standard normal transformation is carried out on the full spectrum, a continuous projection algorithm (SPA) is used for extracting characteristic wavelength backward bias least squares modeling (SPA-SNV-PLS), wherein the root mean square error RMSEcal of a correction set is 0.3025; the root mean square error RMSEp of the prediction set is 0.3011, and the coefficient Rp is determined 2 0.9695, the relative analysis error RPD was 3.60.
1.4 characteristic spectrum extraction:
the SPA band selection range at SNV is shown in fig. 5.
Example 5: construction of MSC+SPA partial least squares model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and constructing a data processing flow by adopting MATLAB 2020a, importing the sample spectrum data and the sample chemical actual measurement value, and performing multi-component scattering correction (MSC) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data. And extracting characteristic wavelengths by using a continuous projection algorithm.
The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity.
And starting to take one wavelength as an initial wavelength variable randomly, projecting the rest wavelength variables onto the initial wavelength to obtain corresponding projections, taking the maximum projection wavelength in the traversal as a candidate wavelength, selecting a new wavelength again as a new initial wavelength variable, and repeating the operation to sequentially obtain the maximum projections of all the wavelength variables corresponding to other wavelengths. Through projection analysis of the wavelength variable, the collinearity of the selected wavelength is reduced, and most redundant information is reduced to a certain extent, so that the accuracy and stability of the model are improved. SPA calculation steps are as follows:
let the target spectrum matrix be X k(0) The initial band is k (0) and the selected wavelength variable is N.
(1) Initializing variables: n=1, x j ∈X *,j J=1, … J selects any column wavelength in the matrix and assigns it to x j 。
(2) Determining unselected variables:s is the set of unselected wavelength variables.
(3) Calculating a mapping of the unselected wavelengths on the initial variable wavelengths:
(4) Determining a maximum wavelength projection: k (n) =max (||px) j ||),x j ∈S;x j =Px j ,j=1,…J;
(5) And (3) cyclic operation: n=n+1, and when N is less than N, returning to the step (2) to continue operation;
(6) Determining the selected band: when N > N, the selected band is finally output.
And establishing a linear regression model for the subset of each iteration, and selecting the wavelength variable combination with the minimum root mean square error as the preferred wave band.
1.3 results:
190 samples were divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set at a ratio of 4:1 according to the method of spxy sample division. The data of the mean, the maximum minimum component content, the standard deviation, etc. of the correction set and the validation set are shown in table 5.
Table 5 sample correction set and validation set data statistics table
After the multi-element scattering correction is carried out on the full spectrum, a continuous algorithm (SPA) is used for extracting characteristic wavelength backward bias least squares modeling (SPA-MSC-PLS), wherein the root mean square error RMSEcal of the correction set is 0.3215; the root mean square error RMSEp of the prediction set is 0.3155, and the coefficient Rp is determined 2 0.9699, the relative analysis error RPD was 3.48.
1.4 characteristic spectrum extraction:
the resulting spectral region ranges are shown in fig. 6.
The invention aims to provide a new method which is more efficient and accurate and aims to extract characteristic wavelengths on the basis of different pretreatment based on a near infrared spectrum technology so as to improve the accuracy of a near infrared spectrum prediction model of the content of crude fat in mutton. And collecting the leg muscle of the mutton, performing near infrared spectrum scanning, and performing different pretreatment on the original near infrared spectrum data. And extracting characteristic wavelengths of different preprocessed data by using a continuous projection algorithm (SPA) and establishing a partial least square model.
1. In the model without pretreatment, the prediction capability index of the model is RMSE 0.7641, R 2 0.6321 and rpd of 1.26.
2. In the model under the pretreatment SNV, the prediction capability index of the model is RMSE of 0.7651, R 2 0.6421 and rpd of 1.42.
3. In the model under the pretreatment MSC, the prediction capability index of the model is RMSE of 0.180, R 2 0.463 and rpd 1.444.
4. In a model of preprocessing SNV and combining SPA algorithm, the prediction capability index of the model is RMSE of 0.3011, R 2 0.9695 and rpd of 3.60.
5. In a model of preprocessing MSC and combining SPA algorithm, the prediction capability index of the model is RMSE of 0.3155, R 2 0.9699 and a prd of 3.48.
6. Of the four modeling modes, the fourth and fifth preprocessing modes (SNV, MSC) are combined with the optimal prediction capability under the continuous projection algorithm model. After cross verification, the root mean square error RMSE of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 The relative analysis errors RPD were 3.60 and 3.48 for 0.9695 and 0.9699.
Combining the above 5 examples yields: the prediction capability under the model of combining two preprocessing modes (SNV, MSC) with continuous projection algorithm (SPA) is optimal. After cross validation, the root mean square error RMSE (Root mean square error) of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 (coefficient of determinant) 0.9695 and 0.9699, and relative analysis errors PRD (relative percent deviation) of 3.60 and 3.48.
Near infrared spectrum technology is a nondestructive technology for analyzing materials and substances, along with rapid development and popularization of metrology and computer science, various spectrum detection instruments suitable for various fields are developed and utilized, and the near infrared spectrum technology has more and more outstanding characteristics, and is mainly embodied in the following aspects:
nondestructive: compared with the traditional physical or chemical detection method, the near infrared spectrum technology is a nondestructive analysis method, does not need to destroy or change the shape or structure of the sample, and reduces the damage and loss to the sample.
Fast and efficient: the near infrared spectrum technology can acquire spectrum data of a sample in a short time, and has the advantages of rapidness and high efficiency. Near infrared techniques can save time and increase analysis efficiency compared to some of the more time-consuming analysis processes in conventional methods.
Multivariate analysis: near infrared spectroscopy techniques can provide multiple information about the sample. Different chemical tests and functional groups exhibit specific spectral characteristics for spectral absorption and scattering in the spectral range, so that information about the composition, structure and properties of the sample can be obtained by analyzing these characteristics. The ability of such multiplex analysis makes near infrared spectroscopy a unique advantage in complex sample analysis.
Non-contact: near infrared spectroscopy can be used to observe and analyze a sample by a remote sensor or probe without directly contacting the surface of the sample. This has advantages for analysis of sensitive, liquid or high temperature samples and can avoid contamination or damage of the sample.
Low cost and no pollution: the determination process does not need a large amount of chemical reagents, the analysis cost is greatly reduced, the method is environment-friendly, almost no pollution is caused, and the method belongs to the technology of green analysis.
Diversity application: near infrared spectroscopy has found wide application in many fields. It can be used for quality detection of food and agricultural products, quality control of medicines, chemical analysis, life science research, environmental monitoring, etc.
Due to the high adaptability and flexibility of the near infrared spectrum technology to the sample, the on-line and real-time analysis can be realized, so that the near infrared spectrum technology is very valuable in many application scenes. In order to reduce and eliminate the influence of various non-target factors on the spectrum, the spectrum information is purified, the original spectrum is required to be preprocessed, and different algorithms are used for extracting characteristic wavelengths. The continuous projection algorithm (SPA) is a method for feature selection and data dimension reduction. The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity. The application of the continuous projection algorithm can search the subset set with the highest consistency and correlation from the feature vectors, remove most redundant information, reduce the complexity and training time of the model, and effectively reduce the dimensionality of the spectrum information and reduce the risk of overfitting.
The invention applies two different pretreatment methods (SNV and MSC) and combines a continuous projection algorithm (SPA) to extract characteristic wavelength and establish a partial least square model of the content of mutton crude fat, after cross verification, the Root Mean Square Error (RMSE) of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 The relative analysis errors PRD were 3.60 and 3.48 for 0.9695 and 0.9699. Compared with full spectrum modeling combining partial least square method (FS-PLS), full spectrum combining standard normal transformation partial least square modeling (FS-SNV-PLS), full spectrum wave band combining multi-element scattering correction partial least square modeling (FS-MSC-PLS), the effect is optimal, the near infrared model detection effect and the prediction capability of the crude fat content in the mutton of the prediction model are greatly improved, and the practical level is achieved.
The invention adopts two methods of preprocessing spectrum data based on SNV and MSC, improves the model performance through SPA algorithm, can rapidly and accurately detect the content of mutton crude fat, and has simple operation and no pollution. In addition, the invention innovatively provides a brand new technical solution for the near infrared model establishment of other nutritional ingredients in mutton.
Example 6:
in order to perform the method corresponding to the above embodiment 1 to achieve the corresponding functions and technical effects, a mutton crude fat content determination system is provided below, including:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and raw fat content actual measurement values of a plurality of mutton samples.
And the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data.
And the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data.
And the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
The predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.
Claims (8)
1. A method for determining the crude fat content of mutton, comprising:
acquiring original near infrared spectrum data and crude fat content actual measurement values of a plurality of mutton samples;
preprocessing the original near infrared spectrum data to obtain processed spectrum data;
extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data;
determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method;
and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
2. The method for determining the content of the crude fat in mutton according to claim 1, wherein the method for preprocessing the original near infrared spectrum data to obtain processed spectrum data specifically comprises the following steps:
preprocessing the original near infrared spectrum data by using a standard normal transformation algorithm to obtain standard normal transformation spectrum data;
preprocessing the original near infrared spectrum data by utilizing a multi-element scattering correction algorithm to obtain multi-element scattering correction spectrum data; the standard normal transformation spectral data or the multivariate scattering correction spectral data is the processing spectral data.
3. The method for determining the content of the crude fat in mutton according to claim 2, wherein the characteristic wavelength extraction is performed on the processed spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data, and the method specifically comprises the following steps:
extracting characteristic wavelengths of the standard normal transformation spectrum data by using a continuous projection algorithm to obtain standard normal transformation characteristic wavelengths;
extracting characteristic wavelengths of the multi-element scattering correction spectrum data by using a continuous projection algorithm to obtain multi-element scattering correction characteristic wavelengths; the standard normal transformation characteristic wavelength and the multivariate scattering correction characteristic wavelength constitute the characteristic wavelength data.
4. A method for determining the crude fat content of mutton as claimed in claim 3, wherein the determining the final crude fat content prediction model from the characteristic wavelength data by using a partial least square method comprises:
taking the standard normal transformation characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a first crude fat content prediction model;
taking the multi-element scattering correction characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a second crude fat content prediction model;
and determining a final crude fat content prediction model according to the first crude fat content prediction model and the second crude fat content prediction model.
5. A method for determining the crude fat content of mutton as claimed in claim 4, wherein determining a final crude fat content prediction model based on the first crude fat content prediction model and the second crude fat content prediction model comprises:
according to the evaluation index of the first crude fat content prediction model and the evaluation index of the second crude fat content prediction model, determining a crude fat content prediction model with the prediction capability index meeting a set condition as a final crude fat content prediction model; the evaluation index includes root mean square error, decision coefficient, and relative analysis error.
6. The method of claim 4, wherein the setting condition is that a difference between the root mean square error and the root mean square error setting is within a first threshold range, a difference between the determination coefficient and the determination coefficient setting is within a second threshold range, and a difference between the relative analysis error and the relative analysis error setting is within a third threshold range.
7. The method of claim 1, wherein the raw near infrared spectrum data is obtained by scanning a mutton sample with a near infrared analyzer.
8. A mutton fat content determination system, comprising:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples;
the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data;
the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data;
the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method;
the predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410068013.6A CN117871459A (en) | 2024-01-17 | 2024-01-17 | Mutton crude fat content determination method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410068013.6A CN117871459A (en) | 2024-01-17 | 2024-01-17 | Mutton crude fat content determination method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117871459A true CN117871459A (en) | 2024-04-12 |
Family
ID=90586387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410068013.6A Pending CN117871459A (en) | 2024-01-17 | 2024-01-17 | Mutton crude fat content determination method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117871459A (en) |
-
2024
- 2024-01-17 CN CN202410068013.6A patent/CN117871459A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102879353B (en) | The method of content of protein components near infrared detection peanut | |
CN102590129B (en) | Method for detecting content of amino acid in peanuts by near infrared method | |
CN101915744A (en) | Near infrared spectrum nondestructive testing method and device for material component content | |
CN110726694A (en) | Characteristic wavelength selection method and system of spectral variable gradient integrated genetic algorithm | |
CN111044483A (en) | Method, system and medium for determining pigment in cream based on near infrared spectrum | |
Yu et al. | Rapid and visual measurement of fat content in peanuts by using the hyperspectral imaging technique with chemometrics | |
CN111693487A (en) | Fruit sugar degree detection method and system based on genetic algorithm and extreme learning machine | |
CN110567889A (en) | Nondestructive testing method for water content of fresh cocoons based on spectral imaging and deep learning technology | |
CN112730312A (en) | Doped bovine colostrum qualitative identification method based on near infrared spectrum technology | |
CN115937670A (en) | Intelligent musk identification method based on hyperspectral imaging and application | |
CN109540837B (en) | Method for rapidly detecting lignocellulose content of ramie leaves by near infrared | |
CN114611582A (en) | Method and system for analyzing substance concentration based on near infrared spectrum technology | |
CN104502307A (en) | Method for quickly detecting content of glycogen and protein of crassostrea gigas | |
CN111896497B (en) | Spectral data correction method based on predicted value | |
CN106338488A (en) | Method for fast undamaged determination of transgenic soybean milk powder | |
CN112945901A (en) | Method for detecting quality of ensiled soybeans based on near infrared spectrum | |
CN107290299B (en) | Method for detecting sugar degree and acidity of peaches in real time in nondestructive mode | |
CN108398400B (en) | Method for nondestructive testing of fatty acid content in wheat by terahertz imaging | |
CN117871459A (en) | Mutton crude fat content determination method and system | |
CN114062306B (en) | Near infrared spectrum data segmentation preprocessing method | |
CN113049526B (en) | Corn seed moisture content determination method based on terahertz attenuated total reflection | |
CN113984683A (en) | Hyperspectrum-based method for measuring starch content of potato whole flour noodles | |
CN115420708B (en) | Near-infrared nondestructive detection method for capsaicin substances in dry peppers | |
CN113791049B (en) | Method for rapidly detecting freshness of chilled duck meat by combining NIRS and CV | |
CN117723675A (en) | Multi-dimensional characteristic-fused rapid nondestructive analysis method for quality of fermented soybean paste |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |