CN117871459A - Mutton crude fat content determination method and system - Google Patents

Mutton crude fat content determination method and system Download PDF

Info

Publication number
CN117871459A
CN117871459A CN202410068013.6A CN202410068013A CN117871459A CN 117871459 A CN117871459 A CN 117871459A CN 202410068013 A CN202410068013 A CN 202410068013A CN 117871459 A CN117871459 A CN 117871459A
Authority
CN
China
Prior art keywords
crude fat
fat content
data
mutton
characteristic wavelength
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410068013.6A
Other languages
Chinese (zh)
Inventor
年芳
尹成诚
康景
李飞
唐德富
马德全
马自军
彭桢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gansu Agricultural University
Original Assignee
Gansu Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gansu Agricultural University filed Critical Gansu Agricultural University
Priority to CN202410068013.6A priority Critical patent/CN117871459A/en
Publication of CN117871459A publication Critical patent/CN117871459A/en
Pending legal-status Critical Current

Links

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention discloses a method and a system for determining the crude fat content of mutton, and relates to the technical field of meat quality research, wherein the method comprises the steps of obtaining original near infrared spectrum data and crude fat content actual measurement values of a plurality of mutton samples; preprocessing original near infrared spectrum data to obtain processed spectrum data; extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data; determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method; and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into a final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected. According to the invention, the characteristic wavelength data modeling obtained by extracting the characteristic wavelength is performed by using a continuous projection algorithm, so that the crude fat content of mutton is predicted, the prediction accuracy of the crude fat content of mutton is improved, the labor and time cost is reduced, and the environmental pollution is avoided.

Description

Mutton crude fat content determination method and system
Technical Field
The invention relates to the technical field of meat quality research, in particular to a method and a system for determining the content of mutton crude fat.
Background
The Chinese is a large country for producing and consuming mutton, the mutton accounts for about 5.9 percent in meat consumer products, and the published data in 2022 shows that the Chinese mutton accounts for one third of the world, and the inlet and outlet amounts are increased by 2.5 percent in a same way. The method encourages enterprises to strengthen technical research and development, improves the quality and benefit of livestock and poultry cultivation, increases the supervision on the aspects of enterprise registration, inspection and quarantine, product quality, food safety and the like in the mutton industry, formulates and perfects relevant quality standards and food safety standards, and ensures the quality and food safety of mutton products. The nutritive value of mutton is usually detected by adopting a wet chemical method, and the chemical detection methods have the defects of high cost, time and labor waste, environmental pollution, large accidental errors and the like.
Disclosure of Invention
The invention aims to provide a method and a system for determining the content of mutton crude fat, which can improve the prediction accuracy of the mutton crude fat content, reduce the labor and time cost and avoid environmental pollution.
In order to achieve the above object, the present invention provides the following solutions:
the invention provides a method for determining the content of mutton crude fat, which comprises the following steps:
and obtaining original near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples.
And preprocessing the original near infrared spectrum data to obtain processed spectrum data.
And extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain the characteristic wavelength data.
And determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
And obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
The invention also provides a system for determining the content of the mutton crude fat, which comprises the following steps:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and raw fat content actual measurement values of a plurality of mutton samples.
And the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data.
And the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data.
And the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
The predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects: the invention provides a method and a system for determining the crude fat content of mutton, which are characterized in that firstly, raw near infrared spectrum data and actual measurement values of the crude fat content of a plurality of mutton samples are obtained; preprocessing original near infrared spectrum data to obtain processed spectrum data; extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data; determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method; and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into a final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected. According to the invention, the characteristic wavelength data modeling obtained by extracting the characteristic wavelength is performed by using a continuous projection algorithm, so that the crude fat content of mutton is predicted, the prediction accuracy of the crude fat content of mutton is improved, the labor and time cost is reduced, and the environmental pollution is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for determining the content of mutton crude fat;
FIG. 2 is an original near infrared spectrum of a mutton sample provided in example 1 of the present invention;
FIG. 3 is a spectrum chart of standard normal transformation data of mutton samples provided in example 2 of the present invention;
FIG. 4 is a spectrum chart of the multivariate scattering correction data of the mutton sample provided in example 3 of the present invention;
FIG. 5 is a spectrum of the standard normal transformation characteristic wavelength of the mutton sample provided in example 4 of the present invention;
FIG. 6 is a spectrum of characteristic wavelengths of the multi-component scattering correction of a mutton sample provided in example 5 of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention aims to provide a method and a system for determining the content of mutton crude fat, which are used for predicting the content of mutton crude fat by modeling characteristic wavelength data obtained by extracting characteristic wavelengths through a continuous projection algorithm, so that the prediction accuracy of the mutton crude fat content is improved, the labor and time cost is reduced, and the environmental pollution is avoided.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 1, the embodiment provides a method for determining the content of mutton crude fat, which includes:
s1: and obtaining original near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples.
S2: and preprocessing the original near infrared spectrum data to obtain processed spectrum data.
S3: and extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm (Successive Projections Algorithm, SPA) to obtain characteristic wavelength data.
S4: and determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
S5: and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
The samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
The fascia and fat are removed from the leg muscle of the mutton, the meat is minced to a meat paste state by a meat grinder, and the meat paste is filled into a sample cup, and the thickness of the meat paste is 1cm, so that the meat paste is compact and seamless. The original near infrared spectrum data in the step S1 are obtained by scanning a mutton sample by a near infrared analyzer, in the embodiment, near infrared spectra of mutton are collected by scanning by using a FoodScan 2 near infrared analyzer of Focus company, the wavelength range is 400-1100nm, the data wavelength interval is 0.5nm, each sample is repeatedly filled in a cup for 3 times, and the average spectrum is obtained after the cup is repeatedly scanned for 3 times. The test operating environment was maintained at room temperature of 20-25 ℃.
The content of crude fat (EE) in the mutton sample was determined according to the method GB 5009.6-2016.
In step S2, preprocessing the original near infrared spectrum data to obtain processed spectrum data, which specifically includes: preprocessing the original near infrared spectrum data by using a standard normal transformation algorithm to obtain standard normal transformation spectrum data; preprocessing the original near infrared spectrum data by utilizing a multi-element scattering correction algorithm to obtain multi-element scattering correction spectrum data; the standard normal transformation spectral data or the multivariate scattering correction spectral data is the processing spectral data.
Specifically, the original near infrared spectrum data of the mutton sample is respectively preprocessed ((Standard Normal Variate transform, SNV) and multi-element scattering correction (Multiplicative Scatter Correction, MSC), so that preparation is made for extracting characteristic wavelengths of a continuous projection algorithm and establishing a partial least square model for the preprocessed spectrum data, and comparison of the follow-up models is facilitated.
The standard normal transformation can correct the spectrum error caused by scattering among samples, the SNV is essentially the original spectrum minus the spectrum average value, divided by the standard deviation, and the standard normal transformation is carried out on the whole spectrum, and the specific formula is as follows:
in the above, x SNV The wavelength after pretreatment is converted in standard normal state, m is the number of wave points, and x k Is the wavelength of the kth wavelength point of the original near infrared spectrum,is the average of the full spectrum.
The Multivariate Scatter Correction (MSC) is calculated based on the spectral matrix of a set of samples. The basic idea of this approach is to assume that the scattering coefficient is the same at all wavelengths. When using the multi-element scattering correction, the change of the spectrum and the content of the components in the sample need to satisfy a direct linear relation, and the near infrared spectrum of all other samples needs to be corrected by taking the spectrum as a standard, wherein the correction comprises baseline translation and offset correction. The specific formula is as follows:
in the above, A i,j The method is characterized in that the method is used for representing the calibration spectrum data of the ith row and the jth column in the n multiplied by p dimensional calibration spectrum data matrix, n is the number of samples, p is the number of wavelength points used for spectrum acquisition, i and j are the ranges of the spectrum matrix, namely the spectrum data wavelength range.Representing the average of the raw near infrared spectra of all samples taken from the data at each wavelength point. A is that i Is a 1 xp dimension matrix representing a single sample spectral vector, m i And b i Respectively represent the near infrared spectrum A of each sample i Mean spectrum->And (5) obtaining a relative offset coefficient and a translation amount after performing unitary linear regression. A is that i(MSC) The spectrum data after the multi-element scattering correction, namely the multi-element scattering correction spectrum data, are shown.
In step S3, a continuous projection algorithm is used to extract characteristic wavelengths from the processed spectrum data, so as to obtain characteristic wavelength data, which specifically includes:
extracting characteristic wavelengths of the standard normal transformation spectrum data by using a continuous projection algorithm to obtain standard normal transformation characteristic wavelengths;
extracting characteristic wavelengths of the multi-element scattering correction spectrum data by using a continuous projection algorithm to obtain multi-element scattering correction characteristic wavelengths; the standard normal transformation characteristic wavelength and the multivariate scattering correction characteristic wavelength constitute the characteristic wavelength data.
In step S4, a partial least square method is used to determine a final crude fat content prediction model according to the characteristic wavelength data, which specifically includes:
taking the standard normal transformation characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a first crude fat content prediction model;
taking the multi-element scattering correction characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a second crude fat content prediction model;
determining a final crude fat content prediction model according to the first crude fat content prediction model and the second crude fat content prediction model, specifically comprising:
according to the evaluation index of the first crude fat content prediction model and the evaluation index of the second crude fat content prediction model, determining a crude fat content prediction model with the prediction capability index meeting a set condition as a final crude fat content prediction model; the evaluation index comprises root mean square error (Root Mean Square Error, RMSE), determination coefficient R 2 (coefficient of determinant) and relative analysis errors (Relative Percent Deviation, RPD). The setting condition is that the difference value of the root mean square error and the root mean square error set value is in a first threshold range, the difference value of the decision coefficient and the decision coefficient set value is in a second threshold range, and the difference value of the relative analysis error and the relative analysis error set value is in a third threshold range.
Besides the two modeling modes, the invention further comprises the following steps:
taking the original near infrared spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a third crude fat content prediction model;
taking the standard normal transformation spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a fourth crude fat content prediction model;
and taking the multivariate scattering correction spectrum data as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a fifth crude fat content prediction model.
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton samples as a calibration set and 39 mutton samples as a validation set.
And respectively constructing a near infrared spectrum prediction model by taking SNV and MSC as pretreatment methods. Modeling the near infrared spectrum of the sample of the correction set and the chemical value of the sample, carrying out model external verification through the sample of the verification set, and carrying out correlation comparison on the predicted value and the content of the known sample. And simultaneously comparing the judging indexes of the two models: root mean square error, decision coefficient and relative analysis error, and selecting an optimal preprocessing method for modeling.
And extracting characteristic wavelengths of the spectrum data by utilizing a continuous projection algorithm respectively from the original near infrared spectrum data preprocessed by the SNV and the MSC. And establishing a partial least square model of the characteristic wavelength, so as to predict the mutton sample with unknown content. Firstly, a partial least square model is built by the near infrared spectrum of the sample of the correction set and the chemical value of the sample (crude fat content of the sample), and the input and output of the model are the near infrared spectrum of the known sample and the content of the known sample respectively. And then the spectrum data of the sample in the verification set is put into a prediction model to obtain the concentration (predicted value of crude fat content) of the sample in the verification set, and the obtained predicted value is subjected to correlation comparison with the known sample content (the crude fat content of the sample in the verification set). Comparing the judgment indexes of the two models: and searching an optimal model by root mean square error, determination coefficient and relative analysis error.
Five specific embodiments are provided below, respectively, for further details of the establishment and evaluation of the above five prediction models.
Example 1: construction of full spectrum partial least square model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing partial least square modeling on the original near infrared spectrum data without any pretreatment, and performing model evaluation on a third crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 1.
Table 1 sample correction set and validation set data statistics table
Partial least squares (FS-PLS) modeling is directly performed on full spectrum wavelengths, wherein the root mean square error (RMSEcal) of the correction set is 0.7627, and the coefficient Rc is determined 2 0.6445; the root mean square error RMSEp of the prediction set is 0.7641, and the coefficient Rp is determined 2 0.7641, the relative analysis error RPD was 1.26. FIG. 2 is a raw near infrared spectrum of 190 different varieties of mutton samples.
Example 2: construction of partial least squares model for SNV pretreatment
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, carrying out standard normal transformation (SNV) pretreatment on the original near infrared spectrum data, carrying out partial least square modeling, and carrying out model evaluation on a fourth crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 2.
Table 2 sample correction set and validation set data statistics table
Standard normal transformation (SNV) preprocessing is performed on the full spectrum band, partial least squares modeling (FS-SNV-PLS) is performed, wherein root mean square error RMSEcal of a correction set is 0.7602, and a coefficient Rc is determined 2 0.6514; the root mean square error RMSEp of the prediction set is 0.7651, and the coefficient Rc is determined 2 0.6421, the relative analysis error RPD was 1.467. Each different line represents the SNV-treated spectrum of a different sample. The standard normal transformed spectral data, i.e. the spectral diagram under SNV preprocessing, is shown in fig. 3.
Example 3: construction of MSC pretreatment partial least square model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing multi-component scattering correction (MSC) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data, performing partial least square modeling, and performing model evaluation on a fifth crude fat content prediction model according to an evaluation standard.
1.3 results:
190 samples were partitioned according to the spxy sample protocol and samples were partitioned according to 4: the ratio of 1 was divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set. The data of the mean value, the maximum value, the minimum component content, the standard deviation and the like of the correction set and the verification set are shown in table 3.
Table 3 sample correction set and validation set data statistics table
Performing Multivariate Scattering Correction (MSC) pretreatment on the full spectrum band, performing partial least squares modeling (FS-MSC-PLS), wherein the root mean square error RMSEcal of the correction set is 0.7739, and determining a coefficient Rc 2 0.7508; the root mean square error RMSEp of the prediction set is 0.7821, and the coefficient Rp is determined 2 0.7501, the relative analysis error RPD was 1.44. The multivariate scatter corrected spectral data, i.e., the spectral diagram under MSC pretreatment, is shown in fig. 4.
Example 4: construction of SNV+SPA partial least squares model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and (3) constructing a data processing flow by adopting MATLAB 2020a, importing sample spectrum data and sample chemical actual measurement values, performing standard normal transformation (SNV) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data, and extracting characteristic wavelengths by using a continuous projection algorithm.
The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity.
And starting to take one wavelength as an initial wavelength variable randomly, projecting the rest wavelength variables onto the initial wavelength to obtain corresponding projections, taking the maximum projection wavelength in the traversal as a candidate wavelength, selecting a new wavelength again as a new initial wavelength variable, and repeating the operation to sequentially obtain the maximum projections of all the wavelength variables corresponding to other wavelengths. Through projection analysis of the wavelength variable, the collinearity of the selected wavelength is reduced, and most redundant information is reduced to a certain extent, so that the accuracy and stability of the model are improved. The SPA calculation step of the continuous projection algorithm is as follows:
let the target spectrum matrix be X k(0) The initial band is k (0) and the selected wavelength variable is N.
(1) Initializing variables: n=1, x j ∈X *,j J=1, … J selects any column wavelength in the matrix and assigns it to x j ;X *,j Is the initial set of variables.
(2) Determining unselected variables:s is the set of unselected wavelength variables.
(3) Calculating a mapping of the unselected wavelengths on the initial variable wavelengths:
(4) Determining a maximum wavelength projection:
(5) And (3) cyclic operation: n=n+1, and when N is less than N, returning to the step (2) to continue operation;
(6) Determining the selected band: when N > N, the selected band is finally output.
And establishing a linear regression model for the subset of each iteration, and selecting the wavelength variable combination with the minimum root mean square error as the preferred wave band.
Implementation of SPA functions
Parameter description: the specrum: representing the input spectral data. This is a vector containing spectral information.
n: the number of principal components used in the principal component analysis is represented. It determines the number of maximum singular values selected, i.e. the number of characteristic wavelengths.
spa_result: results are included for SPA at each location. In this implementation, it represents the norm of the largest singular value selected in each sub-spectrum for determining the most informative wavelength.
feature_wavelength hs: an index of the selected characteristic wavelength at each location is included.
1.3 results:
190 samples were divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set at a ratio of 4:1 according to the method of spxy sample division. The data of the mean value, the maximum value, the minimum component content, the standard deviation, and the like of the correction set and the verification set are shown in table 4.
Table 4 sample correction set and validation set data statistics table
After standard normal transformation is carried out on the full spectrum, a continuous projection algorithm (SPA) is used for extracting characteristic wavelength backward bias least squares modeling (SPA-SNV-PLS), wherein the root mean square error RMSEcal of a correction set is 0.3025; the root mean square error RMSEp of the prediction set is 0.3011, and the coefficient Rp is determined 2 0.9695, the relative analysis error RPD was 3.60.
1.4 characteristic spectrum extraction:
the SPA band selection range at SNV is shown in fig. 5.
Example 5: construction of MSC+SPA partial least squares model
1.1 mutton sample collection and preparation:
the samples collected in this example include 88 parts of a Hu sheep hind leg sample, 52 parts of a longus muscle of the back of the beach sheep, 12 parts of a Hu sheep (Hu sheep) hind leg sample, and 38 parts of a Dupoachi sheep hind leg sample.
1.2 data processing:
and constructing a data processing flow by adopting MATLAB 2020a, importing the sample spectrum data and the sample chemical actual measurement value, and performing multi-component scattering correction (MSC) pretreatment on the original near infrared spectrum data to obtain the processed spectrum data. And extracting characteristic wavelengths by using a continuous projection algorithm.
The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity.
And starting to take one wavelength as an initial wavelength variable randomly, projecting the rest wavelength variables onto the initial wavelength to obtain corresponding projections, taking the maximum projection wavelength in the traversal as a candidate wavelength, selecting a new wavelength again as a new initial wavelength variable, and repeating the operation to sequentially obtain the maximum projections of all the wavelength variables corresponding to other wavelengths. Through projection analysis of the wavelength variable, the collinearity of the selected wavelength is reduced, and most redundant information is reduced to a certain extent, so that the accuracy and stability of the model are improved. SPA calculation steps are as follows:
let the target spectrum matrix be X k(0) The initial band is k (0) and the selected wavelength variable is N.
(1) Initializing variables: n=1, x j ∈X *,j J=1, … J selects any column wavelength in the matrix and assigns it to x j
(2) Determining unselected variables:s is the set of unselected wavelength variables.
(3) Calculating a mapping of the unselected wavelengths on the initial variable wavelengths:
(4) Determining a maximum wavelength projection: k (n) =max (||px) j ||),x j ∈S;x j =Px j ,j=1,…J;
(5) And (3) cyclic operation: n=n+1, and when N is less than N, returning to the step (2) to continue operation;
(6) Determining the selected band: when N > N, the selected band is finally output.
And establishing a linear regression model for the subset of each iteration, and selecting the wavelength variable combination with the minimum root mean square error as the preferred wave band.
1.3 results:
190 samples were divided into 151 mutton data samples as a calibration set and 39 mutton data samples as a validation set at a ratio of 4:1 according to the method of spxy sample division. The data of the mean, the maximum minimum component content, the standard deviation, etc. of the correction set and the validation set are shown in table 5.
Table 5 sample correction set and validation set data statistics table
After the multi-element scattering correction is carried out on the full spectrum, a continuous algorithm (SPA) is used for extracting characteristic wavelength backward bias least squares modeling (SPA-MSC-PLS), wherein the root mean square error RMSEcal of the correction set is 0.3215; the root mean square error RMSEp of the prediction set is 0.3155, and the coefficient Rp is determined 2 0.9699, the relative analysis error RPD was 3.48.
1.4 characteristic spectrum extraction:
the resulting spectral region ranges are shown in fig. 6.
The invention aims to provide a new method which is more efficient and accurate and aims to extract characteristic wavelengths on the basis of different pretreatment based on a near infrared spectrum technology so as to improve the accuracy of a near infrared spectrum prediction model of the content of crude fat in mutton. And collecting the leg muscle of the mutton, performing near infrared spectrum scanning, and performing different pretreatment on the original near infrared spectrum data. And extracting characteristic wavelengths of different preprocessed data by using a continuous projection algorithm (SPA) and establishing a partial least square model.
1. In the model without pretreatment, the prediction capability index of the model is RMSE 0.7641, R 2 0.6321 and rpd of 1.26.
2. In the model under the pretreatment SNV, the prediction capability index of the model is RMSE of 0.7651, R 2 0.6421 and rpd of 1.42.
3. In the model under the pretreatment MSC, the prediction capability index of the model is RMSE of 0.180, R 2 0.463 and rpd 1.444.
4. In a model of preprocessing SNV and combining SPA algorithm, the prediction capability index of the model is RMSE of 0.3011, R 2 0.9695 and rpd of 3.60.
5. In a model of preprocessing MSC and combining SPA algorithm, the prediction capability index of the model is RMSE of 0.3155, R 2 0.9699 and a prd of 3.48.
6. Of the four modeling modes, the fourth and fifth preprocessing modes (SNV, MSC) are combined with the optimal prediction capability under the continuous projection algorithm model. After cross verification, the root mean square error RMSE of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 The relative analysis errors RPD were 3.60 and 3.48 for 0.9695 and 0.9699.
Combining the above 5 examples yields: the prediction capability under the model of combining two preprocessing modes (SNV, MSC) with continuous projection algorithm (SPA) is optimal. After cross validation, the root mean square error RMSE (Root mean square error) of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 (coefficient of determinant) 0.9695 and 0.9699, and relative analysis errors PRD (relative percent deviation) of 3.60 and 3.48.
Near infrared spectrum technology is a nondestructive technology for analyzing materials and substances, along with rapid development and popularization of metrology and computer science, various spectrum detection instruments suitable for various fields are developed and utilized, and the near infrared spectrum technology has more and more outstanding characteristics, and is mainly embodied in the following aspects:
nondestructive: compared with the traditional physical or chemical detection method, the near infrared spectrum technology is a nondestructive analysis method, does not need to destroy or change the shape or structure of the sample, and reduces the damage and loss to the sample.
Fast and efficient: the near infrared spectrum technology can acquire spectrum data of a sample in a short time, and has the advantages of rapidness and high efficiency. Near infrared techniques can save time and increase analysis efficiency compared to some of the more time-consuming analysis processes in conventional methods.
Multivariate analysis: near infrared spectroscopy techniques can provide multiple information about the sample. Different chemical tests and functional groups exhibit specific spectral characteristics for spectral absorption and scattering in the spectral range, so that information about the composition, structure and properties of the sample can be obtained by analyzing these characteristics. The ability of such multiplex analysis makes near infrared spectroscopy a unique advantage in complex sample analysis.
Non-contact: near infrared spectroscopy can be used to observe and analyze a sample by a remote sensor or probe without directly contacting the surface of the sample. This has advantages for analysis of sensitive, liquid or high temperature samples and can avoid contamination or damage of the sample.
Low cost and no pollution: the determination process does not need a large amount of chemical reagents, the analysis cost is greatly reduced, the method is environment-friendly, almost no pollution is caused, and the method belongs to the technology of green analysis.
Diversity application: near infrared spectroscopy has found wide application in many fields. It can be used for quality detection of food and agricultural products, quality control of medicines, chemical analysis, life science research, environmental monitoring, etc.
Due to the high adaptability and flexibility of the near infrared spectrum technology to the sample, the on-line and real-time analysis can be realized, so that the near infrared spectrum technology is very valuable in many application scenes. In order to reduce and eliminate the influence of various non-target factors on the spectrum, the spectrum information is purified, the original spectrum is required to be preprocessed, and different algorithms are used for extracting characteristic wavelengths. The continuous projection algorithm (SPA) is a method for feature selection and data dimension reduction. The SPA maps the original high-dimensional data to a low-dimensional space by selecting a series of projection vectors while maintaining class information for the data. The SPA uses vector projection analysis, the size of the projection vector is compared by projecting the wavelength onto other wavelengths, the wavelength with the maximum projection vector is taken as the wavelength to be selected, then the final characteristic wavelength is selected based on the correction model, and the SPA is selected to be the variable combination containing the least redundant information and the least collinearity. The application of the continuous projection algorithm can search the subset set with the highest consistency and correlation from the feature vectors, remove most redundant information, reduce the complexity and training time of the model, and effectively reduce the dimensionality of the spectrum information and reduce the risk of overfitting.
The invention applies two different pretreatment methods (SNV and MSC) and combines a continuous projection algorithm (SPA) to extract characteristic wavelength and establish a partial least square model of the content of mutton crude fat, after cross verification, the Root Mean Square Error (RMSE) of the model is 0.3011 and 0.3155, and the coefficient R is determined 2 The relative analysis errors PRD were 3.60 and 3.48 for 0.9695 and 0.9699. Compared with full spectrum modeling combining partial least square method (FS-PLS), full spectrum combining standard normal transformation partial least square modeling (FS-SNV-PLS), full spectrum wave band combining multi-element scattering correction partial least square modeling (FS-MSC-PLS), the effect is optimal, the near infrared model detection effect and the prediction capability of the crude fat content in the mutton of the prediction model are greatly improved, and the practical level is achieved.
The invention adopts two methods of preprocessing spectrum data based on SNV and MSC, improves the model performance through SPA algorithm, can rapidly and accurately detect the content of mutton crude fat, and has simple operation and no pollution. In addition, the invention innovatively provides a brand new technical solution for the near infrared model establishment of other nutritional ingredients in mutton.
Example 6:
in order to perform the method corresponding to the above embodiment 1 to achieve the corresponding functions and technical effects, a mutton crude fat content determination system is provided below, including:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and raw fat content actual measurement values of a plurality of mutton samples.
And the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data.
And the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data.
And the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method.
The predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims (8)

1. A method for determining the crude fat content of mutton, comprising:
acquiring original near infrared spectrum data and crude fat content actual measurement values of a plurality of mutton samples;
preprocessing the original near infrared spectrum data to obtain processed spectrum data;
extracting characteristic wavelengths of the processing spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data;
determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method;
and obtaining target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content prediction model to obtain a crude fat content prediction value of the target mutton to be detected.
2. The method for determining the content of the crude fat in mutton according to claim 1, wherein the method for preprocessing the original near infrared spectrum data to obtain processed spectrum data specifically comprises the following steps:
preprocessing the original near infrared spectrum data by using a standard normal transformation algorithm to obtain standard normal transformation spectrum data;
preprocessing the original near infrared spectrum data by utilizing a multi-element scattering correction algorithm to obtain multi-element scattering correction spectrum data; the standard normal transformation spectral data or the multivariate scattering correction spectral data is the processing spectral data.
3. The method for determining the content of the crude fat in mutton according to claim 2, wherein the characteristic wavelength extraction is performed on the processed spectrum data by using a continuous projection algorithm to obtain characteristic wavelength data, and the method specifically comprises the following steps:
extracting characteristic wavelengths of the standard normal transformation spectrum data by using a continuous projection algorithm to obtain standard normal transformation characteristic wavelengths;
extracting characteristic wavelengths of the multi-element scattering correction spectrum data by using a continuous projection algorithm to obtain multi-element scattering correction characteristic wavelengths; the standard normal transformation characteristic wavelength and the multivariate scattering correction characteristic wavelength constitute the characteristic wavelength data.
4. A method for determining the crude fat content of mutton as claimed in claim 3, wherein the determining the final crude fat content prediction model from the characteristic wavelength data by using a partial least square method comprises:
taking the standard normal transformation characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a first crude fat content prediction model;
taking the multi-element scattering correction characteristic wavelength as input, taking the actual measurement value of the crude fat content as output, and modeling by using a partial least square method to obtain a second crude fat content prediction model;
and determining a final crude fat content prediction model according to the first crude fat content prediction model and the second crude fat content prediction model.
5. A method for determining the crude fat content of mutton as claimed in claim 4, wherein determining a final crude fat content prediction model based on the first crude fat content prediction model and the second crude fat content prediction model comprises:
according to the evaluation index of the first crude fat content prediction model and the evaluation index of the second crude fat content prediction model, determining a crude fat content prediction model with the prediction capability index meeting a set condition as a final crude fat content prediction model; the evaluation index includes root mean square error, decision coefficient, and relative analysis error.
6. The method of claim 4, wherein the setting condition is that a difference between the root mean square error and the root mean square error setting is within a first threshold range, a difference between the determination coefficient and the determination coefficient setting is within a second threshold range, and a difference between the relative analysis error and the relative analysis error setting is within a third threshold range.
7. The method of claim 1, wherein the raw near infrared spectrum data is obtained by scanning a mutton sample with a near infrared analyzer.
8. A mutton fat content determination system, comprising:
the raw data acquisition module is used for acquiring raw near infrared spectrum data and actual measurement values of crude fat content of a plurality of mutton samples;
the preprocessing module is used for preprocessing the original near infrared spectrum data to obtain processing spectrum data;
the characteristic wavelength extraction module is used for extracting characteristic wavelengths of the processing spectrum data by utilizing a continuous projection algorithm to obtain characteristic wavelength data;
the final crude fat content prediction model determining module is used for determining a final crude fat content prediction model according to the characteristic wavelength data by using a partial least square method;
the predicting module is used for acquiring target characteristic wavelength data of the target mutton to be detected, and inputting the target characteristic wavelength data into the final crude fat content predicting model to obtain a crude fat content predicting value of the target mutton to be detected.
CN202410068013.6A 2024-01-17 2024-01-17 Mutton crude fat content determination method and system Pending CN117871459A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410068013.6A CN117871459A (en) 2024-01-17 2024-01-17 Mutton crude fat content determination method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410068013.6A CN117871459A (en) 2024-01-17 2024-01-17 Mutton crude fat content determination method and system

Publications (1)

Publication Number Publication Date
CN117871459A true CN117871459A (en) 2024-04-12

Family

ID=90586387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410068013.6A Pending CN117871459A (en) 2024-01-17 2024-01-17 Mutton crude fat content determination method and system

Country Status (1)

Country Link
CN (1) CN117871459A (en)

Similar Documents

Publication Publication Date Title
CN102879353B (en) The method of content of protein components near infrared detection peanut
CN102590129B (en) Method for detecting content of amino acid in peanuts by near infrared method
CN101915744A (en) Near infrared spectrum nondestructive testing method and device for material component content
CN110726694A (en) Characteristic wavelength selection method and system of spectral variable gradient integrated genetic algorithm
CN111044483A (en) Method, system and medium for determining pigment in cream based on near infrared spectrum
Yu et al. Rapid and visual measurement of fat content in peanuts by using the hyperspectral imaging technique with chemometrics
CN111693487A (en) Fruit sugar degree detection method and system based on genetic algorithm and extreme learning machine
CN110567889A (en) Nondestructive testing method for water content of fresh cocoons based on spectral imaging and deep learning technology
CN112730312A (en) Doped bovine colostrum qualitative identification method based on near infrared spectrum technology
CN115937670A (en) Intelligent musk identification method based on hyperspectral imaging and application
CN109540837B (en) Method for rapidly detecting lignocellulose content of ramie leaves by near infrared
CN114611582A (en) Method and system for analyzing substance concentration based on near infrared spectrum technology
CN104502307A (en) Method for quickly detecting content of glycogen and protein of crassostrea gigas
CN111896497B (en) Spectral data correction method based on predicted value
CN106338488A (en) Method for fast undamaged determination of transgenic soybean milk powder
CN112945901A (en) Method for detecting quality of ensiled soybeans based on near infrared spectrum
CN107290299B (en) Method for detecting sugar degree and acidity of peaches in real time in nondestructive mode
CN108398400B (en) Method for nondestructive testing of fatty acid content in wheat by terahertz imaging
CN117871459A (en) Mutton crude fat content determination method and system
CN114062306B (en) Near infrared spectrum data segmentation preprocessing method
CN113049526B (en) Corn seed moisture content determination method based on terahertz attenuated total reflection
CN113984683A (en) Hyperspectrum-based method for measuring starch content of potato whole flour noodles
CN115420708B (en) Near-infrared nondestructive detection method for capsaicin substances in dry peppers
CN113791049B (en) Method for rapidly detecting freshness of chilled duck meat by combining NIRS and CV
CN117723675A (en) Multi-dimensional characteristic-fused rapid nondestructive analysis method for quality of fermented soybean paste

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination