CN110672552A - Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result - Google Patents

Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result Download PDF

Info

Publication number
CN110672552A
CN110672552A CN201910910971.2A CN201910910971A CN110672552A CN 110672552 A CN110672552 A CN 110672552A CN 201910910971 A CN201910910971 A CN 201910910971A CN 110672552 A CN110672552 A CN 110672552A
Authority
CN
China
Prior art keywords
test
sample
confidence
near infrared
detection result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910910971.2A
Other languages
Chinese (zh)
Other versions
CN110672552B (en
Inventor
熊智新
张肖雪
杨冲
赵静远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Forestry University
Original Assignee
Nanjing Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Forestry University filed Critical Nanjing Forestry University
Priority to CN201910910971.2A priority Critical patent/CN110672552B/en
Publication of CN110672552A publication Critical patent/CN110672552A/en
Application granted granted Critical
Publication of CN110672552B publication Critical patent/CN110672552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/359Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using near infrared light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Algebra (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The invention provides a confidence estimation method for a vehicle fuel near infrared spectrum detection result, which is used for completing the calculation of Mahalanobis distance based on principal component analysis, obtaining a significance level according to F distribution on the basis and completing the confidence estimation of the detection result of a sample to be detected. And according to the Mahalanobis distance and the statistic obtained by the Mahalanobis distance, obeying to F distribution, confidence coefficient estimation on the reliability of the detection result is provided for the practical application of the near infrared spectrum analysis technology, and a quantitative basis is provided for the next qualitative diagnosis of the analysis object or the effectiveness evaluation of the quantitative analysis result.

Description

Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result
Technical Field
The invention relates to near infrared spectrum anomaly detection of vehicle fuel oil, in particular to a near infrared detection result confidence degree estimation method based on Mahalanobis distance.
Background
Based on group frequency and frequency multiplication absorption of hydrogen group stretching vibration in organic molecules, the near infrared spectrum can establish a linear or nonlinear relation between the spectrum and a quality index through a chemometrics method, quickly and efficiently complete qualitative and quantitative analysis of a sample, and can overcome the defects of complicated process, high cost, low efficiency and the like in the traditional oil analysis technology.
In recent years, the near infrared spectroscopy is widely and more mature to be applied to the measurement of the content of various components of oil products so as to improve the production management and quality supervision level of the oil products. In the acquisition process of the near infrared spectrum, abnormal spectrum data can be generated due to factors such as change of sample properties, change of experimental conditions, measurement errors of instruments, artificial measurement errors and the like; the presence of abnormal spectra affects the data characteristic performance, and further reduces the reliability of the spectrum detection result. Therefore, identifying and rejecting abnormal samples is a necessary condition for building a reliable near-infrared analysis model. Common abnormal sample point elimination methods include Mahalanobis Distance (MD), a lever method, monte carlo cross validation, and the like. However, in the actual process of oil product rapid detection, the difference of the same oil product spectral data is often unavoidable in consideration of different production processes and adulteration possibility existing in different oil refineries. Therefore, simple abnormal sample rejection is often not desirable, and enterprises need to provide a suitable judgment standard (for example, the confidence level is not less than 80%) to complete the judgment and screening of samples, which is important for the rapid detection of oil quality indexes and further diagnostic analysis.
However, in the field of near infrared spectroscopy, there is no mature confidence estimation method for the detection result of the spectral data. In the field of process control, the quadratic calculation of mahalanobis distance (PCA-MD) due to Principal Component Analysis (PCA) is equivalent to hotelling T2PCA-MD is commonly used for T2Checking; through T2The comparison of the control limit and the square of the PCA-MD value can judge whether the sample to be tested is in a normal state. On the basis of the above, T is combined2And the statistic accords with the F distribution, so that the calculation of the significance level of the sample can be completed. Therefore, by means of the data distribution idea, the significance level of the sample can be calculated by calculating the square value of near infrared spectrum data PCA-MD, and the confidence degree of the detection result can be further estimated.
Disclosure of Invention
Aiming at the rapid detection of oil products, the invention provides a near-infrared detection result confidence degree estimation method based on the Mahalanobis distance, so that whether a sample is qualified or not can be conveniently and effectively judged according to the confidence degree in practical application, and the reliability of an analysis result can be ensured.
The method completes the calculation of the Mahalanobis distance based on the principal component analysis, obtains the significance level according to the F distribution on the basis, and completes the confidence estimation of the detection result of the sample to be detected.
The implementation of the process specifically comprises the following steps:
s1, carrying out standardization processing on spectral data to obtain correction set spectral data X;
s2, adopting PCA-MD T2Detecting and removing abnormal samples in the correction set to ensure that the spectral data X in the correction set are all normal samples;
s3, carrying out PCA decomposition on the spectrum data X of the correction set, and combining the spectrum data X of the test settestCalculating the squared value of the Mahalanobis distance between the sample to be measured and the sample in the correction set
Figure BDA0002214692240000021
S4, according to
Figure BDA0002214692240000022
Calculating the significance level alpha by following the F distributiontestThen obtaining confidence coefficient c of the near infrared spectrum detection resulttest
Step S2 includes:
s21: the PCA decomposition of the calibration set spectral data can be expressed as:
Figure BDA0002214692240000023
in the formula, T is belonged to Rn×pFor the score matrix, n represents the number of samples, P represents the number of principal components, and P belongs to Rm×pM represents the number of variables as a load matrix;
s22: the square of the PCA-MD value of the ith sample of the calibration set spectral data can be expressed as:
Figure BDA0002214692240000024
wherein, tiRepresenting the ith row vector of the scoring matrix T, and sigma being the covariance matrix of T;
S23:T2the control limit may be expressed as:
Figure BDA0002214692240000025
where α is the significance level and the confidence in the control limits is 1- α. At this time, if
Figure BDA0002214692240000026
If the value is less than the control limit, judging the sample to be a normal sample; if it is
Figure BDA0002214692240000027
If the value is larger than the control limit, the abnormal sample is judged.
Step S3 includes:
s31: after the abnormal samples are removed, PCA decomposition is carried out on the spectrum data X of the correction set according to a formula (1), and the covariance matrix sigma of the load matrix P and the scoring matrix T is updated;
s32: calculating a score matrix of the spectral data of the sample set to be detected:
Ttest=XtestP (4)
in the formula, XtestA sample set to be detected and a correction set load matrix P are obtained;
s33: the square value of the Mahalanobis distance between the ith sample to be detected and the calibration set
Figure BDA0002214692240000031
Can be expressed as:
Figure BDA0002214692240000032
in the formula, ttest-iScore matrix T representing sample set to be testedtestThe ith row vector of (1).
Step S4 includes:
s41: according to the degree of freedom of the F distribution of p and n-pTo a significant level of alphatestFor the ith test set sample, the significance level αtest-iCan be obtained according to the following formula:
Figure BDA0002214692240000033
s42: confidence level c for the ith test set sampletest-iCan be expressed as:
ctest-i=1-αtest-i(7)
the method has the advantages that according to the Mahalanobis distance and the statistic obtained by the Mahalanobis distance, the F distribution obeys, confidence degree estimation of the reliability of the detection result is provided for the practical application of the near infrared spectrum analysis technology, and a quantitative basis is provided for the next qualitative diagnosis of an analysis object or the effectiveness evaluation of the quantitative analysis result.
Drawings
FIG. 1 is a flow chart of a PCA-MD based near infrared anomalous spectral confidence quantification method;
FIG. 2 is T taken from PCA-MD2Checking and eliminating a line graph of abnormal samples in the correction set;
FIG. 3 is a plot of a sample confidence estimate for a near infrared spectrum of diesel-blended gasoline;
FIG. 4 is a sample confidence estimate line graph for simulation case 1;
fig. 5 is a line graph of confidence estimates for simulation case 2.
Detailed description of the preferred embodiments
The technical scheme adopted by the method for performing confidence estimation on the oil product near infrared spectrum detection result is as follows:
s1, carrying out standardization processing on spectral data to obtain correction set spectral data X;
s2, adopting PCA-MD T2Detecting and removing abnormal samples in the correction set to ensure that the spectral data X in the correction set are all normal samples;
s3, carrying out PCA decomposition on the spectrum data X of the correction set, and combining the spectrum data X of the test settestCalculating the squared value of the Mahalanobis distance between the sample to be measured and the sample in the correction set
Figure BDA0002214692240000041
S4, according to
Figure BDA0002214692240000042
Calculating the significance level alpha by following the F distributiontestThen obtaining confidence coefficient c of the near infrared spectrum detection resulttest
Step S2 includes:
s21: the PCA decomposition of the calibration set spectral data can be expressed as:
Figure BDA0002214692240000043
in the formula, T is belonged to Rn×pFor the score matrix, n represents the number of samples, P represents the number of principal components, and P belongs to Rm×pM represents the number of variables as a load matrix;
s22: the square of the PCA-MD value of the ith sample of the calibration set spectral data can be expressed as:
wherein, tiRepresenting the ith row vector of the scoring matrix T, and sigma being the covariance matrix of T;
S23:T2the control limit may be expressed as:
Figure BDA0002214692240000045
where α is the significance level (typically set at 0.01 or 0.05) and the confidence of the control limit is 1- α. At this time, ifIf the value is less than the control limit, judging the sample to be a normal sample; if it is
Figure BDA0002214692240000047
If the value is greater than the control limit, the judgment is thatAnd (4) abnormal samples.
Step S3 includes:
s31: after the abnormal samples are removed, PCA decomposition is carried out on the spectrum data X of the correction set according to a formula (1), and the covariance matrix sigma of the load matrix P and the scoring matrix T is updated;
s32: calculating a score matrix of the spectral data of the sample set to be detected:
Ttest=XtestP (4)
in the formula, XtestA sample set to be detected and a correction set load matrix P are obtained;
s33: the square value of the Mahalanobis distance between the ith sample to be detected and the calibration set
Figure BDA0002214692240000048
Can be expressed as:
Figure BDA0002214692240000049
in the formula, ttest-iScore matrix T representing sample set to be testedtestThe ith row vector of (1).
Step S4 includes:
s41: significance level α is achieved based on F distribution with degrees of freedom p and n-ptestFor the ith test set sample, the significance level αtest-iCan be obtained according to the following formula:
Figure BDA0002214692240000051
s42: confidence level c for the ith test set sampletest-iCan be expressed as:
ctest-i=1-αtest-i(7)
example 1:
taking the detection of a gasoline sample doped with a certain percentage of diesel oil as an example. The method comprises the steps of carrying out spectrum collection on diesel oil and gasoline samples provided by main oil refineries in south-johnson of Shandong through a near-infrared spectrometer with the model number of Thermo FisherAntaris II, and fitting as a correction set. And simultaneously collecting the near infrared spectrum of the doped gasoline as a test set.
S1, carrying out standardization processing on spectral data to obtain correction set spectral data X;
s2, adopting PCA-MD T2Detecting and removing abnormal samples in the correction set to ensure that the spectral data X in the correction set are all normal samples;
s3, carrying out PCA decomposition on the spectrum data X of the correction set, and combining the spectrum data X of the test settestCalculating the squared value of the Mahalanobis distance between the sample to be measured and the sample in the correction set
Figure BDA0002214692240000052
S4, according to
Figure BDA0002214692240000053
Calculating the significance level alpha by following the F distributiontestThen obtaining confidence coefficient c of the near infrared spectrum detection resulttest
The present invention is further detailed by simulating the above method by MATLAB in conjunction with fig. 1:
the first step is as follows: and completing the sample division and data standardization processing of the correction set and the test set. The calibration set comprises 81 pure gasoline near infrared spectrum samples, the test set comprises 1 pure gasoline spectrum sample and 10 gasoline spectrum samples respectively doped with diesel oil with different contents, and the diesel oil contents respectively account for 5.26%, 5.88%, 8.33%, 9.09%, 10%, 11.11%, 12.5%, 14.29%, 16.67% and 20%.
The second step is that: carrying out PCA model decomposition on the spectrum data of the correction set so as to calculate the square value of the Mahalanobis distanceThen according to T2The control limit (α set to 0.05) determines the presence or absence of an abnormal sample point. Since the gasoline near infrared spectrum data has too many variables (wavelength points), the first 6 principal components are selected here to analyze the differential contribution rate. As can be seen from Table 1, the cumulative variance contribution ratio did not increase significantly after the number of principal components exceeded 3 by PCA decomposition, and 3 principal components were selectedAnd calculating the MD value and eliminating abnormal values. As can be seen from fig. 2, the red dotted line represents the 95% confidence control limit, and the samples 53 and 54 are significantly out of the control limit range, and therefore are determined to be abnormal samples. After the abnormal samples are eliminated, the number of the samples in the correction set is 79.
TABLE 1 influence of the number of principal components of the PCA model on the contribution rate and the cumulative contribution rate
The third step: carrying out PCA decomposition on the correction set without the abnormal samples and updating the covariance matrix sigma of the load matrix P and the score matrix T, and then combining the test spectrum data XtestCompleting the test set spectral data scoring matrix TtestAnd the squared value of the mahalanobis distance between the test set and the correction set samples
Figure BDA0002214692240000062
As can be seen from table 2, after the abnormal samples in the correction set are removed, the variance contribution rate and the cumulative variance contribution rate of the principal component of the PCA model slightly change, and according to the criterion that the cumulative variance contribution rate does not significantly rise, 3 principal components are still selected to update the covariance matrix Σ of the load matrix P and the score matrix T.
TABLE 2 influence of PCA model principal component number on contribution rate and cumulative contribution rate
Figure BDA0002214692240000063
The fourth step: according to
Figure BDA0002214692240000064
Level of significance achieved for F distributions with set and degrees of freedom 3 and 76testAnd confidence ctestAnd (4) calculating. As shown in fig. 3, the confidence of the pure gasoline near infrared spectrum sample in the test set is above 90%; after 5.26% of diesel oil is doped, the confidence coefficient of the sample is rapidly reduced to about 20%; when 5.88% -11.11% of diesel oil is doped, the confidence coefficient reduction trend is not obvious, but is in the range of 20% -35% in all casesThis condition may be caused by the mixing of the sample being uneven or the evaporation of part of the gasoline during actual operation; when the content of the doped diesel oil exceeds 11.11 percent, the confidence level of the sample is gradually reduced from the vicinity of 30 percent to the vicinity of 1 percent.
Example 2:
the mixture of diesel oil and gasoline is simulated by taking the proportion of a single spectrum as an example. A near-infrared spectrometer with the model number of Thermo Fisher Antaris II is adopted to carry out spectrum collection on diesel oil and gasoline samples provided by main oil refineries in south and Ji of Shandong. The calibration set is 81 pure gasoline near infrared spectrum samples in the embodiment 1, 11 samples in the test set are respectively formed by adding 1 gasoline spectrum and 1 diesel spectrum according to a specific proportion, and the diesel content respectively accounts for 0 percent (pure gasoline), 2 percent, 4 percent, 6 percent, 8 percent, 10 percent, 12 percent, 14 percent, 16 percent, 18 percent and 20 percent.
The present invention is further detailed by simulating the above method by MATLAB in conjunction with fig. 1:
the first step is as follows: and completing the data standardization processing of the correction set and the test set.
The second step is that: carrying out PCA model decomposition on the spectrum data of the correction set so as to calculate the square value of the Mahalanobis distance
Figure BDA0002214692240000071
Then according to T2The control limit (α set to 0.05) determines the presence or absence of an abnormal sample point. Since the correction set is consistent with the correction set of embodiment 1, 3 principal components are still selected to calculate the MD values of the correction set and to remove the abnormal values, and after the abnormal samples are removed, the number of samples in the correction set is 79.
The third step: carrying out PCA decomposition on the correction set without the abnormal samples and updating the covariance matrix sigma of the load matrix P and the score matrix T, and then combining the test spectrum data XtestCompleting the test set spectral data scoring matrix TtestAnd the squared value of the mahalanobis distance between the test set and the correction set samples
Figure BDA0002214692240000072
As shown in Table 2, after removing abnormal samples in the correction set, principal components of the PCA modelThe variance contribution rate and the accumulated variance contribution rate slightly change, and according to the criterion that the accumulated variance contribution rate does not obviously rise, 3 principal components are still selected to update the covariance matrix sigma of the load matrix P and the score matrix T.
The fourth step: according to
Figure BDA0002214692240000073
Level of significance achieved for F distributions with set and degrees of freedom 3 and 76testAnd confidence ctestAnd (4) calculating. As shown in fig. 4, as the proportion of doped diesel increases, the sample confidence of the simulation test set shows a smooth decreasing curve; when the proportion of the blended diesel oil is 2-10%, the confidence coefficient is reduced to be lower than 50% quickly, and when the proportion of the blended diesel oil is 6%; as the proportion of blended diesel oil continues to increase, the confidence rate of decline gradually slows down.
Example 3:
the mixture of diesel oil and gasoline is simulated by the mixture ratio of a plurality of spectra. Firstly, a near-infrared spectrometer with the model number of Thermo Fisher Antaris II is adopted to carry out spectrum collection on diesel oil and gasoline samples provided by main oil refineries in Jinan, Shandong. The calibration set is 81 pure gasoline near infrared spectrum samples in the example 1, 11 samples in the test set are obtained by averaging after adding a plurality of gasoline spectrums and diesel oil spectrums, and the diesel oil contents respectively account for 0%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45% and 50%.
The present invention is further detailed by simulating the above method by MATLAB in conjunction with fig. 1:
the first step is as follows: and completing the data standardization processing of the correction set and the test set.
The second step is that: carrying out PCA model decomposition on the spectrum data of the correction set so as to calculate the square value of the Mahalanobis distance
Figure BDA0002214692240000081
Then according to T2The control limit (α set to 0.05) determines the presence or absence of an abnormal sample point. Since the correction set remained the same as that of example 1, 3 principal components were still selected to calculate the MD values of the correction set andand removing abnormal values, wherein the number of the correction set samples is 79 after the abnormal samples are removed.
The third step: carrying out PCA decomposition on the correction set without the abnormal samples and updating the covariance matrix sigma of the load matrix P and the score matrix T, and then combining the test spectrum data XtestCompleting the test set spectral data scoring matrix TtestAnd the squared value of the mahalanobis distance between the test set and the correction set samples
Figure BDA0002214692240000082
As can be seen from table 2, after the abnormal samples in the correction set are removed, the variance contribution rate and the cumulative variance contribution rate of the principal component of the PCA model slightly change, and according to the criterion that the cumulative variance contribution rate does not significantly rise, 3 principal components are still selected to update the covariance matrix Σ of the load matrix P and the score matrix T.
The fourth step: according to
Figure BDA0002214692240000083
Level of significance achieved for F distributions with set and degrees of freedom 3 and 76testAnd confidence ctestAnd (4) calculating. As can be seen from FIG. 5, as the proportion of blended diesel increases, the confidence of the test set samples decreases overall. When the proportion of the doped diesel oil is 5-15%, the confidence coefficient is reduced at the fastest speed; when the diesel oil blending ratio is 15-25%, the confidence coefficient descending speed gradually slows down; when the diesel oil blending ratio is 25% -50%, the confidence coefficient is close to 0, and no obvious change exists.
According to 3 implementation cases, along with the increase of the proportion of diesel oil mixed in gasoline, the confidence coefficient of a detection sample integrally falls, which shows the effectiveness of judging whether the near-infrared detection result is abnormal or not by adopting data distribution in the method. The confidence coefficient can be compared with the judgment standard of an enterprise through the sample significance level and the confidence coefficient estimation provided by the Mahalanobis distance and the F distribution; if the confidence coefficient is not less than the judgment standard, the near infrared detection result of the sample is considered to be normal, and if the confidence coefficient is less than the judgment standard, the sample is considered to be suspicious, and the quality index needs to be further determined. Therefore, the method effectively guarantees the reliability of the near infrared spectrum detection result.
The above embodiments are provided only for illustrating the present invention and not for limiting the present invention, and those skilled in the art should make various changes or modifications without departing from the spirit and scope of the present invention.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the foregoing description only for the purpose of illustrating the principles of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the invention as defined by the appended claims, specification and equivalents thereof.

Claims (5)

1. A confidence degree estimation method for a vehicle fuel near infrared spectrum detection result is characterized by comprising the following steps:
s1, carrying out standardization processing on spectral data to obtain correction set spectral data X;
s2, adopting PCA-MD T2Detecting and removing abnormal samples in the correction set spectral data X to ensure that the correction set spectral data X are all normal samples;
s3, carrying out PCA decomposition on the spectrum data X of the correction set, and combining the spectrum data X of the test settestCalculating the squared value of the Mahalanobis distance between the sample to be measured and the sample in the correction set
Figure FDA0002214692230000011
S4, according to
Figure FDA0002214692230000012
Calculating the significance level alpha by following the F distributiontestThen obtaining confidence coefficient c of the near infrared spectrum detection resulttest
2. The method for estimating the confidence of the detection result of the near infrared spectrum of the vehicle fuel according to claim 1, wherein the step S2 includes:
s21: the PCA decomposition of the calibration set spectral data can be expressed as:
Figure FDA0002214692230000013
in the formula, T is belonged to Rn×pFor the score matrix, n represents the number of samples, P represents the number of principal components, and P belongs to Rm×pM represents the number of variables as a load matrix;
s22: the square of the PCA-MD value of the ith sample of the calibration set spectral data can be expressed as:
Figure FDA0002214692230000014
wherein, tiRepresenting the ith row vector of the scoring matrix T, and sigma being the covariance matrix of T;
S23:T2the control limit is expressed as:
wherein, alpha is a significance level, and the confidence coefficient of the control limit is 1-alpha; if it isIf the value is less than the control limit, judging the sample to be a normal sample; if it is
Figure FDA0002214692230000017
If the value is larger than the control limit, the abnormal sample is judged.
3. The method for estimating the confidence of the detection result of the near infrared spectrum of the vehicle fuel according to claim 2, wherein the step S3 includes:
s31: after the abnormal samples are removed, PCA decomposition is carried out on the spectrum data X of the correction set according to a formula (1), and the covariance matrix sigma of the load matrix P and the scoring matrix T is updated;
s32: calculating a score matrix of the spectral data of the sample set to be detected:
Ttest=XtestP (4)
in the formula, XtestA sample set to be detected and a correction set load matrix P are obtained;
s33: the square value of the Mahalanobis distance between the ith sample to be detected and the calibration set
Figure FDA0002214692230000021
Expressed as:
in the formula, ttest-iScore matrix T representing sample set to be testedtestThe ith row vector of (1).
4. The method for estimating the confidence of the detection result of the near infrared spectrum of the vehicle fuel according to claim 3, wherein the step S4 includes:
s41: significance level α is achieved based on F distribution with degrees of freedom p and n-ptestFor the ith test set sample, the significance level αtest-iObtained according to the following formula:
Figure FDA0002214692230000023
s42: confidence level c for the ith test set sampletest-iCan be expressed as:
ctest-i=1-αtest-i(7)。
5. the confidence estimation method for the detection result of the near infrared spectrum of the vehicle fuel according to any one of claims 2 to 4, characterized in that the significance level α is set to be 0.01 or 0.05.
CN201910910971.2A 2019-09-25 2019-09-25 Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result Active CN110672552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910910971.2A CN110672552B (en) 2019-09-25 2019-09-25 Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910910971.2A CN110672552B (en) 2019-09-25 2019-09-25 Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result

Publications (2)

Publication Number Publication Date
CN110672552A true CN110672552A (en) 2020-01-10
CN110672552B CN110672552B (en) 2022-06-07

Family

ID=69079401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910910971.2A Active CN110672552B (en) 2019-09-25 2019-09-25 Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result

Country Status (1)

Country Link
CN (1) CN110672552B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113740294A (en) * 2021-07-29 2021-12-03 北京易兴元石化科技有限公司 Gasoline/diesel oil detection and analysis method and device based on near infrared modeling

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5606164A (en) * 1996-01-16 1997-02-25 Boehringer Mannheim Corporation Method and apparatus for biological fluid analyte concentration measurement using generalized distance outlier detection
CN104713846A (en) * 2015-02-03 2015-06-17 贵州省烟草科学研究院 Modeling method for rapidly detecting content of starch in tobacco by using near infrared spectroscopy
CN107748146A (en) * 2017-10-20 2018-03-02 华东理工大学 A kind of crude oil attribute method for quick predicting based near infrared spectrum detection
CN109060711A (en) * 2018-08-28 2018-12-21 中蓝晨光成都检测技术有限公司 A method of calculating white oil content in white oil-doped organosilicon product

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5606164A (en) * 1996-01-16 1997-02-25 Boehringer Mannheim Corporation Method and apparatus for biological fluid analyte concentration measurement using generalized distance outlier detection
CN104713846A (en) * 2015-02-03 2015-06-17 贵州省烟草科学研究院 Modeling method for rapidly detecting content of starch in tobacco by using near infrared spectroscopy
CN107748146A (en) * 2017-10-20 2018-03-02 华东理工大学 A kind of crude oil attribute method for quick predicting based near infrared spectrum detection
CN109060711A (en) * 2018-08-28 2018-12-21 中蓝晨光成都检测技术有限公司 A method of calculating white oil content in white oil-doped organosilicon product

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DE MAESSCHALCK, R: "On-line monitoring of powder blending with near-infrared spectroscopy", 《APPLIED SPECTROSCOPY》 *
冯新泸: "《近红外光谱及其在石油产品分析中的应用》", 31 December 2002, 中国石化出版社 *
孟丹蕊: "一种基于简化正交距离的近红外异常光谱判断方法", 《光谱学与光谱分析》 *
熊智新: "制浆材木质素含量近红外分析模型传递研究", 《中国造纸学报》 *
赵振英: "近红外光谱法分析油页岩含油率中异常样品识别和剔除方法的研究", 《光谱学与光谱分析》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113740294A (en) * 2021-07-29 2021-12-03 北京易兴元石化科技有限公司 Gasoline/diesel oil detection and analysis method and device based on near infrared modeling
CN113740294B (en) * 2021-07-29 2024-03-08 北京易兴元石化科技有限公司 Near infrared modeling-based gasoline/diesel oil detection and analysis method and device

Also Published As

Publication number Publication date
CN110672552B (en) 2022-06-07

Similar Documents

Publication Publication Date Title
CN111123188A (en) Electric energy meter comprehensive verification method and system based on improved least square method
CN107632592B (en) Nonlinear time-varying process fault monitoring method based on efficient recursion kernel principal component analysis
CN107067100B (en) Wind power abnormal data identification method and identification device
CN111222095B (en) Rough difference judging method, device and system in dam deformation monitoring
CN105117550A (en) Product multidimensional correlation-oriented degradation failure modeling method
CN111367959B (en) Zero-time-lag nonlinear expansion Granger causal analysis method
JP2015011027A (en) Method for detecting anomalies in time series data
WO2022052333A1 (en) Polymer material service life prediction method based on environmental big data and machine learning
CN110672552B (en) Confidence coefficient estimation method for vehicle fuel oil near infrared spectrum detection result
CN111537845A (en) Method for identifying aging state of oil paper insulation equipment based on Raman spectrum cluster analysis
Chen et al. Exploiting Hardy-Weinberg equilibrium for efficient screening of single SNP associations from case-control studies
CN108844612B (en) Transformer internal fault identification method based on mathematical statistics probability model
CN115453064B (en) Fine particulate matter air pollution cause analysis method and system
CN109978059B (en) Early warning method for tower flushing faults of primary distillation tower in crude oil distillation process
CN111967140A (en) Performance degradation experiment modeling and analyzing method considering mixing uncertainty
CN108508860B (en) Process industrial production system data monitoring method based on coupling relation
CN102542284B (en) Method for identifying spectrum
CN114038501A (en) Background bacterium judgment method based on machine learning
CN112949680A (en) Pollution source identification method based on corresponding analysis and multiple linear regression
CN112630343A (en) Method for testing, tracing and rectifying hydrocarbon pollutants of automobile tire
CN108267422B (en) Abnormal sample removing method based on near infrared spectrum analysis
CN111507374A (en) Power grid mass data anomaly detection method based on random matrix theory
CN115728290A (en) Method, system, equipment and storage medium for detecting chromium element in soil
Berger et al. Estimating Weibull parameters by linear and nonlinear regression
WO2019041732A1 (en) Evaluation method and apparatus for manufacturing process capability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant