CN117783032A - Method and device for determining mixture components based on infrared spectrum - Google Patents

Method and device for determining mixture components based on infrared spectrum Download PDF

Info

Publication number
CN117783032A
CN117783032A CN202311694381.3A CN202311694381A CN117783032A CN 117783032 A CN117783032 A CN 117783032A CN 202311694381 A CN202311694381 A CN 202311694381A CN 117783032 A CN117783032 A CN 117783032A
Authority
CN
China
Prior art keywords
spectrum
spectrums
determining
mixture
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311694381.3A
Other languages
Chinese (zh)
Inventor
周志明
隋峰
青万均
吴雪瑞
郑子为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Csic Anpel Instrument Co ltd Hubei
Original Assignee
Csic Anpel Instrument Co ltd Hubei
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Csic Anpel Instrument Co ltd Hubei filed Critical Csic Anpel Instrument Co ltd Hubei
Priority to CN202311694381.3A priority Critical patent/CN117783032A/en
Publication of CN117783032A publication Critical patent/CN117783032A/en
Pending legal-status Critical Current

Links

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

The application discloses a method and a device for determining mixture components based on infrared spectrum, and belongs to the field of infrared spectrum analysis. The method comprises the following steps: acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance; acquiring a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum; extracting a fourth spectrum from the first spectrum and a fifth spectrum from the third spectrum, wherein the wave number range of each fifth spectrum is the same as the wave number range of the fourth spectrum; performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and acquiring target spectrums from the plurality of fifth spectrums, wherein the regression coefficient of each target spectrum is larger than or equal to a first preset threshold value; the candidate substances corresponding to the respective target spectra are determined as components of the mixture. The method can quickly determine each component of the mixture by utilizing the multiple linear regression method, and realizes qualitative analysis of the mixture.

Description

Method and device for determining mixture components based on infrared spectrum
Technical Field
The application relates to the technical field of infrared spectrum analysis, in particular to a method and a device for determining mixture components based on infrared spectrum.
Background
The Fourier infrared absorption spectrometer is generally used for detecting Fourier infrared spectrums of solid samples, liquid samples and gas samples, and has wide application in the fields of environmental protection, industrial processes, security protection, scientific research and the like. In the use process of the Fourier infrared absorption spectrometer, as the Fourier infrared spectrum is taken as a fingerprint spectrum to reflect the structural information of the measured substance, a user generally has a great demand on the qualitative analysis capability of the mixture of the instrument. However, complex mixtures often include multiple species, most of which may have multiple characteristic peaks, and there are cases where the spectral peaks overlap, so qualitative analysis using fourier infrared spectroscopy presents a great challenge.
Disclosure of Invention
The embodiment of the application provides a method and a device for determining components of a mixture based on infrared spectrum, which are used for solving the technical problem that the prior art cannot determine each component in the mixture by utilizing Fourier infrared spectrum.
In order to solve the technical problems, the embodiment of the application discloses the following technical scheme:
in a first aspect, there is provided a method of determining the composition of an infrared spectrum based mixture for determining the composition of the mixture, the method comprising:
acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance;
obtaining a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum;
extracting a fourth spectrum from the first spectrum and a fifth spectrum from the third spectrum, each of the fifth spectrum having the same wavenumber range as the wavenumber range of the fourth spectrum;
performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and acquiring target spectrums from a plurality of fifth spectrums, wherein the regression coefficient of each target spectrum is larger than or equal to a first preset threshold value;
and determining the candidate substances corresponding to the target spectrums as components of the mixture.
With reference to the first aspect, performing multiple linear regression analysis based on the fourth spectrum and each of the fifth spectrums, and obtaining a target spectrum from a plurality of the fifth spectrums includes:
constructing a multiple linear regression model based on the fourth spectrum and each of the fifth spectrums;
obtaining regression coefficients corresponding to the fifth spectrums through a least square method;
and if the regression coefficient corresponding to each fifth spectrum is greater than or equal to the first preset threshold value, determining each fifth spectrum as each target spectrum.
With reference to the first aspect, the method further includes:
if a fifth spectrum with the regression coefficient smaller than the first preset threshold exists, removing the fifth spectrum with the regression coefficient smaller than the first preset threshold from a plurality of fifth spectrums, constructing a multiple linear regression model based on the fourth spectrum and the rest of the fifth spectrums, and acquiring the regression coefficient corresponding to the rest of the fifth spectrums;
and if the regression coefficient corresponding to each remaining fifth spectrum is greater than or equal to the first preset threshold value, determining each remaining fifth spectrum as each target spectrum.
With reference to the first aspect, the method further includes:
obtaining fitting degree of a multiple linear regression model constructed by the fourth spectrum and each target spectrum;
acquiring a contribution value of each target spectrum to the fitting degree, wherein the contribution value is used for reflecting the existence possibility of the candidate substance corresponding to the target spectrum;
removing a target spectrum with a contribution value smaller than a fourth preset threshold value;
removing a target spectrum with the regression coefficient smaller than a fifth preset threshold value;
and determining the candidate substances corresponding to the residual target spectrums as components of the mixture, and sequencing and outputting the candidate substances corresponding to the residual target spectrums according to the order of the contribution values from large to small.
With reference to the first aspect, extracting a fourth spectrum from the first spectrum and extracting a fifth spectrum from the third spectrum includes:
removing a spectrum in the interference wave number range in the first spectrum to obtain a fourth spectrum;
and removing the spectrum in the interference wave number range from the third spectrum to obtain the fifth spectrum.
With reference to the first aspect, the interference wavenumber range includes a first wavenumber range in which absorbance in the first spectrum exceeds a second preset threshold, and a second wavenumber range in which water in the first spectrum is interfered.
With reference to the first aspect, the obtaining, from each of the second spectrums, a third spectrum with a maximum characteristic peak position overlapping with any one of the candidate characteristic peak positions of the first spectrum includes:
acquiring a first wave number corresponding to each alternative characteristic peak of the first spectrum;
acquiring a second wave number corresponding to the maximum characteristic peak of each second spectrum;
comparing each of said second wavenumbers with a respective one of said first wavenumbers;
and if the deviation between the second wave number and any one of the first wave numbers is smaller than a third preset threshold value, determining a second spectrum corresponding to the second wave number as the third spectrum.
With reference to the first aspect, the acquiring a second spectrum of each candidate substance includes:
acquiring a preset substance library, wherein spectra of a plurality of substances are stored in the substance library;
selecting a spectrum of each substance having a predetermined correlation with the mixture from the library of substances;
and determining the spectrum of each substance as each second spectrum.
With reference to the first aspect, before the step of acquiring a third spectrum having a maximum characteristic peak position overlapping with any of the alternative characteristic peak positions of the first spectrum from each of the second spectrums, the method further includes:
and respectively carrying out data interception processing on the first spectrum and each second spectrum so as to enable the wave number ranges of the first spectrum and each second spectrum to be the same.
In a second aspect, there is provided an infrared spectrum based mixture component determining apparatus for determining the components of the mixture, the apparatus comprising:
a spectrum acquisition module for acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance;
the first screening module is used for acquiring a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum;
a spectrum processing module, configured to extract a fourth spectrum from the first spectrum and extract a fifth spectrum from the third spectrum, where a wave number range of each fifth spectrum is the same as a wave number range of the fourth spectrum;
the second screening module is used for performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, obtaining target spectrums from a plurality of fifth spectrums, and enabling regression coefficients of each target spectrum to be larger than or equal to a first preset threshold value;
and the component determining module is used for determining the candidate substances corresponding to the target spectrums as components of the mixture.
Compared with the prior art, the method for determining the mixture components based on infrared spectrum comprises the following steps: acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance; acquiring a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum; extracting a fourth spectrum from the first spectrum and a fifth spectrum from the third spectrum, the wave number range of each fifth spectrum being the same as the wave number range of the fourth spectrum; performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and acquiring target spectrums from the plurality of fifth spectrums, wherein the regression coefficient of each target spectrum is larger than or equal to a first preset threshold value; the candidate substances corresponding to the respective target spectra are determined as components of the mixture. The method for determining the components of the mixture based on the infrared spectrum can determine a plurality of alternative substances from a large number of substances by utilizing a multiple linear regression method, so that the linear combination of the second spectrums of the alternative substances can be fitted with the first spectrums of the unknown mixture to the greatest extent, thereby being capable of determining the components of the mixture more quickly and realizing qualitative analysis of the mixture.
According to the mixture component determining device based on the infrared spectrum, a plurality of candidate substances can be determined from a large number of substances, so that the linear combination of the second spectrums of the candidate substances can be fitted with the first spectrums of unknown mixtures to the greatest extent, and therefore each component of the mixture can be well determined, and qualitative analysis of the mixture is achieved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic overall flow diagram of an infrared spectrum-based mixture component determination method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a specific example of an infrared spectrum based mixture component determination method according to an embodiment of the present application;
FIG. 3 is an exemplary schematic diagram of a fourth spectrum and respective target spectrums in an embodiment of the present application;
fig. 4 is a schematic structural view of an infrared spectrum-based mixture component determining apparatus according to an embodiment of the present application.
Reference numerals:
401-a spectrum acquisition module; 402-a first screening module; 403-a spectral processing module; 404-a second screening module; 405-component determination module.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
In the description of the present application, it should be understood that the terms "upper," "lower," "front," "rear," "left," "right," "top," "bottom," "inner," "outer," and the like indicate an orientation or a positional relationship based on that shown in the drawings, merely for convenience of description and to simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more features. In the description of the present application, the meaning of "a plurality" is two or more, and at least one means may be one, two or more, unless explicitly defined otherwise.
In this application, a mixture may include at least two substances.
In order to solve the problem that qualitative analysis of each component in a mixture cannot be performed by using a fourier infrared absorption spectrometer, the embodiment of the application provides a method for determining the components of the mixture based on infrared spectra, and based on the spectra of the mixture and each candidate substance, a plurality of candidate substances are determined from a large number of substances by using a multiple linear regression method, so that each component of the mixture can be determined more quickly, and qualitative analysis of the mixture is realized, thereby at least part of the technical problems can be solved.
Referring to fig. 1, fig. 1 illustrates an overall flow of a method for determining a mixture component based on infrared spectroscopy according to an embodiment of the present application. The method for determining the components of the mixture based on infrared spectrum is used for determining the components of the mixture and comprises the following steps:
step 101: a first spectrum of the mixture is acquired, and a second spectrum of each candidate substance is acquired.
Specifically, in the embodiment of the application, the spectrum refers to an infrared spectrum, the data form of the spectrum is scatter point data, the data form of the spectrum is composed of two rows of data with the same length, wherein one row of data is a unit wave number, monotone and not repeated, the other row of data is absorbance, that is, the spectrum is a one-to-one mapping relation between the wave number and absorbance, the abscissa is the unit wave number, and the ordinate is absorbance.
In some embodiments, a second spectrum of each candidate substance may be obtained by:
step one, a preset substance library is obtained, and spectra of a plurality of substances are stored in the substance library.
Specifically, the substance library may store hundreds of spectra of each substance collected in advance, and may be continuously added, updated, or classified.
And step two, selecting the spectrum of each substance with preset relativity with the mixture from the substance library.
The preset correlation may be, for example, a domain correlation or a usage correlation, etc.
And thirdly, determining the spectrum of each substance as each second spectrum.
By the method, the range of the substances to be searched can be reduced, so that the processing efficiency is further improved.
In other embodiments, the spectrum of all the substances stored in the substance library may be determined as the second spectrum, which is not particularly limited.
In some embodiments, after performing step 101 and before performing step 102, the method of embodiments of the present application further includes:
and respectively carrying out data interception processing on the first spectrum and each second spectrum so as to make the wave number ranges of the first spectrum and each second spectrum identical.
For example, the minima of the abscissa (wave number) of all the second and first spectra may be recorded separately, and the maximum value val thereof may be found from these recorded minima data 1 Then, the maximum values of the abscissa (wave number) of all the second spectrum and the first spectrum are recorded, and the minimum value val thereof is found from the recorded maximum value data 2 Val is intercepted from the first spectrum and the second spectrum respectively 1 To val 2 And the first spectrum and the second spectrum which participate in the analysis can be obtained from the part in between.
In this way, the wave number ranges of the first spectrum and the second spectrums are the same and are the optimal and maximum ranges, so that the consistency of data can be ensured, and omission of key features can be avoided as much as possible.
Step 102: from each second spectrum, a third spectrum is acquired in which the maximum characteristic peak position overlaps with any of the alternative characteristic peak positions of the first spectrum.
In some embodiments, step 102 may be performed by:
step one, obtaining a first wave number corresponding to each alternative characteristic peak of the first spectrum.
Specifically, a plurality of characteristic peaks exist in the first spectrum, each characteristic peak is taken as an alternative characteristic peak, and the abscissa of the peak is the first wave number.
And step two, obtaining a second wave number corresponding to the maximum characteristic peak of each second spectrum.
Specifically, a plurality of characteristic peaks exist in the second spectrum, the characteristic peak with the largest peak value in each characteristic peak is taken as the largest characteristic peak, and the abscissa of the peak value is the second wave number.
And step three, comparing each second wave number with each first wave number.
And step four, if the deviation between the second wave number and any one of the first wave numbers is smaller than a third preset threshold value, determining the second spectrum corresponding to the second wave number as a third spectrum.
And fifthly, if the deviation between the second wave number and each first wave number is larger than or equal to a third preset threshold value, removing the second spectrum.
Specifically, the third preset threshold may be preset according to actual needs, and as long as the wave number deviation between the maximum characteristic peak position and the alternative characteristic peak position is smaller than the third preset threshold, the maximum characteristic peak and the alternative characteristic peak may be considered to coincide. The third preset threshold may be, for example, 10 or less (in cm -1 ) For example, the third preset threshold is a range value of any one or any two of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
Step 103: a fourth spectrum is extracted from the first spectrum and fifth spectra are extracted from the third spectrum, the wave number range of each fifth spectrum being the same as the wave number range of the fourth spectrum.
In some embodiments, step 103 may be performed by:
and step one, removing a spectrum in the interference wave number range in the first spectrum to obtain a fourth spectrum.
Illustratively, the interference wavenumber range may include a first wavenumber range in which absorbance in the first spectrum exceeds a second preset threshold, and a second wavenumber range in which water in the first spectrum interferes. Illustratively, the second preset threshold may be set to 1.5.
Specifically, the range in which the saturation range (i.e., the ordinate y absorbance) is too high is excluded from the first spectrum, and the wavenumber range in which water interference is also excluded, typically 3000cm -1 (wave number unit, number of waves contained per cm) and wave numbers thereafter.
And step two, removing the spectrum in the interference wave number range from the third spectrum to obtain a fifth spectrum.
Specifically, the spectrum in the interference wavenumber range identical to that in the first spectrum is removed from the third spectrum, and the remaining spectrum is the fifth spectrum.
It will be appreciated that if the first wavenumber range is in the intermediate range, the first spectrum and the third spectrum will be truncated, and then the fourth spectrum is a spectrum obtained by splicing the truncated two portions of the first spectrum, and the fifth spectrum is a spectrum obtained by splicing the truncated two portions of the third spectrum.
Step 104: and performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and acquiring target spectrums from the plurality of fifth spectrums, wherein the regression coefficient of each target spectrum is larger than or equal to a first preset threshold value.
Illustratively, the first preset threshold may be set to 0.05.
In some embodiments, step 104 may be performed by:
and step one, constructing a multiple linear regression model based on the fourth spectrum and each fifth spectrum.
Specifically, a multiple linear regression model is constructed by taking the fourth spectrum as a dependent variable and taking each fifth spectrum as an independent variable, and the multiple linear regression model can be specifically represented by the following formula (1):
Y=β 01 ×X 12 ×X 2 +…+β n ×X n (1)
in the formula (1), Y is a fourth spectrum, beta 0 ~β n As regression coefficient, X 1 ~X n N fifth spectra.
And step two, obtaining regression coefficients corresponding to each fifth spectrum through a least square method.
And thirdly, if the regression coefficient corresponding to each fifth spectrum is greater than or equal to a first preset threshold value, determining each fifth spectrum as each target spectrum.
And step four, if fifth spectrums with the regression coefficients smaller than the first preset threshold value exist, removing the fifth spectrums with the regression coefficients smaller than the first preset threshold value from the plurality of fifth spectrums, constructing a multiple linear regression model based on the fourth spectrums and the remaining fifth spectrums, and obtaining the regression coefficients corresponding to the remaining fifth spectrums.
And fifthly, if the regression coefficients corresponding to the remaining fifth spectrums are all larger than or equal to a first preset threshold value, determining the remaining fifth spectrums as target spectrums.
Specifically, the first preset threshold may be set according to actual requirements.
That is, firstly, performing multiple linear fitting on all fifth spectrums based on the fourth spectrums to obtain regression coefficients of all fifth spectrums, removing the fifth spectrums with regression coefficients smaller than a first preset threshold value, performing multiple linear fitting on the rest fifth spectrums based on the fourth spectrums, removing the fifth spectrums with regression coefficients smaller than the first preset threshold value, and repeating the iteration until the regression coefficients of all fifth spectrums participating in multiple linear regression are larger than or equal to the first preset threshold value, thereby ending the iteration.
Step 105: the candidate substances corresponding to the respective target spectra are determined as components of the mixture.
Therefore, the regression coefficient is calculated by using the least square method based on the multiple linear regression in the mode, so that the method is mature and reliable, and compared with other qualitative distinguishing algorithms, such as spectrum matching (one by one matching with a standard spectrum library), characteristic peak matching, an artificial neural network, a partial least square method and the like, the method is simpler and faster to realize and is very easy to apply.
In some embodiments, after performing step 105, the method of embodiments of the present application may further include the steps of:
step one, obtaining the fitting degree of a multiple linear regression model constructed by the fourth spectrum and each target spectrum.
Illustratively, a goodness of fit (R 2 ) As an evaluation index of the fitting degree of the multiple linear regression model. Specifically, the goodness of fit of the multiple linear regression model may be determined by the following equation (2):
in the formula (2), y i Represents the ith sample dependent variable (the ith point of the fourth spectrum),representing the predicted value of the ith sample through linear regression (via a multi-element linePoint i after sexual regression fit),>represents the mean of the dependent variable (the mean of the target spectrum).
And step two, acquiring a contribution value of each target spectrum to the fitting degree, wherein the contribution value is used for reflecting the existence possibility of the candidate substances corresponding to the target spectrum.
Specifically, the contribution value of each target spectrum to the fitting degree can be determined by the following formula (3):
in the formula (3),r as a single substance 2 Contribution value (s)/(s)>For the whole R calculated in the previous step 2 ,/>To exclude the substance to be calculated, the other substance calculates R according to equation (2) based on the spectrum fitted by multiple linear regression and the fourth spectrum 2 Is a value of (2).
And step three, removing the target spectrum with the contribution value smaller than a fourth preset threshold value.
Specifically, the fourth preset threshold may be set as required, for example, may be set to 0.2.
And step four, removing the target spectrum with the regression coefficient smaller than a fifth preset threshold value.
Specifically, the fifth preset threshold may be set as needed, for example, may be set to 0.05.
And fifthly, determining the candidate substances corresponding to the residual target spectrums as components of the mixture, and sequencing and outputting the candidate substances corresponding to the residual target spectrums according to the order of the contribution values from large to small.
Through the mode, the target spectrum can be further screened, the possibility of existence of each alternative substance is intuitively displayed, and a more accurate identification effect is achieved.
In order to more clearly demonstrate the method of the embodiments of the present application, the following description is given by way of specific examples.
Referring to fig. 2, fig. 2 illustrates a specific exemplary flow of a method for determining the composition of an infrared spectrum-based mixture in accordance with an embodiment of the present application. The method specifically comprises inputting a first spectrum and a plurality of second spectrums; determining an optimal analysis range; then processing the first spectrum and the second spectrum to make the data length consistent; recording the largest characteristic peak of the second spectrum; recording each alternative characteristic peak of the first spectrum; screening out a third spectrum similar to each alternative characteristic peak of the first spectrum; further processing the analysis range, namely removing the interference range to obtain a fourth spectrum and a fifth spectrum; jointly importing the screened fifth spectrum and fourth spectrum into a multiple linear regression model to obtain regression coefficients of the fifth spectrums; if all regression coefficients meet the conditions, a preliminary result is obtained, otherwise, a fifth spectrum which does not meet the conditions is removed. After the preliminary result is obtained, the contribution value of each target spectrum can be calculated, the target spectrum with the contribution value meeting the condition is screened, the target spectrum with the regression coefficient meeting the condition is screened, and finally the rest target spectrums are ranked according to the order of the contribution values from large to small, so that the final result is output.
Referring to fig. 3, fig. 3 illustrates an example of a fourth spectrum and each target spectrum in an embodiment of the present application. Illustratively, obtaining a first spectrum of the mixture to be analyzed, and a 343 Zhang Dier spectrum, requires analysis of which components are contained in the mixture. The best range was first determined to be 628.0cm -1 ~3898.0cm -1 The method comprises the steps of carrying out a first treatment on the surface of the Then, the following 170 materials were initially screened out according to the method of the embodiment of the present application, including: trichlorosilane, trimethylborate, tribenzylamine, triphenylmethyl mercaptan, triphenylmethane …, and the like. Further screening was carried out over a range of 628.0cm -1 ~3060.0cm -1 . Then iterating the seed screening:1,3, 5-triphenyl, 2, 4-pentanediol, acrylonitrile, diethyl mercury, diethyl chlorophosphite, cyclohexanol, benzyl alcohol, and flavone, for a total of 8 substances. And finally, calculating a contribution value and a regression coefficient of each substance, and screening out a final result according to a set threshold value based on the contribution value and the regression coefficient: benzyl alcohol and acrylonitrile, the regression coefficient of benzyl alcohol is 0.8812, the contribution value is 0.1934, the regression coefficient of acrylonitrile is 1.0089, and the contribution value is 0.1675. Wherein the fourth spectrum of the mixture is shown as curve a, the fifth spectrum of benzyl alcohol is shown as curve b, the fifth spectrum of acrylonitrile is shown as curve c, and the data obtained by fitting benzyl alcohol and acrylonitrile is shown as curve d.
It can be appreciated that the method for determining the components of the mixture based on the infrared spectrum in the embodiment of the application can determine a plurality of candidate substances from a large number of substances by using a multiple linear regression method, so that the linear combination of the second spectrums of the candidate substances can be fitted with the first spectrums of the unknown mixture to the greatest extent, thereby determining the components of the mixture quickly and realizing qualitative analysis of the mixture.
Accordingly, referring to fig. 4, fig. 4 illustrates a block diagram of an infrared spectrum-based mixture component determination apparatus according to an embodiment of the present application. The device for determining the mixture components based on infrared spectrum provided by the embodiment of the application comprises: a spectrum acquisition module 401, a first screening module 402, a spectrum processing module 403, a second screening module 404, and a component determination module 405.
A spectrum acquisition module 401 for acquiring a first spectrum of the mixture and acquiring a second spectrum of each of the candidate substances.
A first screening module 402, configured to obtain, from each second spectrum, a third spectrum with a maximum characteristic peak position overlapping with any of the candidate characteristic peak positions of the first spectrum.
The spectrum processing module 403 is configured to extract a fourth spectrum from the first spectrum and extract fifth spectrums from the third spectrum, where a wave number range of each fifth spectrum is the same as a wave number range of the fourth spectrum.
The second screening module 404 is configured to perform multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and obtain a target spectrum from the multiple fifth spectrums, where a regression coefficient of each target spectrum is greater than or equal to a first preset threshold.
A component determining module 405, configured to determine candidate substances corresponding to the respective target spectrums as components of the mixture.
In some embodiments, the second screening module 404 is specifically configured to:
a multiple linear regression model is constructed based on the fourth spectrum and each of the fifth spectrums.
And obtaining regression coefficients corresponding to each fifth spectrum through a least square method.
And if the regression coefficient corresponding to each fifth spectrum is greater than or equal to the first preset threshold value, determining each fifth spectrum as each target spectrum.
In some embodiments, the second screening module 404 is specifically further configured to:
if fifth spectrums with the regression coefficients smaller than the first preset threshold value exist, removing the fifth spectrums with the regression coefficients smaller than the first preset threshold value from the plurality of fifth spectrums, constructing a multiple linear regression model based on the fourth spectrums and the remaining fifth spectrums, and obtaining the regression coefficients corresponding to the remaining fifth spectrums.
And if the regression coefficients corresponding to the remaining fifth spectrums are all larger than or equal to a first preset threshold value, determining the remaining fifth spectrums as target spectrums.
In some embodiments, the apparatus further comprises:
and the third screening module is used for acquiring the fitting degree of the multiple linear regression model constructed by the fourth spectrum and each target spectrum. And obtaining a contribution value of each target spectrum to the fitting degree, wherein the contribution value is used for reflecting the existence possibility of the candidate substances corresponding to the target spectrum. And removing the target spectrum with the contribution value smaller than the fourth preset threshold value. And removing the target spectrum with the regression coefficient smaller than the fifth preset threshold value. And determining the candidate substances corresponding to the residual target spectrums as components of the mixture, and sequencing and outputting the candidate substances corresponding to the residual target spectrums according to the order of the contribution values from large to small.
In some embodiments, the spectrum processing module 403 is specifically configured to:
and removing the spectrum in the interference wave number range in the first spectrum to obtain a fourth spectrum.
And removing the spectrum in the interference wave number range from the third spectrum to obtain a fifth spectrum.
In some embodiments, the interference wavenumber range includes a first wavenumber range in which absorbance in the first spectrum exceeds a second preset threshold, and a second wavenumber range in which water in the first spectrum interferes.
In some embodiments, the first screening module 402 is specifically configured to:
and acquiring a first wave number corresponding to each alternative characteristic peak of the first spectrum.
And obtaining a second wave number corresponding to the maximum characteristic peak of each second spectrum.
Each second wave number is compared with a respective first wave number.
If the deviation between the second wave number and any one of the first wave numbers is smaller than a third preset threshold, determining the second spectrum corresponding to the second wave number as a third spectrum.
In some embodiments, the spectrum acquisition module 401 is specifically configured to:
and acquiring a preset substance library, wherein spectra of a plurality of substances are stored in the substance library.
And selecting the spectrum of each substance with preset relativity with the mixture from the substance library.
The spectrum of each substance is determined as each second spectrum.
In some embodiments, prior to the step of obtaining a third spectrum from each second spectrum having a maximum characteristic peak position that overlaps with any of the alternative characteristic peak positions of the first spectrum, the first screening module 402 is further configured to:
and respectively carrying out data interception processing on the first spectrum and each second spectrum so as to make the wave number ranges of the first spectrum and each second spectrum identical.
It will be appreciated that the device for determining the composition of an infrared spectrum-based mixture according to the embodiments of the present application can determine a plurality of candidate substances from a large number of substances, so that the linear combination of the second spectra thereof can be fitted to the first spectrum of an unknown mixture to the greatest extent, thereby being capable of better determining the respective components of the mixture and realizing qualitative analysis of the mixture.
The above description is provided in detail of a method and an apparatus for determining a mixture component based on infrared spectrum, and specific examples are applied to illustrate the principles and embodiments of the present application, where the above description of the examples is only used to help understand the technical solution and core idea of the present application; those of ordinary skill in the art will appreciate that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method for determining the composition of a mixture based on infrared spectroscopy, the method comprising:
acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance;
obtaining a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum;
extracting a fourth spectrum from the first spectrum and a fifth spectrum from the third spectrum, each of the fifth spectrum having the same wavenumber range as the wavenumber range of the fourth spectrum;
performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, and acquiring target spectrums from a plurality of fifth spectrums, wherein the regression coefficient of each target spectrum is larger than or equal to a first preset threshold value;
and determining the candidate substances corresponding to the target spectrums as components of the mixture.
2. The method of determining the composition of an infrared spectrum-based mixture according to claim 1, wherein performing a multiple linear regression analysis based on the fourth spectrum and each of the fifth spectrums, obtaining a target spectrum from a plurality of the fifth spectrums, comprises:
constructing a multiple linear regression model based on the fourth spectrum and each of the fifth spectrums;
obtaining regression coefficients corresponding to the fifth spectrums through a least square method;
and if the regression coefficient corresponding to each fifth spectrum is greater than or equal to the first preset threshold value, determining each fifth spectrum as each target spectrum.
3. The method of determining the composition of an infrared spectrum based mixture of claim 2, further comprising:
if a fifth spectrum with the regression coefficient smaller than the first preset threshold exists, removing the fifth spectrum with the regression coefficient smaller than the first preset threshold from a plurality of fifth spectrums, constructing a multiple linear regression model based on the fourth spectrum and the rest of the fifth spectrums, and acquiring the regression coefficient corresponding to the rest of the fifth spectrums;
and if the regression coefficient corresponding to each remaining fifth spectrum is greater than or equal to the first preset threshold value, determining each remaining fifth spectrum as each target spectrum.
4. A method of determining the composition of an infrared spectrum based mixture according to claim 2 or 3, wherein the method further comprises:
obtaining fitting degree of a multiple linear regression model constructed by the fourth spectrum and each target spectrum;
acquiring a contribution value of each target spectrum to the fitting degree, wherein the contribution value is used for reflecting the existence possibility of the candidate substance corresponding to the target spectrum;
removing a target spectrum with a contribution value smaller than a fourth preset threshold value;
removing a target spectrum with the regression coefficient smaller than a fifth preset threshold value;
and determining the candidate substances corresponding to the residual target spectrums as components of the mixture, and sequencing and outputting the candidate substances corresponding to the residual target spectrums according to the order of the contribution values from large to small.
5. The method of determining the composition of an infrared spectrum-based mixture of claim 1, wherein extracting a fourth spectrum from the first spectrum and extracting a fifth spectrum from the third spectrum comprises:
removing a spectrum in the interference wave number range in the first spectrum to obtain a fourth spectrum;
and removing the spectrum in the interference wave number range from the third spectrum to obtain the fifth spectrum.
6. The method of claim 5, wherein the range of interference wavenumbers comprises a first range of wavenumbers in the first spectrum where absorbance exceeds a second predetermined threshold, and a second range of wavenumbers in the first spectrum where water interferes.
7. The method of determining the composition of an infrared spectrum-based mixture according to claim 1, wherein said obtaining a third spectrum having a maximum characteristic peak position overlapping any of the alternative characteristic peak positions of the first spectrum from each of the second spectra comprises:
acquiring a first wave number corresponding to each alternative characteristic peak of the first spectrum;
acquiring a second wave number corresponding to the maximum characteristic peak of each second spectrum;
comparing each of said second wavenumbers with a respective one of said first wavenumbers;
and if the deviation between the second wave number and any one of the first wave numbers is smaller than a third preset threshold value, determining a second spectrum corresponding to the second wave number as the third spectrum.
8. The method of determining the composition of an infrared spectrum-based mixture as set forth in claim 1, wherein said obtaining a second spectrum of each candidate substance includes:
acquiring a preset substance library, wherein spectra of a plurality of substances are stored in the substance library;
selecting a spectrum of each substance having a predetermined correlation with the mixture from the library of substances;
and determining the spectrum of each substance as each second spectrum.
9. The method of determining the composition of an infrared spectrum-based mixture of claim 1, wherein prior to the step of obtaining a third spectrum from each of the second spectra having a maximum characteristic peak position that overlaps any of the alternative characteristic peak positions of the first spectrum, the method further comprises:
and respectively carrying out data interception processing on the first spectrum and each second spectrum so as to enable the wave number ranges of the first spectrum and each second spectrum to be the same.
10. An infrared spectrum based mixture component determining apparatus for determining the components of the mixture, the apparatus comprising:
a spectrum acquisition module for acquiring a first spectrum of the mixture and acquiring a second spectrum of each candidate substance;
the first screening module is used for acquiring a third spectrum with the largest characteristic peak position overlapped with any alternative characteristic peak position of the first spectrum from each second spectrum;
a spectrum processing module, configured to extract a fourth spectrum from the first spectrum and extract a fifth spectrum from the third spectrum, where a wave number range of each fifth spectrum is the same as a wave number range of the fourth spectrum;
the second screening module is used for performing multiple linear regression analysis based on the fourth spectrum and each fifth spectrum, obtaining target spectrums from a plurality of fifth spectrums, and enabling regression coefficients of each target spectrum to be larger than or equal to a first preset threshold value;
and the component determining module is used for determining the candidate substances corresponding to the target spectrums as components of the mixture.
CN202311694381.3A 2023-12-08 2023-12-08 Method and device for determining mixture components based on infrared spectrum Pending CN117783032A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311694381.3A CN117783032A (en) 2023-12-08 2023-12-08 Method and device for determining mixture components based on infrared spectrum

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311694381.3A CN117783032A (en) 2023-12-08 2023-12-08 Method and device for determining mixture components based on infrared spectrum

Publications (1)

Publication Number Publication Date
CN117783032A true CN117783032A (en) 2024-03-29

Family

ID=90384543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311694381.3A Pending CN117783032A (en) 2023-12-08 2023-12-08 Method and device for determining mixture components based on infrared spectrum

Country Status (1)

Country Link
CN (1) CN117783032A (en)

Similar Documents

Publication Publication Date Title
US6487523B2 (en) Model for spectral and chromatographic data
Zhang et al. An intelligent background‐correction algorithm for highly fluorescent samples in Raman spectroscopy
US11614408B2 (en) Method for improving identification accuracy of mixture components by using known mixture Raman spectrum
WO2021232757A1 (en) Method for improving mixture component identification precision by using raman spectra of known mixtures
US7676329B2 (en) Method and system for processing multi-dimensional measurement data
JP6091493B2 (en) Spectroscopic apparatus and spectroscopy for determining the components present in a sample
EP3066435B1 (en) Texture analysis of a coated surface using pivot-normalization
Ranzan et al. Wheat flour characterization using NIR and spectral filter based on Ant Colony Optimization
US20120083678A1 (en) System and method for raman chemical analysis of lung cancer with digital staining
CN110068544B (en) Substance identification network model training method and terahertz spectrum substance identification method
CN108398416A (en) A kind of mix ingredients assay method based on laser Raman spectroscopy
CN108802002B (en) Silkworm egg Raman spectrum model construction method for rapidly identifying and removing diapause without damage
CN114611582B (en) Method and system for analyzing substance concentration based on near infrared spectrum technology
WO2018103541A1 (en) Raman spectrum detection method and electronic apparatus for removing solvent perturbation
Tan et al. Mutual information-induced interval selection combined with kernel partial least squares for near-infrared spectral calibration
CN117783032A (en) Method and device for determining mixture components based on infrared spectrum
CN112782115A (en) Method for detecting consistency of sensory characteristics of cigarettes based on near infrared spectrum
Kumar Application of Akaike information criterion assisted probabilistic latent semantic analysis on non-trilinear total synchronous fluorescence spectroscopic data sets: Automatizing fluorescence based multicomponent mixture analysis
Tan et al. Calibration transfer between two near-infrared spectrometers based on a wavelet packet transform
CN116026808A (en) Raman spectrum discrimination method and system
CN116399836A (en) Cross-talk fluorescence spectrum decomposition method based on alternating gradient descent algorithm
CN114018856B (en) Spectral correction method
CN113449804B (en) Method for determining blood category and related equipment
CN113138181B (en) Method for grading quality of fen-flavor wine base
Feudale et al. An inverse model for target detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination