CN108827904B - Substance identification method, device and equipment based on terahertz spectrum and storage medium - Google Patents

Substance identification method, device and equipment based on terahertz spectrum and storage medium Download PDF

Info

Publication number
CN108827904B
CN108827904B CN201810628198.6A CN201810628198A CN108827904B CN 108827904 B CN108827904 B CN 108827904B CN 201810628198 A CN201810628198 A CN 201810628198A CN 108827904 B CN108827904 B CN 108827904B
Authority
CN
China
Prior art keywords
substance
absorption peak
characteristic absorption
detected
standard
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810628198.6A
Other languages
Chinese (zh)
Other versions
CN108827904A (en
Inventor
程良伦
何伟健
罗鉴鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201810628198.6A priority Critical patent/CN108827904B/en
Publication of CN108827904A publication Critical patent/CN108827904A/en
Application granted granted Critical
Publication of CN108827904B publication Critical patent/CN108827904B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/31Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
    • G01N21/35Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
    • G01N21/3581Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation
    • G01N21/3586Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light using far infrared light; using Terahertz radiation by Terahertz time domain spectroscopy [THz-TDS]

Abstract

The application discloses a method, a device, equipment and a storage medium for identifying substances based on terahertz spectrum, which comprises the following steps: extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time; repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set; acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing a substance to be detected from a spectral database; matching the characteristic absorption peak data set and the standard characteristic absorption peak data set through an inverse gravitation algorithm, and calculating the similarity of the two substances; and comparing the similarity with a set similarity threshold value to identify the substance to be detected. According to the method and the device, the characteristic absorption peak is extracted through the gravity algorithm, the confidence coefficient is calculated, and the characteristic absorption peak is identified by utilizing the matching of the absorption peak data set, so that the influence of amplitude fluctuation on characteristic extraction is effectively avoided, and the identification rate is improved.

Description

Substance identification method, device and equipment based on terahertz spectrum and storage medium
Technical Field
The invention relates to the field of substance identification, in particular to a substance identification method, a substance identification device, substance identification equipment and a storage medium based on terahertz spectrum.
Background
In recent years, problems such as terrorist threat, food safety and the like are increasingly exposed and serious, and a method for rapidly and effectively identifying and detecting substances such as explosives, foods, medicines and the like is urgently needed.
The terahertz wave has the properties of strong penetrability, low energy, fingerprint spectrum and the like, and has important application prospects in the aspects of safety inspection and nondestructive testing. Molecular vibration and intermolecular interaction of a plurality of organic matters can absorb terahertz waves with specific frequency, so that corresponding absorption peaks are generated, and infrared spectra can only detect rotation and stretching vibration of chemical bonds in molecules; by utilizing the terahertz time-domain spectroscopy technology, intermolecular framework vibration can be effectively obtained in a larger wavelength range, and information obtained by infrared spectroscopy can be effectively supplemented. Due to the fact that the composition groups of each substance are different, the terahertz spectrum obtained by irradiation of terahertz waves with different frequencies is large in difference, and the difference of the frequency of a characteristic absorption peak on a spectral curve of each substance is mainly large. After the features of the terahertz spectrum of each substance are extracted, the substances can be distinguished by the features.
At present, the commonly used terahertz feature extraction methods include a partial least square method and a principal component analysis method, and after the features are extracted by the methods, a support vector machine or other machine learning methods are used for identifying substances; or directly calculating the relation of the distance, the included angle and the like of the two spectral curves to judge whether the two spectral curves belong to the same substance. However, the actual terahertz spectrum measurement experiment proves that under the conditions of different concentrations (thicknesses), irradiation positions, environmental conditions, uniformity and the like, the spectrum amplitude obtained by the same substance still has large fluctuation, and only the characteristic absorption peak frequency of each substance changes slightly. The above methods directly extract the features of the frequency and amplitude of the feature absorption peak, and the extracted features are greatly influenced by the amplitude, so that the recognition rate of recognizing the substances measured under different conditions is low.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a device and a storage medium for identifying a substance based on terahertz spectroscopy, which can improve the identification rate of substance identification. The specific scheme is as follows:
a substance identification method based on terahertz spectrum comprises the following steps:
extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time;
repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing the substance to be detected from a spectral database;
matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity of the substance to be detected and the standard substance;
and comparing the calculated similarity with a set similarity threshold, and identifying the substance to be detected according to the comparison result.
Preferably, in the method for identifying a substance based on a terahertz spectrum provided in an embodiment of the present invention, the extracting an absorption peak from a terahertz spectrum curve obtained by measuring a substance to be detected at a single time specifically includes:
preprocessing a terahertz spectrum curve measured by a substance to be measured at a single time, and finding the positions of a plurality of absorption peaks through monotonicity;
and circularly expanding the interval of each absorption peak through a peak interval expansion algorithm, calculating the confidence coefficient of each absorption peak, and reserving the absorption peak corresponding to the confidence coefficient which is greater than a set confidence coefficient threshold value.
Preferably, in the method for identifying a substance based on a terahertz spectrum according to an embodiment of the present invention, after finding the positions of the plurality of absorption peaks by monotonicity, before cyclically extending the interval of each absorption peak by a peak interval extension algorithm, the method further includes:
removing the interference part with dense wave peak points and wave valley points in the terahertz spectrum curve by using a window sliding method;
by intercepting valley points around a peak point as a fitting interval, fitting and judging whether the slope of the peak point exceeds a slope threshold value;
if so, removing the peak point; if not, the peak point is reserved.
Preferably, in the method for identifying a substance based on a terahertz spectrum according to an embodiment of the present invention, the extracting an absorption peak common to multiple measurements as a characteristic absorption peak of the substance to be detected by using a gravity algorithm specifically includes:
in a one-dimensional coordinate system space only with gravitation, each absorption peak is regarded as an object, the position of the absorption peak on a one-dimensional coordinate axis is the frequency of the absorption peak, and the mass is the confidence coefficient of the absorption peak;
respectively placing absorption peaks with fixed frequency and confidence at each point of a one-dimensional coordinate axis, and calculating the resultant force of the absorption peaks to obtain a resultant force curve of the attraction;
and extracting a proper maximum value point on the gravitational resultant curve to serve as a characteristic absorption peak of the substance to be detected.
Preferably, in the method for identifying a substance based on a terahertz spectrum according to an embodiment of the present invention, before acquiring a standard characteristic absorption peak data set of at least one standard substance that may include the substance to be detected from a spectrum database, the method further includes:
extracting absorption peaks of terahertz spectrum curves obtained by measuring a plurality of standard substances at a single time;
repeating the multiple measurements, extracting the common absorption peak in the multiple measurements as the standard characteristic absorption peak of the plurality of standard substances through a gravity algorithm, calculating the confidence coefficient of each standard characteristic absorption peak, obtaining the data set of the standard characteristic absorption peaks of the plurality of standard substances, and storing the data set into a spectrum database.
The embodiment of the invention also provides a substance identification device based on the terahertz spectrum, which comprises:
the absorption peak extraction module is used for extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time;
the confidence coefficient calculation module is used for repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
a data set acquisition module for acquiring a data set of standard characteristic absorption peaks of at least one standard substance possibly containing the substance to be detected from a spectral database;
the data set matching module is used for matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm and calculating the similarity of the substance to be detected and the standard substance;
and the substance identification module is used for comparing the calculated similarity with a set similarity threshold value and identifying the substance to be detected according to the comparison result.
The embodiment of the invention also provides a substance identification device based on the terahertz spectrum, which comprises a processor and a memory, wherein the processor executes a computer program stored in the memory to realize the substance identification method based on the terahertz spectrum.
Embodiments of the present invention further provide a computer-readable storage medium for storing a computer program, where the computer program is executed by a processor to implement the method for identifying a substance based on terahertz spectroscopy as provided in an embodiment of the present invention.
The invention provides a substance identification method, a device, equipment and a storage medium based on a terahertz spectrum, wherein the method comprises the following steps: extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time; repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set; acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing a substance to be detected from a spectral database; matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity between the substance to be detected and the standard substance; and comparing the calculated similarity with a set similarity threshold, and identifying the substance to be detected according to the comparison result.
The method provided by the invention has the advantages that the measurement is repeated for many times, the characteristic absorption peak is extracted through the gravity algorithm, the confidence coefficient is calculated, the characteristic absorption peak and the confidence coefficient are combined into the characteristic absorption peak data set, the identification is completed by utilizing the matching of the characteristic absorption peak data set, only the amplitude is used for calculating auxiliary parameters such as the confidence coefficient in the whole process, the influence of amplitude fluctuation on the characteristic extraction can be effectively avoided, the terahertz spectrum identification technology can be effectively applied to various environmental conditions and physical parameters of substances, and the identification rate of the terahertz spectrum substance identification is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a substance identification method based on terahertz spectroscopy according to an embodiment of the present invention;
fig. 2 is a terahertz spectrum graph of 5 times hydrogen peroxide solution measured according to the embodiment of the present invention;
FIG. 3 is a graph of hydrogen peroxide attraction provided by an embodiment of the present invention;
FIG. 4 is a diagram of a matching situation between a characteristic absorption peak and a source spectral curve provided by an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a substance identification device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention provides a substance identification method based on terahertz spectrum, as shown in figure 1, comprising the following steps:
s101, extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time;
s102, repeating multiple measurements, extracting an absorption peak shared in the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
s103, acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing the substance to be detected from a spectrum database;
s104, matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity between the substance to be detected and the standard substance;
and S105, comparing the calculated similarity with a set similarity threshold, and identifying the substance to be detected according to the comparison result.
It should be noted that the hardware system used can be a terahertz spectrometer of type TAS7500, and can measure terahertz spectrum from 0.1THz to 5THz, and frequency resolution fr0.0019THz, it can provide various material parameters such as amplitude, phase, transmittance, reflectance, refractive index, and dielectric constant. Preferably, the substance to be detected and the standard substance may be both pure substances.
Further, in specific implementation, the step S101 of extracting an absorption peak of the terahertz spectrum curve obtained by measuring the substance to be measured at a single time may specifically include:
firstly, preprocessing a terahertz spectrum curve measured by a substance to be detected at a single time, and finding the positions of a plurality of absorption peaks through monotonicity;
specifically, a background curve obtained by measurement when a substance to be measured is not put is subtracted from a terahertz spectrum curve obtained by measurement after the substance to be measured is put to obtain a substance spectrum curve with a filtered background, and then wavelet denoising and interference filtering are carried out on the substance spectrum curve to obtain a smooth terahertz spectrum curve with less oscillation. All spectral curve points (f, y) form a point set G with a spectral curve frequency resolution of fr0.0019 THz. For each spectral curve point (f)i,yi) The following judgments were performed in order:
yi>yi+1and y isi>yi-1 (1)
yi<yi+1And y isi<yi-1 (2)
All the points satisfying the condition (1) are peak points, and all the peak points (f)P,yP) Forming a point set P, wherein the points satisfying the condition (2) are valley points, and all the valley points (f)V,yV) Forming a set of points V.
Next, after finding the positions of a plurality of absorption peaks (i.e. peaks) by monotonicity in step one, the method may further include:
removing interference parts with intensive wave peak points and wave valley points in the terahertz spectrum curve by using a window sliding method;
specifically, each peak point (f) in the point set PPi,yPi) At each peak point frequency f in turnPiIs a midpoint, and the frequency range is fixed to be fT(e.g. f)T0.025THz), the amplitude range size is fixed at ymax-yminEstablishing a two-dimensional window, wherein ymaxIs the y maximum value of each element in the point set G, yminIs the minimum value of y for each element in the set of points G. The two-dimensional window formed by the ith peak point is recorded as WiThen there is
Figure GDA0002730572110000061
Determine each two-dimensional window WiWhether the following conditions are satisfied:
Figure GDA0002730572110000062
wherein WiN is P is window WiSet of inner peak points, WiN is V is window WiThe set of valley points in (A), the function number (A) being defined as the number of elements in the set A, ρT(e.g.,. rho.)T5) is the set peak to valley density threshold. Removing the peak point (f) corresponding to the two-dimensional window which does not satisfy the above condition from the set PPi,yPi);
Thirdly, fitting and judging whether the slope of the peak point exceeds a slope threshold value or not by intercepting the valley points around the peak point as a fitting interval; if yes, removing the peak point; if not, keeping the peak point;
specifically, each peak point (f) in the point set PPi,yPi) Find two points (f)vleft,yvleft)、(fvright,yvright) The following conditions are satisfied:
min|fvleft-fPii and (f)vleft,yvleft)∈V,fPi>fvleft
min|fvright-fPiI and (f)vright,yvright)∈V,fPi<fvright
When the point (f) of the above condition is satisfiedvleft,yvleft) [ or (f)vright,yvright)]Taking the left end point (f) of the spectral curve in the absencemin,y1) [ or the right end point (f)max,y2)]I.e., (f)vleft,yvleft)=(fmin,y1) [ or (f)vright,yvright)=(fmax,y2)]。
Set of points Q1={(x,y)|x∈[fvleft,fvright](x, y) belongs to G, and for a two-dimensional point set Q1All points in (2) make a minimum of twoMultiply the regression fit to fit the following function:
y=C1·(f-fPi)+yPi
judging whether the fitted function meets the following conditions:
C1≤Ta
wherein T isa(e.g. T)a-300) is the set parabolic slope threshold. The peak point not satisfying the above condition is a pseudo peak, and the peak point not satisfying the above condition is removed from the set P (f)Pi,yPi)。
After the step three is carried out and the integral waveform fitting is carried out to judge whether the peak point meets the basic condition, the next step is carried out:
circularly expanding the interval of each absorption peak through a peak interval expansion algorithm, calculating the confidence coefficient of each absorption peak, and reserving the absorption peak corresponding to the confidence coefficient which is greater than a set confidence coefficient threshold;
specifically, each peak point (f) in the point set PPi,yPi) Set of set points Q2={(x,y)|x∈[fPi-f0,fPi+f0](x, y) e G, where f0(e.g. f)00.0038) is the set frequency difference constant. The following operations are cyclically carried out:
Q2k={(x,y)|x∈[fPi-f0-k·fr,fPi+f0+k·f],(x,y)∈G,k∈N}
point set Q2kIs subjected to a least squares regression, fitting the following function:
y=C1·(f-fPi)+C2 (3)
the determined coefficient obtained by fitting is set as
Figure GDA0002730572110000071
Until the expansion coefficient is a local maximum value, that is, the following conditions are satisfied:
Figure GDA0002730572110000072
and taking 1,2,3 and … in turn during the circulation process. Let the result of the expansion at the end of the cycle be
Q3={(x,y)|x∈[f′left,f′right],(x,y)∈G}
Then, similar to the above steps, the interval is expanded left and right respectively, namely, the expansion is circularly carried out respectively
Q3k={(x,y)|x∈[f′left-k·fr,f′right],(x,y)∈G,k∈N}
Q3k={(x,y)|x∈[f′left,f′right+k·fr],(x,y)∈G,k∈N}
Up to the coefficient of expansion
Figure GDA0002730572110000081
And reaching the local maximum to finish the cycle of expansion at the left and right sides.
Let the final expanded result Qend={(x,y)|x∈[fleft,fright](x, y) is belonged to G }, and the confirmation coefficient obtained by fitting in the interval is
Figure GDA0002730572110000082
The peak amplitude after fitting is C2endCoefficient of parabolic quadratic term of C1endThe left and right endpoints are respectively (f)left,yleft)、(fright,yright). According to the formula of curvature calculation
Figure GDA0002730572110000083
The curvature of the parabola at the peak point is calculated by substituting equation (3) of the parabola into the equation
k=|2*C1end|
From this, the basic parameters of the absorption peak such as wave width, wave height, confidence level, etc. can be calculated as follows:
Figure GDA0002730572110000084
wherein C is1(e.g. C)1=300)、C2(e.g. C)2=1)、C3(e.g. C)31) are all undetermined constants.
Performing S-shaped function mapping on the confidence degree to obtain the final confidence degree beta (beta is more than or equal to 0 and less than or equal to 100 percent), namely
β=fS0)*100%
Wherein f iss(. cndot.) is a given sigmoidal function having
Figure GDA0002730572110000085
Removing confidence coefficient beta < beta in point set PTCorresponding peak point of wave, whereinT(e.g.. beta.) ofT20%) is a given confidence threshold. The points in the point set P are all characteristic absorption peaks.
Next, in practical application, the step S102 repeats the multiple measurements and extracts an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected by using a gravity algorithm, and calculates a confidence of each characteristic absorption peak to obtain a characteristic absorption peak data set, which may specifically include the following steps:
in specific implementation, the substance to be measured is repeatedly subjected to the steps from one to four (n is 5) times to obtain absorption peaks extracted from n measurement results, 5 spectral curves are uniformly drawn as shown in fig. 2, the horizontal axis is the terahertz frequency (THz), the vertical axis is the amplitude (dB), that is, the two-dimensional set obtained by the extraction at the ith time is PiThen P isi={(f11),(f22),...,(fmm) In which fmIs the m-th absorption peak, β, extractedmIs the confidence corresponding to the mth absorption peak. Let PWP is the set of all the absorption peaks extracted from the substance to be measuredW=P1+P2+P3...+Pn
It is assumed that in a one-dimensional coordinate system space where only gravity exists, each absorption peak is an object, the position of the absorption peak on the one-dimensional coordinate axis is the frequency f thereof, and the mass is the confidence degree β thereof. The calculation formula of the magnitude of the gravity is set as
Figure GDA0002730572110000091
Figure GDA0002730572110000092
Wherein F is the size of gravity, F1、f2Respectively the frequency of the two absorption peaks, r the frequency distance between the two absorption peaks, and y (r) is a given function of the frequency distance r, monotonically decreasing.
Specifically, first, one frequency is set as a standard frequency f0Is respectively placed at each point of one-dimensional coordinate axis (resolution is f)r0.0019THz), calculating the resultant force received by the sensor to obtain a resultant force curve F of the attraction forceG=fG(f) (ii) a Then, normalization processing is carried out on the resultant force of the gravitation in the gravitation curve, and interference is removed through wavelet denoising and filtering; then, all the wave peak points and wave valley points of the gravitational resultant force curve are found out in a traversing way, and then whether each wave peak point meets the condition or not is judged by utilizing the integral waveform fitting method in the step three, and the condition is not met and discarded.
Let the obtained characteristic absorption peak set be PZ,PZ={f1,f2...,fmIn which fmIs the mth characteristic absorption peak obtained by the method, and the final fitting interval in the third step is [ fleft,fright],ym、yleft、yrightThe amplitudes of the corresponding points are respectively, and the gravity of the absorption peak at the normalized gravity curve is Fm. The hydrogen peroxide attraction curve is shown in fig. 3, the horizontal axis is the terahertz wave frequency (THz), the vertical axis is the normalized attraction magnitude, and the peak marked by the circle is the characteristic absorption peak. Confidence degree beta of mth characteristic absorption peakmThe calculation is as follows:
βm=Fm·g(min(ym-yleft,ym-yright))
Figure GDA0002730572110000093
where g (-) is a given attenuation coefficient function, monotonically decreasing, βmConfidence of the m characteristic absorption peak, FmTo absorb the magnitude of the peak's gravitational force at the normalized gravitational curve, ym、yleft、yrightRespectively, the amplitudes of the corresponding points in the fitting interval. Finally obtaining a characteristic absorption peak data set P of the substance to be detectedfinal
Pfinal={(f11),(f22),...,(fmm)}
The extracted characteristic absorption peaks are then plotted on the original spectral curve, as shown in fig. 4, and it can be seen that the method can effectively extract a common absorption peak (i.e., a common absorption peak) in a plurality of spectral curves.
Further, before the step S103 of obtaining the standard characteristic absorption peak data set of at least one standard substance possibly containing the substance to be tested from the spectrum database, the method may further include: extracting absorption peaks of terahertz spectrum curves obtained by measuring a plurality of standard substances at a single time; repeating the multiple measurements, extracting the common absorption peak in the multiple measurements as the standard characteristic absorption peak of the plurality of standard substances through a gravity algorithm, calculating the confidence coefficient of each standard characteristic absorption peak, obtaining the data set of the standard characteristic absorption peaks of the plurality of standard substances, and storing the data set into a spectrum database.
It should be noted that the process of obtaining the standard characteristic absorption peak data set in the spectrum database is substantially the same as that of steps S101 to S102, that is, terahertz spectrum curve data of a plurality of standard substances can be measured in advance by the feature extraction method of steps S101 to S102 of the present invention, and the standard characteristic absorption peak data thereof is extracted and stored in the spectrum database, and the specific process is not described herein again.
In specific implementation, h standard substances possibly containing the substances to be detected can be selected from a spectrum database according to physical properties, and characteristic absorption peak data sets of the standard substances are obtained and respectively recorded as
Figure GDA0002730572110000101
Figure GDA0002730572110000102
Further, in step S104, the feature absorption peak data set of the substance to be detected and the standard feature absorption peak data set of the standard substance are matched through an inverse gravity algorithm, and the similarity between the substance to be detected and the standard substance is calculated, which may specifically include the following steps:
matching a standard characteristic absorption peak data set of a possible standard substance with a characteristic absorption peak of a substance to be detected by using an inverse gravitation algorithm; the specific algorithm for the inverse gravity is described as follows:
data set of standard characteristic absorption peak
Figure GDA0002730572110000103
Each absorption peak in (1)
Figure GDA0002730572110000104
Sequentially at each absorption peak frequency
Figure GDA0002730572110000105
Is a midpoint, and the frequency range is fixed
Figure GDA0002730572110000106
(e.g. using
Figure GDA0002730572110000107
) The amplitude range is fixed to be 0,100%]Establishing a two-dimensional window, and recording the two-dimensional window formed by the jth standard characteristic absorption peak as
Figure GDA0002730572110000108
Then there is
Figure GDA0002730572110000109
In addition, the set of absorption peaks of the substance to be measured in the two-dimensional window can be
Figure GDA0002730572110000111
Note WjEach element (f) ofjj) Let us mean the variance s2The calculation formula is as follows:
Figure GDA0002730572110000112
where D (x) is the variance of all data in x, and y (r) is a given function with respect to the frequency distance r, monotonically decreasing, as with y (r) in step S102. s2The attenuation coefficient α is obtained by functional mapping, i.e.
α=t(s2)
Wherein t(s)2) Is about s2A given function that monotonically increases.
Then at WjFind such an element within:
Figure GDA0002730572110000113
matching coefficient C of jth characteristic absorption peakMjThe calculation is as follows:
Figure GDA0002730572110000114
Figure GDA0002730572110000115
where z (r) is similar to y (r), and is also monotonically decreasing given a function of frequency distance r.
And (3) weighted averaging of the matching coefficients of all the characteristic absorption peaks to obtain the similarity between the substance and the substance to be detected:
Figure GDA0002730572110000116
where k is the number of characteristic absorption peaks for the standard.
Finally, repeating the steps until all possible standard substances are compared, and recording the similarity of the ith possible substance as similarityiJudging whether the following inequality is true:
similaritymax=max(similarity1,similarity2,...,similarityi)>Tsim
wherein, Tsim(e.g. T)sim60% or more) is set as the similarity threshold. When the inequality is satisfied, among the possible standard substances, there is a substance suspected of being a substance to be measured, and the substance to be measured is considered to be the substance responsible for similarityi=similaritymaxA substance that is established; if the inequality is not true, the possible standard substances are considered to be free of the substance to be detected. That is, if the maximum value among the calculated plurality of similarities is greater than or equal to the set similarity threshold TsimIf so, the standard substance contains the substance of the substance to be detected; if the maximum value of the calculated multiple similarity is less than the set similarity threshold value TsimAnd if the standard substance is not the substance to be detected.
Based on the same inventive concept, embodiments of the present invention further provide a substance identification device, and as the principle of solving the problem of the substance identification device is similar to that of the aforementioned substance identification method based on the terahertz spectrum, the implementation of the substance identification device can refer to the implementation of the substance identification method based on the terahertz spectrum, and repeated details are omitted.
In specific implementation, the substance identification device provided in the embodiment of the present invention, as shown in fig. 5, specifically includes:
the absorption peak extraction module 11 is used for extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time;
the confidence coefficient calculation module 12 is used for repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
a data set acquisition module 13, configured to acquire a standard characteristic absorption peak data set of at least one standard substance that may include a substance to be detected from a spectrum database;
the data set matching module 14 is used for matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity between the substance to be detected and the standard substance;
and the substance identification module 15 is used for comparing the calculated similarity with a set similarity threshold value and identifying the substance to be detected according to the comparison result.
In the substance identification device provided by the embodiment of the invention, through the interaction of the five modules, the influence of amplitude fluctuation on feature extraction can be effectively avoided, the terahertz spectrum identification technology can be effectively applied to various environmental conditions and physical parameters of substances, and the identification rate of terahertz spectrum substance identification is improved.
For more specific working processes of the modules, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Correspondingly, the embodiment of the invention also discloses a substance identification device, which comprises a processor and a memory; wherein the processor implements the terahertz spectrum based substance identification method disclosed in the foregoing embodiments when executing the computer program stored in the memory.
For more specific processes of the above method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
Further, the present invention also discloses a computer readable storage medium for storing a computer program; the computer program is executed by a processor to realize the terahertz spectrum-based substance identification method disclosed in the foregoing.
For more specific processes of the above method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not repeated here.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the equipment and the storage medium disclosed by the embodiment correspond to the method disclosed by the embodiment, so that the description is relatively simple, and the relevant points can be referred to the method part for description.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The embodiment of the invention provides a substance identification method, a substance identification device, substance identification equipment and a storage medium based on terahertz spectrum, wherein the method comprises the following steps: extracting an absorption peak of a terahertz spectrum curve measured by a substance to be detected at a single time; repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set; acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing a substance to be detected from a spectral database; matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity between the substance to be detected and the standard substance; and comparing the calculated similarity with a set similarity threshold, and identifying the substance to be detected according to the comparison result. The terahertz spectrum identification method has the advantages that multiple times of measurement are repeated, the characteristic absorption peak is extracted through the gravity algorithm, the confidence coefficient is calculated, the characteristic absorption peak and the confidence coefficient are combined into the characteristic absorption peak data set, identification is completed through matching of the characteristic absorption peak data set, only the amplitude is used for calculating auxiliary parameters such as the confidence coefficient in the whole process, the influence of amplitude fluctuation on characteristic extraction can be effectively avoided, the terahertz spectrum identification technology can be effectively applied to various environmental conditions and physical parameters of substances, and the identification rate of terahertz spectrum substance identification is improved.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The method, the device, the apparatus and the storage medium for identifying a substance based on terahertz spectrum provided by the present invention are described in detail above, and a specific example is applied in the present document to illustrate the principle and the implementation of the present invention, and the description of the above embodiment is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (6)

1. A substance identification method based on terahertz spectrum is characterized by comprising the following steps:
preprocessing a terahertz spectrum curve measured by a substance to be measured at a single time, and finding the positions of a plurality of absorption peaks through monotonicity;
removing the interference part with dense wave peak points and wave valley points in the terahertz spectrum curve by using a window sliding method;
by intercepting valley points around a peak point as a fitting interval, fitting and judging whether the slope of the peak point exceeds a slope threshold value;
if so, removing the peak point; if not, the peak point is reserved;
circularly expanding the interval of each absorption peak through a peak interval expansion algorithm, calculating the confidence coefficient of each absorption peak, and reserving the absorption peak corresponding to the confidence coefficient which is greater than a set confidence coefficient threshold;
repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing the substance to be detected from a spectral database;
matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm, and calculating the similarity of the substance to be detected and the standard substance;
and comparing the calculated similarity with a set similarity threshold, and identifying the substance to be detected according to the comparison result.
2. The terahertz spectrum-based substance identification method as claimed in claim 1, wherein an absorption peak common to a plurality of measurements is extracted as a characteristic absorption peak of the substance to be detected by a gravity algorithm, and specifically comprises:
in a one-dimensional coordinate system space only with gravitation, each absorption peak is regarded as an object, the position of the absorption peak on a one-dimensional coordinate axis is the frequency of the absorption peak, and the mass is the confidence coefficient of the absorption peak;
respectively placing absorption peaks with fixed frequency and confidence at each point of a one-dimensional coordinate axis, and calculating the resultant force of the absorption peaks to obtain a resultant force curve of the attraction;
and extracting a proper maximum value point on the gravitational resultant curve to serve as a characteristic absorption peak of the substance to be detected.
3. The method for identifying a substance based on terahertz spectrum according to claim 1, further comprising, before acquiring a standard characteristic absorption peak data set of at least one standard substance possibly containing the substance to be detected from a spectrum database, the following steps:
extracting absorption peaks of terahertz spectrum curves obtained by measuring a plurality of standard substances at a single time;
repeating the multiple measurements, extracting the common absorption peak in the multiple measurements as the standard characteristic absorption peak of the plurality of standard substances through a gravity algorithm, calculating the confidence coefficient of each standard characteristic absorption peak, obtaining the data set of the standard characteristic absorption peaks of the plurality of standard substances, and storing the data set into a spectrum database.
4. A substance identification device based on terahertz spectroscopy, comprising:
the absorption peak extraction module is used for preprocessing a terahertz spectrum curve measured by a substance to be detected at a single time and finding the positions of a plurality of absorption peaks through monotonicity; removing the interference part with dense wave peak points and wave valley points in the terahertz spectrum curve by using a window sliding method; by intercepting valley points around a peak point as a fitting interval, fitting and judging whether the slope of the peak point exceeds a slope threshold value; if so, removing the peak point; if not, the peak point is reserved; circularly expanding the interval of each absorption peak through a peak interval expansion algorithm, calculating the confidence coefficient of each absorption peak, and reserving the absorption peak corresponding to the confidence coefficient which is greater than a set confidence coefficient threshold;
the confidence coefficient calculation module is used for repeating multiple measurements, extracting an absorption peak common to the multiple measurements as a characteristic absorption peak of the substance to be detected through a gravity algorithm, and calculating the confidence coefficient of each characteristic absorption peak to obtain a characteristic absorption peak data set;
a data set acquisition module for acquiring a data set of standard characteristic absorption peaks of at least one standard substance possibly containing the substance to be detected from a spectral database;
the data set matching module is used for matching the characteristic absorption peak data set of the substance to be detected and the standard characteristic absorption peak data set of the standard substance through an inverse gravitation algorithm and calculating the similarity of the substance to be detected and the standard substance;
and the substance identification module is used for comparing the calculated similarity with a set similarity threshold value and identifying the substance to be detected according to the comparison result.
5. A terahertz spectrum-based substance identification apparatus comprising a processor and a memory, wherein the processor implements the terahertz spectrum-based substance identification method according to any one of claims 1 to 3 when executing a computer program stored in the memory.
6. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the method for substance identification based on terahertz spectroscopy of any one of claims 1 to 3.
CN201810628198.6A 2018-06-19 2018-06-19 Substance identification method, device and equipment based on terahertz spectrum and storage medium Active CN108827904B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810628198.6A CN108827904B (en) 2018-06-19 2018-06-19 Substance identification method, device and equipment based on terahertz spectrum and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810628198.6A CN108827904B (en) 2018-06-19 2018-06-19 Substance identification method, device and equipment based on terahertz spectrum and storage medium

Publications (2)

Publication Number Publication Date
CN108827904A CN108827904A (en) 2018-11-16
CN108827904B true CN108827904B (en) 2021-01-26

Family

ID=64142712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810628198.6A Active CN108827904B (en) 2018-06-19 2018-06-19 Substance identification method, device and equipment based on terahertz spectrum and storage medium

Country Status (1)

Country Link
CN (1) CN108827904B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110068544B (en) * 2019-05-08 2021-09-17 广东工业大学 Substance identification network model training method and terahertz spectrum substance identification method
CN110532308B (en) * 2019-07-11 2024-01-02 北京嘉元文博科技有限公司 Cultural relic substance identification method and device and computer readable storage medium
CN113607680B (en) * 2021-08-11 2023-05-26 江门市华讯方舟科技有限公司 Method for detecting methionine content
CN113670848B (en) * 2021-08-23 2022-08-02 中国人民解放军军事科学院国防科技创新研究院 High-resolution broadband terahertz detector based on pixelized structure and detection method

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009128035A (en) * 2007-11-20 2009-06-11 Nippon Telegr & Teleph Corp <Ntt> Spectrum analysis method and program
CN103020959A (en) * 2012-11-24 2013-04-03 中国科学院地理科学与资源研究所 Gravity model-based oceanic front information extraction method
CN104266993A (en) * 2014-08-28 2015-01-07 北京环境特性研究所 Article characteristic extraction method and device based on terahertz frequency band
CN104713845A (en) * 2015-03-25 2015-06-17 西安应用光学研究所 Mixture component identification method based on terahertz absorption spectrum processing
CN105279379A (en) * 2015-10-28 2016-01-27 昆明理工大学 Terahertz spectroscopy feature extraction method based on convex combination kernel function principal component analysis
CN105303585A (en) * 2015-09-29 2016-02-03 燕山大学 Rapid target tracking method and device
CN105512675A (en) * 2015-11-27 2016-04-20 中国石油大学(华东) Memory multi-point crossover gravitational search-based feature selection method
CN106645014A (en) * 2016-09-23 2017-05-10 上海理工大学 Terahertz spectroscopy based material recognition method
CN106952315A (en) * 2017-03-22 2017-07-14 广东工业大学 A kind of method that image quick reconfiguration is carried out to Terahertz complex-valued data based on BFGS

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009128035A (en) * 2007-11-20 2009-06-11 Nippon Telegr & Teleph Corp <Ntt> Spectrum analysis method and program
CN103020959A (en) * 2012-11-24 2013-04-03 中国科学院地理科学与资源研究所 Gravity model-based oceanic front information extraction method
CN104266993A (en) * 2014-08-28 2015-01-07 北京环境特性研究所 Article characteristic extraction method and device based on terahertz frequency band
CN104713845A (en) * 2015-03-25 2015-06-17 西安应用光学研究所 Mixture component identification method based on terahertz absorption spectrum processing
CN105303585A (en) * 2015-09-29 2016-02-03 燕山大学 Rapid target tracking method and device
CN105279379A (en) * 2015-10-28 2016-01-27 昆明理工大学 Terahertz spectroscopy feature extraction method based on convex combination kernel function principal component analysis
CN105512675A (en) * 2015-11-27 2016-04-20 中国石油大学(华东) Memory multi-point crossover gravitational search-based feature selection method
CN106645014A (en) * 2016-09-23 2017-05-10 上海理工大学 Terahertz spectroscopy based material recognition method
CN106952315A (en) * 2017-03-22 2017-07-14 广东工业大学 A kind of method that image quick reconfiguration is carried out to Terahertz complex-valued data based on BFGS

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
Identification and quantitative analysis of chemical compounds based on multiscale linear fitting of terahertz spectra;Lingbo Qiao et al.;《Optical Engineering》;20140731;第53卷(第07期);第074102-1-074102-8页 *
Terahertz Image Segmentation Based on K-Harmonic-Means Clustering and Statistical Feature Extraction Modeling;Mohamed Walid Ayech;《21st International Conference on Pattern Recognition》;20121115;第222-225页 *
Terahertz reflective spectroscopy;Liangliang Zhang;《Proc. of SPIE》;20111231;第8195卷;第819502-1-819502-8页 *
一种基于置信度的动态特征提取算法;温涛;《电力自动化设备》;20010331;第21卷(第03期);第23-26页 *
基于几何代数的太赫兹时域光谱信号分析及物质识别方法研究;李静;《万方学位论文》;20120929;全文 *
基于深层信念网络的太赫兹光谱识别;马 帅;《光谱学与光谱分析》;20151231;第35卷(第12期);第3325-3329页 *
太赫兹频段微动特征边缘检测及提取方法;田坤;《电子科技大学学报》;20180131;第47卷(第01期);第19-36页 *
核优化相关向量机太赫兹频谱特征提取方法;钟毅伟;《光谱学与光谱分析》;20161231;第36卷(第12期);第3857-3862页 *

Also Published As

Publication number Publication date
CN108827904A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108827904B (en) Substance identification method, device and equipment based on terahertz spectrum and storage medium
US11493447B2 (en) Method for removing background from spectrogram, method of identifying substances through Raman spectrogram, and electronic apparatus
CN107818298B (en) General Raman spectrum feature extraction method for machine learning substance identification algorithm
CN107219188B (en) A method of based on the near-infrared spectrum analysis textile cotton content for improving DBN
CN110068544B (en) Substance identification network model training method and terahertz spectrum substance identification method
CN107179310B (en) Raman spectrum characteristic peak recognition methods based on robust noise variance evaluation
JP5623061B2 (en) Inspection apparatus and inspection method
CN113139610A (en) Abnormity detection method and device for transformer monitoring data
CN109709062B (en) Substance identification method and device and computer readable storage medium
CN109374568B (en) Sample identification method using terahertz time-domain spectroscopy
Kuzmiakova et al. An automated baseline correction protocol for infrared spectra of atmospheric aerosols collected on polytetrafluoroethylene (Teflon) filters
US8954286B2 (en) Method and device for measuring electromagnetic wave
CN110599425A (en) Wavelet parameter selection method suitable for ACFM signal wavelet denoising
CN108195817B (en) Raman spectrum detection method for removing solvent interference
Zhuang et al. Rapid determination of green tea origins by near-infrared spectroscopy and multi-wavelength statistical discriminant analysis
CN107239768A (en) A kind of high spectrum image object detection method based on tensor principal component analysis dimensionality reduction
CN102542284B (en) Method for identifying spectrum
CN106970042B (en) Method for detecting impurity and moisture content of carrageenin
CN109670531A (en) A kind of denoising method of the near infrared light spectrum signal based on Hodrick-Prescott filter
CN108241846B (en) Method for identifying Raman spectrogram
CN113252641B (en) Substance identification method based on residual analysis under Raman spectrum
CN115015120A (en) Fourier infrared spectrometer and temperature drift online correction method thereof
Yiming et al. Research on iris recognition algorithm based on hough transform
CN111222455B (en) Wavelength selection method and device, computing equipment and computer storage medium
CN116796262A (en) Vine tea grade discriminating method, device and equipment based on near infrared spectrum data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant