WO2018121121A1 - 用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备 - Google Patents

用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备 Download PDF

Info

Publication number
WO2018121121A1
WO2018121121A1 PCT/CN2017/111588 CN2017111588W WO2018121121A1 WO 2018121121 A1 WO2018121121 A1 WO 2018121121A1 CN 2017111588 W CN2017111588 W CN 2017111588W WO 2018121121 A1 WO2018121121 A1 WO 2018121121A1
Authority
WO
WIPO (PCT)
Prior art keywords
peak
spectrum
standard
measured
background
Prior art date
Application number
PCT/CN2017/111588
Other languages
English (en)
French (fr)
Inventor
苟巍
王红球
奉华成
Original Assignee
同方威视技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 同方威视技术股份有限公司 filed Critical 同方威视技术股份有限公司
Priority to US16/473,495 priority Critical patent/US11493447B2/en
Priority to EP17886286.8A priority patent/EP3561696A4/en
Publication of WO2018121121A1 publication Critical patent/WO2018121121A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/65Raman scattering
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J3/00Spectrometry; Spectrophotometry; Monochromators; Measuring colours
    • G01J3/28Investigating the spectrum
    • G01J3/44Raman spectrometry; Scattering spectrometry ; Fluorescence spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N23/00Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00
    • G01N23/22Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by measuring secondary emission from the material
    • G01N23/223Investigating or analysing materials by the use of wave or particle radiation, e.g. X-rays or neutrons, not covered by groups G01N3/00 – G01N17/00, G01N21/00 or G01N22/00 by measuring secondary emission from the material by irradiating the sample with X-rays or gamma-rays and by measuring X-ray fluorescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0075Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence by spectroscopy, i.e. measuring spectra, e.g. Raman spectroscopy, infrared absorption spectroscopy
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J3/00Spectrometry; Spectrophotometry; Monochromators; Measuring colours
    • G01J3/28Investigating the spectrum
    • G01J3/44Raman spectrometry; Scattering spectrometry ; Fluorescence spectrometry
    • G01J2003/4424Fluorescence correction for Raman spectrometry
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01JMEASUREMENT OF INTENSITY, VELOCITY, SPECTRAL CONTENT, POLARISATION, PHASE OR PULSE CHARACTERISTICS OF INFRARED, VISIBLE OR ULTRAVIOLET LIGHT; COLORIMETRY; RADIATION PYROMETRY
    • G01J3/00Spectrometry; Spectrophotometry; Monochromators; Measuring colours
    • G01J3/02Details
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/17Systems in which incident light is modified in accordance with the properties of the material investigated
    • G01N21/25Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
    • G01N21/27Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands using photo-electric detection ; circuits for computing concentration
    • G01N21/274Calibration, base line adjustment, drift correction
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2201/00Features of devices classified in G01N21/00
    • G01N2201/12Circuits of general importance; Signal processing
    • G01N2201/129Using chemometrical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • G06F2218/10Feature extraction by analysing the shape of a waveform, e.g. extracting parameters relating to peaks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • G06F2218/14Classification; Matching by matching peak patterns

Definitions

  • the present invention relates generally to the field of spectroscopic analysis processing techniques, and more particularly to a method for subtracting a background of a spectrum, a method of identifying a substance by a Raman spectrum, and an electronic device.
  • Raman spectroscopy is a molecular vibrational spectroscopy that reflects the fingerprint characteristics of molecules and can be used to detect substances. Raman spectroscopy detects and identifies a substance by detecting a Raman spectrum produced by the Raman scattering effect of the analyte on the excitation light. Raman spectroscopy has been widely used in liquid security, jewelry testing, explosives testing, drug testing, drug testing, pesticide residue testing and other fields.
  • the commonly used method of deducting the background now includes using the least squares method with a penalty function to process the spectrum.
  • the background is subtracted by the least squares method with a penalty function, which takes a long time and is complicated in parameter setting. Unreasonable parameter settings can lead to poor deduction of background effects and even affect the final deduction of background effects.
  • the present invention has been made in order to overcome or eliminate at least one of the problems and disadvantages of the prior art.
  • a method for subtracting a background of a spectrum comprising the steps of:
  • the background data obtained by the SNIP method is used to replace the data of the original spectrum to fit to form a background spectrum;
  • the smoothed background spectrum is subtracted from the original spectrum to obtain a background subtracted spectrum.
  • the step of obtaining background data within each peak region comprises:
  • each peak region the intensity value of each wave number in the peak region is transformed using a transformation formula, which is:
  • v p (i) min ⁇ v p-1 (i), [v p-1 (i+p)+v p-1 (ip)]/2 ⁇ and
  • i is the wave number of the original spectrum
  • y(i) is the intensity value corresponding to the i-th wave number in the original spectrum
  • v(i) is the operation result of y(i);
  • m is the predetermined number of iterations
  • p is the current number of iterations
  • v p (i) represents the v(i), v p-1 (i), v p- calculated by the pth iteration 1 (i+p) and v p-1 (ip) represent v(i), v(i+p) and v(ip), v(i+p) and v, respectively, calculated by the p-1th iteration.
  • v p represents the intensity value corresponding to the i+p and ip wave numbers, respectively.
  • the predetermined number of iterations m satisfies the following relationship:
  • m (w - 1) / 2, where w is the peak width of the peak region.
  • Standard spectrum library building step measuring the Raman spectrum of a plurality of samples to obtain a standard spectrum of a plurality of samples, pretreating the standard spectrum and extracting the peak intensity and peak position of the standard spectrum Peak information of peak region and peak width, and the pre-processed standard spectrum and the extracted peak information are stored in a database to establish a standard spectrum library;
  • the measured spectrum obtaining step measuring the Raman spectrum of the substance to be measured to obtain a measured spectrum
  • the measured spectrum preprocessing and the peak information extraction step preprocessing the measured spectrum and extracting peak information of the measured spectrum, the peak information including the peak intensity, the peak position and the peak of the measured spectrum Area and peak width;
  • Peak matching step comparing the peak information of the measured spectrum with the peak information of the standard spectrum to select a standard spectrum having peak information matching the peak information of the measured spectrum;
  • the step of identifying correlating the data of the measured spectrum with the data of the standard spectrum selected in the peak matching step to select a standard spectrum most relevant to the measured spectrum, thereby identifying the measured substance ,
  • the pre-processing the measured spectrum in the measured spectrum pre-processing and peak information extraction step comprises: deducting the method using the subtractive spectral background according to any embodiment of the present invention The background of the measured spectrum.
  • pre-processing the standard spectrum in the standard spectral library building step comprises: deducting the standard spectrum using a method of subtracting a spectral background according to any of the embodiments of the present invention Background.
  • the peak matching step comprises:
  • Sorting step sorting the peaks of the measured spectrum and the peaks of the standard spectrum according to the order of the peak intensity from the largest to the smallest, and selecting the peaks of the top N names of the measured spectrum and the standard spectrum;
  • the first matching step comparing the peak position information of the peaks of the top N names of the measured spectrum and the standard spectrum to select a standard spectrum having peak information matching the peak information of the measured spectrum.
  • the first matching step specifically includes:
  • the condition (1) is: pD ⁇ p2[j].fWidth/3 and pD ⁇ p1[i].fWidth/3,
  • N is a predetermined number of comparative peaks, and N is a natural number greater than or equal to 3;
  • i, j respectively represent the sequence numbers of the sorted peaks in the standard spectrum and the measured spectrum, and i and j are integers greater than or equal to 0 and less than or equal to N;
  • P1[i].fPos represents the peak position of the ith peak after sorting in the standard spectrum
  • P2[j].fWidth represents the peak width of the jth peak after sorting in the measured spectrum
  • pD represents the absolute difference of the peak positions.
  • the peak matching step further comprises:
  • Peak matching weight calculation step establishing a punitive function according to the following formula (2) to calculate a peak matching weight
  • a second matching step when the peak matching weight is greater than or equal to the preset weight threshold, determining that the measured spectrum matches the peak information of the standard spectrum; and when the peak matching weight is less than the weight threshold, determining the measured spectrum and the standard The peak information of the spectrum does not match.
  • the peak matching weight calculation step and the second matching step are performed.
  • N is a natural number greater than or equal to 3 and less than or equal to 5.
  • the data of the measured spectrum and the data of the standard spectrum selected in the peak matching step are performed in a union interval of peak regions of all peaks of the measured spectrum and the standard spectrum. The steps to compare the correlations.
  • an electronic device including:
  • a memory for storing executable instructions
  • a processor for executing executable instructions stored in a memory to perform the methods described in any aspect or embodiment of the present invention.
  • any one of the above technical solutions of the present invention can adaptively deduct the background, and by using the segmentation SNIP method, while designing the number of iterations equal to (w-1)/2, the background can be fitted as much as possible, and the peak is maintained. Type and increase the calculation speed to facilitate subsequent spectral processing.
  • FIG. 1 schematically shows a flow chart of a method for subtracting a background of a spectrum according to an embodiment of the present invention
  • Figure 2 is a schematic representation of a peak of a Raman spectrum of a substance
  • FIG. 3 is a flow chart showing the background of removing a Raman spectrum using a method for subtracting a spectral background according to an embodiment of the present invention
  • FIG. 5 schematically illustrates a method of identifying a substance by a Raman spectrum according to an embodiment of the present invention.
  • FIG. 6 shows a flow chart for identifying a matched measured spectrum and a standard spectrum using a method in accordance with an embodiment of the present invention.
  • FIG. 7 is a block diagram showing an example hardware arrangement of an electronic device for performing a method in accordance with an embodiment of the present invention.
  • SNIP refers to Statistics-sensitive Nonlinear Iterative Peak-clipping.
  • FIG. 1 schematically illustrates a method for subtracting a spectral background in accordance with an exemplary embodiment of the present invention. As shown in FIG. 1, the method may include the following steps:
  • Peak information searching step searching for peak information of the original spectrum, the peak information including the peak position, the start and end points of the peak, and the peak width w;
  • Background data obtaining step processing each peak of the original spectrum using the SNIP method in each peak region defined by the start point and the end point of each peak of the original spectrum to obtain background data in each peak region;
  • Background spectrum forming step in each peak region, the background data obtained by the SNIP method is used to replace the data of the original spectrum to fit to form a background spectrum;
  • Smoothing processing step smoothing the formed background spectrum
  • Subtracting the background step subtracting the smoothed background spectrum from the original spectrum to obtain a spectrum subtracted background.
  • Figure 2 shows schematically a peak of a Raman spectrum of a substance.
  • the peak information of the Raman spectrum may include a peak position, a peak start and end point, a peak width w, and a peak intensity.
  • the abscissa of the Raman spectrum represents the Raman shift or wave number (in cm -1 ) and the ordinate represents the intensity of the Raman spectrum (dimensionless or expressed in au).
  • the Raman spectrum can be regarded as a set of discrete data points, as shown by the black dots in Figure 2.
  • the abscissa of the data points can be called the wave number, and the ordinate can be called the intensity or Strength value.
  • the peak information finding step described above can perform peak finding using a simple comparison method. Specifically, in the spectrum intensity or intensity value, when the intensity value of a certain wave number is much larger than the intensity values of several wave numbers adjacent thereto, it is considered that there is one peak in the wave number.
  • the above-described peak information finding step may perform peak finding using a derivative method. Specifically, if the spectrum is considered to be a continuous curve, the first, second, and third derivatives of the spectrum can be calculated. Usually, the first derivative has a positive to negative zero crossing at the peak position, the second derivative has a negative local minimum at the peak position, and the third derivative has a negative to positive zero crossing near the peak.
  • the peak information such as the peak position can be accurately determined from the change in the slope and curvature of the spectral curve.
  • the background data obtaining step may further include the following steps:
  • each peak region the intensity value of each wave number in the peak region is transformed using a transformation formula, which is:
  • v p (i) min ⁇ v p-1 (i), [v p-1 (i+p)+v p-1 (ip)]/2 ⁇ ;
  • i is the wave number of the original spectrum
  • y(i) is the intensity value corresponding to the i-th wave number in the original spectrum
  • v(i) is the operation result of y(i);
  • m is the predetermined number of iterations
  • p is the current number of iterations
  • v p (i) represents the v(i), v p-1 (i), v p- calculated by the pth iteration 1 (i + p) and v p-1 (IP) represent the first p-1 iterations calculated v (i), v (i + p) and v (ip), v (i + p) and v (ip)
  • the operation result of the intensity value corresponding to the i+p and ip wave numbers, respectively.
  • w is the peak width of the corresponding peak region. Due to The peak widths of the individual peak regions may be different, so the predetermined number of iterations m for each peak region may also be different.
  • the smoothing processing step may include smoothing processing using a least squares motion smoothing method, a Gaussian filter smoothing method, a median filtering, or an mean filtering smoothing method.
  • FIGS. 3 and 4 shows a flow chart for removing a background of a Raman spectrum using a method for subtracting a spectral background according to an embodiment of the present invention
  • FIG. 4 is a schematic illustration for use in accordance with an embodiment of the present invention.
  • a Raman spectrum of the method of subtracting the background of the spectrum In Figure 4, the abscissa indicates the Raman shift or wave number (Raman Shift in cm -1 ), and the ordinate indicates the intensity of the Raman spectrum (Raman Intensity , dimensionless or expressed in au).
  • the original Raman spectrum generally includes a plurality of peaks, for example, may be recorded as peak 1, peak 2, ... peak n, and accordingly, each peak includes respective peak information, in particular, Each peak has its own peak width, which can be written as w 1 , w 2 , ... w n .
  • each peak ie, peak 1, peak 2, ... peak n
  • SNIP processing is performed.
  • m (w-1)/2, the number of iterations m is also different.
  • the background data obtained by the SNIP method is used to replace the peak data of the original spectrum to fit the background spectrum, and then the formed data is formed.
  • the bottom spectrum is smoothed, and the background spectrum thus formed is as shown in FIG.
  • the smoothed background spectrum is subtracted from the original spectrum to obtain the background subtracted spectrum, and the background subtracted from the background is shown in Fig. 4.
  • the background can be adaptively subtracted, and by using the segmentation SNIP method, the number of design iterations is equal to (w-1). ) /2, you can fit the background as much as possible, maintain the peak shape, and increase the calculation speed, which facilitates subsequent spectral processing, mixture identification and quantitative analysis.
  • the above method of subtracting the background of the spectrum may be used in a method of identifying a substance by a Raman spectrum to improve the speed and accuracy of the substance to be identified.
  • a method of identifying a substance by a Raman spectrum according to an embodiment of the present invention which may include the following steps, is described in detail with reference to FIG.
  • Standard spectrum library building step measuring the Raman spectrum of a plurality of samples to obtain a standard spectrum of a plurality of samples, pretreating the standard spectrum and extracting peaks and peaks of the standard spectrum
  • the peak information of the bit, the peak region and the peak width, and the pre-processed standard spectrum and the extracted peak information are stored in the database to establish a standard spectrum library;
  • the measured spectrum obtaining step measuring the Raman spectrum of the substance to be measured to obtain a measured spectrum
  • the measured spectrum preprocessing and the peak information extraction step preprocessing the measured spectrum and extracting peak information of the measured spectrum, the peak information including the peak intensity, the peak position and the peak of the measured spectrum Area and peak width;
  • Peak matching step comparing the peak information of the measured spectrum with the peak information of the standard spectrum to select a standard spectrum having peak information matching the peak information of the measured spectrum;
  • the step of identifying correlating the data of the measured spectrum with the data of the standard spectrum selected in the peak matching step to select a standard spectrum most relevant to the measured spectrum, thereby identifying the measured substance .
  • pre-processing the measured spectrum in the measured spectral pre-processing and peak information extraction steps comprises: deducting the quilted using a method of subtracting a spectral background according to the above embodiment The background of the spectrum.
  • pre-processing the standard spectrum in the standard spectral library building step includes subtracting a background of the standard spectrum using a method of subtracting a spectral background according to the above embodiment.
  • the standard spectrum library stores peak information of the pre-processed standard spectrum and the extracted standard spectrum, for example, standard spectrum 1, standard spectrum 2, ... standard Spectral n, correspondingly, peak information 1, peak information 2, ... peak information n.
  • the peak matching step may further include a "peak position matching" step, which may include the following steps:
  • Sorting step sorting the peaks of the measured spectrum and the peaks of the standard spectrum according to the order of the peak intensity from the largest to the smallest, and selecting the peaks of the top N names of the measured spectrum and the standard spectrum;
  • the first matching step comparing the peak positions of the peaks of the top N names of the measured spectrum and the standard spectrum to select a standard spectrum having peak information matching the peak information of the measured spectrum.
  • the first matching step may specifically include:
  • the condition (1) is: pD ⁇ p2[j].fWidth/3 and pD ⁇ p1[i].fWidth/3,
  • N is a predetermined number of comparative peaks, and N is a natural number greater than or equal to 3;
  • i, j respectively represent the sequence numbers of the sorted peaks in the standard spectrum and the measured spectrum, and i and j are integers greater than or equal to 0 and less than or equal to N;
  • P1[i].fPos represents the peak position of the ith peak after sorting in the standard spectrum
  • P2[j].fPos represents the peak position of the jth peak after sorting in the measured spectrum
  • P1[i].fWidth represents the peak width of the ith peak after sorting in the standard spectrum
  • P2[j].fWidth represents the peak width of the jth peak after sorting in the measured spectrum
  • pD represents the absolute difference of the peak positions.
  • N is a natural number greater than or equal to 3 and less than or equal to 5.
  • N is a natural number greater than or equal to 3 and less than or equal to 5.
  • the peak matching step may further include a “filter” step, and the “filter” step may specifically include:
  • Peak matching weight calculation step establishing a punitive function according to the following formula (2) to calculate a peak matching weight
  • a second matching step when the peak matching weight is greater than or equal to the preset weight threshold, determining that the measured spectrum matches the peak information of the standard spectrum; and when the peak matching weight is less than the weight threshold, determining the measured spectrum and the standard The peak information of the spectrum does not match.
  • min(p1[i].fWidth,p2[j].fWidth) means taking the smaller values in p1[i].fWidth and p2[j].fWidth;
  • the peak matching weight calculation step and the second matching step are performed. That is, in the embodiment of the present invention, only the absolute difference of the peaks is judged to be satisfied in the first matching step.
  • the calculation of the punitive function and the comparison of the peak matching weights are performed.
  • the calculation amount of the absolute difference of the calculated peak is smaller than the calculation amount of the calculation of the punitive function, preliminary calculation by calculating the absolute difference of the peak before calculating the punitive function can greatly reduce the calculation amount, thereby improving the recognition. Speed and improve recognition accuracy.
  • the step of comparing the data of the measured spectrum and the data of the pre-stored standard spectrum in the second identifying step may include: performing data of the measured spectrum and data of the pre-stored standard spectrum Correlation comparison, that is, the "correlation calculation" step shown in FIG.
  • the step of correlating the data of the measured spectrum with the data of the pre-stored standard spectrum comprises:
  • Calculating a correlation coefficient between the data of the measured spectrum and the data of the pre-stored standard spectrum When the calculated correlation coefficient is greater than or equal to a preset correlation threshold, determining that the measured spectrum matches the standard spectrum; When the correlation coefficient is less than the correlation threshold, it is determined that the measured spectrum does not match the standard spectrum.
  • the correlation coefficient is the amount of linear correlation between the variables, and is a measure of the relationship between vectors.
  • a feature vector X(x1, x2, ..., xn), Y(y1, y2, ..., yn) is provided, and the correlation coefficient r of the two can be defined as follows:
  • correlation between the data of the measured spectrum and the data of the pre-stored standard spectrum is performed in a union interval of peak regions of all peaks of the measured spectrum and the standard spectrum.
  • the steps of the correlation of the data of the standard spectrum represents an interval composed of the peak regions of all the peaks of the measured spectrum and the standard spectrum.
  • the peak spectrum information of the spectrum is first used to compare the local features of the “peak”, and the measured spectrum and the standard spectrum are initially screened, and the spectral data is performed after the preliminary screening is passed.
  • Global comparison not only can greatly shorten the match recognition time, but also improve the match The accuracy of the identification.
  • the matching process can be terminated immediately without subsequent recognition. The matching process can greatly improve the calculation speed when determining the mismatch between the two. Tests have shown that the matching recognition time is shortened to about 5%, and the accuracy of matching recognition is increased by about 10%.
  • FIG. 7 is a block diagram showing an example hardware arrangement of the electronic device 700.
  • Electronic device 700 includes a processor 706 (eg, a microprocessor ( ⁇ P), a digital signal processor (DSP), etc.).
  • processor 706 can be a single processing unit or a plurality of processing units for performing different acts of the method steps described herein.
  • the electronic device 700 may also include an input unit 702 for receiving signals from other entities, and an output unit 704 for providing signals to other entities.
  • Input unit 702 and output unit 704 can be arranged as a single entity or as separate entities.
  • electronic device 700 can include at least one computer readable storage medium 707 in the form of a non-volatile or volatile memory, such as an electrically erasable programmable read only memory (EEPROM), flash memory, and/or a hard drive.
  • the computer readable storage medium 707 includes a computer program 710 that includes code/computer readable instructions that, when executed by the processor 706 in the electronic device 700, cause the electronic device 700 to perform, for example, as described above in connection with the above-described embodiments. Process and any variations thereof.
  • Computer program 710 can be configured as computer program code having an architecture, such as computer program modules 710A-710C.
  • the computer program module can substantially perform the various actions in the flow described in the above embodiments to simulate the device. In other words, when different computer program modules are executed in processor 706, they may correspond to the different units described above in the device.
  • code means in the embodiment disclosed above in connection with FIG. 7 is implemented as a computer program module that, when executed in processor 706, causes electronic device 700 to perform the actions described above in connection with the above-described embodiments, however in an alternative embodiment At least one of the code means can be implemented at least partially as a hardware circuit.
  • the processor may be a single CPU (Central Processing Unit), but may also include two or more processing units.
  • a processor can include a general purpose microprocessor, an instruction set processor, and/or a related chipset and/or a special purpose microprocessor (eg, an application specific integrated circuit (ASIC)).
  • the processor may also include an onboard memory for caching purposes.
  • the computer program can be carried by a computer program product connected to the processor.
  • the computer program product can comprise a computer readable medium having stored thereon a computer program.
  • the computer program product can be flash memory, random access memory (RAM), read only memory (ROM), The EEPROM, and the computer program modules described above, in alternative embodiments, may be distributed into different computer program products in the form of memory.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)
  • Spectrometry And Color Measurement (AREA)

Abstract

一种用于扣除谱图本底的方法及通过拉曼谱图识别物质的方法和电子设备,该方法包括以下步骤:寻找出原始谱图的峰信息,该峰信息包括峰位、峰的起点和终点以及峰宽w;在原始谱图的每个峰的起点和终点限定的每个峰区内,使用SNIP方法处理原始谱图的每个峰,以获得每个峰区内的本底数据;在每个峰区内,使用经SNIP方法处理后获得的本底数据替换原始谱图的数据,以拟合形成本底谱图;将形成的本底谱图进行平滑处理;和将原始谱图减去经平滑处理的本底谱图,以获得扣除本底的谱图。

Description

用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备 技术领域
本发明一般地涉及谱图分析处理技术领域,尤其涉及用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备。
背景技术
拉曼光谱是一种分子振动光谱,它可以反映分子的指纹特征,可用于对物质的检测。拉曼光谱检测通过检测待测物对于激发光的拉曼散射效应所产生的拉曼光谱来检测和识别物质。拉曼光谱检测方法已经广泛应用于液体安检、珠宝检测、爆炸物检测、毒品检测、药品检测、农药残留检测等领域。
在对拉曼光谱的谱图进行分析处理时,经常面临的一个问题是如何有效且快速地扣除拉曼谱图的本底,以获得与物质成分对应的峰数据,从而方便进行后续的处理。
现在常用的扣除本底的方法包括使用带有惩罚函数的最小二乘法对谱图进行处理,然而,通过带有惩罚函数的最小二乘法来扣除本底,消耗时间很长,参数设置复杂,并且,不合理的参数设置会导致不佳的扣除本底效果,甚至会影响最终的扣除本底效果。
发明内容
为了克服或消除现有技术存在的问题和缺陷中的至少一种,提出了本发明。
本发明的一个目的是至少提供一种用于扣除谱图本底的方法,其能够有效且快速地扣除谱图本底,以便于后续的谱图处理、混合物识别和定量分析。
本发明的另一个目的还在于提供一种通过拉曼谱图识别物质的方法,通过利用上述扣除谱图本底的方法,该方法能够准确且快速地识别出待测物质。
根据本发明的一个方面,提供一种用于扣除谱图本底的方法,包括以下步骤:
寻找出原始谱图的峰信息,该峰信息包括峰位、峰的起点和终点以及峰宽;
在原始谱图的每个峰的起点和终点限定的每个峰区内,使用SNIP方法处 理原始谱图的每个峰,以获得每个峰区内的本底数据;
在每个峰区内,使用经SNIP方法处理后获得的本底数据替换原始谱图的数据,以拟合形成本底谱图;
将形成的本底谱图进行平滑处理;和
将原始谱图减去经平滑处理的本底谱图,以获得扣除本底的谱图。
根据一些实施例,所述获得每个峰区内的本底数据的步骤包括:
在每个峰区内,使用变换公式对该峰区内的每个波数的强度值进行变换,该变换公式为:
Figure PCTCN2017111588-appb-000001
根据SNIP公式进行迭代计算,依次计算出v1(i)、v2(i),直至vm(i),该SNIP公式为:vp(i)=min{vp-1(i),[vp-1(i+p)+vp-1(i-p)]/2}和
当vm(i)计算完之后,再根据上述变换公式进行逆运算,以计算出与vm(i)对应的y(i),从而获得该峰区内的本底数据,
其中,i为原始谱图的波数,y(i)为原始谱图中第i个波数对应的强度值,v(i)为y(i)的运算结果;
m为预定的迭代次数,p为当前迭代次数,即,1<p≤m,vp(i)表示第p次迭代计算出的v(i),vp-1(i)、vp-1(i+p)和vp-1(i-p)分别表示第p-1次迭代计算出的v(i)、v(i+p)和v(i-p),v(i+p)和v(i-p)分别表示第i+p和i-p个波数对应的强度值的运算结果。
根据一些实施例,对于每个峰区,预定的迭代次数m均满足如下关系:
m=(w-1)/2,其中w为该峰区的峰宽。
根据本发明的另一方面,还提供一种通过拉曼谱图识别物质的方法,包括以下步骤:
标准谱图库建立步骤:对多种样品的拉曼光谱进行测量,以获得多种样品的标准谱图,对所述标准谱图进行预处理并提取所述标准谱图的包括峰强、峰位、峰区和峰宽的峰信息,将经预处理的标准谱图及提取出的峰信息存入数据库中,以建立标准谱图库;
被测谱图获得步骤:对待测物质的拉曼光谱进行测量,以获得被测谱图;
被测谱图预处理和峰信息提取步骤:对所述被测谱图进行预处理,并提取出被测谱图的峰信息,该峰信息包括被测谱图的峰强、峰位、峰区和峰宽;
峰匹配步骤:比较被测谱图的峰信息和标准谱图的峰信息,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图;和
识别步骤:对被测谱图的数据和上述峰匹配步骤中筛选出的标准谱图的数据进行相关性比较,以筛选出与被测谱图最相关的标准谱图,从而识别出被测物质,
其中,在所述被测谱图预处理和峰信息提取步骤中对所述被测谱图进行预处理包括:利用根据本发明任一实施例中所述的扣除谱图本底的方法扣除所述被测谱图的本底。
根据一些实施例,在所述标准谱图库建立步骤中对所述标准谱图进行预处理包括:利用根据本发明任一实施例中所述的扣除谱图本底的方法扣除所述标准谱图的本底。
根据一些实施例,所述峰匹配步骤包括:
排序步骤:按照峰强从大到小的顺序,对被测谱图的峰和标准谱图的峰分别进行排序,选择出被测谱图和标准谱图的排序在前N名的峰;和
第一匹配步骤:比较被测谱图和标准谱图的排序在前N名的峰的峰位信息,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图。
根据一些实施例,所述第一匹配步骤具体包括:
按照下述公式(1)依次计算排序在前N名的峰的峰位的绝对差;和
当计算出的峰位的绝对差满足下述条件(1)时,确定被测谱图与标准谱图的峰信息匹配;当计算出的峰位的绝对差不满足下述条件(1)时,确定被测谱图与标准谱图的峰信息不匹配,
其中:
公式(1)为:pD=|p2[j].fPos-p1[i].fPos|,
条件(1)为:pD<p2[j].fWidth/3且pD<p1[i].fWidth/3,
其中,N为预定的比较峰数,N为大于等于3的自然数;
i,j分别表示标准谱图和被测谱图中的排序后的峰的序号,i和j均为大于等于0且小于等于N的整数;
p1[i].fPos表示标准谱图中排序后的第i个峰的峰位;
p2[j].fPos表示被测谱图中排序后的第j个峰的峰位;
p1[i].fWidth表示标准谱图中排序后的第i个峰的峰宽;
p2[j].fWidth表示被测谱图中排序后的第j个峰的峰宽;
pD表示峰位的绝对差。
根据一些实施例,所述峰匹配步骤还包括:
峰匹配权重计算步骤:根据下述公式(2)建立惩罚性函数,以计算峰匹配权重;和
第二匹配步骤:当峰匹配权重大于等于预设的权重阈值时,确定被测谱图与标准谱图的峰信息匹配;当峰匹配权重小于所述权重阈值时,确定被测谱图与标准谱图的峰信息不匹配,
其中,公式(2)为:
h=(1-2*|j-i|/10)*(0.5/(i+1))*exp(-pD*2/min(p1[i].fWidth,p2[j].fWidth)),
其中,h表示峰匹配权重。
根据一些实施例,在所述第一匹配步骤中确定被测谱图与标准谱图的峰信息匹配的情况下,执行所述峰匹配权重计算步骤和所述第二匹配步骤。
根据一些实施例,N为大于等于3且小于等于5的自然数。
根据一些实施例,在被测谱图和标准谱图的所有峰的峰区的并集区间内,执行所述对被测谱图的数据和上述峰匹配步骤中筛选出的标准谱图的数据进行相关性比较的步骤。
根据本发明的又一方面,还提供一种电子设备,包括:
存储器,用于存储可执行指令;以及
处理器,用于执行存储器中存储的可执行指令,以执行本发明的任一方面或实施例中所述的方法。
本发明的上述技术方案中的任何一个能够自适应地扣除本底,并且,通过采用分段SNIP方法,同时设计迭代次数等于(w-1)/2,可以尽量地拟合本底,保持峰型,并且提高计算速度,从而便于后续的谱图处理。
附图说明
图1示意性地示出根据本发明的实施例的用于扣除谱图本底的方法的流程图;
图2示意性示出了一种物质的拉曼谱图的一个峰;
图3示出了使用根据本发明实施例的用于扣除谱图本底的方法去除拉曼谱图的本底的流程图;
图4示意性示出了用于根据本发明实施例的用于扣除谱图本底的方法的一个拉曼谱图;
图5示意性地示出根据本发明的实施例的通过拉曼谱图识别物质的方法 的流程图;
图6示出了使用根据本发明实施例的方法识别匹配被测谱图与标准谱图的流程图;和
图7是示出了用于执行根据本发明的实施例的方法的电子设备的示例硬件布置的框图。
具体实施方式
下面通过实施例,并结合附图,对本发明的技术方案作进一步具体的说明。在说明书中,相同或相似的附图标号表示相同或相似的部件。下述参照附图对本发明实施方式的说明旨在对本发明的总体发明构思进行解释,而不应当理解为对本发明的一种限制。
在本文中,为了描述方便,使用“第一、第二”、“A、B、C”等表述描述方法的步骤,但是,除非有特别说明,这样的表述不应理解为对步骤执行顺序的限制。
在本文中,“SNIP”指代统计敏感的非线性迭代剥峰算法(Statistics-sensitive Nonlinear Iterative Peak-clipping)。
图1示意性地示出根据本发明的一个示例性地实施例的用于扣除谱图本底的方法。如图1所示,该方法可以包括以下步骤:
峰信息寻找步骤:寻找出原始谱图的峰信息,该峰信息包括峰位、峰的起点和终点以及峰宽w;
本底数据获得步骤:在原始谱图的每个峰的起点和终点限定的每个峰区内,使用SNIP方法处理原始谱图的每个峰,以获得每个峰区内的本底数据;
本底谱图形成步骤:在每个峰区内,使用经SNIP方法处理后获得的本底数据替换原始谱图的数据,以拟合形成本底谱图;
平滑处理步骤:将形成的本底谱图进行平滑处理;和
扣除本底步骤:将原始谱图减去经平滑处理的本底谱图,以获得扣除本底的谱图。
下面,以拉曼谱图为例,结合附图对根据本发明实施例的用于扣除谱图本底的方法进行更详细地说明。
图2示意性示出了一种物质的拉曼谱图的一个峰。拉曼谱图的峰信息可以 包括峰位、峰的起点和终点、峰宽w和峰强。通常,拉曼谱图的横坐标表示拉曼频移或波数(单位为cm-1),纵坐标表示拉曼光谱的强度(无量纲或以a.u.表示)。在进行数学计算时,拉曼谱图可以看作是一组离散的数据点,如图2中的黑圆点所示,该数据点的横坐标可以称作波数,纵坐标可以称作强度或强度值。这样,如图2所示,峰位可以为该峰的最高点P的位置,即P对应的波数;峰的起点和终点可以分别为峰的起点S和终点E对应的波数;峰宽w可以为峰的起点S和终点E限定的宽度,即峰的终点E的波数与峰的起点S的波数之差。
在一个示例中,上述峰信息寻找步骤可以利用简单比较法进行寻峰。具体地,在谱图强度或强度值中,某一波数的强度值比其邻近的几个波数的强度值大很多时,则认为该波数存在一个峰。可替代地,上述峰信息寻找步骤可以利用导数法进行寻峰。具体地,如果把谱图看成是一个连续曲线,那么可以计算谱图的一、二、三阶导数。通常,一阶导数在峰位处由正到负过零点,二阶导数在峰位处出现负的局部极小值,三阶导数在峰位附近由负到正过零点。这样,利用谱图曲线在峰位附近形状的特征,由谱图曲线的斜率、曲率的变化可以准确地确定出峰位等峰信息。
根据本发明的实施例,上述本底数据获得步骤还可以包括如下步骤:
在每个峰区内,使用变换公式对该峰区内的每个波数的强度值进行变换,该变换公式为:
Figure PCTCN2017111588-appb-000002
根据SNIP公式进行迭代计算,依次计算出v1(i)、v2(i),直至vm(i),该SNIP公式为:vp(i)=min{vp-1(i),[vp-1(i+p)+vp-1(i-p)]/2};和
当vm(i)计算完之后,再根据上述变换公式进行逆运算,以计算出与vm(i)对应的y(i),此时计算出的y(i)可以记为y′(i),从而获得该峰区内的本底数据,
其中,i为原始谱图的波数,y(i)为原始谱图中第i个波数对应的强度值,v(i)为y(i)的运算结果;
m为预定的迭代次数,p为当前迭代次数,即,1<p≤m,vp(i)表示第p次迭代计算出的v(i),vp-1(i)、vp-1(i+p)和vp-1(i-p)分别表示第p-1次迭代计算出的v(i)、v(i+p)和v(i-p),v(i+p)和v(i-p)分别表示第i+p和i-p个波数对应的强度值的运算结果。
根据本发明的实施例,对于每个峰区,预定的迭代次数m均满足如下关系:m=(w-1)/2,其中,w为峰宽。在上述关系式中,w为对应峰区的峰宽。由于 各个峰区的峰宽可能不同,所以对于各个峰区的预定的迭代次数m也可能不同。这样,通过基于各个峰区的峰宽确定预定的迭代次数,可以自适应地拟合各个峰的本底,同时还可以提高计算速度。
在一个示例中,上述平滑处理步骤可以包括使用最小二乘移动平滑方法、高斯滤波器平滑方法、中值滤波或均值滤波平滑方法等方法进行平滑处理。
下面,结合图3和图4更详细描述根据本发明实施例的用于扣除谱图本底的方法。图3示出了使用根据本发明实施例的用于扣除谱图本底的方法去除拉曼谱图的本底的流程图,图4示意性示出了用于根据本发明实施例的用于扣除谱图本底的方法的一个拉曼谱图,在图4中,横坐标表示拉曼频移或波数(Raman Shift,单位为cm-1),纵坐标表示拉曼光谱的强度(Raman Intensity,无量纲或以a.u.表示)。
如图3和图4所示,原始拉曼谱图通常包括多个峰,例如,可以记为峰1、峰2、……峰n,相应地,每个峰包括各自的峰信息,特别地,每个峰具有各自的峰宽,可以记为w1、w2、…wn
如图3所示,对每个峰(即,峰1、峰2、……峰n)均使用SNIP方法进行处理,特别地,由于每个峰的峰宽w不同,所以,在进行SNIP处理时,由于m=(w-1)/2,所以迭代次数m也不同。
进一步地,如图3所示,在每个峰区内,使用经SNIP方法处理后获得的本底数据替换原始谱图的峰数据,以拟合形成本底谱图,然后,将形成的本底谱图进行平滑处理,这样形成的本底谱图如图4所示。最后,将原始谱图减去经平滑处理的本底谱图,以获得扣除本底的谱图,扣除本底的谱图如图4所示。
从图4可以看出,通过使用根据本发明实施例的扣除谱图本底的方法,可以实现自适应地扣除本底,并且,通过采用分段SNIP方法,同时设计迭代次数等于(w-1)/2,可以尽量地拟合本底,保持峰型,并且提高计算速度,从而便于后续的谱图处理、混合物识别和定量分析。
根据本发明的一些实施例,上述扣除谱图本底的方法可以用于通过拉曼谱图识别物质的方法中,以提高识别物质的速度和准确度。下面,结合附图5详细描述根据本发明一个实施例的通过拉曼谱图识别物质的方法,该方法可以包括以下步骤:
标准谱图库建立步骤:对多种样品的拉曼光谱进行测量,以获得多种样品的标准谱图,对所述标准谱图进行预处理并提取所述标准谱图的包括峰强、峰 位、峰区和峰宽的峰信息,将经预处理的标准谱图及提取出的峰信息存入数据库中,以建立标准谱图库;
被测谱图获得步骤:对待测物质的拉曼光谱进行测量,以获得被测谱图;
被测谱图预处理和峰信息提取步骤:对所述被测谱图进行预处理,并提取出被测谱图的峰信息,该峰信息包括被测谱图的峰强、峰位、峰区和峰宽;
峰匹配步骤:比较被测谱图的峰信息和标准谱图的峰信息,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图;和
识别步骤:对被测谱图的数据和上述峰匹配步骤中筛选出的标准谱图的数据进行相关性比较,以筛选出与被测谱图最相关的标准谱图,从而识别出被测物质。
在一些实施例中,在所述被测谱图预处理和峰信息提取步骤中对所述被测谱图进行预处理包括:利用根据上述实施例的扣除谱图本底的方法扣除所述被测谱图的本底。
在一些实施例中,在所述标准谱图库建立步骤中对所述标准谱图进行预处理包括:利用根据上述实施例的扣除谱图本底的方法扣除所述标准谱图的本底。
具体地,如图6所示,所述标准谱图库中存储有经预处理的标准谱图及提取出的标准谱图的峰信息,例如,标准谱图1、标准谱图2、……标准谱图n,相应地,峰信息1、峰信息2、……峰信息n。这样,根据本发明的一个实施例,如图6所示,所述峰匹配步骤可以进一步包括“峰位匹配”步骤,该“峰位匹配”步骤可以包括如下步骤:
排序步骤:按照峰强从大到小的顺序,对被测谱图的峰和标准谱图的峰分别进行排序,选择出被测谱图和标准谱图的排序在前N名的峰;和
第一匹配步骤:比较被测谱图和标准谱图的排序在前N名的峰的峰位,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图。
根据本发明进一步的实施例,所述第一匹配步骤具体可以包括:
按照下述公式(1)依次计算排序在前N名的峰的峰位的绝对差;和
当计算出的峰位的绝对差满足下述条件(1)时,确定被测谱图与标准谱图的峰信息匹配;当计算出的峰位的绝对差不满足下述条件(1)时,确定被测谱图与标准谱图的峰信息不匹配,
其中:
公式(1)为:pD=|p2[j].fPos-p1[i].fPos|,
条件(1)为:pD<p2[j].fWidth/3且pD<p1[i].fWidth/3,
其中,N为预定的比较峰数,N为大于等于3的自然数;
i,j分别表示标准谱图和被测谱图中的排序后的峰的序号,i和j均为大于等于0且小于等于N的整数;
p1[i].fPos表示标准谱图中排序后的第i个峰的峰位;
p2[j].fPos表示被测谱图中排序后的第j个峰的峰位;
p1[i].fWidth表示标准谱图中排序后的第i个峰的峰宽;
p2[j].fWidth表示被测谱图中排序后的第j个峰的峰宽;
pD表示峰位的绝对差。
在一个实施例中,N为大于等于3且小于等于5的自然数。当N的取值较小,例如,小于3时,被比较的峰的数量过少,不利于筛选出与被测谱图匹配的标准谱图,即,不利于识别的有效性;当N的取值过大时,会增加比较峰信息的计算量,从而可能会影响峰信息比较的计算速度。在N取值大于等于3且小于等于5的自然数的情况下,可以兼顾利用峰信息识别的有效性和计算速度。
进一步地,如图6所示,所述峰匹配步骤还可以包括“筛选器”步骤,“筛选器”步骤具体可以包括:
峰匹配权重计算步骤:根据下述公式(2)建立惩罚性函数,以计算峰匹配权重;和
第二匹配步骤:当峰匹配权重大于等于预设的权重阈值时,确定被测谱图与标准谱图的峰信息匹配;当峰匹配权重小于所述权重阈值时,确定被测谱图与标准谱图的峰信息不匹配,
其中,公式(2)为:
h=(1-2*|j-i|/10)*(0.5/(i+1))*exp(-pD*2/min(p1[i].fWidth,p2[j].fWidth)),
其中,h表示峰匹配权重;
“min(p1[i].fWidth,p2[j].fWidth)”表示取p1[i].fWidth和p2[j].fWidth中的较小值;
“exp”表示以自然对数e为底的幂函数。
在本发明的实施例中,在所述第一匹配步骤中确定被测谱图与标准谱图的峰信息匹配的情况下,执行所述峰匹配权重计算步骤和所述第二匹配步骤。也就是说,在本发明的实施例中,只有在第一匹配步骤中判断峰的绝对差满足要 求时,才进行惩罚性函数的计算和峰匹配权重的比较。同样地,由于计算峰的绝对差的计算量比计算惩罚性函数的计算量小,所以,在计算惩罚性函数之前,通过计算峰的绝对差进行初步筛选,可以大大减少计算量,从而提高识别速度,并且提高识别准确率。
根据本发明的实施例,所述第二识别步骤中比较被测谱图的数据和预存的标准谱图的数据的步骤可以包括:对被测谱图的数据和预存的标准谱图的数据进行相关性比较,即,如图6所示的“相关性计算”步骤。
在一个实施例中,所述对被测谱图的数据和预存的标准谱图的数据进行相关性比较的步骤包括:
计算被测谱图的数据和预存的标准谱图的数据的相关系数,当计算出的相关系数大于等于预设的相关性阈值时,确定被测谱图与标准谱图匹配;当计算出的相关系数小于所述相关性阈值时,确定被测谱图与标准谱图不匹配。
具体地,相关系数是研究变量间线性相关程度的量,是一种衡量向量间相互关系的方法。例如,设有特征向量X(x1,x2,…,xn),Y(y1,y2,…,yn),二者的相关系数r可以定义如下:
Figure PCTCN2017111588-appb-000003
其中,
Figure PCTCN2017111588-appb-000004
分别表示向量X、Y的均值,i表示向量的第i个数据。
根据本发明的实施例,在被测谱图和标准谱图的所有峰的峰区的并集区间内,执行所述对被测谱图的数据和预存的标准谱图的数据进行相关性比较的步骤。也就是说,并不是在谱图的全区间内,而仅在被测谱图和标准谱图的所有峰的峰区的并集区间内,执行所述对被测谱图的数据和预存的标准谱图的数据进行相关性比较的步骤。此处的“被测谱图和标准谱图的所有峰的峰区的并集区间”表示由被测谱图和标准谱图的所有峰的峰区组成的区间。这样,可以进一步减少需要进行相关性比较的数据量,从而进一步提高运算速度,并且保证计算的准确性。
在本发明的实施例中,通过利用谱图的峰信息首先进行“峰”这个局部特征的比较,对被测谱图和标准谱图进行初步筛选,在初步筛选通过之后,才进行谱图数据的全局比较,不仅能够大大缩短匹配识别时间,而且能够提高匹配 识别的准确率。而且,在图6所示的“峰位匹配”和“筛选器”步骤中,如果识别出被测谱图与标准谱图的峰信息不匹配,可以立即终止匹配过程,而无需进行后续的识别匹配过程,可以大大提高确定二者不匹配时的计算速度。试验证明,匹配识别时间缩短为原来的约5%,并且匹配识别的准确率提高了约10%。
根据本发明的又一实施例,还提供一种电子设备,图7是示出了该电子设备700的示例硬件布置的框图。电子设备700包括处理器706(例如,微处理器(μP)、数字信号处理器(DSP)等)。处理器706可以是用于执行本文描述的方法步骤的不同动作的单一处理单元或者是多个处理单元。电子设备700还可以包括用于从其他实体接收信号的输入单元702、以及用于向其他实体提供信号的输出单元704。输入单元702和输出单元704可以被布置为单一实体或者是分离的实体。
此外,电子设备700可以包括具有非易失性或易失性存储器形式的至少一个计算机可读存储介质707,例如是电可擦除可编程只读存储器(EEPROM)、闪存、和/或硬盘驱动器。计算机可读存储介质707包括计算机程序710,该计算机程序710包括代码/计算机可读指令,其在由电子设备700中的处理器706执行时使得电子设备700可以执行例如上面结合上述实施例所描述的流程及其任何变形。
计算机程序710可被配置为具有例如计算机程序模块710A~710C等架构的计算机程序代码。计算机程序模块实质上可以执行上述实施例中所描述的流程中的各个动作,以模拟设备。换言之,当在处理器706中执行不同计算机程序模块时,它们可以对应于设备中的上述不同单元。
尽管上面结合图7所公开的实施例中的代码手段被实现为计算机程序模块,其在处理器706中执行时使得电子设备700执行上面结合上述实施例所描述的动作,然而在备选实施例中,该代码手段中的至少一项可以至少被部分地实现为硬件电路。
处理器可以是单个CPU(中央处理单元),但也可以包括两个或更多个处理单元。例如,处理器可以包括通用微处理器、指令集处理器和/或相关芯片组和/或专用微处理器(例如,专用集成电路(ASIC))。处理器还可以包括用于缓存用途的板载存储器。计算机程序可以由连接到处理器的计算机程序产品来承载。计算机程序产品可以包括其上存储有计算机程序的计算机可读介质。例如,计算机程序产品可以是闪存、随机存取存储器(RAM)、只读存储器(ROM)、 EEPROM,且上述计算机程序模块在备选实施例中可以用存储器的形式被分布到不同计算机程序产品中。
本领域技术人员应当理解,在本发明的一些示例性实施例中,虽然以拉曼谱图为示例详细说明了本发明的技术构思,但是本发明不局限于拉曼谱图的分析处理。
虽然结合附图对本发明进行了说明,但是附图中公开的实施例旨在对本发明优选实施方式进行示例性说明,而不能理解为对本发明的一种限制。
虽然本发明总体构思的一些实施例已被显示和说明,本领域普通技术人员将理解,在不背离本总体发明构思的原则和精神的情况下,可对这些实施例做出改变,本发明的范围以权利要求和它们的等同物限定。

Claims (12)

  1. 一种用于扣除谱图本底的方法,包括以下步骤:
    寻找出原始谱图的峰信息,该峰信息包括峰位、峰的起点和终点以及峰宽;
    在原始谱图的每个峰的起点和终点限定的每个峰区内,使用SNIP方法处理原始谱图的每个峰,以获得每个峰区内的本底数据;
    在每个峰区内,使用经SNIP方法处理后获得的本底数据替换原始谱图的数据,以拟合形成本底谱图;
    将形成的本底谱图进行平滑处理;和
    将原始谱图减去经平滑处理的本底谱图,以获得扣除本底的谱图。
  2. 根据权利要求1所述的方法,其中,所述获得每个峰区内的本底数据的步骤包括:
    在每个峰区内,使用变换公式对该峰区内的每个波数对应的强度值进行变换,该变换公式为:
    Figure PCTCN2017111588-appb-100001
    根据SNIP公式进行迭代计算,依次计算出v1(i)、v2(i),直至vm(i),该SNIP公式为:vp(i)=min{vp-1(i),[vp-1(i+p)+vp-1(i-p)]/2};和
    当vm(i)计算完之后,再根据上述变换公式进行逆运算,以计算出与vm(i)对应的y(i),从而获得该峰区内的本底数据,
    其中,i为原始谱图的波数,y(i)为原始谱图中第i个波数对应的强度值,v(i)为y(i)的运算结果;
    m为预定的迭代次数,p为当前迭代次数,即,1<p≤m,vp(i)表示第p次迭代计算出的v(i),vp-1(i)、vp-1(i+p)和vp-1(i-p)分别表示第p-1次迭代计算出的v(i)、v(i+p)和v(i-p),v(i+p)和v(i-p)分别表示第i+p和i-p个波数对应的强度值的运算结果。
  3. 根据权利要求2所述的方法,其中,对于每个峰区,预定的迭代次数m均满足如下关系:
    m=(w-1)/2,其中w为该峰区的峰宽。
  4. 一种通过拉曼谱图识别物质的方法,包括以下步骤:
    标准谱图库建立步骤:对多种样品的拉曼光谱进行测量,以获得多种样品的标准谱图,对所述标准谱图进行预处理并提取所述标准谱图的包括峰强、峰位、峰区和峰宽的峰信息,将经预处理的标准谱图及提取出的峰信息存入数据库中,以建立标准谱图库;
    被测谱图获得步骤:对待测物质的拉曼光谱进行测量,以获得被测谱图;
    被测谱图预处理和峰信息提取步骤:对所述被测谱图进行预处理,并提取出被测谱图的峰信息,该峰信息包括被测谱图的峰强、峰位、峰区和峰宽;
    峰匹配步骤:比较被测谱图的峰信息和标准谱图的峰信息,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图;和
    识别步骤:对被测谱图的数据和上述峰匹配步骤中筛选出的标准谱图的数据进行相关性比较,以筛选出与被测谱图最相关的标准谱图,从而识别出被测物质,
    其中,在所述被测谱图预处理和峰信息提取步骤中对所述被测谱图进行预处理包括:利用根据权利要求1-3中任一项所述的方法扣除所述被测谱图的本底。
  5. 根据权利要求4所述的方法,其中,在所述标准谱图库建立步骤中对所述标准谱图进行预处理包括:利用根据权利要求1-3中任一项所述的方法扣除所述标准谱图的本底。
  6. 根据权利要求4或5所述的方法,其中,所述峰匹配步骤包括:
    排序步骤:按照峰强从大到小的顺序,对被测谱图的峰和标准谱图的峰分别进行排序,选择出被测谱图和标准谱图的排序在前N名的峰;和
    第一匹配步骤:比较被测谱图和标准谱图的排序在前N名的峰的峰位信息,以筛选出具有与被测谱图的峰信息匹配的峰信息的标准谱图。
  7. 根据权利要求6所述的方法,其中,所述第一匹配步骤具体包括:
    按照下述公式(1)依次计算排序在前N名的峰的峰位的绝对差;和
    当计算出的峰位的绝对差满足下述条件(1)时,确定被测谱图与标准谱图的峰信息匹配;当计算出的峰位的绝对差不满足下述条件(1)时,确定被测谱图与标准谱图的峰信息不匹配,
    其中:
    公式(1)为:pD=|p2[j].fPos-p1[i].fPos|,
    条件(1)为:pD<p2[j].fWidth/3且pD<p1[i].fWidth/3,
    其中,N为预定的比较峰数,N为大于等于3的自然数;
    i,j分别表示标准谱图和被测谱图中的排序后的峰的序号,i和j均为大于等于0且小于等于N的整数;
    p1[i].fPos表示标准谱图中排序后的第i个峰的峰位;
    p2[j].fPos表示被测谱图中排序后的第j个峰的峰位;
    p1[i].fWidth表示标准谱图中排序后的第i个峰的峰宽;
    p2[j].fWidth表示被测谱图中排序后的第j个峰的峰宽;
    pD表示峰位的绝对差。
  8. 根据权利要求7所述的方法,其中,所述峰匹配步骤还包括:
    峰匹配权重计算步骤:根据下述公式(2)建立惩罚性函数,以计算峰匹配权重;和
    第二匹配步骤:当峰匹配权重大于等于预设的权重阈值时,确定被测谱图与标准谱图的峰信息匹配;当峰匹配权重小于所述权重阈值时,确定被测谱图与标准谱图的峰信息不匹配,
    其中,公式(2)为:
    h=(1-2*|j-i|/10)*(0.5/(i+1))*exp(-pD*2/min(p1[i].fWidth,p2[j].fWidth)),
    其中,h表示峰匹配权重。
  9. 根据权利要求8所述的方法,其中,在所述第一匹配步骤中确定被测谱图与标准谱图的峰信息匹配的情况下,执行所述峰匹配权重计算步骤和所述第二匹配步骤。
  10. 根据权利要求6-9中任一项所述的方法,其中,N为大于等于3且小于等于5的自然数。
  11. 根据权利要求6-10中任一项所述的方法,其中,在被测谱图和标准谱图的所有峰的峰区的并集区间内,执行所述对被测谱图的数据和上述峰匹配步 骤中筛选出的标准谱图的数据进行相关性比较的步骤。
  12. 一种电子设备,包括:
    存储器,用于存储可执行指令;以及
    处理器,用于执行存储器中存储的可执行指令,以执行如权利要求1-11中任一项所述的方法。
PCT/CN2017/111588 2016-12-26 2017-11-17 用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备 WO2018121121A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/473,495 US11493447B2 (en) 2016-12-26 2017-11-17 Method for removing background from spectrogram, method of identifying substances through Raman spectrogram, and electronic apparatus
EP17886286.8A EP3561696A4 (en) 2016-12-26 2017-11-17 METHOD FOR USE IN A SUBTRACTING SPECTROGRAM BACKGROUND, METHOD FOR IDENTIFYING SUBSTANCES BY MEANS OF A RAMAN SPECTRUM AND ELECTRONIC DEVICE

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611222587.6 2016-12-26
CN201611222587.6A CN108241845B (zh) 2016-12-26 2016-12-26 用于扣除谱图本底的方法和通过拉曼谱图识别物质的方法

Publications (1)

Publication Number Publication Date
WO2018121121A1 true WO2018121121A1 (zh) 2018-07-05

Family

ID=62702457

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/111588 WO2018121121A1 (zh) 2016-12-26 2017-11-17 用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备

Country Status (4)

Country Link
US (1) US11493447B2 (zh)
EP (1) EP3561696A4 (zh)
CN (1) CN108241845B (zh)
WO (1) WO2018121121A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110108695A (zh) * 2019-05-17 2019-08-09 广西科技大学 采用光栅阵列检测器的拉曼光谱仪的背景暗噪声扣除方法
CN114002155A (zh) * 2021-11-30 2022-02-01 北京鉴知技术有限公司 一种荧光光谱的检测方法、装置、设备及存储介质
CN115508335A (zh) * 2022-10-21 2022-12-23 哈尔滨工业大学(威海) 基于傅里叶变换的拉曼光谱曲线数据增强方法
CN116878407A (zh) * 2023-09-08 2023-10-13 法博思(宁波)半导体设备有限公司 一种基于红外干涉的外延片测厚方法及装置

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109557071B (zh) * 2018-11-14 2021-12-17 公安部第一研究所 一种危险液体混合物的拉曼光谱定性定量识别方法
CN109632761B (zh) * 2018-12-14 2021-11-09 广东环凯微生物科技有限公司 一种拉曼光谱数据的处理方法及系统
CN110658178A (zh) * 2019-09-29 2020-01-07 江苏拉曼医疗设备有限公司 一种用于拉曼光谱的荧光背底扣除方法
CN111089856B (zh) * 2019-12-26 2021-05-14 厦门大学 一种拉曼光谱弱信号提取的后处理方法
CN112711991B (zh) * 2020-12-17 2024-03-15 华东理工大学 一种自动提取x射线衍射图谱中特征峰信息的方法
CN113989578B (zh) * 2021-12-27 2022-04-26 季华实验室 拉曼光谱的峰位分析方法、系统、终端设备及介质
CN114624271B (zh) * 2022-03-25 2023-08-25 电子科技大学 一种基于变分模态分解的x射线荧光光谱本底扣除方法
CN115078616B (zh) * 2022-05-07 2024-06-07 天津国科医疗科技发展有限公司 基于信噪比的多窗口谱峰识别方法、设备、介质及产品
CN116399996A (zh) * 2023-06-02 2023-07-07 海能未来技术集团股份有限公司 基于热导检测的有机元素分析方法、装置及电子设备
CN118015038A (zh) * 2024-02-07 2024-05-10 北京霍里思特科技有限公司 对光谱数据进行背景扣除的方法、系统、设备及存储介质

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014094039A1 (en) * 2012-12-19 2014-06-26 Rmit University A background correction method for a spectrum of a target sample
CN103955518A (zh) * 2014-05-06 2014-07-30 北京华泰诺安科技有限公司 一种检测物谱图与数据库谱图的匹配方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3711207B2 (ja) * 1998-12-25 2005-11-02 三菱化学株式会社 熱可塑性ポリエステルエラストマー
US7962199B2 (en) * 2005-06-30 2011-06-14 University Of Wyoming Method and apparatus for determination of bone fracture risk using raman spectroscopy
CN101017143A (zh) * 2007-02-07 2007-08-15 盐城师范学院 用显微激光拉曼光谱检测中药注射剂成分及其方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014094039A1 (en) * 2012-12-19 2014-06-26 Rmit University A background correction method for a spectrum of a target sample
CN103955518A (zh) * 2014-05-06 2014-07-30 北京华泰诺安科技有限公司 一种检测物谱图与数据库谱图的匹配方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LONG, BIN ET AL.: "A Self-Adaptive Method for the Clipping of Scatter Background of 7 Spectrum", NUCLEAR ELECTRONICS & DETECTION TECHNOLOGY, vol. 33, no. 10, 20 October 2013 (2013-10-20), pages 1293 - 1296, XP009515332, ISSN: 0258-0934 *
See also references of EP3561696A4 *
YIN, WANGMING ET AL.: "Discussion and Application of Eliminating the Background in y-ray Spectrum by SNIP Algorithm", JOURNAL OF EAST CHINA INSTITUTE OF TECHNOLOGY ( NATURAL SCIENCE EDITION, vol. 32, no. 3, 30 September 2009 (2009-09-30), pages 245, XP009515334, ISSN: 1674-3504 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110108695A (zh) * 2019-05-17 2019-08-09 广西科技大学 采用光栅阵列检测器的拉曼光谱仪的背景暗噪声扣除方法
CN110108695B (zh) * 2019-05-17 2021-11-02 广西科技大学 采用光栅阵列检测器的拉曼光谱仪的背景暗噪声扣除方法
CN114002155A (zh) * 2021-11-30 2022-02-01 北京鉴知技术有限公司 一种荧光光谱的检测方法、装置、设备及存储介质
CN114002155B (zh) * 2021-11-30 2024-03-26 北京鉴知技术有限公司 一种荧光光谱的检测方法、装置、设备及存储介质
CN115508335A (zh) * 2022-10-21 2022-12-23 哈尔滨工业大学(威海) 基于傅里叶变换的拉曼光谱曲线数据增强方法
CN116878407A (zh) * 2023-09-08 2023-10-13 法博思(宁波)半导体设备有限公司 一种基于红外干涉的外延片测厚方法及装置
CN116878407B (zh) * 2023-09-08 2023-12-01 法博思(宁波)半导体设备有限公司 一种基于红外干涉的外延片测厚方法及装置

Also Published As

Publication number Publication date
CN108241845A (zh) 2018-07-03
EP3561696A4 (en) 2020-07-22
US20190339205A1 (en) 2019-11-07
EP3561696A1 (en) 2019-10-30
CN108241845B (zh) 2021-04-02
US11493447B2 (en) 2022-11-08

Similar Documents

Publication Publication Date Title
WO2018121121A1 (zh) 用于扣除谱图本底的方法、通过拉曼谱图识别物质的方法和电子设备
WO2018121122A1 (zh) 用于物品查验的拉曼光谱检测方法和电子设备
Feilhauer et al. Multi-method ensemble selection of spectral bands related to leaf biochemistry
Zhang et al. Wood defect detection method with PCA feature fusion and compressed sensing
JP6091493B2 (ja) 試料に存在する成分を決定するための分光装置と分光法
Liu et al. Joint baseline-correction and denoising for Raman spectra
WO2016177002A1 (zh) 基于拉曼光谱的检测保健品中是否添加有西药的方法
CN109253985B (zh) 基于神经网络的近红外光谱识别古筝面板用木材等级的方法
CN107179310B (zh) 基于鲁棒噪声方差估计的拉曼光谱特征峰识别方法
WO2018121082A1 (zh) 基于拉曼光谱的自学习式定性分析方法
WO2015096779A1 (zh) 拉曼光谱检测方法
CN109858477A (zh) 用深度森林在复杂环境中识别目标物的拉曼光谱分析方法
CN108169201B (zh) 用于扣除包装干扰的拉曼光谱检测方法
WO2018103541A1 (zh) 用于去除溶剂干扰的拉曼光谱检测方法和电子设备
CN115618282B (zh) 一种合成宝石的鉴定方法、装置及存储介质
CN114611582A (zh) 一种基于近红外光谱技术分析物质浓度的方法及系统
Barburiceanu et al. An improved feature extraction method for texture classification with increased noise robustness
WO2018121151A1 (zh) 用于识别拉曼谱图的方法和电子设备
CN116858822A (zh) 一种基于机器学习和拉曼光谱的水体中磺胺嘧啶定量分析方法
CN111220565B (zh) 一种基于cpls的红外光谱测量仪器标定迁移方法
CN109145887B (zh) 一种基于光谱潜变量混淆判别的阈值分析方法
Pavelka et al. Complex evaluation of Raman spectra using morphological filtering: Algorithms, software implementation, and experimental verification of baseline correction, peak recognition, and cosmic ray removal in SERS spectra of designer drugs
CN110632024A (zh) 一种基于红外光谱的定量分析方法、装置、设备以及存储介质
CN114694771A (zh) 样品分类方法、分类器的训练方法、设备和介质
CN110672584B (zh) 用于检测食用明胶掺假的pls-svm模型的构建方法及检测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17886286

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2017886286

Country of ref document: EP