CN115078616A - Multi-window spectral peak identification method, device, medium and product based on signal-to-noise ratio - Google Patents

Multi-window spectral peak identification method, device, medium and product based on signal-to-noise ratio Download PDF

Info

Publication number
CN115078616A
CN115078616A CN202210494928.4A CN202210494928A CN115078616A CN 115078616 A CN115078616 A CN 115078616A CN 202210494928 A CN202210494928 A CN 202210494928A CN 115078616 A CN115078616 A CN 115078616A
Authority
CN
China
Prior art keywords
peak
point
signal
data
noise ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210494928.4A
Other languages
Chinese (zh)
Other versions
CN115078616B (en
Inventor
吴梦
贾明正
庞嘉
黄琪
凌星
程文播
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Guoke Medical Technology Development Co ltd
Suzhou Institute of Biomedical Engineering and Technology of CAS
Original Assignee
Tianjin Guoke Medical Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Guoke Medical Technology Development Co Ltd filed Critical Tianjin Guoke Medical Technology Development Co Ltd
Priority to CN202210494928.4A priority Critical patent/CN115078616B/en
Publication of CN115078616A publication Critical patent/CN115078616A/en
Application granted granted Critical
Publication of CN115078616B publication Critical patent/CN115078616B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N30/8631Peaks
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8603Signal analysis with integration or differentiation
    • G01N30/8617Filtering, e.g. Fourier filtering
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N30/8631Peaks
    • G01N30/8637Peak shape
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N30/00Investigating or analysing materials by separation into components using adsorption, absorption or similar phenomena or using ion-exchange, e.g. chromatography or field flow fractionation
    • G01N30/02Column chromatography
    • G01N30/86Signal analysis
    • G01N30/8624Detection of slopes or peaks; baseline correction
    • G01N30/8644Data segmentation, e.g. time windows

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Mathematical Physics (AREA)
  • Algebra (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Spectrometry And Color Measurement (AREA)

Abstract

The invention relates to a multi-window spectral peak identification method, equipment, medium and product based on signal-to-noise ratio, wherein the method comprises the following steps: preprocessing raw spectrogram data; identifying wave crests, wave troughs and peak-free data with smooth fluctuation through a multi-window spectrum peak identification algorithm, and searching peak peaks; and calculating the signal-to-noise ratio of each data point, correcting the peak boundary through the signal-to-noise ratio, and setting and identifying the peak boundary according to the signal-to-noise ratio and the peak width to obtain all peak information. Depending on the spectrum signal, the multi-window spectrum peak identification algorithm is used for quickly identifying the wave peak, the wave trough and the normal data, dividing the spectrum peak area, accurately finding the peak top point, then calculating the signal-to-noise ratio of each data point to correct the peak boundary, and setting according to the signal-to-noise ratio and the peak width, the peak boundary can be accurately identified. The signal-to-noise ratio estimation algorithm based on the histogram corrects the peak boundary, the corrected spectral peak area is small, the spectral peaks are basically symmetrical, the subsequent data processing is facilitated, and the accuracy of the result is improved.

Description

Multi-window spectral peak identification method, device, medium and product based on signal-to-noise ratio
Technical Field
The invention relates to the technical field of spectral peak identification, in particular to a multi-window spectral peak identification method, equipment, medium and product based on signal-to-noise ratio.
Background
The liquid chromatography-triple quadrupole mass spectrometry (LC-MS/MS) is a multifunctional analyzer combining the chromatographic separation capability and the excellent specificity of mass spectrometry, can separate target compounds in a complex matrix sample, and can perform high-precision and high-sensitivity quantitative analysis on substances, thereby meeting the requirements in the field of clinical analysis.
The spectral peak identification is the basis and key of qualitative analysis and quantitative calculation, and the peak and peak area information obtained by identification can be used for quantitative analysis, influences the determination of the name and concentration of the final component, and has great influence on the application field with higher precision requirement. However, when a sample is detected, ions generated by chromatographic column loss or a complex sample matrix interfere with a spectrogram of a compound to be detected, so that the shape of a spectral peak is irregular, and the peak identification difficulty is increased, so that an effective peak identification algorithm needs to be developed to process mass spectrum data.
The conventional spectrogram detecting methods mainly include an amplitude method, a first derivative method, a second derivative method and various similar methods derived on the basis of the amplitude method, the first derivative method and the second derivative method. The traditional spectral peak identification algorithm is used for distinguishing by utilizing the motion trend and the slope of a signal, is simple, has experience and local randomness, and cannot accurately distinguish various peaks in a mass spectrum. The amplitude method determines a chromatographic peak through an amplitude change rule of chromatographic data, is the simplest and most intuitive method, and is easily interfered by noise. The first derivative method, which uses the first derivative of the curve to detect the characteristic points of the peak, has good identification effect for a single peak, but cannot identify more complex shoulder peaks. The second derivative method detects the peak characteristic point through the second derivative value, but the influence of noise is far greater than that of the first derivative, and weak noise is enough to completely submerge the characteristic point on the second derivative curve, so that the application of the second derivative method is influenced. The peak identification is carried out by combining the first derivative method and the second derivative method, and the first derivative method is not easy to judge the occurrence condition of the shoulder peak, so that the influence is caused on the second derivative, the spectral peak cannot be accurately identified, and the algorithm has certain limitation. The deconvolution algorithm compares the peak with the model peak, and combines fragment ions with the same peak shape into a compound mass spectrogram to realize deconvolution, but the parameter selection of the algorithm is complex.
Because the peak shape in the chromatographic signal is complex, conditions such as a single peak, a shoulder peak, a tailing peak and the like can occur, an effective spectral peak identification method is urgently needed, spectral peak regions are accurately divided, and a foundation is laid for subsequent quantitative calculation.
Disclosure of Invention
To achieve the above objects and other advantages and in accordance with the purpose of the invention, a first object of the present invention is to provide a signal-to-noise ratio based multi-window spectral peak identification method, comprising the steps of:
preprocessing original spectrogram data;
identifying wave crests, wave troughs and fluctuating gentle and non-peak data through a multi-window spectral peak identification algorithm, and searching peak tops;
and calculating the signal-to-noise ratio of each data point, correcting the peak boundary through the signal-to-noise ratio, and setting and identifying the peak boundary according to the signal-to-noise ratio and the peak width to obtain all peak information.
Further, the identifying the peak, the trough and other data by the multi-window spectral peak identifying algorithm, and the finding the peak top includes the following steps:
calculating the mean value and standard difference of the preprocessed original spectrogram data by using a plurality of sliding windows with fixed sizes, and searching the peak highest point of each peak area;
and overlapping sliding windows with different lengths, repeating the single-window identification step, and identifying the spectrum peak to obtain a peak top point set Peaks.
Further, the calculating the mean and the standard deviation of the preprocessed raw spectrogram data by using a plurality of sliding windows with fixed sizes comprises the following steps:
inputting the preprocessed original spectrogram data, setting the length of a sliding window to be n, and calculating a data mean value avg and a standard deviation std in the window width;
setting a threshold value as m, setting a signal intensity fluctuation range value as m × std, judging the type of a signal according to the signal intensity fluctuation range value, and setting a signal label flag;
according to the difference value delta X between the signal intensity value and the mean value, log (X) i ) -avg classifies the signal into three categories, and if Δ x is greater than the signal intensity fluctuation range, the data point is identified as a part of the peak appearance region, if Δ x is less than the signal intensity fluctuation range, the data point is identified as a trough part, if Δ x is within the signal intensity fluctuation range, the data point is identified as a flat-fluctuating non-peak signal, and the flag is identified as 0;
and if the flag is not equal to 1, the window starts to slide forwards, the data mean value avg and the standard deviation std in the window are recalculated, and the steps are repeated until all the data are processed.
Further, the step of finding the peak of each peak area comprises the following steps:
go through all tags, if flag i 1 and peak flag bit peak flag 0, tag flag i For each data point X i And if the label of the corresponding label is not 1, and the peak flag is 1, the data point is determined to be a peak end point, and the peak flag is 0, the position of the data point with the highest intensity is searched from the peak start point to the peak end point, namely the peak point, the intensity of the peak point is compared with the peak threshold value, if the intensity of the peak point is greater than the peak threshold value, the peak information is output, and if the intensity of the peak point is less than the peak threshold value, the peak information is not output, and the steps are repeated until all peaks are searched.
Further, the step of calculating the signal-to-noise ratio of each data point comprises the steps of:
dividing the preprocessed original spectrogram data into a plurality of data units, and calculating the signal-to-noise ratio estimation value of all data points in each data unit based on histogram statistics.
Further, the step of dividing the preprocessed raw spectrogram data into a plurality of data units and calculating the snr estimation values of all data points in each data unit based on histogram statistics comprises the following steps:
set the number of bins of the histogram to N bin Empirical value N bin Dividing the histogram into N bin Segments, each segment having a range length of:
Figure RE-GDA0003784959190000031
INS MAX selecting thresholds for data, INS MAX E (X) + η stdev (X), η being a constant, X representing a vector constituting a data point;
will exceed the INS MAX Removing the data smaller than the INS in the first data unit MAX The data points of (2) are counted into a histogram, and the segmentation interval of the histogram is as follows:
[0,INS SIZE ),[INS SIZE ,2INS SIZE ),……,[(N bin -1)INS SIZE ,N bin INS SIZE );
will be less than INS MAX Counting all data points in the subsection interval, and counting the frequency number in each subsection interval;
to N bin The segment subsection intervals are arranged according to the number of the data points falling into the segment subsection intervals, and the subsection intervals corresponding to the median of the number of the data points are screened out [ (N) m -1)INS SIZE ,N m INS SIZE ) The estimated initial value n0 of the noise is: n is 0 =(N m -0.5)INS SIZE
The estimate for the noise is modified as: n ═ max {1, (N) m -0.5)INS SIZE };
The signal-to-noise ratio estimate for the data points of the cell data is: yn t =y(t)/n;
And repeating the steps to calculate the signal-to-noise ratio of all the data points.
Further, the step of correcting the peak boundary by the signal-to-noise ratio comprises the steps of:
for a certain peak point pi in the peak point set Peaks, all points on the left side of pi are traversed to the right boundary of the previous spectrum peak, and when a certain point pi s The signal intensity of (a) is lower than the signal intensity of the point to the right of the peak top pi; point pi s The absolute value of the difference in retention time to point pi is less than the preset peak width W; point pi s Is greater than the input signal-to-noise threshold T1; point pi will be pointed s The peak starting point corresponding to the peak top point pi is taken as the peak starting point;
for the peak point pi in the peak point set Peaks, all points on the right side of pi are traversed, and when a certain point pi d The signal intensity of (a) is lower than the signal intensity of the point on the left of the peak top point pi; point pi to point pi d The absolute value of the difference of the retention times of (a) is smaller than the preset peak width W; point pi d Is greater than the input signal-to-noise threshold T1; point pi will be pointed d As a peak end point corresponding to the peak top point pi:
traversing all peak tops in the peak top set Peaks, repeating the steps to obtain a peak starting point and a peak ending point corresponding to each peak top in the peak top set Peaks, and obtaining a modified peak boundary.
A second object of the present invention is to provide an electronic apparatus, comprising: a memory having program code stored thereon; a processor coupled with the memory and when the program code is executed by the processor, implementing a signal-to-noise ratio based multi-window spectral peak identification method.
It is a third object of the present invention to provide a computer readable storage medium having stored thereon program instructions that, when executed, implement a signal-to-noise ratio based multi-window spectral peak identification method.
It is a fourth object of the invention to provide a computer program product comprising computer programs/instructions which, when executed by a processor, implement a multi-window spectral peak identification method based on signal-to-noise ratio.
Compared with the prior art, the invention has the beneficial effects that:
the invention provides a multi-window spectral peak identification method based on a signal-to-noise ratio, which is characterized in that a multi-window spectral peak identification algorithm is utilized to quickly identify peaks, troughs and normal data depending on spectral signals, spectral peak areas are divided, peak tops are accurately found, then the signal-to-noise ratio of each data point is calculated to correct the peak boundary, and the peak boundary can be accurately identified according to the signal-to-noise ratio and peak width setting.
The invention corrects the peak boundary based on the signal-to-noise ratio estimation algorithm of the histogram, the corrected spectral peak area is smaller, and the spectral peak is basically symmetrical, thereby being beneficial to the subsequent data processing and improving the accuracy of the result.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to be implemented according to the content of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings. The detailed description of the present invention is given in detail by the following examples and the accompanying drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention to a lesser extent. In the drawings:
FIG. 1 is a flow chart of a multi-window spectral peak identification method based on signal-to-noise ratio in example 1;
FIG. 2 is a flowchart of the multi-window spectral peak identification algorithm of example 1;
FIG. 3 is a chart of mass spectral signal classification;
FIG. 4 is a spectrum corresponding to a mass spectrum signal;
FIG. 5 is a diagram of spectral peak information obtained by a multi-window spectral peak identification algorithm;
FIG. 6 is a plot of spectral peak information corrected based on signal-to-noise ratio;
FIG. 7 is a diagram illustrating the results of correcting peak boundaries using screening conditions that increase peak start points;
fig. 8 is a schematic view of an electronic device of embodiment 2.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
Although the existing spectral peak identification method can identify the spectral peak area, the existing spectral peak identification method only identifies the peak top and the number of peaks accurately, the identification of the peak boundary point is poor, the problems of large spectral peak area, asymmetric spectral peaks and the like exist, and the subsequent quantitative analysis is not convenient. Therefore, on the basis, the advantage of accurately identifying the peak top point is utilized, the peak boundary is corrected by combining the signal-to-noise ratio estimation algorithm based on the histogram, the corrected spectral peak area is small, the spectral peaks are basically symmetrical, the subsequent data processing is facilitated, and the accuracy of the result is improved.
Example 1
A multi-window spectral peak identification method based on signal-to-noise ratio, as shown in fig. 1, includes the following steps:
preprocessing original spectrogram data; specifically, the original spectrogram data is RData, and high-frequency signals are filtered out through S-G filtering to obtain preprocessed signals X.
As shown in fig. 2, the peak, the trough and the peak-free data with smooth fluctuation are identified by a multi-window spectral peak identification algorithm, and the peak top is found.
The multi-window spectral peak recognition algorithm calculates the mean value and the standard deviation of the preprocessed original spectrogram data by using a plurality of sliding windows with fixed sizes, and searches the peak highest point of each peak area.
Single window peak identification includes the steps of:
inputting the preprocessed original spectrogram data X, setting the length of a sliding window to be n, and calculating a data mean value avg and a standard deviation std in the window width;
setting a threshold value as m, setting a signal intensity fluctuation range value as m × std, judging the signal type according to the signal intensity fluctuation range value, and setting a signal label flag according to the following judgment rule:
according to the difference value delta X between the signal intensity value and the mean value, log (X) i ) -avg classifies the signal into three classes, and if Δ x is larger than the signal intensity fluctuation range, the data point is identified as a part of the peak appearance region, if Δ x is smaller than the signal intensity fluctuation range, the data point is identified as a trough part, if Δ x is within the signal intensity fluctuation range, the data point is identified as a trough part, and the data point is identified as a trough part,the data point is regarded as a common signal, namely a smooth-wave non-peak signal;
if the flag is not equal to 1, the window starts to slide forwards, the data mean value avg and the standard deviation std in the window are recalculated, and the steps are repeated until all data are processed.
The searching of the peak highest point of each peak area comprises the following steps:
through the above steps, each data point X i All have a corresponding tag flag i Defining peak flag bit peakflag to be 0, setting peak threshold value, screening all mass spectrum peaks whose peak value is greater than threshold value. Go through all tags, if flag i If the peak flag bit peakflag is equal to 0, the previous point of the data point is determined to be a peak start area, namely a peak start point, and at this time, the peakflag is equal to 1, if the label of the data point and the next data point is not equal to 1, and the peakflag is equal to 1, the data point is determined to be a peak end point, and at this time, the peakflag is equal to 0, in the area from the peak start point to the peak end point, the position of the data point with the highest intensity is found, namely the peak point, the intensity of the peak point is compared with the peak threshold value, if the intensity of the peak point is greater than the peak threshold value, the peak information is output, if the intensity of the peak point is less than the peak threshold value, the peak information is not output, and the above steps are repeated until all peaks are found.
On the basis of the single window, sliding windows with different lengths are superposed, the single window identification step is repeated, the spectral peak can be accurately identified, and the peak set Peaks is obtained.
In this embodiment, a peak top set peak and a peak boundary set, i.e., a large sliding window (window width >100) and a small sliding window (window width <50), are obtained by combining two sliding windows of different lengths, and a mass spectrum peak is identified. Fig. 3 shows the signal differentiation of the partial region of the mass spectrum data, which corresponds to the spectrum shown in fig. 4.
Although the multi-window peak identification can identify the peak boundary, the peak boundary is mainly accurate identification of the peak top, and for the peak boundary, for some mass spectrum peaks, the peak boundary is too wide, and in the subsequent quantitative analysis, the calculation of the peak area and the quantitative result are influenced, so that the peak boundary needs to be further optimized. And a signal-to-noise ratio estimation method based on histogram statistics is adopted to optimize the peak boundary, so that the spectral peak information is more accurate.
And calculating the signal-to-noise ratio of each data point, correcting the peak boundary through the signal-to-noise ratio, and setting and identifying the peak boundary according to the signal-to-noise ratio and the peak width to obtain all peak information.
Dividing original spectrogram data X into a plurality of data units, and calculating the signal-to-noise ratio estimation values of all data points in the data units according to the following method for each data unit, wherein X represents a vector forming the data point, and the data selection threshold is recorded as INS MAX ,INS MAX η stdev (x), η is a constant 3.
Setting the number of bins of the histogram to N bin Empirical value N bin Divide the histogram into N30 bin Segments, each segment having a range length of:
Figure BDA0003632510970000071
INS MAX selecting thresholds for data, INS MAX E (X) + η stdev (X), η being a constant, X representing a vector constituting a data point;
will exceed the INS MAX The first data unit R1 is removed to be smaller than the INS MAX The data points of (a) are counted into a histogram, the segmentation interval of the histogram is:
[0,INS SIZE ),[INS SIZE ,2INS SIZE ),……,[(N bin -1)INS SIZE ,N bin INS SIZE );
will be less than INS MAX Counting all data points in the subsection interval, and counting the frequency number in each subsection interval;
to N bin The segment subsection interval is arranged according to the number of the data points falling into the segment subsection interval, and the subsection interval corresponding to the median of the number of the data points is screened out [ (N) m -1)INS SIZE ,N m INS SIZE ) The estimated initial value n0 of the noise is: n is 0 =(N m -0.5)INS SIZE
Since the noise should be 1 or more in principle, the estimation of the noise is modifiedThe method comprises the following steps: n ═ max {1, (N) m -0.5)INS SIZE };
Thus, the snr estimate for the data points of the cell data is: yn t =y(t)/n;
And repeating the steps to calculate the signal-to-noise ratio of all the data points.
Correcting the peak boundaries comprises the steps of:
for a certain peak point pi in the peak point set Peaks, all points on the left side of pi are traversed to the right boundary of the previous spectral peak, and when a certain point pi s The signal intensity of (a) is lower than the signal intensity of the point to the right of the peak top pi; point pi s The absolute value of the difference in retention time to point pi is less than the preset peak width W; point pi s Is greater than the input signal-to-noise threshold T1; then point pi s As the peak start point corresponding to the peak vertex pi;
traversing all peak vertexes in the peak vertex set Peaks, and obtaining a peak starting point corresponding to each peak vertex in the Peaks according to the method;
finding the end point of each peak comprises the steps of:
for the peak point pi in the peak point set Peaks, all points on the right side of pi are traversed, and when a certain point pi d The signal intensity of (a) is lower than the signal intensity of the point on the left of the peak top pi; point pi to point pi d The absolute value of the difference of the retention times of (a) is smaller than the preset peak width W; point pi d Is greater than the input signal-to-noise threshold T1; point pi will be pointed d As a peak end point corresponding to the peak top point pi:
traversing all peak tops in the peak top set Peaks, and obtaining a peak end point corresponding to each peak top in Peaks according to the method.
And finally obtaining the corrected peak boundary.
Fig. 5 shows the spectral peak information (peak top, peak start, peak end) obtained by the multi-window spectral peak identification algorithm. Fig. 6 shows the corrected spectral peak information based on the signal-to-noise ratio. As can be seen from the figure, although the multi-window spectral peak recognition algorithm can recognize the peak boundary, the peak area is larger, the peak top point is mainly recognized accurately, after the peak boundary is corrected by the signal-to-noise ratio, the peak area can be reduced, the spectral peak can be recognized more accurately, and the follow-up quantitative analysis is facilitated.
In an embodiment, in order to improve the inaccurate appearance of the peak boundary of the spectrum peak obtained by identifying the multi-window spectrum peak, in the process of searching the peak value, the screening condition of the peak starting point is increased, the intensity value of the peak starting point is screened to be greater than the peak top intensity corresponding to 0.1 x, the peak boundary is corrected, and the corrected peak boundary is as shown in fig. 7. Compared with the unmodified peak boundary, the method has the advantages that the improvement is obvious, the peak area is reduced, but the spectrum peak identification effect is not as good as that of multi-window peak identification based on signal-to-noise ratio correction, and the spectrum peak symmetry is slightly poor.
The invention provides a multi-window spectral peak identification method based on signal-to-noise ratio, which comprises the steps of firstly dividing a spectral peak region by utilizing a multi-window spectral peak identification algorithm, and storing a peak top in a set Peaks, but the peak boundary of the spectral peak region is not particularly accurate, the problems that the peak region area is slightly large, the distance between a spectral peak initial point and the peak top is slightly large and the like exist, the peak boundary needs to be corrected, and the correction process is as follows: dividing the signal into a plurality of data units based on the signal-to-noise ratio of histogram statistics, calculating the signal-to-noise ratio of each data point, traversing peak tops in Peaks, screening a peak starting point on the left side of the peak tops by using the signal strength and the signal-to-noise ratio conditions, screening a peak ending point on the right side of the peak tops, and finally obtaining a modified spectrum peak area, wherein the peak area is small and basically symmetrical, so that the method is beneficial to further quantitative analysis.
Example 2
An electronic device 200, as shown in FIG. 8, includes but is not limited to: a memory 201 having program code stored thereon; a processor 202 coupled to the memory and when the program code is executed by the processor, implementing a signal-to-noise ratio based multi-window spectral peak identification method. For the detailed description of the method, reference may be made to the corresponding description in the above method embodiments, and details are not repeated here.
Example 3
A computer readable storage medium having stored thereon program instructions that when executed implement a signal-to-noise ratio based multi-window spectral peak identification method. For the detailed description of the method, reference may be made to the corresponding description in the above method embodiments, which is not repeated herein.
Example 4
A computer program product comprising a computer program/instructions that when executed by a processor implement a signal-to-noise ratio based multi-window spectral peak identification method. For the detailed description of the method, reference may be made to the corresponding description in the above method embodiments, which is not repeated herein.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on different points from other embodiments.
The foregoing is merely an example of the present specification and is not intended to limit one or more embodiments of the present specification. Various modifications and alterations to one or more embodiments described herein will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of one or more embodiments of the present specification should be included in the scope of claims of one or more embodiments of the present specification. One or more embodiments of this specification.

Claims (10)

1. The multi-window spectral peak identification method based on the signal-to-noise ratio is characterized by comprising the following steps of:
preprocessing raw spectrogram data;
identifying wave crests, wave troughs and peak-free data with smooth fluctuation through a multi-window spectrum peak identification algorithm, and searching peak peaks;
and calculating the signal-to-noise ratio of each data point, correcting the peak boundary through the signal-to-noise ratio, and setting and identifying the peak boundary according to the signal-to-noise ratio and the peak width to obtain all peak information.
2. The signal-to-noise ratio-based multi-window spectral peak identification method according to claim 1, wherein the identifying the peaks, valleys and other data by the multi-window spectral peak identification algorithm, and the finding the peak top comprises the following steps:
calculating the mean value and the standard deviation of the preprocessed original spectrogram data by using a plurality of sliding windows with fixed sizes, and searching the peak value highest point of each peak area;
and overlapping sliding windows with different lengths, repeating the single-window identification step, and identifying the spectrum peak to obtain a peak set Peaks.
3. The signal-to-noise ratio-based multi-window spectral peak identification method according to claim 2, wherein the calculating the mean and standard deviation of the preprocessed raw spectral data using a plurality of sliding windows of fixed size comprises the steps of:
inputting the preprocessed original spectrogram data, setting the length of a sliding window to be n, and calculating a data mean value avg and a standard deviation std in the window width;
setting a threshold value as m, setting a signal intensity fluctuation range value as m × std, judging the type of a signal according to the signal intensity fluctuation range value, and setting a signal label flag;
according to the difference value delta X between the signal intensity value and the mean value, log (X) i ) -avg classifies the signal into three categories, and if Δ x is greater than the signal strength fluctuation range, the flag is 1, the data point is identified as a part of the peak appearance region, if Δ x is less than the signal strength fluctuation range, the flag is-1, the data point is identified as a trough part, if Δ x is within the signal strength fluctuation range, the flag is identified as a trough partThe data point is considered to be a flat wave, peak-free signal;
and if the flag is not equal to 1, the window starts to slide forwards, the data mean value avg and the standard deviation std in the window are recalculated, and the steps are repeated until all the data are processed.
4. The signal-to-noise ratio based multi-window spectral peak identification method according to claim 3, wherein the step of finding the peak of each peak region comprises the steps of:
go through all tags, if flag i 1 and peak flag bit peak flag 0, tag flag i For each data point X i And if the label of the corresponding label is not 1, and the label of the data point and the label of the next data point are not 1, the data point is judged to be a peak end point, and the peak flag is 0, the position of the data point with the highest intensity is searched from the peak start point to the peak end point, namely the peak point, the intensity of the peak point is compared with the peak threshold value, if the intensity of the peak point is greater than the peak threshold value, the peak information is output, if the intensity of the peak point is less than the peak threshold value, the peak information is not output, and the steps are repeated until all peaks are searched.
5. The signal-to-noise ratio based multi-window spectral peak identification method of claim 1, wherein the step of calculating the signal-to-noise ratio of each data point comprises the steps of:
dividing the preprocessed original spectrogram data into a plurality of data units, and calculating the signal-to-noise ratio estimation value of all data points in each data unit based on histogram statistics.
6. The method as claimed in claim 5, wherein the step of dividing the preprocessed original spectrogram data into a plurality of data units, and the step of calculating the SNR estimates for all data points in each data unit based on histogram statistics comprises the steps of:
setting the number of bins of the histogram to N bin Empirical value N bin Dividing the histogram into N bin Segments, each segment having a range length of:
Figure RE-FDA0003784959180000021
INS MAX selecting thresholds for data, INS MAX E (X) + η stdev (X), η being a constant, X representing a vector constituting a data point;
will exceed the INS MAX Removing the data smaller than the INS in the first data unit MAX The data points of (2) are counted into a histogram, and the segmentation interval of the histogram is as follows:
[0,INS SIZE ),[INS SIZE ,2INS SIZE ),……,[(N bin -1)INS SIZE ,N bin INS SIZE );
will be less than INS MAX Counting all data points in the subsection interval, and counting the frequency number in each subsection interval;
to N bin The segment subsection interval is arranged according to the number of the data points falling into the segment subsection interval, and the subsection interval corresponding to the median of the number of the data points is screened out [ (N) m -1)INS SIZE ,N m INS SIZE ) The estimated initial value n0 of the noise is: n is 0 =(N m -0.5)INS SIZE
The estimate for the noise is modified as: n ═ max {1, (N) m -0.5)INS SIZE };
The signal-to-noise ratio estimate for the data points of the cell data is: yn (n) t =y(t)/n;
And repeating the steps to calculate the signal-to-noise ratio of all the data points.
7. The signal-to-noise ratio based multi-window spectral peak identification method of claim 2, wherein the step of correcting the peak boundary by the signal-to-noise ratio comprises the steps of:
for a certain peak point pi in the peak point set Peaks, all points on the left side of pi are traversed to the right boundary of the previous spectral peak, and when a certain point pi s The signal intensity of (a) is lower than the signal intensity of the point to the right of the peak top pi; point pi s The absolute value of the difference in retention time to point pi is less than the preset peak width W; point pi s Is greater than the input signal-to-noise threshold T1; point pi will be pointed s As the peak start point corresponding to the peak vertex pi;
for the peak point pi in the peak point set Peaks, all points on the right side of pi are traversed, and when a certain point pi d The signal intensity of (a) is lower than the signal intensity of the point on the left of the peak top pi; point pi to point pi d The absolute value of the difference of the retention times of (a) is less than the preset peak width W; point pi d Is greater than the input signal-to-noise threshold T1; point pi will be pointed d As a peak end point corresponding to the peak top point pi:
traversing all peak tops in the peak top set Peaks, repeating the steps to obtain a peak starting point and a peak ending point corresponding to each peak top in the peak top set Peaks, and obtaining a modified peak boundary.
8. An electronic device, comprising: a memory having program code stored thereon; a processor coupled with the memory and implementing the method of any of claims 1 to 7 when the program code is executed by the processor.
9. A computer-readable storage medium, having stored thereon program instructions which, when executed, implement the method of any one of claims 1 to 7.
10. A computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions, when executed by a processor, implement the method according to any of claims 1 to 7.
CN202210494928.4A 2022-05-07 2022-05-07 Multi-window spectrum peak identification method, equipment, medium and product based on signal to noise ratio Active CN115078616B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210494928.4A CN115078616B (en) 2022-05-07 2022-05-07 Multi-window spectrum peak identification method, equipment, medium and product based on signal to noise ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210494928.4A CN115078616B (en) 2022-05-07 2022-05-07 Multi-window spectrum peak identification method, equipment, medium and product based on signal to noise ratio

Publications (2)

Publication Number Publication Date
CN115078616A true CN115078616A (en) 2022-09-20
CN115078616B CN115078616B (en) 2024-06-07

Family

ID=83248080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210494928.4A Active CN115078616B (en) 2022-05-07 2022-05-07 Multi-window spectrum peak identification method, equipment, medium and product based on signal to noise ratio

Country Status (1)

Country Link
CN (1) CN115078616B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332258A (en) * 2023-12-01 2024-01-02 奥谱天成(成都)信息科技有限公司 Near infrared absorption peak identification method, system and medium based on multi-scale Lorentz

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5896463A (en) * 1996-09-30 1999-04-20 Siemens Corporate Research, Inc. Method and apparatus for automatically locating a region of interest in a radiograph
US20050267689A1 (en) * 2003-07-07 2005-12-01 Maxim Tsypin Method to automatically identify peak and monoisotopic peaks in mass spectral data for biomolecular applications
WO2014094039A1 (en) * 2012-12-19 2014-06-26 Rmit University A background correction method for a spectrum of a target sample
CN105518455A (en) * 2013-09-09 2016-04-20 株式会社岛津制作所 Peak detection method
JP2016173346A (en) * 2015-03-18 2016-09-29 株式会社島津製作所 Mass spectrum data processing device
CN109001354A (en) * 2018-05-30 2018-12-14 迈克医疗电子有限公司 Wave crest recognition methods and device, chromatograph and storage medium
CN109993155A (en) * 2019-04-23 2019-07-09 北京理工大学 For the characteristic peak extracting method of low signal-to-noise ratio uv raman spectroscopy
US20190339205A1 (en) * 2016-12-26 2019-11-07 Nuctech Company Limited Method for removing background from spectrogram, method of identifying substances through raman spectrogram, and electronic apparatus
CN111595992A (en) * 2020-06-30 2020-08-28 浙江三青环保科技有限公司 Rapid peak searching method for online gas chromatographic peak
CN112415078A (en) * 2020-11-18 2021-02-26 深圳市步锐生物科技有限公司 Mass spectrum data spectrogram signal calibration method and device
CN113567603A (en) * 2021-07-22 2021-10-29 华谱科仪(大连)科技有限公司 Detection and analysis method of chromatographic spectrogram and electronic equipment
CN114166814A (en) * 2021-05-25 2022-03-11 北京理工大学 Raman spectrum peak identification method based on dual-scale correlation operation
CN114186596A (en) * 2022-02-17 2022-03-15 天津国科医工科技发展有限公司 Multi-window identification method and device for spectrogram peaks and electronic equipment
CN114444542A (en) * 2022-01-10 2022-05-06 中国科学院苏州生物医学工程技术研究所 Liquid chromatography peak noise estimation method, device, storage medium and system

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5896463A (en) * 1996-09-30 1999-04-20 Siemens Corporate Research, Inc. Method and apparatus for automatically locating a region of interest in a radiograph
US20050267689A1 (en) * 2003-07-07 2005-12-01 Maxim Tsypin Method to automatically identify peak and monoisotopic peaks in mass spectral data for biomolecular applications
WO2014094039A1 (en) * 2012-12-19 2014-06-26 Rmit University A background correction method for a spectrum of a target sample
CN105518455A (en) * 2013-09-09 2016-04-20 株式会社岛津制作所 Peak detection method
JP2016173346A (en) * 2015-03-18 2016-09-29 株式会社島津製作所 Mass spectrum data processing device
US20190339205A1 (en) * 2016-12-26 2019-11-07 Nuctech Company Limited Method for removing background from spectrogram, method of identifying substances through raman spectrogram, and electronic apparatus
CN109001354A (en) * 2018-05-30 2018-12-14 迈克医疗电子有限公司 Wave crest recognition methods and device, chromatograph and storage medium
CN109993155A (en) * 2019-04-23 2019-07-09 北京理工大学 For the characteristic peak extracting method of low signal-to-noise ratio uv raman spectroscopy
CN111595992A (en) * 2020-06-30 2020-08-28 浙江三青环保科技有限公司 Rapid peak searching method for online gas chromatographic peak
CN112415078A (en) * 2020-11-18 2021-02-26 深圳市步锐生物科技有限公司 Mass spectrum data spectrogram signal calibration method and device
CN114166814A (en) * 2021-05-25 2022-03-11 北京理工大学 Raman spectrum peak identification method based on dual-scale correlation operation
CN113567603A (en) * 2021-07-22 2021-10-29 华谱科仪(大连)科技有限公司 Detection and analysis method of chromatographic spectrogram and electronic equipment
CN114444542A (en) * 2022-01-10 2022-05-06 中国科学院苏州生物医学工程技术研究所 Liquid chromatography peak noise estimation method, device, storage medium and system
CN114186596A (en) * 2022-02-17 2022-03-15 天津国科医工科技发展有限公司 Multi-window identification method and device for spectrogram peaks and electronic equipment

Non-Patent Citations (14)

* Cited by examiner, † Cited by third party
Title
CAKRAK, F AND LOUGHLIN, PJ: "2 Instantaneous frequency estimation of polynomial phase signals", PROCEEDINGS OF THE IEEE-SP INTERNATIONAL SYMPOSIUM ON TIME-FREQUENCY AND TIME-SCALE ANALYSIS, 31 December 1998 (1998-12-31) *
KRISTIN H. JARMAN等: "A new approach to automated peak detection", 《CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS》, vol. 69 *
MINGZHENG JIA等: "Quantitative Method for Liquid Chromatography-Mass Spectrometry Based on Multi-Sliding Window and Noise Estimation", PROCESSES, 1 June 2022 (2022-06-01) *
THOMAS P.BRONEZ等: "Alternate windows for multi-window spectral ahalysis", 《IEEE》 *
WANG YAN等: "Quantitative Analysis Algorithm of Liquid Chromatography-Triple Quadrupole Mass Spectrometry", JOURNAL OF TIANJIN UNIVERSITY (SCIENCE AND TECHNOLOGY), vol. 53, no. 6, 15 June 2020 (2020-06-15) *
余贵水;李晓龙;魏钟记: "多窗口自适应的面目标滤波算法研究", 电子设计工程, no. 008 *
姜承志;孙强;刘英;梁静秋;刘兵: "基于多尺度局部信噪比的拉曼谱峰识别算法", 光学学报, no. 006, 31 December 2014 (2014-12-31) *
季江;高鹏飞;贾南南;杨蕊;郭汉明;瑚琦;庄松林;: "自适应多尺度窗口平均光谱平滑", 光谱学与光谱分析, no. 05, 15 May 2015 (2015-05-15) *
李艳萍;宁跃飞;杨伟;: "基于限峰分离模糊直方图均衡化的图像增强算法", 西南师范大学学报(自然科学版), no. 03, 20 March 2018 (2018-03-20) *
范贤光;王秀芬;王昕;许英杰;阙靖;王小东;何浩;李韦;左勇;: "基于特征提取的低信噪比拉曼光谱去噪方法研究", 光谱学与光谱分析, no. 12, 15 December 2016 (2016-12-15) *
贾明正等: "LC-MS / MS 软件系统设计及定量分析研究", 质谱学报, vol. 41, no. 2 *
郑学书;郑舟;陈君: "一种基于滑动窗口的离子迁移谱谱峰识别算法", 电子世界, no. 011 *
郑学书;郑舟;陈君;: "一种基于滑动窗口的离子迁移谱谱峰识别算法", 电子世界, no. 011, 8 June 2018 (2018-06-08) *
陈淑珍;麻红昭;: "一种色谱谱峰识别算法的实现", 计算机应用与软件, no. 11, 15 November 2013 (2013-11-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117332258A (en) * 2023-12-01 2024-01-02 奥谱天成(成都)信息科技有限公司 Near infrared absorption peak identification method, system and medium based on multi-scale Lorentz
CN117332258B (en) * 2023-12-01 2024-01-30 奥谱天成(成都)信息科技有限公司 Near infrared absorption peak identification method, system and medium based on multi-scale Lorentz

Also Published As

Publication number Publication date
CN115078616B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
US10198630B2 (en) Peak detection method
CN107179310B (en) Raman spectrum characteristic peak recognition methods based on robust noise variance evaluation
CN107818298B (en) General Raman spectrum feature extraction method for machine learning substance identification algorithm
WO2017185963A1 (en) Big data-based method and terminal for matching trend curve local characteristics
CN117132778B (en) Spectrum measurement correction calculation method and system
CN109164450B (en) Downburst prediction method based on Doppler radar data
CN111089856B (en) Post-processing method for extracting Raman spectrum weak signal
CN111598827A (en) Appearance flaw detection method, electronic device and storage medium
CN105447506B (en) A kind of gesture identification method based on section distribution probability feature
CN115311507B (en) Building board classification method based on data processing
CN115078616B (en) Multi-window spectrum peak identification method, equipment, medium and product based on signal to noise ratio
CN112732748A (en) Non-invasive household appliance load identification method based on adaptive feature selection
CN110659374A (en) Method for searching images by images based on neural network extraction of vehicle characteristic values and attributes
CN109829902B (en) Lung CT image nodule screening method based on generalized S transformation and Teager attribute
CN114186596B (en) Multi-window identification method and device for spectrogram peaks and electronic equipment
CN114609319A (en) Spectral peak identification method and system based on noise estimation
CN110378417B (en) Method for acquiring construction boundary
CN105718723B (en) Spectrum peak position detection method in a kind of mass spectrometric data processing
CN109271902B (en) Infrared weak and small target detection method based on time domain empirical mode decomposition under complex background
CN115166120A (en) Spectral peak identification method, device, medium and product
CN116129187A (en) Quick target detection method and system based on local stable characteristic points
CN116242954A (en) Automated analysis method and system for expiratory molecular analysis gas chromatography data
CN115078519A (en) Spectral peak identification method, device, medium and product based on iterative algorithm
CN113705672A (en) Threshold value selection method, system and device for image target detection and storage medium
CN108932491B (en) Method for identifying and removing cosmic rays by utilizing five-point three-time smoothing method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: Building 4, No.16 Wujing Road, development zone, Dongli District, Tianjin

Applicant after: Tianjin Guoke Medical Technology Development Co.,Ltd.

Address before: Building 4, No.16 Wujing Road, development zone, Dongli District, Tianjin

Applicant before: TIANJIN GUOKE YIGONG TECHNOLOGY DEVELOPMENT Co.,Ltd.

Country or region before: China

CB02 Change of applicant information
TA01 Transfer of patent application right

Effective date of registration: 20240326

Address after: Building 4, No.16 Wujing Road, development zone, Dongli District, Tianjin

Applicant after: Tianjin Guoke Medical Technology Development Co.,Ltd.

Country or region after: China

Applicant after: Suzhou Institute of Biomedical Engineering and Technology Chinese Academy of Sciences

Address before: Building 4, No.16 Wujing Road, development zone, Dongli District, Tianjin

Applicant before: Tianjin Guoke Medical Technology Development Co.,Ltd.

Country or region before: China

TA01 Transfer of patent application right
GR01 Patent grant