CN117095771B

CN117095771B - High-precision spectrum measurement data optimization processing method

Info

Publication number: CN117095771B
Application number: CN202311346592.8A
Authority: CN
Inventors: 李延磊; 周春卿
Original assignee: Kunshan Shangrui Intelligent Technology Co ltd
Current assignee: Kunshan Shangrui Intelligent Technology Co ltd
Priority date: 2023-10-18
Filing date: 2023-10-18
Publication date: 2024-02-06
Anticipated expiration: 2043-10-18
Also published as: CN117095771A

Abstract

The invention relates to the technical field of near infrared spectrum analysis, in particular to a high-precision spectrum measurement data optimization processing method. The method comprises the following steps: acquiring spectrum measurement data and constructing an initial isolated tree; further constructing a depth sequence of each data point; determining structural similarity of two data points; determining the similarity consistency degree of the frequency to be measured; dividing frequency intervals of spectrum measurement data according to the similarity consistency degree of different frequencies to obtain characteristic wave bands; the method and the device can effectively improve the detection precision of the spectrum measurement data, realize the optimization processing of the high-precision spectrum measurement data, improve the reliability of the optimized spectrum data and enhance the optimization effect of the spectrum measurement data.

Description

High-precision spectrum measurement data optimization processing method

Technical Field

The invention relates to the technical field of near infrared spectrum analysis, in particular to a high-precision spectrum measurement data optimization processing method.

Background

Near infrared spectrometry is a technique for determining chemical composition of a substance, which includes features of corresponding wavelength, frequency, amplitude, etc., and can detect properties of an object through near infrared spectrometry. This technology has been widely used in many fields such as chemistry, biology, geology, astronomy, etc. However, during spectrum detection, the obtained spectrum measurement data has lower accuracy due to the influence of the environment in the scene, such as the temperature, humidity, vibration, dust, noise, spectrometer performance and other interference in the scene.

In order to improve the accuracy of spectrum measurement data, data optimization is required to be carried out on the spectrum measurement data, and in the related technology, the analysis of abnormal data is realized by comparing sample data with standard data.

Disclosure of Invention

In order to solve the technical problems of insufficient detection precision and reliability and poor optimization effect of spectrum measurement data in the related art, the invention provides a high-precision spectrum measurement data optimization processing method, which adopts the following specific technical scheme:

the invention provides a high-precision spectrum measurement data optimization processing method, which comprises the following steps:

periodically acquiring spectrum measurement data of a sample to be measured at different time points, and determining initial isolated trees of the spectrum measurement data at different dimensions;

constructing a depth sequence of each data point according to the depth information of the data point in different initial isolated trees and the frequency of different depth information in the spectrum measurement data; determining the structural similarity of two data points according to the depth sequence of any two data points, the amplitude difference and the frequency difference of the two data points;

clustering any data point serving as a data point to be detected, the structural similarity of the data point to be detected and all other data points to obtain a cluster of the data points to be detected, taking the frequency of the data points to be detected as the frequency to be detected, taking the cluster containing the frequency to be detected in the cluster corresponding to all the data points as the cluster to be detected, and determining the similarity consistency degree of the frequency to be detected according to the structural similarity values in all the clusters to be detected; dividing frequency intervals of the spectrum measurement data according to the similarity consistency degree of different frequencies to obtain characteristic wave bands;

according to the difference of frequencies contained in characteristic wave bands of different time points, determining an isolated tree splitting frequency, carrying out isolated tree analysis on the spectrum measurement data based on the values of the isolated tree splitting frequency in different dimensions, determining abnormal data points, and carrying out data optimization on the spectrum measurement data according to the abnormal data points to obtain optimized spectrum data.

Further, the constructing a depth sequence of each data point according to the depth information of the data point in different initial isolated trees and the frequency of different depth information in the spectrum measurement data comprises:

taking the depth value of the data point in the initial isolated tree as depth information, and taking the frequency combination of the depth value and the data point under the same depth value as a depth vector;

and sequencing the depth vectors corresponding to all the depth values according to the sequence from the small depth value to the large depth value to obtain a depth sequence of the data points.

Further, the structural similarity of two data points is determined according to the depth sequence of any two data points, the amplitude difference and the frequency difference of the two data points, and the corresponding calculation formula is as follows:

the method comprises the steps of carrying out a first treatment on the surface of the In (1) the->Representing structural similarity of the ith data point and the jth data point,/for>Depth sequence representing the i-th data point, +.>Depth sequence representing jth data point, +.>DTW distance, +_f, representing depth sequence of ith data point and depth sequence of jth data point>Indicating the frequency difference between the ith data point and the jth data point,/for each data point>The difference in amplitude between the ith data point and the jth data point is represented, and x represents a preset constant coefficient.

Further, the determining the similarity consistency degree of the frequencies to be measured according to the values of the structural similarity in all the clusters to be measured includes:

calculating the average value of the structural similarity values of all data points in each cluster to be detected as a cluster average value;

and calculating the sum value of cluster mean values of all the clusters to be tested, and carrying out normalization processing on the sum value to obtain the similarity consistency degree of the frequencies to be tested.

Further, the dividing the frequency interval of the spectrum measurement data according to the similarity consistency degree of different frequencies to obtain a characteristic wave band includes:

and combining frequencies with adjacent and similar consistency degrees larger than a preset consistency threshold value to obtain a characteristic wave band.

Further, the determining the splitting frequency of the orphan tree according to the difference of the frequencies contained in the characteristic wave bands at different time points comprises the following steps:

determining the frequency of any frequency in the characteristic wave band in all time points as the characteristic frequency;

performing inverse proportion normalization processing on the characteristic frequency to obtain an isolated coefficient;

and when the isolation coefficient is larger than a preset isolation threshold value, taking the corresponding frequency as the isolation tree splitting frequency.

Further, the performing an orphan tree analysis on the spectral measurement data based on the orphan tree splitting frequency at values of different dimensions, determining outlier data points, includes:

based on an isolated tree algorithm, characteristic points of different dimensions corresponding to the isolated tree splitting frequency are used as splitting points to be analyzed, and outliers obtained through isolated tree analysis are used as abnormal data points.

Further, the performing data optimization on the spectrum measurement data according to the abnormal data points to obtain optimized spectrum data includes:

abnormal data points are deleted from the spectral measurement data, and the remaining data points are formed into optimized spectral data.

Further, the determining an initial orphan tree of spectral measurement data in different dimensions includes:

based on an isolated tree algorithm, the spectrum measurement data of any time point is randomly selected and analyzed at any dimension to obtain an initial isolated tree of the spectrum measurement data in different dimensions.

Further, the clustering of the structural similarity between the data point to be measured and all other data points to obtain a cluster of the data points to be measured includes:

and clustering the structural similarity of the data points to be detected and all other data points by using a k-means clustering algorithm to obtain a cluster of the data points to be detected.

The invention has the following beneficial effects:

according to the method, the initial isolation tree of the spectrum measurement data in different dimensions is determined by periodically acquiring the spectrum measurement data of the sample to be measured at different time points. And then, constructing a depth sequence according to the depth information and the frequency of the data points in different initial isolation trees, accurately analyzing the distribution of each leaf node in the initial isolation tree through the construction of the depth sequence, and further determining the structural similarity among the data points by combining the depth sequence, the amplitude difference and the frequency difference, so that the structural similarity can effectively represent the similarity degree of the corresponding data points. Clustering is carried out according to the structural similarity, and the similarity consistency degree is calculated; the frequency interval of the spectrum measurement data is divided according to the similarity degree of different frequencies to obtain characteristic wave bands, the similarity degree is used as the division basis of the characteristic wave bands, the spectrum measurement data of all time points can be analyzed, the characteristic wave bands with the most stable characteristics can be screened out according to the change of the spectrum measurement data at different time points, the recognition effect of the characteristic wave bands is ensured, the follow-up analysis of the isolated tree splitting frequency according to the characteristic wave bands is facilitated, abnormal data points are determined, the acquisition of the abnormal data points can integrate the data characteristics of multiple dimensions and multiple time points, the reliability and the accuracy of the acquisition of the abnormal data points are ensured, finally, the spectrum measurement data is subjected to data optimization according to the abnormal data points with higher accuracy and reliability to obtain optimized spectrum data, and the detection accuracy and the reliability of the optimized spectrum data can be improved. In conclusion, the method and the device can effectively improve the detection precision of the spectrum measurement data, realize the optimization processing of the high-precision spectrum measurement data, improve the reliability of the optimized spectrum data and enhance the optimization effect on the spectrum measurement data.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a method for optimizing high-precision spectral measurement data according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to specific implementation, structure, characteristics and effects of a high-precision spectrum measurement data optimization processing method according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the high-precision spectrum measurement data optimization processing method provided by the invention with reference to the accompanying drawings.

Referring to fig. 1, a flowchart of a method for optimizing high-precision spectrum measurement data according to an embodiment of the present invention is shown, where the method includes:

s101: and periodically acquiring spectrum measurement data of the sample to be measured at different time points, and determining initial isolated trees of the spectrum measurement data at different dimensions.

A specific application scenario of the present invention may be, for example: and obtaining a plurality of high-precision spectrum detection data for the same sample to be detected in a detection period by using a high-precision infrared spectrometer. The sample to be detected may be, for example, a water sample, a food sample, a metal sample, or a plurality of solid samples that can be detected by using a spectrometer.

It should be noted that, in the detection process, the detection environment conditions are kept consistent, so as to avoid the influence of the external environment on the high-precision spectrum measurement data, meanwhile, the detection period can be set to 25 minutes, and the high-precision spectrum detection data at different time points can be obtained at intervals of 30 seconds within 25 minutes as a time point, and of course, the detection period and the sampling frequency can be set according to specific implementation scenes, so that the method is not limited.

The high-precision spectrum detection data is an amplitude sequence of different frequencies at different time points in ascending order, and can be expressed in a spectrum graph form by taking time as a horizontal axis and frequency as a vertical axis, namely the high-precision spectrum detection data is the spectrum graph of drinking. So far, high-precision spectrum measurement data of the same sample to be measured at a plurality of different time points are obtained, subsequent spectrum measurement data change characteristic analysis is facilitated, and abnormal data extraction and high-precision spectrum measurement data optimization processing are realized.

In the embodiment of the invention, a plurality of different dimensions of the spectrum measurement data can be determined, wherein the dimensions are characteristic dimensions of the spectrum measurement data of the sample to be measured, such as amplitude dimensions, frequency dimensions and the like.

Further, in some embodiments of the present invention, determining an initial orphan tree of spectral measurement data in different dimensions includes: based on an isolated tree algorithm, the spectrum measurement data of any time point is randomly selected and analyzed at any dimension to obtain an initial isolated tree of the spectrum measurement data in different dimensions.

According to the embodiment of the invention, the amplitude dimension is used as a specific dimension for analysis, a certain data point is selected as a partition point, the spectrum measurement data is cut according to the partition point, the formed two subsequences are used as two leaf nodes, then the two subsequences are cut again according to the data quantity and the data distribution of the leaf nodes, the leaf node of the next layer is obtained, and the cutting is stopped until only one leaf node of the bottommost layer has data, so that an initial isolated tree is obtained. See the examples that follow for the analysis and selection process.

S102: according to the depth information of the data points in different initial isolated trees and the frequency of different depth information in the spectrum measurement data, constructing a depth sequence of each data point; and determining the structural similarity of the two data points according to the depth sequence of any two data points, the amplitude difference and the frequency difference of the two data points.

In the embodiment of the invention, each dimension can correspond to an initial isolation tree, and because the spectrum measurement data comprises a plurality of dimensions and is selected only according to the abnormal data of one dimension, the reliability is lower, so that the invention combines all the dimensions to carry out overall analysis.

It will be appreciated that since different samples to be tested have a decisive effect on the distribution characteristics of each band of their spectra, the similarity characteristics of the data points and adjacent data points over each spectral band are related to their band positions. Therefore, the scheme constructs an isolated tree for the data points according to an isolated tree algorithm, and analyzes the structural similarity of any two data points.

Further, in some embodiments of the present invention, constructing a depth sequence for each data point based on the depth information of the data point in the spectral measurement data in different initial orphaned trees and the frequency of the different depth information, comprises: taking the depth value of the data point in the initial isolated tree as depth information, and taking the frequency combination of the depth value and the data point under the same depth value as a depth vector; and sequencing the depth vectors corresponding to all the depth values according to the sequence from the small depth value to the large depth value to obtain a depth sequence of the data points.

It can be understood that, in the embodiment of the present invention, the number of layers where the leaf node is located may be taken as the corresponding depth value, and the closer the leaf node is to the root node, the smaller the value of the number of layers where the leaf node is located is, the larger the corresponding depth value is, the farther the leaf node where the data point is located is from the root node, and the depth value is taken as the depth information in the embodiment of the present invention, meanwhile, when only one data point is included in the leaf node when the isolated tree analysis is performed, the corresponding data point may be indicated to be completely divided, so that the deeper leaf node will not include the data point, that is, the greater the depth information is, the more normal the data point is in the corresponding dimension, and in order to analyze the similarity of the data point, the frequency of occurrence of the data point in all dimensions may be determined, and the combination of the depth value and the frequency of the data point in the same depth value is taken as the depth vector of the data point.

For example, taking the data point p as a specific example, the included dimensions include three dimensions of frequency, amplitude and amplitude change rate, the depth value of the point p in the frequency dimension is 3, in the amplitude dimension, the leaf node with the depth value of 3 includes the data point p, in the amplitude change rate dimension, the leaf node with the depth value of 3 does not include the data point p, that is, the frequency is 2, the corresponding depth vector is (3, 2), the data point p is analyzed under other depth values, and then the depth vectors are sequenced according to the sequence from small depth values to large to obtain the corresponding depth sequence.

The structural similarity can represent the similarity degree of the structural distribution of two data points in all the isolated trees, and the similarity of the depth sequences of the data points in a certain frequency interval is of practical significance only because the data points in the certain frequency interval correspond to the same substance due to the wave band physical characteristics on the spectrum data. Thus, the more similar the depth sequence between two data points, the more similar the frequency and the more similar the amplitude, the greater the structural similarity of the two data points.

Further, in some embodiments of the present invention, the structural similarity of two data points is determined according to the depth sequence of any two data points, the amplitude difference and the frequency difference of the two data points, and the corresponding calculation formula is:

in the method, in the process of the invention,representing structural similarity of the ith data point and the jth data point,/for>Depth sequence representing the i-th data point, +.>Depth sequence representing jth data point, +.>DTW distance, +_f, representing depth sequence of ith data point and depth sequence of jth data point>Indicating the frequency difference between the ith data point and the jth data point,/for each data point>X represents the difference in amplitude between the ith data point and the jth data pointA preset constant coefficient is shown, which is a safety value set to prevent the denominator from being 0, alternatively, 0.01.

It will be appreciated that the depth sequence may be used as the overall distribution information of the corresponding data points in all dimensions, so in the embodiment of the present invention, the DTW distance of the depth sequence of any two data points is calculated, where the DTW distance is the distance between two sequences calculated based on a dynamic time warping (Dynamic Time Warping, DTW) algorithm, and when the DTW distance is smaller, the similarity of the corresponding two data points is higherAnd (3) withIn a negative correlation, the smaller the frequency difference and the amplitude difference, the higher the similarity of two corresponding data points can be expressed as +.>、/>All are in charge of>And in negative correlation, calculating the product of the DTW distance, the frequency difference and the amplitude difference, and carrying out negative correlation on the product to obtain the structural similarity of two data points.

S103: clustering any data point serving as a data point to be detected, and carrying out structural similarity between the data point to be detected and all other data points to obtain a cluster of the data points to be detected, wherein the frequency of the data points to be detected is used as the frequency to be detected, the cluster containing the frequency to be detected in the cluster corresponding to all the data points is used as the cluster to be detected, and the similarity consistency degree of the frequency to be detected is determined according to the structural similarity value in all the clusters to be detected; and dividing the frequency interval of the spectrum measurement data according to the similarity consistency degree of different frequencies to obtain a characteristic wave band.

Further, in some embodiments of the present invention, clustering the structural similarity between the data point to be measured and all other data points to obtain a cluster of data points to be measured includes: and clustering the structural similarity of the data points to be detected and all other data points by using a k-means clustering algorithm to obtain a cluster of the data points to be detected.

In the embodiment of the invention, the preset k value can be used as the mass center number of the clusters, wherein the preset k value can be set according to actual detection experience, or can be calculated based on an elbow method and the like, and of course, it is understood that the k-means clustering algorithm is a distance-based clustering algorithm, so that the structural similarity of the data points to be detected and all other data points is clustered by using the k-means clustering algorithm, and the obtained cluster is a cluster with relatively similar distribution in space.

It can be understood that the obtained clusters are a plurality of frequency intervals with higher similarity determined by the view angles of the data points to be measured, the similarity consistency degree of any frequency is analyzed according to a plurality of clustering results with different data points to be measured as view angles, namely, the frequency of the data points to be measured is used as the frequency to be measured, and the clusters corresponding to all the data points and containing the frequency to be measured are used as the clusters to be measured.

Further, in some embodiments of the present invention, determining the degree of similarity consistency of the frequencies under test according to the values of the structural similarity in all clusters under test includes: calculating the average value of the structural similarity values of all data points in each cluster to be detected as a cluster average value; and calculating the sum value of cluster mean values of all the clusters to be tested, and carrying out normalization processing on the sum value to obtain the similarity consistency degree of the frequencies to be tested.

In the embodiment of the invention, each cluster to be detected can be analyzed, that is, the average value of the structural similarity values of all data points in the cluster to be detected is calculated as the cluster average value, and it can be understood that the cluster to be detected is a set which uses different data points as the data points to be detected and contains fixed frequency, that is, the cluster to be detected is analyzed, that is, the whole analysis is performed on the light measurement data, and the obtained similarity consistency degree has better expression effect.

Further, in some embodiments of the present invention, the frequency interval of the spectrum measurement data is divided according to the degree of similarity and consistency of different frequencies, so as to obtain a characteristic band, which includes: and combining frequencies with adjacent and similar consistency degrees larger than a preset consistency threshold value to obtain a characteristic wave band.

In the embodiment of the invention, adjacent frequencies with the similar consistency degree larger than the preset consistency threshold value can be combined, and all frequencies are traversed to obtain the characteristic wave band with the larger similarity degree.

The preset consistency threshold is a threshold of similarity consistency degree, and in the embodiment of the present invention, the preset consistency threshold may be set to 0.89, which is not limited.

S104: according to the difference of frequencies contained in characteristic wave bands of different time points, determining the splitting frequency of an isolated tree, carrying out isolated tree analysis on spectrum measurement data based on the values of the splitting frequency of the isolated tree in different dimensions, determining abnormal data points, and carrying out data optimization on the spectrum measurement data according to the abnormal data points to obtain optimized spectrum data.

In the embodiment of the invention, the analysis can be performed according to the obtained similarity consistency degree of each frequency and all characteristic wave bands, so that abnormal data points are obtained, and because the characteristic wave bands are frequency wave bands on spectrum measurement data obtained at one time point and have certain spectrum data change along with the change of time, the stability degree of the characteristic wave bands is analyzed by combining the wave band fluctuation characteristics of multiple time points, so that the preference degree of the data of each frequency in the analysis processing of corresponding isolation trees is determined.

Further, in some embodiments of the present invention, determining the orphan tree splitting frequency from the difference in frequencies contained in the characteristic bands at different points in time includes: determining the frequency of any frequency in the characteristic wave band in all time points as the characteristic frequency; performing inverse proportion normalization processing on the characteristic frequency to obtain an isolated coefficient; and when the isolation coefficient is larger than a preset isolation threshold value, taking the corresponding frequency as the isolation tree splitting frequency.

In the embodiment of the invention, each frequency is specifically analyzed, that is, the frequency of the frequency appearing in the characteristic wave band corresponding to all time points is taken as the characteristic frequency, and the larger the characteristic frequency is, the more popular the corresponding frequency in all data points is, that is, the more normal the corresponding frequency is, so that the worse the splitting effect is caused by taking the frequency as the splitting point of the splitting of the isolated tree, and further the calculation redundancy is caused when the splitting of the isolated tree is carried out.

In the embodiment of the present invention, the preset isolation threshold may specifically be, for example, 0.85, or may be adjusted according to an actual detection requirement, which is not further limited and described in detail. And taking the frequency which is larger than a preset isolation threshold value as an isolation tree splitting frequency, and then constructing and analyzing the isolation tree based on the isolation tree splitting frequency.

Further, in some embodiments of the present invention, performing an orphan tree analysis on the spectral measurement data based on values of orphan tree splitting frequency in different dimensions, determining outlier data points includes: based on an isolated tree algorithm, characteristic points of different dimensions corresponding to the isolated tree splitting frequency are used as splitting points to be analyzed, and outliers obtained through isolated tree analysis are used as abnormal data points.

In the embodiment of the invention, the feature points of different dimensions corresponding to the splitting frequency of the isolated tree can be used as the splitting points based on the isolated tree algorithm, the isolated tree is constructed, the outliers are directly obtained according to the distribution characteristics of the isolated tree, and the outliers are used as the abnormal data points.

Under the condition, the problems that the structure of the whole isolated tree is complex and the calculation is complicated due to the fact that the isolated tree segmentation points are selected to normal data points are avoided.

Further, in some embodiments of the present invention, data optimization is performed on the spectral measurement data according to the abnormal data points to obtain optimized spectral data, including: abnormal data points are deleted from the spectral measurement data, and the remaining data points are formed into optimized spectral data.

In the embodiment of the invention, after the abnormal data point is obtained by detection, the corresponding abnormal data point can be deleted from the spectrum measurement data, or the abnormal data point can be smoothed according to the characteristics of other data points in the local range where the abnormal data point is positioned, so that the influence of the abnormal data point on the whole spectrum measurement data is eliminated, and the optimized spectrum data with better quality is obtained.

It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims

1. A method for optimizing high-precision spectral measurement data, the method comprising:

determining an isolated tree splitting frequency according to the difference of frequencies contained in characteristic wave bands of different time points, performing isolated tree analysis on the spectrum measurement data based on the values of the isolated tree splitting frequency in different dimensions, determining abnormal data points, and performing data optimization on the spectrum measurement data according to the abnormal data points to obtain optimized spectrum data;

the data optimization is performed on the spectrum measurement data according to the abnormal data points to obtain optimized spectrum data, and the method comprises the following steps:

deleting abnormal data points from the spectrum measurement data, and forming the rest data points into optimized spectrum data;

wherein the dimension is a characteristic dimension of the spectrum measurement data, and the determining the initial isolated tree of the spectrum measurement data in different dimensions comprises:

2. The method for optimizing high-precision spectral measurement data according to claim 1, wherein said constructing a depth sequence for each data point based on depth information of the data point in different initial isolated trees and frequency of different depth information in the spectral measurement data comprises:

3. The method for optimizing high-precision spectrum measurement data according to claim 1, wherein the structural similarity of two data points is determined according to a depth sequence of any two data points, an amplitude difference and a frequency difference of the two data points, and the corresponding calculation formula is as follows:

4. The method for optimizing high-precision spectrum measurement data according to claim 1, wherein said determining the degree of similarity consistency of the frequencies to be measured according to the values of the structural similarity in all the clusters to be measured comprises:

5. The method for optimizing high-precision spectrum measurement data according to claim 1, wherein the dividing the frequency interval of the spectrum measurement data according to the similarity consistency degree of different frequencies to obtain the characteristic wave band comprises:

6. The method for optimizing high-precision spectral measurement data according to claim 1, wherein said determining the isolated tree splitting frequency based on the difference of frequencies included in the characteristic bands at different time points comprises:

7. The method of optimizing high-precision spectral measurement data according to claim 1, wherein said performing an orphan tree analysis on said spectral measurement data based on values of said orphan tree splitting frequency in different dimensions, determining outlier data points, comprises:

8. The method for optimizing high-precision spectrum measurement data according to claim 1, wherein the clustering of the structural similarity between the data point to be measured and all other data points to obtain a cluster of data points to be measured comprises: