CN107784192A - Fingerprint similarity computational methods, device and sample quality evaluation system - Google Patents
Fingerprint similarity computational methods, device and sample quality evaluation system Download PDFInfo
- Publication number
- CN107784192A CN107784192A CN201710832461.9A CN201710832461A CN107784192A CN 107784192 A CN107784192 A CN 107784192A CN 201710832461 A CN201710832461 A CN 201710832461A CN 107784192 A CN107784192 A CN 107784192A
- Authority
- CN
- China
- Prior art keywords
- finger
- sample
- similarity
- testing sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/20—Identification of molecular entities, parts thereof or of chemical compositions
Landscapes
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Chemical & Material Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses a kind of fingerprint similarity computational methods, device and sample quality evaluation system, it is characterised in that methods described includes:Establish canonical reference finger-print;Obtain testing sample finger-print;Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor.The computational methods of the present invention can reflect the similarity degree of sample in terms of the chemical composition of sample and content difference two, and suitable for the needs of different type complex sample system.
Description
Technical field
The invention belongs to chemical substance quality evaluation field, more particularly to a kind of fingerprint similarity computational methods, dress
Put and sample quality evaluation system.
Background technology
Finger-print refers to some complex material systems, such as Chinese medicine, certain organism or the DNA of tissue or cell, albumen
Matter is after proper treatment, and using certain analysis means, what is obtained can characterize the chromatogram or light of its chemical substance feature
Spectrogram.Finger-print controls and evaluated field in traditional Chinese medicine quality and is widely used, and is mainly used in evaluating Chinese medicine and its Chinese medicine preparation
Authenticity, Optimality and the stability of semi-manufactured goods quality.Finger-print is based on mainly for multi-component complex species analysis system
Modern Instrument Analytical Technique and detection method, the collection of illustrative plates for determining to obtain can provide sample abundant characteristic information, can be more complete
The chemical composition and content of sample are reflected in face, and then carry out globality description and evaluation to the quality of sample.Determine finger-print
Analytical technology have a lot, can generally be divided into chromatogram class and the major class method of spectral class two.High performance liquid chromatography in chromatogram class method
Method (HPLC) is conventional prefered method, and spectral class method mid-infrared light spectrometry (IR) application is more.Used by finger-print
In the quality assessment process of sample, a key link is to calculate the similarity of testing sample and standard reference sample.It is similar
Degree result is the important evidence that finger-print is used for sample quality evaluation.
Different similarity calculating methods evaluates the similitude of sample finger-print from different angles, therefore result is often
Can be different.The result of calculation of fingerprint similarity should be able to qualitatively and quantitatively embody the species of the chemical composition for the sample that compares
And the change of content.At present, the computational methods of fingerprint similarity mainly include two major classes, and one kind is from the whole of finger-print
Shape is set out, and by comparing the number at Fingerprints peak, the information such as position of relative intensity ratio and characteristic peak is determined
Property judge sample chemical composition similitude or otherness, such as correlation coefficient process, Cosin method, Nei Y-factor method Ys are this kind of
Fluctuating change (such as peak area, peak height or peak intensity, particularly small peak) of the method to Fingerprints peak amount is not sensitive enough;Separately
One kind is characteristic peak area by comparing finger-print or the difference of peak height or intensity come quantitatively judgement sample chemical composition
The otherness or similitude of content, such as similarity algorithm based on distance, change of this kind of method to Fingerprints peak amount
It is more sensitive, but the change of fingerprint similarity is because sample chemical composition change causes or the fluctuation of characteristic peak amount is drawn
Rise and be difficult to differentiate between.
How qualitative and quantitative evaluation similarity, and embody sample chemical composition and content similar and difference, be
The technical problem that those skilled in the art urgently solve at present.
The content of the invention
To overcome above-mentioned the deficiencies in the prior art, the invention provides a kind of fingerprint similarity computational methods, device
With sample quality evaluation system, above-mentioned two classes similarity calculating method is combined to obtain comprehensive similarity, energy by methods described
The enough while comprehensive similar and difference for embodying sample chemical composition and content, from the phase of qualitative and quantitative two angles evaluation sample
Like degree, and then evaluate the quality of sample.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of fingerprint similarity computational methods, comprise the following steps:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print
Number R2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
The coefficient correlation calculation formula is as follows:
Wherein, xiAnd siThe peak face of i-th of chromatographic peak of testing sample and standard reference sample finger-print is represented respectively
The intensity of product, peak height or collection of illustrative plates;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and canonical reference
The mean intensity of the average peak area of sample finger-print, average peak height or collection of illustrative plates.
The difference similarity factor calculation formula is as follows:
Wherein, xiAnd siThe peak face of i-th of chromatographic peak of testing sample and standard reference sample finger-print is represented respectively
The intensity of product, peak height or collection of illustrative plates;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and canonical reference
The mean intensity of the average peak area of sample finger-print, average peak height or collection of illustrative plates.
Wherein, δ, u and v default value are 0.5,1 and 1.
The canonical reference finger-print is the finger-print of selected standard reference sample, or is wanted according to according to relevant technology
The finger-print by the multiple batch samples determined is asked to generate.
The mass discrepancy that the computational methods can be used between the quality testing of sample and identical or different type of sample
Detection.
Wherein, adjusting parameter δ, u and v are passed through, by increasing capacitance it is possible to increase the otherness of similarity between different type sample.
According to the second object of the present invention, present invention also offers a kind of device calculated for fingerprint similarity,
Including memory, processor and the computer program that can be run on a memory and on a processor is stored, the processor is held
Following steps are realized during row described program, including:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print
Number R2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
According to the third object of the present invention, present invention also offers a kind of computer-readable recording medium, it is stored thereon with
Computer program, is calculated for fingerprint similarity, and the program performs following steps when being executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print
Number R2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
According to the fourth object of the present invention, present invention also offers a kind of sample quality evaluation system, including detector and
Computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmission
To the computing device;
The calculating that the computing device includes memory, processor and storage on a memory and can run on a processor
Machine program, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Based on default parameter value or setting of the user for the parameter value of comprehensive similarity is received, according to the coefficient correlation
Comprehensive similarity is calculated with difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞;
Based on default threshold or setting of the user for threshold value is received, by comprehensive similarity compared with given threshold,
If the comprehensive similarity is not less than the threshold value, the sample passes are otherwise unqualified.
Beneficial effects of the present invention
1st, the method for evaluating similarity of finger-print can substantially be divided into two major classes:One kind is the difference reflection based on collection of illustrative plates
The similarity degree of sample;One kind is the similarity degree of vector angle or data degree of correlation reflection sample based on collection of illustrative plates.This
Two class methods explain the similarity of sample from different perspectives, there is its feature and deficiency.If two class methods are combined, its is played each
Feature, overcomes respective deficiency, so evaluates the similarity of sample then more fully, rationally.
2nd, two class method for evaluating similarity are combined by the present invention, it is proposed that a kind of finger-print based on comprehensive similarity
Similarity calculating method, this method can reflect the similar journey of sample in terms of the chemical composition of sample and content difference two
Degree, can not only provide comprehensive similarity, and can provide reflection chemical composition (R respectively1) or content difference (R2) phase
Like degree information;The need for making similarity calculating method be applied to different type complex sample system by introducing tri- parameters of δ, u and v
Will.
Brief description of the drawings
The Figure of description for forming the part of the application is used for providing further understanding of the present application, and the application's shows
Meaning property embodiment and its illustrate be used for explain the application, do not form the improper restriction to the application.
Fig. 1 is finger-print computational methods flow chart of the present invention;
Fig. 2 is area or the comparison of peak height or intensity of two collection of illustrative plates chromatographic peaks;
Fig. 3 is different batches sample HPLC finger-prints;
Fig. 4 is the finger-print chromatographic peak peak area bar graph of different batches sample;
Fig. 5 is cigarette housing material infrared spectrogram.
Embodiment
It is noted that described further below is all exemplary, it is intended to provides further instruction to the application.It is unless another
Indicate, all technologies used herein and scientific terminology are with usual with the application person of an ordinary skill in the technical field
The identical meanings of understanding.
It should be noted that term used herein above is merely to describe embodiment, and be not intended to restricted root
According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative
It is also intended to include plural form, additionally, it should be understood that, when in this manual using term "comprising" and/or " bag
Include " when, it indicates existing characteristics, step, operation, device, component and/or combinations thereof.
In the case where not conflicting, the feature in embodiment and embodiment in the application can be mutually combined.
General thought proposed by the present invention:Coefficient correlation (R1) for the chemical composition and relative scale of finger-print reflection
It is more sensitive, and difference similarity factor (R2) directly to reflect the content difference that compared Fingerprints peak represents relative
In the ratio of total content of material.In order that difference similarity factor and it is generally acknowledged that the bigger similarity degree of Similarity value is higher comments
Valency is consistent, the computational methods of difference similarity factor is improved so that R2Bigger, the content difference degree of sample room is got over
Small, similarity is higher.By R1And R2Combine the similarity for representing sample room, can preferably embody the qualitative (chemistry of sample room
Composition) and the quantitatively similarity degree of (chemical composition content) two aspects, and then it is then more complete to carry out similarity evaluation to sample
Face.
Embodiment one
Present embodiment discloses a kind of computational methods for fingerprint similarity, comprise the following steps:
Step 1:Selected standard reference sample, determines its finger-print, the finger-print is as standard under certain condition
Reference fingerprint collection of illustrative plates;Or the finger-print of multiple batch samples is determined according to relevant technical requirements, then generate canonical reference
Finger-print;
Step 2:Determine the finger-print of testing sample under the same conditions with standard sample, obtain testing sample fingerprint
Collection of illustrative plates;
Step 3:The coefficient correlation of testing sample and standard reference sample is calculated, formula is as follows:
Step 4:The difference similarity factor of testing sample and standard reference sample finger-print is calculated, formula is as follows:
X in two above formula ((1) and (2) formula)iAnd siTesting sample and standard reference sample finger-print are represented respectively
I-th of chromatographic peak peak area or the intensity of peak height or collection of illustrative plates (absorbance or light transmittance of such as spectral signal).N is chromatogram
Number or the wavelength points at peak;WithThe respectively average peak area of testing sample and standard reference sample or average peak height or figure
The mean intensity of spectrum;
Step 5:The comprehensive similarity of testing sample and standard reference sample is calculated as follows:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, they
Span is respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Methods described can be used for the quality evaluation of different type sample, and the quality between identical or different type of sample
Difference.
In comprehensive similarity R calculation formula (3), weight factor δ and sensitivity factor u and v are introduced, to adapt to not
With the data type of sample finger-print, meet the purpose of similarity evaluation.By δ variation, R can be adjusted1And R2For R
Percentage contribution, to adapt to the requirement of different sample analysis systems;By u and v change, R can be adjusted1And R2Sensitive journey
Degree, u or v value are bigger, and sensitivity level is higher.Analysis for actual sample, suitable δ, u and v are selected, meet actual sample
The similarity evaluation requirement of product.Generally, default value can be used, δ, u and v default value are 0.5,1 and 1.Comparing
During the similarity of same type difference sample, the value of above three parameter should fix.For different types of sample
Data (such as traditional Chinese medicine sample or tobacco sample), δ, u and v can take different values, to be adapted to the needs of different type sample.But
During similarity-rough set, the value of above three parameter should fix.
Embodiment two
The purpose of the present embodiment is to provide a kind of computing device.
A kind of device calculated for fingerprint similarity, including memory, processor and storage are on a memory simultaneously
The computer program that can be run on a processor, following steps are realized during the computing device described program, including:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print
Number R2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Embodiment three
The purpose of the present embodiment is to provide a kind of computer-readable recording medium.
A kind of computer-readable recording medium, is stored thereon with computer program, is calculated for fingerprint similarity, should
Following steps are performed when program is executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print
Number R2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Example IV
The purpose of the present embodiment is to provide a kind of sample quality evaluation system.
To achieve these goals, the present invention is using a kind of following technical scheme:
Present embodiments provide a kind of sample quality evaluation system, including detector and computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmission
To the computing device;
The calculating that the computing device includes memory, processor and storage on a memory and can run on a processor
Machine program, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Parameter setting of the user for comprehensive similarity is received, is calculated based on the coefficient correlation and difference similarity factor comprehensive
Close similarity:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model
Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞;
By comprehensive similarity compared with given threshold, if the comprehensive similarity is not less than the threshold value, the sample
Product are qualified, otherwise unqualified.
Each step being related in the device of above example two, three and four is corresponding with embodiment of the method one, specific implementation
Mode can be found in the related description part of embodiment one.Term " computer-readable recording medium " be construed as including one or
The single medium or multiple media of multiple instruction collection;Any medium is should also be understood as including, any medium can be deposited
Store up, encode or carry for the instruction set by computing device and make the either method in the computing device present invention.
Experimental result
Example 1
At present from the document delivered, qualitative and quantitative similarity is combined to the method for calculating similarity mainly to be had
Index of composite information, i.e., improved Nei Y-factor method Ys (Meng Qinghua, Liu Yongsuo, Jiang Shumin, Hu Yuzhu, chromatographic fingerprinting synthesis letter
Cease application study of the index in traditional Chinese medicine quality control, Chinese natural drug, 2004,11 (2):359-364), and phase relation
The similarity based method (CN105651875A) that number is combined with relative Euclidean distance.The calculation formula of Index of composite information is:
In above formula (4), n1And n2The chromatogram peak number of two finger-prints respectively compared, n are shared peak number.h1t(i)With
h2t(i)I-th of shared peak area or peak height in respectively first and second finger-print.From (4) formula, S calculating
Including two parts:Part I is Nei coefficientsPart II is to compare two finger-prints to share peak i-th
The difference of area or peak height reflects the content difference of institute's comparative sample relative to the relative scale of their sums.But (4) formula is present
Following deficiency:The influence of the non-shared peak area of finger-print or peak height to similarity is not accounted in Part II;Part I
Only consider that shared peak number accounts for the ratio of the total peak number of finger-print, does not consider the area at these peaks or the ratio of peak height relative size
Relation;For spectrum fingerprint data, spectroscopic data might have negative valued data appearance after derivative processing sometimes, and this causes
Part II, which calculates, occurs problem, it is possible to so that negative sign ("-") change of Part II causes the model that S values exceed 0-1
Enclose.Two finger-prints of our digital simulations, such as table 1 below:
The spectroscopic data that table 1 is simulated
Wavelength nm | Finger-print 1 | Finger-print 2 |
201 | -1.4927 | 0.43799 |
202 | -1.4443 | 1.325 |
203 | 0.51217 | 0.8079 |
204 | -0.85071 | 1.3743 |
205 | 0.9012 | 0.15665 |
206 | 0.97494 | 0.28882 |
207 | -0.64072 | 0.6114 |
208 | -1.7011 | -2.5206 |
209 | 2.0643 | 1.5019 |
210 | 1.392 | -0.79828 |
According to (4) formula calculate Index of composite information be 6.8376, and according to it is proposed that method calculate synthesis it is similar
Spend for 26.289%, results contrast is reasonable.
The similarity based method calculation formula that coefficient correlation is combined with relative Euclidean distance is:
S=RjDk (5)
(5) formula includes two part R (coefficient correlation) and D (with respect to Euclidean distance), j and k difference sensitivity coefficients.
(5) R calculation formula is in formula:
(5) D calculation formula is in formula:
When
When
(6) n is fingerprint peak peak area number or fingerprint peakses peak height number in (7) and (8) formula.XiFor control sample
Product fingerprint peakses peak area or peak height,For comparative sample fingerprint peakses average peak area or average peak height;YiFor the finger of standard sample
Line peak peak area or peak height, if corresponding fingerprint peakses are not present in comparative sample, its peak area or peak height value are designated as 0.For
The fingerprint peakses average peak area of standard sample or average peak height.
Coefficient correlation shown in (6) formula is somewhat different (see (1) formula) with general coefficient correlation in this method, can so cause
Part I R value is possible to be more than 1, it is difficult to explains.For example, the analogue data shown in table 2:
The finger-print data that table 2 is simulated
The similarity calculated by (6) formula is R=1.2786, because the molecule numerical value (28887) of (6) formula is more than subhead
Caused by numerical value (22593).
If (6) formula uses common formula of correlation coefficient (as shown in (1) formula), the analysis for spectral fingerprint data has
When can have certain deficiency.It is a kind of preprocessing procedures by data normalization in spectroscopic data, i.e., such as following formula table
Show:
The mould a length of 1 of spectrum fingerprint data is obtained after being normalized by (9) formula, or the mould length of two finger-prints is non-
Very close to its ratio is 1.When progress sample finger-print compares, calculated by (5) formula, now S=Rj*Dk=Rj, because D=
1.The finger-print data of two simulations shown in table 3.
The finger-print data that table 3 is simulated
Because the long ratio of the mould of two finger-prints is 1, it is S=R to calculate similarity according to (5) formulajDk=Rj*1k=Rj。
The similarity S=R of this two finger-printsj=1, illustrate that this two groups of data similarity degrees are completely the same, it is impossible to distinguish them it
Between difference.And according to it is proposed that similarity calculating method result be comprehensive similarity R=91.83%;R1=100%;
R2=83.66% (tri- parameters of wherein δ, u and v are respectively default value 0.5,1 and 1), can be distinguished between this two collection of illustrative plates
Difference, this result for having different with the data shown in table 3 and Fig. 2 is consistent, therefore comprehensive similarity analysis result is
Compare rational.
Example 2
Anscombe data analyses
Anscombe data are the very peculiar data that statistician F.J.Anscombe is constructed, as shown in table 4:
The Anscombe data of table 4
Y | 10 | 8 | 13 | 9 | 11 | 14 | 6 | 4 | 12 | 7 | 5 |
X1 | 8.04 | 6.95 | 7.58 | 8.81 | 8.33 | 9.96 | 7.24 | 4.26 | 10.84 | 4.82 | 5.68 |
X2 | 9.14 | 8.14 | 8.74 | 8.77 | 9.26 | 8.1 | 6.13 | 3.1 | 9.13 | 7.26 | 4.74 |
X3 | 7.46 | 6.77 | 12.74 | 7.11 | 7.81 | 8.84 | 6.08 | 5.39 | 8.15 | 6.42 | 5.73 |
Using Y as canonical reference finger-print, and X1, X2 and X3 are respectively three sample finger-prints, Similarity Measure
The results detailed in Table 5.
The Similarity Measure result of table 5
From table 5, if retaining three effective digitals, the result obtained by Cosin method and correlation coefficient process is for three
Individual sample is consistent (0.981 and 0.816), cannot be distinguished by sample difference, and uses three sample phases of the inventive method calculating
It is respectively 85.4%, 86.6% and 85.3% like degree result, sample difference can be distinguished.In order to increase the difference of sample room similarity
The opposite sex, it can suitably adjust three parameters δ, u and v.Due to this method coefficient correlation part variation very little, thus it is similar in synthesis
Degree makes its weight smaller in calculating, while increases the sensitivity of strength difference similarity factor, as a result as shown in table 5, it is seen that different
Sample comprehensive similarity value has different.Similarity calculating method proposed by the present invention has preferably application to be applicable as can be seen here
Property.
Example 3
Ginseng branch tuckahoe oral liquid finger-print is determined using HPLC methods, determines 12 batches altogether under identical chromatographic condition
Sample.After (2004 editions, A versions) processing of similarity evaluation, Auto-matching result is obtained, and
Reference fingerprint is generated using averaging method, different batches sample similarity analysis is carried out (such as Fig. 3 institutes as canonical reference collection of illustrative plates
Show).
It is as shown in Figure 4 that different batches sample finger-print chromatographic peak area is depicted as bar graph.
Although it can be seen from figure 3 that the overall profile figure of each batch sample is more similar, from Fig. 2 peak areas relatively,
The peak area of different batches sample still has larger difference.The phase that Cosin method and correlation coefficient process as shown in Table 6 obtains
It is visible like degree result, the finger-print of 1-9 samples and the Similarity value of canonical reference collection of illustrative plates it is larger (>0.970) these, are illustrated
The chemical composition of batch sample and standard reference sample is more close, and the Similarity value of 10-12 samples is relatively low, illustrates these
The chemical composition of sample and standard reference sample has different.In fact, 10-12 batch samples are the sample that exceeds the time limit, these
The sample storage time is longer, it may occur however that the change of chemical composition, therefore Similarity value is relatively low.Generally, included angle cosine
The change that method and correlation coefficient process are difficult to embody sample finger-print chemical composition amount influences to caused by similarity.Can by Fig. 4
See, some finger-print chromatographic peak areas degree of fluctuation in different batches sample is larger, and particularly some peak areas are less
Chromatographic peak.It is generally acknowledged that then exist when the Similarity value of testing sample collection of illustrative plates and canonical reference finger-print is less than 0.900 larger
Difference.In order to obtain rational similarity evaluation result, 12 batch samples are divided into two number quantity sets, respectively school by us
Positive collection (1,3,5,7,9 and No. 11 samples) and checking collection (2,4,6,8,10 and No. 12 samples).By calibration set sample finger-print
With standard reference sample finger-print carry out by (1) formula carry out comprehensive similarity calculating, then adjust tri- parameters of δ, u and v with
Make the Similarity value of calibration set sample in rational scope, i.e., the comprehensive similarity value for the sample that do not exceed the time limit>0.900 (or
90.0%), exceed the time limit the comprehensive similarity value of sample<0.900 (or 90.0%).Parameter after fixed adjustment and for verifying
Collect the comprehensive similarity prediction of sample, verify whether the comprehensive similarity value of collection sample is reasonable to examine.Due to exceeding the time limit, sample is deposited
It is longer to store up the time, it is thus possible to have the change of chemical composition, the sensitivity to chemical composition change is properly increased for this, reduce
The sensitivity of change to chemical composition amount, to distinguish do not exceed the time limit sample and the sample that exceeds the time limit.δ, u and v are adjusted to 0.5,2 Hes
0.1, the result of calculation of comprehensive similarity is shown in Table 6.From table 6, the comprehensive similarity of the sample that do not exceed the time limit is all higher than 94.0%, and
The comprehensive similarity value of sample of exceeding the time limit is respectively less than 90.0%, and Similarity Measure result tallies with the actual situation, relatively more reasonable.
The different batches sample similarity evaluation result of table 6
Example 4
The true and false that infrared spectrum can be used for cigarette differentiates.We are surveyed under the same conditions using Infrared Reflective Spectra method
Determining the housing material of genuine piece and adulterant double happiness cigarette, to obtain infrared Absorption spectrogram as shown in Figure 4:The phase of genuine piece and adulterant
7 are shown in Table like degree evaluation result.
The double happiness cigarette housing material similarity result of table 7
As seen from Figure 5, for the difference of genuine piece and adulterant sample essentially from absorption intensity (T%) difference, overall profile is non-
It is often similar, therefore to obtain Similarity value very high (being shown in Table 7) for coefficient correlation and Cosin method, so it is difficult to distinguishing genuine piece and puppet
Product.Using the comprehensive similarity computational methods of the present invention, according to Fig. 5, the appropriate sensitivity for increasing difference similarity factor because
Son, you can obtain relatively satisfactory result.From table 7, comprehensive similarity value is more than 92.0% between genuine piece, and with adulterant most
High comprehensive similarity value be 88.6% (<90.0%), therefore Similarity Measure result of the present invention tallies with the actual situation, and more closes
Reason.
The computational methods of the present invention can reflect the similar of sample in terms of the chemical composition of sample and content difference two
Degree, comprehensive similarity can not only be provided, and reflection chemical composition (R can be provided respectively1) or content difference (R2)
Similarity information;Similarity calculating method is set to be applied to different type complex sample system by introducing tri- parameters of δ, u and v
Need.
It will be understood by those skilled in the art that each module or each step of the invention described above can be filled with general computer
Put to realize, alternatively, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored
Performed in the storage device by computing device, either they are fabricated to respectively each integrated circuit modules or by they
In multiple modules or step be fabricated to single integrated circuit module to realize.The present invention be not restricted to any specific hardware and
The combination of software.
Although above-mentioned the embodiment of the present invention is described with reference to accompanying drawing, model not is protected to the present invention
The limitation enclosed, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not
Need to pay various modifications or deformation that creative work can make still within protection scope of the present invention.
Claims (10)
1. a kind of fingerprint similarity computational methods, it is characterised in that comprise the following steps:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point
Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
2. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the coefficient correlation calculation formula
It is as follows:
Wherein, xiAnd siPeak area, the peak of i-th of chromatographic peak of testing sample and standard reference sample finger-print are represented respectively
High or collection of illustrative plates intensity;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and standard reference sample
The mean intensity of the average peak area of finger-print, average peak height or collection of illustrative plates.
3. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the difference similarity factor calculates
Formula is as follows:
Wherein, xiAnd siPeak area, the peak of i-th of chromatographic peak of testing sample and standard reference sample finger-print are represented respectively
High or collection of illustrative plates intensity;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and standard reference sample
The mean intensity of the average peak area of finger-print, average peak height or collection of illustrative plates.
4. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that δ, u and v default value are 0.5,
1 and 1.
5. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the canonical reference finger-print
The finger-print of selected standard reference sample, or according to according to about technical requirements by the fingerprint of the multiple batch samples determined
Collection of illustrative plates generates.
6. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the computational methods can be used in
Mass discrepancy detection between the quality testing of sample and identical or different type of sample.
7. fingerprint similarity computational methods as claimed in claim 6, it is characterised in that wherein, pass through adjusting parameter δ, u
And v, by increasing capacitance it is possible to increase the otherness of similarity between different type sample.
8. a kind of device calculated for fingerprint similarity, including memory, processor and storage are on a memory and can
The computer program run on a processor, it is characterised in that following steps are realized during the computing device described program, are wrapped
Include:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point
Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
9. a kind of computer-readable recording medium, is stored thereon with computer program, calculated for fingerprint similarity, it is special
Sign is that the program performs following steps when being executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point
Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
10. a kind of sample quality evaluation system, it is characterised in that including detector and computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmit to institute
State computing device;
The computer journey that the computing device includes memory, processor and storage on a memory and can run on a processor
Sequence, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;
Based on default parameter value or setting of the user for the parameter value of comprehensive similarity is received, according to the coefficient correlation and difference
Different similarity factor calculates comprehensive similarity:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point
Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞;
Based on default threshold or setting of the user for threshold value is received, by comprehensive similarity compared with given threshold, if institute
State comprehensive similarity and be not less than the threshold value, then the sample passes, otherwise unqualified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710832461.9A CN107784192A (en) | 2017-09-15 | 2017-09-15 | Fingerprint similarity computational methods, device and sample quality evaluation system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710832461.9A CN107784192A (en) | 2017-09-15 | 2017-09-15 | Fingerprint similarity computational methods, device and sample quality evaluation system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107784192A true CN107784192A (en) | 2018-03-09 |
Family
ID=61437615
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710832461.9A Pending CN107784192A (en) | 2017-09-15 | 2017-09-15 | Fingerprint similarity computational methods, device and sample quality evaluation system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107784192A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109444071A (en) * | 2018-12-14 | 2019-03-08 | 江苏东交工程检测股份有限公司 | Pitch infrared spectroscopy quality determining method and device based on subrane |
CN110426505A (en) * | 2019-08-22 | 2019-11-08 | 重庆壤科农业数据服务有限公司 | Purple soil fingerprint databases and its method for building up |
CN110532308A (en) * | 2019-07-11 | 2019-12-03 | 北京嘉元文博科技有限公司 | Substance discrimination method and device and computer readable storage medium |
CN110632024A (en) * | 2019-10-29 | 2019-12-31 | 五邑大学 | Quantitative analysis method, device and equipment based on infrared spectrum and storage medium |
CN110838343A (en) * | 2019-11-15 | 2020-02-25 | 山东中医药大学 | Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum |
CN111426648A (en) * | 2020-03-19 | 2020-07-17 | 甘肃省交通规划勘察设计院股份有限公司 | Method and system for determining similarity of infrared spectrogram |
CN111855929A (en) * | 2020-07-06 | 2020-10-30 | 浙江工商大学 | Method for evaluating similarity of fruit powder raw materials |
CN111971555A (en) * | 2018-03-28 | 2020-11-20 | 科思创知识产权两合公司 | Quality inspection method for polyurethane sample, electronic nose device, and storage medium |
CN113030007A (en) * | 2021-02-10 | 2021-06-25 | 河南中烟工业有限责任公司 | Method for rapidly testing quality stability of tobacco essence based on similarity learning algorithm |
CN113029979A (en) * | 2021-02-10 | 2021-06-25 | 河南中烟工业有限责任公司 | Method for testing quality stability of cigarette paper |
CN113076812A (en) * | 2021-03-12 | 2021-07-06 | 药都(本溪)一致科技有限公司 | Processing method, system, medium and application of spectrum quantization fingerprint |
CN114646715A (en) * | 2020-12-21 | 2022-06-21 | 株式会社岛津制作所 | Waveform processing support device and waveform processing support method |
CN115828115A (en) * | 2023-02-16 | 2023-03-21 | 北京圣芯诺科技有限公司 | Data consistency evaluation method, device, electronic equipment and program product |
CN113076812B (en) * | 2021-03-12 | 2024-05-10 | 药都(本溪)一致科技有限公司 | Processing method, system, medium and application of quantized fingerprint spectrum of spectrum |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101256177A (en) * | 2007-11-21 | 2008-09-03 | 皖南医学院 | System for evaluation of Chinese medicine numeralization color spectrum dactylogram similarity |
CN103018382A (en) * | 2012-12-07 | 2013-04-03 | 南京中医药大学 | Detection method of fingerprint spectrum similarity |
CN103278591A (en) * | 2013-05-16 | 2013-09-04 | 江苏师范大学 | Evaluation method for chromatographic fingerprint similarity |
CN105651875A (en) * | 2015-12-31 | 2016-06-08 | 河北中烟工业有限责任公司 | Similarity evaluating algorithm of fingerprint spectrum |
-
2017
- 2017-09-15 CN CN201710832461.9A patent/CN107784192A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101256177A (en) * | 2007-11-21 | 2008-09-03 | 皖南医学院 | System for evaluation of Chinese medicine numeralization color spectrum dactylogram similarity |
CN103018382A (en) * | 2012-12-07 | 2013-04-03 | 南京中医药大学 | Detection method of fingerprint spectrum similarity |
CN103278591A (en) * | 2013-05-16 | 2013-09-04 | 江苏师范大学 | Evaluation method for chromatographic fingerprint similarity |
CN105651875A (en) * | 2015-12-31 | 2016-06-08 | 河北中烟工业有限责任公司 | Similarity evaluating algorithm of fingerprint spectrum |
Non-Patent Citations (3)
Title |
---|
朱训生: "《工程管理的模糊分析》", 31 October 2004 * |
詹雪艳等: "色谱指纹图谱相似度方法的研究进展", 《中国实验方剂学杂志》 * |
赖何季: "中药色谱指纹图谱相似度分析的研究与应用", 《万方学位论文》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111971555A (en) * | 2018-03-28 | 2020-11-20 | 科思创知识产权两合公司 | Quality inspection method for polyurethane sample, electronic nose device, and storage medium |
CN109444071A (en) * | 2018-12-14 | 2019-03-08 | 江苏东交工程检测股份有限公司 | Pitch infrared spectroscopy quality determining method and device based on subrane |
CN110532308A (en) * | 2019-07-11 | 2019-12-03 | 北京嘉元文博科技有限公司 | Substance discrimination method and device and computer readable storage medium |
CN110426505A (en) * | 2019-08-22 | 2019-11-08 | 重庆壤科农业数据服务有限公司 | Purple soil fingerprint databases and its method for building up |
CN110632024A (en) * | 2019-10-29 | 2019-12-31 | 五邑大学 | Quantitative analysis method, device and equipment based on infrared spectrum and storage medium |
CN110632024B (en) * | 2019-10-29 | 2022-06-24 | 五邑大学 | Quantitative analysis method, device and equipment based on infrared spectrum and storage medium |
CN110838343B (en) * | 2019-11-15 | 2022-03-01 | 山东中医药大学 | Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum |
CN110838343A (en) * | 2019-11-15 | 2020-02-25 | 山东中医药大学 | Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum |
CN111426648A (en) * | 2020-03-19 | 2020-07-17 | 甘肃省交通规划勘察设计院股份有限公司 | Method and system for determining similarity of infrared spectrogram |
CN111855929A (en) * | 2020-07-06 | 2020-10-30 | 浙江工商大学 | Method for evaluating similarity of fruit powder raw materials |
CN111855929B (en) * | 2020-07-06 | 2021-04-30 | 浙江工商大学 | Method for evaluating similarity of fruit powder raw materials |
CN114646715A (en) * | 2020-12-21 | 2022-06-21 | 株式会社岛津制作所 | Waveform processing support device and waveform processing support method |
CN114646715B (en) * | 2020-12-21 | 2023-08-04 | 株式会社岛津制作所 | Waveform processing support device and waveform processing support method |
CN113030007A (en) * | 2021-02-10 | 2021-06-25 | 河南中烟工业有限责任公司 | Method for rapidly testing quality stability of tobacco essence based on similarity learning algorithm |
CN113029979A (en) * | 2021-02-10 | 2021-06-25 | 河南中烟工业有限责任公司 | Method for testing quality stability of cigarette paper |
CN113076812A (en) * | 2021-03-12 | 2021-07-06 | 药都(本溪)一致科技有限公司 | Processing method, system, medium and application of spectrum quantization fingerprint |
CN113076812B (en) * | 2021-03-12 | 2024-05-10 | 药都(本溪)一致科技有限公司 | Processing method, system, medium and application of quantized fingerprint spectrum of spectrum |
CN115828115A (en) * | 2023-02-16 | 2023-03-21 | 北京圣芯诺科技有限公司 | Data consistency evaluation method, device, electronic equipment and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107784192A (en) | Fingerprint similarity computational methods, device and sample quality evaluation system | |
CN104897607B (en) | Portable near infrared spectrum food modeling and quick detection integral method and system | |
Ye et al. | Non-destructive prediction of protein content in wheat using NIRS | |
CN108362662A (en) | Near infrared spectrum similarity calculating method, device and substance qualitative analytic systems | |
CN104792652B (en) | A kind of Milkvetch Root multiple index quick detecting method | |
CN104990895B (en) | A kind of near infrared spectrum signal standards normal state bearing calibration based on regional area | |
CN108872132A (en) | A method of fresh tea leaves kind is differentiated using near infrared spectrum | |
CN106918572B (en) | The assay method of potato content in potato compounding staple food | |
CN108489929A (en) | Ginseng, Radix Notoginseng and the legal base source Panax polysaccharide of three kinds of American Ginseng discrimination method | |
CN108760647A (en) | A kind of wheat content of molds line detecting method based on Vis/NIR technology | |
Yun et al. | Identification of tea based on CARS‐SWR variable optimization of visible/near‐infrared spectrum | |
CN107402192A (en) | A kind of method of quick analysis essence and flavoring agent quality stability | |
Innamorato et al. | Tracing the geographical origin of lentils (Lens culinaris Medik.) by infrared spectroscopy and chemometrics | |
Zhang et al. | Spectral and chromatographic overall analysis: An insight into chemical equivalence assessment of traditional Chinese medicine | |
CN106770607A (en) | A kind of method that utilization HS-IMR-MS differentiates genuine-fake cigarette | |
CN109358022A (en) | A kind of method of the quick-fried pearl type of quick discrimination cigarette | |
Wang et al. | SVM classification method of waxy corn seeds with different vitality levels based on hyperspectral imaging | |
Luo et al. | Rapid quantification of multi-components in alcohol precipitation liquid of Codonopsis Radix using near infrared spectroscopy (NIRS) | |
CN106872398A (en) | A kind of HMX explosives moisture method for fast measuring | |
CN109932335A (en) | It is a kind of for the method for natural rubber assay in plant and measurement use LED near infrared spectrometer | |
CN109984725A (en) | Contact pressure disturbance restraining method, device and measurement method in diffusing reflection measurement | |
CN106568740A (en) | Method for rapid judging of varieties of fresh tea leaves by near infrared spectroscopy | |
Yang et al. | Rapid authentication of variants of Gastrodia elata Blume using near-infrared spectroscopy combined with chemometric methods | |
CN112782116A (en) | Method for detecting moisture content of large traditional Chinese medicine honeyed pill by utilizing near infrared spectrum and application | |
CN112801173A (en) | Lettuce near infrared spectrum classification method based on QR fuzzy discrimination analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180309 |
|
RJ01 | Rejection of invention patent application after publication |