CN107784192A - Fingerprint similarity computational methods, device and sample quality evaluation system - Google Patents

Fingerprint similarity computational methods, device and sample quality evaluation system Download PDF

Info

Publication number
CN107784192A
CN107784192A CN201710832461.9A CN201710832461A CN107784192A CN 107784192 A CN107784192 A CN 107784192A CN 201710832461 A CN201710832461 A CN 201710832461A CN 107784192 A CN107784192 A CN 107784192A
Authority
CN
China
Prior art keywords
print
finger
sample
similarity
testing sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710832461.9A
Other languages
Chinese (zh)
Inventor
姜红
聂磊
姜文文
刘肖雁
苏美
田进国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN201710832461.9A priority Critical patent/CN107784192A/en
Publication of CN107784192A publication Critical patent/CN107784192A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions

Landscapes

  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a kind of fingerprint similarity computational methods, device and sample quality evaluation system, it is characterised in that methods described includes:Establish canonical reference finger-print;Obtain testing sample finger-print;Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1;Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2;Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor.The computational methods of the present invention can reflect the similarity degree of sample in terms of the chemical composition of sample and content difference two, and suitable for the needs of different type complex sample system.

Description

Fingerprint similarity computational methods, device and sample quality evaluation system
Technical field
The invention belongs to chemical substance quality evaluation field, more particularly to a kind of fingerprint similarity computational methods, dress Put and sample quality evaluation system.
Background technology
Finger-print refers to some complex material systems, such as Chinese medicine, certain organism or the DNA of tissue or cell, albumen Matter is after proper treatment, and using certain analysis means, what is obtained can characterize the chromatogram or light of its chemical substance feature Spectrogram.Finger-print controls and evaluated field in traditional Chinese medicine quality and is widely used, and is mainly used in evaluating Chinese medicine and its Chinese medicine preparation Authenticity, Optimality and the stability of semi-manufactured goods quality.Finger-print is based on mainly for multi-component complex species analysis system Modern Instrument Analytical Technique and detection method, the collection of illustrative plates for determining to obtain can provide sample abundant characteristic information, can be more complete The chemical composition and content of sample are reflected in face, and then carry out globality description and evaluation to the quality of sample.Determine finger-print Analytical technology have a lot, can generally be divided into chromatogram class and the major class method of spectral class two.High performance liquid chromatography in chromatogram class method Method (HPLC) is conventional prefered method, and spectral class method mid-infrared light spectrometry (IR) application is more.Used by finger-print In the quality assessment process of sample, a key link is to calculate the similarity of testing sample and standard reference sample.It is similar Degree result is the important evidence that finger-print is used for sample quality evaluation.
Different similarity calculating methods evaluates the similitude of sample finger-print from different angles, therefore result is often Can be different.The result of calculation of fingerprint similarity should be able to qualitatively and quantitatively embody the species of the chemical composition for the sample that compares And the change of content.At present, the computational methods of fingerprint similarity mainly include two major classes, and one kind is from the whole of finger-print Shape is set out, and by comparing the number at Fingerprints peak, the information such as position of relative intensity ratio and characteristic peak is determined Property judge sample chemical composition similitude or otherness, such as correlation coefficient process, Cosin method, Nei Y-factor method Ys are this kind of Fluctuating change (such as peak area, peak height or peak intensity, particularly small peak) of the method to Fingerprints peak amount is not sensitive enough;Separately One kind is characteristic peak area by comparing finger-print or the difference of peak height or intensity come quantitatively judgement sample chemical composition The otherness or similitude of content, such as similarity algorithm based on distance, change of this kind of method to Fingerprints peak amount It is more sensitive, but the change of fingerprint similarity is because sample chemical composition change causes or the fluctuation of characteristic peak amount is drawn Rise and be difficult to differentiate between.
How qualitative and quantitative evaluation similarity, and embody sample chemical composition and content similar and difference, be The technical problem that those skilled in the art urgently solve at present.
The content of the invention
To overcome above-mentioned the deficiencies in the prior art, the invention provides a kind of fingerprint similarity computational methods, device With sample quality evaluation system, above-mentioned two classes similarity calculating method is combined to obtain comprehensive similarity, energy by methods described The enough while comprehensive similar and difference for embodying sample chemical composition and content, from the phase of qualitative and quantitative two angles evaluation sample Like degree, and then evaluate the quality of sample.
To achieve the above object, the present invention adopts the following technical scheme that:
A kind of fingerprint similarity computational methods, comprise the following steps:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print Number R2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
The coefficient correlation calculation formula is as follows:
Wherein, xiAnd siThe peak face of i-th of chromatographic peak of testing sample and standard reference sample finger-print is represented respectively The intensity of product, peak height or collection of illustrative plates;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and canonical reference The mean intensity of the average peak area of sample finger-print, average peak height or collection of illustrative plates.
The difference similarity factor calculation formula is as follows:
Wherein, xiAnd siThe peak face of i-th of chromatographic peak of testing sample and standard reference sample finger-print is represented respectively The intensity of product, peak height or collection of illustrative plates;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and canonical reference The mean intensity of the average peak area of sample finger-print, average peak height or collection of illustrative plates.
Wherein, δ, u and v default value are 0.5,1 and 1.
The canonical reference finger-print is the finger-print of selected standard reference sample, or is wanted according to according to relevant technology The finger-print by the multiple batch samples determined is asked to generate.
The mass discrepancy that the computational methods can be used between the quality testing of sample and identical or different type of sample Detection.
Wherein, adjusting parameter δ, u and v are passed through, by increasing capacitance it is possible to increase the otherness of similarity between different type sample.
According to the second object of the present invention, present invention also offers a kind of device calculated for fingerprint similarity, Including memory, processor and the computer program that can be run on a memory and on a processor is stored, the processor is held Following steps are realized during row described program, including:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print Number R2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
According to the third object of the present invention, present invention also offers a kind of computer-readable recording medium, it is stored thereon with Computer program, is calculated for fingerprint similarity, and the program performs following steps when being executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print Number R2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
According to the fourth object of the present invention, present invention also offers a kind of sample quality evaluation system, including detector and Computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmission To the computing device;
The calculating that the computing device includes memory, processor and storage on a memory and can run on a processor Machine program, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Based on default parameter value or setting of the user for the parameter value of comprehensive similarity is received, according to the coefficient correlation Comprehensive similarity is calculated with difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞;
Based on default threshold or setting of the user for threshold value is received, by comprehensive similarity compared with given threshold, If the comprehensive similarity is not less than the threshold value, the sample passes are otherwise unqualified.
Beneficial effects of the present invention
1st, the method for evaluating similarity of finger-print can substantially be divided into two major classes:One kind is the difference reflection based on collection of illustrative plates The similarity degree of sample;One kind is the similarity degree of vector angle or data degree of correlation reflection sample based on collection of illustrative plates.This Two class methods explain the similarity of sample from different perspectives, there is its feature and deficiency.If two class methods are combined, its is played each Feature, overcomes respective deficiency, so evaluates the similarity of sample then more fully, rationally.
2nd, two class method for evaluating similarity are combined by the present invention, it is proposed that a kind of finger-print based on comprehensive similarity Similarity calculating method, this method can reflect the similar journey of sample in terms of the chemical composition of sample and content difference two Degree, can not only provide comprehensive similarity, and can provide reflection chemical composition (R respectively1) or content difference (R2) phase Like degree information;The need for making similarity calculating method be applied to different type complex sample system by introducing tri- parameters of δ, u and v Will.
Brief description of the drawings
The Figure of description for forming the part of the application is used for providing further understanding of the present application, and the application's shows Meaning property embodiment and its illustrate be used for explain the application, do not form the improper restriction to the application.
Fig. 1 is finger-print computational methods flow chart of the present invention;
Fig. 2 is area or the comparison of peak height or intensity of two collection of illustrative plates chromatographic peaks;
Fig. 3 is different batches sample HPLC finger-prints;
Fig. 4 is the finger-print chromatographic peak peak area bar graph of different batches sample;
Fig. 5 is cigarette housing material infrared spectrogram.
Embodiment
It is noted that described further below is all exemplary, it is intended to provides further instruction to the application.It is unless another Indicate, all technologies used herein and scientific terminology are with usual with the application person of an ordinary skill in the technical field The identical meanings of understanding.
It should be noted that term used herein above is merely to describe embodiment, and be not intended to restricted root According to the illustrative embodiments of the application.As used herein, unless the context clearly indicates otherwise, otherwise singulative It is also intended to include plural form, additionally, it should be understood that, when in this manual using term "comprising" and/or " bag Include " when, it indicates existing characteristics, step, operation, device, component and/or combinations thereof.
In the case where not conflicting, the feature in embodiment and embodiment in the application can be mutually combined.
General thought proposed by the present invention:Coefficient correlation (R1) for the chemical composition and relative scale of finger-print reflection It is more sensitive, and difference similarity factor (R2) directly to reflect the content difference that compared Fingerprints peak represents relative In the ratio of total content of material.In order that difference similarity factor and it is generally acknowledged that the bigger similarity degree of Similarity value is higher comments Valency is consistent, the computational methods of difference similarity factor is improved so that R2Bigger, the content difference degree of sample room is got over Small, similarity is higher.By R1And R2Combine the similarity for representing sample room, can preferably embody the qualitative (chemistry of sample room Composition) and the quantitatively similarity degree of (chemical composition content) two aspects, and then it is then more complete to carry out similarity evaluation to sample Face.
Embodiment one
Present embodiment discloses a kind of computational methods for fingerprint similarity, comprise the following steps:
Step 1:Selected standard reference sample, determines its finger-print, the finger-print is as standard under certain condition Reference fingerprint collection of illustrative plates;Or the finger-print of multiple batch samples is determined according to relevant technical requirements, then generate canonical reference Finger-print;
Step 2:Determine the finger-print of testing sample under the same conditions with standard sample, obtain testing sample fingerprint Collection of illustrative plates;
Step 3:The coefficient correlation of testing sample and standard reference sample is calculated, formula is as follows:
Step 4:The difference similarity factor of testing sample and standard reference sample finger-print is calculated, formula is as follows:
X in two above formula ((1) and (2) formula)iAnd siTesting sample and standard reference sample finger-print are represented respectively I-th of chromatographic peak peak area or the intensity of peak height or collection of illustrative plates (absorbance or light transmittance of such as spectral signal).N is chromatogram Number or the wavelength points at peak;WithThe respectively average peak area of testing sample and standard reference sample or average peak height or figure The mean intensity of spectrum;
Step 5:The comprehensive similarity of testing sample and standard reference sample is calculated as follows:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, they Span is respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Methods described can be used for the quality evaluation of different type sample, and the quality between identical or different type of sample Difference.
In comprehensive similarity R calculation formula (3), weight factor δ and sensitivity factor u and v are introduced, to adapt to not With the data type of sample finger-print, meet the purpose of similarity evaluation.By δ variation, R can be adjusted1And R2For R Percentage contribution, to adapt to the requirement of different sample analysis systems;By u and v change, R can be adjusted1And R2Sensitive journey Degree, u or v value are bigger, and sensitivity level is higher.Analysis for actual sample, suitable δ, u and v are selected, meet actual sample The similarity evaluation requirement of product.Generally, default value can be used, δ, u and v default value are 0.5,1 and 1.Comparing During the similarity of same type difference sample, the value of above three parameter should fix.For different types of sample Data (such as traditional Chinese medicine sample or tobacco sample), δ, u and v can take different values, to be adapted to the needs of different type sample.But During similarity-rough set, the value of above three parameter should fix.
Embodiment two
The purpose of the present embodiment is to provide a kind of computing device.
A kind of device calculated for fingerprint similarity, including memory, processor and storage are on a memory simultaneously The computer program that can be run on a processor, following steps are realized during the computing device described program, including:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print Number R2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Embodiment three
The purpose of the present embodiment is to provide a kind of computer-readable recording medium.
A kind of computer-readable recording medium, is stored thereon with computer program, is calculated for fingerprint similarity, should Following steps are performed when program is executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the similar system of difference between the canonical reference finger-print and the testing sample finger-print Number R2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞.
Example IV
The purpose of the present embodiment is to provide a kind of sample quality evaluation system.
To achieve these goals, the present invention is using a kind of following technical scheme:
Present embodiments provide a kind of sample quality evaluation system, including detector and computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmission To the computing device;
The calculating that the computing device includes memory, processor and storage on a memory and can run on a processor Machine program, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Parameter setting of the user for comprehensive similarity is received, is calculated based on the coefficient correlation and difference similarity factor comprehensive Close similarity:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, value model Enclose respectively 0≤u≤+ ∞ and 0≤v≤+ ∞;
By comprehensive similarity compared with given threshold, if the comprehensive similarity is not less than the threshold value, the sample Product are qualified, otherwise unqualified.
Each step being related in the device of above example two, three and four is corresponding with embodiment of the method one, specific implementation Mode can be found in the related description part of embodiment one.Term " computer-readable recording medium " be construed as including one or The single medium or multiple media of multiple instruction collection;Any medium is should also be understood as including, any medium can be deposited Store up, encode or carry for the instruction set by computing device and make the either method in the computing device present invention.
Experimental result
Example 1
At present from the document delivered, qualitative and quantitative similarity is combined to the method for calculating similarity mainly to be had Index of composite information, i.e., improved Nei Y-factor method Ys (Meng Qinghua, Liu Yongsuo, Jiang Shumin, Hu Yuzhu, chromatographic fingerprinting synthesis letter Cease application study of the index in traditional Chinese medicine quality control, Chinese natural drug, 2004,11 (2):359-364), and phase relation The similarity based method (CN105651875A) that number is combined with relative Euclidean distance.The calculation formula of Index of composite information is:
In above formula (4), n1And n2The chromatogram peak number of two finger-prints respectively compared, n are shared peak number.h1t(i)With h2t(i)I-th of shared peak area or peak height in respectively first and second finger-print.From (4) formula, S calculating Including two parts:Part I is Nei coefficientsPart II is to compare two finger-prints to share peak i-th The difference of area or peak height reflects the content difference of institute's comparative sample relative to the relative scale of their sums.But (4) formula is present Following deficiency:The influence of the non-shared peak area of finger-print or peak height to similarity is not accounted in Part II;Part I Only consider that shared peak number accounts for the ratio of the total peak number of finger-print, does not consider the area at these peaks or the ratio of peak height relative size Relation;For spectrum fingerprint data, spectroscopic data might have negative valued data appearance after derivative processing sometimes, and this causes Part II, which calculates, occurs problem, it is possible to so that negative sign ("-") change of Part II causes the model that S values exceed 0-1 Enclose.Two finger-prints of our digital simulations, such as table 1 below:
The spectroscopic data that table 1 is simulated
Wavelength nm Finger-print 1 Finger-print 2
201 -1.4927 0.43799
202 -1.4443 1.325
203 0.51217 0.8079
204 -0.85071 1.3743
205 0.9012 0.15665
206 0.97494 0.28882
207 -0.64072 0.6114
208 -1.7011 -2.5206
209 2.0643 1.5019
210 1.392 -0.79828
According to (4) formula calculate Index of composite information be 6.8376, and according to it is proposed that method calculate synthesis it is similar Spend for 26.289%, results contrast is reasonable.
The similarity based method calculation formula that coefficient correlation is combined with relative Euclidean distance is:
S=RjDk (5)
(5) formula includes two part R (coefficient correlation) and D (with respect to Euclidean distance), j and k difference sensitivity coefficients. (5) R calculation formula is in formula:
(5) D calculation formula is in formula:
When
When
(6) n is fingerprint peak peak area number or fingerprint peakses peak height number in (7) and (8) formula.XiFor control sample Product fingerprint peakses peak area or peak height,For comparative sample fingerprint peakses average peak area or average peak height;YiFor the finger of standard sample Line peak peak area or peak height, if corresponding fingerprint peakses are not present in comparative sample, its peak area or peak height value are designated as 0.For The fingerprint peakses average peak area of standard sample or average peak height.
Coefficient correlation shown in (6) formula is somewhat different (see (1) formula) with general coefficient correlation in this method, can so cause Part I R value is possible to be more than 1, it is difficult to explains.For example, the analogue data shown in table 2:
The finger-print data that table 2 is simulated
The similarity calculated by (6) formula is R=1.2786, because the molecule numerical value (28887) of (6) formula is more than subhead Caused by numerical value (22593).
If (6) formula uses common formula of correlation coefficient (as shown in (1) formula), the analysis for spectral fingerprint data has When can have certain deficiency.It is a kind of preprocessing procedures by data normalization in spectroscopic data, i.e., such as following formula table Show:
The mould a length of 1 of spectrum fingerprint data is obtained after being normalized by (9) formula, or the mould length of two finger-prints is non- Very close to its ratio is 1.When progress sample finger-print compares, calculated by (5) formula, now S=Rj*Dk=Rj, because D= 1.The finger-print data of two simulations shown in table 3.
The finger-print data that table 3 is simulated
Because the long ratio of the mould of two finger-prints is 1, it is S=R to calculate similarity according to (5) formulajDk=Rj*1k=Rj。 The similarity S=R of this two finger-printsj=1, illustrate that this two groups of data similarity degrees are completely the same, it is impossible to distinguish them it Between difference.And according to it is proposed that similarity calculating method result be comprehensive similarity R=91.83%;R1=100%; R2=83.66% (tri- parameters of wherein δ, u and v are respectively default value 0.5,1 and 1), can be distinguished between this two collection of illustrative plates Difference, this result for having different with the data shown in table 3 and Fig. 2 is consistent, therefore comprehensive similarity analysis result is Compare rational.
Example 2
Anscombe data analyses
Anscombe data are the very peculiar data that statistician F.J.Anscombe is constructed, as shown in table 4:
The Anscombe data of table 4
Y 10 8 13 9 11 14 6 4 12 7 5
X1 8.04 6.95 7.58 8.81 8.33 9.96 7.24 4.26 10.84 4.82 5.68
X2 9.14 8.14 8.74 8.77 9.26 8.1 6.13 3.1 9.13 7.26 4.74
X3 7.46 6.77 12.74 7.11 7.81 8.84 6.08 5.39 8.15 6.42 5.73
Using Y as canonical reference finger-print, and X1, X2 and X3 are respectively three sample finger-prints, Similarity Measure The results detailed in Table 5.
The Similarity Measure result of table 5
From table 5, if retaining three effective digitals, the result obtained by Cosin method and correlation coefficient process is for three Individual sample is consistent (0.981 and 0.816), cannot be distinguished by sample difference, and uses three sample phases of the inventive method calculating It is respectively 85.4%, 86.6% and 85.3% like degree result, sample difference can be distinguished.In order to increase the difference of sample room similarity The opposite sex, it can suitably adjust three parameters δ, u and v.Due to this method coefficient correlation part variation very little, thus it is similar in synthesis Degree makes its weight smaller in calculating, while increases the sensitivity of strength difference similarity factor, as a result as shown in table 5, it is seen that different Sample comprehensive similarity value has different.Similarity calculating method proposed by the present invention has preferably application to be applicable as can be seen here Property.
Example 3
Ginseng branch tuckahoe oral liquid finger-print is determined using HPLC methods, determines 12 batches altogether under identical chromatographic condition Sample.After (2004 editions, A versions) processing of similarity evaluation, Auto-matching result is obtained, and Reference fingerprint is generated using averaging method, different batches sample similarity analysis is carried out (such as Fig. 3 institutes as canonical reference collection of illustrative plates Show).
It is as shown in Figure 4 that different batches sample finger-print chromatographic peak area is depicted as bar graph.
Although it can be seen from figure 3 that the overall profile figure of each batch sample is more similar, from Fig. 2 peak areas relatively, The peak area of different batches sample still has larger difference.The phase that Cosin method and correlation coefficient process as shown in Table 6 obtains It is visible like degree result, the finger-print of 1-9 samples and the Similarity value of canonical reference collection of illustrative plates it is larger (>0.970) these, are illustrated The chemical composition of batch sample and standard reference sample is more close, and the Similarity value of 10-12 samples is relatively low, illustrates these The chemical composition of sample and standard reference sample has different.In fact, 10-12 batch samples are the sample that exceeds the time limit, these The sample storage time is longer, it may occur however that the change of chemical composition, therefore Similarity value is relatively low.Generally, included angle cosine The change that method and correlation coefficient process are difficult to embody sample finger-print chemical composition amount influences to caused by similarity.Can by Fig. 4 See, some finger-print chromatographic peak areas degree of fluctuation in different batches sample is larger, and particularly some peak areas are less Chromatographic peak.It is generally acknowledged that then exist when the Similarity value of testing sample collection of illustrative plates and canonical reference finger-print is less than 0.900 larger Difference.In order to obtain rational similarity evaluation result, 12 batch samples are divided into two number quantity sets, respectively school by us Positive collection (1,3,5,7,9 and No. 11 samples) and checking collection (2,4,6,8,10 and No. 12 samples).By calibration set sample finger-print With standard reference sample finger-print carry out by (1) formula carry out comprehensive similarity calculating, then adjust tri- parameters of δ, u and v with Make the Similarity value of calibration set sample in rational scope, i.e., the comprehensive similarity value for the sample that do not exceed the time limit>0.900 (or 90.0%), exceed the time limit the comprehensive similarity value of sample<0.900 (or 90.0%).Parameter after fixed adjustment and for verifying Collect the comprehensive similarity prediction of sample, verify whether the comprehensive similarity value of collection sample is reasonable to examine.Due to exceeding the time limit, sample is deposited It is longer to store up the time, it is thus possible to have the change of chemical composition, the sensitivity to chemical composition change is properly increased for this, reduce The sensitivity of change to chemical composition amount, to distinguish do not exceed the time limit sample and the sample that exceeds the time limit.δ, u and v are adjusted to 0.5,2 Hes 0.1, the result of calculation of comprehensive similarity is shown in Table 6.From table 6, the comprehensive similarity of the sample that do not exceed the time limit is all higher than 94.0%, and The comprehensive similarity value of sample of exceeding the time limit is respectively less than 90.0%, and Similarity Measure result tallies with the actual situation, relatively more reasonable.
The different batches sample similarity evaluation result of table 6
Example 4
The true and false that infrared spectrum can be used for cigarette differentiates.We are surveyed under the same conditions using Infrared Reflective Spectra method Determining the housing material of genuine piece and adulterant double happiness cigarette, to obtain infrared Absorption spectrogram as shown in Figure 4:The phase of genuine piece and adulterant 7 are shown in Table like degree evaluation result.
The double happiness cigarette housing material similarity result of table 7
As seen from Figure 5, for the difference of genuine piece and adulterant sample essentially from absorption intensity (T%) difference, overall profile is non- It is often similar, therefore to obtain Similarity value very high (being shown in Table 7) for coefficient correlation and Cosin method, so it is difficult to distinguishing genuine piece and puppet Product.Using the comprehensive similarity computational methods of the present invention, according to Fig. 5, the appropriate sensitivity for increasing difference similarity factor because Son, you can obtain relatively satisfactory result.From table 7, comprehensive similarity value is more than 92.0% between genuine piece, and with adulterant most High comprehensive similarity value be 88.6% (<90.0%), therefore Similarity Measure result of the present invention tallies with the actual situation, and more closes Reason.
The computational methods of the present invention can reflect the similar of sample in terms of the chemical composition of sample and content difference two Degree, comprehensive similarity can not only be provided, and reflection chemical composition (R can be provided respectively1) or content difference (R2) Similarity information;Similarity calculating method is set to be applied to different type complex sample system by introducing tri- parameters of δ, u and v Need.
It will be understood by those skilled in the art that each module or each step of the invention described above can be filled with general computer Put to realize, alternatively, they can be realized with the program code that computing device can perform, it is thus possible to which they are stored Performed in the storage device by computing device, either they are fabricated to respectively each integrated circuit modules or by they In multiple modules or step be fabricated to single integrated circuit module to realize.The present invention be not restricted to any specific hardware and The combination of software.
Although above-mentioned the embodiment of the present invention is described with reference to accompanying drawing, model not is protected to the present invention The limitation enclosed, one of ordinary skill in the art should be understood that on the basis of technical scheme those skilled in the art are not Need to pay various modifications or deformation that creative work can make still within protection scope of the present invention.

Claims (10)

1. a kind of fingerprint similarity computational methods, it is characterised in that comprise the following steps:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
2. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the coefficient correlation calculation formula It is as follows:
Wherein, xiAnd siPeak area, the peak of i-th of chromatographic peak of testing sample and standard reference sample finger-print are represented respectively High or collection of illustrative plates intensity;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and standard reference sample The mean intensity of the average peak area of finger-print, average peak height or collection of illustrative plates.
3. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the difference similarity factor calculates Formula is as follows:
Wherein, xiAnd siPeak area, the peak of i-th of chromatographic peak of testing sample and standard reference sample finger-print are represented respectively High or collection of illustrative plates intensity;N represents the number or wavelength points of chromatographic peak;WithRespectively testing sample and standard reference sample The mean intensity of the average peak area of finger-print, average peak height or collection of illustrative plates.
4. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that δ, u and v default value are 0.5, 1 and 1.
5. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the canonical reference finger-print The finger-print of selected standard reference sample, or according to according to about technical requirements by the fingerprint of the multiple batch samples determined Collection of illustrative plates generates.
6. fingerprint similarity computational methods as claimed in claim 1, it is characterised in that the computational methods can be used in Mass discrepancy detection between the quality testing of sample and identical or different type of sample.
7. fingerprint similarity computational methods as claimed in claim 6, it is characterised in that wherein, pass through adjusting parameter δ, u And v, by increasing capacitance it is possible to increase the otherness of similarity between different type sample.
8. a kind of device calculated for fingerprint similarity, including memory, processor and storage are on a memory and can The computer program run on a processor, it is characterised in that following steps are realized during the computing device described program, are wrapped Include:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
9. a kind of computer-readable recording medium, is stored thereon with computer program, calculated for fingerprint similarity, it is special Sign is that the program performs following steps when being executed by processor:
Step 1:Establish canonical reference finger-print;
Step 2:Obtain testing sample finger-print;
Step 3:Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Step 4:Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Step 5:Comprehensive similarity is calculated based on the coefficient correlation and difference similarity factor:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞.
10. a kind of sample quality evaluation system, it is characterised in that including detector and computing device;
The detector, for the finger-print of bioassay standard reference sample and testing sample under the same conditions and transmit to institute State computing device;
The computer journey that the computing device includes memory, processor and storage on a memory and can run on a processor Sequence, following steps are realized during the computing device described program, including:
The finger-print of standard reference sample is received, as canonical reference finger-print;
Receive testing sample finger-print;
Calculate the coefficient R between the canonical reference finger-print and the testing sample finger-print1
Calculate the difference similarity factor R between the canonical reference finger-print and the testing sample finger-print2
Based on default parameter value or setting of the user for the parameter value of comprehensive similarity is received, according to the coefficient correlation and difference Different similarity factor calculates comprehensive similarity:
Wherein δ is weight factor, and its value scope is 0≤δ≤1;U and v is respectively R1And R2Sensitivity factor, span point Wei not 0≤u≤+ ∞ and 0≤v≤+ ∞;
Based on default threshold or setting of the user for threshold value is received, by comprehensive similarity compared with given threshold, if institute State comprehensive similarity and be not less than the threshold value, then the sample passes, otherwise unqualified.
CN201710832461.9A 2017-09-15 2017-09-15 Fingerprint similarity computational methods, device and sample quality evaluation system Pending CN107784192A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710832461.9A CN107784192A (en) 2017-09-15 2017-09-15 Fingerprint similarity computational methods, device and sample quality evaluation system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710832461.9A CN107784192A (en) 2017-09-15 2017-09-15 Fingerprint similarity computational methods, device and sample quality evaluation system

Publications (1)

Publication Number Publication Date
CN107784192A true CN107784192A (en) 2018-03-09

Family

ID=61437615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710832461.9A Pending CN107784192A (en) 2017-09-15 2017-09-15 Fingerprint similarity computational methods, device and sample quality evaluation system

Country Status (1)

Country Link
CN (1) CN107784192A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109444071A (en) * 2018-12-14 2019-03-08 江苏东交工程检测股份有限公司 Pitch infrared spectroscopy quality determining method and device based on subrane
CN110426505A (en) * 2019-08-22 2019-11-08 重庆壤科农业数据服务有限公司 Purple soil fingerprint databases and its method for building up
CN110532308A (en) * 2019-07-11 2019-12-03 北京嘉元文博科技有限公司 Substance discrimination method and device and computer readable storage medium
CN110632024A (en) * 2019-10-29 2019-12-31 五邑大学 Quantitative analysis method, device and equipment based on infrared spectrum and storage medium
CN110838343A (en) * 2019-11-15 2020-02-25 山东中医药大学 Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum
CN111426648A (en) * 2020-03-19 2020-07-17 甘肃省交通规划勘察设计院股份有限公司 Method and system for determining similarity of infrared spectrogram
CN111855929A (en) * 2020-07-06 2020-10-30 浙江工商大学 Method for evaluating similarity of fruit powder raw materials
CN111971555A (en) * 2018-03-28 2020-11-20 科思创知识产权两合公司 Quality inspection method for polyurethane sample, electronic nose device, and storage medium
CN113030007A (en) * 2021-02-10 2021-06-25 河南中烟工业有限责任公司 Method for rapidly testing quality stability of tobacco essence based on similarity learning algorithm
CN113029979A (en) * 2021-02-10 2021-06-25 河南中烟工业有限责任公司 Method for testing quality stability of cigarette paper
CN113076812A (en) * 2021-03-12 2021-07-06 药都(本溪)一致科技有限公司 Processing method, system, medium and application of spectrum quantization fingerprint
CN114646715A (en) * 2020-12-21 2022-06-21 株式会社岛津制作所 Waveform processing support device and waveform processing support method
CN115828115A (en) * 2023-02-16 2023-03-21 北京圣芯诺科技有限公司 Data consistency evaluation method, device, electronic equipment and program product
CN113076812B (en) * 2021-03-12 2024-05-10 药都(本溪)一致科技有限公司 Processing method, system, medium and application of quantized fingerprint spectrum of spectrum

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256177A (en) * 2007-11-21 2008-09-03 皖南医学院 System for evaluation of Chinese medicine numeralization color spectrum dactylogram similarity
CN103018382A (en) * 2012-12-07 2013-04-03 南京中医药大学 Detection method of fingerprint spectrum similarity
CN103278591A (en) * 2013-05-16 2013-09-04 江苏师范大学 Evaluation method for chromatographic fingerprint similarity
CN105651875A (en) * 2015-12-31 2016-06-08 河北中烟工业有限责任公司 Similarity evaluating algorithm of fingerprint spectrum

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101256177A (en) * 2007-11-21 2008-09-03 皖南医学院 System for evaluation of Chinese medicine numeralization color spectrum dactylogram similarity
CN103018382A (en) * 2012-12-07 2013-04-03 南京中医药大学 Detection method of fingerprint spectrum similarity
CN103278591A (en) * 2013-05-16 2013-09-04 江苏师范大学 Evaluation method for chromatographic fingerprint similarity
CN105651875A (en) * 2015-12-31 2016-06-08 河北中烟工业有限责任公司 Similarity evaluating algorithm of fingerprint spectrum

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
朱训生: "《工程管理的模糊分析》", 31 October 2004 *
詹雪艳等: "色谱指纹图谱相似度方法的研究进展", 《中国实验方剂学杂志》 *
赖何季: "中药色谱指纹图谱相似度分析的研究与应用", 《万方学位论文》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111971555A (en) * 2018-03-28 2020-11-20 科思创知识产权两合公司 Quality inspection method for polyurethane sample, electronic nose device, and storage medium
CN109444071A (en) * 2018-12-14 2019-03-08 江苏东交工程检测股份有限公司 Pitch infrared spectroscopy quality determining method and device based on subrane
CN110532308A (en) * 2019-07-11 2019-12-03 北京嘉元文博科技有限公司 Substance discrimination method and device and computer readable storage medium
CN110426505A (en) * 2019-08-22 2019-11-08 重庆壤科农业数据服务有限公司 Purple soil fingerprint databases and its method for building up
CN110632024A (en) * 2019-10-29 2019-12-31 五邑大学 Quantitative analysis method, device and equipment based on infrared spectrum and storage medium
CN110632024B (en) * 2019-10-29 2022-06-24 五邑大学 Quantitative analysis method, device and equipment based on infrared spectrum and storage medium
CN110838343B (en) * 2019-11-15 2022-03-01 山东中医药大学 Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum
CN110838343A (en) * 2019-11-15 2020-02-25 山东中医药大学 Traditional Chinese medicine property identification method and system based on multi-modal fingerprint spectrum
CN111426648A (en) * 2020-03-19 2020-07-17 甘肃省交通规划勘察设计院股份有限公司 Method and system for determining similarity of infrared spectrogram
CN111855929A (en) * 2020-07-06 2020-10-30 浙江工商大学 Method for evaluating similarity of fruit powder raw materials
CN111855929B (en) * 2020-07-06 2021-04-30 浙江工商大学 Method for evaluating similarity of fruit powder raw materials
CN114646715A (en) * 2020-12-21 2022-06-21 株式会社岛津制作所 Waveform processing support device and waveform processing support method
CN114646715B (en) * 2020-12-21 2023-08-04 株式会社岛津制作所 Waveform processing support device and waveform processing support method
CN113030007A (en) * 2021-02-10 2021-06-25 河南中烟工业有限责任公司 Method for rapidly testing quality stability of tobacco essence based on similarity learning algorithm
CN113029979A (en) * 2021-02-10 2021-06-25 河南中烟工业有限责任公司 Method for testing quality stability of cigarette paper
CN113076812A (en) * 2021-03-12 2021-07-06 药都(本溪)一致科技有限公司 Processing method, system, medium and application of spectrum quantization fingerprint
CN113076812B (en) * 2021-03-12 2024-05-10 药都(本溪)一致科技有限公司 Processing method, system, medium and application of quantized fingerprint spectrum of spectrum
CN115828115A (en) * 2023-02-16 2023-03-21 北京圣芯诺科技有限公司 Data consistency evaluation method, device, electronic equipment and program product

Similar Documents

Publication Publication Date Title
CN107784192A (en) Fingerprint similarity computational methods, device and sample quality evaluation system
CN104897607B (en) Portable near infrared spectrum food modeling and quick detection integral method and system
Ye et al. Non-destructive prediction of protein content in wheat using NIRS
CN108362662A (en) Near infrared spectrum similarity calculating method, device and substance qualitative analytic systems
CN104792652B (en) A kind of Milkvetch Root multiple index quick detecting method
CN104990895B (en) A kind of near infrared spectrum signal standards normal state bearing calibration based on regional area
CN108872132A (en) A method of fresh tea leaves kind is differentiated using near infrared spectrum
CN106918572B (en) The assay method of potato content in potato compounding staple food
CN108489929A (en) Ginseng, Radix Notoginseng and the legal base source Panax polysaccharide of three kinds of American Ginseng discrimination method
CN108760647A (en) A kind of wheat content of molds line detecting method based on Vis/NIR technology
Yun et al. Identification of tea based on CARS‐SWR variable optimization of visible/near‐infrared spectrum
CN107402192A (en) A kind of method of quick analysis essence and flavoring agent quality stability
Innamorato et al. Tracing the geographical origin of lentils (Lens culinaris Medik.) by infrared spectroscopy and chemometrics
Zhang et al. Spectral and chromatographic overall analysis: An insight into chemical equivalence assessment of traditional Chinese medicine
CN106770607A (en) A kind of method that utilization HS-IMR-MS differentiates genuine-fake cigarette
CN109358022A (en) A kind of method of the quick-fried pearl type of quick discrimination cigarette
Wang et al. SVM classification method of waxy corn seeds with different vitality levels based on hyperspectral imaging
Luo et al. Rapid quantification of multi-components in alcohol precipitation liquid of Codonopsis Radix using near infrared spectroscopy (NIRS)
CN106872398A (en) A kind of HMX explosives moisture method for fast measuring
CN109932335A (en) It is a kind of for the method for natural rubber assay in plant and measurement use LED near infrared spectrometer
CN109984725A (en) Contact pressure disturbance restraining method, device and measurement method in diffusing reflection measurement
CN106568740A (en) Method for rapid judging of varieties of fresh tea leaves by near infrared spectroscopy
Yang et al. Rapid authentication of variants of Gastrodia elata Blume using near-infrared spectroscopy combined with chemometric methods
CN112782116A (en) Method for detecting moisture content of large traditional Chinese medicine honeyed pill by utilizing near infrared spectrum and application
CN112801173A (en) Lettuce near infrared spectrum classification method based on QR fuzzy discrimination analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180309

RJ01 Rejection of invention patent application after publication