CN105512415A - Batch fitting method of nonlinear dose-response curve - Google Patents

Batch fitting method of nonlinear dose-response curve Download PDF

Info

Publication number
CN105512415A
CN105512415A CN201510945122.2A CN201510945122A CN105512415A CN 105512415 A CN105512415 A CN 105512415A CN 201510945122 A CN201510945122 A CN 201510945122A CN 105512415 A CN105512415 A CN 105512415A
Authority
CN
China
Prior art keywords
data
dose
effect
monotonic
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510945122.2A
Other languages
Chinese (zh)
Other versions
CN105512415B (en
Inventor
朱祥伟
曹煜彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Agricultural University
Original Assignee
Qingdao Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Agricultural University filed Critical Qingdao Agricultural University
Priority to CN201510945122.2A priority Critical patent/CN105512415B/en
Publication of CN105512415A publication Critical patent/CN105512415A/en
Application granted granted Critical
Publication of CN105512415B publication Critical patent/CN105512415B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Evolutionary Computation (AREA)
  • Geometry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a batch fitting method of a nonlinear dose-response curve. The method comprises steps as follows: establishing of a basic dose-response database, establishing of a fitting parameter database, reading of dose-response data, monotonicity judgment of dose-response data, high-frequency trial and error fitting of monotonic dose-response data, high-frequency trial and error fitting of non-monotonic dose-response data, selection fitting of the optimum fitting function and other processes. The method is based on fitting parameter database SVCF (starting values for curve fitting) of a nonlinear function, a high-frequency trial and error technology is adopted for batch and rapid fitting of dose-response curve data, and mass data can be processed in a short time. The method is very suitable for large batches of dose-response curve data produced during toxicity testing on chemical substances with a quantitative high-throughput screening technology at present.

Description

A kind of batch approximating method of non-linear dose-effect curve
Technical field
The invention belongs to dose-effect curve analysis field in chemical toxicity test, be specifically related to the batch approximating method of Nonlinear Monotone for dose-effect relationship data set in enormous quantities and non-linear non-monotonic docs-effect function.
Background technology
Dosage (concentration)-effect relation is toxicologic key concept, refer to increase along with the concentration of xenobiontics, the degree of the toxic effect of body is increased, or occurring that the ratio shared in colony of the individuality of certain effect increases, dosage (concentration)-effect relation reflects human body or animal used as test to the distribution of exogenous polyamines toxic action neurological susceptibility.Concentration-effect relation usable concentration effect curve represents, namely represents ordinate by effect power, and toxic concentration is horizontal ordinate, draws the curve of scatter diagram gained.The effect type caused at different conditions due to different poisonous substance is different, and the correlationship of concentration and effect can be caused inconsistent.Therefore, different concentration effect curve types can be presented.Non-linear dose-effect relation is prevalent in nature, according to dosage and effect whether in monotonic relationshi, have dull S type curve and non-monotonic inverted U, J type curve point.The feature of dull dose-effect curve is in low strength range, and along with concentration increases, effect increase is comparatively slow, and when then concentration increases, effect also increases thereupon rapidly, but when concentration continues to increase, effect answers intensity increase to tend to again slow.Curve starts gently, then precipitous, then becomes again mild, becomes S type.Non-monotonic dose-effect curve feature is Low dose radiation, and high dose suppresses, i.e. Hormesis effect.
The traditional chemical product toxotest method cycle is long, and the data volume of acquisition is few.In data analysis process, Software of Data Statistics such as SPSS, Origin, GraphPad, DPS etc. that docs-effect data were inputted or be copied to some specialties by researchist usually carry out matching.Nonlinear fitting process need: the variation tendency 1) according to docs-effect data selects respective function form; 2) initial value of parameter in function is provided.Docs-effect data fitting monotonic quantity comprises conventional Hill, Weibull, Logit etc. tens prescription journey [ScholzeM, BoedekerW, FaustM, BackhausT, AltenburgerR, GrimmeLH.2001.Ageneralbest-fitmethodforconcentration-res ponsecurvesandtheestimationoflow-effectconcentrations.En viron.Toxicol.Chem.20:448 – 457] [SpiessA-N, NeumeyerN.AnevaluationofR2asaninadequatemeasurefornonlin earmodelsinpharmacologicalandbiochemicalresearch:AMonteC arloapproach.BMCPharmacol.2010, 10:11.].The function of the non-monotonic docs-effect data of matching also organizes such as Brain_Consens and Biphasic function [ZhuX-W more, LiuS-S, QinL-T, ChenF, LiuH-L.2013.Modelingnon-monotonicdose – responserelationships:Modelevaluationandhormeticquantiti esexploration.Ecotoxicol.Environ.Saf.89:130 – 136] and the heterogeneous function [DiVeroliGY of non-monotonic data of description by the Hill function superimposed generation between two of dullness, FornariC, GoldlustI, MillsG, KohSB, BramhallJL, etal.2015.Anautomatedfittingprocedureandsoftwarefordose-responsecurveswithmultiphasicfeatures.Sci.Rep.5:14701.].At present, a kind of business or open source software is not had to cover so many function.For Origin software, need before matching to compile in selected functional expression manual input Origin software, compile by after can carry out next step operation, often need selected many group functions to carry out matching, then therefrom choose most the superior.In addition before matching, the initial value of function parameter is selected to be a difficult problem equally, if initial parameter value and final match value are more or less the same, by suitable iterative algorithm, parameter optimal fitting value can be found, if differ larger, non-linear least square method firmly firmly can be caused to find in optimal value of the parameter process and to produce singular matrix, cause equation model failure.Researchist is usually according to research experience before this or utilize the method for continuous trial and error to select the initial value of function, but for the novel dosage-effect data occurred in experiment, the function initial value that researchist provides often causes matching failure.Before this, researchist does a lot of work, mainly concentrate on the suitable initial value utilizing the method for machine learning to find fit equation parameter as genetic algorithm and [appoint uncle's flag, the improvement blending heredity curve-fitting method of sharp Long Teng .2005.P-III type estimation of distribution parameters. University Of Chongqing's journal (natural science edition) 28:82 – 85.] [WatkinsP, PuxtyG.2006.Ahybridgeneticalgorithmforestimatingtheequil ibriumpotentialofanion-selectiveelectrode.Talanta68:1336 – 1342; ] [NiaziA, LeardiR.2012.Geneticalgorithmsinchemometrics.J.Chemom.26: 345 – 351].It is good that above-mentioned research genetic algorithm finds fitting parameter initial value Benefit Transfer, and unique weak point is that amount of calculation is large.
Along with USEPA is for medicine and personal care articles, industrial chemical, the development that ToxCast calculates Toxicology Program is initiated in the risk assessment of the environmental chemicals such as agricultural chemicals, quantitative High Throughput Screening Assay has determined the docs-effect data [DixDJ of thousands of kinds of chemical substances to more than 800 group experimental systems, HouckKA, MartinMT, RichardAM, SetzerRW, KavlockRJ.2007.TheToxCastprogramforprioritizingtoxicityt estingofenvironmentalchemicals.Toxicol.Sci.95:5 – 12] [JudsonRS, HouckKA, KavlockRJ, KnudsenTB, MartinMT, MortensenHM, etal.2010.InVitroScreeningofEnvironmentalChemicalsforTar getedTestingPrioritization:TheToxCastProject.Environ.Hea lthPerspect.118:485 – 492].Such as, end in Dec, 2013, obtained more than 200 ten thousand dose-effect curves.Meanwhile, quantitative High Throughput Screening Assay is utilized also in the docs-effect data measuring a large amount of specializes in chemistry material in other enterprise and laboratory.The dose-effect curve of matching magnanimity like this, needs choice function and provides suitable initial parameter value.Therefore, matched curve is obviously improper one by one to utilize traditional artificial mode.
A large amount of Research Literatures pays close attention to the Function Fitting [DiVeroliGY of wall scroll dose-effect curve, FornariC, GoldlustI, MillsG, KohSB, BramhallJL, etal.2015.Anautomatedfittingprocedureandsoftwarefordose-responsecurveswithmultiphasicfeatures.Sci.Rep.5:14701].There is dull to merge with the Fast Quasi of non-monotonic high flux dose-effect curve from a large amount of fitting function, to choose most the superior for mixing, still there is no the solution of mass and robotization.Most researchs concentrate on the improvement of Nonlinear Least-Square Algorithm.Such as a lot of research adopts method of steepest descent, Newton-decline method, linear search method, Trust Region Algorithm etc. to improve optimal fitting parameter search process [MadsenK, NielsenHB, TingleffO.2004.Methodsfornon-linearleastsquaresproblems. 2nded.InformaticsandMathematicalModelling, TechnicalUniversityofDenmark, { DTU}, Lyngby], reach the object tolerating poor matching initial value.But these methods can not solve the problem that parameter fitting initial value is selected in itself.
For this problem, usually need artificially to attempt the possible match value combination of many groups, and the artificial mode effect empirically guessing possible initial value lowly and often incorrect.The present invention attempts utilizing computer technology, reads initial value and carries out matching, if matching is unsuccessful, then reads next initial value fast and carry out matching, constantly attempt, until docs-effect data are by successful matching by program technic from the database built.We are referred to as high frequency trial and error technology this by the fit approach of computer-controlled high frequency trial and error.The mass robotization Function Fitting that the high frequency trial and error technology of inventing can be carried out for high flux docs-effect data.
Summary of the invention
For solve utilize quantitative High Throughput Screening Assay to obtain docs-effect data in enormous quantities cannot fast, the problem of Accurate Curve-fitting, the invention provides a kind of brand-new approximating method.
A batch approximating method for non-linear dose-effect curve, comprises the steps:
(1) basal dose-effect data storehouse (Dose-responsedata, DRD) structure: to publish based on the docs-effect data in document and/or disclosed toxicity of compound database, choose different dosage ranges, docs-effect data point builds database, database comprises dull and nonmonotonic docs-effect data simultaneously;
(2) fitting parameter database (Startingvaluesforcurvefitting, SVCF) structure: utilize the data in many group nonlinear function approximation basal dose-effect data storehouses, obtain the optimized parameter match value of different function, utilize these match values as the initial value of matching novel dosage-effect curve; Comprise 13 groups of monotonic quantitys and 6 groups of Non-monotonic functions to the match value of basal dose-effect curve database;
(3) docs-effect digital independent: the matching one by one of high frequency trial and error approximating method is adopted for the data set comprising large batch of dose-effect curve data, computing machine reads dose-effect curve data at every turn, the dosage number read is consistent with effect number, and dosage and effect one_to_one corresponding;
(4) docs-effect data sheet tonality differentiates: utilize Mann-kendall trend test method to identify dull dosage effect data, utilize nonparametric rank test method to non-monotonic dose-effect curve data identification, if curve is flattened data, then be marked as dull label, if non-monotonic data, be then marked as non-monotonic label;
(5) monotonic quantity high frequency trial and error matching: to the data being designated dull label in step (4), to select in 13 groups of monotonic quantitys one or more groups, the initial value calling respective function in fitting parameter database carries out matching; Idiographic flow is as follows: a. selects one group of monotonic quantity arbitrarily from 13 groups of monotonic quantitys, utilizes corresponding function parameter value in fitting parameter database to be initial value, adopts nonlinear least square method to carry out matching to dull dose-effect curve data; If b. successful matching in fit procedure, records corresponding fitting parameter, effective concentration and corresponding fiducial interval, digital simulation goodness information, determines coefficients R after comprising correction 2 adj, square error RMSE, red pond quantity of information AIC, red pond quantity of information AICc, Bayesian Information amount BIC after offset correction, then jump out this Function Fitting process, other monotonic quantity of Selection and call carries out matching or directly carries out next step; C. repeat previous step, until the whole matching of all dull label datas is complete, record the matching information of all successful fitting functions;
(6) Non-monotonic function high frequency trial and error matching: to the data being designated non-monotonic label in step (4), to select in 6 groups of Non-monotonic functions one or more groups, the initial value calling respective function in fitting parameter database carries out matching; Idiographic flow is as follows: a. selects one group of Non-monotonic function arbitrarily from 6 groups of Non-monotonic functions, utilizes corresponding function parameter value in fitting parameter database to be initial value, adopts nonlinear least square method to carry out matching to non-monotonic dose-effect curve data; If b. successful matching in fit procedure, record corresponding goodness of fit information, after comprising correction, determine coefficients R 2 adj, square error RMSE, red pond quantity of information AIC, red pond quantity of information AICc, Bayesian Information amount BIC after offset correction, then jump out this Function Fitting process, other Non-monotonic function of Selection and call carries out matching or directly carries out next step; C. repeat previous step, until the whole matching of all non-monotonic label datas is complete, record the matching information of all successful fitting functions;
(7) optimal approximation function is selected: according to the goodness of fit information of institute's matching monotonic quantity in step (5), chooses the optimal function of the dull dose-effect curve of matching; According to the goodness of fit information of institute's matching Non-monotonic function in step (6), choose the optimal function of the non-monotonic dose-effect curve of matching;
(8) batch matching: repeat step (3)-(7), until data centralization all dose-effect curves data are complete by matching.
In step (1), described various dose scope is: 1 × 10 -12between mol/L ~ 1mol/L, the middle effect of step (1) can be the number percent effect after process or raw experimental data.
In step (1), described docs-effect data comprise the docs-effect data of different chemical material to prokaryotes body, most eukaryotes, various cell, reporter gene, bio protease experimental system.
In step (1), described docs-effect data comprise experiment test system title, chemical substance American Chemical Society number of registration (CASRN), name of chemical substance, experimental concentration and respective effects.
The object of step (1) basis of formation dose-effect curve database is the dose-effect curve containing number of different types, in dull docs-effect data, lowest dose level number is 5 groups, in non-monotonic docs-effect data, lowest dose level number is 8 groups, for the fitting parameter database sharing of nonlinear function lays the foundation.
In step (2), described 13 groups of monotonic quantitys are: three parameters of Weibull function (four parameters) and distortion and two parameter Weibull functions, 3 groups are expressed as Weibull_four, Weibull_three and Weibull, three parameters of Logit function (four parameters) and distortion and two parameter Logit functions, 3 groups are expressed as Logit_four, Logit_three and Logit, one group of three parameter Hill function of Hill function (four parameters) and distortion and two group of two parameter (Hill coefficient is 1 or is not 1), 4 groups of functions are expressed as Hill_four, Hill_three, Hill_two, Hill, three parameter Box-Cox-Weibul functions, be expressed as BCW, three parameter Box-Cox-Logit functions, be expressed as BCL, three parameter GeneralisedLogit (GL) function, be expressed as GL.
In step (2), described 6 groups of Non-monotonic functions are: three parameter Brain_Consens functions, the Brain_Consens function that four parameters are improved by Vanewijk, be expressed as BCV, four parameter Cedergreen functions, five parameter Beckon functions, the five parameter Biphasic functions applied in pharmacology, the superimposed function of Hill of six parameters, is expressed as Hill_six.
The fitting parameter database of the nonlinear function that step (2) builds covers the optimal value of the parameter that dissimilar dose-effect curve is obtained by different parameters matching, having important reference for emerging docs-effect data fitting, is the basis of high frequency trial and error matching.
In step (3), for the docs-effect data repeating to test, first effect data is averaged, then form the discernible dosage of computing machine and effect data mode one to one.
In step (4), the insolation level mxm. of nonparametric rank test method is 0.10, usually gets 0.01 or 0.05.
In step (7), the choice criteria of optimal approximation function is: R 2 adjin the highest or RMSE, AIC, AICc, BIC, any one is minimum.
The inventive method, on the Research foundation of conventional curvature matching and optimal function system of selection, for high-throughout docs-effect data, establishes simple, the efficient and convenient and mass robotization Function Fitting technology that amount of calculation is little of a kind of principle.The present invention uses the high frequency trial and error technology in cryptography, realizes the Fast Fitting to dose-effect curve.The present invention constructs basal dose-effect curve database (DRD) and the optimal fitting parameter database (SVCF) with nonlinear function, and DRD covers polytype dose-effect curve data, the SVCF that DRD basis is set up.SVCF mono-aspect can adopt high frequency trial and error technology to carry out batch Fast Fitting to dose-effect curve data, mass data can be processed at short notice, realize the Fast Classification and the Accurate Curve-fitting that mix dullness and nonmonotonic high flux dose-effect curve, be highly suitable for the quantitative High Throughput Screening Assay of current utilization carries out toxotest generation large batch of dose-effect curve data to chemical substance.SVCF can provide suitable fitting function and parameter for the dose-effect curve data of the difficult matching of wall scroll equally in addition.In a word, this method contributes to the effective concentration information of chemistry product fast, accurately.
Accompanying drawing explanation
Fig. 1. the batch approximating method techniqueflow chart of non-linear dose-effect curve of the present invention.
Fig. 2. utilize Mann-kendall trend test to filter out dull docs-effect scatter plot of data, dull docs-effect data are for Palmatinechloridehydrate (CASRN:171869-95-7).
Fig. 3. utilize nonparametric rank test to filter out non-monotonic docs-effect scatter plot of data, non-monotonic docs-effect data are for PharmaGSID_47315 (CASRN:444610-91-7).
Fig. 4. the Weibull_three Function Fitting figure of dull docs-effect data Palmatinechloridehydrate (CASRN:171869-95-7), broken string represents 95% fiducial interval based on observed reading.
Fig. 5. non-monotonic docs-effect data are with the Biphasic Function Fitting figure of PharmaGSID_47315 (CASRN:444610-91-7), and broken string represents 95% fiducial interval based on observed reading.
Embodiment
Below in conjunction with specific embodiment and accompanying drawing, the present invention is described in further details.
According to step as shown in Figure 1, batch matching is carried out to non-linear dose-effect curve.
(1) structure of basal dose-effect data storehouse (DRD):
To publish document [GeH-L, LiuS-S, ZhuX-W, LiuH-L, WangL-J.Predictinghormeticeffectsofionicliquidmixtureson luciferaseactivityusingtheconcentrationadditionmodel.Env ironSciTechnol.2011; 45:1623 – 1629] [ZhuX-W, LiuS-S, QinL-T, ChenF, LiuH-L.Modelingnon-monotonicdose – responserelationships:Modelevaluationandhormeticquantiti esexploration.EcotoxicolEnvironSaf.2013; 89:130 – 136] [Zhu Xiangwei, Liu Shushen, Zhang Qiong, Liu Yan. pesticide and microbiotic are to the short term toxicity of photogen and long term toxicity. Research of Environmental Sciences .2009; 22:589 – 594.] [Ge Huilin, Liu Shushen, Zhu Xiangwei, Wang Lijuan. microplate light absorption method measures the suppression toxicity of 9 kinds of agricultural chemicals to scenedesmus obliquus. ecotoxicological journal .2008; 3:606 – 612], online data [ http:// www2.epa.gov/chemical-research/toxicity-forecaster-toxca sttm-data] data be Data Source, comprise different dosage ranges, docs-effect data point build database, comprise dull with nonmonotonic docs-effect data simultaneously.Database coset 1500 groups of dose-effect curves, comprise the docs-effect data of different chemical material to prokaryotes body, most eukaryotes, various kinds of cell, reporter gene, bio protease experimental system; Every bar dose-effect curve data are made up of experiment test system title, chemical substance American Chemical Society number of registration (CASRN), name of chemical substance, experimental concentration and respective effects; In dull docs-effect data, lowest dose level number is 5 groups, and in non-monotonic docs-effect data, lowest dose level number is 8 groups.Moiety concentrations in DRD-effect data example is as shown in table 1.
(2) structure of fitting parameter database (SVCF):
Utilize the data in nonlinear function approximation basal dose-effect curve database (DRD), obtain the match value of different function parameter, utilize these match values as the initial value of matching novel dosage-effect curve; Comprise 13 groups of monotonic quantitys and 6 groups of Non-monotonic functions to the match value parameter of basal dose-effect curve database.13 groups of monotonic quantity data messages are as follows: Weibull parameter 20 groups, Weibull_three parameter 20 groups, Weibull_four parameter 20 groups, Logit parameter 20 groups, Logit_three parameter 20 groups, Logit_four parameter 20 groups, Hill parameter 20 groups, Hill_two parameter 20 groups, Hill_three parameter 20 groups, Hill_four parameter 20 groups, BCW parameter 20 groups, BCL parameter 20 groups, GL parameter 20 groups.6 groups of Non-monotonic function match value parameters are as follows: Brain & Consens parameter 20 groups, BCV parameter 20 groups, Cedergrenn parameter 20 groups, Beckon parameter 20 groups, Biphasic parameter 20 groups, Hill_six parameter 20 groups.In SVCF, part function parameter is as shown in table 2-19.
(3) docs-effect digital independent:
What choose country's chemistry genome research center (NCGC) that USEPA issues in Dec, 2013 is tested object for the quantitative high flux screening data of chemical substance, data web page http:// www2.epa.gov/chemical-research/toxicity-forecaster-toxca sttm-data, after entering the page, select PreviouslyPublishedToxCastData mono-in Downloads list, data download link can be entered and select Dec_2013_Data_Release mono-to download.Data zip format compression file size is 623.587 million, comprises the docs-effect data that 3000 number of chemical materials more than 800 organize test system, dose-effect curve millions of.The present invention choose wherein in data in NCGC reporter gene test system compound be that example is analyzed for mitochondrial toxicity ratio data, filename ToxCast_c_resp_Tox21_MitochondrialToxicity_ratio_2013_12 _ 10.txt, this file comprises 3780 dose-effect curve data, every bar dose-effect curve is made up of the effect that 15 groups of dosage is corresponding with it, and the dilution gfactor of different graded doses is 0.446.For the matching one by one of above-mentioned 3780 groups of docs-effect data acquisition high frequency trial and error approximating methods, computing machine reads dose-effect curve data at every turn, reads post dose number and effect number with regard to consistent and dosage and effect one_to_one corresponding.
(4) docs-effect data sheet tonality judges:
Mann-kendall trend test is carried out to 3780 dose-effect curve data in step (3), identify 632 effective dull dosage effect data altogether, shown in Fig. 2 is the monotonicity data of Palmatinechloridehydrate (CASRN:171869-95-7).Carry out nonparametric rank test, identify 185 effective non-monotonic dose-effect curve data altogether, shown in Fig. 3 is PharmaGSID_47315 (CASRN:444610-91-7) nonmonotonicity data simultaneously.All the other 2963 groups of docs-effect data are random data.
The insolation level of Mann-kendall trend test and nonparametric rank test method is 0.01.
(5) monotonic quantity parameter high frequency trial and error matching:
To the dull dose-effect curve data of mark in step (4), in 13 groups of monotonic quantitys of the present invention in SVCF, Hill_three and Weibull_three function is selected to carry out matching respectively to 632 dull docs-effect data; Idiographic flow is as follows: a. chooses Hill_three function, to utilize in SVCF database Hill_three function parameter to carry out matching to dull dose-effect curve data for initial value; If b. successful matching in fit procedure, records corresponding fitting parameter, effective concentration, digital simulation goodness information: determine coefficient (R after comprising correction 2 adj), square error (RMSE), red pond quantity of information (AIC), red pond quantity of information (AICc), Bayesian Information amount (BIC) etc. after offset correction, then jump out this dose-effect curve matching, enter next data fitting; C. repeat previous step, until the whole matching of all data is complete, record the matching information of all successful fitting functions; D. choose Weibull_three function, repeat said process, until all data are completed by matching.
For 632 groups of docs-effect data, utilize initial value in SVCF database, Hill_three success matching wherein 382 groups, Weibull_three model success matching 512 groups of data wherein, account for 60.4% and 81.0% of total amount of data respectively.Comprehensive two groups of Function Fitting information, always have the dullness-dose data of 81.2% by successful matching.Shown in Fig. 4 is the Weibull_three Function Fitting figure of dull docs-effect data Palmatinechloridehydrate (CASRN:171869-95-7).
(6) Non-monotonic function parameter high frequency trial and error matching:
To the non-monotonic dose-effect curve data of mark in step (4), in 6 groups of Non-monotonic functions of the present invention in SVCF, Biphasic function is selected to carry out matching respectively to 185 groups of non-monotonic docs-effect data; Idiographic flow is as follows: a. chooses Biphasic function, to utilize in SVDF database Biphasic function parameter to carry out matching to dull dose-effect curve data for initial value; If b. successful matching in fit procedure, records corresponding fitting parameter, effective concentration, digital simulation goodness information: determine coefficient (R after comprising correction 2 adj), square error (RMSE), red pond quantity of information (AIC), red pond quantity of information (AICc), Bayesian Information amount (BIC) etc. after offset correction, then jump out this dose-effect curve matching, enter next data fitting; C. repeat previous step, until the whole matching of all data is complete, record the matching information of all successful fitting functions;
For 185 groups of non-monotonic docs-effect data, utilize initial value in SVCF database, Biphasic model success matching wherein 128 groups, accounts for 70.3% of total amount of data.Shown in Fig. 5 is that non-monotonic docs-effect data are with the Biphasic Function Fitting figure of PharmaGSID_47315 (CASRN:444610-91-7).
R language compilation core algorithm and various functions is adopted in specific implementation process, in Dellinspiron series computer Ubuntu14.04 system, working time is no more than 30 minutes, completes the dullness of 3780 groups of dose-effect curves and non-monotonic differentiation and the matching to 632 groups of dullnesses and 185 groups of non-monotonic dose-effect curves.Adopt high frequency trial and error fitting technique, a curve is completed, trial and error matching 405232 times altogether for 496 times to the average every trial and error of dull dose-effect curve.According to traditional manual type, assuming that very skilled operating personnel, within 30 seconds, complete and once manually collide trial, then need the time to be within 2612 hours, complete whole matching.A curve is completed, trial and error matching 133570 times altogether for 722 times to the average every trial and error of non-monotonic dose-effect curve.According to traditional manual type, assuming that very skilled operating personnel, within 30 seconds, complete once artificial trial and error and attempt, then need the time to be within 1113 hours, complete whole matching.Compare with traditional artificial fitting's method, high frequency trial and error fitting technique has the raising of several order of magnitude in matching speed and accuracy.
Hill parameter, Hill_two parameter, Hill_three parameter and Hill_four parameter in table 2. fitting parameter database SVCF
Logit parameter, Logit_three parameter, Weibull_three parameter, Logit_four parameter in table 3. fitting parameter database SVCF
Weibull parameter, BCL parameter, BCW parameter, BCV parameter in table 4. fitting parameter database SVCF
Brain_Consens parameter, Weibull_four parameter, Beckon parameter in table 5. fitting parameter database SVCF
Cedergreen parameter, Biphasic parameter in table 6. fitting parameter database SVCF
Hill_six parameter in table 7. fitting parameter database SVCF

Claims (10)

1. a batch approximating method for non-linear dose-effect curve, is characterized in that, comprise the steps:
(1) structure of basal dose-effect curve database: to publish based on the docs-effect data in document and/or disclosed toxicity of compound database, choose different dosage ranges, docs-effect data point builds database, database comprises dull and nonmonotonic docs-effect data simultaneously;
(2) structure of fitting parameter database: utilize the data in nonlinear function approximation basal dose-effect data storehouse, obtains the optimized parameter match value of different function, utilizes these match values as the initial value of matching novel dosage-effect curve; Comprise 13 groups of monotonic quantitys and 6 groups of Non-monotonic functions to the match value of basal dose-effect curve database;
(3) docs-effect digital independent: the matching one by one of high frequency trial and error approximating method is adopted for the data set comprising large batch of dose-effect curve data, computing machine reads dose-effect curve data at every turn, the dosage number read is consistent with effect number, and dosage and effect one_to_one corresponding;
(4) docs-effect data sheet tonality differentiates: utilize Mann-kendall trend test method to identify dull docs-effect data, utilize nonparametric rank test method to non-monotonic dose-effect curve data identification, if curve is flattened data, then be marked as dull label, if non-monotonic data, be then marked as non-monotonic label;
(5) monotonic quantity high frequency trial and error matching: to the data being designated dull label in step (4), to select in 13 groups of monotonic quantitys one or more groups, the initial value calling respective function in fitting parameter database carries out matching; Idiographic flow is as follows: a. selects one group of monotonic quantity arbitrarily from 13 groups of monotonic quantitys, utilizes corresponding function parameter value in fitting parameter database to be initial value, adopts nonlinear least square method to carry out matching to dull dose-effect curve data; If b. successful matching in fit procedure, records corresponding fitting parameter, effective concentration and corresponding fiducial interval, digital simulation goodness information, determines coefficients R after comprising correction 2 adj, square error RMSE, red pond quantity of information AIC, red pond quantity of information AICc, Bayesian Information amount BIC after offset correction, then jump out this Function Fitting process, other monotonic quantity of Selection and call carries out matching or directly carries out next step; C. repeat previous step, until the whole matching of all dull label datas is complete, record the matching information of all successful fitting functions;
(6) Non-monotonic function high frequency trial and error matching: to the data being designated non-monotonic label in step (4), to select in 6 groups of Non-monotonic functions one or more groups, the initial value calling respective function in fitting parameter database carries out matching; Idiographic flow is as follows:
A. from 6 groups of Non-monotonic functions, select one group of Non-monotonic function arbitrarily, utilize corresponding function parameter value in fitting parameter database to be initial value, adopt nonlinear least square method to carry out matching to non-monotonic dose-effect curve data; If b. successful matching in fit procedure, record corresponding goodness of fit information, after comprising correction, determine coefficients R 2 adj, square error RMSE, red pond quantity of information AIC, red pond quantity of information AICc, Bayesian Information amount BIC after offset correction, then jump out this Function Fitting process, other Non-monotonic function of Selection and call carries out matching or directly carries out next step; C. repeat previous step, until the whole matching of all non-monotonic label datas is complete, record the matching information of all successful fitting functions;
(7) optimal approximation function is selected: according to the goodness of fit information of institute's matching monotonic quantity in step (5), chooses the optimal function of the dull dose-effect curve of matching; According to the goodness of fit information of institute's matching Non-monotonic function in step (6), choose the optimal function of the non-monotonic dose-effect curve of matching;
(8) batch matching: repeat step (3)-(7), until data centralization all dose-effect curves data are complete by matching.
2. the batch approximating method of non-linear dose-effect curve according to claim 1, is characterized in that, in step (1), described various dose scope is: 1 × 10 -12between mol/L ~ 1mol/L, the middle effect of step (1) can be the number percent effect after process or raw experimental data.
3. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, it is characterized in that, in step (1), described docs-effect data comprise the docs-effect data of different chemical material to prokaryotes body, most eukaryotes, various cell, reporter gene, bio protease experimental system.
4. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, it is characterized in that, in step (1), in described dull docs-effect data, lowest dose level number is 5 groups, and in non-monotonic docs-effect data, lowest dose level number is 8 groups.
5. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, it is characterized in that, in step (2), described 13 groups of monotonic quantitys are: three parameters of four parameter Weibull functions and distortion and two parameter Weibull functions, and 3 groups are expressed as Weibull_four, Weibull_three and Weibull; Three parameters of four parameter Logit functions and distortion and two parameter Logit functions, 3 groups are expressed as Logit_four, Logit_three and Logit; One group of three parameter Hill function of four parameter Hill functions and distortion and two group of two parameter, wherein Hill coefficient is 1 or is not expressed as Hill_four, Hill_three, Hill_two, Hill for Isosorbide-5-Nitrae group function; Three parameter Box-Cox-Weibul functions, are expressed as BCW; Three parameter Box-Cox-Logit functions, are expressed as BCL; Three parameter GeneralisedLogit functions, are expressed as GL.
6. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, it is characterized in that, in step (2), described 6 groups of Non-monotonic functions are: three parameter Brain_Consens functions, the Brain_Consens function that four parameters are improved by Vanewijk, be expressed as BCV, four parameter Cedergreen functions, five parameter Beckon functions, the five parameter Biphasic functions applied in pharmacology, the superimposed function of Hill of six parameters, is expressed as Hill_six.
7. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, it is characterized in that, in step (3), for the docs-effect data repeating to test, first effect data is averaged, then form the discernible dosage of computing machine and effect data mode one to one.
8. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, is characterized in that, in step (4), the insolation level mxm. of Mann-kendall trend test and nonparametric rank test method is 0.10.
9. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, is characterized in that, in step (4), the insolation level of Mann-kendall trend test and nonparametric rank test method gets 0.01 or 0.05.
10. the batch approximating method of non-linear dose-effect curve according to claim 1 and 2, is characterized in that, in step (7), the choice criteria of optimal approximation function is: R 2 adjin the highest or RMSE, AIC, AICc, BIC, any one is minimum.
CN201510945122.2A 2015-12-17 2015-12-17 A kind of batch approximating method of non-linear dose-effect curve Active CN105512415B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510945122.2A CN105512415B (en) 2015-12-17 2015-12-17 A kind of batch approximating method of non-linear dose-effect curve

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510945122.2A CN105512415B (en) 2015-12-17 2015-12-17 A kind of batch approximating method of non-linear dose-effect curve

Publications (2)

Publication Number Publication Date
CN105512415A true CN105512415A (en) 2016-04-20
CN105512415B CN105512415B (en) 2018-07-10

Family

ID=55720395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510945122.2A Active CN105512415B (en) 2015-12-17 2015-12-17 A kind of batch approximating method of non-linear dose-effect curve

Country Status (1)

Country Link
CN (1) CN105512415B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882130A (en) * 2020-07-30 2020-11-03 浙江大学 Online dioxin emission prediction method based on generation path clustering and Box-Cox transformation
CN117672411A (en) * 2023-12-05 2024-03-08 首都医科大学附属北京世纪坛医院 Hormesis effect research method of traditional Chinese medicine component and application of berberine as traditional Chinese medicine component

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103014116A (en) * 2012-11-22 2013-04-03 同济大学 Method of establishing Hormesis dose-effect fitting model
CN103267972B (en) * 2013-05-06 2015-09-09 南京航空航天大学 A kind of irradiating biological dose conversion method based on serum levels of iron/serum copper

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882130A (en) * 2020-07-30 2020-11-03 浙江大学 Online dioxin emission prediction method based on generation path clustering and Box-Cox transformation
CN111882130B (en) * 2020-07-30 2022-01-11 浙江大学 Online dioxin emission prediction method based on generation path clustering and Box-Cox transformation
CN117672411A (en) * 2023-12-05 2024-03-08 首都医科大学附属北京世纪坛医院 Hormesis effect research method of traditional Chinese medicine component and application of berberine as traditional Chinese medicine component
CN117672411B (en) * 2023-12-05 2024-05-14 首都医科大学附属北京世纪坛医院 Hormesis effect research method of traditional Chinese medicine component and application of berberine as traditional Chinese medicine component

Also Published As

Publication number Publication date
CN105512415B (en) 2018-07-10

Similar Documents

Publication Publication Date Title
Latosińska et al. Risk assessment of soil contamination with heavy metals from municipal sewage sludge
Egelie et al. The emerging patent landscape of CRISPR–Cas gene editing technology
Li et al. ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery
Yari et al. Several tree-based solutions for predicting flyrock distance due to mine blasting
Mitchell Ayers et al. Soil layer development and biota in bioretention
CN105512415A (en) Batch fitting method of nonlinear dose-response curve
Li et al. Can the introduction of an environmental target assessment policy improve the TFP of textile enterprises? A quasi-natural experiment based on the Huai River Basin in China
Santorufo et al. Impact of anthropic activities on soil quality under different land uses
Olariaga et al. Two new species of Hydnum with ovoid basidiospores: H. ovoideisporum and H. vesterholtii
Matarredona et al. The survival of Haloferax mediterranei under stressful conditions
Vítěz et al. Methanogens diversity during anaerobic sewage sludge stabilization and the effect of temperature
Pawłowski et al. The Role of Agriculture in Climate Change Mitigation—A Polish Example
Lu et al. A review of recent progress in drug doping and gene doping Control analysis
Pham et al. Humin as an external electron mediator for microbial pentachlorophenol dechlorination: exploration of redox active structures influenced by isolation methods
Van Den Bussche et al. Evaluating monophyly of Nataloidea (Chiroptera) with mitochondrial DNA sequences
Figas et al. Heavy metals and sulphur in needles of Pinus sylvestris L. and soil in the forests of city agglomeration
Ferretti et al. SiNPle: Fast and sensitive variant calling for deep sequencing data
Rockwood et al. Short rotation Eucalypts: Opportunities for biochar
Huang et al. Copper distribution and binding affinity of size-fractioned humic substances taken from paddy soil and correlation with optical characteristics
Bekier et al. Effect of differently matured compost produced from willow (salix viminalis l.) on growth and development of lettuce (Lactuca sativa L.)
Maréchal Plastids: Methods and protocols
Voigt et al. Comparative evaluation of chemical and environmental online and CD-ROM databases
Leśniańska et al. Immobilization of Zn and Cu in conditions of reduced C/N ratio during sewage sludge composting process
Bohm et al. Analysis of Chemical and Phytotoxic Properties of Frass Derived from Black Soldier Fly-Based Bioconversion of Biosolids
Folkedahl et al. Round-robin interlaboratory study on rare-earth elements in US-based geologic materials

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant