CN101894215A - Likelihood ratio test error detection method - Google Patents

Likelihood ratio test error detection method Download PDF

Info

Publication number
CN101894215A
CN101894215A CN2010102231750A CN201010223175A CN101894215A CN 101894215 A CN101894215 A CN 101894215A CN 2010102231750 A CN2010102231750 A CN 2010102231750A CN 201010223175 A CN201010223175 A CN 201010223175A CN 101894215 A CN101894215 A CN 101894215A
Authority
CN
China
Prior art keywords
probability
distribution
prediction
value
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010102231750A
Other languages
Chinese (zh)
Inventor
陈彤生
李绍滋
周昌乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN2010102231750A priority Critical patent/CN101894215A/en
Publication of CN101894215A publication Critical patent/CN101894215A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a likelihood ratio test error detection method and relates to an application of an artificial intelligent technology in traditional Chinese medicine. The invention provides the likelihood ratio test error detection method based on probability integral transform. The invention provides an analysis tool with accurate and high-efficient prediction, the simulation experiments show that the method can be used for usual small samples; if the P value conclusion obtained by the likelihood ratio test is inconsistent with the appearance of a one-way ordinal contingency table, the P value error needs to be amended according to a 0.074 calibration parameter of the probability integral transform, thereby leading the P value conclusion to be consistent with the appearance of the one-way ordinal contingency table. The invention provides a tail prediction test method which is widely applicable to assess the prediction, and the method can evaluate the distribution of the whole prediction rather than a scalar or a range. The combination of information contents of the prediction distribution with after action knowledge is sufficient to establish a strong test, which can still meet the needs of a prediction program under the situation that the sample size is as small as 100.

Description

The detection method of likelihood ratio test error
Technical field
The present invention relates to of the application of a kind of artificial intelligence technology, especially relate to a kind of detection method of likelihood ratio test error in the traditional Chinese medical science.
Background technology
As far back as the Eastern Han Dynasty, Zhang Zhongjing is just attached great importance to the function of card type theory.Card type theory is meant the dialectical standard that ancient times, the doctor formulated and discusses and control rule." diagnosis and treatment " principle shows that dialectical and opinion is controlled and is used to Clinics and Practices.
Traditional Chinese medical science suggestion, the relation of dialectical theoretical description card type symptom, opinion is administered the relation that opinion is described card type prescription, and the relation of the theoretical contact of prescription card type Chinese medicine.
4 entities (i.e. card type, symptom, prescription and Chinese medicine) and 3 relations (i.e. the relation of the relation of card type symptom, card type prescription relation and card type Chinese medicine) are the marrow of the traditional Chinese medical science.The card type can have many symptoms, and a symptom can be comprised by many card types.One card type must comprise at least one symptom, but symptom not necessarily has the card type.The relation of card type symptom is meant and is used for representing the dialectical of one or more specific symptoms.
At present, the TCM Syndrome Type model only has ([1] Zhang such as Zhang Lianwen, N.L.Yuan, S., Chen, T.and Wang, Y.Latent tree models and diagnosis in traditional Chinese medicine.Artificial Intelligence in Medicine, 2008,42 (3): 229-245) use latent tree-model to analyze data set, find the natural aggregation of data set, well corresponding to TCM Syndrome Type.It provides statistics, and with the checking TCM Syndrome Type, and card pattern type is built in suggestion on dialectical basis.Yet this method thinks that single symptom belongs to specific card type, and uses the Bayesian network analysis to find the natural group in the data set.Those supposition are not consistent with reality, and the model that the result makes in this way and set up seldom produces quite positive performance.
In the actual life, comprise the traditional Chinese medical science etc., all need the prediction of science, and the order of accuarcy of prediction is the primary and foremost purpose of research.At present, the likelihood ratio test of small sample research is divided into interval prediction, density is predicted and tail is predicted three classes.The classic method of prediction lays particular emphasis on assessment interval prediction and density prediction.
Johansen ([2] Johansen, S.A small sample correction for tests of hypotheses on the cointegrating vectors[J] .Journal of Econometrics, 2002,111 (2): 195-221) carry out the whole inference that concerns of relevant association at whole to assisting (cointegrated) venture worth model, the asymptotic inferred results that draws small sample is not accurate enough, should obtain correction factor according to sample size and parameter.
([3] McSorley such as McSorley, E.O., Lu, J.C., and Li, C.S.Performance of Parameter-Estimates in Step-Stress Accelerated Life-Tests With Various Sample-Sizes[J] .IEEE Transactions on Reliability, 2002,51 (3): 271-277) adopt the emulation technology investigation to use the Gaussian approximation fiducial interval of large sample, and estimate at the required sample size of the ML of the limited sample situation of different model of fit.
([4] Wong such as Wong, Heung., Liu, F., Chen, M.and Cheung, W.Empirical likelihood based diagnostics for heteroscedasticity in partial linear models[J] .Computational Statistics and Data Analysis, 2009,53:3466-3477) the bootstrapping emulation of use experience likelihood overcomes the distortion of small sample.Because being one, likelihood ratio test has the asymptotic test that restriction card side distributes.Can overcome the distortion that the experience likelihood ratio test of small sample causes by experience likelihood bootstrapping critical value (EL bootstrap critical value).
In most of traditional Chinese medical science mechanism, not at the card type symptom modelling historical data of disease, in addition, and the time integral that the TCM Syndrome Type model need be longer than other medical science model, these facts show, need to be fit to the forecasting techniques of small sample.If traditional Chinese medical science model is estimated with small sample, the performance of model must have angle widely.But the density forecast assessment can be subjected to the influence of density inside, owing to be subjected to the many little interference of inner delineation, may significantly reduce the concern of traditional Chinese medical science managerial personnel to tail.The tail prediction measures has satisfied cover axiom reasonably directly perceived, and as monotonicity and subadditivity, the tail prediction causes than based on the lower loss of interval prediction.From statistical angle, obviously, the tail prediction measures comprises than interval more information.Tail forecast test power efficient relatively and require more parameter and basic assumption condition at present, has best detection effect.
Summary of the invention
The object of the present invention is to provide a kind of detection method of the likelihood ratio test error based on probability integral transformation.
The detection method of a kind of likelihood ratio test error based on probability integral transformation of the present invention may further comprise the steps:
Step 1) is set mathematical model: establish the stochastic variable X value 0,1 of parent, then the probability function of X is:
f ( X = x ; p ) = ( 1 2 ) x ( 1 2 ) 1 - x , x = 0,1 - - - ( 12 )
Take out n sample X thus at random 1, X 2..., X n(X i=0,1 i=1,2 ..., n), in the formula (12), it is 0 or 1 that x represents possible outcome, it is 0 probability (each test is all identical) that p represents the x possible outcome.
Step 2) establishes stochastic variable Y=X 1+ X 2+ ... + X n, because of the distribution function of Y is
h ( Y = y ; n , p ) = n y ( 1 2 ) y ( 1 2 ) n - y , y = 1,2 , . . . , n - - - ( 13 )
In the formula (13), h represents in the binomial experiment, tests that to have be for y time the probability distribution of 0 stochastic variable X for n time; It is 0 number of times that y represents the result, and n represents test number (TN), and p represents each 0 probability that occurs.
So can be by following its right value of asking
P ( Y ≥ S ) = Σ y = S n n y ( 1 2 ) y ( 1 2 ) n - y - - - ( 14 )
In the formula (14), P represent n test to have S time at least (S<n) is 0 probability, and S represents that to have S time at least be 0, and n represents test number (TN), and it is 0 number of times that y represents the result.
Step 3) can utilize following its approximate value of asking of central limit theorem to be for related with Gauss's likelihood:
X ‾ = 1 n Σ k = 1 n X k ,
Wherein, n representative sample size.
Step 4) knows that by central limit theorem its distribution is similar to
Figure BSA00000183714700032
Wherein, N represents normal distribution, n representative sample size,
Cause
Figure BSA00000183714700033
Figure BSA00000183714700034
P is 0 probability, and q is 1 probability), so
t = X ‾ - 1 2 1 4 n - - - ( 15 )
Distribution be similar to N (0,1), wherein on behalf of t, t distribute,
Figure BSA00000183714700036
The representative sample size is the average of the stochastic variable X of n, and n representative sample size is n,
Cause
Y ≥ S ≅ X ‾ ≥ S n ≅ t ≥ 2 S - n n - - - ( 16 )
In the formula (16), it is 0 number of times that y represents the result, S represent the result be 0 have at least S time (S<n), n representative sample size,
So P { Y ≥ S } = P ( t ≥ 2 S - n n ) ~ 1 2 π ∫ 2 s - n n ∞ e - t 2 2 dt
= 0.5 - 1 2 π ∫ 0 2 s - n n e - t 2 2 dt = 0.5 - Φ ( 2 S - n n ) - - - ( 17 )
Wherein, the accurate single argument normal distribution of bidding Φ ( x ) = 1 2 π ∫ 0 x e - t 2 2 dt ) .
Step 5) when n=∞ (n=1000), by
Figure BSA000001837147000311
Can look into gaussian distribution table (the most important continuous probability distribution table of statistics) gets
2 S - 1000 31.6 = 2.33 - - - ( 20 )
Compare with n again after solving S, can draw calibration parameter.
Step 6) is when the looks of likelihood ratio test resulting P value conclusion and one-dimensional order contingency table are inconsistent, and the calibration parameter correction P value error according to based on probability integral transformation makes P value conclusion consistent with the looks of one-dimensional order contingency table.
Outstanding advantage of the present invention is as follows:
The present invention proposes the detection method of relevant likelihood ratio test error, is used for the prediction of one-dimensional order contingency table and small sample.Its objective is provides prediction accuracy and analysis tool efficiently, and more general forecast test power.Simulation shows, this method can be used for common small sample.If when the looks of likelihood ratio test resulting P value conclusion and one-dimensional order contingency table are inconsistent, should make P value conclusion consistent according to 0.074 calibration parameter correction P value error based on probability integral transformation with the looks of one-dimensional order contingency table.We are applied to the analysis of stomach trouble with the detection method of likelihood ratio test error, obtain the stomachache severity parameter result consistent with the prediction of contingency table looks of card type.
Description of drawings
Fig. 1 is the card type and the stomachache severity looks of the embodiment of the invention.In Fig. 1, horizontal ordinate is the card type, and ordinate is a probability; Curve 1 is a liver-stomach disharmony; Curve 2 is weakness of the spleen and the stomach; Curve 3 is damp heat in the spleen and the stomach; Curve 3 is a marginal probability.
Fig. 2 is the card type and the stomachache severity looks of the embodiment of the invention.In Fig. 2, horizontal ordinate is the card type, and ordinate is a probability; Curve 1 is a liver-stomach disharmony; Curve 2 is weakness of the spleen and the stomach; Curve 3 is damp heat in the spleen and the stomach; Curve 3 is a marginal probability.
Embodiment
1, independent same distribution
In Bernoulli Jacob (Bernoulli) test occasion, when test number (TN) n is big, calculate also inconvenient.Poisson's theorem is told us, when p≤0.1, can use the Poisson distribution approximate treatment, but doing approximate treatment with normal distribution then the restriction of p≤0.1 can be subjected to, thereby central limit theorem is made accurate Calculation in Bernoulli Jacob's occasion ingenious part can be realized.In a word, central limit theorem will be applied in Bernoulli Jacob's occasion more and more.
This joint comprises law of great numbers (Law of Large Numbers), central limit theorem (Central Limit Theorem) and 3 parts of Rosenblatt conversion.
1.1 law of great numbers
The independent identically distributed suffering law of great numbers of admiring: establish x 1, x 2..., x nBe independence and sequence of random variables, and have mathematical expectation and variance: E (x with same distribution i)=μ, D (x i)=σ 2(i=1,2 ..., n), then, have any given ε>0
lin n &RightArrow; &infin; P | | 1 n &Sigma; i = 1 n x i - &mu; | < &epsiv; | = 1 - - - ( 1 )
The arithmetic mean convergence in (with)probability that the hot law of great numbers of admiring illustrates independent identically distributed stochastic variable is in its mathematical expectation, and it is for estimating that with arithmetic mean mathematical expectation provides theoretical foundation in the practical application.
1.2 central limit theorem
The central limit theorem of independent same distribution (iid): establish x 1, x 2..., x nBe independence and sequence of random variables, and have mathematical expectation and variance: E (x with same distribution i)=μ, D (x i)=σ 2≠ 0 (i=1,2 ..., n), then, have any real number
lin n &RightArrow; &infin; P | &Sigma; i = 1 n x i - &mu; &mu; &sigma; &le; x | = 1 2 n &Integral; - &infin; x e - t 2 2 dt = &Phi; ( x ) - - - ( 2 )
Independent identically distributed central limit theorem has been expressed the special status of normal distribution in theory of probability, although x iDistribution be arbitrarily, but as long as n fully big, stochastic variable
&Sigma; i = 1 n x i - &mu; n &sigma; &le; x - - - ( 3 )
The approximate standardized normal distribution N (0,1) that obeys, in other words, when n is very big, independent identically distributed stochastic variable x iAnd
Figure BSA00000183714700053
Approximate Normal Distribution N (n μ, n σ 2).Those many small, stochastic variables of the overall result of enchancement factor effect independently that Here it is, the general rationale of Normal Distribution approx, thereby normal distribution all has great importance in theory with on using.(n p), then when n is very big, has as if x~B
P ( a &le; x &le; b ) = P ( b - np npq &le; x - np npq &le; a - np npq )
&ap; &Phi; | b - np np ( 1 - p ) | - &Phi; | a - np np ( 1 - p ) | - - - ( 4 )
1.3Rosenblatt conversion
A stochastic process y t, estimate at time t-1, provide y tProbability density be f (y t) with relevant distribution function
Figure BSA00000183714700056
Interval prediction is based on contrary distribution letter+number,
y &OverBar; t = F - 1 ( &alpha; ) - - - ( 5 )
For example, 99% the satisfaction in two weeks by a definite date is a quantity
Figure BSA00000183714700058
Make
Figure BSA00000183714700059
Christoffersen (1998) points out to verify a way of forecast interval, and promptly interval should surpassing or time of α % in violation of rules and regulations, this unlawful practice also should be uncorrelated with the time, and in conjunction with these attributes, variable-definition is
I t=1, if in violation of rules and regulations
=0, if do not take place in violation of rules and regulations
It should be an independent same distribution Bernoulli sequence that parameter alpha is arranged, because rare violation (by design), check looks at that forming a Bernoulli Jacob whether in violation of rules and regulations needs hundreds of to observe at least, key issue is, Bernoulli Jacob's variable just has only two values (0 and 1), and be worth 1 seldom, the density evaluation method is utilized whole distributions of result, thereby draws a bigger quantity of information from available data.
Not to only limit to note rare violation, just can all realizations of conversion become a series of independent identically distributed stochastic variables.
Specifically, Rosenblatt ( [5 ]Rosenblatt, M.Remarks on a Multivariate Transformation[J] .The Annals of Mathematical Statistics, 1952,23:470-472) Ding Yi conversion
x t = &Integral; -&infin; y t f ^ ( u ) du = F ^ ( y t ) - - - ( 6 )
Y wherein tBe afterwards knowledge and
Figure BSA00000183714700062
Be the loss density of ex ante forecasting, Rosenblatt shows x tBe independent same distribution and be uniformly distributed in (0,1).Therefore, if the necessary regular reporting prediction distribution of enterprise, Regulator can use this probability integral transformation, tests and whether violates independence and/or consistance.In addition, its knowledge y no matter tThe behind distribute, even forecast model
Figure BSA00000183714700064
Change in time, this result still sets up.
2 likelihood ratio test frameworks
Below introduced the extension of Rosenblatt conversion, independent same distribution N (0,1) is provided under the null hypothesis variable, this allows convenient and estimates Gauss's likelihood flexibly and make up based on the inspection statistics of likelihood and have good finite sample property.
Be difficult to small data sample check consistency, this check is that the nonparametric that statistical circles is determined is the fact of a straight line with utilizing uniform density.It also is difficult to design the parameter when checking null hypothesis to be U (0,1) stochastic variable.At a nested x of more general pattern tIndependent same distribution U (0,1) model needs provided support to depend upon unknown parameter, likelihood ratio (Likelihood Ratio, LR) and other statistics because the uncontinuity of objective function does not have usual gradation.
Berkowitz advocates one and simply is transformed to normal state.At first, conversion is a simple computation, can directly calculate Gauss's likelihood after the conversion and make up LR, to some classifications of pattern failure, the LR check be evenly the strongest (Uniformly Most Powerful, UMP).That is to say that the LR program has higher ability than the check of the fixedly degree of confidence of each value of any other unknown parameter.At last, even it can not be proved to be evenly the most powerful, the LR check often has desirable inspection statistics characteristic and good limited sample behavior (referring to [6] Hogg, R.V., and Craig, A.T.Mathematical Statistics[M] .New York:Macmillan.1965).
The attracting characteristics of another of likelihood test framework are that the researchist has very big decision check for which and how many restrictions.Small sample, but we want tight parametric test probably.
Though people can check average, variance, the degree of bias of independent same distribution U (0,1) data or the like, the performance of these programs usually with provide sample size relevant.
Provide Φ -1(.) is contrary Standard Normal Distribution, and then the prediction for any sequence has following result.
Proposition 1: if series
Figure BSA00000183714700065
Be as an independent same distribution U (0,1), then
z t = &Phi; - 1 [ &Integral; - &infin; y t f ( u ) du ] - - - ( 7 )
Be an independent same distribution U (0,1).
The conversion of proposition 1 is to be used for the emulation stochastic variable, and it shows simple an extension of Rosenblatt conversion, and we change and observe the combination income to creating a series
Figure BSA00000183714700072
This should be the iid standard normal, is what makes it so useful, is because according to null hypothesis, the data Normal Distribution, and this provides us the convenient tool related with Gauss's likelihood.
In addition, can be in some aspects, the conversion of deal with data has identical explanation as non-switched raw data.Following this notion of proposition official confirmation.
Proposition 2: provide z tDensity h (z t) and standardized normal distribution Φ (z t), then
log [ f ( y t ) / f ^ ( y t ) ] = log [ h ( z t ) / &Phi; ( z t ) ] - - - ( 8 )
Proof: at Φ -1Data converted can be written as function of functions
Figure BSA00000183714700074
Wherein
Figure BSA00000183714700075
Be model prediction, Φ -1Be contrary normal distribution, use the Jacobian conversion, z tDistribution provides
Figure BSA00000183714700076
After logarithm and arrangement, the result who obtains requiring.
Proposition 2 regulations, inaccurate density prediction will be retained in the data after the conversion.For example, if Within the specific limits, also will be to make h (z like this t)>Φ (z t) at the respective regions of standard normal.
Not the Rosenblatt conversion, neither further apply the normal state conversion of any distributional assumption basic data; On the contrary, correct density prediction means that converted variable is a normal state.
Suppose given model formation sequence
Figure BSA00000183714700078
Because z tShould be independently to observe and standard normal, diversified check can make up.Particularly null hypothesis can be tested, for example, average that the single order autoregression is selected fully and variance may with (0,1) difference, can write
z t-μ=ρ(z t-1-μ)+ε t (9)
What 1 null hypothesis of assigning a topic was described is μ=0, ρ=0, and var (ε t)=1, the definite log-likelihood function relevant with equation (9) is well-known, reprints for convenient here:
- 1 2 log ( 2 &pi; ) - 1 2 log [ &sigma; 2 / ( 1 - &rho; 2 ) ] - ( z 1 - &mu; / ( 1 - &rho; ) ) 2 2 &sigma; 2 / ( 1 - &rho; 2 )
- T - 1 2 log ( 2 &pi; ) - T - 1 2 log ( &sigma; 2 ) - &Sigma; t = 2 T ( ( z t - &mu; - &rho; z t - 1 ) 2 2 &sigma; 2 ) - - - ( 10 )
σ wherein 2Be ε tVariance, for for purpose of brevity, the likelihood that Berkowitz writes has only the unknown parameter function of model, L (μ, σ 2, ρ).
The LR independence test of observed value can reduce
LR ind = - 2 ( L ( &mu; ^ , &sigma; ^ 2 , 0 ) - L ( &mu; ^ , &sigma; ^ 2 , &rho; ^ ) ) , - - - ( 11 )
Wherein cap is expressed as estimated value, and this test statistics is the degree that metric data is supported the non-zero parameter, and in null hypothesis, test statistics is distributed as χ 2(1), the side's of card degree of freedom is 1, and mode that can be common is carried out reasoning like this.
The shortcoming of LR check is West ([7] West, K.D.Asymptotic Inference About Predictive Ability[J] .Econometrica, 1996,64:1067-1084) emphasized: the prediction of small sample estimation model generation sometimes may be subjected to the uncertainty influence of parameter.
Below provide the specific embodiment of the detection method of likelihood ratio test error.
Certain city two families (influenza vaccines) inoculation point A, B do business rivalry, and supposing to add up to every day has n position client, and this n position client independently and at random selects the influenza vaccines of each inoculation point mutually, the vaccine problem of storing up of each inoculation point of consideration.
This problem is
Figure BSA00000183714700082
N Bernoulli trials problem, so its mathematical model is: establish the stochastic variable X value 0,1 of parent, X=1 represents to inoculate the influenza vaccines that A is ordered, and X=0 represents to inoculate the influenza vaccines that B is ordered, and then the probability function of X is
f ( X = x ; p ) = ( 1 2 ) x ( 1 2 ) 1 - x , x = 0,1 - - - ( 12 )
Take out n sample X thus at random 1, X 2..., X n(X i=0,1 i=1,2 ..., n).
If stochastic variable Y=X 1+ X 2+ ... + X nThe number that inoculation A order among the expression n position client props up that (S<n), the probability of S people or above inoculation A point influenza vaccines are P (Y 〉=S) if the influenza vaccines of inoculation point A prepare S
Because of the distribution function of Y is
h ( Y = y ; n , p ) = n y ( 1 2 ) y ( 1 2 ) n - y , y = 1,2 , . . . , n - - - ( 13 )
So can be by following its right value of asking
P ( Y &GreaterEqual; S ) = &Sigma; y = S n n y ( 1 2 ) y ( 1 2 ) n - y - - - ( 14 )
, but, can utilize following its approximate value of asking of central limit theorem for related with Gauss's likelihood.
If
Figure BSA00000183714700086
Know that by central limit theorem its distribution is similar to
Figure BSA00000183714700087
Cause
Figure BSA00000183714700088
So
t = X &OverBar; - 1 2 1 4 n - - - ( 15 )
Distribution be similar to N (0,1), because of
Y &GreaterEqual; S &cong; X &OverBar; &GreaterEqual; S n &cong; t &GreaterEqual; 2 S - n n - - - ( 16 )
So
P { Y &GreaterEqual; S } = P ( t &GreaterEqual; 2 S - n n ) ~ 1 2 &pi; &Integral; 2 s - n n &infin; e - t 2 2 dt
= 0.5 - 1 2 &pi; &Integral; 0 2 s - n n e - t 2 2 dt = 0.5 - &Phi; ( 2 S - n n ) - - - ( 17 )
(the accurate single argument normal distribution of bidding &Phi; ( x ) = 1 2 &pi; &Integral; 0 x e - t 2 2 dt )
Suitably getting S makes
Figure BSA00000183714700095
The time then represent at most only 1 people that 100 philtrums are not inoculated into, or the rarest 99 people of 100 philtrums are inoculated into influenza vaccines.
For example inoculate number n=1000 every day when (two inoculations point adds up to the inoculation number)
0.5 - &Phi; ( 2 S - 1000 31.6 ) < 0.01 - - - ( 18 )
Then
&Phi; ( 2 S - 1000 31.6 ) > 0.49 - - - ( 19 )
By
Figure BSA00000183714700098
Can look into gaussian distribution table gets
2 S - 1000 31.6 = 2.33 - - - ( 20 )
The S that satisfies formula (19) is 2 S - 1000 31.6 > 2.33 , S > 536 ,
Then must be as if inoculating number n=1000 (two inoculations point adds up to) every day, A inoculation point desires to make minimum 99 people of 100 philtrums to be inoculated into influenza vaccines, must set and store up the same if B inoculation point of minimum 537 of vaccine (so get S=537) and desire to make minimum 99 people of 100 philtrums to be inoculated into influenza vaccines also must to set and store up minimum 537 of vaccine (set vaccine 537 then can) so two an inoculations total need be established vaccine 537+537=1074 props up, cause two inoculation point clients only 1000 people are called 0.074 calibration parameter so lose 74 (0.074) vaccines by business rivalry.
Below provide experimental result and analysis.
When if the looks of likelihood ratio test resulting P value conclusion and one-dimensional order contingency table are inconsistent, should be according to 0.074 calibration parameter correction P value error of the 3rd joint statement, make P value conclusion consistent with the looks of one-dimensional order contingency table, whether our the desire severity of relatively having a stomachache is relevant with the card type, measured 227 sampled points, must table 1 data, be the one-dimensional order contingency table, table 1 comprises the stomachache severity (0-3: normal, slight of frequency J=4 kind varying level, moderate, severe) find in I=3 kind card type; When the grid of table all is bigger than 6, large sample theory is suitable for contingency table for the scoring check, if we the frequency of every row (card type) divided by total corresponding line, we obtain to have relative probability to show the table 2 of 3 * 4 contingency tables.
Utilize these data p Ij/ p I.And p .jDraw, can obtain the looks of serious symptom degree, the looks (OK) of different card types are presented in the line chart of Fig. 1, the marginal probability and the conditional probability of indication stomachache severity, away from go and show that capable classification is relevant with the row classification.
Table 1 stomachache severity frequency
Figure BSA00000183714700101
Table 2 stomachache severity relative probability
Figure BSA00000183714700102
In Fig. 1 level " moderate ", between dotted line and solid line (DAMPH) maximum disparity is arranged, almost reach 0.2, stomachache severity 〉=2 of deduction DEFSS are compared with DAMPH less probability.Therefore the severity of having a stomachache is relevant with the card type.Utilize card side's independence test, whether be associated χ with the stomachache order of severity with true authentication-type 2Value is 25.90, P=0.0002, and expression stomachache severity is relevant with the card type.Proportion of utilization calculates than digital-to-analogue type, L 1Be equivalent to-297.9847, if b 1=b 2=0, L 0=-300.9095 ,-2 (L 0-L 1) obtain D2 and equal 5.8496, relevant χ 2Critical value will be checked H 0: β 12=0, find So P value 0.0536, null hypothesis is not rejected, and draws such conclusion, β 1And β 2Be zero simultaneously, the constituent ratio looks of these conclusions and Fig. 1 are inconsistent, and according to behind the 0.074 above-mentioned calibration parameter round-off error, the P value is 0.0497, and then the constituent ratio looks with Fig. 1 are consistent.In addition, analyze β 1=0 or β 2=0 or both neither be zero.Table 3 is the results by maximal possibility estimation (MLE).
From table 3, refusal β 1=0 and β 2=0, the P value is respectively 0.021 and 0.038, and this is that to accuse of type relevant with stomachache seriousness, and INCRD relatively DEFSS be (1,0) and (0,1) as indieating variable x, therefore
L j(1,0)=θ j1 (21)
The parameter of table 3 card type and stomachache severity
Figure BSA00000183714700111
And
L j(0,1)=θ j2 (22)
Be respectively INCRD and the DEFSS logarithm than number ratio, the logarithm than number ratio of DAMPH is then
L j(0,0)=θ j (23)
Figure BSA00000183714700112
Represent the ratio number of severity≤j of INCRD to compare logarithm greater than the ratio number of DAMPH than logarithm.In other words, patient's INCRD stomachache symptom is normally so not serious;
Figure BSA00000183714700113
Represent the ratio number of severity≤j of DEFSS to compare logarithm greater than the ratio number of DAMPH than logarithm.In other words, patient's DEFSS stomachache symptom is normally so more not serious than DAMPH; These conclusions are consistent with the constituent ratio looks of Fig. 1.
But, be revised as 22 and 5 (table 4) respectively as the patient that will go up the severity grade weakness of the spleen and the stomach in the example 2,3, then adopt χ 2Check is analyzed, and whether is associated χ with the stomachache order of severity with true authentication-type 2Value is 27.31, P=0.0001, and expression stomachache severity is relevant with the card type.
Table 4 stomachache severity frequency
The parameter of table 5 card type and stomachache severity
Figure BSA00000183714700121
Proportion of utilization calculates than digital-to-analogue type, L 1Be equivalent to-297.0911, if b 1=b 2=0, L 0=-300.0885 ,-2 (L 0-L 1) obtain D2 and equal 5.9948, relevant χ 2Critical value will be checked H 0: β 12=0, find
Figure BSA00000183714700122
So P value 0.0499, null hypothesis is rejected, and draws such conclusion, β 1And β 2Not zero simultaneously.In addition, analyze β 1=0 or β 2=0 or both neither be zero, table 5 is the results by maximal possibility estimation (MLE).
From table 5, refusal β 1=0 and β 2=0, the P value is respectively 0.021 and 0.033, this be accuse of type with the stomachache seriousness relevant, these conclusions are consistent with the constituent ratio looks of Fig. 2.
The prediction of all science is extremely important, comprises the traditional Chinese medical science, and likelihood ratio test is differentiated, and the suitable foot of its prediction efficiency and model is inseparable, and failure means the deficiency of forecast model.The present invention has developed the inference procedure of the predicated error of the prediction of relevant contingency table looks and likelihood ratio test, be used in the prediction of one-dimensional order contingency table and small sample, its objective is the analysis tool that prediction accuracy and efficient are provided, and more general forecast test power.Simulation shows, this program can be used for common small sample.
As if LR check framework is flexibly, intuitively, and the method for inspection provides very good power of test characteristic, the shortcoming of LR check is the prediction that the small sample estimation model produces, and may be subjected to the uncertainty influence of parameter.The present invention proposes a kind of new calibration parameter method and assesses this prediction; If when the looks of likelihood ratio test resulting P value conclusion and one-dimensional order contingency table are inconsistent, should make P value conclusion consistent according to 0.074 above-mentioned calibration parameter correction P value error with the looks of one-dimensional order contingency table.

Claims (1)

1. the detection method of likelihood ratio test error is characterized in that may further comprise the steps:
Step 1) is set mathematical model: establish the stochastic variable X value 0,1 of parent, then the probability function of X is:
f ( X = x ; p ) = ( 1 2 ) x ( 1 2 ) 1 - x , x = 0,1 - - - ( 12 )
Take out n sample X thus at random 1, X 2..., X n(X i=0,1 i=1,2 ..., n), in the formula (12), it is 0 or 1 that x represents possible outcome, it is 0 probability (each test is all identical) that p represents the x possible outcome;
Step 2) establishes stochastic variable Y=X 1+ X 2+ ... + X n, because of the distribution function of Y is
h ( Y = y ; n , p ) = n y ( 1 2 ) y ( 1 2 ) n - y , y = 1,2 , . . . , n - - - ( 13 )
In the formula (13), h represents in the binomial experiment, tests that to have be for y time the probability distribution of 0 stochastic variable X for n. time; It is 0 number of times that y represents the result, and n represents test number (TN), and p represents each 0 probability that occurs;
So can be by following its right value of asking
P ( Y &GreaterEqual; S ) = &Sigma; y = S n n y ( 1 2 ) y ( 1 2 ) n - y - - - ( 14 )
In the formula (14), P represent n test to have S time at least (S<n) is 0 probability, and S represents that to have S time at least be 0, and n represents test number (TN), and it is 0 number of times that y represents the result;
Step 3) can utilize following its approximate value of asking of central limit theorem to be for related with Gauss's likelihood:
X &OverBar; = 1 n &Sigma; k = 1 n X k ,
Wherein, n representative sample size;
Step 4) knows that by central limit theorem its distribution is similar to
Figure FSA00000183714600015
Wherein, N represents normal distribution, n representative sample size,
Cause
Figure FSA00000183714600016
Figure FSA00000183714600017
P is 0 probability, and q is 1 probability), so
t = X &OverBar; - 1 2 1 4 n - - - ( 15 )
Distribution be similar to N (0,1), wherein on behalf of t, t distribute,
Figure FSA00000183714600019
The representative sample size is the average of the stochastic variable X of n, and n representative sample size is n,
Cause
Y &GreaterEqual; S &cong; X &OverBar; &GreaterEqual; S n &cong; t &GreaterEqual; 2 S - n n - - - ( 16 )
In the formula (16), it is 0 number of times that y represents the result, S represent the result be 0 have at least S time (S<n), n representative sample size,
So P { Y &GreaterEqual; S } = P ( t &GreaterEqual; 2 S - n n ) ~ 1 2 &pi; &Integral; 2 s - n n &infin; e - t 2 2 dt
= 0.5 - 1 2 &pi; &Integral; 0 2 s - n n e - t 2 2 dt = 0.5 - &Phi; ( 2 S - n n ) - - - ( 17 )
Wherein, the accurate single argument normal distribution of bidding &Phi; ( x ) = 1 2 &pi; &Integral; 0 x e - t 2 2 dt ) ;
Step 5) when n=∞ (n=1000), by
Figure FSA00000183714600025
Can look into gaussian distribution table (the most important continuous probability distribution table of statistics) gets
2 S - 1000 31.6 = 2.33 - - - ( 20 )
Compare with n again after solving S, can draw calibration parameter;
Step 6) is when the looks of likelihood ratio test resulting P value conclusion and one-dimensional order contingency table are inconsistent, and the calibration parameter correction P value error according to based on probability integral transformation makes P value conclusion consistent with the looks of one-dimensional order contingency table.
CN2010102231750A 2010-07-06 2010-07-06 Likelihood ratio test error detection method Pending CN101894215A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102231750A CN101894215A (en) 2010-07-06 2010-07-06 Likelihood ratio test error detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102231750A CN101894215A (en) 2010-07-06 2010-07-06 Likelihood ratio test error detection method

Publications (1)

Publication Number Publication Date
CN101894215A true CN101894215A (en) 2010-11-24

Family

ID=43103405

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102231750A Pending CN101894215A (en) 2010-07-06 2010-07-06 Likelihood ratio test error detection method

Country Status (1)

Country Link
CN (1) CN101894215A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062767A (en) * 2018-02-11 2018-05-22 河海大学 Statistics based on sequential SAR image is the same as distribution space pixel selecting method
CN108242268A (en) * 2017-07-18 2018-07-03 嘉兴太美医疗科技有限公司 A kind of clinical investigation subject randomized grouping and the wrong identification and correcting method of therapy distribution
CN108875303A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Dendrobidium huoshanness
CN108875309A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Dendrobium loddigesii
CN108875304A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Herba Dendrobii

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875303A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Dendrobidium huoshanness
CN108875309A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Dendrobium loddigesii
CN108875304A (en) * 2017-05-11 2018-11-23 北京蓝标成科技有限公司 A kind of foundation, judgment criteria and the judgment method of the method judging the purebred phase recency of Herba Dendrobii
CN108242268A (en) * 2017-07-18 2018-07-03 嘉兴太美医疗科技有限公司 A kind of clinical investigation subject randomized grouping and the wrong identification and correcting method of therapy distribution
CN108062767A (en) * 2018-02-11 2018-05-22 河海大学 Statistics based on sequential SAR image is the same as distribution space pixel selecting method
CN108062767B (en) * 2018-02-11 2022-02-11 河海大学 Statistical same-distribution spatial pixel selection method based on time sequence SAR image

Similar Documents

Publication Publication Date Title
McCabe et al. Bayesian predictions of low count time series
Chen et al. Statistical analysis of Q-matrix based diagnostic classification models
Du et al. Statistical inference for partially linear additive spatial autoregressive models
Paxton et al. Nonrecursive models: Endogeneity, reciprocal relationships, and feedback loops
Midi et al. Collinearity diagnostics of binary logistic regression model
Meerschaert et al. A simple robust estimation method for the thickness of heavy tails
Han et al. Additive functional regression for densities as responses
CN101894215A (en) Likelihood ratio test error detection method
Braeken et al. Copula functions for residual dependency
Lee CARBayes version 6.1. 1: An R Package for Spatial Areal Unit Modelling with Conditional Autoregressive Priors
CN110442911B (en) High-dimensional complex system uncertainty analysis method based on statistical machine learning
Marino et al. Semiparametric empirical best prediction for small area estimation of unemployment indicators
Le et al. Linear regression and its inference on noisy network-linked data
Jiang et al. Composite quantile regression for massive datasets
Tenreiro An affine invariant multiple test procedure for assessing multivariate normality
Jiang et al. Multiple criteria decision making with interval stochastic variables: A method based on interval stochastic dominance
Cook et al. Heckroccurve: ROC curves for selected samples
Wang et al. Statistical inferences for varying coefficient partially non linear model with missing covariates
Chen et al. Inference for mixed models of ANOVA type with high-dimensional data
Feng et al. A lack-of-fit test for quantile regression process models
Qingguo M-estimation for functional linear regression
Lavergne et al. A Hausman specification test of conditional moment restrictions
Lu et al. Likelihood based confidence intervals for the tail index
Rao Estimation of stress-strength reliability from truncated type-I generalised logistic distribution
Kim Assessing the relative performance of local item dependence indexes

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20101124