CN102103715A - Negative binomial regression-based maritime traffic accident investigation analysis and prediction method - Google Patents

Negative binomial regression-based maritime traffic accident investigation analysis and prediction method Download PDF

Info

Publication number
CN102103715A
CN102103715A CN2010105491463A CN201010549146A CN102103715A CN 102103715 A CN102103715 A CN 102103715A CN 2010105491463 A CN2010105491463 A CN 2010105491463A CN 201010549146 A CN201010549146 A CN 201010549146A CN 102103715 A CN102103715 A CN 102103715A
Authority
CN
China
Prior art keywords
msub
mrow
model
distribution
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010105491463A
Other languages
Chinese (zh)
Inventor
张�浩
肖英杰
白响恩
杨小军
李松
郑剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Maritime University
Original Assignee
Shanghai Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Maritime University filed Critical Shanghai Maritime University
Priority to CN2010105491463A priority Critical patent/CN102103715A/en
Publication of CN102103715A publication Critical patent/CN102103715A/en
Pending legal-status Critical Current

Links

Landscapes

  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a negative binomial regression-based maritime traffic accident investigation analysis and prediction method, which comprises the following steps of: first determining the most substantial factors and the most important factors which may influence an accident, determining a plurality of independent variables and selecting a proper model type according to the characteristics of data; then performing model hypothesis on such a basis, and determining a basic model which describes problems; and finally performing parameter evaluation and verification on the model by using a series of criterions, and correcting the model to establish a more accurate model. The analysis and prediction model has relatively higher prediction accuracy, and provides a good way for analyzing and predicting maritime traffic accidents.

Description

Water traffic accident investigation analysis and prediction method based on negative binomial regression
Technical Field
The invention relates to a negative binomial regression-based water accident investigation analysis and prediction method.
Background
The accuracy of accident prediction is mainly based on two basic preconditions: first, the information is known, governments and organizations around the world have realized the importance of establishing disaster databases, and are actively working; the second is a correct accident prediction method.
Analysis, prediction, evaluation and prevention decision control technologies of ship accidents become the core of modern ship safety management. How much danger exists at all for marine traffic safety, how much loss is possibly caused to the society, how much acceptable risk value is, how to carry out marine traffic safety guarantee can reduce the danger degree of the system to be within the safety index, and these all rely on accident investigation and analytical research.
The vulnerability, infectivity and stability of pregnant disaster bodies, disaster-causing bodies and disaster-suffered bodies in the catastrophe transmission process are explored, and the mechanism of extraction and concentration is the most central problem of safety science. When the key point of safety science research is shifted from the cause of an exploration accident to the prosperity of the exploration accident, the accident prediction technology is marked to increasingly become one of the most basic scientific problems in the current safety guarantee technology research. How to effectively investigate and analyze the accident after the accident happens and prevent the similar accident from happening again is the key to select a proper analysis method.
The current research situation and the development trend at home and abroad are analyzed, and accident prediction is mainly shown in the following aspects at present:
1) qualitative prediction method
Qualitative prediction means that a predictor relies on personnel and experts who are familiar with business knowledge and have abundant experience and comprehensive analysis capability to judge the future development of things according to mastered historical data and visual materials by using personal experience and analysis judgment capability, and then integrates opinions of various aspects in a certain form to serve as a main basis for predicting the future.
2) Quantitative prediction method
Quantitative prediction refers to a method of calculating and finally obtaining a specific numerical value by using a certain mathematical model and using historical and existing specific data. The accident prediction mathematical model and method has various forms, and the common and feasible mathematical methods include regression prediction, time series prediction, Markov prediction, grey prediction, Bayesian network prediction, artificial neural network prediction, support vector machine prediction, etc.
3) The system safety analysis and evaluation methods are widely pushed, but the method research is carried out according to the characteristics of the system and the industry, the relevant standards are established, the relevant regulations and methods are issued, and the limitations are large. The safety evaluation method under study and development mainly adopts a qualitative prediction method, and experts make judgments by using work experience according to the existing information and data and quantify the judgments. The weight number of some quantitative indicators needs to be refined step by step.
4) At present, some accident statistical analysis and prediction mainly adopt an intuitive prediction method and a comparative analysis method, and only short-term prediction or comparison with the past can be carried out. The establishment of an evaluation index system, an evaluation synthesis technology and the like are analyzed through mathematical theories, a model is established for accident statistical analysis, the accident statistical analysis is in a research trial stage, and a systematic accident prediction and safety analysis evaluation method in the aspect of safety accident statistics and mathematics is not available at present.
Accident investigation and analysis techniques and methods have become a focus of research in recent years. The development of modern accident investigation and analysis methods is often the cross-use and interpenetration of multiple investigation and analysis methods, and therefore, absolute classification is difficult. Currently, the studies of the scholars are as follows:
according to the accident reason result model, the process model, the energy model, the logic tree model and the SHE management model, accident investigation analysis technical methods are summarized into 5 categories, and the combined analysis of multiple investigation technologies is suitable for the investigation of complex accidents; an investigation method considering the sequence of events and their influencing factors facilitates the suggestion of preventing the reoccurrence of accidents and reducing risks.
The Yanjiaxuan proposes that a water traffic accident information system is established by adopting an electronic chart technology, the content characteristics of the water traffic accident management system based on the electronic chart are determined according to the attributes and the spatial characteristics of accident data, and the functional structure and the technical development process of the accident management system are designed, so that the Yanjiaxuan becomes an advanced and effective technical means for marine traffic management and accident analysis.
The Huangzhi proposes that a grey correlation matrix of accident types, accident reasons, ship dimension, accident occurrence geographic positions and accident occurrence time is established by adopting a correlation analysis principle in a grey system theory, characteristics and rules of accidents of Taiwan straits are analyzed by applying statistical data of the accidents of Taiwan straits ships and sea damage, and a new method is provided for analyzing marine traffic accidents.
Xu nationality proposes that a gray correlation system is used for analyzing marine accidents of all ships and 300 or more ships in Taiwan straits and nearby water areas respectively, and gray correlation matrix operation and analysis are performed on six types of marine accidents such as collision, reef touch or grounding, touch, fire or explosion, mechanical failure, inclination or overturning and the like.
The classification and statistical method of the water traffic accidents of China, IMO and other countries in the world is analyzed to distinguish the defects of the classification and statistical method of China, and the defects of classification and statistical method of the accidents of China are indicated to be nonstandard and imperfect, and the comparability and the accuracy are lacked.
According to the method, 10 major fishery-related collision accidents in the Ningbo-Zhoushan sea area are classified, counted and analyzed case by case, a fishery-related collision accident frequently-caused structure model is constructed, and a targeted early warning prevention pre-control measure is provided.
The army-league-and-military-oriented technology proposes a concept of comprehensively arranging and analyzing inland ship traffic accidents by applying a data mining technology to overcome the adverse effects of multidimensional, sparse, incomplete and the like of data in an accident database, effectively identify and discover a new mode and an internal rule of accident data.
On the basis of collecting 100 collision accident investigation reports, Liu Zheng Jiang utilizes an association rule in a data mining technology to mine the relationship between human errors and influence factors thereof, and preliminarily determines the corresponding relationship between the human errors and the initiation factors in the ship collision avoidance process.
The king wind force proposes a correlation analysis method in the grey system theory aiming at the sea damage accident caused by the stormy waves, establishes a correlation matrix between the stormy waves and the cause of the stormy waves, performs grey correlation analysis on the cause of the accident, and quantitatively obtains the main cause of the sea damage accident, namely ship non-navigation and human factors.
The significance of building a shipwreck data warehouse is analyzed by the aid of the Weihong mining, a snowflake model of the shipwreck data warehouse is provided, and results show that a large amount of knowledge can be mined by means of deep analysis of shipwreck historical data through the data mining technology, and reference is provided for navigation safety.
The above studies have made a more detailed explanation of the accident cause studies, but there is still no research on quantitative analysis and prediction of accident casualties.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a method for investigating, analyzing and predicting water traffic accidents.
In order to achieve the purpose, the invention adopts the following technical scheme:
a negative binomial regression-based water traffic accident investigation, analysis and prediction method comprises the following steps:
a) variable and data selection: before modeling, a series of descriptive statistics and related analysis are carried out, the most basic and most important factors which possibly influence the accident occurrence are determined, and finally a plurality of independent variables which can enter a model are determined; and selecting a proper model type according to the characteristics of the dependent variable and the independent variable. When the dependent variable is a Count variable, i.e., the number of events occurring is taken, the use of a Count Data model should be considered. The model is applied to the condition that dependent variables are discrete integers, the dependent variables have small numerical values and more zero numbers, and most independent variables are nominal variables representing attributes, a model form capable of reflecting the characteristics is needed to improve a common least square method, and multiple estimation methods of counting data provided by EViews are needed, and besides a standard Poisson and negative binomial Maximum Likelihood (ML), a quasi-maximum likelihood (QML) method is also available. When the interpretation variables are qualitative variables, the test or observation results are called counting data, such as water traffic accident survey results: accident onset, shipwrecks, death and loss, injury, direct economic loss, etc.
b) The model assumes that: common distribution types of qualitative variables include binomial distribution, multinomial distribution, Poisson distribution, negative binomial distribution and the like; because the number of the accident, the number of the death and the missing people and the number of the injured people are any nonnegative integers, are typical counting data, do not obey normal distribution, but possibly obey poisson distribution or negative binomial distribution, the counting model adopted in the metering analysis is more suitable than a linear model, and the discrete value of the explained variable is assumed to obey certain poisson distribution, and the distribution function is as follows:
<math><mrow><mi>P</mi><mrow><mo>(</mo><mi>Y</mi><mo>=</mo><msub><mi>y</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><mi>exp</mi><mrow><mo>(</mo><mo>-</mo><mi>&lambda;</mi><mo>)</mo></mrow><msup><mi>&lambda;</mi><msub><mi>y</mi><mi>i</mi></msub></msup></mrow><mrow><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></mfrac><mo>,</mo></mrow></math> yi=1,2,3,Λ
wherein E (y)i)=λ,Var(yi) λ means that the mean and variance of the random variable y are both λ, and if X is (X)1,x2,Λ,xm) The poisson regression model is a regression model describing the relationship between the mean λ of the target variable y subject to poisson distribution and the interpretation variable X, and can be expressed as:
logλ=Xβ
where β is the parameter to be estimated, it can be estimated using an iterative nonlinear weighted least squares or maximum likelihood method. At a given xiUnder the condition of (a) yiThe conditional densities of (a) are:
<math><mrow><mi>f</mi><mrow><mo>(</mo><msub><mi>y</mi><mi>i</mi></msub><mo>|</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msup><mi>e</mi><mrow><mo>-</mo><mi>&lambda;</mi><mrow><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi><mo>)</mo></mrow></mrow></msup><mi>&lambda;</mi><mrow><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi></mrow><msup><mo>)</mo><msub><mi>y</mi><mi>i</mi></msub></msup></mrow><mrow><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></mfrac></mrow></math>
if the random variable yiIs equal to the variance, the poisson maximum likelihood estimate is consistent and valid, while the actual accident number data often has an over-dispersion (over-dispersion) characteristic if the random variable yiAnd (4) excessive divergence, wherein the variance is larger than the mean value, namely the variance is larger than the mean value. In these cases, if the poisson regression model is still used, the standard error of the parameters may be underestimated, overestimating their significance level, thus preserving redundant explanatory variables in the model, ultimately leading to unreasonable results. To eliminate the adverse effect, a Negative Binomial Regression model (Negative Binomial Regression) is used to replace the poisson Regression model for estimation, a Negative Binomial distribution is constructed by introducing an error term of the gamma distribution, and the Negative Binomial Regression model introduces an independent random effect u in the conditional mean value mu, so that the poisson Regression model is expanded, namely: log μi=logλi+loguiThen, the regression form of the negative binomial regression model is:
logμi=xiβ+ei
in the above formula, eiFor random errors, exp (e)i) Obeying the distribution of Γ, in a negative binomial regression model, yiFor xi,uiThe conditional distribution of (a) is still a poisson distribution:
<math><mrow><mi>f</mi><mrow><mo>(</mo><msub><mi>y</mi><mi>i</mi></msub><mo>|</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mo>[</mo><mi>exp</mi><mrow><mo>(</mo><mo>-</mo><msub><mi>&lambda;</mi><mi>i</mi></msub><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><msup><mrow><mo>(</mo><msub><mi>&lambda;</mi><mi>i</mi></msub><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><msub><mi>y</mi><mi>i</mi></msub></msup><mo>]</mo><mo>/</mo><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></math>
at this time, the random variable yiIs lambda, lambda (1+ eta), respectively2λ), wherein η2=1/yiIt is a measure of how well the condition variance exceeds the condition mean, i.e., the divergence.
c) Parameter estimation and verification:
1) parameter estimation using quasi-maximum likelihood function (QML):
the quasi-maximum likelihood function method can be realized under a series of distribution assumptions, the estimation of the quasi-maximum likelihood function method is more robust, and even if the distribution assignment is wrong, the quasi-maximum likelihood function method can generate consistent estimation of a correctly defined condition mean value parameter. This robustness of the results is similar to normal regression: the ML estimate is consistent even if the residual distribution is non-normal. In the ordinary least square method, the consistency requirement is a conditional mean value m (x, β) ═ x 'β, and in QML, the consistency requirement is m (x, β) ═ exp (x' β).
The method of estimating the standard error is to use the inverse calculation of the information matrix, but there is no consistency unless the condition distribution of y specifies correct. However, even if the specification is wrong, it is still possible to estimate the standard error in a robust way.
2) And (3) parameter estimation and verification:
the parameter estimation of the discrete data counting model is realized by maximum likelihood estimation, and the detection of the estimated parameters is mainly completed by Wald detection. The parametric test helps make some inferences about the mean of the population of samples, and the Wald test is similar to the t-test in a linear regression model and is therefore often referred to as the generalized t-test. The assumptions of the Wald test are: h0:βj0. Establishing t statistics as:
<math><mrow><msub><mi>t</mi><msub><mi>&beta;</mi><mi>j</mi></msub></msub><mo>=</mo><mfrac><msub><mover><mi>&beta;</mi><mo>^</mo></mover><mi>j</mi></msub><mrow><mi>se</mi><mrow><mo>(</mo><msub><mover><mi>&beta;</mi><mo>^</mo></mover><mi>j</mi></msub><mo>)</mo></mrow></mrow></mfrac></mrow></math>
wherein,
Figure BDA0000032930140000052
for the estimated value of the parameter to be examined,is a standard deviation of a list of parameter estimates. the t statistic follows approximately a standard normal distribution in the case of large samples.
3) And (3) carrying out goodness-of-fit calibration, verification and variable introduction judgment according to the following criteria:
(1)Pesudo R2goodness of fit test of the statistical pair model, R2A larger value indicates a better fit;
(2) the Log Likelihood (LL) Log maximum likelihood function value is a statistic obtained based on maximum likelihood estimation, the Log likelihood value is used for explaining the accuracy of the model, and the larger the Log likelihood value is, the more accurate the model is;
(3) significance of the t-estimate parameter was at the 5% level;
(4) the ratio of Pearson's chi-square value to the degree of freedom is between 0.8 and 1.2;
(5) the Akaike's Information Criterion (AIC) criterion is used for evaluating the quality of a model, and generally, the AIC value is required to be smaller and better.
According to the analysis prediction model obtained by the technical scheme, the judgment rules of model goodness-of-fit calibration, verification and variable introduction are introduced in the modeling process, so that the finally obtained prediction model has better goodness-of-fit, the prediction precision of the model is improved, and a good method is provided for the analysis prediction of the water surface traffic accident.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
The invention determines the most basic and important factors which can influence the accident on the basis of literature reading, data sorting and analysis and related research based on the principles of generality, simplicity and practicability. The occurrence of the water traffic accident is the result of the comprehensive action of various factors, all influencing factors are mutually related, and independent variables with larger relevance cannot enter a model at the same time. Therefore, a series of descriptive statistical and correlation analyses were performed prior to modeling, and finally 11 mutually independent variables were determined, as shown in table 1. Selecting casualty number as an output variable, selecting parameters affecting accident occurrence, such as ship registration places, accident types, ship types, accident occurrence positions and the like as explanatory variables, wherein the 5 explanatory variables respectively have 2 risk levels, 3 risk levels and 2 risk levels, and the total risk levels are 36 risk levels, and fitting data by applying EVIEWS software according to the risk levels.
TABLE 1
Figure BDA0000032930140000061
Firstly, carrying out regression prediction by adopting a negative binomial distribution form, substituting all independent variables into a model, wherein the regression result shows that some variables are not significant on a statistical model, the assumption that the coefficient is 0 cannot be rejected, the regression coefficient of some variables is paradoxical, meanwhile, multiple collinearity is found to occur due to excessive qualitative indexes, the multiple collinearity is eliminated by adopting stepwise regression, unitary regression of the explained variables relative to each explained variable is respectively fitted, and the fitting goodness R of each regression equation is2Sorting according to the size sequence; then R is put2And adding a large explanatory variable into the model for estimation, carrying out t test on the parameter estimation value according to the estimation result of the model, if the t test is obvious, retaining, otherwise, removing the variable, and continuously repeating the process until all the obvious variables are added. And finally, reserving a ship registration place a1, removing a2 and a3, reserving 2 accident type variables b1 and b3, 2 ship type variables c2 and c3, and 2 accident water area position variables d1 and d2, and reestablishing the model by only using e1 in the model at the accident occurrence time. The data were fitted using EVIEWS software, and the fitting results are shown in table 2.
TABLE 2
Figure BDA0000032930140000071
Alpha is a regression parameter of the negative binomial distribution and is used for representing the degree of over-dispersion of the data, the larger alpha is, the more dispersed the data (the variance is larger than the mean), and when alpha is 0, the data obeys the Poisson distribution. The optimal-to-inferior ratio among the models takes AIC statistics and log likelihood as judgment standards, and the prediction model in the negative binomial distribution form is better by comparing regression indexes of 2 distribution models shown in the table 2. And comparing the fitting conditions of the two models to show that the goodness of fit of the negative binomial regression model is better than that of the Poisson regression model.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (1)

1. The method for investigating, analyzing and predicting the water traffic accident based on negative binomial regression is carried out according to the following steps:
a) variable and data selection: before modeling, a series of descriptive statistics and related analysis are carried out, the most basic and most important factors which possibly influence the accident occurrence are determined, and finally a plurality of independent variables which can enter a model are determined; selecting a proper model type according to the characteristics of the dependent variable and the independent variable, and considering to use a counting (Count Data) model when the dependent variable is a counting variable, namely the number of the events;
b) the model assumes that: common distribution types of qualitative variables include binomial distribution, multinomial distribution, Poisson distribution, negative binomial distribution and the like; because the number of the accident, the number of the death and the missing people and the number of the injured people are any nonnegative integers, are typical counting data, do not obey normal distribution, but possibly obey poisson distribution or negative binomial distribution, the counting model adopted in the metering analysis is more suitable than a linear model, and the discrete value of the explained variable is assumed to obey certain poisson distribution, and the distribution function is as follows:
<math><mrow><mi>P</mi><mrow><mo>(</mo><mi>Y</mi><mo>=</mo><msub><mi>y</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><mi>exp</mi><mrow><mo>(</mo><mo>-</mo><mi>&lambda;</mi><mo>)</mo></mrow><msup><mi>&lambda;</mi><msub><mi>y</mi><mi>i</mi></msub></msup></mrow><mrow><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></mfrac><mo>,</mo></mrow></math> yi=1,2,3,Λ
wherein E (y)i)=λ,Var(yi) λ means that the mean and variance of the random variable y are both λ, and if X is (X)1,x2,Λ,xm) The poisson regression model is a regression model describing the relationship between the mean λ of the target variable y subject to poisson distribution and the interpretation variable X, and can be expressed as:
logλ=Xβ
where β is the parameter to be estimated, it can be estimated using an iterative nonlinear weighted least squares or maximum likelihood method, given xiUnder the condition of (a) yiThe conditional densities of (a) are:
<math><mrow><mi>f</mi><mrow><mo>(</mo><msub><mi>y</mi><mi>i</mi></msub><mo>|</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msup><mi>e</mi><mrow><mo>-</mo><mi>&lambda;</mi><mrow><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi><mo>)</mo></mrow></mrow></msup><mi>&lambda;</mi><mrow><mo>(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><mi>&beta;</mi></mrow><msup><mo>)</mo><msub><mi>y</mi><mi>i</mi></msub></msup></mrow><mrow><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></mfrac></mrow></math>
if the random variable yiIs equal to the variance, the poisson maximum likelihood estimate is consistent and valid, while the actual accident number data often has an over-dispersion (over-dispersion) characteristic if the random variable yiUnder such circumstances, if the poisson Regression model is still used, the standard error of the parameter may be underestimated, the significance level of the parameter is overestimated, and thus redundant explanatory variables remain in the model, which ultimately leads to unreasonable results, and to eliminate the adverse effect, a Negative Binomial Regression model (Negative Binomial Regression) is used instead of the poisson Regression model for estimation, and a Negative Binomial distribution is constructed by introducing an error term of the gamma distribution, and the Negative Binomial Regression model introduces an independent random effect u in the conditional mean μ, so that the poisson Regression model is expanded, that is: log μi=logλi+loguiThen, the regression form of the negative binomial regression model is:
logμi=xiβ+ei
in the above formula, eiFor random errors, exp (e)i) Obeying the distribution of Γ, in a negative binomial regression model, yiFor xi,uiThe conditional distribution of (a) is still a poisson distribution:
<math><mrow><mi>f</mi><mrow><mo>(</mo><msub><mi>y</mi><mi>i</mi></msub><mo>|</mo><msub><mi>x</mi><mi>i</mi></msub><mo>,</mo><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><mo>=</mo><mo>[</mo><mi>exp</mi><mrow><mo>(</mo><mo>-</mo><msub><mi>&lambda;</mi><mi>i</mi></msub><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><msup><mrow><mo>(</mo><msub><mi>&lambda;</mi><mi>i</mi></msub><msub><mi>u</mi><mi>i</mi></msub><mo>)</mo></mrow><msub><mi>y</mi><mi>i</mi></msub></msup><mo>]</mo><mo>/</mo><msub><mi>y</mi><mi>i</mi></msub><mo>!</mo></mrow></math>
at this time, the random variable yiIs lambda, lambda (1+ eta), respectively2λ), wherein η2=1/yiThe variance of the condition is a measure of the degree of exceeding the condition mean value, namely the divergence degree;
c) parameter estimation and verification:
1) parameter estimation using quasi-maximum likelihood function (QML): the quasi-maximum likelihood function method can be realized under a series of distribution assumptions, the estimation is more robust, even if the distribution assignment is wrong, the consistent estimation of the correctly defined condition mean value parameter can be generated, and the robustness of the result is similar to the common regression: the ML estimation is consistent even if the residual distribution is not normal, and in the ordinary least square method, the consistency requirement is a conditional mean value m (x, β) ═ x 'β, whereas in QML, the consistency requirement is m (x, β) ═ exp (x' β);
2) and (3) parameter estimation and verification: the parameter estimation of the discrete data counting model is realized by maximum likelihood estimation, the test for estimating parameters is mainly completed by Wald test, which helps to make some inferences on the mean value of the sampling population, and is similar to t test in the linear regression model, so it is often called generalized t test, and the hypothesis of Wald test is: h0:βjWhen 0, the t statistic is established as:
<math><mrow><msub><mi>t</mi><msub><mi>&beta;</mi><mi>j</mi></msub></msub><mo>=</mo><mfrac><msub><mover><mi>&beta;</mi><mo>^</mo></mover><mi>j</mi></msub><mrow><mi>se</mi><mrow><mo>(</mo><msub><mover><mi>&beta;</mi><mo>^</mo></mover><mi>j</mi></msub><mo>)</mo></mrow></mrow></mfrac></mrow></math>
wherein,
Figure FDA0000032930130000023
for the estimated value of the parameter to be examined,the standard deviation of a list of parameter estimation values is adopted, t statistic approximately follows standard normal distribution under the condition of a large sample, t test of the parameter estimation values is carried out according to a model estimation result, if the t test is obvious, the parameter is reserved, otherwise, the variable is eliminated, and the process is continuously repeated until all obvious variables are added;
3) and carrying out goodness-of-fit calibration, verification and variable introduction judgment of the model according to the following criteria:
(1)Pesudo R2goodness of fit test of the statistical pair model, R2A larger value indicates a better fit;
(2) the Log Likelihood (LL) Log maximum likelihood function value is a statistic obtained based on maximum likelihood estimation, the Log likelihood value is used for explaining the accuracy of the model, and the larger the Log likelihood value is, the more accurate the model is;
(3) significance of the t-estimate parameter was at the 5% level;
(4) the ratio of Pearson's chi-square value to the degree of freedom is between 0.8 and 1.2;
(5) the Akaike's Information Criterion (AIC) criterion is used for evaluating the quality of a model, and generally, the AIC value is required to be smaller and better.
CN2010105491463A 2010-11-18 2010-11-18 Negative binomial regression-based maritime traffic accident investigation analysis and prediction method Pending CN102103715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105491463A CN102103715A (en) 2010-11-18 2010-11-18 Negative binomial regression-based maritime traffic accident investigation analysis and prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105491463A CN102103715A (en) 2010-11-18 2010-11-18 Negative binomial regression-based maritime traffic accident investigation analysis and prediction method

Publications (1)

Publication Number Publication Date
CN102103715A true CN102103715A (en) 2011-06-22

Family

ID=44156464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105491463A Pending CN102103715A (en) 2010-11-18 2010-11-18 Negative binomial regression-based maritime traffic accident investigation analysis and prediction method

Country Status (1)

Country Link
CN (1) CN102103715A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646533A (en) * 2013-11-22 2014-03-19 江苏大学 A traffic accident modeling and control method based on sparse multi-output regression
CN106897948A (en) * 2017-01-04 2017-06-27 天津职业技术师范大学 One kind rides implementation traffic accident authentication method
CN108009692A (en) * 2017-12-26 2018-05-08 东软集团股份有限公司 Maintenance of equipment information processing method, device, computer equipment and storage medium
CN108664451A (en) * 2018-04-10 2018-10-16 中国林业科学研究院资源信息研究所 Nonlinear mixed-effect model unified standard form and application
CN108922168A (en) * 2018-05-29 2018-11-30 同济大学 A kind of mid-scale view Frequent Accidents road sentences method for distinguishing
CN109447306A (en) * 2018-08-13 2019-03-08 上海海事大学 Metro accidents delay time at stop prediction technique based on maximum likelihood regression tree
CN111144677A (en) * 2018-11-06 2020-05-12 北京京东振世信息技术有限公司 Efficiency evaluation method and efficiency evaluation system
CN111680022A (en) * 2020-05-15 2020-09-18 河海大学 Beach tourist safety accident database establishing and predicting method
CN113919144A (en) * 2021-09-27 2022-01-11 同济大学 Mountain area highway accident cause analysis model modeling method and storage medium
CN114493150A (en) * 2021-12-30 2022-05-13 广西交通设计集团有限公司 Road outburst accident influence factor identification method based on negative binomial regression

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5798949A (en) * 1995-01-13 1998-08-25 Kaub; Alan Richard Traffic safety prediction model
WO2002084446A2 (en) * 2001-04-16 2002-10-24 Jacobs John M Safety management system and method
CN101634851A (en) * 2009-08-25 2010-01-27 西安交通大学 Method based on cause-and-effect relation of variables for diagnosing failures in process industry
CN101826258A (en) * 2010-04-09 2010-09-08 北京工业大学 Method for predicting simple accidents on freeways

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5798949A (en) * 1995-01-13 1998-08-25 Kaub; Alan Richard Traffic safety prediction model
WO2002084446A2 (en) * 2001-04-16 2002-10-24 Jacobs John M Safety management system and method
CN101634851A (en) * 2009-08-25 2010-01-27 西安交通大学 Method based on cause-and-effect relation of variables for diagnosing failures in process industry
CN101826258A (en) * 2010-04-09 2010-09-08 北京工业大学 Method for predicting simple accidents on freeways

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646533B (en) * 2013-11-22 2016-05-25 江苏大学 Traffic accident modeling and control method based on sparse multi-output regression
CN103646533A (en) * 2013-11-22 2014-03-19 江苏大学 A traffic accident modeling and control method based on sparse multi-output regression
CN106897948B (en) * 2017-01-04 2021-01-01 北京工业大学 Riding and pushing traffic accident identification method
CN106897948A (en) * 2017-01-04 2017-06-27 天津职业技术师范大学 One kind rides implementation traffic accident authentication method
CN108009692A (en) * 2017-12-26 2018-05-08 东软集团股份有限公司 Maintenance of equipment information processing method, device, computer equipment and storage medium
CN108664451A (en) * 2018-04-10 2018-10-16 中国林业科学研究院资源信息研究所 Nonlinear mixed-effect model unified standard form and application
CN108922168A (en) * 2018-05-29 2018-11-30 同济大学 A kind of mid-scale view Frequent Accidents road sentences method for distinguishing
CN108922168B (en) * 2018-05-29 2019-10-18 同济大学 A kind of mid-scale view Frequent Accidents road sentences method for distinguishing
CN109447306A (en) * 2018-08-13 2019-03-08 上海海事大学 Metro accidents delay time at stop prediction technique based on maximum likelihood regression tree
CN109447306B (en) * 2018-08-13 2021-07-02 上海海事大学 Subway accident delay time prediction method based on maximum likelihood regression tree
CN111144677A (en) * 2018-11-06 2020-05-12 北京京东振世信息技术有限公司 Efficiency evaluation method and efficiency evaluation system
CN111144677B (en) * 2018-11-06 2023-11-07 北京京东振世信息技术有限公司 Efficiency evaluation method and efficiency evaluation system
CN111680022A (en) * 2020-05-15 2020-09-18 河海大学 Beach tourist safety accident database establishing and predicting method
CN113919144A (en) * 2021-09-27 2022-01-11 同济大学 Mountain area highway accident cause analysis model modeling method and storage medium
CN114493150A (en) * 2021-12-30 2022-05-13 广西交通设计集团有限公司 Road outburst accident influence factor identification method based on negative binomial regression

Similar Documents

Publication Publication Date Title
CN102103715A (en) Negative binomial regression-based maritime traffic accident investigation analysis and prediction method
CN106951984B (en) Dynamic analysis and prediction method and device for system health degree
WO2022147853A1 (en) Complex equipment power pack fault prediction method based on hybrid prediction model
Akyuz et al. The role of human factor in maritime environment risk assessment: A practical application on Ballast Water Treatment (BWT) system in ship
CN104636449A (en) Distributed type big data system risk recognition method based on LSA-GCC
CN114997607A (en) Anomaly assessment early warning method and system based on engineering detection data
CN116842527A (en) Data security risk assessment method
CN116934262B (en) Construction safety supervision system and method based on artificial intelligence
CN114492980B (en) Intelligent prediction method for corrosion risk of urban gas buried pipeline
CN110929224A (en) Safety index system establishing method based on bus driving safety
Fakher et al. New insights into development of an environmental–economic model based on a composite environmental quality index: a comparative analysis of economic growth and environmental quality trend
CN116245367A (en) Dangerous truck transportation risk assessment method and system based on hierarchical fuzzy neural network
CN112699467B (en) Method for monitoring and checking allocation of ships and inspectors in port countries
CN115345414A (en) Method and system for evaluating information security of oil and gas pipeline industrial control network
CN114764682A (en) Rice safety risk assessment method based on multi-machine learning algorithm fusion
Qiao et al. Dynamic assessment method for human factor risk of manned deep submergence operation system based on SPAR-H and SD
CN112907067A (en) Concrete dam safety monitoring method and system
CN116401601B (en) Power failure sensitive user handling method based on logistic regression model
CN117493759A (en) Gas methane distinguishing method and device based on principal component analysis and vector machine
CN116777224A (en) Foundation pit construction risk evaluation method based on combined weighting-nonlinear FAHP
CN113807743A (en) Power grid dispatching automation software reliability assessment method and system
CN111724053A (en) Aviation network risk propagation identification method
Bonato et al. Application of the Polynomial Function in the Analysis of Statistical Indicators of Risk and Safety in Shipping
CN117541071B (en) Site soil heavy metal damage baseline calculation method and device
CN114046179B (en) Method for intelligently identifying and predicting underground safety accidents based on CO monitoring data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20110622