CN108717786A - A kind of traffic accident causation method for digging based on universality meta-rule - Google Patents

A kind of traffic accident causation method for digging based on universality meta-rule Download PDF

Info

Publication number
CN108717786A
CN108717786A CN201810781739.9A CN201810781739A CN108717786A CN 108717786 A CN108717786 A CN 108717786A CN 201810781739 A CN201810781739 A CN 201810781739A CN 108717786 A CN108717786 A CN 108717786A
Authority
CN
China
Prior art keywords
rule
accident
meta
data
traffic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810781739.9A
Other languages
Chinese (zh)
Other versions
CN108717786B (en
Inventor
曾维理
赵子瑜
李娟�
任禹蒙
孙煜时
羊钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN201810781739.9A priority Critical patent/CN108717786B/en
Publication of CN108717786A publication Critical patent/CN108717786A/en
Application granted granted Critical
Publication of CN108717786B publication Critical patent/CN108717786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Educational Administration (AREA)
  • Analytical Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Chemical & Material Sciences (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Traffic Control Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of traffic accident causation method for digging based on universality meta-rule.This method is by reading in traffic accident information over the years, after being pre-processed to data, grade classification is carried out to every accident record according to road traffic accident criteria for classification, Association Rule Analysis method is used on this basis, minimum support is set, the reasonable threshold value of min confidence and frequent index, the consistent association rule mining of threshold value is carried out to more data sets, build the two-value data collection of each data set and Strong association rule, and then extract meta-rule collection, gather meta-rule collection again and data set carries out mining again, integrate the meta-rule that multi-group data is concentrated, it obtains exporting with cellular pattern, the output rule that polynary rule with pervasive feature is constituted.The present invention can excavate the hiding related information in traditional association rule, screen valuable rule, reject the correlation rule without the pervasive feature in multi-disc area, person provides decision assistant for traffic safety management.

Description

A kind of traffic accident causation method for digging based on universality meta-rule
Technical field
The invention belongs to traffic safety technology fields, more particularly to a kind of traffic accident based on universality meta-rule causes Because of method for digging.
Background technology
In recent years, urban highway traffic is fast-developing, and city road network scale and road mileage are substantially improved, catering delivery clothes Business, shared bicycle and shared automobile and internet industry of hiring a car are emerged in large numbers like the mushrooms after rain, in thriving behind, It is the heavy pressure that urban transportation faces, traffic accident is also in rising trend.With current traffic accident data recording condition Substantially perfect, such data how are efficiently used, find that crux is the master faced at this stage from a large amount of traffic accident data Want problem.It is decision-making level by analyzing accident occurrence cause and finding out the inherent law of each attribute relationship in traffic accident Foundation is provided, accomplishes to shoot the arrow at the target, the condition for making traffic accident occur by artificial intervention and control is lacked to reduce and be handed over Interpreter thus occur probability.
With the development of artificial intelligence and big data technology, the theory and method of data mining also begin to be widely used in friendship Logical field.Association rule mining method is current mainstream for analyzing the relevance in data set between different transaction attributes Traffic accident data mining means.One of the mining algorithm of correlation rule and the main representative of unsupervised learning, conform exactly to Traffic accident randomness is strong and data feature unevenly distributed, and the potential association between transaction attributes can be made to be able to body It is existing, and then valuable correlation rule is analyzed, make Rational Decision.
It is as follows that presently, there are problems in the prior art:Based on the correlation rule that traffic data excavates, generally for guarantor Card is readable, and threshold value setting is higher, and the probability size that the accident of different severity occurs in traffic accident difference, if The association rule mining that uniform threshold is directly used to entire traffic data collection, the part that may result in correlation rule hide pass Connection can not embody;It is concentrated in the more traffic data of data attribute, quantity mistake occur in preceding paragraph and consequent attribute in correlation rule It is more, it is embodied from each other without logic, is not easy to the analysis of decision-making level;Solely data mining is carried out to data set to be closed Connection rule, does not often account for the universality of rule, and obtained Rule section is only applicable to current data set, only embodies single The Accident-causing of data set section, and the shared traffic accident causation feature in multi-disc area can not be embodied.
Invention content
In order to solve the technical issues of above-mentioned background technology proposes, the present invention is intended to provide a kind of being based on universality meta-rule Traffic accident causation method for digging, extract the meta-rule with pervasive feature, and the cellular pattern by being constituted with meta-rule Form output rule, the universality of implementation rule and easily it is explanatory.
In order to achieve the above technical purposes, the technical scheme is that:
A kind of traffic accident causation method for digging based on universality meta-rule, includes the following steps:
Step 1: data preparation
Step 1.1:Traffic accident information over the years is read, and is classified as accident essential information, relates to thing drivers information, thing Therefore 5 class traffic accident causation information of information of vehicles, road condition information and environmental information, and adopted per class traffic accident causation information With more attribute descriptions;
Step 1.2:Data Quality Analysis is carried out to the traffic accident information of reading, screening retains up-to-standard attribute and becomes Amount;
Step 1.3:Attributions selection is carried out to the traffic accident information after screening, it will or redundancy uncorrelated to mining task Attribute reject, the target of Attributions selection is to find out minimal attribute set, while ensureing the probability distribution of data set as possible close to profit The former distribution obtained with all properties;
Step 1.4:Data cleansing, including missing values processing and noise are carried out to the traffic accident information that step 1.3 obtains Filtering;Missing values processing uses elimination method, and it is more than default missing threshold value to reject attribute missing degree in 5 class traffic accident causation information Information;Noise filtering uses the outlier detection algorithm based on statistical method, the outlier being diagnosed to be in data, and deletes It removes;
Step 1.5:Clustering processing is carried out to the attribute of continuity distribution, meanwhile, according to road traffic accident criteria for classification Classify to every accident;
Step 2: parameter is chosen
Step 2.1:According to the support and confidence level of following method computation rule:
Regular R:Support in traffic data collection T is as follows:
Wherein,
Regular R:Confidence level in traffic data collection T is as follows:
Wherein,
For regular R:X is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when setting for regular R Both when reliability is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrate Between contact it is closer;
Step 2.2:Select minimum support S threshold values:To the different classes of accidents of different administrative region traffic data collection into After row is distinguished, the rule degree of being supported of different zones is calculated according to formula (1), obtains the pass for meeting minimum support threshold value The relational graph of connection regular quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support Threshold value is abscissa, using the correlation rule quantity for meeting minimum support threshold value as ordinate, obtains each region branch of all kinds of accidents Degree of holding threshold value chooses tendency chart, carries out minimum support threshold value selection;
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1) (2) to the rule degree of being supported of different zones and confidence calculations, different supports is set and confidence threshold value carries out Comparative analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence The range of choice of threshold value is spent, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger, The correlation rule quantity that expression includes is more;
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the index of universality meta-rule is screened For frequent index, the correlation rule excavated respectively is concentrated according to different data, establishes the correlation rule based on more data sets Frequent index table meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set is excavating association Consistent support and confidence threshold value are taken when regular, indicate to have rule respectively by Boolean variable 1 and 0 and there is no rule Then, regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pijTake 1, it is no Then pijIt is data set quantity to take 0, n;
There is in multizone the meta-rule of universality in order to obtain, while ensureing that obtained universality meta-rule has analysis Meaning is associated screening to the different types of traffic accident data set in each region, filters out the pass repeated in each region Connection rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the association rule of all kinds of accidents Then region is associated with tendency chart, carries out frequent index threshold selection;
Step 3: the association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and minimum confidence are set C is spent, once connection rule digging is carried out, obtains corresponding correlation rule R1,R2,…,Ri
Step 3.2:According to the frequent index that each correlation rule is concentrated in different data, screened by frequent index F threshold values, Meta-rule is extracted, meta-rule collection is established;
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, integrates what multi-group data was concentrated Meta-rule;
Step 3.4:Strong association rule output is carried out according to minimum support S and min confidence C, and exports and is based on causing The meta-rule of traffic accident type association factor obtains being constituted with the output of cellular pattern, the polynary rule with pervasive feature Output rule;
Step 4: rule analysis
According to the Strong association rule that the polynary rule under the different type accident of generation is constituted, qualitative and quantitative analysis is each Relevance between Accident-causing provides reference frame for decision-making level.
Further, in step 1.1, the attribute that 5 class traffic accident causation information respectively contain is as follows:
Accident essential information includes:Type, incident classification, accident casualty number, accident direct property loss, the thing of accident Therefore time and accident spot;
Relating to thing drivers information includes:Body heart when gender, age, occupation, driving age and the accident of driver occur Front and back driver behavior behavior occurs for reason situation, accident;
Accident vehicle information includes:Type of vehicle, vehicle safety situation and vehicle drive performance;
Road condition information includes:Category of roads, road surface form, pavement behavior, safety devices and Alignment Design;
Environmental information includes:Therefore moment road traffic condition, driving sight distance, weather conditions, identifier marking and illumination occurs.
Further, in step 1.2, by being worth analysis method, retain the attribute variable that non-empty accounting is more than 70%.
Further, in step 1.4, the default missing threshold value is 30%.
Further, in step 1.5, classified as follows to accident:
Property loss accident:It causes vehicle, cargo or other properties impaired, or is slightly hindered with personnel;
Injured accident:Party is caused to sustain a severe injury or slight wound, or can be with property loss;
Death by accident:Cause party dead, or with injury to personnel, property loss.
Further, in step 2.2,2.3 and 2.4, when choosing all kinds of threshold values, it should ensure that the readable of rule first Property so that regular quantity is maintained within readable range, while should ensure that the validity of rule so that the attribute that rule includes It is more as possible.
The advantageous effect brought using above-mentioned technical proposal:
The present invention is mainly by carrying out different data collection once to excavate obtained correlation rule, progress universality analysis, The rule with pervasive feature is extracted as meta-rule, then by mining again, the meta-rule that multi-group data is concentrated is integrated, with member Born of the same parents' pattern exports, and obtains the output rule that there is the polynary rule of pervasive feature to constitute.It is obtained compared to traditional excavation means Correlation rule, it is of the invention to be advantageous in that:(1) the hiding association in traditional association rule is disclosed by universality meta-rule Information, and information integration is shown in Result;(2) correlation rule without the pervasive feature in multi-disc area, screening are rejected Valuable universality rule;(3) abandoning tradition correlation rule is in the various defect for being unfavorable for resolution of attribute of decision-making level, note The logicality in item between attribute before and after rule is highlighted again, and correlation rule is indicated in a manner of more readable.This hair It is bright not only to excavate the reason for leading to all kinds of traffic accidents, and the incidence relation between reason can be found, so as to It helps vehicle supervision department to find most important and most critical intervention factor, improves the effect of accident prevention.
Description of the drawings
Fig. 1 is flow chart of the method for the present invention.
Specific implementation mode
Below with reference to attached drawing, technical scheme of the present invention is described in detail.
The present invention proposes a kind of traffic accident causation method for digging based on universality meta-rule, as shown in Figure 1, specifically Steps are as follows.
Step 1:Data preparation
Step 1.1:Traffic accident information over the years is read in, by accident essential information, relates to thing drivers information, accident vehicle letter Breath, road condition information and environmental information are divided into the traffic accident causation information of five aspects, realize the multi-angle to incident attributes Description:
1) accident essential information, including the type of accident, incident classification, accident casualty (number of injured people, death toll, again Hurt sb.'s feelings number, slight wound number), accident direct property loss, time of casualty, the information such as accident spot.The information, which is constituted, hands over road Interpreter's event basic description of itself.
2) thing drivers information is related to, driver is one of an important factor for accident occurs, including the gender of driver, year Front and back driver behavior row occurs for body psychologic status, accident when the essential informations such as age, occupation, driving age and accident occur For etc. satellite informations.
3) accident vehicle information, including type of vehicle, vehicle safety situation, vehicle drive performance etc..
4) road condition information, road traffic accident spot include urban road and highway, and road condition information includes Category of roads, road surface form, pavement behavior, safety devices, Alignment Design etc..
5) moment road traffic condition, driving sight distance, weather conditions, identifier marking, photograph occur for environmental information, including accident Bright etc., these all will directly or indirectly influence traffic accident.
Step 1.2:Data Quality Analysis is carried out to traffic accident data information.By being worth analysis method, by non-empty accounting Attribute variable more than 70% is included in the variable system of the data sample of next round.
Step 1.3:Attributions selection is carried out to the traffic accident data information after screening, it would be possible to mining task not phase It closes or the attribute of redundancy is rejected.The target of Attributions selection is the probability point found out minimal attribute set, while ensureing data set Cloth is close to the former distribution for using all properties to obtain.It is advantageous that reducing the attribute appeared on discovery mode Number so that pattern is more readily understood.
Step 1.4:Data cleansing is carried out to the Accident-causing information of reservation.It is substantially carried out missing values processing and noise mistake Filter.Missing values processing is carried out first.Due to missing data characterization be accident independent individual information, between each accident not There are apparent correlations, so such missing data can not be made up in theory by post analysis, therefore use Elimination method carries out data cleansing, rejects the accident information that 5 class traffic accident causation information attribute missing degree are more than 30%, improves number According to quality and tap value.Then noise filtering is carried out, uses the outlier detection algorithm based on statistical method herein.Due to Traffic accident data are independent individual information, while apparent correlation being not present between each accident, and with height Randomness is smoothed so such missing data can not be analyzed in theory by the Return Law, so for examining Break the outlier, gives delete processing.
Step 1.5:Hough transformation is carried out to continuity attribute.It is general in order in follow-up data mining process, classify The characteristics of including all kinds of traffic accidents and attention is placed in some specific class to be further analysed, to continuity point The attribute of cloth carries out clustering processing.Meanwhile the association for leading to all kinds of traffic accidents being intuitively presented in order to facilitate data mining results Factor information, according to road traffic accident criteria for classification, the death toll, slight wound number based on every accident record and severely injured people Number and property loss, classify to every accident:
A) property loss accident.It causes vehicle, cargo or other properties impaired, can slightly be hindered with personnel;
B) injured accident.Party is caused to sustain a severe injury or slight wound, it can be with property loss;
C) death by accident.Cause party dead, it can be with injury to personnel, property loss.
Step 2:Parameter is chosen
Step 2.1:According to the support and confidence level of following equation computation rule.
Regular R:Support in traffic data collection T is as follows:
Wherein,
Regular R:Confidence level in traffic data collection T is as follows:
Wherein,
For regular R:X is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when setting for regular R Both when reliability is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrate Between contact it is closer.
Step 2.2:Select minimum support S threshold values:To the different classes of accidents of different administrative region traffic data collection into After row is distinguished, the rule degree of being supported of different zones is calculated according to formula (1), obtains the pass for meeting minimum support threshold value The relational graph of connection regular quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support Threshold value is abscissa, using the correlation rule quantity for meeting minimum support threshold value as ordinate, obtains each region branch of all kinds of accidents Degree of holding threshold value chooses tendency chart, carries out minimum support threshold value selection.
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1) (2) to the rule degree of being supported of different zones and confidence calculations, different supports is set and confidence threshold value carries out Comparative analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence The range of choice of threshold value is spent, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger, The correlation rule quantity that expression includes is more.
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the member rule with pervasive feature are screened Index then is frequent index, and the correlation rule excavated respectively is concentrated according to different data, is established based on more data sets The frequent index table of correlation rule meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set exists Consistent support and confidence threshold value are taken when Mining Association Rules, by Boolean variable 1 and 0 respectively indicate exist rule and There is no rule, regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pijTake 1, it is no Then pijIt is data set quantity to take 0, n.
There is in multizone the meta-rule of universality in order to obtain, while ensureing that obtained universality meta-rule has analysis Meaning is associated screening to the different types of traffic accident data set in each region, filters out the pass repeated in each region Connection rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the association rule of all kinds of accidents Then region is associated with tendency chart, carries out frequent index threshold selection.
When choosing all kinds of threshold values, the readability of rule should ensure that first so that regular quantity is maintained at readable range Within (generally 200 or less), while should ensure that regular validity so that the attribute that rule includes is more as possible.
Step 3:Association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and minimum confidence are set C is spent, once connection rule digging is carried out, obtains corresponding correlation rule R1,R2,…,Ri
Step 3.2:According to the frequent index that each correlation rule is concentrated in different data, screened by frequent index F threshold values, Meta-rule is extracted, meta-rule collection is established.
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, integrates what multi-group data was concentrated Meta-rule.
Step 3.4:Correlation rule generates.Strong association rule output is carried out according to minimum support S and min confidence C, And export based on the meta-rule for leading to traffic accident type association factor, it obtains with the output of cellular pattern, with pervasive feature It is polynary rule constitute output rule, be shaped likeRule template, indicate Accident-causing P1,...,Pi,Pj,...,PkGeneration induction of Accident-causing Q (Y) generation, in Accident-causing P1,...,Pi,Pj,...,Pk In, Accident-causing P1,...,PiGeneration induction of Accident-causing Pj,...,PkGeneration.Cellular pattern is chosen as output mould Formula is mainly in view of the packaging type feature of cellular pattern, and it had both included meta-rule that can make the rule of output, also can include single category Property, output rule is collectively formed, keeps output rule envelope information more complete, more analyticity.It is concentrated by different data It excavates obtained correlation rule to be screened, is presented in the preceding paragraph of correlation rule and consequent, packet in the form of cellular group later Attribute and attribute, attribute and rule, rule and regular three kinds of forms are contained.In practical applications, by consider influence factor it Between correlation rule, as long as the less influence factor of control, you can reach the prevention to traffic accident.
Step 4:Rule analysis
The Strong association rule that polynary rule under the different type accident generated according to step 3 is constituted, qualitative and quantitative point The relevance between each Accident-causing is analysed, reference frame is provided for decision-making level.
For example, using Shenzhen 2014-2016 year traffic casualty data information as research object, to each traffic in Shenzhen The reason of accident pattern carries out mining analysis, in the present embodiment, minimum support S, the min confidence C of correlation rule and The threshold value of frequent index F is set to:S >=30%, C >=70%, F >=55% obtain the association rule under each traffic accident type It is then as a result, as shown in the table:
1 Shenzhen Traffic accident correlation rule result (part) of table
Correlation rule result is analyzed, it is possible to provide following to suggest:In terms of weather, when weather condition is sunny, driver More easily cause accident because of random change lane, and when weather condition is rain, do not keep safe distance and dangerous with front truck Driving then becomes the main traffic behavior for leading to accident, and the Frequent Accidents period is located at 17:00-19:59, incident area is treasured Pacify area, therefore consider to start with from weather, when and where, the period Traffic Announcement under row particular weather, which is reminded, to be launched, to reinforce The security protection of driver is realized.In terms of driver, driver of the age 19 to 23 and 30 to 35 years old is the 1st class accident The group occurred frequently of (property loss accident), but its linked character is different.There was maximum probability hair at age in 19 to 23 years old drivers The driving behavior of raw number 1225, that is, have other that the behavior of safe driving, this class behavior maximum probability is interfered to lead to the 1st class when driving The generation of accident, and identifier marking it is not perfect be this class behavior main reason.And the age was in driver's generation in 30 to 35 years old The main illegal activities of 1st class accident are 1094, i.e., do not keep safe distance with front truck, and accident spot is mainly in general city Road.In view of 19 to 23 age brackets are mostly the new Shanxi driver just to graduate from driving school, it is proposed that reinforce for learning in driving school's training The training of member's traffic safety consciousness of behavior, will pay attention to driver's age-colony difference in management, emphasis reinforces the management of new hand.And year Driver of the age section between 24 to 29 years old, most of driving ages for just having accumulated 4 to 6 years, are that awareness of safety link is most weak Period, therefore suggest suitably carrying out traffic safety drive education when first driver's license expires replacement and combine real case Carry out awareness of safety intensified education, it is contemplated that audience is too big, therefore the modes such as network answer, Internet video can be used and pacified Full education, while by safety education qualification by being included in the condition and range of first driver's license replacement.
Embodiment is merely illustrative of the invention's technical idea, and cannot limit protection scope of the present invention with this, it is every according to Technological thought proposed by the present invention, any change done on the basis of technical solution, each falls within the scope of the present invention.

Claims (6)

1. a kind of traffic accident causation method for digging based on universality meta-rule, which is characterized in that include the following steps:
Step 1: data preparation
Step 1.1:Traffic accident information over the years is read, and is classified as accident essential information, relates to thing drivers information, accident vehicle 5 class traffic accident causation information of information, road condition information and environmental information, and per class traffic accident causation information using more Attribute description;
Step 1.2:Data Quality Analysis is carried out to the traffic accident information of reading, screening retains up-to-standard attribute variable;
Step 1.3:Attributions selection is carried out to the traffic accident information after screening, by category uncorrelated or redundancy to mining task Property reject, the target of Attributions selection is to find out minimal attribute set, while ensureing the probability distribution of data set as possible close to utilizing institute There is the former distribution that attribute obtains;
Step 1.4:Data cleansing, including missing values processing and noise filtering are carried out to the traffic accident information that step 1.3 obtains; Missing values processing uses elimination method, rejects the letter that attribute missing degree in 5 class traffic accident causation information is more than default missing threshold value Breath;Noise filtering uses the outlier detection algorithm based on statistical method, the outlier being diagnosed to be in data, and deletes;
Step 1.5:Clustering processing is carried out to the attribute of continuity distribution, meanwhile, according to road traffic accident criteria for classification to every Accident is classified;
Step 2: parameter is chosen
Step 2.1:According to the support and confidence level of following method computation rule:
RuleSupport in traffic data collection T is as follows:
Wherein,
RuleConfidence level in traffic data collection T is as follows:
Wherein,
For ruleX is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when regular R's When confidence level is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrates two Contact between person is closer;
Step 2.2:Select minimum support S threshold values:Area is carried out to the different classes of accident of different administrative region traffic data collection After point, the rule degree of being supported of different zones is calculated according to formula (1), obtains the association rule for meeting minimum support threshold value The then relational graph of quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support threshold value For abscissa each region support of all kinds of accidents is obtained using the correlation rule quantity for meeting minimum support threshold value as ordinate Threshold value chooses tendency chart, carries out minimum support threshold value selection;
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1) and (2) To the rule degree of being supported and confidence calculations of different zones, different supports is set and confidence threshold value is compared point Analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence threshold value Range of choice, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger, indicates packet The correlation rule quantity contained is more;
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the index of screening universality meta-rule is frequency Numerous index concentrates the correlation rule excavated respectively according to different data, and it is frequent to establish the correlation rule based on more data sets Index table meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set is in Mining Association Rules When take consistent support and confidence threshold value, by Boolean variable 1 and 0 respectively indicate exist rule and there is no rule, Regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pij1 is taken, otherwise pij It is data set quantity to take 0, n;
There is in multizone the meta-rule of universality in order to obtain, while ensureing that there is obtained universality meta-rule analysis to anticipate Justice is associated screening to the different types of traffic accident data set in each region, filters out the association repeated in each region Rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the correlation rule of all kinds of accidents Region is associated with tendency chart, carries out frequent index threshold selection;
Step 3: the association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and min confidence C is set, Once connection rule digging is carried out, corresponding correlation rule R is obtained1,R2,…,Ri
Step 3.2:It according to the frequent index that each correlation rule is concentrated in different data, is screened by frequent index F threshold values, extraction Meta-rule establishes meta-rule collection;
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, the member rule that multi-group data is concentrated are integrated Then;
Step 3.4:Strong association rule output is carried out according to minimum support S and min confidence C, and exports and is based on leading to traffic The meta-rule of accident pattern relation factor obtains the output constituted with the output of cellular pattern, the polynary rule with pervasive feature Rule;
Step 4: rule analysis
According to the Strong association rule that the polynary rule under the different type accident of generation is constituted, each accident of qualitative and quantitative analysis Relevance between reason provides reference frame for decision-making level.
2. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step In rapid 1.1, the attribute that 5 class traffic accident causation information respectively contain is as follows:
Accident essential information includes:When the type of accident, incident classification, accident casualty number, accident direct property loss, accident Between and accident spot;
Relating to thing drivers information includes:Body psychology shape when gender, age, occupation, driving age and the accident of driver occur Front and back driver behavior behavior occurs for condition, accident;
Accident vehicle information includes:Type of vehicle, vehicle safety situation and vehicle drive performance;
Road condition information includes:Category of roads, road surface form, pavement behavior, safety devices and Alignment Design;
Environmental information includes:Therefore moment road traffic condition, driving sight distance, weather conditions, identifier marking and illumination occurs.
3. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step In rapid 1.2, by being worth analysis method, retain the attribute variable that non-empty accounting is more than 70%.
4. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step In rapid 1.4, the default missing threshold value is 30%.
5. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step In rapid 1.5, classified as follows to accident:
Property loss accident:It causes vehicle, cargo or other properties impaired, or is slightly hindered with personnel;
Injured accident:Party is caused to sustain a severe injury or slight wound, or can be with property loss;
Death by accident:Cause party dead, or with injury to personnel, property loss.
6. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step In rapid 2.2,2.3 and 2.4, when choosing all kinds of threshold values, the readability of rule should ensure that first so that regular quantity is maintained at Within readable range, while it should ensure that the validity of rule so that the attribute that rule includes is more as possible.
CN201810781739.9A 2018-07-17 2018-07-17 Traffic accident cause mining method based on universality meta-rule Active CN108717786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810781739.9A CN108717786B (en) 2018-07-17 2018-07-17 Traffic accident cause mining method based on universality meta-rule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810781739.9A CN108717786B (en) 2018-07-17 2018-07-17 Traffic accident cause mining method based on universality meta-rule

Publications (2)

Publication Number Publication Date
CN108717786A true CN108717786A (en) 2018-10-30
CN108717786B CN108717786B (en) 2022-06-17

Family

ID=63914019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810781739.9A Active CN108717786B (en) 2018-07-17 2018-07-17 Traffic accident cause mining method based on universality meta-rule

Country Status (1)

Country Link
CN (1) CN108717786B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410588A (en) * 2018-12-20 2019-03-01 湖南晖龙集团股份有限公司 A kind of traffic accident evolution analysis method based on traffic big data
CN110263709A (en) * 2019-06-19 2019-09-20 百度在线网络技术(北京)有限公司 Driving Decision-making method for digging and device
CN110442620A (en) * 2019-08-05 2019-11-12 赵玉德 A kind of big data is explored and cognitive approach, device, equipment and computer storage medium
CN110825777A (en) * 2019-11-11 2020-02-21 云南电网有限责任公司电力科学研究院 Cause and effect analysis method for park road degradation
CN111144772A (en) * 2019-12-30 2020-05-12 交通运输部公路科学研究所 Road transportation safety risk real-time assessment method based on data mining
CN111459994A (en) * 2020-03-06 2020-07-28 中国科学院计算技术研究所 Disabled person-oriented big data analysis method and system
CN112597236A (en) * 2020-12-04 2021-04-02 河南大学 Concept lattice-based association rule optimization method and visual display method
CN113077625A (en) * 2021-03-24 2021-07-06 合肥工业大学 Road traffic accident form prediction method
CN113792193A (en) * 2021-08-27 2021-12-14 武汉理工大学 Inland navigation mark-oriented accident data mining method and system
CN115794801A (en) * 2022-12-23 2023-03-14 东南大学 Data analysis method for mining chain relation of automatic driving accident cause
CN116384820A (en) * 2023-03-31 2023-07-04 四川省自然资源科学研究院(四川省生产力促进中心) Scientific and technological innovation capability assessment method, system, equipment and medium for enterprises

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011060723A1 (en) * 2009-11-19 2011-05-26 北京世纪高通科技有限公司 Method and device for data mining of road traffic accident based on association rule
CN103455563A (en) * 2013-08-15 2013-12-18 国家电网公司 Data mining method applicable to integrated monitoring system of intelligent substation
CN103488802A (en) * 2013-10-16 2014-01-01 国家电网公司 EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule
CN104298778A (en) * 2014-11-04 2015-01-21 北京科技大学 Method and system for predicting quality of rolled steel product based on association rule tree
CN104464344A (en) * 2014-11-07 2015-03-25 湖北大学 Vehicle driving path prediction method and system
US20160061625A1 (en) * 2014-12-02 2016-03-03 Kevin Sunlin Wang Method and system for avoidance of accidents
CN106383920A (en) * 2016-11-28 2017-02-08 东南大学 Method for identifying reasons of major traffic accidents based on association rules
CN107610421A (en) * 2017-09-19 2018-01-19 合肥英泽信息科技有限公司 A kind of geo-hazard early-warning analysis system and method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011060723A1 (en) * 2009-11-19 2011-05-26 北京世纪高通科技有限公司 Method and device for data mining of road traffic accident based on association rule
CN103455563A (en) * 2013-08-15 2013-12-18 国家电网公司 Data mining method applicable to integrated monitoring system of intelligent substation
CN103488802A (en) * 2013-10-16 2014-01-01 国家电网公司 EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule
CN104298778A (en) * 2014-11-04 2015-01-21 北京科技大学 Method and system for predicting quality of rolled steel product based on association rule tree
CN104464344A (en) * 2014-11-07 2015-03-25 湖北大学 Vehicle driving path prediction method and system
US20160061625A1 (en) * 2014-12-02 2016-03-03 Kevin Sunlin Wang Method and system for avoidance of accidents
CN107430006A (en) * 2014-12-02 2017-12-01 凯文·孙林·王 Avoid the method and system of accident
CN106383920A (en) * 2016-11-28 2017-02-08 东南大学 Method for identifying reasons of major traffic accidents based on association rules
CN107610421A (en) * 2017-09-19 2018-01-19 合肥英泽信息科技有限公司 A kind of geo-hazard early-warning analysis system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
左艇 等: "基于关联规则方法对不同地区乌头组反药的临床调查研究和配伍特点分析", 《中国中药杂志》 *
张春生: "大数据环境下相容数据集的关联规则数据挖掘", 《微电子学与计算机》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410588A (en) * 2018-12-20 2019-03-01 湖南晖龙集团股份有限公司 A kind of traffic accident evolution analysis method based on traffic big data
CN109410588B (en) * 2018-12-20 2022-03-15 湖南晖龙集团股份有限公司 Traffic accident evolution analysis method based on traffic big data
CN110263709B (en) * 2019-06-19 2021-07-16 百度在线网络技术(北京)有限公司 Driving decision mining method and device
CN110263709A (en) * 2019-06-19 2019-09-20 百度在线网络技术(北京)有限公司 Driving Decision-making method for digging and device
CN110442620A (en) * 2019-08-05 2019-11-12 赵玉德 A kind of big data is explored and cognitive approach, device, equipment and computer storage medium
CN110442620B (en) * 2019-08-05 2023-08-29 赵玉德 Big data exploration and cognition method, device, equipment and computer storage medium
CN110825777A (en) * 2019-11-11 2020-02-21 云南电网有限责任公司电力科学研究院 Cause and effect analysis method for park road degradation
CN111144772A (en) * 2019-12-30 2020-05-12 交通运输部公路科学研究所 Road transportation safety risk real-time assessment method based on data mining
CN111144772B (en) * 2019-12-30 2023-11-21 交通运输部公路科学研究所 Road transportation safety risk real-time assessment method based on data mining
CN111459994A (en) * 2020-03-06 2020-07-28 中国科学院计算技术研究所 Disabled person-oriented big data analysis method and system
CN112597236A (en) * 2020-12-04 2021-04-02 河南大学 Concept lattice-based association rule optimization method and visual display method
CN112597236B (en) * 2020-12-04 2022-10-25 河南大学 Concept lattice-based association rule optimization method and visual display method
CN113077625A (en) * 2021-03-24 2021-07-06 合肥工业大学 Road traffic accident form prediction method
CN113077625B (en) * 2021-03-24 2022-03-15 合肥工业大学 Road traffic accident form prediction method
CN113792193A (en) * 2021-08-27 2021-12-14 武汉理工大学 Inland navigation mark-oriented accident data mining method and system
CN113792193B (en) * 2021-08-27 2023-02-28 武汉理工大学 Inland navigation mark-oriented accident data mining method and system
CN115794801A (en) * 2022-12-23 2023-03-14 东南大学 Data analysis method for mining chain relation of automatic driving accident cause
CN115794801B (en) * 2022-12-23 2023-08-15 东南大学 Data analysis method for mining cause chain relation of automatic driving accidents
CN116384820A (en) * 2023-03-31 2023-07-04 四川省自然资源科学研究院(四川省生产力促进中心) Scientific and technological innovation capability assessment method, system, equipment and medium for enterprises

Also Published As

Publication number Publication date
CN108717786B (en) 2022-06-17

Similar Documents

Publication Publication Date Title
CN108717786A (en) A kind of traffic accident causation method for digging based on universality meta-rule
CN104268599B (en) Intelligent unlicensed vehicle finding method based on vehicle track temporal-spatial characteristic analysis
CN108717790B (en) Vehicle travel analysis method based on checkpoint license plate recognition data
Papadimitriou et al. Patterns of pedestrian attitudes, perceptions and behaviour in Europe
CN108090429B (en) Vehicle type recognition method for graded front face bayonet
CN108596409B (en) Method for improving accident risk prediction precision of traffic hazard personnel
CN110119676A (en) A kind of Driver Fatigue Detection neural network based
CN106383920B (en) A kind of particularly serious traffic accident causation recognition methods based on correlation rule
CN106384100A (en) Component-based fine vehicle model recognition method
CN109408557B (en) Traffic accident cause analysis method based on multiple correspondences and K-means clustering
Das et al. Investigating the pattern of traffic crashes under rainy weather by association rules in data mining
CN109409337A (en) Muck vehicle feature identification method based on convolutional neural network
CN109086808B (en) Traffic high-risk personnel identification method based on random forest algorithm
CN110458082A (en) A kind of city management case classification recognition methods
CN109191828B (en) Traffic participant accident risk prediction method based on ensemble learning
Gao et al. Research on automated modeling algorithm using association rules for traffic accidents
Anderson Crime Statistics and the ‘Problem of Crime’in Scotland
Kim et al. Hit-and-run crashes: use of rough set analysis with logistic regression to capture critical attributes and determinants
CN108510168A (en) Commerial vehicle paths planning method based on traffic accident correlation rule
CN114003683A (en) Alarm condition analysis method based on natural language processing and association rule
CN109101568A (en) Traffic high-risk personnel recognition methods based on XgBoost algorithm
CN109614496A (en) A kind of minimum living discrimination method of knowledge based map
CN110263074A (en) A method of illegal accident corresponding relationship is excavated based on LLE and K averaging method
CN109063751B (en) Traffic high-risk personnel identification method based on gradient lifting decision tree algorithm
CN105654118A (en) Civil aviation passenger relationship classification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant