CN108717786A - A kind of traffic accident causation method for digging based on universality meta-rule - Google Patents
A kind of traffic accident causation method for digging based on universality meta-rule Download PDFInfo
- Publication number
- CN108717786A CN108717786A CN201810781739.9A CN201810781739A CN108717786A CN 108717786 A CN108717786 A CN 108717786A CN 201810781739 A CN201810781739 A CN 201810781739A CN 108717786 A CN108717786 A CN 108717786A
- Authority
- CN
- China
- Prior art keywords
- rule
- accident
- meta
- data
- traffic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 206010039203 Road traffic accident Diseases 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000004458 analytical method Methods 0.000 claims abstract description 18
- 238000005065 mining Methods 0.000 claims abstract description 15
- 238000013480 data collection Methods 0.000 claims abstract description 11
- 230000001413 cellular effect Effects 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 10
- 208000027418 Wounds and injury Diseases 0.000 claims description 8
- 230000007613 environmental effect Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 230000006698 induction Effects 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000006378 damage Effects 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 3
- 230000008030 elimination Effects 0.000 claims description 3
- 238000003379 elimination reaction Methods 0.000 claims description 3
- 230000001771 impaired effect Effects 0.000 claims description 3
- 208000014674 injury Diseases 0.000 claims description 3
- 238000013450 outlier detection Methods 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 208000037974 severe injury Diseases 0.000 claims description 3
- 230000009528 severe injury Effects 0.000 claims description 3
- 238000007619 statistical method Methods 0.000 claims description 3
- 238000005286 illumination Methods 0.000 claims description 2
- 238000004451 qualitative analysis Methods 0.000 claims description 2
- 238000004445 quantitative analysis Methods 0.000 claims description 2
- 239000012141 concentrate Substances 0.000 claims 1
- 238000000605 extraction Methods 0.000 claims 1
- 230000006399 behavior Effects 0.000 description 11
- 238000007418 data mining Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010835 comparative analysis Methods 0.000 description 2
- 239000004744 fabric Substances 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 235000001674 Agaricus brunnescens Nutrition 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G08—SIGNALLING
- G08G—TRAFFIC CONTROL SYSTEMS
- G08G1/00—Traffic control systems for road vehicles
- G08G1/01—Detecting movement of traffic to be counted or controlled
- G08G1/0104—Measuring and analyzing of parameters relative to traffic conditions
- G08G1/0125—Traffic data processing
- G08G1/0129—Traffic data processing for creating historical data or processing based on historical data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
Landscapes
- Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Educational Administration (AREA)
- Analytical Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- Chemical & Material Sciences (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Traffic Control Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of traffic accident causation method for digging based on universality meta-rule.This method is by reading in traffic accident information over the years, after being pre-processed to data, grade classification is carried out to every accident record according to road traffic accident criteria for classification, Association Rule Analysis method is used on this basis, minimum support is set, the reasonable threshold value of min confidence and frequent index, the consistent association rule mining of threshold value is carried out to more data sets, build the two-value data collection of each data set and Strong association rule, and then extract meta-rule collection, gather meta-rule collection again and data set carries out mining again, integrate the meta-rule that multi-group data is concentrated, it obtains exporting with cellular pattern, the output rule that polynary rule with pervasive feature is constituted.The present invention can excavate the hiding related information in traditional association rule, screen valuable rule, reject the correlation rule without the pervasive feature in multi-disc area, person provides decision assistant for traffic safety management.
Description
Technical field
The invention belongs to traffic safety technology fields, more particularly to a kind of traffic accident based on universality meta-rule causes
Because of method for digging.
Background technology
In recent years, urban highway traffic is fast-developing, and city road network scale and road mileage are substantially improved, catering delivery clothes
Business, shared bicycle and shared automobile and internet industry of hiring a car are emerged in large numbers like the mushrooms after rain, in thriving behind,
It is the heavy pressure that urban transportation faces, traffic accident is also in rising trend.With current traffic accident data recording condition
Substantially perfect, such data how are efficiently used, find that crux is the master faced at this stage from a large amount of traffic accident data
Want problem.It is decision-making level by analyzing accident occurrence cause and finding out the inherent law of each attribute relationship in traffic accident
Foundation is provided, accomplishes to shoot the arrow at the target, the condition for making traffic accident occur by artificial intervention and control is lacked to reduce and be handed over
Interpreter thus occur probability.
With the development of artificial intelligence and big data technology, the theory and method of data mining also begin to be widely used in friendship
Logical field.Association rule mining method is current mainstream for analyzing the relevance in data set between different transaction attributes
Traffic accident data mining means.One of the mining algorithm of correlation rule and the main representative of unsupervised learning, conform exactly to
Traffic accident randomness is strong and data feature unevenly distributed, and the potential association between transaction attributes can be made to be able to body
It is existing, and then valuable correlation rule is analyzed, make Rational Decision.
It is as follows that presently, there are problems in the prior art:Based on the correlation rule that traffic data excavates, generally for guarantor
Card is readable, and threshold value setting is higher, and the probability size that the accident of different severity occurs in traffic accident difference, if
The association rule mining that uniform threshold is directly used to entire traffic data collection, the part that may result in correlation rule hide pass
Connection can not embody;It is concentrated in the more traffic data of data attribute, quantity mistake occur in preceding paragraph and consequent attribute in correlation rule
It is more, it is embodied from each other without logic, is not easy to the analysis of decision-making level;Solely data mining is carried out to data set to be closed
Connection rule, does not often account for the universality of rule, and obtained Rule section is only applicable to current data set, only embodies single
The Accident-causing of data set section, and the shared traffic accident causation feature in multi-disc area can not be embodied.
Invention content
In order to solve the technical issues of above-mentioned background technology proposes, the present invention is intended to provide a kind of being based on universality meta-rule
Traffic accident causation method for digging, extract the meta-rule with pervasive feature, and the cellular pattern by being constituted with meta-rule
Form output rule, the universality of implementation rule and easily it is explanatory.
In order to achieve the above technical purposes, the technical scheme is that:
A kind of traffic accident causation method for digging based on universality meta-rule, includes the following steps:
Step 1: data preparation
Step 1.1:Traffic accident information over the years is read, and is classified as accident essential information, relates to thing drivers information, thing
Therefore 5 class traffic accident causation information of information of vehicles, road condition information and environmental information, and adopted per class traffic accident causation information
With more attribute descriptions;
Step 1.2:Data Quality Analysis is carried out to the traffic accident information of reading, screening retains up-to-standard attribute and becomes
Amount;
Step 1.3:Attributions selection is carried out to the traffic accident information after screening, it will or redundancy uncorrelated to mining task
Attribute reject, the target of Attributions selection is to find out minimal attribute set, while ensureing the probability distribution of data set as possible close to profit
The former distribution obtained with all properties;
Step 1.4:Data cleansing, including missing values processing and noise are carried out to the traffic accident information that step 1.3 obtains
Filtering;Missing values processing uses elimination method, and it is more than default missing threshold value to reject attribute missing degree in 5 class traffic accident causation information
Information;Noise filtering uses the outlier detection algorithm based on statistical method, the outlier being diagnosed to be in data, and deletes
It removes;
Step 1.5:Clustering processing is carried out to the attribute of continuity distribution, meanwhile, according to road traffic accident criteria for classification
Classify to every accident;
Step 2: parameter is chosen
Step 2.1:According to the support and confidence level of following method computation rule:
Regular R:Support in traffic data collection T is as follows:
Wherein,
Regular R:Confidence level in traffic data collection T is as follows:
Wherein,
For regular R:X is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when setting for regular R
Both when reliability is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrate
Between contact it is closer;
Step 2.2:Select minimum support S threshold values:To the different classes of accidents of different administrative region traffic data collection into
After row is distinguished, the rule degree of being supported of different zones is calculated according to formula (1), obtains the pass for meeting minimum support threshold value
The relational graph of connection regular quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support
Threshold value is abscissa, using the correlation rule quantity for meeting minimum support threshold value as ordinate, obtains each region branch of all kinds of accidents
Degree of holding threshold value chooses tendency chart, carries out minimum support threshold value selection;
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1)
(2) to the rule degree of being supported of different zones and confidence calculations, different supports is set and confidence threshold value carries out
Comparative analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence
The range of choice of threshold value is spent, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger,
The correlation rule quantity that expression includes is more;
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the index of universality meta-rule is screened
For frequent index, the correlation rule excavated respectively is concentrated according to different data, establishes the correlation rule based on more data sets
Frequent index table meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set is excavating association
Consistent support and confidence threshold value are taken when regular, indicate to have rule respectively by Boolean variable 1 and 0 and there is no rule
Then, regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pijTake 1, it is no
Then pijIt is data set quantity to take 0, n;
There is in multizone the meta-rule of universality in order to obtain, while ensureing that obtained universality meta-rule has analysis
Meaning is associated screening to the different types of traffic accident data set in each region, filters out the pass repeated in each region
Connection rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the association rule of all kinds of accidents
Then region is associated with tendency chart, carries out frequent index threshold selection;
Step 3: the association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and minimum confidence are set
C is spent, once connection rule digging is carried out, obtains corresponding correlation rule R1,R2,…,Ri;
Step 3.2:According to the frequent index that each correlation rule is concentrated in different data, screened by frequent index F threshold values,
Meta-rule is extracted, meta-rule collection is established;
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, integrates what multi-group data was concentrated
Meta-rule;
Step 3.4:Strong association rule output is carried out according to minimum support S and min confidence C, and exports and is based on causing
The meta-rule of traffic accident type association factor obtains being constituted with the output of cellular pattern, the polynary rule with pervasive feature
Output rule;
Step 4: rule analysis
According to the Strong association rule that the polynary rule under the different type accident of generation is constituted, qualitative and quantitative analysis is each
Relevance between Accident-causing provides reference frame for decision-making level.
Further, in step 1.1, the attribute that 5 class traffic accident causation information respectively contain is as follows:
Accident essential information includes:Type, incident classification, accident casualty number, accident direct property loss, the thing of accident
Therefore time and accident spot;
Relating to thing drivers information includes:Body heart when gender, age, occupation, driving age and the accident of driver occur
Front and back driver behavior behavior occurs for reason situation, accident;
Accident vehicle information includes:Type of vehicle, vehicle safety situation and vehicle drive performance;
Road condition information includes:Category of roads, road surface form, pavement behavior, safety devices and Alignment Design;
Environmental information includes:Therefore moment road traffic condition, driving sight distance, weather conditions, identifier marking and illumination occurs.
Further, in step 1.2, by being worth analysis method, retain the attribute variable that non-empty accounting is more than 70%.
Further, in step 1.4, the default missing threshold value is 30%.
Further, in step 1.5, classified as follows to accident:
Property loss accident:It causes vehicle, cargo or other properties impaired, or is slightly hindered with personnel;
Injured accident:Party is caused to sustain a severe injury or slight wound, or can be with property loss;
Death by accident:Cause party dead, or with injury to personnel, property loss.
Further, in step 2.2,2.3 and 2.4, when choosing all kinds of threshold values, it should ensure that the readable of rule first
Property so that regular quantity is maintained within readable range, while should ensure that the validity of rule so that the attribute that rule includes
It is more as possible.
The advantageous effect brought using above-mentioned technical proposal:
The present invention is mainly by carrying out different data collection once to excavate obtained correlation rule, progress universality analysis,
The rule with pervasive feature is extracted as meta-rule, then by mining again, the meta-rule that multi-group data is concentrated is integrated, with member
Born of the same parents' pattern exports, and obtains the output rule that there is the polynary rule of pervasive feature to constitute.It is obtained compared to traditional excavation means
Correlation rule, it is of the invention to be advantageous in that:(1) the hiding association in traditional association rule is disclosed by universality meta-rule
Information, and information integration is shown in Result;(2) correlation rule without the pervasive feature in multi-disc area, screening are rejected
Valuable universality rule;(3) abandoning tradition correlation rule is in the various defect for being unfavorable for resolution of attribute of decision-making level, note
The logicality in item between attribute before and after rule is highlighted again, and correlation rule is indicated in a manner of more readable.This hair
It is bright not only to excavate the reason for leading to all kinds of traffic accidents, and the incidence relation between reason can be found, so as to
It helps vehicle supervision department to find most important and most critical intervention factor, improves the effect of accident prevention.
Description of the drawings
Fig. 1 is flow chart of the method for the present invention.
Specific implementation mode
Below with reference to attached drawing, technical scheme of the present invention is described in detail.
The present invention proposes a kind of traffic accident causation method for digging based on universality meta-rule, as shown in Figure 1, specifically
Steps are as follows.
Step 1:Data preparation
Step 1.1:Traffic accident information over the years is read in, by accident essential information, relates to thing drivers information, accident vehicle letter
Breath, road condition information and environmental information are divided into the traffic accident causation information of five aspects, realize the multi-angle to incident attributes
Description:
1) accident essential information, including the type of accident, incident classification, accident casualty (number of injured people, death toll, again
Hurt sb.'s feelings number, slight wound number), accident direct property loss, time of casualty, the information such as accident spot.The information, which is constituted, hands over road
Interpreter's event basic description of itself.
2) thing drivers information is related to, driver is one of an important factor for accident occurs, including the gender of driver, year
Front and back driver behavior row occurs for body psychologic status, accident when the essential informations such as age, occupation, driving age and accident occur
For etc. satellite informations.
3) accident vehicle information, including type of vehicle, vehicle safety situation, vehicle drive performance etc..
4) road condition information, road traffic accident spot include urban road and highway, and road condition information includes
Category of roads, road surface form, pavement behavior, safety devices, Alignment Design etc..
5) moment road traffic condition, driving sight distance, weather conditions, identifier marking, photograph occur for environmental information, including accident
Bright etc., these all will directly or indirectly influence traffic accident.
Step 1.2:Data Quality Analysis is carried out to traffic accident data information.By being worth analysis method, by non-empty accounting
Attribute variable more than 70% is included in the variable system of the data sample of next round.
Step 1.3:Attributions selection is carried out to the traffic accident data information after screening, it would be possible to mining task not phase
It closes or the attribute of redundancy is rejected.The target of Attributions selection is the probability point found out minimal attribute set, while ensureing data set
Cloth is close to the former distribution for using all properties to obtain.It is advantageous that reducing the attribute appeared on discovery mode
Number so that pattern is more readily understood.
Step 1.4:Data cleansing is carried out to the Accident-causing information of reservation.It is substantially carried out missing values processing and noise mistake
Filter.Missing values processing is carried out first.Due to missing data characterization be accident independent individual information, between each accident not
There are apparent correlations, so such missing data can not be made up in theory by post analysis, therefore use
Elimination method carries out data cleansing, rejects the accident information that 5 class traffic accident causation information attribute missing degree are more than 30%, improves number
According to quality and tap value.Then noise filtering is carried out, uses the outlier detection algorithm based on statistical method herein.Due to
Traffic accident data are independent individual information, while apparent correlation being not present between each accident, and with height
Randomness is smoothed so such missing data can not be analyzed in theory by the Return Law, so for examining
Break the outlier, gives delete processing.
Step 1.5:Hough transformation is carried out to continuity attribute.It is general in order in follow-up data mining process, classify
The characteristics of including all kinds of traffic accidents and attention is placed in some specific class to be further analysed, to continuity point
The attribute of cloth carries out clustering processing.Meanwhile the association for leading to all kinds of traffic accidents being intuitively presented in order to facilitate data mining results
Factor information, according to road traffic accident criteria for classification, the death toll, slight wound number based on every accident record and severely injured people
Number and property loss, classify to every accident:
A) property loss accident.It causes vehicle, cargo or other properties impaired, can slightly be hindered with personnel;
B) injured accident.Party is caused to sustain a severe injury or slight wound, it can be with property loss;
C) death by accident.Cause party dead, it can be with injury to personnel, property loss.
Step 2:Parameter is chosen
Step 2.1:According to the support and confidence level of following equation computation rule.
Regular R:Support in traffic data collection T is as follows:
Wherein,
Regular R:Confidence level in traffic data collection T is as follows:
Wherein,
For regular R:X is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when setting for regular R
Both when reliability is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrate
Between contact it is closer.
Step 2.2:Select minimum support S threshold values:To the different classes of accidents of different administrative region traffic data collection into
After row is distinguished, the rule degree of being supported of different zones is calculated according to formula (1), obtains the pass for meeting minimum support threshold value
The relational graph of connection regular quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support
Threshold value is abscissa, using the correlation rule quantity for meeting minimum support threshold value as ordinate, obtains each region branch of all kinds of accidents
Degree of holding threshold value chooses tendency chart, carries out minimum support threshold value selection.
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1)
(2) to the rule degree of being supported of different zones and confidence calculations, different supports is set and confidence threshold value carries out
Comparative analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence
The range of choice of threshold value is spent, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger,
The correlation rule quantity that expression includes is more.
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the member rule with pervasive feature are screened
Index then is frequent index, and the correlation rule excavated respectively is concentrated according to different data, is established based on more data sets
The frequent index table of correlation rule meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set exists
Consistent support and confidence threshold value are taken when Mining Association Rules, by Boolean variable 1 and 0 respectively indicate exist rule and
There is no rule, regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pijTake 1, it is no
Then pijIt is data set quantity to take 0, n.
There is in multizone the meta-rule of universality in order to obtain, while ensureing that obtained universality meta-rule has analysis
Meaning is associated screening to the different types of traffic accident data set in each region, filters out the pass repeated in each region
Connection rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the association rule of all kinds of accidents
Then region is associated with tendency chart, carries out frequent index threshold selection.
When choosing all kinds of threshold values, the readability of rule should ensure that first so that regular quantity is maintained at readable range
Within (generally 200 or less), while should ensure that regular validity so that the attribute that rule includes is more as possible.
Step 3:Association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and minimum confidence are set
C is spent, once connection rule digging is carried out, obtains corresponding correlation rule R1,R2,…,Ri。
Step 3.2:According to the frequent index that each correlation rule is concentrated in different data, screened by frequent index F threshold values,
Meta-rule is extracted, meta-rule collection is established.
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, integrates what multi-group data was concentrated
Meta-rule.
Step 3.4:Correlation rule generates.Strong association rule output is carried out according to minimum support S and min confidence C,
And export based on the meta-rule for leading to traffic accident type association factor, it obtains with the output of cellular pattern, with pervasive feature
It is polynary rule constitute output rule, be shaped likeRule template, indicate Accident-causing
P1,...,Pi,Pj,...,PkGeneration induction of Accident-causing Q (Y) generation, in Accident-causing P1,...,Pi,Pj,...,Pk
In, Accident-causing P1,...,PiGeneration induction of Accident-causing Pj,...,PkGeneration.Cellular pattern is chosen as output mould
Formula is mainly in view of the packaging type feature of cellular pattern, and it had both included meta-rule that can make the rule of output, also can include single category
Property, output rule is collectively formed, keeps output rule envelope information more complete, more analyticity.It is concentrated by different data
It excavates obtained correlation rule to be screened, is presented in the preceding paragraph of correlation rule and consequent, packet in the form of cellular group later
Attribute and attribute, attribute and rule, rule and regular three kinds of forms are contained.In practical applications, by consider influence factor it
Between correlation rule, as long as the less influence factor of control, you can reach the prevention to traffic accident.
Step 4:Rule analysis
The Strong association rule that polynary rule under the different type accident generated according to step 3 is constituted, qualitative and quantitative point
The relevance between each Accident-causing is analysed, reference frame is provided for decision-making level.
For example, using Shenzhen 2014-2016 year traffic casualty data information as research object, to each traffic in Shenzhen
The reason of accident pattern carries out mining analysis, in the present embodiment, minimum support S, the min confidence C of correlation rule and
The threshold value of frequent index F is set to:S >=30%, C >=70%, F >=55% obtain the association rule under each traffic accident type
It is then as a result, as shown in the table:
1 Shenzhen Traffic accident correlation rule result (part) of table
Correlation rule result is analyzed, it is possible to provide following to suggest:In terms of weather, when weather condition is sunny, driver
More easily cause accident because of random change lane, and when weather condition is rain, do not keep safe distance and dangerous with front truck
Driving then becomes the main traffic behavior for leading to accident, and the Frequent Accidents period is located at 17:00-19:59, incident area is treasured
Pacify area, therefore consider to start with from weather, when and where, the period Traffic Announcement under row particular weather, which is reminded, to be launched, to reinforce
The security protection of driver is realized.In terms of driver, driver of the age 19 to 23 and 30 to 35 years old is the 1st class accident
The group occurred frequently of (property loss accident), but its linked character is different.There was maximum probability hair at age in 19 to 23 years old drivers
The driving behavior of raw number 1225, that is, have other that the behavior of safe driving, this class behavior maximum probability is interfered to lead to the 1st class when driving
The generation of accident, and identifier marking it is not perfect be this class behavior main reason.And the age was in driver's generation in 30 to 35 years old
The main illegal activities of 1st class accident are 1094, i.e., do not keep safe distance with front truck, and accident spot is mainly in general city
Road.In view of 19 to 23 age brackets are mostly the new Shanxi driver just to graduate from driving school, it is proposed that reinforce for learning in driving school's training
The training of member's traffic safety consciousness of behavior, will pay attention to driver's age-colony difference in management, emphasis reinforces the management of new hand.And year
Driver of the age section between 24 to 29 years old, most of driving ages for just having accumulated 4 to 6 years, are that awareness of safety link is most weak
Period, therefore suggest suitably carrying out traffic safety drive education when first driver's license expires replacement and combine real case
Carry out awareness of safety intensified education, it is contemplated that audience is too big, therefore the modes such as network answer, Internet video can be used and pacified
Full education, while by safety education qualification by being included in the condition and range of first driver's license replacement.
Embodiment is merely illustrative of the invention's technical idea, and cannot limit protection scope of the present invention with this, it is every according to
Technological thought proposed by the present invention, any change done on the basis of technical solution, each falls within the scope of the present invention.
Claims (6)
1. a kind of traffic accident causation method for digging based on universality meta-rule, which is characterized in that include the following steps:
Step 1: data preparation
Step 1.1:Traffic accident information over the years is read, and is classified as accident essential information, relates to thing drivers information, accident vehicle
5 class traffic accident causation information of information, road condition information and environmental information, and per class traffic accident causation information using more
Attribute description;
Step 1.2:Data Quality Analysis is carried out to the traffic accident information of reading, screening retains up-to-standard attribute variable;
Step 1.3:Attributions selection is carried out to the traffic accident information after screening, by category uncorrelated or redundancy to mining task
Property reject, the target of Attributions selection is to find out minimal attribute set, while ensureing the probability distribution of data set as possible close to utilizing institute
There is the former distribution that attribute obtains;
Step 1.4:Data cleansing, including missing values processing and noise filtering are carried out to the traffic accident information that step 1.3 obtains;
Missing values processing uses elimination method, rejects the letter that attribute missing degree in 5 class traffic accident causation information is more than default missing threshold value
Breath;Noise filtering uses the outlier detection algorithm based on statistical method, the outlier being diagnosed to be in data, and deletes;
Step 1.5:Clustering processing is carried out to the attribute of continuity distribution, meanwhile, according to road traffic accident criteria for classification to every
Accident is classified;
Step 2: parameter is chosen
Step 2.1:According to the support and confidence level of following method computation rule:
RuleSupport in traffic data collection T is as follows:
Wherein,
RuleConfidence level in traffic data collection T is as follows:
Wherein,
For ruleX is known as the former piece of rule, and Y is known as the consequent of rule, the support of regular RIndicate Accident-causing X and the simultaneous probability of Accident-causing Y, the confidence level of regular RIt indicates when Accident-causing X occurs, the simultaneous conditional probabilities of Accident-causing Y, when regular R's
When confidence level is more than pre-set threshold value, it is believed that induction of the generation of Y events, confidence level is bigger for the generation of X events, illustrates two
Contact between person is closer;
Step 2.2:Select minimum support S threshold values:Area is carried out to the different classes of accident of different administrative region traffic data collection
After point, the rule degree of being supported of different zones is calculated according to formula (1), obtains the association rule for meeting minimum support threshold value
The then relational graph of quantity and minimum support threshold value;By choosing different minimum support threshold values, with minimum support threshold value
For abscissa each region support of all kinds of accidents is obtained using the correlation rule quantity for meeting minimum support threshold value as ordinate
Threshold value chooses tendency chart, carries out minimum support threshold value selection;
Step 2.3:Select min confidence C threshold values:To the traffic accident data set under different type, according to formula (1) and (2)
To the rule degree of being supported and confidence calculations of different zones, different supports is set and confidence threshold value is compared point
Analysis obtains the bubble relational graph of the regular distribution and threshold value setting that meet threshold condition, to weigh support and confidence threshold value
Range of choice, wherein abscissa corresponds to support threshold, and ordinate corresponds to confidence threshold value, and number of bubbles is bigger, indicates packet
The correlation rule quantity contained is more;
Step 2.4:Select frequent index F threshold values:Between different traffic data collection, the index of screening universality meta-rule is frequency
Numerous index concentrates the correlation rule excavated respectively according to different data, and it is frequent to establish the correlation rule based on more data sets
Index table meets the correlation rule of frequent index threshold as universality meta-rule, wherein each data set is in Mining Association Rules
When take consistent support and confidence threshold value, by Boolean variable 1 and 0 respectively indicate exist rule and there is no rule,
Regular RiFrequent Index Definition it is as follows:
Wherein, pijFor regular RiIn data set TjIn judgment value, regular RiIn data set TjMiddle presence, then pij1 is taken, otherwise pij
It is data set quantity to take 0, n;
There is in multizone the meta-rule of universality in order to obtain, while ensureing that there is obtained universality meta-rule analysis to anticipate
Justice is associated screening to the different types of traffic accident data set in each region, filters out the association repeated in each region
Rule, and using frequent index threshold as abscissa, using Strong association rule quantity as ordinate, obtain the correlation rule of all kinds of accidents
Region is associated with tendency chart, carries out frequent index threshold selection;
Step 3: the association rule mining based on meta-rule
Step 3.1:To multiple format same data set T1,T2,…,TiConsistent minimum support S and min confidence C is set,
Once connection rule digging is carried out, corresponding correlation rule R is obtained1,R2,…,Ri;
Step 3.2:It according to the frequent index that each correlation rule is concentrated in different data, is screened by frequent index F threshold values, extraction
Meta-rule establishes meta-rule collection;
Step 3.3:In conjunction with meta-rule collection and data set T1,T2,…,TiMining again is carried out, the member rule that multi-group data is concentrated are integrated
Then;
Step 3.4:Strong association rule output is carried out according to minimum support S and min confidence C, and exports and is based on leading to traffic
The meta-rule of accident pattern relation factor obtains the output constituted with the output of cellular pattern, the polynary rule with pervasive feature
Rule;
Step 4: rule analysis
According to the Strong association rule that the polynary rule under the different type accident of generation is constituted, each accident of qualitative and quantitative analysis
Relevance between reason provides reference frame for decision-making level.
2. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step
In rapid 1.1, the attribute that 5 class traffic accident causation information respectively contain is as follows:
Accident essential information includes:When the type of accident, incident classification, accident casualty number, accident direct property loss, accident
Between and accident spot;
Relating to thing drivers information includes:Body psychology shape when gender, age, occupation, driving age and the accident of driver occur
Front and back driver behavior behavior occurs for condition, accident;
Accident vehicle information includes:Type of vehicle, vehicle safety situation and vehicle drive performance;
Road condition information includes:Category of roads, road surface form, pavement behavior, safety devices and Alignment Design;
Environmental information includes:Therefore moment road traffic condition, driving sight distance, weather conditions, identifier marking and illumination occurs.
3. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step
In rapid 1.2, by being worth analysis method, retain the attribute variable that non-empty accounting is more than 70%.
4. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step
In rapid 1.4, the default missing threshold value is 30%.
5. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step
In rapid 1.5, classified as follows to accident:
Property loss accident:It causes vehicle, cargo or other properties impaired, or is slightly hindered with personnel;
Injured accident:Party is caused to sustain a severe injury or slight wound, or can be with property loss;
Death by accident:Cause party dead, or with injury to personnel, property loss.
6. the traffic accident causation method for digging based on universality meta-rule according to claim 1, which is characterized in that in step
In rapid 2.2,2.3 and 2.4, when choosing all kinds of threshold values, the readability of rule should ensure that first so that regular quantity is maintained at
Within readable range, while it should ensure that the validity of rule so that the attribute that rule includes is more as possible.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810781739.9A CN108717786B (en) | 2018-07-17 | 2018-07-17 | Traffic accident cause mining method based on universality meta-rule |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810781739.9A CN108717786B (en) | 2018-07-17 | 2018-07-17 | Traffic accident cause mining method based on universality meta-rule |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108717786A true CN108717786A (en) | 2018-10-30 |
CN108717786B CN108717786B (en) | 2022-06-17 |
Family
ID=63914019
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810781739.9A Active CN108717786B (en) | 2018-07-17 | 2018-07-17 | Traffic accident cause mining method based on universality meta-rule |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108717786B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109410588A (en) * | 2018-12-20 | 2019-03-01 | 湖南晖龙集团股份有限公司 | A kind of traffic accident evolution analysis method based on traffic big data |
CN110263709A (en) * | 2019-06-19 | 2019-09-20 | 百度在线网络技术(北京)有限公司 | Driving Decision-making method for digging and device |
CN110442620A (en) * | 2019-08-05 | 2019-11-12 | 赵玉德 | A kind of big data is explored and cognitive approach, device, equipment and computer storage medium |
CN110825777A (en) * | 2019-11-11 | 2020-02-21 | 云南电网有限责任公司电力科学研究院 | Cause and effect analysis method for park road degradation |
CN111144772A (en) * | 2019-12-30 | 2020-05-12 | 交通运输部公路科学研究所 | Road transportation safety risk real-time assessment method based on data mining |
CN111459994A (en) * | 2020-03-06 | 2020-07-28 | 中国科学院计算技术研究所 | Disabled person-oriented big data analysis method and system |
CN112597236A (en) * | 2020-12-04 | 2021-04-02 | 河南大学 | Concept lattice-based association rule optimization method and visual display method |
CN113077625A (en) * | 2021-03-24 | 2021-07-06 | 合肥工业大学 | Road traffic accident form prediction method |
CN113792193A (en) * | 2021-08-27 | 2021-12-14 | 武汉理工大学 | Inland navigation mark-oriented accident data mining method and system |
CN115794801A (en) * | 2022-12-23 | 2023-03-14 | 东南大学 | Data analysis method for mining chain relation of automatic driving accident cause |
CN116384820A (en) * | 2023-03-31 | 2023-07-04 | 四川省自然资源科学研究院(四川省生产力促进中心) | Scientific and technological innovation capability assessment method, system, equipment and medium for enterprises |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011060723A1 (en) * | 2009-11-19 | 2011-05-26 | 北京世纪高通科技有限公司 | Method and device for data mining of road traffic accident based on association rule |
CN103455563A (en) * | 2013-08-15 | 2013-12-18 | 国家电网公司 | Data mining method applicable to integrated monitoring system of intelligent substation |
CN103488802A (en) * | 2013-10-16 | 2014-01-01 | 国家电网公司 | EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule |
CN104298778A (en) * | 2014-11-04 | 2015-01-21 | 北京科技大学 | Method and system for predicting quality of rolled steel product based on association rule tree |
CN104464344A (en) * | 2014-11-07 | 2015-03-25 | 湖北大学 | Vehicle driving path prediction method and system |
US20160061625A1 (en) * | 2014-12-02 | 2016-03-03 | Kevin Sunlin Wang | Method and system for avoidance of accidents |
CN106383920A (en) * | 2016-11-28 | 2017-02-08 | 东南大学 | Method for identifying reasons of major traffic accidents based on association rules |
CN107610421A (en) * | 2017-09-19 | 2018-01-19 | 合肥英泽信息科技有限公司 | A kind of geo-hazard early-warning analysis system and method |
-
2018
- 2018-07-17 CN CN201810781739.9A patent/CN108717786B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011060723A1 (en) * | 2009-11-19 | 2011-05-26 | 北京世纪高通科技有限公司 | Method and device for data mining of road traffic accident based on association rule |
CN103455563A (en) * | 2013-08-15 | 2013-12-18 | 国家电网公司 | Data mining method applicable to integrated monitoring system of intelligent substation |
CN103488802A (en) * | 2013-10-16 | 2014-01-01 | 国家电网公司 | EHV (Extra-High Voltage) power grid fault rule mining method based on rough set association rule |
CN104298778A (en) * | 2014-11-04 | 2015-01-21 | 北京科技大学 | Method and system for predicting quality of rolled steel product based on association rule tree |
CN104464344A (en) * | 2014-11-07 | 2015-03-25 | 湖北大学 | Vehicle driving path prediction method and system |
US20160061625A1 (en) * | 2014-12-02 | 2016-03-03 | Kevin Sunlin Wang | Method and system for avoidance of accidents |
CN107430006A (en) * | 2014-12-02 | 2017-12-01 | 凯文·孙林·王 | Avoid the method and system of accident |
CN106383920A (en) * | 2016-11-28 | 2017-02-08 | 东南大学 | Method for identifying reasons of major traffic accidents based on association rules |
CN107610421A (en) * | 2017-09-19 | 2018-01-19 | 合肥英泽信息科技有限公司 | A kind of geo-hazard early-warning analysis system and method |
Non-Patent Citations (2)
Title |
---|
左艇 等: "基于关联规则方法对不同地区乌头组反药的临床调查研究和配伍特点分析", 《中国中药杂志》 * |
张春生: "大数据环境下相容数据集的关联规则数据挖掘", 《微电子学与计算机》 * |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109410588A (en) * | 2018-12-20 | 2019-03-01 | 湖南晖龙集团股份有限公司 | A kind of traffic accident evolution analysis method based on traffic big data |
CN109410588B (en) * | 2018-12-20 | 2022-03-15 | 湖南晖龙集团股份有限公司 | Traffic accident evolution analysis method based on traffic big data |
CN110263709B (en) * | 2019-06-19 | 2021-07-16 | 百度在线网络技术(北京)有限公司 | Driving decision mining method and device |
CN110263709A (en) * | 2019-06-19 | 2019-09-20 | 百度在线网络技术(北京)有限公司 | Driving Decision-making method for digging and device |
CN110442620A (en) * | 2019-08-05 | 2019-11-12 | 赵玉德 | A kind of big data is explored and cognitive approach, device, equipment and computer storage medium |
CN110442620B (en) * | 2019-08-05 | 2023-08-29 | 赵玉德 | Big data exploration and cognition method, device, equipment and computer storage medium |
CN110825777A (en) * | 2019-11-11 | 2020-02-21 | 云南电网有限责任公司电力科学研究院 | Cause and effect analysis method for park road degradation |
CN111144772A (en) * | 2019-12-30 | 2020-05-12 | 交通运输部公路科学研究所 | Road transportation safety risk real-time assessment method based on data mining |
CN111144772B (en) * | 2019-12-30 | 2023-11-21 | 交通运输部公路科学研究所 | Road transportation safety risk real-time assessment method based on data mining |
CN111459994A (en) * | 2020-03-06 | 2020-07-28 | 中国科学院计算技术研究所 | Disabled person-oriented big data analysis method and system |
CN112597236A (en) * | 2020-12-04 | 2021-04-02 | 河南大学 | Concept lattice-based association rule optimization method and visual display method |
CN112597236B (en) * | 2020-12-04 | 2022-10-25 | 河南大学 | Concept lattice-based association rule optimization method and visual display method |
CN113077625A (en) * | 2021-03-24 | 2021-07-06 | 合肥工业大学 | Road traffic accident form prediction method |
CN113077625B (en) * | 2021-03-24 | 2022-03-15 | 合肥工业大学 | Road traffic accident form prediction method |
CN113792193A (en) * | 2021-08-27 | 2021-12-14 | 武汉理工大学 | Inland navigation mark-oriented accident data mining method and system |
CN113792193B (en) * | 2021-08-27 | 2023-02-28 | 武汉理工大学 | Inland navigation mark-oriented accident data mining method and system |
CN115794801A (en) * | 2022-12-23 | 2023-03-14 | 东南大学 | Data analysis method for mining chain relation of automatic driving accident cause |
CN115794801B (en) * | 2022-12-23 | 2023-08-15 | 东南大学 | Data analysis method for mining cause chain relation of automatic driving accidents |
CN116384820A (en) * | 2023-03-31 | 2023-07-04 | 四川省自然资源科学研究院(四川省生产力促进中心) | Scientific and technological innovation capability assessment method, system, equipment and medium for enterprises |
Also Published As
Publication number | Publication date |
---|---|
CN108717786B (en) | 2022-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108717786A (en) | A kind of traffic accident causation method for digging based on universality meta-rule | |
CN104268599B (en) | Intelligent unlicensed vehicle finding method based on vehicle track temporal-spatial characteristic analysis | |
CN108717790B (en) | Vehicle travel analysis method based on checkpoint license plate recognition data | |
Papadimitriou et al. | Patterns of pedestrian attitudes, perceptions and behaviour in Europe | |
CN108090429B (en) | Vehicle type recognition method for graded front face bayonet | |
CN108596409B (en) | Method for improving accident risk prediction precision of traffic hazard personnel | |
CN110119676A (en) | A kind of Driver Fatigue Detection neural network based | |
CN106383920B (en) | A kind of particularly serious traffic accident causation recognition methods based on correlation rule | |
CN106384100A (en) | Component-based fine vehicle model recognition method | |
CN109408557B (en) | Traffic accident cause analysis method based on multiple correspondences and K-means clustering | |
Das et al. | Investigating the pattern of traffic crashes under rainy weather by association rules in data mining | |
CN109409337A (en) | Muck vehicle feature identification method based on convolutional neural network | |
CN109086808B (en) | Traffic high-risk personnel identification method based on random forest algorithm | |
CN110458082A (en) | A kind of city management case classification recognition methods | |
CN109191828B (en) | Traffic participant accident risk prediction method based on ensemble learning | |
Gao et al. | Research on automated modeling algorithm using association rules for traffic accidents | |
Anderson | Crime Statistics and the ‘Problem of Crime’in Scotland | |
Kim et al. | Hit-and-run crashes: use of rough set analysis with logistic regression to capture critical attributes and determinants | |
CN108510168A (en) | Commerial vehicle paths planning method based on traffic accident correlation rule | |
CN114003683A (en) | Alarm condition analysis method based on natural language processing and association rule | |
CN109101568A (en) | Traffic high-risk personnel recognition methods based on XgBoost algorithm | |
CN109614496A (en) | A kind of minimum living discrimination method of knowledge based map | |
CN110263074A (en) | A method of illegal accident corresponding relationship is excavated based on LLE and K averaging method | |
CN109063751B (en) | Traffic high-risk personnel identification method based on gradient lifting decision tree algorithm | |
CN105654118A (en) | Civil aviation passenger relationship classification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |