CN114677254A - Truck accident identification method, device, storage medium and program product - Google Patents
Truck accident identification method, device, storage medium and program product Download PDFInfo
- Publication number
- CN114677254A CN114677254A CN202210262059.2A CN202210262059A CN114677254A CN 114677254 A CN114677254 A CN 114677254A CN 202210262059 A CN202210262059 A CN 202210262059A CN 114677254 A CN114677254 A CN 114677254A
- Authority
- CN
- China
- Prior art keywords
- vehicle
- information
- truck
- accident
- parking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000003860 storage Methods 0.000 title claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 26
- 238000005070 sampling Methods 0.000 claims abstract description 21
- 238000001914 filtration Methods 0.000 claims abstract description 18
- 230000006399 behavior Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 15
- 230000002159 abnormal effect Effects 0.000 claims description 11
- 230000001133 acceleration Effects 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims description 7
- 238000007689 inspection Methods 0.000 claims description 6
- 230000002093 peripheral effect Effects 0.000 claims description 4
- 238000004220 aggregation Methods 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 3
- 230000010354 integration Effects 0.000 claims description 2
- 230000005856 abnormality Effects 0.000 claims 1
- 238000009826 distribution Methods 0.000 abstract description 7
- 238000010276 construction Methods 0.000 abstract description 6
- 230000000694 effects Effects 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000875 corresponding effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Software Systems (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Traffic Control Systems (AREA)
Abstract
The application relates to a truck accident identification method, a truck accident identification device, a storage medium and a program product. The method comprises the following steps: acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information; filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set; integrating the sample data set, determining a plurality of themes, and calculating characteristic dimensions under the themes; training an isolated forest model by using feature dimensions under a plurality of subjects; and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not. The method and the device fully consider the extremely unbalanced current situation of data of truck accidents, do not need to collect marked samples, thereby reducing the cost of data set construction, ensuring the distribution consistency with the whole data by adopting layered sampling, and realizing ideal recognition effect through a trained isolated forest model.
Description
Technical Field
The present application relates to the field of intelligent identification technologies, and more particularly, to a truck accident identification method, apparatus, storage medium, and program product.
Background
The existing accident recognition technology mostly trains each recognition model according to a supervised learning algorithm, but the probability of accidents, particularly major accidents, occurring in the operation process of a truck is often extremely low, the generated data are extremely unbalanced, and the generated data do not accord with the use conditions of the supervised learning algorithm adopted in the existing technical scheme, so that the occurrence of an overfitting phenomenon is caused, and the prediction result is inaccurate.
In addition, the supervised learning algorithm adopted by the prior art depends on marked data, but the collection of a certain amount of truck accident samples is often difficult, the construction of a data set is completed, and a large amount of cost is required.
Disclosure of Invention
Based on the technical problems, the invention aims to obtain the whole data, filter, associate and hierarchically sample the whole data to obtain a sample data set, and train the isolated forest model by using the sample data set, so that the truck data to be identified is identified by using the isolated forest model.
The invention provides a truck accident identification method in a first aspect, which comprises the following steps:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, associating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
In some embodiments of the invention, the plurality of topics comprises: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
In some embodiments of the present invention, the filtering, associating, and hierarchically sampling the whole data to obtain a sample data set includes:
Sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents on the basis of adjacent track point information;
the basic attribute information of the vehicle is associated with the vehicle operation track information through the vehicle ID, and the proportion of each vehicle type is obtained based on grouping and aggregation;
and performing non-return sampling on the parking behaviors which are possibly subjected to accidents according to the proportion of each vehicle type to obtain a sample data set.
In some embodiments of the present invention, the sorting vehicle operation track information by time, and filtering parking behaviors that may cause an accident based on adjacent track point information includes:
sequencing the vehicle operation track information according to time;
for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point;
if the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station, a parking lot or has a second preset distance from the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot;
And if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station and the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
In some embodiments of the invention, calculating feature dimensions under the plurality of topics comprises: calculating the acceleration of the vehicle, the average speed of the vehicle, the speed standard deviation of the vehicle, the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the overspeed mileage of hundreds of kilometers, the daily fatigue driving time, the times of dangerous road passing of hundreds of kilometers, the times of steering lamps of hundreds of kilometers, the times of braking of hundreds of kilometers, the number of parked vehicles around before parking, the number of parked vehicles around after parking, the average speed of vehicles around before parking and the average speed of vehicles around after parking.
In some embodiments of the invention, the expression of the isolated forest model is:
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
In some embodiments of the invention, the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model.
A second aspect of the present invention provides a truck accident recognition apparatus, the apparatus comprising:
the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring overall data, and the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
the sample obtaining module is used for filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
the integration module is used for integrating the sample data set, determining a plurality of topics and calculating characteristic dimensions under the topics;
the training module is used for training the isolated forest model by using the feature dimensions under the plurality of subjects;
and the identification module is used for inputting the data to be identified of the truck into the trained isolated forest model so as to judge whether the truck has an accident or not.
A third aspect of the invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
Acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
A fourth aspect of the invention provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
Training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
The beneficial effect of this application does: this application acquires the overall data, right the overall data filters, is correlated with, the layering sampling, obtains the sample data set, uses the isolated forest model of characteristic dimension training under a plurality of topics, treat the isolated forest model of identification data input training back with the freight train to judge whether the freight train takes place the accident, fully consider the extremely unbalanced current situation of data that the freight train accident took place, need not to collect the sample that has the mark, thereby the cost of data set construction has been reduced, adopt the mode of layering sampling to obtain the sample from the overall data set, the distribution uniformity with the overall data has been guaranteed, reach more ideal recognition effect through the isolated forest model that trains under the condition of need not to use a large amount of sample data, and then the efficiency of freight train discernment has been promoted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description, serve to explain the principles of the application.
The present application may be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:
FIG. 1 illustrates a schematic diagram of method steps of an exemplary embodiment of the present application;
FIG. 2 illustrates a schematic diagram of an apparatus according to an exemplary embodiment of the present application;
FIG. 3 illustrates a schematic structural diagram of a computer device provided by an exemplary embodiment of the present application;
fig. 4 illustrates a schematic diagram of a storage medium provided by an exemplary embodiment of the present application.
Detailed Description
Hereinafter, embodiments of the present application will be described with reference to the accompanying drawings. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present application. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present application. It will be apparent to one skilled in the art that the present application may be practiced without one or more of these details. In other instances, well-known features of the art have not been described in order to avoid obscuring the present application.
It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the application. As used herein, the singular is intended to include the plural unless the context clearly dictates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Exemplary embodiments according to the present application will now be described in more detail with reference to the accompanying drawings. These exemplary embodiments may, however, be embodied in many different forms and should not be construed as limited to only the embodiments set forth herein. The figures are not drawn to scale, wherein certain details may be exaggerated and omitted for clarity. The shapes of various regions, layers, and relative sizes and positional relationships therebetween shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, as actually required.
Several examples are given below in conjunction with the description of figures 1-4 to describe exemplary embodiments according to the present application. It should be noted that the following application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.
At present, in the prior art, a supervised learning algorithm is mostly used for training each recognition model, however, a lot of problems exist, firstly, the probability of accidents, especially major accidents, occurring in the operation process of a truck is often extremely low, the generated data are extremely unbalanced, and the generated data do not accord with the use conditions of the supervised learning algorithm adopted in the prior art, so that the overfitting phenomenon occurs, and the prediction result is inaccurate. Secondly, the supervised learning algorithm adopted by the prior art depends on marked data, but the collection of a certain amount of truck accident samples is often difficult, the construction of a data set is completed, and a large amount of cost is required. Thirdly, the amount of the collected sample data is small, so that the deviation of the data exists, the data distribution is inconsistent with the whole data distribution, the problem that the representativeness of the data set sample is poor is caused, and the final accuracy of the model is influenced. Finally, the prior art only considers the dynamic attribute information of the vehicle, and does not consider the influence of the static attribute information of the vehicle, the driving behavior information of the driver, the holiday and weather information and other relevant factors on the model.
Accordingly, in some exemplary embodiments of the present application, there is provided a truck accident recognition method, implemented based on an isolated forest model, as shown in fig. 1, the method including:
s1, acquiring overall data, wherein the overall data comprises basic attribute information of the vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
s2, filtering, correlating and sampling the whole data in a layering way to obtain a sample data set;
s3, integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
s4, training an isolated forest model by using the feature dimensions under the plurality of subjects;
and S5, inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
In specific implementation, in S2, filtering, associating, and hierarchically sampling the entire data to obtain a sample data set, including: sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents based on adjacent track point information; associating the basic attribute information of the vehicle with the vehicle operation track information through a vehicle ID, and obtaining the proportion of each vehicle type based on grouping and aggregation; and performing non-return sampling on the parking behaviors which are possibly subjected to accidents according to the proportion of each vehicle type to obtain a sample data set.
In some preferred embodiments, sequencing the vehicle operation track information according to time, and filtering out possible accident-occurring parking behaviors based on adjacent track point information, includes: sequencing the vehicle operation track information according to time; for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point; if the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station, a parking lot or has a second preset distance from the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot; and if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station and the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
In some preferred embodiments, integrating the sample data set comprises: and integrating the stop information of the sample data set with other data information in the whole data collected in the S1. For example, the sample data set is associated by the vehicle ID and the vehicle information and the driving behavior information of the driver, by the start stop point longitude and latitude and the geographic information and the road information, by the start stop time point and the holiday information, and by the longitude and latitude and the start stop time point and the weather information. And counting the null value rate of each field, identifying abnormal values through the upper and lower boundaries of the boxed graph, and the like, deleting fields with null value rates exceeding a certain threshold, filling fields with low null value rates through the median of the fields, deleting fields with more abnormal values, and performing box-dividing conversion on fields with less abnormal values to improve the distribution of the fields.
In some preferred embodiments, for the integrated data, a plurality of topics are determined, the plurality of topics including: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
The basic attribute information of the vehicle includes a vehicle type, a vehicle brand, a vehicle age, a home-made property, a total vehicle mass, and the like. The characteristic dimensions required to be calculated in the driving behavior information of the driver comprise the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the speeding and the mileage of hundreds of kilometers, the daily fatigue driving duration, the times of passing through dangerous roads of hundreds of kilometers, the times of turning lights of hundreds of kilometers, the times of braking of hundreds of kilometers and the like corresponding to the driving behavior information of the driver. The characteristic dimensions to be calculated under the collision attribute information of the vehicle include parking information of the vehicle, acceleration of the vehicle, average speed of the vehicle, speed standard deviation of the vehicle, and the like; the acceleration can be calculated by dividing the front and rear speed difference by the time difference in a period of time, the average speed of the vehicle is equal to the speed of the track point of the vehicle and is calculated by dividing the track point number of the vehicle, and the speed standard deviation of the vehicle is equal to the arithmetic square root of the variance of the speed of the vehicle. The characteristic dimensions required to be calculated under the peripheral parking vehicle information of the vehicle comprise: the number of parked vehicles in the front vicinity of the stop, the number of parked vehicles in the rear vicinity of the stop, the average speed of the vehicles in the vicinity before the stop, the average speed of the vehicles in the vicinity after the stop, and the like.
The vehicle-associated road attribute information may be associated with, for example, spatio-temporal information, the type of associated road such as high speed, national road, provincial road, prefectural road, and the like. Besides, the characteristic dimensions of the holiday information of the vehicle operation place and the weather information of the vehicle operation place can be obtained through the track location. Of course, the data accumulated by the truck operation and maintenance and calculated in the accumulation can be classified as the so-called feature dimension in the present application, and will not be described herein again.
It should be noted here that the english name of the isolated forest is IsolationForest, which is composed of a plurality of binary trees, a tree in iForest is called an isolation tree, which is abbreviated as iTree, and when a data set is assumed to have Z pieces of data, a plurality of samples are uniformly sampled (generally, no return sampling) from the Z pieces of data to construct one iTree, which is used as a training sample of the tree. In the sample, a feature is randomly selected, a value is randomly selected in all value ranges (between the minimum value and the maximum value) of the feature, the sample is divided into two branches, the part which is smaller than the value in the sample is divided into the left side of the node, and the part which is larger than or equal to the value in the sample is divided into the right side of the node. This results in a splitting condition and left and right datasets, and then repeats the above process on the left and right datasets, respectively, until the dataset has only one record or a defined height of the tree is reached.
In a preferred embodiment, the isolated forest model expression is:
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
Further preferably, c (n) can be represented as:
c(n)=2H(n-1)-(2(n-1)/n)
H(k)=ln(k)+ζ,ζ=0.5772156649
where h (k) represents a harmonic function, and ζ represents an euler constant. Finally, s (x, n) has a value in the range of [0,1], and a value closer to 1 indicates a high possibility of being an abnormal point, and a value closer to 0 indicates a high possibility of being a normal point.
In a specific training process, the method further comprises: and training the isolated forest model by adopting gridding hyper-parameters. Grid search superparameters refer to a set of parameters that are determined by enumerating possible values for each parameter, such as: the number of trees, the maximum number of samples to be sampled, the selection of feature proportion and the like, then the possible values of all parameters are combined to train a model, and finally the parameter with the best evaluation result is selected as the final parameter.
In another preferred embodiment, the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model. After the training times of the preset number, the model is considered to be well trained, and the trained model can be used for a specific application scene to identify the target data. The meaning of "well-trained" and "after-trained" in the application is the same, and the isolated forest model training method can be carried out according to the prior art and is not limited specifically. This application acquires the overall data, right the overall data filters, is correlated with, the layering sampling, obtains the sample data set, uses the isolated forest model of characteristic dimension training under a plurality of topics, treat the isolated forest model of identification data input training back with the freight train to judge whether the freight train takes place the accident, fully consider the extremely unbalanced current situation of data that the freight train accident took place, need not to collect the sample that has the mark, thereby the cost of data set construction has been reduced, adopt the mode of layering sampling to obtain the sample from the overall data set, the distribution uniformity with the overall data has been guaranteed, reach more ideal recognition effect through the isolated forest model that trains under the condition of need not to use a large amount of sample data, and then the efficiency of freight train discernment has been promoted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
In some exemplary embodiments of the present application, there is also provided a truck accident recognition apparatus, as shown in fig. 2, including:
the acquiring module 301 is configured to acquire overall data, where the overall data includes basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, holiday information, and weather information;
a sample obtaining module 302, configured to filter, associate, and sample the whole data in a hierarchical manner to obtain a sample data set;
an integrating module 303, configured to integrate the sample data set, determine multiple topics, and calculate feature dimensions under the multiple topics;
a training module 304, configured to train an isolated forest model using the feature dimensions under the multiple topics;
and the identification module 305 is configured to input data to be identified of the truck into the trained isolated forest model to determine whether an accident occurs in the truck.
In a preferred embodiment, the isolated forest model expression is:
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
Further preferably, c (n) can be represented as:
c(n)=2H(n-1)-(2(n-1)/n)
H(k)=ln(k)+ζ,ζ=0.5772156649
where h (k) represents a harmonic function, and ζ represents an euler constant. Finally, s (x, n) has a value in the range of [0,1], and a value closer to 1 indicates a high possibility of being an abnormal point, and a value closer to 0 indicates a high possibility of being a normal point.
The truck accident recognition device can reduce the cost of data set construction and can achieve a relatively ideal recognition effect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
It is further emphasized that the system provided in the embodiments of the present application may be based on artificial intelligence techniques for obtaining and processing relevant data. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Reference is now made to fig. 3, which is a schematic diagram illustrating a computer device provided in some embodiments of the present application. As shown in fig. 3, the computer device 2 includes: a processor 200, a memory 201, a bus 202 and a communication interface 203, wherein the processor 200, the communication interface 203 and the memory 201 are connected through the bus 202; the memory 201 stores a computer program operable on the processor 200, and the processor 200 executes the computer program to perform the truck accident identification method provided in any one of the foregoing embodiments.
The Memory 201 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 203 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The processor 200 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 200. The Processor 200 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201 and completes the steps of the method in combination with the hardware thereof.
Referring to fig. 4, the computer-readable storage medium shown in fig. 4 is an optical disc 30, and a computer program (i.e., a program product) is stored on the optical disc 30, and when the computer program is executed by a processor, the computer program may perform the truck accident recognition method according to any of the foregoing embodiments.
In addition, examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.
The computer-readable storage medium provided by the above-mentioned embodiment of the present application and the quantum key distribution channel allocation method in the spatial division multiplexing optical network provided by the embodiment of the present application have the same inventive concept, and have the same beneficial effects as the method adopted, run, or implemented by the application program stored in the computer-readable storage medium.
The present application further provides a computer program product, including a computer program, where the computer program is executed by a processor to implement the steps of the truck accident identification method provided in any of the foregoing embodiments, and the method includes: acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information; filtering, associating and sampling the whole data in a layered mode to obtain a sample data set; integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects; training an isolated forest model using the feature dimensions under the plurality of topics; and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
It should be noted that: the algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may be used with the teachings herein. The required structure for constructing such a device will be apparent from the description above. In addition, this application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the present application. In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification, and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except that at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent or similar purpose, unless expressly stated otherwise.
The various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the creation apparatus of a virtual machine according to embodiments of the present application. The present application may also be embodied as an apparatus or device program for carrying out a portion or all of the methods described herein. A program implementing the application may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
1. A truck accident identification method, characterized in that the method comprises:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, associating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
2. A truck accident identification method according to claim 1, wherein the plurality of topics comprises: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
3. The truck accident recognition method of claim 1, wherein the filtering, correlating, and hierarchically sampling the overall data to obtain a sample data set comprises:
sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents on the basis of adjacent track point information;
the basic attribute information of the vehicle is associated with the vehicle operation track information through the vehicle ID, and the proportion of each vehicle type is obtained based on grouping and aggregation;
and performing non-playback sampling on the parking behaviors which are likely to have accidents according to the proportion of the types of the vehicles to obtain a sample data set.
4. A truck accident identification method according to claim 3, wherein the sorting of the vehicle operation track information by time and the filtering of possible accident stopping behaviors based on the adjacent track point information comprises:
sequencing the vehicle operation track information according to time;
for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point;
If the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station and a parking lot or whether the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot is within a second preset distance;
if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station or the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
5. The truck accident recognition method of claim 2, wherein calculating the feature dimensions under the plurality of topics comprises: calculating the acceleration of the vehicle, the average speed of the vehicle, the speed standard deviation of the vehicle, the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the overspeed mileage of hundreds of kilometers, the daily fatigue driving time, the times of dangerous road passing of hundreds of kilometers, the times of steering lamps of hundreds of kilometers, the times of braking of hundreds of kilometers, the number of parked vehicles around before parking, the number of parked vehicles around after parking, the average speed of vehicles around before parking and the average speed of vehicles around after parking.
6. A truck accident recognition method according to any one of claims 1 to 5, wherein the expression of the orphan forest model is:
wherein s (x, n) represents an abnormality index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a sample number n.
7. A truck accident identification method according to claim 6, wherein the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model.
8. A truck accident recognition apparatus, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring overall data, and the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
the sample obtaining module is used for filtering, correlating and sampling the whole data in a layered manner to obtain a sample data set;
the integration module is used for integrating the sample data set, determining a plurality of subjects and calculating characteristic dimensions under the subjects;
The training module is used for training the isolated forest model by using the feature dimensions under the plurality of subjects;
and the identification module is used for inputting the data to be identified of the truck into the trained isolated forest model so as to judge whether the truck has an accident or not.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 7 when executed by a processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210262059.2A CN114677254A (en) | 2022-03-17 | 2022-03-17 | Truck accident identification method, device, storage medium and program product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210262059.2A CN114677254A (en) | 2022-03-17 | 2022-03-17 | Truck accident identification method, device, storage medium and program product |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114677254A true CN114677254A (en) | 2022-06-28 |
Family
ID=82074269
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210262059.2A Pending CN114677254A (en) | 2022-03-17 | 2022-03-17 | Truck accident identification method, device, storage medium and program product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114677254A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116543557A (en) * | 2023-05-05 | 2023-08-04 | 重庆邮电大学 | Real-time automobile electronic data extraction and fixing method based on accident detection model |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110149258A (en) * | 2019-04-12 | 2019-08-20 | 北京航空航天大学 | A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest |
CN112505549A (en) * | 2020-11-26 | 2021-03-16 | 西安电子科技大学 | New energy automobile battery abnormity detection method based on isolated forest algorithm |
CN112633395A (en) * | 2020-12-29 | 2021-04-09 | 平安科技(深圳)有限公司 | Abnormal data detection method and device, computer equipment and storage medium |
CN112884480A (en) * | 2021-03-31 | 2021-06-01 | 中国工商银行股份有限公司 | Method and device for constructing abnormal transaction identification model, computer equipment and medium |
WO2021135653A1 (en) * | 2019-12-31 | 2021-07-08 | 北京嘀嘀无限科技发展有限公司 | Method and system for identifying abnormal stay of vehicle |
CN113743815A (en) * | 2021-09-13 | 2021-12-03 | 一汽出行科技有限公司 | Risk monitoring method and device for operating vehicle, storage medium and computer equipment |
CN113792782A (en) * | 2021-09-13 | 2021-12-14 | 一汽出行科技有限公司 | Track monitoring method and device for operating vehicle, storage medium and computer equipment |
-
2022
- 2022-03-17 CN CN202210262059.2A patent/CN114677254A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110149258A (en) * | 2019-04-12 | 2019-08-20 | 北京航空航天大学 | A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest |
WO2021135653A1 (en) * | 2019-12-31 | 2021-07-08 | 北京嘀嘀无限科技发展有限公司 | Method and system for identifying abnormal stay of vehicle |
CN112505549A (en) * | 2020-11-26 | 2021-03-16 | 西安电子科技大学 | New energy automobile battery abnormity detection method based on isolated forest algorithm |
CN112633395A (en) * | 2020-12-29 | 2021-04-09 | 平安科技(深圳)有限公司 | Abnormal data detection method and device, computer equipment and storage medium |
CN112884480A (en) * | 2021-03-31 | 2021-06-01 | 中国工商银行股份有限公司 | Method and device for constructing abnormal transaction identification model, computer equipment and medium |
CN113743815A (en) * | 2021-09-13 | 2021-12-03 | 一汽出行科技有限公司 | Risk monitoring method and device for operating vehicle, storage medium and computer equipment |
CN113792782A (en) * | 2021-09-13 | 2021-12-14 | 一汽出行科技有限公司 | Track monitoring method and device for operating vehicle, storage medium and computer equipment |
Non-Patent Citations (3)
Title |
---|
柳本民;闫寒;: "基于SVM事故分类的连环追尾事故影响因素分析", 交通信息与安全, no. 01, 31 December 2020 (2020-12-31), pages 49 - 57 * |
薛清文;蒋愚明;陆键;: "基于轨迹数据的危险驾驶行为识别方法", 中国公路学报, no. 06, 31 December 2020 (2020-12-31), pages 88 - 98 * |
衡红军;刘静;: "基于混合方法的多维时间序列驾驶异常点检测", 计算机工程, no. 03, 31 December 2020 (2020-12-31), pages 105 - 110 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116543557A (en) * | 2023-05-05 | 2023-08-04 | 重庆邮电大学 | Real-time automobile electronic data extraction and fixing method based on accident detection model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108513676B (en) | Road condition identification method, device and equipment | |
CN113155173B (en) | Perception performance evaluation method and device, electronic device and storage medium | |
CN103895649A (en) | Driver safety driving warning method | |
CN114664087B (en) | Method, device, equipment and medium for recognizing up-down high speed of vehicle based on track | |
US10820166B1 (en) | Systems and methods for obtaining location intelligence | |
CN112987711B (en) | Optimization method of automatic driving regulation algorithm and simulation testing device | |
Martinelli et al. | Cluster analysis for driver aggressiveness identification. | |
CN110516691A (en) | A kind of Vehicular exhaust detection method and device | |
CN110858312A (en) | Driver driving style classification method based on fuzzy C-means clustering algorithm | |
US20180017402A1 (en) | Method and system for vehicle speed profile generation | |
Trirat et al. | Df-tar: a deep fusion network for citywide traffic accident risk prediction with dangerous driving behavior | |
CN113423063A (en) | Vehicle monitoring method and device based on vehicle-mounted T-BOX, vehicle and medium | |
Wu et al. | Clustering of several typical behavioral characteristics of commercial vehicle drivers based on GPS data mining: Case study of highways in China | |
CN114677254A (en) | Truck accident identification method, device, storage medium and program product | |
CN110264725B (en) | Method and device for determining road section flow | |
CN111696347A (en) | Method and device for automatically analyzing traffic incident information | |
CN114426025B (en) | Driving assistance method, driving assistance device, computer device, and storage medium | |
CN113192340B (en) | Method, device, equipment and storage medium for identifying highway construction vehicles | |
US20220120580A1 (en) | Systems and methods for the classification of geographic locations based on vehicle trip logs | |
CN111121803B (en) | Method and device for acquiring common stop points of road | |
CN115688003A (en) | Driver identification method and device, computer equipment and readable storage medium | |
CN114841283A (en) | Method, device, equipment and medium for determining running condition of new energy vehicle | |
CN112257869A (en) | Fake-licensed car analysis method and system based on random forest and computer medium | |
CN114446042A (en) | Method, device, equipment and storage medium for early warning of traffic accidents | |
CN102201166A (en) | Preprocessing apparatus for floating car data and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |