CN114677254A - Truck accident identification method, device, storage medium and program product - Google Patents

Truck accident identification method, device, storage medium and program product Download PDF

Info

Publication number
CN114677254A
CN114677254A CN202210262059.2A CN202210262059A CN114677254A CN 114677254 A CN114677254 A CN 114677254A CN 202210262059 A CN202210262059 A CN 202210262059A CN 114677254 A CN114677254 A CN 114677254A
Authority
CN
China
Prior art keywords
vehicle
information
truck
accident
parking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210262059.2A
Other languages
Chinese (zh)
Inventor
杨俊京
赵岩
夏曙东
蔡抒扬
孙智彬
张志平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Transwiseway Information Technology Co Ltd
Original Assignee
Beijing Transwiseway Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Transwiseway Information Technology Co Ltd filed Critical Beijing Transwiseway Information Technology Co Ltd
Priority to CN202210262059.2A priority Critical patent/CN114677254A/en
Publication of CN114677254A publication Critical patent/CN114677254A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application relates to a truck accident identification method, a truck accident identification device, a storage medium and a program product. The method comprises the following steps: acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information; filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set; integrating the sample data set, determining a plurality of themes, and calculating characteristic dimensions under the themes; training an isolated forest model by using feature dimensions under a plurality of subjects; and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not. The method and the device fully consider the extremely unbalanced current situation of data of truck accidents, do not need to collect marked samples, thereby reducing the cost of data set construction, ensuring the distribution consistency with the whole data by adopting layered sampling, and realizing ideal recognition effect through a trained isolated forest model.

Description

Truck accident identification method, device, storage medium and program product
Technical Field
The present application relates to the field of intelligent identification technologies, and more particularly, to a truck accident identification method, apparatus, storage medium, and program product.
Background
The existing accident recognition technology mostly trains each recognition model according to a supervised learning algorithm, but the probability of accidents, particularly major accidents, occurring in the operation process of a truck is often extremely low, the generated data are extremely unbalanced, and the generated data do not accord with the use conditions of the supervised learning algorithm adopted in the existing technical scheme, so that the occurrence of an overfitting phenomenon is caused, and the prediction result is inaccurate.
In addition, the supervised learning algorithm adopted by the prior art depends on marked data, but the collection of a certain amount of truck accident samples is often difficult, the construction of a data set is completed, and a large amount of cost is required.
Disclosure of Invention
Based on the technical problems, the invention aims to obtain the whole data, filter, associate and hierarchically sample the whole data to obtain a sample data set, and train the isolated forest model by using the sample data set, so that the truck data to be identified is identified by using the isolated forest model.
The invention provides a truck accident identification method in a first aspect, which comprises the following steps:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, associating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
In some embodiments of the invention, the plurality of topics comprises: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
In some embodiments of the present invention, the filtering, associating, and hierarchically sampling the whole data to obtain a sample data set includes:
Sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents on the basis of adjacent track point information;
the basic attribute information of the vehicle is associated with the vehicle operation track information through the vehicle ID, and the proportion of each vehicle type is obtained based on grouping and aggregation;
and performing non-return sampling on the parking behaviors which are possibly subjected to accidents according to the proportion of each vehicle type to obtain a sample data set.
In some embodiments of the present invention, the sorting vehicle operation track information by time, and filtering parking behaviors that may cause an accident based on adjacent track point information includes:
sequencing the vehicle operation track information according to time;
for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point;
if the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station, a parking lot or has a second preset distance from the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot;
And if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station and the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
In some embodiments of the invention, calculating feature dimensions under the plurality of topics comprises: calculating the acceleration of the vehicle, the average speed of the vehicle, the speed standard deviation of the vehicle, the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the overspeed mileage of hundreds of kilometers, the daily fatigue driving time, the times of dangerous road passing of hundreds of kilometers, the times of steering lamps of hundreds of kilometers, the times of braking of hundreds of kilometers, the number of parked vehicles around before parking, the number of parked vehicles around after parking, the average speed of vehicles around before parking and the average speed of vehicles around after parking.
In some embodiments of the invention, the expression of the isolated forest model is:
Figure BDA0003550869130000031
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
In some embodiments of the invention, the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model.
A second aspect of the present invention provides a truck accident recognition apparatus, the apparatus comprising:
the system comprises an acquisition module, a storage module and a display module, wherein the acquisition module is used for acquiring overall data, and the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
the sample obtaining module is used for filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
the integration module is used for integrating the sample data set, determining a plurality of topics and calculating characteristic dimensions under the topics;
the training module is used for training the isolated forest model by using the feature dimensions under the plurality of subjects;
and the identification module is used for inputting the data to be identified of the truck into the trained isolated forest model so as to judge whether the truck has an accident or not.
A third aspect of the invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
Acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
A fourth aspect of the invention provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, correlating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
Training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
The beneficial effect of this application does: this application acquires the overall data, right the overall data filters, is correlated with, the layering sampling, obtains the sample data set, uses the isolated forest model of characteristic dimension training under a plurality of topics, treat the isolated forest model of identification data input training back with the freight train to judge whether the freight train takes place the accident, fully consider the extremely unbalanced current situation of data that the freight train accident took place, need not to collect the sample that has the mark, thereby the cost of data set construction has been reduced, adopt the mode of layering sampling to obtain the sample from the overall data set, the distribution uniformity with the overall data has been guaranteed, reach more ideal recognition effect through the isolated forest model that trains under the condition of need not to use a large amount of sample data, and then the efficiency of freight train discernment has been promoted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description, serve to explain the principles of the application.
The present application may be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:
FIG. 1 illustrates a schematic diagram of method steps of an exemplary embodiment of the present application;
FIG. 2 illustrates a schematic diagram of an apparatus according to an exemplary embodiment of the present application;
FIG. 3 illustrates a schematic structural diagram of a computer device provided by an exemplary embodiment of the present application;
fig. 4 illustrates a schematic diagram of a storage medium provided by an exemplary embodiment of the present application.
Detailed Description
Hereinafter, embodiments of the present application will be described with reference to the accompanying drawings. It should be understood that the description is intended to be exemplary only, and is not intended to limit the scope of the present application. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present application. It will be apparent to one skilled in the art that the present application may be practiced without one or more of these details. In other instances, well-known features of the art have not been described in order to avoid obscuring the present application.
It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments in accordance with the application. As used herein, the singular is intended to include the plural unless the context clearly dictates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Exemplary embodiments according to the present application will now be described in more detail with reference to the accompanying drawings. These exemplary embodiments may, however, be embodied in many different forms and should not be construed as limited to only the embodiments set forth herein. The figures are not drawn to scale, wherein certain details may be exaggerated and omitted for clarity. The shapes of various regions, layers, and relative sizes and positional relationships therebetween shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, as actually required.
Several examples are given below in conjunction with the description of figures 1-4 to describe exemplary embodiments according to the present application. It should be noted that the following application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.
At present, in the prior art, a supervised learning algorithm is mostly used for training each recognition model, however, a lot of problems exist, firstly, the probability of accidents, especially major accidents, occurring in the operation process of a truck is often extremely low, the generated data are extremely unbalanced, and the generated data do not accord with the use conditions of the supervised learning algorithm adopted in the prior art, so that the overfitting phenomenon occurs, and the prediction result is inaccurate. Secondly, the supervised learning algorithm adopted by the prior art depends on marked data, but the collection of a certain amount of truck accident samples is often difficult, the construction of a data set is completed, and a large amount of cost is required. Thirdly, the amount of the collected sample data is small, so that the deviation of the data exists, the data distribution is inconsistent with the whole data distribution, the problem that the representativeness of the data set sample is poor is caused, and the final accuracy of the model is influenced. Finally, the prior art only considers the dynamic attribute information of the vehicle, and does not consider the influence of the static attribute information of the vehicle, the driving behavior information of the driver, the holiday and weather information and other relevant factors on the model.
Accordingly, in some exemplary embodiments of the present application, there is provided a truck accident recognition method, implemented based on an isolated forest model, as shown in fig. 1, the method including:
s1, acquiring overall data, wherein the overall data comprises basic attribute information of the vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
s2, filtering, correlating and sampling the whole data in a layering way to obtain a sample data set;
s3, integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
s4, training an isolated forest model by using the feature dimensions under the plurality of subjects;
and S5, inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
In specific implementation, in S2, filtering, associating, and hierarchically sampling the entire data to obtain a sample data set, including: sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents based on adjacent track point information; associating the basic attribute information of the vehicle with the vehicle operation track information through a vehicle ID, and obtaining the proportion of each vehicle type based on grouping and aggregation; and performing non-return sampling on the parking behaviors which are possibly subjected to accidents according to the proportion of each vehicle type to obtain a sample data set.
In some preferred embodiments, sequencing the vehicle operation track information according to time, and filtering out possible accident-occurring parking behaviors based on adjacent track point information, includes: sequencing the vehicle operation track information according to time; for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point; if the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station, a parking lot or has a second preset distance from the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot; and if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station and the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
In some preferred embodiments, integrating the sample data set comprises: and integrating the stop information of the sample data set with other data information in the whole data collected in the S1. For example, the sample data set is associated by the vehicle ID and the vehicle information and the driving behavior information of the driver, by the start stop point longitude and latitude and the geographic information and the road information, by the start stop time point and the holiday information, and by the longitude and latitude and the start stop time point and the weather information. And counting the null value rate of each field, identifying abnormal values through the upper and lower boundaries of the boxed graph, and the like, deleting fields with null value rates exceeding a certain threshold, filling fields with low null value rates through the median of the fields, deleting fields with more abnormal values, and performing box-dividing conversion on fields with less abnormal values to improve the distribution of the fields.
In some preferred embodiments, for the integrated data, a plurality of topics are determined, the plurality of topics including: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
The basic attribute information of the vehicle includes a vehicle type, a vehicle brand, a vehicle age, a home-made property, a total vehicle mass, and the like. The characteristic dimensions required to be calculated in the driving behavior information of the driver comprise the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the speeding and the mileage of hundreds of kilometers, the daily fatigue driving duration, the times of passing through dangerous roads of hundreds of kilometers, the times of turning lights of hundreds of kilometers, the times of braking of hundreds of kilometers and the like corresponding to the driving behavior information of the driver. The characteristic dimensions to be calculated under the collision attribute information of the vehicle include parking information of the vehicle, acceleration of the vehicle, average speed of the vehicle, speed standard deviation of the vehicle, and the like; the acceleration can be calculated by dividing the front and rear speed difference by the time difference in a period of time, the average speed of the vehicle is equal to the speed of the track point of the vehicle and is calculated by dividing the track point number of the vehicle, and the speed standard deviation of the vehicle is equal to the arithmetic square root of the variance of the speed of the vehicle. The characteristic dimensions required to be calculated under the peripheral parking vehicle information of the vehicle comprise: the number of parked vehicles in the front vicinity of the stop, the number of parked vehicles in the rear vicinity of the stop, the average speed of the vehicles in the vicinity before the stop, the average speed of the vehicles in the vicinity after the stop, and the like.
The vehicle-associated road attribute information may be associated with, for example, spatio-temporal information, the type of associated road such as high speed, national road, provincial road, prefectural road, and the like. Besides, the characteristic dimensions of the holiday information of the vehicle operation place and the weather information of the vehicle operation place can be obtained through the track location. Of course, the data accumulated by the truck operation and maintenance and calculated in the accumulation can be classified as the so-called feature dimension in the present application, and will not be described herein again.
It should be noted here that the english name of the isolated forest is IsolationForest, which is composed of a plurality of binary trees, a tree in iForest is called an isolation tree, which is abbreviated as iTree, and when a data set is assumed to have Z pieces of data, a plurality of samples are uniformly sampled (generally, no return sampling) from the Z pieces of data to construct one iTree, which is used as a training sample of the tree. In the sample, a feature is randomly selected, a value is randomly selected in all value ranges (between the minimum value and the maximum value) of the feature, the sample is divided into two branches, the part which is smaller than the value in the sample is divided into the left side of the node, and the part which is larger than or equal to the value in the sample is divided into the right side of the node. This results in a splitting condition and left and right datasets, and then repeats the above process on the left and right datasets, respectively, until the dataset has only one record or a defined height of the tree is reached.
In a preferred embodiment, the isolated forest model expression is:
Figure BDA0003550869130000111
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
Further preferably, c (n) can be represented as:
c(n)=2H(n-1)-(2(n-1)/n)
H(k)=ln(k)+ζ,ζ=0.5772156649
where h (k) represents a harmonic function, and ζ represents an euler constant. Finally, s (x, n) has a value in the range of [0,1], and a value closer to 1 indicates a high possibility of being an abnormal point, and a value closer to 0 indicates a high possibility of being a normal point.
In a specific training process, the method further comprises: and training the isolated forest model by adopting gridding hyper-parameters. Grid search superparameters refer to a set of parameters that are determined by enumerating possible values for each parameter, such as: the number of trees, the maximum number of samples to be sampled, the selection of feature proportion and the like, then the possible values of all parameters are combined to train a model, and finally the parameter with the best evaluation result is selected as the final parameter.
In another preferred embodiment, the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model. After the training times of the preset number, the model is considered to be well trained, and the trained model can be used for a specific application scene to identify the target data. The meaning of "well-trained" and "after-trained" in the application is the same, and the isolated forest model training method can be carried out according to the prior art and is not limited specifically. This application acquires the overall data, right the overall data filters, is correlated with, the layering sampling, obtains the sample data set, uses the isolated forest model of characteristic dimension training under a plurality of topics, treat the isolated forest model of identification data input training back with the freight train to judge whether the freight train takes place the accident, fully consider the extremely unbalanced current situation of data that the freight train accident took place, need not to collect the sample that has the mark, thereby the cost of data set construction has been reduced, adopt the mode of layering sampling to obtain the sample from the overall data set, the distribution uniformity with the overall data has been guaranteed, reach more ideal recognition effect through the isolated forest model that trains under the condition of need not to use a large amount of sample data, and then the efficiency of freight train discernment has been promoted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
In some exemplary embodiments of the present application, there is also provided a truck accident recognition apparatus, as shown in fig. 2, including:
the acquiring module 301 is configured to acquire overall data, where the overall data includes basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, holiday information, and weather information;
a sample obtaining module 302, configured to filter, associate, and sample the whole data in a hierarchical manner to obtain a sample data set;
an integrating module 303, configured to integrate the sample data set, determine multiple topics, and calculate feature dimensions under the multiple topics;
a training module 304, configured to train an isolated forest model using the feature dimensions under the multiple topics;
and the identification module 305 is configured to input data to be identified of the truck into the trained isolated forest model to determine whether an accident occurs in the truck.
In a preferred embodiment, the isolated forest model expression is:
Figure BDA0003550869130000131
where s (x, n) represents an anomaly index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a number of samples n.
Further preferably, c (n) can be represented as:
c(n)=2H(n-1)-(2(n-1)/n)
H(k)=ln(k)+ζ,ζ=0.5772156649
where h (k) represents a harmonic function, and ζ represents an euler constant. Finally, s (x, n) has a value in the range of [0,1], and a value closer to 1 indicates a high possibility of being an abnormal point, and a value closer to 0 indicates a high possibility of being a normal point.
The truck accident recognition device can reduce the cost of data set construction and can achieve a relatively ideal recognition effect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
It is further emphasized that the system provided in the embodiments of the present application may be based on artificial intelligence techniques for obtaining and processing relevant data. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Reference is now made to fig. 3, which is a schematic diagram illustrating a computer device provided in some embodiments of the present application. As shown in fig. 3, the computer device 2 includes: a processor 200, a memory 201, a bus 202 and a communication interface 203, wherein the processor 200, the communication interface 203 and the memory 201 are connected through the bus 202; the memory 201 stores a computer program operable on the processor 200, and the processor 200 executes the computer program to perform the truck accident identification method provided in any one of the foregoing embodiments.
The Memory 201 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 203 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
Bus 202 can be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The memory 201 is used for storing a program, and the processor 200 executes the program after receiving an execution instruction, and the truck accident identification method disclosed by any of the foregoing embodiments of the present application may be applied to the processor 200, or implemented by the processor 200.
The processor 200 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 200. The Processor 200 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201 and completes the steps of the method in combination with the hardware thereof.
Referring to fig. 4, the computer-readable storage medium shown in fig. 4 is an optical disc 30, and a computer program (i.e., a program product) is stored on the optical disc 30, and when the computer program is executed by a processor, the computer program may perform the truck accident recognition method according to any of the foregoing embodiments.
In addition, examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory, or other optical and magnetic storage media, which are not described in detail herein.
The computer-readable storage medium provided by the above-mentioned embodiment of the present application and the quantum key distribution channel allocation method in the spatial division multiplexing optical network provided by the embodiment of the present application have the same inventive concept, and have the same beneficial effects as the method adopted, run, or implemented by the application program stored in the computer-readable storage medium.
The present application further provides a computer program product, including a computer program, where the computer program is executed by a processor to implement the steps of the truck accident identification method provided in any of the foregoing embodiments, and the method includes: acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information; filtering, associating and sampling the whole data in a layered mode to obtain a sample data set; integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects; training an isolated forest model using the feature dimensions under the plurality of topics; and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
It should be noted that: the algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose devices may be used with the teachings herein. The required structure for constructing such a device will be apparent from the description above. In addition, this application is not directed to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the present application as described herein, and any descriptions of specific languages are provided above to disclose the best modes of the present application. In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the application may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the application, various features of the application are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the application and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this application.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification, and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except that at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification may be replaced by an alternative feature serving the same, equivalent or similar purpose, unless expressly stated otherwise.
The various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the creation apparatus of a virtual machine according to embodiments of the present application. The present application may also be embodied as an apparatus or device program for carrying out a portion or all of the methods described herein. A program implementing the application may be stored on a computer readable medium or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A truck accident identification method, characterized in that the method comprises:
acquiring overall data, wherein the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
filtering, associating and sampling the whole data in a layered mode to obtain a sample data set;
integrating the sample data set, determining a plurality of subjects, and calculating feature dimensions under the subjects;
training an isolated forest model using the feature dimensions under the plurality of topics;
and inputting the data to be identified of the truck into the trained isolated forest model to judge whether the truck has an accident or not.
2. A truck accident identification method according to claim 1, wherein the plurality of topics comprises: basic attribute information of the vehicle, driving behavior information of a driver, collision attribute information of the vehicle, parking attribute information of the vehicle, alarm attribute information of the vehicle, attribute information of a road associated with the vehicle, peripheral parking vehicle information of the vehicle, holiday information of the vehicle operation and weather information of the vehicle operation.
3. The truck accident recognition method of claim 1, wherein the filtering, correlating, and hierarchically sampling the overall data to obtain a sample data set comprises:
sequencing the vehicle operation track information according to time, and filtering out parking behaviors which are possible to have accidents on the basis of adjacent track point information;
the basic attribute information of the vehicle is associated with the vehicle operation track information through the vehicle ID, and the proportion of each vehicle type is obtained based on grouping and aggregation;
and performing non-playback sampling on the parking behaviors which are likely to have accidents according to the proportion of the types of the vehicles to obtain a sample data set.
4. A truck accident identification method according to claim 3, wherein the sorting of the vehicle operation track information by time and the filtering of possible accident stopping behaviors based on the adjacent track point information comprises:
sequencing the vehicle operation track information according to time;
for any vehicle, if the speeds of two adjacent track points are all 0 and the time difference of the two adjacent track points is within a preset time range, taking the previous track point of the two adjacent track points as a starting stop point;
If the subsequent stop point of the vehicle is within a first preset distance from the starting stop point and the state duration time exceeds a preset time length, judging whether the vehicle stops at a gas station, a service area, a high-speed parking belt, an inspection station and a parking lot or whether the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the inspection station and the parking lot is within a second preset distance;
if the vehicle does not stop at a gas station, a service area, a high-speed parking belt, a check station, a parking lot or the distance between the vehicle and the gas station, the service area, the high-speed parking belt, the check station or the parking lot is beyond a second preset distance, determining that the vehicle has a stopping behavior which is possible to have an accident.
5. The truck accident recognition method of claim 2, wherein calculating the feature dimensions under the plurality of topics comprises: calculating the acceleration of the vehicle, the average speed of the vehicle, the speed standard deviation of the vehicle, the times of sudden acceleration of hundreds of kilometers, the times of sudden deceleration of hundreds of kilometers, the overspeed mileage of hundreds of kilometers, the daily fatigue driving time, the times of dangerous road passing of hundreds of kilometers, the times of steering lamps of hundreds of kilometers, the times of braking of hundreds of kilometers, the number of parked vehicles around before parking, the number of parked vehicles around after parking, the average speed of vehicles around before parking and the average speed of vehicles around after parking.
6. A truck accident recognition method according to any one of claims 1 to 5, wherein the expression of the orphan forest model is:
Figure FDA0003550869120000021
wherein s (x, n) represents an abnormality index of iTree formed by recording x in training data of n samples, h (x) represents an average path from a leaf node to a root node, and c (n) represents an average value of path lengths given a sample number n.
7. A truck accident identification method according to claim 6, wherein the method further comprises: and comparing the ratio of the number of the abnormal samples to the total number of the samples with the prior accident occurrence probability, and selecting the abnormal rate closest to the prior accident occurrence probability as an output result of the isolated forest model.
8. A truck accident recognition apparatus, comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring overall data, and the overall data comprises basic attribute information of a vehicle, driving behavior information of a driver, vehicle operation track information, road information, festival information and weather information;
the sample obtaining module is used for filtering, correlating and sampling the whole data in a layered manner to obtain a sample data set;
the integration module is used for integrating the sample data set, determining a plurality of subjects and calculating characteristic dimensions under the subjects;
The training module is used for training the isolated forest model by using the feature dimensions under the plurality of subjects;
and the identification module is used for inputting the data to be identified of the truck into the trained isolated forest model so as to judge whether the truck has an accident or not.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 7 when executed by a processor.
CN202210262059.2A 2022-03-17 2022-03-17 Truck accident identification method, device, storage medium and program product Pending CN114677254A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210262059.2A CN114677254A (en) 2022-03-17 2022-03-17 Truck accident identification method, device, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210262059.2A CN114677254A (en) 2022-03-17 2022-03-17 Truck accident identification method, device, storage medium and program product

Publications (1)

Publication Number Publication Date
CN114677254A true CN114677254A (en) 2022-06-28

Family

ID=82074269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210262059.2A Pending CN114677254A (en) 2022-03-17 2022-03-17 Truck accident identification method, device, storage medium and program product

Country Status (1)

Country Link
CN (1) CN114677254A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116543557A (en) * 2023-05-05 2023-08-04 重庆邮电大学 Real-time automobile electronic data extraction and fixing method based on accident detection model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
CN112505549A (en) * 2020-11-26 2021-03-16 西安电子科技大学 New energy automobile battery abnormity detection method based on isolated forest algorithm
CN112633395A (en) * 2020-12-29 2021-04-09 平安科技(深圳)有限公司 Abnormal data detection method and device, computer equipment and storage medium
CN112884480A (en) * 2021-03-31 2021-06-01 中国工商银行股份有限公司 Method and device for constructing abnormal transaction identification model, computer equipment and medium
WO2021135653A1 (en) * 2019-12-31 2021-07-08 北京嘀嘀无限科技发展有限公司 Method and system for identifying abnormal stay of vehicle
CN113743815A (en) * 2021-09-13 2021-12-03 一汽出行科技有限公司 Risk monitoring method and device for operating vehicle, storage medium and computer equipment
CN113792782A (en) * 2021-09-13 2021-12-14 一汽出行科技有限公司 Track monitoring method and device for operating vehicle, storage medium and computer equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110149258A (en) * 2019-04-12 2019-08-20 北京航空航天大学 A kind of automobile CAN-bus network data method for detecting abnormality based on isolated forest
WO2021135653A1 (en) * 2019-12-31 2021-07-08 北京嘀嘀无限科技发展有限公司 Method and system for identifying abnormal stay of vehicle
CN112505549A (en) * 2020-11-26 2021-03-16 西安电子科技大学 New energy automobile battery abnormity detection method based on isolated forest algorithm
CN112633395A (en) * 2020-12-29 2021-04-09 平安科技(深圳)有限公司 Abnormal data detection method and device, computer equipment and storage medium
CN112884480A (en) * 2021-03-31 2021-06-01 中国工商银行股份有限公司 Method and device for constructing abnormal transaction identification model, computer equipment and medium
CN113743815A (en) * 2021-09-13 2021-12-03 一汽出行科技有限公司 Risk monitoring method and device for operating vehicle, storage medium and computer equipment
CN113792782A (en) * 2021-09-13 2021-12-14 一汽出行科技有限公司 Track monitoring method and device for operating vehicle, storage medium and computer equipment

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
柳本民;闫寒;: "基于SVM事故分类的连环追尾事故影响因素分析", 交通信息与安全, no. 01, 31 December 2020 (2020-12-31), pages 49 - 57 *
薛清文;蒋愚明;陆键;: "基于轨迹数据的危险驾驶行为识别方法", 中国公路学报, no. 06, 31 December 2020 (2020-12-31), pages 88 - 98 *
衡红军;刘静;: "基于混合方法的多维时间序列驾驶异常点检测", 计算机工程, no. 03, 31 December 2020 (2020-12-31), pages 105 - 110 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116543557A (en) * 2023-05-05 2023-08-04 重庆邮电大学 Real-time automobile electronic data extraction and fixing method based on accident detection model

Similar Documents

Publication Publication Date Title
CN108513676B (en) Road condition identification method, device and equipment
CN113155173B (en) Perception performance evaluation method and device, electronic device and storage medium
CN103895649A (en) Driver safety driving warning method
CN114664087B (en) Method, device, equipment and medium for recognizing up-down high speed of vehicle based on track
US10820166B1 (en) Systems and methods for obtaining location intelligence
CN112987711B (en) Optimization method of automatic driving regulation algorithm and simulation testing device
Martinelli et al. Cluster analysis for driver aggressiveness identification.
CN110516691A (en) A kind of Vehicular exhaust detection method and device
CN110858312A (en) Driver driving style classification method based on fuzzy C-means clustering algorithm
US20180017402A1 (en) Method and system for vehicle speed profile generation
Trirat et al. Df-tar: a deep fusion network for citywide traffic accident risk prediction with dangerous driving behavior
CN113423063A (en) Vehicle monitoring method and device based on vehicle-mounted T-BOX, vehicle and medium
Wu et al. Clustering of several typical behavioral characteristics of commercial vehicle drivers based on GPS data mining: Case study of highways in China
CN114677254A (en) Truck accident identification method, device, storage medium and program product
CN110264725B (en) Method and device for determining road section flow
CN111696347A (en) Method and device for automatically analyzing traffic incident information
CN114426025B (en) Driving assistance method, driving assistance device, computer device, and storage medium
CN113192340B (en) Method, device, equipment and storage medium for identifying highway construction vehicles
US20220120580A1 (en) Systems and methods for the classification of geographic locations based on vehicle trip logs
CN111121803B (en) Method and device for acquiring common stop points of road
CN115688003A (en) Driver identification method and device, computer equipment and readable storage medium
CN114841283A (en) Method, device, equipment and medium for determining running condition of new energy vehicle
CN112257869A (en) Fake-licensed car analysis method and system based on random forest and computer medium
CN114446042A (en) Method, device, equipment and storage medium for early warning of traffic accidents
CN102201166A (en) Preprocessing apparatus for floating car data and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination