CN106095963B - Vehicle driving behavior analysis big data public service platform under internet + era - Google Patents
Vehicle driving behavior analysis big data public service platform under internet + era Download PDFInfo
- Publication number
- CN106095963B CN106095963B CN201610430864.6A CN201610430864A CN106095963B CN 106095963 B CN106095963 B CN 106095963B CN 201610430864 A CN201610430864 A CN 201610430864A CN 106095963 B CN106095963 B CN 106095963B
- Authority
- CN
- China
- Prior art keywords
- data
- driving
- vehicle
- big data
- platform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/904—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Traffic Control Systems (AREA)
Abstract
A big data public service platform for analyzing vehicle driving behaviors in the Internet + era analyzes the vehicle driving behaviors and the vehicle driving risk probability, and comprises a first step of storing driving data of a vehicle in a period of time through a big data platform and processing and analyzing the data by adopting a MapReduce frame, a second step of counting abnormal driving times of the vehicle in a certain area and period of time after processing the data based on the big data platform, wherein the abnormal driving times comprise overspeed, overload, rapid acceleration and rapid deceleration, fatigue driving and night driving, and a third step of calculating the abnormal driving behaviors, calculating the vehicle risk probability by utilizing a self-organized mapping neural network algorithm and predicting the subsequent vehicle driving risk. According to the invention, the Beidou/GPS navigation data of the vehicle is collected, the collected data is preprocessed by using the big data platform, and the driving behavior of the driver can be analyzed and judged after further data mining on the data.
Description
The technical field is as follows:
The invention relates to the field of physics, in particular to a data storage and processing technology, and particularly relates to a vehicle driving behavior analysis big data public service platform in the Internet + era.
Background art:
with the improvement of the informatization degree, the life is full of massive information, some of the information is closely related to the life of people, and the analysis and the mining of the information can improve the life, serve the public, improve the social informatization degree and provide a new opportunity for a smart city and an informatization dynamic management scheme.
as the number of vehicles traveling on roads increases, the risk of personal property loss due to illegal driving by people also increases. The risk of vehicle violation in different regional areas also varies, so the risk cannot be simply averaged and varies from location to location. The environmental factors of the automobile driving need to be considered, so the automobile driving area is divided firstly and then the risk probability in each area is analyzed. In the prior art, the driving behavior of a vehicle cannot be analyzed from massive information in life.
The invention content is as follows:
the invention aims to provide a public service platform for analyzing vehicle driving behaviors in the Internet + era, which aims to solve the technical problem that the vehicle driving behaviors cannot be analyzed from mass information in life in the prior art.
The public service platform for vehicle driving behavior analysis big data in the Internet + era comprises the following steps of analyzing vehicle driving behaviors and automobile driving risk probability:
step one, storing driving data of the automobile within a period of time through a big data platform, processing and analyzing the data by adopting a MapReduce framework,
Step two, counting the abnormal driving times of the automobile in a certain area and a certain time period based on the processed data under the big data platform, wherein the abnormal driving times comprise overspeed, overload, rapid acceleration and rapid deceleration, fatigue driving and night driving,
And step three, calculating the vehicle risk probability by utilizing a self-organizing mapping neural network algorithm after counting the abnormal driving behaviors, and predicting the subsequent vehicle running risk.
Further, processing and analyzing the automobile driving data, wherein the method comprises the steps of establishing a data storage layer in the first step, wherein the main scheme is a Hadoop Distributed File System (HDFS) based storage scheme and a HDFS + distributed open-oriented source database (HBase) technology based storage scheme; the method comprises the following steps of establishing a data processing layer, processing original data in the data processing layer by adopting a large-scale parallel computing framework MapReduce (MR for short), establishing a data analysis layer, further analyzing, modeling and mining the data, establishing data visualization in the fourth step, establishing a security layer in the fifth step, providing a security protection system for a big data platform, providing a uniform service interface for the outside, and accessing the big data service platform by a user through a uniform data access interface or a uniform portal page.
Furthermore, a road network road section on which the vehicle runs is found by using a hidden Markov algorithm, road section information on which the vehicle runs is found through map data, and finally the running speed in the positioning data is input into a space-time speed limit model, so that the judgment of whether the vehicle is overspeed is finally obtained.
Further, an algorithm of a support vector machine is realized by utilizing a big data platform, wherein the method comprises the following steps of firstly, standardizing original data to eliminate the influence of dimensions; secondly, calculating average speed, average acceleration and starting acceleration through time, speed and mileage in the original data, and taking the average speed, the average acceleration and the starting acceleration as training samples; thirdly, uploading a large number of training samples to the HDFS and partitioning, enabling a Map task of the HDFS to start a sequence minimum optimization (SMO for short) training of data in the blocks by each node to calculate a support vector of each block of data, and finally inputting the support vector at the training position of each node into a Reduce task of the HDFS and submitting the support vector to an SMO algorithm training to obtain a final support vector; and fourthly, obtaining a classifier by using the support vector in the third step, and judging whether the vehicle is overloaded by using the classifier.
further, establishing a linear regression model under a big data platform, wherein the method comprises the steps of firstly, finding out the driving data of all automobiles in the road section by using the big data platform, and calculating the speed difference between two adjacent data; the method comprises the steps of firstly calculating a linear regression algorithm, secondly, obtaining a model S (delta v, vo) of the speed variation of the automobile in the road section and the driving distance of the automobile in the road section by using a statistical linear regression algorithm, thirdly, calculating a regression equation to obtain the model S (delta v, vo) of the speed variation of the automobile in the road section and the driving distance, finally, judging the probability of the automobile in rapid acceleration and rapid deceleration by using the equation, and judging the rule that the larger the error e between the distance of the ith automobile and the model distance is | | | | S-Si |, the larger the probability of the automobile in rapid acceleration and rapid deceleration is, and the smaller the probability is.
Specifically, the driving data of all vehicles at all time of a certain road section is found out from the original data through a big data platform, the data are processed to obtain the difference value of the speeds of two adjacent points, and then a regression model is obtained by using a regression analysis method. And obtaining the probability of the automobile in the road section for sudden acceleration and sudden deceleration through the regression model. Let Δ v be the speed difference between two time points on the same road section, and vo be the initial speed of each time point of the car. Since the distance traveled by the vehicle over a period of time is linear with the speed difference and the initial speed, the regression model can be as
Where bo, b1, b2 are partial regression coefficients for the variables Δ v and v 0.
And obtaining the value of the partial regression coefficient through training data and a least square principle, testing the obtained model by using variance analysis, and finally obtaining the probability of the rapid acceleration and the rapid deceleration of the automobile according to the size. Wherein S is the actual driving distance of the automobile and is the driving distance meeting the regression model.
Further, the step of calculating the vehicle risk probability includes measuring the risk probability in the first step, establishing a risk probability model in the second step, preprocessing historical data, clustering the data to obtain l clusters, namely Cn ═ { I1, I2.... Il }, wherein Ii is one of the clusters, and then constructing a prediction model according to a clustering result includes calculating an average risk probability of the clusters and a similarity coefficient between a feature vector and the average vector, and predicting the risk probability.
Specifically, the risk probability is a percentage of the occurrence probability of the risk, and is a measure of the occurrence probability of the object state. The driving data of the automobiles in the same area has higher similarity, and the driving data in different areas have larger difference. And clustering the automobile driving data according to the characteristic points by taking one area as a cluster to obtain the driving data in different areas. Each feature point (feature vector) is associated with a point in the cluster, and assuming that the ith feature vector in the cluster is Xi (i ∈ N), the risk probability can be expressed as
Wherein | | | nr (Xi) | | represents the number of all points in the region r, and nr (Xi) is a field in which the point Xi is in the radius of r. U (Nr (xi)) is the number of security risks occurring in the cluster. It can be seen that (2) the risk probability is the average probability of all points in the cluster. Can also be expressed as
Where the similarity of the feature points Xi to the average vector of all points is represented.
The risk probability prediction model determination comprises three parts:
pre-processing of historical data
(1) data attributes are abstracted from historical data, and the driving behavior data of the automobile comprises license plate numbers, longitude and latitude, driving speed, driving direction and driving time.
(2) Clustering the historical data to obtain l clusters, namely Cn ═ { I1, I2.. Il }, wherein Ii is one of the clusters, and k feature vectors in the Ii cluster are assumed, namely Xij is the jth feature vector in the cluster.
Construction prediction model
calculating the risk probability of the feature vector in each cluster after clustering to obtain a result, wherein the steps are as follows:
(1) The average risk probability in cluster Ii is calculated.
Where | Ii | is the number of elements in cluster Ii, if the point represented is a risk occurrence, then 0 otherwise.
(2) Calculate the average eigenvector of all points in the cluster Ii
(3) Calculating the similarity coefficient of each feature vector in the cluster
(4) Calculating a risk probability for each feature vector
5.3 Risk probability prediction
After the risk probability model is obtained through calculation, new automobile data can be predicted, and the method comprises the following steps:
(1) carrying out cluster analysis on the new automobile driving data Xp;
(2) After general clustering, new data are clustered into a certain cluster, and then the similarity coefficient of the data and an average vector is calculated;
(3) then the probability calculation for the new data point is:
further, in the step of clustering the data, a neural network clustering algorithm based on hadoop self-organizing mapping is adopted, and a Mapreduce parallel computing frame is used for computing the data.
Further, in the step of calculating the data by using the Mapreduce parallel calculation framework, a winning neuron and a weight modifier are calculated in the Map stage, and the total update quantity of each weight connected with the same winning neuron is counted in the Reducer stage, and then the weights are updated.
Further, in an algorithm for realizing the support vector machine, the similarity between two vectors is measured by using the Euclidean distance attitude.
specifically, in the above-mentioned hadoop-based self-organizing map (SOM) -based neural network clustering algorithm, competition learning is adopted to realize internal adaptation and change of the internal structure of self-organization, that is, after external input is received, the internal structure is automatically divided into different regions, each region has different response characteristics to the input, and the process is automatically completed. The steps of the algorithm are as follows:
(1) Initialization and normalization. Firstly, normalization processing is carried out on the current input vector and the neuron weight vector Wi in the competition layer, an initial win area is established, and the gain mu (0) is assigned.
(2) Winning neurons were sought. And performing similarity comparison on the input vector and all the response weight vectors in the competition layer, and taking the most missing weight vector as a winning neuron. The greater the dot product of two vectors in mathematics that are most similar can be expressed as the following equation:
(3) Adjusting node weight in the winning field
Where μ (t) is the gain and decreases to 0 over time.
(4) If the gain drops to the set threshold μmin then the process ends, otherwise the input data continues to start a new process calculation from the first step.
The algorithm is operated on a Hadoop platform, and the implementation steps are as follows:
map phase
the key-value pair < key, value >, where key denotes the sample sequence number value denotes the sample feature set.
and calculating the output value of the competition layer according to the competition learning principle.
The winning neuron is determined, and the state thereof is determined as 1, and the other states are determined as 0. The flag is set to the position of the winning neuron in the competition layer.
and calculating the updating quantity of each weight connected with the winning neuron, and keeping other weights unchanged.
And outputting the key value pair, wherein key represents flag, and value represents the weight updating amount connected with the winning neuron.
reducer phase
And accepting the output of the map function as the input of the reduce function, wherein the input key represents the position of the winning neuron, and the values represent the combination of weight updating quantities connected with the winning neuron.
And accumulating to obtain the total updating amount of each weight value, and calculating the average updating amount.
And adjusting the corresponding network weight.
And outputting the key value pair, wherein key represents the weight updating state, and value represents the updated network weight.
and updating the global vector, and repeating the MapReduce process until the errors of the weight vector and the winning sample are smaller than a set threshold value, and finishing.
It can be seen from the above calculation steps that learning of the SOM network can adaptively perform clustering according to intrinsic connections of samples, the output neuron weight vector Wi gradually approaches to a winning sample, and a weight vector set { Wi | i ═ 1,2.. l } is regarded as a clustering center of all samples. The similarity measure method is described as follows:
an X-X, X-1, X2.. xn, belongs to a cluster Ii in the feature space, and the similarity coefficient between the feature vector and the average vector (cluster center) can be expressed as,
It can be seen from equation (9) that the similarity between the metric vectors is measured using euclidean distance, and the more similar the two vectors, the larger the value is, and the maximum value is 1.
Compared with the prior art, the invention has positive and obvious effect. According to the invention, the Beidou/GPS navigation data of the vehicle is collected, the collected data is preprocessed by using the big data platform, and the driving behavior of the driver can be analyzed and judged after further data mining on the data.
Description of the drawings:
Fig. 1 is a frame diagram of a vehicle driving behavior analysis big data public service platform in the internet + age of the present invention.
Fig. 2 is a flow chart of the neural network algorithm based on self-organizing map in the invention.
the specific implementation mode is as follows:
Example 1:
As shown in fig. 1 and 2, the public service platform for vehicle driving behavior analysis in the internet + age of the present invention includes four parts: the data visualization system comprises a data storage layer, a data processing layer, a data analysis layer and a data visualization layer.
Judging the overspeed of the automobile, and finding the optimal road section for the automobile to run by utilizing a hidden Markov algorithm; secondly, finding out the speed limit size corresponding to the road type of the automobile by using the map data; and thirdly, comparing the driving speed in the obtained data with the speed limit of the road section to judge whether the vehicle is overspeed or not.
Judging the overload of the automobile, and normalizing the original data to eliminate the influence of dimensions; secondly, calculating average speed, average acceleration and starting acceleration through time, speed and mileage in the original data, and taking the average speed, the average acceleration and the starting acceleration as training samples; thirdly, training a classifier by using the training samples in the second step; and fourthly, judging whether the vehicle is overloaded by using the classifier.
and judging the rapid acceleration and the rapid deceleration, firstly finding out the driving data of all vehicles at all time of a certain road section from the original data through a large data platform, processing the data to obtain the difference value of the speeds of two adjacent points, and then obtaining a regression model by using a regression analysis method. And obtaining the probability of the automobile in the road section for sudden acceleration and sudden deceleration through the regression model. Let Δ v be the speed difference between two time points on the same road section, and vo be the initial speed of each time point of the car. Because the distance traveled by the vehicle over a period of time is linear with the speed difference and the initial speed, the regression model is:
Where bo, b1, b2 are partial regression coefficients for the variables Δ v and v 0.
and obtaining the value of the partial regression coefficient through training data and a least square principle, testing the obtained model by using variance analysis, and finally obtaining the probability of the rapid acceleration and the rapid deceleration of the automobile according to the size. Wherein S is the actual driving distance of the automobile and is the driving distance meeting the regression model.
the risk probability is the percentage of the occurrence probability of the risk and is a measure of the occurrence probability of the state of the thing. The driving data of the automobiles in the same area has higher similarity, and the driving data in different areas have larger difference. And clustering the automobile driving data according to the characteristic points by taking one area as a cluster to obtain the driving data in different areas. Each feature point (feature vector) is associated with a point in the cluster, and assuming that the ith feature vector in the cluster is Xi (i ∈ N), the risk probability can be expressed as
wherein | | | nr (Xi) | | represents the number of all points in the region r, and nr (Xi) is a field in which the point Xi is in the radius of r. U (Nr (xi)) is the number of security risks occurring in the cluster. One can see that (1) the risk probability is the average probability of all points in the cluster. Can also be expressed as
Where the similarity of the feature points Xi to the average vector of all points is represented.
The risk probability prediction model needs to be divided into three parts:
pre-processing of historical data
Data attributes are abstracted from historical data, and the driving behavior data of the automobile comprises license plate numbers, longitude and latitude, driving speed, driving direction and driving time.
Clustering the historical data to obtain l clusters, namely Cn ═ { I1, I2.. Il }, wherein Ii is one of the clusters, and k feature vectors in the Ii cluster are assumed, namely the jth feature vector in the cluster.
Construction prediction model
calculating the risk probability of the feature vector in each cluster after clustering to obtain a result, wherein the steps are as follows:
(1) The average risk probability in cluster Ii is calculated.
Where | Ii | is the number of elements in cluster Ii, if the point represented is a risk occurrence, then 0 otherwise.
(2) Calculate the average eigenvector of all points in the cluster Ii
(3) Calculating the similarity coefficient of each feature vector in the cluster
(4) Calculating a risk probability for each feature vector
Risk probability prediction
After the risk probability model is obtained through calculation, new automobile data can be predicted, and the method comprises the following steps:
Carrying out cluster analysis on the new automobile driving data Xp;
after general clustering, new data are clustered into a certain cluster, and then the similarity coefficient of the data and an average vector is calculated;
The probability calculation formula for the new data point is:
the key technology for solving the risk probability model of the data points is how to cluster historical data by using a clustering algorithm. A self-organizing map (SOM) based neural network clustering algorithm is employed herein. The algorithm adopts competitive learning to realize internal self-adaptation and self-organization change internal structure, namely, the algorithm can be automatically divided into different areas after receiving external input, each area has different response characteristics to the input, and the process is automatically completed. The steps of the algorithm are as follows:
initialization and normalization. Firstly, normalization processing is carried out on the current input vector and the neuron weight vector Wi in the competition layer, an initial win region Ni (0) is established, and the gain mu (0) is assigned.
Winning neurons were sought. And performing similarity comparison on the input vector and all the response weight vectors in the competition layer, and taking the most missing weight vector as a winning neuron. The greater the dot product of two vectors in mathematics that are most similar can be expressed as the following equation:
Adjusting node weight in the winning field
where μ (t) is the gain and decreases to 0 over time.
If the gain drops to the set threshold μmin then the process ends, otherwise the input data continues to start a new process calculation from the first step.
The algorithm is operated on a Hadoop platform, and the implementation steps are as follows:
Map phase
Inputting a key-value pair < key, value >, wherein key represents a sample sequence number value represents a sample feature set
And calculating the output value of the competition layer according to the competition learning principle.
The winning neuron is determined, and the state thereof is determined as 1, and the other states are determined as 0. The flag is set to the position of the winning neuron in the competition layer.
and calculating the updating quantity of each weight connected with the winning neuron, and keeping other weights unchanged.
And outputting the key value pair, wherein key represents flag, and value represents the weight updating amount connected with the winning neuron.
Reducer phase
And accepting the output of the map function as the input of the reduce function, wherein the input key represents the position of the winning neuron, and the values represent the combination of weight updating quantities connected with the winning neuron.
And accumulating to obtain the total updating amount of each weight value, and calculating the average updating amount.
And adjusting the corresponding network weight.
and outputting the key value pair, wherein key represents the weight updating state, and value represents the updated network weight.
And updating the global vector, and repeating the MapReduce process until the errors of the weight vector and the winning sample are smaller than a set threshold value, and finishing.
it can be seen from the above calculation steps that learning of the SOM network can adaptively perform clustering according to intrinsic connections of samples, and the output neuron weight vectors Wi gradually approach winning samples, so that a weight vector set { Wi | i ═ 1,2.. l } can be regarded as a clustering center of all samples. The similarity measurement method comprises the following steps:
An X-X { X1, X2.. xn } feature vector belongs to a cluster Ii in the feature space, and the similarity coefficient between the feature vector and the average vector (cluster center) can be expressed as
it can be seen from equation (9) that the similarity between the metric vectors is measured using euclidean distance, and the more similar the two vectors, the larger the value is, and the maximum value is 1.
Claims (4)
1. the utility model provides a vehicle driving behavior analysis big data public service platform under internet + time which characterized in that: the method for analyzing the driving behavior of the vehicle and the driving risk probability of the vehicle comprises the following steps:
Step one, storing driving data of the automobile within a period of time through a big data platform, processing and analyzing the data by adopting a MapReduce framework,
Step two, counting the abnormal driving times of the automobile in a certain area and a certain time period based on the processed data under the big data platform, wherein the abnormal driving times comprise overspeed, overload, rapid acceleration and rapid deceleration, fatigue driving and night driving,
Thirdly, calculating the risk probability of the vehicle by utilizing a self-organizing mapping neural network algorithm after counting the abnormal driving behaviors, and predicting the subsequent vehicle running risk;
The method comprises the steps of establishing a data storage layer in the first step, and establishing a Hadoop distributed file system based storage scheme and a Hadoop distributed file system + distributed column-oriented source database technology based storage scheme; establishing a data processing layer, processing original data by adopting a large-scale parallel computing frame in the data processing layer, establishing a data analysis layer in the third step, further analyzing, modeling and mining the data, establishing data visualization in the fourth step, establishing a security layer in the fifth step, providing a security protection system for the big data platform, providing a uniform service interface for the outside, and accessing the big data service platform by a user through a uniform data access interface or a uniform portal page;
Finding road network sections where vehicles run by using a hidden Markov algorithm, finding information of the road sections where the vehicles run by using map data, and finally inputting the running speed in positioning data into a space-time speed-limiting model to finally obtain the judgment of whether the vehicles overspeed;
an algorithm of a support vector machine is realized by utilizing a big data platform, wherein the algorithm comprises the following steps of firstly, standardizing original data to eliminate the influence of dimensions; secondly, calculating average speed, average acceleration and starting acceleration through time, speed and mileage in the original data, and taking the average speed, the average acceleration and the starting acceleration as training samples; thirdly, uploading a large number of training samples to a Hadoop distributed file system and partitioning, starting a Map task of a Hadoop distributed processing platform by each node to perform sequence minimum optimization training on data in the block to calculate a support vector of each block of data, and finally inputting the support vector of each node training position into a Reduce task of the Hadoop distributed processing platform and submitting the support vector to an SMO algorithm training to obtain a final support vector; fourthly, obtaining a classifier by using the support vector in the third step, and judging whether the vehicle is overloaded by using the classifier;
Establishing a linear regression model under a big data platform, wherein the linear regression model comprises the steps of finding out the driving data of all automobiles in the road section by using the big data platform, and calculating the speed difference between two adjacent data; secondly, obtaining a speed variation delta v of a vehicle on a certain road by using a statistical linear regression algorithm, wherein the speed variation delta v of the vehicle from a certain initial speed vo to the next driving time is input, the driving distance S of the vehicle in the time is output, and the data of all the driven vehicles in the road are counted;
The vehicle risk probability calculating method comprises the steps of measuring risk probability in the first step, establishing a risk probability model in the second step, preprocessing historical data, clustering the data to obtain l clusters, namely Cn ═ { I1, I2.... Il }, wherein Ii is one of the clusters, constructing a prediction model according to clustering results, calculating the average risk probability of the clusters and the similarity coefficient between a feature vector and the average vector, and predicting the risk probability.
2. the internet + time vehicle driving behavior analysis big data public service platform according to claim 1, wherein: in the step of clustering the data, a neural network clustering algorithm of self-organizing mapping is adopted, and a large-scale parallel computing framework Mapreduce parallel computing framework is used for computing the data.
3. The internet + time vehicle driving behavior analysis big data public service platform according to claim 2, wherein: in the step of calculating data by using a massively parallel calculation framework Mapreduce parallel calculation framework, a winning neuron and a weight modifier are calculated in a Map stage, and a total update quantity of each weight connected with the same winning neuron is counted in a Reduce stage, and then the weights are updated.
4. The internet + time vehicle driving behavior analysis big data public service platform according to claim 1, wherein: in the algorithm for realizing the support vector machine, the similarity between two vectors is measured by using the Euclidean distance attitude.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610430864.6A CN106095963B (en) | 2016-06-17 | 2016-06-17 | Vehicle driving behavior analysis big data public service platform under internet + era |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610430864.6A CN106095963B (en) | 2016-06-17 | 2016-06-17 | Vehicle driving behavior analysis big data public service platform under internet + era |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106095963A CN106095963A (en) | 2016-11-09 |
CN106095963B true CN106095963B (en) | 2019-12-06 |
Family
ID=57235694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610430864.6A Active CN106095963B (en) | 2016-06-17 | 2016-06-17 | Vehicle driving behavior analysis big data public service platform under internet + era |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106095963B (en) |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106682394A (en) * | 2016-11-30 | 2017-05-17 | 北京拓明科技有限公司 | Big data analyzing method and system of survival risk |
CN108268678B (en) * | 2016-12-30 | 2021-07-23 | 上海擎感智能科技有限公司 | Driving behavior analysis method, device and system |
CN107153916A (en) * | 2017-04-30 | 2017-09-12 | 安徽中科美络信息技术有限公司 | A kind of driving behavior evaluation method clustered based on FCM with BP neural network |
CN108944799B (en) * | 2017-05-18 | 2022-06-03 | 腾讯科技(深圳)有限公司 | Vehicle driving behavior abnormity processing method and device |
CN109754595B (en) * | 2017-11-01 | 2022-02-01 | 阿里巴巴集团控股有限公司 | Vehicle risk assessment method and device and interface equipment |
CN107742193B (en) * | 2017-11-28 | 2019-08-27 | 江苏大学 | A kind of driving Risk Forecast Method based on time-varying state transition probability Markov chain |
CN109840660B (en) * | 2017-11-29 | 2021-07-30 | 北京四维图新科技股份有限公司 | Vehicle characteristic data processing method and vehicle risk prediction model training method |
CN107993017A (en) * | 2017-12-12 | 2018-05-04 | 中国矿业大学(北京) | A kind of worker's unsafe acts analysis method and system |
CN109993966A (en) * | 2018-01-02 | 2019-07-09 | 中国移动通信有限公司研究院 | A kind of method and device of building user portrait |
CN108764111B (en) * | 2018-05-23 | 2022-03-01 | 长安大学 | Method for detecting abnormal driving behavior of vehicle |
CN109377046A (en) * | 2018-10-18 | 2019-02-22 | 上海经达信息科技股份有限公司 | Overload of vehicle method of discrimination, system and device based on BP neural network |
CN111105110A (en) * | 2018-10-25 | 2020-05-05 | 北京嘀嘀无限科技发展有限公司 | Driving risk determination method, device, medium and computing equipment |
CN109542081B (en) * | 2018-11-14 | 2021-06-22 | 江苏大学 | Online driving danger estimation method based on offline vehicle deceleration curve |
CN109493566B (en) * | 2018-12-18 | 2020-09-08 | 长安大学 | Fatigue driving prediction method based on GPS data |
CN109710245A (en) * | 2019-01-16 | 2019-05-03 | 北京百度网讯科技有限公司 | Information demonstrating method and device |
CN110060370B (en) * | 2019-04-15 | 2021-04-27 | 上海经达信息科技股份有限公司 | Equivalent statistical method for times of rapid acceleration and rapid deceleration of vehicle |
CN113095713B (en) * | 2021-04-28 | 2023-10-20 | 扬州大学 | Driver space risk early warning method based on public transport history alarm data |
CN113838277B (en) * | 2021-09-26 | 2022-07-26 | 广州文远知行科技有限公司 | Method, device and equipment for determining abnormal occurrence time point of vehicle and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103106793B (en) * | 2013-01-11 | 2015-02-04 | 福州大学 | Traffic state discriminated method based on real-time driving direction and transit time quantum information |
CN103646534B (en) * | 2013-11-22 | 2015-12-02 | 江苏大学 | A kind of road real-time traffic accident risk control method |
US9600541B2 (en) * | 2014-05-02 | 2017-03-21 | Kookmin University Industry Academy Corporation Foundation | Method of processing and analysing vehicle driving big data and system thereof |
CN104504531B (en) * | 2014-12-30 | 2018-08-31 | 大连楼兰科技股份有限公司 | A kind of vehicle drive methods of risk assessment |
CN104484725A (en) * | 2014-12-30 | 2015-04-01 | 大连楼兰科技股份有限公司 | Driving risk assessment method based on principal component linear weighting |
CN104636449A (en) * | 2015-01-27 | 2015-05-20 | 厦门大学 | Distributed type big data system risk recognition method based on LSA-GCC |
CN105448108B (en) * | 2015-12-30 | 2017-10-03 | 上海经达信息科技股份有限公司 | A kind of hypervelocity method of discrimination based on Hidden Markov road network |
-
2016
- 2016-06-17 CN CN201610430864.6A patent/CN106095963B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106095963A (en) | 2016-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106095963B (en) | Vehicle driving behavior analysis big data public service platform under internet + era | |
Zheng et al. | Deep and embedded learning approach for traffic flow prediction in urban informatics | |
Carfora et al. | A “pay-how-you-drive” car insurance approach through cluster analysis | |
US20220332335A1 (en) | Vehicle-data analytics | |
Dong et al. | Characterizing driving styles with deep learning | |
CN112700072B (en) | Traffic condition prediction method, electronic device, and storage medium | |
Zhao et al. | Highway travel time prediction using sparse tensor completion tactics and k‐nearest neighbor pattern matching method | |
CN111582559B (en) | Arrival time estimation method and device | |
US20230289663A1 (en) | Vehicle resiliency, driving feedback and risk assessment using machine learning-based vehicle wear scoring | |
CN111126868B (en) | Road traffic accident occurrence risk determination method and system | |
CN113205698A (en) | Navigation reminding method based on IGWO-LSTM short-time traffic flow prediction | |
CN116432810A (en) | Traffic flow prediction model determination method, device, apparatus and readable storage medium | |
Li et al. | Macroscopic big data analysis and prediction of driving behavior with an adaptive fuzzy recurrent neural network on the internet of vehicles | |
CN113033899A (en) | Unmanned adjacent vehicle track prediction method | |
Matarazzo et al. | Crowdsourcing bridge vital signs with smartphone vehicle trips | |
CN103106329B (en) | A kind of training sample constructed in groups method for SVR short-term load forecasting | |
Zhang et al. | Long‐term bridge performance assessment using clustering and Bayesian linear regression for vehicle load and strain mapping model | |
CN114863170A (en) | Deep learning-based new energy vehicle battery spontaneous combustion early warning method and device | |
Xia et al. | A distributed EMDN-GRU model on Spark for passenger waiting time forecasting | |
Rui et al. | High-accuracy transient fuel consumption model based on distance correlation analysis | |
CN116663742A (en) | Regional capacity prediction method based on multi-factor and model fusion | |
Lu et al. | A bi-level distribution mixture framework for unsupervised driving performance evaluation from naturalistic truck driving data | |
Xue et al. | Traffic status evolution trend prediction based on congestion propagation effects under rainy weather | |
Marciniuk et al. | Machine learning applied to acoustic-based road traffic monitoring | |
Moghadam et al. | Application of smartphones in pavement deterioration identification using artificial neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |