CN111832225B

CN111832225B - Method for constructing driving condition of automobile

Info

Publication number: CN111832225B
Application number: CN202010644339.0A
Authority: CN
Inventors: 白明泽; 邓川; 覃春园; 葛丝雨
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Guangzhou Dayu Chuangfu Technology Co ltd
Priority date: 2020-07-07
Filing date: 2020-07-07
Publication date: 2023-01-31
Anticipated expiration: 2040-07-07
Also published as: CN111832225A

Abstract

The invention relates to the field of automobile working condition data construction, in particular to a method for constructing automobile driving working conditions, which comprises the following steps: acquiring original GPS data of automobile driving, and preprocessing; dividing the preprocessed data into kinematic fragments; performing feature calculation on the kinematics segment to obtain feature parameters of the kinematics segment; dividing the kinematic fragments into four fragment libraries by adopting K-Means clustering; constructing a training data set; inputting the training data set into a model for training to obtain a trained long-short term memory neural network model; predicting by using the trained long and short term memory neural network model to obtain time-speed prediction curves corresponding to the four segment libraries respectively; and combining the curves of the four speed sections into a working condition curve. The method effectively identifies the implicit characteristics in the automobile driving data in the special region through the LSTM network, thereby constructing the automobile driving condition curve according with the driving characteristics.

Description

Method for constructing driving condition of automobile

Technical Field

The invention relates to the field of automobile working condition data construction, in particular to a method for constructing automobile running working conditions.

Background

The Driving Cycle (also called as vehicle test Cycle) is an important and common basic technology in the automobile industry, is a basis of a vehicle energy consumption/emission test method and a limit standard, and is also a main reference for calibrating and optimizing various performance indexes of an automobile, and is used for describing a speed-time curve of automobile Driving, the total time is generally within 1800 seconds, but no limit standard exists, the kinematic characteristics of automobile road Driving can be reflected. At present, developed countries of automobiles in Europe, america, japan and the like adopt standards suitable for respective automobile running conditions to carry out calibration optimization of vehicle performance and energy consumption/emission certification.

In the beginning of the century, china directly adopts the European NEDC running working condition to authenticate the energy consumption/emission of automobile products, but the two most main working condition characteristics of the idle speed time ratio and the average speed of the working condition are greatly different from the actual automobile running working condition of China. As the most basic basis for vehicle development and evaluation, the method is more and more important for developing deep research and formulating test conditions reflecting actual road running conditions in China. Meanwhile, the region of China is wide, and the development degree, climatic conditions and traffic conditions of each city are different, so that the driving condition characteristics of the automobiles of each city are obviously different. Therefore, the construction and research of city automobile driving conditions based on the city own automobile driving data are more and more urgent, and the constructed automobile driving conditions are expected to be matched with the driving conditions of the city automobiles as much as possible, ideally representing the driving conditions of the city automobiles, so that the construction of the automobile driving conditions of each city according to different road conditions of each city is necessary.

The current traditional construction methods involve the use of fuzzy clustering analysis, markov, etc. The fuzzy clustering analysis method basically searches representative segments in a kinematic segment library to combine to form a working condition curve close to the actual condition, and whether the working condition curve is representative or not depends on the search of the optimal segment. Furthermore, none of these methods is able to efficiently estimate and extract all the implicit information in the data, resulting in the selection of only one generic segment that can represent all the routes. The Markov method is characterized in that the final road driving condition is obtained according to the influence of the current data on the future data through the state transition probability, although the representative condition with the specified duration can be randomly generated, the result depends on the accuracy of the state transition probability, and the information amount of the previous data which can be reserved in the new state is less, so that the new prediction data has lower generalization on the future situation than the previous data.

Disclosure of Invention

In order to solve the above problems, the present invention provides a method for constructing a driving condition of an automobile. According to the method, the characteristics implicit in the driving data are deeply learned according to the collected driving data which accord with a certain special region and area, so that effective information is extracted, the working condition curve of a given interval is predicted in a segmented mode, and finally a complete automobile driving working condition curve which accords with the region characteristics is constructed.

A method for constructing a driving condition of an automobile comprises the following steps:

acquiring original GPS data of automobile driving, and preprocessing the original GPS data of the automobile driving;

dividing the preprocessed data into kinematic fragments by a short-stroke dividing method;

performing feature calculation on the kinematics segments to obtain feature parameters of the kinematics segments, and filtering irrelevant features by adopting a principal component analysis method to obtain effective feature parameters;

dividing the kinematic fragments into four fragment libraries by adopting K-Means clustering, wherein the four fragment libraries are respectively as follows: a low-speed interval fragment library, a medium-speed interval fragment library, a high-number interval fragment library and an extremely high-speed interval fragment library;

constructing a training data set: splicing all the kinematic fragments in each fragment library to obtain four long fragments, and taking the four long fragments as a training data set;

inputting the training data set into a long-short term memory neural network model for training to obtain a trained long-short term memory neural network model;

predicting by using the trained long and short term memory neural network model to obtain time-speed prediction curves corresponding to the four segment libraries respectively, wherein the specific process comprises the following steps: inputting the last sample data of the training data set as a first input element into the trained long-short term memory neural network model, and outputting a first prediction sequence; deleting the first input element, taking the first predicted value as a second input element, and inputting the second predicted sequence obtained by the model; by analogy, a prediction sequence of one fragment library is finally obtained, and time-speed prediction curves corresponding to the four fragment libraries are obtained;

after time-speed prediction curves corresponding to the four segment libraries are obtained, determining the time of the four segment libraries in the final working condition synthesis according to the time proportion of the four segment libraries in the whole kinematic segment, and combining the curves of the four speed segments into a working condition curve;

and sending the working condition curve to a control device, and evaluating the vehicle exhaust emission and evaluating the environmental protection grade by the control device according to the working condition curve.

Further, the preprocessing of the raw GPS data of the vehicle driving includes:

traversing and searching original GPS data of automobile driving from the beginning, searching a first intermittent point, dividing the original GPS data into different driving segments from the first intermittent point, wherein the first intermittent point is an area with a time interval of more than 55 seconds in the original GPS data of the automobile driving;

judging whether a second time breakpoint exists in the obtained driving segment, if so, fitting a series of new speed data points by adopting an improved polynomial fitting method according to speed data before and after the second time breakpoint to supplement the second time breakpoint in the driving segment, wherein the second time breakpoint is a region with a time interval of more than 2 seconds and less than or equal to 55 seconds in original GPS data of automobile driving;

after the data fitting and supplementing are completed, calculating the acceleration of each driving segment at each time point, and removing the driving segments with abnormal acceleration from the data according to an acceleration abnormal screening rule;

for abnormal data of long-term idling for more than 180 seconds, sliding the time and the vehicle speed of each segment by using a sliding window with the size of 180 seconds, wherein the sliding step length is 1s, and in the window sliding process, if all data in the window are idling data, screening out the first data of the window; when the tail of the window slides to the tail of the driving segment, if the data in the window is idle speed data, all the data in the window are screened out, and the data are screened out on all the driving segments in the same way to obtain the preprocessed data.

Further, the method for dividing the preprocessed data into the kinematic segments by adopting the short-stroke dividing method comprises the following steps: firstly, judging whether the running time of each running section is more than 20s, and if the running time of each running section is less than 20s, rejecting the running section; if the distance is greater than 20s, searching a kinematic segment from the driving segment according to a search rule of the kinematic segment, wherein the search rule of the kinematic segment comprises the following steps:

(1) Searching a point with the speed of a first GPS vehicle being 0 downwards from the starting time of the driving segment, namely an idling starting point, and recording the position of the idling starting point if the idling starting point is found; then, continuously finding a point, namely a middle point, of which the first GPS vehicle speed is not 0 downwards, and recording the position of the middle point;

(2) Calculating the time difference from the intermediate point to the idling starting point, if the time difference is greater than 20s, moving the position of the idling starting point downwards for 20s, and then judging the time difference from the intermediate point to the idling starting point until the time difference is less than 20 s; searching a next point with the GPS vehicle speed of 0, namely an idle speed terminal point of the kinematic fragment, and recording the position of the idle speed terminal point;

(3) Screening the kinematics segment according to a kinematics segment screening rule, and if the kinematics segment screening rule is met, extracting the kinematics segment from the driving segment according to the recorded positions of the idling starting point and the idling ending point;

the kinematic fragment screening rule comprises:

(1) The duration of the kinematic segment is not less than 20 seconds, namely the time from one idle starting point to the next idle starting point is at least 20 seconds;

(2) The kinematic segment comprises at least one acceleration state and one deceleration state, so that at least one acceleration of the vehicle is required to be greater than 0.1m/s ² And deceleration less than-0.1 m/s ² A continuous fragment of (a);

(3) The idle time of the kinematic segment does not exceed 20 seconds.

Further, the characteristic parameters of the kinematic segment include a time characteristic parameter, a velocity characteristic parameter, and an acceleration characteristic parameter, wherein the time characteristic parameter includes: running time t(s) and constant speed time t _i (s) Idle time t _c (s), acceleration time t _a (s) deceleration time t _d (s); the speed characteristic parameters include: average velocity v _m (km/h), average speed of travel v _mr (km/h), maximum speed v _max (km/h), velocity standard deviation v _std (km/h); the acceleration characteristic parameters comprise: average acceleration a _ma (m/s ² ) Average deceleration a _md (m/s ² ) Acceleration standard deviation a _std (m/s ² ) Constant velocity time ratio P _c (%), ratio of idle time P _i (%), acceleration time ratio P _a (%), deceleration time ratio P _d (％)。

Further, the step of dividing the kinematic fragments into four fragment libraries by adopting K-Means clustering specifically comprises the following steps: firstly, randomly selecting 4 kinematic segments from all kinematic segments as an initial clustering center; then, the cluster assignment operation is carried out: calculating Euclidean distances from each kinematic segment to 4 initial clustering centers respectively, classifying according to the Euclidean distances between the kinematic segments and the initial clustering centers, and assigning each kinematic segment to the initial clustering center with the nearest Euclidean distance to form 4 clusters; and after 4 clusters are obtained, recalculating the clustering center of each cluster, and executing the step S42 until the composition of the kinematic segments of each cluster is not changed any more, and finally obtaining four segment libraries of the kinematic segments.

Further, the calculation method of the euclidean distance from the kinematic segment to the cluster center includes:

wherein, d _ij Euclidean distance, x 'of the ith kinematic fragment to the clustering center j' _im Is the m-th characteristic feature of the i-th kinematic segment,

is the mth characteristic element of the cluster center j.

Further, the structure of the long-short term memory neural network model comprises: an input layer, an LSTM layer, a full link layer, and an output layer.

Further, the training data set needs to be preprocessed before being input into the model, and the preprocessing of the training data set includes: discarding the time dimension of the long segment and keeping the speed dimension of the long segment; setting a sliding window, wherein the length of the window is the size of a time step, sliding the window backwards from an initial position on a long segment, the sliding step length is 1 second each time, taking a speed-time sequence segment of an area covered by the window at the current moment as a speed-time sequence of the current moment, and taking an initial speed value of the area covered by the window at the next moment as a label of the area covered by the window at the current moment; by parity of reasoning, the four long fragments are respectively preprocessed to obtain the preprocessed D of the four long fragments _L ,D _M ,D _H ,D _EH A training data set.

The invention has the beneficial effects that:

the invention provides a convenient and rapid method for automobile driving conditions, which can effectively identify hidden characteristics in automobile driving data of special regions through a long-term and short-term memory neural LSTM network model, thereby constructing an automobile driving condition curve according with the driving characteristics.

Drawings

The invention will be described in further detail below with reference to the drawings and the detailed description, which are provided for the purpose of illustrating preferred embodiments only and are not to be construed as limiting the invention.

FIG. 1 is a flowchart of a method for constructing a driving condition of an automobile according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a network structure of an LSTM prediction model;

FIG. 3 is a flow chart of data preprocessing;

FIG. 4 is a comparison of pre-and post-data processing;

FIG. 5 is a schematic diagram of two short-stroke segmentation methods;

FIG. 6 is a flow chart of a kinematic fragment screening;

FIG. 7 is a schematic diagram of an input data generation strategy for the LSTM model;

FIG. 8 is a schematic diagram of the kinematics segment training and prediction results for the intermediate velocity segment;

FIG. 9 is a graphical illustration of the results of kinematics segment training and prediction for the high velocity segment;

FIG. 10 is a graph of results of a sample set-up of a vehicle driving profile.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in FIG. 1, a method for constructing a driving condition of an automobile includes, but is not limited to, the following steps:

the method comprises the steps of obtaining original GPS data of automobile driving, wherein the original GPS data of the automobile driving comprises acquisition time, GPS speed and GPS acceleration, and the original GPS data of the automobile driving often comprises bad data values due to various external factors and vehicle self reasons, and the reliability and accuracy of the original data are directly related to the effectiveness of a subsequent construction working condition, so that the original GPS data of the automobile driving is preprocessed, a data screening principle is established according to the bad data types in the original data to screen the original data, invalid data in the original GPS data of the automobile driving is deleted, effective and reasonable data are reserved, the analysis of the subsequent data is facilitated, and the quality of the working condition construction is improved by optimizing characteristic parameters.

In one embodiment, the bad data types in the raw data and the data filtering rules include:

(1) Fragment screening for time-discrete cases: the GPS signal is lost when the automobile runs due to the conditions of high-rise building coverage or tunnel crossing, and the like, so that the time in the original GPS data of the automobile running is discontinuous. In order to ensure the validity of the original data on the prediction result, the time break needs to be preprocessed. The processing mode comprises the following steps: searching a time discontinuity, if no missing value exists, indicating that the time is continuous, and not processing; if the missing value exists, indicating that the time is discontinuous, and if the discontinuous time is less than or equal to 55 seconds, performing data completion; if the interruption time is more than 55 seconds, no processing is carried out, the data with the interruption time more than 55 seconds does not cause great influence on the whole data, and meanwhile, the idle time ratio is also ensured.

(2) Acceleration exception screening rules: generally, the acceleration time of a common automobile from 0 to 100km/h is more than 7 seconds, and the maximum acceleration of the automobile is 3.968m/s ² (ii) a Meanwhile, the maximum deceleration of the emergency brake of the automobile is 7.5-8m/s ² . In the original data of the automobile running process, if the acceleration does not meet the range of the maximum acceleration and/or the range of the maximum deceleration, the data is considered to be data with abnormal acceleration and deceleration, and because the abnormal data has a large influence on the prediction of the whole running condition, the whole kinematic segment of the abnormal acceleration and deceleration segment and a connected idling segment are removed, and the kinematic segments before and after the abnormal segment are connected after the removal.

(3) Maximum idle time duration rule: the original GPS data of the running automobile can have the situations of long-term parking or long-term traffic jam such as people who do not stop or stop, but the data acquisition equipment is not closed, and the like, under the situations, the automobile stops running, but the data acquisition equipment is not closed, at the moment, the data equipment still works, and the data section of which the GPS speed is kept 0 for a long time is acquired, so that the data section does not accord with the actual running situation of the road, and is processed according to the idling situation; in addition, there are cases where the vehicle runs at a low speed intermittently during running, i.e., the maximum vehicle speed is less than 10km/h, and such data belongs to the spike data, and these data points are usually modified to the idling point. And the data segment with the excessively long idle time does not meet the requirements of the kinematic segment, the data segment with the excessively long idle time is deleted to reduce the influence of the data on the working condition curve, the data with the idle time exceeding 180 seconds is considered as the excessively long idle time, the idle data point within 180 seconds is reserved as the idle time segment, and the rest part is deleted.

In one embodiment, the order of pre-processing includes, but is not limited to:

as shown in fig. 3, a traversal search is first performed on three raw data files collected by the same vehicle in different time periods. All data in one file are divided into different driving segments from the beginning according to the first break point (the area with the time interval larger than 55 seconds in the original GPS data of the driving of the automobile).

Then, for all the obtained driving segments, judging whether second time break points exist in all the obtained driving segments or not (if the time interval in the original GPS data for driving the automobile is more than 2 seconds and less than or equal to 55 seconds, the interval region is regarded as the second time break points), supplementing the second time break points in the driving segments by adopting an improved polynomial fitting method, wherein the supplemented length is the length of the time difference, fitting a series of new speed data points according to the speed data before and after the second time break points, filling the new speed data points at the second time break points, and supplementing the second time break points in the driving segments, wherein the vehicle speed of the GPS in the actual situation is not less than 0, so that all negative numbers in the data points fitted by the polynomial function are replaced by 0.

After the data fitting and supplementing are completed, the acceleration of each driving segment at each time point is calculated, the driving segment with abnormal acceleration and an idling segment connected with the abnormal segment are removed from the data according to an acceleration abnormal screening rule, and the motion segments before and after the abnormal segment are connected after the abnormal segment is removed.

And for abnormal data of long-term idling for more than 180 seconds, sliding is carried out on the time and the vehicle speed of each segment by using a sliding window with the size of 180 seconds, and the step length of the sliding is 1s. In the window sliding process, if all data in the window are idle speed data, screening out the first piece of data of the window; when the tail part of the window slides to the tail part of the driving segment, if the data in the window at the moment are idle speed data, all the data in the window are screened out. And in this way, screening out data of all driving segments to obtain preprocessed data.

And (3) performing kinematic fragment division on the preprocessed data by adopting a short-stroke division method.

As shown in fig. 5, the kinematic segment includes an idle time period and a driving time period, and the time length M of each kinematic segment does not exceed 600s, and wherein the idle time period does not exceed 20s at most, and the portion exceeding the idle time is deleted. Each kinematic segment begins at zero speed and ends at zero speed, and the interval between the beginning and the ending segment may include an idle segment.

In an alternative embodiment, in the kinematic segment, the idle time period and the driving time period may be intercepted by the following method: the kinematic segment begins at speed zero and ends at speed zero, with the idle period preceding the travel period.

In an alternative embodiment, in the kinematic segment, the method two may be adopted to intercept the idle time period and the driving time period: the kinematic segment begins at speed zero and ends at speed zero, with the idle period following the travel period.

In one embodiment, a kinematic fragment screening rule is first established:

(1) The duration of the kinematic segment is not less than 20 seconds, i.e. the time from one idle start to the next is at least 20 seconds.

(2) The kinematic segment comprises at least one acceleration state and one deceleration state, so that at least one acceleration of the vehicle is required to be greater than 0.1m/s ² And deceleration less than-0.1 m/s ² A continuous fragment of (a).

(3) The idle time of the kinematic segment does not exceed 20 seconds.

As shown in fig. 6, according to the principle of kinematics segment screening, a short-stroke segmentation method is adopted to find out a kinematics segment from the pre-processed driving segment data, which specifically includes: firstly, judging whether the running time of each running section is more than 20s, and if the running time of each running section is less than 20s, rejecting the running section; if the time is more than 20s, searching the kinematic segment from the driving segment according to the searching rule of the kinematic segment.

In one embodiment, the search rule of the kinematic segment includes:

(1) Searching an idle starting point (a point with the GPS vehicle speed of 0) from the starting time of the driving segment downwards, and recording the position of the idle starting point if the idle starting point is found. Continuously finding a point with the speed of the first GPS not being 0 downwards, recording the point as a middle point, recording the position of the middle point and judging the time difference from the middle point to the idling starting point;

(2) And if the time difference is greater than 20s, moving the position of the idle speed starting point downwards by 20s, judging the time difference from the middle point to the idle speed starting point until the time difference is less than 20s, searching a next idle speed point, wherein the next idle speed point is the idle speed end point (the point of the GPS vehicle speed is 0) of the kinematic segment, and if the next idle speed point is found, recording the position of the idle speed end point.

(3) And screening the kinematic segment according to a kinematic segment screening rule, and if the kinematic segment screening rule is met, extracting the kinematic segment from the driving segment according to the recorded positions of the idle speed starting point and the idle speed end point.

And performing characteristic calculation on the kinematics segment according to the characteristic parameter calculation formula of the kinematics segment to obtain the characteristic parameter of the kinematics segment. The characteristic parameters of the kinematic segment include 16 characteristic parameters, which can be classified into three types, namely time characteristic parameters, speed characteristic parameters and acceleration characteristic parameters, wherein the time characteristic parameters include: running time t(s) and constant speed time t _i (s) Idle time t _c (s), acceleration time t _a (s), deceleration time t _d (s); the speed characteristic parameters comprise: average velocity v _m (km/h), average traveling speed v _mr (km/h), maximum speed v _max (km/h), velocity standard deviation v _std (km/h); the acceleration characteristic parameters comprise: average acceleration a _ma (m/s ² ) Average deceleration a _md (m/s ² ) Acceleration standard deviation a _std (m/s ² ) Constant velocity time ratio P _c (%), ratio of idle time P _i (%), acceleration time ratio P _a (%), deceleration time ratio P _d (％)。

In one embodiment, the feature parameter calculation formula of the kinematic segment includes:

(1) Run time t(s): since the sampling frequency is 1Hz, the running time is calculated as follows:

t＝n

wherein n is the number of the collected driving data.

(2) Idle time t _i ：t _i The number of data in the kinematic segment is 0.

(3) Acceleration time t _a ：t _a Acceleration is greater than 0.1m/s ² The total number of points.

(4) Deceleration time t _d ：t _d Acceleration is less than-0.1 m/s ² The total number of points.

(5) Time t of uniform velocity _c The calculation method of (A) is as follows: t is t _c ＝t-t _i -t _a -t _d

(6) Average velocity v _m The calculation method is as follows:

in the formula (6), q is the total number of data points of one kinematic segment, v _p Representing the velocity value of the data point p.

(7) Average running speed v _mr The calculation method is as follows:

where q is the total number of data points for one kinematic segment.

(8) Maximum velocity v _max The calculation method of (A) is as follows:

v _max ＝max{v _p ,p＝1,2,3...q}

(9) Standard deviation of velocity v _std The calculation method is as follows:

(10) Average acceleration a _ma The calculation method is as follows:

(11) Average deceleration a _md The calculation method is as follows:

wherein, a _p And represents the acceleration value corresponding to the data point p measured by the GPS.

(12) Standard deviation of acceleration a _std The calculation method is as follows:

(13) Idle time ratio P _i The calculation method is as follows: p _i ＝t _i /t

(14) Uniform time ratio P _c The calculation method is as follows: p is _c ＝t _c /t

(15) Acceleration time ratio P _a The calculation method is as follows: p is _a ＝t _a /t

(16) Deceleration time ratio P _d The calculation method is as follows: p _d ＝t _d /t

Further, in a preferred embodiment, after obtaining the features of the kinematic segment, filtering the irrelevant features by using a principal component analysis method to obtain feature parameters with a larger relation.

Since the filtering of the irrelevant features by the principal component analysis method is not an innovative part of the invention and is not a key point of the invention, the description of the specification does not give excessive details, and the specific process of filtering the irrelevant features by the principal component analysis method can refer to the corresponding principal component analysis step in the research of the urban road automobile driving condition construction method based on the K-means clustering analysis in the prior art.

According to the characteristics of the kinematic fragments, dividing the kinematic fragments into four fragment libraries by adopting K-Means clustering, wherein the four fragment libraries are as follows: a low-speed interval fragment library L, a medium-speed interval fragment library M, a high-number interval fragment library H and an extremely high-speed interval fragment library EH. The highest speed of the kinematics segment in the low-speed interval segment library is not more than 60km/h, the highest speed of the kinematics segment in the medium-speed interval segment library is not more than 80km/h, the highest speed of the kinematics segment in the high-speed interval segment library is not more than 100km/h, and the highest speed of the kinematics segment in the ultra-high speed interval segment library is not more than 130km/h.

In one embodiment, a K-Means clustering algorithm is adopted to perform the division of the kinematic segments, and the euclidean distance is used as a distance measurement standard of the distance between categories, which specifically includes:

(1) Firstly, randomly selecting 4 kinematic segments from all kinematic segments as an initial clustering center;

(2) Performing cluster assignment operation: calculating Euclidean distances from each kinematic segment to 4 initial clustering centers respectively, classifying according to the Euclidean distances from the kinematic segments to the initial clustering centers, and assigning each kinematic segment to the initial clustering center with the nearest Euclidean distance to form 4 clusters;

(3) And (3) after 4 clusters are obtained, recalculating a new clustering center of each cluster, and executing the step (2) until the composition of the kinematic segments of each cluster is not changed any more, and finally obtaining four segment libraries of the kinematic segments.

The calculation mode of the new cluster center in each cluster comprises the following steps:

C _i ＝mean(x ₁ ,x ₂ ,x ₃ ,…,x _n )

wherein, C _i Representing a new cluster center in cluster i, x being the kinematic segment assigned to cluster i, n being the number of kinematic segments belonging to cluster i, mean being the averaging operation.

Further, in one embodiment, in the K-means clustering algorithm, the criterion function for measuring the clustering quality is the sum of squares of errors, and the smaller the sum of squares of errors, the higher the clustering quality is. The criterion function of the K-means clustering algorithm is as follows:

wherein, SSE represents Sum of Squared Error (SSE), k represents k clustering centers, that is, the final clustering result has k classes, where k =4, x is the data point to be clustered, C _i To cluster center, dist is the Euclidean metric.

Further, in one embodiment, the K-means clustering algorithm uses euclidean metric to compute the similarity, and the euclidean distance between the kinematic segment and the cluster center is:

wherein d is _ij Is the Euclidean distance, x 'of the ith kinematic fragment to the clustering center j' _im Is the m-th feature element of the i-th kinematic segment,

is the mth characteristic element of the cluster center j.

It should be noted that, in addition to calculating the similarity degree by using the euclidean metric, alternatively, the similarity degree by using metric may be calculated by using other manners in the present specification, for example: the euclidean distance, the minkowski distance, the chebyshev distance, the block distance, etc. may be any achievable measure of similarity in the prior art.

Further, in an embodiment, if the minimum euclidean distances from the kinematic segment to the centers of the various types are equal and the similarity degree cannot be determined, the similarity criterion is used for determining, and the formula of the similarity criterion is as follows:

wherein the content of the first and second substances,

representing the velocity-time vector of any kinematic segment i that needs to be subjected to a closeness decision,

is the velocity-time vector of the cluster center m,

representing the proximity value of the kinematic segment i to the cluster center m.

Calculating the proximity values of the kinematic segments and all the clustering centers according to the formula of the proximity criterion, finding the maximum value from the obtained proximity values, and dividing the kinematic segments into the clustering centers corresponding to the maximum values of the proximity values, wherein the clustering centers corresponding to the maximum values of the proximity values are the categories of the kinematic segments.

And respectively splicing all the kinematic fragments in each fragment library to obtain four long fragments, and taking the four long fragments as a training data set.

The training data set construction method comprises the following steps: for each segment library, all kinematic segments S in each segment library are spliced to obtain a speed-time sequence Si (namely a long segment), and for four segment libraries, S is correspondingly obtained _L ，S _M ，S _H And S _EH Four long segments.

Inputting the training data set into a long-short term memory neural network model (LSTM) for training, continuously learning the change characteristics of the speed-time sequences contained in different types of segment libraries in the training process of the long-short term memory neural network model (LSTM), and training four corresponding time-speed sequence prediction models to obtain the trained long-short term memory neural network model. It should be noted that the time-velocity sequence prediction model and the long-short term memory neural network model in the present specification are the same model.

The long-short term memory neural network model can complete the reconstruction of time sequences according to the space-time correlation of segment libraries of different classes, and the time correlation characteristics are identified and strengthened by means of model training, so that different classes of kinematics segments with representative characteristics are obtained through prediction. Because the network structure of the long-short term memory neural network model is determined, vectors with specific dimensionality need to be input, and the data length in the training data set is different and does not meet the condition, the training data set needs to be preprocessed before being input into the model, and the training data set is processed into supervised learning data which can be used by a time-speed sequence prediction model.

Preprocessing the training data set includes:

1. discarding the time dimension of the long segment and keeping the speed dimension of the long segment;

2. setting a sliding window, wherein the length of the window is the size of a time step, sliding the window backwards from an initial position on a long segment, the sliding step length is 1 second each time, taking a speed-time sequence segment of an area covered by the window at the current moment as a speed-time sequence of the current moment, and taking an initial speed value of an area covered by the window at the next moment as a label of the area covered by the window at the current moment;

specifically, the sliding window W has a length n, and slides over a speed-time series S (a long segment) having a length L, at t ₀ Time of day start, t ₀ The coverage area of the time sliding window is

Taking the velocity-time sequence segment of the coverage area as t ₀ Time-of-day speed-time sequence s ₀ Get t _n+1 Velocity value of time of day

As s ₀ Tag of (1), constituting t ₀ Data segment of a time of day

By analogy, the sliding window continues to slide backwards, yielding t ₁ Data segment of a time of day

Until t is generated _L-n Data segment of a time of day

Obtaining L-n data fragments in total, and constructing a training data set with long fragments preprocessed according to the L-n data fragments, wherein the training data set is D = { D = _m |0<m<L-n }. Wherein L represents the data length of the long segment S, d _m Denotes the mth data segment, s ₁ Represents t ₁ The speed-time sequence of the time of day,

represents t _n+2 The speed value of the moment in time,

denotes t _L The velocity value of the moment.

3. By analogy, the four long segments are respectively preprocessed to obtain the preprocessed D of the four long segments _L ,D _M ,D _H ,D _EH A training data set.

The invention utilizes the LSTM recurrent neural network as a prediction core to generate an automobile driving condition curve and construct a time-speed sequence prediction model. The time-velocity sequence prediction model is shown in figure 2 and comprises an input layer, an LSTM layer, a fully-connected layer and an output layer, wherein the input layer is followed by the LSTM layer, and then is connected to the output layer through two fully-connected layers. Because four types of kinematics fragment libraries after K-Means clustering are input into an input layer of the model, input data have potential periodicity and are beneficial to training, and an LSTM layer internally comprises 3 hidden layers and uses tanh as an activation function for learning potential time sequence relations and abstract features in an input sequence; the full connection layer reduces the output dimension of the LSTM layer and maps the abstract features learned by the LSTM layer, and the output dimension of the full connection layer is 1; and finally, outputting a result of the predicted speed data by the output layer, and sequentially splicing the predicted results to finally obtain an automobile running condition curve.

After preprocessing the input training data set, preprocessing the training data set D _L ,D _M ,D _H ,D _EH Inputting the long-short term memory neural network model to train a time-speed sequence prediction model, wherein the training of the time-speed sequence prediction model comprises the following steps: for the LSTM layer section, the input dimension of the first layer hidden layer is (1, n), the input dimension of the second layer hidden layer is (1, 6), the input dimension of the third layer hidden layer is (1, 8), and the flow direction of the input data in the LSTM layer is: the processed intermediate output is obtained through the LSTM gate structure, and then the intermediate output is further processed through the tanh activation function as the output result of the layer. And finally, inputting the output result of the LSTM layer into a full-connection layer, wherein the input dimension of the full-connection layer is 4, the output dimension of the full-connection layer is 1, and the output is the predicted value of the speed point next to the current input vector. The number of training iteration rounds of the model is 100 rounds, and the training process is divided into a forward propagation process and an error back propagation process: forward propagation Using input of time-velocity sequence prediction model as velocity vector data D in preprocessed training set X _i I = (L, M, H, EH) is used as input, the weights of all layers are inactivated at a probability of 0.2 at random through the LSTM layer and the full connection layer to obtain output, a mean square error is used as a loss function to obtain an error, the error is propagated in a reverse direction, an Adam function is used as an optimization function of each layer to update the weight of each layer, the learning rate of the optimization function is set to be 0.001, the batch processing size of network training is set to be 1, and four segment libraries D in the training set X are respectively subjected to one-time matching _L ,D _M ,D _H ,D _EH All ofInput data are trained to obtain four corresponding time-speed sequence prediction models M _L ,M _M ,M _H ,M _EH The internal structures of the four models are the same, but the training data sets are different, so that the obtained internal weights are different, and the four models correspond to segments capable of predicting four parts, namely low, medium, high and extremely high in the working condition of the automobile.

Preferably, the basic gate structure in the LSTM employed is:

forget the door: f. of _t ＝σ(W _f ×[h _t-1 ,x _t ]+b _f )

An input gate: i.e. i _t ＝σ(W _i ×[h _t-1 ，x _t ]+b _i )

An output gate: o _t ＝σ(W _o ×[h _t-1 ，x _t ]+b _o )

h _t ＝o _t ×tanh(c _t )

Wherein, symbol ". "represents the multiplication of elements at corresponding positions of two vectors; the notation "σ" denotes the sigmoid function, which is used as the activation function; c represents the state of the cell and c represents the state of the cell,

representing the currently input cell state, x _t For cell input, h _t For the cell output, W, b are the weight and offset of each gate in the algorithm, respectively. The method specifically comprises the following steps: f. of _t Is a forgetting vector that determines which information was forgotten from the state of the cell at the previous time, W _f To forget the weight in the door, h _t-1 For the output of the last gate structure, x _t Is an input at time t, b _f Is a bias in the forgetting gate; i.e. i _t Is an input gateVector, information to be retained in the decision unit state, W _i Is the weight in the input gate, b _i Is an offset in the input gate; o _t Is the output vector of the output gate, W _o Is a weight in the output gate, b _o Is the offset in the output gate or gates,

is an update value of the cell unit state, W _c Is a weight in the cell unit network, b _c Is a bias in the cell unit network, c _t-1 Is the state of the cells at the last time (t-1).

And predicting by using the trained long and short term memory neural network model to obtain time-speed prediction curves corresponding to the four segment libraries respectively.

And for a certain segment library, predicting the speed at a certain time in the future according to the kinematic data in the segment library to obtain the automobile working condition driving curve.

The prediction process comprises the following steps: model M obtained as described above _i I = (L, M, H, EH), for corresponding training data D in training set X _i Last data in (2)

As input, t is predicted _L+1 Velocity value of time of day

Thereby is composed of

Predict t _L+2 Velocity value of time of day

Knowing the predicted t _L+M Velocity value of

Get

The speed value is taken as a time-speed prediction curve P extracted from the interval and capable of representing all speed change curve characteristics of the interval _i . In this way, finally, the time-speed prediction curves P corresponding to the low-speed interval fragment library L, the medium-speed interval fragment library M, the high-number interval fragment library H and the extremely high-speed interval fragment library EH are obtained _L ，P _M ，P _H ，P _EH 。

After the time-speed prediction curves corresponding to the four segment libraries are obtained, the time of the four segment libraries in the final working condition synthesis is determined according to the time proportion of the four segment libraries in the whole kinematic segment, and the time-speed prediction curves P of the four speed segments are obtained _L ，P _M ，P _H ，P _EHy And sequentially splicing the two curves into a working condition curve to obtain a final automobile running working condition curve P.

According to the technical scheme provided by the method for constructing the automobile running condition curve, the speed change is complex in the running process of the automobile, and the time-speed curve generated by the running of the automobile can contain all the information of the running of the automobile: acceleration, maximum deceleration, average acceleration and deceleration, average speed and idle interval conditions, the conditions are not strict, the rules can be followed, uncertainty and instability exist, nonlinearity exists in practice and space, and the data size is large. The long-short term memory (LSTM) cyclic neural network is complex in structure, output information of each time is related to input, memory content of a cell unit and a last output result, and meanwhile, the LSTM model makes up the problems of gradient dispersion and gradient explosion, insufficient long-term memory capability and the like of a typical machine learning model such as a cyclic neural network (RNN) and other structures, so that the cyclic neural network can really and effectively utilize long-distance time sequence information compared with a traditional fitting method, an automobile working condition curve is just typical time sequence data, and ideal results can be obtained by utilizing the LSTM model to predict the working condition curve. In the LSTM model, the new cell state and the previous state are an accumulation process, and nonlinear fitting can be performed on the driving kinematics curve segment of the automobile, so that the time sequence of input data is effectively considered, and the encoding and decoding of a time sequence are realized.

In order to make the present specification clearer and more complete, the following further description is made with specific data.

In one embodiment, original data collected by vehicle-mounted GPS equipment in 'mathematic modeling competition of China researchers' problem D in 2019 is selected for analysis, the data is collected by actual road running of a light automobile in a certain city, the sampling frequency is 1Hz, and the data comprises the collection time (seconds) and the GPS speed.

The method comprises the steps of preprocessing original data collected by vehicle-mounted GPS equipment, eliminating bad data values and invalid data, and reserving valid data.

The raw data collected by the onboard GPS device consists of 3 files. Corresponding processing is performed according to the preprocessing method of the original data to obtain a final preprocessing result, which is shown in the following table 1: the processed number of the motion segment records of the file 1 is 186255, the processed number of the motion segment records of the file 2 is 149032, and the processed number of the motion segment records of the file 3 is 170808.

TABLE 1 number of records before and after preprocessing of each file data

In order to better show the preprocessing result, the partially processed data segments in the file 2 according to the 3 screening principles are selected to be compared with the original data segments, as shown in fig. 4, the upper 3 figures are the original data segments, the 3 original data segments respectively have the problems of data loss, abnormal acceleration/deceleration and abnormal segments with overlong idle time, and the lower 3 figures are the results of preprocessing the original data abnormal segments with data loss, abnormal acceleration/deceleration and overlong idle time, so that it can be obviously seen that the speed-time curve of the preprocessed automobile running is closer to the working condition curve of the actual automobile.

And (3) obtaining a kinematics fragment by short-stroke division of the preprocessed data, and establishing a screening rule of the kinematics fragment:

(3) The idle time of the kinematic segment does not exceed 20 seconds.

Applying the short-stroke division method, as shown in fig. 6, finding out the kinematic segment from the pre-processed driving segment data by using the short-stroke division method

And performing characteristic calculation on the kinematics segment according to the characteristic parameter calculation formula of the kinematics segment to obtain the characteristic parameter of the kinematics segment.

The calculated values of the partial kinematic segment characteristic parameters are shown in table 2 below, for example.

TABLE 2 kinematic fragment eigenvalues

And dividing the kinematic fragments into four fragment libraries by adopting K-Means clustering according to the characteristics of the kinematic fragments. Through the characteristic matrix clustering of the segments, the original 386 kinematic segments are divided into four segment libraries, namely a low-speed interval segment library L, a medium-speed interval segment library M, a high-speed interval segment library H and an ultrahigh-speed interval segment library EH. In order to further analyze the vehicle running characteristics represented by the kinematic segments in each class of speed interval segment library, the characteristic parameters related to speed and acceleration are selected from the 16 characteristic parameters to correspondingly calculate the four classes, so as to obtain the comprehensive characteristic values of all the kinematic segments in each class of segment library, which is shown in table 3.

As can be seen from Table 3, the segment library M (first type) of the medium-speed interval comprises 203 kinematic segments, the average speed is 12.15km/h, and the maximum speed does not exceed 106.74km/h; the low-speed interval fragment library L (a second type) comprises 39 kinematic fragments, the average speed is 6.02km/h, and the maximum speed does not exceed 149.88km/h; the extremely high speed interval fragment library EH (third type) comprises 10 kinematic fragments, the average speed is 37.54km/h, and the maximum speed does not exceed 76.60km/h; the high-speed interval fragment library H (fourth class) comprises 134 kinematic fragments, the average speed is 23.88km/H, and the maximum speed does not exceed 139.61km/H.

TABLE 3 values of the comprehensive characteristic parameters of the various classes of fragment libraries

Characteristic parameter	First kind of	Second class	Class III	Class IV
					Total number of fragments	203	39	10	134
v _m	12.15	6.02	37.54	23.88
					v _max	106.74	149.88	76.60	139.61
v _std	9.657	14.786	14.027	14.672
					a _ma	0.5970	0.4356	0.3387	0.4856
a _md	-0.7155	-0.5479	-0.4549	-0.6179
					a _std	0.6023	0.5509	0.4236	0.5810

The input dimensions of a typical LSTM model include: samples (sample), time steps (timestamp), characteristics of the previous network (feature), the input of LSTM must be "sequences", which is against our requirements: the speed at the next time is predicted by inputting a speed sequence.

(2) Preprocessing of the training data set: since the obtained long segment is a time-velocity sequence, and the time dimension in the long segment is discarded, the actual dimension of the long segment after discarding the time dimension is dim = (s, 1), and s is the length of the segment, but when performing model training, data needs to be processed into supervised learning data that can be used by the model.

The treatment method comprises the following steps: setting a sliding window for the speed sequence of the long segment, wherein the length of the window is the size of a time step, sliding the window from the initial speed backwards, the sliding step length is 1 second each time, taking the next value of the window as the processing value of the current window, and taking the next value as the segment x covered in the window _t Label h of _t And by analogy, an input data set 5 with the length of s-time is obtained, and the input data generation strategy of the long-short term memory neural network model is shown in fig. 7.

(3) After preprocessing the training data set, inputting the preprocessed training data set into a long-short term memory neural network model (LSTM) for training:

the data set X is input into a model for training, and the super parameters are used as follows: the batch processing size (batch-size) is equal to 5, the time step is equal to 300 seconds, and the concrete structure of the model is as follows: and one LSTM layer internally comprising 3 hidden layers, using tanh as an activation function, using a fully-connected layer to process the output of all the LSTM layers, wherein the output dimension of the fully-connected layer is 1, and predicting the speed of the next moment (second) according to the sequence speed data of the current input model.

(4) Prediction of fragments using the model:

the input of the model is a time period sequence with the corresponding length of len, and a corresponding speed sequence Y is predicted _len The last sample data of the training data set is used as seed input to generate the first predicted value, and meanwhileDeleting the first input element, taking the first predicted value as the input of the next element, outputting the second predicted value by the model, repeating the steps to obtain all predicted sequences Y, respectively predicting the four segment libraries to obtain time-speed predicted curves corresponding to the four segment libraries, splicing the time-speed predicted curves corresponding to the four segment libraries, and further obtaining the whole complete working condition data, wherein the loss function related data in the model training process is given in the table 4.

TABLE 4 loss function correlation data for model training procedure

The trained long-short term memory neural network model is used for prediction, time-speed prediction curves corresponding to four kinematics segment libraries are finally obtained, fig. 8 is a schematic diagram of the training and prediction results of the kinematics segments of the middle speed segment, wherein a + data point part (a True curve in fig. 8) is used as training data, a solid curve (a Train curve in fig. 8) is used as the prediction result on a training data set after model training, the coincidence degree of the prediction result and the training data is high, a work condition curve measured by a + data point part (a Predict curve in fig. 8) is displayed, fig. 9 is a schematic diagram of the training and prediction results of the kinematics segments of the high speed segment, wherein a + data point part is used as the training data, the solid curve is used as the prediction result on a retraining data set after model training, the coincidence degree of the prediction result and the training data is high, a working condition curve measured by a + data point part is displayed, and the work condition curve measured by the + data point part can be seen from fig. 8-9, the coincidence degree of the prediction result of the LSTM model can be better, and the result of the accurate result can be obtained when the construction of the work condition curve.

After the time-speed prediction curves corresponding to the four kinematics segment libraries are obtained, the time occupied by each type of segment in the final working condition synthesis is determined according to the time proportion occupied by each type in the whole kinematics segment, and the curves of the four speed segments are combined into a working condition curve according to the prediction result obtained by the training of the LSTM model, as shown in figure 10.

And after the working condition curve is obtained, the working condition curve is sent to the control equipment, the control equipment controls the tested vehicle to carry out simulated driving according to the working condition curve, so that data such as oil consumption data, tail gas emission and the like of the tested vehicle in the driving process are obtained, and vehicle emission evaluation and environmental protection grade evaluation are carried out according to the obtained data such as the oil consumption data, the tail gas emission and the like.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It should be understood that the term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

It should be noted that, a person skilled in the art can understand that all or part of the processes in the above method embodiments can be implemented by a computer program to instruct related hardware, where the program can be stored in a computer readable storage medium, and when executed, the program can include the processes in the above method embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The foregoing is directed to embodiments of the present invention and it will be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A method for constructing a driving condition of an automobile is characterized by comprising the following steps:

acquiring original GPS data of automobile driving, and preprocessing the original GPS data of the automobile driving; the preprocessing of the raw GPS data of the automobile driving comprises the following steps:

traversing and searching original GPS data of automobile driving from the beginning, searching a first time break point, and dividing the original GPS data into different driving segments from the first time break point;

judging whether a second time breakpoint exists in the obtained driving segment, if so, fitting a series of new speed data points by adopting an improved polynomial fitting method according to speed data before and after the second time breakpoint, and supplementing the second time breakpoint in the driving segment;

for abnormal data of long-term idling for more than 180 seconds, sliding the time and the vehicle speed of each segment by using a sliding window with the size of 180 seconds, wherein the sliding step length is 1s, and in the window sliding process, if all data in the window are idling data, screening out the first piece of data of the window; when the tail of the window slides to the tail of the driving segment, if the data in the window is idle speed data, all the data in the window are screened out, and the data are screened out on all the driving segments in the same way to obtain preprocessed data;

the division of the kinematic fragments of the preprocessed data by adopting a short-stroke division method comprises the following steps: firstly, judging whether the running time of each running section is more than 20s, and if the running time of each running section is less than 20s, rejecting the running section; if the time is more than 20s, searching a kinematic segment from the driving segment according to the searching rule of the kinematic segment;

the search rule of the kinematic segment includes:

(3) Screening the kinematics segment according to a kinematics segment screening rule, and if the kinematics segment screening rule is met, extracting the kinematics segment from the driving segment according to the recorded positions of an idling starting point and an idling end point;

the kinematic fragment screening rule comprises:

(2) The kinematic segment comprises at least one acceleration state and one deceleration state, so that at least the acceleration of the vehicle in the kinematic segment is greater than 0.1m/s ² And deceleration less than-0.1 m/s ² A continuous fragment of (a);

(3) The idle time of the kinematic segment does not exceed 20 seconds;

constructing a training data set: splicing all the kinematic segments in each segment library to obtain four long segments, and taking the four long segments as a training data set;

predicting by using the trained long and short term memory neural network model to obtain time-speed prediction curves corresponding to the four segment libraries respectively, wherein the specific process comprises the following steps: taking the last sample data of the training data set as a first input element, inputting the first input element into the trained long-short term memory neural network model, and outputting a first prediction sequence; deleting the first input element, taking the first predicted value as a second input element, and inputting the model to obtain a second prediction sequence; by analogy, a prediction sequence of one fragment library is finally obtained, and time-speed prediction curves corresponding to the four fragment libraries are obtained;

after time-speed prediction curves corresponding to the four segment libraries are obtained, determining the time occupied by the four segment libraries in the final working condition synthesis according to the time proportion occupied by the four segment libraries in the whole kinematic segment, and combining the curves of the four speed segments into a working condition curve;

2. The method for constructing the driving condition of the automobile according to claim 1, wherein the characteristic parameters of the kinematic segment comprise a time characteristic parameter, a speed characteristic parameter and an acceleration characteristic parameter, wherein the time characteristic parameter comprises: running time t(s) and constant speed time t _i (s) idle time t _c (s), acceleration time t _a (s) deceleration time t _d (s); the speed characteristic parameters include: average velocity v _m (km/h), average speed of travel v _mr (km/h), maximum speed v _max (km/h), velocity standard deviation v _std (km/h); the acceleration characteristic parameters comprise: average acceleration a _ma (m/s ² ) Average deceleration a _md (m/s ² ) Acceleration standard deviation a _std (m/s ² ) Constant velocity time ratio P _c (%), ratio of idle time P _i (%), acceleration time ratio P _a (%), deceleration time ratio P _d (％)。

3. The method for constructing the driving conditions of the automobile according to claim 1, wherein the step of dividing the kinematic segments into four segment libraries by adopting K-Means clustering specifically comprises the following steps:

s41, firstly, randomly selecting 4 kinematic segments from all the kinematic segments as an initial clustering center;

s42, carrying out cluster assignment operation: calculating Euclidean distances from each kinematic segment to 4 initial clustering centers respectively, classifying according to the Euclidean distances from the kinematic segments to the initial clustering centers, and assigning each kinematic segment to the initial clustering center with the nearest Euclidean distance to form 4 clusters;

and S43, after 4 clusters are obtained, recalculating the clustering center of each cluster, and executing the step S42 until the composition of the kinematic segment of each cluster does not change any more, thereby finally obtaining four segment libraries of the kinematic segments.

4. The method for constructing the driving condition of the automobile according to claim 3, wherein the Euclidean distance from the kinematic segment to the cluster center is calculated in a manner of:

wherein d is _ij Is the Euclidean distance, x 'of the ith kinematic fragment to the clustering center j' _im Is the m-th characteristic feature of the i-th kinematic segment,

is the mth characteristic element of the cluster center j.

5. The method for constructing the driving condition of the automobile according to claim 1, wherein the structure of the long-short term memory neural network model comprises: an input layer, an LSTM layer, a full link layer, and an output layer.

6. The method for constructing the driving condition of the automobile according to claim 1, wherein the training data set needs to be preprocessed before being input into the model, and the preprocessing of the training data set comprises:

discarding the time dimension of the long segment and keeping the speed dimension of the long segment;

setting a sliding window, wherein the length of the window is the size of a time step, sliding the window backwards from an initial position on a long segment, the sliding step length is 1 second each time, taking a speed-time sequence segment of an area covered by the window at the current moment as a speed-time sequence of the current moment, and taking an initial speed value of the area covered by the window at the next moment as a label of the area covered by the window at the current moment;

by analogy, the four long segments are respectively preprocessed to obtain the preprocessed D of the four long segments _L ,D _M ,D _H ,D _EH A set of training data.