CN115798239A

CN115798239A - Vehicle operation road area type identification method

Info

Publication number: CN115798239A
Application number: CN202211439582.4A
Authority: CN
Inventors: 赵轩; 袁晓磊; 杨涛; 魏玉超; 谢鹏辉
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2023-03-14
Anticipated expiration: 2042-11-17
Also published as: CN115798239B

Abstract

The invention discloses a vehicle operation road area type identification method, which comprises the following steps: the method comprises the following steps: dividing a mountain area type, an urban area type, a suburban area type and a high-speed area type in a target area; and the range of the area type is embodied as discrete longitude and latitude point information; step two: judging the area type of each short journey according to the vehicle GPS positioning information and the area type range information; step three: adding labels of the types of the areas to which the vehicles run in each short-stroke characteristic parameter database; and meanwhile, a machine learning database is obtained, a machine learning algorithm is utilized for training, and a vehicle operation road region classification model based on machine learning is constructed and is used for identifying a vehicle operation road region. The method and the device realize acquisition of the region type label of the user data before synthesis of the driving conditions, can be used for constructing the driving conditions of different city partition types, and can calibrate technical parameters timely and accurately to evaluate the performance of the whole vehicle.

Description

A type identification method of vehicle running road area

技术领域technical field

本发明涉及车辆运行道路区域类型辨识技术领域，特别涉及一种车辆运行道路区域类型辨识方法。The invention relates to the technical field of vehicle running road area type identification, in particular to a vehicle running road area type identification method.

背景技术Background technique

全球能源紧张和环境污染问题日渐严重，而汽车作为人们出行的必备工具之一，其能耗和排放问题对整体形势有着举足轻重的影响，因此，为了满足各国日益严格的排放法规要求，顺应环境友好型和可持续发展的战略要求，各汽车生产商都在提高对燃油车的燃油经济性和电车能耗的重视。The global energy shortage and environmental pollution problems are becoming more and more serious. As one of the necessary tools for people to travel, the energy consumption and emission of automobiles have a decisive impact on the overall situation. Therefore, in order to meet the increasingly stringent emission regulations of various countries The strategy of friendly and sustainable development requires that all automobile manufacturers are paying more attention to the fuel economy of fuel vehicles and the energy consumption of electric vehicles.

行驶工况是描述车辆行驶特征的速度与时间的曲线，其反映了某一车型在某一地区或某一道路类型最具代表性的车辆行驶特征，它主要用于确定车辆污染物排放量和燃油消耗量，是车辆能耗及排放检测试验的重要依据，也可以用来对车辆驱动电机的损伤和寿命进行分析，为新车型的技术开发和评估等提供参考依据。此前，业内使用的汽车测试工况大多是国际或国内标准工况，而实际上，各个城市的社会、经济和地理特征并不相同，甚至有很大的差异。因此，典型的标准工况并不足以代表不同地区或城市的实际道路行驶条件。不仅如此，同一城市的不同地理区域内的道路交通状况都有很大差异，简单的累加不利于获得适配度强的典型工况，不能更准确反映特定研究对象的行驶特性。为了使构建的行驶工况更细致化，更具有代表性，根据地理位置信息，构建不同城市分区域类型的行驶工况成为了一种较为新颖的解决方案。Driving condition is a curve describing the speed and time of vehicle driving characteristics, which reflects the most representative vehicle driving characteristics of a certain model in a certain area or a certain road type. It is mainly used to determine vehicle pollutant emissions and Fuel consumption is an important basis for vehicle energy consumption and emission testing. It can also be used to analyze the damage and life of the vehicle's drive motor, providing a reference for the technical development and evaluation of new models. Previously, most of the automotive test conditions used in the industry were international or domestic standard conditions, but in fact, the social, economic and geographical characteristics of each city are not the same, and even have great differences. Therefore, typical standard operating conditions are not sufficient to represent the actual road driving conditions in different regions or cities. Not only that, but the road traffic conditions in different geographical areas of the same city are very different. Simple accumulation is not conducive to obtaining typical working conditions with strong adaptability, and cannot more accurately reflect the driving characteristics of specific research objects. In order to make the constructed driving conditions more detailed and representative, it is a relatively novel solution to construct driving conditions of different city subregions based on geographic location information.

分区域类型的行驶工况是车辆研发设计的动力总成参数匹配和控制策略优化的基础，这是因为不同区域类型的行驶工况具有不同的特点，所适合的车辆最佳控制参数和形式也不同。The driving conditions of sub-areas are the basis for powertrain parameter matching and control strategy optimization in vehicle R&D and design. This is because the driving conditions of different types of regions have different characteristics, and the optimal control parameters and forms of vehicles are also suitable. different.

为了能够构建出典型城市分区域类型的行驶工况，且能够按照车辆所属的实际行驶工况类型，进行整车控制参数的实时调整，及时准确的校准技术参数评估整车性能，需要一套方法对车辆运行的区域类型进行辨识。In order to be able to construct the driving conditions of typical urban subregions, and to adjust the vehicle control parameters in real time according to the actual driving conditions of the vehicle, and to calibrate the technical parameters in a timely and accurate manner to evaluate the performance of the vehicle, a set of methods is required. Identify the type of area in which the vehicle operates.

发明内容Contents of the invention

本发明的目的在于提供一种车辆运行道路区域类型辨识方法，不仅可用于典型工况开发的前期准备，也可以用来进行实时工况识别，改善车辆运行状况，为车辆运行云服务大数据分析提供数据信息。The purpose of the present invention is to provide a vehicle operation road area type identification method, which can not only be used for the preliminary preparation of the development of typical working conditions, but also can be used for real-time working condition identification, improve vehicle operation conditions, and serve big data analysis for vehicle operation cloud Provide data information.

为了实现上述目的，本发明提供的技术方案如下：In order to achieve the above object, the technical scheme provided by the invention is as follows:

一种车辆运行道路区域类型辨识方法，其特征在于，包括：A method for identifying a vehicle running road area type, characterized in that it includes:

步骤一：按照地理位置信息，划分目标区域中的山区区域类型；按照道路类型，划分目标区域为市区区域类型、郊区区域类型和高速区域类型；并将区域类型的范围具象为离散的经纬度点状信息；Step 1: According to the geographical location information, divide the mountain area type in the target area; according to the road type, divide the target area into urban area type, suburban area type and high-speed area type; and represent the range of the area type as discrete latitude and longitude points status information;

步骤二：以用户数据作为输入，用户数据包括车辆运行数据信息和车辆GPS定位信息，对数据进行预处理后，根据短行程法进行短行程划分，将原始数据划分成若干个运动学片段，计算每个短行程的特征参数获得短行程特征参数数据库；根据车辆GPS定位信息及区域类型范围信息，判断每个短行程所属的区域类型；Step 2: The user data is used as input. The user data includes vehicle operation data information and vehicle GPS positioning information. After preprocessing the data, divide the original data into several kinematic segments according to the short-stroke method, and calculate The characteristic parameters of each short trip are obtained from the short trip characteristic parameter database; according to the vehicle GPS positioning information and the area type range information, the area type to which each short trip belongs is judged;

步骤三：每个短行程特征参数数据库内增加车辆运行所属区域类型的标签；同时获得机器学习数据库，利用机器学习算法进行训练，构建出基于机器学习的车辆运行道路区域分类模型，该模型将用于对车辆运行道路区域的辨识。Step 3: In each short-travel characteristic parameter database, add the label of the area type of vehicle operation; at the same time, obtain the machine learning database, use the machine learning algorithm for training, and build a vehicle operation road area classification model based on machine learning. This model will use It is used to identify the road area where the vehicle runs.

可选的，所述的步骤一具体包括：Optionally, the first step specifically includes:

根据城市行政区划范围，确定城市区域与郊区区域的分界线；城市与郊区的分界线以内划为市区区域，并获得市区区域边界GPS点状信息；According to the scope of urban administrative divisions, determine the boundary line between the urban area and the suburban area; divide the boundary between the city and the suburbs into the urban area, and obtain the GPS point information of the urban area boundary;

城市行政区内的所有高速公路划为高速区域，并获得高速区域GPS点状信息；All expressways in the urban administrative area are classified as high-speed areas, and GPS point information of high-speed areas is obtained;

根据DEM高程数据，计算地形起伏度，进行邻域分析，对地形起伏度进行重分类，获得山区范围，并获得山区区域边界GPS点状信息；According to the DEM elevation data, calculate the terrain relief, carry out neighborhood analysis, reclassify the terrain relief, obtain the range of mountainous areas, and obtain the GPS point information of the boundary of the mountainous area;

除市区区域、高速区域、山区区域以外的范围划为郊区区域，并获得郊区区域边界GPS点状信息。Areas other than urban areas, high-speed areas, and mountainous areas are classified as suburban areas, and GPS point information on the borders of suburban areas is obtained.

可选的，所述的城市区域与郊区区域的分界线为绕城高速或环城线。Optionally, the boundary line between the urban area and the suburban area is the Ring Expressway or the Ring Line.

可选的，所述的步骤二具体包括：Optionally, the second step specifically includes:

用户短行程数据按区域类型分类；对包括车辆运行数据信息和车辆GPS定位信息的原始数据进行预处理后，根据短行程法进行短行程划分，将原始数据划分成若干个运动学片段，得到用户短行程数据库，计算每个短行程的特征参数供后续使用；The user's short-stroke data is classified according to the type of area; after preprocessing the original data including vehicle operation data information and vehicle GPS positioning information, the short-stroke is divided according to the short-stroke method, and the original data is divided into several kinematic segments, and the user Short-stroke database, calculating the characteristic parameters of each short-stroke for subsequent use;

构建识别短行程所属区域类型的函数，该函数以短行程的GPS定位信息和四个区域类型的GPS信息作为输入，以短行程所属区域类型标签作为输出。Construct a function to identify the type of area to which the short trip belongs. The function takes the GPS positioning information of the short trip and the GPS information of the four types of areas as input, and outputs the label of the area type to which the short trip belongs.

可选的，所述的步骤三具体包括：Optionally, the step three specifically includes:

利用机器学习的随机森林算法构建车辆运行道路区域分类模型，获得的机器学习数据库，包含用户短行程数据、特征参数及标签，利用随机森林算法对数据库进行训练，获取随机森林模型，合理调整机器学习数据库中数据做训练集和测试集的比例，根据实现效果，进行模型调参，最终构建出基于机器学习的车辆运行道路区域分类模型。Use the random forest algorithm of machine learning to build a classification model of the vehicle running road area, and obtain the machine learning database, including user short-distance travel data, characteristic parameters and labels, use the random forest algorithm to train the database, obtain the random forest model, and adjust the machine learning reasonably The data in the database is used as the ratio of the training set and the test set. According to the realization effect, the model parameters are adjusted, and finally a machine learning-based vehicle operating road area classification model is constructed.

可选的，所述的特征参数包括：Optionally, the characteristic parameters include:

运行时间、运行距离、运行速度、加速时间、减速时间、匀速时间、怠速时间、加速比例、减速比例、匀速比例、怠速比例、最大速度、平均速度、速度标准偏差、最大加速度、加速段平均加速度、正加速度标准差、最大减速度、减速段平均减速度、负减速度标准差、平均加速度、加速度标准差、相对正加速度、不同速度间隔的比例、加速数目、减速数目、匀速数目、停车数目、转矩标准差、平均正转矩、平均负转矩、最大正转矩、最大负转矩、怠速段平均转矩、运行段平均转矩、运行段平均正转矩、运行段平均负转矩、正转矩时间比例、负转矩时间比例、转矩增加时最大波动量、转矩减小时最大波动量、转矩波动标准差、转矩增加时平均波动量、转矩减小时平均波动量、高转矩区间时间比例、中转矩区间时间比例、低转矩区间时间比例、横摆角速度绝对最大值、侧向加速度绝对最大值、制动次数、加速段时长、百公里能耗、电机总损伤、前电机总损伤、后电机总损伤、电机单位里程损伤、前电机单位里程损伤、后电机单位里程损伤、相邻短行程的速度及里程信息。Running time, running distance, running speed, acceleration time, deceleration time, constant speed time, idle time, acceleration ratio, deceleration ratio, uniform speed ratio, idle speed ratio, maximum speed, average speed, speed standard deviation, maximum acceleration, average acceleration in acceleration section , standard deviation of positive acceleration, maximum deceleration, average deceleration of deceleration section, standard deviation of negative deceleration, average acceleration, standard deviation of acceleration, relative positive acceleration, ratio of different speed intervals, number of accelerations, number of decelerations, number of constant speeds, number of stops , standard deviation of torque, average positive torque, average negative torque, maximum positive torque, maximum negative torque, average torque of idle speed segment, average torque of running segment, average positive torque of running segment, average negative rotation of running segment Torque, positive torque time ratio, negative torque time ratio, maximum fluctuation when torque increases, maximum fluctuation when torque decreases, standard deviation of torque fluctuation, average fluctuation when torque increases, average fluctuation when torque decreases amount, time ratio of high torque interval, time ratio of medium torque interval, time ratio of low torque interval, absolute maximum value of yaw rate, absolute maximum value of lateral acceleration, braking times, duration of acceleration segment, energy consumption per 100 kilometers, Motor total damage, front motor total damage, rear motor total damage, motor unit mileage damage, front motor unit mileage damage, rear motor unit mileage damage, speed and mileage information of adjacent short strokes.

可选的，所述的采用随机森林算法构建车辆运行道路区域分类模型包括：Optionally, the construction of the classification model of the vehicle running road area using the random forest algorithm includes:

导入机器学习数据库；将样本数据随机划分为训练集和测试集，训练集与测试集的比例是7：3，求解出特征参数的重要性，进行排序；Import the machine learning database; randomly divide the sample data into a training set and a test set, the ratio of the training set to the test set is 7:3, find out the importance of the characteristic parameters, and sort them;

设计分类器的主要参数如下：随机森林中决策树的数目设为178，最大树深度设为9，最大叶子节点数设为41，最大特征数设为31，叶子节点最小样本数设为3，分裂所需最小样本数设为5；The main parameters of the design classifier are as follows: the number of decision trees in the random forest is set to 178, the maximum tree depth is set to 9, the maximum number of leaf nodes is set to 41, the maximum number of features is set to 31, and the minimum number of samples of leaf nodes is set to 3. The minimum number of samples required for splitting is set to 5;

进行模型调参，使准确率尽可能高，最终构建出基于机器学习算法的车辆运行道路区域分类模型。Adjust the parameters of the model to make the accuracy rate as high as possible, and finally build a classification model of vehicle operating road areas based on machine learning algorithms.

本发明的有益效果为：The beneficial effects of the present invention are:

根据地理位置信息和道路信息，构建不同城市分区域类型的行驶工况，可使构建的行驶工况更细致化，更具有代表性。本发明的短行程数据的运行区域类型标签是确定车辆行驶工况的重要条件，也是后期特征工况合成的主要依据。基于包含车辆运行所属区域类型标签的短行程数据库，在进行行驶工况合成时，可以选择按照区域类别分别合成典型城市分区域的行驶工况，也可以选择根据四种区域类别的里程长度比例进行四个区域的合并拼接，构建典型城市总的行驶工况。因此在行驶工况合成之前获取用户短行程数据的运行区域类型标签是必须的。在车辆运行时，利用本发明所述的车辆运行道路区域分类模型对车辆运行工况进行实时识别，便于对车辆进行整车控制参数的实时调整，及时准确的校准技术参数评估整车性能。因此，本发明的车辆运行道路区域类型辨识方法不仅可用于典型工况开发的前期准备，也可以用来进行实时工况识别，改善车辆运行状况，为车辆运行云服务大数据分析提供数据信息。According to geographical location information and road information, the driving conditions of different cities and regions are constructed, which can make the constructed driving conditions more detailed and more representative. The operating area type label of the short-travel data of the present invention is an important condition for determining the driving conditions of the vehicle, and is also the main basis for the synthesis of later characteristic operating conditions. Based on the short-trip database containing the label of the area type to which the vehicle operates, when synthesizing the driving conditions, you can choose to synthesize the driving conditions of typical urban sub-regions according to the regional category, or you can choose to perform it according to the mileage length ratio of the four regional categories. The merging and splicing of the four areas constructs the overall driving conditions of a typical city. Therefore, it is necessary to obtain the operating area type label of the user's short-trip data before the synthesis of driving conditions. When the vehicle is running, the vehicle operating road area classification model of the present invention is used to identify the vehicle operating conditions in real time, which facilitates the real-time adjustment of vehicle control parameters for the vehicle, and timely and accurate calibration of technical parameters to evaluate the performance of the vehicle. Therefore, the vehicle operation road area type identification method of the present invention can not only be used for preliminary preparation of typical working condition development, but also can be used for real-time working condition identification, improve vehicle operation status, and provide data information for vehicle operation cloud service big data analysis.

附图说明Description of drawings

图1为本发明的车辆运行道路区域类型辨识方法的流程图；Fig. 1 is a flow chart of the vehicle operation road area type identification method of the present invention;

图2为本发明实施例中典型城市的道路区域分类GPS边界示意图。Fig. 2 is a schematic diagram of GPS boundary of road area classification in a typical city in an embodiment of the present invention.

具体实施方式Detailed ways

下面结合具体的实施例对本发明做进一步的详细说明，所述是对本发明的解释而不是限定。The present invention will be further described in detail below in conjunction with specific embodiments, which are explanations of the present invention rather than limitations.

此前，业内使用的汽车测试工况基本都是国际标准工况，而各个城市的社会、经济和地理特征并不相同，甚至有很大的差异，因此，典型的标准工况并不足以代表不同地区或城市的实际道路行驶条件。不仅如此，同一城市的不同地理区域内的道路交通状况都有很大差异，简单的累加不利于获得适配度强的典型工况，不能更准确反映特定研究对象的行驶特性。因此，根据地理位置信息和道路信息，构建不同城市分区域类型的行驶工况，可使构建的行驶工况更细致化，更具有代表性。Previously, the automotive testing conditions used in the industry were basically international standard working conditions, and the social, economic and geographical characteristics of each city are not the same, and even have great differences. Therefore, typical standard working conditions are not enough to represent different conditions. Actual road driving conditions in the region or city. Not only that, but the road traffic conditions in different geographical areas of the same city are very different. Simple accumulation is not conducive to obtaining typical working conditions with strong adaptability, and cannot more accurately reflect the driving characteristics of specific research objects. Therefore, according to the geographical location information and road information, constructing the driving conditions of different cities and sub-regions can make the constructed driving conditions more detailed and more representative.

首先，本发明的短行程数据的运行区域类型标签是确定车辆行驶工况的重要条件，也是后期特征工况合成的主要依据。基于包含车辆运行所属区域类型标签的短行程数据库，在进行行驶工况合成时，可以选择按照区域类别分别合成典型城市分区域的行驶工况，也可以选择根据四种区域类别的里程长度比例进行四个区域的合并拼接，构建典型城市总的行驶工况。因此在行驶工况合成之前获取用户短行程数据的运行区域类型标签是必须的。First of all, the operating region type label of the short-travel data of the present invention is an important condition for determining the driving conditions of the vehicle, and is also the main basis for the later synthesis of characteristic operating conditions. Based on the short-trip database containing the label of the area type to which the vehicle operates, when synthesizing the driving conditions, you can choose to synthesize the driving conditions of typical urban sub-regions according to the regional category, or you can choose to perform it according to the mileage length ratio of the four regional categories. The merging and splicing of the four areas constructs the overall driving conditions of a typical city. Therefore, it is necessary to obtain the operating area type label of the user's short-trip data before the synthesis of driving conditions.

其次，在车辆运行时，利用本发明的车辆运行道路区域分类模型对车辆运行工况进行实时识别，便于对车辆进行整车控制参数的实时调整，及时准确的校准技术参数评估整车性能。Secondly, when the vehicle is running, the vehicle operating road area classification model of the present invention is used to identify the vehicle operating conditions in real time, which facilitates real-time adjustment of vehicle control parameters for the vehicle, and timely and accurate calibration of technical parameters to evaluate vehicle performance.

因此，本发明的车辆运行道路区域类型辨识方法不仅可用于典型工况开发的前期准备，也可以用来进行实时工况识别，改善车辆运行状况，为车辆运行云服务大数据分析提供数据信息。Therefore, the vehicle operation road area type identification method of the present invention can not only be used for preliminary preparation of typical working condition development, but also can be used for real-time working condition identification, improve vehicle operation status, and provide data information for vehicle operation cloud service big data analysis.

本发明在进行车辆行驶工况构建时，采用自主驾驶法得到原始数据，但是典型城市的不同地理区域内的道路交通状况都有很大差异，简单的累加不利于获得适配度强的典型工况，不能更准确反映特定研究对象的行驶特性。于是根据典型城市的地理区域构成特点，利用arcgis软件将典型城市划分为几种区域，包括市区、山区、郊区、高速；与其他现有工况开发思路不同的是，这几种区域是根据地理信息和道路信息划分的，某些现有工况开发思路所命名的“市区”、“郊区”是工况的类型，是由车辆在整个典型城市内自主行驶的短行程进行聚类所得，例如，在某些现有工况开发思路中，某地理位置位于郊区的道路工况也可能被划分为“市区”类别，其所谓的“市区”类别其实是指车辆速度不高、起停频繁的低速工况)。本发明所划分的几种区域内车辆的行驶工况均具有显著的差异性特点，例如，郊区区域类别中的低速工况显著高于市区区域类别中的低速工况。因此，在进行典型城市行驶工况开发时，将典型城市地缘信息具体划分为几类之后，再进行行驶工况开发，有利于将工况分类的更细致化、典型化，更具有代表性。The present invention uses the autonomous driving method to obtain original data when constructing vehicle driving conditions, but the road traffic conditions in different geographical areas of typical cities are very different, and simple accumulation is not conducive to obtaining typical working conditions with strong adaptability. However, it cannot more accurately reflect the driving characteristics of a specific research object. Therefore, according to the geographical composition characteristics of typical cities, use arcgis software to divide typical cities into several regions, including urban areas, mountainous areas, suburbs, and expressways; Geographical information and road information are divided, and the "urban" and "suburbs" named by some existing working condition development ideas are the types of working conditions, which are obtained by clustering the short trips of vehicles autonomously driving in the entire typical city , for example, in some existing road condition development ideas, a road condition located in the suburbs may also be classified as "urban" category, the so-called "urban" category actually refers to the vehicle speed is not high, Low-speed working conditions with frequent starts and stops). The driving conditions of vehicles in several areas divided by the present invention all have significant differences. For example, the low-speed operating conditions in the suburban area category are significantly higher than the low-speed operating conditions in the urban area category. Therefore, in the development of typical urban driving conditions, after the typical urban geographical information is divided into several categories, and then the driving conditions are developed, it is beneficial to make the classification of the driving conditions more detailed, typical, and more representative.

本发明将短行程数据划分到不同地理区域类型中，基于包含车辆运行所属区域类型标签的短行程特征参数数据库，在进行行驶工况合成时，(1)可以选择按照区域类型分别合成典型城市分区域的行驶工况，(2)也可以选择根据四种区域类型的里程长度比例进行四个区域的合并拼接，构建典型城市总的行驶工况。这是步骤1，步骤2所做的工作。The present invention divides the short-distance data into different geographical area types, and based on the short-distance characteristic parameter database containing the area type label to which the vehicle operates, when synthesizing the driving conditions, (1) can choose to synthesize the typical city points according to the area type; Regional driving conditions, (2) You can also choose to merge and splice the four regions according to the mileage length ratio of the four types of regions to construct the total driving conditions of a typical city. This is what step 1, step 2 does.

步骤3利用机器学习的随机森林算法构建车辆运行道路区域分类模型。机器学习的训练集数据库为：用户短行程的运动学特征参数+区域类型标签。机器学习模型将根据用户短行程的运动学特征参数与区域类型的相关性，进行学习训练，经过训练后，输入用户的运动学特征参数就可判断所属区域类型。这样就可以对车辆运行工况进行实时识别，便于对车辆进行整车控制参数的实时调整，及时准确的校准技术参数评估整车性能。Step 3 uses the random forest algorithm of machine learning to build a classification model for vehicle running road areas. The training set database for machine learning is: the kinematic feature parameters of the user's short trip + region type labels. The machine learning model will conduct learning and training based on the correlation between the kinematic characteristic parameters of the user's short stroke and the type of area. After training, the user's kinematic characteristic parameters can be input to determine the type of area to which it belongs. In this way, the operating conditions of the vehicle can be identified in real time, which is convenient for real-time adjustment of the vehicle control parameters, and timely and accurate calibration of technical parameters to evaluate the performance of the vehicle.

机器学习数据库特征参数选择创新：横摆角速度的绝对最大值、相邻短行程的速度及里程信息、侧向加速度的绝对最大值、制动次数、百公里能耗。Innovative selection of machine learning database feature parameters: absolute maximum yaw rate, speed and mileage information of adjacent short strokes, absolute maximum lateral acceleration, braking times, and energy consumption per 100 kilometers.

本发明所提出的车辆运行道路区域类型辨识方法，具体实施方式主要包括以下步骤：The specific implementation of the method for identifying the type of vehicle running road area proposed by the present invention mainly includes the following steps:

步骤1、典型道路分类GPS边界数据库的创建。将全国的地图信息数据导入GIS平台Arcgis，包括省级行政区、县级行政区、全国道路、DEM高程数据等。由于每个城市都有绕城高速或环线路来将城郊区域划分开来，因此采用典型城市的绕城高速或环线路作为城市和郊区的分界线，进而与全国道路库相交获得城市道路与郊区道路库；高速道路库为省区内道路的高速部分合集，包括郊区高速与绕城高速等；山区道路则利用DEM数据，在ARCGIS软件中得出典型城市内的山区面积，从而得出山区道路库；普通公路行政等级划分用G、S、X、Y、C、Z区分国道、省道、县道、乡道、村道、专用公路，主要涵盖郊区除高速，山区的其他部分道路。Step 1. Creation of typical road classification GPS boundary database. Import the national map information data into the GIS platform Arcgis, including provincial administrative regions, county administrative regions, national roads, DEM elevation data, etc. Since every city has a ring expressway or a ring line to divide the suburban area, a typical city’s ring expressway or ring line is used as the boundary between the city and the suburbs, and then intersects with the national road bank to obtain the urban road and the suburbs Road library; expressway library is a collection of high-speed roads in provinces and regions, including suburban expressways and ring expressways; mountainous roads use DEM data to obtain the area of mountainous areas in typical cities in ARCGIS software, thereby obtaining mountainous roads Library; G, S, X, Y, C, Z are used to divide the administrative grades of ordinary roads to distinguish national roads, provincial roads, county roads, township roads, village roads, and special roads, mainly covering suburban roads except expressways and other roads in mountainous areas.

具体来说，步骤1.1，调研各个典型城市的城郊分界线是绕城高速还是环城线，以成都市为例，成都市是以四环路作为城市和郊区的边界线，将全国省级行政区、县级行政区、全国道路数据导入Arcgis，删除典型城市所在省区以外的其他省区，得到典型城市所在省区的矢量数据。将典型城市所在省区数据与全国县级行政区数据做相交操作，得到由县级行政区构成的典型城市所在省级行政区。按照现有标准行政区划删除多余的县区，于是获得由县级行政区构成的典型城市行政区。Specifically, in step 1.1, investigate whether the urban-suburban dividing line of each typical city is the ring expressway or the city ring line. Taking Chengdu as an example, Chengdu uses the Fourth Ring Road as the boundary line between the city and the suburbs, and divides the national provincial administrative regions , county-level administrative regions, and national road data are imported into Arcgis, and other provinces and regions other than the provinces and regions where typical cities are deleted are deleted to obtain the vector data of the provinces and regions where typical cities are located. Intersect the provincial data of typical cities with the national county-level administrative district data to obtain the provincial administrative districts of typical cities composed of county-level administrative districts. The redundant counties are deleted according to the existing standard administrative divisions, and then the typical urban administrative divisions composed of county-level administrative divisions are obtained.

将典型城市行政区进行面转线操作，获得典型城市行政区的外围轮廓线，在轮廓线上按照100米的间隔打点，通过运算获得典型城市行政区的外围轮廓线的GPS点信息。Convert the typical urban administrative districts to plane-to-line operations to obtain the outer contour lines of typical urban administrative districts, dots on the contour lines at intervals of 100 meters, and obtain the GPS point information of the outer contour lines of typical urban administrative districts through calculation.

步骤1.2，在步骤1.1的基础上，将典型城市行政区与全国道路数据做相交操作，获得典型城市内的所有道路信息。将典型城市内的所有道路信息复制出一份，用于获得城市和郊区的分界线，打开道路信息的属性表，属性表中的name属性表明了道路的名称，筛选出成都四环线，然后将其他未被选择的道路删除，得到了城郊分界线。观察环线是否有缺口，如果有缺口将其补齐为一个完整的线圈。在城郊分界线上按照100米的间隔打点，通过运算获得典型城市的城郊分界线的GPS点信息。在获得边界线的基础上，将线要素转面，获得典型城市城区区域，将典型城市城区区域复制并裁剪获得郊区及山区区域。Step 1.2, on the basis of step 1.1, intersect the typical urban administrative districts with the national road data to obtain all road information in a typical city. Copy all the road information in a typical city to obtain the dividing line between the city and the suburbs, open the attribute table of the road information, the name attribute in the attribute table indicates the name of the road, filter out the Chengdu Fourth Ring Road, and then Other unselected roads are deleted, and the suburban boundary is obtained. Observe whether there is a gap in the loop, and if there is a gap, fill it up as a complete coil. Dots are made at intervals of 100 meters on the urban-suburban boundary line, and the GPS point information of the urban-suburban boundary line of typical cities is obtained through calculation. On the basis of obtaining the boundary line, turn the line element into a surface to obtain a typical urban area, copy and cut the typical urban area to obtain suburban and mountainous areas.

步骤1.3，获取典型城市高速公路信息，在步骤1.1及1.2的基础上，将典型城市郊区及山区区域与全国道路数据做相交操作，获得典型城市郊区道路并改名为高速道路，便于后续操作。打开道路信息的属性表，按属性筛选"fclass"＝'motorway'OR"fclass"＝'motorway_link'，得到高速公路(包括绕城高速、环线高速路、环城路等)、高速公路立交及匝道等道路，切换选择删除其余道路得到高速道路库，在高速道路库上按照100米的间隔打点，通过运算获得高速公路的GPS点信息。同时删去城区、郊区道路库中的高速道路。Step 1.3: Obtain the information of typical urban highways. On the basis of steps 1.1 and 1.2, intersect the typical urban suburbs and mountainous areas with the national road data to obtain typical urban suburban roads and rename them as expressways to facilitate subsequent operations. Open the attribute table of the road information, filter "fclass" = 'motorway' OR "fclass" = 'motorway_link' by attribute, and get expressways (including ring expressways, ring expressways, ring roads, etc.), expressway interchanges and ramps Wait for the road, switch and delete the rest of the roads to get the expressway database, dot the expressway database at an interval of 100 meters, and obtain the GPS point information of the expressway through calculation. At the same time, the expressways in the urban and suburban road databases are deleted.

步骤1.4，获取山区的范围，DEM数据是通过有限的地形高程数据实现对地面地形的数字化模拟，将根据全国DEM数据进行山区范围的划分，首先，根据典型城市的经纬度信息，获取典型城市附近的多块DEM数据，在Arcgis中导入同一图层，Step 1.4, obtain the range of mountainous areas. DEM data is to realize the digital simulation of ground terrain through limited terrain elevation data. The range of mountainous areas will be divided according to the national DEM data. First, according to the latitude and longitude information of typical cities, obtain the surrounding areas of typical cities Multiple blocks of DEM data, import the same layer in Arcgis,

通过栅格镶嵌得到一整块合并DEM数据，可以将山区高程数据量程统一且便于统一处理。栅格数据结构是简单最直观的空间数据结构，又称为网络结构或象元结构，是指将区域划分为大小相等排列紧密的格网阵列，每个网格作为一个象元，由行列号定义并包含一个代码代表网格的属性类型或信息数值。Obtaining a whole piece of merged DEM data through raster mosaic can unify the range of mountain elevation data and facilitate unified processing. The raster data structure is the simplest and most intuitive spatial data structure, also known as the network structure or pixel structure, which refers to dividing the area into grid arrays of equal size and tight arrangement. Defines and includes a code representing the grid's property type or message value.

然后计算地形起伏度。国际地理联合会地貌调查与制图委员会编制的《1：250万欧洲国际地貌图》和《中比例尺地貌图国际统一图例指南》，以及中国《1:400万中国地貌图》、《1:100万地貌图制图规范》、《中国地貌全图研制》等均将地势起伏度作为地貌基本形态判别的主要依据。地势起伏度，也称为地形起伏度、局部地势、相对高度，是指某一确定面积内最高点与最低点的高差，其表示如式(1-1)：Then calculate the terrain relief. "1:2.5 million European International Geomorphic Map" and "International Uniform Legend Guide for Medium-Scale Geomorphic Maps" compiled by the International Geographical Union Geomorphological Survey and Cartography Committee, as well as China's "1:4 million Chinese Geomorphic Map", "1:1 million The Standards for Mapping Geomorphological Maps and the Development of Complete Maps of China's Geomorphological Maps all use the degree of topographical relief as the main basis for discriminating the basic morphology of geomorphology. Topographic relief, also known as topographic relief, local topography, and relative height, refers to the height difference between the highest point and the lowest point in a certain area, and its expression is shown in formula (1-1):

R＝H_max-H_min (1-1)；R = H _max - H _min (1-1);

式中，R为分析区域内的地势起伏度，H_max、H_min分别为分析区域内的最大和最小高程值。In the formula, R is the terrain relief in the analysis area, and H _max and H _min are the maximum and minimum elevation values in the analysis area, respectively.

将整块DEM数据进行矩形邻域焦点统计，邻域分析用于计算统计数据的每个像元周围的区域。定义的高和宽为23*23像元的正方形邻域，在矩形块上进行焦点统计，获取最大MAXIMUM及最小值MINIMUM，MAXIMUM是计算邻域内出现最大的数值。MINIMUM是计算邻域内像元的最小值。用Spatial Analyst工具中的地图代数、栅格计算器，计算"maximum"-"minimum"，输出为地形起伏度图。用重分类工具将起伏度进行重分类。利用识别功能查看山区边界拉伸值大致为多少，将边界线上下重新分为两个等级，可以得出山区的范围图。由于“自然间断点”类别基于数据中固有的自然分组。将对分类间隔加以识别，可对相似值进行最恰当地分组，并可使各个类之间的差异最大化。The entire block of DEM data is subjected to rectangular neighborhood focus statistics, and neighborhood analysis is used to calculate the area around each pixel of statistical data. Define a square neighborhood with a height and width of 23*23 pixels, perform focus statistics on the rectangular block, and obtain the maximum MAXIMUM and minimum MINIMUM, MAXIMUM is the largest value that appears in the calculation neighborhood. MINIMUM is to calculate the minimum value of the cells in the neighborhood. Use the map algebra and raster calculator in the Spatial Analyst tool to calculate "maximum"-"minimum", and the output is a terrain relief map. Use the reclassification tool to reclassify the waviness. Use the identification function to check the approximate stretching value of the boundary line of the mountainous area, divide the upper and lower boundaries into two levels, and obtain the range map of the mountainous area. Since the "natural breaks" categories are based on natural groupings inherent in the data. Class breaks are identified to best group similar values and maximize differences between classes.

先将重分类栅格的结果进行栅格转面操作得到矢量数据，再将矢量数据与郊区范围的矢量数据相交，将低处数据删去，裁剪出典型城市的山区范围。First, convert the result of the reclassified raster into a grid to obtain vector data, and then intersect the vector data with the vector data of the suburban area, delete the low-level data, and crop out the mountain range of a typical city.

处理图块部分。根据得出的山区区域以及中国地图，东边山区部分虽然因河流有高低起伏，但应该整体都属于山区部分，因此利用工具将其合并为一整块区域。Handle the tile part. According to the obtained mountain area and the map of China, although the eastern mountain area has ups and downs due to the river, it should belong to the mountain area as a whole, so the tool was used to merge it into a whole area.

开启编辑器在山区矢量数据中选中图斑区域，选择制图工具→制图综合→聚合面，输入为山区区域，输出时需在文件夹内新建文件地理数据库gdb，保存为山区聚合面；剩余部分在编辑中选择创建要素的自动生成面，将大块内的孔洞以及周围的小区域连接成一整块面积并合并；将聚合面与自动生成面裁剪复制。所得区域为山区矢量数据，将获得山区范围面转线获得山区范围线，按照间隔100米打点获得山区范围GPS点，在属性表信息中将图块按区块命名，并将选取部分与挖去部分用0和1进行标识。Open the editor and select the patch area in the mountain vector data, select the cartography tool→cartographic synthesis→aggregate surface, the input is a mountain area, and when outputting, a new file geodatabase gdb needs to be created in the folder and saved as a mountain aggregation surface; the rest is in In editing, choose to create the automatically generated surface of the element, connect the hole in the large block and the surrounding small area into a whole area and merge it; cut and copy the aggregated surface and the automatically generated surface. The obtained area is the vector data of the mountain area, and the area of the mountain area will be converted to the line to obtain the area line of the mountain area, and the GPS points of the mountain area will be obtained by dots at an interval of 100 meters. In the attribute table information, the tiles will be named according to the block, and the selected part will be excavated Parts are identified with 0 and 1.

选择交集取反工具，在郊区区域中对山区区域取反，得到普通公路区域，将普通公路区域进行面转线操作，在普通公路区域的轮廓线上按照100米的间隔打点，通过运算获得典型城市普通公路区域的外围轮廓线的GPS点信息，同样在信息表中命名，并用0和1标明是选取范围或是去除范围。Select the intersection and inversion tool, invert the mountainous area in the suburban area, and obtain the ordinary road area, perform the surface-to-line operation on the ordinary road area, and make points on the contour line of the ordinary road area at intervals of 100 meters, and obtain the typical road area through calculation. The GPS point information of the outer contour line of the urban ordinary road area is also named in the information table, and 0 and 1 are used to indicate whether it is the selection range or the removal range.

自此，市区、山区、郊区、高速四种区域的GPS信息库都已建成，如附图2所示。Since then, the GPS information databases in urban areas, mountainous areas, suburbs, and high-speed areas have all been built, as shown in Figure 2.

步骤2、用户短行程数据按区域类型分类。以大批量的用户数据作为输入的原始数据，该原始数据包括GPS定位信息、速度、加速度、soc、高压上电信号、电机转矩转速等车辆运行信息。对原始数据进行预处理后，根据构建行驶工况的常规前期步骤，基于高压上电信号的将用户数据划分为若干短行程运行有效片段。得到用户短行程数据库，计算每个短行程的特征参数供后续使用。Step 2. The user's short-distance travel data is classified by area type. A large amount of user data is used as the input raw data, which includes GPS positioning information, speed, acceleration, SOC, high-voltage power-on signal, motor torque speed and other vehicle operating information. After preprocessing the raw data, according to the conventional early steps of constructing the driving conditions, the user data is divided into several effective segments for short-stroke operation based on the high-voltage power-on signal. Obtain the user's short-stroke database, and calculate the characteristic parameters of each short-stroke for subsequent use.

构建识别短行程所属区域类型的函数，该函数以短行程的GPS定位信息和城市各区域(包括城区、高速、山区、郊区)GPS信息作为输入，以短行程所属区域类别标签作为输出。其基本过程如下：Construct a function to identify the type of area to which a short trip belongs. This function takes the GPS positioning information of the short trip and the GPS information of various areas of the city (including urban areas, highways, mountainous areas, and suburbs) as input, and outputs the category label of the area to which the short trip belongs. The basic process is as follows:

由于用户短行程是一段运行片段，其有可能横跨多个区域类型，因此需要根据该短行程的经纬度信息判断其所属区域类型的比例，定义该短行程的经纬度点超过一定比例都属于某一区域类型，就认为该短行程属于该区域类型。Since the user's short trip is a running segment, it may span multiple area types, so it is necessary to judge the proportion of the area type to which it belongs based on the longitude and latitude information of the short trip, and define that the longitude and latitude points of the short trip exceed a certain proportion and belong to a certain area type, the short trip is considered to belong to this area type.

将属于典型城市行政区的短行程筛选出来，首先判断属于市区区域的短行程，因为短行程位于市区就肯定不会再属于高速区域，这样可以减小计算量，以短行程GPS点信息、城郊分界线的GPS点信息作为输入，第一步，判断城郊分界线的GPS点信息是否闭合，若未闭合则令其闭合，第二步，使用Matlab自带的inpolygon()函数，判断属于市区区域内的GPS点，计算该短行程中属于市区区域的GPS点占总的GPS点的比例，如果有超过80％的GPS点都属于市区区域，那么就认为该短行程片段属于市区区域类别。To filter out the short trips that belong to the typical urban administrative area, first judge the short trips that belong to the urban area, because the short trip is located in the urban area, it will definitely not belong to the high-speed area. This can reduce the amount of calculation, and the short trip GPS point information, The GPS point information of the urban-suburban boundary is used as input. The first step is to judge whether the GPS point information of the urban-suburban boundary is closed. GPS points in the urban area, and calculate the proportion of GPS points belonging to the urban area in the short trip to the total GPS points. If more than 80% of the GPS points belong to the urban area, then the short trip segment is considered to belong to the urban area. District area category.

对属于高速区域的短行程判断有所区别，其判断方法是在高速库GPS点集中找到距离每个短行程GPS点最近的点，然后根据经纬度计算高速库GPS点与最近的短行程GPS点之间的距离，如果距离小于50米，就认为该GPS点属于高速区域，同时，计算该短行程中属于高速区域的GPS点占总的GPS点的比例，如果有超过60％的GPS点都属于高速区域，那么就认为该短行程片段属于高速区域类别。There is a difference in judging the short trips belonging to the high-speed area. The judgment method is to find the point closest to each short-trip GPS point in the high-speed library GPS point set, and then calculate the distance between the high-speed library GPS point and the nearest short-trip GPS point according to the latitude and longitude. If the distance is less than 50 meters, the GPS point is considered to belong to the high-speed area. At the same time, the proportion of the GPS points belonging to the high-speed area in the short trip to the total GPS points is calculated. If more than 60% of the GPS points belong to High-speed area, then it is considered that the short-stroke segment belongs to the high-speed area category.

郊区与市区的区别在于郊区不是凸空间，因此需要考虑到郊区大轮廓内的非郊区范围，划分好郊区部分和非郊区部分，以短行程GPS点信息、郊区轮廓线的GPS点信息作为输入，第一步，判断郊区轮廓线的GPS点信息是否闭合，若未闭合则令其闭合，第二步，使用Matlab自带的inpolygon()函数，判断属于郊区区域内的GPS点，计算该短行程中属于郊区区域的GPS点占总的GPS点的比例，如果有超过80％的GPS点都属于郊区区域，那么就认为该短行程片段属于郊区区域类别。The difference between suburbs and urban areas is that the suburbs are not a convex space, so it is necessary to consider the non-suburbs within the large outline of the suburbs, divide the suburbs and non-suburbs, and use the short-distance GPS point information and the GPS point information of the suburban outline as input , the first step is to judge whether the GPS point information of the suburban contour line is closed, and if it is not closed, make it closed; the second step is to use the inpolygon() function that comes with Matlab to judge the GPS points belonging to the suburban area, and calculate the short The ratio of the GPS points belonging to the suburban area in the itinerary to the total GPS points, if more than 80% of the GPS points belong to the suburban area, then the short trip segment is considered to belong to the suburban area category.

以短行程GPS点信息、山区轮廓线的GPS点信息作为输入，第一步，判断山区轮廓线的GPS点信息是否闭合，若未闭合则令其闭合，第二步，使用Matlab自带的inpolygon()函数，判断属于山区区域内的GPS点，计算该短行程中属于山区区域的GPS点占总的GPS点的比例，如果有超过80％的GPS点都属于山区区域，那么就认为该短行程片段属于山区区域类别。Taking the short-distance GPS point information and the GPS point information of the mountain contour line as input, the first step is to judge whether the GPS point information of the mountain contour line is closed, and if it is not closed, make it closed. The second step is to use the inpolygon that comes with Matlab () function, judge the GPS points belonging to the mountain area, calculate the proportion of the GPS points belonging to the mountain area in the short trip to the total GPS points, if more than 80% of the GPS points belong to the mountain area, then it is considered that the short trip The trip segment belongs to the Mountain Region category.

于是得到了该短行程内用户经过的区域类别，每个短行程特征参数数据库内增加车辆运行所属区域类型的标签供后续合成工况使用，同时获得了机器学习数据库供步骤3使用。Then the category of the area that the user passes through in the short trip is obtained, and the label of the type of the area to which the vehicle operates is added to the database of each short trip characteristic parameter for use in subsequent synthetic conditions, and the machine learning database is obtained for use in step 3.

步骤3、利用机器学习的随机森林算法构建车辆运行道路区域分类模型。根据随机森林分类用途的优势，采用随机森林算法基于特征参数指标对车辆运行短行程的运动学特征参数数据进行分类辨识。Step 3. Use the random forest algorithm of machine learning to construct a classification model for vehicle running road areas. According to the advantages of the random forest classification application, the random forest algorithm is used to classify and identify the kinematic characteristic parameter data of the short-distance vehicle running based on the characteristic parameter index.

利用随机森林算法对数据库进行训练，获取随机森林模型。其中，随机森林模型包括多颗决策树，该随机森林模型是用户的运行道路区域类型机器学习的分类识别模型，多个决策树之间互不关联。决策树是一个树状的构造形式，可以是非二叉树结构，也可以为二叉树结构，本发明实施例中的随机森林模型以非二叉树结构的CART决策树(以降低GINI系数为划分标准的决策树算法)作为弱学习器，每颗决策树根据分类回归算法生成。GINI系数，可以表示数据集中随机选择的数据点可能被错误分类的频率，其计算方法如下：Use the random forest algorithm to train the database to obtain a random forest model. Among them, the random forest model includes multiple decision trees, and the random forest model is a classification recognition model for the user to run machine learning of road area types, and the multiple decision trees are not related to each other. Decision tree is a tree-like structural form, which can be non-binary tree structure or binary tree structure. The random forest model in the embodiment of the present invention is based on the CART decision tree of non-binary tree structure (decision tree algorithm with the reduction of GINI coefficient as the division standard) ) as a weak learner, each decision tree is generated according to the classification regression algorithm. The GINI coefficient, which can represent the frequency at which randomly selected data points in the data set may be misclassified, is calculated as follows:

1、p_k表示选中的样本属于k类别的概率，则这个样本被分错的概率是1-p_k；1. p _k represents the probability that the selected sample belongs to category k, then the probability of this sample being misclassified is 1-p _k ;

2、样本集合中有K个类别，一个随机选中的样本可以属于这k个类别中的任意一个，因而对类别求和例如，样本集合D中有K个类别，第K个类别的数量为C_k的Gini指数：2. There are K categories in the sample set, and a randomly selected sample can belong to any one of these k categories, so the sum of the categories For example, there are K categories in the sample set D, and the number of the Kth category is C Gini index of _k :

在使用决策树的基础上，随机森林通过随机选择节点上的一部分样本特征，假设为Nsub，然后在这些随机选择的Nsub个样本特征中，选择一个最优的特征来做决策树的左右子树划分。这样进一步增强了模型的泛化能力。以下是利用随机森林算法构建车辆运行道路区域分类模型的具体过程：On the basis of using a decision tree, the random forest randomly selects a part of the sample features on the node, assuming Nsub, and then selects an optimal feature among these randomly selected Nsub sample features to make the left and right subtrees of the decision tree divided. This further enhances the generalization ability of the model. The following is the specific process of using the random forest algorithm to build a vehicle operating road area classification model:

首先导入机器学习数据库，步骤2所述机器学习数据库包含多个用户短行程数据，其特征参数包括运行时间、运行距离、运行速度、加速时间、减速时间、匀速时间、怠速时间、加速比例、减速比例、匀速比例、怠速比例、最大速度、平均速度、速度标准偏差、最大加速度、加速段平均加速度、正加速度标准差、最大减速度、减速段平均减速度、负减速度标准差、平均加速度、加速度标准差、相对正加速度、不同速度间隔的比例、加速数目、减速数目、匀速数目、停车数目、转矩标准差、平均正转矩、平均负转矩、最大正转矩、最大负转矩、怠速段平均转矩、运行段平均转矩、运行段平均正转矩、运行段平均负转矩、正转矩时间比例、负转矩时间比例、转矩增加时最大波动量、转矩减小时最大波动量、转矩波动标准差、转矩增加时平均波动量、转矩减小时平均波动量、高转矩区间时间比例、中转矩区间时间比例、低转矩区间时间比例、横摆角速度绝对最大值、侧向加速度绝对最大值、制动次数、加速段时长、百公里能耗、电机总损伤、前电机总损伤、后电机总损伤、电机单位里程损伤、前电机单位里程损伤、后电机单位里程损伤、相邻短行程的速度及里程信息、短行程所属区域类型的标签。First import the machine learning database, the machine learning database in step 2 contains multiple user short-stroke data, and its characteristic parameters include running time, running distance, running speed, acceleration time, deceleration time, uniform speed time, idle time, acceleration ratio, deceleration Ratio, constant speed ratio, idle speed ratio, maximum speed, average speed, speed standard deviation, maximum acceleration, average acceleration in acceleration section, standard deviation in positive acceleration, maximum deceleration, average deceleration in deceleration section, standard deviation in negative deceleration, average acceleration, Acceleration standard deviation, relative positive acceleration, ratio of different speed intervals, number of accelerations, number of decelerations, number of constant speeds, number of stops, standard deviation of torque, average positive torque, average negative torque, maximum positive torque, maximum negative torque , average torque in idling section, average torque in running section, average positive torque in running section, average negative torque in running section, positive torque time ratio, negative torque time ratio, maximum fluctuation when torque increases, torque decrement Hourly maximum fluctuation, standard deviation of torque fluctuation, average fluctuation when torque increases, average fluctuation when torque decreases, time ratio of high torque interval, time ratio of medium torque interval, time ratio of low torque interval, yaw Absolute maximum value of angular velocity, absolute maximum value of lateral acceleration, number of braking, duration of acceleration segment, energy consumption per 100 kilometers, total damage of motor, total damage of front motor, total damage of rear motor, damage per unit mileage of motor, damage per unit mileage of front motor, The unit mileage damage of the rear motor, the speed and mileage information of adjacent short strokes, and the label of the area type to which the short stroke belongs.

其中，选用横摆角速度的绝对最大值、相邻短行程的速度及里程信息、侧向加速度的绝对最大值、制动次数、百公里能耗等特征参数有利于构建的分类模型识别准确度。Among them, the selection of characteristic parameters such as the absolute maximum value of the yaw rate, the speed and mileage information of adjacent short strokes, the absolute maximum value of the lateral acceleration, the number of braking times, and the energy consumption per 100 kilometers is conducive to the recognition accuracy of the classification model constructed.

将样本数据随机划分为训练集和测试集，训练集与测试集的比例是7：3，求解出特征参数的重要性，进行排序。设计分类器的主要参数如下：Randomly divide the sample data into a training set and a test set. The ratio of the training set to the test set is 7:3. The importance of the characteristic parameters is calculated and sorted. The main parameters for designing a classifier are as follows:

随机森林中决策树的数目设为178，最大树深度设为9，最大叶子节点数设为41，最大特征数设为31，叶子节点最小样本数设为3，分裂所需最小样本数设为5。The number of decision trees in the random forest is set to 178, the maximum tree depth is set to 9, the maximum number of leaf nodes is set to 41, the maximum number of features is set to 31, the minimum number of samples of leaf nodes is set to 3, and the minimum number of samples required for splitting is set to 5.

以下是构建随机森林的方法：Here's how to build a random forest:

1、一个样本容量为N的样本，有放回的抽取N次，每次抽取1个，最终形成了N个样本。这选择好了的N个样本用来训练一个决策树，作为决策树根节点处的样本。1. A sample with a sample size of N is drawn N times with replacement, and one sample is drawn each time, and finally N samples are formed. The selected N samples are used to train a decision tree as samples at the root node of the decision tree.

2、当每个样本有M个属性时，在决策树的每个节点需要分裂时，随机从这M个属性中选取出m个属性，满足条件m≤M。然后从这m个属性中采用某种策略(比如说信息增益)来选择1个属性作为该节点的分裂属性。2. When each sample has M attributes, when each node of the decision tree needs to be split, randomly select m attributes from the M attributes, satisfying the condition m≤M. Then adopt some strategy (for example, information gain) from these m attributes to select 1 attribute as the splitting attribute of the node.

3、决策树形成过程中每个节点都要按照步骤2来分裂(如果下一次该节点选出来的属性是其父节点分裂时用过的属性，则该节点已经达到了叶子节点，无须继续分裂)，一直到不能够再分裂为止。整个决策树形成过程中不进行剪枝。3. In the process of forming the decision tree, each node must be split according to step 2 (if the attribute selected by the node next time is the attribute used when the parent node splits, the node has reached the leaf node, and there is no need to continue splitting ), until it can no longer be split. No pruning is performed during the entire decision tree formation process.

4、按照步骤1～3建立大量的决策树，这样就构成了随机森林了。4. Build a large number of decision trees according to steps 1 to 3, thus forming a random forest.

合理调整机器学习数据库中数据做训练集和测试集的比例，进行模型调参，使准确率尽可能高。最终构建出基于机器学习算法的车辆运行道路区域分类模型。Reasonably adjust the ratio of the data in the machine learning database as the training set and the test set, and adjust the model parameters to make the accuracy rate as high as possible. Finally, a vehicle operating road area classification model based on machine learning algorithms is constructed.

经验证，所构建的车辆运行道路区域分类模型的识别准确率比较高，其识别效果如表1所示。It has been verified that the recognition accuracy of the constructed vehicle operating road area classification model is relatively high, and its recognition effect is shown in Table 1.

表1Table 1

市区urban area 高速high speed 郊区suburbs 山区the mountains 测试集准确率test set accuracy 92.85％92.85% 94.01％94.01% 93.90％93.90% 92.21％92.21% 测试集召回率test set recall 93.97％93.97% 94.53％94.53% 94.96％94.96% 90.63％90.63%

本发明实施例中，步骤1和步骤2构建了包含车辆运行所属区域类型标签的短行程特征参数数据库，基于此，在进行行驶工况合成时，可以选择按照区域类别分别合成典型城市分区域的行驶工况，也可以选择根据四种区域类别的里程长度比例进行四个区域的合并拼接，构建典型城市总的行驶工况。步骤3构建了基于机器学习的车辆运行道路区域分类模型，输入车辆的运动学特征参数和运行数据信息就可以获取车辆运行的所属区域类型，基于此，进行整车控制参数的调整，及时准确的校准技术参数评估整车性能。In the embodiment of the present invention, step 1 and step 2 construct a short-trip characteristic parameter database containing the type label of the area to which the vehicle operates. Based on this, when synthesizing driving conditions, it is possible to choose to synthesize typical city sub-areas according to the area category. Driving conditions, you can also choose to merge and splice the four regions according to the mileage length ratio of the four types of regions to construct the total driving conditions of a typical city. Step 3 builds a vehicle operating road area classification model based on machine learning, and the type of area to which the vehicle operates can be obtained by inputting the kinematic characteristic parameters and operating data information of the vehicle. Based on this, the vehicle control parameters are adjusted, and timely and accurate Calibrate technical parameters to evaluate vehicle performance.

本发明未尽事宜为公知技术。Matters not covered in the present invention are known technologies.

上述实施例只为说明本发明的技术构思及特点，其目的在于让熟悉此项技术的人士能够了解本发明的内容并据以实施，并不能以此限制本发明的保护范围。凡根据本发明精神实质所作的等效变化或修饰，都应涵盖在本发明的保护范围之内。The above-mentioned embodiments are only to illustrate the technical concept and characteristics of the present invention, and the purpose is to enable those skilled in the art to understand the content of the present invention and implement it accordingly, and not to limit the protection scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention shall fall within the protection scope of the present invention.

Claims

1. A vehicle operation road area type identification method, characterized in that, comprising:

Step 1: According to the geographical location information, divide the mountain area type in the target area; according to the road type, divide the target area into urban area type, suburban area type and high-speed area type; and represent the range of the area type as discrete latitude and longitude points status information;

Step 2: The user data is used as input. The user data includes vehicle operation data information and vehicle GPS positioning information. After preprocessing the data, divide the original data into several kinematic segments according to the short-stroke method, and calculate The characteristic parameters of each short trip are obtained from the short trip characteristic parameter database; according to the vehicle GPS positioning information and the area type range information, the area type to which each short trip belongs is judged;

Step 3: In each short-travel characteristic parameter database, add the label of the area type of vehicle operation; at the same time, obtain the machine learning database, use the machine learning algorithm for training, and build a vehicle operation road area classification model based on machine learning. This model will use It is used to identify the road area where the vehicle runs.

2. The vehicle operation road area type identification method according to claim 1, wherein said step 1 specifically includes:

According to the scope of urban administrative divisions, determine the boundary line between the urban area and the suburban area; divide the boundary between the city and the suburbs into the urban area, and obtain the GPS point information of the urban area boundary;

All expressways in the urban administrative area are classified as high-speed areas, and GPS point information of high-speed areas is obtained;

According to the DEM elevation data, calculate the terrain relief, carry out neighborhood analysis, reclassify the terrain relief, obtain the range of mountainous areas, and obtain the GPS point information of the boundary of the mountainous area;

Areas other than urban areas, high-speed areas, and mountainous areas are classified as suburban areas, and GPS point information on the borders of suburban areas is obtained.

3 . The method for identifying the type of road area on which vehicles operate according to claim 2 , wherein the boundary line between the urban area and the suburban area is a ring expressway or a ring line. 4 .

4. The vehicle operation road area type identification method according to claim 1, 2 or 3, characterized in that, said step 2 specifically includes:

The user's short-stroke data is classified according to the type of area; after preprocessing the original data including vehicle operation data information and vehicle GPS positioning information, the short-stroke is divided according to the short-stroke method, and the original data is divided into several kinematic segments, and the user Short-stroke database, calculating the characteristic parameters of each short-stroke for subsequent use;

Construct a function to identify the type of area to which the short trip belongs. The function takes the GPS positioning information of the short trip and the GPS information of the four types of areas as input, and outputs the label of the area type to which the short trip belongs.

5. The vehicle operation road area type identification method according to claim 1, 2 or 3, characterized in that the step three specifically includes:

Use the random forest algorithm of machine learning to build a classification model of the vehicle running road area, and obtain the machine learning database, including user short-distance travel data, characteristic parameters and labels, use the random forest algorithm to train the database, obtain the random forest model, and adjust the machine learning reasonably The data in the database is used as the ratio of the training set and the test set. According to the realization effect, the model parameters are adjusted, and finally a machine learning-based vehicle operating road area classification model is constructed.

6. The vehicle operation road area type identification method according to claim 5, wherein the characteristic parameters include:

Running time, running distance, running speed, acceleration time, deceleration time, constant speed time, idle time, acceleration ratio, deceleration ratio, uniform speed ratio, idle speed ratio, maximum speed, average speed, speed standard deviation, maximum acceleration, average acceleration in acceleration section , standard deviation of positive acceleration, maximum deceleration, average deceleration of deceleration section, standard deviation of negative deceleration, average acceleration, standard deviation of acceleration, relative positive acceleration, ratio of different speed intervals, number of accelerations, number of decelerations, number of constant speeds, number of stops , standard deviation of torque, average positive torque, average negative torque, maximum positive torque, maximum negative torque, average torque of idle speed segment, average torque of running segment, average positive torque of running segment, average negative rotation of running segment Torque, positive torque time ratio, negative torque time ratio, maximum fluctuation when torque increases, maximum fluctuation when torque decreases, standard deviation of torque fluctuation, average fluctuation when torque increases, average fluctuation when torque decreases amount, time ratio of high torque interval, time ratio of medium torque interval, time ratio of low torque interval, absolute maximum value of yaw rate, absolute maximum value of lateral acceleration, braking times, duration of acceleration segment, energy consumption per 100 kilometers, Motor total damage, front motor total damage, rear motor total damage, motor unit mileage damage, front motor unit mileage damage, rear motor unit mileage damage, speed and mileage information of adjacent short strokes.

7. The vehicle operation road area type identification method according to claim 5, wherein said adopting the random forest algorithm to construct the vehicle operation road area classification model comprises:

Import the machine learning database; randomly divide the sample data into a training set and a test set, the ratio of the training set to the test set is 7:3, find out the importance of the characteristic parameters, and sort them;

The parameters of the design classifier are as follows: the number of decision trees in the random forest is set to 178, the maximum tree depth is set to 9, the maximum number of leaf nodes is set to 41, the maximum number of features is set to 31, the minimum sample number of leaf nodes is set to 3, and the split The minimum number of samples required is set to 5;

Adjust the parameters of the model to make the accuracy rate as high as possible, and finally build a classification model of vehicle operating road areas based on machine learning algorithms.