WO2015062540A9

WO2015062540A9 - Driving amount model event-based storage and index methods and system

Info

Publication number: WO2015062540A9
Application number: PCT/CN2014/090016
Authority: WO
Inventors: 黄晓庆; 饶佳; 刘祎; 杨景
Original assignee: 中国移动通信集团公司
Priority date: 2013-10-31
Filing date: 2014-10-31
Publication date: 2015-06-18
Also published as: WO2015062540A1; CN104598475B; CN104598475A

Abstract

Disclosed are driving amount model event-based storage and index methods and a system. A driving amount model event is set and the amount model event comprises driving amount associated rules corresponding to different projects of different information main bodies. Data for Internet of vehicles comprises raw data and historical data provided by each information main body for the Internet of vehicles, wherein the driving amount model event and a coarse-grained-level index of a sub-space under the driving amount model event are used for the raw data for the Internet of vehicles, and a fine-grained-level index at a record level is set for the historical data in the driving amount model event. Therefore, the methods and the system provided by the present invention reduces a number of index update times and makes the data for the Internet of vehicles distributed evenly when storing and indexing the data for the Internet of vehicles.

Description

Storage and indexing method and system based on driving usage model event

Cross-reference to related applications

The present application claims priority to Chinese Patent Application No. 201310532545.2, filed on Jan. 31, 2013, the entire content of

Technical field

The invention relates to the field of vehicle networking, in particular to a storage and indexing method and system based on a traffic usage model event.

Background technique

With the continuous maturity of related technologies in the Internet of Vehicles, sensor technology, mobile communication technology, big data technology and intelligent computing technology have begun to integrate deeply with the car networking industry. Driven by market demand, Telemaitcs terminal equipment of the Internet of Vehicles is expected to usher in explosive growth. Among them, Telemaitcs refers to the on-board computer system using wireless communication technology, which will bring considerable value-added revenue and continuous value to operators' data service model. Opportunities for growth. Different from the traditional Intelligent Transportation System (ITS), the Internet of Vehicles pays more attention to the interaction between vehicles and vehicles, vehicles and roads, and between vehicles. It can be said that the emergence of the Internet of Vehicles redefines the mode of operation of vehicles.

The storage and indexing of the original data of the vehicle network provided by the information subject is an important basis and premise for realizing the optimization of vehicle traffic operation mode and effective use of resources. In the Internet of Vehicles environment, millions of information entities will periodically generate the original data of the car network, which leads to the bottleneck of the traditional car network relational database in terms of scalability, making the throughput of the car networking system less than required. Supporting tens of thousands or even hundreds of thousands of concurrent operations, it is necessary to provide a new storage and indexing method for the original data of the Internet of Vehicles to meet the management needs of the original data of the Internet of Vehicles.

The existing cloud data management system has the characteristics of high scalability, high fault tolerance and high availability. It has natural scalability and supports high concurrency. It is often chosen as the way to solve the original data storage and index of the Internet of Vehicles. Some cloud data management systems also support the MapReduce model to improve the performance and efficiency of the query. In the index, the double-layer index is used to solve the massive data and system scalability.

Currently, there are two main ways to store and index data:

The first way is a data management system based on distributed storage. Unlike the common centralized storage method, the distributed storage method does not store data on one or more specific nodes, but uses a limited range of storage space of different machines through the network, so that these storage spaces constitute a virtual Storage devices, data storage is scattered throughout the network. The distributed storage method adopts Key-Value key value storage mode, which supports efficient point query and range query on row key (rowkey), and full table scan comparison for non-primary key (rowkey) query, although MapReduce model can be utilized. Improve the efficiency of the query, but for queries with a lower selection rate, the performance is poor;

The second way is based on the double-layer indexing method of cloud storage. In the double-layer index mode, a local local index is established for the data of each computer node in the network, and the local index is only responsible for the data of the local node. In addition to the local index, each computer node needs to share a part of the storage space for use. The global index is stored. The global index is composed of partial local indexes. Due to the limitation of storage space and query efficiency, it is impossible to publish all local indexes to the global index. Therefore, some local indexes need to be selected according to the set rules. Indexing, for the selected local index, the global index can be organized in different ways.

Although the above two methods can achieve data storage and indexing. However, it is still a problem to store and index the original data of the Internet of Vehicles to optimize resource storage and management. This is because, when applying the above two methods to the storage and indexing of the original data of the Internet of Vehicles, there are the following problems: First, when the data storage system based on distributed storage is used to store and index the original data of the vehicle network, the system adopts The distributed architecture design, so the performance of the car network raw data query with relatively low selection rate is relatively poor; secondly, the R-Tree method based on the cloud storage double-layer indexing method is used as the local index and the global index. In the indexing process of the original data of the car network, the computer nodes need to be continuously split and adjusted, and the maintenance cost of the index is too high, which has a great influence on the throughput of the car network system. Most importantly, the above two methods do not fully consider the relationship between the main data information subjects of the "people-car-road" car network, and lack of pertinence, and can not facilitate the subsequent analysis and processing based on traffic events.

Summary of the invention

In view of this, the present invention provides a storage and indexing method based on a traffic usage model event, which uses the method to store and index the processing of the vehicle network data, the index update times are small, and the vehicle network data is evenly distributed.

The invention also provides a storage and indexing system based on the traffic usage model event, which uses the system to store and index the car network data, the number of index updates is small, and the original data of the car network is evenly distributed.

To achieve the above objective, the technical solution implemented by the present invention is specifically implemented as follows:

A storage and indexing method based on a traffic usage model event, the method comprising:

Establishing a traffic usage model event, the traffic usage model event includes a traffic usage association rule corresponding to different items of different information subjects;

Obtaining the original data of the vehicle network, dividing into the original data block of the vehicle network according to the traffic usage model event, and dividing the original data block of the vehicle network corresponding to the traffic usage model event into multiple sub-space data segments for storage;

The traffic usage model event is indexed by a multi-path search tree B+tree, wherein the leaf node of the B+tree is an n-tree R-tree, and the index is divided into multiple sub-divisions of the car network original data block of the traffic usage model event. Spatial data segment;

The historical data of the corresponding driving usage model event is stored in the setting area, and an index of the recording level is established for the set area.

The plurality of subspace data segments are divided by a K-dimensional index tree K-dimension Tree or an average quadtree bucket PR Quadtree, and a plurality of complementary overlapping rectangular subspace data segments are obtained by dividing, correspondingly stored in an R-tree index. Storage area.

The index of the record level is a local index, and the local index adopts an R tree manner or a grid index manner.

After the car network raw data block corresponding to the traffic usage model event is divided into multiple subspace data segments, the method further includes:

According to the sub-space data segment size and the tree depth of the original data segment of the traffic usage model time of the sub-space data segment, it is determined whether the division strategy is reasonable. If not, the division strategy is adjusted, and the corresponding traffic usage model is re-based according to the division strategy. The car network raw data block of the event is divided into multiple subspace data segment storage.

Whether the determination of the division strategy is reasonable is:

Calculating the subspace data variance according to the subspace data segment size. When determining that the calculated subspace data variance is greater than or equal to the set first threshold and the tree depth is greater than or equal to the set second threshold, adjusting the partitioning strategy to reduce the size a spatial data segment; when it is determined that the calculated subspace data variance is less than the set first threshold, and the tree depth is less than the set second threshold, the partitioning strategy is adjusted to expand the subspace data segment.

A storage and indexing system based on a traffic usage model event, the system comprising: a model building module, a storage indicating module and an indexing module, wherein

Establishing a model module for establishing a traffic usage model event, and the traffic usage model event includes a traffic usage association rule corresponding to different items of different information subjects;

The storage instruction module is configured to divide the original data of the vehicle network into the original data block of the vehicle network according to the traffic usage model event, and divide the original data block of the vehicle network corresponding to the traffic usage model event into a plurality of subspace data segments for storage; The historical data of the driving usage model event is stored in the setting area;

The indexing module is configured to adopt a B+tree index for the traffic usage model event, wherein the B+tree has an R-tree on the leaf node, and the index corresponds to the plurality of subspace data segments divided by the car network original data block of the traffic usage model event. ; Establish an index of the record level for the set area.

The storage indication module is further configured to divide the original data block of the vehicle network corresponding to the traffic usage model event into a plurality of subspace data segments for storage, and use K-dimension Tree or Bucket PR Quadtree to divide, and obtain a plurality of complementary overlaps by dividing The rectangular subspace data segment is correspondingly stored in the storage area using the R-tree index.

The system further includes an update partitioning module, configured to determine whether the partitioning strategy is reasonable according to the sub-space data segment size and the tree depth of the traffic usage original data segment of the corresponding traffic usage model time of the sub-space data segment, and if not, adjust the partitioning strategy. ;

The storage indication module is further configured to divide the car network original data block corresponding to the traffic usage model event into multiple subspace data segment storages according to the division strategy.

The update partitioning module is further configured to calculate the subspace data variance according to the subspace data segment size, and determine that the calculated subspace data variance is greater than or equal to the set first threshold and the tree depth is greater than or equal to the set second threshold. Adjusting the partitioning strategy to reduce the subspace data segment; when determining that the calculated subspace data variance is less than the set first threshold, and the tree depth is less than the set second threshold When the value is adjusted, the partitioning strategy is adjusted to expand the subspace data segment.

It can be seen from the above solution that the present invention sets a traffic usage model event, which includes a traffic usage association rule corresponding to different items of different information bodies. The vehicle network data includes the vehicle network raw data and historical data provided by each information body, wherein the vehicle network original data adopts the traffic usage model event and the coarse-grained level index of the subspace under the traffic usage model event, and is a traffic usage model event. The historical data in is set to a fine-grained level index of the record level. Since the vehicle network raw data related to the traffic usage model event is indexed by the existing traffic usage model event index, there is no need to update the index, and the vehicle network original data is included in a certain range under the traffic usage model event. The subspace is evenly distributed, so the dimensional cost of the index is also controlled within a valid range, without affecting storage performance and index update times. Therefore, the method and system provided by the present invention store and index the Internet of Vehicles data, the index update times are small, and the vehicle network data is evenly distributed.

DRAWINGS

1 is a schematic structural diagram of an association relationship between "person-vehicle-road" information bodies according to an embodiment of the present invention;

2 is a flowchart of a method for indexing vehicle network original data based on a driving usage model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a coarse-grained level indexing process for data related to a driving usage model according to an embodiment of the present invention; FIG.

FIG. 4 is a schematic diagram of a process for storing and indexing car network original data from an indexing level and a storage layer according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a storage and indexing system based on a traffic usage model event according to an embodiment of the present invention.

detailed description

The present invention will be further described in detail below with reference to the accompanying drawings.

In order to solve the solution provided by the present invention, the present invention provides a three-dimensional information master based on "human-vehicle-road" The model of driving usage between the bodies, and the concept of "traffic usage" is proposed, which is described in detail below.

Driving usage, usage is the abbreviation of usage, it is a measure of the use of resources, and the usage management rules are the management of usage. From a single dimension, the most familiar measurement of the behavior of electricity use by electricity meters is an example of usage management. If the behavior of power use is extended to measure in time dimension, by grasping the relationship between power usage and time, and then adjusting the relationship between pricing and market supply and demand, the optimization of supply and use of power resources can be achieved. Strategy. It can be seen that the relationship between the resource design supply and the resource usage based on the usage management model can be described by a multidimensional space. The higher the spatial dimension described, the more variables that can be used for resource allocation, and the greater the benefit space. Here, traffic usage is an important data concept for realizing industrial synergy through the establishment of multi-party contractual relationships in the Internet of Vehicles platform. Its resources involve multiple actors, such as car owners, depots, traffic management and insurance. For the owners, their resources are The use behavior includes: depreciation of the car, loss of traffic accidents, expenses of auto insurance premiums, penalties for fines and penalties, etc. With the increase of resources, new semantics and new functions can be brought to the traffic usage, and then the car More beneficiary services in the networking industry.

The driving usage model, different industry entities in the car networking industry, because of different business objectives, care about different parameters in the driving process. Therefore, the process of providing the required driving amount to these different subjects is to process the data usage in a corresponding data subject demand space through a certain data processing, and the data model is the usage model. To give an example, for example, for the Public Security Traffic Management Bureau, its main responsibilities include road traffic management control and traffic safety and security, and the traffic usage of road traffic accidents becomes a demand projection with the Public Security Traffic Administration as the main body; for insurance companies From the perspective of reducing the accident compensation rate, reducing the risk of insurance and earning profits, the vehicle usage of the insured vehicle driving evaluation is the demand projection of the insurance company as the main body; for the owner, the vehicle is safe to drive and avoid the road. Congestion is the main demand, and the traffic usage of traffic capacity is the projection of demand as the main vehicle.

The relationship between the "people-car-road" information subject

FIG. 1 is a schematic structural diagram of a relationship between a human-vehicle-road information body according to an embodiment of the present invention, where the graph is a closed-loop relationship structure diagram including four interfaces and affecting each other, wherein

The human-vehicle interface, that is, the driving behavior synergy, involves the information subject being the person and the vehicle, including the driver through the accelerator pedal, the brake and the steering wheel, manipulating the direction, controlling the driving speed, and realizing the control of the vehicle;

The human-road interface, that is, the traffic information matching synergy, involves the information subject as the person and the road, including the driver's continuous determination and response according to the characteristics of the vehicle, road and traffic changes during the driving process to adapt to the changes of the road environment. ;

The vehicle-road interface, that is, the vehicle driving synergy, involves the main body of the vehicle and the road, including the interaction and sharing between the vehicle and the road, and realizes the coordination and cooperation between the vehicle and the road infrastructure;

The human-vehicle-road interface, that is, the traffic behavior coordination, involves the information subject as the person, the car and the road, including the dynamic process in which the driver controls the vehicle according to the predetermined target and operates according to the traffic rules, and the vehicle is also subject to the road and environmental conditions. The impact of the joint completion of traffic behavior events.

In the present invention, the established driving usage model includes the driving usage association rules corresponding to different items of different information subjects.

When storing and indexing the Internet of Vehicles data, the index update times are small, and the vehicle network data is evenly distributed. The present invention sets a traffic usage model event, which includes the traffic usage association rules corresponding to different items of different information bodies. . The vehicle network data includes the vehicle network raw data and historical data provided by each information body, wherein the vehicle network original data adopts the traffic usage model event and the coarse-grained level index of the subspace under the traffic usage model event, and is a traffic usage model event. The historical data in is set to a fine-grained level index of the record level. Furthermore, due to the spatio-temporal characteristics of the original data of the Internet of Vehicles, an adaptive data partitioning method is adopted in the indexing process, so that the index of the original data of the Internet of Vehicles is relatively uniform.

Since the vehicle network raw data related to the traffic usage model event is indexed by the existing traffic usage model event index, there is no need to update the index, and the vehicle network original data is included in a certain range under the traffic usage model event. The subspace is evenly distributed, so the dimensional cost of the index is also controlled within a valid range, without affecting storage performance and index update times.

FIG. 2 is a flowchart of a method for indexing vehicle network original data based on a driving usage model according to an embodiment of the present invention, where specific steps are as follows:

Step 201: Establish a traffic usage model event, and the traffic usage model event includes a traffic usage association rule corresponding to different items of different information bodies;

Step 202: After obtaining the original data of the vehicle network, classifying the original data block of the vehicle network according to the traffic usage model event, and dividing the original data block of the vehicle network corresponding to the traffic usage model event into a plurality of subspace data segments for storage;

In this step, a plurality of subspace data segments are divided by a K-dimension tree or a Bucket PR Quadtree, and a plurality of complementary overlapping rectangular subspace data segments are obtained by division, and corresponding storage is performed. a storage area using an R-tree index;

Step 203: The traffic usage model event is indexed by a multi-path search tree (B+tree), wherein the leaf node of the B+tree is an n-tree (R-tree), and the index corresponds to the car network original of the traffic usage model event. a plurality of subspace data segments divided by the data block;

Step 204: Store historical data of the corresponding driving usage model event in the setting area, and establish an index of the recording level for the set area;

In this step, the index of the record level may be a local index, and the local index may adopt an R tree or a grid index.

In step 202, after the car network original data block corresponding to the traffic usage model event is divided into a plurality of subspace data segments, the method further includes:

Whether the determination of the division strategy is reasonable is:

FIG. 3 is a schematic diagram of a coarse-grained level indexing process for data related to a driving usage model according to an embodiment of the present invention. As shown in FIG. 3, first, it is necessary to partition the original data of the vehicle network according to the driving usage model event to obtain a driving usage model. Event-related data; then, the traffic usage model event related data is updated for the traffic usage model event related data, and a subspace is created for the updated driving usage model event, and the driving usage model event related data is divided into multiple data segments. Set in the subspace, these subspaces correspond to a storage area in the car networking database, which is a distributed database, using an R-tree index.

The process shown in Figure 3 will be described in detail below.

First, the original data of the car network is divided according to the traffic usage model event.

Due to the closed-loop relationship between the original data of the car network provided by the three information bodies of "people-car-road" and the mutual influence of each other, according to the demand projection of the traffic usage of different information groups, the traffic usage model event can be formed, including different information subjects. The traffic usage association rules for different projects.

The vehicle network raw data is distributed with the traffic usage model event, so according to a traffic usage model event occurrence and end anchor, the car network raw data can be divided into several data blocks related to the traffic usage model event (Event Data Block). The anchor is represented by A, the car network raw data DBS={[A _s1 , A _e1 ), [A _s2 , A _e2 ),..., [A _si , A _ei ),...}, where [A _si , A _ei ) It is a data interval of left closed right opening, indicating the car network raw data block for the traffic usage model event, these intervals are not overlapping.

In the specific implementation, the car network raw data is first divided into several blocks according to the traffic usage model event in the event dimension. For each block, the two-dimensional space is divided into several data segments, and several data segments are respectively Stored in several subspaces.

In order to ensure that the subspace division of the stored data segment is reasonable, it is necessary to monitor the size of each subspace and calculate the depth and offset of the subspace; according to the calculation result, it is determined whether the division is reasonable, and if it is unreasonable, such as exceeding the set segmentation data segment threshold, Then adjust the segmentation strategy of the subspace.

Store the vehicle network raw data on the storage node of the corresponding driving usage model event

After the vehicle network raw data is divided, the car network raw data block [A _s1 , A _e1 ) between the traffic usage model event start anchor and the end anchor is stored on the storage node of the corresponding traffic usage model event, if The corresponding driving usage model event is stored by the cloud storage system, and the storage node of the corresponding driving usage model event is determined through the interface of the cloud storage system, and the original data block [A _s1 , A _e1 ) of the vehicle network is updated to the storage node. on.

Update the index of the corresponding driving usage model event

In order to speed up the point query and range query of the traffic usage model event, the B+Tree index driving usage model event is used, and the leaf node of B+Tree corresponds to an R-Tree, and the R-Tree is used to index the vehicle of the driving usage model event. The subspace divided by the networked original data block [A _s1 , A _e1 ), when the car network original data block [A _s1 , A _e1 ) is stored to the storage node of the corresponding traffic usage model event, the corresponding B+Tree is updated. index.

Create a subspace index of the car network raw data block [A _s1 , A _e1 )

For most car networking applications, the storage nodes are mostly spatially fixed, so for subspaces in a car network raw data block [A _s1 , A _e1 ), K-dimension Tree or Bucket can be used. The PR Quadtree is divided, and finally, a plurality of complementary overlapping rectangular subspace regions are obtained, and an R-tree index is adopted for the overlapping rectangular subspace regions.

Establish an index of the record level [A _s2 , A _e2 ) for the historical data of the car network raw data block [A _s1 , A _e1 )

When the car network raw data block [A _s1 , A _e1 ) updates the corresponding driving usage model event, the original car network raw data block [A _s1 , A _e1 ) in the original driving usage model event becomes historical data for historical data. In order to further speed up the query, a local index of the record level can be established for each region, and the local index uses the R tree or the grid index to index the historical data in each traffic usage model event.

For convenience of description, the following describes the process of storing and indexing the vehicle network raw data based on the traffic usage model from the index level and the storage layer. FIG. 4 is a view of the vehicle usage model event from the index level and the storage layer according to an embodiment of the present invention. Schematic diagram of the process of storing and indexing the original data of the car network.

It can be seen from Fig. 4 that, at the storage level, according to the traffic usage model event dimension, the vehicle network raw data is divided into the traffic usage model event related data and the traffic usage model event-independent data; then, according to the driving usage model event occurrence anchor And ending the anchor, the traffic usage model event related data is divided into data blocks corresponding to the traffic usage model event; again, for each data block, divided in two dimensions, divided into several subspaces, and each of the data blocks The sub-space data segment is stored in an area of the distributed storage system, and the sub-space data segment of the traffic usage model event-related data block corresponding to a traffic usage model event is guaranteed to exist in the same area as much as possible, thereby reducing the need for the query process. The number of scanned areas improves query efficiency.

At the index level, there are mainly three levels, wherein the traffic usage model event index and the subspace index are for the current vehicle network raw data, and the grid index is for the historical data index corresponding to the traffic usage model event.

Specifically, at the time of indexing, the data is divided into current vehicle network raw data and historical data on the traffic usage model event dimension. For the current data of the original Internet of the car, only the data segment and the sub-space index where it is located, but not the data record itself, so that the number of times of updating the index is greatly reduced in the current car network original data storage, wherein the traffic usage model event Index uses B+tree mode Line, because the multiple subspace data segments of the traffic usage model event are stored in different regions, the R-tree index is used. When the traffic usage model event is updated, the historical data is no longer changed, so the historical data can be stored in batches and indexed at the record level, for example, an R-tree index or a grid index can be used. In this way, the cost of index update maintenance is relatively low, and the impact on the original data storage of the car network is relatively small, ensuring that the car network system can support large-scale frequent updates.

Explain exactly how subspace data segments are divided and optimized

In practical applications, the original data of the Internet of Vehicles is monotonously increased in time dimension, and the driving usage model can also change with time. This requires dividing the original data stream of the car network into several feedback cycles and within the feedback cycle. Adaptive step-by-step optimization of traffic usage model events and subspace partitioning strategies:

In the first step, according to the specific application scenario, the total number N of the traffic usage model event records in the feedback period is set, and if each subspace is at most S records, the data in each event segment is equally divided into R subspaces. The traffic usage data block of the corresponding traffic usage model event in the first feedback cycle is divided into

Blocks are E11, E12, E13...E1k.

In the second step, sub-space partitioning is performed on the E11, E12, E13...E1k using the Bucket PR KD-tree, the depth Dep _i of the tree is recorded, and the size of the data segment of each subspace is monitored, and is calculated according to the formula (1). The variance of the data amount of each subspace data segment in the traffic usage data block corresponding to the driving usage model:

Formula 1),

Where N _i represents the number of subspaces within Ei, x _m represents the size of the mth subspace of Ei, and D _i represents the variance of the size of the subspace within Ei. The size of D _i reflects the degree of uniformity of data partitioning in the data segment of the subspace.

In the third step, according to the data amount variance D _i of each subspace data segment in the traffic usage data block of the corresponding traffic usage model and the layer number Dep _{i of the} data division, the division of the data segment is adjusted: if D _{i is} greater than or equal to the set number a threshold value, indicating that the sub-space data segment in the traffic usage data block of the corresponding traffic usage model is unevenly distributed, and when Dep _{i is} greater than or equal to the set second threshold, the sub-space data segment needs to be reduced; if D _i If the first threshold is less than the first threshold, the data segment of each subspace in the traffic usage data block of the corresponding traffic usage model is relatively uniform. If Dep _{i is} less than the second threshold, the data volume is too small. The two subspace data segments are merged;

In the fourth step, by monitoring the partitioning strategy, if the partitioning of the data segment and the partitioning strategy of the region remain unchanged for a continuous feedback period, the partitioning strategy can be fixed and no dynamic partitioning is performed. In this way, the partitioning scheme can be determined in advance, and the original data of the vehicle network does not need to be dynamically divided when stored, thereby further improving the storage performance.

Further, during the operation of the system, it is still necessary to monitor the distribution of data. Once the data distribution is found to be unbalanced, the dynamic partitioning strategy is re-used.

FIG. 5 is a schematic structural diagram of a storage and indexing system based on a traffic usage model event according to an embodiment of the present invention. As shown in the figure, the method includes: establishing a model module, a storage indication module, and an index module, where

In the present invention, the storage indication module is further configured to divide the car network original data block corresponding to the traffic usage model event into a plurality of subspace data segments for storage, and use K-dimension Tree or Bucket PR Quadtree to divide, and obtain a plurality of The complementary overlapping rectangular subspace data segments are correspondingly stored in the storage area using the R-tree index.

In the embodiment of the present invention, the system further includes an update partitioning module, configured to determine whether the partitioning strategy is reasonable according to the sub-space data segment size and the tree depth of the original usage data segment of the traffic usage model time of the sub-space data segment. If no, adjust the division strategy;

In the embodiment of the present invention, the update partitioning module is further configured to calculate the subspace data variance according to the subspace data segment size, and determine that the calculated subspace data variance is greater than or equal to the set first threshold and the tree depth When the second threshold is greater than or equal to the set value, the partitioning strategy is adjusted to reduce the subspace. a data segment; when it is determined that the calculated subspace data variance is less than the set first threshold, and the tree depth is less than the set second threshold, the partitioning strategy is adjusted to expand the subspace data segment.

The method and system provided by the invention fully consider the requirement for effective measurement of the driving amount, and realize the optimal storage and utilization of the vehicle network related resource data. The method and system provided by the invention fully consider that the original data of the vehicle network is continuously generated, and the historical data generally does not change after being generated, and the distribution of the original data of the vehicle network corresponding to the traffic usage model event often has a tilt, with time The change of the driving usage model event will also change. When the subspace data segment is divided, the imbalance of data distribution is considered at the same time, and the demand for the measurement of the traffic usage resource is satisfied, which has practical guiding significance for the vehicle networking solution.

The applicable scenarios and examples of the methods and systems provided by the present invention include, but are not limited to, the following vehicle networking applications: intelligent transportation systems, mass data storage and indexing, and resource usage metering, etc., which can meet the needs of existing vehicle network data storage applications.

The present invention has been described in detail with reference to the preferred embodiments of the present invention. All modifications, equivalent substitutions and improvements made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

A storage and indexing method based on a traffic usage model event, the method comprising:

Establishing a traffic usage model event, the traffic usage model event includes a traffic usage association rule corresponding to different items of different information subjects;

Obtaining the original data of the vehicle network, dividing into the original data block of the vehicle network according to the traffic usage model event, and dividing the original data block of the vehicle network corresponding to the traffic usage model event into multiple sub-space data segments for storage;

The traffic usage model event is indexed by a multi-path search tree B+tree, wherein the leaf node of the B+tree is an n-tree R-tree, and the index is divided into multiple sub-divisions of the car network original data block of the traffic usage model event. Spatial data segment;

The historical data of the corresponding driving usage model event is stored in the setting area, and an index of the recording level is established for the set area.
The method according to claim 1, wherein the plurality of subspace data segments are divided by a K-dimensional index tree K-dimension Tree or an average quadtree bucket PR Quadtree, and a plurality of complementary overlapping rectangles are obtained by dividing. The spatial data segment is correspondingly stored in a storage area using an R-tree index.
The method according to claim 1, wherein the index of the record level is a local index, and the local index adopts an R-tree manner or a mesh index manner.
The method of claim 1, wherein after the car network raw data block corresponding to the traffic usage model event is divided into the plurality of subspace data segments, the method further comprises:

According to the sub-space data segment size and the tree depth of the original data segment of the traffic usage model time of the sub-space data segment, it is determined whether the division strategy is reasonable. If not, the division strategy is adjusted, and the corresponding traffic usage model is re-based according to the division strategy. The car network raw data block of the event is divided into multiple subspace data segment storage.
The method of claim 4 wherein said determining whether the partitioning strategy is reasonable is:

Calculating the subspace data variance according to the subspace data segment size, when determining that the calculated subspace data variance is greater than or equal to the set first threshold and the tree depth is greater than or equal to the set second threshold, Adjusting the partitioning strategy to reduce the subspace data segment; when determining that the calculated subspace data variance is less than the set first threshold, and the tree depth is less than the set second threshold, adjusting the partitioning strategy to expand the subspace data segment .
A storage and indexing system based on a traffic usage model event, wherein the system comprises: a model building module, a storage indicating module and an indexing module, wherein

Establishing a model module for establishing a traffic usage model event, and the traffic usage model event includes a traffic usage association rule corresponding to different items of different information subjects;

The storage instruction module is configured to divide the original data of the vehicle network into the original data block of the vehicle network according to the traffic usage model event, and divide the original data block of the vehicle network corresponding to the traffic usage model event into a plurality of subspace data segments for storage; The historical data of the driving usage model event is stored in the setting area;

The indexing module is configured to adopt a B+tree index for the traffic usage model event, wherein the B+tree has an R-tree on the leaf node, and the index corresponds to the plurality of subspace data segments divided by the car network original data block of the traffic usage model event. ; Establish an index of the record level for the set area.
The system of claim 6, wherein the storage indication module is further configured to divide the car network raw data block corresponding to the traffic usage model event into a plurality of subspace data segments for storage by using a K-dimension Tree or Bucket PR Quadtree partitioning, by dividing, obtains a plurality of complementary overlapping rectangular subspace data segments, correspondingly stored in a storage area using an R-tree index.
The system according to claim 6, wherein the system further comprises an update partitioning module, configured to: according to the sub-space data segment size and the tree depth of the original data segment of the driving usage model time of the sub-space data segment, Determine whether the division strategy is reasonable, and if not, adjust the division strategy;

The storage indication module is further configured to divide the car network original data block corresponding to the traffic usage model event into multiple subspace data segment storages according to the division strategy.
The system according to claim 8, wherein the update dividing module is further configured to calculate the subspace data variance according to the subspace data segment size, and determine that the calculated subspace data variance is greater than or equal to the set number. When the threshold is greater than or equal to the set second threshold, the partitioning strategy is adjusted to reduce the subspace data segment; when the calculated subspace data variance is determined to be less than the set first threshold, and the tree depth is less than the set threshold When the second threshold is used, the partitioning strategy is adjusted to be an extension. Spatial data segment.