CN104464344A - Vehicle driving path prediction method and system - Google Patents

Vehicle driving path prediction method and system Download PDF

Info

Publication number
CN104464344A
CN104464344A CN201410628190.1A CN201410628190A CN104464344A CN 104464344 A CN104464344 A CN 104464344A CN 201410628190 A CN201410628190 A CN 201410628190A CN 104464344 A CN104464344 A CN 104464344A
Authority
CN
China
Prior art keywords
path
path sequence
sequence
sequence pattern
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410628190.1A
Other languages
Chinese (zh)
Other versions
CN104464344B (en
Inventor
马传香
王时绘
余啸
曾诚
陈昊
吕顺营
宋建华
吴思尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei University
Original Assignee
Hubei University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei University filed Critical Hubei University
Priority to CN201410628190.1A priority Critical patent/CN104464344B/en
Publication of CN104464344A publication Critical patent/CN104464344A/en
Application granted granted Critical
Publication of CN104464344B publication Critical patent/CN104464344B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/09Arrangements for giving variable traffic instructions
    • G08G1/0962Arrangements for giving variable traffic instructions having an indicator mounted inside the vehicle, e.g. giving voice messages
    • G08G1/0968Systems involving transmission of navigation instructions to the vehicle
    • G08G1/096805Systems involving transmission of navigation instructions to the vehicle where the transmitted instructions are used to compute a route
    • G08G1/096811Systems involving transmission of navigation instructions to the vehicle where the transmitted instructions are used to compute a route where the route is computed offboard
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Abstract

The invention provides a vehicle driving path prediction method and system. The method includes the steps that a minimum internal storage is determined on the basis of a Hadoop platform, the largest length of a path is scanned, and an original path sequence database is evenly divided into n disjoint sub-path sequence databases; the original path sequence database and the n sub-path sequence databases are respectively uploaded to an HDFS; the n sub-path sequence databases are dispatched to different Map nodes by a master control node, each Map node executes an improved GSP algorithm, the sub-path sequence databases stored in a Map node internal storage are scanned according to a preset minimum supporting degree X, a local path sequence mode is worked out, and Reduce nodes are merged and processed so that an overall candidate sequence mode can be obtained; the original path sequence database is scanned again so that an overall path sequence mode can be obtained; the overall path sequence mode generates a path association rule and the confidence degree of the path association rule is calculated so that a vehicle driving path prediction result can be obtained.

Description

A kind of vehicle running path Forecasting Methodology and system
Technical field
The invention belongs to intelligent transportation system technical field, particularly relate to a kind of vehicle running path Forecasting Methodology and system.
Background technology
(1) intelligent transportation system
Along with the development of geographic positioning technology is with ripe, and the rise of mobile computing, the application based on path and geographic position becomes the common focus of academia and industry member or even government.Routing information and geographic position, as the important attribute of mobile object, can provide important support for the improvement of much service and application system.The path of mobile object and positional information are inputted as system, has expedited the emergence of numerous emerging application.Intelligent transportation system is exactly a wherein famous application.The predecessor of intelligent transportation system is intelligent vehicle roadnet.Intelligent transportation system applies to whole traffic management system by effectively integrated to the infotech of advanced person, data communication transmission technology, Electronic transducer technology, electron controls technology and computer processing technology etc., and set up a kind of on a large scale in, comprehensively to play a role, in real time, multi-transportation and management system accurately and efficiently.Intelligent transportation system is a complicated comprehensive system, can be divided into following subsystems from the angle of system composition:
1) advanced transportation information service systems (ATIS)
ATIS is based upon on perfect information network basis.Traffic participant by being equipped on road, Che Shang, on transfer stop, on parking lot and the sensor of forecast center and transmission equipment, provide the Real-time Traffic Information of various places to traffic information center; ATIS obtain these information and by process after, in real time to traffic participant provide Traffic Information, public traffic information, transfer information, traffic weather information, parking lot information and with other information going out line correlation; Traveler determines trip mode, the selection schemer of oneself according to these information.Further, when car being equipped with automatic location and navigational system, this system can help driver automatically to select travel route.
2) advanced traveler information systems (ATMS)
ATMS some and ATIS shared information gather, process and transmission system, but ATMS uses mainly to traffic administration person, for detection control and management highway communication, at road, provide communication contact between vehicle and driver.It carries out real-time supervision by the traffic in roadnet, traffic hazard, meteorological condition and traffic environment, rely on advanced vehicle testing techniques and Computerized Information Processing Tech, obtain the information about traffic, and according to the information collected, traffic is controlled, as signal lamp, issue induction information, road control, accident treatment and rescue etc.
3) advanced public transportation system (APTS)
The fundamental purpose of APTS is the development adopting various intellectual technology to promote public transportation industry, makes public transit system realize safe and convenient, economy, target that freight volume is large.As provide advice with regard to trip mode and event, route and train number selection etc. to the public by personal computer, closed-circuit television etc., provided the real-time traffic information of vehicle to the person of waiting by display in bus stop.In public transit vehicle administrative center, the plan such as can to dispatch a car according to the real-time status reasonable arrangement of vehicle, return the vehicle to the garage and knock off, increases work efficiency and service quality.
4) advanced vehicle control system (AVCS)
The object of AVCS is that exploitation helps driver to carry out the various technology of this wagon control, thus makes ride safety of automobile, efficient.AVCS comprises warning to driver and help, and barrier is avoided waiting automatic Pilot technology.
5) transportation management system
Here refer to based on expressway network and information management system, utilize Logistics Theory to carry out the intelligentized logistic management system managed.Comprehensive utilization satnav, Geographic Information System, logistics information and network technology effectively organize freight transportation, improve shipping efficiency.
6) E-payment system (ETC)
ETC is state-of-the-art toll on the road and bridge's mode in the world.By being arranged on the special short range communication of microwave between the vehicle carried device in vehicle windscreen and the microwave antenna on charge station ETC track, Computer Networking and bank is utilized to carry out backstage settlement process, thus reach vehicle and do not need to stop by toll on the road and bridge station and can pay the object of road and bridge expense, and relevant income owner is given in the expense of paying sorting after background process.Electronic charging system without parking is installed in existing track, the traffic capacity in track can be made to improve 3 ~ 5 times.
7) emergency rescuing system (EMS)
EMS is a special system, its basis is ATIS, ATMS and relevant rescue facility and facility, by ATIS and ATMS, traffic surveillance and control center and the rescue facility of occupation are unified into organic whole, the services such as vehicle trouble on-the-spot emergency action, trailer, on-the-spot rescue, eliminating accident vehicle are provided for road user.
(2) Trace predict technology
The method of Trace predict is mainly divided into following two classes:
1) based on the Trace predict method of Markov model.Document [1]: Simmons R, Browning B, Zhang Y, et al.Learning to predict driver route and destination intent [C] .Proceedings of Intelligent TransportationSystems Conference, even if 2006:127-132. proposes there is better path, what people also can habitual select to pass by the past is familiar with route.Based on this prerequisite, by the observation to driver's history driving path data, set up Markov probability model and generate Markov probability tree, accordingly can by current time state, the routing of prediction vehicle subsequent time.Document [2]: based on the ETC charge data Research on Mining [J] of mixing Markov model. Traffic transport system engineering and information .2012.12 (4). choose ETC historical data build path sequence transaction database, propose a kind of method based on mixing Markov Trace predict model prediction vehicle on highway path, utilize the prediction the method achieving highway ETC vehicle current state in future.But the distance of the method prediction is short, be merely able to the section of predicting that vehicle subsequent time will arrive.
2) based on the Trace predict method of sequential mode mining.Document [3]: Yang J, Hu M.Trajpattern:mining sequentialpatterns from imprecise trajectories of mobile objects [C] .Proceedings of the InternationalConferences on Extending Database Technology, 2006:664-681. are for the position prediction problem of moving target under mobile computing environment, propose a kind of method excavating target travel rule from historical trajectory data, first moving region is divided into several grid of area equation, then target trajectory is changed into by the ordered sequence formed through these grid limits, then standard GSP algorithm is adopted to excavate Frequent Sequential Patterns wherein and generate inference rule.Document [4]: Giannotti F, Nanni M, Pedreschi D.Trajectory pattern mining [C] .Proceedings of the 13th ACMSlGKDD International Conference on Knowledge Discovery and Data Mining, 2007; 330-339. propose a kind of frequent Sequential Pattern Mining Algorithm being provided data by GPS device, on the basis of the algorithm of document [3], add this parameter of the residence time in grid.But the method arithmetic capability when processing mass data can not meet the requirement of people far away.Therefore, the newest fruits of computer software and hardware development must be given full play to, improve counting yield.
At present, intelligent transportation system adopts a large amount of advanced sensing device, network technology, camera arrangement and high speed computer system, can Real-Time Monitoring and collect a large amount of traffic datas.Supposed that with series installation the intersection of electronic eyes is formed transportation network for node, so vehicle running path sequence (hereinafter referred to as path sequence) can represent with node sequence arrangement.If I={i k, k=1,2 ..., n} is a project set, project i krepresent intersection road circuit node and road being provided with electronic eyes, n is intersection number.Path sequence is the ordered arrangement of disparity items, and path sequence S can be expressed as S=<s 1, s 2... s j. ... s n>, wherein s jfor the project in project set I.In a path sequence, the sequence of a continuous item composition is called the subpath sequence of this path sequence arbitrarily.If path sequence α is the subpath sequence of path sequence β, then path sequence β is claimed to comprise path sequence α.Path sequence S is the path sequence number comprising S in path sequence database at the support counting of path sequence database.Path sequence S is the number percent shared in path sequence database of the path sequence that comprises S in the support of path sequence database, is designated as Support (S).Given minimum support ξ, if the support of path sequence S in path sequence database is not less than ξ, then claims path sequence S to be path sequence pattern.Path sequence has following character (hereinafter referred to as character 1): every two adjacent items contained by path sequence are adjacent two nodes of road.
(3) Map-Reduce programming framework
Map-Reduce is a kind of programming framework, have employed concept " Map (mapping) " and " Reduce (reduction) ", for the concurrent operation of large-scale dataset (being greater than 1TB).At related documents: [3] Jeffrey Dean and Sanjay Ghemawat.Map-Reduce:Simplified data processing on large Cluster [C] .Commuication of theACM, 2008, propose in 51 (1): 107-113..User only need write the function that two are called Map and Reduce, system can manage the coordination between the execution of Map or Reduce parallel task and task, and the situation of certain mission failure above-mentioned can be processed, and the fault-tolerance to hardware fault can be ensured simultaneously.
Computation process based on Map-Reduce is as follows:
1) first input file is divided into M data fragmentation by the Map-Reduce storehouse in user program, the size of each burst is generally from 16 to 64MB (user can control the size of each data slot by optional parameter), and then Map-Reduce storehouse creates a large amount of copies of programs in a group of planes.
2) these copies of programs have a special program-primary control program, and in copy, other program is all by the working routine of primary control program allocating task.Have M Map task and R Reduce task to be assigned with, a Map task or Reduce task matching are given an idle working routine by primary control program.
3) working routine being assigned with Map task reads relevant input data slot, key-value (key is parsed from the data slot of input, value) right, then key-be worth passing to user-defined Map function, Map function is by the middle ephemeral key produced-be worth being kept in local memory cache.
4) key in buffer memory-be worth being divided into R region by partition functions, is periodically written on local disk afterwards.The key of buffer memory-be worth will be returned to primary control program to the memory location on local disk, will be responsible for these memory locations to pass to the working routine being assigned with Reduce task again by primary control program.
5) after the working routine being assigned with Reduce task receives the data storage location information that primary control program sends, using remote procedure call (remote procedure calls) to read from the disk of the working routine place main frame being assigned with Map task, these are data cached.After the working routine being assigned with Reduce task have read all intermediate data, after key is sorted, make to have the data aggregate of same keys together.Because many different keys can be mapped in identical Reduce task, therefore must sort.If intermediate data cannot complete sequence too greatly in internal memory, so will sort in outside.
6) intermediate data after the working routine traversal sequence of Reduce task has been assigned with, for each unique middle key-it is right to be worth, the set of this key and its relevant intermediate value is passed to user-defined Reduce function by the working routine being assigned with Reduce task.The output of Reduce function is appended to the output file of affiliated subregion.
7) after all Map and Reduce tasks all complete, primary control program wakes user program up. and during this time, calling Map-Reduce in user program just returns.
(4) Hadoop cloud computing platform
Hadoop is the open source software project meeting reliability, extensibility, Distributed Calculation developed by Apache foundation.User can when not understanding distributed low-level details, exploitation distributed program.The power making full use of cluster carries out high-speed computation and storage.Hadoop achieves a distributed file system (Hadoop Distributed File System), is called for short HDFS.HDFS has the feature of high fault tolerance, and design is used for being deployed on cheap hardware; And it provides high-throughput to visit the data of application program, be applicable to the application program that those have super large data set.HDFS relaxes the requirement of POSIX, can data in the form of streaming in access file system.
The design that the framework of Hadoop is most crucial is exactly: HDFS and Map-Reduce.HDFS is that the data of magnanimity provide storage, and Map-Reduce is that the data of magnanimity provide calculating.
But, for concrete technical problems, need to solve how planning technology scheme is to adopt the problem of Map-Reduce Parallel Implementation.Not yet there is the technical scheme with ideal effect in the art.
Summary of the invention
The distance predicted for the existing Trace predict method based on Markov model is short, be merely able to the section of predicting that vehicle subsequent time will arrive, the existing Trace predict method based on sequential mode mining is in the problem processing mass data and high dimensional data arithmetic capability poor efficiency, and for the character 1 that vehicle running path sequence has, the present invention improves the production process of original GSP algorithm candidate sequence pattern, promote the operational performance of original GSP algorithm, and utilize Map-Reduce programming framework to carry out parallelization to improvement GSP algorithm, design meets the sequence library decomposition strategy of concurrent operation requirement, reduce I/O expense.The Large-scale parallel computing ability making full use of Hadoop cloud computing platform on this basis improves mass data sequential mode mining efficiency, shortens working hours.
Technical scheme provided by the invention is a kind of vehicle running path Forecasting Methodology, carries out following steps based on Hadoop platform,
Step 1, according to the internal memory situation of platform computing machine every in Hadoop platform, determine the minimum internal memory of all nodes, and be designated as Q, unit is GB;
Step 2, scanning stores the original path sequence library of vehicle running path sequence, the number obtaining path sequence in original path sequence library is designated as m bar, every paths sequence comprises more than one crossing, in original path sequence library, the actual storage size of longest path sequence is designated as P, and unit is B;
Step 3, is on average divided into n disjoint subpath sequence library by original route sequence library by horizontal division mode, wherein P × (m/n)≤Q × 10 9;
Step 4, uploads in certain specified folder of HDFS by original path sequence library;
Step 5, uploads in another specified folder of HDFS by n sub-path sequence database;
Step 6, n step 5 uploaded by the main controlled node of Hadoop platform sub-path sequence database divides tasks different Map nodes, each Map node performs the GSP algorithm improved, according to the minimum support ξ preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node, wherein key is local path sequence pattern, and value is the support counting of local path sequence pattern;
It is as follows that each Map node performs the GSP algorithm improved,
Operation a, for the subpath sequence library being assigned to this Map node, scanning subpath sequence library obtains 1-path sequence pattern L 1, make k=1,
Operation b, by k-path sequence pattern L kproduce candidate k+1-path sequence C k+1, again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern L k+1; Wherein, candidate k+1-path sequence C is produced k+1divide the following two kinds situation,
(1) if produce candidate 2-path sequence pattern by 1-path sequence pattern, scanning stores the adjacency list of traffic network information, checks 1-path sequence pattern L 1in each path sequence pattern s 1adjacent node, will with s 1adjacent node project adds s to 1in;
(2) if produce candidate k+1-path sequence pattern by k-path sequence pattern, k>1,
First, to two path sequence pattern s any in k-path sequence pattern 1and s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then by s 1with s 2connect; Then, prune, if certain the subpath sequence comprising certain path candidate sequence pattern is not path sequence pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until do not have new path candidate sequence to produce;
Step 7, the <key that Reduce node passes over Map node, value> obtain overall candidate sequence pattern to carrying out merger process;
Step 8, scanning step 4 leaves original path sequence library in HDFS in overall candidate sequence mode counting again, finds out the sequence pattern meeting and be not less than the minimum support ξ preset, obtains global path sequence pattern;
Step 9, produces path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in step 8, obtains vehicle running path and predict the outcome.
The present invention is also corresponding provides a kind of vehicle running path prognoses system, arranges based on Hadoop platform with lower module,
Internal memory confirms module, and for the internal memory situation according to platform computing machine every in Hadoop platform, determine the internal memory of the machine that internal memory is minimum in all nodes, and be designated as Q, unit is GB;
Longest path sequence confirms module, for scanning the original path sequence library storing vehicle running path sequence, the number obtaining path sequence in original path sequence library is designated as m bar, every paths sequence comprises more than one crossing, in original path sequence library, the actual storage size of longest path sequence is designated as P, and unit is B;
Subpath sequence library divides module, for original route sequence library being on average divided into n disjoint subpath sequence library by horizontal division mode, wherein P × (m/n)≤Q × 10 9;
Transmission module on raw data base, for uploading in certain specified folder of HDFS by original path sequence library;
Transmission module on subdata base, for uploading in another specified folder of HDFS by n sub-path sequence database;
Local path sequence pattern module, the n uploaded by transmission module on subdata base for making the main controlled node of Hadoop platform sub-path sequence database divides tasks different Map nodes, each Map node performs the GSP algorithm improved, according to the minimum support ξ preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node, wherein key is local path sequence pattern, value is the support counting of local path sequence pattern,
It is as follows that each Map node performs the GSP algorithm improved,
Operation a, for the subpath sequence library being assigned to this Map node, scanning subpath sequence library obtains 1-path sequence pattern L 1, make k=1,
Operation b, by k-path sequence pattern L kproduce candidate k+1-path sequence C k+1, again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern L k+1; Wherein, candidate k+1-path sequence C is produced k+1divide the following two kinds situation,
(1) if produce candidate 2-path sequence pattern by 1-path sequence pattern, scanning stores the adjacency list of traffic network information, checks 1-path sequence pattern L 1in each path sequence pattern s 1adjacent node, will with s 1adjacent node project adds s to 1in;
(2) if produce candidate k+1-path sequence pattern by k-path sequence pattern, k>1,
First, to two path sequence pattern s any in k-path sequence pattern 1and s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then by s 1with s 2connect; Then, prune, if certain the subpath sequence comprising certain path candidate sequence pattern is not path sequence pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until do not have new path candidate sequence to produce;
Overall situation candidate sequence mode module, obtains overall candidate sequence pattern for the <key making Reduce node pass over Map node, value> to carrying out merger process;
Global path sequence pattern module, original path sequence library in HDFS is left in overall candidate sequence mode counting for scanning transmission module on raw data base again, find out the sequence pattern meeting and be not less than the minimum support ξ preset, obtain global path sequence pattern;
Predict the outcome module, for producing path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in global path sequence pattern module, obtaining vehicle running path and predicting the outcome.
Relative to domestic and international existing vehicle running path Forecasting Methodology, the present invention, according to the basic demand of Map-Reduce programming framework, has redesigned and has carried out sequential mode mining and the flow process of generation pass correlation rule to vehicle running path sequence.The present invention also improves for the production process of vehicle running path sequence character 1 to original GSP algorithm candidate sequence pattern, the present invention have also been devised rational sequence library decomposition strategy, achieve the parallelization improving GSP algorithm, reduce I/O expense, the processing power sharing the cluster computer stored can be given full play to, increase work efficiency.Technical scheme of the present invention has simply, feature fast, can improve preferably and carry out sequential mode mining and the efficiency of generation pass correlation rule to vehicle running path sequence.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of the embodiment of the present invention;
Fig. 2 is the simulation traffic network schematic diagram of the embodiment of the present invention;
Fig. 3 is the adjacency list of the storage simulation traffic network of the embodiment of the present invention;
Fig. 4 is that the original path sequence library of the embodiment of the present invention divides schematic diagram;
Fig. 5 is that embodiment of the present invention antithetical phrase path sequence database 1 performs Map task schematic diagram;
Fig. 6 is that embodiment of the present invention antithetical phrase path sequence database 2 performs Map task schematic diagram;
Fig. 7 is that embodiment of the present invention antithetical phrase path sequence database 3 performs Map task schematic diagram.
Embodiment
Technical solution of the present invention is described in detail below in conjunction with drawings and Examples.
Embodiment, for simulation traffic network as shown in Figure 2, all has electronic eyes image data in 14 intersections of A ~ N.Because the present invention will utilize the information of traffic network, so adopt adjacency list to store traffic network information, the adjacency list that this road network is corresponding is shown in accompanying drawing 3, A crossing and B, C crossing adjoins, B crossing and A, D crossing adjoins, C crossing and A, E crossing adjoins, D crossing and B, G, F crossing adjoins, E crossing and C, F, H crossing adjoins, F crossing and D, G, J, H, E crossing adjoins, G crossing and D, I, F crossing adjoins, H crossing and F, K, E crossing adjoins, I crossing and G, L crossing adjoins, J crossing and F, N crossing adjoins, K crossing and H, M crossing adjoins, L crossing and I, N crossing adjoins, M crossing and K, N crossing adjoins, N crossing and J, L, M crossing adjoins.The traveling of the vehicle of electronic eyes collection is recorded corresponding path sequence stored in vehicle running path sequence library, every paths sequence comprises more than one crossing, such as, shown in following table.
Path sequence
<A B D F H K>
<A C E F G I L>
<A B D F H K M N>
<C E F G I L N>
<A B D F H K>
<C E F G I L N>
<A B D G I L N>
<A B D F H K M>
<A B D F H K>
<E F G I L N>
<A B D G I L N>
<A B D F H K M N>
The reflection of path sequence pattern be the route selection of vehicle regularity.Produce tool directive path correlation rule by path sequence pattern, the former piece of rule represents that the path sequence that vehicle has travelled, consequent represent the path sequence that vehicle will travel.Degree of confidence conf (<A B D> → <F H K>) as this paths correlation rule of <A B D> → <F H K> is defined as in path sequence database the number that comprises path sequence <A B D F H K> and the ratio of number comprising path sequence <A B D>.Namely represent that the following probability through FHK node of the vehicle having run over A B D tri-nodes is conf (<A B D> → <F H K>).
Based on the above-mentioned original path sequence library generated in advance, the vehicle running path Forecasting Methodology flow process based on Map-Reduce programming framework of the present invention's design is shown in accompanying drawing 1, and institute can adopt computer software technology realization flow automatically to run by those skilled in the art in steps.Embodiment specific implementation process is as follows:
Step 1, according to the internal memory situation of platform computing machine every in Hadoop platform, determines the internal memory of the machine that internal memory is minimum in all nodes, and is designated as Q (unit: GB).In embodiment, obtain Q=2GB.
Owing to will on average be divided into n disjoint subpath sequence library to original path sequence library in step 3, and subpath sequence library is put into node memory.So the bottleneck in order to not allow the less computing machine of wherein certain internal memory become computing, suggestion is concrete when implementing in Hadoop platform, and the internal memory of every platform computing machine is the same with operational performance.
Step 2, run-down original path sequence library (original path sequence library can text document form store, be beneficial to and original path sequence library is imported in HDFS), obtain the number of path sequence in database and be designated as m bar, in database, the actual storage size of longest path sequence is P (unit: B).In embodiment, in database, the number of path sequence is 12, and because a character take up space is 1B, therefore maximum length sequence actual storage size is 17B (comprising space and angle brackets), therefore obtains m=12, P=17B.
Step 3, is on average divided into n disjoint subpath sequence library (n disjoint subpath sequence library also can the form of text document store) by original route sequence library by horizontal division mode.General m can be divided exactly by n, each subpath sequence library is made to comprise m/n paths sequence, namely the path sequence of the 1st article to m/n article of original path sequence library is comprised in first sub-path sequence database, (k-1) × (m/n)+1 article that the individual sub-path sequence database of kth (1<k<n) comprises original path sequence library arrives the path sequence of k × (m/n) article, n-th subpath sequence library comprises the path sequence of (n-1) × (m/n)+1 article to m article of original path sequence library.In order to the original route sequence library be placed in external memory need not be scanned when counting path candidate sequence pattern, reduce I/O expense, each subpath sequence library should be made to put into internal memory.Namely P × (m/n)≤Q × 10 should be met 9.During P, Q employing other unit, also corresponding conditions should be met, in protection scheme of the present invention.
As Fig. 4, original path sequence library is divided into n=3 sub-path sequence database by embodiment setting, 17 × (12/3) <2 × 10 in embodiment 9, meet the requirement of subpath sequence library being put into internal memory.
Original path sequence library is divided the subpath sequence library 1,2,3 obtained as follows respectively:
The path sequence table of subpath sequence library 1
Path sequence
<A B D F H K>
<A C E F G I L>
<A B D F H K M N>
<C E F G I L N>
The path sequence table of subpath sequence library 2
Path sequence
<A B D F H K>
<C E F G I L N>
<A B D G I L N>
<A B D F H K M>
The path sequence table of subpath sequence library 3
Path sequence
<A B D F H K>
<E F G I L N>
<A B D G I L N>
<A B D F H K M N>
Each path sequence comprises project set { some projects in A, B, C, D, E, F, G, H, I, J, K, L, M, N} respectively.Subpath sequence library 1 comprises the 1st article of original path sequence library to the 4th paths sequence, subpath sequence library 2 comprises the 5th article of original path sequence library to the 8th paths sequence, and subpath sequence library 3 comprises the 9th article of original path sequence library to the 12nd paths sequence.
If the number of Map node is q in Hadoop platform, the number of suggestion subpath sequence library equals the number of Map node, i.e. n=q.If n<q, when running the method, have (q-n) individual Map node to be not used when not having mission failure, Duty-circle is not high.If n>q, when running the method, when not having mission failure, n-q sub-path sequence database needs just can be processed after q the complete front q of Map node processing sub-path sequence database, and treatment effeciency is not high.Therefore n=q can meet Duty-circle and treatment effeciency simultaneously.
Step 4, uploaded to by original path sequence library in certain specified folder of HDFS, step 8 will scan the path sequence database leaving this specified folder in.
Step 5, uploads in another specified folder of HDFS by n sub-path sequence database, and the n in this file sub-path sequence database is the input file that step 6 processes.
Step 6, n sub-path sequence database step 5 uploaded by main controlled node (running the computer node of primary control program) divides tasks different Map nodes (performing the computer node of Map task), each Map node performs the GSP algorithm improved, according to the minimum support ξ preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node (performing the computer node of Reduce task), wherein key is local path sequence pattern, value is the support counting of local path sequence pattern.
It is as follows that each Map node performs the GSP algorithm improved:
Operation a, for the subpath sequence library being assigned to this Map node, first scans subpath sequence library and obtains 1-path sequence pattern L 1, namely length is 1 and support in subpath sequence library is not less than the set of the path sequence of ξ.If length is k and the set that support in subpath sequence library is not less than the path sequence of ξ is k-path sequence pattern L k; Make k=1,
Operation b, then by k-path sequence pattern L kproduce candidate k+1-path sequence C k+1, again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern L k+1;
Operation c, makes k=k+1, repetitive operation b afterwards, until do not have new path candidate sequence to produce, and gained 1-path sequence pattern L 1, 2-path sequence pattern L 2it is all local path sequence pattern.The number of times of scan database is identical with the maximum length of the path sequence pattern of generation.
Wherein, produce path candidate sequence pattern and mainly divide the following two kinds situation:
(1) if produce candidate 2-path sequence pattern by 1-path sequence pattern, scanning adjacency list, checks 1-path sequence pattern L 1in each path sequence pattern s 1adjacent node, if s 1adjacent node also at 1-path sequence pattern L 1in, then s 1with s 1adjacent node connects, and is about to and s 1adjacent node project adds s to 1in.
(2) if produce candidate k+1-path sequence pattern (k>1) by k-path sequence pattern, path candidate sequence pattern is produced main in two steps:
First, to two path sequence pattern s any in k-path sequence pattern 1and s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then can by s 1with s 2connect, by s 2last project add s to 1in.Then prune: if certain subpath sequence of certain path candidate sequence pattern is not path sequence pattern, then this path candidate sequence pattern can not be path sequence pattern, it is deleted from path candidate sequence pattern.
Embodiment setting minimum support is 50%, performs the concrete steps of improvement GSP algorithm as Fig. 5,6,7.Be assigned to the Map node of subpath sequence library 1, scanning subpath sequence library 1 obtains 1-path sequence pattern L 1, then by 1-path sequence pattern L 1produce candidate 2-path sequence pattern C 2, again scan former sequence library, calculate the support of each path candidate sequence pattern, produce 2-path sequence pattern L 2, repetitive operation afterwards, until do not have new path candidate sequence pattern to produce.Antithetical phrase path sequence database 2, subpath sequence library 3 are respectively by the corresponding Map node respective handling be assigned to.
See Fig. 5, in antithetical phrase path sequence database 1 implementation, acquired results is respectively shown as follows:
L 1(1-path sequence pattern)
Path sequence Support counting
<A> 3
<B> 2
<C> 2
<D> 2
<E> 2
<F> 4
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 2
C 2(candidate 2-path sequence pattern)
Path sequence
<A B>
<A C>
<B A>
<B D>
<C A>
<C E>
<D B>
<D G>
<D F>
<E F>
<E H>
<E C>
<F D>
<F G>
<F H>
<F E>
<G D>
<G I>
<G F>
<H E>
<H F>
<H K>
<I G>
<I L>
<K H>
<L I>
<L N>
<N L>
L 2(2-path sequence pattern)
Path sequence Support counting
<A B> 2
<B D> 2
<C E> 2
<D F> 2
<E F> 2
<F G> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
C 3(candidate 3-path sequence pattern)
Path sequence
<A B D>
<B D F>
<C E F>
<D F G>
<D F H>
<E F G>
<E F H>
<F G I>
<F H K>
<G I L>
L 3(3-path sequence pattern)
Path sequence Support counting
<A B D> 2
<B D F> 2
<C E F> 2
<D F H> 2
<E F G> 2
<F G I> 2
<F H K> 2
<G I L> 2
C 4(candidate 4-path sequence pattern)
Path sequence
<A B D F>
<B D F H>
<C E F G>
<D F H K>
<E F G I>
<F G I L>
L 4(4-path sequence pattern)
Path sequence Support counting
<A B D F> 2
<B D F H> 2
<C E F G> 2
<D F H K> 2
<E F G I> 2
<F G I L> 2
C 5(candidate 5-path sequence pattern)
Path sequence
<A B D F H>
<B D F H K>
<C E F G I>
<E F G I L>
L 5(5-path sequence pattern)
Path sequence Support counting
<A B D F H> 2
<B D F H K> 2
<C E F G I> 2
<E F G I L> 2
C 6(candidate 6-path sequence pattern)
Path sequence
<A B D F H K>
<C E F G I L>
L 6(6-path sequence pattern)
Path sequence Support counting
<A B D F H K> 2
<C E F G I L> 2
See Fig. 6, in antithetical phrase path sequence database 2 implementation, acquired results is respectively shown as follows:
L 1(1-path sequence pattern)
Path sequence Support counting
<A> 3
<B> 3
<D> 3
<F> 3
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 2
C 2(candidate 2-path sequence pattern)
Path sequence
<A B>
<B A>
<B D>
<D B>
<D G>
<D F>
<F D>
<F G>
<F H>
<G D>
<G I>
<G F>
<H E>
<H F>
<H K>
<I G>
<I L>
<K H>
<L I>
<L N>
<N L>
L 2(2-path sequence pattern)
Path sequence Support counting
<A B> 3
<B D> 3
<D F> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
<L N> 2
C 3(candidate 3-path sequence pattern)
Path sequence
<A B D>
<B D F>
<D F H>
<F H K>
<G I L>
<I L N>
L 3(3-path sequence pattern)
Path sequence Support counting
<A B D> 3
<B D F> 2
<D F H> 2
<F H K> 2
<G I L> 2
<I L N> 2
C 4(candidate 4-path sequence pattern)
Path sequence
<A B D F>
<B D F H>
<D F H K>
<G I L N>
L 4(4-path sequence pattern)
Path sequence Support counting
<A B D F> 2
<B D F H> 2
<D F H K> 2
<G I L N> 2
C 5(candidate 5-path sequence pattern)
Path sequence
<A B D F H>
<B D F H K>
L 5(5-path sequence pattern)
Path sequence Support counting
<A B D F H> 2
<B D F H K> 2
C 6(candidate 6-path sequence pattern)
Path sequence
<A B D F H K>
L 6(6-path sequence pattern)
Path sequence Support counting
<A B D F H K> 2
See Fig. 7, in antithetical phrase path sequence database 3 implementation, acquired results is respectively shown as follows:
L 1(1-path sequence pattern)
Path sequence Support counting
<A> 3
<B> 3
<D> 3
<F> 3
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 3
C 2(candidate 2-path sequence pattern)
Path sequence
<A B>
<B A>
<B D>
<D B>
<D G>
<D F>
<F D>
<F G>
<F H>
<G D>
<G I>
<G F>
<H E>
<H F>
<H K>
<I G>
<I L>
<K H>
<L I>
<L N>
<N L>
L 2(2-path sequence pattern)
Path sequence Support counting
<A B> 3
<B D> 3
<D F> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
<L N> 2
C 3(candidate 3-path sequence pattern)
Path sequence
<A B D>
<B D F>
<D F H>
<F H K>
<G I L>
<I L N>
L 3(3-path sequence pattern)
Path sequence Support counting
<A B D> 3
<B D F> 2
<D F H> 2
<F H K> 2
<G I L> 2
<I L N> 2
C 4(candidate 4-path sequence pattern)
Path sequence
<A B D F>
<B D F H>
<D F H K>
<G I L N>
L 4(4-path sequence pattern)
Path sequence Support counting
<A B D F> 2
<B D F H> 2
<D F H K> 2
<G I L N> 2
C 5(candidate 5-path sequence pattern)
Path sequence
<A B D F H>
<B D F H K>
L 5(5-path sequence pattern)
Path sequence Support counting
<A B D F H> 2
<B D F H K> 2
C 6(candidate 6-path sequence pattern)
Path sequence
<A B D F H K>
L 6(6-path sequence pattern)
Path sequence Support counting
<A B D F H K> 2
Map working node passes to the <key of Reduce working node, and value> is to such as following table:
key value
<A> 3
<B> 2
<C> 2
<D> 2
<E> 2
<F> 4
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 2
<A B> 2
<B D> 2
<C E> 2
<D F> 2
<E F> 2
<F G> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
<A B D> 2
<B D F> 2
<C E F> 2
<D F H> 2
<E F G> 2
<F G I> 2
<F H K> 2
<G I L> 2
<A B D F> 2
<B D F H> 2
<C E F G> 2
<D F H K> 2
<E F G I> 2
<F G I L> 2
<A B D F H> 2
<B D F H K> 2
<C E F G I> 2
<E F G I L> 2
<A B D F H K> 2
<C E F G I L> 2
<A> 3
<B> 3
<D> 3
<F> 3
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 2
<A B> 3
<B D> 3
<D F> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
<L N> 2
<A B D> 3
<B D F> 2
<D F H> 2
<F H K> 2
<G I L> 2
<I L N> 2
<A B D F> 2
<B D F H> 2
<D F H K> 2
<G I L N> 2
<A B D F H> 2
<B D F H K> 2
<A B D F H K> 2
<A> 3
<B> 3
<D> 3
<F> 3
<G> 2
<H> 2
<I> 2
<K> 2
<L> 2
<N> 3
<A B> 3
<B D> 3
<D F> 2
<F H> 2
<G I> 2
<H K> 2
<I L> 2
<L N> 2
<A B D> 3
<B D F> 2
<D F H> 2
<F H K> 2
<G I L> 2
<I L N> 2
<A B D F> 2
<B D F H> 2
<D F H K> 2
<G I L N> 2
<A B D F H> 2
<B D F H K> 2
<A B D F H K> 2
N sub-path sequence database automatically to be divided by Master node and task different Map working nodes by Hadoop, and can manage the coordination between the execution of Map parallel task and task, and can process the situation of certain mission failure above-mentioned.Realize relatively simple, quick in this way.
Step 7, the <key that Reduce node passes over Map node, value> obtains overall candidate sequence pattern to merger process, namely identical to key <key, value> is combined, by <key, value> is to being converted to <key, the set > of the value that this key is correlated with, the overall candidate sequence pattern that embodiment produces is as following table.
key Value gathers
<A> {3,3,3}
<B> {2,3,3}
<C> {2}
<D> {2,3,3}
<E> {2}
<F> {4,3,3}
<G> {2,2,2}
<H> {2,2,2}
<I> {2,2,2}
<K> {2,2,2}
<L> {2,2,2}
<N> {2,2,3}
<A B> {2,3,3}
<B D> {2,3,3}
<C E> {2}
<D F> {2,2,2}
<E F> {2}
<F G> {2}
<F H> {2,2,2}
<G I> {2,2,2}
<H K> {2,2,2}
<I L> {2,2,2}
<A B D> {2,3,3}
<B D F> {2,2,2}
<C E F> {2}
<D F H> {2,2,2}
<E F G> {2}
<F G I> {2}
<F H K> {2,2,2}
<G I L> {2,2,2}
<A B D F> {2,2,2}
<B D F H> {2,2,2}
<C E F G> {2}
<D F H K> {2,2,2}
<E F G I> {2}
<F G I L> {2}
<A B D F H> {2,2,2}
<B D F H K> {2,2,2}
<C E F G I> {2}
<E F G I L> {2}
<A B D F H K> {2,2,2}
<C E F G I L> {2}
<L N> {2,2}
<I L N> {2,2}
<G I L N> {2,2}
Merger process is completed automatically by Hadoop, and object is to not repeat identical local sequence mode counting.
Step 8, scanning step 4 leaves original path sequence library in HDFS in overall candidate sequence mode counting again, find out the sequence pattern meeting and be not less than the minimum support ξ preset, embodiment exports <key, value> as following table.The local sequence pattern that Map task just produces, does not meet the minimum support of the overall situation, so again scan former sequence library, obtains the path sequence pattern of the overall situation.Scan former sequence library to the key counting in step 7 gained overall situation path candidate sequence pattern, obtain global path sequence pattern, namely obtain the key in following table.
key value
<A> 9
<B> 8
<D> 8
<F> 10
<G> 6
<H> 6
<I> 6
<K> 6
<L> 6
<N> 7
<A B> 8
<B D> 8
<D F> 6
<F H> 6
<G I> 6
<H K> 6
<I L> 6
<A B D> 8
<B D F> 6
<D F H> 6
<F H K> 6
<G I L> 6
<A B D F> 6
<B D F H> 6
<D F H K> 6
<A B D F H> 6
<B D F H K> 6
<A B D F H K> 6
Step 9, produces path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in step 8, obtains vehicle running path and predict the outcome.The concrete steps being produced path correlation rule by global path sequence pattern are: using front n project (1≤n<L) of L-path sequence pattern (L>1) as regular former piece, a rear L-n project is as consequent, and the degree of confidence of rule is the support of whole path sequence pattern and the ratio of the support of regular former piece.The path correlation rule produced and degree of confidence thereof are as following table:
Path correlation rule Degree of confidence
<A>→<B> 88.89%
<B>→<D> 100%
<D>→<F> 75%
<F>→<H> 60%
<G>→<I> 100%
<H>→<K> 100%
<I>→<L> 100%
<A>→<B D> 88.89%
<A B>→<D> 100%
<B>→<D F> 75%
<B D>→<F> 75%
<D>→<F H> 75%
<D F>→<H> 100%
<F>→<H K> 60%
<F H>→<K> 100%
<G>→<I L> 100%
<G I>→<L> 100%
<A>→<B D F> 66.67%
<A B>→<D F> 75%
<A B D>→<F> 75%
<B>→<D F H> 75%
<B D>→<F H> 75%
<B D F>→<H> 100%
<D>→<F H K> 100%
<D F>→<H K> 100%
<D F H>→<K> 100%
<A>→<B D F H> 66.67%
<A B>→<D F H> 75%
<A B D>→<F H> 75%
<A B D F>→<H> 100%
<B>→<D F H K> 75%
<B D>→<F H K> 75%
<B D F>→<H K> 100%
<B D F H>→<K> 100%
<A>→<B D F H K> 66.67%
<A B>→<D F H K> 75%
<A B D>→<F H K> 75%
<A B D F>→<H K> 100%
<A B D F H>→<K> 100%
During concrete enforcement, step 1 ~ 5 can be performed by the main controlled node of Hadoop platform, step 6 by the main controlled node of Hadoop platform divide task Map node perform, step 7, step 8, step 9 are performed by the Reduce node of Hadoop platform.
The present invention is also corresponding provides a kind of vehicle running path prognoses system, arrange based on Hadoop platform with lower module, internal memory confirms module, for the internal memory situation according to platform computing machine every in Hadoop platform, determine the internal memory of the machine that internal memory is minimum in all nodes, and be designated as Q;
Longest path sequence confirms module, for scanning the original path sequence library storing vehicle running path sequence, the number obtaining path sequence in original path sequence library is designated as m bar, every paths sequence comprises more than one crossing, and in original path sequence library, the actual storage size of longest path sequence is designated as P;
Subpath sequence library divides module, for original route sequence library is on average divided into n disjoint subpath sequence library by horizontal division mode;
Transmission module on raw data base, for uploading in certain specified folder of HDFS by original path sequence library;
Transmission module on subdata base, for uploading in another specified folder of HDFS by n sub-path sequence database;
Local path sequence pattern module, the n uploaded by transmission module on subdata base for making the main controlled node of Hadoop platform sub-path sequence database divides tasks different Map nodes, each Map node performs the GSP algorithm improved, according to the minimum support ξ preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node, wherein key is local path sequence pattern, value is the support counting of local path sequence pattern,
It is as follows that each Map node performs the GSP algorithm improved,
Operation a, for the subpath sequence library being assigned to this Map node, scanning subpath sequence library obtains 1-path sequence pattern L 1, make k=1,
Operation b, by k-path sequence pattern L kproduce candidate k+1-path sequence C k+1, again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern L k+1; Wherein, candidate k+1-path sequence C is produced k+1divide the following two kinds situation,
(1) if produce candidate 2-path sequence pattern by 1-path sequence pattern, scanning stores the adjacency list of traffic network information, checks 1-path sequence pattern L 1in each path sequence pattern s 1adjacent node, if s 1adjacent node also at 1-path sequence pattern L 1in, will with s 1adjacent node project adds s to 1in;
(2) if produce candidate k+1-path sequence pattern by k-path sequence pattern, k>1,
First, to two path sequence pattern s any in k-path sequence pattern 1and s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then by s 1with s 2connect; Then, prune, if certain the subpath sequence comprising certain path candidate sequence pattern is not path sequence pattern, then delete from path candidate sequence pattern;
Operation c, makes k=k+1, repetitive operation b, until do not have new path candidate sequence to produce;
Overall situation candidate sequence mode module, obtains overall candidate sequence pattern for the <key making Reduce node pass over Map node, value> to carrying out merger process;
Global path sequence pattern module, original path sequence library in HDFS is left in overall candidate sequence mode counting for scanning transmission module on raw data base again, find out the sequence pattern meeting and be not less than the minimum support ξ preset, obtain global path sequence pattern;
Predict the outcome module, for producing path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in global path sequence pattern module, obtaining vehicle running path and predicting the outcome.
Specific embodiment described herein is only to the explanation for example of the present invention's spirit.Those skilled in the art can make various amendment or supplement or adopt similar mode to substitute to described specific embodiment, but can't depart from spirit of the present invention or surmount the scope that appended claims defines.

Claims (2)

1. a vehicle running path Forecasting Methodology, is characterized in that: carry out following steps based on Hadoop platform,
Step 1, according to the internal memory situation of platform computing machine every in Hadoop platform, determine the minimum internal memory of all nodes, and be designated as Q, unit is GB;
Step 2, scanning stores the original path sequence library of vehicle running path sequence, the number obtaining path sequence in original path sequence library is designated as m bar, every paths sequence comprises more than one crossing, in original path sequence library, the actual storage size of longest path sequence is designated as P, and unit is B;
Step 3, is on average divided into n disjoint subpath sequence library by original route sequence library by horizontal division mode, wherein P × (m/n)≤Q × 10 9;
Step 4, uploads in certain specified folder of HDFS by original path sequence library;
Step 5, uploads in another specified folder of HDFS by n sub-path sequence database;
Step 6, n step 5 uploaded by the main controlled node of Hadoop platform sub-path sequence database divides tasks different Map nodes, each Map node performs the GSP algorithm improved, according to the minimum support x preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node, wherein key is local path sequence pattern, and value is the support counting of local path sequence pattern;
It is as follows that each Map node performs the GSP algorithm improved,
Operation a, for the subpath sequence library being assigned to this Map node, scanning subpath sequence library obtains 1-path sequence pattern l 1, order k=1,
Operation b, by k-path sequence pattern l k produce candidate k+1-path sequence c k+1 , again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern l k+1 ; Wherein, candidate is produced k+1-path sequence c k+1 divide the following two kinds situation,
(1) if produce candidate 2 by 1-path sequence pattern -path sequence pattern, scanning stores the adjacency list of traffic network information, checks 1-path sequence pattern l 1in each path sequence pattern s 1adjacent node, if s 1adjacent node also in 1-path sequence pattern l 1in, will be with s 1adjacent node project is added to s 1in;
(2) if by k-path sequence pattern produces candidate k+1 -path sequence pattern, k>1,
First, right kany two path sequence patterns in-path sequence pattern s 1with s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then will s 1with s 2connect;
Then, prune, if certain the subpath sequence comprising certain path candidate sequence pattern is not path sequence pattern, then delete from path candidate sequence pattern;
Operation c, order k= k+ 1, repetitive operation b, until do not have new path candidate sequence to produce;
Step 7, the <key that Reduce node passes over Map node, value> obtain overall candidate sequence pattern to carrying out merger process;
Step 8, scanning step 4 leaves original path sequence library in HDFS in overall candidate sequence mode counting again, finds out the sequence pattern meeting and be not less than the minimum support x preset, obtains global path sequence pattern;
Step 9, produces path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in step 8, obtains vehicle running path and predict the outcome.
2. a vehicle running path prognoses system, is characterized in that: arrange based on Hadoop platform with lower module,
Internal memory confirms module, and for the internal memory situation according to platform computing machine every in Hadoop platform, determine the internal memory of the machine that internal memory is minimum in all nodes, and be designated as Q, unit is GB;
Longest path sequence confirms module, for scanning the original path sequence library storing vehicle running path sequence, the number obtaining path sequence in original path sequence library is designated as m bar, every paths sequence comprises more than one crossing, in original path sequence library, the actual storage size of longest path sequence is designated as P, and unit is B;
Subpath sequence library divides module, for original route sequence library being on average divided into n disjoint subpath sequence library by horizontal division mode, wherein P × (m/n)≤Q × 10 9;
Transmission module on raw data base, for uploading in certain specified folder of HDFS by original path sequence library;
Transmission module on subdata base, for uploading in another specified folder of HDFS by n sub-path sequence database;
Local path sequence pattern module, the n uploaded by transmission module on subdata base for making the main controlled node of Hadoop platform sub-path sequence database divides tasks different Map nodes, each Map node performs the GSP algorithm improved, according to the minimum support x preset, scan the subpath sequence library left in Map node memory, calculate local path sequence pattern, with <key, the form that value> is right passes to Reduce node, wherein key is local path sequence pattern, value is the support counting of local path sequence pattern,
It is as follows that each Map node performs the GSP algorithm improved,
Operation a, for the subpath sequence library being assigned to this Map node, scanning subpath sequence library obtains 1-path sequence pattern l 1, order k=1,
Operation b, by k-path sequence pattern l k produce candidate k+1-path sequence c k+1 , again scan former sequence library, calculate the support of each path candidate sequence, produce k+1-path sequence pattern l k+1 ; Wherein, candidate is produced k+1-path sequence c k+1 divide the following two kinds situation,
(1) if produce candidate 2 by 1-path sequence pattern -path sequence pattern, scanning stores the adjacency list of traffic network information, checks 1-path sequence pattern l 1in each path sequence pattern s 1adjacent node, if s 1adjacent node also in 1-path sequence pattern l 1in, will be with s 1adjacent node project is added to s 1in;
(2) if by k-path sequence pattern produces candidate k+1 -path sequence pattern, k>1,
First, right kany two path sequence patterns in-path sequence pattern s 1with s 2if remove path sequence pattern s 1first project with remove path sequence pattern s 2the path sequence that obtains of last project identical, then will s 1with s 2connect;
Then, prune, if certain the subpath sequence comprising certain path candidate sequence pattern is not path sequence pattern, then delete from path candidate sequence pattern;
Operation c, order k= k+ 1, repetitive operation b, until do not have new path candidate sequence to produce;
Overall situation candidate sequence mode module, obtains overall candidate sequence pattern for the <key making Reduce node pass over Map node, value> to carrying out merger process;
Global path sequence pattern module, original path sequence library in HDFS is left in overall candidate sequence mode counting for scanning transmission module on raw data base again, find out the sequence pattern meeting and be not less than the minimum support x preset, obtain global path sequence pattern;
Predict the outcome module, for producing path correlation rule and the degree of confidence of calculating path correlation rule by the global path sequence pattern produced in global path sequence pattern module, obtaining vehicle running path and predicting the outcome.
CN201410628190.1A 2014-11-07 2014-11-07 A kind of vehicle running path Forecasting Methodology and system Expired - Fee Related CN104464344B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410628190.1A CN104464344B (en) 2014-11-07 2014-11-07 A kind of vehicle running path Forecasting Methodology and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410628190.1A CN104464344B (en) 2014-11-07 2014-11-07 A kind of vehicle running path Forecasting Methodology and system

Publications (2)

Publication Number Publication Date
CN104464344A true CN104464344A (en) 2015-03-25
CN104464344B CN104464344B (en) 2016-09-14

Family

ID=52910319

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410628190.1A Expired - Fee Related CN104464344B (en) 2014-11-07 2014-11-07 A kind of vehicle running path Forecasting Methodology and system

Country Status (1)

Country Link
CN (1) CN104464344B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105716620A (en) * 2016-03-16 2016-06-29 沈阳建筑大学 Navigation method based on cloud computing and big data
CN106652440A (en) * 2015-10-30 2017-05-10 杭州海康威视数字技术股份有限公司 Method and apparatus for determining frequent activity area of vehicle
CN106652557A (en) * 2015-10-28 2017-05-10 现代自动车株式会社 Method and system for predicting driving path of neighboring vehicle
CN107316016A (en) * 2017-06-19 2017-11-03 桂林电子科技大学 A kind of track of vehicle statistical method based on Hadoop and monitoring video flow
CN107862868A (en) * 2017-11-09 2018-03-30 泰华智慧产业集团股份有限公司 A kind of method that track of vehicle prediction is carried out based on big data
CN108292475A (en) * 2015-11-30 2018-07-17 日产自动车株式会社 The generation method and device of the prediction information of vehicles used in the traveling of road vehicle net
CN108717786A (en) * 2018-07-17 2018-10-30 南京航空航天大学 A kind of traffic accident causation method for digging based on universality meta-rule
CN110084402A (en) * 2019-03-25 2019-08-02 广东工业大学 A kind of public transport self-adapting dispatching method preferably to be tracked with ant based on website
CN113468245A (en) * 2021-07-19 2021-10-01 金陵科技学院 Dynamic minimum support degree calculation method for rail transit application

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007218777A (en) * 2006-02-17 2007-08-30 Aisin Aw Co Ltd Route search method, route information system, navigation system and statistical processing server
JP2007230454A (en) * 2006-03-02 2007-09-13 Toyota Motor Corp Course setting method, device, and automatic driving system
CN102509170A (en) * 2011-10-10 2012-06-20 浙江鸿程计算机系统有限公司 Location prediction system and method based on historical track data mining
CN103298059A (en) * 2013-05-13 2013-09-11 西安电子科技大学 Connectivity sensing routing method on basis of location prediction in vehicle ad hoc network
CN103366566A (en) * 2013-06-25 2013-10-23 中国科学院信息工程研究所 Running track prediction method aiming at specific vehicle potential group
CN103929804A (en) * 2014-03-20 2014-07-16 南京邮电大学 Position predicting method based on user moving rule

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007218777A (en) * 2006-02-17 2007-08-30 Aisin Aw Co Ltd Route search method, route information system, navigation system and statistical processing server
JP2007230454A (en) * 2006-03-02 2007-09-13 Toyota Motor Corp Course setting method, device, and automatic driving system
CN102509170A (en) * 2011-10-10 2012-06-20 浙江鸿程计算机系统有限公司 Location prediction system and method based on historical track data mining
CN103298059A (en) * 2013-05-13 2013-09-11 西安电子科技大学 Connectivity sensing routing method on basis of location prediction in vehicle ad hoc network
CN103366566A (en) * 2013-06-25 2013-10-23 中国科学院信息工程研究所 Running track prediction method aiming at specific vehicle potential group
CN103929804A (en) * 2014-03-20 2014-07-16 南京邮电大学 Position predicting method based on user moving rule

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李伟亮 等: "基于MAPREDUCE并行处理的轨迹模式挖掘算法的研究", 《智能处理与应用》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10793162B2 (en) 2015-10-28 2020-10-06 Hyundai Motor Company Method and system for predicting driving path of neighboring vehicle
CN106652557B (en) * 2015-10-28 2021-03-26 现代自动车株式会社 Method and system for predicting driving path of neighboring vehicle
CN106652557A (en) * 2015-10-28 2017-05-10 现代自动车株式会社 Method and system for predicting driving path of neighboring vehicle
CN106652440B (en) * 2015-10-30 2019-05-21 杭州海康威视数字技术股份有限公司 A kind of determination method and device in the frequent activities region of vehicle
CN106652440A (en) * 2015-10-30 2017-05-10 杭州海康威视数字技术股份有限公司 Method and apparatus for determining frequent activity area of vehicle
CN108292475B (en) * 2015-11-30 2021-11-02 日产自动车株式会社 Method and device for generating predicted vehicle information used for traveling of vehicle road network
CN108292475A (en) * 2015-11-30 2018-07-17 日产自动车株式会社 The generation method and device of the prediction information of vehicles used in the traveling of road vehicle net
CN105716620A (en) * 2016-03-16 2016-06-29 沈阳建筑大学 Navigation method based on cloud computing and big data
CN105716620B (en) * 2016-03-16 2018-03-23 沈阳建筑大学 A kind of air navigation aid based on cloud computing and big data
CN107316016B (en) * 2017-06-19 2020-06-23 桂林电子科技大学 Vehicle track statistical method based on Hadoop and monitoring video stream
CN107316016A (en) * 2017-06-19 2017-11-03 桂林电子科技大学 A kind of track of vehicle statistical method based on Hadoop and monitoring video flow
CN107862868B (en) * 2017-11-09 2019-08-20 泰华智慧产业集团股份有限公司 A method of track of vehicle prediction is carried out based on big data
CN107862868A (en) * 2017-11-09 2018-03-30 泰华智慧产业集团股份有限公司 A kind of method that track of vehicle prediction is carried out based on big data
CN108717786A (en) * 2018-07-17 2018-10-30 南京航空航天大学 A kind of traffic accident causation method for digging based on universality meta-rule
CN110084402A (en) * 2019-03-25 2019-08-02 广东工业大学 A kind of public transport self-adapting dispatching method preferably to be tracked with ant based on website
CN110084402B (en) * 2019-03-25 2022-03-11 广东工业大学 Bus self-adaptive scheduling method based on station optimization and ant tracing
CN113468245A (en) * 2021-07-19 2021-10-01 金陵科技学院 Dynamic minimum support degree calculation method for rail transit application
CN113468245B (en) * 2021-07-19 2023-05-05 金陵科技学院 Dynamic minimum support calculation method for rail transit application

Also Published As

Publication number Publication date
CN104464344B (en) 2016-09-14

Similar Documents

Publication Publication Date Title
CN104464344B (en) A kind of vehicle running path Forecasting Methodology and system
Ma et al. Large-scale demand driven design of a customized bus network: A methodological framework and Beijing case study
CN101325004B (en) Method for compensating real time traffic information data
Zhang et al. Network‐wide traffic speed forecasting: 3D convolutional neural network with ensemble empirical mode decomposition
CN110866649A (en) Method and system for predicting short-term subway passenger flow and electronic equipment
CN113763700B (en) Information processing method, information processing device, computer equipment and storage medium
Ma et al. Evolution regularity mining and gating control method of urban recurrent traffic congestion: a literature review
Xia et al. A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction
Sahoo et al. Study and analysis of smart applications in smart city context
Yedavalli et al. Microsimulation analysis for network traffic assignment (MANTA) at metropolitan-scale for agile transportation planning
Wang et al. A unified framework with multi-source data for predicting passenger demands of ride services
Nemtinov et al. Information support of decision making in urban passenger transport management
CN113312760A (en) Traffic simulation-based networked motor vehicle right turn trajectory planning method and device
Wei et al. Bi-level programming model for multi-modal regional bus timetable and vehicle dispatch with stochastic travel time
Hsueh et al. A short-term traffic speed prediction model based on LSTM networks
Kee et al. Multi-label classification of estimated time of arrival with ensemble neural networks in bus transportation network
CN105139328B (en) Hourage real-time predicting method and device towards license plate identification data
CN111199247A (en) Bus operation simulation method
Fafoutellis et al. Dilated LSTM networks for short-term traffic forecasting using network-wide vehicle trajectory data
CN112669604B (en) Urban traffic scheduling method and device
Asimakopoulos et al. Towards a dynamic waste collection management system using real-time and forecasted data
Xia et al. A distributed EMDN-GRU model on Spark for passenger waiting time forecasting
Asimakopoulos et al. Architecture and Implementation Issues, Towards a Dynamic Waste Collection Management System
Jain et al. Enhance traffic flow prediction with real-time vehicle data integration
Liu et al. An intelligent urban traffic data fusion analysis method based on improved artificial neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160914

Termination date: 20171107

CF01 Termination of patent right due to non-payment of annual fee