CN108764555B - A method for location selection of shared bicycle parking spots based on Hadoop - Google Patents

A method for location selection of shared bicycle parking spots based on Hadoop Download PDF

Info

Publication number
CN108764555B
CN108764555B CN201810493379.2A CN201810493379A CN108764555B CN 108764555 B CN108764555 B CN 108764555B CN 201810493379 A CN201810493379 A CN 201810493379A CN 108764555 B CN108764555 B CN 108764555B
Authority
CN
China
Prior art keywords
demand
points
point
parking
bicycles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810493379.2A
Other languages
Chinese (zh)
Other versions
CN108764555A (en
Inventor
陈观林
史豫坤
徐煌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou City University
Original Assignee
Zhejiang University City College ZUCC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University City College ZUCC filed Critical Zhejiang University City College ZUCC
Priority to CN201810493379.2A priority Critical patent/CN108764555B/en
Publication of CN108764555A publication Critical patent/CN108764555A/en
Application granted granted Critical
Publication of CN108764555B publication Critical patent/CN108764555B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06315Needs-based resource requirements planning or analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明涉及一种基于Hadoop的共享单车停放点选址方法,包括1)基于分布式聚类算法的共享单车需求点预测;2)基于多目标优化的共享单车停放点选址模型;3)基于NSGA‑II算法的模型求解算法;同时采用Hadoop对算法进行实现,对种群初始化后,每一代进化的过程用一个MapReduce来完成。本发明的有益效果是:本发明提出了一种基于Hadoop的共享单车停放点选址方法,该方法预计能够提高共享单车停放点选址的合理性和准确性,使共享单车的管理更加规范;针对以产生的大量共享单车出行数据,本专利建立基于出行数据的需求预测模型,通过Hadoop框架和聚类算法,预测出共享单车需求点。

Figure 201810493379

The invention relates to a Hadoop-based method for selecting a parking spot for shared bicycles, including: 1) a demand point prediction for shared bicycles based on a distributed clustering algorithm; 2) a location model for shared bicycle parking spots based on multi-objective optimization; The model solving algorithm of the NSGA-II algorithm; at the same time, Hadoop is used to implement the algorithm. After the population is initialized, the evolution process of each generation is completed by a MapReduce. The beneficial effects of the present invention are as follows: the present invention proposes a Hadoop-based method for location selection of shared bicycle parking points, which is expected to improve the rationality and accuracy of location selection for shared bicycle parking points, and make the management of shared bicycles more standardized; Aiming at the large amount of shared bicycle travel data generated, this patent establishes a demand prediction model based on travel data, and predicts the demand points of shared bicycles through the Hadoop framework and clustering algorithm.

Figure 201810493379

Description

Shared bicycle parking point site selection method based on Hadoop
Technical Field
The invention relates to the technical field of computer information processing, in particular to a shared bicycle parking point address selection method based on Hadoop.
Background
In recent years, along with social development and improvement of living standard of people, the trip consciousness of people is changed, and low-carbon trip becomes the subject of people trip. The shared bicycle, a fusion product of modern science and technology and public bicycles, comes out along with the public bicycle and rapidly occupies the core position of the market. The sharing bicycle overcomes the inherent defects that a public bicycle is borrowed and returned at a fixed point, a deposit is returned inconveniently, and the like, more accords with the travel route of people, is practically convenient for people to travel, and due to the characteristic of parking everywhere, a large number of users select the sharing bicycle to travel.
However, with the rapid development of the shared bicycle, a great deal of problems are also generated. Due to the lack of reasonable layout and scientific planning of shared bicycles, the life of people is seriously affected by the problems of disordered parking and random placement of the generated vehicles, serious damage, partial road congestion caused by the fact that the vehicles cannot be cleaned in time and the like. How to reasonably plan the parking points of the shared bicycle becomes particularly important, and if the parking points are not reasonably selected, the problems like those of the traditional public bicycles are generated, so that a large number of users give up riding. Under the large background of building smart cities and big data times, how to reasonably arrange and plan parking points of shared bicycles is very significant.
Patent CN201710764773.0 "a method and an apparatus for determining a shared parking spot" provides a method and an apparatus for determining a shared parking spot, the method includes: acquiring walking track data in a preset area; based on the position coordinates of the track points contained in the walking track, clustering the walking track by adopting a preset clustering algorithm; and determining shared single-vehicle parking points according to the distribution condition of the real street paths corresponding to the walking tracks contained in each category. In the process of determining the shared bicycle parking points, the method reasonably determines which users on the real street paths have riding requirements by means of walking track data, and sets the shared bicycle parking points on the street paths, so that the method has instructive significance for determining the shared bicycle parking points, thereby serving more users with riding requirements and enabling the shared bicycle resource allocation to be more balanced. Patent 201710517669.1 "a method and apparatus for determining a shared parking spot" provides a method and apparatus for determining a shared parking spot, the method comprising: determining a sub-area with the function from a preset area according to the functional area information with the parking requirement corresponding to the current time period; classifying the determined sub-regions by adopting a preset classification algorithm; and determining the central position of each classified category as a shared single-vehicle parking point. In the method, in the process of determining the parking point of the shared bicycle, the time factor and the sub-regions with parking requirements are associated, the plurality of sub-regions with parking requirements corresponding to the current time period are classified, and the central position of each class is used as the parking point of the shared bicycle, so that shared bicycle management personnel can be effectively guided to schedule the shared bicycle, the requirement of users in the preset region on the shared bicycle in the current time period is met, the situation that supply and demand of the shared bicycle are insufficient in a certain region is reduced as much as possible, and the user experience is improved. However, the methods and the systems only realize the prediction of the parking points of the shared bicycle, do not fully utilize the historical travel data of the shared bicycle, cannot judge the rationality of the parking points, do not provide the parking points to be connected with the programmable area of a government department, and have poor accuracy and operability of the parking points.
In summary, the key of the management of the shared bicycle lies in the reasonable distribution of the user requirements and the number of the bicycles, so that the maximum value of the shared bicycle can be fully exerted, and the problems generated by the shared bicycle can be solved. The problems of unreasonable supply and demand relation, unreasonable resource distribution, disordered management and the like caused by low accuracy of parking point site selection of the current shared bicycle exist.
Disclosure of Invention
The invention aims to provide a shared bicycle parking point addressing method based on Hadoop, aiming at the problem that shared bicycle management is disordered due to unreasonable addressing of the shared bicycle parking point. The invention is realized by the following technical scheme:
the shared single-vehicle parking point site selection method based on Hadoop mainly comprises three parts: the method comprises the steps of shared bicycle demand point prediction based on a distributed clustering algorithm, a shared bicycle parking point site selection model based on multi-objective optimization, and a model solving algorithm based on an NSGA-II algorithm.
(1) Shared single vehicle demand point prediction based on a distributed clustering algorithm: the accuracy of demand point prediction plays a critical role in the site selection of shared bicycle parking points, the traditional demand prediction is mainly carried out based on experience and small-range data statistics, and the demand prediction is not accurate enough, so that the planning of rental points is not reasonable enough. The sharing bicycle demand point forecasting method based on the GPS has the advantages that a large number of real user travel data are generated by aiming at a sharing bicycle with the GPS, and a more reasonable demand point forecasting model is provided to forecast the sharing bicycle demand point.
(2) Sharing single parking point addressing model based on multi-objective optimization: after the steps of the demand points are carried out, address selection distribution is carried out between the demand points and the programmable points, and a parking point address selection model is established by taking the shortest total travel distance of a user and the smallest total cost of the shared bicycle parking points as targets.
(3) The model solving algorithm based on the NSGA-II algorithm comprises the following steps: the model is a classical dual-objective optimization problem, two objective functions in the model cannot be optimized simultaneously, so that the model has a plurality of feasible solutions, the method solves the model on the basis of a mature NSGA-II algorithm in a multi-objective evolutionary algorithm, optimizes the NSGA-II algorithm aiming at the model, and performs distributed improvement according to the problems of slow running time and the like.
The general structure of the method is shown in fig. 1, and the specific implementation steps are as follows:
step one, shared bicycle demand point prediction based on distributed clustering algorithm
Massive user data are generated by the launch of the shared bicycle so far, and the demand points of the shared bicycle are predicted according to the massive data. These travel data include time, bicycle number, bicycle type, GPS location information, etc. The truncated portion of the data is shown in figure 2 below. A clustering mode is adopted for clustering and analyzing the single vehicle data at a certain moment to form a plurality of demand areas in a certain range, a clustering center point in each demand area is used as a demand point, and the number of shared single vehicles in each demand area is used as the demand of the demand point. The demand point prediction model framework provided by the invention is shown in fig. 3, and the specific flow of the model is as follows:
1) for the actual situation of demand points, two thresholds for Canopy are set, i.e., T1 is the maximum distance between demand points and T2 is the maximum range of each demand point.
2) And executing a Canopy algorithm to obtain the number of the demand points and the positions of the demand points.
3) And screening the generated demand points, and deleting the isolated points with less demand to obtain a new data set.
4) And taking the number of the remaining demand points as a K value, taking the position of the demand point as an initial cluster center, and performing iterative operation by a K-means algorithm to finally obtain a clustering result.
Step two, sharing bicycle parking point site selection model based on multi-objective optimization
The problem of site selection and planning of shared bicycle parking spots is colloquially that site selection of optimized quantity distribution relation is carried out between the demand quantity of demand spots and each planned parking spot to be selected, and the single bicycle distribution quantity distributed to each planned parking spot by each shared bicycle demand spot is obtained.
The model takes the minimum total construction cost of the shared bicycle parking points and the minimum total travel distance of a user as optimization targets. The specific mathematical model is expressed as:
Figure BDA0001668501310000031
Figure BDA0001668501310000032
Figure BDA0001668501310000033
Figure BDA0001668501310000034
Figure BDA0001668501310000035
Figure BDA0001668501310000036
in the formula:
i: a set of demand points {1,2,3.. i };
j: represents a set of planned parking points {1,2,3.. j };
ni: representing the single vehicle demand of the demand point i;
dij: representing the distance from the demand point i to the candidate planned parking point j;
xij: representing the number of the single vehicles of the demand point i distributed to the candidate planning parking point j;
cj: representing the total number of the single vehicles distributed at the distributed parking point j;
m: representing a capital construction cost for each candidate planned parking point;
c: the number of the basic single vehicles planned at each candidate planned parking point is represented, and the construction and management cost Y is increased when the number of the basic single vehicles exceeds one;
yj: indicating whether the candidate planned parking point is established;
aj: representing the number of the candidate planned parking points exceeding the number of the basic single vehicles;
wherein the objective function (1) minimizes the total distance of a single vehicle at a demand point to a candidate planned stopping point; the objective function (2) minimizes the total cost required for the parking spot. Equation (3) indicates that the shared vehicles at the demand points are all assigned to the parking points. Equation (4) is used to calculate the number of single vehicles at the assigned parking spot. Equation (5) indicates that if the number of single cars at the post-allocation parking spot is 0, the planned parking spot is not established. Equation (6) represents the number of basic cars exceeding the planned stopping point.
Step three, model solving algorithm based on NSGA-II algorithm
The algorithmic solution steps for this model are as follows:
step 1: reading original data, a demand point set, a facility candidate point set, the single vehicle demand of each demand point, the distance from each demand point to each candidate planning parking point, the basic construction cost of each candidate planning parking point and the like;
step 2: and (3) encoding the population individuals by adopting a matrix encoding mode, initializing the population individuals within a variable value-acquirable range, and generating a population containing N individuals.
And step 3: and calculating two objective function values of each individual of the population, and performing rapid non-dominated sorting on the individuals according to the fitness value of the individual.
And 4, step 4: according to the congestion degree calculation method, the congestion degree value of the individual in the population is calculated.
And 5: according to the improved self-adaptive crossover operator and mutation operator, the crossover probability and mutation probability of each individual are solved, and then the population is selected, crossed and mutated to generate a new offspring population.
Step 6: and combining parent and offspring populations in an elite strategy to form a large population with the population number of 2N.
And 7: and performing rapid non-dominated sorting and congestion degree calculation on the population generated by merging to obtain better N individuals so as to form a new generation parent population.
And 8: and (5) repeating the step.
And step 9: and judging whether to carry out recombination crossing or not according to the self-adaptive adjustment of the feasible solution and the infeasible solution.
Step 10: and 6, repeating the step 6 to obtain a new generation of offspring population.
Step 11: and judging whether the evolution algebra of the program exceeds the maximum iteration algebra or meets a termination condition, if so, ending the program, otherwise, if not, turning to the step 7 to continue the execution.
The algorithm execution flow is shown in fig. 4. The optimal solution set of the model can be obtained through the algorithm, so that the selection planning can be carried out on the parking points.
Meanwhile, Hadoop is adopted to realize the algorithm, and after population initialization, the evolution process of each generation is completed by using MapReduce. The Map stage is used for completing calculation of individual fitness, the number of the node subgroup is used as a key value, and the individual and the fitness thereof are used as value values, so that the processes of completing the operations are time-consuming, and parallel operations are adopted; the Reduce is responsible for reducing the value values corresponding to the same key value, and then the operations such as selection, intersection, mutation and the like are carried out on subgroups on each node, so that the relative independence of the subgroup evolution process can be kept. Because the populations of the nodes are not influenced by each other, the evolution operation of the populations is performed in a parallel mode through a plurality of Reduce nodes. The parallelization flow diagram is shown in fig. 5.
The method has the beneficial effects that the method for selecting the shared bicycle parking point site based on Hadoop is predicted to improve the rationality and accuracy of the shared bicycle parking point site selection, so that the management of the shared bicycle is more standard. Aiming at a large amount of generated shared bicycle trip data, a demand prediction model based on the trip data is established, and shared bicycle demand points are predicted through a Hadoop frame and a clustering algorithm. Meanwhile, aiming at the problem that the demand points can not be set as parking points, a multi-target parking point addressing model which aims at the shortest total travel distance and the smallest total construction cost is established, and the position of the parking point of the shared bicycle and the scale of the shared bicycle which can be accommodated by the position can be calculated through the model. And finally, solving the model by adopting the improved NSGA-II, and realizing the calculation process by adopting a Hadoop framework. The problem of shared bicycle parking point addressing is solved to a certain extent, so that the parking point addressing becomes more scientific and rational.
Drawings
FIG. 1 is a general structure diagram of a shared single-vehicle parking-point addressing method based on Hadoop according to the present invention;
FIG. 2 is a partial cycling data diagram of a shared bicycle in accordance with the present invention;
FIG. 3 is a demand point prediction model framework of the present invention;
FIG. 4 is a flow chart of the improved NSGA-II algorithm solution model of the present invention;
FIG. 5 is a flow chart of an algorithmic solution model implemented on a Hadoop framework according to the present invention;
FIG. 6 is a parallelization implementation flow of Canopy-Kmeans adopted by the model in step 1 of the present invention;
FIG. 7 is a flow of the K-means parallelization algorithm based on MapReduce in the invention.
Detailed Description
The present invention will be further described with reference to the following examples. The following examples are set forth merely to aid in the understanding of the invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.
As an implementation mode, the shared single-vehicle parking point addressing method based on Hadoop specifically comprises the following steps:
step one, shared bicycle demand point prediction based on distributed clustering algorithm
The method adopts a parallelization method based on Hadoop to realize the Canopy-Kmeans algorithm to solve the model of the figure 3. Under the MapReduce framework, the solving method of the model can be split into a plurality of subtasks, the specific flow is shown in fig. 6, and each dotted square in the diagram contains an independent MapReduce task. The Canopy and K-means algorithms are matched for use, uncertainty caused by manual K value selection can be overcome, the problems of local optimization and algorithm instability caused by random selection of initial cluster centers, the influence of isolated points on clustering results and the like are solved, and the clustering performance of the K-means algorithms is greatly improved.
Firstly, collected GPS information data of a shared bicycle in a certain time period are subjected to file arrangement and stored in an HDFS file system, Hadoop executes parallel execution of a Canopy algorithm in the first stage, the GPS information data are output in a file form, map visualization is carried out on a clustering center point in the file, isolated points are filtered, the processed file is stored in the HDFS for clustering processing of a K-means algorithm in the next stage of Hadoop, and the processed file, namely the position, the quantity and the vehicle information of a required point, is output. As input data for the third stage.
Step two, sharing bicycle parking point site selection model based on multi-objective optimization
When the shared bicycle parking point model is established, certain assumed conditions are made on the model to improve the feasibility of the model. The analysis and research of the text show that the shared single-vehicle parking spot addressing problem has the following characteristics:
(1) the ground selected by the shared single parking point is a project in the urban construction development, the scale of the project is long, and in addition, the design cost of the electronic fence for the shared single parking point site selection is high, the ground planning and the daily operation need a large amount of cost, which means that the cost constraint in the shared single parking point site selection model accounts for an important part.
(2) The method comprises the steps that a to-be-constructed area of the shared single-vehicle parking spot is divided into a plurality of electronic fence areas according to land properties and geographic conditions, and the selection of a walking traffic mode by travelers is limited by distance factors, so that the single-vehicle parking amount of each electronic fence area has a certain upper limit, and the construction and management cost is increased when the number of the shared single-vehicle parking spots exceeds the basic parking amount.
(3) In addition to cost limitation, how to improve the convenience degree of traveling of travelers is also a key factor for determining the quality of the site selection of the shared bicycle parking spot, and it is assumed herein that a traveler must select a parking spot closest to the traveler when traveling by using the shared bicycle. In order to fully meet the requirements of different trip personnel, namely the shortest distance from a demand point to a parking point of a shared bicycle, the position of the parking point of the shared bicycle can be widely accepted by people, so that the problem that the shared bicycle is randomly parked and randomly placed is solved. To better optimize the model, the locations of all the shared vehicles within the demand area are considered herein as the locations of the demand points, and the centers of the electronic fences are considered as the locations of the shared vehicle parking points. In order to make the distances from all the single vehicles in each demand area to the shared single vehicle parking point as closest as possible, the total distance from the single vehicles in all the demand areas to the planned parking point is adopted as an optimization target by the model, and the distances from all the demand points to the parking point are reduced to the greatest extent.
Step three, model solving algorithm based on improved NSGA-II
1 coding mode
The NSGA-II algorithm adopts a real number coding and binary coding mode, and the one-dimensional real number coding and the binary coding cannot better reflect various combination conditions of population individuals in the model. The method aims at a shared bicycle parking point model and adopts a real matrix coding mode to code population individuals. The specific form is represented by the formula (3-1):
Figure BDA0001668501310000071
in the formula PkIs the kth individual in the population; xi,jThe number of the vehicles which are assigned to the jth parking point by the ith demand point is the ith row and jth column elements corresponding to the coding matrix; viIndicating the distribution condition of the ith demand point to the parking points; rjDenotes the jthThe parking points are from the distribution condition of each demand point;
the NSGA-II algorithm adopts a matrix coding mode, so that the distribution scheme of results in population individuals can be well reflected, the diversity of the population individuals can be kept in the crossing and variation operations, and the phenomena of local convergence and precocity caused by the early stage can be avoided.
2 crossover operator and mutation operator
Because the population individuals adopt a real matrix coding mode, the cross operator and the mutation operator of the NSGA-II algorithm are redesigned. The NSGA-II algorithm adopts fixed crossover operators and mutation operators, and the crossover probability P is causedcAnd PmThe method is a fixed value, cannot meet the dynamic requirements of the population change process on the parameters, and provides a new crossover operator and a mutation operator according to the problems.
1) The crossover operator:
the traditional crossover operator generally adopts a single-point crossover mode and a two-point crossover mode, so that gene communication among population individuals is insufficient, and a certain column in a matrix is crossed. Two populations of individuals that need to be crossed are as follows:
Figure BDA0001668501310000081
Figure BDA0001668501310000082
the individuals generated by the interleaving operation are C1,C2The expression is as follows:
Figure BDA0001668501310000083
Figure BDA0001668501310000084
wherein i is a randomly generated cross point, i is between 1 and N,
Figure BDA0001668501310000085
P1rank represents the individual P1Of non-dominant ordering hierarchy, P2Rank represents the individual P2The non-dominant ranking hierarchy of (c). By associating the participation of the crossover operator with the Pareto non-dominated sorting level of each individual in the population, the value of lambda is larger in the early operation period of the algorithm due to the fact that the proportion of the individuals with small Pareto non-dominated sorting values in the offspring is larger, but as the algorithm is continuously carried out, the individuals tend to the same Pareto front surface, and the value of lambda gradually tends to 0.5. By adopting the cross operator strategy, better genes in the father class can be inherited, and the diversity of population individuals is improved.
2) Mutation operator
For a traditional mutation operator, a node is selected from individuals to perform mutation operation. Because the coding mode of real matrix coding is adopted, the mutation operation of one node cannot be adopted. Set models and coding schemes, which employ a run-to-run mutation operation on a particular sequence, are described below
P=[R1 P,R2 P,...,RN P] (3-7)
The individual P needing mutation is generated through mutation:
Q=[R1 P,R2 P,...,Ri,...,RN P] (3-8)
Rithe original ith column of data is replaced by a column of data generated randomly.
The specific processes of the crossover operator and the mutation operator can be known through the above description, and the key of the performance of the genetic algorithm in the parameters of the genetic algorithm is the crossover probability PcAnd the mutation probability PmAnd (4) selecting. Cross probability PcThe larger, the faster the new individual may be producing, if PmToo large of a vector will result in a genetic modelThe possibility of formula (ii) being broken increases; pcToo small, making the search process slow. For different optimization problems, repeated experiments are required to determine PcAnd PmIt is difficult to find an optimum value suitable for each problem. Since the NSGA-II algorithm employs fixed crossover and mutation probabilities, for this purpose, m.srinvivas [44 ] is introduced herein]Et al propose an adaptive genetic algorithm.
When the strategy considers that the individual fitness is smaller than the population average fitness, the individual performance can be judged to be poor, and a larger cross rate and a larger variation rate are given to the individual fitness, so that the generation of individuals with a new mode is promoted; when the individual fitness is more than or equal to the average fitness, the individual can be judged to have more excellent pattern genes, and a smaller cross rate and a smaller variation rate are given to the individual, so that the better pattern genes in the population are not damaged. The corresponding model is given below, and formula (3-9) is the cross rate adjustment function, and formula (3-10) is the variation rate adjustment function.
Figure BDA0001668501310000091
Figure BDA0001668501310000092
Wherein, PcTo be crossed, the individual cross rate, PmThe rate of variation of the individual to be mutated, fmaxIs the maximum value in the population individual fitness favgThe average fitness of population individuals, f' is the maximum fitness of two individuals to be crossed, f is the fitness of the individual to be mutated, k1、k2Adjusting the parameter of the function, k, for the crossover rate3、k4The parameters of the function are adjusted for the crossover rate. In general, k is1=k2,k3=k4
3 constrained optimization process improvements
In practical applications, many constrained multi-objective optimization problems may have their true optimal solutions often existing near the constraint boundaries, and these infeasible solutions located at the constraint boundaries may often have objective function values better than those of some feasible solutions in the feasible domain. These highly advantageous infeasible solutions are utilized to increase the search speed towards the feasible domain. Because the model has constraint conditions, the population can generate an infeasible solution in the evolution process, in order to fully consider the influence of infeasibility on the population, the feasible solution and the infeasible solution are considered at the same time, and a better feasible solution set and an infeasible solution set are selected for genetic operation every several generations of evolution.
The generation carries out recombination and crossing on the infeasible solution and the feasible solution, and judges the execution generation number through a self-adaptive strategy. Since the evolution process is evolving towards the feasible region and the optimal solution, the number of feasible solutions in the evolution process is increasing, and if too many genetic operations are performed on the feasible solutions and the infeasible solutions at the later stage of evolution, the search performance of the algorithm in the feasible region may be affected instead, so that the number of times of direct intersection of the infeasible solutions and the feasible solutions in the evolution process is gradually reduced. Aiming at the problem, in the process of performing genetic operation by separating feasible solution and infeasible solution, the algebra for adaptively adjusting the intersection of the feasible solution and the infeasible solution is set, that is, when the population evolution algebra is k, recombination intersection is performed on the feasible solution and the infeasible solution:
Figure BDA0001668501310000101
in the formula (3-11), T is the total evolution generation number of the population. It can be seen from the equation that as population evolution algebra increases, the operations on feasible solutions and infeasible solutions gradually decrease.
The NSGA-II algorithm is coded according to the above requirements, the initial population size is N-100, and the initial cross probability Pc0.8, mutation probability PmThe maximum iteration number of the algorithm is 0.1, and max is 100; hadoop is adopted to perform parallel execution as shown in the figure 5, and finally a pareto optimal solution set of a programmable parking point is output for a decision maker to select.

Claims (3)

1.一种基于Hadoop的共享单车停放点选址方法,其特征在于,包括如下步骤:1. a Hadoop-based shared bicycle parking point location method, is characterized in that, comprises the steps: 步骤一、基于分布式聚类算法的共享单车需求点预测:Step 1. Prediction of shared bicycle demand points based on distributed clustering algorithm: 出行数据包含时间,单车编号,单车类型,GPS位置信息;将采集到的共享单车某一时间段的GPS信息数据进行文件整理,存入HDFS文件系统中,通过对某一时刻的单车数据采用聚类的方式进行聚类分析,形成许多一定范围的需求区域,将需求区域中的聚类中心点作为需求点,需求区域范围内的共享单车数量作为需求点的需求量;基于分布式聚类算法的共享单车需求点预测的具体流程如下:The travel data includes time, bicycle number, bicycle type, and GPS location information; the collected GPS information data of shared bicycles in a certain period of time are organized and stored in the HDFS file system. Cluster analysis is carried out in a class manner to form a number of demand areas within a certain range, the cluster center point in the demand area is taken as the demand point, and the number of shared bicycles within the demand area is taken as the demand amount of the demand point; based on distributed clustering algorithm The specific process of forecasting the demand point of shared bicycles is as follows: 1)针对需求点的实际情况,设置Canopy的两个阈值,即T1为需求点之间的最大距离,T2为每个需求点的最大范围;1) According to the actual situation of demand points, set two thresholds of Canopy, namely T1 is the maximum distance between demand points, and T2 is the maximum range of each demand point; 2)由Hadoop执行Canopy算法的并行执行,得到需求点的个数和需求点的位置;将所得数据以文件的形式进行输出,2) The parallel execution of the Canopy algorithm is performed by Hadoop to obtain the number of demand points and the position of the demand points; the obtained data is output in the form of a file, 3)对文件中的聚类中心点进行地图可视化,对产生的需求点进行筛选,将含需求量较少的孤立点删除,得到新的数据集,将处理后的数据集文件存入HDFS文件系统;3) Visualize the map of the cluster center points in the file, filter the generated demand points, delete the isolated points with less demand, get a new dataset, and save the processed dataset file into the HDFS file system; 4)将剩下的需求点数量作为K值,需求点位置作为初始簇心,通过Hadoop执行K-means算法进行迭代运算,最终得到聚类结果,输出处理后的文件;4) Take the remaining number of demand points as the K value, and the position of the demand point as the initial cluster center, perform the iterative operation through the K-means algorithm through Hadoop, finally obtain the clustering result, and output the processed file; 步骤二、基于多目标优化的共享单车停放点选址模型:Step 2. Location model of shared bicycle parking spots based on multi-objective optimization: 共享单车停车点选址规划问题是将需求点的需求量和各个待选可规划停车点之间进行优化数量分配关系的选址,得出各共享单车需求点分配给各个规划停放点的单车分配数量;The problem of site selection and planning of shared bicycle parking spots is to select the location of the optimal quantity allocation relationship between the demand of demand points and each of the planned parking spots to be selected, and to obtain the allocation of bicycles from each shared bicycle demand point to each planned parking spot. quantity; 该模型是以共享单车停放点总建设成本最小和用户出行总距离最短为优化目标;具体数学模型表示为:This model takes the minimum total construction cost of shared bicycle parking points and the shortest total travel distance of users as the optimization goals; the specific mathematical model is expressed as:
Figure FDA0003142969250000011
Figure FDA0003142969250000011
Figure FDA0003142969250000012
Figure FDA0003142969250000012
Figure FDA0003142969250000013
Figure FDA0003142969250000013
Figure FDA0003142969250000014
Figure FDA0003142969250000014
Figure FDA0003142969250000021
Figure FDA0003142969250000021
Figure FDA0003142969250000022
Figure FDA0003142969250000022
式中:where: I:表示需求点的集合{1,2,3...i};I: represents the set of demand points {1, 2, 3...i}; J:表示规划停放点的集合{1,2,3...j};J: represents the set of planned parking points {1, 2, 3...j}; ni:表示需求点i的单车需求量;n i : represents the demand for bicycles at demand point i; dij:表示需求点i到候选规划停车点j的距离;d ij : represents the distance from the demand point i to the candidate planned parking point j; xij:表示需求点i分配给候选规划停车点j的单车数量;x ij : Indicates the number of bicycles allocated by demand point i to candidate planned parking point j; cj:表示分配后停车点j分配的总单车数量;c j : Indicates the total number of bicycles allocated to parking spot j after allocation; M:表示每个候选规划停车点的基本建设费用;M: Indicates the capital construction cost of each candidate planned parking spot; c:表示每个候选规划停车点规划的基本单车数量,每超出基本单车数量一辆将增加建设和管理费用Y;c: Indicates the number of basic bicycles planned for each candidate planned parking spot, each exceeding the basic number of bicycles will increase the construction and management cost Y; yj:表示是否建设该候选规划停车点;y j : Indicates whether to build the candidate planned parking spot; aj:表示候选规划停车点超出基本单车数量的个数;a j : Indicates the number of candidate planned parking spots exceeding the basic number of bicycles; 其中,目标函数(1)使需求点的单车到候选规划的停车点的总距离最小化;目标函数(2)使停车点需要的总费用最小化;式(3)表示需求点的共享单车都分配给了停车点;式(4)用来计算分配后的停车点的单车数量;式(5)表示如果分配后停车点单车数量为0,则不建设该规划停车点;式(6)表示超过规划停车点基本单车数量的数目;Among them, the objective function (1) minimizes the total distance from the bicycles at the demand point to the parking points of the candidate planning; the objective function (2) minimizes the total cost of the parking points; Equation (3) represents the total cost of the shared bicycles at the demand point. Allocated to the parking spot; Equation (4) is used to calculate the number of bicycles in the assigned parking spot; Equation (5) indicates that if the number of bicycles in the assigned parking spot is 0, the planned parking spot will not be built; Equation (6) represents The number of bicycles that exceeds the basic number of planned parking spots; 步骤三、基于NSGA-II算法的模型求解算法。Step 3: Model solving algorithm based on NSGA-II algorithm.
2.根据权利要求1所述的基于Hadoop的共享单车停放点选址方法,其特征在于,步骤三针对该模型的算法求解步骤如下:2. the shared bicycle parking spot location method based on Hadoop according to claim 1, is characterized in that, step 3 is as follows for the algorithm solving step of this model: 步骤1:读取原始数据,需求点集合、设施候选点集合、各需求点的单车需求量、各需求点到各候选规划停车点的距离、各候选规划停车点的基本建设费用;Step 1: Read the original data, the set of demand points, the set of candidate facilities, the bicycle demand of each demand point, the distance from each demand point to each candidate planned parking point, and the capital construction cost of each candidate planned parking point; 步骤2:采用矩阵编码的方式,对种群个体进行编码,在变量可取值范围内,对种群个体进行初始化,生成包含N个个体的种群;Step 2: Use the matrix coding method to encode the population individuals, and initialize the population individuals within the range of variable values to generate a population containing N individuals; 步骤3:计算种群每个个体的两个目标函数值,根据个体的适应度值,对个体进行快速非支配排序;Step 3: Calculate the two objective function values of each individual in the population, and perform a fast non-dominated sorting of the individuals according to the fitness value of the individual; 步骤4:根据拥挤度计算方法,计算种群中个体的拥挤度值;Step 4: Calculate the crowding degree value of the individuals in the population according to the crowding degree calculation method; 步骤5:根据改进的自适应交叉算子和变异算子,求出每个个体的交叉概率和变异概率,然后对种群进行选择、交叉、变异操作,产生新的子代种群;Step 5: According to the improved adaptive crossover operator and mutation operator, the crossover probability and mutation probability of each individual are obtained, and then the population is selected, crossed, and mutated to generate a new offspring population; 步骤6:使用精英策略的方式,合并父代和子代种群,形成种群个体数为2N的大种群;Step 6: Use the elite strategy to merge the parent and child populations to form a large population with 2N individuals; 步骤7:对合并产生的种群进行快速非支配排序和拥挤度的计算,求出较优的N个个体,形成新一代的父代种群;Step 7: Perform fast non-dominated sorting and crowding calculation on the merged population, and find out the better N individuals to form a new generation of parent population; 步骤8:重复步骤5;Step 8: Repeat Step 5; 步骤9:根据对可行解和不可行解的自适应调整,判断是否进行重组交叉;Step 9: According to the adaptive adjustment of the feasible solution and the infeasible solution, determine whether to carry out the reorganization and crossover; 步骤10:重复步骤6,得到新一代的子代种群;Step 10: Repeat step 6 to obtain a new generation of offspring populations; 步骤11:判断程序进化代数是否超过最大迭代数或者满足终止条件,是则程序结束,否则,t=t+1,转到步骤7继续执行。Step 11: Determine whether the program evolution algebra exceeds the maximum number of iterations or satisfies the termination condition, if yes, the program ends, otherwise, t=t+1, go to step 7 to continue execution. 3.根据权利要求2所述的基于Hadoop的共享单车停放点选址方法,其特征在于,所述步骤三同时采用Hadoop对算法进行实现,对种群初始化后,每一代进化的过程用一个MapReduce来完成;其中Map阶段用来完成个体适应度的计算,将节点子群编号作为key值,个体及其适应度作为value值,通常在完成这些操作的过程都比较费时,所以采用并行操作;Reduce负责将相同的key值对应的value值归约起来,然后针对每个节点上的子群进行选择、交叉、变异操作,能够保持子群进化过程的相对独立;由于各节点种群互不影响,通过多个Reduce节点,对种群的进化操作也采用并行的方式进行。3. The Hadoop-based method for selecting a parking spot for shared bicycles according to claim 2, wherein the step 3 adopts Hadoop to implement the algorithm simultaneously, and after the population is initialized, the process of each generation of evolution is generated by a MapReduce. Completed; the Map stage is used to complete the calculation of the individual fitness, the node subgroup number is used as the key value, and the individual and its fitness are used as the value value. Usually, the process of completing these operations is time-consuming, so parallel operations are used; Reduce is responsible for Reduce the value values corresponding to the same key value, and then perform selection, crossover, and mutation operations on the subgroups on each node, which can maintain the relative independence of the subgroup evolution process; since the populations of each node do not affect each other, through multiple A Reduce node is used, and the evolution operation of the population is also carried out in a parallel manner.
CN201810493379.2A 2018-05-22 2018-05-22 A method for location selection of shared bicycle parking spots based on Hadoop Active CN108764555B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810493379.2A CN108764555B (en) 2018-05-22 2018-05-22 A method for location selection of shared bicycle parking spots based on Hadoop

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810493379.2A CN108764555B (en) 2018-05-22 2018-05-22 A method for location selection of shared bicycle parking spots based on Hadoop

Publications (2)

Publication Number Publication Date
CN108764555A CN108764555A (en) 2018-11-06
CN108764555B true CN108764555B (en) 2021-08-31

Family

ID=64008547

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810493379.2A Active CN108764555B (en) 2018-05-22 2018-05-22 A method for location selection of shared bicycle parking spots based on Hadoop

Country Status (1)

Country Link
CN (1) CN108764555B (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109902969A (en) * 2019-03-13 2019-06-18 武汉大学 A planning method of shared bicycle placement point based on OD data
CN110147885B (en) * 2019-05-07 2021-01-05 浙江大学城市学院 Shared bicycle parking point distribution method for improving genetic algorithm
CN110175704B (en) * 2019-05-07 2022-12-27 南京师范大学 Method for dividing standard parking area of shared vehicles
CN110059986B (en) * 2019-05-08 2021-02-19 武汉大学 A kind of shared bicycle dynamic delivery method and system
CN110458309B (en) * 2019-06-29 2023-07-11 东南大学 Network about splicing station point location method based on actual road network environment
CN112185014A (en) * 2020-04-07 2021-01-05 江苏智途科技股份有限公司 Method for judging rationality of parking points of shared bicycle
CN111551958B (en) * 2020-04-28 2022-04-01 北京踏歌智行科技有限公司 Mining area unmanned high-precision map manufacturing method
CN111881939B (en) * 2020-06-24 2021-03-09 东南大学 A method for the layout of shared bicycle parking areas based on clustering algorithm
CN112131330B (en) * 2020-09-16 2024-01-26 上海交通大学 Method for selecting and laying out operation area of shared automobile in free flow mode
CN112163788B (en) * 2020-10-21 2024-05-31 深圳市规划国土发展研究中心 Scheduling method of Internet pile-free bicycle based on real-time data
CN113095670B (en) * 2021-04-08 2022-07-15 上海市城市建设设计研究总院(集团)有限公司 Planning and site selection method for shared bicycle storage points
CN113095406B (en) * 2021-04-14 2022-04-26 国能智慧科技发展(江苏)有限公司 Electronic fence effective time period management and control method based on intelligent Internet of things
CN113850310B (en) * 2021-09-16 2024-11-29 杭州电子科技大学 Shared bicycle electronic fence planning method based on land block subdivision and regional maximum coverage
CN113920713B (en) * 2021-09-29 2024-01-23 广州时空位置网科学技术研究院有限公司 Urban traffic intelligent information management system based on Beidou positioning
CN114327859B (en) * 2021-11-18 2024-06-07 西安电子科技大学 Source model clustering selection method for proxy optimization of large-scale problems in cloud computing environment
CN114386500A (en) * 2021-12-31 2022-04-22 苏州市公安局 Infectious disease sampling point selection method, device, equipment and storage medium
CN114971728A (en) * 2022-06-01 2022-08-30 中国银行股份有限公司 Bank outlet layout method and device
CN114971328B (en) * 2022-06-02 2024-09-24 郑州轻工业大学 A regional scheduling method for shared cars based on PSO-DE
CN116824833B (en) * 2023-03-02 2024-09-17 四川国蓝中天环境科技集团有限公司 Method for optimizing position of shared bicycle electronic fence based on grid division
CN116702954B (en) * 2023-05-16 2024-08-20 苏州大学 Long-term dynamic site selection method, device and readable storage medium for movable service facilities
CN116665455B (en) * 2023-07-18 2023-12-01 北京阿帕科蓝科技有限公司 Vehicle station selection method, device and computer equipment
CN118072502B (en) * 2024-04-17 2024-06-21 北京工业大学 Planning method and device for electronic fence, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095266A (en) * 2014-05-08 2015-11-25 中国科学院声学研究所 Method and system for clustering optimization based on Canopy algorithm
CN107392239A (en) * 2017-07-11 2017-11-24 南京邮电大学 A kind of K Means algorithm optimization methods based on Spark computation models
CN107463620A (en) * 2017-07-05 2017-12-12 洛川闰土农牧科技有限责任公司 A kind of elevator accident early-warning and predicting system based on data mining
CN107871184A (en) * 2017-11-16 2018-04-03 南京邮电大学 A site selection method for electric vehicle charging stations for regional charging facilities
CN108038575A (en) * 2017-12-20 2018-05-15 广西大学 Waypoint location planing method based on modified NSGA II

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095266A (en) * 2014-05-08 2015-11-25 中国科学院声学研究所 Method and system for clustering optimization based on Canopy algorithm
CN107463620A (en) * 2017-07-05 2017-12-12 洛川闰土农牧科技有限责任公司 A kind of elevator accident early-warning and predicting system based on data mining
CN107392239A (en) * 2017-07-11 2017-11-24 南京邮电大学 A kind of K Means algorithm optimization methods based on Spark computation models
CN107871184A (en) * 2017-11-16 2018-04-03 南京邮电大学 A site selection method for electric vehicle charging stations for regional charging facilities
CN108038575A (en) * 2017-12-20 2018-05-15 广西大学 Waypoint location planing method based on modified NSGA II

Also Published As

Publication number Publication date
CN108764555A (en) 2018-11-06

Similar Documents

Publication Publication Date Title
CN108764555B (en) A method for location selection of shared bicycle parking spots based on Hadoop
CN112700045B (en) Intelligent site selection system based on land reserve implementation monitoring model
CN111582552B (en) Shared bicycle parking point distribution method based on multi-target genetic algorithm
CN111612252B (en) Automatic site selection method and device for large-scale emergency facilities and readable storage medium
CN107977740A (en) A kind of scene O&M intelligent dispatching method
CN117688968B (en) Tramcar layout method based on particle swarm optimization
CN103744733A (en) Method for calling and configuring imaging satellite resources
CN102262702B (en) Decision-making method for maintaining middle and small span concrete bridges
CN117994986B (en) A traffic flow prediction optimization method based on intelligent optimization algorithm
CN115034557A (en) Agile satellite emergency task planning method
CN112378415A (en) Path planning method, device and equipment
CN108154003A (en) Prediction of Blasting Vibration method based on the optimization of Spark gene expressions
CN114118539A (en) Highway accurate maintenance decision-making method based on microscopic unit performance prediction
CN113053122A (en) WMGIRL algorithm-based regional flow distribution prediction method in variable traffic control scheme
CN117872763A (en) Multi-unmanned aerial vehicle road network traffic flow monitoring path optimization method
CN115495859B (en) A warehouse network planning method based on genetic algorithm
CN116541644A (en) Big piece transportation monitoring point layout system based on improved genetic algorithm
CN115146866B (en) A method for planning multiple equivalent optimal paths considering practical multiple constraints
CN114253975B (en) A load-aware road network shortest path distance calculation method and device
CN113689720B (en) Automatic intersection traffic decision method based on convolutional neural network
CN108460491A (en) Scenic spot line design method under a kind of time-varying random environment based on heuritic approach
CN114742329B (en) An improved genetic planning method for vehicle hedging paths in urban waterlogging
CN117146852A (en) Path planning method, device, equipment and storage medium
CN114329783A (en) A multi-objective electric vehicle charging network planning method
Gora et al. Solving traffic signal setting problem using machine learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220719

Address after: 310015 No. 51, Huzhou street, Hangzhou, Zhejiang

Patentee after: HANGZHOU City University

Address before: 310015 No. 50 Huzhou Street, Hangzhou City, Zhejiang Province

Patentee before: Zhejiang University City College

TR01 Transfer of patent right