CN108647360A - A kind of method of the access of taxi big data and the processing of multithreading - Google Patents

A kind of method of the access of taxi big data and the processing of multithreading Download PDF

Info

Publication number
CN108647360A
CN108647360A CN201810480023.5A CN201810480023A CN108647360A CN 108647360 A CN108647360 A CN 108647360A CN 201810480023 A CN201810480023 A CN 201810480023A CN 108647360 A CN108647360 A CN 108647360A
Authority
CN
China
Prior art keywords
data
multithreading
taxi
spark
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810480023.5A
Other languages
Chinese (zh)
Other versions
CN108647360B (en
Inventor
孙玲
张琨
施佺
陆俊天
吕心钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN201810480023.5A priority Critical patent/CN108647360B/en
Publication of CN108647360A publication Critical patent/CN108647360A/en
Application granted granted Critical
Publication of CN108647360B publication Critical patent/CN108647360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The method of the access of taxi big data and processing of the multithreading of the present invention, includes the following steps:Step 1)Longitude and latitude is subjected to coordinate conversion, the international latitude and longitude coordinates standards of WGS 84 system is converted twice by GCJ 02 and BD 09, the coordinate that can be accurately shown in Baidu map is obtained, each longitude and latitude degrees of data is executed by coordinate conversion operation by the Map operations of elasticity distribution formula data set by the Spark under Hadoop parallel computation frames;Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside be all made of the parallel computation that Spark carries out each data multithreading, step 3)By data deposit distributed file system HDFS.Advantageous effect:Using taxi big data as background, in addition to the available data handling implement using Spark combinations HDFS, in order to enable treatment effeciency is more promoted, it is also added into multi-threading parallel process mechanism, using Python advanced features decorator and multithreading module, perfect synchronizing for multiprocessing process works.

Description

A kind of method of the access of taxi big data and the processing of multithreading
Technical field
The present invention relates to big data fields more particularly to a kind of taxi big data of multithreading based on Spark to access And the method for processing.
Background technology
Traditional have in such a way that Hadoop clusters access data MapReduce, MapReduce is Hadoop Parallel computation frame, but due to the mechanism of its step-by-step processing, limit its effectiveness of performance, it is very big for the I/O expenses of disk, Data are read from cluster every time and carry out a part of processing, then write the result into cluster, then repeat this step until having handled At.And Spark can complete all data analyses with the time close to " real-time " in memory, and data are read from cluster, it is complete At after all analyzing processings i.e. result back into cluster.Therefore fast nearly 10 times of the batch processing speed ratio MapReduce of Spark, Data analyzing speed in memory is then nearly 100 times fast.It is thus found that it is traditional by Hadoop clusters access data in the way of Through being difficult to meet the requirement of access, the processing and analysis ability of current mass data, the real-time operation of data is less adapted to.
Invention content
Present invention aims at solve the problems, such as asking for poor, the multitask coordinated work of the access efficiency of existing mass data The problem of topic, single task data processing, the history obtained by data-interface and real time traffic data, it is efficiently interior with Spark It deposits calculating and elasticity distribution formula data set RDD carries out the parallel processing of data, while setting up multithreading operation, synchronous generation is a variety of Data processing as a result, and be stored in distributed file system HDFS, it is big to provide a kind of taxi of the multithreading based on Spark Data access and the method for processing, are specifically realized by following technical scheme:
The taxi big data of the multithreading accesses and the method for processing, includes the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set RDD;
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside adopt The parallel computation of multithreading is carried out to each data with Spark,
Step 3)By data deposit distributed file system HDFS.
The taxi big data of the multithreading accesses and the further design of processing is that the Spark includes The course of work of SparkContext, Cluter Manager and Executor, Spark include the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position Beginningization SparkContext, and create DAG Scheduler and Task Scheduler, Driver and generation is executed according to application program Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster;
Step b) Driver apply for resource, the money according to the resource requirement in SparkContext to Cluter Manager Source includes Executor numbers and memory source;
Step c) explorers create Executor processes after receiving request on the work node nodes for the condition that meets;
Step d) Executor process creations are reversely registered after completing to Driver, to receive the task of Driver distribution;
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In data cleansing Include the following steps:
The first step is the data dump that will be more than actual coordinate range;
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation, do not exist Data within 3 standard deviations of mean value judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use box traction substation method Then, first quartile, the second quartile are found out, third quartile calculates interquartile-range IQR from as long as data are in setting Then retain in range, go beyond the scope, is judged as that exceptional value is rejected;
Third step is rejected to redundant data.
The taxi big data of the multithreading accesses and the further design of processing is, in the second step of data cleansing First quartile is set as Q1, the second quartile is median, and third quartile is Q3, interquartile-range IQR separation from for: IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+1.5*IQR).
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In region division Include that regular cutting is carried out to map on map by coordinate, by and the combination of transport power achievement data adjust difference of longitude and latitude Degree it is poor, finally by the difference of longitude of regional extent be set as 0.03 °, difference of latitude be set as 0.02 °.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In data conversion For:Data are carried out to the conversion of data format from time transverse direction, longitudinal direction, feature correlation and spatial distribution.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)Middle index calculates For:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating include rate of empty ride, handling capacity of passengers, the amount of getting on the bus and The operation indicator for the amount of getting off.
The taxi big data of the multithreading accesses and the further design of processing is, the step 2)In it is multi-thread The parallel computation of journey is:Metaclass, constructed fuction method are initialized first;Then, data cleansing, region division, index meter are defined It calculates, the subclass of the various operations of data conversion, the method for inheriting metaclass, and adds the advanced feature decorator of python for subclass, The method for calling metaclass interface that all subclasses can be realized;Finally each subclass runner is executed parallel.
Advantages of the present invention is as follows:
The history and real time traffic data that the present invention is obtained by data-interface, are calculated and elasticity point with the efficient memories of Spark Cloth data set RDD carries out the parallel processing of data, while setting up multithreading operation, the synchronous knot for generating a variety of data processings Fruit, and it is stored in distributed file system HDFS.The access of taxi big data and processing of the multithreading based on Spark of the present invention Method using taxi big data as background, in addition to using Spark combination HDFS available data handling implement, in order to enable locate Reason efficiency is more promoted, and is also added into multi-threading parallel process mechanism, is utilized Python advanced features decorator and multithreading mould Block, perfect synchronizing for multiprocessing process work.
Description of the drawings
Fig. 1 is Spark operation principle flow charts.
Fig. 2 is that HDFS writes process flow diagram flow chart.
Fig. 3 is multithreading instance data process chart.
Fig. 4 is that regular domain divides exemplary plot.
Fig. 5 is multithreading implementation flow chart.
Specific implementation mode
Below in conjunction with attached drawing, technical scheme of the present invention is described in detail.
Such as Fig. 3, the method for the access of taxi big data and processing of the multithreading of the present embodiment includes the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set RDD.
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside It is all made of the parallel computation that Spark carries out each data multithreading.
Step 3)By data deposit distributed file system HDFS.
If Fig. 1, Spark include SparkContext, Cluter Manager and Executor.Cluter Manager Refer to obtaining the external service of resource on cluster.There are three types of types at present:Standalone, Apache Mesos and Hadoop Yarn.Standalone is the primary resource managements of spark, is responsible for the distribution of resource by Master.Apache Mesos is and a kind of good scheduling of resource frame of hadoop MR compatibility.Hadoop Yarn are to be primarily referred to as in Yarn ResourceManager.Used herein be exactly yarn is exactly explorer, and the essence of yarn layered structures is This entity of ResourceManager controls entire cluster and distribution of the management application to basic calculation resource. Executor:Indicate that some Spark application program operates in a process on work node nodes.SparkContext is The running environment of spark.The course of work of Spark includes the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position Beginningization SparkContext, and create DAG Scheduler and Task Scheduler, Driver and generation is executed according to application program Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster.
Step b) Driver apply according to the resource requirement in SparkContext to the ResourceManager of yarn Resource, the resource include Executor numbers and memory source.Driver in Spark is to run Spark application programs Main functions simultaneously create SparkContext, and the purpose for creating SparkContext is to prepare the fortune of Spark application programs Row environment, have in Spark SparkContext be responsible for communicate with ClusterManager, progress resource bid, task divide Match and monitor, after the parts Executor are run, Driver is responsible for closing SparkContext simultaneously, usually uses SparkContext represents Driver.
Step c) explorers receive request after on the work node nodes for the condition that meets create Executor into Journey.
Step d) Executor process creations are reversely registered after completing to Driver, to receive Driver distribution task。
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
DAG Scheduler are an advanced scheduler layers, realize the scheduling based on stage, it is each A job divides stage, and single stage is divided into multiple task, then submits to bottom using stage as taskSet Task Scheduler are executed by Task Scheduler.Task Scheduler are in SparkContext in addition to DAG Another very important scheduler of Scheduler, task Scheduler are responsible for the task for generating DAG Scheduler It is dispatched in Executor and executes.
Such as Fig. 2, distributed file system(Hereinafter HDFS)Workflow is:The HDFS of the present embodiment is deployed in In 2 Namenode and 5 Datanode nodes under Hadoop clusters, data are stored in the form of a file, wherein Namenode stores data meta file, includes timestamp, path, the number information etc. of deposit data, Datanode storage files Truthful data, and file is backed up, at least two parts, is respectively stored in different Datanode nodes.It reads HDFS data files utilize Spark caching mechanisms, it would be desirable to which the data meta file of long-time service is cached, and is no longer needed Entire fileinfo is traversed, only need to inquire the data meta file that Namenode is preserved can find the position of truthful data.With The mode of Hadoop clusters carries out the access of mass data, and calculates the erection logarithm with multithreading with efficient Spark memories According to parallel processing is carried out, compared with there is great promotion on the effectiveness of performance of individual node and traditional Relational DataBase, and Have the characteristics that high reliability and high scalability, according to the demand of real data can increase node appropriate to meet data Storage needs.
By HDFS write process for, describe the member that Namenode is responsible for being stored in All Files on HDFS in detail Data, it can first confirm the request of client, and record the name of file and store the Datanode set of this file, and Store this information in the whole process of the file allocation table in memory.It is asked to Namenode for example, client sends one, Say that " gcwk.csv " file is written to HDFS by it, this document saves the data such as the longitude and latitude of taxi, carrying situation.That , flow is executed, referring to Fig. 2:
1st step:Client sends a request to Namenode, and " gcwk.csv " file is written.(①)
2nd step:Namenode return informations allow client to write file in Datanode A, B and D, and allow it to client Directly contacted with Datanode B.(②)
3rd step:Client sends a request to Datanode B, allows it to preserve a " gcwk.csv " file, and send two parts Copy, respectively in Datanode A and Datanode D.(③)
4th step:Datanode B send a request to Datanode A, allow it to preserve a " gcwk.csv " file, and allow it It sends a copy and gives Datanode D.(④)
5th step:Datanode A send a request to Datanode D, it is allowed to preserve a " gcwk.csv " file.(⑤)
6th step:Datanode D receive information and return to confirmation message to give Datanode A.(⑤)
7th step:Datanode A receive information and return to confirmation message to give Datanode B.(④)
8th step:Datanode B receive information and return to confirmation message to client, so far indicate the knot of entire ablation process Beam.(⑥)
Step 2)In data cleansing include the following steps:
The first step is the data dump that will be more than actual coordinate range.
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation, The data within 3 standard deviations of mean value do not judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use case line Figure rule, finds out first quartile, the second quartile, and third quartile calculates interquartile-range IQR from as long as data are being set Then retain in fixed range, go beyond the scope, is judged as that exceptional value is rejected.
Third step is rejected to redundant data.
Further, first quartile is set in the second step of data cleansing as Q1, the second quartile is median, Third quartile be Q3, interquartile-range IQR separation from for:IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+ 1.5*IQR).
Such as Fig. 4, step 2)In region division by coordinate include that regular cutting is carried out to map on map, by with The combination adjustment difference of longitude and difference of latitude of transport power achievement data, finally by the difference of longitude of regional extent is set as 0.03 °, difference of latitude sets It is 0.02 °.
Step 2)In data be converted to:Data are carried out from time transverse direction, longitudinal direction, feature correlation and spatial distribution The conversion of data format.
Step 2)Middle index is calculated as:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating includes The operation indicator of rate of empty ride, handling capacity of passengers, the amount of getting on the bus and the amount of getting off.
Such as Fig. 5, step 2)In the parallel computation of multithreading be:Metaclass, constructed fuction method are initialized first;Then, The subclass of data cleansing, region division, index calculating, the various operations of data conversion, the method for inheriting metaclass are defined, and is subclass The advanced feature decorator of python is added, the method for calling metaclass interface that all subclasses can be realized;Last parallel execution is every One subclass runner.
The present embodiment is using taxi big data as background, the available data handling implement in addition to utilizing Spark combinations HDFS, In order to enable treatment effeciency is more promoted, it is also added into multi-threading parallel process mechanism, utilizes Python advanced feature decorators With multithreading module, perfect synchronizing for multiprocessing process works.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Any one skilled in the art in the technical scope disclosed by the present invention, the change or replacement that can be readily occurred in, It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims Subject to.

Claims (8)

1. a kind of taxi big data of multithreading accesses and the method for processing, it is characterised in that include the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set;
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside adopt The parallel computation of multithreading is carried out to each data with Spark,
Step 3)By data deposit distributed file system HDFS.
2. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that described Spark includes SparkContext, Cluter Manager and Executor, and the course of work of Spark includes the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position Beginningization SparkContext simultaneously creates DAG Scheduler and Task Scheduler, Driver according to application program execution generation Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster;
Step b) Driver apply for resource, the money according to the resource requirement in SparkContext to Cluter Manager Source includes Executor numbers and memory source;
Step c) explorers create Executor processes after receiving request on the work node nodes for the condition that meets;
Step d) Executor process creations are reversely registered after completing to Driver, to receive the task of Driver distribution;
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
3. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step 2)In data cleansing include the following steps:
The first step is the data dump that will be more than actual coordinate range;
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation, do not exist Data within 3 standard deviations of mean value judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use box traction substation method Then, first quartile, the second quartile are found out, third quartile calculates interquartile-range IQR from as long as data are in setting Then retain in range, go beyond the scope, is judged as that exceptional value is rejected;
Third step is rejected to redundant data.
4. the taxi big data of multithreading according to claim 3 accesses and the method for processing, it is characterised in that data First quartile is set in the second step of cleaning as Q1, the second quartile is median, and third quartile is Q3, four points Position is apart from distance:IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+1.5*IQR).
5. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step 2)In region division by coordinate include that regular cutting is carried out to map on map, by and transport power achievement data combination Adjust difference of longitude and difference of latitude, finally by the difference of longitude of regional extent be set as 0.03 °, difference of latitude be set as 0.02 °.
6. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step 2)In data be converted to:Data are subjected to turning for data format from time transverse direction, longitudinal direction, feature correlation and spatial distribution It changes.
7. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step 2)Middle index is calculated as:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating includes rate of empty ride, carrying It measures, the operation indicator of the amount of getting on the bus and the amount of getting off.
8. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that described Step 2)In the parallel computation of multithreading be:Metaclass, constructed fuction method are initialized first;Then, data cleansing, area are defined Domain divides, index calculates, the subclass of the various operations of data conversion, the method for inheriting metaclass, and the height of python is added for subclass Level characteristics decorator, the method for calling metaclass interface that all subclasses can be realized;Finally each subclass runner is executed parallel.
CN201810480023.5A 2018-05-18 2018-05-18 Multithreading taxi big data access and processing method Active CN108647360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810480023.5A CN108647360B (en) 2018-05-18 2018-05-18 Multithreading taxi big data access and processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810480023.5A CN108647360B (en) 2018-05-18 2018-05-18 Multithreading taxi big data access and processing method

Publications (2)

Publication Number Publication Date
CN108647360A true CN108647360A (en) 2018-10-12
CN108647360B CN108647360B (en) 2020-04-28

Family

ID=63756840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810480023.5A Active CN108647360B (en) 2018-05-18 2018-05-18 Multithreading taxi big data access and processing method

Country Status (1)

Country Link
CN (1) CN108647360B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977306A (en) * 2019-03-14 2019-07-05 北京达佳互联信息技术有限公司 Implementation method, system, server and the medium of advertisement engine
CN110376290A (en) * 2019-07-19 2019-10-25 中南大学 Acoustic emission source locating method based on multidimensional Density Estimator
CN111400299A (en) * 2020-06-04 2020-07-10 成都四方伟业软件股份有限公司 Method and system for testing fusion quality of multiple data
CN113051279A (en) * 2021-03-05 2021-06-29 北京顺达同行科技有限公司 Data message storage method, storage device, electronic equipment and storage medium
CN114297395A (en) * 2021-06-18 2022-04-08 北京大学 Knowledge graph distributed mass data importing method based on load balancing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103823933A (en) * 2014-02-26 2014-05-28 大连理工大学 Method for processing metal cutting simulation data
CN103971509A (en) * 2013-01-31 2014-08-06 上海飞田通信技术有限公司 Taxi scheduling system, scheduling server and vehicle-mounted navigation terminal
CN105118287A (en) * 2015-09-01 2015-12-02 南京理工大学 General investigation system of road traffic sign information
CN106815452A (en) * 2015-11-27 2017-06-09 苏宁云商集团股份有限公司 A kind of cheat detection method and device
CN107341328A (en) * 2017-09-05 2017-11-10 中交第公路勘察设计研究院有限公司 Based on the Ground Settlement method for improving Verhulst curves

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103971509A (en) * 2013-01-31 2014-08-06 上海飞田通信技术有限公司 Taxi scheduling system, scheduling server and vehicle-mounted navigation terminal
CN103823933A (en) * 2014-02-26 2014-05-28 大连理工大学 Method for processing metal cutting simulation data
CN105118287A (en) * 2015-09-01 2015-12-02 南京理工大学 General investigation system of road traffic sign information
CN106815452A (en) * 2015-11-27 2017-06-09 苏宁云商集团股份有限公司 A kind of cheat detection method and device
CN107341328A (en) * 2017-09-05 2017-11-10 中交第公路勘察设计研究院有限公司 Based on the Ground Settlement method for improving Verhulst curves

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈欣欣: "基于大规模 GPS 轨迹数据的出租车服务策略研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109977306A (en) * 2019-03-14 2019-07-05 北京达佳互联信息技术有限公司 Implementation method, system, server and the medium of advertisement engine
CN109977306B (en) * 2019-03-14 2021-08-20 北京达佳互联信息技术有限公司 Method, system, server and medium for implementing advertisement engine
CN110376290A (en) * 2019-07-19 2019-10-25 中南大学 Acoustic emission source locating method based on multidimensional Density Estimator
CN111400299A (en) * 2020-06-04 2020-07-10 成都四方伟业软件股份有限公司 Method and system for testing fusion quality of multiple data
CN113051279A (en) * 2021-03-05 2021-06-29 北京顺达同行科技有限公司 Data message storage method, storage device, electronic equipment and storage medium
CN113051279B (en) * 2021-03-05 2024-05-10 北京顺达同行科技有限公司 Storage method, storage device, electronic equipment and storage medium for data message
CN114297395A (en) * 2021-06-18 2022-04-08 北京大学 Knowledge graph distributed mass data importing method based on load balancing

Also Published As

Publication number Publication date
CN108647360B (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN108647360A (en) A kind of method of the access of taxi big data and the processing of multithreading
CN109491790B (en) Container-based industrial Internet of things edge computing resource allocation method and system
Larsen et al. Susceptibility of optimal train schedules to stochastic disturbances of process times
DE112014004794B4 (en) Allocating map matching tasks through cluster servers on the vehicles' Internet
US20190207869A1 (en) Intelligent Placement within a Data Center
CN106547882A (en) A kind of real-time processing method and system of big data of marketing in intelligent grid
CN108335075A (en) A kind of processing system and method for Logistics Oriented big data
Josyula et al. A parallel algorithm for train rescheduling
CN105491329B (en) A kind of extensive monitoring video flow assemblage method based on streaming computing
CN103218233A (en) Data allocation strategy in hadoop heterogeneous cluster
US20190057611A1 (en) System and method to analyze data based on air traffic volume
CN111966289A (en) Partition optimization method and system based on Kafka cluster
CN104536804A (en) Virtual resource dispatching system for related task requests and dispatching and distributing method for related task requests
Plehn-Dujowich et al. The dynamic relationship between entrepreneurship, unemployment, and growth: evidence from US industries
CN109087030A (en) Realize method, General Mobile crowdsourcing server and the system of the crowdsourcing of C2C General Mobile
CN111506413B (en) Intelligent task scheduling method and system based on business efficiency optimization
CN106569892A (en) Resource scheduling method and device
CN113095781A (en) Temperature control equipment control method, equipment and medium based on edge calculation
Buettner et al. Migration Projections: The Economic Case
CN115169634A (en) Task allocation optimization processing method and device
US20090144011A1 (en) One-pass sampling of hierarchically organized sensors
Shtern et al. Towards a multi-cluster analytical engine for transportation data
Yu et al. Dissecting geosparksim: a scalable microscopic road network traffic simulator in apache spark
CN104410868B (en) A kind of shared-file system multifile rapid polymerization and the method read
Mian et al. A data platform for the highway traffic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant