CN108647360A - A kind of method of the access of taxi big data and the processing of multithreading - Google Patents
A kind of method of the access of taxi big data and the processing of multithreading Download PDFInfo
- Publication number
- CN108647360A CN108647360A CN201810480023.5A CN201810480023A CN108647360A CN 108647360 A CN108647360 A CN 108647360A CN 201810480023 A CN201810480023 A CN 201810480023A CN 108647360 A CN108647360 A CN 108647360A
- Authority
- CN
- China
- Prior art keywords
- data
- multithreading
- taxi
- spark
- processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000012545 processing Methods 0.000 title claims abstract description 34
- 238000006243 chemical reaction Methods 0.000 claims abstract description 18
- 230000008569 process Effects 0.000 claims abstract description 17
- 230000015654 memory Effects 0.000 claims description 8
- 230000009471 action Effects 0.000 claims description 3
- 230000014759 maintenance of location Effects 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims 1
- 230000007246 mechanism Effects 0.000 abstract description 5
- 238000013461 design Methods 0.000 description 7
- 238000012790 confirmation Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000002679 ablation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Educational Administration (AREA)
- Marketing (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The method of the access of taxi big data and processing of the multithreading of the present invention, includes the following steps:Step 1)Longitude and latitude is subjected to coordinate conversion, the international latitude and longitude coordinates standards of WGS 84 system is converted twice by GCJ 02 and BD 09, the coordinate that can be accurately shown in Baidu map is obtained, each longitude and latitude degrees of data is executed by coordinate conversion operation by the Map operations of elasticity distribution formula data set by the Spark under Hadoop parallel computation frames;Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside be all made of the parallel computation that Spark carries out each data multithreading, step 3)By data deposit distributed file system HDFS.Advantageous effect:Using taxi big data as background, in addition to the available data handling implement using Spark combinations HDFS, in order to enable treatment effeciency is more promoted, it is also added into multi-threading parallel process mechanism, using Python advanced features decorator and multithreading module, perfect synchronizing for multiprocessing process works.
Description
Technical field
The present invention relates to big data fields more particularly to a kind of taxi big data of multithreading based on Spark to access
And the method for processing.
Background technology
Traditional have in such a way that Hadoop clusters access data MapReduce, MapReduce is Hadoop
Parallel computation frame, but due to the mechanism of its step-by-step processing, limit its effectiveness of performance, it is very big for the I/O expenses of disk,
Data are read from cluster every time and carry out a part of processing, then write the result into cluster, then repeat this step until having handled
At.And Spark can complete all data analyses with the time close to " real-time " in memory, and data are read from cluster, it is complete
At after all analyzing processings i.e. result back into cluster.Therefore fast nearly 10 times of the batch processing speed ratio MapReduce of Spark,
Data analyzing speed in memory is then nearly 100 times fast.It is thus found that it is traditional by Hadoop clusters access data in the way of
Through being difficult to meet the requirement of access, the processing and analysis ability of current mass data, the real-time operation of data is less adapted to.
Invention content
Present invention aims at solve the problems, such as asking for poor, the multitask coordinated work of the access efficiency of existing mass data
The problem of topic, single task data processing, the history obtained by data-interface and real time traffic data, it is efficiently interior with Spark
It deposits calculating and elasticity distribution formula data set RDD carries out the parallel processing of data, while setting up multithreading operation, synchronous generation is a variety of
Data processing as a result, and be stored in distributed file system HDFS, it is big to provide a kind of taxi of the multithreading based on Spark
Data access and the method for processing, are specifically realized by following technical scheme:
The taxi big data of the multithreading accesses and the method for processing, includes the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system
It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames
One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set RDD;
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside adopt
The parallel computation of multithreading is carried out to each data with Spark,
Step 3)By data deposit distributed file system HDFS.
The taxi big data of the multithreading accesses and the further design of processing is that the Spark includes
The course of work of SparkContext, Cluter Manager and Executor, Spark include the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position
Beginningization SparkContext, and create DAG Scheduler and Task Scheduler, Driver and generation is executed according to application program
Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will
DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage
TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster;
Step b) Driver apply for resource, the money according to the resource requirement in SparkContext to Cluter Manager
Source includes Executor numbers and memory source;
Step c) explorers create Executor processes after receiving request on the work node nodes for the condition that meets;
Step d) Executor process creations are reversely registered after completing to Driver, to receive the task of Driver distribution;
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In data cleansing
Include the following steps:
The first step is the data dump that will be more than actual coordinate range;
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation, do not exist
Data within 3 standard deviations of mean value judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use box traction substation method
Then, first quartile, the second quartile are found out, third quartile calculates interquartile-range IQR from as long as data are in setting
Then retain in range, go beyond the scope, is judged as that exceptional value is rejected;
Third step is rejected to redundant data.
The taxi big data of the multithreading accesses and the further design of processing is, in the second step of data cleansing
First quartile is set as Q1, the second quartile is median, and third quartile is Q3, interquartile-range IQR separation from for:
IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+1.5*IQR).
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In region division
Include that regular cutting is carried out to map on map by coordinate, by and the combination of transport power achievement data adjust difference of longitude and latitude
Degree it is poor, finally by the difference of longitude of regional extent be set as 0.03 °, difference of latitude be set as 0.02 °.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)In data conversion
For:Data are carried out to the conversion of data format from time transverse direction, longitudinal direction, feature correlation and spatial distribution.
The taxi big data of the multithreading accesses and the further design of processing is, step 2)Middle index calculates
For:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating include rate of empty ride, handling capacity of passengers, the amount of getting on the bus and
The operation indicator for the amount of getting off.
The taxi big data of the multithreading accesses and the further design of processing is, the step 2)In it is multi-thread
The parallel computation of journey is:Metaclass, constructed fuction method are initialized first;Then, data cleansing, region division, index meter are defined
It calculates, the subclass of the various operations of data conversion, the method for inheriting metaclass, and adds the advanced feature decorator of python for subclass,
The method for calling metaclass interface that all subclasses can be realized;Finally each subclass runner is executed parallel.
Advantages of the present invention is as follows:
The history and real time traffic data that the present invention is obtained by data-interface, are calculated and elasticity point with the efficient memories of Spark
Cloth data set RDD carries out the parallel processing of data, while setting up multithreading operation, the synchronous knot for generating a variety of data processings
Fruit, and it is stored in distributed file system HDFS.The access of taxi big data and processing of the multithreading based on Spark of the present invention
Method using taxi big data as background, in addition to using Spark combination HDFS available data handling implement, in order to enable locate
Reason efficiency is more promoted, and is also added into multi-threading parallel process mechanism, is utilized Python advanced features decorator and multithreading mould
Block, perfect synchronizing for multiprocessing process work.
Description of the drawings
Fig. 1 is Spark operation principle flow charts.
Fig. 2 is that HDFS writes process flow diagram flow chart.
Fig. 3 is multithreading instance data process chart.
Fig. 4 is that regular domain divides exemplary plot.
Fig. 5 is multithreading implementation flow chart.
Specific implementation mode
Below in conjunction with attached drawing, technical scheme of the present invention is described in detail.
Such as Fig. 3, the method for the access of taxi big data and processing of the multithreading of the present embodiment includes the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system
It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames
One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set RDD.
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside
It is all made of the parallel computation that Spark carries out each data multithreading.
Step 3)By data deposit distributed file system HDFS.
If Fig. 1, Spark include SparkContext, Cluter Manager and Executor.Cluter Manager
Refer to obtaining the external service of resource on cluster.There are three types of types at present:Standalone, Apache Mesos and
Hadoop Yarn.Standalone is the primary resource managements of spark, is responsible for the distribution of resource by Master.Apache
Mesos is and a kind of good scheduling of resource frame of hadoop MR compatibility.Hadoop Yarn are to be primarily referred to as in Yarn
ResourceManager.Used herein be exactly yarn is exactly explorer, and the essence of yarn layered structures is
This entity of ResourceManager controls entire cluster and distribution of the management application to basic calculation resource.
Executor:Indicate that some Spark application program operates in a process on work node nodes.SparkContext is
The running environment of spark.The course of work of Spark includes the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position
Beginningization SparkContext, and create DAG Scheduler and Task Scheduler, Driver and generation is executed according to application program
Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will
DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage
TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster.
Step b) Driver apply according to the resource requirement in SparkContext to the ResourceManager of yarn
Resource, the resource include Executor numbers and memory source.Driver in Spark is to run Spark application programs
Main functions simultaneously create SparkContext, and the purpose for creating SparkContext is to prepare the fortune of Spark application programs
Row environment, have in Spark SparkContext be responsible for communicate with ClusterManager, progress resource bid, task divide
Match and monitor, after the parts Executor are run, Driver is responsible for closing SparkContext simultaneously, usually uses
SparkContext represents Driver.
Step c) explorers receive request after on the work node nodes for the condition that meets create Executor into
Journey.
Step d) Executor process creations are reversely registered after completing to Driver, to receive Driver distribution
task。
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
DAG Scheduler are an advanced scheduler layers, realize the scheduling based on stage, it is each
A job divides stage, and single stage is divided into multiple task, then submits to bottom using stage as taskSet
Task Scheduler are executed by Task Scheduler.Task Scheduler are in SparkContext in addition to DAG
Another very important scheduler of Scheduler, task Scheduler are responsible for the task for generating DAG Scheduler
It is dispatched in Executor and executes.
Such as Fig. 2, distributed file system(Hereinafter HDFS)Workflow is:The HDFS of the present embodiment is deployed in
In 2 Namenode and 5 Datanode nodes under Hadoop clusters, data are stored in the form of a file, wherein
Namenode stores data meta file, includes timestamp, path, the number information etc. of deposit data, Datanode storage files
Truthful data, and file is backed up, at least two parts, is respectively stored in different Datanode nodes.It reads
HDFS data files utilize Spark caching mechanisms, it would be desirable to which the data meta file of long-time service is cached, and is no longer needed
Entire fileinfo is traversed, only need to inquire the data meta file that Namenode is preserved can find the position of truthful data.With
The mode of Hadoop clusters carries out the access of mass data, and calculates the erection logarithm with multithreading with efficient Spark memories
According to parallel processing is carried out, compared with there is great promotion on the effectiveness of performance of individual node and traditional Relational DataBase, and
Have the characteristics that high reliability and high scalability, according to the demand of real data can increase node appropriate to meet data
Storage needs.
By HDFS write process for, describe the member that Namenode is responsible for being stored in All Files on HDFS in detail
Data, it can first confirm the request of client, and record the name of file and store the Datanode set of this file, and
Store this information in the whole process of the file allocation table in memory.It is asked to Namenode for example, client sends one,
Say that " gcwk.csv " file is written to HDFS by it, this document saves the data such as the longitude and latitude of taxi, carrying situation.That
, flow is executed, referring to Fig. 2:
1st step:Client sends a request to Namenode, and " gcwk.csv " file is written.(①)
2nd step:Namenode return informations allow client to write file in Datanode A, B and D, and allow it to client
Directly contacted with Datanode B.(②)
3rd step:Client sends a request to Datanode B, allows it to preserve a " gcwk.csv " file, and send two parts
Copy, respectively in Datanode A and Datanode D.(③)
4th step:Datanode B send a request to Datanode A, allow it to preserve a " gcwk.csv " file, and allow it
It sends a copy and gives Datanode D.(④)
5th step:Datanode A send a request to Datanode D, it is allowed to preserve a " gcwk.csv " file.(⑤)
6th step:Datanode D receive information and return to confirmation message to give Datanode A.(⑤)
7th step:Datanode A receive information and return to confirmation message to give Datanode B.(④)
8th step:Datanode B receive information and return to confirmation message to client, so far indicate the knot of entire ablation process
Beam.(⑥)
Step 2)In data cleansing include the following steps:
The first step is the data dump that will be more than actual coordinate range.
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation,
The data within 3 standard deviations of mean value do not judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use case line
Figure rule, finds out first quartile, the second quartile, and third quartile calculates interquartile-range IQR from as long as data are being set
Then retain in fixed range, go beyond the scope, is judged as that exceptional value is rejected.
Third step is rejected to redundant data.
Further, first quartile is set in the second step of data cleansing as Q1, the second quartile is median,
Third quartile be Q3, interquartile-range IQR separation from for:IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+
1.5*IQR).
Such as Fig. 4, step 2)In region division by coordinate include that regular cutting is carried out to map on map, by with
The combination adjustment difference of longitude and difference of latitude of transport power achievement data, finally by the difference of longitude of regional extent is set as 0.03 °, difference of latitude sets
It is 0.02 °.
Step 2)In data be converted to:Data are carried out from time transverse direction, longitudinal direction, feature correlation and spatial distribution
The conversion of data format.
Step 2)Middle index is calculated as:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating includes
The operation indicator of rate of empty ride, handling capacity of passengers, the amount of getting on the bus and the amount of getting off.
Such as Fig. 5, step 2)In the parallel computation of multithreading be:Metaclass, constructed fuction method are initialized first;Then,
The subclass of data cleansing, region division, index calculating, the various operations of data conversion, the method for inheriting metaclass are defined, and is subclass
The advanced feature decorator of python is added, the method for calling metaclass interface that all subclasses can be realized;Last parallel execution is every
One subclass runner.
The present embodiment is using taxi big data as background, the available data handling implement in addition to utilizing Spark combinations HDFS,
In order to enable treatment effeciency is more promoted, it is also added into multi-threading parallel process mechanism, utilizes Python advanced feature decorators
With multithreading module, perfect synchronizing for multiprocessing process works.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Any one skilled in the art in the technical scope disclosed by the present invention, the change or replacement that can be readily occurred in,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of the claims
Subject to.
Claims (8)
1. a kind of taxi big data of multithreading accesses and the method for processing, it is characterised in that include the following steps:
Step 1)Longitude and latitude is subjected to coordinate conversion, GCJ-02 and BD-09 is passed through into the worlds WGS-84 latitude and longitude coordinates standard system
It converts twice, obtains the coordinate that can be accurately shown in Baidu map, it will be every by the Spark under Hadoop parallel computation frames
One longitude and latitude degrees of data executes coordinate conversion operation by the Map operations of elasticity distribution formula data set;
Step 2)Data are cleaned, index calculates, the operation of region division and data conversion, per single stepping inside adopt
The parallel computation of multithreading is carried out to each data with Spark,
Step 3)By data deposit distributed file system HDFS.
2. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that described
Spark includes SparkContext, Cluter Manager and Executor, and the course of work of Spark includes the following steps:
Step a) application programs using spark-submit after being submitted, according to parameter setting when submitting at the beginning of corresponding position
Beginningization SparkContext simultaneously creates DAG Scheduler and Task Scheduler, Driver according to application program execution generation
Entire program is divided into multiple job by code according to action operators, and each job internal builds DAG figures, DAG Scheduler will
DAG figures are divided into multiple stage, while being divided into multiple task as a taskSet, DAG inside each stage
TaskSet is transmitted to Task Scheduler by Scheduler, and Task Scheduler are responsible for the scheduling of task on cluster;
Step b) Driver apply for resource, the money according to the resource requirement in SparkContext to Cluter Manager
Source includes Executor numbers and memory source;
Step c) explorers create Executor processes after receiving request on the work node nodes for the condition that meets;
Step d) Executor process creations are reversely registered after completing to Driver, to receive the task of Driver distribution;
For step e) after program has executed, Driver nullifies apllied resource to ResourceManager.
3. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step
2)In data cleansing include the following steps:
The first step is the data dump that will be more than actual coordinate range;
Second step is verification characteristics exceptional value, if data meet normal distribution, that is, 3 σ principles, σ is used to indicate standard deviation, do not exist
Data within 3 standard deviations of mean value judge to be exceptional value;If data are unsatisfactory for normal distribution, that is, use box traction substation method
Then, first quartile, the second quartile are found out, third quartile calculates interquartile-range IQR from as long as data are in setting
Then retain in range, go beyond the scope, is judged as that exceptional value is rejected;
Third step is rejected to redundant data.
4. the taxi big data of multithreading according to claim 3 accesses and the method for processing, it is characterised in that data
First quartile is set in the second step of cleaning as Q1, the second quartile is median, and third quartile is Q3, four points
Position is apart from distance:IQR=Q3-Q1, setting is ranging from(Q1-1.5*IQR, Q3+1.5*IQR).
5. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step
2)In region division by coordinate include that regular cutting is carried out to map on map, by and transport power achievement data combination
Adjust difference of longitude and difference of latitude, finally by the difference of longitude of regional extent be set as 0.03 °, difference of latitude be set as 0.02 °.
6. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step
2)In data be converted to:Data are subjected to turning for data format from time transverse direction, longitudinal direction, feature correlation and spatial distribution
It changes.
7. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that step
2)Middle index is calculated as:Calculate includes inflow, outflow, retention and unloaded transport power index;Calculating includes rate of empty ride, carrying
It measures, the operation indicator of the amount of getting on the bus and the amount of getting off.
8. the taxi big data of multithreading according to claim 1 accesses and the method for processing, it is characterised in that described
Step 2)In the parallel computation of multithreading be:Metaclass, constructed fuction method are initialized first;Then, data cleansing, area are defined
Domain divides, index calculates, the subclass of the various operations of data conversion, the method for inheriting metaclass, and the height of python is added for subclass
Level characteristics decorator, the method for calling metaclass interface that all subclasses can be realized;Finally each subclass runner is executed parallel.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810480023.5A CN108647360B (en) | 2018-05-18 | 2018-05-18 | Multithreading taxi big data access and processing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810480023.5A CN108647360B (en) | 2018-05-18 | 2018-05-18 | Multithreading taxi big data access and processing method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108647360A true CN108647360A (en) | 2018-10-12 |
CN108647360B CN108647360B (en) | 2020-04-28 |
Family
ID=63756840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810480023.5A Active CN108647360B (en) | 2018-05-18 | 2018-05-18 | Multithreading taxi big data access and processing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108647360B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977306A (en) * | 2019-03-14 | 2019-07-05 | 北京达佳互联信息技术有限公司 | Implementation method, system, server and the medium of advertisement engine |
CN110376290A (en) * | 2019-07-19 | 2019-10-25 | 中南大学 | Acoustic emission source locating method based on multidimensional Density Estimator |
CN111400299A (en) * | 2020-06-04 | 2020-07-10 | 成都四方伟业软件股份有限公司 | Method and system for testing fusion quality of multiple data |
CN113051279A (en) * | 2021-03-05 | 2021-06-29 | 北京顺达同行科技有限公司 | Data message storage method, storage device, electronic equipment and storage medium |
CN114297395A (en) * | 2021-06-18 | 2022-04-08 | 北京大学 | Knowledge graph distributed mass data importing method based on load balancing |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103823933A (en) * | 2014-02-26 | 2014-05-28 | 大连理工大学 | Method for processing metal cutting simulation data |
CN103971509A (en) * | 2013-01-31 | 2014-08-06 | 上海飞田通信技术有限公司 | Taxi scheduling system, scheduling server and vehicle-mounted navigation terminal |
CN105118287A (en) * | 2015-09-01 | 2015-12-02 | 南京理工大学 | General investigation system of road traffic sign information |
CN106815452A (en) * | 2015-11-27 | 2017-06-09 | 苏宁云商集团股份有限公司 | A kind of cheat detection method and device |
CN107341328A (en) * | 2017-09-05 | 2017-11-10 | 中交第公路勘察设计研究院有限公司 | Based on the Ground Settlement method for improving Verhulst curves |
-
2018
- 2018-05-18 CN CN201810480023.5A patent/CN108647360B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103971509A (en) * | 2013-01-31 | 2014-08-06 | 上海飞田通信技术有限公司 | Taxi scheduling system, scheduling server and vehicle-mounted navigation terminal |
CN103823933A (en) * | 2014-02-26 | 2014-05-28 | 大连理工大学 | Method for processing metal cutting simulation data |
CN105118287A (en) * | 2015-09-01 | 2015-12-02 | 南京理工大学 | General investigation system of road traffic sign information |
CN106815452A (en) * | 2015-11-27 | 2017-06-09 | 苏宁云商集团股份有限公司 | A kind of cheat detection method and device |
CN107341328A (en) * | 2017-09-05 | 2017-11-10 | 中交第公路勘察设计研究院有限公司 | Based on the Ground Settlement method for improving Verhulst curves |
Non-Patent Citations (1)
Title |
---|
陈欣欣: "基于大规模 GPS 轨迹数据的出租车服务策略研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977306A (en) * | 2019-03-14 | 2019-07-05 | 北京达佳互联信息技术有限公司 | Implementation method, system, server and the medium of advertisement engine |
CN109977306B (en) * | 2019-03-14 | 2021-08-20 | 北京达佳互联信息技术有限公司 | Method, system, server and medium for implementing advertisement engine |
CN110376290A (en) * | 2019-07-19 | 2019-10-25 | 中南大学 | Acoustic emission source locating method based on multidimensional Density Estimator |
CN111400299A (en) * | 2020-06-04 | 2020-07-10 | 成都四方伟业软件股份有限公司 | Method and system for testing fusion quality of multiple data |
CN113051279A (en) * | 2021-03-05 | 2021-06-29 | 北京顺达同行科技有限公司 | Data message storage method, storage device, electronic equipment and storage medium |
CN113051279B (en) * | 2021-03-05 | 2024-05-10 | 北京顺达同行科技有限公司 | Storage method, storage device, electronic equipment and storage medium for data message |
CN114297395A (en) * | 2021-06-18 | 2022-04-08 | 北京大学 | Knowledge graph distributed mass data importing method based on load balancing |
Also Published As
Publication number | Publication date |
---|---|
CN108647360B (en) | 2020-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108647360A (en) | A kind of method of the access of taxi big data and the processing of multithreading | |
CN109491790B (en) | Container-based industrial Internet of things edge computing resource allocation method and system | |
Larsen et al. | Susceptibility of optimal train schedules to stochastic disturbances of process times | |
DE112014004794B4 (en) | Allocating map matching tasks through cluster servers on the vehicles' Internet | |
US20190207869A1 (en) | Intelligent Placement within a Data Center | |
CN106547882A (en) | A kind of real-time processing method and system of big data of marketing in intelligent grid | |
CN108335075A (en) | A kind of processing system and method for Logistics Oriented big data | |
Josyula et al. | A parallel algorithm for train rescheduling | |
CN105491329B (en) | A kind of extensive monitoring video flow assemblage method based on streaming computing | |
CN103218233A (en) | Data allocation strategy in hadoop heterogeneous cluster | |
US20190057611A1 (en) | System and method to analyze data based on air traffic volume | |
CN111966289A (en) | Partition optimization method and system based on Kafka cluster | |
CN104536804A (en) | Virtual resource dispatching system for related task requests and dispatching and distributing method for related task requests | |
Plehn-Dujowich et al. | The dynamic relationship between entrepreneurship, unemployment, and growth: evidence from US industries | |
CN109087030A (en) | Realize method, General Mobile crowdsourcing server and the system of the crowdsourcing of C2C General Mobile | |
CN111506413B (en) | Intelligent task scheduling method and system based on business efficiency optimization | |
CN106569892A (en) | Resource scheduling method and device | |
CN113095781A (en) | Temperature control equipment control method, equipment and medium based on edge calculation | |
Buettner et al. | Migration Projections: The Economic Case | |
CN115169634A (en) | Task allocation optimization processing method and device | |
US20090144011A1 (en) | One-pass sampling of hierarchically organized sensors | |
Shtern et al. | Towards a multi-cluster analytical engine for transportation data | |
Yu et al. | Dissecting geosparksim: a scalable microscopic road network traffic simulator in apache spark | |
CN104410868B (en) | A kind of shared-file system multifile rapid polymerization and the method read | |
Mian et al. | A data platform for the highway traffic data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |