CN110990367A - Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering - Google Patents

Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering Download PDF

Info

Publication number
CN110990367A
CN110990367A CN201911137142.1A CN201911137142A CN110990367A CN 110990367 A CN110990367 A CN 110990367A CN 201911137142 A CN201911137142 A CN 201911137142A CN 110990367 A CN110990367 A CN 110990367A
Authority
CN
China
Prior art keywords
longitude
user
latitude
clustering
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911137142.1A
Other languages
Chinese (zh)
Inventor
陈亮
邓翠珠
戴传智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Guangdong Co Ltd filed Critical China Mobile Group Guangdong Co Ltd
Priority to CN201911137142.1A priority Critical patent/CN110990367A/en
Publication of CN110990367A publication Critical patent/CN110990367A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/025Services making use of location information using location based information parameters
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Position Fixing By Use Of Radio Waves (AREA)

Abstract

The invention discloses a method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering, which comprises the following steps: deployment and construction of a computing environment; through external Kafka pushing and internal FLUME receiving, a five-minute position snapshot table and a base station longitude and latitude table of a mobile operator are put into an HDFS distributed file system; reading the five-minute position snapshot table and the base station longitude and latitude table, and correlating the two tables to obtain user position information data containing longitude and latitude information; removing the weight of user position data in a day of a set city from the user position information data to obtain a user longitude and latitude position statistical table; clustering the user longitude and latitude position information in the user longitude and latitude position statistical table by using a graph group clustering method; the Spark program is submitted to a yarn cluster to run, and the obtained analysis result is stored in the HDFS distributed file system. The invention reduces the calculation amount and the calculation energy consumption and improves the operation performance.

Description

Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering
Technical Field
The invention relates to the field of big data processing, in particular to a method for realizing calculation performance optimization of a GPS positioning cluster based on graph group clustering.
Background
With the continuous development of mobile intelligent terminals, various services related to the position are provided for users, and the mobile intelligent terminals will become a mainstream trend in the future mobile terminal user service. The GPS positioning in the mobile signaling data is feasible based on the graph community clustering algorithm, and meanwhile, the operation performance is optimized on a GPU cluster.
Although the prior art can apply the clustering method to computational analysis, in practical application, we find that the prior art scheme still has some inconveniences and disadvantages. The prior art has the following disadvantages:
if the application number is: 201410360455.4 the program code used by the method of the invention is realized by adopting a superset CUDA of C language, and the distributed computing technology of the K-means clustering method is not realized, so the method is easily limited by the video memory of a single-machine GPU, and the operation is likely to be unsuccessful in the clustering computation of the high-dimensional matrix. The application numbers are: 201811589386.9 this invention utilizes a distributed framework to process high dimensional big data, but does not use the combination of Hadoop and GPU for acceleration.
The existing clustering method used for GPS positioning based on mobile signaling data needs to traverse all clusters, and has high energy consumption and poor operation performance.
Disclosure of Invention
The invention provides a method for realizing the calculation performance optimization of a GPS positioning cluster based on graph group clustering, aiming at overcoming the defects that the clustering method used for GPS positioning based on mobile signaling data in the prior art needs to traverse all clusters, and has high energy consumption and poor operation performance.
The primary objective of the present invention is to solve the above technical problems, and the technical solution of the present invention is as follows:
a method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering comprises the following steps:
s1: respectively building a GPU environment, a Spark cluster and a Hadoop cluster, and building a GPU calculation analysis frame on a plurality of nodes provided with the GPU environment;
s2: through external Kafka pushing and internal FLUME receiving, a five-minute position snapshot table and a base station longitude and latitude table of a mobile operator are put into an HDFS distributed file system;
s3: reading a five-minute position snapshot table and a base station longitude and latitude table in the HDFS distributed file system, and correlating the two tables to obtain user position information data containing longitude and latitude information;
s4: removing the weight of user position data in a day of a set city from the user position information data to obtain a user longitude and latitude position statistical table;
s5: carrying out mapPartitions operator operation on the longitude and latitude position information of the users in the longitude and latitude position statistical table of the users, and then clustering the longitude and latitude position information pairs of the users after the mapPartitions operator processing by using a graph group clustering method;
s6: and submitting the Spark program to a horn cluster for operation, and storing the obtained analysis result into an HDFS distributed file system, wherein the analysis result is the attribution of a base station of a user.
Further, the fields in the five-minute position snapshot table of the mobile operator in step S2 include the number of the user terminal, the time of occurrence, and the base station cgi, and the fields in the longitude and latitude table of the base station are the base station cgi, the longitude, and the latitude.
Further, the associating of the two tables in step S3 refers to performing an inter-connection associating operation on the field ue number, the time of occurrence, the base station cgi in the five-minute location snapshot table and the field base station cgi, longitude and latitude in the base station longitude and latitude table.
Further, in step S4, the user location data in one day of the set city is deduplicated from the user location information data to obtain a user longitude and latitude location statistical table, which specifically includes:
s401: screening out the longitude and latitude position information of the user with the set date of the city according to the field-longitude, latitude and appearance time of the full latitude and longitude position information data of the user;
s402: and performing duplication elimination operation on the longitude and latitude position information of the user screened in the step S401, and screening out the first piece of position information of each user by taking the user terminal number as a unique identifier.
Further, performing mapPartitions operator operation on the longitude and latitude position information of the users in the longitude and latitude position statistical table of the users, and then clustering the longitude and latitude position information pairs of the users after the mapPartitions operator processing by using a graph group clustering method; the method comprises the following specific steps:
s501: randomly dividing the user longitude and latitude position information in the user longitude and latitude position statistical table into a plurality of partitions, carrying out map function operation on each Partition, and extracting data required by clustering from the data subjected to the map function operation, wherein the data comprises user longitude and user latitude; the data type of the user longitude and latitude position information in the user longitude and latitude position statistical table is RDD data;
s502: according to the obtained longitude and user latitude information of the user, initializing N longitude and latitude position information as N vertexes, wherein each vertex independently forms a cluster, and calculating the modularity M of the cluster network, wherein the calculation formula is as follows:
Figure BDA0002279883380000031
where L represents the number of edges included in the graph, N represents the number of vertices, kiDenotes the degree of the vertex i, AijIs the value in the adjacency matrix, ciRepresents the clustering of the vertexes i, delta is the kronecker function, if the vertexes i, j belong to the same clustering, delta (c)i,cj) Returning to 1, if i, j do not belong to the same cluster, δ (c)i,cj) Returning to 0;
s503: randomly selecting two clusters for fusion, and calculating the modularity change delta M caused by fusion;
s504: selecting two clusters with the maximum growth in the delta M for fusion, calculating new modularity of the fused clusters, and recording;
s505: repeating the steps S503-S504, fusing a pair of clusters each time, calculating delta M, recording a new clustering mode and corresponding modularity of the new clustering mode, and stopping until all the vertexes are grouped into a cluster;
s506: and detecting all records of the clustering process, inquiring the corresponding clustering mode when the modularity value is maximum, and taking the clustering mode as a final clustering structure.
Further, the step S505 specifically includes:
s5051: converting all RDD data in the Partition into Numpy type data, specifically comprising each pair of user longitude and user latitude, a modularity change value delta M and a new modularity M, and taking the converted RDD data as data input, wherein the data output length is the same as the length of each pair of user longitude and user latitude data, and the data output type is a 3-dimensional Numpy type, wherein the 1 st dimension represents a clustering group identifier, and the 2 nd dimension and the 3 rd dimension represent re-clustered data;
s5052: copying input data to device from host, wherein host is CPU and its memory, and device is GPU and its memory;
s5053: setting grid and blocks for the GPU-kernel function, wherein the grid is all threads started by one GPU-kernel function, the grid comprises a plurality of blocks, and each block comprises a plurality of threads;
s5054: and dividing each pair of vertexes into the same class to serve as algorithm logic to write the GPU-kernel, and operating the GPU-kernel in a GPU and a memory thereof.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention overcomes the defect that the traditional clustering method needs to violently traverse all clusters through the graph group clustering method, reduces the calculated amount and the energy consumption, and realizes cluster distributed calculation by using the GPU based on a Hadoop/Spark framework, thereby improving the operation performance.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a flow chart of the community clustering algorithm.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Example 1
Fig. 1 shows a flowchart of a method for optimizing the computation performance of a GPS positioning cluster based on graph community clustering.
A method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering comprises the following steps:
s1: respectively building a GPU environment, a Spark cluster and a Hadoop cluster, and building a GPU calculation analysis frame on a plurality of nodes provided with the GPU environment;
in a specific embodiment, environment deployment is carried out on 3 servers provided with GTX GeForce1080Ti, including building of a GPU environment, a Spark cluster and a Hadoop cluster, and a GPU calculation analysis framework is built on multiple nodes provided with the GPU environment. The construction of the GPU environment comprises the installation of the NVIDIA driver and cuda and corresponding environment configuration.
S2: through external Kafka pushing and internal FLUME receiving, a five-minute position snapshot table and a base station longitude and latitude table of a mobile operator are put into an HDFS distributed file system;
it should be noted that the fields of the five-minute position snapshot table include the number of the user terminal, the time of occurrence, and the cgi of the base station; the main fields in the base station longitude and latitude table are the base station cgi, longitude and latitude.
S3: reading a five-minute position snapshot table and a base station longitude and latitude table in the HDFS distributed file system, and correlating the two tables to obtain user position information data containing longitude and latitude information;
in a specific embodiment, a five-minute position snapshot table and a base station theodolite table in the HDFS distributed file system are read, and for a field in the five-minute position snapshot table: user terminal number, time of occurrence, base station cgi and fields in base station longitude and latitude table: base station cgi, longitude and latitude carry out the correlation operation of internal connection, finally obtain the user position information data containing longitude and latitude information;
s4: removing the weight of user position data in a day of a set city from the user position information data to obtain a user longitude and latitude position statistical table;
in a specific embodiment, the specific steps are as follows:
s401: screening out the longitude and latitude position information of the user with the set date of the city according to the field-longitude, latitude and appearance time of the full latitude and longitude position information data of the user;
s402: and performing duplication removal operation on the longitude and latitude position information of the user screened in the step S401, and screening out the first piece of position information (the base station cgi, the longitude and the latitude) of each user by taking the user terminal number as a unique identifier.
S5: carrying out mapPartitions operator operation on the longitude and latitude position information of the users in the longitude and latitude position statistical table of the users, and then clustering the longitude and latitude position information pairs of the users after the mapPartitions operator processing by using a graph group clustering method;
in a specific embodiment, firstly, a mapPartitions operator is used to randomly divide RDD data to be processed into a plurality of partitions, and then map function operation is performed on each Partiton, which is beneficial to improving the efficiency of the algorithm. The concrete process of clustering the longitude and latitude information of the users in the longitude and latitude position statistical table by using the graph group clustering method comprises six steps: as shown in fig. 2.
S501: randomly dividing the user longitude and latitude position information in the user longitude and latitude position statistical table into a plurality of partitions, carrying out map function operation on each Partition, and extracting data required by clustering from the data subjected to the map function operation, wherein the data comprises user longitude and user latitude; the data type of the user longitude and latitude position information in the user longitude and latitude position statistical table is RDD data;
s502: according to the obtained longitude and user latitude information of the user, initializing N longitude and latitude position information as N vertexes, wherein each vertex independently forms a cluster, and calculating the modularity M of the cluster network, wherein the calculation formula is as follows:
Figure BDA0002279883380000051
where L represents the number of edges included in the graph, N represents the number of vertices, kiDenotes the degree of the vertex i, AijIs the value in the adjacency matrix, ciRepresents the clustering of the vertexes i, delta is the kronecker function, if the vertexes i, j belong to the same clustering, delta (c)i,cj) Returning to 1, if i, j do not belong to the same cluster, δ (c)i,cj) Returning to 0;
s503: randomly selecting two clusters for fusion, and calculating the modularity change delta M caused by fusion;
s504: selecting two clusters with the maximum growth in the delta M for fusion, calculating new modularity of the fused clusters, and recording;
s505: repeating the steps S503-S504, fusing a pair of clusters each time, calculating delta M, recording a new clustering mode and corresponding modularity of the new clustering mode, and stopping until all the vertexes are grouped into a cluster;
in step S505, mapPartitions operator calculation and GPU acceleration are performed, and the specific steps include:
s5051: converting all RDD data in the Partition into Numpy type data, specifically comprising each pair of user longitude and user latitude, a modularity change value delta M and a new modularity M, and taking the converted RDD data as data input, wherein the data output length is the same as the length of each pair of user longitude and user latitude data, and the data output type is a 3-dimensional Numpy type, wherein the 1 st dimension represents a clustering group identifier, and the 2 nd dimension and the 3 rd dimension represent re-clustered data;
it should be noted that it is a standard for measuring the quality of the graph community division, and the larger the value, the better the division.
S5052: copying input data from a CPU to a GPU;
s5053: setting grid and blocks for the GPU-kernel function, wherein the grid is all threads started by one GPU-kernel function, the grid comprises a plurality of blocks, and each block comprises a plurality of threads;
in a specific embodiment, block is set to 256,
Figure BDA0002279883380000061
s5054: and dividing each pair of vertexes into the same class to serve as algorithm logic to write the GPU-kernel, and operating the GPU-kernel in a GPU and a memory thereof.
S506: and detecting all records of the clustering process, inquiring the corresponding clustering mode when the modularity value is maximum, and taking the clustering mode as a final clustering structure.
S6: submitting a Spark program to run on a yann cluster, and storing an obtained analysis result into an HDFS distributed file system, wherein the analysis result is the attribution of a base station of a user.
In the embodiment, 1000000 longitude and latitude position data of users are used for carrying out clustering test, a Spark program is submitted to a corner cluster to run, and the running time when the GPU is used and the running time when the GPU is not used are respectively counted. As a result, it was found that: the time consumed by the position clustering algorithm when the GPU is used is 3.6s, the time consumed by the position clustering algorithm when the GPU is not used is 27.4s, and the GPU clustering operation technology brings about more than 6 times of acceleration for the graph group clustering algorithm.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (6)

1. A method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering is characterized by comprising the following steps:
s1: respectively building a GPU environment, a Spark cluster and a Hadoop cluster, and building a GPU calculation analysis frame on a plurality of nodes provided with the GPU environment;
s2: through external Kafka pushing and internal FLUME receiving, a five-minute position snapshot table and a base station longitude and latitude table of a mobile operator are put into an HDFS distributed file system;
s3: reading a five-minute position snapshot table and a base station longitude and latitude table in the HDFS distributed file system, and correlating the two tables to obtain user position information data containing longitude and latitude information;
s4: removing the weight of user position data in a day of a set city from the user position information data to obtain a user longitude and latitude position statistical table;
s5: carrying out mapPartitions operator operation on the longitude and latitude position information of the users in the longitude and latitude position statistical table of the users, and then clustering the longitude and latitude position information pairs of the users after the mapPartitions operator processing by using a graph group clustering method;
s6: and submitting the Spark program to a horn cluster for operation, and storing the obtained analysis result into an HDFS distributed file system, wherein the analysis result is the attribution of a base station of a user.
2. The method of claim 1, wherein the GPS positioning cluster computing performance optimization is realized based on graph community clustering,
the fields in the five-minute position snapshot table of the mobile operator in step S2 include the user terminal number, the time of occurrence, and the base station cgi, and the fields in the base station longitude and latitude table are the base station cgi, longitude, and latitude.
3. The method as claimed in claim 2, wherein the step S3 of associating the two tables means performing an inter-connection association operation on the field ue number, the time of occurrence, the cgi of the bs and the field bs cgi, longitude and latitude of the bs in the longitude and latitude table.
4. The method according to claim 1, wherein in step S4, the user location data in one day of a set city is deduplicated from the user location information data to obtain a user longitude and latitude location statistical table, and the specific steps are as follows:
s401: screening out the longitude and latitude position information of the user with the set date of the city according to the field-longitude, latitude and appearance time of the full latitude and longitude position information data of the user;
s402: and performing duplication elimination operation on the longitude and latitude position information of the user screened in the step S401, and screening out the first piece of position information of each user by taking the user terminal number as a unique identifier.
5. The method according to claim 1, wherein in step S5, mappartions operator operation is performed on the longitude and latitude position information of the user in the longitude and latitude position statistical table of the user, and then clustering is performed on the longitude and latitude position information pair of the user after being processed by the mappartions operator by using a graph community clustering method; the method comprises the following specific steps:
s501: randomly dividing the user longitude and latitude position information in the user longitude and latitude position statistical table into a plurality of partitions, carrying out map function operation on each Partition, and extracting data required by clustering from the data subjected to the map function operation, wherein the data comprises user longitude and user latitude; the data type of the user longitude and latitude position information in the user longitude and latitude position statistical table is RDD data;
s502: according to the longitude and user latitude information of the acquired user, initializing N longitude and latitude position information as N vertexes, wherein each vertex independently forms a cluster, and calculating the modularity M of the cluster network, wherein the calculation formula is as follows:
Figure FDA0002279883370000021
where L represents the number of edges included in the graph, N represents the number of vertices, kiDenotes the degree of the vertex i, AijIs the value in the adjacency matrix, ciRepresents the clustering of the vertexes i, delta is the kronecker function, if the vertexes i, j belong to the same clustering, delta (c)i,cj) Returning to 1, if i, j do not belong to the same cluster, δ (c)i,cj) Returning to 0;
s503: randomly selecting two clusters for fusion, and calculating the modularity change delta M caused by fusion;
s504: selecting two clusters with the maximum growth in the delta M for fusion, calculating new modularity of the fused clusters, and recording;
s505: repeating the steps S503-S504, fusing a pair of clusters each time, calculating delta M, recording a new clustering mode and corresponding modularity of the new clustering mode, and stopping until all the vertexes are grouped into a cluster;
s506: and detecting all records of the clustering process, inquiring the corresponding clustering mode when the modularity value is maximum, and taking the clustering mode as a final clustering structure.
6. The method for optimizing the calculation performance of the GPS positioning cluster based on the graph community clustering according to claim 5, wherein the specific process of step S505 is as follows:
s5051: converting all RDD data in the Partition into Numpy type data, specifically comprising each pair of user longitude and user latitude, a modularity change value delta M and a new modularity M, and taking the converted RDD data as data input, wherein the data output length is the same as the length of each pair of user longitude and user latitude data, and the data output type is a 3-dimensional Numpy type, wherein the 1 st dimension represents a clustering group identifier, and the 2 nd dimension and the 3 rd dimension represent re-clustered data;
s5052: copying input data to device from host, wherein host is CPU and its memory, and device is GPU and its memory;
s5053: setting grid and blocks for the GPU-kernel function, wherein the grid is all threads started by one GPU-kernel function, the grid comprises a plurality of blocks, and each block comprises a plurality of threads;
s5054: and dividing each pair of vertexes into the same class to serve as algorithm logic to write the GPU-kernel, and operating the GPU-kernel in a GPU and a memory thereof.
CN201911137142.1A 2019-11-19 2019-11-19 Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering Pending CN110990367A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911137142.1A CN110990367A (en) 2019-11-19 2019-11-19 Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911137142.1A CN110990367A (en) 2019-11-19 2019-11-19 Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering

Publications (1)

Publication Number Publication Date
CN110990367A true CN110990367A (en) 2020-04-10

Family

ID=70085353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911137142.1A Pending CN110990367A (en) 2019-11-19 2019-11-19 Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering

Country Status (1)

Country Link
CN (1) CN110990367A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990744A (en) * 2021-03-30 2021-06-18 杭州东方通信软件技术有限公司 Automatic operation and maintenance method and device for massive million-level cloud equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110213583A1 (en) * 2010-03-01 2011-09-01 Qualcomm Incorporated Fast clustering of position data for user profiling
US8792909B1 (en) * 2013-12-04 2014-07-29 4 Info, Inc. Systems and methods for statistically associating mobile devices to households
US20150262397A1 (en) * 2014-03-14 2015-09-17 Under Armour, Inc. System and method for generating a map from activity data
CN106604297A (en) * 2015-10-20 2017-04-26 中国电信股份有限公司 Method and equipment for optimizing longitude and latitude data in center of sector of base station
CN109213940A (en) * 2017-06-30 2019-01-15 武汉斗鱼网络科技有限公司 Method, storage medium, equipment and system that user location calculates are realized under big data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110213583A1 (en) * 2010-03-01 2011-09-01 Qualcomm Incorporated Fast clustering of position data for user profiling
US8792909B1 (en) * 2013-12-04 2014-07-29 4 Info, Inc. Systems and methods for statistically associating mobile devices to households
US20150262397A1 (en) * 2014-03-14 2015-09-17 Under Armour, Inc. System and method for generating a map from activity data
CN106604297A (en) * 2015-10-20 2017-04-26 中国电信股份有限公司 Method and equipment for optimizing longitude and latitude data in center of sector of base station
CN109213940A (en) * 2017-06-30 2019-01-15 武汉斗鱼网络科技有限公司 Method, storage medium, equipment and system that user location calculates are realized under big data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112990744A (en) * 2021-03-30 2021-06-18 杭州东方通信软件技术有限公司 Automatic operation and maintenance method and device for massive million-level cloud equipment
CN112990744B (en) * 2021-03-30 2022-07-12 杭州东方通信软件技术有限公司 Automatic operation and maintenance method and device for massive million-level cloud equipment

Similar Documents

Publication Publication Date Title
CN112352234B (en) System for processing concurrent attribute map queries
Mirzasoleiman et al. Fast distributed submodular cover: Public-private data summarization
US9996552B2 (en) Method for generating a dataset structure for location-based services and method and system for providing location-based services to a mobile device
CN108011987B (en) IP address positioning method and device, electronic equipment and storage medium
CN106210163B (en) IP address-based localization method and device
JP7407209B2 (en) Information push method and device
CN107395680B (en) Shop group's information push and output method and device, equipment
CN110809317B (en) Multi-source dynamic grid network RTK positioning method, system, terminal and storage medium
EP3425530A1 (en) Target location search method and apparatus
CN106027693B (en) IP address-based localization method and device
CN112214468B (en) Small file acceleration method, device, equipment and medium for distributed storage system
CN113268550A (en) Method and system for scheduling autonomous domain system, electronic device and storage medium
Demissie et al. Modeling location choice of taxi drivers for passenger pickup using GPS data
CN117632580A (en) Secret data backup method, system, equipment and storage medium
CN110990367A (en) Method for realizing GPS positioning cluster calculation performance optimization based on graph group clustering
CN109255080A (en) A kind of information processing method and device
Boutsis et al. Efficient event detection by exploiting crowds
CN112887910B (en) Method and device for determining abnormal coverage area and computer readable storage medium
CN112115382A (en) Data processing method and device, storage medium and electronic device
CN110555037A (en) Smart city data sharing system
CN112040413B (en) User track calculation method and device and electronic equipment
Yue et al. Engineering evaluation system of logistics park capability
CN105976091A (en) Individual activity steady state training method under large data environment
CN114253938A (en) Data management method, data management device, and storage medium
Alkhelaiwi et al. Smart city data storage optimization in the cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20230818