WO2021232585A1 - Artificial intelligence-based positioning data processing method and related device - Google Patents

Artificial intelligence-based positioning data processing method and related device Download PDF

Info

Publication number
WO2021232585A1
WO2021232585A1 PCT/CN2020/104604 CN2020104604W WO2021232585A1 WO 2021232585 A1 WO2021232585 A1 WO 2021232585A1 CN 2020104604 W CN2020104604 W CN 2020104604W WO 2021232585 A1 WO2021232585 A1 WO 2021232585A1
Authority
WO
WIPO (PCT)
Prior art keywords
points
stay
point
candidate
positioning
Prior art date
Application number
PCT/CN2020/104604
Other languages
French (fr)
Chinese (zh)
Inventor
朱海胜
许华杰
Original Assignee
平安国际智慧城市科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安国际智慧城市科技股份有限公司 filed Critical 平安国际智慧城市科技股份有限公司
Publication of WO2021232585A1 publication Critical patent/WO2021232585A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Definitions

  • Traffic travel volume OD analysis is to obtain users' daily traffic travel data. Through data analysis, the characteristics and distribution of users' needs for the entire city's traffic and other urban functions can be mined, and it can provide information and decision-making support for urban traffic planning and construction. Among them, the OD matrix is a very critical analysis data.
  • the OD matrix is the starting and ending point matrix. It is necessary to know the starting point and ending point of all traffic trips of users in this city during this time period, that is, the travel stop point. There are many positioning points in the user's travel positioning data. The inventor has realized that how to identify the user's travel stay point based on these positioning points is a technical problem that needs to be solved urgently.
  • the first aspect of the present application provides an artificial intelligence-based positioning data processing method.
  • the artificial intelligence-based positioning data processing method includes:
  • the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
  • the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where
  • the set of staying points includes the final staying point for the user to travel;
  • a second aspect of the present application provides an electronic device, wherein the electronic device includes a processor and a memory, and the processor is configured to execute at least one computer-readable instruction stored in the memory to implement the following steps:
  • the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
  • the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where
  • the set of staying points includes the final staying point for the user to travel;
  • a third aspect of the present application provides a computer-readable storage medium on which at least one computer-readable instruction is stored, wherein the at least one computer-readable instruction implements the following steps when executed by a processor:
  • the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
  • the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where
  • the set of staying points includes the final staying point for the user to travel;
  • a fourth aspect of the present application provides a positioning data processing device, the positioning data processing device includes:
  • the obtaining module is used to obtain multiple positioning data representing the user's travel recorded by the user terminal;
  • the first processing module is configured to preprocess the positioning data to obtain processed data
  • the second processing module is configured to use the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories, where each candidate region includes Multiple candidate stay points belonging to the same category;
  • the third processing module is configured to subdivide the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate areas of each category to obtain A set of staying points, wherein the set of staying points includes the final staying point of the user when traveling;
  • the upload module is used to upload the set of stay points to the blockchain.
  • this application can be applied to areas that require positioning data processing, such as smart city management, smart security, smart logistics, and smart transportation, so as to promote the development of smart cities.
  • the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to realize the space-time two-layer clustering , It can effectively identify the staying point, at the same time, it removes part of the drift point of the positioning, and also improves the recognition accuracy of the staying point.
  • FIG. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the artificial intelligence-based positioning data processing method according to the present application.
  • FIG. 1 is a flowchart of a preferred embodiment of an artificial intelligence-based positioning data processing method disclosed in the present application. Among them, according to different needs, the order of the steps in the flowchart can be changed, and some steps can be omitted.
  • the positioning data includes the geographic location of the positioning point and the arrival time of each positioning point.
  • the preprocessing the positioning data to obtain processed data includes:
  • the drift data in the intermediate data is deleted to obtain processed data.
  • the preprocessing can include equal time interval processing, such as dividing the positioning data for a period of time according to a preset time interval (such as 10min), and the preprocessing also includes deleting the drift data, where the drift data includes drift points.
  • the anchor point whose moving speed is greater than the maximum speed of urban traffic is usually a drift point. The existence of drift points will greatly affect the recognition effect of stay points, so they need to be deleted.
  • DBSCAN Density-Based Spatial Clustering of Applications with Noise
  • the clustering algorithm can not only identify clusters of any shape and size, but also has high anti-interference ability.
  • the main idea of the DBSCAN clustering algorithm is: select an unprocessed sample point P from the sample point set D, detect the sample points in the Eps. neighborhood of the point P to search for clusters that meet the requirements, if the Eps.
  • the length of time the user stays at a certain place can be obtained from the density of the positioning points in space. Therefore, the density-based DBSCAN algorithm can be used to cluster the positioning points on the spatial layer, which is preliminary realized Obtaining the user's travel candidate staying point.
  • K-Nearest Neighbor k-Nearest Neighbor, kNN classification algorithm
  • KNN K-Nearest Neighbor
  • the positioning data of the user terminal is collected non-isochronously and unevenly, and the time interval between two consecutive positioning data collections is sometimes large, this makes it possible even if a certain positioning point is between a certain positioning point and the previous positioning point.
  • the displacement speed of is less than the maximum speed of urban traffic, but the positioning point may still be a drift point. Therefore, only the displacement speed between the two positioning points cannot identify all the positioning drift points, and the existence of these positioning drift points will greatly Affect the recognition effect of stay points. Their existence may divide a long-term stay point into multiple short-term stay points, and may even cause some stay points to be unrecognized at all. Therefore, in order to more accurately identify the staying point of a user's travel, it is necessary to use the idea of the KNN algorithm to optimize the clustering results.
  • the use of the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories includes:
  • the DBSCAN algorithm is used to process the processed data on the spatial layer to obtain a plurality of first stop points for the user to travel;
  • a candidate area is constructed.
  • the DBSCAN algorithm is used to obtain a plurality of first stay points that are initially identified. Since there are unavoidable drift points in the processed data, it is also necessary to use the KNN algorithm to further optimize the classification of the plurality of first stay points.
  • the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain multiple first stay points for the user to travel includes:
  • the positioning result will change to a certain extent even if the user’s position does not move.
  • travel refers to travel by means of transportation for more than 500 meters or time-consuming walking Activities that exceed 5 minutes, therefore, the user's position movement within a small area reflected from the positioning data does not mean that the user has made a trip.
  • the threshold for the distance of the stay point it is possible to set the threshold value for the time for the point of stay to be 5 minutes.
  • the method further includes:
  • constructing the candidate area includes:
  • a candidate area is constructed according to the first stay points belonging to the same category.
  • the KNN algorithm is used to classify the multiple first stay points, and the drift points can be classified into one category.
  • the categories including the drift points can be deleted, For the remaining categories, to further optimize.
  • the use of the KNN algorithm to classify the plurality of first stay points, and obtain the first stay points of the plurality of categories includes:
  • For each of the first stay points obtain a plurality of positioning point sets corresponding to the first stay points;
  • the KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets
  • the first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
  • the set of staying points includes the final staying point of the user when traveling.
  • the candidate stay points for the user travel are extracted from the location data of the user terminal. Since the positioning points that are similar in the spatial dimension may be far apart in the time dimension, for example, a user works at location A in the morning, leaves location A for lunch at noon, and continues to work at location A in the afternoon, only for the user on the spatial layer
  • the clustering of trajectory points will cluster its work location A in the morning and afternoon into the same cluster, that is, only one candidate stay point can be identified, but in fact, the user stayed at location A twice. In theory, there should be Two staying points, so the candidate staying points need to be further processed in the time dimension to get the final staying point of the user's travel.
  • clustering is performed on the time layer to divide and select candidate stay points.
  • the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
  • the number of anchor points in the new cluster is greater than or equal to the preset stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the currently read anchor points, and add the target candidate stay points to Stay in the collection.
  • the method further includes:
  • the initialized cluster identifier as the cluster identifier of the currently read anchor point and perform iteration.
  • the method further includes:
  • step 5 If the number N of anchor points in the new cluster C is greater than or equal to the stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the anchor points that have been read before, that is, the anchor points ⁇ p j , The geometric center point of p j+1 ,...,pi -1 ⁇ , where the arrival time of the target candidate stay point is the positioning time of p i , and the stay time is N (minutes). Add the target candidate stay point to the stay In the point set SP, then go to step 5); if the number N of anchor points in the new cluster C is less than the stay point discrimination time threshold, go directly to step 5);
  • the method further includes:
  • the above steps 1)-5) are iterated in ascending order of the arrival time of the anchor points. Therefore, the final stay points in the obtained stay point set SP are also arranged in ascending order of time, and the stay points can be set directly The included final stay points are connected to obtain the travel chain of the user's travel.
  • the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to realize the space-
  • the two-layer clustering of time can effectively identify the staying point, at the same time, it removes part of the drift points of the positioning, and also improves the recognition accuracy of the staying point.
  • this application can be applied to areas that require processing of positioning data, such as smart city management, smart security, smart logistics, and smart transportation, so as to promote the development of smart cities.
  • Fig. 2 is a functional module diagram of a preferred embodiment of a positioning data processing device disclosed in the present application.
  • the positioning data processing device runs in an electronic device.
  • the positioning data processing device may include multiple functional modules composed of program code segments.
  • the program code of each program segment in the positioning data processing device may be stored in a memory and executed by at least one processor to execute part or all of the steps in the artificial intelligence-based positioning data processing method described in FIG. 1 .
  • the positioning data processing device may be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: an acquisition module 201, a first processing module 202, a second processing module 203, a third processing module 204, and an uploading module 205.
  • the module referred to in this application refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory.
  • the positioning data includes the geographic location of the positioning point and the arrival time of each positioning point.
  • the first processing module 202 is configured to preprocess the positioning data to obtain processed data.
  • the drift data in the intermediate data is deleted to obtain processed data.
  • the preprocessing can include equal time interval processing, such as dividing the positioning data for a period of time according to a preset time interval (such as 10min), and the preprocessing also includes deleting the drift data, where the drift data includes drift points.
  • the anchor point whose moving speed is greater than the maximum speed of urban traffic is usually a drift point. The existence of drift points will greatly affect the recognition effect of stay points, so they need to be deleted.
  • DBSCAN Density-Based Spatial Clustering of Applications with Noise
  • the clustering algorithm can not only identify clusters of any shape and size, but also has high anti-interference ability.
  • the main idea of the DBSCAN clustering algorithm is: select an unprocessed sample point P from the sample point set D, detect the sample points in the Eps. neighborhood of the point P to search for clusters that meet the requirements, if the Eps.
  • K-Nearest Neighbor k-Nearest Neighbor, kNN classification algorithm
  • KNN K-Nearest Neighbor
  • the positioning data of the user terminal is collected non-isochronously and unevenly, and the time interval between two consecutive positioning data collections is sometimes large, this makes it possible even if a certain positioning point is between a certain positioning point and the previous positioning point.
  • the displacement speed of is less than the maximum speed of urban traffic, but the positioning point may still be a drift point. Therefore, only the displacement speed between the two positioning points cannot identify all the positioning drift points, and the existence of these positioning drift points will greatly Affect the recognition effect of stay points. Their existence may divide a long-term stay point into multiple short-term stay points, and may even cause some stay points to be unrecognized at all. Therefore, in order to more accurately identify the staying point of a user's travel, it is necessary to use the idea of the KNN algorithm to optimize the clustering results.
  • the use of the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories includes:
  • the DBSCAN algorithm is used to process the processed data on the spatial layer to obtain multiple first stop points for the user to travel;
  • a candidate area is constructed.
  • the DBSCAN algorithm is used to obtain a plurality of first stay points that are initially identified. Since there are unavoidable drift points in the processed data, it is also necessary to use the KNN algorithm to further optimize the classification of the plurality of first stay points.
  • the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain multiple first stay points for the user to travel includes:
  • the geometric center points of all the positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
  • stay point can be defined as: For any locating point P, if the number N of locating points in the neighborhood with point P as the center and radius R (stay point discrimination distance threshold) is greater than or equal to the stay point discrimination time threshold, Then the geometric center points of all anchor points in the R. neighborhood are called stay points, the time to reach the stay point is the arrival time of the first anchor point in the R. neighborhood, and the stay time is N (unit time). Among them, the length of each anchor point in the time dimension is the same, which is a unit time, and the number of anchor points in the R. neighborhood of point P is the stay time of the stay point.
  • the KNN algorithm is used to classify the multiple first stay points, and the drift points can be classified into one category.
  • the categories including the drift points can be deleted, For the remaining categories, to further optimize.
  • the use of the KNN algorithm to classify the multiple first stay points, and obtain the first stay points of the multiple categories includes:
  • the KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets
  • each anchor point data contains a cluster ID field.
  • the third processing module 204 is configured to subdivide the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate areas of each category, and Obtain a set of staying points, where the set of staying points includes the final staying point of the user traveling.
  • clustering is performed on the time layer to divide and select candidate stay points.
  • the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
  • the upload module 205 is configured to upload the stay point set to the blockchain.
  • the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to achieve
  • the space-time double-layer clustering can effectively identify the stay points, and at the same time, it removes part of the drift points of the positioning, and also improves the recognition accuracy of the stay points.
  • the electronic device 3 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit (ASIC), and field programmable Gate array (FPGA), digital signal processor (DSP), embedded device, etc.
  • the electronic equipment may also include network equipment and/or user equipment.
  • the network device includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud composed of a large number of hosts or network servers based on Cloud Computing.
  • the user equipment includes, but is not limited to, any electronic product that can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, and a personal digital device. Assistant PDA, etc.
  • the at least one processor 32 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (ASICs). ), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the processor 32 can be a microprocessor, or the processor 32 can also be any conventional processor, etc.
  • the processor 32 is the control center of the electronic device 3, and connects the entire electronic device 3 through various interfaces and lines. Parts.
  • the memory 31 may be used to store the computer program 33 and/or modules/units.
  • the processor 32 runs or executes the computer programs and/or modules/units stored in the memory 31 and calls the computer programs and/or modules/units stored in the memory 31.
  • the data in 31 realizes various functions of the electronic device 3.
  • the memory 31 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may Data (such as audio data) created according to the use of the electronic device 3 and the like are stored.
  • the memory 31 may include volatile and non-volatile memory, such as random access memory (RAM), hard disk, memory, plug-in hard disk, smart media card (SMC), and security A digital (Secure Digital, SD) card, a flash card (Flash Card), at least one magnetic disk storage device, a flash memory device, or other computer-readable storage media that can be used to carry or store data.
  • volatile and non-volatile memory such as random access memory (RAM), hard disk, memory, plug-in hard disk, smart media card (SMC), and security
  • a digital (Secure Digital, SD) card such as a flash card (Flash Card), at least one magnetic disk storage device, a flash memory device, or other computer-readable storage media that can be used to carry or store data.
  • the computer-readable storage medium may be non-volatile or volatile.
  • the memory 31 in the electronic device 3 stores multiple instructions to implement an artificial intelligence-based positioning data processing method, and the processor 32 can execute the multiple instructions to achieve:
  • the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
  • the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where
  • the set of staying points includes the final staying point for the user to travel;
  • the positioning points are clustered on the spatial layer using the DBSCAN algorithm and the KNN algorithm, and the clustering results are optimized, and then the candidate stay points are divided and selected on the time layer to realize the spatial -Time double-layer clustering can effectively identify the stay points, and at the same time, remove part of the drift points of the positioning, and also improve the recognition accuracy of the stay points.
  • the integrated module/unit of the electronic device 3 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the computer program can be stored in a computer-readable storage medium.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, and read-only memory (ROM, Read-Only Memory) .
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • This application can be used in many general or special computer system environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor systems, microprocessor-based systems, set-top boxes, programmable consumer electronic devices, network PCs, small computers, large computers, including Distributed computing environment for any of the above systems or equipment, etc.
  • This application may be described in the general context of computer-executable instructions executed by a computer, such as a program module.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • This application can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network.
  • program modules can be located in local and remote computer storage media including storage devices.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Remote Sensing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An artificial intelligence-based positioning data processing method, comprising: acquiring multiple pieces of positioning data indicating user travel and recorded by a user terminal (S11); preprocessing the positioning data to obtain processing data (S12); processing the processing data on a spatial layer by using a clustering algorithm DBSCAN and a K-nearest neighbor classification algorithm KNN to obtain candidate regions comprising multiple categories, wherein each candidate region comprises a plurality of candidate stay points belonging to the same category (S13); for each category of candidate regions, according to cluster identifiers of a plurality of positioning points corresponding to the candidate regions, subdividing the plurality of candidate stay points in a time layer to obtain a stay point set, wherein the stay point set comprises a final stay point of the user travel (S14); and uploading the stay point set to a blockchain (S15). The method may be applied to smart transportation scenarios so as to promote the construction of smart cities.

Description

基于人工智能的定位数据处理方法及相关设备Artificial intelligence-based positioning data processing method and related equipment
本申请要求于2020年5月21日提交中国专利局,申请号为202010438159.7发明名称为“基于人工智能的定位数据处理方法及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on May 21, 2020. The application number is 202010438159.7. The invention title is "Artificial intelligence-based positioning data processing method and related equipment". The entire content is incorporated by reference. In this application.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种基于人工智能的定位数据处理方法及相关设备。This application relates to the field of artificial intelligence technology, and in particular to an artificial intelligence-based positioning data processing method and related equipment.
背景技术Background technique
交通出行量OD分析是为获取用户日常交通出行数据,通过数据分析可挖掘出用户对整个城市交通及其他城市功能需求的特征与分布情况,可对城市交通规划、建设提供信息与决策支持。其中,OD矩阵是一个很关键的分析数据。Traffic travel volume OD analysis is to obtain users' daily traffic travel data. Through data analysis, the characteristics and distribution of users' needs for the entire city's traffic and other urban functions can be mined, and it can provide information and decision-making support for urban traffic planning and construction. Among them, the OD matrix is a very critical analysis data.
OD矩阵即起讫点矩阵,需要知道该市用户在这个时间段内所有交通出行的起点和讫点,即出行停留点。在用户的出行定位数据中存在很多个定位点,发明人意识到,如何根据这些定位点来识别用户的出行停留点是一个亟待解决的技术问题。The OD matrix is the starting and ending point matrix. It is necessary to know the starting point and ending point of all traffic trips of users in this city during this time period, that is, the travel stop point. There are many positioning points in the user's travel positioning data. The inventor has realized that how to identify the user's travel stay point based on these positioning points is a technical problem that needs to be solved urgently.
发明内容Summary of the invention
鉴于以上内容,有必要提供一种基于人工智能的定位数据处理方法及相关设备,能够根据定位点来识别用户的出行停留点。In view of the above, it is necessary to provide an artificial intelligence-based positioning data processing method and related equipment, which can identify the user's travel stop point based on the positioning point.
本申请的第一方面提供一种基于人工智能的定位数据处理方法,所述基于人工智能的定位数据处理方法包括:The first aspect of the present application provides an artificial intelligence-based positioning data processing method. The artificial intelligence-based positioning data processing method includes:
获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
本申请的第二方面提供一种电子设备,其中,所述电子设备包括处理器和存储器,所述处理器用于执行所述存储器中存储的至少一个计算机可读指令以实现以下步骤:A second aspect of the present application provides an electronic device, wherein the electronic device includes a processor and a memory, and the processor is configured to execute at least one computer-readable instruction stored in the memory to implement the following steps:
获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所 述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
本申请的第三方面提供一种计算机可读存储介质,其上存储有至少一个计算机可读指令,其中,所述至少一个计算机可读指令被处理器执行时实现以下步骤:A third aspect of the present application provides a computer-readable storage medium on which at least one computer-readable instruction is stored, wherein the at least one computer-readable instruction implements the following steps when executed by a processor:
获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
本申请的第四方面提供一种定位数据处理装置,所述定位数据处理装置包括:A fourth aspect of the present application provides a positioning data processing device, the positioning data processing device includes:
获取模块,用于获取用户终端记录的表示用户出行的多个定位数据;The obtaining module is used to obtain multiple positioning data representing the user's travel recorded by the user terminal;
第一处理模块,用于对所述定位数据进行预处理,获得处理数据;The first processing module is configured to preprocess the positioning data to obtain processed data;
第二处理模块,用于采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The second processing module is configured to use the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories, where each candidate region includes Multiple candidate stay points belonging to the same category;
第三处理模块,用于针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;The third processing module is configured to subdivide the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate areas of each category to obtain A set of staying points, wherein the set of staying points includes the final staying point of the user when traveling;
上传模块,用于将所述停留点集合上传至区块链。The upload module is used to upload the set of stay points to the blockchain.
由以上技术方案,本申请可应用在智慧城管、智慧安防、智慧物流以及智慧交通等需要对定位数据进行处理的领域,从而推动智慧城市的发展。本申请中,利用DBSCAN算法和KNN算法对定位点在空间层上进行聚类并优化聚类结果,再在时间层上对候选停留点进行划分和取舍,实现了空间-时间的双层聚类,可以有效的对停留点进行识别,同时,去除了部分定位的漂移点,也提高了停留点的识别精度。Based on the above technical solutions, this application can be applied to areas that require positioning data processing, such as smart city management, smart security, smart logistics, and smart transportation, so as to promote the development of smart cities. In this application, the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to realize the space-time two-layer clustering , It can effectively identify the staying point, at the same time, it removes part of the drift point of the positioning, and also improves the recognition accuracy of the staying point.
附图说明Description of the drawings
图1是本申请公开的一种基于人工智能的定位数据处理方法的较佳实施例的流程图。Fig. 1 is a flowchart of a preferred embodiment of a positioning data processing method based on artificial intelligence disclosed in the present application.
图2是本申请公开的一种定位数据处理装置的较佳实施例的功能模块图。Fig. 2 is a functional module diagram of a preferred embodiment of a positioning data processing device disclosed in the present application.
图3是本申请实现基于人工智能的定位数据处理方法的较佳实施例的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the artificial intelligence-based positioning data processing method according to the present application.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the specification of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.
请参见图1,图1是本申请公开的一种基于人工智能的定位数据处理方法的较佳实施例的流程图。其中,根据不同的需求,该流程图中步骤的顺序可以改变,某些步骤可以省略。Please refer to FIG. 1. FIG. 1 is a flowchart of a preferred embodiment of an artificial intelligence-based positioning data processing method disclosed in the present application. Among them, according to different needs, the order of the steps in the flowchart can be changed, and some steps can be omitted.
S11、获取用户终端记录的表示用户出行的多个定位数据。S11. Obtain a plurality of positioning data recorded by the user terminal and representing the user's travel.
其中,定位数据包括定位点所处的地理位置以及每个定位点的到达时间。Among them, the positioning data includes the geographic location of the positioning point and the arrival time of each positioning point.
S12、对所述定位数据进行预处理,获得处理数据。S12. Preprocess the positioning data to obtain processed data.
具体的,所述对所述定位数据进行预处理,获得处理数据包括:Specifically, the preprocessing the positioning data to obtain processed data includes:
对所述定位数据进行等时间间隔化处理,获得中间数据;Perform equal time interval processing on the positioning data to obtain intermediate data;
对所述中间数据中的漂移数据进行删除,获得处理数据。The drift data in the intermediate data is deleted to obtain processed data.
其中,预处理可以包括等时间间隔化处理,比如将一段时间的定位数据按照预设的时间间隔(如10min)进行划分,预处理还包括对漂移数据的删除处理,其中,漂移数据包括漂移点,通常,移动速度大于城市交通最大速度的定位点通常是漂移点。漂移点的存在会极大地影响停留点的识别效果,因此需要删除。Among them, the preprocessing can include equal time interval processing, such as dividing the positioning data for a period of time according to a preset time interval (such as 10min), and the preprocessing also includes deleting the drift data, where the drift data includes drift points. , Usually, the anchor point whose moving speed is greater than the maximum speed of urban traffic is usually a drift point. The existence of drift points will greatly affect the recognition effect of stay points, so they need to be deleted.
S13、采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点。S13. Adopt the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple categories belonging to the same category. Candidate stay points.
其中,DBSCAN(Density-Based Spatial Clustering of Applications with Noise)是一个比较有代表性的基于密度的聚类算法。它的目的是将低密度部分过滤掉,将高密度样本点识别出来,该聚类算法不仅可以识别出任意形状和大小的簇、而且具有较高的抗干扰性。DBSCAN聚类算法的主要思想是:从样本点集合D中选取一个未经处理的样本点P,检测点P的Eps.邻域内的样本点来搜索满足要求的簇,如果点P的Eps.邻域内的点数大于等于MinPts,则判定点P属于核心点,并根据该核心点P创建一个新簇C,然后从样本点集合D中搜索所有从核心点P出发直接密度可达的点,当所有点均处理完毕后聚类结束。Among them, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a representative density-based clustering algorithm. Its purpose is to filter out the low-density part and identify the high-density sample points. The clustering algorithm can not only identify clusters of any shape and size, but also has high anti-interference ability. The main idea of the DBSCAN clustering algorithm is: select an unprocessed sample point P from the sample point set D, detect the sample points in the Eps. neighborhood of the point P to search for clusters that meet the requirements, if the Eps. neighbors of the point P If the number of points in the domain is greater than or equal to MinPts, it is determined that the point P belongs to the core point, and a new cluster C is created based on the core point P, and then all the points with direct density reachable from the core point P are searched from the sample point set D. When all After all the points are processed, the clustering ends.
本申请实施例中,用户在某个地点停留时间的长短就可以由定位点在空间上的密度的高低得到,因此,可以利用基于密度的DBSCAN算法对定位点在空间层上聚类,初步实现用户出行候选停留点的获取。In the embodiments of this application, the length of time the user stays at a certain place can be obtained from the density of the positioning points in space. Therefore, the density-based DBSCAN algorithm can be used to cluster the positioning points on the spatial layer, which is preliminary realized Obtaining the user's travel candidate staying point.
其中,邻近算法或K最近邻(k-NearestNeighbor,kNN)分类算法是数据挖掘分类技术中最简单的方法之一。KNN的主要思想是:对于一个给定的样本,如果与此样本相距最近的K个实例中大部分属于某个类别,则判定此样本同样属于这个类别。Among them, the neighbor algorithm or K-Nearest Neighbor (k-Nearest Neighbor, kNN) classification algorithm is one of the simplest methods in data mining classification technology. The main idea of KNN is: for a given sample, if most of the K instances closest to this sample belong to a certain category, then it is determined that this sample also belongs to this category.
本申请实施例中,由于用户终端的定位数据是非等时、不均匀采集的,并且连续两条定位数据采集的时间间隔有时候较大,这使得即使某个定位点与前一定位点之间的位移速度小于城市交通最大速度,但该定位点仍然可能是漂移点,因此仅仅根据两个定位点之间的位移速度无法识别出全部的定位漂移点,而这些定位漂移点的存在会极大地影响停留点的识别效果,它们的存在可能会将一个长时停留点划分为多个短时停留点,甚至可能会导致一部分停留点根本无法被识别出来。因此,为了更精确地识别出用户出行的停留点,需要利用KNN算法的思想对聚类结果进行优化。In the embodiments of the present application, because the positioning data of the user terminal is collected non-isochronously and unevenly, and the time interval between two consecutive positioning data collections is sometimes large, this makes it possible even if a certain positioning point is between a certain positioning point and the previous positioning point. The displacement speed of is less than the maximum speed of urban traffic, but the positioning point may still be a drift point. Therefore, only the displacement speed between the two positioning points cannot identify all the positioning drift points, and the existence of these positioning drift points will greatly Affect the recognition effect of stay points. Their existence may divide a long-term stay point into multiple short-term stay points, and may even cause some stay points to be unrecognized at all. Therefore, in order to more accurately identify the staying point of a user's travel, it is necessary to use the idea of the KNN algorithm to optimize the clustering results.
具体的,所述采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域包括:Specifically, the use of the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories includes:
采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点;The DBSCAN algorithm is used to process the processed data on the spatial layer to obtain a plurality of first stop points for the user to travel;
采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点;Using the KNN algorithm to classify the multiple first stay points to obtain multiple categories of first stay points;
根据属于同一类别的第一停留点,构建候选区域。According to the first stay point belonging to the same category, a candidate area is constructed.
其中,采用DBSCAN算法,获得是初步识别的多个第一停留点,由于处理数据中难免还存在漂移点,还需要采用KNN算法,对所述多个第一停留点进行进一步的优化分类。Among them, the DBSCAN algorithm is used to obtain a plurality of first stay points that are initially identified. Since there are unavoidable drift points in the processed data, it is also necessary to use the KNN algorithm to further optimize the classification of the plurality of first stay points.
具体的,所述采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点包括:Specifically, the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain multiple first stay points for the user to travel includes:
采用DBSCAN算法,在空间层上,针对所述处理数据中的任一定位点,构建以所述任一定位点为中心,半径为预设的停留点判别距离阈值的邻域;Using the DBSCAN algorithm, on the spatial layer, for any locating point in the processed data, construct a neighborhood centered on the any locating point and the radius is the preset stopping point discriminating distance threshold;
判断所述邻域内的定位点的数量是否大于或等于预设的停留点判别时间阈值,其中,每个定位点在时间维度上的长度均为一个单位时间;Judging whether the number of positioning points in the neighborhood is greater than or equal to a preset stop point discrimination time threshold, where the length of each positioning point in the time dimension is one unit time;
若所述邻域内的定位点的数量大于或等于预设的停留点判别时间阈值,计算所述邻域内的所有定位点的几何中心点,并将所述几何中心点确定为所述用户出行的第一停留点。If the number of positioning points in the neighborhood is greater than or equal to the preset stop point discrimination time threshold, the geometric center points of all positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
由于用户终端定位存在一定程度的误差,即使用户的位置不动其定位结果也会出现一定程度的改变;而根据交通对出行的界定,出行是指交乘坐通工具行驶超出500米或者步行耗时超出5分钟的活动,因此,从定位数据上反映出来的小范围内用户的位置移动并不意味着用户进行了一次出行。基于以上考虑,可以设置停留点判别距离阈值为500米,停留点判别时间阈值为5分钟。Due to a certain degree of error in the positioning of the user terminal, the positioning result will change to a certain extent even if the user’s position does not move. According to the definition of travel by traffic, travel refers to travel by means of transportation for more than 500 meters or time-consuming walking Activities that exceed 5 minutes, therefore, the user's position movement within a small area reflected from the positioning data does not mean that the user has made a trip. Based on the above considerations, it is possible to set the threshold for the distance of the stay point to be 500 meters, and the threshold value for the time for the point of stay to be 5 minutes.
可以定义用户的停留点为:对于任一定位点P,如果以点P为中心、半径为R(停留点判别距离阈值)的邻域内的定位点的数量N大于或等于停留点判别时间阈值,则称R.邻域内所有的定位点的几何中心点为停留点,到达该停留点的时间为R.邻域中的第一个定位点的到达时间,停留时长为N(单位时间)。其中,每个定位点在时间维度上的长度相同,都为一个单位时间,点P的R.邻域内的定位点的数量N也就是该停留点的停留时长。The user’s stay point can be defined as: For any locating point P, if the number N of locating points in the neighborhood with point P as the center and radius R (stay point discrimination distance threshold) is greater than or equal to the stay point discrimination time threshold, Then the geometric center points of all anchor points in the R. neighborhood are called stay points, the time to reach the stay point is the arrival time of the first anchor point in the R. neighborhood, and the stay time is N (unit time). Among them, the length of each anchor point in the time dimension is the same, which is a unit time, and the number of anchor points in the R. neighborhood of the point P is the stay time of the stay point.
可选的,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点之后,所述方法还包括:Optionally, after the KNN algorithm is used to classify the multiple first stay points, and after the first stay points of the multiple categories are obtained, the method further includes:
判断所述多个类别中是否包括漂移点的类别;Judging whether the multiple categories include a category of drift points;
若所述多个类别中包括漂移点的类别,将包括漂移点的类别删除;If the multiple categories include the category of drift points, delete the category that includes the drift points;
所述根据属于同一类别的第一停留点,构建候选区域包括:According to the first stay points belonging to the same category, constructing the candidate area includes:
针对删除包括漂移点的类别后的其他类别的第一停留点,根据属于同一类别的第一停留点,构建候选区域。For the first stay points of other categories after the category including the drift point is deleted, a candidate area is constructed according to the first stay points belonging to the same category.
其中,采用KNN算法,对所述多个第一停留点进行分类,可以将包括漂移点的划分为一类,为了降低漂移点对停留点的识别效果,可以将包括漂移点的类别删除之后,在针对剩余的类别,来进一步优化。Among them, the KNN algorithm is used to classify the multiple first stay points, and the drift points can be classified into one category. In order to reduce the recognition effect of the drift points on the stay points, the categories including the drift points can be deleted, For the remaining categories, to further optimize.
具体的,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点包括:Specifically, the use of the KNN algorithm to classify the plurality of first stay points, and obtain the first stay points of the plurality of categories includes:
针对每个所述第一停留点,获取所述第一停留点对应的多个定位点集合;For each of the first stay points, obtain a plurality of positioning point sets corresponding to the first stay points;
采用KNN算法,对所述多个定位点集合中的任一定位点的簇标识进行变更;The KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets;
将变更后具有相同的簇标识的定位点所对应的第一停留点划分为同一类别。The first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
其中,基于KNN算法思想对聚类结果进行优化的具体步骤如下:Among them, the specific steps for optimizing the clustering results based on the idea of KNN algorithm are as follows:
1)令集合D={p 1,p 2,…,p n}表示空间层聚类后的定位点集合,初始化KNN算法中的参数K为4,初始化i=3; 1) Let set D={p 1 , p 2 ,..., p n } represent the set of positioning points after spatial layer clustering, the parameter K in the initialization KNN algorithm is 4, and the initialization i=3;
2)选取定位点p i进行优化处理:检查点p i前、后各K/2个定位点的簇标识ID,如果某个簇ID出现的次数超过K/2,则将点p i的簇ID变更为该簇ID,将点p i的经纬度值更改为簇ID簇的簇中心的经纬度值,否则,点p i的簇ID保持不变; 2) Select the anchor point p i for optimization: check the cluster IDs of K/2 anchor points before and after the point p i . If the number of occurrences of a certain cluster ID exceeds K/2, then the cluster of the point p i Change the ID to the cluster ID, and change the latitude and longitude value of the point p i to the latitude and longitude value of the cluster center of the cluster ID; otherwise, the cluster ID of the point p i remains unchanged;
3)i=i+1,若i=n-2则优化结束,否则转入步骤2)。3) i=i+1, if i=n-2, the optimization ends, otherwise, go to step 2).
其中,每个定位点数据中都包含一个簇ID字段,通过上述步骤,可以将具有相同的簇ID的定位点所对应的第一停留点划分为同一类别,实现了对DBSCAN算法的聚类结果的进一步优化。Among them, each anchor point data contains a cluster ID field. Through the above steps, the first stay points corresponding to the anchor points with the same cluster ID can be divided into the same category, and the clustering result of the DBSCAN algorithm is realized. Further optimization.
S14、针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识, 在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点。S14. For the candidate regions of each category, perform subdivision processing on the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate regions to obtain a stay point set, where , The set of staying points includes the final staying point of the user when traveling.
其中,对用户出行定位点进行聚类并优化后,从用户终端的定位数据中提取出了用户出行的候选停留点。由于在空间维度上相近的定位点在时间维度上可能相距较远,例如某用户上午在地点A工作,中午离开地点A吃午饭,下午继续回到地点A工作,仅在空间层上对该用户的轨迹点聚类会将其上午和下午的工作地点A聚为同一个簇,即只能识别出一个候选停留点,但实际上该用户在地点A停留了两次,从理论上来说应该有两个停留点,因此需要在时间维度上对候选停留点经过进一步地处理,才能得到用户出行的最终停留点。Among them, after clustering and optimizing the user travel location points, the candidate stay points for the user travel are extracted from the location data of the user terminal. Since the positioning points that are similar in the spatial dimension may be far apart in the time dimension, for example, a user works at location A in the morning, leaves location A for lunch at noon, and continues to work at location A in the afternoon, only for the user on the spatial layer The clustering of trajectory points will cluster its work location A in the morning and afternoon into the same cluster, that is, only one candidate stay point can be identified, but in fact, the user stayed at location A twice. In theory, there should be Two staying points, so the candidate staying points need to be further processed in the time dimension to get the final staying point of the user's travel.
本申请实施例中,在空间层聚类并优化后所提取出的候选停留点的基础上,在时间层上聚类以对候选停留点进行划分和取舍。In the embodiment of the present application, based on the candidate stay points extracted after spatial layer clustering and optimization, clustering is performed on the time layer to divide and select candidate stay points.
具体的,所述根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合包括:Specifically, the step of subdividing the plurality of candidate stay points on the time layer according to the cluster identifiers of the plurality of anchor points corresponding to the candidate area to obtain a set of stay points includes:
依次读取所述候选区域对应的任一定位点的第一簇标识;Sequentially reading the first cluster identifier of any anchor point corresponding to the candidate area;
判断所述第一簇标识是否等于初始化的簇标识,其中,所述初始化的簇标识为到达时间最早的定位点的簇标识;Judging whether the first cluster identifier is equal to the initialized cluster identifier, wherein the initialized cluster identifier is the cluster identifier of the anchor point with the earliest arrival time;
若所述第一簇标识等于初始化的簇标识,将所述第一簇标识添加至新簇中,直至当前读取的定位点的第一簇标识不等于初始化的簇标识时,判断所述新簇中的定位点的数量是否大于或等于预设的停留点判别时间阈值;If the first cluster identifier is equal to the initialized cluster identifier, the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
若所述新簇中的定位点的数量大于或等于预设的停留点判别时间阈值,获取当前之前已读取的所有定位点对应的目标候选停留点,并将所述目标候选停留点添加至停留点集合中。If the number of anchor points in the new cluster is greater than or equal to the preset stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the currently read anchor points, and add the target candidate stay points to Stay in the collection.
可选的,所述方法还包括:Optionally, the method further includes:
若当前读取的定位点不是最后一个定位点,设置所述初始化的簇标识为当前读取的定位点的簇标识并进行迭代。If the currently read anchor point is not the last anchor point, set the initialized cluster identifier as the cluster identifier of the currently read anchor point and perform iteration.
可选的,所述方法还包括:Optionally, the method further includes:
若所述新簇中的定位点的数量小于预设的停留点判别时间阈值,且当前读取的定位点不是最后一个定位点,设置所述初始化的簇标识为当前读取的定位点的簇标识并进行迭代。If the number of anchor points in the new cluster is less than the preset stay point discrimination time threshold, and the currently read anchor point is not the last anchor point, set the initialized cluster identifier as the cluster of the currently read anchor point Identify and iterate.
其中,针对每个候选区域,在时间层上的具体实现步骤如下:Among them, for each candidate area, the specific implementation steps in the time layer are as follows:
1)初始化参数i=0、j=i、num=0,初始化簇标识cid,令cid表示第一个定位点(即到达时间最早的定位点)的簇ID,停留点集合SP初始为空集;1) Initialization parameters i = 0, j = i, num = 0, initialize the cluster identifier cid, let cid represent the cluster ID of the first anchor point (ie the anchor point with the earliest arrival time), and the set of stay points SP is initially an empty set ;
2)创建一个新簇C;2) Create a new cluster C;
3)令i=i+1,读取一条定位数据p i,判断p i的簇ID是否等于cid。如果相等,则将点p i加入簇C中,转入步骤3);如果不等,转入步骤4); 3) Let i=i+1, read a piece of positioning data p i , and judge whether the cluster ID of p i is equal to cid. If they are equal, add point p i to cluster C and go to step 3); if not, go to step 4);
4)如果新簇C中的定位点的数量N大于等于停留点判别时间阈值,获取当前之前已读取的所有定位点对应的目标候选停留点,即预先已经计算得到的定位点{p j,p j+1,…,p i-1}的几何中心点,其中,目标候选停留点的到达时间为p i的定位时间,停留时长为N(分钟),将该目标候选停留点添加到停留点集合SP中,然后转入步骤5);如果新簇C中的定位点的数量N小于停留点判别时间阈值,则直接转入步骤5); 4) If the number N of anchor points in the new cluster C is greater than or equal to the stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the anchor points that have been read before, that is, the anchor points {p j , The geometric center point of p j+1 ,...,pi -1 }, where the arrival time of the target candidate stay point is the positioning time of p i , and the stay time is N (minutes). Add the target candidate stay point to the stay In the point set SP, then go to step 5); if the number N of anchor points in the new cluster C is less than the stay point discrimination time threshold, go directly to step 5);
5)如果p i不是最后一个点,则令cid=p i的簇ID,j=i,然后转入步骤2);否则,时间层聚类结束。 5) If p i is not the last point, set cid = the cluster ID of p i and j = i, and then go to step 2); otherwise, the time-level clustering ends.
可选的,所述方法还包括:Optionally, the method further includes:
按照所述停留点集合包括的最终停留点的到达时间先后顺序,对所述最终停留点进 行连接,获得所述用户出行的出行链。Connect the final stay points according to the arrival time sequence of the final stay points included in the stay point set to obtain the travel chain of the user's travel.
其中,上述步骤1)-5)是按照定位点的到达时间的升序进行迭代,因此,获得的停留点集合SP中的最终停留点也是按照时间升序进行排列的,可以直接对所述停留点集合包括的最终停留点进行连接,获得所述用户出行的出行链。Wherein, the above steps 1)-5) are iterated in ascending order of the arrival time of the anchor points. Therefore, the final stay points in the obtained stay point set SP are also arranged in ascending order of time, and the stay points can be set directly The included final stay points are connected to obtain the travel chain of the user's travel.
S15、将所述停留点集合上传至区块链。S15. Upload the set of stay points to the blockchain.
其中,最后获得的停留点集合可以上传至区块链,通过区块链来保存停留点集合中的数据,可以确保数据的私密性和安全性。Among them, the final set of stay points can be uploaded to the blockchain, and the data in the set of stay points can be saved through the blockchain, which can ensure the privacy and security of the data.
在图1所描述的方法流程中,利用DBSCAN算法和KNN算法对定位点在空间层上进行聚类并优化聚类结果,再在时间层上对候选停留点进行划分和取舍,实现了空间-时间的双层聚类,可以有效的对停留点进行识别,同时,去除了部分定位的漂移点,也提高了停留点的识别精度。In the method flow described in Figure 1, the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to realize the space- The two-layer clustering of time can effectively identify the staying point, at the same time, it removes part of the drift points of the positioning, and also improves the recognition accuracy of the staying point.
由以上实施例可知,本申请可应用在智慧城管、智慧安防、智慧物流以及智慧交通等需要对定位数据进行处理的领域,从而推动智慧城市的发展。It can be seen from the above embodiments that this application can be applied to areas that require processing of positioning data, such as smart city management, smart security, smart logistics, and smart transportation, so as to promote the development of smart cities.
图2是本申请公开的一种定位数据处理装置的较佳实施例的功能模块图。Fig. 2 is a functional module diagram of a preferred embodiment of a positioning data processing device disclosed in the present application.
在一些实施例中,所述定位数据处理装置运行于电子设备中。所述定位数据处理装置可以包括多个由程序代码段所组成的功能模块。所述定位数据处理装置中的各个程序段的程序代码可以存储于存储器中,并由至少一个处理器所执行,以执行图1所描述的基于人工智能的定位数据处理方法中的部分或全部步骤。In some embodiments, the positioning data processing device runs in an electronic device. The positioning data processing device may include multiple functional modules composed of program code segments. The program code of each program segment in the positioning data processing device may be stored in a memory and executed by at least one processor to execute part or all of the steps in the artificial intelligence-based positioning data processing method described in FIG. 1 .
本实施例中,所述定位数据处理装置根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:获取模块201、第一处理模块202、第二处理模块203、第三处理模块204及上传模块205。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机程序段,其存储在存储器中。In this embodiment, the positioning data processing device may be divided into multiple functional modules according to the functions it performs. The functional modules may include: an acquisition module 201, a first processing module 202, a second processing module 203, a third processing module 204, and an uploading module 205. The module referred to in this application refers to a series of computer program segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory.
获取模块201,用于获取用户终端记录的表示用户出行的多个定位数据。The obtaining module 201 is configured to obtain multiple positioning data recorded by the user terminal and representing the user's travel.
其中,定位数据包括定位点所处的地理位置以及每个定位点的到达时间。Among them, the positioning data includes the geographic location of the positioning point and the arrival time of each positioning point.
第一处理模块202,用于对所述定位数据进行预处理,获得处理数据。The first processing module 202 is configured to preprocess the positioning data to obtain processed data.
具体的,所述对所述定位数据进行预处理,获得处理数据包括:Specifically, the preprocessing the positioning data to obtain processed data includes:
对所述定位数据进行等时间间隔化处理,获得中间数据;Perform equal time interval processing on the positioning data to obtain intermediate data;
对所述中间数据中的漂移数据进行删除,获得处理数据。The drift data in the intermediate data is deleted to obtain processed data.
其中,预处理可以包括等时间间隔化处理,比如将一段时间的定位数据按照预设的时间间隔(如10min)进行划分,预处理还包括对漂移数据的删除处理,其中,漂移数据包括漂移点,通常,移动速度大于城市交通最大速度的定位点通常是漂移点。漂移点的存在会极大地影响停留点的识别效果,因此需要删除。Among them, the preprocessing can include equal time interval processing, such as dividing the positioning data for a period of time according to a preset time interval (such as 10min), and the preprocessing also includes deleting the drift data, where the drift data includes drift points. , Usually, the anchor point whose moving speed is greater than the maximum speed of urban traffic is usually a drift point. The existence of drift points will greatly affect the recognition effect of stay points, so they need to be deleted.
第二处理模块203,用于采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点。The second processing module 203 is configured to use the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each of the candidate regions Including multiple candidate stay points belonging to the same category.
其中,DBSCAN(Density-Based Spatial Clustering of Applications with Noise)是一个比较有代表性的基于密度的聚类算法。它的目的是将低密度部分过滤掉,将高密度样本点识别出来,该聚类算法不仅可以识别出任意形状和大小的簇、而且具有较高的抗干扰性。DBSCAN聚类算法的主要思想是:从样本点集合D中选取一个未经处理的样本点P,检测点P的Eps.邻域内的样本点来搜索满足要求的簇,如果点P的Eps.邻域内的点数大于等于MinPts,则判定点P属于核心点,并根据该核心点P创建一个新簇C,然后从样本点集合D中搜索所有从核心点P出发直接密度可达的点,当所有点均处理完毕后聚类结束。Among them, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a representative density-based clustering algorithm. Its purpose is to filter out the low-density part and identify the high-density sample points. The clustering algorithm can not only identify clusters of any shape and size, but also has high anti-interference ability. The main idea of the DBSCAN clustering algorithm is: select an unprocessed sample point P from the sample point set D, detect the sample points in the Eps. neighborhood of the point P to search for clusters that meet the requirements, if the Eps. neighbors of the point P If the number of points in the domain is greater than or equal to MinPts, it is determined that the point P belongs to the core point, and a new cluster C is created based on the core point P, and then all the points with direct density reachable from the core point P are searched from the sample point set D. When all After all the points are processed, the clustering ends.
本申请实施例中,用户在某个地点停留时间的长短就可以由定位点在空间上的密度 的高低得到,因此,可以利用基于密度的DBSCAN算法对定位点在空间层上聚类,初步实现用户出行候选停留点的获取。In the embodiments of this application, the length of time the user stays at a certain place can be obtained from the density of the positioning points in space. Therefore, the density-based DBSCAN algorithm can be used to cluster the positioning points on the spatial layer, which is preliminary realized Obtaining the user's travel candidate staying point.
其中,邻近算法或K最近邻(k-NearestNeighbor,kNN)分类算法是数据挖掘分类技术中最简单的方法之一。KNN的主要思想是:对于一个给定的样本,如果与此样本相距最近的K个实例中大部分属于某个类别,则判定此样本同样属于这个类别。Among them, the neighbor algorithm or K-Nearest Neighbor (k-Nearest Neighbor, kNN) classification algorithm is one of the simplest methods in data mining classification technology. The main idea of KNN is: for a given sample, if most of the K instances closest to this sample belong to a certain category, then it is determined that this sample also belongs to this category.
本申请实施例中,由于用户终端的定位数据是非等时、不均匀采集的,并且连续两条定位数据采集的时间间隔有时候较大,这使得即使某个定位点与前一定位点之间的位移速度小于城市交通最大速度,但该定位点仍然可能是漂移点,因此仅仅根据两个定位点之间的位移速度无法识别出全部的定位漂移点,而这些定位漂移点的存在会极大地影响停留点的识别效果,它们的存在可能会将一个长时停留点划分为多个短时停留点,甚至可能会导致一部分停留点根本无法被识别出来。因此,为了更精确地识别出用户出行的停留点,需要利用KNN算法的思想对聚类结果进行优化。In the embodiments of the present application, because the positioning data of the user terminal is collected non-isochronously and unevenly, and the time interval between two consecutive positioning data collections is sometimes large, this makes it possible even if a certain positioning point is between a certain positioning point and the previous positioning point. The displacement speed of is less than the maximum speed of urban traffic, but the positioning point may still be a drift point. Therefore, only the displacement speed between the two positioning points cannot identify all the positioning drift points, and the existence of these positioning drift points will greatly Affect the recognition effect of stay points. Their existence may divide a long-term stay point into multiple short-term stay points, and may even cause some stay points to be unrecognized at all. Therefore, in order to more accurately identify the staying point of a user's travel, it is necessary to use the idea of the KNN algorithm to optimize the clustering results.
具体的,所述采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域包括:Specifically, the use of the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories includes:
采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点;The DBSCAN algorithm is used to process the processed data on the spatial layer to obtain multiple first stop points for the user to travel;
采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点;Using the KNN algorithm to classify the multiple first stay points to obtain multiple categories of first stay points;
根据属于同一类别的第一停留点,构建候选区域。According to the first stay point belonging to the same category, a candidate area is constructed.
其中,采用DBSCAN算法,获得是初步识别的多个第一停留点,由于处理数据中难免还存在漂移点,还需要采用KNN算法,对所述多个第一停留点进行进一步的优化分类。Among them, the DBSCAN algorithm is used to obtain a plurality of first stay points that are initially identified. Since there are unavoidable drift points in the processed data, it is also necessary to use the KNN algorithm to further optimize the classification of the plurality of first stay points.
具体的,所述采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点包括:Specifically, the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain multiple first stay points for the user to travel includes:
采用DBSCAN算法,在空间层上,针对所述处理数据中的任一定位点,构建以所述任一定位点为中心,半径为预设的停留点判别距离阈值的邻域;Using the DBSCAN algorithm, on the spatial layer, for any locating point in the processed data, construct a neighborhood centered on the any locating point and the radius is the preset stopping point discriminating distance threshold;
判断所述邻域内的定位点的数量是否大于或等于预设的停留点判别时间阈值,其中,每个定位点在时间维度上的长度均为一个单位时间;Judging whether the number of positioning points in the neighborhood is greater than or equal to a preset stop point discrimination time threshold, where the length of each positioning point in the time dimension is one unit time;
若所述邻域内的定位点的数量大于或等于预设的停留点判别时间阈值,计算所述邻域内的所有定位点的几何中心点,并将所述几何中心点确定为所述用户出行的第一停留点。If the number of positioning points in the neighborhood is greater than or equal to the preset stop point discrimination time threshold, the geometric center points of all the positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
由于用户终端定位存在一定程度的误差,即使用户的位置不动其定位结果也会出现一定程度的改变;而根据交通对出行的界定,出行是指交乘坐通工具行驶超出500米或者步行耗时超出5分钟的活动,因此,从定位数据上反映出来的小范围内用户的位置移动并不意味着用户进行了一次出行。基于以上考虑,可以设置停留点判别距离阈值为500米,停留点判别时间阈值为5分钟。Due to a certain degree of error in the positioning of the user terminal, the positioning result will change to a certain extent even if the user’s position does not move. According to the definition of travel by traffic, travel refers to travel by means of transportation over 500 meters or time-consuming walking Activities that exceed 5 minutes, therefore, the user's position movement within a small area reflected from the positioning data does not mean that the user has made a trip. Based on the above considerations, it is possible to set the threshold for the distance of the stay point to be 500 meters, and the threshold for the time for the point of stay to be 5 minutes.
可以定义用户的停留点为:对于任一定位点P,如果以点P为中心、半径为R(停留点判别距离阈值)的邻域内的定位点的数量N大于或等于停留点判别时间阈值,则称R.邻域内所有的定位点的几何中心点为停留点,到达该停留点的时间为R.邻域中的第一个定位点的到达时间,停留时长为N(单位时间)。其中,每个定位点在时间维度上的长度相同,都为一个单位时间,点P的R.邻域内的定位点的数量N也就是该停留点的停留时长。The user’s stay point can be defined as: For any locating point P, if the number N of locating points in the neighborhood with point P as the center and radius R (stay point discrimination distance threshold) is greater than or equal to the stay point discrimination time threshold, Then the geometric center points of all anchor points in the R. neighborhood are called stay points, the time to reach the stay point is the arrival time of the first anchor point in the R. neighborhood, and the stay time is N (unit time). Among them, the length of each anchor point in the time dimension is the same, which is a unit time, and the number of anchor points in the R. neighborhood of point P is the stay time of the stay point.
其中,采用KNN算法,对所述多个第一停留点进行分类,可以将包括漂移点的划分为一类,为了降低漂移点对停留点的识别效果,可以将包括漂移点的类别删除之后,在针对剩余的类别,来进一步优化。Among them, the KNN algorithm is used to classify the multiple first stay points, and the drift points can be classified into one category. In order to reduce the recognition effect of the drift points on the stay points, the categories including the drift points can be deleted, For the remaining categories, to further optimize.
具体的,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第 一停留点包括:Specifically, the use of the KNN algorithm to classify the multiple first stay points, and obtain the first stay points of the multiple categories includes:
针对每个所述第一停留点,获取所述第一停留点对应的多个定位点集合;For each of the first stay points, obtain a plurality of positioning point sets corresponding to the first stay points;
采用KNN算法,对所述多个定位点集合中的任一定位点的簇标识进行变更;The KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets;
将变更后具有相同的簇标识的定位点所对应的第一停留点划分为同一类别。The first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
其中,基于KNN算法思想对聚类结果进行优化的具体步骤如下:Among them, the specific steps for optimizing the clustering results based on the idea of KNN algorithm are as follows:
1)令集合D={p 1,p 2,…,p n}表示空间层聚类后的定位点集合,初始化KNN算法中的参数K为4,初始化i=3; 1) Let set D={p 1 , p 2 ,..., p n } represent the set of positioning points after spatial layer clustering, the parameter K in the initialization KNN algorithm is 4, and the initialization i=3;
2)选取定位点p i进行优化处理:检查点p i前、后各K/2个定位点的簇标识ID,如果某个簇ID出现的次数超过K/2,则将点p i的簇ID变更为该簇ID,将点p i的经纬度值更改为簇ID簇的簇中心的经纬度值,否则,点p i的簇ID保持不变; 2) Select the anchor point p i for optimization: check the cluster IDs of K/2 anchor points before and after the point p i . If the number of occurrences of a certain cluster ID exceeds K/2, then the cluster of the point p i Change the ID to the cluster ID, and change the latitude and longitude value of the point p i to the latitude and longitude value of the cluster center of the cluster ID; otherwise, the cluster ID of the point p i remains unchanged;
3)i=i+1,若i=n-2则优化结束,否则转入步骤2)。3) i=i+1, if i=n-2, the optimization ends, otherwise, go to step 2).
其中,每个定位点数据中都包含一个簇ID字段,通过上述步骤,可以将具有相同的簇ID的定位点所对应的第一停留点划分为同一类别,实现了对DBSCAN算法的聚类结果的进一步优化。Among them, each anchor point data contains a cluster ID field. Through the above steps, the first stay points corresponding to the anchor points with the same cluster ID can be divided into the same category, and the clustering result of the DBSCAN algorithm is realized. Further optimization.
第三处理模块204,用于针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点。The third processing module 204 is configured to subdivide the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate areas of each category, and Obtain a set of staying points, where the set of staying points includes the final staying point of the user traveling.
其中,对用户出行定位点进行聚类并优化后,从用户终端的定位数据中提取出了用户出行的候选停留点。由于在空间维度上相近的定位点在时间维度上可能相距较远,例如某用户上午在地点A工作,中午离开地点A吃午饭,下午继续回到地点A工作,仅在空间层上对该用户的轨迹点聚类会将其上午和下午的工作地点A聚为同一个簇,即只能识别出一个候选停留点,但实际上该用户在地点A停留了两次,从理论上来说应该有两个停留点,因此需要在时间维度上对候选停留点经过进一步地处理,才能得到用户出行的最终停留点。Among them, after clustering and optimizing the user travel location points, the candidate stay points for the user travel are extracted from the location data of the user terminal. Since the positioning points that are similar in the spatial dimension may be far apart in the time dimension, for example, a user works at location A in the morning, leaves location A for lunch at noon, and continues to work at location A in the afternoon, only for the user on the spatial layer The clustering of trajectory points will cluster its work location A in the morning and afternoon into the same cluster, that is, only one candidate stay point can be identified, but in fact, the user stayed at location A twice. In theory, there should be Two staying points, so the candidate staying points need to be further processed in the time dimension to get the final staying point of the user's travel.
本申请实施例中,在空间层聚类并优化后所提取出的候选停留点的基础上,在时间层上聚类以对候选停留点进行划分和取舍。In the embodiment of the present application, based on the candidate stay points extracted after spatial layer clustering and optimization, clustering is performed on the time layer to divide and select candidate stay points.
具体的,所述根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合包括:Specifically, the step of subdividing the plurality of candidate stay points on the time layer according to the cluster identifiers of the plurality of anchor points corresponding to the candidate area to obtain a set of stay points includes:
依次读取所述候选区域对应的任一定位点的第一簇标识;Sequentially reading the first cluster identifier of any anchor point corresponding to the candidate area;
判断所述第一簇标识是否等于初始化的簇标识,其中,所述初始化的簇标识为到达时间最早的定位点的簇标识;Judging whether the first cluster identifier is equal to the initialized cluster identifier, wherein the initialized cluster identifier is the cluster identifier of the anchor point with the earliest arrival time;
若所述第一簇标识等于初始化的簇标识,将所述第一簇标识添加至新簇中,直至当前读取的定位点的第一簇标识不等于初始化的簇标识时,判断所述新簇中的定位点的数量是否大于或等于预设的停留点判别时间阈值;If the first cluster identifier is equal to the initialized cluster identifier, the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
若所述新簇中的定位点的数量大于或等于预设的停留点判别时间阈值,获取当前之前已读取的所有定位点对应的目标候选停留点,并将所述目标候选停留点添加至停留点集合中。If the number of anchor points in the new cluster is greater than or equal to the preset stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the currently read anchor points, and add the target candidate stay points to Stay in the collection.
上传模块205,用于将所述停留点集合上传至区块链。The upload module 205 is configured to upload the stay point set to the blockchain.
在图2所描述的定位数据处理装置中,利用DBSCAN算法和KNN算法对定位点在空间层上进行聚类并优化聚类结果,再在时间层上对候选停留点进行划分和取舍,实现了空间-时间的双层聚类,可以有效的对停留点进行识别,同时,去除了部分定位的漂移点,也提高了停留点的识别精度。In the positioning data processing device described in Figure 2, the DBSCAN algorithm and the KNN algorithm are used to cluster the positioning points on the spatial layer and optimize the clustering results, and then the candidate stay points are divided and selected on the time layer to achieve The space-time double-layer clustering can effectively identify the stay points, and at the same time, it removes part of the drift points of the positioning, and also improves the recognition accuracy of the stay points.
如图3所示,图3是本申请实现基于人工智能的定位数据处理方法的较佳实施例的电子设备的结构示意图。所述电子设备3包括存储器31、至少一个处理器32、存储在所述 存储器31中并可在所述至少一个处理器32上运行的计算机程序33及至少一条通讯总线34。As shown in FIG. 3, FIG. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the artificial intelligence-based positioning data processing method according to the present application. The electronic device 3 includes a memory 31, at least one processor 32, a computer program 33 stored in the memory 31 and running on the at least one processor 32, and at least one communication bus 34.
本领域技术人员可以理解,图3所示的示意图仅仅是所述电子设备3的示例,并不构成对所述电子设备3的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述电子设备3还可以包括输入输出设备、网络接入设备等。Those skilled in the art can understand that the schematic diagram shown in FIG. 3 is only an example of the electronic device 3, and does not constitute a limitation on the electronic device 3. It may include more or less components than those shown in the figure, or a combination. Certain components, or different components, for example, the electronic device 3 may also include input and output devices, network access devices, and so on.
所述电子设备3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(ASIC)、现场可编程门阵列(FPGA)、数字信号处理器(DSP)、嵌入式设备等。所述电子设备还可包括网络设备和/或用户设备。其中,所述网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算(Cloud Computing)的由大量主机或网络服务器构成的云。所述用户设备包括但不限于任何一种可与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、个人数字助理PDA等。The electronic device 3 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit (ASIC), and field programmable Gate array (FPGA), digital signal processor (DSP), embedded device, etc. The electronic equipment may also include network equipment and/or user equipment. Wherein, the network device includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud composed of a large number of hosts or network servers based on Cloud Computing. The user equipment includes, but is not limited to, any electronic product that can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, and a personal digital device. Assistant PDA, etc.
所述至少一个处理器32可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。该处理器32可以是微处理器或者该处理器32也可以是任何常规的处理器等,所述处理器32是所述电子设备3的控制中心,利用各种接口和线路连接整个电子设备3的各个部分。The at least one processor 32 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (ASICs). ), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The processor 32 can be a microprocessor, or the processor 32 can also be any conventional processor, etc. The processor 32 is the control center of the electronic device 3, and connects the entire electronic device 3 through various interfaces and lines. Parts.
所述存储器31可用于存储所述计算机程序33和/或模块/单元,所述处理器32通过运行或执行存储在所述存储器31内的计算机程序和/或模块/单元,以及调用存储在存储器31内的数据,实现所述电子设备3的各种功能。所述存储器31可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据电子设备3的使用所创建的数据(比如音频数据)等。此外,存储器31可以包括易失性和非易失性存储器,例如随机存取存储器(Random Access Memory,RAM)、硬盘、内存、插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)、至少一个磁盘存储器件、闪存器件、或其他能够用于携带或存储数据的计算机可读的存储介质。所述计算机可读存储介质可以是非易失性,也可以是易失性的。The memory 31 may be used to store the computer program 33 and/or modules/units. The processor 32 runs or executes the computer programs and/or modules/units stored in the memory 31 and calls the computer programs and/or modules/units stored in the memory 31. The data in 31 realizes various functions of the electronic device 3. The memory 31 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may Data (such as audio data) created according to the use of the electronic device 3 and the like are stored. In addition, the memory 31 may include volatile and non-volatile memory, such as random access memory (RAM), hard disk, memory, plug-in hard disk, smart media card (SMC), and security A digital (Secure Digital, SD) card, a flash card (Flash Card), at least one magnetic disk storage device, a flash memory device, or other computer-readable storage media that can be used to carry or store data. The computer-readable storage medium may be non-volatile or volatile.
结合图1,所述电子设备3中的所述存储器31存储多个指令以实现一种基于人工智能的定位数据处理方法,所述处理器32可执行所述多个指令从而实现:With reference to FIG. 1, the memory 31 in the electronic device 3 stores multiple instructions to implement an artificial intelligence-based positioning data processing method, and the processor 32 can execute the multiple instructions to achieve:
获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
具体地,所述处理器32对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above-mentioned instructions by the processor 32, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 1, which will not be repeated here.
在图3所描述的电子设备3中,利用DBSCAN算法和KNN算法对定位点在空间层上进 行聚类并优化聚类结果,再在时间层上对候选停留点进行划分和取舍,实现了空间-时间的双层聚类,可以有效的对停留点进行识别,同时,去除了部分定位的漂移点,也提高了停留点的识别精度。In the electronic device 3 described in Figure 3, the positioning points are clustered on the spatial layer using the DBSCAN algorithm and the KNN algorithm, and the clustering results are optimized, and then the candidate stay points are divided and selected on the time layer to realize the spatial -Time double-layer clustering can effectively identify the stay points, and at the same time, remove part of the drift points of the positioning, and also improve the recognition accuracy of the stay points.
所述电子设备3集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器以及只读存储器(ROM,Read-Only Memory)。If the integrated module/unit of the electronic device 3 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the present application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, and read-only memory (ROM, Read-Only Memory) .
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本申请可用于众多通用或专用的计算机系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。This application can be used in many general or special computer system environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor systems, microprocessor-based systems, set-top boxes, programmable consumer electronic devices, network PCs, small computers, large computers, including Distributed computing environment for any of the above systems or equipment, etc. This application may be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. This application can also be practiced in distributed computing environments. In these distributed computing environments, tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules can be located in local and remote computer storage media including storage devices.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。系统权利要求中陈述的多个单元或装置也可以通过软件或者硬件来实现。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any associated diagram marks in the claims should not be regarded as limiting the claims involved. Multiple units or devices stated in the system claims can also be implemented by software or hardware.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种基于人工智能的定位数据处理方法,其中,所述基于人工智能的定位数据处理方法包括:An artificial intelligence-based positioning data processing method, wherein the artificial intelligence-based positioning data processing method includes:
    获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
    对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
    采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
    针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
    将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
  2. 根据权利要求1所述的基于人工智能的定位数据处理方法,其中,所述采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域包括:The artificial intelligence-based positioning data processing method according to claim 1, wherein the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain the data including multiple categories. The candidate areas include:
    采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点;The DBSCAN algorithm is used to process the processed data on the spatial layer to obtain multiple first stop points for the user to travel;
    采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点;Using the KNN algorithm to classify the multiple first stay points to obtain multiple categories of first stay points;
    根据属于同一类别的第一停留点,构建候选区域。According to the first stay point belonging to the same category, a candidate area is constructed.
  3. 根据权利要求2所述的基于人工智能的定位数据处理方法,其中,所述采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点包括:The artificial intelligence-based positioning data processing method according to claim 2, wherein said using the DBSCAN algorithm to process the processed data on the spatial layer to obtain the plurality of first stay points for the user to travel comprises:
    采用DBSCAN算法,在空间层上,针对所述处理数据中的任一定位点,构建以所述任一定位点为中心,半径为预设的停留点判别距离阈值的邻域;Using the DBSCAN algorithm, on the spatial layer, for any locating point in the processed data, construct a neighborhood centered on the any locating point and the radius is the preset stopping point discriminating distance threshold;
    判断所述邻域内的定位点的数量是否大于或等于预设的停留点判别时间阈值,其中,每个定位点在时间维度上的长度均为一个单位时间;Judging whether the number of positioning points in the neighborhood is greater than or equal to a preset stop point discrimination time threshold, where the length of each positioning point in the time dimension is one unit time;
    若所述邻域内的定位点的数量大于或等于预设的停留点判别时间阈值,计算所述邻域内的所有定位点的几何中心点,并将所述几何中心点确定为所述用户出行的第一停留点。If the number of positioning points in the neighborhood is greater than or equal to the preset stop point discrimination time threshold, the geometric center points of all positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
  4. 根据权利要求2所述的基于人工智能的定位数据处理方法,其中,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点之后,所述基于人工智能的定位数据处理方法还包括:The artificial intelligence-based positioning data processing method according to claim 2, wherein the KNN algorithm is used to classify the multiple first stay points, and after the first stay points of multiple categories are obtained, the The artificial intelligence positioning data processing method also includes:
    判断所述多个类别中是否包括漂移点的类别;Judging whether the multiple categories include a category of drift points;
    若所述多个类别中包括漂移点的类别,将包括漂移点的类别删除;If the multiple categories include the category of drift points, delete the category that includes the drift points;
    所述根据属于同一类别的第一停留点,构建候选区域包括:According to the first stay points belonging to the same category, constructing the candidate area includes:
    针对删除包括漂移点的类别后的其他类别的第一停留点,根据属于同一类别的第一停留点,构建候选区域。For the first stay points of other categories after the category including the drift point is deleted, a candidate area is constructed according to the first stay points belonging to the same category.
  5. 根据权利要求2所述的基于人工智能的定位数据处理方法,其中,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点包括:The artificial intelligence-based positioning data processing method according to claim 2, wherein said using KNN algorithm to classify said multiple first stay points to obtain multiple categories of first stay points comprises:
    针对每个所述第一停留点,获取所述第一停留点对应的多个定位点集合;For each of the first stay points, obtain a plurality of positioning point sets corresponding to the first stay points;
    采用KNN算法,对所述多个定位点集合中的任一定位点的簇标识进行变更;The KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets;
    将变更后具有相同的簇标识的定位点所对应的第一停留点划分为同一类别。The first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
  6. 根据权利要求1所述的基于人工智能的定位数据处理方法,其中,所述根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合包括:The artificial intelligence-based positioning data processing method according to claim 1, wherein the plurality of candidate stay points are subdivided on a time level according to the cluster identifiers of the plurality of positioning points corresponding to the candidate area Processing and obtaining a set of stay points include:
    依次读取所述候选区域对应的任一定位点的第一簇标识;Sequentially reading the first cluster identifier of any anchor point corresponding to the candidate area;
    判断所述第一簇标识是否等于初始化的簇标识,其中,所述初始化的簇标识为到达时间最早的定位点的簇标识;Judging whether the first cluster identifier is equal to the initialized cluster identifier, wherein the initialized cluster identifier is the cluster identifier of the anchor point with the earliest arrival time;
    若所述第一簇标识等于初始化的簇标识,将所述第一簇标识添加至新簇中,直至当前读取的定位点的第一簇标识不等于初始化的簇标识时,判断所述新簇中的定位点的数量是否大于或等于预设的停留点判别时间阈值;If the first cluster identifier is equal to the initialized cluster identifier, the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
    若所述新簇中的定位点的数量大于或等于预设的停留点判别时间阈值,获取当前之前已读取的所有定位点对应的目标候选停留点,并将所述目标候选停留点添加至停留点集合中。If the number of anchor points in the new cluster is greater than or equal to the preset stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the currently read anchor points, and add the target candidate stay points to Stay in the collection.
  7. 根据权利要求6所述的基于人工智能的定位数据处理方法,其中,所述基于人工智能的定位数据处理方法还包括:The method for processing positioning data based on artificial intelligence according to claim 6, wherein the method for processing positioning data based on artificial intelligence further comprises:
    若当前读取的定位点不是最后一个定位点,设置所述初始化的簇标识为当前读取的定位点的簇标识并进行迭代。If the currently read anchor point is not the last anchor point, set the initialized cluster identifier as the cluster identifier of the currently read anchor point and perform iteration.
  8. 一种电子设备,其中,所述电子设备包括处理器和存储器,所述处理器用于执行所述存储器中存储的至少一个计算机可读指令以实现以下步骤:An electronic device, wherein the electronic device includes a processor and a memory, and the processor is configured to execute at least one computer-readable instruction stored in the memory to implement the following steps:
    获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
    对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
    采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
    针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
    将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
  9. 根据权利要求8所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域时,具体包括:The electronic device according to claim 8, wherein the processor executes the at least one computer-readable instruction to implement the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN, and the processing is performed on the spatial layer When the data is processed to obtain candidate regions that include multiple categories, it specifically includes:
    采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点;The DBSCAN algorithm is used to process the processed data on the spatial layer to obtain multiple first stop points for the user to travel;
    采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点;Using the KNN algorithm to classify the multiple first stay points to obtain multiple categories of first stay points;
    根据属于同一类别的第一停留点,构建候选区域。According to the first stay point belonging to the same category, a candidate area is constructed.
  10. 根据权利要求9所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点时,具体包括:The electronic device according to claim 9, wherein the processor executes the at least one computer-readable instruction to implement the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain the user travel When there are multiple first stay points, specifically include:
    采用DBSCAN算法,在空间层上,针对所述处理数据中的任一定位点,构建以所述任一定位点为中心,半径为预设的停留点判别距离阈值的邻域;Using the DBSCAN algorithm, on the spatial layer, for any locating point in the processed data, construct a neighborhood centered on the any locating point and the radius is the preset stopping point discriminating distance threshold;
    判断所述邻域内的定位点的数量是否大于或等于预设的停留点判别时间阈值,其中,每个定位点在时间维度上的长度均为一个单位时间;Judging whether the number of positioning points in the neighborhood is greater than or equal to a preset stop point discrimination time threshold, where the length of each positioning point in the time dimension is one unit time;
    若所述邻域内的定位点的数量大于或等于预设的停留点判别时间阈值,计算所述邻域内的所有定位点的几何中心点,并将所述几何中心点确定为所述用户出行的第一停留点。If the number of positioning points in the neighborhood is greater than or equal to the preset stop point discrimination time threshold, the geometric center points of all positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
  11. 根据权利要求9所述的电子设备,其中,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点之后,所述处理器执行所述至少一个计算机可读指令还用以执行以下步骤:The electronic device according to claim 9, wherein the KNN algorithm is used to classify the plurality of first stay points, and after obtaining the first stay points of a plurality of categories, the processor executes the at least one The computer readable instructions are also used to perform the following steps:
    判断所述多个类别中是否包括漂移点的类别;Judging whether the multiple categories include a category of drift points;
    若所述多个类别中包括漂移点的类别,将包括漂移点的类别删除;If the multiple categories include the category of drift points, delete the category that includes the drift points;
    所述根据属于同一类别的第一停留点,构建候选区域包括:According to the first stay points belonging to the same category, constructing the candidate area includes:
    针对删除包括漂移点的类别后的其他类别的第一停留点,根据属于同一类别的第一停留点,构建候选区域。For the first stay points of other categories after the category including the drift point is deleted, a candidate area is constructed according to the first stay points belonging to the same category.
  12. 根据权利要求9所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点时,具体包括:The electronic device according to claim 9, wherein the processor executes the at least one computer-readable instruction to implement the KNN algorithm to classify the plurality of first stay points to obtain a plurality of categories When the first stop point, specifically include:
    针对每个所述第一停留点,获取所述第一停留点对应的多个定位点集合;For each of the first stay points, obtain a plurality of positioning point sets corresponding to the first stay points;
    采用KNN算法,对所述多个定位点集合中的任一定位点的簇标识进行变更;The KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets;
    将变更后具有相同的簇标识的定位点所对应的第一停留点划分为同一类别。The first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
  13. 根据权利要求8所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令以实现所述根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合时,具体包括:The electronic device according to claim 8, wherein the processor executes the at least one computer-readable instruction to implement the cluster identification of the plurality of anchor points corresponding to the candidate area, and compare all locations on the time level. When the multiple candidate stay points are subdivided, and the set of stay points is obtained, it specifically includes:
    依次读取所述候选区域对应的任一定位点的第一簇标识;Sequentially reading the first cluster identifier of any anchor point corresponding to the candidate area;
    判断所述第一簇标识是否等于初始化的簇标识,其中,所述初始化的簇标识为到达时间最早的定位点的簇标识;Judging whether the first cluster identifier is equal to the initialized cluster identifier, wherein the initialized cluster identifier is the cluster identifier of the anchor point with the earliest arrival time;
    若所述第一簇标识等于初始化的簇标识,将所述第一簇标识添加至新簇中,直至当前读取的定位点的第一簇标识不等于初始化的簇标识时,判断所述新簇中的定位点的数量是否大于或等于预设的停留点判别时间阈值;If the first cluster identifier is equal to the initialized cluster identifier, the first cluster identifier is added to the new cluster until the first cluster identifier of the currently read anchor point is not equal to the initialized cluster identifier, the new cluster identifier is determined Whether the number of anchor points in the cluster is greater than or equal to the preset stay point discrimination time threshold;
    若所述新簇中的定位点的数量大于或等于预设的停留点判别时间阈值,获取当前之前已读取的所有定位点对应的目标候选停留点,并将所述目标候选停留点添加至停留点集合中。If the number of anchor points in the new cluster is greater than or equal to the preset stay point discrimination time threshold, obtain the target candidate stay points corresponding to all the currently read anchor points, and add the target candidate stay points to Stay in the collection.
  14. 根据权利要求13所述的电子设备,其中,所述处理器执行所述至少一个计算机可读指令还用以执行以下步骤:The electronic device according to claim 13, wherein the processor executing the at least one computer readable instruction is further configured to execute the following steps:
    若当前读取的定位点不是最后一个定位点,设置所述初始化的簇标识为当前读取的定位点的簇标识并进行迭代。If the currently read anchor point is not the last anchor point, set the initialized cluster identifier as the cluster identifier of the currently read anchor point and perform iteration.
  15. 一种计算机可读存储介质,其上存储有至少一个计算机可读指令,其中,所述至少一个计算机可读指令被处理器执行时实现以下步骤:A computer-readable storage medium has at least one computer-readable instruction stored thereon, wherein the at least one computer-readable instruction implements the following steps when executed by a processor:
    获取用户终端记录的表示用户出行的多个定位数据;Obtain multiple positioning data recorded by the user terminal indicating the user's travel;
    对所述定位数据进行预处理,获得处理数据;Preprocessing the positioning data to obtain processing data;
    采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN are used to process the processed data on the spatial layer to obtain candidate regions including multiple categories, wherein each candidate region includes multiple candidates belonging to the same category Stop point
    针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;For the candidate area of each category, the multiple candidate stay points are subdivided in the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate area to obtain a stay point set, where The set of staying points includes the final staying point for the user to travel;
    将所述停留点集合上传至区块链。Upload the set of stay points to the blockchain.
  16. 根据权利要求15所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域时,具体包括:The storage medium according to claim 15, wherein the at least one computer-readable instruction is executed by the processor to implement the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN, and the When processing data for processing to obtain candidate regions that include multiple categories, it specifically includes:
    采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点;The DBSCAN algorithm is used to process the processed data on the spatial layer to obtain multiple first stop points for the user to travel;
    采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点;Using the KNN algorithm to classify the multiple first stay points to obtain multiple categories of first stay points;
    根据属于同一类别的第一停留点,构建候选区域。According to the first stay point belonging to the same category, a candidate area is constructed.
  17. 根据权利要求16所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述采用DBSCAN算法,在空间层上对所述处理数据进行处理,获得所述用户出行的多个第一停留点包括:The storage medium according to claim 16, wherein the at least one computer-readable instruction is executed by the processor to implement the use of the DBSCAN algorithm to process the processed data on the spatial layer to obtain the user The multiple first stop points during travel include:
    采用DBSCAN算法,在空间层上,针对所述处理数据中的任一定位点,构建以所述任一定位点为中心,半径为预设的停留点判别距离阈值的邻域;Using the DBSCAN algorithm, on the spatial layer, for any locating point in the processed data, construct a neighborhood centered on the any locating point and the radius is the preset stopping point discriminating distance threshold;
    判断所述邻域内的定位点的数量是否大于或等于预设的停留点判别时间阈值,其中,每个定位点在时间维度上的长度均为一个单位时间;Judging whether the number of positioning points in the neighborhood is greater than or equal to a preset stop point discrimination time threshold, where the length of each positioning point in the time dimension is one unit time;
    若所述邻域内的定位点的数量大于或等于预设的停留点判别时间阈值,计算所述邻域内的所有定位点的几何中心点,并将所述几何中心点确定为所述用户出行的第一停留点。If the number of positioning points in the neighborhood is greater than or equal to the preset stop point discrimination time threshold, the geometric center points of all positioning points in the neighborhood are calculated, and the geometric center points are determined as the travel of the user The first stop point.
  18. 根据权利要求16所述的存储介质,其中,所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点之后,所述至少一个计算机可读指令被处理器执行时还实现以下步骤:The storage medium according to claim 16, wherein the KNN algorithm is used to classify the plurality of first stay points, and after the first stay points of a plurality of categories are obtained, the at least one computer-readable instruction is The processor also implements the following steps when executing:
    判断所述多个类别中是否包括漂移点的类别;Judging whether the multiple categories include a category of drift points;
    若所述多个类别中包括漂移点的类别,将包括漂移点的类别删除;If the multiple categories include the category of drift points, delete the category that includes the drift points;
    所述根据属于同一类别的第一停留点,构建候选区域包括:According to the first stay points belonging to the same category, constructing the candidate area includes:
    针对删除包括漂移点的类别后的其他类别的第一停留点,根据属于同一类别的第一停留点,构建候选区域。For the first stay points of other categories after the category including the drift point is deleted, a candidate area is constructed according to the first stay points belonging to the same category.
  19. 根据权利要求16所述的存储介质,其中,所述至少一个计算机可读指令被所述处理器执行以实现所述采用KNN算法,对所述多个第一停留点进行分类,获得多个类别的第一停留点时,具体包括:The storage medium according to claim 16, wherein the at least one computer-readable instruction is executed by the processor to implement the use of the KNN algorithm to classify the plurality of first stay points to obtain a plurality of categories When the first stop point, specifically include:
    针对每个所述第一停留点,获取所述第一停留点对应的多个定位点集合;For each of the first stay points, obtain a plurality of positioning point sets corresponding to the first stay points;
    采用KNN算法,对所述多个定位点集合中的任一定位点的簇标识进行变更;The KNN algorithm is used to change the cluster identifier of any anchor point in the plurality of anchor point sets;
    将变更后具有相同的簇标识的定位点所对应的第一停留点划分为同一类别。The first stay points corresponding to the anchor points with the same cluster identifier after the change are classified into the same category.
  20. 一种定位数据处理装置,其中,所述定位数据处理装置包括:A positioning data processing device, wherein the positioning data processing device includes:
    获取模块,用于获取用户终端记录的表示用户出行的多个定位数据;The obtaining module is used to obtain multiple positioning data representing the user's travel recorded by the user terminal;
    第一处理模块,用于对所述定位数据进行预处理,获得处理数据;The first processing module is configured to preprocess the positioning data to obtain processed data;
    第二处理模块,用于采用聚类算法DBSCAN和K最邻近分类算法KNN,在空间层上对所述处理数据进行处理,获得包括多个类别的候选区域,其中,每个所述候选区域包括属于同一类别的多个候选停留点;The second processing module is configured to use the clustering algorithm DBSCAN and the K nearest neighbor classification algorithm KNN to process the processed data on the spatial layer to obtain candidate regions including multiple categories, where each candidate region includes Multiple candidate stay points belonging to the same category;
    第三处理模块,用于针对每个类别的所述候选区域,根据所述候选区域对应的多个定位点的簇标识,在时间层上对所述多个候选停留点进行细分处理,获得停留点集合,其中,所述停留点集合包括所述用户出行的最终停留点;The third processing module is configured to subdivide the multiple candidate stay points on the time layer according to the cluster identifiers of the multiple anchor points corresponding to the candidate areas of each category to obtain A set of staying points, wherein the set of staying points includes the final staying point of the user when traveling;
    上传模块,用于将所述停留点集合上传至区块链。The upload module is used to upload the set of stay points to the blockchain.
PCT/CN2020/104604 2020-05-21 2020-07-24 Artificial intelligence-based positioning data processing method and related device WO2021232585A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010438159.7 2020-05-21
CN202010438159.7A CN111680102B (en) 2020-05-21 2020-05-21 Positioning data processing method based on artificial intelligence and related equipment

Publications (1)

Publication Number Publication Date
WO2021232585A1 true WO2021232585A1 (en) 2021-11-25

Family

ID=72452901

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/104604 WO2021232585A1 (en) 2020-05-21 2020-07-24 Artificial intelligence-based positioning data processing method and related device

Country Status (2)

Country Link
CN (1) CN111680102B (en)
WO (1) WO2021232585A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114509076A (en) * 2022-02-16 2022-05-17 平安科技(深圳)有限公司 Moving track data processing method, device and equipment and storage medium
CN115424367A (en) * 2022-08-03 2022-12-02 洛阳智能农业装备研究院有限公司 GPS-based operation state judgment method, device, equipment and readable storage medium
CN115422480A (en) * 2022-10-31 2022-12-02 荣耀终端有限公司 Method, apparatus and storage medium for determining event occurrence area
CN115936817A (en) * 2022-12-30 2023-04-07 北京白驹易行科技有限公司 Passenger order starting point aggregation method and device and computer equipment
CN116027367A (en) * 2022-08-31 2023-04-28 荣耀终端有限公司 Stay point identification method, apparatus, device and storage medium
CN116434529A (en) * 2022-12-12 2023-07-14 交通运输部规划研究院 Inter-city highway freight characteristic analysis method and device and electronic equipment
CN117992870A (en) * 2024-04-03 2024-05-07 山东铁鹰建设工程有限公司 Bias early warning method for intelligent lining trolley

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112201047B (en) * 2020-10-10 2022-02-18 武汉中科通达高新技术股份有限公司 Suspected vehicle foothold analysis method and device based on Flink framework
CN116738073B (en) * 2022-09-21 2024-03-22 荣耀终端有限公司 Method, equipment and storage medium for identifying residence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102749631A (en) * 2012-07-26 2012-10-24 海华电子企业(中国)有限公司 Method for reducing positioning drift of Big Dipper satellite navigating and positioning device
CN106912015A (en) * 2017-01-10 2017-06-30 上海云砥信息科技有限公司 A kind of personnel's Trip chain recognition methods based on mobile network data
CN107589435A (en) * 2017-09-05 2018-01-16 成都新橙北斗智联有限公司 A kind of Big Dipper GPS track stops analysis method
CN108170793A (en) * 2017-12-27 2018-06-15 厦门市美亚柏科信息股份有限公司 Dwell point analysis method and its system based on vehicle semanteme track data
CN109104694A (en) * 2018-06-26 2018-12-28 重庆市交通规划研究院 A kind of user stop place discovery method and system based on mobile phone signaling
US10691133B1 (en) * 2019-11-26 2020-06-23 Apex Artificial Intelligence Industries, Inc. Adaptive and interchangeable neural networks

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273508B (en) * 2017-06-20 2020-07-10 北京百度网讯科技有限公司 Information processing method and device based on artificial intelligence

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102749631A (en) * 2012-07-26 2012-10-24 海华电子企业(中国)有限公司 Method for reducing positioning drift of Big Dipper satellite navigating and positioning device
CN106912015A (en) * 2017-01-10 2017-06-30 上海云砥信息科技有限公司 A kind of personnel's Trip chain recognition methods based on mobile network data
CN107589435A (en) * 2017-09-05 2018-01-16 成都新橙北斗智联有限公司 A kind of Big Dipper GPS track stops analysis method
CN108170793A (en) * 2017-12-27 2018-06-15 厦门市美亚柏科信息股份有限公司 Dwell point analysis method and its system based on vehicle semanteme track data
CN109104694A (en) * 2018-06-26 2018-12-28 重庆市交通规划研究院 A kind of user stop place discovery method and system based on mobile phone signaling
US10691133B1 (en) * 2019-11-26 2020-06-23 Apex Artificial Intelligence Industries, Inc. Adaptive and interchangeable neural networks

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114509076A (en) * 2022-02-16 2022-05-17 平安科技(深圳)有限公司 Moving track data processing method, device and equipment and storage medium
CN114509076B (en) * 2022-02-16 2023-10-20 平安科技(深圳)有限公司 Method, device, equipment and storage medium for processing movement track data
CN115424367A (en) * 2022-08-03 2022-12-02 洛阳智能农业装备研究院有限公司 GPS-based operation state judgment method, device, equipment and readable storage medium
CN116027367A (en) * 2022-08-31 2023-04-28 荣耀终端有限公司 Stay point identification method, apparatus, device and storage medium
CN116027367B (en) * 2022-08-31 2023-10-20 荣耀终端有限公司 Stay point identification method, apparatus, device and storage medium
CN115422480A (en) * 2022-10-31 2022-12-02 荣耀终端有限公司 Method, apparatus and storage medium for determining event occurrence area
CN116434529A (en) * 2022-12-12 2023-07-14 交通运输部规划研究院 Inter-city highway freight characteristic analysis method and device and electronic equipment
CN116434529B (en) * 2022-12-12 2023-10-24 交通运输部规划研究院 Inter-city highway freight characteristic analysis method and device and electronic equipment
CN115936817A (en) * 2022-12-30 2023-04-07 北京白驹易行科技有限公司 Passenger order starting point aggregation method and device and computer equipment
CN115936817B (en) * 2022-12-30 2024-02-20 北京白驹易行科技有限公司 Passenger order starting point aggregation method and device and computer equipment
CN117992870A (en) * 2024-04-03 2024-05-07 山东铁鹰建设工程有限公司 Bias early warning method for intelligent lining trolley

Also Published As

Publication number Publication date
CN111680102B (en) 2023-12-26
CN111680102A (en) 2020-09-18

Similar Documents

Publication Publication Date Title
WO2021232585A1 (en) Artificial intelligence-based positioning data processing method and related device
Zhao et al. A trajectory clustering approach based on decision graph and data field for detecting hotspots
WO2020228706A1 (en) Fence address-based coordinate data processing method and apparatus, and computer device
US20190012914A1 (en) Parking Identification and Availability Prediction
WO2022227303A1 (en) Information processing method and apparatus, computer device, and storage medium
Suma et al. Automatic detection and validation of smart city events using hpc and apache spark platforms
CN114428828A (en) Method and device for digging new road based on driving track and electronic equipment
US11829455B2 (en) AI governance using tamper proof model metrics
WO2021191685A2 (en) System and method for vehicle event data processing for identifying parking areas
Chen et al. An analysis of movement patterns between zones using taxi GPS data
Zhang et al. Automatic latent street type discovery from web open data
Namdarpour et al. Using genetic programming on GPS trajectories for travel mode detection
Belcastro et al. Parallel extraction of Regions‐of‐Interest from social media data
CN110598122B (en) Social group mining method, device, equipment and storage medium
CN114692978A (en) Social media user behavior prediction method and system based on big data
Lei Geospatial data conflation: A formal approach based on optimization and relational databases
Nguyen et al. A method for efficient clustering of spatial data in network space
Zhang et al. Clustering with implicit constraints: A novel approach to housing market segmentation
Sun et al. Predicting future locations with semantic trajectories
Boroumand et al. FLCSS: A fuzzy‐based longest common subsequence method for uncertainty management in trajectory similarity measures
CN111339446B (en) Interest point mining method and device, electronic equipment and storage medium
Li et al. gsstSIM: A high‐performance and synchronized similarity analysis method of spatiotemporal trajectory based on grid model representation
Zhao et al. Efficient semantic enrichment process for spatiotemporal trajectories
Hsu et al. Common sub-trajectory clustering via hypercubes in spatiotemporal space
Liu et al. Behavior identification based on geotagged photo data set

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20936920

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.03.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20936920

Country of ref document: EP

Kind code of ref document: A1