CN109299747B - Method and device for determining cluster center, computer equipment and storage medium - Google Patents

Method and device for determining cluster center, computer equipment and storage medium Download PDF

Info

Publication number
CN109299747B
CN109299747B CN201811246206.7A CN201811246206A CN109299747B CN 109299747 B CN109299747 B CN 109299747B CN 201811246206 A CN201811246206 A CN 201811246206A CN 109299747 B CN109299747 B CN 109299747B
Authority
CN
China
Prior art keywords
position information
tree
geographical position
node
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811246206.7A
Other languages
Chinese (zh)
Other versions
CN109299747A (en
Inventor
于晓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201811246206.7A priority Critical patent/CN109299747B/en
Publication of CN109299747A publication Critical patent/CN109299747A/en
Application granted granted Critical
Publication of CN109299747B publication Critical patent/CN109299747B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques

Abstract

The embodiment of the disclosure discloses a method, a device, a computer device and a storage medium for determining a cluster center, wherein the method comprises the following steps: converting each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology; generating a dictionary tree according to each position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node; calculating density values respectively corresponding to each piece of geographical position information according to the position relation between each tree node and each piece of geographical position information in the dictionary tree and the quantity value of the geographical position information relevant to each tree node; determining at least one cluster-like center in the set of geographic location information based on the density value. The technical scheme of the embodiment of the disclosure can reduce the calculation complexity of the cluster-like center in the clustering algorithm.

Description

Method and device for determining cluster center, computer equipment and storage medium
Technical Field
The embodiment of the disclosure relates to the technical field of data processing, and in particular, to a method and an apparatus for determining a cluster center, a computer device, and a storage medium.
Background
The objective of the density-based clustering algorithm is to find high-density regions separated by low-density regions, in colloquial, the points (high density) of the bundle are found out, and the parts (low density) where the points are rarely sparse are used as segmentation regions.
The core idea of the density-based clustering algorithm is to find points with higher density according to the acquired position data, then connect close high-density points into one piece step by step, and further generate various clusters, wherein each cluster is correspondingly matched with a cluster center.
In the process of implementing the present disclosure, the inventors find that the existing density-based clustering algorithm has the following defects: the position data involved in the density-based clustering algorithm is usually calculated in the form of longitude and latitude information, and the calculation complexity is high.
Disclosure of Invention
The embodiment of the disclosure provides a method and a device for determining a cluster center, computer equipment and a storage medium, which are used for reducing the computation complexity of the cluster center in a clustering algorithm.
In a first aspect, an embodiment of the present disclosure provides a method for determining a cluster center, including:
converting each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology;
generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node;
calculating density values respectively corresponding to the geographical position information according to the position relation between each tree node in the dictionary tree and the geographical position information and the quantity value of the geographical position information associated with each tree node;
determining at least one cluster-like center in the set of geographic location information based on the density value.
Optionally, calculating density values respectively corresponding to the geographical location information according to a location relationship between each tree node in the dictionary tree and each geographical location information and a quantity value of the geographical location information associated with each tree node, includes:
calculating a position error value and central geographical position information respectively corresponding to each tree node according to the geographical position area corresponding to each tree node in the dictionary tree;
and calculating a numerical relationship between the difference value between the geographical position information and the central geographical position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, and determining a traversal form of the dictionary tree and an updating mode of the density value of the geographical position information according to the numerical relationship until the dictionary tree is traversed to obtain the density value corresponding to each geographical position information.
Optionally, calculating a numerical relationship between a difference between the geographic position information and central geographic position information of the tree nodes in the dictionary tree and the position error value of the tree node, and determining a traversal form of the dictionary tree and an update mode of the density value of the geographic position information according to the numerical relationship until the dictionary tree is traversed, including:
acquiring one geographical position information in the geographical position information set as current position information, and setting an initial density value of the current position information;
sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence, and calculating a distance value between the current position information and the central geographical position information of the current comparison node;
if the distance value is smaller than or equal to a first threshold value, updating the density value of the current position information into the accumulated sum of the quantity values of the geographic position information associated with the current comparison node, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the first threshold is a difference value between a set density distance threshold and a position error value of the current comparison node;
if the distance value is larger than or equal to a second threshold value, keeping the density value of the current position information unchanged, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the second threshold is a sum of the set density distance threshold and the position error value of the current comparison node;
if the distance value is larger than the first threshold and smaller than the second threshold, keeping the density value of the current position information unchanged, and marking the current comparison node as a processed node;
returning to execute the operation of sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence until the processing of all the tree nodes in the dictionary tree is completed to obtain a density value corresponding to the current position information;
and returning to execute the operation of acquiring one geographical position information in the geographical position information set as the current position information until the processing of all the geographical position information is completed.
Optionally, calculating a position error value and central geographical position information corresponding to each tree node according to a geographical position region corresponding to each tree node in the dictionary tree, includes:
calculating position error values respectively corresponding to the tree nodes according to height values between upper boundaries and lower boundaries of the geographic position areas corresponding to the tree nodes in the dictionary tree;
calculating central geographical position information corresponding to each tree node according to the average value of the geographical position information of the geographical position area corresponding to each tree node in the dictionary tree; the geographic position information comprises longitude and latitude information.
Optionally, determining at least one cluster center in the geographic location information set according to the density value includes:
calculating cluster-like distance of each geographic position information according to the density value corresponding to each geographic position information;
calculating cluster-like weight of each geographic position information according to the density value and cluster-like distance respectively corresponding to each geographic position information;
and determining at least one cluster center in the point geographic position information set according to the cluster weight.
Optionally, calculating the cluster-like distance of each geographic location information according to the density value corresponding to each geographic location information respectively includes:
sorting the geographic position information according to a set rule according to the density value;
sequentially calculating a distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result to serve as the cluster distance to be screened;
and taking the cluster distance to be screened which meets the cluster distance judgment condition as the cluster distance of the geographic position information.
Optionally, the geographical location information set to be processed is track data of the target user within a set time interval;
after determining at least one cluster-like center in the set of geographic location information according to the density value, further comprising:
and taking the at least one cluster center as a resident point of the target user.
In a second aspect, an embodiment of the present disclosure further provides an apparatus for determining a cluster center, including:
the information conversion module is used for converting each two-dimensional geographic position information in the geographic position information set to be processed into one-dimensional position coding information by adopting a geographic position coding technology;
the dictionary tree generating module is used for generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node;
the density value calculation module is used for calculating density values corresponding to the geographical position information according to the position relation between the geographical position information and each tree node in the dictionary tree and the quantity value of the geographical position information associated with each tree node;
and the cluster center determining module is used for determining at least one cluster center in the geographic position information set according to the density value.
Optionally, the density value calculating module includes: the central geographical position information calculation unit is used for calculating a position error value and central geographical position information which respectively correspond to each tree node according to a geographical position area corresponding to each tree node in the dictionary tree;
and the density value calculation unit is used for calculating a numerical relationship between the difference value between the geographical position information and the central geographical position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, determining a traversal form of the dictionary tree and an updating mode of the density value of the geographical position information according to the numerical relationship until the dictionary tree is traversed so as to obtain the density value corresponding to each geographical position information.
Optionally, the density value calculating unit is specifically configured to obtain one geographic location information in the geographic location information set as current location information, and set an initial density value of the current location information;
sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence, and calculating a distance value between the current position information and the central geographical position information of the current comparison node;
if the distance value is smaller than or equal to a first threshold value, updating the density value of the current position information into the accumulated sum of the quantity values of the geographic position information associated with the current comparison node, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the first threshold is a difference value between a set density distance threshold and a position error value of the current comparison node;
if the distance value is larger than or equal to a second threshold value, keeping the density value of the current position information unchanged, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the second threshold is a sum of the set density distance threshold and the position error value of the current comparison node;
if the distance value is larger than the first threshold and smaller than the second threshold, keeping the density value of the current position information unchanged, and marking the current comparison node as a processed node;
returning to execute the operation of sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence until the processing of all the tree nodes in the dictionary tree is completed to obtain a density value corresponding to the current position information;
and returning to execute the operation of acquiring one geographical position information in the geographical position information set as the current position information until the processing of all the geographical position information is completed.
Optionally, the central geographic position information calculating unit is specifically configured to calculate, according to a height value between an upper boundary and a lower boundary of a geographic position region corresponding to each tree node in the dictionary tree, a position error value corresponding to each tree node;
calculating central geographical position information corresponding to each tree node according to the average value of the geographical position information of the geographical position area corresponding to each tree node in the dictionary tree; the geographic position information comprises longitude and latitude information.
Optionally, the cluster center determining module includes: the cluster-like distance calculating unit is used for calculating the cluster-like distance of each geographic position information according to the density value corresponding to each geographic position information;
the cluster weight calculation unit is used for calculating the cluster weight of each geographic position information according to the density value and the cluster distance which are respectively corresponding to each geographic position information;
and the cluster center determining unit is used for determining at least one cluster center in the point geographical position information set according to the cluster weight.
Optionally, the cluster-like distance calculating unit specifically ranks the geographic location information according to a set rule according to the density value;
sequentially calculating a distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result to serve as the cluster distance to be screened;
and taking the cluster distance to be screened which meets the cluster distance judgment condition as the cluster distance of the geographic position information.
Optionally, the geographical location information set to be processed is track data of the target user within a set time interval; the device further comprises: and the stationary point determining module is used for taking the at least one cluster center as a resident point of the target user.
In a third aspect, an embodiment of the present disclosure further provides a computer device, where the computer device includes:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a method for cluster center determination provided by any of the embodiments of the present disclosure.
In a fourth aspect, the embodiments of the present disclosure further provide a computer storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for determining a class center provided in any embodiment of the present disclosure.
The embodiment of the disclosure converts each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology to generate a dictionary tree, calculates density values respectively corresponding to each geographical position information according to the position relation between each tree node and each geographical position information in the dictionary tree and the quantity value of the geographical position information associated with each tree node, and finally determines at least one cluster center in the geographical position information set according to the density values.
Drawings
Fig. 1a is a flowchart of a method for determining a cluster center according to an embodiment of the present disclosure;
FIG. 1b is a schematic structural diagram of a dictionary tree generated according to position-coding information according to an embodiment of the present invention;
fig. 2a is a flowchart of a method for determining a cluster center according to a second embodiment of the disclosure;
fig. 2b is a flowchart of a method for calculating density values corresponding to geographic location information according to a second embodiment of the disclosure;
fig. 3 is a schematic diagram of a cluster center determining apparatus according to a third embodiment of the present disclosure;
fig. 4 is a schematic hardware structure diagram of a computer device according to a fourth embodiment of the present disclosure.
Detailed Description
The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the disclosure and are not limiting of the disclosure.
It is also noted that, for the sake of convenience in description, only some but not all of the matters related to the present disclosure are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1a is a flowchart of a method for determining a cluster center according to an embodiment of the present disclosure, where the present embodiment is applicable to a case of quickly determining a cluster center, and the method may be performed by a device for determining a cluster center, where the device may be implemented by software and/or hardware, and may be generally integrated in a computer device. Accordingly, as shown in fig. 1a, the method comprises the following operations:
and S110, converting the two-dimensional geographic position information in the geographic position information set to be processed into one-dimensional position coding information by adopting a geographic position coding technology.
The Geo-location encoding technique may be a technique for encoding the Geo-location data, for example, the Geo-location encoding technique may use a Geo-hash algorithm (Geo-hash). The two-dimensional geographic location information may be information corresponding to point data collected at different geographic locations. The position code information may be information formed by encoding geographical position information.
In the embodiment of the present disclosure, when a density-based clustering algorithm is used to calculate the cluster center, a geographical location information set used for algorithm calculation needs to be acquired first. The geographical location information set may include a plurality of kinds of two-dimensional geographical location information. Optionally, the geographic location information may be latitude and longitude information. After the geographical position information set to be processed is obtained, the geographical position coding technology can be adopted to convert the two-dimensional geographical position information included in the geographical position information set into one-dimensional position coding information.
Optionally, if the Geo-location coding technology adopts a Geo hash algorithm, the one-dimensional location coding information converted by the algorithm may perform approximate coding on the two-dimensional Geo-location information, and convert the two-dimensional Geo-location information into a corresponding character string.
And S120, generating a dictionary tree according to each position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node.
In the embodiment of the present disclosure, after converting each two-dimensional geographic position information in the geographic position information set into one-dimensional position coding information, a dictionary tree may be constructed according to the converted position coding information. The dictionary tree may be formed by multiple levels of tree nodes, a character string obtained by a path corresponding to each tree node forms position coding information, each position coding information corresponds to a rectangular geographic position area, and each geographic position area may include one or more pieces of geographic position information. The geographic location areas corresponding to the tree nodes of different levels are different in size, and the geographic location area corresponding to one child node belongs to the geographic location area range corresponding to the parent node of the child node. Wherein the root node in the trie is empty.
Fig. 1b is a schematic structural diagram of a dictionary tree generated according to position coding information according to an embodiment of the present invention. Illustratively, as shown in FIG. 1b, the second level nodes of the dictionary tree include two nodes A and C, and the third level nodes include N, L, B and S. Assuming that the geographic location area corresponding to the node a is the united states and the geographic location area corresponding to the node C is china, the child node N under the node a may represent new york and the child node L may represent los angeles; similarly, the child node B under node C may represent beijing and the child node S may represent shanghai. The geographic location area corresponding to beijing may include 10 pieces of two-dimensional geographic location information, such as 10 pieces of longitude and latitude information corresponding to point data acquired respectively in the same urban area or different urban areas.
S130, calculating density values corresponding to the geographical position information according to the position relation between the tree nodes and the geographical position information in the dictionary tree and the quantity value of the geographical position information relevant to the tree nodes.
The density value of the geographical location information may reflect the dense situation of the geographical location information included around the geographical location information.
In the embodiment of the present disclosure, when calculating the geographical location information, the density value corresponding to each geographical location information may be calculated according to the location relationship between each tree node and each geographical location information in the dictionary tree and the quantity value of the geographical location information associated with each tree node.
S140, determining at least one cluster center in the geographic position information set according to the density value.
Correspondingly, at least one cluster center can be determined for all data included in the geographical location information set according to the density values respectively corresponding to the geographical location information.
The embodiment of the disclosure converts each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology to generate a dictionary tree, calculates density values respectively corresponding to each geographical position information according to the position relation between each tree node and each geographical position information in the dictionary tree and the quantity value of the geographical position information associated with each tree node, and finally determines at least one cluster center in the geographical position information set according to the density values.
Example two
Fig. 2a is a flowchart of a method for determining a cluster-like center according to a second embodiment of the present disclosure, and fig. 2b is a flowchart of a method for calculating density values corresponding to respective geographic location information according to a second embodiment of the present disclosure. Accordingly, as shown in fig. 2a, the method of the present embodiment may include:
s210, converting the two-dimensional geographic position information in the geographic position information set to be processed into one-dimensional position coding information by adopting a geographic position coding technology.
S220, generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node.
S230, calculating a position error value and central geographical position information respectively corresponding to each tree node according to the geographical position area corresponding to each tree node in the dictionary tree.
In the embodiment of the present disclosure, when calculating the density values corresponding to the geographic location information, the location error value and the central geographic location information corresponding to each tree node may be first calculated as intermediate quantities.
Optionally, in an embodiment, S230 may include the following operations:
s231, calculating position error values respectively corresponding to the tree nodes according to height values between upper boundaries and lower boundaries of the geographic position areas corresponding to the tree nodes in the dictionary tree.
Specifically, the position error value corresponding to each tree node may use a height value between an upper boundary and a lower boundary of the geographic position region corresponding to the tree node as the corresponding position error value. Alternatively, the difference between the latitude corresponding to the upper boundary and the latitude corresponding to the lower boundary of the geographic location area may be used as the height value.
S232, calculating central geographical position information corresponding to each tree node according to the average value of the geographical position information of the geographical position area corresponding to each tree node in the dictionary tree.
Correspondingly, the central geographical location information corresponding to each tree node may be geographical location information corresponding to a longitude average value and a latitude average value of the geographical location area corresponding to each tree node.
S240, calculating a numerical relationship between the difference value between the geographic position information and the central geographic position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, determining a traversal form of the dictionary tree and an updating mode of the density value of the geographic position information according to the numerical relationship until the dictionary tree is traversed, and obtaining the density value corresponding to each geographic position information.
In the embodiment of the present disclosure, after the position error value and the central geographical position information respectively corresponding to each tree node are obtained, a difference between the geographical position information and the central geographical position information of each tree node in the dictionary tree may be respectively calculated, and then the contribution of each node in the dictionary tree to the density value of the geographical position information is calculated in a traversal manner according to a numerical relationship between the difference and the position error value of the tree node. The density values corresponding to the geographic position information are calculated through the constructed field numbers, so that a large amount of unnecessary calculation can be effectively reduced, and the calculation complexity is reduced.
Optionally, in an embodiment, as shown in fig. 2b, S240 may include the following operations:
s241, acquiring one geographical position information in the geographical position information set as the current position information, and setting the initial density value of the current position information.
In the embodiment of the present disclosure, when determining the cluster center of the geographic location information set, each geographic location information in the geographic location information set may be calculated separately. During calculation, the initial density value of the current location information may be set, and optionally, the initial density value of the current location information may be set to 0.
And S242, sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence, and calculating the distance value between the current position information and the central geographic position information of the current comparison node.
In the embodiment of the disclosure, when the density value of the geographic position information is calculated, the calculation may be performed by means of the constructed dictionary tree. Specifically, an unprocessed tree node may be sequentially obtained from top to bottom in the dictionary tree as a current comparison node, and then a distance value between the current location information and the central geographic location information of the current comparison node is calculated.
S243, judging whether the distance value is smaller than or equal to a first threshold value, if so, executing S244; otherwise, S245 is executed.
The first threshold is a difference value between a set density distance threshold and a position error value of the current comparison node.
The first threshold value may be set by comprehensively considering distribution characteristics of the geographic location information and an actual demand. Optionally, the first threshold may be a difference between the set density distance threshold and a position error value of the current comparison node. The set density distance threshold may be an actually selected distance value, such as 300 m.
Correspondingly, after the distance value between the current position information and the center geographical position information of the current comparison node is obtained, the size relationship between the distance value and the first threshold value can be calculated to calculate the density value of the current position information.
S244, updating the density value of the current location information to an accumulated sum of the quantity values of the geographic location information associated with the current comparison node, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes.
Specifically, if the distance value is less than or equal to the first threshold, it indicates that the distances between all the nodes under the current comparison node and the current location information are less than the set density distance threshold, on the basis of the density value of the current location information, the number of points of the geographic location information included in the geographic location area corresponding to the current comparison node may be added, and then the current comparison node and all the child nodes corresponding to the current comparison node are marked as processed nodes, without calculating the distance between the current location information and each child node included in the current comparison node.
S245, judging whether the distance value is larger than or equal to a second threshold value, if so, executing S246; otherwise, S247 is executed.
The second threshold is a sum of the set density distance threshold and the position error value of the current comparison node.
Accordingly, in the embodiment of the present disclosure, the density value of the current location information may also be calculated according to a magnitude relationship between the distance value and the second threshold.
S246, keeping the density value of the current position information unchanged, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes.
Specifically, if the distance value is greater than or equal to the second threshold, which indicates that the distances between all the nodes under the current comparison node and the current location information are greater than the set density distance threshold, the density value of the current location information may be directly kept unchanged, and then the current comparison node and all the child nodes corresponding to the current comparison node are marked as processed nodes without calculating the distance between the current location information and each child node included in the current comparison node.
S247, the distance value is larger than the first threshold value and smaller than the second threshold value, the density value of the current position information is kept unchanged, and the current comparison node is marked as a processed node.
Correspondingly, if the distance value is greater than the first threshold and less than the second threshold, the density contribution of each sub-node included in the current comparison node to the current position information needs to be traversed, at this time, the density value of the current position information may be kept unchanged, the current comparison node is marked as a processed node, and then the density contribution of each sub-node included in the current comparison node to the current position information is sequentially traversed and calculated. It is calculated in the same manner as the operations of S243-S245.
S248, judging whether the processing of all tree nodes in the dictionary tree is finished or not, if so, executing S249; otherwise, execution returns to S242.
Specifically, if the distance value is greater than the first threshold and less than the second threshold, the density contribution of each child node included in the current comparison node to the current position information is sequentially calculated in an automatic downward order until the processing of each child node included in the current comparison node is completed. Then, on the basis of the current comparison node, returning to execute the operation of sequentially acquiring an unprocessed tree node in the dictionary tree as the current comparison node according to the top-down sequence until the processing of all the tree nodes in the dictionary tree is completed.
S249, judging whether the processing of all the geographical position information is finished or not, and if so, ending the operation; otherwise, return to execute S241.
Correspondingly, after the density value of the current position information is calculated, whether all the geographical position information is processed is judged, if so, the operation is ended, otherwise, the operation of acquiring one geographical position information in the geographical position information set as the current position information is returned to be executed until all the geographical position information is processed.
And S250, determining at least one cluster center in the geographic position information set according to the density value.
Optionally, in an embodiment, as shown in fig. 2a, S250 may include the following operations:
and S251, calculating the cluster-like distance of each piece of geographic position information according to the density value corresponding to each piece of geographic position information.
The cluster-like distance may be a distance between the geographical location information and the geographical location information having a larger density value.
In this embodiment of the disclosure, when the cluster center is determined according to the density values corresponding to the geographic location information, the cluster distance of each geographic location information may be calculated according to the density values corresponding to the geographic location information.
In an optional embodiment of the present disclosure, calculating the cluster-like distance of each geographic location information according to the density value corresponding to each geographic location information may include: sorting the geographic position information according to a set rule according to the density value; sequentially calculating a distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result to serve as the cluster distance to be screened; and taking the cluster distance to be screened which meets the cluster distance judgment condition as the cluster distance of the geographic position information.
Wherein, the setting rule can be in the order of numerical value from large to small. The cluster distance determination condition may be that the value of the cluster distance to be filtered is minimum.
Specifically, when the cluster-like distance of each geographic location information is calculated, each geographic location information may be sorted according to the corresponding density value in the descending order. And then sequentially calculating the distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result, and taking the minimum distance of the cluster to be screened as the cluster distance of the geographical position information. Meanwhile, the corresponding geographical position information when the cluster distance is calculated can be used as a father node.
For example, it is assumed that the geographic location information set includes 5 pieces of geographic location information, and the geographic location information obtained by sorting according to the order of density values from large to small is [5,4,3,2,1], where a number in the set represents point data corresponding to one piece of geographic location information. When calculating the cluster-like distance of the geographical location information corresponding to the numeral 3, the distances between the cluster-like distance and the geographical location information corresponding to the numerals 5 and 4 may be sequentially calculated. Assuming that the distance between the geographical location information corresponding to the numeral 3 and the geographical location information corresponding to the numeral 5 is 200, and the distance between the geographical location information corresponding to the numeral 4 is 100, 100 is taken as the cluster-like distance of the geographical location information corresponding to the numeral 3, and the point data of the geographical location information corresponding to the numeral 4 may be taken as the parent node of the point data of the geographical location information corresponding to the numeral 3.
And S252, calculating the cluster-like weight of each piece of geographic position information according to the density value and the cluster-like distance respectively corresponding to each piece of geographic position information.
In the embodiment of the present disclosure, after the cluster-like distance of each geographic location information is obtained, the cluster-like weight of each geographic location information may be calculated according to the density value and the cluster-like distance respectively corresponding to each geographic location information. Optionally, the product of the density value corresponding to the geographic location information and the cluster-like distance may be used as the cluster-like weight of the geographic location information. That is, the larger the density value and the cluster-like distance are, the larger the corresponding cluster-like weight is.
And S253, determining at least one cluster center in the point geographical location information set according to the cluster weight.
And finally, determining at least one cluster center according to the cluster weight of each geographic position information. Specifically, the density values and the cluster-like distances of the geographic position information can be sequentially traversed according to the cluster-like weights and the geographic position information in the descending order. When the density value of the geographic location information is greater than or equal to a first set value and the cluster-like distance is greater than or equal to a second set value, the geographic location information can be determined as a cluster-like center. Alternatively, the first setting value may be set to 2, and the second setting value may be set to 500.
Correspondingly, after the cluster center is determined, the cluster attribution can be determined for each piece of geographic position information. Specifically, if the geographic location information is a cluster center, the cluster is attributed to itself; otherwise, the cluster of the class is the same as the parent node corresponding to the cluster of the class. In order to avoid the influence of the discrete geographical location information on the cluster class, the cluster class distance corresponding to the geographical location information may be limited to be smaller than the set density distance threshold.
In an optional embodiment of the present disclosure, the set of geographical location information to be processed is trajectory data of the target user within a set time interval; after determining at least one cluster-like center in the set of geographic location information according to the density value, may further include: and taking the at least one cluster center as a resident point of the target user.
The setting time interval can be set according to actual requirements, such as one month or two months. The trajectory data may be latitude and longitude information data of the user.
The method for determining the cluster center provided by the embodiment of the disclosure can be applied to the field of mining of user stationary points. When mining the user residence point, it is common to count user trajectory data reported by a target user within a period of time (e.g., 60 days), and according to a location where each user frequently goes, a class cluster center determined by the method for determining a class cluster center provided by the embodiment of the present disclosure is used as the residence point of the target user.
By adopting the technical scheme, the dictionary tree is constructed through the converted one-dimensional position coding information, the density values corresponding to the geographic position information are calculated according to the dictionary tree, and at least one cluster center is further determined according to the density values, so that a large amount of unnecessary calculation amount can be reduced, and the calculation complexity of the cluster center in a clustering algorithm is reduced.
EXAMPLE III
Fig. 3 is a schematic diagram of an apparatus for determining a cluster center according to a third embodiment of the present disclosure, as shown in fig. 3, the apparatus includes: an information conversion module 310, a dictionary tree generation module 320, a density value calculation module 330, and a cluster center determination module 340, wherein:
an information conversion module 310, configured to convert, by using a geographic position coding technique, each two-dimensional geographic position information in a geographic position information set to be processed into one-dimensional position coding information;
the dictionary tree generating module 320 is configured to generate a dictionary tree according to each of the position coding information, where one tree node in the dictionary tree corresponds to a set geographic location area, and a geographic location area corresponding to one child node belongs to a geographic location area range corresponding to a parent node of the child node;
a density value calculating module 330, configured to calculate, according to a location relationship between each tree node in the dictionary tree and each piece of geographic location information, and a quantity value of the piece of geographic location information associated with each tree node, a density value corresponding to each piece of geographic location information;
a cluster center determining module 340, configured to determine at least one cluster center in the geographic location information set according to the density value.
The embodiment of the disclosure converts each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology to generate a dictionary tree, calculates density values respectively corresponding to each geographical position information according to the position relation between each tree node and each geographical position information in the dictionary tree and the quantity value of the geographical position information associated with each tree node, and finally determines at least one cluster center in the geographical position information set according to the density values.
Optionally, the density value calculating module 330 includes: the central geographical position information calculation unit is used for calculating a position error value and central geographical position information which respectively correspond to each tree node according to a geographical position area corresponding to each tree node in the dictionary tree;
and the density value calculation unit is used for calculating a numerical relationship between the difference value between the geographical position information and the central geographical position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, determining a traversal form of the dictionary tree and an updating mode of the density value of the geographical position information according to the numerical relationship until the dictionary tree is traversed so as to obtain the density value corresponding to each geographical position information.
Optionally, the density value calculating unit is specifically configured to obtain one geographic location information in the geographic location information set as current location information, and set an initial density value of the current location information;
sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence, and calculating a distance value between the current position information and the central geographical position information of the current comparison node;
if the distance value is smaller than or equal to a first threshold value, updating the density value of the current position information into the accumulated sum of the quantity values of the geographic position information associated with the current comparison node, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the first threshold is a difference value between a set density distance threshold and a position error value of the current comparison node;
if the distance value is larger than or equal to a second threshold value, keeping the density value of the current position information unchanged, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the second threshold is a sum of the set density distance threshold and the position error value of the current comparison node;
if the distance value is larger than the first threshold and smaller than the second threshold, keeping the density value of the current position information unchanged, and marking the current comparison node as a processed node;
returning to execute the operation of sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence until the processing of all the tree nodes in the dictionary tree is completed to obtain a density value corresponding to the current position information;
and returning to execute the operation of acquiring one geographical position information in the geographical position information set as the current position information until the processing of all the geographical position information is completed.
Optionally, the central geographic position information calculating unit is specifically configured to calculate, according to a height value between an upper boundary and a lower boundary of a geographic position region corresponding to each tree node in the dictionary tree, a position error value corresponding to each tree node;
calculating central geographical position information corresponding to each tree node according to the average value of the geographical position information of the geographical position area corresponding to each tree node in the dictionary tree; the geographic position information comprises longitude and latitude information.
Optionally, the cluster center determining module 340 includes: the cluster-like distance calculating unit is used for calculating the cluster-like distance of each geographic position information according to the density value corresponding to each geographic position information;
the cluster weight calculation unit is used for calculating the cluster weight of each geographic position information according to the density value and the cluster distance which are respectively corresponding to each geographic position information;
and the cluster center determining unit is used for determining at least one cluster center in the point geographical position information set according to the cluster weight.
Optionally, the cluster-like distance calculating unit is specifically configured to sort the geographic location information according to a set rule according to the density value;
sequentially calculating a distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result to serve as the cluster distance to be screened;
and taking the cluster distance to be screened which meets the cluster distance judgment condition as the cluster distance of the geographic position information.
Optionally, the geographical location information set to be processed is track data of the target user within a set time interval; the device further comprises: and the stationary point determining module is used for taking the at least one cluster center as a resident point of the target user.
The determination device for the cluster center can execute the determination method for the cluster center provided by any embodiment of the disclosure, and has corresponding functional modules and beneficial effects of the execution method. For the technical details not described in detail in this embodiment, reference may be made to the method for determining the cluster center provided in any embodiment of the present disclosure.
Example four
Fig. 4 is a schematic diagram illustrating a hardware structure of a computer device according to a fourth embodiment of the present disclosure. The computer device may be implemented in various forms, and the computer device in the present disclosure may include, but is not limited to, mobile computer devices such as a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a navigation apparatus, a vehicle terminal device, a vehicle display terminal, a vehicle electronic rear view mirror, and the like, and stationary computer devices such as a digital TV, a desktop computer, and the like.
As shown in fig. 4, the computer device 0 may include a wireless communication unit 41, an a/V (audio/video) input unit 42, a user input unit 43, a sensing unit 44, an output unit 45, a memory 46, an interface unit 47, a processor 48, a power supply unit 49, and the like. Fig. 4 shows computer device 0 having various components, but it is understood that not all of the illustrated components are required to be implemented. More or fewer components may alternatively be implemented.
Among them, the wireless communication unit 41 allows radio communication between the computer device 0 and a wireless communication system or a network. The a/V input unit 42 is for receiving an audio or video signal. The user input unit 43 may generate key input data to control various operations of the computer device 0 according to commands input by the user. The sensing unit 44 detects a current state of the computer device 0, a position of the computer device 0, presence or absence of a touch input by a user to the computer device 0, an orientation of the computer device 0, an acceleration or deceleration movement and direction of the computer device 0, and the like, and generates a command or signal for controlling an operation of the computer device 0. The interface unit 47 serves as an interface through which at least one external device can be connected to the computer apparatus 0. The output unit 45 is configured to provide output signals in a visual, audio and/or tactile manner. The memory 46 may store software programs or the like for processing and controlling operations performed by the processor 48, or may temporarily store data that has been or will be output. The memory 46 may include at least one type of storage medium. Also, the computer apparatus 0 may cooperate with a network storage device that performs a storage function of the memory 46 through a network connection. Processor 48 generally controls the overall operation of computer device 0. In addition, the processor 48 may include a multimedia module for reproducing or playing back multimedia data. The processor 48 may perform a pattern recognition process to recognize a handwriting input or a picture drawing input performed on the touch screen as a character or an image. The power supply unit 49 receives external power or internal power and supplies appropriate power required to operate the respective elements and components under the control of the processor 48.
The processor 48 executes programs stored in the memory 46 to execute various functional applications and data processing, for example, to implement a method for determining a class center provided by the embodiment of the present disclosure, including: converting each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology; generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node; calculating density values respectively corresponding to the geographical position information according to the position relation between each tree node in the dictionary tree and the geographical position information and the quantity value of the geographical position information associated with each tree node; determining at least one cluster-like center in the set of geographic location information based on the density value.
EXAMPLE five
A fifth embodiment of the present disclosure further provides a computer storage medium storing a computer program, where the computer program is used to execute the data processing method according to any one of the above embodiments of the present disclosure when executed by a computer processor: acquiring data change information associated with a source code file, wherein the source code file is a binary file; generating a data processing program matched with the data change information according to the data change type of the data change information; and calling the data processing program to process the source code file to form a new source code file matched with the data change information.
The computer storage media of the disclosed embodiments may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM) or flash Memory), an optical fiber, a portable compact disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, Radio Frequency (RF), etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present disclosure and the technical principles employed. Those skilled in the art will appreciate that the present disclosure is not limited to the particular embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the present disclosure. Therefore, although the present disclosure has been described in greater detail with reference to the above embodiments, the present disclosure is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present disclosure, the scope of which is determined by the scope of the appended claims.

Claims (9)

1. A method for determining a cluster center, comprising:
converting each two-dimensional geographical position information in the geographical position information set to be processed into one-dimensional position coding information by adopting a geographical position coding technology;
generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node;
calculating density values respectively corresponding to the geographical position information according to the position relation between each tree node in the dictionary tree and the geographical position information and the quantity value of the geographical position information associated with each tree node;
determining at least one cluster-like center in the geographic location information set according to the density value;
calculating density values respectively corresponding to the geographical position information according to the position relationship between each tree node and each geographical position information in the dictionary tree and the quantity value of the geographical position information associated with each tree node, wherein the calculating comprises the following steps:
calculating a position error value and central geographical position information respectively corresponding to each tree node according to the geographical position area corresponding to each tree node in the dictionary tree;
and calculating a numerical relationship between the difference value between the geographical position information and the central geographical position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, and determining a traversal form of the dictionary tree and an updating mode of the density value of the geographical position information according to the numerical relationship until the dictionary tree is traversed to obtain the density value corresponding to each geographical position information.
2. The method of claim 1, wherein calculating a numerical relationship between a difference between the geographic location information and the central geographic location information of the tree nodes in the trie and the location error value of the tree node, and determining a traversal form of the trie and an update manner of the density value of the geographic location information according to the numerical relationship until traversing the trie, comprises:
acquiring one geographical position information in the geographical position information set as current position information, and setting an initial density value of the current position information;
sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence, and calculating a distance value between the current position information and the central geographical position information of the current comparison node;
if the distance value is smaller than or equal to a first threshold value, updating the density value of the current position information into the accumulated sum of the quantity values of the geographic position information associated with the current comparison node, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the first threshold is a difference value between a set density distance threshold and a position error value of the current comparison node;
if the distance value is larger than or equal to a second threshold value, keeping the density value of the current position information unchanged, and marking the current comparison node and all sub-nodes corresponding to the current comparison node as processed nodes; the second threshold is a sum of the set density distance threshold and the position error value of the current comparison node;
if the distance value is larger than the first threshold and smaller than the second threshold, keeping the density value of the current position information unchanged, and marking the current comparison node as a processed node;
returning to execute the operation of sequentially acquiring an unprocessed tree node in the dictionary tree as a current comparison node according to the top-down sequence until the processing of all the tree nodes in the dictionary tree is completed to obtain a density value corresponding to the current position information;
and returning to execute the operation of acquiring one geographical position information in the geographical position information set as the current position information until the processing of all the geographical position information is completed.
3. The method of claim 1, wherein calculating a location error value and a central geographic location information corresponding to each of the tree nodes based on the geographic location area corresponding to each of the tree nodes in the dictionary tree comprises:
calculating position error values respectively corresponding to the tree nodes according to height values between upper boundaries and lower boundaries of the geographic position areas corresponding to the tree nodes in the dictionary tree;
calculating central geographical position information corresponding to each tree node according to the average value of the geographical position information of the geographical position area corresponding to each tree node in the dictionary tree; the geographic position information comprises longitude and latitude information.
4. The method of claim 1, wherein determining at least one cluster-like center in the set of geographic location information based on the density value comprises:
calculating cluster-like distance of each geographic position information according to the density value corresponding to each geographic position information;
calculating cluster-like weight of each geographic position information according to the density value and cluster-like distance respectively corresponding to each geographic position information;
and determining at least one cluster center in the geographic position information set according to the cluster weight.
5. The method of claim 4, wherein calculating the cluster-like distance of each geographic location information according to the density value corresponding to each geographic location information comprises:
sorting the geographic position information according to a set rule according to the density value;
sequentially calculating a distance value between each piece of geographical position information and the geographical position information which is ranked in front according to the ranking result to serve as the cluster distance to be screened;
and taking the cluster distance to be screened which meets the cluster distance judgment condition as the cluster distance of the geographic position information.
6. The method according to any one of claims 1 to 5, wherein the set of geographical location information to be processed is trajectory data of a target user within a set time interval;
after determining at least one cluster-like center in the set of geographic location information according to the density value, further comprising:
and taking the at least one cluster center as a resident point of the target user.
7. An apparatus for determining cluster center, comprising:
the information conversion module is used for converting each two-dimensional geographic position information in the geographic position information set to be processed into one-dimensional position coding information by adopting a geographic position coding technology;
the dictionary tree generating module is used for generating a dictionary tree according to the position coding information, wherein one tree node in the dictionary tree corresponds to a set geographical position area, and the geographical position area corresponding to one child node belongs to the geographical position area range corresponding to the parent node of the child node;
the density value calculation module is used for calculating density values corresponding to the geographical position information according to the position relation between the geographical position information and each tree node in the dictionary tree and the quantity value of the geographical position information associated with each tree node;
a cluster center determination module, configured to determine at least one cluster center in the geographic location information set according to the density value;
wherein, density value calculation module includes:
the central geographical position information calculation unit is used for calculating a position error value and central geographical position information which respectively correspond to each tree node according to a geographical position area corresponding to each tree node in the dictionary tree;
and the density value calculation unit is used for calculating a numerical relationship between the difference value between the geographical position information and the central geographical position information of the tree nodes in the dictionary tree and the position error value of the tree nodes, determining a traversal form of the dictionary tree and an updating mode of the density value of the geographical position information according to the numerical relationship until the dictionary tree is traversed so as to obtain the density value corresponding to each geographical position information.
8. A computer device, the device comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method for class cluster center determination as recited in any of claims 1-6.
9. A computer storage medium on which a computer program is stored which, when being executed by a processor, carries out the method for cluster center determination according to any one of claims 1 to 6.
CN201811246206.7A 2018-10-24 2018-10-24 Method and device for determining cluster center, computer equipment and storage medium Active CN109299747B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811246206.7A CN109299747B (en) 2018-10-24 2018-10-24 Method and device for determining cluster center, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811246206.7A CN109299747B (en) 2018-10-24 2018-10-24 Method and device for determining cluster center, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109299747A CN109299747A (en) 2019-02-01
CN109299747B true CN109299747B (en) 2020-12-15

Family

ID=65157705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811246206.7A Active CN109299747B (en) 2018-10-24 2018-10-24 Method and device for determining cluster center, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109299747B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287426B (en) * 2019-05-23 2021-12-31 北京百度网讯科技有限公司 Method and device for establishing parent-child relationship of interest points, storage medium and processor
CN112330332B (en) * 2021-01-05 2021-05-07 南京智闪萤科技有限公司 Methods, computing devices, and media for identifying fraud risk with respect to node tasks

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744861A (en) * 2013-12-12 2014-04-23 深圳先进技术研究院 Lookup method and device for frequency sub-trajectories in trajectory data
CN107330466A (en) * 2017-06-30 2017-11-07 上海连尚网络科技有限公司 Very fast geographical GeoHash clustering methods
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user
CN108011987A (en) * 2017-10-11 2018-05-08 北京三快在线科技有限公司 IP address localization method and device, electronic equipment and storage medium
CN108304502A (en) * 2018-01-17 2018-07-20 中国科学院自动化研究所 Quick hot spot detecting method and system based on magnanimity news data
CN108536695A (en) * 2017-03-02 2018-09-14 北京嘀嘀无限科技发展有限公司 A kind of polymerization and device of geographical location information point

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9754328B2 (en) * 2013-08-08 2017-09-05 Academia Sinica Social activity planning system and method
US9710485B2 (en) * 2014-03-14 2017-07-18 Twitter, Inc. Density-based dynamic geohash
CN104199860B (en) * 2014-08-15 2017-05-10 浙江大学 Dataset fragmentation method based on two-dimensional geographic position information
CN107273471A (en) * 2017-06-07 2017-10-20 国网上海市电力公司 A kind of binary electric power time series data index structuring method based on Geohash
CN107911293A (en) * 2017-10-31 2018-04-13 天津大学 A kind of flow route tree constructing method based on geographical location

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103744861A (en) * 2013-12-12 2014-04-23 深圳先进技术研究院 Lookup method and device for frequency sub-trajectories in trajectory data
CN108536695A (en) * 2017-03-02 2018-09-14 北京嘀嘀无限科技发展有限公司 A kind of polymerization and device of geographical location information point
CN107330466A (en) * 2017-06-30 2017-11-07 上海连尚网络科技有限公司 Very fast geographical GeoHash clustering methods
CN107547633A (en) * 2017-07-27 2018-01-05 腾讯科技(深圳)有限公司 Processing method, device and the storage medium of a kind of resident point of user
CN108011987A (en) * 2017-10-11 2018-05-08 北京三快在线科技有限公司 IP address localization method and device, electronic equipment and storage medium
CN108304502A (en) * 2018-01-17 2018-07-20 中国科学院自动化研究所 Quick hot spot detecting method and system based on magnanimity news data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于轨迹聚类的城市热点区域提取与分析方法研究;赵鹏祥;《中国博士学位论文全文数据库_基础科学辑》;20170315;第60-62页第4.3节 *

Also Published As

Publication number Publication date
CN109299747A (en) 2019-02-01

Similar Documents

Publication Publication Date Title
CN107402955B (en) Method and apparatus for determining index grid of geo-fence
US8352483B1 (en) Scalable tree-based search of content descriptors
US20130054647A1 (en) Information processing apparatus, information processing method, and program
CN108701149A (en) A kind of intelligent recommendation method and terminal
CN110972261A (en) Base station fingerprint database establishing method, device, server and storage medium
CN107092623B (en) Interest point query method and device
CN109299747B (en) Method and device for determining cluster center, computer equipment and storage medium
US11860846B2 (en) Methods, systems and apparatus to improve spatial-temporal data management
US10306583B2 (en) Systems and methods to evaluate accuracy of locations of mobile devices
US20140370920A1 (en) Systems and methods for generating and employing an index associating geographic locations with geographic objects
US9111213B2 (en) Method for constructing a tree of linear classifiers to predict a quantitative variable
CN116958267A (en) Pose processing method and device, electronic equipment and storage medium
CN115204318B (en) Event automatic hierarchical classification method and electronic equipment
CN111125183A (en) Tuple measurement method and system based on CFI-Apriori algorithm in fog environment
CN111582456B (en) Method, apparatus, device and medium for generating network model information
EP4216073A1 (en) Data management method, data management apparatus, and storage medium
CN116227467A (en) Model training method, text processing method and device
CN112235723B (en) Positioning method, positioning device, electronic equipment and computer readable storage medium
CN110704679B (en) Video classification method and device and electronic equipment
CN114564516A (en) Business object classification method, device, equipment and storage medium
US20220138371A1 (en) Behavior model of photodetectors with a built-in lookup table
CN112118592A (en) Region generation method and device, electronic equipment and storage medium
CN113609631B (en) Event network topology diagram-based creation method and device and electronic equipment
CN111582482B (en) Method, apparatus, device and medium for generating network model information
CN111724002B (en) Method for predicting bus to be taken by user and reminding arrival information based on kNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant