CN106294485B - Determine the method and device in significant place - Google Patents
Determine the method and device in significant place Download PDFInfo
- Publication number
- CN106294485B CN106294485B CN201510307160.5A CN201510307160A CN106294485B CN 106294485 B CN106294485 B CN 106294485B CN 201510307160 A CN201510307160 A CN 201510307160A CN 106294485 B CN106294485 B CN 106294485B
- Authority
- CN
- China
- Prior art keywords
- attribute
- significant place
- location
- salient
- place
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明提供了确定显著地点的方法及装置,涉及移动互联网技术领域。为了解决现有技术中的对显著地点的识别精度较低的问题而发明。该方法包括步骤一:根据样本位置轨迹确定显著地点决定属性和步骤二:根据显著地点决定属性确定待处理位置轨迹中的显著地点。其中,步骤一具体包括:确定样本位置轨迹上的潜在显著地点;获取每个潜在显著地点对应的第一属性集合的值,得到属性信息表;根据属性信息表,使用预设特征选取算法从第一属性集合中选取显著地点决定属性;获取显著地点决定属性的阈值范围。步骤二具体包括:确定待处理位置轨迹上的至少一个潜在显著地点;将显著地点决定属性的值在阈值范围内的潜在显著地点确定为显著地点。
The invention provides a method and a device for determining a prominent place, and relates to the technical field of the mobile Internet. It is invented to solve the problem of low recognition accuracy of prominent locations in the prior art. The method includes step 1: determining the determining attribute of the salient location according to the sample position trajectory and step 2: determining the salient location in the location trajectory to be processed according to the determining attribute of the salient location. Among them, step 1 specifically includes: determining the potential salient locations on the sample location trajectory; obtaining the value of the first attribute set corresponding to each potentially salient location to obtain the attribute information table; according to the attribute information table, using the preset feature selection algorithm to select the Selecting a salient location to determine the attribute from an attribute set; obtaining the threshold range of the salient location to determine the attribute. Step 2 specifically includes: determining at least one potential salient location on the location track to be processed; and determining a potentially salient location whose value of a determining attribute of a salient location is within a threshold range as a salient location.
Description
技术领域technical field
本发明涉及移动互联网技术领域,尤其涉及确定显著地点的方法及装置。The present invention relates to the technical field of mobile Internet, in particular to a method and device for determining a prominent location.
背景技术Background technique
随着移动互联网的盛行,移动终端用户的行为分析已成为研究的焦点。其中,在用户行为分析中,用户的位置分析具有很重要的意义,可以基于移动终端用户的位置获取用户访问次数较多的位置也即显著地点以当用户处于显著地点时为用户提供更加智能的服务。With the prevalence of mobile Internet, the behavior analysis of mobile terminal users has become the focus of research. Among them, in the user behavior analysis, the user's location analysis is of great significance. Based on the location of the mobile terminal user, the location that the user visits more frequently, that is, the prominent location, can be obtained to provide the user with more intelligent information when the user is in a prominent location. Serve.
现有技术中提供了一种显著地点的确定方法,在该方法的实现过程中,首先对获取到的位置轨迹进行预处理,剔除明显不是显著地点的点,然后对剩余的点采用聚类算法进行处理将各个点标记为噪声点和显著地点,标记为显著地点的点即为确定出的显著地点。现有的这种确定显著地点的方法对显著地点的识别过程中可能存在将非显著地点误判为显著地点的问题,因而现有技术对显著地点的识别精度较低。The prior art provides a method for determining salient locations. In the implementation process of this method, firstly, preprocessing is performed on the obtained location trajectory, and the points that are obviously not salient locations are eliminated, and then clustering algorithm is used for the remaining points After processing, each point is marked as a noise point and a salient point, and the point marked as a salient point is the determined salient point. The existing method for determining salient places may have the problem of misjudging non-prominent places as salient places in the process of identifying salient places, so the existing technology has low recognition accuracy for salient places.
发明内容Contents of the invention
本发明提供一种确定显著地点的方法及装置,以解决现有技术中的对显著地点的识别度较低的问题。The present invention provides a method and device for determining prominent locations to solve the problem of low recognition of prominent locations in the prior art.
为达到上述目的,本发明采用如下技术方案:To achieve the above object, the present invention adopts the following technical solutions:
第一方面,本发明提供了一种确定显著地点的方法,所述方法包括:根据样本位置轨迹确定显著地点决定属性和根据所述显著地点决定属性确定待处理位置轨迹中的显著地点;In a first aspect, the present invention provides a method for determining a salient location, the method comprising: determining a salient location determining attribute according to a sample location trajectory and determining a salient location in the location trajectory to be processed according to the salient location determining attribute;
其中,所述根据样本位置轨迹确定显著地点决定属性,具体包括:Wherein, the determination of the determining attribute of the salient location according to the sample position track specifically includes:
通过预设聚类算法得到所述样本位置轨迹上的至少一个潜在显著地点;Obtaining at least one potentially significant location on the trajectory of the sample location through a preset clustering algorithm;
获取每个所述潜在显著地点对应的第一属性集合的值,得到属性信息表,所述第一属性集合包括预设个数的条件属性和一个决策属性;Acquiring the value of the first attribute set corresponding to each of the potentially significant locations to obtain an attribute information table, the first attribute set includes a preset number of conditional attributes and a decision attribute;
根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性;According to the attribute information table, using a preset feature selection algorithm to select a prominent place from the first attribute set to determine an attribute;
获取所述显著地点决定属性的阈值范围;Obtaining the threshold range of the determining attribute of the prominent location;
所述根据所述显著地点决定属性确定待处理位置轨迹中的显著地点,具体包括:The determining the salient locations in the location track to be processed according to the salient location determination attributes specifically includes:
通过所述预设聚类算法得到所述待处理位置轨迹上的至少一个潜在显著地点;Obtaining at least one potentially salient location on the location track to be processed through the preset clustering algorithm;
将所述显著地点决定属性的值在所述阈值范围内的潜在显著地点确定为显著地点。Potentially notable places whose value of the salient place determining attribute is within the threshold range are determined as salient places.
结合第一方面,在第一方面的第一种实现方式中,所述根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性,具体包括:In combination with the first aspect, in the first implementation manner of the first aspect, according to the attribute information table, using a preset feature selection algorithm to select a prominent location from the first attribute set to determine an attribute, specifically includes:
使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值;Using a preset feature selection algorithm to calculate the value of the P-norm with tolerance capability of each conditional attribute in the first attribute set;
将计算得到的所述具有容差能力的P范数的所有值中的最大值对应的条件属性确定为显著地点决定属性。The condition attribute corresponding to the maximum value among all the calculated values of the P-norm with tolerance capability is determined as the salient location determining attribute.
结合第一方面的第一种实现方式,在第一方面的第二种实现方式中,所述使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值,具体包括:In combination with the first implementation of the first aspect, in the second implementation of the first aspect, the use of a preset feature selection algorithm to calculate the P with tolerance capability of each conditional attribute in the first attribute set Norm values, including:
将所述第一属性集合中的第一条件属性作为待评估属性,将其余条件属性构成的集合确定为第二属性集合,所述第一条件属性为任意条件属性;Using the first conditional attribute in the first attribute set as the attribute to be evaluated, and determining the set formed by the remaining conditional attributes as the second attribute set, the first conditional attribute is any conditional attribute;
使用基于模糊熵的用户属性选择博弈算法,依次判断所述待评估属性在所述第二属性集合中的每个目标子集的合租博弈中是否获胜,得到与每个目标子集对应的判断结果,所述目标子集包括至少两个条件属性;Using a user attribute selection game algorithm based on fuzzy entropy, sequentially judge whether the attribute to be evaluated wins in the co-tenancy game of each target subset in the second attribute set, and obtain a judgment result corresponding to each target subset , the target subset includes at least two conditional attributes;
根据所有所述判断结果,计算所述待评估属性的具有容差能力的P范数的值。According to all the judgment results, the value of the P-norm with tolerance capability of the property to be evaluated is calculated.
结合第一方面,在第一方面的第三种实现方式中,所述预设聚类算法包括距离阈值、点数阈值和时间阈值三个参数;With reference to the first aspect, in a third implementation manner of the first aspect, the preset clustering algorithm includes three parameters: distance threshold, point threshold and time threshold;
所述通过预设聚类算法得到所述样本位置轨迹或所述待处理位置轨迹上的至少一个潜在显著地点,具体包括:The obtaining at least one potential salient location on the sample position track or the position track to be processed through a preset clustering algorithm specifically includes:
将所述位置轨迹中未被标记的地点作为起始点,所述标记包括标记为潜在显著地点或噪声点;Using an unmarked location in the location track as a starting point, the marking includes a potentially salient location or a noise point;
查找与所述起始点的距离小于等于所述距离阈值的目标点;Find a target point whose distance from the starting point is less than or equal to the distance threshold;
如果查找到的所述目标点的数量大于等于所述点数阈值,且每个所述目标点与所述起始点的时间间隔均大于所述时间阈值,则将所述起始点和所有所述目标点确定为一个簇,且将所述起始点标记为潜在显著地点;If the number of the target points found is greater than or equal to the point threshold, and the time interval between each target point and the starting point is greater than the time threshold, then the starting point and all the targets points are determined as a cluster, and the starting point is marked as a potentially salient location;
当将所述簇内的所有点均标记后,依次标记其他未被标记的地点。After all points in the cluster are marked, other unmarked places are marked in sequence.
结合第一方面或者第一方面的第一种实现方式、第二种实现方式、第三种实现方式中的任意一种实现方式,在第一方面的第四种实现方式中,所述第一属性集合中包括速度、加速度、潜在显著地点在所述样本位置轨迹中的记录个数、所述潜在显著地点在所述样本位置轨迹中出现的天数、所述潜在显著地点的停留时间间隔和方位变化标准差。In combination with the first aspect or any one of the first implementation, the second implementation, and the third implementation of the first aspect, in the fourth implementation of the first aspect, the first The attribute set includes velocity, acceleration, the number of records of potentially significant locations in the sample location track, the number of days that the potentially prominent locations appear in the sample location track, the dwell time interval and orientation of the potentially prominent locations Standard deviation of variation.
第二方面,本发明还提供了一种确定显著地点的装置,,所述装置包括:In the second aspect, the present invention also provides a device for determining a significant location, and the device includes:
处理模块,用于通过预设聚类算法得到所述样本位置轨迹上的至少一个潜在显著地点;A processing module, configured to obtain at least one potentially significant location on the sample location trajectory through a preset clustering algorithm;
获取模块,用于获取每个所述潜在显著地点对应的第一属性集合的值,得到属性信息表,所述第一属性集合包括预设个数的条件属性和一个决策属性;An acquisition module, configured to acquire the value of the first attribute set corresponding to each of the potentially significant locations, and obtain an attribute information table, where the first attribute set includes a preset number of conditional attributes and a decision attribute;
所述处理模块,还用于根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性;The processing module is further configured to use a preset feature selection algorithm to select a prominent place from the first attribute set to determine an attribute according to the attribute information table;
所述获取模块,还用于获取所述显著地点决定属性的阈值范围;The obtaining module is also used to obtain the threshold range of the determining attribute of the prominent place;
所述处理模块,还用于通过所述预设聚类算法得到所述待处理位置轨迹上的至少一个潜在显著地点;The processing module is further configured to obtain at least one potentially significant location on the track of the position to be processed through the preset clustering algorithm;
所述处理模块,还用于将所述显著地点决定属性的值在所述阈值范围内的潜在显著地点确定为显著地点。The processing module is further configured to determine, as a salient place, a potential notable place whose value of the salient place determining attribute is within the threshold range.
结合第二方面,在第二方面的第一种实现方式中,所述获取模块,具体用于使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值;With reference to the second aspect, in the first implementation manner of the second aspect, the acquisition module is specifically configured to use a preset feature selection algorithm to calculate the P with tolerance capability of each conditional attribute in the first attribute set. the value of the norm;
将计算得到的所述具有容差能力的P范数的所有值中的最大值对应的条件属性确定为显著地点决定属性。The condition attribute corresponding to the maximum value among all the calculated values of the P-norm with tolerance capability is determined as the salient location determining attribute.
结合第二方面的第一种实现方式,在第二方面的第二种实现方式中,所述获取模块,还具体用于:In combination with the first implementation of the second aspect, in the second implementation of the second aspect, the acquisition module is also specifically used for:
将所述第一属性集合中的第一条件属性作为待评估属性,将其余条件属性构成的集合确定为第二属性集合,所述第一条件属性为任意条件属性;Using the first conditional attribute in the first attribute set as the attribute to be evaluated, and determining the set formed by the remaining conditional attributes as the second attribute set, the first conditional attribute is any conditional attribute;
使用基于模糊熵的用户属性选择博弈算法,依次判断所述待评估属性在所述第二属性集合中的每个目标子集的合租博弈中是否获胜,得到与每个目标子集对应的判断结果,所述目标子集包括至少两个条件属性;Using a user attribute selection game algorithm based on fuzzy entropy, sequentially judge whether the attribute to be evaluated wins in the co-tenancy game of each target subset in the second attribute set, and obtain a judgment result corresponding to each target subset , the target subset includes at least two conditional attributes;
根据所有所述判断结果,计算所述待评估属性的具有容差能力的P范数的值。According to all the judgment results, the value of the P-norm with tolerance capability of the property to be evaluated is calculated.
结合第二方面,在第二方面的第三种实现方式中,所述预设聚类算法包括距离阈值、点数阈值和时间阈值三个参数;With reference to the second aspect, in a third implementation manner of the second aspect, the preset clustering algorithm includes three parameters: distance threshold, point threshold and time threshold;
所述处理模块,具体用于:The processing module is specifically used for:
将所述位置轨迹中未被标记的地点作为起始点,所述标记包括标记为潜在显著地点或噪声点;Using an unmarked location in the location track as a starting point, the marking includes a potentially salient location or a noise point;
查找与所述起始点的距离小于等于所述距离阈值的目标点;Find a target point whose distance from the starting point is less than or equal to the distance threshold;
如果查找到的所述目标点的数量大于等于所述点数阈值,且每个所述目标点与所述起始点的时间间隔均大于所述时间阈值,则将所述起始点和所有所述目标点确定为一个簇,且将所述起始点标记为潜在显著地点;If the number of the target points found is greater than or equal to the point threshold, and the time interval between each target point and the starting point is greater than the time threshold, then the starting point and all the targets points are determined as a cluster, and the starting point is marked as a potentially salient location;
当将所述簇内的所有点均标记后,依次标记其他未被标记的地点。After all points in the cluster are marked, other unmarked places are marked in sequence.
结合第二方面或者第二方面的第一种实现方式、第二种实现方式、第三种实现方式中的任意一种实现方式,在第二方面的第四种实现方式中,所述获取模块获取的所述第一属性集合中包括速度、加速度、潜在显著地点在所述样本位置轨迹中的记录个数、所述潜在显著地点在所述样本位置轨迹中出现的天数、所述潜在显著地点的停留时间间隔和方位变化标准差。In combination with the second aspect or any one of the first implementation, the second implementation, and the third implementation of the second aspect, in the fourth implementation of the second aspect, the acquisition module The acquired first attribute set includes velocity, acceleration, the number of records of potentially significant locations in the sample location track, the number of days that the potentially prominent locations appear in the sample location track, the potential significant location Dwell intervals and standard deviations of azimuth changes.
本发明实施例提供的确定显著地点的方法及装置,将样本位置轨迹通过聚类算法得出潜在显著地点,再通过特征选取算法选取出显著地点的决定属性以及其阈值范围,然后再对待处理位置轨迹进行处理,将待处理位置轨迹中的潜在显著地点中显著地点的决定属性的值满足阈值范围的潜在显著地点确定为显著地点,与现有技术中将通过聚类算法选择出的点直接作为显著地点相比,本发明由于在聚类算法后采用了特征选取算法,因而能够减少显著地点的误判率,进而能够提高确定显著地点的精确度。The method and device for determining salient locations provided by the embodiments of the present invention use the sample location trajectory to obtain potential salient locations through a clustering algorithm, and then select the decisive attributes of the salient locations and their threshold ranges through a feature selection algorithm, and then select the location to be processed Trajectories are processed, and the potential salient locations in the potential salient locations in the position track to be processed are determined as the salient locations whose value of the decisive attribute of the salient locations meets the threshold range, and the points selected by the clustering algorithm are directly used as the salient locations in the prior art. Compared with the prominent locations, the present invention can reduce the misjudgment rate of the prominent locations due to the adoption of the feature selection algorithm after the clustering algorithm, thereby improving the accuracy of determining the prominent locations.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1为本发明实施例提供的一种确定显著地点的方法的流程示意图;FIG. 1 is a schematic flowchart of a method for determining a prominent location provided by an embodiment of the present invention;
图2为本发明实施例提供的一种通过预设聚类算法确定位置轨迹上的潜在显著地点的方法的流程示意图;FIG. 2 is a schematic flow chart of a method for determining potential salient locations on a location trajectory through a preset clustering algorithm provided by an embodiment of the present invention;
图3为本发明实施例提供的一种根据属性信息表,使用预设特征选取算法选取显著地点决定属性的方法的流程示意图;Fig. 3 is a schematic flowchart of a method for selecting prominent locations and determining attributes according to the attribute information table provided by an embodiment of the present invention using a preset feature selection algorithm;
图4为本发明实施例提供的一种计算每个条件属性的具有容差能力的P范数的值的方法的流程示意图;FIG. 4 is a schematic flowchart of a method for calculating the value of the P-norm with tolerance capability for each conditional attribute provided by an embodiment of the present invention;
图5为本发明实施例提供的一种确定显著地点的装置的结构示意图;Fig. 5 is a schematic structural diagram of a device for determining prominent locations provided by an embodiment of the present invention;
图6为本发明实施例提供的另一种确定显著地点的装置的结构示意图。FIG. 6 is a schematic structural diagram of another device for determining a prominent location provided by an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本实施例中的附图,对本实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solution in this embodiment with reference to the drawings in this embodiment. Obviously, the described embodiment is only a part of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
本发明实施例提供了一种确定显著地点的方法,所述方法包括:根据样本位置轨迹确定显著地点决定属性和根据所述显著地点决定属性确定待处理位置轨迹中的显著地点。An embodiment of the present invention provides a method for determining a salient location, the method comprising: determining a salient location determining attribute according to a sample location trajectory and determining a salient location in a location trajectory to be processed according to the salient location determining attribute.
其中,所述样本位置轨迹为已经确定好哪些地点是显著地点的位置轨迹。所述待处理位置轨迹为还未确定显著地点的位置轨迹。所述样本位置轨迹和所述待处理位置轨迹作为一种位置轨迹均包含了一段时间内各个时间点移动终端所在的位置信息或者说所经过的路线。一般而言,现有的移动终端上有GPS传感器,可以接收并记录下用户的移动终端连续的GPS经纬度坐标点从而形成GPS轨迹也就是本发明实施例中所指的位置轨迹。Wherein, the sample location trajectory is a location trajectory that has already determined which locations are salient locations. The position track to be processed is a position track for which no significant point has been determined. The sample position track and the pending position track, as a kind of position track, both include the position information of the mobile terminal at various time points within a period of time or the route it has traveled. Generally speaking, the existing mobile terminal has a GPS sensor, which can receive and record the continuous GPS longitude and latitude coordinate points of the user's mobile terminal to form a GPS track, which is the location track referred to in the embodiment of the present invention.
所述“根据样本位置轨迹确定显著地点决定属性”具体包括下述步骤101至步骤104;所述“根据所述显著地点决定属性确定待处理位置轨迹中的显著地点”具体包括下述步骤105和106。The "determining the salient location decision attribute according to the sample position trajectory" specifically includes the following steps 101 to 104; the "determining the salient location in the pending location trajectory according to the salient location determination attribute" specifically includes the following steps 105 and 104. 106.
如图1所示,本发明实施例提供的确定显著地点的方法具体包括:As shown in Figure 1, the method for determining a prominent location provided by the embodiment of the present invention specifically includes:
101:通过预设聚类算法得到所述样本位置轨迹上的至少一个潜在显著地点。101: Obtain at least one potentially significant location on the sample location track by using a preset clustering algorithm.
其中,聚类算法的基本目的为将符合特定条件的点形成一个类,常见的聚类算法包括基于密度的空间聚类算法(Density-Based Spatial Clustering of Applicationswith Noise,简称DBSCAN)算法等。Among them, the basic purpose of the clustering algorithm is to form a cluster of points that meet specific conditions. Common clustering algorithms include the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm.
位置轨迹上的点按照其可能成为显著地点的概率可以分为噪声点和潜在显著地点,所指的潜在显著地点为成为显著地点的概率较大的点。Points on the location trajectory can be divided into noise points and potential salient points according to their probabilities of becoming salient points.
102:获取每个所述潜在显著地点对应的第一属性集合的值,得到属性信息表,所述第一属性集合包括预设个数的条件属性和一个决策属性。102: Obtain the value of the first attribute set corresponding to each of the potentially significant locations to obtain an attribute information table, where the first attribute set includes a preset number of condition attributes and a decision attribute.
其中,第一属性集合中的各个条件属性和决策属性的值可通过分析样本位置轨迹直接读取或者经过一定运算后得到。例如:当该潜在地点在位置轨迹中的记录个数作为条件属性时,可通过直接查找潜在地点出现的次数得到。又如:当速度为一个条件属性时,在确定了潜在显著地点后,针对某个潜在显著地点可通过获取包括该潜在显著地点在内的极小段路径以及该路径对应的时间,通过速度计算公式得到这小段时间的移动终端的速度并将该速度作为该潜在显著地点的速度。Wherein, the values of each conditional attribute and decision attribute in the first attribute set can be directly read by analyzing the sample position track or obtained after a certain operation. For example: when the number of records of the potential location in the location track is used as a condition attribute, it can be obtained by directly searching the number of occurrences of the potential location. Another example: when speed is a conditional attribute, after determining a potential significant point, for a potential significant point, the path can be obtained by obtaining a very small path including the potentially significant point and the time corresponding to the path, and the speed can be calculated The formula obtains the speed of the mobile terminal in this short period of time and takes the speed as the speed of the potential salient point.
作为一个可选实现方式,所述第一属性集合中可以包括速度、加速度、潜在显著地点在所述样本位置轨迹中的记录个数、所述潜在显著地点在所述样本位置轨迹中出现的天数、所述潜在显著地点的停留时间间隔和方位变化标准差共6个条件属性。所指的决策属性即是指该潜在显著地点是否为显著地点。As an optional implementation, the first attribute set may include velocity, acceleration, the number of records of potentially significant locations in the sample location trajectory, and the number of days that the potentially significant locations appear in the sample location trajectory , the dwell time interval of the potentially significant location and the standard deviation of the orientation change, a total of 6 conditional attributes. The decision-making attribute referred to refers to whether the potential prominent place is a prominent place.
以样本位置轨迹中共包括10个潜在显著地点,第一属性集合表中包括上述6个条件属性和1个决策属性为例,下表表一给出了潜在显著地点对应的属性信息表。Taking the example where the sample location trajectory includes 10 potentially salient locations, and the first attribute set table includes the above-mentioned 6 conditional attributes and 1 decision attribute as an example, Table 1 below gives the attribute information table corresponding to the potentially salient locations.
表一:属性信息表Table 1: Attribute Information Table
其中,是否为显著地点所在列中,“1”代表该潜在显著地点为显著地点,“0”代表该潜在显著地点并非显著地点。Among them, in the column of whether it is a prominent place, "1" means that the potential prominent place is a prominent place, and "0" means that the potential prominent place is not a prominent place.
上述表一仅仅给出了属性信息表的实现形式,其中各个条件属性的值可根据实际情况获取后得到。The above Table 1 only shows the realization form of the attribute information table, and the value of each conditional attribute can be obtained after obtaining according to the actual situation.
103:根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性。103: According to the attribute information table, use a preset feature selection algorithm to select a prominent place from the first attribute set to determine an attribute.
其中,所指的预设特征选取算法可以为具有容差能力的P范数。Wherein, the preset feature selection algorithm referred to may be a P-norm with tolerance capability.
104:获取所述显著地点决定属性的阈值范围。104: Obtain a threshold range of the determining attribute of the prominent location.
在本步骤的一种具体实现方式中,通过获取多个地点(既包括非显著地点也包括显著地点)对应的该显著地点决定属性的值;如果显著地点决定属性的值的某一个区间包含的显著地点的个数最多,则将该区间确定为显著地点决定属性的阈值范围。以将速度这一条件属性作为显著地点决定属性为例,该确定过程具体为:In a specific implementation of this step, by obtaining the value of the prominent location decision attribute corresponding to multiple locations (including both non-significant locations and prominent locations); if a certain interval of the value of the prominent location decision attribute contains If the number of salient locations is the largest, then this interval is determined as the threshold range of the salient locations determining attributes. Taking the conditional attribute of speed as an example to determine the salient location, the determination process is as follows:
获取样本位置轨迹上的多个地点的速度值;Obtain the velocity values of multiple locations on the sample position trajectory;
将包含显著地点个数最多的速度值的区间确定为速度这一条件属性的阈值范围。The interval containing the speed value with the largest number of prominent locations is determined as the threshold range of the conditional attribute of speed.
105:通过所述预设聚类算法得到所述待处理位置轨迹上的至少一个潜在显著地点。105: Obtain at least one potentially significant location on the location track to be processed by using the preset clustering algorithm.
106:将所述显著地点决定属性的值在所述阈值范围内的潜在显著地点确定为显著地点。106: Determine the potential salient places whose value of the salient place determining attribute is within the threshold range as the salient places.
本发明实施例提供的确定显著地点的方法,将样本位置轨迹通过聚类算法得出潜在显著地点,再通过特征选取算法选取出显著地点的决定属性以及其阈值范围,然后再对待处理位置轨迹进行处理,将待处理位置轨迹中的潜在显著地点中显著地点的决定属性的值满足阈值范围的潜在显著地点确定为显著地点,与现有技术中将通过聚类算法选择出的点直接作为显著地点相比,本发明由于在聚类算法后采用了特征选取算法,因而能够减少显著地点的误判率,进而能够提高确定显著地点的精确度。In the method for determining a salient location provided by the embodiment of the present invention, the sample location trajectory is obtained through a clustering algorithm to obtain a potential salient location, and then the determining attribute of the salient location and its threshold range are selected through a feature selection algorithm, and then the location trajectory to be processed is processed. processing, determining the potential salient places whose value of the determining attribute of the salient places in the position trajectory to be processed meets the threshold range as the salient places, which is different from the prior art in which the points selected by the clustering algorithm are directly used as the salient places In contrast, the present invention can reduce the misjudgment rate of prominent locations because the feature selection algorithm is used after the clustering algorithm, thereby improving the accuracy of determining prominent locations.
需要说明的是,为了提高本发明实施例的显著地点决定属性以及该决定属性的阈值范围的选取的准确性,可以通过获取多个样本位置轨迹并采用步骤101至步骤104相同的方法,得到与每个样本位置轨迹对应的一个决定属性和该决定属性的阈值范围,然后进行分析处理后得到最终的显著地点决定属性和其阈值范围。It should be noted that, in order to improve the selection accuracy of the salient location determining attribute and the threshold range of the determining attribute in the embodiment of the present invention, it is possible to obtain multiple sample position trajectories and use the same method from step 101 to step 104 to obtain the same method as Each sample position trajectory corresponds to a determining attribute and the threshold value range of the determining attribute, and then after analysis and processing, the final significant location determining attribute and its threshold value range are obtained.
作为步骤101和步骤105所述的获取位置轨迹上的潜在显著地点的一种实现方式:所指的预设聚类算法为DBSCAN算法的变体,该预设聚类算法包括距离阈值、点数阈值和时间阈值三个参数。As an implementation of the acquisition of potentially significant locations on the location trajectory described in step 101 and step 105: the referred preset clustering algorithm is a variant of the DBSCAN algorithm, and the preset clustering algorithm includes a distance threshold and a point threshold and time threshold three parameters.
所述通过预设聚类算法得到所述样本位置轨迹或所述待处理位置轨迹上的至少一个潜在显著地点,如图2所示,具体包括:The obtaining of at least one potential salient location on the sample position track or the position track to be processed by the preset clustering algorithm, as shown in FIG. 2 , specifically includes:
201:将所述位置轨迹中未被标记的地点作为起始点。201: Use an unmarked location in the location track as a starting point.
其中,所述标记包括标记为潜在显著地点或噪声点。Wherein, the marking includes marking as a potentially salient location or a noise point.
202:查找与所述起始点的距离小于等于所述距离阈值的目标点。202: Search for a target point whose distance to the starting point is less than or equal to the distance threshold.
作为一个举例,该距离阈值的取值可以为100m。As an example, the value of the distance threshold may be 100m.
203:如果查找到的所述目标点的数量大于等于所述点数阈值,且每个所述目标点与所述起始点的时间间隔均大于所述时间阈值,则将所述起始点和所有所述目标点确定为一个簇,且将所述起始点标记为潜在显著地点。203: If the number of the found target points is greater than or equal to the point threshold, and the time interval between each target point and the starting point is greater than the time threshold, then set the starting point and all The target point is determined as a cluster, and the starting point is marked as a potentially salient location.
作为一个举例,该点数阈值可以为5个;该时间阈值可以为10分钟。As an example, the point threshold may be 5; the time threshold may be 10 minutes.
如果不满足上述条件,则将所述起始点标记为噪声点。If the above conditions are not satisfied, the starting point is marked as a noise point.
204:当将所述簇内的所有点均标记后,依次标记其他未被标记的地点。204: After all the points in the cluster are marked, mark other unmarked places in sequence.
重复执行步骤201至步骤203,依次对簇内的所有点进行处理,当簇内所有点均标记完后(可称为该簇被充分扩展),对其他未标记的点进行标记处理。Steps 201 to 203 are repeatedly executed, and all points in the cluster are processed sequentially. When all points in the cluster are marked (it can be said that the cluster is fully expanded), other unmarked points are marked.
本发明实施例采用的这种聚类算法是DBSCAN算法的变体,它不仅考虑轨迹中点和点之间的距离,而且点和点之间的距离是轨迹上的距离而不是两个点的直接距离,这样更接近现实生活中的真实情况。且聚类过程中使用最小时间代替最小区域个数来判定簇,则能够防止用户设备仅因为某一特殊原因,不得不在此停留小段时间,而造成显著地点误判。The clustering algorithm used in the embodiment of the present invention is a variant of the DBSCAN algorithm, which not only considers the distance between points and points in the trajectory, but also the distance between points and points is the distance on the trajectory rather than the distance between two points. Direct distance, which is closer to the real situation in real life. In addition, in the clustering process, the minimum time is used instead of the minimum number of regions to determine clusters, which can prevent the user equipment from having to stay here for a short period of time due to a special reason, resulting in misjudgment of significant locations.
作为步骤103“根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性”的一种具体实现方式,如图3所示,该过程具体包括:As a specific implementation of step 103 "according to the attribute information table, use a preset feature selection algorithm to select a prominent place to determine an attribute from the first attribute set", as shown in Figure 3, the process specifically includes:
301:使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值。301: Use a preset feature selection algorithm to calculate a P-norm value with tolerance capability for each conditional attribute in the first attribute set.
302:将计算得到的所述具有容差能力的P范数的所有值中的最大值对应的条件属性确定为显著地点决定属性。302: Determine the conditional attribute corresponding to the maximum value among all the calculated values of the P-norm with tolerance capability as the salient location determining attribute.
其中,如图4所示,上述步骤301中,计算每个条件属性的具有容差能力的P范数的值的具体过程为:Wherein, as shown in FIG. 4, in the above step 301, the specific process of calculating the value of the P norm with tolerance capability for each conditional attribute is:
401:将所述第一属性集合中的第一条件属性作为待评估属性,将其余条件属性构成的集合确定为第二属性集合,所述第一条件属性为任意条件属性。401: Use a first conditional attribute in the first attribute set as an attribute to be evaluated, and determine a set formed by other conditional attributes as a second attribute set, where the first conditional attribute is any conditional attribute.
402:使用基于模糊熵的用户属性选择博弈算法,依次判断所述待评估属性在所述第二属性集合中的每个目标子集的合租博弈中是否获胜,得到与每个目标子集对应的判断结果,所述目标子集包括至少两个条件属性。402: Using a user attribute selection game algorithm based on fuzzy entropy, sequentially judge whether the attribute to be evaluated wins in the co-tenancy game of each target subset in the second attribute set, and obtain the value corresponding to each target subset. As a result of the judgment, the target subset includes at least two condition attributes.
本步骤的实现过程的基本原理为使用基于模糊熵的用户属性选择博弈方法,计算条件互信息,判断待评估属性是否在属性子集的合租博弈中获胜。The basic principle of the implementation process of this step is to use the user attribute selection game method based on fuzzy entropy, calculate the conditional mutual information, and judge whether the attribute to be evaluated wins in the shared rental game of the attribute subset.
403:根据所有所述判断结果,计算所述待评估属性的具有容差能力的P范数的值。403: According to all the judgment results, calculate the value of the P-norm with tolerance capability of the attribute to be evaluated.
重复上述步骤401至步骤403,可得到每个条件属性对应的具有容差能力的P范数的值。By repeating the above step 401 to step 403, the value of the P-norm with tolerance capability corresponding to each condition attribute can be obtained.
假设第一属性集合中共有N-1个条件属性Ni和1个决策属性D,选择潜在地点信息表中的任一条件属性Ni作为待评估属性进行评估,则剩余的N-2个条件属性构成第二属性集合。该第二属性集合对应有多个子集,选择该第二属性集合中的任意一个包含至少两个条件属性的属性子集Si,来评估在条件属性Ni已知的情况下,属性子集Si与最终决策属性的信息共享度。其计算公式如下:Assuming that there are N-1 conditional attributes Ni and 1 decision attribute D in the first attribute set, and any conditional attribute Ni in the potential location information table is selected as the attribute to be evaluated for evaluation, the remaining N-2 conditional attributes constitute The second set of attributes. The second attribute set corresponds to a plurality of subsets, select any attribute subset Si in the second attribute set that contains at least two conditional attributes, to evaluate the relationship between the attribute subset Si and Degree of information sharing for final decision attributes. Its calculation formula is as follows:
MI(Si;D|Ni)=E(Si|Ni)-E(Si|D,Ni) (1)MI(Si;D|Ni)=E(Si|Ni)-E(Si|D,Ni) (1)
其中,条件熵E(Si|Ni)称之为当Ni已知时,属性子集Si的熵,E(Si|D,Ni)表示在待评估条件属性Ni和决策属性D同时存在时,属性子集Si的熵。Among them, the conditional entropy E(Si|Ni) refers to the entropy of the attribute subset Si when Ni is known, and E(Si|D,Ni) indicates that when the conditional attribute Ni and the decision attribute D to be evaluated exist at the same time, the attribute The entropy of the subset Si.
如果公式(1)中,MI(Si;D|Ni)的值大于0,则说明当待评估属性Ni存在时,增加了属性子集Si和决策属性D的信息共享,称Ni在属性子集Si的合租博弈中获胜。If the value of MI(Si; D|Ni) in formula (1) is greater than 0, it means that when the attribute Ni to be evaluated exists, the information sharing between the attribute subset Si and the decision attribute D is increased, and it is said that Ni is in the attribute subset Si wins the co-tenancy game.
按照公式(1)的计算方法,可依次得到待评估属性Ni对应的在每个目标属性子集Si的合租博弈中是否获胜。According to the calculation method of formula (1), it can be obtained sequentially whether the property Ni corresponding to the property to be evaluated wins in the rent-sharing game of each target attribute subset Si.
此外,条件熵E(Si|Ni)计算方法为:In addition, the calculation method of conditional entropy E(Si|Ni) is:
E(Si|Ni)=-(∑log2(|[xi]|/n))/n (2)E(Si|Ni)=-(∑log 2 (|[xi]|/n))/n (2)
其中,|[xi]|=∑rij,rij为Si集合中任意两个条件属性值xi与xj的相似关系,n为Si中条件属性的个数。rij的计算公式如下述公式(3)所示:Where, |[xi]|=∑r ij , r ij is the similarity relationship between any two condition attribute values x i and x j in the Si set, and n is the number of condition attributes in Si. The calculation formula of r ij is shown in the following formula (3):
其中,公式(3)中,||xi-xj||为具有容差能力的P范数。Wherein, in formula (3), || xi -x j || is a P norm with tolerance capability.
根据待评估属性在每个目标属性子集的合租博弈中是否获胜计算待评估属性的具有容差能力的P范数(Banzhaf)的值。Calculate the value of P-norm (Banzhaf) of the attribute to be evaluated according to whether the attribute to be evaluated wins in the rent-sharing game of each target attribute subset.
公式(1)中,当MI(Si;D|Ni)大于0,即条件属性Ni在属性子集Si的合租博弈中获胜时,计Δi(Si)=1,否则Δi(Si)=0。In the formula (1), when MI(Si; D|Ni) is greater than 0, that is, when the conditional attribute Ni wins in the co-tenancy game of the attribute subset Si, count Δ i (Si) = 1, otherwise Δ i (Si) = 0.
则条件属性Ni的Banzhaf值计算方法如下:Then the Banzhaf value calculation method of the condition attribute Ni is as follows:
最终得到每一个条件属性对应的Banzhaf值,Banzhaf值越高,说明其对属性子集的贡献越大。选择Banzhaf值最高值对应的条件属性作为决定显著地点最重要的属性也即显著地点决定属性。Finally, the Banzhaf value corresponding to each conditional attribute is obtained. The higher the Banzhaf value, the greater its contribution to the attribute subset. The condition attribute corresponding to the highest value of Banzhaf value is selected as the most important attribute for determining the salient location, that is, the salient location determining attribute.
由于潜在显著地点是通过聚类算法得到的,可能存在样本稀疏的情况,而该特征提取所使用的具有容差能力的P范数,比传统的P范数更能够应对样本稀疏的情况,同时避免了等价关系样本离散化带来的信息损失。Since the potentially salient locations are obtained through a clustering algorithm, there may be cases where samples are sparse, and the P-norm with tolerance capability used in feature extraction is more able to deal with the case of sparse samples than the traditional P-norm. The information loss caused by discretization of equivalence relation samples is avoided.
此外,在步骤101和步骤105之前,由于根据GPS传感器获取到的位置轨迹会时有漂移,因而需要将这些明显为噪声点的漂移点剔除,比如瞬间从北京漂移到了上海,这显然是错误的,这种GPS经纬度坐标点就需要剔除掉。In addition, before step 101 and step 105, since the position trajectory acquired by the GPS sensor will drift from time to time, it is necessary to remove these drift points that are obviously noise points, such as drifting from Beijing to Shanghai in an instant, which is obviously wrong , such GPS latitude and longitude coordinate points need to be eliminated.
作为上述方法的实现,本发明实施例还提供了一种确定显著地点的装置,如图5所示,该装置包括:As an implementation of the above method, an embodiment of the present invention also provides a device for determining a prominent location, as shown in Figure 5, the device includes:
处理模块501,用于通过预设聚类算法得到所述样本位置轨迹上的至少一个潜在显著地点;A processing module 501, configured to obtain at least one potentially significant location on the sample location trajectory through a preset clustering algorithm;
获取模块502,用于获取每个所述潜在显著地点对应的第一属性集合的值,得到属性信息表,所述第一属性集合包括预设个数的条件属性和一个决策属性;An acquisition module 502, configured to acquire the value of the first attribute set corresponding to each of the potential significant locations, and obtain an attribute information table, the first attribute set includes a preset number of conditional attributes and a decision attribute;
所述处理模块501,还用于根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性;The processing module 501 is further configured to use a preset feature selection algorithm to select a prominent place from the first attribute set to determine an attribute according to the attribute information table;
所述获取模块502,还用于获取所述显著地点决定属性的阈值范围;The acquiring module 502 is further configured to acquire the threshold range of the determining attribute of the prominent place;
所述处理模块501,还用于通过所述预设聚类算法得到所述待处理位置轨迹上的至少一个潜在显著地点;The processing module 501 is further configured to use the preset clustering algorithm to obtain at least one potential salient location on the track of the location to be processed;
所述处理模块501,还用于将所述显著地点决定属性的值在所述阈值范围内的潜在显著地点确定为显著地点。The processing module 501 is further configured to determine, as a salient place, a potential notable place whose value of the salient place determining attribute is within the threshold range.
进一步的,所述获取模块502,具体用于使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值;Further, the acquisition module 502 is specifically configured to use a preset feature selection algorithm to calculate the value of the P-norm with tolerance capability of each conditional attribute in the first attribute set;
将计算得到的所述具有容差能力的P范数的所有值中的最大值对应的条件属性确定为显著地点决定属性。The condition attribute corresponding to the maximum value among all the calculated values of the P-norm with tolerance capability is determined as the salient location determining attribute.
进一步的,所述获取模块502,还具体用于:Further, the obtaining module 502 is also specifically used for:
将所述第一属性集合中的第一条件属性作为待评估属性,将其余条件属性构成的集合确定为第二属性集合,所述第一条件属性为任意条件属性;Using the first conditional attribute in the first attribute set as the attribute to be evaluated, and determining the set formed by the remaining conditional attributes as the second attribute set, the first conditional attribute is any conditional attribute;
使用基于模糊熵的用户属性选择博弈算法,依次判断所述待评估属性在所述第二属性集合中的每个目标子集的合租博弈中是否获胜,得到与每个目标子集对应的判断结果,所述目标子集包括至少两个条件属性;Using a user attribute selection game algorithm based on fuzzy entropy, sequentially judge whether the attribute to be evaluated wins in the co-tenancy game of each target subset in the second attribute set, and obtain a judgment result corresponding to each target subset , the target subset includes at least two conditional attributes;
根据所有所述判断结果,计算所述待评估属性的具有容差能力的P范数的值。According to all the judgment results, the value of the P-norm with tolerance capability of the attribute to be evaluated is calculated.
进一步的,所述预设聚类算法包括距离阈值、点数阈值和时间阈值三个参数;则所述处理模块501,具体用于:Further, the preset clustering algorithm includes three parameters: distance threshold, points threshold and time threshold; then the processing module 501 is specifically used for:
将所述位置轨迹中未被标记的地点作为起始点,所述标记包括标记为潜在显著地点或噪声点;Using an unmarked location in the location track as a starting point, the marking includes a potentially salient location or a noise point;
查找与所述起始点的距离小于等于所述距离阈值的目标点;Find a target point whose distance from the starting point is less than or equal to the distance threshold;
如果查找到的所述目标点的数量大于等于所述点数阈值,且每个所述目标点与所述起始点的时间间隔均大于所述时间阈值,则将所述起始点和所有所述目标点确定为一个簇,且将所述起始点标记为潜在显著地点;If the number of the target points found is greater than or equal to the point threshold, and the time interval between each target point and the starting point is greater than the time threshold, then the starting point and all the targets points are determined as a cluster, and the starting point is marked as a potentially salient location;
当将所述簇内的所有点均标记后,依次标记其他未被标记的地点。After all points in the cluster are marked, other unmarked places are marked in sequence.
进一步的,所述获取模块502获取的所述第一属性集合中包括速度、加速度、潜在显著地点在所述样本位置轨迹中的记录个数、所述潜在显著地点在所述样本位置轨迹中出现的天数、所述潜在显著地点的停留时间间隔和方位变化标准差。Further, the first attribute set acquired by the acquisition module 502 includes velocity, acceleration, the number of records of potential salient points in the sample position track, and the number of records of the potential salient points in the sample position track , the dwell time interval and the standard deviation of the change in orientation of said potentially significant location.
本发明实施例提供的确定显著地点的装置,将样本位置轨迹通过聚类算法得出潜在显著地点,再通过特征选取算法选取出显著地点的决定属性以及其阈值范围,然后再对待处理位置轨迹进行处理,将待处理位置轨迹中的潜在显著地点中显著地点的决定属性的值满足阈值范围的潜在显著地点确定为显著地点,与现有技术中将通过聚类算法选择出的点直接作为显著地点相比,本发明由于在聚类算法后采用了特征选取算法,因而能够减少显著地点的误判率,进而能够提高确定显著地点的精确度。The device for determining a salient location provided by the embodiment of the present invention uses a clustering algorithm to obtain a potential salient location from the sample location trajectory, and then selects the decisive attribute of the salient location and its threshold range through a feature selection algorithm, and then performs the processing on the location trajectory to be processed. processing, determining the potential salient places whose value of the determining attribute of the salient places in the position trajectory to be processed meets the threshold range as the salient places, which is different from the prior art in which the points selected by the clustering algorithm are directly used as the salient places In contrast, the present invention can reduce the misjudgment rate of prominent locations because the feature selection algorithm is used after the clustering algorithm, thereby improving the accuracy of determining prominent locations.
作为上述方法的实现,本发明实施例还提供了一种确定显著地点的装置,如图6所示,该装置包括处理器601、存储器602和总线603,处理器601和存储器602通过总线603连接。其中:As an implementation of the above method, an embodiment of the present invention also provides a device for determining a significant location, as shown in FIG. . in:
处理器601,用于通过预设聚类算法得到所述样本位置轨迹上的至少一个潜在显著地点;Processor 601, configured to obtain at least one potentially significant location on the sample location trajectory through a preset clustering algorithm;
获取每个所述潜在显著地点对应的第一属性集合的值,得到属性信息表,所述第一属性集合包括预设个数的条件属性和一个决策属性;Acquiring the value of the first attribute set corresponding to each of the potentially significant locations to obtain an attribute information table, the first attribute set includes a preset number of conditional attributes and a decision attribute;
根据所述属性信息表,使用预设特征选取算法从所述第一属性集合中选取显著地点决定属性;According to the attribute information table, using a preset feature selection algorithm to select a prominent place from the first attribute set to determine an attribute;
获取所述显著地点决定属性的阈值范围;Obtaining the threshold range of the determining attribute of the prominent location;
通过所述预设聚类算法得到所述待处理位置轨迹上的至少一个潜在显著地点;Obtaining at least one potentially salient location on the location track to be processed through the preset clustering algorithm;
将所述显著地点决定属性的值在所述阈值范围内的潜在显著地点确定为显著地点。Potentially notable places whose value of the salient place determining attribute is within the threshold range are determined as salient places.
进一步的,所述处理器601,具体用于使用预设特征选取算法计算所述第一属性集合中每个条件属性的具有容差能力的P范数的值;Further, the processor 601 is specifically configured to use a preset feature selection algorithm to calculate the value of the P-norm with tolerance capability of each conditional attribute in the first attribute set;
将计算得到的所述具有容差能力的P范数的所有值中的最大值对应的条件属性确定为显著地点决定属性。The condition attribute corresponding to the maximum value among all the calculated values of the P-norm with tolerance capability is determined as the salient location determining attribute.
进一步的,所述处理器601,还具体用于:Further, the processor 601 is also specifically configured to:
将所述第一属性集合中的第一条件属性作为待评估属性,将其余条件属性构成的集合确定为第二属性集合,所述第一条件属性为任意条件属性;Using the first conditional attribute in the first attribute set as the attribute to be evaluated, and determining the set formed by the remaining conditional attributes as the second attribute set, the first conditional attribute is any conditional attribute;
使用基于模糊熵的用户属性选择博弈算法,依次判断所述待评估属性在所述第二属性集合中的每个目标子集的合租博弈中是否获胜,得到与每个目标子集对应的判断结果,所述目标子集包括至少两个条件属性;Using a user attribute selection game algorithm based on fuzzy entropy, sequentially judge whether the attribute to be evaluated wins in the co-tenancy game of each target subset in the second attribute set, and obtain a judgment result corresponding to each target subset , the target subset includes at least two conditional attributes;
根据所有所述判断结果,计算所述待评估属性的具有容差能力的P范数的值。According to all the judgment results, the value of the P-norm with tolerance capability of the attribute to be evaluated is calculated.
进一步的,所述处理器601,具体用于:Further, the processor 601 is specifically configured to:
将所述位置轨迹中未被标记的地点作为起始点,所述标记包括标记为潜在显著地点或噪声点;Using an unmarked location in the location track as a starting point, the marking includes a potentially salient location or a noise point;
查找与所述起始点的距离小于等于所述距离阈值的目标点;Find a target point whose distance from the starting point is less than or equal to the distance threshold;
如果查找到的所述目标点的数量大于等于所述点数阈值,且每个所述目标点与所述起始点的时间间隔均大于所述时间阈值,则将所述起始点和所有所述目标点确定为一个簇,且将所述起始点标记为潜在显著地点;If the number of the target points found is greater than or equal to the point threshold, and the time interval between each target point and the starting point is greater than the time threshold, then the starting point and all the targets points are determined as a cluster, and the starting point is marked as a potentially salient location;
当将所述簇内的所有点均标记后,依次标记其他未被标记的地点。After all points in the cluster are marked, other unmarked places are marked in sequence.
进一步的,所述处理器601获取的所述第一属性集合中包括速度、加速度、潜在显著地点在所述样本位置轨迹中的记录个数、所述潜在显著地点在所述样本位置轨迹中出现的天数、所述潜在显著地点的停留时间间隔和方位变化标准差。Further, the first attribute set acquired by the processor 601 includes velocity, acceleration, the number of records of potential salient points in the sample position track, and the number of records of the potential salient points in the sample position track. , the dwell time interval and the standard deviation of the change in orientation of said potentially significant location.
存储器602用于存储处理器601执行过程中所用到的程序。The memory 602 is used for storing the programs used in the execution process of the processor 601 .
本发明实施例提供的确定显著地点的装置,将样本位置轨迹通过聚类算法得出潜在显著地点,再通过特征选取算法选取出显著地点的决定属性以及其阈值范围,然后再对待处理位置轨迹进行处理,将待处理位置轨迹中的潜在显著地点中显著地点的决定属性的值满足阈值范围的潜在显著地点确定为显著地点,与现有技术中将通过聚类算法选择出的点直接作为显著地点相比,本发明由于在聚类算法后采用了特征选取算法,因而能够减少显著地点的误判率,进而能够提高确定显著地点的精确度。The device for determining a salient location provided by the embodiment of the present invention uses a clustering algorithm to obtain a potential salient location from the sample location trajectory, and then selects the decisive attribute of the salient location and its threshold range through a feature selection algorithm, and then performs the processing on the location trajectory to be processed. processing, determining the potential salient places whose value of the determining attribute of the salient places in the position trajectory to be processed meets the threshold range as the salient places, which is different from the prior art in which the points selected by the clustering algorithm are directly used as the salient places In contrast, the present invention can reduce the misjudgment rate of prominent locations because the feature selection algorithm is used after the clustering algorithm, thereby improving the accuracy of determining prominent locations.
需要说明的是,本发明实施例所述的处理器601可以是一个处理器,也可以是多个处理元件的统称。例如,该处理器601可以是中央处理器(Central Processing Unit,简称CPU),也可以是特定集成电路(Application Specific Integrated Circuit,简称ASIC),或者是被配置成实施本发明实施例的一个或多个集成电路,例如:一个或多个微处理器(digital signal processor,简称DSP),或,一个或者多个现场可编程门阵列(FieldProgrammable Gate Array,简称FPGA)。It should be noted that the processor 601 in this embodiment of the present invention may be one processor, or may be a general term for multiple processing elements. For example, the processor 601 may be a central processing unit (Central Processing Unit, referred to as CPU), or a specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or be configured to implement one or more An integrated circuit, for example: one or more microprocessors (digital signal processor, DSP for short), or one or more field programmable gate arrays (Field Programmable Gate Array, FPGA for short).
存储器602可以是一个存储装置,也可以是多个存储元件的统称,且用于存储可执行程序代码等。且存储器602可以包括随机存储器(RAM),也可以包括非易失性存储器(non-volatile memory),例如磁盘存储器,闪存(Flash)等。The memory 602 may be a storage device, or a general term for multiple storage elements, and is used to store executable program codes and the like. And the memory 602 may include random access memory (RAM), and may also include non-volatile memory (non-volatile memory), such as disk memory, flash memory (Flash), and the like.
总线603可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component,PCI)总线或扩展工业标准体系结构(ExtendedIndustry Standard Architecture,EISA)总线等。该总线603可以分为地址总线、数据总线、控制总线等。为便于表示,图6中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The bus 603 may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc. The bus 603 can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 6 , but it does not mean that there is only one bus or one type of bus.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本发明可借助软件加必需的通用硬件的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘,硬盘或光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be realized by means of software plus necessary general-purpose hardware, and of course also by hardware, but in many cases the former is a better embodiment . Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , a hard disk or an optical disk, etc., including several instructions for enabling a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments of the present invention.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510307160.5A CN106294485B (en) | 2015-06-05 | 2015-06-05 | Determine the method and device in significant place |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510307160.5A CN106294485B (en) | 2015-06-05 | 2015-06-05 | Determine the method and device in significant place |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106294485A CN106294485A (en) | 2017-01-04 |
CN106294485B true CN106294485B (en) | 2019-11-01 |
Family
ID=57659648
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510307160.5A Active CN106294485B (en) | 2015-06-05 | 2015-06-05 | Determine the method and device in significant place |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294485B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491895A (en) * | 2017-08-30 | 2017-12-19 | 国信优易数据有限公司 | A kind of shared bicycle parks determination method and device a little |
CN112230253B (en) * | 2020-10-13 | 2021-07-09 | 电子科技大学 | Track characteristic anomaly detection method based on public slice subsequence |
CN113254570B (en) * | 2021-07-15 | 2021-10-01 | 浙江网商银行股份有限公司 | Data identification method and device |
CN114995628B (en) * | 2021-10-13 | 2023-08-11 | 荣耀终端有限公司 | Space gesture recognition method and related equipment thereof |
CN114064834A (en) * | 2021-11-16 | 2022-02-18 | 深圳市中科明望通信软件有限公司 | Target location determination method and device, storage medium and electronic device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218442A (en) * | 2013-04-22 | 2013-07-24 | 中山大学 | Method and system for life mode analysis based on mobile device sensor data |
CN103577547A (en) * | 2013-10-12 | 2014-02-12 | 优视科技有限公司 | Webpage type identification method and device |
CN104252527A (en) * | 2014-09-02 | 2014-12-31 | 百度在线网络技术(北京)有限公司 | Method and device for determining resident point information of mobile subscriber |
CN104636354A (en) * | 2013-11-07 | 2015-05-20 | 华为技术有限公司 | Position point of interest clustering method and related device |
-
2015
- 2015-06-05 CN CN201510307160.5A patent/CN106294485B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103218442A (en) * | 2013-04-22 | 2013-07-24 | 中山大学 | Method and system for life mode analysis based on mobile device sensor data |
CN103577547A (en) * | 2013-10-12 | 2014-02-12 | 优视科技有限公司 | Webpage type identification method and device |
CN104636354A (en) * | 2013-11-07 | 2015-05-20 | 华为技术有限公司 | Position point of interest clustering method and related device |
CN104252527A (en) * | 2014-09-02 | 2014-12-31 | 百度在线网络技术(北京)有限公司 | Method and device for determining resident point information of mobile subscriber |
Also Published As
Publication number | Publication date |
---|---|
CN106294485A (en) | 2017-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110177094B (en) | User group identification method and device, electronic equipment and storage medium | |
CN108038183B (en) | Structured entity recording method, device, server and storage medium | |
CN106294485B (en) | Determine the method and device in significant place | |
CN104281882B (en) | The method and system of prediction social network information stream row degree based on user characteristics | |
US10218716B2 (en) | Technologies for analyzing uniform resource locators | |
CN104080054B (en) | A kind of acquisition methods and device of exception point of interest | |
US10645105B2 (en) | Network attack detection method and device | |
CN108345601B (en) | Search result ordering method and device | |
CN108595655B (en) | Abnormal user detection method based on session feature similarity fuzzy clustering | |
CN103258025B (en) | Generate the method for co-occurrence keyword, the method that association search word is provided and system | |
CN106598965B (en) | Account mapping method and device based on address information | |
JP2019512764A (en) | Method and apparatus for identifying the type of user geographical location | |
CN109388634B (en) | Address information processing method, terminal device and computer readable storage medium | |
CN110222790B (en) | User identity identification method and device and server | |
De Boom et al. | Semantics-driven event clustering in Twitter feeds | |
CN110968802B (en) | Analysis method and analysis device for user characteristics and readable storage medium | |
CN110288003A (en) | Data change identification method and device | |
CN103617163A (en) | Quick target association method based on clustering analysis | |
CN110688434B (en) | Method, device, equipment and medium for processing interest points | |
CN109241360B (en) | Matching method and device of combined character strings and electronic equipment | |
CN103455491A (en) | Method and device for classifying search terms | |
CN115935953A (en) | False news detection method, device, electronic device and storage medium | |
CN111026935B (en) | Cross-modal retrieval reordering method based on adaptive measurement fusion | |
CN110457600B (en) | Method, device, storage medium and computer equipment for searching target group | |
CN107389071B (en) | Improved indoor KNN positioning method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |