CN108764951B - User similarity obtaining method and device, equipment and storage medium - Google Patents
User similarity obtaining method and device, equipment and storage medium Download PDFInfo
- Publication number
- CN108764951B CN108764951B CN201810244918.9A CN201810244918A CN108764951B CN 108764951 B CN108764951 B CN 108764951B CN 201810244918 A CN201810244918 A CN 201810244918A CN 108764951 B CN108764951 B CN 108764951B
- Authority
- CN
- China
- Prior art keywords
- user
- similarity
- track
- obtaining
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种基于用户移动轨迹的用户相似度获得方法和装置、设备、存储介质。所述基于用户移动轨迹的用户相似度获得方法包括:获得第一用户移动轨迹和第二用户移动轨迹;其中,所述第一用户移动轨迹中包含至少一个第一时刻信息;所述第二用户移动轨迹中包含至少一个第二时刻信息;根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度;根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度。采用本发明,能够提高获得的用户相似度的准确度。
The invention discloses a method, device, device and storage medium for obtaining user similarity based on user movement trajectory. The method for obtaining user similarity based on a user movement track includes: obtaining a first user movement track and a second user movement track; wherein the first user movement track includes at least one first moment information; the second user movement track includes at least one piece of first moment information; The movement track contains at least one second moment information; according to each of the first moment information and each of the second moment information, a time similarity is obtained; according to the first user movement track, the second user movement The trajectory and the time similarity are used to obtain the user similarity between the first user and the second user. By adopting the present invention, the accuracy of the obtained user similarity can be improved.
Description
技术领域technical field
本发明涉及计算机技术领域,尤其涉及一种用户相似度获得方法和装置、设备、存储介质。The present invention relates to the field of computer technology, and in particular, to a method and apparatus, device, and storage medium for obtaining user similarity.
背景技术Background technique
在电信运营商运营过程中,当某个用户使用的通信号码发生变化时,常常需要将该用户的新旧号码的相关记录进行合并。那么如何识别两个号码是否归属于同一用户,即如何判断某新入网用户与某旧入网用户是否为同一用户,成为了电信运营过程中的核心技术问题。In the operation process of a telecommunication operator, when the communication number used by a certain user changes, it is often necessary to merge the relevant records of the old and new numbers of the user. Then, how to identify whether two numbers belong to the same user, that is, how to judge whether a new user and an old user are the same user, has become a core technical problem in the telecommunication operation process.
在现有技术中,通常都是通过计算两个用户之间的相似度来判断这两个用户是否为同一用户的。具体地,每当检测到某一新入网用户出现时,自动计算通信网络中的所有旧入网用户与该新入网用户之间的用户相似度,并在检测到计算获得的用户相似度中存在大于预设阈值的用户相似度时,认为该通信网络中存在与该新入网用户相似的旧入网用户,因此判定该新入网用户为重入网用户;随后,判定其中用户相似度最大的旧入网用户与该新入网用户为同一用户,并将二者对应的相关记录合并。现有的计算新入网用户与旧入网用户之间的用户相似度的方法通常都是通过计算用户通信记录的相似度来获得的,用户相似度的获得依据单一,准确度不高。In the prior art, it is usually determined whether the two users are the same user by calculating the similarity between the two users. Specifically, whenever a new network user is detected, the user similarity between all the old network users in the communication network and the new network user is automatically calculated, and it is detected that the calculated user similarity is greater than When the user similarity of the preset threshold is reached, it is considered that there is an old network user similar to the new network user in the communication network, so it is determined that the new network user is a re-entry user; then, it is determined that the old network user with the largest user similarity is the same as the new network user. The new network user is the same user, and the related records corresponding to the two are merged. The existing methods for calculating the user similarity between the new network user and the old network user are usually obtained by calculating the similarity of the user's communication records.
发明内容SUMMARY OF THE INVENTION
本发明实施例提出一种用户相似度获得方法和装置、设备、存储介质,能够提高获得的用户相似度的准确度。The embodiments of the present invention provide a method, apparatus, device, and storage medium for obtaining a user similarity, which can improve the accuracy of the obtained user similarity.
本发明实施例提供的一种基于用户移动轨迹的用户相似度获得方法,具体包括:A method for obtaining user similarity based on a user's movement trajectory provided by an embodiment of the present invention specifically includes:
获得第一用户移动轨迹和第二用户移动轨迹;其中,所述第一用户移动轨迹中包含至少一个第一时刻信息;所述第二用户移动轨迹中包含至少一个第二时刻信息;obtaining a first user movement track and a second user movement track; wherein, the first user movement track includes at least one first moment information; the second user movement track includes at least one second moment information;
根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度;obtaining a temporal similarity according to each of the first moment information and each of the second moment information;
根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度。The user similarity between the first user and the second user is obtained according to the first user movement trajectory, the second user movement trajectory and the time similarity.
进一步地,所述第一用户移动轨迹中包含至少一个第一轨迹点;所述第二用户移动轨迹中包含至少一个第二轨迹点;每个所述第一时刻信息与每个所述第一轨迹点具有一一对应关系;每个所述第二时刻信息与每个所述第二轨迹点具有一一对应关系。Further, the first user movement track includes at least one first track point; the second user movement track includes at least one second track point; each of the first time information is associated with each of the first The track points have a one-to-one correspondence; each of the second moment information has a one-to-one correspondence with each of the second track points.
进一步地,所述根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度,具体包括:Further, the obtaining the temporal similarity according to each of the first moment information and each of the second moment information specifically includes:
根据每个所述第一时刻信息Ti(u)、每个所述第二时刻信息Tj(v)以及预设的时间相似度计算模型计算获得所述时间相似度COL;其中,u表示所述第一用户;v表示所述第二用户;n(u)为所述第一用户移动轨迹中的第一轨迹点的总个数;n(v)为所述第二用户移动轨迹中的第二轨迹点的总个数;ΔT为预设的时间精度;Li(u)表示所述第一用户移动轨迹中的第i个第一轨迹点;Lj(v)表示所述第二用户移动轨迹中的第j个第二轨迹点;δ(Li(u),Lj(v))为所述第一用户移动轨迹中的第i个第一轨迹点与所述第二用户移动轨迹中的第j个第二轨迹点的重合度。The model is calculated according to each of the first time information T i (u), each of the second time information T j (v) and a preset time similarity Calculate and obtain the temporal similarity COL; wherein, u represents the first user; v represents the second user; n(u) is the total number of first trajectory points in the movement trajectory of the first user; n(v) is the total number of second trajectory points in the second user's movement trajectory; ΔT is the preset time precision; Li (u) represents the i -th point in the first user's movement trajectory A trajectory point; L j (v) represents the j-th second trajectory point in the second user movement trajectory; δ(L i (u), L j (v)) is the first user movement trajectory in the The degree of coincidence between the i-th first trajectory point of and the j-th second trajectory point in the second user movement trajectory.
进一步地,所述根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度,具体包括:Further, the obtaining the user similarity between the first user and the second user according to the first user movement trajectory, the second user movement trajectory and the time similarity, specifically includes:
根据所述第一用户移动轨迹、所述第二用户移动轨迹和预设的轨迹相似度计算模型计算获得轨迹相似度DLCSS;其中,LCSS(u,v)表示所述第一用户移动轨迹与所述第二用户移动轨迹之间的最长公共子序列;lenu表示所述第一用户移动轨迹中的第一轨迹点的总个数;lenv表示所述第二用户移动轨迹中的第二轨迹点的总个数;The model is calculated according to the similarity of the first user movement trajectory, the second user movement trajectory and a preset trajectory Calculate and obtain the track similarity D LCSS ; wherein, LCSS(u, v) represents the longest common subsequence between the first user's movement track and the second user's movement track; len u represents the first user's movement the total number of first trajectory points in the trajectory; len v represents the total number of second trajectory points in the second user movement trajectory;
根据所述轨迹相似度DLCSS和所述时间相似度,获得所述用户相似度。The user similarity is obtained according to the trajectory similarity D LCSS and the temporal similarity.
进一步地,所述根据所述轨迹相似度DLCSS和所述时间相似度,获得所述用户相似度,具体包括:Further, obtaining the user similarity according to the trajectory similarity D LCSS and the temporal similarity specifically includes:
根据所述轨迹相似度DLCSS、所述时间相似度COL和用户相似度计算模型sim(u,v)=DLCSS*COL,计算获得所述用户相似度sim(u,v)。According to the trajectory similarity D LCSS , the temporal similarity COL and the user similarity calculation model sim(u,v)=D LCSS *COL, the user similarity sim(u,v) is obtained by calculation.
进一步地,所述第一时刻信息为所述第一用户到达对应的第一轨迹点的时刻信息;所述第二时刻信息为所述第二用户到达对应的第二轨迹点的时刻信息。Further, the first time information is the time information when the first user arrives at the corresponding first trajectory point; the second time information is the time information when the second user arrives at the corresponding second trajectory point.
进一步地,在所述根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度之后,还包括:Further, after obtaining the user similarity between the first user and the second user according to the first user movement trajectory, the second user movement trajectory and the time similarity, the method further includes:
判断所述用户相似度是否大于预设阈值;judging whether the user similarity is greater than a preset threshold;
若是,则判定所述第一用户与所述第二用户为同一用户;If so, determine that the first user and the second user are the same user;
若否,则判定所述第一用户与所述第二用户不为同一用户。If not, it is determined that the first user and the second user are not the same user.
相应地,本发明实施例还提供了一种基于用户移动轨迹的用户相似度获得装置,具体包括:Correspondingly, the embodiment of the present invention also provides a user similarity obtaining device based on the user movement trajectory, which specifically includes:
用户移动轨迹获得模块,用于获得第一用户移动轨迹和第二用户移动轨迹;其中,所述第一用户移动轨迹中包含至少一个第一时刻信息;所述第二用户移动轨迹中包含至少一个第二时刻信息;a user movement track obtaining module, configured to obtain a first user movement track and a second user movement track; wherein, the first user movement track includes at least one first moment information; the second user movement track includes at least one second time information;
时间相似度获得模块,用于根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度;以及,a time similarity obtaining module, configured to obtain a time similarity according to each of the first moment information and each of the second moment information; and,
用户相似度获得模块,用于根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度。The user similarity obtaining module is configured to obtain the user similarity between the first user and the second user according to the first user movement trajectory, the second user movement trajectory and the time similarity.
本发明实施例还提供了一种设备,具体包括处理器、存储器以及存储在所述存储器中且被配置为由所述处理器执行的计算机程序,所述处理器执行所述计算机程序时实现如上所述的基于用户移动轨迹的用户相似度获得方法。An embodiment of the present invention further provides a device, which specifically includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the above when executing the computer program The method for obtaining user similarity based on user movement trajectory.
本发明实施例还提供了一种计算机可读存储介质,具体包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行如上所述的基于用户移动轨迹的用户相似度获得方法。Embodiments of the present invention further provide a computer-readable storage medium, which specifically includes a stored computer program, wherein, when the computer program runs, the device on which the computer-readable storage medium is located is controlled to execute the above-described movement based on the user's movement trajectory. The method for obtaining user similarity.
实施本发明实施例,具有如下有益效果:Implementing the embodiment of the present invention has the following beneficial effects:
本发明实施例提供的用户相似度获得方法和装置、设备、存储介质,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。In the method, device, device, and storage medium for obtaining user similarity provided by the embodiments of the present invention, the similarity between the first user and the second user is obtained by calculating the similarity between the movement trajectory of the first user and the movement trajectory of the second user. User similarity, so that the calculated user similarity is highly consistent with the actual situation. In addition, by introducing temporal similarity in the process of calculating user similarity, the basis for obtaining user similarity can be diversified, thereby increasing the number of users obtained. Similarity accuracy.
附图说明Description of drawings
图1是本发明提供的基于用户移动轨迹的用户相似度获得方法的一个优选的实施例的流程示意图;1 is a schematic flowchart of a preferred embodiment of a method for obtaining user similarity based on user movement trajectory provided by the present invention;
图2是本发明提供的基于用户移动轨迹的用户相似度获得装置的一个优选的实施例的结构示意图;2 is a schematic structural diagram of a preferred embodiment of a device for obtaining user similarity based on user movement trajectory provided by the present invention;
图3是本发明提供的设备的一个优选的实施例的结构示意图。FIG. 3 is a schematic structural diagram of a preferred embodiment of the device provided by the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
如图1所示,为本发明提供的基于用户移动轨迹的用户相似度获得方法的一个优选的实施例的流程示意图,包括步骤S11至S13,具体如下:As shown in FIG. 1, it is a schematic flowchart of a preferred embodiment of the method for obtaining user similarity based on user movement trajectory provided by the present invention, including steps S11 to S13, and the details are as follows:
S11:获得第一用户移动轨迹和第二用户移动轨迹;其中,所述第一用户移动轨迹中包含至少一个第一时刻信息;所述第二用户移动轨迹中包含至少一个第二时刻信息。S11: Obtain a first user movement track and a second user movement track; wherein the first user movement track includes at least one piece of first moment information; and the second user movement track includes at least one second moment information.
需要说明的是,本发明实施例由系统执行。其中,该系统可以为电信运营商服务器中的系统。It should be noted that the embodiments of the present invention are executed by the system. Wherein, the system may be a system in a telecommunication operator's server.
电信运营商在各处设置若干基站,系统通过获取用户的通信数据,从而获得用户的用户移动轨迹。具体地,系统对第一用户进行实时监测,获取第一用户微信、短信、QQ等通信数据,通过对这些通信数据分析,从而判断第一用户在这段时间内经过的基站,并对这些基站进行记录,从而获得第一用户的移动轨迹。与此同理,可以获得第二用户的移动轨迹。Telecom operators set up several base stations in various places, and the system obtains the user's user movement trajectory by obtaining the user's communication data. Specifically, the system monitors the first user in real time, obtains communication data such as WeChat, SMS, QQ, etc. of the first user, and analyzes these communication data to determine the base stations that the first user passes through during this period, and analyzes these base stations. Recording is performed to obtain the movement track of the first user. Similarly, the movement trajectory of the second user can be obtained.
进一步地,所述第一用户移动轨迹中包含至少一个第一轨迹点;所述第二用户移动轨迹中包含至少一个第二轨迹点;每个所述第一时刻信息与每个所述第一轨迹点具有一一对应关系;每个所述第二时刻信息与每个所述第二轨迹点具有一一对应关系。Further, the first user movement track includes at least one first track point; the second user movement track includes at least one second track point; each of the first time information is associated with each of the first The track points have a one-to-one correspondence; each of the second moment information has a one-to-one correspondence with each of the second track points.
更优选地,所述第一时刻信息为所述第一用户到达对应的第一轨迹点的时刻信息;所述第二时刻信息为所述第二用户到达对应的第二轨迹点的时刻信息。More preferably, the first time information is the time information when the first user arrives at the corresponding first trajectory point; the second time information is the time information when the second user arrives at the corresponding second trajectory point.
需要说明的是,在本实施例中,每当检测到第一用户移动至一个基站附近时,将该基站记录为一个轨迹点,同时记录第一用户到达该基站的时刻信息,通过记录第一用户在预设的时间段内经过的基站以及到达各个基站的时刻信息,即可获得上述第一用户移动轨迹。与此同理,可以获得上述第二用户移动轨迹。It should be noted that, in this embodiment, whenever it is detected that the first user moves to the vicinity of a base station, the base station is recorded as a track point, and the time information when the first user arrives at the base station is recorded at the same time. The above-mentioned first user movement trajectory can be obtained from the base stations that the user passes through within the preset time period and the time information of each base station arrival. Similarly, the above-mentioned second user movement trajectory can be obtained.
S12:根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度。S12: Obtain a temporal similarity according to each of the first time information and each of the second time information.
需要说明的是,根据每个第一时刻信息和每个第二时刻信息,获得第一用户和第二用户在时间上的相似度(即第一用户和第二用户在相同的时间段内出现在相同的地理位置的可能性),从而获得上述时间相似度。It should be noted that, according to each first moment information and each second moment information, the similarity in time between the first user and the second user (that is, the first user and the second user appear in the same time period) are obtained. possibility of being in the same geographic location) to obtain the above-mentioned temporal similarity.
S13:根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度。S13: Obtain the user similarity between the first user and the second user according to the first user movement trajectory, the second user movement trajectory, and the time similarity.
需要说明的是,在本实施例中,通过结合时间相似度,计算第一用户移动轨迹与第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度。It should be noted that, in this embodiment, the similarity between the movement trajectory of the first user and the movement trajectory of the second user is calculated by combining the temporal similarity, so as to obtain the user similarity between the first user and the second user .
本实施例通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。In this embodiment, the user similarity between the first user and the second user is obtained by calculating the similarity between the first user's movement trajectory and the second user's movement trajectory, so that the calculated user similarity is consistent with the actual situation. In addition, by introducing the temporal similarity in the process of calculating the user similarity, the basis for obtaining the user similarity can be diversified, thereby improving the accuracy of the obtained user similarity.
在另一个优选的实施例中,上述步骤S12进一步包括步骤S1201,具体如下:In another preferred embodiment, the above step S12 further includes step S1201, which is as follows:
S1201:根据每个所述第一时刻信息Ti(u)、每个所述第二时刻信息Tj(v)以及预设的时间相似度计算模型计算获得所述时间相似度COL;其中,u表示所述第一用户;v表示所述第二用户;n(u)为所述第一用户移动轨迹中的第一轨迹点的总个数;n(v)为所述第二用户移动轨迹中的第二轨迹点的总个数;ΔT为预设的时间精度;Li(u)表示所述第一用户移动轨迹中的第i个第一轨迹点;Lj(v)表示所述第二用户移动轨迹中的第j个第二轨迹点;δ(Li(u),Lj(v))为所述第一用户移动轨迹中的第i个第一轨迹点与所述第二用户移动轨迹中的第j个第二轨迹点的重合度。S1201: Calculate a model according to each of the first time information T i (u), each of the second time information T j (v) and a preset time similarity Calculate and obtain the temporal similarity COL; wherein, u represents the first user; v represents the second user; n(u) is the total number of first trajectory points in the movement trajectory of the first user; n(v) is the total number of second trajectory points in the second user's movement trajectory; ΔT is the preset time precision; Li (u) represents the i -th point in the first user's movement trajectory A trajectory point; L j (v) represents the j-th second trajectory point in the second user movement trajectory; δ(L i (u), L j (v)) is the first user movement trajectory in the The degree of coincidence between the i-th first trajectory point of and the j-th second trajectory point in the second user movement trajectory.
需要说明的是,更优选地,ΔT的值设置为1小时。当第一用户移动轨迹中的第i个第一轨迹点与第二用户移动轨迹中第j个第二轨迹点重合时,δ(Li(u),Lj(v))=1,否则,δ(Li(u),Lj(v))=0。It should be noted that, more preferably, the value of ΔT is set to 1 hour. When the i-th first trajectory point in the first user's movement trajectory coincides with the j-th second trajectory point in the second user's movement trajectory, δ(L i (u), L j (v))=1, otherwise , δ(L i (u), L j (v))=0.
在又一个优选的实施例中,上述步骤S13进一步包括步骤S1301至S1302,具体如下:In yet another preferred embodiment, the above step S13 further includes steps S1301 to S1302, which are as follows:
S1301:根据所述第一用户移动轨迹、所述第二用户移动轨迹和预设的轨迹相似度计算模型计算获得轨迹相似度DLCSS;其中,LCSS(u,v)表示所述第一用户移动轨迹与所述第二用户移动轨迹之间的最长公共子序列;lenu表示所述第一用户移动轨迹中的第一轨迹点的总个数;lenv表示所述第二用户移动轨迹中的第二轨迹点的总个数。S1301: Calculate a model according to the first user movement trajectory, the second user movement trajectory, and a preset trajectory similarity Calculate and obtain the track similarity D LCSS ; wherein, LCSS(u, v) represents the longest common subsequence between the first user's movement track and the second user's movement track; len u represents the first user's movement The total number of first track points in the track; len v represents the total number of second track points in the second user movement track.
需要说明的是,其中,ui表示上述第一用户移动轨迹中的第i个第一轨迹点;vj表示上述第二用户移动轨迹中的第j个第二轨迹点;γ为预设的相似性阈值;dist(ui,vj)为第一用户移动轨迹与第二用户移动轨迹之间的欧氏距离。It should be noted, Wherein, ui represents the i-th first trajectory point in the above-mentioned first user movement trajectory; vj represents the j -th second trajectory point in the above-mentioned second user movement trajectory; γ is a preset similarity threshold; dist (u i , v j ) is the Euclidean distance between the movement trajectory of the first user and the movement trajectory of the second user.
S1302:根据所述轨迹相似度DLCSS和所述时间相似度,获得所述用户相似度。S1302: Obtain the user similarity according to the trajectory similarity D LCSS and the temporal similarity.
进一步地,上述步骤S1302进一步包括步骤S1302_1,具体如下:Further, the above step S1302 further includes step S1302_1, which is as follows:
S1302_1:根据所述轨迹相似度DLCSS、所述时间相似度COL和用户相似度计算模型sim(u,v)=DLCSS*COL,计算获得所述用户相似度sim(u,v)。S1302_1: Calculate and obtain the user similarity sim(u,v) according to the trajectory similarity D LCSS , the temporal similarity COL and the user similarity calculation model sim(u,v)=D LCSS *COL.
在又一个优选的实施例中,在上述步骤S13之后,还包括步骤S14至S16,具体如下:In yet another preferred embodiment, after the above step S13, steps S14 to S16 are further included, as follows:
S14:判断所述用户相似度是否大于预设阈值,若是,则跳转至S15,若否,则跳转至S16。S14: Determine whether the user similarity is greater than a preset threshold, if so, skip to S15, and if not, skip to S16.
S15:判定所述第一用户与所述第二用户为同一用户。S15: Determine that the first user and the second user are the same user.
S16:判定所述第一用户与所述第二用户不为同一用户。S16: Determine that the first user and the second user are not the same user.
需要说明的是,在本实施例中,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度来判断第一用户与第二用户之间的相似度,从而可以在电信运营过程中,对新入网的用户是否为重入网用户进行判断。例如,假设某用户甲之前使用的号码是159********,并在使用2个月后停用了该号码,2个月后,某用户乙在同一个电信运营商开通了一个新号186********。此时,电信运营商通过对该用户甲和用户乙的通信数据进行分析,从而分别获得该用户甲的用户移动轨迹和该用户乙的用户移动轨迹,随后,通过计算该用户甲的用户移动轨迹与该用户乙的用户移动轨迹之间的相似度,从而获得该用户甲与该用户乙之间的用户相似度,若该用户相似度大于某一预设阈值,则可以认为该用户甲与该用户乙为同一个人,则判定该用户乙则为重入网用户;若该用户相似度小于或者等于某一预设阈值,则认为该用户甲与该用户乙不是同一个人,则判定该用户乙则为新入网用户。在本实施例中,由于根据上述实施例获得的用户相似度的准确度高,因此能够提高对重入网用户的判断的准确度。It should be noted that, in this embodiment, the similarity between the first user and the second user is determined by calculating the similarity between the movement trajectory of the first user and the movement trajectory of the second user. In , it is judged whether the user who newly joins the network is a user who has re-entered the network. For example, suppose a user A used the number 159******** before, and deactivated the number after using it for 2 months. After 2 months, a user B activated the number on the same telecom operator. A new number 186********. At this time, the telecom operator analyzes the communication data of user A and user B to obtain the user movement trajectory of user A and the user movement trajectory of user B respectively, and then calculates the user movement trajectory of user A by calculating the user movement trajectory of user A. The similarity between the user movement trajectory and the user B, so as to obtain the user similarity between the user A and the user B, if the user similarity is greater than a preset threshold, it can be considered that the user A and the user If the user B is the same person, it is determined that the user B is a re-entry user; if the similarity of the user is less than or equal to a preset threshold, it is considered that the user A and the user B are not the same person, and the user B is determined to be the same person. For new users. In this embodiment, since the accuracy of the user similarity obtained according to the above embodiment is high, the accuracy of the judgment on the re-entering users can be improved.
需要进一步说明的是,上述步骤标号仅用于表示不同步骤,而不对各步骤之间的执行顺序进行限定。It should be further noted that the above step numbers are only used to indicate different steps, and do not limit the execution order of the steps.
本发明实施例提供的基于用户移动轨迹的用户相似度获得方法,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。The method for obtaining the user similarity based on the user's movement trajectory provided by the embodiment of the present invention obtains the user similarity between the first user and the second user by calculating the similarity between the first user's movement trajectory and the second user's movement trajectory In addition, by introducing the temporal similarity in the process of calculating the user similarity, the basis for obtaining the user similarity can be diversified, thereby improving the obtained user similarity accuracy.
相应地,本发明还提供一种基于用户移动轨迹的用户相似度获得装置,能够实现上述实施例中的基于用户移动轨迹的用户相似度获得方法的所有流程。Correspondingly, the present invention also provides an apparatus for obtaining user similarity based on user movement trajectory, which can implement all the processes of the method for obtaining user similarity based on user movement trajectory in the above embodiments.
如图2所示,为本发明提供的基于用户移动轨迹的用户相似度获得装置的一个优选的实施例的结构示意图,具体如下:As shown in FIG. 2, it is a schematic structural diagram of a preferred embodiment of an apparatus for obtaining user similarity based on user movement trajectory provided by the present invention, and the details are as follows:
用户移动轨迹获得模块21,用于获得第一用户移动轨迹和第二用户移动轨迹;其中,所述第一用户移动轨迹中包含至少一个第一时刻信息;所述第二用户移动轨迹中包含至少一个第二时刻信息;The user movement
时间相似度获得模块22,用于根据每个所述第一时刻信息和每个所述第二时刻信息,获得时间相似度;以及,a time
用户相似度获得模块23,用于根据所述第一用户移动轨迹、所述第二用户移动轨迹和所述时间相似度,获得第一用户与第二用户之间的用户相似度。The user
进一步地,所述第一用户移动轨迹中包含至少一个第一轨迹点;所述第二用户移动轨迹中包含至少一个第二轨迹点;每个所述第一时刻信息与每个所述第一轨迹点具有一一对应关系;每个所述第二时刻信息与每个所述第二轨迹点具有一一对应关系。Further, the first user movement track includes at least one first track point; the second user movement track includes at least one second track point; each of the first time information is associated with each of the first The track points have a one-to-one correspondence; each of the second moment information has a one-to-one correspondence with each of the second track points.
进一步地,所述时间相似度获得模块,具体包括:Further, the temporal similarity obtaining module specifically includes:
时间相似度计算单元,用于根据每个所述第一时刻信息Ti(u)、每个所述第二时刻信息Tj(v)以及预设的时间相似度计算模型计算获得所述时间相似度COL;其中,u表示所述第一用户;v表示所述第二用户;n(u)为所述第一用户移动轨迹中的第一轨迹点的总个数;n(v)为所述第二用户移动轨迹中的第二轨迹点的总个数;ΔT为预设的时间精度;Li(u)表示所述第一用户移动轨迹中的第i个第一轨迹点;Lj(v)表示所述第二用户移动轨迹中的第j个第二轨迹点;δ(Li(u),Lj(v))为所述第一用户移动轨迹中的第i个第一轨迹点与所述第二用户移动轨迹中的第j个第二轨迹点的重合度。A time similarity calculation unit, configured to calculate a model according to each of the first time information T i (u), each of the second time information T j (v) and a preset time similarity Calculate and obtain the temporal similarity COL; wherein, u represents the first user; v represents the second user; n(u) is the total number of first trajectory points in the movement trajectory of the first user; n(v) is the total number of second trajectory points in the second user's movement trajectory; ΔT is the preset time precision; Li (u) represents the i -th point in the first user's movement trajectory A trajectory point; L j (v) represents the j-th second trajectory point in the second user movement trajectory; δ(L i (u), L j (v)) is the first user movement trajectory in the The degree of coincidence between the i-th first trajectory point of and the j-th second trajectory point in the second user movement trajectory.
进一步地,所述用户相似度获得模块,具体包括:Further, the user similarity obtaining module specifically includes:
轨迹相似度计算单元,用于根据所述第一用户移动轨迹、所述第二用户移动轨迹和预设的轨迹相似度计算模型计算获得轨迹相似度DLCSS;其中,LCSS(u,v)表示所述第一用户移动轨迹与所述第二用户移动轨迹之间的最长公共子序列;lenu表示所述第一用户移动轨迹中的第一轨迹点的总个数;lenv表示所述第二用户移动轨迹中的第二轨迹点的总个数;以及,A trajectory similarity calculation unit, configured to calculate a model according to the first user movement trajectory, the second user movement trajectory and a preset trajectory similarity Calculate and obtain the track similarity D LCSS ; wherein, LCSS(u, v) represents the longest common subsequence between the first user's movement track and the second user's movement track; len u represents the first user's movement the total number of first trajectory points in the trajectory; len v represents the total number of second trajectory points in the second user movement trajectory; and,
用户相似度计算单元,用于根据所述轨迹相似度DLCSS和所述时间相似度,获得所述用户相似度。A user similarity calculation unit, configured to obtain the user similarity according to the trajectory similarity D LCSS and the temporal similarity.
进一步地,所述用户相似度计算单元,具体包括;Further, the user similarity calculation unit specifically includes;
用户相似度计算子单元,用于根据所述轨迹相似度DLCSS、所述时间相似度COL和用户相似度计算模型sim(u,v)=DLCSS*COL,计算获得所述用户相似度sim(u,v)。A user similarity calculation subunit, configured to calculate and obtain the user similarity sim according to the trajectory similarity D LCSS , the temporal similarity COL and the user similarity calculation model sim(u,v)=D LCSS *COL (u, v).
进一步地,所述第一时刻信息为所述第一用户到达对应的第一轨迹点的时刻信息;所述第二时刻信息为所述第二用户到达对应的第二轨迹点的时刻信息。Further, the first time information is the time information when the first user arrives at the corresponding first trajectory point; the second time information is the time information when the second user arrives at the corresponding second trajectory point.
进一步地,所述基于用户移动轨迹的用户相似度获得装置,还包括:Further, the device for obtaining the user similarity based on the user's movement trajectory further includes:
用户相似度判断模块,用于判断所述用户相似度是否大于预设阈值;以及,a user similarity judging module for judging whether the user similarity is greater than a preset threshold; and,
第一处理模块,用于当判定所述用户相似度大于预设阈值时,判定所述第一用户与所述第二用户为同一用户;或者,a first processing module, configured to determine that the first user and the second user are the same user when it is determined that the user similarity is greater than a preset threshold; or,
第二处理模块,用于当判定所述用户相似度不大于预设阈值时,判定所述第一用户与所述第二用户不为同一用户。A second processing module, configured to determine that the first user and the second user are not the same user when it is determined that the user similarity is not greater than a preset threshold.
本发明实施例提供的基于用户移动轨迹的用户相似度获得装置,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。The apparatus for obtaining the user similarity based on the user movement trajectory provided by the embodiment of the present invention obtains the user similarity between the first user and the second user by calculating the similarity between the movement trajectory of the first user and the movement trajectory of the second user In addition, by introducing the temporal similarity in the process of calculating the user similarity, the basis for obtaining the user similarity can be diversified, thereby improving the obtained user similarity accuracy.
本发明还提供了一种设备。The present invention also provides a device.
如图3所示,为本发明提供的设备的一个优选的实施例的结构示意图,包括处理器31、存储器32以及存储在所述存储器32中且被配置为由所述处理器31执行的计算机程序,所述处理器31执行所述计算机程序时实现如上任一实施例所述的基于用户移动轨迹的用户相似度获得方法。As shown in FIG. 3 , it is a schematic structural diagram of a preferred embodiment of the device provided by the present invention, including a
需要说明的是,图3仅以该设备中的一个存储器和一个处理器相连接为例进行示意,在一些具体的实施例中,该设备中还可以包括多个存储器和/或多个处理器,其具体的数目及连接方式可根据实际情况需要进行设置和适应性调整。It should be noted that FIG. 3 only takes the connection of a memory and a processor in the device as an example for illustration, and in some specific embodiments, the device may also include multiple memories and/or multiple processors , the specific number and connection method can be set and adjusted according to the actual situation.
本发明实施例提供的设备,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。The device provided by the embodiment of the present invention obtains the user similarity between the first user and the second user by calculating the similarity between the movement trajectory of the first user and the movement trajectory of the second user, so that the user similarity obtained by calculation The degree of conformity with the actual situation is high. In addition, by introducing the temporal similarity in the process of calculating the user similarity, the basis for obtaining the user similarity can be diversified, thereby improving the accuracy of the obtained user similarity.
本发明还提供了一种计算机可读存储介质,具体包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行如上任一实施例所述的基于用户移动轨迹的用户相似度获得方法。The present invention also provides a computer-readable storage medium, including a stored computer program, wherein, when the computer program runs, the device where the computer-readable storage medium is located is controlled to execute the user-based user-based method described in any of the above embodiments. A method for obtaining user similarity of moving trajectories.
需要说明的是,本发明实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要进一步说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。It should be noted that, in the present invention, all or part of the processes in the methods of the above embodiments can also be implemented by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium, and the computer When the program is executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a read-only memory (ROM, Read-Only Memory) , Random Access Memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium, etc. It should be further noted that the content contained in the computer-readable medium may be appropriately increased or decreased according to the requirements of legislation and patent practice in the jurisdiction, for example, in some jurisdictions, according to legislation and patent practice, computer-readable The medium excludes electrical carrier signals and telecommunication signals.
本发明实施例提供的计算机可读存储介质,通过计算第一用户移动轨迹和第二用户移动轨迹之间的相似度,从而获得第一用户与第二用户之间的用户相似度,使得计算获得的用户相似度与现实情况符合度高,另外,通过在计算用户相似度的过程中引入时间相似度,能够使得用户相似度的获得依据多元化,从而提高获得的用户相似度的准确度。The computer-readable storage medium provided by the embodiment of the present invention obtains the user similarity between the first user and the second user by calculating the similarity between the movement trajectory of the first user and the movement trajectory of the second user, so that the calculation obtains In addition, by introducing the temporal similarity in the process of calculating the user similarity, the basis for obtaining the user similarity can be diversified, thereby improving the accuracy of the obtained user similarity.
以上所述是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也视为本发明的保护范围。The above are the preferred embodiments of the present invention. It should be pointed out that for those skilled in the art, without departing from the principles of the present invention, several improvements and modifications can also be made, and these improvements and modifications are also regarded as protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810244918.9A CN108764951B (en) | 2018-03-23 | 2018-03-23 | User similarity obtaining method and device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810244918.9A CN108764951B (en) | 2018-03-23 | 2018-03-23 | User similarity obtaining method and device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108764951A CN108764951A (en) | 2018-11-06 |
CN108764951B true CN108764951B (en) | 2021-01-12 |
Family
ID=63980280
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810244918.9A Expired - Fee Related CN108764951B (en) | 2018-03-23 | 2018-03-23 | User similarity obtaining method and device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108764951B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111694875B (en) * | 2019-03-14 | 2023-04-25 | 百度在线网络技术(北京)有限公司 | Method and device for outputting information |
CN112653995B (en) * | 2019-10-12 | 2023-03-28 | 中国移动通信有限公司研究院 | User identity recognition method and device and computer readable storage medium |
CN111182465A (en) * | 2019-12-12 | 2020-05-19 | 中国联合网络通信集团有限公司 | Method and device for determining terminal belonging |
CN111294742B (en) * | 2020-02-10 | 2020-11-10 | 邑客得(上海)信息技术有限公司 | Method and system for identifying accompanying mobile phone number based on signaling CDR data |
CN112465869B (en) * | 2020-11-30 | 2023-09-05 | 杭州海康威视数字技术股份有限公司 | Track association method and device, electronic equipment and storage medium |
CN114584657A (en) * | 2022-02-28 | 2022-06-03 | 天翼安全科技有限公司 | Telephone number identification method, device, equipment and medium for abnormal communication |
CN116056067B (en) * | 2023-01-09 | 2024-04-19 | 中国联合网络通信集团有限公司 | Terminal identification method, device, server and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101141633A (en) * | 2007-08-28 | 2008-03-12 | 湖南大学 | A Moving Object Detection and Tracking Method in Complex Scenes |
CN103942310A (en) * | 2014-04-18 | 2014-07-23 | 厦门雅迅网络股份有限公司 | User behavior similarity mining method based on space-time mode |
WO2017133636A1 (en) * | 2016-02-05 | 2017-08-10 | 中兴通讯股份有限公司 | Method and apparatus for predicting base station switching of a mobile terminal |
CN107145796A (en) * | 2017-04-24 | 2017-09-08 | 公安海警学院 | Track data k anonymities method for secret protection under a kind of uncertain environment |
CN107665289A (en) * | 2017-11-17 | 2018-02-06 | 广州汇智通信技术有限公司 | The processing method and system of a kind of carrier data |
-
2018
- 2018-03-23 CN CN201810244918.9A patent/CN108764951B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101141633A (en) * | 2007-08-28 | 2008-03-12 | 湖南大学 | A Moving Object Detection and Tracking Method in Complex Scenes |
CN103942310A (en) * | 2014-04-18 | 2014-07-23 | 厦门雅迅网络股份有限公司 | User behavior similarity mining method based on space-time mode |
WO2017133636A1 (en) * | 2016-02-05 | 2017-08-10 | 中兴通讯股份有限公司 | Method and apparatus for predicting base station switching of a mobile terminal |
CN107145796A (en) * | 2017-04-24 | 2017-09-08 | 公安海警学院 | Track data k anonymities method for secret protection under a kind of uncertain environment |
CN107665289A (en) * | 2017-11-17 | 2018-02-06 | 广州汇智通信技术有限公司 | The processing method and system of a kind of carrier data |
Also Published As
Publication number | Publication date |
---|---|
CN108764951A (en) | 2018-11-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108764951B (en) | User similarity obtaining method and device, equipment and storage medium | |
CN112417439B (en) | Account detection method, device, server and storage medium | |
WO2018086489A1 (en) | Method and device for processing incoming call, and terminal | |
CN109951289B (en) | Identification method, device, equipment and readable storage medium | |
CN110213714B (en) | Method and device for terminal positioning | |
CN105336097B (en) | The traffic prewarning method and device of movement of population track | |
CN110210604A (en) | A kind of terminal device movement pattern method and device | |
CN112770265B (en) | Pedestrian identity information acquisition method, system, server and storage medium | |
CN108377204B (en) | User off-network prediction method and device | |
CN113891240B (en) | Geographic fence generation method and device, positioning method and device, medium and equipment | |
CN108834077A (en) | Tracking area division method, device and electronic equipment based on user mobility characteristics | |
CN112528716A (en) | Event information acquisition method and device | |
CN112131225B (en) | Method and device for determining application installation source and tracing system | |
CN108600961A (en) | Preparation method and device, equipment, the storage medium of user's similarity | |
CN110995687B (en) | Cat pool equipment identification method, device, equipment and storage medium | |
CN108900975A (en) | The detection method and device of user's motion track, equipment, storage medium | |
CN105578395B (en) | A kind of method and device updating terminal attribute in end message library | |
WO2017020748A1 (en) | Method and device for processing signalling tracking task | |
CN113741930A (en) | Application upgrading method and device, electronic equipment and computer readable storage medium | |
CN107241219B (en) | Users to trust degree prediction technique and device | |
CN108509560B (en) | User similarity acquisition method and device, device, and storage medium | |
CN112637888A (en) | Coverage hole area identification method, device, equipment and readable storage medium | |
CN109379704B (en) | Method, device and equipment for correcting regional information of short message and storage medium | |
WO2021080332A3 (en) | System for predicting vehicle trajectory by using extended kalman filter in vehicle software defined networking and method therefor, and computer-readable recording medium on which program for performing method is recorded | |
CN114970495A (en) | Name disambiguation method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210112 |
|
CF01 | Termination of patent right due to non-payment of annual fee |