CN107563402A - A kind of social networks estimating method and system - Google Patents
A kind of social networks estimating method and system Download PDFInfo
- Publication number
- CN107563402A CN107563402A CN201710552281.5A CN201710552281A CN107563402A CN 107563402 A CN107563402 A CN 107563402A CN 201710552281 A CN201710552281 A CN 201710552281A CN 107563402 A CN107563402 A CN 107563402A
- Authority
- CN
- China
- Prior art keywords
- user
- intentions
- mobile
- spatio
- multiple mobile
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000000354 decomposition reaction Methods 0.000 claims description 25
- 238000013145 classification model Methods 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 11
- 239000013598 vector Substances 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 abstract description 7
- 238000012512 characterization method Methods 0.000 abstract 1
- 238000004364 calculation method Methods 0.000 description 4
- 239000013256 coordination polymer Substances 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及信息技术领域,更具体地,涉及一种社交关系推断方法及系统。The present invention relates to the field of information technology, and more specifically, to a social relationship inference method and system.
背景技术Background technique
目前,随着手机、平板电脑等带有多种传感器的智能移动设备广泛普及,社交网络的快速发展,可以收集到越来越多的时空数据,比如:个人通讯数据,GPS轨迹,不同位置服务商提供的签到数据等。并且随着物联网的发展,原本用于其它目的的系统也进一步增加了时空数据的收集能力,比如智能公交卡系统,视频监控系统,ATM系统等。从这些时空数据中挖掘人和人之间的社会关系对某些流行应用来说至关重要,比如朋友推荐,广告投放,流行病的蔓延,犯罪或恐怖组织成员的确定等。At present, with the widespread popularization of smart mobile devices with various sensors such as mobile phones and tablet computers, and the rapid development of social networks, more and more spatio-temporal data can be collected, such as: personal communication data, GPS trajectory, different location services The sign-in data provided by the provider, etc. And with the development of the Internet of Things, systems originally used for other purposes have further increased the ability to collect spatio-temporal data, such as smart bus card systems, video surveillance systems, ATM systems, etc. Mining the social relationship between people from these spatio-temporal data is crucial for some popular applications, such as friend recommendation, advertisement placement, spread of epidemics, identification of members of criminal or terrorist organizations, etc.
为了能从时空数据中推断出两人的社会关系,现有技术通过两人的共现特征来推断两个人关系的紧密程度。两人的共现是指两个人在同一时间出现在相同的地点的现象。一般来说,关系越紧密的两人,共现的次数越多,那么可以直接通过共现的次数多寡来判断两个人的关系紧密程度。共现的地点可以根据地点的熵来确定是公共场所还是私人场所,地点熵的计算公式如下:In order to infer the social relationship between two people from the spatio-temporal data, the prior art uses the co-occurrence characteristics of the two people to infer the closeness of the relationship between the two people. Co-occurrence of two people refers to the phenomenon that two people appear in the same place at the same time. Generally speaking, the closer the relationship between two people, the more times they co-occur, so the closeness of the relationship between two people can be judged directly by the number of co-occurrences. The co-occurrence location can be determined as a public place or a private place according to the entropy of the location. The calculation formula of the location entropy is as follows:
其中,Pu,l表示的是某个用户出现在第l个地点的次数占所有用户出现在在第l个地点总次数的比例。如果地点的熵越大,那么表示很多人都到过该地点,则该地点是一个公共场所。那么熵越高的地点的共现对两人之间的关系贡献较少,而熵小的地点的共现对两人的关系贡献较大,那么通过熵值可以推断出现共现的多寡,从而推断两个人的关系。Among them, P u,l represents the ratio of the number of times a certain user appears in the lth location to the total number of times that all users appear in the lth location. If the entropy of the location is greater, it means that many people have been to the location, and the location is a public place. Then the co-occurrence of locations with higher entropy contributes less to the relationship between the two, while the co-occurrence of locations with lower entropy contributes more to the relationship between the two, then the amount of co-occurrence can be inferred from the entropy value, so that Infer the relationship between two people.
但是两个关系紧密的人也会在熵高的地方共现,比如大型商场、超市等位置,并且现有的时空数据中的绝大多数数据来源于公共场所,因此熵小的地点的时空数据不易获取,因此在实际应用中,现有的根据熵值推断社交关系的方法并不准确。However, two closely related people will also co-occur in places with high entropy, such as large shopping malls, supermarkets, etc., and most of the existing spatio-temporal data come from public places, so the spatio-temporal data of places with low entropy It is not easy to obtain, so in practical applications, the existing methods of inferring social relationships based on entropy values are not accurate.
发明内容Contents of the invention
本发明提供一种克服上述问题或者至少部分地解决上述问题的一种社交关系推断方法及系统。The present invention provides a social relationship inference method and system to overcome the above problems or at least partially solve the above problems.
根据本发明提供的第一方面,本发明提供一种社交关系推断方法,所述方法包括:According to the first aspect provided by the present invention, the present invention provides a social relationship inference method, the method comprising:
S1、基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;S1. Based on the tensor decomposition method, multiple mobile intentions of the first user and multiple mobile intentions of the second user are obtained from the spatio-temporal data within the same time period;
S2、基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;S2. Based on a preset multi-class classifier, from the multiple mobile intentions of the first user and the multiple mobile intentions of the second user, obtain the co-occurring mobile intentions of the first user and the second user ;
S3、基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。S3. Based on the SVM binary classification model and the co-occurring mobile intention, infer the social relationship between the first user and the second user.
其中,步骤S1包括:Wherein, step S1 includes:
将所述时空数据,按照预设的时间段,划分为多个时空子数据;dividing the spatio-temporal data into multiple spatio-temporal sub-data according to a preset time period;
计算每个时间段内,所述时空子数据的张量元素值;Calculating the tensor element values of the spatiotemporal sub-data in each time period;
基于所述张量元素值的特征向量,提取所述第一用户的多个移动意图以及第二用户的多个移动意图。Based on the eigenvectors of the tensor element values, a plurality of mobile intentions of the first user and a plurality of mobile intentions of the second user are extracted.
其中,步骤S1之前所述方法还包括:Wherein, the method described before step S1 also includes:
获取所述时空数据,并将所述时空数据的部分数据作为训练数据集,对所述多类分类器进行训练。The spatio-temporal data is obtained, and part of the spatio-temporal data is used as a training data set to train the multi-class classifier.
其中,所述获取所述时空数据,并将所述时空数据的部分数据作为训练数据集,对所述多类分类器进行训练,包括:Wherein, the acquisition of the spatio-temporal data and using part of the spatio-temporal data as a training data set to train the multi-class classifier includes:
基于所述张量元素值,确定所述多类分类器的分类特征;determining classification features for the multi-class classifier based on the tensor element values;
基于所述分类特征,使用所述训练数据集,对所述多类分类器进行训练。Based on the classification features, the multi-class classifier is trained using the training data set.
其中,步骤S2包括:Wherein, step S2 includes:
基于所述第一用户的多个移动意图,计算所述第一用户的分类特征值;calculating a classification feature value of the first user based on a plurality of movement intentions of the first user;
基于所述第二用户的多个移动意图,计算所述第二用户的分类特征值;calculating a classification feature value of the second user based on a plurality of movement intentions of the second user;
基于所述多类分类器,比对所述第一用户的分类特征值和所述第二用户的分类特征值,提取所述第一用户和所述第二用户共现的移动意图。Based on the multi-class classifier, comparing the classification feature value of the first user with the classification feature value of the second user, extracting the co-occurring movement intention of the first user and the second user.
其中,所述分类特征包括:Wherein, the classification features include:
空间特征、时间特征以及日期特征的一种或多种。One or more of spatial features, time features, and date features.
其中,步骤S3包括:Wherein, step S3 includes:
将所述第一用户和所述第二用户共现的移动意图,转化为多维特征向量;converting the co-occurring mobile intentions of the first user and the second user into a multidimensional feature vector;
基于所述SVM二分类模型和所述多维特征向量,推断所述第一用户和第二用户的社会关系。Based on the SVM binary classification model and the multi-dimensional feature vector, infer the social relationship between the first user and the second user.
根据本发明的第二方面,提供一种社交关系推断系统,包括:According to a second aspect of the present invention, a social relationship inference system is provided, including:
张量分解模块,用于基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;The tensor decomposition module is used to obtain multiple mobile intentions of the first user and multiple mobile intentions of the second user within the same time period from the spatio-temporal data based on the tensor decomposition method;
多类分类器模块,用于基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;A multi-class classifier module, configured to obtain the first user and the second user from the multiple mobile intentions of the first user and the multiple mobile intentions of the second user based on a preset multi-class classifier. co-occurring mobile intent of users;
推断模块,用于基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。An inference module, configured to infer the social relationship between the first user and the second user based on the SVM binary classification model and the co-occurring mobile intention.
根据本发明的第三方面,提供一种计算机程序产品,包括程序代码,所述程序代码用于执行上述所述的社交关系推断方法。According to a third aspect of the present invention, a computer program product is provided, including program code, the program code is used to execute the above-mentioned social relationship inference method.
根据本发明的第四方面,提供一种非暂态计算机可读存储介质,用于存储如前所述的计算机程序。According to a fourth aspect of the present invention, there is provided a non-transitory computer-readable storage medium for storing the aforementioned computer program.
本发明提供的社交关系推断方法及系统,通过张量分解法分解出用户时空数据中具有同一时空特性的移动意图,从而建立时空数据与所述移动意图之间的映射关系,进而推断用户之间的社会关系,提高了推断的准确率。The social relationship inference method and system provided by the present invention decompose the mobile intentions with the same spatiotemporal characteristics in the user's spatiotemporal data through the tensor decomposition method, so as to establish the mapping relationship between the spatiotemporal data and the mobile intentions, and then infer the relationship between users. social relations, which improves the accuracy of inference.
附图说明Description of drawings
图1是本发明实施例提供的一种社交关系推断方法流程图;Fig. 1 is a flow chart of a social relationship inference method provided by an embodiment of the present invention;
图2是本发明实施例提供的社交关系推断方法试验结果对比图;Fig. 2 is a comparison chart of the experimental results of the social relationship inference method provided by the embodiment of the present invention;
图3是本发明实施例提供的一种社交关系推断系统结构图。Fig. 3 is a structural diagram of a social relationship inference system provided by an embodiment of the present invention.
具体实施方式detailed description
下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明,但不用来限制本发明的范围。The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.
图1是本发明实施例提供的一种社交关系推断方法流程图,如图1所示,包括:Fig. 1 is a flow chart of a social relationship inference method provided by an embodiment of the present invention, as shown in Fig. 1 , including:
S1、基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;S1. Based on the tensor decomposition method, multiple mobile intentions of the first user and multiple mobile intentions of the second user are obtained from the spatio-temporal data within the same time period;
S2、基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;S2. Based on a preset multi-class classifier, from the multiple mobile intentions of the first user and the multiple mobile intentions of the second user, obtain the co-occurring mobile intentions of the first user and the second user ;
S3、基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。S3. Based on the SVM binary classification model and the co-occurring mobile intention, infer the social relationship between the first user and the second user.
所述S1中,所述张量分解法为CP分解法,即将一个张量表示成有限个秩一张量之和,所述CP分解法能够有效的对时空数据中的时空特性进行分解,基于所述张量分解法,能够从时空数据中提取用户的移动模式,即用户在不同时间段的移动意图,所述移动意图即为所述移动目的。In the S1, the tensor decomposition method is a CP decomposition method, that is, a tensor is expressed as a sum of finite rank tensors, and the CP decomposition method can effectively decompose the spatiotemporal characteristics in the spatiotemporal data, based on The tensor decomposition method can extract the user's movement pattern from spatio-temporal data, that is, the user's movement intention in different time periods, and the movement intention is the movement purpose.
所述S2中,可以理解的是,所述多类分类器是用于建立用户记录与移动目的之间的映射函数,对于存在多种移动目的的情况下,可以将对移动目的的分类看作为一个多类分类,故而设立多类分类器确定所需要的移动目的特征。In the above S2, it can be understood that the multi-class classifier is used to establish a mapping function between user records and mobile purposes, and when there are multiple mobile purposes, the classification of mobile purposes can be regarded as A multi-class classification, therefore, a multi-class classifier is set up to determine the desired characteristics of the mobile purpose.
所述S3中,所述SVM二分类模型为传统分类器,是非常有效的线性分类器,根据共现的目的地信息,可以由线性分类器推断第一用户和第二用户是属于陌生人还是熟人类别。In the S3, the SVM binary classification model is a traditional classifier, which is a very effective linear classifier. According to the co-occurring destination information, the linear classifier can be used to infer whether the first user and the second user belong to strangers or acquaintance category.
本发明提供的社交关系推断方法及系统,通过张量分解法分解出用户时空数据中具有同一时空特性的移动目的地,从而建立时空数据与所述移动目的地之间的映射关系,进而推断用户之间的社会关系,提高了推算的准确率。The social relationship inference method and system provided by the present invention decompose the mobile destinations with the same spatio-temporal characteristics in the user's spatio-temporal data through the tensor decomposition method, so as to establish the mapping relationship between the spatio-temporal data and the mobile destinations, and then infer the user The social relationship between them improves the accuracy of the calculation.
在上述实施例的基础上,步骤S1包括:On the basis of the foregoing embodiments, step S1 includes:
将所述时空数据,按照预设的时间段,划分为多个时空子数据;dividing the spatio-temporal data into multiple spatio-temporal sub-data according to a preset time period;
计算每个时间段内,所述时空子数据的张量元素值;Calculating the tensor element values of the spatiotemporal sub-data in each time period;
基于所述张量元素值的特征向量,提取所述第一用户的多个移动意图以及第二用户的多个移动意图。Based on the eigenvectors of the tensor element values, a plurality of mobile intentions of the first user and a plurality of mobile intentions of the second user are extracted.
具体的,例如:获取数据采用的是北京公交一卡通的数据,该数据收集的是2014年10月1日至2014年10月31日北京市区主要公交车的刷卡记录。Specifically, for example: the data obtained is the data of the Beijing Public Transport Card, which collects the card swiping records of major buses in the urban area of Beijing from October 1, 2014 to October 31, 2014.
利用张量分解算法,抽取数据集中的移动目的。计算每个张量元素的值,计算方法如下:Using the tensor decomposition algorithm, the moving purpose in the data set is extracted. Calculate each tensor element The value of is calculated as follows:
以数据中所有的唯一公交站点作为地点,对应一个ri。如果两个公交站相隔不超过300米,则视为一个公交站。将一天分为下面八个时间段,每个时间段的时间如下:Take all the unique bus stops in the data as locations, corresponding to a r i . Two bus stops are considered one bus stop if they are no more than 300 meters apart. Divide a day into the following eight time periods, and the time of each time period is as follows:
00:00:00-4:59:59、05:00:00-6:59:59、07:00:00-8:59:59、09:00:00-10:59:59、11:00:00-13:59:59、14:00:00-16:59:59、17:00:00-19:59:59、20:00:00-23:59:59。00:00:00-4:59:59, 05:00:00-6:59:59, 07:00:00-8:59:59, 09:00:00-10:59:59, 11: 00:00-13:59:59, 14:00:00-16:59:59, 17:00:00-19:59:59, 20:00:00-23:59:59.
统计10月1日10月31日每天上面每个时间段的所有乘客数以及每个公交站点ri的乘客数Count(ri,tj,dk),则每个张量元素的值为:Count the number of all passengers in each time period above on October 1st and October 31st And the number of passengers Count(ri , t j , d k ) of each bus station r i , then each tensor element The value is:
在获得所有张量元素的值后,利用CP分解程序,将该张量Y进行CP分解:After obtaining the values of all tensor elements, use the CP decomposition procedure to perform CP decomposition on the tensor Y:
其中,λr是系数,Yr是得到的一阶张量,可以写成三个向量ar,br,cr的外积。Among them, λ r is the coefficient, and Y r is the obtained first-order tensor, which can be written as the outer product of three vectors a r , b r , and c r .
其中,ar,br,cr分别是地点,时间和日期上的特征向量;而Yr是我们从数据集中抽取的一个移动目的。在我们使用的北京市公交一卡通的数据集上,我们抽取了7种移动目的。Among them, a r , b r , and c r are the feature vectors of location, time and date respectively; and Y r is a mobile destination we extracted from the data set. On the data set of the Beijing Public Transport Card we used, we extracted 7 kinds of mobile purposes.
需要说明的是,本发明实施例提供的2014年10月1日至2014年10月31日北京市区主要公交车的刷卡记录数据抽取的7种移动目的仅为该方案中具有的固有移动目的,本发明实施例对不同数据集中具备的移动目的数量不做具体限定。It should be noted that the seven moving purposes extracted from the card swiping record data of major buses in Beijing from October 1, 2014 to October 31, 2014 provided by the embodiment of the present invention are only the inherent moving purposes in the scheme , the embodiment of the present invention does not specifically limit the number of mobile purposes in different data sets.
本发明实施例通过使用张量分解的方法抽取时空数据集中的用户移动目的,作为判别特征,获取方便且特征明显,为后续用户之间的社会关系推断提供了判别基础。The embodiment of the present invention uses tensor decomposition method to extract the user's movement purpose in the spatio-temporal data set as a discriminant feature, which is easy to obtain and has obvious features, and provides a discriminative basis for subsequent social relationship inference between users.
在上述实施例的基础上,步骤S1之前所述方法还包括:On the basis of the foregoing embodiments, the method before step S1 further includes:
获取所述时空数据,并将所述时空数据的部分数据作为训练数据集,对所述多类分类器进行训练。The spatio-temporal data is obtained, and part of the spatio-temporal data is used as a training data set to train the multi-class classifier.
具体的,将我们得到的时空数据集中,根据记录的时间和地点,我们识别出412张持卡用户,并通过调查,确定了2796对认识的用户,同时使用已有的算法提取412个用户之间所有的共现,所述共现包括了认识和不认识用户之间的所有共现。我们把这些确定认识的2796对用户中随机挑选出80%的用户对作为训练数据,训练一个多类分类器,来建立用户记录与移动目的之间的映射函数。Specifically, we collected the spatio-temporal data we obtained, and identified 412 cardholders according to the recorded time and place, and determined 2796 pairs of known users through investigation, and used the existing algorithm to extract the 412 cardholders. All co-occurrences among users, the co-occurrences include all co-occurrences between known and unknown users. We randomly selected 80% of the 2,796 pairs of users we knew as training data, and trained a multi-class classifier to establish a mapping function between user records and mobile purposes.
在上述实施例的基础上,所述获取所述时空数据,并将所述时空数据的部分数据集作为训练数据集,对所述多类分类器进行训练,包括:On the basis of the foregoing embodiments, the acquisition of the spatio-temporal data, and using a partial data set of the spatio-temporal data as a training data set to train the multi-class classifier includes:
基于所述张量元素值,确定所述多类分类器的分类特征;determining classification features for the multi-class classifier based on the tensor element values;
基于所述分类特征,使用所述训练数据集,对所述多类分类器进行训练。Based on the classification features, the multi-class classifier is trained using the training data set.
基于S1过程中张量分解的张量元素值,可以将张量信息分为多类特征,那么每个记录对应的哪一种移动目的,相当于对应的映射在哪一类型的移动目的中,再根据特征对其进行分类。Based on the tensor element values of the tensor decomposition in the S1 process, the tensor information can be divided into multiple types of features, so which type of movement purpose each record corresponds to is equivalent to which type of movement purpose the corresponding mapping is in, Classify them according to their characteristics.
在上述实施例的基础上,步骤S2包括:On the basis of the foregoing embodiments, step S2 includes:
基于所述第一用户的多个移动意图,计算所述第一用户的分类特征值;calculating a classification feature value of the first user based on a plurality of movement intentions of the first user;
基于所述第二用户的多个移动意图,计算所述第二用户的分类特征值;calculating a classification feature value of the second user based on a plurality of movement intentions of the second user;
基于所述多类分类器,比对所述第一用户的分类特征值和所述第二用户的分类特征值,提取所述第一用户和所述第二用户共现的移动意图。Based on the multi-class classifier, comparing the classification feature value of the first user with the classification feature value of the second user, extracting the co-occurring movement intention of the first user and the second user.
具体的,本发明实施例采用现有的Adaboost算法和训练数据集,训练出Adaboost模型。根据训练出的Adaboost分类模型,将这些共现的目的地信息映射成移动目的对。本发明实施例中抽取出七种移动目的,会产生7种不同的移动目的对,所述移动目的对即为提取的共现特征。Specifically, the embodiment of the present invention uses the existing Adaboost algorithm and training data set to train the Adaboost model. According to the trained Adaboost classification model, these co-occurring destination information are mapped into mobile destination pairs. In the embodiment of the present invention, seven kinds of moving purposes are extracted, and seven different moving purpose pairs are generated, and the moving purpose pairs are the extracted co-occurrence features.
在上述实施例的基础上,所述分类特征包括:On the basis of the foregoing embodiments, the classification features include:
空间特征、时间特征以及日期特征的一种或多种。One or more of spatial features, time features, and date features.
根据之前分解出的一阶张量给出的信息,以及我们做得一些特征工程试验,我们确定了三大类特征,分别是空间特征、时间特征以及日期特征,本发明实施例提供的分类特征可包括这三类特征的一种或多种,这三类特征具体包括:According to the information given by the first-order tensor decomposed before, and some feature engineering experiments we have done, we have determined three major types of features, namely spatial features, time features, and date features. The classification features provided by the embodiments of the present invention One or more of these three types of features may be included, and these three types of features specifically include:
空间特征:其中,所述空间特征包括了地点熵,所述地点熵用来衡量地点的热度,其计算公式如下:Spatial features: wherein, the spatial features include location entropy, which is used to measure the heat of a location, and its calculation formula is as follows:
其中pi是第i个用户出现在lock的次数占所有用户出现在lock的次数的比率。where p i is the ratio of the number of times that the i-th user appears in lock k to the number of times that all users appear in lock k .
所述空间特征还包括了地点类型,所述地点类型表示地点所属的类别,比如酒吧,超市等。地点类型可以通过一些基于位置服务的应用程序接口获取。The spatial feature also includes a location type, which indicates the category to which the location belongs, such as a bar, a supermarket, and the like. The location type can be obtained through some application programming interfaces based on location services.
所述空间特征还包括了距离,所述距离表示当前记录中的地点距离上一个记录中的地点距离。The spatial feature also includes a distance, which indicates the distance between the location in the current record and the location in the previous record.
时间特征:其中,所述时间特征包括了小时,所述小时表示记录的时间,取值从0到23。Time characteristic: Wherein, the time characteristic includes hour, and the hour represents the recorded time, and the value ranges from 0 to 23.
所述时间特征还包括了伪停留时间,所述伪停留时间表示当前记录中的时间与上一个记录中时间差。The time feature also includes a pseudo-stay time, which represents the time difference between the time in the current record and the time in the previous record.
所述时间特征还包括了时间间隔,所述时间间隔用来衡量某一个地点,某个用户的访问习惯。等于当前记录中的时间与上一个相同地点的记录中时间差。The time feature also includes a time interval, and the time interval is used to measure a certain location and a certain user's access habit. Equal to the difference between the time in the current record and the time in the previous record for the same location.
所述时间特征还包括二次时间间隔,所述二次时间间隔同样用来衡量某一个地点,某个用户的访问习惯。等于当前记录中的时间与相同地点的前第二个记录中时间差。The time feature also includes a secondary time interval, and the secondary time interval is also used to measure a certain location and a certain user's access habit. Equal to the difference between the time in the current record and the time in the second previous record for the same location.
日期特征:其中,所述日期特征包括了星期,所述星期用0到6表示周日、周一到周六。Date feature: Wherein, the date feature includes a week, and the week uses 0 to 6 to represent Sunday, Monday to Saturday.
所述日期特征还包括日期,所述日期用0到31表示所在月份的天数。The date feature also includes a date, and the date uses 0 to 31 to represent the number of days in the month.
所述日期特征还包括日期类型,所述日期类型表示日期的类别,本发明实施例提供了三种日期类别:周末,小长假和长假,分别对应的是周末两天假期,中秋节这种三到五天的假期,以及超过五天及以上的假期。The date feature also includes a date type, and the date type represents a date category. The embodiment of the present invention provides three date categories: weekends, small holidays and long holidays, corresponding to two-day holidays on weekends and three-day holidays such as the Mid-Autumn Festival. holidays up to five days, and holidays exceeding five days or more.
本发明实施例通过将分类特征具体划为空间特征、时间特征以及日期特征,对分类进行了细化,使得分类更为准确。The embodiment of the present invention refines the classification by specifically classifying the classification features into spatial features, time features, and date features, so that the classification is more accurate.
在上述实施例的基础上,步骤S3包括:On the basis of the foregoing embodiments, step S3 includes:
将所述第一用户和所述第二用户共现的移动意图,转化为多维特征向量;converting the co-occurring mobile intentions of the first user and the second user into a multidimensional feature vector;
基于所述SVM二分类模型和所述多维特征向量,推断所述第一用户和第二用户的社会关系。Based on the SVM binary classification model and the multi-dimensional feature vector, infer the social relationship between the first user and the second user.
具体的,例如:获取的第一用户和所述第二用户共现的移动意图为上述实施例提供的数据,解算的信息数为7种,则会产生种不同的移动目的对。统计两个用户的所有移动目的对,形成21维的特征向量,利用该21维特征向量训练所述SVM二分类模型,再根据21维特征向量判断所述第一用户和所述第二用户是陌生人还是熟人。Specifically, for example, the obtained co-occurring movement intentions of the first user and the second user are the data provided in the above embodiment, and the number of information to be calculated is 7, and a different mobile destination pairs. Counting all mobile destination pairs of two users to form a 21-dimensional feature vector, using the 21-dimensional feature vector to train the SVM binary classification model, and then judging whether the first user and the second user are based on the 21-dimensional feature vector Strangers or acquaintances.
图2是本发明实施例提供的社交关系推断方法推断结果对比图,如图2所示,黑色实线为本发明实施例的推断结果,黑色虚线为现有技术的推断结果,从图2可以看出,在推断准确率上,本发明实施例要明显优于现有技术的基于地点熵共现的方法推断。Fig. 2 is a comparison chart of the inference results of the social relationship inference method provided by the embodiment of the present invention. As shown in Fig. 2, the black solid line is the inference result of the embodiment of the present invention, and the black dotted line is the inference result of the prior art. From Fig. 2 It can be seen that in terms of inference accuracy, the embodiment of the present invention is significantly better than the method inference based on the co-occurrence of location entropy in the prior art.
本发明提供的社交关系推断方法及系统,通过张量分解法分解出用户时空数据中具有同一时空特性的移动目的地,从而建立时空数据与所述移动目的地之间的映射关系,进而推断用户之间的社会关系,提高了推算的准确率。The social relationship inference method and system provided by the present invention decompose the mobile destinations with the same spatio-temporal characteristics in the user's spatio-temporal data through the tensor decomposition method, so as to establish the mapping relationship between the spatio-temporal data and the mobile destinations, and then infer the user The social relationship between them improves the accuracy of the calculation.
图3是本发明实施例提供的一种社交关系推断系统结构图,包括:张量分解模块1、多类分类器模块2以及推断模块3,其中,FIG. 3 is a structural diagram of a social relationship inference system provided by an embodiment of the present invention, including: a tensor decomposition module 1, a multi-class classifier module 2, and an inference module 3, wherein,
张量分解模块1用于基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;The tensor decomposition module 1 is used to obtain multiple mobile intentions of the first user and multiple mobile intentions of the second user within the same time period from the spatio-temporal data based on the tensor decomposition method;
多类分类器模块2用于基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;The multi-class classifier module 2 is used to obtain the first user and the second user from the multiple mobile intentions of the first user and the multiple mobile intentions of the second user based on a preset multi-class classifier. co-occurring mobile intent of users;
推断模块3用于基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。The inference module 3 is used to infer the social relationship between the first user and the second user based on the SVM binary classification model and the co-occurring mobile intention.
具体的社交关系推断方法可参见上述实施例,本发明实施例在此不再赘述。For a specific social relationship inference method, reference may be made to the foregoing embodiments, and details are not repeated in this embodiment of the present invention.
本实施例提供一种社交关系推断系统,包括:至少一个处理器;以及与所述处理器通信连接的至少一个存储器,其中:This embodiment provides a social relationship inference system, including: at least one processor; and at least one memory communicated with the processor, wherein:
所述存储器存储有可被所述处理器执行的程序指令,所述处理器调用所述程序指令以执行上述各方法实施例所提供的方法,例如包括:S1、基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;S2、基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;S3、基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。The memory stores program instructions that can be executed by the processor, and the processor calls the program instructions to execute the methods provided by the above method embodiments, for example, including: S1, based on the tensor decomposition method, from the space-time Obtain multiple mobile intentions of the first user and multiple mobile intentions of the second user within the same time period from the data; S2, based on a preset multi-class classifier, from the multiple mobile intentions of the first user and the second Among the multiple mobile intentions of two users, obtain the co-occurring mobile intentions of the first user and the second user; S3, based on the SVM binary classification model and the co-occurring mobile intentions, infer the first user and the co-occurring mobile intentions The social relationship of the second user.
本实施例公开一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法实施例所提供的方法,例如包括:S1、基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;S2、基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;S3、基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。This embodiment discloses a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by the computer, the computer The methods provided by the above-mentioned method embodiments can be executed, for example, including: S1. Based on the tensor decomposition method, obtain multiple mobile intentions of the first user and multiple mobile intentions of the second user within the same time period from the spatio-temporal data ; S2, based on a preset multi-class classifier, from the multiple mobile intentions of the first user and the multiple mobile intentions of the second user, obtain the co-occurring movement of the first user and the second user Intent; S3. Based on the SVM binary classification model and the co-occurring mobile intention, infer the social relationship between the first user and the second user.
本实施例提供一种非暂态计算机可读存储介质,所述非暂态计算机可读存储介质存储计算机指令,所述计算机指令使所述计算机执行上述各方法实施例所提供的方法,例如包括:S1、基于张量分解法,从时空数据中获取同一时间段内,第一用户的多个移动意图以及第二用户的多个移动意图;S2、基于预设的多类分类器,从所述第一用户的多个移动意图以及第二用户的多个移动意图中,获取所述第一用户和所述第二用户共现的移动意图;S3、基于SVM二分类模型和所述共现的移动意图,推断所述第一用户和所述第二用户的社交关系。This embodiment provides a non-transitory computer-readable storage medium, the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions cause the computer to execute the methods provided in the above method embodiments, for example, including : S1. Based on the tensor decomposition method, multiple mobile intentions of the first user and multiple mobile intentions of the second user are obtained from the spatiotemporal data in the same time period; S2. Based on the preset multi-class classifier, from the Among the multiple mobile intentions of the first user and the multiple mobile intentions of the second user, obtain the co-occurring mobile intentions of the first user and the second user; S3, based on the SVM binary classification model and the co-occurrence mobile intention, inferring the social relationship between the first user and the second user.
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for realizing the above-mentioned method embodiments can be completed by hardware related to program instructions, and the aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the It includes the steps of the above method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the implementations, those skilled in the art can clearly understand that each implementation can be implemented by means of software plus a necessary general hardware platform, and of course also by hardware. Based on this understanding, the essence of the above technical solution or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic discs, optical discs, etc., including several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in various embodiments or some parts of the embodiments.
最后,本申请的方法仅为较佳的实施方案,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Finally, the method of the present application is only a preferred embodiment, and is not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710552281.5A CN107563402A (en) | 2017-07-07 | 2017-07-07 | A kind of social networks estimating method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710552281.5A CN107563402A (en) | 2017-07-07 | 2017-07-07 | A kind of social networks estimating method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107563402A true CN107563402A (en) | 2018-01-09 |
Family
ID=60973071
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710552281.5A Pending CN107563402A (en) | 2017-07-07 | 2017-07-07 | A kind of social networks estimating method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107563402A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675192A (en) * | 2019-09-27 | 2020-01-10 | 深圳市掌众信息技术有限公司 | Intimacy mining method, advertisement pushing method and system |
CN111553386A (en) * | 2020-04-07 | 2020-08-18 | 哈尔滨工程大学 | AdaBoost and CNN-based intrusion detection method |
CN113115200A (en) * | 2019-12-24 | 2021-07-13 | 中国移动通信集团浙江有限公司 | User relationship identification method and device and computing equipment |
CN114611016A (en) * | 2020-12-08 | 2022-06-10 | 中移(苏州)软件技术有限公司 | Relationship determination method and device, electronic equipment and storage medium |
-
2017
- 2017-07-07 CN CN201710552281.5A patent/CN107563402A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675192A (en) * | 2019-09-27 | 2020-01-10 | 深圳市掌众信息技术有限公司 | Intimacy mining method, advertisement pushing method and system |
CN113115200A (en) * | 2019-12-24 | 2021-07-13 | 中国移动通信集团浙江有限公司 | User relationship identification method and device and computing equipment |
CN113115200B (en) * | 2019-12-24 | 2023-04-18 | 中国移动通信集团浙江有限公司 | User relationship identification method and device and computing equipment |
CN111553386A (en) * | 2020-04-07 | 2020-08-18 | 哈尔滨工程大学 | AdaBoost and CNN-based intrusion detection method |
CN111553386B (en) * | 2020-04-07 | 2022-05-20 | 哈尔滨工程大学 | An Intrusion Detection Method Based on AdaBoost and CNN |
CN114611016A (en) * | 2020-12-08 | 2022-06-10 | 中移(苏州)软件技术有限公司 | Relationship determination method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112069415B (en) | An interest point recommendation method based on heterogeneous attribute network representation learning | |
CN113722611B (en) | Recommendation method, device and equipment for government affair service and computer readable storage medium | |
US11816727B2 (en) | Credit scoring method and server | |
CN110765863B (en) | Target clustering method and system based on space-time constraint | |
US9183497B2 (en) | Performance-efficient system for predicting user activities based on time-related features | |
CN110363081B (en) | Face recognition method, device, equipment and computer readable storage medium | |
CN109614556B (en) | Access path prediction and information push method and device | |
CN107563402A (en) | A kind of social networks estimating method and system | |
CN112016485B (en) | A passenger flow counting method and system based on face recognition | |
CN107944593A (en) | A kind of resource allocation methods and device, electronic equipment | |
CN109190925B (en) | Policy recommendation method, device, computer equipment and storage medium | |
Bi et al. | How built environment impacts online car-hailing ridership | |
CN110570044A (en) | Next Location Prediction Method Based on Recurrent Neural Network and Attention Mechanism | |
CN114819967A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
Wang et al. | Inferring demographics and social networks of mobile device users on campus from AP-trajectories | |
CN113656699B (en) | User feature vector determining method, related equipment and medium | |
CN112560105B (en) | Joint modeling method and device for protecting multi-party data privacy | |
CN115545103A (en) | Abnormal data identification method, label identification method and abnormal data identification device | |
CN117592096A (en) | Abnormal financial account detection method and device based on federal learning privacy protection | |
CN110727864A (en) | A user portrait method based on mobile App installation list | |
WO2025065881A1 (en) | Spatio-temporal trajectory generation method and apparatus, and computer device and storage medium | |
Zhao et al. | Identifying hidden visits from sparse call detail record data | |
WO2019223082A1 (en) | Customer category analysis method and apparatus, and computer device and storage medium | |
CN110489175A (en) | Service processing method, device, server and storage medium | |
CN115730125A (en) | Object identification method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180109 |