Specific embodiment
In order to make those skilled in the art more fully understand the technical solution in this specification one or more embodiment,
Below in conjunction with the attached drawing in this specification one or more embodiment, to the technology in this specification one or more embodiment
Scheme is clearly and completely described, it is clear that and described embodiment is only this specification a part of the embodiment, rather than
Whole embodiments.Based on this specification one or more embodiment, those of ordinary skill in the art are not making creativeness
The range of disclosure protection all should belong in every other embodiment obtained under the premise of labour.
At least one embodiment of this specification provides a kind of method of social circle, and this method can be used for excavating some
The social circle of user, for example, the kith and kin that the social circle can be user enclose colleague's circle or user of user
Classmate circle.As for excavating which kind of scene obtained social circle is applied to, this specification embodiment is not intended to limit, for example, can be with
Social circle is applied to the credit evaluation to user, alternatively, the social circle can also be applied to carry out commercial product recommending to user
Or friend recommendation etc..
Fig. 1 illustrates social circle's method for digging of at least one embodiment of this specification offer, may include:
In step 100, relationship among persons network is constructed.
In this step, relationship among persons network refers to, establishes incidence relation between user and user and the network that is formed.User
Between incidence relation it is very extensive, for different companies and different business scenarios, the foundation of relationship can between user
With difference.For example, the foundation of one of incidence relation can be, carries out transferring accounts between user, give bonus, the behaviour such as plusing good friend
Make, can also be that user A to the prepaid mobile phone recharging of user B, can also set up the incidence relation between A and B in this way.
During constructing relationship among persons network, guarantee it is to recognize between the user to establish a connection each other as far as possible
's.On how to determine to recognize each other between user, a variety of different identifications can be taken according to different business scenarios
Method, for example, between the user for the relationship of transferring accounts, it is believed that number of transferring accounts is greater than between the user of certain amount threshold value
Recognize each other.
In addition, the relationship among persons network of this step building includes multiple users between each other with incidence relation, often
A user all can serve as one of network node in the human connection relational network, if can lead between two network nodes
It crosses connection side to be connected, then it represents that there is incidence relation between the corresponding user of node.
The human connection relational network is equivalent to the database of an incidence relation, when to excavate the social circle of some user,
As long as the user is a network node in above-mentioned relationship among persons network, so that it may which being excavated based on the human connection relational network should
The social circle of user.For example, it is assumed that including user A, user B and user C in the relationship among persons network of this step building, then,
When a foundation of the social circle of user A to be excavated using the credit appraisal as user A, the human connection relational network can be based on
Excavate the social circle of user A;Similarly, it when the social circle of user B to be excavated, can also be dug based on the human connection relational network
Pick.
Assuming that currently to excavate the social circle of user C, which is known as target user, continues to execute following step 102
To step 108, so that it may obtain the social circle of user C based on relationship among persons Web Mining.
In addition, the basis that the relationship among persons network of the step is excavated as social circle, can constantly be updated,
To more perfect.For example, can be modified to user-association relationship wrong in the human connection relational network, alternatively, may be used also
Newfound user-association relationship to be supplemented in network.
In a step 102, as extracting localized network belonging to target user in relationship among persons network.
In this step, target user is one of network node of the relationship among persons network.
The relationship among persons network constructed in step 100 may be very huge, for example, include more than one hundred million nodes, 10,000,000,000 sides, such as
Fruit carries out community's division directly on whole network of throwing the net, and not only resource consumption is big, and effect may be also undesirable.Therefore, this step is
As extracting localized network belonging to target user in relationship among persons network, the localized network is a net comprising target user
Network, the network are a part of relationship among persons network, also, any network node in the network is all corresponding with target user
Network node has direct or indirect association.
For example, the localized network can be the N degree localized network of target user, N is natural number.The N degree localized network
In fringe node and starting point node between by continuous N item connection side be connected, the starting point node is the target user
Corresponding network node.
Localized network when following example N=1 and N=2, same method and so on when N is greater than 2.
Fig. 2 illustrates the once localized network of the corresponding network node 21 of target user, the excavation of the once localized network
Process is starting point with the corresponding network node 21 of target user, obtains each neighbor node being directly connected to starting point, for example,
The neighbor node may include node 22, node 23, node 24 etc., once neighbor node of these neighbor nodes as starting point.
Also, this once has connection side between neighbor node and starting point, for example, there is connection side L1 between node 22 and node 21,
There is connection side L2 between node 23 and node 21.In addition, each once neighbor node between each other there may also be connection side,
For example, there is connection side L3 between node 22 and node 23.The network shown in Fig. 2 being made of once neighbor node and starting point,
It can be used as the once localized network of target user.
Fig. 3 illustrates two degree of localized networks of the corresponding network node 21 of target user, is with each once neighbor node
Starting point obtains each neighbor node being directly connected to the once neighbor node, two degree of neighbours as the target user
Node.For example, node 25 is the neighbor node that once neighbor node 26 is directly connected to, it is properly termed as two degree of neighbor nodes.Node
27 be the neighbor node that once neighbor node 23 is directly connected to and two degree of neighbor nodes.Network node 21 and its once
The network that neighbor node and two degree of neighbor nodes are constituted, is properly termed as two degree of localized networks of target user.
When the N in N degree localized network is greater than 2 natural number, extracting mode and it is above-mentioned once with two degree of localized networks
Extraction it is similar, for example, as N=3, on the basis of two degree of localized networks, it is direct just to obtain each two degree of neighbor nodes for that
The neighbor node of connection obtains three degree of neighbor nodes.Increase these upper three degree of neighbor nodes on the basis of two degree of localized networks
And the connection side between node, so that it may obtain three degree of localized networks.The extracting mode class of the localized network of other N values
Seemingly, it repeats no more.
As described above, the generation of N degree localized network when for N greater than 1, can generate as follows: work as N=
When i, i is greater than 1 natural number, then after obtaining the i-1 degree localized network of target user, further includes: with each i-1 degree
Neighbor node is starting point, each neighbor node being directly connected to the i-1 degree neighbor node is obtained, as the target user
I degree neighbor node;The network that the i-1 degree localized network and i degree neighbor node are constituted is as the i degree of the target user
Localized network.
At step 104, community's division is carried out to the localized network, obtains at least one community network.
Each localized network is exactly the network that is formed due to connecting each other between the contact person of a user in fact.Each
User will appreciate that different people in different division of life span, there is rule here: in the people of same stage understanding, between them
It often and understanding, and is mostly to recognize mutually between classmate for example, that middle school period understanding is mostly classmate
's;The people of different phase understanding, is seldom understanding between them, for example, the classmate of middle school period and the colleague of working stage it is big
It is all unacquainted.It is the contact person of same stage often Cheng Chengyi community that this rule is cashed in network structure, difference
The contact person in stage is belonging respectively to different communities.
In this step, community discovery algorithm can be used, it is each to identify at least one community for including in localized network
A community can correspond to a customer relationship group, for example, being kith and kin's relationship between each user of a community, alternatively, one
It is Peer Relationships between each user of a community, alternatively, being classmate's relationship between each user of a community.Wherein, society
Area's discovery (Community Detection) algorithm can also regard a kind of cluster as finding the community structure in network
Algorithm, community discovery algorithm can there are many, for example, Louvain clustering algorithm, Fast Unfolding algorithm etc..
From the point of view of existence general knowledge, in adjacent division of life span, often some contact persons are duplicate, such as the height of user A
Middle classmate B and A is admitted to university of same institute, and such B is both in the classmate community of senior middle school of A, also in the classmate community of university of A.This
A phenomenon is shown as in network structure, and a node belongs to multiple communities.Therefore, for identification obtain it is above-mentioned at least one
Community can also use overlapping community discovery algorithm, identify the overlay network nodes that are repetitively appearing in multiple communities, and by institute
Overlay network nodes distribution is stated in affiliated multiple communities, obtains at least one community network.
For example, referring to Fig. 4, the node 41 in Fig. 4 was both located in community S1, also was located in community S2, which can
To be known as an overlay network nodes.Overlapping community discovery algorithm can be used, identify this overlay network nodes, and can be with
By overlay network nodes distribution in multiple communities belonging to it.For example, including node in finally obtained community network S1
It also include node 41 in 41, community network S2.
Be overlapped community discovery algorithm also there are many, (one kind that Gregory was proposed in 2010 is based on for example, COPRA algorithm
The community discovery algorithm of label transmitting) etc..A kind of overlapping community discovery algorithm is used as follows:
1) for each network node u in localized network G, connected graph partitioning algorithm can be used, to localized network G
Network cutting is carried out, multiple sub-networks are formed.For each sub-network egonet_i, one new node u_i of creation, and and
All nodes in egonet_i are attached, while deleting origin node u.
2) network cutting is carried out to newly-generated Web vector graphic connected graph partitioning algorithm, forms multiple connected subgraphs.It is described
" connected graph division " be that a figure G is separated into multiple connected subgraphs, guarantee for any two node i and j, and if only if
When having communication path between i and j, i and j are in the same subgraph.
3) for each node u_i in each connected subgraph, the origin node u being mapped to before its first step.It is every in this way
A connected subgraph is a community network.
After recognizing each community network, the type decision of community network can be continued.
In this example, type decision model is can be used to identify in type decision, and the input of the type decision model is one
Community's aggregation features of a community network, output are social circle's types belonging to the community network.
First illustrate the training method of type decision model as follows, the type decision model is disaggregated model more than one.For example,
Social circle's type belonging to community network may include: kith and kin's circle, classmate's circle, colleague's circle, identify what a community network belongs to
Kind social circle's type, just belongs to more classification problems.For example, logistic regression, random forest etc. can be used in the type decision model
Random Forest model can be used in model, this example.
It determines that model training needs sample and sample characteristics to be used first, then can use sample and sample characteristics
Model training is carried out, type decision model is obtained.Therefore, it mainly describes how to obtain sample as follows and how to obtain sample
Feature.
1) sample generates:
The sample generation is to obtain a social circle, for example, obtaining classmate's circle or kith and kin's circle.
For example, artificial mark can be carried out by way of questionnaire survey obtains sample, it is available by questionnaire survey
Relationship between multiple users, which can be kith and kin, or be also possible to classmate.
In another example being also based on a hypothesis obtains sample.The hypothesis may is that if in user and some social circle
Most of user all be colleague, then the social circle be user colleague circle, and so on kith and kin circle and classmate circle.Based on the vacation
If, it is only necessary to relationship is taken to the kith and kin of granularity, colleague or classmate's relationship, then the sample of relation loop granularity can be generated.How
Relation object degree relationship type data are taken, there are many different sources, by taking Alipay as an example, intimately paying in business there are many kith and kin
Relationship has many classmate's relationships etc. in campus card recharging service.
2) sample characteristics:
After obtaining social circle's sample, sample characteristics can be obtained as follows:
The first, it determines at least one foundation characteristic as identification social circle's type foundation.
For example, the foundation characteristic of the user may include: the age, gender, surname, school, address, household register etc..It is each
A to be known as a foundation characteristic, each of social circle user may have the foundation characteristic, for example, with
The age at family.
The second, for foundation characteristic described in each, the foundation characteristic of each user in the community network is carried out
Characteristic aggregation obtains the corresponding user's aggregation features of the foundation characteristic.
For example, the population characteristic of social circle's granularity can be extracted in the judgement of social circle's type.
For example, the foundation characteristic of each user in community network can be carried out characteristic aggregation, obtains user and polymerize spy
Sign.By taking the age as an example, the age of each user can be subjected to characteristic aggregation, the mode of polymerization includes but is not limited to: mode, side
Difference and comentropy.
Mode: mode (Mode) is statistical term, with the numerical value of obvious central tendency point in statistical distribution, is represented
The mean level of data.
Variance: variance is to measure the measurement of stochastic variable or one group of data discrete degree.
Comentropy: comentropy is system perturbations degree measurement, in general, it is uncertain that symbol is a system, which send out,
, measuring it can measure according to the probability that it occurs.Probability is big, and it is more chance occur, uncertain small;Otherwise it is just big.
By taking age age as an example, the age of all users forms a set in a social circle:
S={ age1, age2... ... ..ageN};
The mode Mode at age refers to the age that frequency of occurrence is most in set S;
The variance (Variance) at age refers to square of the difference of each age value and age average value
The comentropy Entropy at age refers to that the confusion degree at age, calculation formula are as follows:
Wherein, piRefer to each age frequency of occurrence accounting in set S
Each of the mode at above-mentioned age, the variance at age, the comentropy at age can be known as and basis spy
It is one corresponding " user's aggregation features " to levy " age ".The quantity of the corresponding user's aggregation features of one foundation characteristic can be to
It is one few, for example, the corresponding user's aggregation features of foundation characteristic " age " include three: the mode at age, year in above-mentioned example
The variance in age and the comentropy at age.Similarly, other foundation characteristics can also correspond at least one user's aggregation features.
Third, by the set of the corresponding user's aggregation features of each foundation characteristic, as community's aggregation features.With base
Plinth feature include " age " and " surname " for, then, the mode at age, the variance at age, the comentropy at age, surname crowd
The set of number, the variance of surname and the comentropy of surname these user's aggregation features, is properly termed as community's aggregation features.
Community's aggregation features can serve as the sample characteristics of a social circle, can be used as in social circle's type identification
Mode input, training pattern accordingly.
Type decision model can be training in advance and obtain, and the model that training is completed can be applied to the society to target user
It hands in circle mining process.After recognizing each community network, the type decision model trained in advance can use to identify
Social circle's type belonging to community network.
In step 106, for each community network, by the foundation characteristic of each user in the community network into
Row polymerization, obtains the corresponding community's aggregation features of the community network.
For example, it may be determined that at least one foundation characteristic as identification social circle's type foundation;For each
The foundation characteristic of each user in the community network is carried out characteristic aggregation, obtains foundation characteristic pair by the foundation characteristic
The user's aggregation features answered;The quantity of the corresponding user's aggregation features of one foundation characteristic is at least one;Each basis is special
The set for levying corresponding user's aggregation features, as community's aggregation features.
In step 108, using community's aggregation features as input parameter, the type decision that training obtains in advance is inputted
Model obtains social circle's type belonging to the community network.
For example, it is assumed that the localized network of target user recognizes three community networks, can be known by type decision model
The type of not each community network can be kith and kin's circle or classmate's circle etc..
Above-mentioned social circle's method for digging, combines network structure feature, applies community discovery, population characteristic excavation etc.
Technology, various types of social circles such as kith and kin, colleague for identifying user according to the population characteristic of community, so that social networks
Excavation is more deeply and specific, more accurate to the excavation of user social contact circle.
Corresponding with the method for digging of above-mentioned social circle, Fig. 5 is the society that at least one embodiment of this specification provides
The structural schematic diagram of the excavating gear of circle is handed over, which is used for the social circle by excavating target user in relationship among persons network, mesh
Mark user is one of network node of relationship among persons network;The apparatus may include: network extraction module 51, community divide
Module 52, feature obtain module 53 and type identification module 54.
Network extraction module 51, for as extracting localized network belonging to target user in relationship among persons network;
Community's division module 52 divides to obtain at least one community network for carrying out community to localized network;
Feature obtains module 53, is used for for each community network, by the base of each user in the community network
Plinth feature is polymerize, and the corresponding community's aggregation features of the community network are obtained;
Type identification module 54, for using community's aggregation features as input parameter, inputting what training in advance obtained
Type decision model obtains social circle's type belonging to the community network.
In one example, network extraction module 51, specifically for by extracting target user in the relationship among persons network
N degree localized network, N is natural number;Pass through continuous N between fringe node and starting point node in the N degree localized network
Item connects side and is connected, and the starting point node is the corresponding network node of the target user.
In one example, community's division module 52, is specifically used for: utilizing community discovery algorithm, identifies the local area network
At least one community for including in network;For identification obtain described at least one community, use overlapping community discovery algorithm, know
The overlay network nodes not being repetitively appearing in multiple communities;And the overlay network nodes are distributed in affiliated multiple societies
Area obtains at least one community network.
In one example, feature obtains module 53, is specifically used for: determining for as identification social circle's type foundation
At least one foundation characteristic;For foundation characteristic described in each, by the foundation characteristic of each user in the community network
Characteristic aggregation is carried out, the corresponding user's aggregation features of the foundation characteristic are obtained;The corresponding user of one foundation characteristic polymerize special
The quantity of sign is at least one;By the set of the corresponding user's aggregation features of each foundation characteristic, it polymerize as the community special
Sign.
The device or module that above-described embodiment illustrates can specifically realize by computer chip or entity, or by having
The product of certain function is realized.A kind of typically to realize that equipment is computer, the concrete form of computer can be personal meter
Calculation machine, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation are set
It is any several in standby, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment
The combination of equipment.
For convenience of description, it is divided into various modules when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each module can be realized in the same or multiple software and or hardware when specification one or more embodiment.
Each step in above-mentioned process as shown in the figure, execution sequence are not limited to the sequence in flow chart.In addition, each
The description of a step can be implemented as software, hardware or its form combined, for example, those skilled in the art can be by it
It is embodied as the form of software code, can is the computer executable instructions that can be realized the corresponding logic function of the step.
When it is realized in the form of software, the executable instruction be can store in memory, and by the processor in equipment
It executes.
For example, corresponding to the above method, this specification one or more embodiment provides a kind of excavation of social circle simultaneously
Equipment.The equipment may include processor, memory and storage on a memory and the computer that can run on a processor
Instruction, the processor is by executing described instruction, for realizing following steps:
As extracting localized network belonging to target user in relationship among persons network;The target user is the relationship among persons
One of network node of network;
Community's division is carried out to the localized network, obtains at least one community network;
For each community network, the foundation characteristic of each user in the community network is polymerize, is obtained
The corresponding community's aggregation features of the community network;
Using community's aggregation features as input parameter, the type decision model that training obtains in advance is inputted, institute is obtained
State social circle's type belonging to community network.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want
There is also other identical elements in the process, method of element, commodity or equipment.
It will be understood by those skilled in the art that this specification one or more embodiment can provide as method, system or calculating
Machine program product.Therefore, this specification one or more embodiment can be used complete hardware embodiment, complete software embodiment or
The form of embodiment combining software and hardware aspects.Moreover, this specification one or more embodiment can be used at one or
It is multiple wherein include computer usable program code computer-usable storage medium (including but not limited to magnetic disk storage,
CD-ROM, optical memory etc.) on the form of computer program product implemented.
This specification one or more embodiment can computer executable instructions it is general on
It hereinafter describes, such as program module.Generally, program module includes executing particular task or realization particular abstract data type
Routine, programs, objects, component, data structure etc..Can also practice in a distributed computing environment this specification one or
Multiple embodiments, in these distributed computing environments, by being executed by the connected remote processing devices of communication network
Task.In a distributed computing environment, the local and remote computer that program module can be located at including storage equipment is deposited
In storage media.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.At data
For managing apparatus embodiments, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to side
The part of method embodiment illustrates.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment
It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable
Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can
With or may be advantageous.
The foregoing is merely the preferred embodiments of this specification one or more embodiment, not to limit this theory
Bright book one or more embodiment, all within the spirit and principle of this specification one or more embodiment, that is done is any
Modification, equivalent replacement, improvement etc. should be included within the scope of the protection of this specification one or more embodiment.