CN106874931A - User portrait grouping method and device - Google Patents
User portrait grouping method and device Download PDFInfo
- Publication number
- CN106874931A CN106874931A CN201611259956.9A CN201611259956A CN106874931A CN 106874931 A CN106874931 A CN 106874931A CN 201611259956 A CN201611259956 A CN 201611259956A CN 106874931 A CN106874931 A CN 106874931A
- Authority
- CN
- China
- Prior art keywords
- network
- label
- summit
- line chart
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Business, Economics & Management (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the present disclosure is related to a kind of user's portrait grouping method and device, is related to data analysis field, user can be made to draw a portrait and divide group more stable and accurate.The method includes:Label based on each user portrait builds label network G (V, E), wherein V represents the vertex set in label network, one user's portrait of each vertex representation, E represents the line set in label network, and each edge represents two lines on summit corresponding with two user's portraits for possessing at least one common tag;Based on label network build line chart network G ' (V', E', W'), wherein, V' represents the vertex set in line chart network, and a line in each vertex representation label network, E' represents the line set in line chart network, each edge represents corresponding with two sides for having public vertex in label network two lines on summit in line chart network, and W' represents the weights set on the side in line chart network;Corporations' division is carried out to the summit in line chart network;Corporations' division result is converted into user's portrait grouping result.
Description
Technical field
The embodiment of the present disclosure is related to data analysis field, in particular it relates to a kind of user portrait grouping method and device.
Background technology
User portrait point group is to have broken data silo and truly understood to use for the maximum change of network marketing environment
Family, the isolated user one by one in social networks can be associated and it is grouped.Current user portrait point group
Method is very strong to the dependence of parameter, dependence of such as K-means algorithms to the vector selection of the value and initial center of k
Property it is very big, this causes that user draws a portrait the unstable result of point group.
The content of the invention
The purpose of the embodiment of the present disclosure is to provide a kind of user's portrait grouping method and device, the user that can be stablized
Portrait grouping result.
To achieve these goals, the embodiment of the present disclosure provides a kind of user's portrait grouping method, and the method includes:
Label based on each user portrait builds label network G (V, E), and wherein V represents the top in the label network
Point set, one user's portrait of each vertex representation, E represents the line set in the label network, and each edge is represented and possessed
The line that two users of at least one common tag draw a portrait between two corresponding summits;
Based on the label network build line chart network G ' (V', E', W'), wherein, V' is represented in the line chart network
Vertex set, a line in label network described in a vertex representation in the line chart network, E' represents the line chart net
Line set in network, each edge in the line chart network represents in the line chart network there is public top with the label network
Line between corresponding two summits in two sides of point, W' represents the weights set on the side in the line chart network;
Based on the line chart network, corporations' division is carried out to the summit in the line chart network;
Corporations' division result is converted into user's portrait grouping result.
Alternatively, the weights on the side in the line chart network are calculated by following steps:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public top in the calculating label network
Similarity between the side of point;
The weights of each edge in the line chart network are equal to two summits on the side in the label network
In similarity between corresponding two sides.
Alternatively, the weights on the side in the label network are calculated by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent
Side eijWeights.
Alternatively, the similarity is calculated by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent
Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i
Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower
Value, wjmRepresent the side e between summit j and mjmWeights.
Alternatively, it is described that corporations' division result is converted into user's portrait grouping result, including:
The summit that will be divided into the line chart network in the label network corresponding to the summit of same corporations
It is divided into same user's portrait point group.
The embodiment of the present disclosure also provides a kind of user's portrait grouping device, and the device includes:
Label network builds module, and the label for being drawn a portrait based on each user builds label network G (V, E), wherein V tables
Show the vertex set in the label network, one user's portrait of each vertex representation, E represents the side collection in the label network
Close, each edge represents the line between two summits corresponding with two user's portraits for possessing at least one common tag;
Line chart network struction module, for building line chart network G ' (V ', E ', W ') based on the label network, wherein, V '
The vertex set in the line chart network is represented, one in label network described in a vertex representation in the line chart network
Side, E ' represents the line set in the line chart network, each edge in the line chart network represent in the line chart network with institute
Stating has line between two corresponding summits of two sides of public vertex in label network, and W ' is represented in the line chart network
Side weights set;
Corporations' division module, for carrying out corporations' division to the summit in the line chart network based on the line chart network;
Modular converter, for corporations' division result to be converted into user's portrait grouping result.
Alternatively, the line chart network struction module calculates the weights on the side in the line chart network in the following manner:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public top in the calculating label network
Similarity between the side of point;
The weights of each edge in the line chart network are equal to two summits on the side in the label network
In similarity between corresponding two sides.
Alternatively, the line chart network struction module calculates the weights on the side in the label network by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent
Side eijWeights.
Alternatively, the line chart network struction module calculates the similarity by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent
Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i
Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower
Value, wjmRepresent the side e between summit j and mjmWeights.
Alternatively, the modular converter is used to be divided into the line chart network corresponding to the summit of same corporations
The label network in vertex partition in same user portrait point group.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only
Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right
In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark
Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network
Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that
Obtain user portrait point group more accurate.
Other feature and advantage of the embodiment of the present disclosure will be described in detail in subsequent specific embodiment part.
Brief description of the drawings
Accompanying drawing is the embodiment of the present disclosure to be further understood for providing, and constitutes a part for specification, with
The specific embodiment in face is used to explain the embodiment of the present disclosure together, but does not constitute the limitation to the embodiment of the present disclosure.Attached
In figure:
Fig. 1 is the flow chart according to a kind of user of embodiment of disclosure portrait grouping method.
Fig. 2 is the schematic diagram of the label network built according to a kind of embodiment of the disclosure.
Fig. 3 is the schematic diagram of the line chart network built according to a kind of embodiment of the disclosure.
Fig. 4 is the flow chart of the weights for calculating the side in line chart network according to a kind of embodiment of the disclosure.
Fig. 5 is the schematic block diagram according to a kind of user of embodiment of disclosure portrait grouping device.
Specific embodiment
The specific embodiment of the embodiment of the present disclosure is described in detail below in conjunction with accompanying drawing.It should be appreciated that this
The described specific embodiment in place is merely to illustrate and explains the embodiment of the present disclosure, is not limited to the embodiment of the present disclosure.
According to a kind of embodiment of the disclosure, there is provided a kind of user draws a portrait grouping method, as shown in figure 1, the method can be with
Comprise the following steps S101 to S104.
In step S101, the label based on each user portrait builds label network G (V, E), and wherein V represents the mark
The vertex set in network is signed, one user's portrait of each vertex representation, E represents the line set in the label network, every
While the line between representing two summits corresponding with two user's portraits for possessing at least one common tag.
In marketing network or social networks etc., it is often desirable which spy the user for analyzing certain indicator behind possesses
Levy --- their crowd's attribute, their behavioral characteristic, prior effect is the reason for finding the behind of product problem, and
Therefrom find that product is efficiently modified chance or the direction of lifting, this is accomplished by carrying out user portrait point group.
" portrait " of tenant group, its focus work be exactly for customer group is beaten " label ", and label typically refer to artificially advise
Fixed highly refined signature identification, such as age, sex, region, user preference (for example liking playing basketball), finally by user
Divide all labels of group in general, it is possible to sketch the contours of the solid " portrait " of the customer group.
Step S101 is illustrated by taking tetra- users of A, B, C and D as an example below.
User A identified label includes a, b, c.User B identified label includes a, b, c, d.What user C was identified
Label includes b and c.User D identified label includes d.The label for then being built based on the label that user A, B, C and D are possessed
Network is as shown in Figure 2.Wherein, in fig. 2, user A, B, C, D is built into four summits of label network, forms vertex set
Close V;Due to having common label, therefore it between user AC, between user BC, between user AB respectively and user BD between
Between have line respectively, form line set E.
In step s 102, line chart network G ' (V ', E ', W ') is built based on the label network, wherein, V ' represents described
Vertex set in line chart network, a line in label network described in a vertex representation in the line chart network, E ' tables
Show the line set in the line chart network, each edge in the line chart network represent in the line chart network with the label net
There is the line between two corresponding summits of two sides of public vertex in network, W ' represents the power on the side in the line chart network
Value set.
Still by taking the label network shown in Fig. 2 as an example, based on line chart network such as Fig. 3 that the label network shown in Fig. 2 builds
It is shown.
Summit AC, BC, AB and BD in Fig. 3 constitute vertex set V ', and summit AC corresponds in label network shown in Fig. 2
Side eAC, summit BC is corresponding to the side e in label network shown in Fig. 2BC, summit AB is corresponding to the side in label network shown in Fig. 2
eAB, summit BD is corresponding to the side e in label network shown in Fig. 2BD。
Due to the side e in the label network shown in Fig. 2ACAnd eBCBetween have public vertex C, therefore the summit AC in Fig. 3
(correspond to the side e in Fig. 2AC) (correspond to the side e in Fig. 2 with summit BCBC) between line form a line in Fig. 3;
Due to the side e in the label network shown in Fig. 2ACAnd eABBetween have a public vertex A, therefore summit AC in Fig. 3 (corresponds to Fig. 2
In side eAC) and summit AB (correspond to Fig. 2 in side eAB) between line form another a line in Fig. 3;Due to Fig. 2
Side e in shown label networkBCAnd eABBetween have a public vertex B, therefore summit BC in Fig. 3 (corresponds to the side in Fig. 2
eBC) and AB (correspond to Fig. 2 in side eAB) between line form a line again in Fig. 3;Due to the label shown in Fig. 2
Side e in networkBCAnd eBDBetween have a public vertex B, therefore summit BC in Fig. 3 (corresponds to the side e in Fig. 2BC) and BD it is (right
Side e that should be in Fig. 2BD) between line form a line again in Fig. 3;Due to the side in the label network shown in Fig. 2
eABAnd eBDBetween have a public vertex B, therefore summit AB in Fig. 3 (corresponds to the side e in Fig. 2AB) and BD (correspond to Fig. 2 in
Side eBD) between line form a line again in Fig. 3, above-mentioned side forms the line set E ' in line chart network.
In addition, the weights when upper numeral represents this shown in Fig. 3, these side right values constitute side right value set
W′.It will be apparent to a skilled person that the side weight values shown in Fig. 3 are only examples.
In step s 103, based on the line chart network, corporations' division is carried out to the summit in the line chart network;
In step S104, corporations' division result is converted into user's portrait grouping result.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only
Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right
In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark
Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network
Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that
Obtain user portrait point group more accurate.
In a kind of possible implementation method, as shown in figure 4, the weights on the side in the line chart network can be by following
Step S401 to S403 is calculated.
In step S401, the weights on the side in the label network are calculated.
For example, the weights on the side in the label network can be calculated by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent
Side eijWeights.
The two summits are calculated by the number of the total label possessed between two summits in based on label network
Between side weights, the result for carrying out corporations' division in step S103 to the summit in line chart network can be made more accurate and
Stabilization.But, it will be apparent to a skilled person that the weights on the side in the label network can use any other
Algorithm (such as the weight computing algorithm according to prior art) is calculated, and the disclosure is without limitation.
In step S402, based on the weights on the side in the label network for being calculated, in the calculating label network
The every two similarity S having between the side of public vertex.
Still by taking the label network shown in Fig. 2 as an example.Due to side eACAnd eBCBetween have public vertex C, side eACAnd eABBetween
There are public vertex A, side eBCAnd eABBetween have public vertex B, side eBCAnd eBDBetween have public vertex B, side eABAnd eBDBetween have
Public vertex B, it is therefore desirable to calculate side eACAnd eBCBetween similarity S (AC, BC), side eACAnd eABBetween similarity S (AC,
AB), side eBCAnd eABBetween similarity S (BC, AB), side eBCAnd eBDBetween similarity S (BC, BD), side eABAnd eBDBetween
Similarity S (AB, BD).
In addition, the every two similarity S having between the side of public vertex can be counted by below equation in label network
Calculate:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent
Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i
Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower
Value, wjmRepresent the side e between summit j and mjmWeights.
Wherein, wimAnd wjmCan be calculated using above-mentioned formula (1).By based on two summits in label network it
Between the number of total label that possesses calculate the similarity in label network between every two sides for possessing public vertex, can
Make the result for carrying out corporations' division in step S103 to the summit in line chart network more accurate and stabilization.
Still by taking the label network shown in Fig. 2 as an example, for two the sides AB and AC of summit A, all neighbours of summit B
The collection for occupying summit composition is combined into NBThe collection that all neighbours summits of={ A, B, C, D }, summit C are constituted is combined into NC={ A, B, C }, then
Side e is can be calculated according to formula (2)ACAnd eABBetween similarity be S (AB, AC)=15/16.
It will be apparent to a skilled person that being only example, Ren Heqi above with respect to the computing formula of side similarity
The algorithm that he calculates similarity can be applied to the calculating of the similarity in the embodiment of the present disclosure between opposite side.The disclosure is implemented
The computational algorithm of example opposite side similarity is not limited.
In step S403, two summits that the weights of each edge in the line chart network are equal to the side are existed
Similarity in the label network between corresponding two sides.
Still illustrated by taking Fig. 2 and Fig. 3 as an example.Due to the side e in the label network that is calculated based on formula (2)ACAnd eAB
Between similarity be the side in S (AB, AC)=15/16, therefore line chart network shown in Fig. 3 between summit AC and AB weights
It is 15/16.
By step S401 to S403, it becomes possible to so that the weights on side in line chart network and two summits in label network
Total number of tags is associated, hence in so that the result for carrying out corporations' division to the summit in line chart network in step S103 is more
It is accurate and stabilization, and then causes the result of user portrait point group more accurate and stabilization.
In a kind of possible implementation method, described in step S103 is based on the line chart network to the line chart network
In summit carry out corporations' division, can include:Corporations are carried out to the summit in the line chart network using community detecting algorithm
Divide.
The embodiment of the present disclosure is not limited to community detecting algorithm, and for example it can be condensing method (agglomerative
Method), namely addition side algorithm, can also be splitting method (divisive method), namely remove the algorithm on side.
For example, the community detecting algorithm that the embodiment of the present disclosure is used can be proposed by Newman and Gievan GN algorithms, may be used also
Be label propagation algorithm (Label Propagation Algorithm, LPA), Fast Unfolding algorithms,
Kernighan-Lin algorithms, the spectrum dichotomy based on Laplace figure characteristic values, K-means algorithms, the ternary based on similarity
Corporations' merging algorithm (Ternary Community Merging Algorithm based on Similarity, STCMA),
Based on ternary corporations LPA algorithms (Label Propagation Algorithm based on Ternary Community,
TCLPA) etc..
Below to carry out describing one as a example by corporations' division to the summit in the line chart network using label propagation algorithm
Under how to carry out corporations' division.
First, it is that each summit imparting one in line chart network is unique in the label propagation algorithm starting stage
Mark L, the initial marking value (for example can be the value of character string type) on this summit being designated in line chart network.Then,
Iterated to calculate by many wheels, by social networks (namely the side in line chart network) by the mark on each summit in line chart network to
Other summits are propagated, wherein, in every wheel iterative process, each summit is according to receiving from neighbours in line chart network
To determine oneself, which mark is this wheel iteration should assign to the mark on summit (having the connected summit in side), and basic principle is:Each
Identify to should summit a line, count the weights sum on the corresponding side of same mark, choose maximum that of weights sum
Mark assigns oneself;If in the presence of two and weights sum identified above is equal, being selected at random from maximum multiple marks
Take a mark and assign oneself.Because each summit only retains a mark, therefore often take turns iterative process Graph network
In each summit need to reaffirm the mark of itself.If after many wheel iterative calculation, the mark on most summits no longer becomes
Change, then terminate iteration.Finally, the summit for possessing like-identified is divided in same corporations (packet).
In a kind of possible implementation method, in step S104 it is described by corporations' division result be converted into user portrait point
Group's result, can include:The label network corresponding to the summit of same corporations will be divided into the line chart network
In vertex partition in same user portrait point group.Namely the corporations on the summit in line chart network divide and can be mapped as
The corporations on the side in label network divide, if for example, summit ij and nm in line chart network are divided into same user's portrait
Divide in group, then mean the side e in label networkijWith side emnIt is divided into same user's portrait point group, such label network
In summit (namely user portrait) i, j, m and n be just divided into same user's portrait point group, this makes it possible to realize stabilization
Accurate user's portrait point group.The multiple summits on summit in due to label network may be divided into different users and draw
In as point group, therefore, it is possible to realize that same user's portrait is divided into the effect that different user portrait divides in group.
Still illustrated by taking the line chart network shown in label network and Fig. 3 shown in Fig. 2 as an example.To shown in Fig. 3
Line chart network is carried out after corporations' division, it is assumed that be by summit AC and BC (namely the side e in label network in line chart networkAC
And eBC) be divided into same user portrait point group X, by summit AB and BD (namely the side e in label network in line chart networkAB
And eBD) be divided into another user portrait point group Y, then by corporations' division result be converted into user's portrait grouping result it
Afterwards, user A, B and C has been divided into user's portrait point group X, and user A, B and D have been divided into user's portrait point group Y, this
Sample, user A and B just belong to two different user's portrait point group X and Y simultaneously.
According to another embodiment of the present disclosure, there is provided a kind of user draws a portrait grouping device, as shown in figure 5, the device can be with
Including:
Label network builds module 501, and the label for being drawn a portrait based on each user builds label network G (V, E), wherein
V represents the vertex set in the label network, and one user's portrait of each vertex representation, E is represented in the label network
Line set, each edge represents the company between two summits corresponding with two user's portraits for possessing at least one common tag
Line;
Line chart network struction module 502, for building line chart network G ' (V ', E ', W ') based on the label network, its
In, V ' represents the vertex set in the line chart network, in label network described in a vertex representation in the line chart network
A line, E ' represents the line set in the line chart network, and each edge in the line chart network represents the line chart network
In line between two summits corresponding with two sides for having public vertex in the label network, W ' represents the line chart
The weights set on the side in network;
Corporations' division module 503, draws for carrying out corporations to the summit in the line chart network based on the line chart network
Point;
Modular converter 504, for corporations' division result to be converted into user's portrait grouping result.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only
Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right
In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark
Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network
Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that
Obtain user portrait point group more accurate.
In a kind of possible implementation method, the line chart network struction module 502 can in the following manner calculate institute
State the weights on the side in line chart network:Calculate the weights on the side in the label network;Based on the label network for being calculated
In side weights, calculate every two similarities having between the side of public vertex in the label network;By the line chart net
The weights of each edge in network be equal to this while two summits at corresponding two in the label network while between
Similarity.
Wherein, the line chart network struction module 502 can calculate the side in the label network by below equation
Weights:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent
Side eijWeights.
Wherein, the line chart network struction module 502 can calculate the similarity by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent
Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i
Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower
Value, wjmRepresent the side e between summit j and mjmWeights.
In a kind of possible implementation method, the modular converter 504 can be used for be divided in the line chart network
In to the vertex partition in the label network corresponding to the summit of same corporations to same user portrait point group.
The side of implementing of the operation in the user's portrait grouping device according to the embodiment of the present disclosure performed by modules
Formula has been described in detail in grouping method of being drawn a portrait according to the user of the embodiment of the present disclosure, and here is omitted.
Describe the preferred embodiment of the embodiment of the present disclosure in detail above in association with accompanying drawing, but, the embodiment of the present disclosure is simultaneously
The detail in above-mentioned implementation method is not limited to, in the range of the technology design of the embodiment of the present disclosure, can be to disclosure reality
The technical scheme for applying example carries out various simple variants, and these simple variants belong to the protection domain of the embodiment of the present disclosure.
It is further to note that each particular technique feature described in above-mentioned specific embodiment, in not lance
In the case of shield, can be combined by any suitable means.In order to avoid unnecessary repetition, the embodiment of the present disclosure pair
Various possible combinations are no longer separately illustrated.
Additionally, can also be combined between a variety of implementation methods of the embodiment of the present disclosure, as long as it is not
The thought of the embodiment of the present disclosure is run counter to, it should equally be considered as embodiment of the present disclosure disclosure of that.
Claims (10)
1. a kind of user portrait grouping method, it is characterised in that the method includes:
Label based on each user portrait builds label network G (V, E), and wherein V represents the vertex set in the label network
Close, one user's portrait of each vertex representation, E represents the line set in the label network, and each edge is represented and possessed at least
The line that one the two of common tag user draws a portrait between two corresponding summits;
Line chart network G ' (V ', E ', W ') is built based on the label network, wherein, V ' represents the summit in the line chart network
Set, a line in label network described in a vertex representation in the line chart network, E ' is represented in the line chart network
Line set, each edge in the line chart network represents in the line chart network there is public vertex with the label network
Line between two corresponding summits of two sides, W ' represents the weights set on the side in the line chart network;
Based on the line chart network, corporations' division is carried out to the summit in the line chart network;
Corporations' division result is converted into user's portrait grouping result.
2. method according to claim 1, it is characterised in that the weights on the side in the line chart network pass through following steps
To calculate:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public vertex in the calculating label network
Similarity between side;
Two summits that the weights of each edge in the line chart network are equal into the side are right in the label network
Similarity between two sides answered.
3. method according to claim 2, it is characterised in that the weights on the side in the label network pass through below equation
Calculate:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent side eij
Weights.
4. according to the method in claim 2 or 3, it is characterised in that the similarity is calculated by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent summit j
Side and k between, side eikAnd ejkConnection identical summit k, NiThe set that expression is made up of all neighbours summits of summit i, and i
∈Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimWeights, wjm
Represent the side e between summit j and mjmWeights.
5. method according to claim 1, it is characterised in that described that corporations' division result is converted into a user portrait point group
As a result, including:
The vertex partition that will be divided into the line chart network in the label network corresponding to the summit of same corporations
To in same user portrait point group.
6. a kind of user portrait grouping device, it is characterised in that the device includes:
Label network builds module, and the label for being drawn a portrait based on each user builds label network G (V, E), and wherein V represents institute
The vertex set in label network is stated, one user's portrait of each vertex representation, E represents the line set in the label network,
Each edge represents the line between two summits corresponding with two user's portraits for possessing at least one common tag;
Line chart network struction module, for building line chart network G ' (V ', E ', W ') based on the label network, wherein, V ' expressions
Vertex set in the line chart network, a line in label network described in a vertex representation in the line chart network,
E ' represents the line set in the line chart network, each edge in the line chart network represent in the line chart network with the mark
There is the line between two corresponding summits of two sides of public vertex in label network, W ' represents the side in the line chart network
Weights set;
Corporations' division module, for carrying out corporations' division to the summit in the line chart network based on the line chart network;
Modular converter, for corporations' division result to be converted into user's portrait grouping result.
7. device according to claim 6, it is characterised in that the line chart network struction module is calculated in the following manner
The weights on the side in the line chart network:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public vertex in the calculating label network
Similarity between side;
Two summits that the weights of each edge in the line chart network are equal into the side are right in the label network
Similarity between two sides answered.
8. device according to claim 7, it is characterised in that the line chart network struction module is calculated by below equation
The weights on the side in the label network:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent side eij
Weights.
9. the device according to claim 7 or 8, it is characterised in that the line chart network struction module passes through below equation
Calculate the similarity:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent summit j
Side and k between, side eikAnd ejkConnection identical summit k, NiThe set that expression is made up of all neighbours summits of summit i, and i
∈Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimWeights, wjm
Represent the side e between summit j and mjmWeights.
10. device according to claim 6, it is characterised in that the modular converter is used for quilt in the line chart network
The vertex partition in the label network corresponding to the summit of same corporations is divided into divide in group to same user portrait.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611259956.9A CN106874931B (en) | 2016-12-30 | 2016-12-30 | User portrait clustering method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611259956.9A CN106874931B (en) | 2016-12-30 | 2016-12-30 | User portrait clustering method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106874931A true CN106874931A (en) | 2017-06-20 |
CN106874931B CN106874931B (en) | 2021-01-22 |
Family
ID=59165184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611259956.9A Active CN106874931B (en) | 2016-12-30 | 2016-12-30 | User portrait clustering method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106874931B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304482A (en) * | 2017-12-29 | 2018-07-20 | 北京城市网邻信息技术有限公司 | The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing |
CN108376095A (en) * | 2018-02-27 | 2018-08-07 | 北京金堤科技有限公司 | A kind of icon arrangement method and apparatus |
CN108537586A (en) * | 2018-03-30 | 2018-09-14 | 杭州米趣网络科技有限公司 | Data processing method and device based on user's portrait |
CN109189936A (en) * | 2018-08-13 | 2019-01-11 | 天津科技大学 | A kind of label semanteme learning method measured based on network structure and semantic dependency |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101840543A (en) * | 2010-05-07 | 2010-09-22 | 南京大学 | Combo discovering method based on vertex difference |
CN102760149A (en) * | 2012-04-05 | 2012-10-31 | 中国人民解放军国防科学技术大学 | Automatic annotating method for subjects of open source software |
CN103327075A (en) * | 2013-05-27 | 2013-09-25 | 电子科技大学 | Distributed mass organization realizing method based on label interaction |
CN103810288A (en) * | 2014-02-25 | 2014-05-21 | 西安电子科技大学 | Method for carrying out community detection on heterogeneous social network on basis of clustering algorithm |
CN103838803A (en) * | 2013-04-28 | 2014-06-04 | 电子科技大学 | Social network community discovery method based on node Jaccard similarity |
CN104102745A (en) * | 2014-07-31 | 2014-10-15 | 上海交通大学 | Complex network community mining method based on local minimum edges |
CN104933624A (en) * | 2015-06-29 | 2015-09-23 | 电子科技大学 | Community discovery method of complex network and important node discovery method of community |
CN105279187A (en) * | 2014-07-15 | 2016-01-27 | 天津科技大学 | Edge clustering coefficient-based social network group division method |
CN105678626A (en) * | 2015-12-30 | 2016-06-15 | 南京理工大学 | Overlapped community excavation method and apparatus |
-
2016
- 2016-12-30 CN CN201611259956.9A patent/CN106874931B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101840543A (en) * | 2010-05-07 | 2010-09-22 | 南京大学 | Combo discovering method based on vertex difference |
CN102760149A (en) * | 2012-04-05 | 2012-10-31 | 中国人民解放军国防科学技术大学 | Automatic annotating method for subjects of open source software |
CN103838803A (en) * | 2013-04-28 | 2014-06-04 | 电子科技大学 | Social network community discovery method based on node Jaccard similarity |
CN103327075A (en) * | 2013-05-27 | 2013-09-25 | 电子科技大学 | Distributed mass organization realizing method based on label interaction |
CN103810288A (en) * | 2014-02-25 | 2014-05-21 | 西安电子科技大学 | Method for carrying out community detection on heterogeneous social network on basis of clustering algorithm |
CN105279187A (en) * | 2014-07-15 | 2016-01-27 | 天津科技大学 | Edge clustering coefficient-based social network group division method |
CN104102745A (en) * | 2014-07-31 | 2014-10-15 | 上海交通大学 | Complex network community mining method based on local minimum edges |
CN104933624A (en) * | 2015-06-29 | 2015-09-23 | 电子科技大学 | Community discovery method of complex network and important node discovery method of community |
CN105678626A (en) * | 2015-12-30 | 2016-06-15 | 南京理工大学 | Overlapped community excavation method and apparatus |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304482A (en) * | 2017-12-29 | 2018-07-20 | 北京城市网邻信息技术有限公司 | The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing |
CN108376095A (en) * | 2018-02-27 | 2018-08-07 | 北京金堤科技有限公司 | A kind of icon arrangement method and apparatus |
CN108537586A (en) * | 2018-03-30 | 2018-09-14 | 杭州米趣网络科技有限公司 | Data processing method and device based on user's portrait |
CN109189936A (en) * | 2018-08-13 | 2019-01-11 | 天津科技大学 | A kind of label semanteme learning method measured based on network structure and semantic dependency |
CN109189936B (en) * | 2018-08-13 | 2021-07-27 | 天津科技大学 | Label semantic learning method based on network structure and semantic correlation measurement |
Also Published As
Publication number | Publication date |
---|---|
CN106874931B (en) | 2021-01-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108920527A (en) | A kind of personalized recommendation method of knowledge based map | |
Kim et al. | Latent multi-group membership graph model | |
CN106874931A (en) | User portrait grouping method and device | |
CN106817251B (en) | Link prediction method and device based on node similarity | |
CN104102745B (en) | Complex network community method for digging based on Local Minimum side | |
CN104346476B (en) | Personalized item recommendation method based on article similarity and network structure | |
Shang et al. | Epidemic spreading on complex networks with overlapping and non-overlapping community structure | |
CN109165692A (en) | A kind of user's personality prediction meanss and method based on Weakly supervised study | |
CN106708953A (en) | Discrete particle swarm optimization based local community detection collaborative filtering recommendation method | |
CN107506617B (en) | Half-local social information miRNA-disease association prediction method | |
KR20140067697A (en) | System and method for supplying collaboration partner search service | |
CN106789338B (en) | Method for discovering key people in dynamic large-scale social network | |
CN104199838B (en) | A kind of user model constructing method based on label disambiguation | |
CN107944485A (en) | The commending system and method, personalized recommendation system found based on cluster group | |
Cooper et al. | Computing hypermatrix spectra with the Poisson product formula | |
Alzahrani et al. | Community detection in bipartite networks using random walks | |
Zhang et al. | Multi-view clustering of microbiome samples by robust similarity network fusion and spectral clustering | |
CN105138684B (en) | A kind of information processing method and information processing unit | |
CN106844743B (en) | Emotion classification method and device for Uygur language text | |
Mattioli et al. | Quivers, words and fundamentals | |
Xu | Community detection based on network communicability distance | |
CN107133218A (en) | Trade name intelligent Matching method, system and computer-readable recording medium | |
CN104462480B (en) | Comment big data method for digging based on typicalness | |
Zhang et al. | Dynamic structure evolution of time-dependent network | |
JP2015109024A (en) | Image dictionary generation device, image dictionary generation method and computer program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |