CN106874931A - User portrait grouping method and device - Google Patents

User portrait grouping method and device Download PDF

Info

Publication number
CN106874931A
CN106874931A CN201611259956.9A CN201611259956A CN106874931A CN 106874931 A CN106874931 A CN 106874931A CN 201611259956 A CN201611259956 A CN 201611259956A CN 106874931 A CN106874931 A CN 106874931A
Authority
CN
China
Prior art keywords
network
label
summit
line chart
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611259956.9A
Other languages
Chinese (zh)
Other versions
CN106874931B (en
Inventor
王阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Corp
Original Assignee
Neusoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Corp filed Critical Neusoft Corp
Priority to CN201611259956.9A priority Critical patent/CN106874931B/en
Publication of CN106874931A publication Critical patent/CN106874931A/en
Application granted granted Critical
Publication of CN106874931B publication Critical patent/CN106874931B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Business, Economics & Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the present disclosure is related to a kind of user's portrait grouping method and device, is related to data analysis field, user can be made to draw a portrait and divide group more stable and accurate.The method includes:Label based on each user portrait builds label network G (V, E), wherein V represents the vertex set in label network, one user's portrait of each vertex representation, E represents the line set in label network, and each edge represents two lines on summit corresponding with two user's portraits for possessing at least one common tag;Based on label network build line chart network G ' (V', E', W'), wherein, V' represents the vertex set in line chart network, and a line in each vertex representation label network, E' represents the line set in line chart network, each edge represents corresponding with two sides for having public vertex in label network two lines on summit in line chart network, and W' represents the weights set on the side in line chart network;Corporations' division is carried out to the summit in line chart network;Corporations' division result is converted into user's portrait grouping result.

Description

User portrait grouping method and device
Technical field
The embodiment of the present disclosure is related to data analysis field, in particular it relates to a kind of user portrait grouping method and device.
Background technology
User portrait point group is to have broken data silo and truly understood to use for the maximum change of network marketing environment Family, the isolated user one by one in social networks can be associated and it is grouped.Current user portrait point group Method is very strong to the dependence of parameter, dependence of such as K-means algorithms to the vector selection of the value and initial center of k Property it is very big, this causes that user draws a portrait the unstable result of point group.
The content of the invention
The purpose of the embodiment of the present disclosure is to provide a kind of user's portrait grouping method and device, the user that can be stablized Portrait grouping result.
To achieve these goals, the embodiment of the present disclosure provides a kind of user's portrait grouping method, and the method includes:
Label based on each user portrait builds label network G (V, E), and wherein V represents the top in the label network Point set, one user's portrait of each vertex representation, E represents the line set in the label network, and each edge is represented and possessed The line that two users of at least one common tag draw a portrait between two corresponding summits;
Based on the label network build line chart network G ' (V', E', W'), wherein, V' is represented in the line chart network Vertex set, a line in label network described in a vertex representation in the line chart network, E' represents the line chart net Line set in network, each edge in the line chart network represents in the line chart network there is public top with the label network Line between corresponding two summits in two sides of point, W' represents the weights set on the side in the line chart network;
Based on the line chart network, corporations' division is carried out to the summit in the line chart network;
Corporations' division result is converted into user's portrait grouping result.
Alternatively, the weights on the side in the line chart network are calculated by following steps:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public top in the calculating label network Similarity between the side of point;
The weights of each edge in the line chart network are equal to two summits on the side in the label network In similarity between corresponding two sides.
Alternatively, the weights on the side in the label network are calculated by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent Side eijWeights.
Alternatively, the similarity is calculated by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower Value, wjmRepresent the side e between summit j and mjmWeights.
Alternatively, it is described that corporations' division result is converted into user's portrait grouping result, including:
The summit that will be divided into the line chart network in the label network corresponding to the summit of same corporations It is divided into same user's portrait point group.
The embodiment of the present disclosure also provides a kind of user's portrait grouping device, and the device includes:
Label network builds module, and the label for being drawn a portrait based on each user builds label network G (V, E), wherein V tables Show the vertex set in the label network, one user's portrait of each vertex representation, E represents the side collection in the label network Close, each edge represents the line between two summits corresponding with two user's portraits for possessing at least one common tag;
Line chart network struction module, for building line chart network G ' (V ', E ', W ') based on the label network, wherein, V ' The vertex set in the line chart network is represented, one in label network described in a vertex representation in the line chart network Side, E ' represents the line set in the line chart network, each edge in the line chart network represent in the line chart network with institute Stating has line between two corresponding summits of two sides of public vertex in label network, and W ' is represented in the line chart network Side weights set;
Corporations' division module, for carrying out corporations' division to the summit in the line chart network based on the line chart network;
Modular converter, for corporations' division result to be converted into user's portrait grouping result.
Alternatively, the line chart network struction module calculates the weights on the side in the line chart network in the following manner:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public top in the calculating label network Similarity between the side of point;
The weights of each edge in the line chart network are equal to two summits on the side in the label network In similarity between corresponding two sides.
Alternatively, the line chart network struction module calculates the weights on the side in the label network by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent Side eijWeights.
Alternatively, the line chart network struction module calculates the similarity by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower Value, wjmRepresent the side e between summit j and mjmWeights.
Alternatively, the modular converter is used to be divided into the line chart network corresponding to the summit of same corporations The label network in vertex partition in same user portrait point group.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that Obtain user portrait point group more accurate.
Other feature and advantage of the embodiment of the present disclosure will be described in detail in subsequent specific embodiment part.
Brief description of the drawings
Accompanying drawing is the embodiment of the present disclosure to be further understood for providing, and constitutes a part for specification, with The specific embodiment in face is used to explain the embodiment of the present disclosure together, but does not constitute the limitation to the embodiment of the present disclosure.Attached In figure:
Fig. 1 is the flow chart according to a kind of user of embodiment of disclosure portrait grouping method.
Fig. 2 is the schematic diagram of the label network built according to a kind of embodiment of the disclosure.
Fig. 3 is the schematic diagram of the line chart network built according to a kind of embodiment of the disclosure.
Fig. 4 is the flow chart of the weights for calculating the side in line chart network according to a kind of embodiment of the disclosure.
Fig. 5 is the schematic block diagram according to a kind of user of embodiment of disclosure portrait grouping device.
Specific embodiment
The specific embodiment of the embodiment of the present disclosure is described in detail below in conjunction with accompanying drawing.It should be appreciated that this The described specific embodiment in place is merely to illustrate and explains the embodiment of the present disclosure, is not limited to the embodiment of the present disclosure.
According to a kind of embodiment of the disclosure, there is provided a kind of user draws a portrait grouping method, as shown in figure 1, the method can be with Comprise the following steps S101 to S104.
In step S101, the label based on each user portrait builds label network G (V, E), and wherein V represents the mark The vertex set in network is signed, one user's portrait of each vertex representation, E represents the line set in the label network, every While the line between representing two summits corresponding with two user's portraits for possessing at least one common tag.
In marketing network or social networks etc., it is often desirable which spy the user for analyzing certain indicator behind possesses Levy --- their crowd's attribute, their behavioral characteristic, prior effect is the reason for finding the behind of product problem, and Therefrom find that product is efficiently modified chance or the direction of lifting, this is accomplished by carrying out user portrait point group.
" portrait " of tenant group, its focus work be exactly for customer group is beaten " label ", and label typically refer to artificially advise Fixed highly refined signature identification, such as age, sex, region, user preference (for example liking playing basketball), finally by user Divide all labels of group in general, it is possible to sketch the contours of the solid " portrait " of the customer group.
Step S101 is illustrated by taking tetra- users of A, B, C and D as an example below.
User A identified label includes a, b, c.User B identified label includes a, b, c, d.What user C was identified Label includes b and c.User D identified label includes d.The label for then being built based on the label that user A, B, C and D are possessed Network is as shown in Figure 2.Wherein, in fig. 2, user A, B, C, D is built into four summits of label network, forms vertex set Close V;Due to having common label, therefore it between user AC, between user BC, between user AB respectively and user BD between Between have line respectively, form line set E.
In step s 102, line chart network G ' (V ', E ', W ') is built based on the label network, wherein, V ' represents described Vertex set in line chart network, a line in label network described in a vertex representation in the line chart network, E ' tables Show the line set in the line chart network, each edge in the line chart network represent in the line chart network with the label net There is the line between two corresponding summits of two sides of public vertex in network, W ' represents the power on the side in the line chart network Value set.
Still by taking the label network shown in Fig. 2 as an example, based on line chart network such as Fig. 3 that the label network shown in Fig. 2 builds It is shown.
Summit AC, BC, AB and BD in Fig. 3 constitute vertex set V ', and summit AC corresponds in label network shown in Fig. 2 Side eAC, summit BC is corresponding to the side e in label network shown in Fig. 2BC, summit AB is corresponding to the side in label network shown in Fig. 2 eAB, summit BD is corresponding to the side e in label network shown in Fig. 2BD
Due to the side e in the label network shown in Fig. 2ACAnd eBCBetween have public vertex C, therefore the summit AC in Fig. 3 (correspond to the side e in Fig. 2AC) (correspond to the side e in Fig. 2 with summit BCBC) between line form a line in Fig. 3; Due to the side e in the label network shown in Fig. 2ACAnd eABBetween have a public vertex A, therefore summit AC in Fig. 3 (corresponds to Fig. 2 In side eAC) and summit AB (correspond to Fig. 2 in side eAB) between line form another a line in Fig. 3;Due to Fig. 2 Side e in shown label networkBCAnd eABBetween have a public vertex B, therefore summit BC in Fig. 3 (corresponds to the side in Fig. 2 eBC) and AB (correspond to Fig. 2 in side eAB) between line form a line again in Fig. 3;Due to the label shown in Fig. 2 Side e in networkBCAnd eBDBetween have a public vertex B, therefore summit BC in Fig. 3 (corresponds to the side e in Fig. 2BC) and BD it is (right Side e that should be in Fig. 2BD) between line form a line again in Fig. 3;Due to the side in the label network shown in Fig. 2 eABAnd eBDBetween have a public vertex B, therefore summit AB in Fig. 3 (corresponds to the side e in Fig. 2AB) and BD (correspond to Fig. 2 in Side eBD) between line form a line again in Fig. 3, above-mentioned side forms the line set E ' in line chart network.
In addition, the weights when upper numeral represents this shown in Fig. 3, these side right values constitute side right value set W′.It will be apparent to a skilled person that the side weight values shown in Fig. 3 are only examples.
In step s 103, based on the line chart network, corporations' division is carried out to the summit in the line chart network;
In step S104, corporations' division result is converted into user's portrait grouping result.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that Obtain user portrait point group more accurate.
In a kind of possible implementation method, as shown in figure 4, the weights on the side in the line chart network can be by following Step S401 to S403 is calculated.
In step S401, the weights on the side in the label network are calculated.
For example, the weights on the side in the label network can be calculated by below equation:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent Side eijWeights.
The two summits are calculated by the number of the total label possessed between two summits in based on label network Between side weights, the result for carrying out corporations' division in step S103 to the summit in line chart network can be made more accurate and Stabilization.But, it will be apparent to a skilled person that the weights on the side in the label network can use any other Algorithm (such as the weight computing algorithm according to prior art) is calculated, and the disclosure is without limitation.
In step S402, based on the weights on the side in the label network for being calculated, in the calculating label network The every two similarity S having between the side of public vertex.
Still by taking the label network shown in Fig. 2 as an example.Due to side eACAnd eBCBetween have public vertex C, side eACAnd eABBetween There are public vertex A, side eBCAnd eABBetween have public vertex B, side eBCAnd eBDBetween have public vertex B, side eABAnd eBDBetween have Public vertex B, it is therefore desirable to calculate side eACAnd eBCBetween similarity S (AC, BC), side eACAnd eABBetween similarity S (AC, AB), side eBCAnd eABBetween similarity S (BC, AB), side eBCAnd eBDBetween similarity S (BC, BD), side eABAnd eBDBetween Similarity S (AB, BD).
In addition, the every two similarity S having between the side of public vertex can be counted by below equation in label network Calculate:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower Value, wjmRepresent the side e between summit j and mjmWeights.
Wherein, wimAnd wjmCan be calculated using above-mentioned formula (1).By based on two summits in label network it Between the number of total label that possesses calculate the similarity in label network between every two sides for possessing public vertex, can Make the result for carrying out corporations' division in step S103 to the summit in line chart network more accurate and stabilization.
Still by taking the label network shown in Fig. 2 as an example, for two the sides AB and AC of summit A, all neighbours of summit B The collection for occupying summit composition is combined into NBThe collection that all neighbours summits of={ A, B, C, D }, summit C are constituted is combined into NC={ A, B, C }, then Side e is can be calculated according to formula (2)ACAnd eABBetween similarity be S (AB, AC)=15/16.
It will be apparent to a skilled person that being only example, Ren Heqi above with respect to the computing formula of side similarity The algorithm that he calculates similarity can be applied to the calculating of the similarity in the embodiment of the present disclosure between opposite side.The disclosure is implemented The computational algorithm of example opposite side similarity is not limited.
In step S403, two summits that the weights of each edge in the line chart network are equal to the side are existed Similarity in the label network between corresponding two sides.
Still illustrated by taking Fig. 2 and Fig. 3 as an example.Due to the side e in the label network that is calculated based on formula (2)ACAnd eAB Between similarity be the side in S (AB, AC)=15/16, therefore line chart network shown in Fig. 3 between summit AC and AB weights It is 15/16.
By step S401 to S403, it becomes possible to so that the weights on side in line chart network and two summits in label network Total number of tags is associated, hence in so that the result for carrying out corporations' division to the summit in line chart network in step S103 is more It is accurate and stabilization, and then causes the result of user portrait point group more accurate and stabilization.
In a kind of possible implementation method, described in step S103 is based on the line chart network to the line chart network In summit carry out corporations' division, can include:Corporations are carried out to the summit in the line chart network using community detecting algorithm Divide.
The embodiment of the present disclosure is not limited to community detecting algorithm, and for example it can be condensing method (agglomerative Method), namely addition side algorithm, can also be splitting method (divisive method), namely remove the algorithm on side. For example, the community detecting algorithm that the embodiment of the present disclosure is used can be proposed by Newman and Gievan GN algorithms, may be used also Be label propagation algorithm (Label Propagation Algorithm, LPA), Fast Unfolding algorithms, Kernighan-Lin algorithms, the spectrum dichotomy based on Laplace figure characteristic values, K-means algorithms, the ternary based on similarity Corporations' merging algorithm (Ternary Community Merging Algorithm based on Similarity, STCMA), Based on ternary corporations LPA algorithms (Label Propagation Algorithm based on Ternary Community, TCLPA) etc..
Below to carry out describing one as a example by corporations' division to the summit in the line chart network using label propagation algorithm Under how to carry out corporations' division.
First, it is that each summit imparting one in line chart network is unique in the label propagation algorithm starting stage Mark L, the initial marking value (for example can be the value of character string type) on this summit being designated in line chart network.Then, Iterated to calculate by many wheels, by social networks (namely the side in line chart network) by the mark on each summit in line chart network to Other summits are propagated, wherein, in every wheel iterative process, each summit is according to receiving from neighbours in line chart network To determine oneself, which mark is this wheel iteration should assign to the mark on summit (having the connected summit in side), and basic principle is:Each Identify to should summit a line, count the weights sum on the corresponding side of same mark, choose maximum that of weights sum Mark assigns oneself;If in the presence of two and weights sum identified above is equal, being selected at random from maximum multiple marks Take a mark and assign oneself.Because each summit only retains a mark, therefore often take turns iterative process Graph network In each summit need to reaffirm the mark of itself.If after many wheel iterative calculation, the mark on most summits no longer becomes Change, then terminate iteration.Finally, the summit for possessing like-identified is divided in same corporations (packet).
In a kind of possible implementation method, in step S104 it is described by corporations' division result be converted into user portrait point Group's result, can include:The label network corresponding to the summit of same corporations will be divided into the line chart network In vertex partition in same user portrait point group.Namely the corporations on the summit in line chart network divide and can be mapped as The corporations on the side in label network divide, if for example, summit ij and nm in line chart network are divided into same user's portrait Divide in group, then mean the side e in label networkijWith side emnIt is divided into same user's portrait point group, such label network In summit (namely user portrait) i, j, m and n be just divided into same user's portrait point group, this makes it possible to realize stabilization Accurate user's portrait point group.The multiple summits on summit in due to label network may be divided into different users and draw In as point group, therefore, it is possible to realize that same user's portrait is divided into the effect that different user portrait divides in group.
Still illustrated by taking the line chart network shown in label network and Fig. 3 shown in Fig. 2 as an example.To shown in Fig. 3 Line chart network is carried out after corporations' division, it is assumed that be by summit AC and BC (namely the side e in label network in line chart networkAC And eBC) be divided into same user portrait point group X, by summit AB and BD (namely the side e in label network in line chart networkAB And eBD) be divided into another user portrait point group Y, then by corporations' division result be converted into user's portrait grouping result it Afterwards, user A, B and C has been divided into user's portrait point group X, and user A, B and D have been divided into user's portrait point group Y, this Sample, user A and B just belong to two different user's portrait point group X and Y simultaneously.
According to another embodiment of the present disclosure, there is provided a kind of user draws a portrait grouping device, as shown in figure 5, the device can be with Including:
Label network builds module 501, and the label for being drawn a portrait based on each user builds label network G (V, E), wherein V represents the vertex set in the label network, and one user's portrait of each vertex representation, E is represented in the label network Line set, each edge represents the company between two summits corresponding with two user's portraits for possessing at least one common tag Line;
Line chart network struction module 502, for building line chart network G ' (V ', E ', W ') based on the label network, its In, V ' represents the vertex set in the line chart network, in label network described in a vertex representation in the line chart network A line, E ' represents the line set in the line chart network, and each edge in the line chart network represents the line chart network In line between two summits corresponding with two sides for having public vertex in the label network, W ' represents the line chart The weights set on the side in network;
Corporations' division module 503, draws for carrying out corporations to the summit in the line chart network based on the line chart network Point;
Modular converter 504, for corporations' division result to be converted into user's portrait grouping result.
By above-mentioned technical proposal, due to during user portrait point group and not needing any parameter, but only Label based on user's portrait can just carry out user portrait point group, therefore avoid dependence of the grouping result to parameter, right In it is determined that a group labeling user for, the user that can be stablized portrait grouping result.With existing only according to mark Sign the next artificial method for dividing groups of users to compare, human cost can be greatly reduced.In addition, being carried out by based on line chart network Corporations divide, and can realize that same user's portrait may be divided into the effect that different user's portraits divide in groups, so that Obtain user portrait point group more accurate.
In a kind of possible implementation method, the line chart network struction module 502 can in the following manner calculate institute State the weights on the side in line chart network:Calculate the weights on the side in the label network;Based on the label network for being calculated In side weights, calculate every two similarities having between the side of public vertex in the label network;By the line chart net The weights of each edge in network be equal to this while two summits at corresponding two in the label network while between Similarity.
Wherein, the line chart network struction module 502 can calculate the side in the label network by below equation Weights:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent Side eijWeights.
Wherein, the line chart network struction module 502 can calculate the similarity by below equation:
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent Side between summit j and k, side eikAnd ejkConnection identical summit k, NiThe collection that expression is made up of all neighbours summits of summit i Close, and i ∈ Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimPower Value, wjmRepresent the side e between summit j and mjmWeights.
In a kind of possible implementation method, the modular converter 504 can be used for be divided in the line chart network In to the vertex partition in the label network corresponding to the summit of same corporations to same user portrait point group.
The side of implementing of the operation in the user's portrait grouping device according to the embodiment of the present disclosure performed by modules Formula has been described in detail in grouping method of being drawn a portrait according to the user of the embodiment of the present disclosure, and here is omitted.
Describe the preferred embodiment of the embodiment of the present disclosure in detail above in association with accompanying drawing, but, the embodiment of the present disclosure is simultaneously The detail in above-mentioned implementation method is not limited to, in the range of the technology design of the embodiment of the present disclosure, can be to disclosure reality The technical scheme for applying example carries out various simple variants, and these simple variants belong to the protection domain of the embodiment of the present disclosure.
It is further to note that each particular technique feature described in above-mentioned specific embodiment, in not lance In the case of shield, can be combined by any suitable means.In order to avoid unnecessary repetition, the embodiment of the present disclosure pair Various possible combinations are no longer separately illustrated.
Additionally, can also be combined between a variety of implementation methods of the embodiment of the present disclosure, as long as it is not The thought of the embodiment of the present disclosure is run counter to, it should equally be considered as embodiment of the present disclosure disclosure of that.

Claims (10)

1. a kind of user portrait grouping method, it is characterised in that the method includes:
Label based on each user portrait builds label network G (V, E), and wherein V represents the vertex set in the label network Close, one user's portrait of each vertex representation, E represents the line set in the label network, and each edge is represented and possessed at least The line that one the two of common tag user draws a portrait between two corresponding summits;
Line chart network G ' (V ', E ', W ') is built based on the label network, wherein, V ' represents the summit in the line chart network Set, a line in label network described in a vertex representation in the line chart network, E ' is represented in the line chart network Line set, each edge in the line chart network represents in the line chart network there is public vertex with the label network Line between two corresponding summits of two sides, W ' represents the weights set on the side in the line chart network;
Based on the line chart network, corporations' division is carried out to the summit in the line chart network;
Corporations' division result is converted into user's portrait grouping result.
2. method according to claim 1, it is characterised in that the weights on the side in the line chart network pass through following steps To calculate:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public vertex in the calculating label network Similarity between side;
Two summits that the weights of each edge in the line chart network are equal into the side are right in the label network Similarity between two sides answered.
3. method according to claim 2, it is characterised in that the weights on the side in the label network pass through below equation Calculate:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent side eij Weights.
4. according to the method in claim 2 or 3, it is characterised in that the similarity is calculated by below equation:
S ( e i k , e j k ) = Σ m ∈ N i ∩ N j ( w i m + w j m ) Σ m ∈ N i ∪ N j ( w i m + w j m )
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent summit j Side and k between, side eikAnd ejkConnection identical summit k, NiThe set that expression is made up of all neighbours summits of summit i, and i ∈Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimWeights, wjm Represent the side e between summit j and mjmWeights.
5. method according to claim 1, it is characterised in that described that corporations' division result is converted into a user portrait point group As a result, including:
The vertex partition that will be divided into the line chart network in the label network corresponding to the summit of same corporations To in same user portrait point group.
6. a kind of user portrait grouping device, it is characterised in that the device includes:
Label network builds module, and the label for being drawn a portrait based on each user builds label network G (V, E), and wherein V represents institute The vertex set in label network is stated, one user's portrait of each vertex representation, E represents the line set in the label network, Each edge represents the line between two summits corresponding with two user's portraits for possessing at least one common tag;
Line chart network struction module, for building line chart network G ' (V ', E ', W ') based on the label network, wherein, V ' expressions Vertex set in the line chart network, a line in label network described in a vertex representation in the line chart network, E ' represents the line set in the line chart network, each edge in the line chart network represent in the line chart network with the mark There is the line between two corresponding summits of two sides of public vertex in label network, W ' represents the side in the line chart network Weights set;
Corporations' division module, for carrying out corporations' division to the summit in the line chart network based on the line chart network;
Modular converter, for corporations' division result to be converted into user's portrait grouping result.
7. device according to claim 6, it is characterised in that the line chart network struction module is calculated in the following manner The weights on the side in the line chart network:
Calculate the weights on the side in the label network;
Based on the weights on the side in the label network for being calculated, every two have public vertex in the calculating label network Similarity between side;
Two summits that the weights of each edge in the line chart network are equal into the side are right in the label network Similarity between two sides answered.
8. device according to claim 7, it is characterised in that the line chart network struction module is calculated by below equation The weights on the side in the label network:
Wherein, i and j represent two summits in the label network, eijRepresent the side between summit i and j, wijRepresent side eij Weights.
9. the device according to claim 7 or 8, it is characterised in that the line chart network struction module passes through below equation Calculate the similarity:
S ( e i k , e j k ) = Σ m ∈ N i ∩ N j ( w i m + w j m ) Σ m ∈ N i ∪ N j ( w i m + w j m )
Wherein, i, j, k and m represent the summit in the label network, eikRepresent the side between summit i and k, ejkRepresent summit j Side and k between, side eikAnd ejkConnection identical summit k, NiThe set that expression is made up of all neighbours summits of summit i, and i ∈Ni, NjThe set that expression is made up of all neighbours summits of summit j, wimRepresent the side e between summit i and mimWeights, wjm Represent the side e between summit j and mjmWeights.
10. device according to claim 6, it is characterised in that the modular converter is used for quilt in the line chart network The vertex partition in the label network corresponding to the summit of same corporations is divided into divide in group to same user portrait.
CN201611259956.9A 2016-12-30 2016-12-30 User portrait clustering method and device Active CN106874931B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611259956.9A CN106874931B (en) 2016-12-30 2016-12-30 User portrait clustering method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611259956.9A CN106874931B (en) 2016-12-30 2016-12-30 User portrait clustering method and device

Publications (2)

Publication Number Publication Date
CN106874931A true CN106874931A (en) 2017-06-20
CN106874931B CN106874931B (en) 2021-01-22

Family

ID=59165184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611259956.9A Active CN106874931B (en) 2016-12-30 2016-12-30 User portrait clustering method and device

Country Status (1)

Country Link
CN (1) CN106874931B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304482A (en) * 2017-12-29 2018-07-20 北京城市网邻信息技术有限公司 The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing
CN108376095A (en) * 2018-02-27 2018-08-07 北京金堤科技有限公司 A kind of icon arrangement method and apparatus
CN108537586A (en) * 2018-03-30 2018-09-14 杭州米趣网络科技有限公司 Data processing method and device based on user's portrait
CN109189936A (en) * 2018-08-13 2019-01-11 天津科技大学 A kind of label semanteme learning method measured based on network structure and semantic dependency

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840543A (en) * 2010-05-07 2010-09-22 南京大学 Combo discovering method based on vertex difference
CN102760149A (en) * 2012-04-05 2012-10-31 中国人民解放军国防科学技术大学 Automatic annotating method for subjects of open source software
CN103327075A (en) * 2013-05-27 2013-09-25 电子科技大学 Distributed mass organization realizing method based on label interaction
CN103810288A (en) * 2014-02-25 2014-05-21 西安电子科技大学 Method for carrying out community detection on heterogeneous social network on basis of clustering algorithm
CN103838803A (en) * 2013-04-28 2014-06-04 电子科技大学 Social network community discovery method based on node Jaccard similarity
CN104102745A (en) * 2014-07-31 2014-10-15 上海交通大学 Complex network community mining method based on local minimum edges
CN104933624A (en) * 2015-06-29 2015-09-23 电子科技大学 Community discovery method of complex network and important node discovery method of community
CN105279187A (en) * 2014-07-15 2016-01-27 天津科技大学 Edge clustering coefficient-based social network group division method
CN105678626A (en) * 2015-12-30 2016-06-15 南京理工大学 Overlapped community excavation method and apparatus

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101840543A (en) * 2010-05-07 2010-09-22 南京大学 Combo discovering method based on vertex difference
CN102760149A (en) * 2012-04-05 2012-10-31 中国人民解放军国防科学技术大学 Automatic annotating method for subjects of open source software
CN103838803A (en) * 2013-04-28 2014-06-04 电子科技大学 Social network community discovery method based on node Jaccard similarity
CN103327075A (en) * 2013-05-27 2013-09-25 电子科技大学 Distributed mass organization realizing method based on label interaction
CN103810288A (en) * 2014-02-25 2014-05-21 西安电子科技大学 Method for carrying out community detection on heterogeneous social network on basis of clustering algorithm
CN105279187A (en) * 2014-07-15 2016-01-27 天津科技大学 Edge clustering coefficient-based social network group division method
CN104102745A (en) * 2014-07-31 2014-10-15 上海交通大学 Complex network community mining method based on local minimum edges
CN104933624A (en) * 2015-06-29 2015-09-23 电子科技大学 Community discovery method of complex network and important node discovery method of community
CN105678626A (en) * 2015-12-30 2016-06-15 南京理工大学 Overlapped community excavation method and apparatus

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304482A (en) * 2017-12-29 2018-07-20 北京城市网邻信息技术有限公司 The recognition methods and device of broker, electronic equipment and readable storage medium storing program for executing
CN108376095A (en) * 2018-02-27 2018-08-07 北京金堤科技有限公司 A kind of icon arrangement method and apparatus
CN108537586A (en) * 2018-03-30 2018-09-14 杭州米趣网络科技有限公司 Data processing method and device based on user's portrait
CN109189936A (en) * 2018-08-13 2019-01-11 天津科技大学 A kind of label semanteme learning method measured based on network structure and semantic dependency
CN109189936B (en) * 2018-08-13 2021-07-27 天津科技大学 Label semantic learning method based on network structure and semantic correlation measurement

Also Published As

Publication number Publication date
CN106874931B (en) 2021-01-22

Similar Documents

Publication Publication Date Title
CN108920527A (en) A kind of personalized recommendation method of knowledge based map
Kim et al. Latent multi-group membership graph model
CN106874931A (en) User portrait grouping method and device
CN106817251B (en) Link prediction method and device based on node similarity
CN104102745B (en) Complex network community method for digging based on Local Minimum side
CN104346476B (en) Personalized item recommendation method based on article similarity and network structure
Shang et al. Epidemic spreading on complex networks with overlapping and non-overlapping community structure
CN109165692A (en) A kind of user's personality prediction meanss and method based on Weakly supervised study
CN106708953A (en) Discrete particle swarm optimization based local community detection collaborative filtering recommendation method
CN107506617B (en) Half-local social information miRNA-disease association prediction method
KR20140067697A (en) System and method for supplying collaboration partner search service
CN106789338B (en) Method for discovering key people in dynamic large-scale social network
CN104199838B (en) A kind of user model constructing method based on label disambiguation
CN107944485A (en) The commending system and method, personalized recommendation system found based on cluster group
Cooper et al. Computing hypermatrix spectra with the Poisson product formula
Alzahrani et al. Community detection in bipartite networks using random walks
Zhang et al. Multi-view clustering of microbiome samples by robust similarity network fusion and spectral clustering
CN105138684B (en) A kind of information processing method and information processing unit
CN106844743B (en) Emotion classification method and device for Uygur language text
Mattioli et al. Quivers, words and fundamentals
Xu Community detection based on network communicability distance
CN107133218A (en) Trade name intelligent Matching method, system and computer-readable recording medium
CN104462480B (en) Comment big data method for digging based on typicalness
Zhang et al. Dynamic structure evolution of time-dependent network
JP2015109024A (en) Image dictionary generation device, image dictionary generation method and computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant