CN108399551A - A kind of method and system of determining user tag and pushed information - Google Patents

A kind of method and system of determining user tag and pushed information Download PDF

Info

Publication number
CN108399551A
CN108399551A CN201710069800.2A CN201710069800A CN108399551A CN 108399551 A CN108399551 A CN 108399551A CN 201710069800 A CN201710069800 A CN 201710069800A CN 108399551 A CN108399551 A CN 108399551A
Authority
CN
China
Prior art keywords
user
label
similarity
probability
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710069800.2A
Other languages
Chinese (zh)
Inventor
沈珑斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710069800.2A priority Critical patent/CN108399551A/en
Publication of CN108399551A publication Critical patent/CN108399551A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0254Targeted advertisements based on statistics

Abstract

The invention relates to Internet technical field, more particularly to a kind of method and system of determining user tag, the efficiency to solve the problems, such as the mode of current determining user tag existing in the prior art is relatively low.Similarity during the embodiment of the present application is gathered according to the similarity user between user, and the corresponding label of seed user in the similarity user set, the probability for determining the corresponding label of non-seed user in similarity user set, will be greater than the corresponding label of probability of probability threshold value as the corresponding target labels of non-seed user.Due to determining the probability of the corresponding label of non-seed user according to the similarity between user, and according to the determining corresponding target labels of the non-seed user of probability selection the efficiency of determining user tag is improved without acquiring mass data training pattern.

Description

A kind of method and system of determining user tag and pushed information
Technical field
This application involves Internet technical field, more particularly to the method for a kind of determining user tag and pushed information and it is System.
Background technology
With the fast development of internet, especially electric business, more and more people have been accustomed to new by internet browsing It hears, watch movie, do shopping.
Internet platform, internet businessman etc. can push some information to user according to demand at present.Such as internet Businessman can be to user's advertisement.
It is broadly divided into two kinds of orientation and non-directional to the mode of user's pushed information.
Non-directional is to increase the information for needing to push in positions such as webpages, aobvious to user as long as user logs on to the webpage Show pre-set information.
Orientation is that the label of user is determined according to the network behavior of user, and different letters is pushed for the user of different labels Breath.
The mode of orientation is generally realized using the method based on user's disaggregated model.
Method based on user's disaggregated model is to carry out manual tag to each website, to user on the basis of artificial mark Sampling, as training crowd, by user accesses data extraction feature, training disaggregated model more than one;Utilize more disaggregated models, root According to the data that user accesses, label is added to user.
This mode needs to use two kinds of models of user's disaggregated model and more disaggregated models, and both models to be trained to need It takes a significant amount of time, and acquires mass data, cause to determine that the efficiency of user tag is relatively low.
Invention content
The application provides a kind of method and system of determining user tag and pushed information, to solve to deposit in the prior art The current problem for determining that the efficiency of the mode of user tag is relatively low.
A kind of method of determining user tag provided by the embodiments of the present application, this method include:
Determine that similarity user gathers, wherein any one user in similarity user set gathers with similarity user At least one of user-association;
Seed in similarity and similarity user set in being gathered according to the similarity user between user The corresponding label of user determines the probability of the corresponding label of non-seed user in similarity user set;
The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
A kind of system of determining user tag provided by the embodiments of the present application, the system include:
Gather determining module, for determining that similarity user gathers, any one use wherein in similarity user set At least one of family and similarity user set user-association;
Processing module, the similarity between user and the similarity in being used to be gathered according to the similarity user The corresponding label of seed user in user's set determines the general of the corresponding label of non-seed user in similarity user set Rate;
Label determining module, the corresponding label of probability for will be greater than probability threshold value is as the corresponding mesh of non-seed user Mark label.
A kind of method of pushed information provided by the embodiments of the present application, this method include:
According to the binding relationship of label and information, the corresponding label of information for needing to push is determined;
The information pushed will be needed to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, wherein any one user in similarity user set gathers with similarity user At least one of user-association;Similarity between user and the similarity in being gathered according to the similarity user The corresponding label of seed user in user's set determines the general of the corresponding label of non-seed user in similarity user set Rate;The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
A kind of system of pushed information provided by the embodiments of the present application, this method include:
Label model determines the corresponding label of information for needing to push for the binding relationship according to label and information;
Pushing module, for the information for needing to push to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, wherein any one user in similarity user set gathers with similarity user At least one of user-association;Similarity between user and the similarity in being gathered according to the similarity user The corresponding label of seed user in user's set determines the general of the corresponding label of non-seed user in similarity user set Rate;The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
Similarity and the similarity during the embodiment of the present application is gathered according to the similarity user between user are used The corresponding label of seed user in the set of family determines the general of the corresponding label of non-seed user in similarity user set Rate will be greater than the corresponding label of probability of probability threshold value as the corresponding target labels of non-seed user.Due to according to user it Between similarity determine the probability of the corresponding label of non-seed user, and it is corresponding according to the determining non-seed user of probability selection Target labels improve the efficiency of determining user tag without acquiring mass data training pattern.
Description of the drawings
In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly introduced, it should be apparent that, the accompanying drawings in the following description is only some embodiments of the present application, for this For the those of ordinary skill in field, without having to pay creative labor, it can also be obtained according to these attached drawings His attached drawing.
Fig. 1 is the method flow schematic diagram that the embodiment of the present application determines user tag;
Fig. 2 is the schematic diagram of the embodiment of the present application and the associated target user of the seed user;
Fig. 3 is the schematic diagram that the embodiment of the present application label is propagated;
Fig. 4 is the complete method flow diagram that the embodiment of the present application determines user tag;
Fig. 5 is the system structure diagram that the embodiment of the present application determines user tag;
Fig. 6 is the method flow schematic diagram of the embodiment of the present application pushed information;
Fig. 7 is the system structure diagram of the embodiment of the present application pushed information.
Specific implementation mode
Wherein, the embodiment of the present application can be applied to any required scene for determining user tag, i.e. crowd touches the canal reached Road class scene, such as applied to electric business advertisement, game advertisement, app distributions etc..
The label of the embodiment of the present application can be set according to application scenarios and demand, for example be applied to electric business scene mark Label can be merchandise classification, such as electronic product, articles for babies etc., can be with according to whether buying seller's commodity defines label For positive sample label and negative sample label.Specific label substance describes in detail later.
In order to keep the purpose, technical scheme and advantage of the application clearer, below in conjunction with attached drawing to the application make into It is described in detail to one step, it is clear that described embodiment is only the application some embodiments, rather than whole implementation Example.Based on the embodiment in the application, obtained by those of ordinary skill in the art without making creative efforts All other embodiment, shall fall in the protection scope of this application.
As shown in Figure 1, the method that the embodiment of the present application determines user tag includes:
Step 100 determines that similarity user gathers, wherein any one user in similarity user set and similarity At least one of user's set user-association;
Step 101, gathered according to the similarity user in similarity between user and the similarity user collect The corresponding label of seed user in conjunction determines the probability of the corresponding label of non-seed user in similarity user set;
The corresponding label of step 102, the probability that will be greater than probability threshold value is as the corresponding target labels of non-seed user.
Similarity and the similarity during the embodiment of the present application is gathered according to the similarity user between user are used The corresponding label of seed user in the set of family determines the general of the corresponding label of non-seed user in similarity user set Rate will be greater than the corresponding label of probability of probability threshold value as the corresponding target labels of non-seed user.Due to according to user it Between similarity determine the probability of the corresponding label of non-seed user, and it is corresponding according to the determining non-seed user of probability selection Target labels improve the efficiency of determining user tag without acquiring mass data training pattern.
The embodiment of the present application can just determine corresponding seed user according to demand.For example it is applied to electric business scene, it determines User tag is in order to which to user's advertisement, then party in request is exactly advertiser, and the corresponding seed user of party in request can be purchase The user of Mai Guo parties in request commodity.
After determining the corresponding target labels of non-seed user, so that it may with to containing the non-seed of the corresponding label of party in request User sends the information of party in request.
For example the embodiment of the present application is applied to electric business scene, the information of party in request is advertisement, then is corresponded to containing party in request Label non-seed user click the page containing advertisement after, can to non-seed user send party in request advertisement.
Wherein, if the party in request of the application is advertiser, the application effectively can help advertiser to spread over Crowd finds potential client.
The embodiment of the present application is it needs to be determined that seed user.The corresponding seed user of different parties in request also differs.
For example party in request is advertiser, then it can be using the client of advertiser as seed crowd, in similar users network Label is propagated, the foundation of similar users network relies on access data of the user to content, and the user that access to content is more overlapped has Higher closes on relationship.
By taking the embodiment of the present application is applied to electric business scene as an example, it is assumed that party in request A sells articles for babies, party in request B sale Electronic product can will buy the user of the commodity of party in request A as the corresponding seed users of party in request A, and mark seed User tag is articles for babies;The user of the commodity of party in request B will be bought as the corresponding seed users of party in request B, and marked Note seed user label is electronic product.
In force, a party in request can correspond to multiple labels, for example sell a variety of different classes of commodity;Similarly One seed user can also corresponding multiple labels, for example bought different types of commodity.
The embodiment of the present application can be directed to a party in request every time, determine the corresponding target labels of non-seed user;Also may be used To be directed to multiple parties in request every time, the corresponding target labels of non-seed user are determined.
If being directed to a party in request every time, the corresponding target labels of non-seed user are determined, then seed user here It is the corresponding seed user of a party in request;
If being directed to multiple parties in request every time, the corresponding target labels of non-seed user are determined, then seed user here It is the corresponding seed user of multiple parties in request.
In force, at least one of any one user in similarity user set and similarity user set use Family is associated with, and association here can be determined by similarity.
Optionally, at least one of any one user and similarity user set use in the similarity user set The similarity at family meets similarity condition.
The similarity (Similarity) of the embodiment of the present application is a numerical value, the similarity between user can according to The network behavior at family determines, is also conceivable to the influence of the popular page and reptile robot when determining similarity, the former is big It measures user to access, the latter can access many pages, therefore in the process of similarity calculation consider to the popular page and access The punishment of the user of a large amount of pages.Merely to access number measurement similarity jointly, many users will have higher with reptile Similarity, also many dissimilar users just have higher similarity, because of their all a large amount of nets for accessing amusement classes It stands.That is, the similarity of the embodiment of the present application is proportional to common accession page number, it is inversely proportional to and respectively accesses number and be interviewed Ask the popular degree of the page.
Wherein, the embodiment of the present application accesses more users jointly has higher similarity, while can be to popular page Face (such as portal homepage) carries out drop power, improves accuracy.
For example, two users of A, B respectively have accessed 10 pages, but 1 page is only accessed jointly, for example be portal The website that most of user such as stand, entertain class website can access, then the similarity value of two users of such A, B will compare It is small;
Also for example, two users of A, B respectively have accessed 10 pages, but 1 page is only accessed jointly, for example be vertical web It stands, such as steel information forum, then the similarity value of two users of such A, B will be bigger.
What network behavior here was also referred to as user behavior gets record ready, that is, user's network-wide access record is tracked, when one Visitor browses some page, can record the information such as user, page address, residence time.
In force, the network behavior of user can be by cookie come one user of unique identification.
After obtaining the corresponding content page of all users and accessing set, so that it may with according to the network behavior of user, really Determine the similarity between any two user.
All users can be all users in network.If the embodiment of the present application is applied to some platform, own User can be all users in platform.
Due to the limitation (for example the network behavior data of not all user can collect) of gathered data, Yong Hufang It is not associated between the network behavior data and other users asked, it is possible to certain customers occur and any user is not closed Connection, then reject this certain customers.
The embodiment of the present application obtains similar users network by similarity algorithm, as shown in Figure 2.
Here similarity algorithm can be Adamic-Adar algorithms, can also use other collaborative filterings, example Such as ItemCF algorithms, SimRank algorithms.
In Fig. 2, it is assumed that A and B is seed user, and C~K is non-seed user.
1,2,3 and 4 be Vertical Website.
Website 1:User A, user B, user C, user D and user E are accessed;
Website 2:User B, user D, user F and user G are accessed;
Website 3:User F, user H and user I are accessed;
Website 4:User G, user I, user J and user K are accessed.
By in the available similar users network of similarity algorithm:
User A is associated with user B, user C, user D and user E;
User B is associated with user A, user C, user D, user E, user F and user G;
User C is associated with user A, user B, user D and user E;
User D is associated with user A, user B, user C, user E, user F and user G;
User E is associated with user A, user B, user C and user D;
User F is associated with user B, user D and user G;
User G is associated with user B, user D, user J and user K;
User H is associated with user F and user I;
User I is associated with user F, user G, user H, user J and user K;
User J is associated with user G, user I and user K;
User K is associated with user G, user I and user J.
The degree of association between two associated users can be calculated according to any one of the above algorithm.
After the degree of association between determining user, it may be determined which user is placed in similarity user set.
Similarity condition can include but is not limited to it is following in some or all of:
The degree of association is more than threshold value;
Maximum top n similarity.
If similarity condition, which includes the degree of association, is more than threshold value, the user that the degree of association can be more than to threshold value is placed in similarity In user's set.Here threshold value can be set as needed, if threshold value is set as 0, then it represents that non-zero user may be used To be placed in similarity user set.
If similarity condition includes maximum top n similarity, it can from big to small arrange, will come according to the degree of association The corresponding user of the top n degree of association is placed in similarity user set.
After determining similarity user set, the similarity in being gathered according to the similarity user between user, and The corresponding label of seed user in the similarity user set determines that non-seed user corresponds in the similarity user set Label probability.
Optionally, when determining the probability of the corresponding label of non-seed user in similarity user set, according to described Similarity in similarity user set between user determines user's similarity matrix, and corresponding according to the seed user Label determines seed user label matrix;
Probability transfer matrix is determined according to user's similarity matrix, and true according to the seed user label matrix Fixed pending label probability matrix;
It is general that label is obtained into row label dissemination process to the pending label probability matrix according to probability transfer matrix Rate matrix;
According to the label probability matrix, the general of the corresponding label of non-seed user in the similarity user set is determined Rate.
In force, the similarity in being gathered according to the similarity user between user determines user's similarity matrix just It is to arrange the value of similarity according to matrix-style.
Similarity between any two user includes two similarities, i.e. user A to the similarity of user B, Yi Jiyong Similarities of the family B to user A.
In force, the similarity of the embodiment of the present application can be symmetrical, i.e. the similarity of user A to user B are equal to Similarities of the user B to user A;Can also be asymmetrical, i.e. the similarity of user A to user B is not equal to user B to user The similarity of A.
Such as by taking Fig. 3 as an example.Similarity in figure is asymmetric, including tetra- user A, user B, user C and user D use Family.
The similarity of user A to user B are 0.2;
The similarity of user A to user C are 0.8;
The similarity of user B to user C are 1.0;
The similarity of user D to user B are 1.0.
Can obtain similarity matrix according to the above is:
In the similarity matrix, transverse and longitudinal coordinate is all user, sits target value and indicates the user of ordinate to the use of abscissa The similarity at family.In similarity matrix, oneself and the similarity of oneself are 1.
In force, there are many modes of the label of setting the embodiment of the present application to set, and two ways is set forth below.
Mode one, according to whether setting label relevant with party in request.
Here only need to be arranged two kinds of labels, 1, positive sample label, i.e., it is related with party in request;2, negative sample label, i.e., with Party in request is unrelated.
When judging whether related with party in request, the standard that different application scenarios judge is also different.
By taking the embodiment of the present application is applied to electric business scene as an example, party in request is seller A, if user bought the quotient of seller Product, it is determined that the user is related with party in request;
If once showing the advertisement of the seller to the user but not clicking on, it is determined that the user is unrelated with party in request.
The corresponding all positive sample users of seller A and negative sample user all can serve as to the seed user of the seller A.
Mode two, according to scene setting label.
Side's setting label according to demand, different scene parties in request is also different, and corresponding label is also different.
By taking the embodiment of the present application is applied to electric business scene as an example, party in request is seller, then label may include that seller sells The classification, such as articles for babies, electronic product, bedding of commodity etc..
Optionally, for mode two, label can also include negative sample label, i.e., unrelated with party in request.
When judging whether related with party in request, the standard that different application scenarios judge is also different.
By the embodiment of the present application be applied to electric business scene for, if once to the user showed the seller advertisement but It does not click on, it is determined that the user is unrelated with party in request.
After determining user's similarity matrix and seed user label matrix, determined according to user's similarity matrix general Rate transfer matrix, and pending label probability matrix is determined according to the seed user label matrix.
It introduces separately below and how to determine probability transfer matrix and label probability matrix.
1, probability transfer matrix is determined according to user's similarity matrix.
A kind of expression way of possible probability transfer matrix P is referred to following equation:
Wherein, PijIndicate the probability that node i is transferred to from node (i.e. user) j, wherein ωjiIndicate node j to node i Similarity, n indicate node i adjacent node number.Which specific node is that node i adjacent node is referred to Fig. 2.
In force, probability transfer matrix P can be asymmetrical, i.e. ωij≠ωji.It is of course also possible to be symmetrical.
2, pending label probability matrix is determined according to the seed user label matrix.
Optionally, pending label probability matrix can meet following equation determination:
Wherein YLIndicate the probability distribution (i.e. seed user label matrix) of L seed user, YUIndicate U non-seed use The probability distribution at family.F ° of matrix is longitudinally user's dimension (L+U), is laterally label dimension (C).
In force, Y can be randomly providedUInitial value, for example take C random number, then normalize, make the adduction be 1.Optionally, in setting YUInitial value when, guarantee probability standardization, i.e., the probability of each user and (laterally adduction) are 1.
Assuming that label is divided into positive sample label (being indicated with X) and negative sample label (being indicated with Y).
By taking Fig. 3 as an example, a kind of feasible seed user label matrix is:
In the seed user label matrix, abscissa is label, and ordinate is user, sits target value and indicates that user is horizontal seat Mark the probability of corresponding label.
Assuming that label is divided into articles for babies (being indicated with X), electronic product (being indicated with Y) and negative sample label (use Z tables Show).
A kind of feasible seed user label matrix is:
For different label setting means, two examples are being enumerated:
Assuming that electric business platform, has 10 advertisers, two advertisers are colleagues, so 9 merchandise classifications in total.
1, the label of user is determined respectively for each advertiser.
With advertiser GiFor.Select seed user (including positive sample and negative sample), advertiser GiConversion user (i.e. Buy advertiser GiCommodity user) be used as positive sample, the main G of browse advertisementsiAdvertisement but unconverted (or it does not occur His behavior) user as negative sample, label is respectively Y and X, then positive sample P (Y)=1, negative sample P (X)=1.
2, the label of user is determined together for all advertisers.
Label is Y1 ... Y9 (i.e. 9 kinds of merchandise classifications) and negative sample X (advertisement of browse advertisements master but it is unconverted (or Other behaviors do not occur for person) user).
The user that conversion behavior occurs in this 9 categories is positive sample, i.e.,P (X)=0.Negative sample User P (X)=1.
The probability distribution (i.e. non-seed user tag matrix) of non-seed user is:
In the seed user label matrix, abscissa is label, and ordinate is user, sits target value and indicates that user is horizontal seat Mark the probability of corresponding label.
Wherein, according to probability transfer matrix to the pending label probability matrix into row label dissemination process, obtain When label probability matrix, probability transfer matrix can be multiplied by pending label probability matrix and just obtain label probability matrix, Specifically it may refer to following equation:
F=PF °
Wherein, F is label probability matrix, and F ° is pending label probability matrix, and P is probability transfer matrix.
Following equation is obtained after above formula expansion:
Fic=∑kPik×Fkc
Since label propagation algorithm after each computation all can be close to optimal value, so after obtaining a label probability matrix It determines that the probability accuracy rate of the corresponding label of non-seed user is not high, label probability matrix can repeatedly be determined based on this, and After meeting label dissemination process termination condition, the corresponding mark of non-seed user is determined with the label probability matrix obtained for the last time The probability of label.
Optionally, state according to probability transfer matrix to the pending label probability matrix into row label dissemination process it Afterwards, judge whether to meet label dissemination process termination condition;
If it is satisfied, then determining the general of the corresponding label of non-seed user with the label probability matrix that last time obtains Rate;
Otherwise, reset seed user label matrix, and return determined according to the seed user label matrix it is pending The step of label probability matrix.
After returning to the step of determining pending label probability matrix according to the seed user label matrix, continue root Pending label probability matrix is determined according to the seed user label matrix, and according to probability transfer matrix to described pending Label probability matrix into row label dissemination process, obtain label probability matrix.
Due into after row label dissemination process, seed user label matrix can change, it is possible to select every time Resetting seed user label matrix (i.e. using the seed user label matrix used for the first time).
Due to F=PF °,So the F obtained after iteration is also comprising being partly kind on upper and lower two parts Child user label matrix, lower part are non-seed seed user label matrixs, so every time after iteration, the kind that will be obtained after iteration Seed user label matrix in child user label matrix replaces with the seed user label matrix of resetting, pending to obtain Label probability matrix, and carry out next iteration.
Wherein, the label dissemination process termination condition include but not limited to it is following in some or all of:
Number into row label dissemination process is equal to iterations (such as 30 times);
The label probability matrix convergence that the last time obtains.
Judge that the last obtained whether convergent mode of label probability matrix has very much, such as:
The difference of the label probability norm of matrix obtained after label probability norm of matrix and last round of iteration is less than a certain Threshold value (such as 0.000001), determines that convergence.
Such as:
Similarity matrix:
Probability transfer matrix P is obtained according to similarity matrix:
F ° of initial labels probability matrix (B C are random):
Obtain first round iteration result:
Iteration, until convergence.
The label probability matrix obtained after successive ignition is:
Assuming that probability threshold value is 0.5, then the label of user B is Y, and the label of user C is X.
In force, since the network behavior of user constantly changes, it is possible to set a duration, such as one day, one Week etc., and scheme of the embodiment of the present application is executed according to long period when this, it can ensure the standard of the label of determining user in this way True property.
As shown in figure 4, the embodiment of the present application determines that the complete method of user tag includes:
Step 400, the network behavior for determining user.
Step 401, the network behavior according to user determine that similarity user gathers.
Step 402 determines user's similarity matrix according to the similarity between user in similarity user set, with And seed user label matrix is determined according to the corresponding label of the seed user.
Step 403 determines probability transfer matrix according to user's similarity matrix.
Step 404 determines pending label probability matrix according to the seed user label matrix.
Step 405, according to probability transfer matrix to the pending label probability matrix into row label dissemination process, obtain To label probability matrix.
Step 406 judges whether to meet label dissemination process termination condition, if so, thening follow the steps 408;Otherwise, it holds Row step 407.
Step 407, resetting seed user label matrix, and return to step 403.
Step 408, according to the label probability matrix obtained for the last time, determine non-in the similarity user set The probability of the corresponding label of seed user.
The corresponding label of step 409, the probability that will be greater than probability threshold value is as the corresponding target labels of non-seed user.
Based on same inventive concept, a kind of system of determining user tag is additionally provided in the embodiment of the present application, due to this The principle that system solves the problems, such as is similar to the method that the embodiment of the present application determines user tag, therefore the implementation of the system can be joined The implementation of square method, overlaps will not be repeated.
As shown in figure 5, the system that the embodiment of the present application determines user tag includes:
Gather determining module 500, any one for determining that similarity user gathers, wherein in similarity user set At least one of user and similarity user set user-association;
Processing module 501, for similarity in being gathered according to the similarity user between user and described similar The corresponding label of seed user in user's set is spent, determines the corresponding label of non-seed user in similarity user set Probability;
Label determining module 502, the corresponding label of probability for will be greater than probability threshold value are corresponded to as non-seed user Target labels.
The embodiment of the present application can just determine corresponding seed user according to demand.For example it is applied to electric business scene, it determines User tag is in order to which to user's advertisement, then party in request is exactly advertiser, and the corresponding seed user of party in request can be purchase The user of Mai Guo parties in request commodity.
After determining the corresponding target labels of non-seed user, so that it may with to containing the non-seed of the corresponding label of party in request User sends the information of party in request.
For example the embodiment of the present application is applied to electric business scene, the information of party in request is advertisement, then is corresponded to containing party in request Label non-seed user click the page containing advertisement after, can to non-seed user send party in request advertisement.
Wherein, if the party in request of the application is advertiser, the application effectively can help advertiser to spread over Crowd finds potential client.
The embodiment of the present application is it needs to be determined that seed user.The corresponding seed user of different parties in request also differs.
For example party in request is advertiser, then it can be using the client of advertiser as seed crowd, in similar users network Label is propagated, the foundation of similar users network relies on access data of the user to content, and the user that access to content is more overlapped has Higher closes on relationship.
By taking the embodiment of the present application is applied to electric business scene as an example, it is assumed that party in request A sells articles for babies, party in request B sale Electronic product can will buy the user of the commodity of party in request A as the corresponding seed users of party in request A, and mark seed User tag is articles for babies;The user of the commodity of party in request B will be bought as the corresponding seed users of party in request B, and marked Note seed user label is electronic product.
In force, a party in request can correspond to multiple labels, for example sell a variety of different classes of commodity;Similarly One seed user can also corresponding multiple labels, for example bought different types of commodity.
The embodiment of the present application can be directed to a party in request every time, determine the corresponding target labels of non-seed user;Also may be used To be directed to multiple parties in request every time, the corresponding target labels of non-seed user are determined.
If being directed to a party in request every time, the corresponding target labels of non-seed user are determined, then seed user here It is the corresponding seed user of a party in request;
If being directed to multiple parties in request every time, the corresponding target labels of non-seed user are determined, then seed user here It is the corresponding seed user of multiple parties in request.
In force, at least one of any one user in similarity user set and similarity user set use Family is associated with, and association here can be determined by similarity.
Optionally, at least one of any one user and similarity user set use in the similarity user set The similarity at family meets similarity condition.
Similarity condition can include but is not limited to it is following in some or all of:
The degree of association is more than threshold value;
Maximum top n similarity.
If similarity condition, which includes the degree of association, is more than threshold value, the user that the degree of association can be more than to threshold value is placed in similarity In user's set.Here threshold value can be set as needed, if threshold value is set as 0, then it represents that non-zero user may be used To be placed in similarity user set.
If similarity condition includes maximum top n similarity, it can from big to small arrange, will come according to the degree of association The corresponding user of the top n degree of association is placed in similarity user set.
Optionally, the processing module 501 is specifically used for:
Similarity in being gathered according to the similarity user between user determines user's similarity matrix, and according to institute It states the corresponding label of seed user and determines seed user label matrix;
Probability transfer matrix is determined according to user's similarity matrix, and true according to the seed user label matrix Fixed pending label probability matrix;
It is general that label is obtained into row label dissemination process to the pending label probability matrix according to probability transfer matrix Rate matrix;
According to the label probability matrix, the general of the corresponding label of non-seed user in the similarity user set is determined Rate.
In force, the similarity in being gathered according to the similarity user between user determines user's similarity matrix just It is to arrange the value of similarity according to matrix-style.
Similarity between any two user includes two similarities, i.e. user A to the similarity of user B, Yi Jiyong Similarities of the family B to user A.
In force, the similarity of the embodiment of the present application can be symmetrical, i.e. the similarity of user A to user B are equal to Similarities of the user B to user A;Can also be asymmetrical, i.e. the similarity of user A to user B is not equal to user B to user The similarity of A.
Such as by taking Fig. 3 as an example.Similarity in figure is asymmetric, including tetra- user A, user B, user C and user D use Family.
The similarity of user A to user B are 0.2;
The similarity of user A to user C are 0.8;
The similarity of user B to user C are 1.0;
The similarity of user D to user B are 1.0.
Can obtain similarity matrix according to the above is:
In the similarity matrix, transverse and longitudinal coordinate is all user, sits target value and indicates the user of ordinate to the use of abscissa The similarity at family.
In force, there are many modes of the label of setting the embodiment of the present application to set, and two ways is set forth below.
Mode one, according to whether setting label relevant with party in request.
Here only need to be arranged two kinds of labels, 1, positive sample label, i.e., it is related with party in request;2, negative sample label, i.e., with Party in request is unrelated.
When judging whether related with party in request, the standard that different application scenarios judge is also different.
By taking the embodiment of the present application is applied to electric business scene as an example, party in request is seller A, if user bought the quotient of seller Product, it is determined that the user is related with party in request;
If once showing the advertisement of the seller to the user but not clicking on, it is determined that the user is unrelated with party in request.
The corresponding all positive sample users of seller A and negative sample user all can serve as to the seed user of the seller A.
Mode two, according to scene setting label.
Side's setting label according to demand, different scene parties in request is also different, and corresponding label is also different.
By taking the embodiment of the present application is applied to electric business scene as an example, party in request is seller, then label may include that seller sells The classification, such as articles for babies, electronic product, bedding of commodity etc..
Optionally, for mode two, label can also include negative sample label, i.e., unrelated with party in request.
When judging whether related with party in request, the standard that different application scenarios judge is also different.
By the embodiment of the present application be applied to electric business scene for, if once to the user showed the seller advertisement but It does not click on, it is determined that the user is unrelated with party in request.
After determining user's similarity matrix and seed user label matrix, determined according to user's similarity matrix general Rate transfer matrix, and pending label probability matrix is determined according to the seed user label matrix.
Optionally, the processing module 501 is additionally operable to:
It is described according to probability transfer matrix to the pending label probability matrix into row label dissemination process, marked After signing probability matrix, after determining and meeting label dissemination process termination condition, according to the label probability matrix, determine described in The probability of the corresponding label of target user.
Optionally, the processing module 501 is additionally operable to:
It is described according to probability transfer matrix to the pending label probability matrix into row label dissemination process, marked After signing probability matrix, if being unsatisfactory for label dissemination process termination condition, seed user label matrix is reset, and return to basis The seed user label matrix determines the step of pending label probability matrix.
Optionally, the label dissemination process termination condition be it is following in some or all of:
Number into row label dissemination process is equal to iterations;
The label probability matrix convergence that the last time obtains.
As shown in fig. 6, the method for the embodiment of the present application pushed information includes:
Step 600, the corresponding label of information pushed according to the binding relationship of label and information, determining needs;
Step 601 will need the information pushed to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, wherein any one user in similarity user set gathers with similarity user At least one of user-association;Similarity between user and the similarity in being gathered according to the similarity user The corresponding label of seed user in user's set determines the general of the corresponding label of non-seed user in similarity user set Rate;The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
The corresponding user of label includes seed user and/or non-seed user in step 601.
The embodiment of the present application determines that the mode of the corresponding label of user determines user couple with above-mentioned the embodiment of the present application introduction The mode for the label answered is identical, is not repeating herein.
Optionally, the described information of the embodiment of the present application is advertising information.
As shown in fig. 7, the system of the embodiment of the present application pushed information includes:
Label model 700 determines the corresponding mark of information for needing to push for the binding relationship according to label and information Label;
Pushing module 701, for the information for needing to push to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, wherein any one user in similarity user set gathers with similarity user At least one of user-association;Similarity between user and the similarity in being gathered according to the similarity user The corresponding label of seed user in user's set determines the general of the corresponding label of non-seed user in similarity user set Rate;The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
The embodiment of the present application determines that the mode of the corresponding label of user determines user couple with above-mentioned the embodiment of the present application introduction The mode for the label answered is identical, is not repeating herein.
It should be noted that determining that the system of the corresponding label of user and the system to user's pushed information can be same A system can also be different system.
Optionally, described information is advertising information.
Above by reference to showing according to the method, apparatus (system) of the embodiment of the present application and/or the frame of computer program product Figure and/or flow chart describe the application.It should be understood that can realize that block diagram and or flow chart is shown by computer program instructions The combination of one block of figure and the block of block diagram and or flow chart diagram.These computer program instructions can be supplied to logical With computer, the processor of special purpose computer and/or other programmable data processing units, to generate machine so that via meter The instruction that calculation machine processor and/or other programmable data processing units execute creates for realizing block diagram and or flow chart block In specified function action method.
Correspondingly, the application can also be implemented with hardware and/or software (including firmware, resident software, microcode etc.).More Further, the application can take computer usable or the shape of computer program product on computer readable storage medium Formula has the computer realized in the medium usable or computer readable program code, to be made by instruction execution system It is used with or in conjunction with instruction execution system.In the present context, computer can be used or computer-readable medium can be with It is arbitrary medium, can includes, store, communicating, transmitting or transmitting program, is made by instruction execution system, device or equipment With, or instruction execution system, device or equipment is combined to use.
Obviously, those skilled in the art can carry out the application essence of the various modification and variations without departing from the application God and range.In this way, if these modifications and variations of the application belong to the range of the application claim and its equivalent technologies Within, then the application is also intended to include these modifications and variations.

Claims (16)

1. a kind of method of determining user tag, which is characterized in that this method includes:
Determine that similarity user gathers, during wherein any one user in similarity user set gathers with similarity user At least one user-association;
Seed user in similarity and similarity user set in being gathered according to the similarity user between user Corresponding label determines the probability of the corresponding label of non-seed user in similarity user set;
The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
2. the method as described in claim 1, which is characterized in that in similarity user set any one user to it is similar The similarity of at least one of degree user's set user meets similarity condition.
3. the method as described in claim 1, which is characterized in that in the set according to the similarity user between user The corresponding label of seed user in similarity and similarity user set determines non-in the similarity user set The probability of the corresponding label of seed user, including:
Similarity in being gathered according to the similarity user between user determines user's similarity matrix, and according to described kind The corresponding label of child user determines seed user label matrix;
Probability transfer matrix is determined according to user's similarity matrix, and is waited for according to seed user label matrix determination The label probability matrix of processing;
Label probability square is obtained into row label dissemination process to the pending label probability matrix according to probability transfer matrix Battle array;
According to the label probability matrix, the probability of the corresponding label of non-seed user in the similarity user set is determined.
4. method as claimed in claim 3, which is characterized in that it is described according to probability transfer matrix to the pending label Probability matrix is into row label dissemination process, after obtaining label probability matrix, according to the label probability matrix, determines the mesh Before the probability for marking the corresponding label of user, further include:
Determination meets label dissemination process termination condition.
5. method as claimed in claim 4, which is characterized in that it is described according to probability transfer matrix to the pending label Probability matrix further includes after obtaining label probability matrix into row label dissemination process:
If being unsatisfactory for label dissemination process termination condition, seed user label matrix is reset, and return and used according to the seed Family label matrix determines the step of pending label probability matrix.
6. method as described in claim 4 or 5, which is characterized in that during the label dissemination process termination condition is following Partly or entirely:
Number into row label dissemination process is equal to iterations;
The label probability matrix convergence that the last time obtains.
7. a kind of system of determining user tag, which is characterized in that the system includes:
Gather determining module, for determine similarity user gather, wherein similarity user set in any one user with At least one of similarity user set user-association;
Processing module, the similarity between user and the similarity user in being used to be gathered according to the similarity user The corresponding label of seed user in set determines the probability of the corresponding label of non-seed user in similarity user set;
Label determining module, the corresponding label of probability for will be greater than probability threshold value is as the corresponding target mark of non-seed user Label.
8. system as claimed in claim 7, which is characterized in that in similarity user set any one user to it is similar The similarity of at least one of degree user's set user meets similarity condition.
9. system as claimed in claim 7, which is characterized in that the processing module is specifically used for:
Similarity in being gathered according to the similarity user between user determines user's similarity matrix, and according to described kind The corresponding label of child user determines seed user label matrix;
Probability transfer matrix is determined according to user's similarity matrix, and is waited for according to seed user label matrix determination The label probability matrix of processing;
Label probability square is obtained into row label dissemination process to the pending label probability matrix according to probability transfer matrix Battle array;
According to the label probability matrix, the probability of the corresponding label of non-seed user in the similarity user set is determined.
10. system as claimed in claim 9, which is characterized in that the processing module is additionally operable to:
It is described according to probability transfer matrix to the pending label probability matrix into row label dissemination process, it is general to obtain label After rate matrix, after determination meets label dissemination process termination condition, according to the label probability matrix, the target is determined The probability of the corresponding label of user.
11. system as claimed in claim 10, which is characterized in that the processing module is additionally operable to:
It is described according to probability transfer matrix to the pending label probability matrix into row label dissemination process, it is general to obtain label After rate matrix, if being unsatisfactory for label dissemination process termination condition, seed user label matrix is reset, and is returned according to Seed user label matrix determines the step of pending label probability matrix.
12. the system as described in claim 10 or 11, which is characterized in that during the label dissemination process termination condition is following Some or all of:
Number into row label dissemination process is equal to iterations;
The label probability matrix convergence that the last time obtains.
13. a kind of method of pushed information, which is characterized in that this method includes:
According to the binding relationship of label and information, the corresponding label of information for needing to push is determined;
The information pushed will be needed to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, during wherein any one user in similarity user set gathers with similarity user At least one user-association;Similarity between user and the similarity user in being gathered according to the similarity user The corresponding label of seed user in set determines the probability of the corresponding label of non-seed user in similarity user set; The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
14. method as claimed in claim 13, which is characterized in that described information is advertising information.
15. a kind of system of pushed information, which is characterized in that this method includes:
Label model determines the corresponding label of information for needing to push for the binding relationship according to label and information;
Pushing module, for the information for needing to push to be pushed to the determining corresponding user of the label;
Wherein, the corresponding label of the user is determined according to following manner:
Determine that similarity user gathers, during wherein any one user in similarity user set gathers with similarity user At least one user-association;Similarity between user and the similarity user in being gathered according to the similarity user The corresponding label of seed user in set determines the probability of the corresponding label of non-seed user in similarity user set; The corresponding label of probability of probability threshold value be will be greater than as the corresponding target labels of non-seed user.
16. system as claimed in claim 15, which is characterized in that described information is advertising information.
CN201710069800.2A 2017-02-08 2017-02-08 A kind of method and system of determining user tag and pushed information Pending CN108399551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710069800.2A CN108399551A (en) 2017-02-08 2017-02-08 A kind of method and system of determining user tag and pushed information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710069800.2A CN108399551A (en) 2017-02-08 2017-02-08 A kind of method and system of determining user tag and pushed information

Publications (1)

Publication Number Publication Date
CN108399551A true CN108399551A (en) 2018-08-14

Family

ID=63094387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710069800.2A Pending CN108399551A (en) 2017-02-08 2017-02-08 A kind of method and system of determining user tag and pushed information

Country Status (1)

Country Link
CN (1) CN108399551A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109379410A (en) * 2018-09-21 2019-02-22 北京达佳互联信息技术有限公司 Information-pushing method, device, server and storage medium
CN109753994A (en) * 2018-12-11 2019-05-14 东软集团股份有限公司 User's portrait method, apparatus, computer readable storage medium and electronic equipment
CN110457387A (en) * 2019-08-19 2019-11-15 腾讯科技(深圳)有限公司 A kind of method and relevant apparatus determining applied to user tag in network
CN110992096A (en) * 2019-12-03 2020-04-10 秒针信息技术有限公司 Prediction model training method and device and media identification prediction method and device
CN111507768A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Determination method of potential user, model training method and related device
CN109753994B (en) * 2018-12-11 2024-05-14 东软集团股份有限公司 User image drawing method, device, computer readable storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009026433A1 (en) * 2007-08-21 2009-02-26 Cortica, Ltd. Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
CN101685521A (en) * 2008-09-23 2010-03-31 北京搜狗科技发展有限公司 Method for showing advertisements in webpage and system
CN102521248A (en) * 2011-11-14 2012-06-27 北京亿赞普网络技术有限公司 Network user classification method and device
CN103577537A (en) * 2013-09-24 2014-02-12 上海交通大学 Image sharing website picture-oriented multi-pairing similarity determining method
CN105354202A (en) * 2014-08-20 2016-02-24 阿里巴巴集团控股有限公司 Data pushing method and apparatus
CN105809478A (en) * 2016-03-07 2016-07-27 合网络技术(北京)有限公司 Advertisement label marking method and system
CN106156062A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 Determine the personalized labels of user and the method and apparatus of pushed information

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009026433A1 (en) * 2007-08-21 2009-02-26 Cortica, Ltd. Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
CN101685521A (en) * 2008-09-23 2010-03-31 北京搜狗科技发展有限公司 Method for showing advertisements in webpage and system
CN102521248A (en) * 2011-11-14 2012-06-27 北京亿赞普网络技术有限公司 Network user classification method and device
CN103577537A (en) * 2013-09-24 2014-02-12 上海交通大学 Image sharing website picture-oriented multi-pairing similarity determining method
CN105354202A (en) * 2014-08-20 2016-02-24 阿里巴巴集团控股有限公司 Data pushing method and apparatus
CN106156062A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 Determine the personalized labels of user and the method and apparatus of pushed information
CN105809478A (en) * 2016-03-07 2016-07-27 合网络技术(北京)有限公司 Advertisement label marking method and system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109379410A (en) * 2018-09-21 2019-02-22 北京达佳互联信息技术有限公司 Information-pushing method, device, server and storage medium
CN109753994A (en) * 2018-12-11 2019-05-14 东软集团股份有限公司 User's portrait method, apparatus, computer readable storage medium and electronic equipment
CN109753994B (en) * 2018-12-11 2024-05-14 东软集团股份有限公司 User image drawing method, device, computer readable storage medium and electronic equipment
CN110457387A (en) * 2019-08-19 2019-11-15 腾讯科技(深圳)有限公司 A kind of method and relevant apparatus determining applied to user tag in network
CN110457387B (en) * 2019-08-19 2023-11-10 腾讯科技(深圳)有限公司 Method and related device applied to user tag determination in network
CN110992096A (en) * 2019-12-03 2020-04-10 秒针信息技术有限公司 Prediction model training method and device and media identification prediction method and device
CN110992096B (en) * 2019-12-03 2023-08-29 秒针信息技术有限公司 Prediction model training method and device and media identification prediction method and device
CN111507768A (en) * 2020-04-17 2020-08-07 腾讯科技(深圳)有限公司 Determination method of potential user, model training method and related device
CN111507768B (en) * 2020-04-17 2023-04-07 腾讯科技(深圳)有限公司 Potential user determination method and related device

Similar Documents

Publication Publication Date Title
CN104965890B (en) The method and apparatus that advertisement is recommended
Dou et al. Engineering optimal network effects via social media features and seeding in markets for digital goods and services
Wei et al. What drives Malaysian m‐commerce adoption? An empirical analysis
Jaradat et al. Investigating the moderating effects of gender and self-efficacy in the context of mobile payment adoption: A developing country perspective
TW201937428A (en) Marketing product recommendation method
US20080313040A1 (en) Content distribution system including cost-per-engagement based advertising
CN106169140A (en) Advertisement placement method and system
CN107341679A (en) Obtain the method and device of user's portrait
CN105095267A (en) User involving project recommendation method and apparatus
CN108399551A (en) A kind of method and system of determining user tag and pushed information
Man et al. The impact of cosmetics industry social media marketing on brand loyalty: Evidence from chinese college students
US20170046734A1 (en) Enhancing touchpoint attribution accuracy using offline data onboarding
CN109325179A (en) A kind of method and device that content is promoted
CN108305181A (en) The determination of social influence power, information distribution method and device, equipment and storage medium
CN106415637A (en) Commission allocation method and system
CN109034867A (en) click traffic detection method, device and storage medium
Zhang et al. Mining target users for mobile advertising based on telecom big data
CN106257507A (en) The methods of risk assessment of user behavior and device
CN110035053A (en) For detecting user-content provider couple method and system of fraudulent
CN109190040A (en) Personalized recommendation method and device based on coevolution
CN104199843A (en) Service sorting and recommending method and system based on social network interactive data
CN107545453A (en) A kind of information distribution method and device
CN113393270B (en) Determination method and device for advertisement point contribution value, electronic equipment and storage medium
CN109064228A (en) Interconnection method, device, equipment/terminal/server and the computer readable storage medium of material
Shao et al. What promotes customers’ trust in the mobile payment platform: an empirical study of alipay in China

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1259216

Country of ref document: HK

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180814