CN103390008A - Method and system for acquiring personalized features of user - Google Patents

Method and system for acquiring personalized features of user Download PDF

Info

Publication number
CN103390008A
CN103390008A CN2012101520841A CN201210152084A CN103390008A CN 103390008 A CN103390008 A CN 103390008A CN 2012101520841 A CN2012101520841 A CN 2012101520841A CN 201210152084 A CN201210152084 A CN 201210152084A CN 103390008 A CN103390008 A CN 103390008A
Authority
CN
China
Prior art keywords
user
parameter vector
document
vector
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012101520841A
Other languages
Chinese (zh)
Other versions
CN103390008B (en
Inventor
祁勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Boao Zongheng Network Technology Co ltd
Six or six fish Information technologies(Shanghai)Co., Ltd.
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201810879449.8A priority Critical patent/CN109344321B/en
Priority to CN201210152084.1A priority patent/CN103390008B/en
Publication of CN103390008A publication Critical patent/CN103390008A/en
Application granted granted Critical
Publication of CN103390008B publication Critical patent/CN103390008B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a system for acquiring personalized features of a user. The method comprises the steps of automatically updating the personalized features of the user through signals indicating that the user accesses other user and signals indicating that the user accesses files, wherein the personalized features of the user can be updated according to not only the personalized features of the other users accessed by the user but also the personalized features of the files accessed by the user; calculating the personalized sequencing value of every user in a group of users according to a query condition submitted by a query user and the personalized features of every user in the group of users; and sending the identification of at least one user in the group of users to the query user according to the size of the personalized sequencing values. The invention also provides the system for acquiring the personalized features of the user. The method disclosed by the invention can be used for looking up users or user groups with specific features in a social network.

Description

A kind of method and system that obtains the user individual feature
Technical field
The present invention relates to internet arena, relate in particular to a kind of method and system that obtains the user individual feature.
Background technology
The main entrance of early stage internet is portal website, and the user enters network world by the webpage of website staff's edit.Due to the development of search technique, people bring into use the main entrance of search engine as internet subsequently, the user by the inputted search key word just can be easily retrieving information on the internet.In recent years, along with the types of facial makeup in Beijing operas (Facebook), push away the fast development of the application such as spy (Twitter), Google+ and microblogging, social networks was becoming the main entrance of internet gradually.
On social networks, the user mainly filters and screens internet information by the relational network that oneself is set up.Obtain to be concerned people's issue or the information that forwards by adding concern (follow), obtain the information of good friend's issue etc. by plusing good friend such as the user.Research shows, the relational network that the user sets up on social networks mainly comprises user's household, friend and colleague etc. " strong relation " network, and this is entity relationship network copying on line under subscribers feeder in essence.And the real potentiality of social networks are development of user present also unacquainted those " weak relation " user resources, for example domain expert on network, master, intelligent etc.At present, the method of setting up " weak relation " network on social networks is also more single, the user easily finds a few peoples that those popularity are higher or the bean vermicelli number is more, be difficult to find that but those are proficient in a particular line in certain particular professional field or for the interested most people of certain particular topic.This is the shortcoming of existing social networks technology.
A kind of improvement project be on social networks for each user arranges individualized feature, then according to described user's individualized feature, find user or the customer group with special characteristic, and they joined user's oneself relational network.But the individualized feature that obtains the user on social networks is more difficult, mainly contains following difficult point.
The firstth, the automatic acquisition problem of customized information.On the types of facial makeup in Beijing operas, 900,000,000 users are arranged at present, Sina's microblogging has 1.4 hundred million users, and most users is manual on social networks safeguards that its individualized feature is unpractical if require.How automatic acquisition user's individualized feature is a difficult problem.The secondth, the replacement problem of customized information.As time goes on, the personal information such as user's hobby, work place, the industry of being engaged in and education degree can change, but it is difficult requiring most users to upgrade in real time its customized information.The 3rd is the semantic difference problem of customized information.In the individualized feature that the user arranges, the different but semantic identical individualized feature of term is difficult to they are sorted out.The 4th is the problem of completeness of customized information.The user provides on website personal information is usually simpler.Describing the typical case of one of hobby such as the user is to like music, play baseball or several the contents such as reading, and require the user to describe out meticulously all sidedly its interested field, is loaded down with trivial details and difficult.
In sum, how effectively obtaining user's individualized feature, and according to described individualized feature, search user and the customer group with special characteristic in social networks, is the problem that needs solve.
Summary of the invention
Problem in view of above-mentioned prior art existence, the object of the present invention is to provide a kind of method and system that obtains the user individual feature, carry out automatic acquisition user's individualized feature, and then, according to described individualized feature, search user and customer group with special characteristic in social networks.
According to above-described purpose, the present invention proposes a kind of method of obtaining the user individual feature, it is characterized in that,
Obtain on the internet a plurality of users, the user that storage is comprised of described a plurality of users collects U={1, and 2 ..., M}; A plurality of features are set, the feature set K={1 that storage is comprised of described a plurality of features, 2 ..., L};
For described user collects a plurality of user's parameters vector initial values in U;
Repeatedly carry out following steps:
Receive the signal that any one user i (i ∈ U) accesses any one user j (j ∈ U);
Read the parameter vector K of described user i u(i)=(uw i1, uw i2..., uw ik..., uw iL), described uw ikThe degree of correlation that represents described user i and feature k (k ∈ K);
Read the parameter vector K of described user j u(j)=(uw j1, uw j2..., uw jk..., uw jL), described uw jkThe degree of correlation that represents described user j and feature k (k ∈ K);
, with following parameter vector update algorithm, upgrade the parameter vector of described user i and described user j;
K u *(i)=function1[K u(i),K u(j)];
K u *(j)=function2[K u(i),K u(j)];
Wherein, described K u(i) and described K u *(i) represent respectively to upgrade parameter vector front and the rear described user i of renewal, described K u(j) and described K u *(j) represent respectively to upgrade parameter vector front and the rear described user j of renewal.
Compared with prior art, the inventive method reflects user's individualized feature with user's parameter vector, by the parameter vector update algorithm, solved the problem such as automatic acquisition, real-time update of user individual feature, and can be according to described user's parameter vector, search user and customer group with special characteristic in social networks, for the user sets up " weak relation " network, provide a new approach.
Description of drawings
Fig. 1 collects the parameter vector method for expressing of each user in U for the user;
Fig. 2 is the parameter vector method for expressing of each document in document sets D;
Fig. 3 is the method for obtaining the user individual feature based on user's calling party signal;
Fig. 4 is the method for obtaining the user individual feature based on user's access document signal;
Fig. 5 has the user's of special characteristic method flow diagram for inquiry;
Fig. 6 has the method flow diagram of the customer group of special characteristic for inquiry;
Fig. 7 is a kind of system construction drawing that obtains the user individual feature.
Embodiment
By reference to the accompanying drawings the inventive method is described in further detail.
Explanation to this patent method specific embodiments, comprise following components.At first, the method for numbering serial of user, document and feature is described, and the parameter vector method for expressing of user and document; Then, the parameter vector update algorithm based on user's calling party signal is described, and based on the parameter vector update algorithm of user's access document signal; Afterwards, the method for searching user or customer group on social networks according to given feature is described; Finally, a kind of system of obtaining the user individual feature is described.
At first the method for numbering serial of user, document and feature is described.
Obtain on the internet a plurality of users, each user has at least one user ID, and described user ID comprises in user account number, phone number, Cookie identification code, IP address, Email address and instant communication number.A plurality of users that obtain are carried out Unified number, and Customs Assigned Number is pooled together and forms the user and collect U={1,2 ..., M}, wherein M is user's number, each user has unique subscriber-coded.Equally, obtain on the internet a plurality of documents, for example by spider, obtain a plurality of Web webpages.Document on internet, have unique identification, for example the URL address of Web webpage.A plurality of documents that obtain are carried out Unified number, and document code is pooled together and forms document sets D={1,2 ..., N}, wherein N is the document number, each document has unique document coding.
Described user is collected the feature that each element in U and described document sets D has carry out Unified number, composition characteristic collection K={1,2 ..., L}, wherein L is Characteristic Number.The attribute of described character representation user and document, for example news, finance and economics, science, music, military affairs and physical culture etc.
Below introduce the method for expressing of the parameter vector of user and document.Described parameter vector method for expressing is similar to the vectorial expression method of vector space model, namely uses the base unit of characteristic item as user characteristics or file characteristics.The parameter vector that represents the user with the set of the degree of correlation of user and each feature, represent the parameter vector of document with the set of the degree of correlation of document and each feature.If certain user or document do not have certain feature, the degree of correlation of user or document and this feature is zero.
Fig. 1 collects the parameter vector method for expressing of each user in U for the user.The parameter vector that collects any one user i in U (i ∈ U) the user is set to K u(i)=(uw i1, uw i2..., uw ik..., uw iL), wherein said uw ikThe degree of correlation that represents described user i and feature k (k ∈ K), uw ik∈ [a, b], a and b are nonnegative constant.In addition, the degree of correlation that described user is collected k the feature of each user in U and feature set K pools together and forms a vector, is called the user and collects k user's column vector (uw of U 1k, uw 2k..., uw Mk).
Fig. 2 is the parameter vector method for expressing of each document in document sets D.The parameter vector of any one document n in document sets D (n ∈ D) is set to K d(n)=(dw n1, dw n2..., dw nk..., dw nL), wherein said dw nkThe degree of correlation that represents described document n and feature k (k ∈ K), dw nk∈ [a, b], a and b are nonnegative constant.In addition, the degree of correlation of k the feature of each document in described document sets D and feature set K is pooled together and forms a vector, be called k the document column vector (dw of document sets D 1k, dw 2k..., dw Nk).
The described degree of correlation is a real number value, the close relation degree of certain feature in its expression document or user and feature set K.If document or user are related with musical features more related a little less with sports feature, we just say that the degree of correlation of the document or user and musical features is high, and are low with the degree of correlation of sports feature.In addition, have correlativity between some feature, therefore can reduce the dimension of feature set K by the correlativity between the minimizing feature when feature selecting, reduce the demand to the server stores space, improve efficiency of algorithm.Some feature needn't directly be listed in feature set, because the relatedness computation that the degree of correlation of these features can be by one or several further feature in feature set K out.
The following describes the method to set up of the parameter vector initial value of user or document.Describe for following three examples.If the parameter vector of user or document is not set up initial value, the default null vector that is made as of its parameter vector initial value.
Example 1 is the method that the parameter vector initial value of user i (i ∈ U) or document n (n ∈ D) manually is set.Feature sum L=5 for example is set, and feature set K=(science, finance and economics, education, music, physical culture), arrange K u(i)=(uw i1, uw i2, uw i3, uw i4, uw i5)=(0,0.00032,0,0.00059,0).Be user i be 0.00032 with the degree of correlation of " finance and economics " feature, with the degree of correlation of " music " feature be 0.00059, with the degree of correlation of further feature be zero.Use similar approach, the parameter vector K of arbitrary document n can be set d(n)=(dw n1, dw n2..., dw nk..., dw nL) initial value.
Example 2 is methods that the parameter vector initial value of user i (i ∈ U) is set.Submit one group of collection of document H={... to by described user i, r ...
Figure BSA00000718347000051
The parameter vector of described document r (r ∈ H) is K d(r)=(dw r1, dw r2..., dw rL), therefore, for each k ∈ K, uw is set ik=(σ 1/ s) ∑ (r ∈ H)[dw rk/ (∑ (k ∈ K)dw rk)], wherein s is the element number of described set H, σ 1For setting constant.Use similar approach, described user i also can select one group of user to calculate the parameter vector initial value of described user i in described user collects U.
Example 3 is a kind of methods that the parameter vector initial value of document is set.Catalogue is a kind of special document, and corresponding document code is arranged.Generally include the split catalogs such as news, music, physical culture, finance and economics and science and technology such as portal website.We suppose that the document under same directory has some identical feature, and the document under for example physical culture catalogue is all relevant to physical culture.If document n (n ∈ D) is a document under catalogue h (h ∈ D), the parameter vector initial value of described document n is decided by the parameter vector of described catalogue h.For example, for each k ∈ K, dw is set nk2Dw hk, σ wherein 2For setting constant.
Fig. 3 is the method for obtaining the user individual feature based on user's calling party signal.Specifically comprise the steps:
S11. obtain on the internet a plurality of users, the user that storage is comprised of described a plurality of users collects U={1, and 2 ..., M}; A plurality of features are set, the feature set K={1 that storage is comprised of described a plurality of features, 2 ..., L};
S12. be that described user collects a plurality of user's parameters vector initial values in U;
S13. receive the signal that any one user i (i ∈ U) accesses any one user j (j ∈ U);
S14. read the parameter vector K of described user i u(i)=(uw i1, uw i2..., uw ik..., uw iL), wherein said uw ikThe degree of correlation that represents described user i and feature k (k ∈ K);
S15. read the parameter vector K of described user j u(j)=(uw j1, uw j2..., uw jk..., uw jL), wherein said uw jkThe degree of correlation that represents described user j and feature k (k ∈ K);
S16. use following parameter vector update algorithm, upgrade the parameter vector of described user i and described user j,
K u *(i)=function1[K u(i),K u(j)],
K u *(j)=function2[K u(i),K u(j)];
Return to step S13.
Wherein, described K u(i) and described K u *(i) represent respectively to upgrade parameter vector front and the rear described user i of renewal, described K u(j) and described K u *(j) represent respectively to upgrade parameter vector front and the rear described user j of renewal; Described K u *(i)=(uw i1 *, uw i2 *..., uw ik *..., uw iL *), described K u *(j)=(uw j1 *, uw j2 *..., uw jk *..., uw jL *); Described function1 represents K u *(i) be K u(i) and K u(j) function, described function2 represents K u *(j) be K u(i) and K u(j) function; Any two users in described user i and described user j representative of consumer collection U, and do not refer in particular to certain two user, while for example performing step S13 the n time, i=1023 in described signal, j=29328, and the n+1 time execution step is during S13, i=737443 in described signal, j=837487.
In the described method of Fig. 3, after executing described step S16, also comprise and upgrade described K u(i) and described K u(j) step, namely carry out assignment K u(i)=K u *(i) and K u(j)=K u *(j).
in an application example of the described method of Fig. 3, the type of described signal is a kind of with in Types Below: T=1 represents that described user i pays close attention to (follow) described user j, T=2 represents that described user i adds as a friend described user j, T=3 represents that described user i forwards the information of described user j, T=4 represents the information of the described user j issue of described user i comment, T=5 represents the information of the described user j of described user i collection, T=6 represents that described user i sends personal letter to described user j, T=7 represents that described user i labels to described user j, T=8 represents that described user i is made as the information of described user j issue to like.In an application example of the described method of Fig. 3, described signal gathers from system journal.
In an application example of the described method of Fig. 3, described method meets K u *(i) 〉=K u(i) and K u *(j) 〉=K u(j).Inequality K wherein u *(i) 〉=K u(i) implication is for each k ∈ K, and uw is arranged ik *〉=uw ikInequality K u *(j) 〉=K u(j) implication is for each k ∈ K, and uw is arranged jk *〉=uw jk
In an application example of the described method of Fig. 3, for each k ∈ K, described uw ik *Described uw jkIncreasing function; For each k ∈ K, described uw jk *Described uw ikIncreasing function.
In an application example of the described method of Fig. 3, for each k ∈ K, described uw iK *It is ∑ (k ∈ K)uw jkSubtraction function; For each k ∈ K, described uw jK *It is ∑ (k ∈ K)uw ikSubtraction function.
In an application example of the described method of Fig. 3, described signal comprises the user ID of described user i and described user j, and each user ID is with unique subscriber-coded corresponding.This application example reads the subscriber-coded of described user i according to the user ID of described user i, and according to the subscriber-coded parameter vector that reads described user i of described user i; Read the subscriber-coded of described user j according to the user ID of described user j, and according to the subscriber-coded parameter vector that reads described user j of described user j.
In the described method of Fig. 3, after the described parameter vector update algorithm of execution reaches set point number,, for each feature k ∈ K, need to collect k user's column vector (uw in U to the user 1k, uw 2k..., uw Mk) carry out normalized (normalization).Wherein, the implication of carrying out primary parameter vector update algorithm is, with described K u(i) and described K u(j) bring described function1 and described function2 into, obtain described K u *(j) and described K u *(i) process.The concrete application example of described method for normalizing is as follows:
Example 1: the user is collected k user's column vector (uw in U 1k, uw 2k..., uw Mk) carry out the method for normalized, comprise the temp=∑ is set (t ∈ U)uw tk, and for each i ∈ U, uw is set ik=uw ik/ temp.
Example 2: the user is collected k user's column vector (uw in U 1k, uw 2k..., uw Mk) carry out normalized method as follows.At first calculate the temp=∑ (t ∈ U)uw tk, and for each i ∈ U, calculate uw ik=uw ik/ temp; Then to uw 1k, uw 2k..., uw MkSort and according to ranking results with uw 1k, uw 2k..., uw MkBe divided into the r group, and take out data composition set { s minimum in every group 1, s 2..., s r, and s 1<s 2<...<s rFinally to uw 1k, uw 2k..., uw MkBe handled as follows: for each i ∈ U, if uw ik<s 1, uw is set ik=a; If s m≤ uw ik≤ s m+1, uw is set ik=g (s m); If uw ik>s r, uw is set ik=b.G (s wherein m) be increasing function, and g (s m) ∈ (a, b), 1≤m<r, a and b are nonnegative constant, r is setup parameter.
Application example 1
This is an application example of the described method of Fig. 3.In the described method of Fig. 3, described parameter vector update algorithm, by following concrete application example, is upgraded the parameter vector of described user i and described user j:
uw ik *=uw ik+ λ 1(j, i, T) f 1[K u(j)] (for each
Figure BSA00000718347000071
)
uw jk *=uw jk+ λ 2(i, j, T) f 2[K u(i)] (for each
Figure BSA00000718347000072
)
Wherein, described uw ikWith described uw ik *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user i, described uw jkWith described uw jk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user j; Described λ 1(j, i, T) is under the type T of described signal, the influence coefficient of described user j to described user i, described λ 2(i, j, T) is under the type T of described signal, the influence coefficient of described user i to described user j.Described UK iThe parameter vector K by described user i u(i)=(uw i1, uw i2..., uw ik..., uw iL) in the P of numerical value maximum iThe set that the corresponding feature of individual component forms, described UK jThe parameter vector K by described user j u(j)=(uw j1, uw j2..., uw jk..., uw jL) in the P of numerical value maximum jThe set that the corresponding feature of individual component forms, P iAnd P jFor setup parameter, and P i≤ L, P j≤ L.I=30 for example, P 30=3, UK 30={ literature, computing machine, biology }; J=265, P 265=2, UK 265={ music, history }.Carry out following setting after carrying out described specific algorithm, namely for each k ∈ UK i, uw is set jk=uw jk *For each k ∈ UK j, uw is set ik=uw ik *
In described application example 1, described specific algorithm can be further defined to for each k ∈ UK i, meet uw jk *〉=uw jkFor each k ∈ UK j, meet uw ik *〉=uw ik
In described application example 1, described f 1[K u(j)] be the parameter vector K of described user j u(j) function, described f 2K u(i)] be the parameter vector K of described user i u(i) function.Described f 1[K uAnd described f (j)] 2[K u(i)] concrete methods of realizing comprises following instance:
Example 1: described f 1[K u(j)] be described uw jkIncreasing function, be ∑ (k ∈ K)uw jkSubtraction function; Described f 2[K u(i)] be described uw ikIncreasing function, be ∑ (k ∈ K)uw ikSubtraction function.
Example 2:f 1[K u(j)]=σ 3Uw jk/ (∑ (k ∈ K)uw jk), f 2[K u(i)]=σ 4Uw ik/ (∑ (k ∈ K)uw ik), σ wherein 3And σ 4For setting constant.
Example 3:f 1[K u(j)]=σ 5Uw jk, f 2[K u(i)]=σ 6Uw ik, σ wherein 5And σ 6For setting constant.
Example 4:f 1[K u(j)]=σ 7{ 1/[1+exp (uw jk)], f 2[K u(i)]=σ 8{ 1/[1+exp (uw ik)], σ wherein 7And σ 8For setting constant.
In described application example 1, described λ 1(j, i, T) and described λ 2The concrete methods of realizing of (i, j, T) comprising:
Example 1: described λ 1(j, i, T) and described λ 2(i, j, T) is respectively the function of the similarity sim (i, j) between the parameter vector of described user i and described user j.λ for example 1(j, i, T)=c 1Sim (i, j), λ 2(i, j, T)=c 2Sim (i, j), and sim (i, j)=|| K u(i), K u(j) ||=[∑ (k ∈ K)(uw ikUw jk)]/{ [∑ (k ∈ K)(uw ik) 2] 1/2[∑ (k ∈ K)(uw jk) 2] 1/2, c 1And c 2For setting constant.The implication of this example is that the similarity between the parameter vector of user i and user j is higher, so they the scale-up factor of " ballot " is larger each other.
Example 2: described λ 1(j, i, T)=u 1(j) u 2(i), described λ 2(i, j, T)=u 1(i) u 2(j), u wherein 1(j) whether the parameter vector of expression user j can be used for upgrading the user and collect other users' of U parameter vector, u 1(i) whether the parameter vector of expression user i can be used for upgrading the user and collect other users' of U parameter vector; u 2(i) whether the parameter vector of expression user i can be collected by the user parameter vector renewal of other users in U; u 2(j) whether the parameter vector of expression user j can be collected by the user parameter vector renewal of other users in U.u 1(j), u 2(j), u 1(i) and u 2(i) be setup parameter, their value is 0 or 1.1 the representative be, 0 represent no.The implication of this example is for preventing malicious attack, does not pass through the user of reliability certification, and its parameter vector can not upgrade other user's parameter vector; Some special user, its parameter vector can not be upgraded by other user's parameter vector.
Example 3: described λ 1(j, i, T)=s 1(T), described λ 2(i, j, T)=s 2(T).Wherein said T is the type of user's calling party signal, described s 1(T) and described s 2(T) be the function of described T.
Example 4: with the combination of above-mentioned example 1~3 each method, generate described λ 1(j, i, T) and λ 2(i, j, T).For example
λ 1(j,i,T)={c 1·sim(i,j)}·{u 1(j)·u 2(i)}·s 1(T)
λ 2(i,j,T)={c 2·sim(i,j)}·{u 1(i)·u 2(j)}·s 2(T)。
Example 5: described λ 1(j, i, T) is the function of number of users in the relational network of described user j, described λ 2(i, j, T) is the function of number of users in the relational network of described user i.
Example 6: described λ 1(j, i, T) and described λ 2(i, j, T) is for setting constant.
In described application example 1, after the described specific algorithm of execution reaches set point number, need to be for each feature k ∈ K, to k user's column vector (uw 1k, uw 2k..., uw Mk) carry out normalized.
Application example 2
This be one of described application example 1 method for example.For convenience of illustration,let us suppose that three users on the internet, and each user has two features, and namely the user collects U={1,2,3}, feature set K={1,2}.User 1, user 2 and user's 3 parameter vector is respectively (uw 11, uw 12), (uw 21, uw 22) and (uw 31, uw 32).Uw wherein ikThe degree of correlation of (i ∈ U, k ∈ K) described user i of expression and feature k.
If received the described user 2 described users' 3 of access signal, and signal type T=1,, according to following parameter vector update algorithm, upgrade described user 2 and described user's 3 parameter vector:
uw 21 *=uw 211(3,2,1){uw 31/(uw 31+uw 32)}
uw 22 *=uw 221(3,2,1){uw 32/(uw 31+uw 32)}
uw 31 *=uw 312(2,3,1){uw 21/(uw 21+uw 22)}
uw 32 *=uw 322(2,3,1){uw 22/(uw 21+uw 22)}
λ wherein 1(3,2,1) is illustrated under signal type T=1,3 couples of described users' 2 of described user influence coefficient; λ 2(2,3,1) is illustrated under signal type T=1,2 couples of described users' 3 of described user influence coefficient.If λ 1(3,2,1)=c 1Sim (2,3) u 1(3) u 2(2) s 1(1); λ 2(2,3,1)=c 2Sim (2,3) u 1(2) u 2(3) s 2(1), establish s 1(1)=3, s 2(1)=1.5; c 1And c 2For setting constant; u 1(3) whether expression user's 3 parameter vector can be used for upgrading the user and collect other users' of U parameter vector, u 1(2) whether expression user's 2 parameter vector can be used for upgrading the user and collect other users' of U parameter vector, u 2(2) whether expression user's 2 parameter vector can be collected by the user parameter vector renewal of other users in U, u 2(3) whether expression user's 3 parameter vector can be collected by the user parameter vector renewal of other users in U, u 1(2)=u 2(2)=u 1(3)=u 2(3)=1; Similarity between described sim (2,3) the described user 2 of expression and described user's 3 parameter vector, namely
sim(2,3)=(uw 21·uw 31+uw 22·uw 32)/{[(uw 21) 2+(uw 22) 2] 1/2·[(uw 31) 2+(uw 32) 2] 1/2}。
After executing above-mentioned algorithm, upgrade described user 2 and described user's 3 parameter vector, uw namely is set 31=uw 31 *, uw 32=uw 32 *, uw 21=uw 21 *And uw 22=uw 22 *
After executing above-mentioned algorithm, to user's column vector (uw 11, uw 21, uw 31) and (uw 12, uw 22, uw 32) carry out normalized.Its algorithm is as follows: establish temp1=uw 11+ uw 21+ uw 31, feature k=1 is arranged uw 11=uw 11/ temp1, uw 21=uw 21/ temp1, uw 31=uw 31/ temp1; If temp2=uw 12+ uw 22+ uw 32, feature k=2 is arranged uw 12=uw 12/ temp2, uw 22=uw 22/ temp2, uw 32=uw 32/ temp2.
Fig. 4 is the method for obtaining the user individual feature based on user's access document signal.Specifically comprise the steps:
S21. obtain on the internet a plurality of users, the user that storage is comprised of described a plurality of users collects U={1, and 2 ..., M}; Obtain on the internet a plurality of documents, the document sets D={1 that storage is comprised of described a plurality of documents, 2 ..., N}; A plurality of features are set, the feature set K={1 that storage is comprised of described a plurality of features, 2 ..., L};
S22. be that described user collects a plurality of user's parameters vector initial values in U, and be a plurality of document setup parameter vector initial values in described document sets D;
S23. receive the signal that any one user m (m ∈ U) accesses any one document n (n ∈ D);
S24. read the parameter vector K of described user m u(m)=(uw m1, uw m2..., uw mk..., uw mL), wherein said uw mkThe degree of correlation that represents described user m and feature k (k ∈ K);
S25. read the parameter vector K of described document n d(n)=(dw n1, dw n2..., dw nk..., dw nL), wherein said dw nkThe degree of correlation that represents described document n and feature k (k ∈ K);
S26. use following parameter vector update algorithm 2, upgrade the parameter vector of described user m and described document n,
K u *(m)=function3[K u(m),K d(n)],
K d *(n)=function4[K u(m),K d(n)];
After executing described step S26, return to described step S23.
Wherein, described K u(m) and described K u *(m) represent respectively to upgrade parameter vector front and the rear described user m of renewal, described K d(n) and described K d *(n) represent respectively to upgrade parameter vector front and the rear described document n of renewal; Described K u *(m)=(uw m1 *, uw m2 *..., uw mk *..., uw mL *), described K d *(n)=(dw n1 *, dw n2 *..., dw nk *..., dw nL *); Described function3 represents K u *(m) be K u(m) and K d(n) function, described function4 represents K d *(n) be K u(m) and K d(n) function; Any one user in described user m representative of consumer collection U, and do not refer in particular to certain user, described document n represents any one document in document sets D, and do not refer in particular to certain document, for example the n time execution step during S23 in described signal m=1023, n=3428, and the n+1 time execution step during S23 in described signal m=33456, n=28477.
In the described method of Fig. 4, after executing described step S26, also comprise and upgrade described K u(m) and described K d(n) step, namely carry out assignment K d(n)=K d *(n) and K u(m)=K u *(m).
in the described method of Fig. 4, the type of described signal is a kind of with in Types Below at least: T=9 represents that described user m clicks the link of described document n, T=10 represents that described user m keys in the address of described document n, T=11 represents that described user m arranges label to described document n, T=12 represents that the described document n of described user m is set to bookmark, T=13 represent the described document n of described user m be set to like (as the Like of the types of facial makeup in Beijing operas and Google+1), T=14 represents that described user m forwards described document n, T=15 represents the described document n of described user m comment, T=16 represents the described document n of described user m collection.In an application example of the described method of Fig. 4, described signal gathers from the Web daily record.Described Web daily record, comprise server log (server log), error log (error log) and Cookie daily record etc.
In an application example of the described method of Fig. 4, described method meets K u *(m) 〉=K u(m) and K d *(n) 〉=K d(n).Inequality K wherein u *(m) 〉=K u(m) implication is for each k ∈ K, and uw is arranged mk *〉=uw mkInequality K d *(n) 〉=K d(n) implication is for each k ∈ K, and dw is arranged nk *〉=dw nk
In an application example of the described method of Fig. 4, for each k ∈ K, described uw mk *Described dw nkIncreasing function, be ∑ (k ∈ K)dw nkSubtraction function; For each k ∈ K, described dw nk *Described uw mkIncreasing function, be ∑ (k ∈ K)uw mkSubtraction function.
In an application example of the described method of Fig. 4, described signal comprises the user ID of described user m and the document identification of described document n, and described user ID is with unique subscriber-coded corresponding, and described document identification is corresponding with unique document coding.This application example reads the subscriber-coded of described user m by the user ID of described user m, and according to the subscriber-coded parameter vector that reads described user m of described user m; Read the document coding of described document n by the document identification of described document n, and the parameter vector that reads described document n according to the document coding of described document n.
In the described method of Fig. 4, in the described parameter vector update algorithm 2 of execution, reach set point number t 1After, for each feature k ∈ K, to k user's column vector (uw 1k, uw 2k..., uw Mk) carry out normalized; Reach set point number t in the described parameter vector update algorithm 2 of execution 2After, for each feature k ∈ K, to k document column vector (dw 1k, dw 2k..., dw Nk) carry out normalized; T wherein 1And t 2For positive integer.The implication of carrying out primary parameter vector update algorithm 2 is, with described K u(m) and described K d(n) bring described function3 and described function4 into, obtain described K u *(m) and described K d *(n) process.The concrete application example of described method for normalizing is as follows:
Example 1: the user is collected k user's column vector (uw in U 1k, uw 2k..., uw Mk) carry out the method for normalized, comprise the temp=∑ is set (t ∈ U)uw tk, and for each m ∈ U, uw is set mk=uw mk/ temp.To k document column vector (dw in document sets D 1k, dw 2k..., dw Nk) carry out the method for normalized, comprise the temp=∑ is set (t ∈ D)dw tk, and for each n ∈ D, dw is set nk=dw nk/ temp.
Example 2: to k document column vector (dw in document sets D 1k, dw 2k..., dw Nk) carry out normalized method as follows.At first calculate the temp=∑ (t ∈ D)dw tk, and for each n ∈ D, calculate dw nk=dw nk/ temp; Then to dw 1k, dw 2k..., dw NkSort and according to ranking results with dw 1k, dw 2k..., dw NkBe divided into the r group, and take out data composition set { s minimum in every group 1, s 2..., s r, and s 1<s 2<...<s rFinally to dw 1k, dw 2k..., dw NkBe handled as follows: for each n ∈ D, if dw nk<s 1, dw is set nk=a; If s m≤ dw nk≤ s m+1, dw is set nk=g (s m); If dw nk>s r, dw is set nk=b.G (s wherein m) be increasing function, and g (s m) ∈ (a, b), 1≤m<r, a and b are nonnegative constant, r is setup parameter.Use same method, can collect k user's column vector in U to the user and carry out normalized.
Application example 3
This is an application example of the described method of Fig. 4.Described parameter vector update algorithm 2, by following concrete application example, is upgraded the parameter vector of described user m and described document n:
uw mk *=uw mk+ λ 3(n, m, T) f 3[K d(n)] (for each
Figure BSA00000718347000121
)
dw nk *=dw nk+ λ 4(m, n, T) f 4[Ku (m)] is (for each
Figure BSA00000718347000122
)
Wherein, described uw mkWith described uw mk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user m, described dw nkWith described dw nk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described document n; Described λ 3(n, m, T) is under the type T of described signal, the influence coefficient of described document n to described user m, described λ 4(m, n, T) is under the type T of described signal, the influence coefficient of described user m to described document n.Described UK mThe parameter vector K by described user m u(m)=(uw m1, uw m2..., uw mk..., uw mL) in the P of numerical value maximum mThe set that the corresponding feature of individual component forms, described DK nThe parameter vector K by described document n d(n)=(dw n1, dw n2..., dw nk..., dw nL) in the Q of numerical value maximum nThe set that the corresponding feature of individual component forms, P mAnd Q nFor setup parameter, and P m≤ L, Q n≤ L.M=30 for example, P 30=3, UK 30={ music, physical culture, finance and economics }; N=265, Q 265=2, DK 265={ music, building }.In addition, carry out following setting after carrying out above-mentioned specific algorithm, namely for each k ∈ UK m, dw is set nk=dw nk *, for each k ∈ DK n, uw is set mk=uw mk *
In described application example 3, described specific algorithm can be further defined to for each k ∈ DK n, meet uw mk *〉=uw mkFor each k ∈ UK m, meet dw nk *〉=dw nk
In described application example 3, described f 3[K d(n)] be the parameter vector K of described document n d(n) function, described f 4[K u(m)] be the parameter vector K of described user m u(m) function.Described f 3[K dAnd described f (n)] 4[K u(m)] concrete methods of realizing comprises:
Example 1: described f 3[K d(n)] be described dw nkIncreasing function, be ∑ (k ∈ K)dw nkSubtraction function; Described f 4[K u(m)] be described uw mkIncreasing function, be ∑ (k ∈ K)uw mkSubtraction function.
Example 2:f 3[K d(n)]=σ 3Dw nk/ (∑ (k ∈ K)dw nk), f 4[K u(m)]=σ 4Uw mk/ (∑ (k ∈ K)uw mk), σ wherein 3And σ 4For setting constant.
Example 3:f 3[K d(n)]=σ 5Dw nk, f 4[K u(m)]=σ 6Uw mk, σ wherein 5And σ 6For setting constant.
Example 4:f 3[K d(n)]=σ 7{ 1/[1+exp (dw nk)], f 4[K u(m)]=σ 8{ 1/[1+exp (uw mk)], σ wherein 7And σ 8For setting constant.
In described application example 3, described λ 3(n, m, T) and described λ 4The concrete methods of realizing of (m, n, T) comprises following example:
Example 1: described λ 3(n, m, T) and described λ 4(m, n, T) is respectively the function of the similarity sim (m, n) between the parameter vector of described user m and described document n.λ for example 3(n, m, T)=c 1Sim (m, n), λ 4(m, n, T)=c 2Sim (m, n), and sim (m, n)=|| K u(m), K d(n) ||=[∑ (k ∈ K)(uw mkDw nk)]/{ [∑ (k ∈ K)(uw mk) 2] 1/2[∑ (k ∈ K)(dw nk) 2] 1/2, c 1And c 2For setting constant.The implication of this example is that the similarity between the parameter vector of user and document is higher, and the scale-up factor of " ballot " is larger each other for they.
Example 2: described λ 3(n, m, T)=u 2(m) d 1(n), described λ 4(n, m, T)=u 1(m) d 2(n), d wherein 1(n) whether the parameter vector of expression document n can be used for upgrading the user and collect U user's parameter vector, u 2(m) whether the parameter vector of expression user m can be upgraded by the parameter vector of document in document sets D, u 1(m) whether the parameter vector of expression user m can be used for upgrading the parameter vector of document sets D document, d 2(n) whether the parameter vector of expression document n can be collected by the user parameter vector renewal of user in U.u 1(m), u 2(m), d 1(n) and d 2(n) be setup parameter, their value is 0 or 1.1 the representative be, 0 represent no.The implication of this example is for preventing malicious attack, and some document (or user) is owing to not passing through reliability certification, and its parameter vector can not upgrade other user's (or document) parameter vector; Some special document (or user), its parameter vector can not be upgraded by other user's (or document) parameter vector.
Example 3: described λ 3(n, m, T)=s 1(T), described λ 4(m, n, T)=s 2(T).Wherein said T is the type of user's access document signal, described s 1(T) and described s 2(T) be the function of described T.
Example 4: with the combination of above-mentioned example 1~3 each method, generate described λ 3(n, m, T) and λ 4(m, n, T).Namely
λ 3(n,m,T)={c 1·sim(m,n)}·{u 2(m)·d 1(n)}·s 1(T)
λ 4(m,n,T)={c 2·sim(m,n)}·{u 1(m)·d 2(n)}·s 2(T)。
Example 5: described λ 3(n, m, T) is the accessed number of times of described document n or the function of PageRank value, described λ 4(m, n, T) is the function of number of users in the relational network of described user m.
Example 6: described λ 3(n, m, T) and described λ 4(m, n, T) is for setting constant.
In described application example 3, after the described concrete parameter vector update algorithm of execution reaches set point number, need to be for each feature k ∈ K, respectively to k document column vector (dw 1k, dw 2k..., dw Nk) and k user's column vector (uw 1k, uw 2k..., uw Mk) carry out normalized.
Application example 4
This is an applicating example of described application example 3 described methods.For convenience of illustration,let us suppose that two users and three documents on the internet, and each user and each document all have two features, and namely the user collects U={1,2}, document sets D={1,2,3}, feature set K={1,2}.User 1 and user's 2 parameter vector is respectively (uw 11, uw 12) and (uw 21, uw 22), the parameter vector of document 1, document 2 and document 3 is respectively (dw 11, dw 12), (dw 21, dw 22) and (dw 31, dw 32).Uw wherein mkThe degree of correlation of (m ∈ U, k ∈ K) described user m of expression and feature k; dw nkThe degree of correlation of (n ∈ D, k ∈ K) described document n of expression and feature k.
Suppose to have received the signal of described user's 2 described documents 3 of access in server, and signal type T=9, according to following algorithm, upgrade the parameter vector of described user 2 and described document 3:
uw 21 *=uw 213(3,2,9){dw 31/(dw 31+dw 32)}
uw 22 *=uw 223(3,2,9){dw 32/(dw 31+dw 32)}
dw 31 *=dw 314(2,3,9){uw 21/(uw 21+uw 22)}
dw 32 *=dw 324(2,3,9){uw 22/(uw 21+uw 22)}
λ wherein 3(3,2,9) are illustrated under signal type T=9,3 couples of described users' 2 of described document influence coefficient; λ 4(2,3,9) are illustrated under signal type T=9, the influence coefficient of 2 pairs of described documents 3 of described user.For example establish λ 3(3,2,9)=c 1Sim (2,3) s 1(9); λ 4(2,3,9)=c 2Sim (2,3) s 2(9), establish s 1(9)=3, s 2(9)=1.5; c 1And c 2For setting constant; Similarity between the parameter vector of described sim (2,3) the described user 2 of expression and described document 3, that is:
sim(2,3)=(uw 21·dw 31+uw 22·dw 32)/{[(uw 21) 2+(uw 22) 2] 1/2·[(dw 31) 2+(dw 32) 2] 1/2}。
After executing above-mentioned algorithm, upgrade the parameter vector of described user 2 and described document 3, uw namely is set 21=uw 21 *, uw 22=uw 22 *, dw 31=dw 31 *And dw 32=dw 32 *
After executing above-mentioned algorithm, to user's column vector (uw 11, uw 21) and (uw 12, uw 22) carry out normalized, and to document column vector (dw 11, dw 21, dw 31) and (dw 12, dw 22, dw 32) carry out normalized.
Algorithm to the normalized of user's column vector is as follows: establish temp1=uw 11+ uw 21, feature k=1 is arranged uw 11=uw 11/ temp1, uw 21=uw 21/ temp1; If temp2=uw 12+ uw 22, feature k=2 is arranged uw 12=uw 12/ temp2, uw 22=uw 22/ temp2.
Algorithm to the normalized of document column vector is as follows: establish temp1=dw 11+ dw 21+ dw 31, feature k=1 is arranged dw 11=dw 11/ temp1, dw 21=dw 21/ temp1, dw 31=dw 31/ temp1; If temp2=dw 12+ dw 22+ dw 32, feature k=2 is arranged dw 12=dw 12/ temp2, dw 22=dw 22/ temp2, dw 32=dw 32/ temp2.
Fig. 5 has the user's of special characteristic method flow diagram for inquiry.The method is included in server carries out following steps:
A11. receive the query vector that arbitrary inquiring user e (e ∈ U) arranges;
A12. described inquiring user e chooses one group of user in described user collects U For example, select the one group user of age between a given area in social networks, perhaps position is all users that set in geographic area; If the user does not carry out above-mentioned choosing, default value is Q=U;
A13., according to the parameter vector of each user in described query vector and described one group of user Q, calculate the personalized ordering value UR (e, m) of each the user m (m ∈ Q) in described one group of user Q; Described UR (e, m) expression is based on the personalized ordering value of the described user m of the query vector of described user e;
A14. in described one group of user Q, some users' of described personalized ordering value maximum sign is sent to described inquiring user e.
In the described method of Fig. 5, the query vector that described user e arranges is K s(e)=(sw e1, sw e2..., sw ek..., sw eL), sw wherein ekRepresent that described user e expects the user who inquires and the degree of correlation of feature k (k ∈ K), sw ek∈ [a, b], a and b are default nonnegative constant.Described query vector K s(e) following several method to set up is arranged.
The first is to select feature by described user e in feature set K, and it is arranged the feature degree of correlation, and sw for example is set e2=0.00023, sw e6=0.00061, the degree of correlation of described user e and further feature is 0.
The second is to described query vector K parameter vector Ku (e) assignment of described user e s(e).
The third is that described user e submits one group of user or document S to e..., r ... }.When
Figure BSA00000718347000161
The time, described user r (r ∈ S e) parameter vector be (uw r1, uw r2..., uw rL), therefore the query vector of described user e is made as: for each feature k ∈ K, sw ek=(σ 9/ s) ∑ (r ∈ Se)[uw rk/ (∑ (k ∈ K)uw rk)]; When
Figure BSA00000718347000162
The time, described document r (r ∈ S e) parameter vector be (dw r1, dw r2..., dw rL), therefore the query vector of described user e is made as: for each feature k ∈ K, sw ek=(σ 10/ s) ∑ (r ∈ Se)[dw rk/ (∑ (k ∈ K)dw rk)].
In an application example of the described method of Fig. 5, described personalized ordering value UR (e, m) is the query vector K by described user e s(e)=(sw e1, sw e2..., sw ek..., sw eL) and the parameter vector K of described user m (m ∈ Q) u(m)=(uw m1, uw m2..., uw mk..., uw mL) calculate and obtain, for example
UR ( e , m ) = Σ k = 1 L { uw mk · sw ek }
Fig. 6 has the method flow diagram of the customer group of special characteristic for inquiry.Obtain a plurality of customer groups, form subscriber cluster G={1,2 ..., E}, wherein E is the number of customer group.The parameter vector of customer group i (i ∈ G) is made as (gw i1, gw i2..., gw ik..., gw iL), wherein said gw ikThe degree of correlation that represents described customer group i and feature k (k ∈ K).Therefore, to have the method for special characteristic customer group as follows in inquiry:
A21. calculate the parameter vector of each customer group in described subscriber cluster G; The parameter vector of a customer group is calculated by each user's that this customer group comprises parameter vector; For example, all users that establish in customer group i form user's set B i, the parameter vector computing method of customer group i are for each feature k ∈ K, and gw is set ik=(σ 11/ s) ∑ (t ∈ Bi)[uw tk/ (∑ (k ∈ K)uw tk)], wherein s is user's set B iElement number, σ 11For setting constant.
A22. receive the query vector that arbitrary inquiring user e (e ∈ U) arranges;
A23., according to the parameter vector of each customer group in described query vector and described subscriber cluster G, calculate the personalized ordering value GR (e, i) of each customer group i in described subscriber cluster G (i ∈ G); Described GR (e, i) expression is based on the personalized ordering value of the described customer group i of the query vector of described user e;
A24. in described subscriber cluster G, the sign of some customer groups of described personalized ordering value maximum is sent to described inquiring user e.
In the described method of Fig. 6, the query vector that described user e arranges is K s(e)=(sw e1, sw e2..., sw ek..., sw eL), sw wherein ekRepresent that described user e expects the customer group that inquires and the degree of correlation of feature k (k ∈ K), sw ek∈ [a, b], a and b are default nonnegative constant.Described query vector K s(e) four kinds of methods to set up are arranged, first three kind is identical with the described three kinds of methods of Fig. 5.The 4th kind is that described user e submits a customer group to, and with the parameter vector assignment of this customer group, gives described K s(e).
In searching the method for customer group, described personalized ordering value GR (e, i) is the query vector K by described user e s(e)=(sw e1, sw e2..., sw ek..., sw eL) and the parameter vector K of described customer group i (i ∈ G) u(i)=(gw i1, gw i2..., g wi..., gw iL) calculate and obtain, for example
GR ( e , i ) = Σ k = 1 L { gw ik · sw ek }
A kind of system construction drawing that obtains the user individual feature of Fig. 7.Described system 200 comprises following functional module:
User, document and feature arrange module 211: obtain on the internet a plurality of users, form the user and collect U={1, and 2 ..., M}, collect U with described user and be stored in customer data base 220; Obtain on the internet a plurality of documents, form document sets D={1,2 ..., N}, be stored in document database 230 with described document sets D; A plurality of features are set, composition characteristic collection K={1,2 ..., L}, be stored in property data base 240 with described feature set;
The parameter vector initial value of user and document arranges module 212: for described user collects a plurality of user's parameters vector initial values in U, and it is stored in described customer data base 220; For a plurality of document setup parameter vector initial values in described document sets D, and it is stored in described document database 230; Be not set up user and the document of parameter vector initial value, the default null vector that is made as of its parameter vector initial value;
User's calling party signal acquisition module 213: be used for gathering the signal 1 that any one user i (i ∈ U) accesses any one user j (j ∈ U), described signal 1 is stored in Web log database 250; The signal of described user i (101) the described user j of access (102), will be sent to social networking service device 302;
User's access document signal acquisition module 214: be used for gathering the signal 2 that any one user m (m ∈ U) (103) accesses any one document n (n ∈ D), described signal 2 is stored in Web log database 250; The signal of the described document n of described user m (103) access, to be sent at least one application server, described application server comprises portal site server 301, social networking service device 302, search engine server 303 and instant communication server 304;
The parameter vector update module 215 of user and document: according to described signal 1, read the parameter vector of described user i (101) and described user j (102) in described customer data base 220, then upgrade the parameter vector of described user i (101) and described user j (102) by the parameter vector update algorithm, and in described customer data base 220 the described user i after storage update and the parameter vector of described user j; According to described signal 2, read the parameter vector of described user m (103) and the parameter vector that reads described document n in described document database 230 in described customer data base 220, then upgrade the parameter vector of described user m (103) and described document n by parameter vector update algorithm 2, and in described customer data base 220 the described user m after storage update parameter vector and in described document database 230 parameter vector of the described document n after storage update;
User's enquiry module 216: this module has the query function of user and customer group; User's query function comprises: receive by the query vector of inquiring user setting and obtain one group of user
Figure BSA00000718347000181
Then according to the parameter vector of each user in described query vector and described one group of user Q, calculate the personalized ordering value of each user in described one group of user Q, and according to the size of this personalized ordering value, the sign of at least one user in described one group of user Q is sent to described inquiring user, referring to steps A 11 to A14; The customer group query function comprises: receive the query vector that is arranged by inquiring user, then according to the parameter vector of each customer group in described query vector and subscriber cluster G, calculate the personalized ordering value of each customer group in described subscriber cluster G, and according to the size of this personalized ordering value, the sign of at least one customer group in described subscriber cluster G is sent to described inquiring user, referring to steps A 21 to A24.
The above application example is only better application example of the present invention, not in order to limit protection scope of the present invention.

Claims (14)

1. a method of obtaining the user individual feature, is characterized in that,
Obtain on the internet a plurality of users, the user that storage is comprised of described a plurality of users collects U={1, and 2 ..., M}; A plurality of features are set, the feature set K={1 that storage is comprised of described a plurality of features, 2 ..., L};
For described user collects a plurality of user's parameters vector initial values in U;
Repeatedly carry out following steps:
Receive the signal that any one user i (i ∈ U) accesses any one user j (j ∈ U);
Read the parameter vector K of described user i u(i)=(uw i1, uw i2..., uw ik..., uw iL), described uw ikThe degree of correlation that represents described user i and feature k (k ∈ K);
Read the parameter vector K of described user j u(j)=(uw j1, uw j2..., uw jk..., uw jL), described uw jkThe degree of correlation that represents described user j and feature k (k ∈ K);
, with following parameter vector update algorithm, upgrade the parameter vector of described user i and described user j, namely
K u *(i)=function1[K u(i),K u(j)];
K u *(j)=function2[K u(i),K u(j)];
Wherein, described K u(i) and described K u *(i) represent respectively to upgrade parameter vector front and the rear described user i of renewal, described K u(j) and described K u *(j) represent respectively to upgrade parameter vector front and the rear described user j of renewal.
2. method according to claim 1, is characterized in that, establishes described K u *(i)=(uw i1 *, uw i2 *..., uw ik *..., uw iL *), described K u *(j)=(uw j1 *, uw j2 *..., uw jk *..., uw jL *), for each k ∈ K, described uw ik *Described uw jkIncreasing function, be ∑ (k ∈ K)uw jkSubtraction function; For each k ∈ K, described uw jk *Described uw ikIncreasing function, be ∑ (k ∈ K)uw ikSubtraction function.
3. method according to claim 1, is characterized in that, described parameter vector update algorithm, by following concrete application example, is upgraded the parameter vector of described user i and described user j:
uw ik *=uw ik+ λ 1(j, i, T) f 1[K u(j)] (for each
Figure FSA00000718346900011
)
uw jk *=uw jk+ λ 2(i, j, T) f 2[K u(i)] (for each )
Wherein, described uw ikWith described uw ik *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user i, described uw jkWith described uw jk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user j; Described λ 1(j, i, T) is under the type T of described signal, the influence coefficient of described user j to described user i, described λ 2(i, j, T) is under the type T of described signal, the influence coefficient of described user i to described user j; Described UK iThe parameter vector K by described user i u(i)=(uw i1, uw i2..., uw ik..., uw iL) in the P of numerical value maximum iThe set that the corresponding feature of individual component forms, described UK jThe parameter vector K by described user j u(j)=(uw j1, uw j2..., uw jk..., uw jL) in the P of numerical value maximum jThe set that the corresponding feature of individual component forms, P iAnd P jFor setup parameter, and P i≤ L, P j≤ L.
4. method according to claim 3, is characterized in that, described λ 1(j, i, T) and described λ 2(i, j, T) is respectively the function of the similarity between the parameter vector of the parameter vector of described user i and described user j.
5. method according to claim 3, is characterized in that, described f 1[K u(j)] be described uw jkIncreasing function, be ∑ (k ∈ K)uw jkSubtraction function; Described f 2[K u(i)] be described uw ikIncreasing function, be ∑ (k ∈ K)uw ikSubtraction function.
6. method according to claim 1, is characterized in that, after carrying out described parameter vector update algorithm and reaching set point number, for each feature k ∈ K, to k user's column vector (uw 1k, uw 2k..., uw Mk) carry out normalized.
7. method according to claim 1, is characterized in that, in described user collected U, user's parameter vector can also upgrade in the following manner:
Obtain on the internet a plurality of documents, the document sets D={1 that storage is comprised of described a plurality of documents, 2 ..., N}, and be a plurality of document setup parameter vector initial values in described document sets D;
Repeatedly carry out following steps:
Receive the signal that any one user m (m ∈ U) accesses any one document n (n ∈ D);
Read the parameter vector K of described user m u(m)=(uw m1, uw m2..., uw mk..., uw mL), described uw mkThe degree of correlation that represents described user m and feature k (k ∈ K);
Read the parameter vector K of described document n d(n)=(dw n1, dw n2..., dw nk..., dw nL), described dw nkThe degree of correlation that represents described document n and feature k (k ∈ K);
With following parameter vector update algorithm 2, upgrade the parameter vector of described user m and described document n,
K u *(m)=function3[K u(m),K d(n)];
K d *(n)=function4[K u(m),K d(n)];
Wherein said K u(m) and described K u *(m) represent respectively to upgrade parameter vector front and the rear described user m of renewal, described K d(n) and described K d *(n) represent respectively to upgrade parameter vector front and the rear described document n of renewal.
8. method according to claim 7, is characterized in that, establishes described K u *(m)=(uw m1 *, uw m2 *..., uw mk *..., uw mL *), described K d *(n)=(dw n1 *, dw n2 *..., dw nk *..., dw nK *), for each k ∈ K, described uw mk *Described dw nkIncreasing function, be ∑ (k ∈ K)dw nkSubtraction function; For each k ∈ K, described dw nk *Described uw mkIncreasing function, be ∑ (k ∈ K)uw mkSubtraction function.
9. method according to claim 7, is characterized in that, described parameter vector update algorithm 2, by following concrete application example, is upgraded the parameter vector of described user m and described document n:
uw mk *=uw mk+ λ 3(n, m, T) f 3[K d(n)] (for each
Figure FSA00000718346900031
)
dw nk *=dw nk+ λ 4(m, n, T) f 4[K u(m)] (for each )
Wherein, described uw mkWith described uw mk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described user m, described dw nkWith described dw nk *Represent to upgrade respectively k component of front parameter vector with upgrading rear described document n; Described λ 3(n, m, T) is under the type T of described signal, the influence coefficient of described document n to described user m, described λ 4(m, n, T) is under the type T of described signal, the influence coefficient of described user m to described document n; Described UK mThe parameter vector K by described user m u(m)=(uw m1, uw m2..., uw mk..., uw mL) in the P of numerical value maximum mThe set that the corresponding feature of individual component forms, described DK nThe parameter vector K by described document n d(n)=(dw n1, dw n2..., dw nk..., dw nL) in the Q of numerical value maximum nThe set that the corresponding feature of individual component forms, P mAnd Q nFor setup parameter, and P m≤ L, Q n≤ L.
10. method according to claim 9, is characterized in that, described f 3[K d(n)] be described dw nkIncreasing function, be ∑ (k ∈ K)dw nkSubtraction function; Described f 4[K u(m)] be described uw mkIncreasing function, be ∑ (k ∈ K)uw mkSubtraction function.
11. method according to claim 7, is characterized in that, carries out described parameter vector update algorithm 2 and reach set point number t 1After, for each feature k ∈ K, to k user's column vector (uw 1k, uw 2k..., uw Mk) carry out normalized; Carry out described parameter vector update algorithm 2 and reach set point number t 2After, for each feature k ∈ K, to k document column vector (dw 1k, dw 2k..., dw Nk) carry out normalized, wherein t 1And t 2For setup parameter.
12. method according to claim 1, is characterized in that, described method comprises the query vector of setting according to inquiring user, searches one group of application example with user of special characteristic, is included in server and carries out following steps:
A11. receive the query vector that inquiring user e (e ∈ U) arranges;
A12. described inquiring user e chooses one group of user in described user collects U
Figure FSA00000718346900033
A13., according to the parameter vector of each user in described query vector and described one group of user Q, calculate the personalized ordering value UR (e, m) of each the user m (m ∈ Q) in described one group of user Q; Described UR (e, m) expression is based on the personalized ordering value of the described user m of the query vector of described user e;
A14. in described one group of user Q, some users' of described personalized ordering value maximum sign is sent to described inquiring user e.
13. method according to claim 1, is characterized in that, obtains a plurality of customer groups, forms subscriber cluster G={1,2 ..., E}; The parameter vector of customer group i (i ∈ G) is made as (gw i1, gw i2..., gw ik..., gw iL), wherein said gw ikThe degree of correlation that represents described customer group i and feature k (k ∈ K); Therefore, to have the method for special characteristic customer group as follows in inquiry:
A21. calculate the parameter vector of each customer group in described subscriber cluster G; The parameter vector of a customer group is calculated by each user's that this customer group comprises parameter vector;
A22. receive the query vector that arbitrary inquiring user e (e ∈ U) arranges;
A23., according to the parameter vector of each customer group in described query vector and described subscriber cluster G, calculate the personalized ordering value GR (e, i) of each the customer group i (i ∈ G) in described subscriber cluster G; Described GR (e, i) expression is based on the personalized ordering value of the described customer group i of the query vector of described user e;
A24. in described subscriber cluster G, the sign of some customer groups of described personalized ordering value maximum is sent to described inquiring user e.
14. a system of obtaining the user individual feature, is characterized in that, described system comprises:
User, document and feature arrange module: obtain on the internet a plurality of users, form the user and collect U={1, and 2 ..., M}, collect U with described user and be stored in customer data base; Obtain on the internet a plurality of documents, form document sets D={1,2 ..., N}, be stored in described document sets D in document database; A plurality of features are set, composition characteristic collection K={1,2 ..., L}, be stored in described feature set K in property data base;
The parameter vector initial value of user and document arranges module: for described user collects a plurality of user's parameters vector initial values in U, and it is stored in described customer data base; For a plurality of document setup parameter vector initial values in described document sets D, and it is stored in described document database; Be not set up user and the document of parameter vector initial value, the default null vector that is made as of its parameter vector initial value;
User's calling party signal acquisition module: be used for gathering the signal 1 that any one user i (i ∈ U) accesses any one user j (j ∈ U), described signal 1 is stored in the Web log database;
User's access document signal acquisition module: be used for gathering the signal 2 that any one user m (m ∈ U) accesses any one document n (n ∈ D), described signal 2 is stored in the Web log database;
User's parameter vector update module:, according to described signal 1, read the parameter vector of described user i and described user j in described customer data base, then by the parameter vector update algorithm, upgrade the parameter vector of described user i and described user j; According to described signal 2, read the parameter vector of described user m and the parameter vector that reads described document n in described document database in described customer data base, then, by parameter vector update algorithm 2, upgrade the parameter vector of described user m and described document n;
User's enquiry module: this module has the query function of user and customer group; User's query function comprises: receive by the query vector of inquiring user setting and obtain one group of user
Figure FSA00000718346900051
Then according to the parameter vector of each user in described query vector and described one group of user Q, calculate the personalized ordering value of each user in described one group of user Q, and some users' of described personalized ordering value maximum sign is sent to described inquiring user e; The customer group query function comprises: receive the query vector that is arranged by inquiring user, then according to the parameter vector of each customer group in described query vector and subscriber cluster G, calculate the personalized ordering value of each customer group in described subscriber cluster G, and the sign of some customer groups of described personalized ordering value maximum is sent to described inquiring user e.
CN201210152084.1A 2012-05-08 2012-05-08 A kind of method and system obtaining user individual feature Expired - Fee Related CN103390008B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810879449.8A CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features
CN201210152084.1A CN103390008B (en) 2012-05-08 2012-05-08 A kind of method and system obtaining user individual feature

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210152084.1A CN103390008B (en) 2012-05-08 2012-05-08 A kind of method and system obtaining user individual feature

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201810879449.8A Division CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features

Publications (2)

Publication Number Publication Date
CN103390008A true CN103390008A (en) 2013-11-13
CN103390008B CN103390008B (en) 2018-09-28

Family

ID=49534284

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810879449.8A Active CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features
CN201210152084.1A Expired - Fee Related CN103390008B (en) 2012-05-08 2012-05-08 A kind of method and system obtaining user individual feature

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201810879449.8A Active CN109344321B (en) 2012-05-08 2012-05-08 System for obtaining user personalized features

Country Status (1)

Country Link
CN (2) CN109344321B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050086215A1 (en) * 2002-06-14 2005-04-21 Igor Perisic System and method for harmonizing content relevancy across structured and unstructured data
CN1667607A (en) * 2004-03-11 2005-09-14 国际商业机器公司 Personalized category treatment method and system for document browsing
CN101071445A (en) * 2007-06-22 2007-11-14 腾讯科技(深圳)有限公司 Classified sample set optimizing method and content-related advertising server
CN101770520A (en) * 2010-03-05 2010-07-07 南京邮电大学 User interest modeling method based on user browsing behavior
CN102999540A (en) * 2011-09-10 2013-03-27 祁勇 Method and system for determining user features on Internet
CN103309900A (en) * 2012-03-06 2013-09-18 祁勇 Personalized multidimensional document sequencing method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020103798A1 (en) * 2001-02-01 2002-08-01 Abrol Mani S. Adaptive document ranking method based on user behavior
CN101079064B (en) * 2007-06-25 2011-11-30 腾讯科技(深圳)有限公司 Web page sequencing method and device
CN101334773B (en) * 2007-06-28 2014-07-30 联想(北京)有限公司 Method for filtrating search engine searching result
TWI443534B (en) * 2009-08-18 2014-07-01 Ind Tech Res Inst Video search method and apparatus using motion vectors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050086215A1 (en) * 2002-06-14 2005-04-21 Igor Perisic System and method for harmonizing content relevancy across structured and unstructured data
CN1667607A (en) * 2004-03-11 2005-09-14 国际商业机器公司 Personalized category treatment method and system for document browsing
CN101071445A (en) * 2007-06-22 2007-11-14 腾讯科技(深圳)有限公司 Classified sample set optimizing method and content-related advertising server
CN101770520A (en) * 2010-03-05 2010-07-07 南京邮电大学 User interest modeling method based on user browsing behavior
CN102999540A (en) * 2011-09-10 2013-03-27 祁勇 Method and system for determining user features on Internet
CN103309900A (en) * 2012-03-06 2013-09-18 祁勇 Personalized multidimensional document sequencing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜鑫维 等: ""Topic PageRank ———一种基于主题的搜索引擎"", 《计算机技术与发展》 *

Also Published As

Publication number Publication date
CN103390008B (en) 2018-09-28
CN109344321B (en) 2021-11-02
CN109344321A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
US20220237145A1 (en) Method of and system for enhanced local-device content discovery
US10311478B2 (en) Recommending content based on user profiles clustered by subscription data
CN103106285B (en) Recommendation algorithm based on information security professional social network platform
US20190018904A1 (en) Method and system for identifying and discovering relationships between disparate datasets from multiple sources
US10157232B2 (en) Personalizing deep search results using subscription data
TWI493367B (en) Progressive filtering search results
US20160179816A1 (en) Near Real Time Auto-Suggest Search Results
US8688702B1 (en) Techniques for using dynamic data sources with static search mechanisms
US11836778B2 (en) Product and content association
US20170154116A1 (en) Method and system for recommending contents based on social network
US9864768B2 (en) Surfacing actions from social data
KR20100094021A (en) Customized and intellectual symbol, icon internet information searching system utilizing a mobile communication terminal and ip-based information terminal
CN106383887A (en) Environment-friendly news data acquisition and recommendation display method and system
CN105095335A (en) Ranking system for search results on network
CN104598604A (en) Browsing method of website navigation applied in various browsers
US10474670B1 (en) Category predictions with browse node probabilities
JP4031264B2 (en) Filtering management method, filtering management program, filtering management method for filtering device, and filtering management program for filtering device
Huang et al. On the understanding of interdependency of mobile app usage
US10387934B1 (en) Method medium and system for category prediction for a changed shopping mission
CN103309900A (en) Personalized multidimensional document sequencing method and system
CN103514237B (en) A kind of method and system obtaining user and Document personalization feature
Lu et al. Genderpredictor: a method to predict gender of customers from e-commerce website
CN101788981A (en) Deep web mobile search method, server and system
CN103390008A (en) Method and system for acquiring personalized features of user
CN103870517A (en) Method and system for acquiring personalized features of user

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180523

Address after: 226661 No. 123 South Street, Qu Tang Town, Haian, Nantong, Jiangsu

Applicant after: Jing Zhuqiang

Address before: 518053 Guangdong Shenzhen Nanshan District overseas Chinese town beautiful Fairview garden 20E

Applicant before: Qi Yong

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180816

Address after: 200082 room 104-2, Yin Road, Yangpu District gate, Shanghai, 104-2.

Applicant after: Six or six fish Information technologies(Shanghai)Co., Ltd.

Address before: 510000 B1B2, one, two, three and four floors of the podium building 231 and 233, science Avenue, Guangzhou, Guangdong.

Applicant before: BOAO ZONGHENG NETWORK TECHNOLOGY Co.,Ltd.

Effective date of registration: 20180816

Address after: 510000 B1B2, one, two, three and four floors of the podium building 231 and 233, science Avenue, Guangzhou, Guangdong.

Applicant after: BOAO ZONGHENG NETWORK TECHNOLOGY Co.,Ltd.

Address before: 226661 No. 123 South Street, Qu Tang Town, Haian, Nantong, Jiangsu

Applicant before: Jing Zhuqiang

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180928

Termination date: 20200508

CF01 Termination of patent right due to non-payment of annual fee