CN109753993A

CN109753993A - User's portrait method, apparatus, computer readable storage medium and electronic equipment

Info

Publication number: CN109753993A
Application number: CN201811513512.2A
Authority: CN
Inventors: 于福超; 王菊
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2018-12-11
Filing date: 2018-12-11
Publication date: 2019-05-14

Abstract

Method, apparatus, computer readable storage medium and electronic equipment this disclosure relates to which a kind of user draws a portrait, which comprises determine at least one corresponding similar users of target user and at least one non-similar users；According to target sample and the first matching degree of at least one similar users and the second matching degree of the target sample and at least one non-similar users, the matching degree of the target sample Yu the target user is determined；It is that the target user marks label according to the associated label of the target sample when the matching degree of the target sample and the target user are more than preset matching degree threshold value.Through the above technical solutions, on the one hand can guarantee the correlation between target sample and target user, on the other hand, the quantity and range of target sample can be effectively widened, to improve the accuracy and abundant degree of user's portrait, promote user experience.

Description

User's portrait method, apparatus, computer readable storage medium and electronic equipment

Technical field

This disclosure relates to data analysis field, and in particular, to a kind of user draw a portrait method, apparatus, computer-readable deposit Storage media and electronic equipment.

Background technique

Currently, user's portrait has become one of the function of falling over each other to pursue in various industries, the effect of user's portrait is Businessman or client is helped accurately to understand the various aspects such as the demand, hobby, interest of user.In the prior art, usually basis Article relevant to active user, as the article that user bought is worked as so that the label of article is transmitted user with abundant user The portrait of preceding user.However, being directly according to the sample of user-association when being to draw a portrait to user according to the relevant sample of user User draws a portrait, and is also labeled so that being not belonging to label corresponding to the sample of the user, so that it is chaotic to will cause user's portrait.

Summary of the invention

To solve the above-mentioned problems, purpose of this disclosure is to provide a kind of accurate user's portrait method, apparatus, computer Readable storage medium storing program for executing and electronic equipment.

To achieve the goals above, according to the disclosure in a first aspect, providing a kind of user's portrait method, the method packet It includes:

Determine at least one corresponding similar users of target user and at least one non-similar users；

According to the first matching degree of target sample and at least one similar users and the target sample and it is described extremely Second matching degree of few non-similar users, determines the matching degree of the target sample Yu the target user；

When the matching degree of the target sample and the target user are more than preset matching degree threshold value, according to the mesh The associated label of standard specimen sheet is that the target user marks label.

Optionally, the first matching degree of target sample Yu at least one similar users is determined by following formula:

Wherein,Indicate first matching degree；

M indicates the quantity of the similar users；

P indicates the label vector of the target sample；

X_iIndicate the label vector of i-th of similar users.

Optionally, the second matching degree of target sample Yu at least one non-similar users is determined by following formula:

Wherein,Indicate second matching degree；

N indicates the quantity of the non-similar users；

P indicates the label vector of the target sample；

Y_iIndicate the label vector of i-th of non-similar users.

Optionally, first matching degree and the target sample according to target sample and at least one similar users Second matching degree of this and at least one non-similar users, determines the matching of the target sample Yu the target user Degree, comprising:

The difference of the weighted value of first matching degree and the weighted value of second matching degree is determined as the target sample The matching degree of this and the target user.

It is optionally, described that label is marked for the target user according to the associated label of the target sample, comprising:

By in the corresponding label of the target sample, at least one label that the target user does not have currently determines For candidate label；

Determine the match parameter of each candidate label and the target user；

According to the match parameter of each candidate label and the target user, it is determined to from the candidate label The target labels are marked for marking the target labels of the target user, and for the target user.

Optionally, the match parameter of the determination each candidate label and the target user, comprising:

For each candidate label, determine that candidate's label is corresponding in the label possessed by each similar users Accounting, and will and the corresponding accounting of each similar users be determined as candidate's label the first weight corresponding with the similar users；

Determination is formed by similar between label vector and the label vector of each similar users by all candidate labels Degree, and the average value of the similarity is determined as corresponding second weight of each candidate's label；

According to first weight and second weight of the candidate label, the candidate is determined by following formula The match parameter of label and the target user:

Wherein, Fit indicates the match parameter of the candidate label and the target user；

M indicates the quantity of the similar users；

w_wholeIndicate the second weight of the candidate label w；

w_iIndicate the candidate label w the first weight corresponding with i-th of similar users.

Optionally, the match parameter according to each candidate label and the target user, from the candidate mark The target labels for marking the target user are determined in label, including any one of following:

The candidate label for being less than preset matching threshold with the match parameter of the target user is determined as the target Label；

It will be determined as according to the candidate label of S before the sequence from small to large of the match parameter with the target user, ranking The target labels, wherein S is positive integer.

Optionally, the target sample is determined as follows:

At least one interested sample of at least one described similar users is determined as first sample；

Clustering processing is carried out to sample to be clustered using each first sample as class center, is obtained and first sample The identical sample class cluster of this quantity, wherein the sample to be clustered is in sample set except the similar users and the target are used Sample except sample associated by family；

It is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and will be each described Second sample is determined as the target sample.

It is optionally, described that clustering processing is carried out to sample to be clustered using each first sample as class center, comprising:

It is determined for each sample to be clustered according to the sample to be clustered and its label vector of affiliated user The distance between the sample to be clustered and each first sample, and the sample to be clustered is divided to apart from shortest first In the corresponding sample class cluster of sample.

Optionally, according to the sample to be clustered and its label vector of affiliated user, by following formula, determining should be to Cluster the distance between sample and each first sample:

Wherein, D (X, Y) indicates the distance between sample X and first sample Y to be clustered；

Indicate user U belonging to the sample to be clustered_xWith user U belonging to the first sample_yBetween Similarity；

K indicates the total number after label duplicate removal associated by the sample X to be clustered and first sample Y；

x_iIndicate the weight of the corresponding label of i-th dimension in the label vector of the sample X to be clustered；

y_iIndicate the weight of the corresponding label of i-th dimension in the label vector of the first sample Y.

It is optionally, described to be determined to represent multiple second samples of the sample class cluster from each sample class cluster, Including any one of following:

In corresponding sample class cluster, according to before the distance between first sample sequence from large to small, ranking T to Cluster sample and the first sample are determined as second sample, wherein T is positive integer；

In corresponding sample class cluster, sample and the institute of preset distance threshold will be less than with the distance between first sample It states first sample and is determined as second sample.

According to the second aspect of the disclosure, a kind of user's portrait device is provided, described device includes:

First determining module, for determining at least one corresponding similar users of target user and at least one non-similar use Family；

Second determining module, for according to the first matching degree of target sample and at least one similar users and described Second matching degree of target sample and at least one non-similar users determines the target sample and the target user's Matching degree；

Mark module is used for when the object matching degree is more than preset first matching degree threshold value, according to the target The associated label of sample is that the target user marks label.

Optionally, the second determining module, for determining target sample and at least one described similar use by following formula First matching degree at family:

Wherein,Indicate first matching degree；

M indicates the quantity of the similar users；

P indicates the label vector of the target sample；

X_iIndicate the label vector of i-th of similar users.

Optionally, the second determining module, for by following formula determine target sample to described at least one is non-similar The second matching degree of user:

Wherein,Indicate second matching degree；

N indicates the quantity of the non-similar users；

P indicates the label vector of the target sample；

Y_iIndicate the label vector of i-th of non-similar users.

Optionally, second determining module is used for:

Optionally, the mark module includes:

First determines submodule, for by the corresponding label of the target sample, the target user do not have currently At least one label having is determined as candidate label；

Second determines submodule, for determining the match parameter of each candidate label and the target user；

Submodule is marked, for the match parameter according to each candidate label and the target user, from the time The target labels being determined in label for marking the target user are selected, and mark the target mark for the target user Label.

Optionally, described second determine that submodule includes:

Third determines submodule, for determining candidate's label in each similar users for each candidate label Corresponding accounting in possessed label, and will accounting corresponding with each similar users to be determined as candidate's label similar to this Corresponding first weight of user；

4th determines submodule, is formed by label vector and each similar users by all candidate labels for determining Similarity between label vector, and the average value of the similarity is determined as each candidate label corresponding second and is weighed Weight；

5th determines submodule, for first weight and second weight according to the candidate label, passes through Following formula determines the match parameter of the candidate label and the target user:

M indicates the quantity of the similar users；

w_wholeIndicate the second weight of the candidate label w；

Optionally, the label submodule includes any one of following:

6th determines submodule, for will be less than the candidate of preset matching threshold with the match parameter of the target user Label is determined as the target labels；

7th determines submodule, for by sequence, the ranking according to the match parameter with the target user from small to large The candidate label of preceding S is determined as the target labels, wherein S is positive integer.

Optionally, described device further include: target sample determining module,

The target sample determining module includes:

8th determines submodule, at least one interested sample of at least one described similar users to be determined as the One sample；

Submodule is clustered, for carrying out clustering processing to sample to be clustered using each first sample as class center, Obtain sample class cluster identical with the first sample quantity, wherein the sample to be clustered is in sample set except described similar Sample except sample associated by user and the target user；

9th determines submodule, for being determined to represent multiple the of the sample class cluster from each sample class cluster Two samples, and each second sample is determined as the target sample.

Optionally, the cluster submodule is used to be directed to each sample to be clustered, according to the sample to be clustered and its The label vector of affiliated user determines the distance between the sample to be clustered and each first sample, and this is waited gathering Class sample is divided in the corresponding sample class cluster of shortest first sample.

Optionally, the cluster submodule is used for the label vector according to the sample to be clustered and its affiliated user, leads to Following formula is crossed, determines the distance between the sample to be clustered and each first sample:

Optionally, the described 9th determine that submodule includes any one of following:

Tenth determines submodule, in corresponding sample class cluster, according to the distance between first sample by greatly extremely The sample to be clustered of T and the first sample are determined as second sample before small sequence, ranking, wherein T is positive integer；

11st determines submodule, in corresponding sample class cluster, will be less than with the distance between first sample pre- If distance threshold sample and the first sample be determined as second sample.

According to the third aspect of the disclosure, a kind of computer readable storage medium is provided, computer program is stored thereon with, The program realizes the step of above-mentioned first aspect any the method when being executed by processor.

According to the fourth aspect of the disclosure, a kind of electronic equipment is provided, comprising:

Memory is stored thereon with computer program；

Processor, for executing the computer program in the memory, to realize any institute of above-mentioned first aspect The step of stating method.

In the above-mentioned technical solutions, by determining the interested sample of similar users of target user, and it is emerging based on the sense The sample of interest clusters other samples, to obtain sample class cluster centered on each sample of interest, basis later Each sample class cluster determines target sample, and is that target user marks label according to the associated label of target sample.By upper Technical solution is stated, on the one hand can guarantee the correlation between target sample and target user, on the other hand, can effectively be widened The quantity and range of target sample promote user experience to improve the accuracy and abundant degree of user's portrait.

Other feature and advantage of the disclosure will the following detailed description will be given in the detailed implementation section.

Detailed description of the invention

Attached drawing is and to constitute part of specification for providing further understanding of the disclosure, with following tool Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:

Fig. 1 is the flow chart of the user's portrait method provided according to an embodiment of the present disclosure；

Fig. 2 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 3 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 4 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 5 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 6 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 7 is the flow chart of the user's portrait method provided according to the another embodiment of the disclosure；

Fig. 8 is the block diagram of the user's portrait device provided according to an embodiment of the present disclosure；

Fig. 9 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment；

Figure 10 is the block diagram of a kind of electronic equipment shown according to an exemplary embodiment.

Specific embodiment

It is described in detail below in conjunction with specific embodiment of the attached drawing to the disclosure.It should be understood that this place is retouched The specific embodiment stated is only used for describing and explaining the disclosure, is not limited to the disclosure.

For problem described in background technology, the disclosure provides a kind of user's portrait method, as shown in Figure 1, the side Method includes:

In S11, at least one corresponding similar users of target user and at least one non-similar users are determined.

Wherein it is possible to determine the similarity between user according to the label vector of each user, illustratively, can according to The associated label in family determines the label vector of user.For example, can will be used when calculating the similarity between user X and user Y Resulting label is determined as vector dimension after label duplicate removal associated by family X and user Y, for any dimension, if user-association Label in include the corresponding label of the dimension, then in the label vector of user the dimension values be 1, if in the label of user-association Not comprising the corresponding label of the dimension, then the dimension values are 0 in the label vector of user, thus can then determine the mark of user Sign vector.Illustratively, the associated label of user U1 is { a, d, f, g, h }, and the associated label of user U2 is { a, b, c, f }.Therefore, When user U1 and user U2 carries out vectorization, the dimension of label vector obtained by duplicate removal is carried out to the label of user U1 and user U2 For { a, b, c, d, f, g, h }, by taking user U1 as an example, the associated label of user U1 is { a, d, f, g, h }, then corresponding in user U1 The corresponding dimension values of label a, d, f, g, h are 1 in label vector, and the corresponding dimension values of label b, c are 0, then user U1 is corresponding Label vector is { 1,0,0,1,1,1,1 }, and the corresponding label vector of user U2 is { 1,1,1,0,1,0,0 }.

After the label vector for determining user then can similarity between the label vector by calculating user it is true Determine the similarity between user.For example, the similarity characterization user between user and target user is similar to target user When, which can be determined as to the similar users of target user.Wherein, similarity can be carried out based on the distance between vector Characterization illustratively when the distance between label vector of user is less than first threshold, determines that the two is similar；Can also based on to The cosine value of angle between amount is characterized, and illustratively, the cosine value of the angle between the label vector of user is greater than second When threshold value, determine that the two is similar.Wherein, first threshold and second threshold can be configured according to actual use scene, this public affairs It opens to this without limiting.

Similarly, the non-similar users of target user can be determined according to same mode.Determining other each use After the similarity of family and target user, as the similarity characterization user and target user's dissmilarity, which is determined as The non-similar users of target user.Both illustratively, when the distance between label vector of user is greater than third threshold value, determine It is dissimilar；In another example determining the two not phase when the cosine value of the angle between the label vector of user is less than four threshold values Seemingly.Wherein, third threshold value and the 4th threshold value can be configured according to actual use situation, and the disclosure is to this without limiting.

In S12, according to the first matching degree and target sample and at least one of target sample and at least one similar users Second matching degree of a non-similar users, determines target sample and the matching degree of target user.

Wherein, the particular content of sample can be determined according to actual usage scenario.Illustratively, in a certain shopping network It stands when drawing a portrait to user, which can be article；When library website draws a portrait to user, which then be can be Text.In this embodiment, when determining the matching degree of target sample and target user, it is based not only on the similar of the target user User considers, while also the non-similar users based on the target user are considered so that the target sample determined with Matching degree between target user is more comprehensive and accurate.

In S13, when the matching degree of target sample and target user are more than preset matching degree threshold value, according to target sample This associated label is that target user marks label.

Illustratively, when the matching degree of target sample and target user are more than preset matching degree threshold value, the target is indicated Sample and the target user are matched, it can the label of target sample is labeled as to the label of user.

Optionally, when the matching degree of target sample and target user is less than the matching degree threshold value, can directly ignore The label of the target sample is not target user label by the target sample.

In the prior art, be user's mark label when, usually by all samples associated with the account of the user All it is used as sample related to user.Then in a practical situation, the scene that multiple users operate same account compares Common, illustratively, user B can have purchased a book by the account of user A, and the relevant information of this this book should be and user B is corresponding, when being directly that user A carries out label label with the label of this this book, may be such that the portrait of user A is more chaotic. In another example for after user A purchase but the article returned goods, directly the label of user A is carried out according to the label of root article When label, it is also possible to cause the portrait of user A chaotic.

Therefore, in the above-mentioned technical solutions, the similar users based on target user and the non-similar users of target user are true Set the goal sample and the matching degree of target user, thus determining that the matching degree between target sample and target user is more than default Matching degree threshold value when, according to the associated label of target sample be target user mark label.Therefore, pass through above-mentioned technical side Case is that target user marks label according to target sample, so as to effectively avoid when target sample is matched with target user For the label of user's mark sheet and the incoherent sample of the user, the accuracy of user's portrait is effectively improved, user is promoted and uses Experience.

Wherein,Indicate first matching degree；

M indicates the quantity of the similar users；

P indicates the label vector of the target sample；

X_iIndicate the label vector of i-th of similar users, wherein the method for determination of label vector has carried out detailed above It states, details are not described herein.

Wherein,Indicate second matching degree；

N indicates the quantity of the non-similar users；

P indicates the label vector of the target sample；

Y_iIndicate the label vector of i-th of non-similar users.

Wherein, by above-mentioned formula can determine respectively the first matching degree between target sample and similar users and The second matching degree between target sample and non-similar users, first matching degree is higher, indicates that the target sample and target are used The matching degree at family is higher；Second matching degree is lower, indicates that the matching degree between the non-similar users of target sample is lower, i.e. target sample Originally higher the matching degree between target user.It as a result, through the above technical solutions, can be between mesh sample and target user Matching degree accurately measured, provide accurate data support with the matched target sample of target user to be determining.

Optionally, the target sample and the first matching degree and target sample of at least one similar users and at least one Second matching degree of non-similar users, determines target sample and the matching degree of target user, comprising:

Illustratively, target sample and the matching degree of target user can be determined according to the following formula:

Wherein,Indicate the matching degree of target sample and target user；

α indicates weighting weight, and illustratively, α value can be 0.6.

Illustratively, when the same account of multiple user's operations is operated, the non-similar users of target user can be passed through The matching degree of target sample and target user is laterally considered.The non-similar use to target user is based in the disclosure The higher sample of family matching degree, this lower thought of matching degree with the target user, are determining target sample and target user Matching degree when, the matching degree being based not only between target sample and similar users, at the same also based on target sample to it is non-similar Matching degree between user considers the similarity of target sample and target user, to can effectively accurately determine out The matching degree of the target sample and target user is supported to guarantee that the accuracy of user's portrait provides accurate data, meets use The use demand at family, further promotes user experience.

In addition, inventor is had found by the process analysis procedure analysis drawn a portrait to user: adding according to the label of sample for user When tagging, whole labels of the sample directly usually are added for user, such that not meeting the label of user's portrait Also it is labeled, draws a portrait so as to cause user inaccurate.For this problem, the disclosure also provides a kind of user's portrait method, such as Fig. 2 It is shown, which comprises

In S21, by the corresponding label of target sample, at least one label that target user does not have currently determines For candidate label.Wherein, target sample can be any sample that label is marked for target user.

Wherein, when being user's mark label according to the corresponding label of target sample, because the label that user has had is not It needs to re-flag, so the label that target sample and target user have can be ignored, it is only necessary to have target sample But the label that target user does not have determines.

In S22, the match parameter of each candidate label and target user is determined.Wherein, candidate label and target user Match parameter for characterizing whether the candidate label matches with target user, determine whether for target user label this Candidate label.

In S23, according to the match parameter of each candidate label and target user, it is determined to be used for from candidate label The target labels of target user are marked, and mark target labels for target user.

Illustratively, according to the match parameter of candidate label and target user, the candidate label that will be matched with target user It is determined as target labels, so as to mark the label to match in the target sample, with target user for target user.

In the above-mentioned technical solutions, when being that target user marks label according to target sample, it is first determined go out candidate mark Label, to avoid as user's repeating label label, raising labeling effciency.Later, pass through each candidate label of determination and the target The match parameter of user, to determine in candidate label to be the label of target user's label.Therefore, through the above scheme, It can be user's mark by the label to match in candidate label with target user, guarantee as the accuracy of user's mark label, The whole labels for adding target sample in the prior art for user are effectively avoided to cause user's portrait chaotic or even the phenomenon of mistake. It is also possible to which the label of effectively abundant user promotes user experience so that user's portrait is more abundant and accurate.

Optionally, in S22, a kind of example of each candidate label and the match parameter of the target user is determined Property implementation is as follows, comprising:

For each candidate label, determine that candidate's label is had at least one similar users of target user Label in corresponding accounting, and accounting corresponding with each similar users is determined as candidate's label and the similar users pair The first weight answered.

Wherein, the method for determination of the similar users of target user is as detailed above, and details are not described herein.Determining mesh After the similar users for marking user, candidate's label corresponding accounting in the label that each similar users have is determined respectively.Show Example ground, similar users A do not have candidate label a, then candidate label a the first weight corresponding with similar users A is 0, similar User B has comprising 10 labels including candidate label a, then, and candidate label a the first weight corresponding with similar users B For 0.1 (that is, 1/10).

It is wherein, similar with the generating mode of label vector above according to the mode that all candidate label forms label vector, Details are not described herein.In this embodiment it is possible to according to by all candidate labels be formed by label vector respectively with each phase Similarity is calculated like the label vector of user, later by the label vector of the label vector determined and each similar users The average value of similarity is determined as the second weight of candidate label.

M indicates the quantity of the similar users；

w_wholeIndicate the second weight of the candidate label w；

Wherein, the match parameter of candidate label and target user are smaller, indicate that candidate label is more matched with target user.It waits Label the first weight corresponding with each similar users is selected to characterize candidate's label itself between the similar users of target user Similarity, the second weight can characterize all similarities between candidate labels and similar users.Therefore, candidate mark is being determined When label and the match parameter of target user, it can be used based on determining candidate label on the basis of all candidate labels and the target The match parameter at family, so as to influence of the label to match parameter for effectively avoiding target sample and target user from all having, Guarantee the accuracy for the match parameter determined, provides accurate data for subsequent determining target labels and support.

Optionally, it according to the match parameter of each candidate label and target user, is determined to be used for from candidate label The target labels of target user are marked, including any one of following:

The candidate label for being less than preset matching threshold with the match parameter of target user is determined as target labels；

It will be determined as target according to the candidate label of S before the sequence from small to large of the match parameter with target user, ranking Label, wherein S is positive integer.

Wherein, which can be configured according to actual use situation, and the disclosure is to this without limiting.One In embodiment, the match parameter of candidate label and target user are smaller, and the matching degree for characterizing candidate's label and target user is got over Therefore the candidate label for being less than preset matching threshold with the match parameter of target user directly can be determined as target by height Label improves the efficiency that candidate label determines, the treatment effeciency of user's portrait is effectively ensured.It in another embodiment, can also be with Target labels are determined according to the sequence of the match parameter with target user from small to large, so as to which target labels are effectively ensured Number is effectively reduced the treating capacity of data during user's portrait, promotes labeling effciency.

In another embodiment, in order to further ensure that the accuracy of user's portrait, the disclosure also provides following embodiment, In this embodiment, comprehensively considered it is described above directly according to the sample of user-association be user draw a portrait, be directly use Influence of the whole labels of the sample to user's portrait accuracy is added at family, specifically, as shown in Figure 3, which comprises

In S31, at least one corresponding similar users of target user and at least one non-similar users are determined.

In S32, according to the first matching degree and target sample and at least one of target sample and at least one similar users Second matching degree of a non-similar users, determines target sample and the matching degree of target user.

In S33, when the matching degree of target sample and target user are more than preset matching degree threshold value, by target sample In corresponding label, at least one label that target user does not have currently be determined as candidate label.

In S34, the match parameter of each candidate label and target user is determined.

In S35, according to the match parameter of each candidate label and target user, it is determined to be used for from candidate label The target labels of target user are marked, and mark target labels for target user.

Wherein, the specific embodiment of S31, S32 and the specific embodiment of S11, S12 are identical, the tool of S33, S34, S35 Body embodiment is similar with the specific embodiment of S21, S22, S23, and details are not described herein.It needs to be illustrated, in S33 In, need to determine whether the matching degree of target sample and target user are more than preset matching degree threshold value, are determining target first The matching degree of sample and target user be more than preset matching degree threshold value when, by the corresponding label of target sample, target user At least one label not having currently is determined as candidate label.By in the corresponding label of target sample, target user it is current The specific embodiment that at least one label not had is determined as candidate label has been described in detail in S21, no longer superfluous herein It states.

Therefore, in the above-mentioned technical solutions, it is first determined whether target sample matches with target user, is determining target sample When the matching degree of this and target user are more than preset matching degree threshold value, the label according to associated by target sample is to target user It draws a portrait, thereby may be ensured that the accuracy on the basis of user's portrait, effectively avoid the sample not being inconsistent for user's mark and user This label.Later, when the label of target user is marked according to the label of target sample, by determining label and using Match parameter between family, to determine the matching degree of the label and user, so as to be further ensured that user tag label Accuracy can not only enrich user's portrait, but also can guarantee the accuracy of user's portrait, further promote user experience.

In addition, inventor by the process analysis procedure analysis drawn a portrait to user it is also found that: when the relevant article of user is less, When the label of user being marked according only to the user relevant article, can make user portrait it is thiner, it is difficult to It is accurately drawn a portrait at family.Therefore, in order to solve this problem, the disclosure also provides a kind of user's portrait method, as shown in figure 4, The described method includes:

In S41, at least one interested sample of at least one similar users of target user is determined as the first sample This.

Wherein it is determined that the mode of the similar users of target user has been described in detail above, details are not described herein.It is determining When similar users, first sample can be determined according to similar users.In this embodiment, the interested sample of similar users can be with It is the more sample of total browsing time of the similar users within a nearest period, illustratively, if pre-setting first sample Total number, such as 10, then can by the associated sample of similar users according to its total browsing time sequence from high to low, ranking before 10 sample is determined as first sample.In another example a browsing threshold value also can be set, it is more than by total browsing time of similar users The sample of the browsing threshold value is determined as first sample.

In S42, clustering processing is carried out to sample to be clustered using each first sample as class center, is obtained and the first sample The identical sample class cluster of this quantity, wherein the sample to be clustered is in sample set except the similar users and the target are used Sample except sample associated by family.Wherein, sample set can be determined according to actual use scene.Illustratively, in shopping network When drawing a portrait in standing to user, the sample set can all items in the shopping website be formed by set.

In S43, it is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and will be each Second sample is determined as target sample.

Illustratively, after determining first sample, the sample in sample set can be clustered according to first sample, from And the second sample determined in each sample class cluster is determined as target sample.As described above, first sample is according to mesh What the interested sample of similar users of mark user was determined, therefore, it is directed to associated by similar users except this is interested Sample except sample and the correlation of target user are lower, and its interested sample has been used as class center to be clustered, because This can directly ignore sample associated by similar users and target user when clustering to the sample in sample set, from And can effectively improve cluster efficiency, data calculation amount is effectively reduced.

It is that target user marks label according to the associated label of target sample in S44.

In one embodiment, whole labels associated by target sample can be marked for target user, so that target The portrait of user is more abundant.

In another embodiment, can be according to the label selection of time part labels of the associated label of target sample Target user's label.For example, can be target user's label in the label of nearly internal labeling in three months by target sample.

Wherein, a sample can be associated with multiple users, can will be with this when determining user belonging to sample It is determined as the owning user of the sample in the associated user of sample, with the most matched user of the sample.Wherein it is possible to according to sample This label vector and sample and the matching degree of user are determined with the similarity of the label vector of the associated user of the sample.Its In, the method for determination of the label vector of the label vector and user of sample is similar with the method for determination of label vector described above, Details are not described herein.When matching degree is characterized by distance, if between the label vector of user and the label vector of sample Distance it is minimum, then the user is determined as to the owning user of the sample；When matching degree is characterized by cosine value, if with The cosine value of angle between the label vector at family and the label vector of sample is maximum, then the user is determined as to the institute of the sample Belong to user.

Indicate user U belonging to the sample to be clustered_xWith user U belonging to the first sample_yBetween Similarity.It, can be directly by user U belonging to sample to be clustered if similarity passes through distance characterization_xWith the first sample Affiliated user U_yThe distance between be determined as the similarity；It, can be by cosine value if similarity is characterized by cosine value Inverse be determined as the similarity.

It wherein, can when calculating the similarity between user belonging to user belonging to sample to be clustered and first sample Determine the dimension of its corresponding label vector, with the label according to associated by the two to ensure that the dimension of the two label vector is phase With.Determine that the mode of label vector has been described in detail above, details are not described herein.

Optionally, multiple second samples for being determined to represent the sample class cluster from each sample class cluster, including Any one of below:

1) in corresponding sample class cluster, according to T before the distance between first sample sequence from large to small, ranking Sample to be clustered and the first sample are determined as second sample, wherein T is positive integer.

Wherein, can be represented in sample class cluster the sample of the sample class cluster as the virtual center in the sample class cluster near Sample.Illustratively, for any sample class cluster, first sample is the center in its corresponding sample class cluster, can directly by First sample is determined as the second sample；For the sample in the sample class cluster in addition to first sample, can according to the first sample The distance between this is ranked up from large to small, as a result, by the selection preceding sample to be clustered of ranking to determine sample class Sample near the virtual center of cluster, so that it is determined that each second sample out.

2) in corresponding sample class cluster, by be less than with the distance between first sample preset distance threshold sample and The first sample is determined as second sample.

In another embodiment, each second sample can be determined in such a way that distance threshold is set, wherein apart from threshold Value can be configured according to actual use situation.

Therefore, through the above technical solutions, can accurately determine that the sample can be represented from each sample class cluster Multiple second samples of class cluster, to be effectively ensured to provide accurate data basis to user's portrait based on the target sample Carry out the accuracy of user's portrait.

In order to further ensure that the accuracy of user's portrait, the disclosure also provides following embodiment, in this embodiment, comprehensive Close consider the relevant article of user described above it is less, directly be user addition sample whole labels be that user draws a portrait Influence to user's portrait accuracy, specific flow chart is as shown in figure 5, as follows:

In S51, at least one interested sample of at least one similar users of target user is determined as the first sample This；

In S52, clustering processing is carried out to sample to be clustered using each first sample as class center, is obtained and the first sample The identical sample class cluster of this quantity, wherein the sample to be clustered is in sample set except the similar users and the target are used Sample except sample associated by family.

In S53, it is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and will be each Second sample is determined as target sample.

In S54, by the corresponding label of target sample, at least one label that target user does not have currently determines For candidate label.

In S55, the match parameter of each candidate label and target user is determined.Wherein, candidate label and target user Match parameter for characterizing whether the candidate label matches with target user, determine whether for target user label this Candidate label.

In S56, according to the match parameter of each candidate label and target user, it is determined to be used for from candidate label The target labels of target user are marked, and mark target labels for target user.

Wherein, the specific embodiment of S51, S52, S53 are identical as the specific embodiment of above S41, S42, S43, The specific embodiment of S54, S55, S56 are identical as the specific embodiment of above S21, S22, S23, and details are not described herein.

Through the above technical solutions, the quantity and range of target sample can be widened, effectively on the one hand so as to be use Family portrait provides data basis, and the accuracy of label label on the other hand can also be effectively ensured, and it is mixed to avoid the occurrence of user's portrait Random phenomenon promotes user experience.

Optionally, less for the relevant article of user described above, directly according to user when drawing a portrait to user Associated sample is the influence that user draws a portrait to user's portrait accuracy, and the disclosure provides following embodiment, specifically, such as Shown in Fig. 6, which comprises

In S61, at least one interested sample of at least one similar users of target user is determined as the first sample This；

In s 62, clustering processing is carried out to sample to be clustered using each first sample as class center, obtained and the first sample The identical sample class cluster of this quantity, wherein the sample to be clustered is in sample set except the similar users and the target are used Sample except sample associated by family.

In S63, it is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and will be each Second sample is determined as target sample.

In S64, at least one corresponding similar users of target user and at least one non-similar users are determined.

In S65, according to the first matching degree and target sample and at least one of target sample and at least one similar users Second matching degree of a non-similar users, determines target sample and the matching degree of target user.

In S66, when the matching degree of target sample and target user are more than preset matching degree threshold value, according to target sample This associated label is that target user marks label.

Wherein, the specific embodiment of S61, S62, S63 and the specific embodiment of S41, S42, S43 are identical, S64, The specific embodiment of S65, S66 and the specific embodiment of S11, S12, S13 are identical, and details are not described herein.

It needs to be illustrated, is only sequentially a kind of example implementations shown in above-mentioned flow chart, not to this public affairs It opens and is defined.Illustratively, the step S64 of the similar users and non-similar users of determining target user can be before S61 It executes, S64 and S61 also may be performed simultaneously, and the disclosure is to this without limiting.

In the above-mentioned technical solutions, target sample can be determined, by way of cluster so as to increase target sample This quantity and range, horn of plenty user, which draws a portrait, provides sample basis.When being drawn a portrait according to target sample to target user, The matching degree of target sample and target user can be accounted for again, so as to while expanding target sample range, The accuracy of target sample is effectively ensured, to guarantee the accuracy of user's portrait.

Optionally, less for the relevant article of user described above, directly according to user when drawing a portrait to user Associated sample be user draw a portrait and be directly user add sample whole labels be that user draws a portrait to user The influence for accuracy of drawing a portrait, the disclosure provide following embodiment, specifically, as shown in Figure 7, which comprises

In S71, at least one interested sample of at least one similar users of target user is determined as the first sample This；

In S72, clustering processing is carried out to sample to be clustered using each first sample as class center, is obtained and the first sample The identical sample class cluster of this quantity, wherein the sample to be clustered is in sample set except the similar users and the target are used Sample except sample associated by family.

In S73, it is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and will be each Second sample is determined as target sample.

In s 74, at least one corresponding similar users of target user and at least one non-similar users are determined.

In S75, according to the first matching degree and target sample and at least one of target sample and at least one similar users Second matching degree of a non-similar users, determines target sample and the matching degree of target user.

In s 76, when the matching degree of target sample and target user are more than preset matching degree threshold value, by target sample In corresponding label, at least one label that target user does not have currently be determined as candidate label.

In S77, the match parameter of each candidate label and target user is determined.

In S78, according to the match parameter of each candidate label and target user, it is determined to be used for from candidate label The target labels of target user are marked, and mark target labels for target user.

Wherein, the specific embodiment of S71, S72, S73 and the specific embodiment of S41, S42, S43 are identical, S74, S75 Specific embodiment it is identical with the specific embodiment of S11, S12, the specific embodiment and S21, S22 of S76, S77, S78, The specific embodiment of S23 is similar, and details are not described herein.It needs to be illustrated, in s 76, needs to determine target first Whether the matching degree of sample and target user are more than preset matching degree threshold value, in the matching for determining target sample and target user Degree be more than preset matching degree threshold value when, by the corresponding label of target sample, at least one that target user does not have currently A label is determined as candidate label.By in the corresponding label of target sample, target user does not have currently at least one mark The specific embodiment that label are determined as candidate label has been described in detail in S21, and details are not described herein.

Through the above technical solutions, on the one hand can effectively widen the sample size and model when being marked for user's portrait It encloses, horn of plenty user, which draws a portrait, provides data support.Meanwhile and can be effectively ensured for user portrait be marked when sample with The matching degree of user is further ensured that the accuracy for the label drawn a portrait for user, so that user's portrait is more accurate.In addition, When to being user's mark label based on the associated label of sample, considered by the matching degree to label and user, thus Effectively avoid as user's mark and its unmatched label and caused by user draw a portrait chaotic problem, further promoting user makes With experience.

The disclosure also provides a kind of user's portrait device, as shown in figure 8, described device 10 includes:

First determining module 100, for determining at least one corresponding similar users of target user and at least one non-phase Like user；

Second determining module 200, for according to the first matching degree of target sample and at least one similar users and Second matching degree of the target sample and at least one non-similar users determines that the target sample and the target are used The matching degree at family；

Mark module 300 is used for when the object matching degree is more than preset first matching degree threshold value, according to the mesh The associated label of standard specimen sheet is that the target user marks label.

Optionally, the second determining module 200, for by following formula determine target sample to described at least one is similar The first matching degree of user:

Wherein,Indicate first matching degree；

M indicates the quantity of the similar users；

P indicates the label vector of the target sample；

X_iIndicate the label vector of i-th of similar users.

Optionally, the second determining module 200, for determining target sample and at least one described non-phase by following formula Like the second matching degree of user:

Wherein,Indicate second matching degree；

N indicates the quantity of the non-similar users；

P indicates the label vector of the target sample；

Y_iIndicate the label vector of i-th of non-similar users.

Optionally, second determining module 200 is used for:

Optionally, the mark module 300 includes:

Optionally, described second determine that submodule includes:

M indicates the quantity of the similar users；

w_wholeIndicate the second weight of the candidate label w；

Optionally, the label submodule includes any one of following:

Optionally, described device further include: target sample determining module,

The target sample determining module includes:

About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.

Fig. 9 is the block diagram of a kind of electronic equipment 700 shown according to an exemplary embodiment.As shown in figure 9, the electronics is set Standby 700 may include: processor 701, memory 702.The electronic equipment 700 can also include multimedia component 703, input/ Export one or more of (I/O) interface 704 and communication component 705.

Wherein, processor 701 is used to control the integrated operation of the electronic equipment 700, to complete above-mentioned user portrait side All or part of the steps in method.Memory 702 is for storing various types of data to support the behaviour in the electronic equipment 700 To make, these data for example may include the instruction of any application or method for operating on the electronic equipment 700, with And the relevant data of application program, such as contact data, the message of transmitting-receiving, picture, audio, video etc..The memory 702 It can be realized by any kind of volatibility or non-volatile memory device or their combination, such as static random-access is deposited Reservoir (Static Random Access Memory, abbreviation SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, abbreviation EEPROM), erasable programmable Read-only memory (Erasable Programmable Read-Only Memory, abbreviation EPROM), programmable read only memory (Programmable Read-Only Memory, abbreviation PROM), and read-only memory (Read-Only Memory, referred to as ROM), magnetic memory, flash memory, disk or CD.Multimedia component 703 may include screen and audio component.Wherein Screen for example can be touch screen, and audio component is used for output and/or input audio signal.For example, audio component may include One microphone, microphone is for receiving external audio signal.The received audio signal can be further stored in storage Device 702 is sent by communication component 705.Audio component further includes at least one loudspeaker, is used for output audio signal.I/O Interface 704 provides interface between processor 701 and other interface modules, other above-mentioned interface modules can be keyboard, mouse, Button etc..These buttons can be virtual push button or entity button.Communication component 705 is for the electronic equipment 700 and other Wired or wireless communication is carried out between equipment.Wireless communication, such as Wi-Fi, bluetooth, near-field communication (Near Field Communication, abbreviation NFC), 2G, 3G or 4G or they one or more of combination, therefore corresponding communication Component 705 may include: Wi-Fi module, bluetooth module, NFC module.

In one exemplary embodiment, electronic equipment 700 can be by one or more application specific integrated circuit (Application Specific Integrated Circuit, abbreviation ASIC), digital signal processor (Digital Signal Processor, abbreviation DSP), digital signal processing appts (Digital Signal Processing Device, Abbreviation DSPD), programmable logic device (Programmable Logic Device, abbreviation PLD), field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), controller, microcontroller, microprocessor or other electronics member Part is realized, for executing above-mentioned user's portrait method.

In a further exemplary embodiment, a kind of computer readable storage medium including program instruction is additionally provided, it should The step of above-mentioned user's portrait method is realized when program instruction is executed by processor.For example, the computer readable storage medium It can be the above-mentioned memory 702 including program instruction, above procedure instruction can be executed by the processor 701 of electronic equipment 700 To complete above-mentioned user's portrait method.

Figure 10 is the block diagram of a kind of electronic equipment 1900 shown according to an exemplary embodiment.For example, electronic equipment 1900 may be provided as a server.Referring to Fig.1 0, electronic equipment 1900 includes processor 1922, and quantity can be one Or multiple and memory 1932, for storing the computer program that can be executed by processor 1922.It is stored in memory 1932 Computer program may include it is one or more each correspond to one group of instruction module.In addition, processor 1922 can be configured as the execution computer program, to execute above-mentioned user's portrait method.

In addition, electronic equipment 1900 can also include power supply module 1926 and communication component 1950, the power supply module 1926 It can be configured as the power management for executing electronic equipment 1900, which can be configured as realization electronic equipment 1900 communication, for example, wired or wireless communication.In addition, the electronic equipment 1900 can also include that input/output (I/O) connects Mouth 1958.Electronic equipment 1900 can be operated based on the operating system for being stored in memory 1932, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM etc..

In a further exemplary embodiment, a kind of computer readable storage medium including program instruction is additionally provided, it should The step of above-mentioned user's portrait method is realized when program instruction is executed by processor.For example, the computer readable storage medium It can be the above-mentioned memory 1932 including program instruction, above procedure instruction can be held by the processor 1922 of electronic equipment 1900 Row is to complete above-mentioned user's portrait method.

The preferred embodiment of the disclosure is described in detail in conjunction with attached drawing above, still, the disclosure is not limited to above-mentioned reality The detail in mode is applied, in the range of the technology design of the disclosure, a variety of letters can be carried out to the technical solution of the disclosure Monotropic type, these simple variants belong to the protection scope of the disclosure.

It is further to note that specific technical features described in the above specific embodiments, in not lance In the case where shield, it can be combined in any appropriate way.In order to avoid unnecessary repetition, the disclosure to it is various can No further explanation will be given for the combination of energy.

In addition, any combination can also be carried out between a variety of different embodiments of the disclosure, as long as it is without prejudice to originally Disclosed thought equally should be considered as disclosure disclosure of that.

Claims

A kind of method 1. user draws a portrait, which is characterized in that the described method includes:

Determine at least one corresponding similar users of target user and at least one non-similar users；

According to the first matching degree and the target sample and described at least one of target sample and at least one similar users Second matching degree of a non-similar users, determines the matching degree of the target sample Yu the target user；

When the matching degree of the target sample and the target user are more than preset matching degree threshold value, according to the target sample This associated label is that the target user marks label.
2. the method according to claim 1, wherein determining target sample and described at least one by following formula First matching degree of a similar users:

Wherein,Indicate first matching degree；

M indicates the quantity of the similar users；

P indicates the label vector of the target sample；

X_iIndicate the label vector of i-th of similar users.
3. the method according to claim 1, wherein determining target sample and described at least one by following formula Second matching degree of a non-similar users:

Wherein,Indicate second matching degree；

N indicates the quantity of the non-similar users；

P indicates the label vector of the target sample；

Y_iIndicate the label vector of i-th of non-similar users.
4. the method according to claim 1, wherein described according to target sample and at least one described similar use Second matching degree of first matching degree at family and the target sample and at least one non-similar users, determines the target The matching degree of sample and the target user, comprising:

By the difference of the weighted value of first matching degree and the weighted value of second matching degree be determined as the target sample with The matching degree of the target user.
5. according to the associated label of the target sample being described the method according to claim 1, wherein described Target user marks label, comprising:

By in the corresponding label of the target sample, at least one label that the target user does not have currently is determined as waiting Select label；

Determine the match parameter of each candidate label and the target user；

According to the match parameter of each candidate label and the target user, it is determined to be used for from the candidate label The target labels of the target user are marked, and mark the target labels for the target user.
6. according to the method described in claim 5, it is characterized in that, each candidate label of the determination and the target are used The match parameter at family, comprising:

For each candidate label, determines that candidate's label is corresponding in the label possessed by each similar users and account for Than, and will and the corresponding accounting of each similar users be determined as candidate's label the first weight corresponding with the similar users；

It determines and the similarity between label vector and the label vector of each similar users is formed by by all candidate labels, and The average value of the similarity is determined as corresponding second weight of each candidate's label；

According to first weight and second weight of the candidate label, the candidate label is determined by following formula With the match parameter of the target user:

Wherein, Fit indicates the match parameter of the candidate label and the target user；

M indicates the quantity of the similar users；

w_wholeIndicate the second weight of the candidate label w；

w_iIndicate the candidate label w the first weight corresponding with i-th of similar users.
7. method according to claim 1 to 6, which is characterized in that the target sample is true in the following way It is fixed:

At least one interested sample of at least one described similar users is determined as first sample；

Clustering processing is carried out to sample to be clustered using each first sample as class center, is obtained and the first sample number Measure identical sample class cluster, wherein the sample to be clustered in sample set remove the similar users and target user institute Sample except associated sample；

It is determined to represent multiple second samples of the sample class cluster from each sample class cluster, and by each described second Sample is determined as the target sample.
The device 8. a kind of user draws a portrait, which is characterized in that described device includes:

First determining module, for determining at least one corresponding similar users of target user and at least one non-similar users；

Second determining module, for the first matching degree and the target according to target sample and at least one similar users Second matching degree of sample and at least one non-similar users, determines the matching of the target sample Yu the target user Degree；

Mark module is used for when the object matching degree is more than preset first matching degree threshold value, according to the target sample Associated label is that the target user marks label.
9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claim 1-7 the method is realized when row.
10. a kind of electronic equipment characterized by comprising

Memory is stored thereon with computer program；

Processor, for executing the computer program in the memory, to realize described in any one of claim 1-7 The step of method.