CN101266620B

CN101266620B - Method and equipment for providing target information for user

Info

Publication number: CN101266620B
Application number: CN200810103480.9A
Authority: CN
Inventors: 吴定明; 赵东岩
Original assignee: Peking University; Peking University Founder Group Co Ltd; Beijing Founder Apabi Technology Co Ltd
Current assignee: New Founder Holdings Development Co ltd; Peking University; Founder Apabi Technology Ltd
Priority date: 2008-04-07
Filing date: 2008-04-07
Publication date: 2021-04-16
Anticipated expiration: 2028-04-07
Also published as: CN101266620A

Abstract

The invention discloses a method for providing target information for a user, which comprises the following steps: acquiring historical behavior data of information browsed by a user, wherein the historical behavior data comprises the content of the information and browsing time; classifying the content of the information, and determining the probability that the information belongs to one or more categories to which the content belongs according to the classification result; determining interest values of the user in the one or more categories at a set moment according to the probability and the browsing moment; and providing target information of the category corresponding to the interest value which is not less than the set threshold value to the user at the set moment according to the interest value. The invention also discloses a device for providing the target information for the user. The invention can quantify and describe the interest of the user to the information in a multi-granularity way according to the user requirement, dynamically reflect the change process of the user interest and simplify the user operation.

Description

Method and equipment for providing target information for user

Technical Field

The present invention relates to the field of network technologies, and in particular, to a method and a device for providing target information to a user.

Background

The internet has massive information, rich content and various forms. Network users want to obtain information from vast sources of information that meets personal needs. Search engines, meta search engines, and some other search tools can help us extract information from vast networks of information. When a user searches, a small number of search keywords are generally provided, and a huge number of search results are returned by a search engine. If there is a lack of interest analysis for the user, i.e., if an accurate user model is not built, the user will be overwhelmed with the sea of information.

Therefore, the establishment of an accurate user model is very important for obtaining the user target information. User modeling has become the core research content of network data mining, such as personalized search, advertisement target customer location, information recommendation, market business decision and customer relationship management, and the like.

At present, two user modeling methods mainly exist, one is static user modeling, and the other is dynamic user modeling.

Static user modeling analyzes personal information provided by a user, such as registration information and questionnaires. The inventor finds out through analysis that the static user modeling method can only roughly describe the interests of the user, and has some problems in practical application, firstly, because the user model is static, the user interests described by the model are only effective within a certain period of time and can not reflect the change condition of the user interests in the future; secondly, the personal information submitted by the user is a subjective input and cannot objectively reflect the characteristics of the interest of the user.

The dynamic user modeling analyzes the activities and behaviors of users on a website, and the inventor finds that the description granularity of the model on the user model is small, and the model is specific to a certain interest point and does not have a general description.

In addition, there are other methods for modeling users that require the user to provide feedback information, which complicates the user's operation and affects the user's normal behavior.

Disclosure of Invention

The embodiment of the invention provides a method for providing target information for a user, which is used for quantifying and describing the interest of the user to the information in a multi-granularity mode according to the requirement of the user, dynamically reflecting the change process of the interest of the user and simplifying the operation of the user, and comprises the following steps:

acquiring historical behavior data of information browsed by a user, wherein the historical behavior data comprises the content of the information and browsing time;

classifying the content of the information, determining the number of categories to which the information belongs, and determining the probability that the information belongs to one or more categories to which the content of the information belongs according to the number of categories to which the information belongs;

determining interest values of the user in the one or more categories at a set moment according to the probability and the browsing moment;

and providing target information of the category corresponding to the interest value which is not less than the set threshold value to the user at the set moment according to the interest value.

Preferably, the historical behavior data further comprises a user identification;

obtaining interest values of different users at set time according to the user identification, the probability and the browsing time; and providing target information to the corresponding user at the set moment according to the interest value and the received user identification.

Preferably, the user identifier is a registration name or an IP address of the user.

Preferably, determining the interest value of the user in a category at a set time according to the probability and the browsing time includes:

determining the interest value of the user in the category at the browsing time according to the following formula:

wherein k is a forgetting factor, k_αAs a memory factor, weight₀The probability corresponding to the category at the moment t is obtained, and t is the browsing moment;

determining the interest value of the user in the category at the set moment according to the following formula:

wherein t is the browsing time, t + tau is the setting time, time_n≤t≤time_n+1,τ＝time_n-time_n-1，weight_nThe probability corresponding to the category at the set time t + τ is obtained.

Preferably, the information is a website, a webpage, or an object in a webpage.

Preferably, according to the interest value, providing the target information of the category corresponding to the interest value not less than the set threshold to the user at the set time includes:

comparing the interest value with a set threshold value in size;

and providing the information of the category corresponding to the interest value which is not less than the threshold value to the user at the set moment.

The embodiment of the present invention further provides a device for providing target information to a user, so as to quantify and describe the interest of the user in the information in a multi-granularity manner according to the user requirement, dynamically reflect the change process of the user interest, and simplify the user operation, wherein the device comprises:

the acquisition module is used for acquiring historical behavior data of information browsed by a user, wherein the historical behavior data comprises the content of the information and browsing time;

the classification module comprises a first classification unit and a second classification unit;

the first classification unit is used for classifying the content of the information and determining the number of categories to which the information belongs;

the second classification unit is used for determining the probability that the information belongs to one or more categories to which the content of the information belongs according to the number of the categories to which the information belongs;

the processing module is used for determining interest values of the user in the one or more categories at a set moment according to the probability and the browsing moment;

and the providing module is used for providing the information of the category corresponding to the interest value which is not less than the set threshold value to the user at the set moment according to the interest value.

Preferably, the historical behavior data further comprises a user identification; the processing module is further used for obtaining interest values of different users at set time according to the user identification, the probability and the browsing time;

the providing module includes:

a receiving unit, configured to receive a user identifier;

and the first providing unit is used for providing target information to the corresponding user at the set moment according to the interest value and the received user identification.

Preferably, the processing module includes:

a first processing unit, configured to determine a value of interest of the user in a category at the browsing time according to the following formula:

the second processing unit is used for determining the interest value of the user in the category at the set moment according to the following formula:

Preferably, the providing module includes:

a comparison unit for comparing the interest value with a threshold value;

and the second providing unit is used for providing the information of the category corresponding to the interest value which is not less than the threshold value to the user at the set moment.

In the embodiment of the invention, historical behavior data of information browsed by a user is acquired, wherein the historical behavior data comprises the content of the information and the browsing time; classifying the content of the information to obtain the probability that the information belongs to a set category; obtaining an interest value of a user at a set moment according to the probability and the browsing moment; according to the interest value, the target information is provided for the user at the set moment, so that the interest of the user to the information can be quantized and described in a multi-granularity mode according to the user requirement, the change process of the user interest is dynamically reflected, the future interest trend of the user is predicted, the user does not need to provide feedback information during implementation, and the user operation is relatively simplified.

Drawings

FIG. 1 is a flow chart of providing targeted information to a user in an embodiment of the present invention;

FIG. 2 is a graph illustrating probability that information browsed by a user belongs to a predetermined category according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of an apparatus for providing target information to a user according to an embodiment of the present invention;

fig. 4 and 7 are schematic structural diagrams of a providing module according to an embodiment of the present invention;

FIG. 5 is a block diagram of a classification module according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a processing module according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings.

As shown in fig. 1, in the embodiment of the present invention, a process of providing target information to a user is as follows:

and 11, acquiring historical behavior data of information browsed by the user, wherein the historical behavior data comprises the content of the information and browsing time.

And step 12, classifying the content of the information to obtain the probability that the information belongs to the set category.

And step 13, obtaining the interest value of the user at the set moment according to the obtained probability and the browsing moment.

And step 14, providing target information to the user at the set time according to the interest value.

In the flow shown in fig. 1, the information browsed by the user may be a website, a webpage, or an object in the webpage, which may be specifically set according to the user requirement. Those skilled in the art can easily understand that, in different application occasions, the above processing can be performed on target information required by a user, and the method provided by the embodiment of the invention can be applied to a product which is as large as a website and as small as a webpage, so that the information interest degree of the user can be described in a multi-granularity manner according to the user requirement.

In one embodiment, when there are multiple users, the historical behavior data in step 11 may further include a user identifier for uniquely identifying the user. The user identification may be a registration name, an IP address, etc. of the user. Table 1 illustrates the historical behavior data of the user, taking the information browsed by the user as an example, which is a web page:

TABLE 1 historical behavioral data of users

If there are multiple users, in step 12, the interest values of different users at the set time can be obtained according to the user identifier, the probability and the browsing time. According to the requirements of different users, different classification systems can be preset, and the classification system can be single-layer, multi-layer or other classification methods. Whatever classification method is adopted, the final purpose is to classify the content of the information browsed by the user into a certain category or a plurality of categories and obtain the probability p that the information browsed by the user belongs to a certain category. For example, in table 1, the probability p that the information content of http:// idocan/page 1.html belongs to the news category is 0.8, the probability p that the information content of http:// idocan/2008. mp3 belongs to the music category is 1.0, and the probability p that the information content of http:// idocan/page 2.html belongs to the sports category is 0.6.

In one embodiment, the content of the information may be classified first to obtain the number of categories to which the information belongs, and then the probability that the information belongs to the set category may be obtained according to the number of categories to which the information belongs. For example, if a piece of information includes content in a news category, content in a music category, and content in a sports category, the piece of information can be assigned to 3 categories of news, music, and sports, and the probability p is 0.33. For another example, if a piece of information includes only the content of a news category, the piece of information belongs to 1 news category, and the probability p is 1. For example, if 80% of the content of a certain piece of information belongs to a news category and 20% of the content belongs to a music category, the probability p of the information belonging to the news category may be 0.8, and the probability p of the information belonging to the music category may be 0.2.

In an embodiment, after the content of the information browsed by the user is classified according to the classification system, the probabilities may be sorted in each category according to a time sequence, and after the data processing is completed, the classified historical behavior data shown in table 2 may be obtained. Wherein, Topic is a certain category in the classification system, and weight is the probability that the information browsed by the user belongs to the set category.

TABLE 2 categorized historical behavior data

Applying the example of table 1 to table 2, table 3 can be obtained:

TABLE 3 specific examples of categorized historical behavior data

In one embodiment, according to the memory forgetting law of Ebinghaos, step 14 may be implemented to obtain the interest value of the user at the browsing time according to formula (r):

wherein k is a forgetting factor, for example, k is 0.1; k is a radical of_αFor memory factors, e.g. taking k_α＝0.9；weight₀The probability corresponding to the time t is the browsing time. Therefore, the formula I describes the degree of interest of a user in a certain category at the browsing moment, and the larger the value is, the larger the interest is, and the actual interest is represented quantitatively.

The interest value of the user at the set moment can be obtained according to a formula II through the derivation of the iterative relationship:

wherein t is the browsing time, t + tau is the setting time, time_n≤t≤time_n+1，τ＝time_n-time_n-1，weight_nThe probability corresponding to the set time t + τ. Therefore, at the time t + tau, the interest value of the user is a superposition value of the interest value left at the previous time and the interest value newly added at the previous time.

Substituting the time sequence of each topic of each user in the table 2 into a formula to obtain an interest model of each user, wherein each topic corresponds to an interest value Z (t) in each user interest model, and the interest change of the user at a certain time in the future can be predicted according to Z (t).

As can be seen from the function graph of the formula (II) shown in FIG. 2, the interest value in the embodiment of the present invention can dynamically reflect the change of the user interest, such as the generation of a new interest, the disappearance or the enhancement of an existing interest. The interest value of the user in a certain category of information can be calculated at any time. Fig. 2 is only an example, and it can be seen that the user's interest in the two categories of food (eating) and education (education) is increased, and the interest in the category of family (home) is decreased.

According to the user model in the embodiment of the invention, the interest values of the user for different information can be calculated, and the content with large interest value can be recommended to the user when the user browses the information on the website. Specifically, in implementation, a threshold value can be set, and the interest value obtained by calculation is compared with the threshold value in size; and subsequently providing the information of the category corresponding to the interest value not less than the threshold value to the user at the set moment.

If there are multiple users, the users need to submit user identifiers, and step 14, in this embodiment, may provide target information to corresponding users at a set time according to the obtained interest values and the received user identifiers.

Based on the same inventive concept, the present invention further provides an apparatus for providing target information to a user, which has a structure as shown in fig. 3, and comprises: an acquisition module 31, a classification module 32, a processing module 33, and a providing module 34; the acquiring module 31 is configured to acquire historical behavior data of information browsed by a user, where the historical behavior data includes content of the information and browsing time; a classification module 32, configured to classify content of the information to obtain a probability that the information belongs to a set category; the processing module 33 is configured to obtain an interest value of the user at a set time according to the probability and the browsing time; and a providing module 34 for providing the target information to the user at the set time according to the interest value.

In one embodiment, the historical behavior data further includes a user identification; at this time, the processing module 33 may also be configured to obtain interest values of different users at the set time according to the user identifier, the obtained probability, and the browsing time; as shown in fig. 4, the providing module 34 includes: a receiving unit 341, configured to receive a user identifier; the first providing unit 342 is configured to provide the target information to the corresponding user at a set time according to the interest value and the received user identifier.

As shown in FIG. 5, in one embodiment, classification module 32 may include: a first classification unit 321, configured to classify content of the information, and obtain the number of categories to which the information belongs; and a second classification unit 322, configured to obtain a probability that the information belongs to the set category according to the number of categories to which the information belongs.

As shown in fig. 6, in one embodiment, the processing module 33 includes: the first processing unit 331 is configured to obtain an interest value of the user at the browsing time according to the following formula:

wherein k is a forgetting factor, k_αAs a memory factor, weight₀The probability corresponding to the moment t is the browsing moment;

a second processing unit 332, configured to obtain an interest value of the user at a set time according to the following formula:

wherein t is the browsing time, t + tau is the setting time, time_n≤t≤time_n+1，τ＝time_n-time_n-1，weight_nIs the probability at the set time t + tau.

As shown in fig. 7, in one embodiment, providing module 34 may include: a comparing unit 343 configured to compare the interest value with a threshold value; a second providing unit 344, configured to provide the user with information of the category corresponding to the interest value that is not less than the threshold at the set time.

It will be understood by those skilled in the art that all or part of the steps in the method of the above embodiments may be implemented by hardware that is instructed to be associated with a program, and the program may be stored in a computer-readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.

In the embodiment of the invention, historical behavior data of information browsed by a user is acquired, wherein the historical behavior data comprises the content of the information and the browsing time; classifying the content of the information to obtain the probability that the information belongs to a set category; obtaining an interest value of a user at a set moment according to the obtained probability and the browsing moment; according to the interest value, the target information is provided for the user at the set moment, so that the interest of the user to the information can be quantized and described in a multi-granularity mode according to the user requirement, the change process of the user interest is dynamically reflected, the future interest trend of the user is predicted, the user does not need to provide feedback information during implementation, and the user operation is relatively simplified.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is intended to include such modifications and variations.

Claims

1. A method for providing targeted information to a user, the method comprising:

2. The method of claim 1, wherein the historical behavior data further comprises a user identification;

obtaining interest values of different users at set time according to the user identification, the probability and the browsing time;

and providing target information to the corresponding user at the set moment according to the interest value and the received user identification.

3. The method of claim 2, wherein the user identification is a registration name or an IP address of the user.

4. The method of claim 1, wherein determining a user's interest value in a category at a set time based on the probability and the browsing time comprises:

5. The method of claim 1, wherein the information is a website, a web page, or an object in a web page.

6. The method according to any one of claims 1 to 5, wherein providing the target information of the category corresponding to the interest value not less than the set threshold to the user at the set time according to the interest value comprises:

comparing the interest value with a set threshold value in size;

7. An apparatus for providing targeted information to a user, comprising:

the classification module comprises a first classification unit and a second classification unit; the first classification unit is used for classifying the content of the information and determining the number of categories to which the information belongs; the second classification unit is used for determining the probability that the information belongs to one or more categories to which the content of the information belongs according to the number of the categories to which the information belongs;

and the providing module is used for providing the target information of the category corresponding to the interest value which is not less than the set threshold value to the user at the set moment according to the interest value.

8. The apparatus of claim 7, wherein the historical behavior data further comprises a user identification;

the processing module is further used for obtaining interest values of different users at set time according to the user identification, the probability and the browsing time;

the providing module includes:

a receiving unit, configured to receive a user identifier;

9. The device of claim 7, wherein the processing module comprises:

10. The apparatus of any of claims 7 to 9, wherein the providing module comprises:

a comparison unit for comparing the interest value with a threshold value;