CN103870512A - Method and device for generating user interest label - Google Patents

Method and device for generating user interest label Download PDF

Info

Publication number
CN103870512A
CN103870512A CN201210552046.5A CN201210552046A CN103870512A CN 103870512 A CN103870512 A CN 103870512A CN 201210552046 A CN201210552046 A CN 201210552046A CN 103870512 A CN103870512 A CN 103870512A
Authority
CN
China
Prior art keywords
domain name
user
preset time
time section
categorize interests
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210552046.5A
Other languages
Chinese (zh)
Inventor
曾鹏云
程小梅
苏小康
范世青
张凯
穆裔坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201210552046.5A priority Critical patent/CN103870512A/en
Publication of CN103870512A publication Critical patent/CN103870512A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention is suitable for the information processing field, and provides a method and a device for generating a user interest label. The method comprises the following steps: creating an interest classification; obtaining a website domain name, and matching the website domain name into the corresponding interest classification to obtain the mapping relationship between the website domain name and the interest classification; collecting the webpage browsing information of the user within a first preset time period; according to the webpage browsing information of the user within the first preset time period and the mapping relationship between the website domain name and the interest classification, generating the user interest label. Because the user interest label is generated according to the webpage browsing information of the user within the preset time period, the user interest can be more truly, accurately and objectively reflected by the generated user interest label.

Description

A kind of method and device that generates user interest label
Technical field
The invention belongs to field of information processing, relate in particular to a kind of method and device that generates user interest label.
Background technology
Along with the development of network technology, the multiple business that traditional needs could be realized Information Communication face-to-face can realize by network, as propagation and the propelling movement etc. of advertisement putting, information.This not only provides a kind of new circulation way for traditional business, and with respect to traditional trafficwise, Network seems and has more the popularity of specific aim and propagation.But existing while carrying out Information Communication by network, owing to cannot truly, accurately, objectively collecting user's interest information, thereby make, in the time carrying out Information Communication by network, also to have the problem that specific aim is poor, blindness is large.
Summary of the invention
The object of the embodiment of the present invention is to provide a kind of generation method of user interest label, is intended to solve the problem true, accurate, objective user interest label that how to generate.
The embodiment of the present invention is achieved in that a kind of generation method of user interest label, and described method comprises:
Create categorize interests;
Obtain website domain name, and website domain name is matched in corresponding categorize interests, obtain the mapping relations between website domain name and categorize interests;
Gather the net page browse information of user in the first Preset Time section, and generate user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
Another object of the embodiment of the present invention is to provide a kind of generating apparatus of user interest label, and described device comprises:
Classification creating unit, for creating categorize interests;
Map unit, for obtaining website domain name, and matches website domain name in corresponding categorize interests, obtains the mapping relations between website domain name and categorize interests;
Interest label generation unit, for gathering the net page browse information of user in the first Preset Time section, and generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
In embodiments of the present invention, by website domain name being matched in the corresponding categorize interests being pre-created, obtain the mapping relations between website domain name and categorize interests, in according to Preset Time section, the mapping relations of user's net page browse information and website domain name and categorize interests generate user interest label, because user interest label is to generate according to the net page browse information of user in Preset Time section, thereby make the user interest label generating can more truly, accurately, objectively reflect user's interest.
Accompanying drawing explanation
Fig. 1 is the realization flow figure of the method for the generation user interest label that provides of the embodiment of the present invention;
Fig. 2 is the schematic diagram of the multi-layer categorize interests that provides of the embodiment of the present invention;
Fig. 3 is the structured flowchart of the device of the generation user interest label that provides of the embodiment of the present invention;
Fig. 4 is the structured flowchart of the device of the generation user interest label that provides of another embodiment of the present invention.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
In embodiments of the present invention, create categorize interests, website domain name is matched in corresponding categorize interests, obtain the mapping relations between website domain name and categorize interests, gather the net page browse information of user in Preset Time section, and generate user interest label according to the mapping relations of user's net page browse information and website domain name and categorize interests.
For technical solutions according to the invention are described, describe below by specific embodiment.
Fig. 1 shows the realization flow of the method for the generation user interest label that the embodiment of the present invention provides, and details are as follows:
S101, creates categorize interests.
In the present embodiment, categorize interests can be monohierarchy categorize interests, can be also multi-layer categorize interests.Wherein monohierarchy categorize interests refers to and creates multiple independent sortings arranged side by side, as the categorize interests creating comprises following multiple independent sortings arranged side by side: military classification, science and technology classification, women's classification etc.Multi-layer categorize interests refers to that creating multiple ground floors classifies, and creates multiple second layers classification for each ground floor classification again, and by that analogy, wherein number of levels can arrange arbitrarily according to needs.
Creating when categorize interests, for the categorize interests creating is provided for the class indication of this categorize interests of unique identification.In the time creating multi-layer categorize interests, a wherein part for class indication that can be using the class indication of higher level's categorize interests as subordinate's categorize interests, so that the hierarchical relationship up and down of clearer, clear and definite embodiment categorize interests.
For the above-mentioned categorize interests of more explicit explanation, below describe with a concrete example:
In the time creating monohierarchy categorize interests, the categorize interests creating includes but not limited to military classification, science and technology classification, women's classification etc., and the categorize interests that is respectively establishment arranges class indication, if the class indication of military affairs classification is 1, the class indication of science and technology classification is 2, and the class indication of women's classification is 3 etc.Be appreciated that according to different demands, can create different categorize interests, also can other forms of class indication be set for each categorize interests, in this no limit, also no longer carry out exhaustive.
In the time creating multi-layer categorize interests, the one-level categorize interests creating includes but not limited to military classification, science and technology classification, women's classification etc., for the military affairs classification in one-level categorize interests, the secondary categorize interests creating includes but not limited to ground force, naval etc., for the science and technology classification in one-level categorize interests, the secondary categorize interests creating includes but not limited to IT, biology etc., and for the women's classification in one-level categorize interests, the secondary categorize interests of establishment includes but not limited to dress ornament, jewellery etc.Simultaneously for the above-mentioned each categorize interests creating arranges class indication, the class indication of the military affairs classification in one-level categorize interests is 1, the class indication of science and technology classification is 2, the class indication of women's classification is 3, the class indication that the class indication of the ground force in secondary categorize interests is 1.1, the class indication of naval is 1.2, IT is 2.1, and biological class indication is 2.2, the class indication of dress ornament is 3.1, and the class indication of jewellery is 3.2.For multi-layer categorize interests is more clearly described, refer to Fig. 2, the schematic diagram of the multi-layer categorize interests providing for the embodiment of the present invention, but multi-layer categorize interests is not limited with this schematic diagram.Wherein root node does not represent any categorize interests.
S102, obtains website domain name, and website domain name is matched in corresponding categorize interests, obtains the mapping relations between website domain name and categorize interests.
In the present embodiment, can from the user's of record net page browse information, obtain website domain name, also can collect the directory information of each website, and obtain website domain name according to the directory information of the website of collecting.Wherein user's net page browse information includes but not limited to the network address that user accessed.Wherein network address can be the URL(uniform resource locator) (Uniform/Universal Resource Locator, URL) of webpage.The detailed process of wherein obtaining website domain name from user's the browsing information of record is as follows:
According to the number of levels of the categorize interests creating, from user's net page browse information, extract the directory web site of respective layer progression, using the directory web site extracting as the website domain name getting.Be described in detail as follows:
Because a website may comprise multi-layer directory web site, as comprised first class catalogue, second-level directory etc., therefore, in the present embodiment, in the time extracting website domain name according to the URL of webpage, the URL of the webpage that can access from user according to the number of levels of the categorize interests creating, extract the catalogue of corresponding progression, and using the catalogue of extracting as the website domain name getting.Wherein the number of levels of the number of levels of categorize interests and the directory web site of extraction is not limited on all four situation, be not limited to extract the situation of first class catalogue in the time that the number of levels of categorize interests is 1, can the corresponding relation between the number of levels of categorize interests and the number of levels of the directory web site of extraction be set arbitrarily according to the classification situation of website.
In order to make the above-mentioned detailed process of obtaining website domain name more clear, illustrate as follows:
In the time that the number of levels of categorize interests is 1, create be monohierarchy categorize interests time, extract first class catalogue according to the URL of webpage, and using the first class catalogue of extraction as the website domain name getting.When the number of levels of categorize interests be 2(as shown in Figure 2) time, extract second-level directory according to the URL of webpage, and using the second-level directory of extraction as the website domain name getting.Illustrate as follows:
The URL that supposes certain webpage is as follows:
http://tech.163.com/IT/1129/19/8HGJMVOH0001124J.html
The first class catalogue of this webpage is http://tech.163.com, and second-level directory is http://tech.163.com/IT, and three grades of catalogues are http://tech.163.com/IT/1129, and level Four catalogue is http://tech.163.com/IT/1129/19.
In the time that the categorize interests creating is monohierarchy categorize interests, extract the first class catalogue http://tech.163.com of this webpage according to the URL of this webpage, and using this first class catalogue extracting as the website domain name getting according to this webpage.
In the time that the categorize interests creating is multi-layer categorize interests (if number of levels is 2), extract the second-level directory http://tech.163.com/IT of this webpage according to the URL of this webpage, and using this second-level directory of extracting as the website domain name getting according to this webpage.
Wherein website domain name is matched in corresponding categorize interests, any one mode that the detailed process of the mapping relations between website domain name and categorize interests can adopt prior art to provide is provided, as pre-configured mode, the following mode that also can adopt the embodiment of the present invention to provide:
A, be pre-created the mapping relations between domain name key word and categorize interests.
Wherein refer to can be for carrying out different domain names the key word of interest differentiation for domain name key word, arranging of this domain name key word is general according to the catalogue naming rule setting of each website, as used tech to represent science and technology in certain website, while using science to represent science and technology in another website, can be using tech, science all as domain name key word, principle according to this, can collect the catalogue naming rule of various websites, and according to the catalogue naming rule of various websites, domain name key word is set, thereby can obtain comparatively perfect domain name key word.For the ease of understanding, table 1 shows the example of the mapping relations between domain name key word and the categorize interests of establishment, but mapping relations between domain name key word and categorize interests are not limited with table 1, can also be other more kinds of forms of expression.
Table 1
Domain name key word Categorize interests
Military、military、mil Military
Tech、tech、science Science and technology
Women、women、wom Women
...... ......
B, website domain name is mated with domain name key word, and in the time that the match is successful, obtain corresponding categorize interests according to the domain name key word that the match is successful, and the mapping relations that set up a web site between domain name and categorize interests.Illustrate as follows:
Suppose that website domain name is http://tech.163.com, when this website domain name coupling with domain name key word, the domain name key word that can obtain mating with this website domain name is tech, and according to the mapping relations between domain name key word and categorize interests, the corresponding categorize interests getting according to the domain name key word that the match is successful is science and technology, thus the mapping relations that can set up a web site between domain name http://tech.163.com and categorize interests " science and technology ".For the ease of understanding, table 2 shows the example of the mapping relations between website domain name and the categorize interests that the embodiment of the present invention provides, but mapping relations between website domain name and categorize interests are not limited with the example of table 2.
Table 2
Website domain name Categorize interests
http://tech.163.com Science and technology
http://tech.sina.com Science and technology
http://Tech.sohu.com Science and technology
http://mil.163.com Military
http://Mil.sina.com Military
...... ......
In the present embodiment, can arrive website domain name by automatic acquisition according to the user's of record browsing information, getting after the domain name of website, can automatically website domain name be mated with categorize interests, obtain the mapping relations between website domain name and categorize interests, thereby make whole process not need artificial participation.
S103, gathers the net page browse information of user in the first Preset Time section, and generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
The first Preset Time section is generally nearest a period of time, as shifted onto forward in the first Preset Time length from current time.
Any one mode that wherein can adopt prior art to provide according to the detailed process of user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests generation user interest label, also can adopt following mode provided by the invention:
A1, obtain website domain name that in this first Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in the first Preset Time section.Wherein the first Preset Time section can be one week, two weeks, one month or other times section.Illustrate as follows:
Suppose that in the first Preset Time section, user accessed 20 webpages, according to this net page browse information of user, can from the URL of these 20 webpages, obtain corresponding website domain name, can obtain according to the concrete acquisition process of aforesaid website domain name, the website domain name getting may be all identical, possible part is identical, also may be all different, and the number of visits of the website domain name that can obtain getting simultaneously, as worked as in the same website domain name getting from 20 webpages, there is the website domain name of obtaining in 4 webpages identical, the number of visits of this website domain name is 4 times, by that analogy, can obtain user browses in the first Preset Time section website domain name and the number of visits of this website domain name.For the ease of understanding, the example of the website domain name that the interior user of the first Preset Time section that table 3 shows the embodiment of the present invention to be provided browses and the number of visits of this website domain name, but be not limited with this example.
Table 3
Website domain name Number of visits
host1
3
host2 5
host3 2
host4 10
A2, obtain categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests.Its detailed process as above, does not repeat them here, but for the ease of understanding, and the result that what table 4 showed that the embodiment of the present invention provides obtain after categorize interests corresponding to website domain name is as follows, but is not limited with table 4.
Table 4
Website domain name Number of visits Categorize interests
host1
3 Military
host2 5 Military
host3
2 Science and technology
host4 10 Women
A3, obtain the weight of each categorize interests of this user in the first Preset Time section according to the categorize interests of the number of visits of the website domain name in the first Preset Time section, website domain name and this website domain name.Its detailed process is as follows:
By the number of visits of each website domain name that belongs to same categorize interests in the first Preset Time section be added, number of visits is averaged or the number of visits multiple different mode such as multiply each other is obtained the weight of each categorize interests of this user in the first Preset Time section.The present invention adds up to example with number of times, describes.
Be depicted as example with above-mentioned table 4, because website domain name host1 and host2 all belong to military categorize interests, therefore the number of visits of website domain name (comprising host1 and host2) corresponding to this user's military categorize interests adds up to 3+5=8, the weight that obtains this user's military categorize interests in the first Preset Time section is 8, for other categorize interests of user, according to the principle of aforementioned detailed description, the weight that can obtain this user's scientific and technological categorize interests in the first Preset Time section is 2, in the first Preset Time section, the weight of women's categorize interests of this user is 10, be expressed as follows with table 5, but be not limited with table 5.
Table 5
Categorize interests The weight of categorize interests
Military 8
Science and technology 2
Women 10
A4, generate user interest label according to the weight of each categorize interests of this user in the first Preset Time section.Its detailed process is as follows:
In the first Preset Time section, the weight of categorize interests is greater than the categorize interests of default the first weight threshold as this user's interest label.Wherein default the first weight threshold is to judge whether categorize interests can be used as the reference value of user's interest label, can be rule of thumb or actual needs setting.Illustrate as follows:
Suppose that default the first weight threshold is 5, according to the first Preset Time section shown in table 5 in the user interest label that generates of this user's the weight of each categorize interests comprise military affairs and women.
In the present embodiment, by obtain the weight of each categorize interests of this user in the first Preset Time section according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests, thereby the weight that makes each categorize interests of this user in this first Preset Time section has been reacted user and has browsed the frequent degree of the website domain name that respectively belongs to each categorize interests, therefore, in this first Preset Time section, the weight of each categorize interests of this user is larger, represent, the number of times that this user browses the website domain name that belongs to this categorize interests is more, thereby can objectively reflect the tendency of user interest, and by generate user interest label according to the weight of each categorize interests of this user in the first Preset Time section, the real interest thereby the user interest label that can make generation is close to the users.
In another embodiment of the present invention, the first Preset Time section is divided into multiple the second Preset Time sections, as in the time that the first Preset Time section is 1 month, this the first Preset Time section can be divided into 30 the second Preset Time sections, each the second Preset Time Duan Weiyi days, certainly, this the first Preset Time section and specifically arranging of the second Preset Time section can arrange arbitrarily according to needs, now, as follows according to the detailed process of user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests generation user interest label:
B1, obtain website domain name that in each the second Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in each the second Preset Time section.Its detailed process is described above, does not repeat them here.The website domain name that in each default two time periods that wherein get, user browsed and the number of visits of this website domain name are as shown in table 6, but are not limited with table 6.
Table 6
The second Preset Time segment identification Website domain name Number of visits
Day0 host1
3
Day0 host2 5
Day0 host3 2
Day0 host4 10
Day1 Host5 1
Day1 host2 5
Day1 host3 6
Day1 host4 7
Day2 Host6 7
Day2 host2 3
Day2 host3 3
Day2 Host5 3
B2, obtain categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests.Its detailed process as above, does not repeat them here.Wherein the corresponding relation of categorize interests corresponding to the number of visits of the website domain name in each the second Preset Time section, website domain name and website domain name is as shown in table 7, but is not limited with table 7.
Table 7
The second Preset Time segment identification Website domain name Number of visits Categorize interests
Day0 host1 3 Military
Day0 host2 5 Military
Day0 host3 2 Science and technology
Day0 host4 10 Women
Dav1 host2 5 Women
Dav1 host3 6 Women
Dav1 host4 7 Science and technology
Day2 Host6 7 Military
Day2 host2 3 Science and technology
Day2 host3
3 Science and technology
Day2 Host5
3 Women
B3, website domain name, the number of visits of website domain name and the categorize interests of this website domain name according in each the second Preset Time section are obtained the initial weight of each categorize interests of this user in each the second Preset Time section.Its detailed process is described above, does not repeat them here, and in each the second Preset Time section, the initial weight of each categorize interests of this user is as shown in table 8, but is not limited with table 8.
Table 8
? Day0 Dav1 Day2
Military 8 0 7
Science and technology 2 7 6
Women 10 11 3
Content representation in table 8, in the second Preset Time section Day0, the initial weight of categorize interests " military affairs " is 8, and the initial weight of categorize interests " science and technology " is 2, and the initial weight of categorize interests " women " is 10.In the second Preset Time section Day1, the initial weight of categorize interests " military affairs " is 0, and the initial weight of categorize interests " science and technology " is 7, and the initial weight of categorize interests " women " is 11.In the second Preset Time section Day2, the initial weight of categorize interests " military affairs " is 7, and the initial weight of categorize interests " science and technology " is 6, and the initial weight of categorize interests " women " is 3.
B4, obtain the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section.Its detailed process is as follows:
B41, be each the second Preset Time section setup times attenuation coefficient.In the present embodiment, because the net page browse information of the user in the time close to more from current time more can reflect than the net page browse information of the user in the time away from more from current time the interest place that user is current really, therefore, in order to embody this difference, it can be each the second Preset Time section setup times attenuation coefficient.As in the time that the second Preset Time section comprises Day0, Day1 and Day2, the time attenuation coefficient that Day0 can be set is a0, and the time attenuation coefficient of Day1 is a1, and the time attenuation coefficient of Day2 is a2.
B42, obtain the weight of each categorize interests of this user in the first Preset Time section in conjunction with the time attenuation coefficient of each the second Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section.While wherein obtaining the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section in conjunction with the time attenuation coefficient of each the second Preset Time section, can be in each the second Preset Time section vector, the mean value of vector or the weight of vector product each categorize interests of this user in the first Preset Time section of the initial weight of each categorize interests of this user.Wherein in the time of the weight of vector each categorize interests of this user in the first Preset Time section of the initial weight of each categorize interests of this user in each the second Preset Time section, can be in the following way:
M=M0*a0+…+Mn*an
Wherein M is the weight of certain categorize interests of this user in the first Preset Time section, and M0 is the initial weight of this categorize interests of this user in n the second Preset Time section to Mn, and a0 is the time attenuation coefficient of each the second Preset Time section to an.Take the initial weight of the each categorize interests of this user in each the second Preset Time section shown in table 8 as example, details are as follows:
The time attenuation coefficient that the time attenuation coefficient of supposing Day0 is 1, Day1 is a, and the time attenuation coefficient of Day2 is b,
In the first Preset Time section, the weight of this user's categorize interests military affairs is: 8*1+0*a+7*b;
In the first Preset Time section, the weight of this user's categorize interests science and technology is: 2*1+7*a+6*b;
In the first Preset Time section, this user's categorize interests women's weight is: 10*1+11*a+3*b.
B5, generate user interest label according to the weight of each categorize interests of this user in the first Preset Time section.Its detailed process is described above, does not repeat them here.
In the present embodiment, by being that the second Preset Time section arranges corresponding time attenuation coefficient, thereby the weight of each categorize interests of this user can reflect user's true interest more really in the first Preset Time section that makes to get, and then make according to the user interest label of weight generation of each categorize interests of this user in the first Preset Time section and user's true interest more approaching.
In another embodiment of the present invention, the method also comprises the steps:
Carry out the recommendation process of advertising message and/or content information according to the user interest label generating.If the user interest label generating is while comprising women, science and technology etc., the advertising message about women or scientific and technological class or content information can be recommended to this user, thereby intelligent realize contacting of advertisement or content and user.
Fig. 3 shows the structure of the user interest label generating apparatus that the embodiment of the present invention provides, and only shows for convenience of explanation the part relevant to the embodiment of the present invention.
This device can be for browser, can be to run on the unit that software unit, hardware cell or software and hardware in browser combine, and also can be used as independently suspension member and is integrated in browser or runs in the application system of browser, wherein:
Classification creating unit 1 creates categorize interests.Wherein categorize interests can be monohierarchy categorize interests, can be also multi-layer categorize interests.Creating when categorize interests, for the categorize interests creating is provided for the class indication of this categorize interests of unique identification.In the time creating multi-layer categorize interests, a wherein part for class indication that can be using the class indication of higher level's categorize interests as subordinate's categorize interests, so that the hierarchical relationship up and down of clearer, clear and definite embodiment categorize interests.
Map unit 2 is obtained website domain name, and website domain name is matched in corresponding categorize interests, obtains the mapping relations between website domain name and categorize interests.
In the present embodiment, can from the user's of record net page browse information, obtain website domain name, also can collect the directory information of each website, and obtain website domain name according to the directory information of the website of collecting.Wherein user's net page browse information includes but not limited to the network address that user accessed.Wherein network address can be the URL(uniform resource locator) (Uniform/Universal Resource Locator, URL) of webpage.
In another embodiment of the present invention, this map unit 2 comprises domain Name acquisition module 21.This domain Name acquisition module 21 is obtained website domain name from the user's of record browsing information, or collects the directory information of each website, and obtains website domain name according to the directory information of the website of collecting.
In another embodiment of the present invention, this domain Name acquisition module 21, specifically for according to the number of levels of the categorize interests creating, is extracted the directory web site of respective layer progression, using the directory web site extracting as the website domain name getting from user's net page browse information.
In another embodiment of the present invention, this map unit also comprises shining upon sets up module 22.This mapping is set up module 22 and is pre-created the mapping relations between domain name key word and categorize interests, website domain name is mated with domain name key word, and in the time that the match is successful, obtain corresponding categorize interests according to the domain name key word that the match is successful, and the mapping relations that set up a web site between domain name and categorize interests.
Interest label generation unit 3 gathers the net page browse information of user in the first Preset Time section, and generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
In another embodiment of the present invention, this interest label generation unit 3 specifically comprises:
The first domain name extraction module 31 obtains website domain name that in this first Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in the first Preset Time section;
The first categorize interests acquisition module 32 obtains categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
The first Weight Acquisition module 33 is obtained the weight of each categorize interests of this user in the first Preset Time section according to the categorize interests of the number of visits of the website domain name in the first Preset Time section, website domain name and this website domain name; Its detailed process is as follows:
By the number of visits of each website domain name that belongs to same categorize interests in the first Preset Time section be added, number of visits is averaged or the number of visits multiple different mode such as multiply each other is obtained the weight of each categorize interests of this user in the first Preset Time section.
The first interest label generation module 34 generates user interest label according to the weight of each categorize interests of this user in the first Preset Time section.Its detailed process is as follows:
In the first Preset Time section, the weight of categorize interests is greater than the categorize interests of default the first weight threshold as this user's interest label.
Refer to Fig. 4, the structure of the user interest label generating apparatus providing for another embodiment of the present invention, only shows the part relevant to the embodiment of the present invention for convenience of explanation.The difference of the user interest label generating apparatus shown in itself and Fig. 3 is only the concrete structure of interest label generation unit 3, in the present embodiment, in the time that described the first Preset Time section is divided into multiple the second Preset Time section, described interest label generation unit 3 specifically comprises:
The second domain name extraction module 35 obtains website domain name that in each the second Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in each the second Preset Time section;
The second categorize interests acquisition module 36 obtains categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
Initial weight acquisition module 37 obtains the initial weight of each categorize interests of this user in each the second Preset Time section according to the website domain name in each the second Preset Time section, the number of visits of website domain name and the categorize interests of this website domain name;
The second Weight Acquisition module 38 is obtained the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section;
In another embodiment of the present invention, this the second Weight Acquisition module 38 is specifically for being each the second Preset Time section setup times attenuation coefficient, according to the initial weight of each categorize interests of this user in each the second Preset Time section and the time attenuation coefficient of each the second Preset Time section, obtain the weight of each categorize interests of this user in the first Preset Time section.
While wherein obtaining the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section in conjunction with the time attenuation coefficient of each the second Preset Time section, can be in each the second Preset Time section vector, the mean value of vector or the weight of vector product each categorize interests of this user in the first Preset Time section of the initial weight of each categorize interests of this user.Wherein in the time of the weight of vector each categorize interests of this user in the first Preset Time section of the initial weight of each categorize interests of this user in each the second Preset Time section, can be in the following way:
M=M0*a0+…+Mn*an
Wherein M is the weight of certain categorize interests of this user in the first Preset Time section, and M0 is the initial weight of this categorize interests of this user in n the second Preset Time section to Mn, and a0 is the time attenuation coefficient of each the second Preset Time section to an.
The second interest label generation module 39 generates user interest label according to the weight of each categorize interests of this user in the first Preset Time section.
In another embodiment of the present invention, this device also comprises information recommendation unit (scheming not shown).This information recommendation unit carries out the recommendation process of advertising message and/or content information according to the user interest label generating.If the user interest label generating is while comprising women, science and technology etc., the advertising message about women or scientific and technological class or content information can be recommended to this user, thereby intelligent realize contacting of advertisement or content and user.
It should be noted that said system, included unit is just divided according to function logic, but is not limited to above-mentioned division, as long as can realize corresponding function; In addition, the concrete title of each functional unit also, just for the ease of mutual differentiation, is not limited to protection scope of the present invention.
One of ordinary skill in the art will appreciate that, the all or part of step realizing in above-described embodiment method is can carry out the hardware that instruction is relevant by program to complete, described program can be being stored in a computer read/write memory medium, described storage medium, as ROM/RAM, disk, CD etc.
In embodiments of the present invention, by website domain name being matched in the corresponding categorize interests being pre-created, obtain the mapping relations between website domain name and categorize interests, in according to Preset Time section, the mapping relations of user's net page browse information and website domain name and categorize interests generate user interest label, because user interest label is to generate according to the net page browse information of user in Preset Time section, thereby make the user interest label generating can more truly, accurately, objectively reflect user's interest.By being that the second Preset Time section arranges corresponding time attenuation coefficient, thereby the weight of each categorize interests of this user can reflect user's true interest more really in the first Preset Time section that makes to get, and then make according to the user interest label of weight generation of each categorize interests of this user in the first Preset Time section and user's true interest more approaching.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (15)

1. a generation method for user interest label, is characterized in that, described method comprises:
Create categorize interests;
Obtain website domain name, and website domain name is matched in corresponding categorize interests, obtain the mapping relations between website domain name and categorize interests;
Gather the net page browse information of user in the first Preset Time section, and generate user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
2. the method for claim 1, is characterized in that, described categorize interests is monohierarchy categorize interests or multi-layer categorize interests.
3. the method for claim 1, is characterized in that, described in obtain website domain name and specifically comprise:
From user's the browsing information of record, obtain website domain name; Or
Collect the directory information of each website, and obtain website domain name according to the directory information of the website of collecting.
4. method as claimed in claim 3, is characterized in that, obtains website domain name and specifically comprise the described user's from record browsing information:
According to the number of levels of the categorize interests creating, from user's net page browse information, extract the directory web site of respective layer progression, using the directory web site extracting as the website domain name getting.
5. the method as described in claim 1 to 4 any one, is characterized in that, described website domain name is matched in corresponding categorize interests, and the mapping relations that obtain between website domain name and categorize interests specifically comprise:
Be pre-created the mapping relations between domain name key word and categorize interests;
Website domain name is mated with domain name key word, and in the time that the match is successful, obtain corresponding categorize interests according to the domain name key word that the match is successful, and the mapping relations that set up a web site between domain name and categorize interests.
6. the method for claim 1, is characterized in that, describedly generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests and specifically comprises:
Obtain website domain name that in this first Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in the first Preset Time section;
Obtain categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
Obtain the weight of each categorize interests of this user in the first Preset Time section according to the categorize interests of the number of visits of the website domain name in the first Preset Time section, website domain name and this website domain name;
Generate user interest label according to the weight of each categorize interests of this user in the first Preset Time section.
7. the method for claim 1, it is characterized in that, in the time that described the first Preset Time section is divided into multiple the second Preset Time section, describedly generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests and specifically comprise:
Obtain website domain name that in each the second Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in each the second Preset Time section;
Obtain categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
Obtain the initial weight of each categorize interests of this user in each the second Preset Time section according to the website domain name in each the second Preset Time section, the number of visits of website domain name and the categorize interests of this website domain name;
Obtain the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section;
Generate user interest label according to the weight of each categorize interests of this user in the first Preset Time section.
8. method as claimed in claim 7, is characterized in that, the weight that the described initial weight according to each categorize interests of this user in each the second Preset Time section obtains each categorize interests of this user in the first Preset Time section specifically comprises:
For each the second Preset Time section setup times attenuation coefficient;
According to the initial weight of each categorize interests of this user in each the second Preset Time section and the time attenuation coefficient of each the second Preset Time section, obtain the weight of each categorize interests of this user in the first Preset Time section.
9. a generating apparatus for user interest label, is characterized in that, described device comprises:
Classification creating unit, for creating categorize interests;
Map unit, for obtaining website domain name, and matches website domain name in corresponding categorize interests, obtains the mapping relations between website domain name and categorize interests;
Interest label generation unit, for gathering the net page browse information of user in the first Preset Time section, and generates user interest label according to user's net page browse information in the first Preset Time section and the mapping relations of website domain name and categorize interests.
10. device as claimed in claim 9, is characterized in that, described map unit comprises:
Domain Name acquisition module, obtains website domain name for the browsing information of the user from record, or collects the directory information of each website, and obtain website domain name according to the directory information of the website of collecting.
11. devices as claimed in claim 10, it is characterized in that, domain name acquisition module, specifically for according to the number of levels of the categorize interests creating, extracts the directory web site of respective layer progression, using the directory web site extracting as the website domain name getting from user's net page browse information.
12. devices as described in claim 9 to 11 any one, is characterized in that, described map unit also comprises:
Module is set up in mapping, for being pre-created the mapping relations between domain name key word and categorize interests, website domain name is mated with domain name key word, and in the time that the match is successful, obtain corresponding categorize interests according to the domain name key word that the match is successful, and the mapping relations that set up a web site between domain name and categorize interests.
13. devices as claimed in claim 9, is characterized in that, described interest label generation unit comprises:
The first domain name extraction module, for obtaining website domain name that in this first Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in the first Preset Time section;
The first categorize interests acquisition module, for obtaining categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
The first Weight Acquisition module, for obtaining the weight of each categorize interests of this user in the first Preset Time section according to the categorize interests of the number of visits of the website domain name in the first Preset Time section, website domain name and this website domain name;
The first interest label generation module, for generating user interest label according to the weight of each categorize interests of this user in the first Preset Time section.
14. devices as claimed in claim 9, is characterized in that, in the time that described the first Preset Time section is divided into multiple the second Preset Time section, described interest label generation unit comprises:
The second domain name extraction module, for obtaining website domain name that in each the second Preset Time section, user browsed and the number of visits of this website domain name according to the net page browse information of the user in each the second Preset Time section;
The second categorize interests acquisition module, for obtaining categorize interests corresponding to website domain name according to website domain name and the mapping relations between categorize interests;
Initial weight acquisition module, for obtaining the initial weight of each categorize interests of this user in each the second Preset Time section according to the website domain name in each the second Preset Time section, the number of visits of website domain name and the categorize interests of this website domain name;
The second Weight Acquisition module, for obtaining the weight of each categorize interests of this user in the first Preset Time section according to the initial weight of each categorize interests of this user in each the second Preset Time section;
The second interest label generation module, for generating user interest label according to the weight of each categorize interests of this user in the first Preset Time section.
15. devices as claimed in claim 14, it is characterized in that, described the second Weight Acquisition module is specifically for being each the second Preset Time section setup times attenuation coefficient, according to the initial weight of each categorize interests of this user in each the second Preset Time section and the time attenuation coefficient of each the second Preset Time section, obtain the weight of each categorize interests of this user in the first Preset Time section.
CN201210552046.5A 2012-12-18 2012-12-18 Method and device for generating user interest label Pending CN103870512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210552046.5A CN103870512A (en) 2012-12-18 2012-12-18 Method and device for generating user interest label

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210552046.5A CN103870512A (en) 2012-12-18 2012-12-18 Method and device for generating user interest label

Publications (1)

Publication Number Publication Date
CN103870512A true CN103870512A (en) 2014-06-18

Family

ID=50909053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210552046.5A Pending CN103870512A (en) 2012-12-18 2012-12-18 Method and device for generating user interest label

Country Status (1)

Country Link
CN (1) CN103870512A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991917A (en) * 2015-06-23 2015-10-21 上海斐讯数据通信技术有限公司 Personalized advertisement pushing system and method
CN105243144A (en) * 2015-10-15 2016-01-13 桂林电子科技大学 Method and device for recommending interesting labels
CN105740389A (en) * 2016-01-27 2016-07-06 上海晶赞科技发展有限公司 Classification method and device
CN106339421A (en) * 2016-08-15 2017-01-18 北京集奥聚合科技有限公司 Interest mining method for user browsing behaviors
CN106383857A (en) * 2016-08-31 2017-02-08 锐捷网络股份有限公司 Information processing method and electronic equipment
WO2017028093A1 (en) * 2015-08-16 2017-02-23 常平 Advertisement delivery method and advertisement delivery system
WO2017028097A1 (en) * 2015-08-16 2017-02-23 常平 Method for alerting information when recommending nearby user, and user recommendation system
CN106649347A (en) * 2015-10-30 2017-05-10 北京国双科技有限公司 Interest information identification method and apparatus
CN106649316A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Video pushing method and device
CN106997347A (en) * 2016-01-22 2017-08-01 华为技术有限公司 Information recommendation method and server
CN107038213A (en) * 2017-02-28 2017-08-11 华为技术有限公司 A kind of method and device of video recommendations
CN107451216A (en) * 2017-07-17 2017-12-08 广州特道信息科技有限公司 The granularity attribute recognition methods of label and device
CN107592346A (en) * 2017-08-31 2018-01-16 江西博瑞彤芸科技有限公司 User classification method based on user behavior analysis
CN108510319A (en) * 2018-03-21 2018-09-07 四川斐讯信息技术有限公司 A kind of method and system of accurate advertisement
CN108829932A (en) * 2018-05-22 2018-11-16 中国人民解放军国防科技大学 Interest matching method and device, computer equipment and storage medium
CN109561162A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 Excavate the method and device that user accesses hobby
CN109993587A (en) * 2019-04-10 2019-07-09 金瓜子科技发展(北京)有限公司 A kind of data classification method, device, equipment and medium
CN110199240A (en) * 2016-12-23 2019-09-03 瑞欧威尔股份有限公司 The content navigation based on context for wearable display
CN110737822A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 User interest mining method, device, equipment and storage medium
CN111738768A (en) * 2020-06-24 2020-10-02 江苏云柜网络技术有限公司 Advertisement pushing method and system
CN112148984A (en) * 2020-09-30 2020-12-29 微梦创科网络科技(中国)有限公司 Method and system for capturing instant interest of user
US11947752B2 (en) 2016-12-23 2024-04-02 Realwear, Inc. Customizing user interfaces of binary applications

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102799662A (en) * 2012-07-10 2012-11-28 北京奇虎科技有限公司 Method, device and system for recommending website

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104991917B (en) * 2015-06-23 2018-05-01 上海斐讯数据通信技术有限公司 Ad personalization supplying system and method
CN104991917A (en) * 2015-06-23 2015-10-21 上海斐讯数据通信技术有限公司 Personalized advertisement pushing system and method
WO2017028093A1 (en) * 2015-08-16 2017-02-23 常平 Advertisement delivery method and advertisement delivery system
WO2017028097A1 (en) * 2015-08-16 2017-02-23 常平 Method for alerting information when recommending nearby user, and user recommendation system
CN105243144A (en) * 2015-10-15 2016-01-13 桂林电子科技大学 Method and device for recommending interesting labels
CN106649316A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Video pushing method and device
CN106649316B (en) * 2015-10-29 2020-11-27 北京国双科技有限公司 Video pushing method and device
CN106649347A (en) * 2015-10-30 2017-05-10 北京国双科技有限公司 Interest information identification method and apparatus
CN106997347A (en) * 2016-01-22 2017-08-01 华为技术有限公司 Information recommendation method and server
CN105740389A (en) * 2016-01-27 2016-07-06 上海晶赞科技发展有限公司 Classification method and device
CN106339421A (en) * 2016-08-15 2017-01-18 北京集奥聚合科技有限公司 Interest mining method for user browsing behaviors
CN106339421B (en) * 2016-08-15 2019-08-13 北京集奥聚合科技有限公司 A kind of interest digging method of user browsing behavior
CN106383857A (en) * 2016-08-31 2017-02-08 锐捷网络股份有限公司 Information processing method and electronic equipment
US11947752B2 (en) 2016-12-23 2024-04-02 Realwear, Inc. Customizing user interfaces of binary applications
CN110199240B (en) * 2016-12-23 2024-02-02 瑞欧威尔股份有限公司 Context-based content navigation for wearable displays
CN110199240A (en) * 2016-12-23 2019-09-03 瑞欧威尔股份有限公司 The content navigation based on context for wearable display
CN107038213A (en) * 2017-02-28 2017-08-11 华为技术有限公司 A kind of method and device of video recommendations
CN107451216A (en) * 2017-07-17 2017-12-08 广州特道信息科技有限公司 The granularity attribute recognition methods of label and device
CN107592346B (en) * 2017-08-31 2020-09-01 江西博瑞彤芸科技有限公司 User classification method based on user behavior analysis
CN107592346A (en) * 2017-08-31 2018-01-16 江西博瑞彤芸科技有限公司 User classification method based on user behavior analysis
CN109561162A (en) * 2017-09-26 2019-04-02 北京国双科技有限公司 Excavate the method and device that user accesses hobby
CN108510319A (en) * 2018-03-21 2018-09-07 四川斐讯信息技术有限公司 A kind of method and system of accurate advertisement
CN108829932A (en) * 2018-05-22 2018-11-16 中国人民解放军国防科技大学 Interest matching method and device, computer equipment and storage medium
CN110737822A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 User interest mining method, device, equipment and storage medium
CN109993587A (en) * 2019-04-10 2019-07-09 金瓜子科技发展(北京)有限公司 A kind of data classification method, device, equipment and medium
CN109993587B (en) * 2019-04-10 2022-06-03 金瓜子科技发展(北京)有限公司 Data classification method, device, equipment and medium
CN111738768A (en) * 2020-06-24 2020-10-02 江苏云柜网络技术有限公司 Advertisement pushing method and system
CN112148984A (en) * 2020-09-30 2020-12-29 微梦创科网络科技(中国)有限公司 Method and system for capturing instant interest of user
CN112148984B (en) * 2020-09-30 2023-11-10 微梦创科网络科技(中国)有限公司 Method and system for capturing instant interests of user

Similar Documents

Publication Publication Date Title
CN103870512A (en) Method and device for generating user interest label
CN103997507B (en) A kind of method for pushing and device of information
US20180176318A1 (en) Method and system for dynamic application management
US9405746B2 (en) User behavior models based on source domain
CN104484431B (en) A kind of multi-source Personalize News webpage recommending method based on domain body
CN102662965A (en) Method and system of automatically discovering hot news theme on the internet
CN103514204A (en) Information recommendation method and device
CN102708132A (en) Method and system for webpage recommendation
CN107256512A (en) One kind house-purchase personalized recommendation method and system
CN102354315A (en) Generation method of site navigation page and device thereof
CN105373608A (en) Input method based scene-mode content pushing method and system
CN103970753A (en) Pushing method and pushing device for related knowledge
KR101816205B1 (en) Server and computer readable recording medium for providing internet content
CN104992352A (en) Individualized resource retrieval method
CN104598604A (en) Browsing method of website navigation applied in various browsers
CN102629265A (en) Method and system for building up web page database
CN106874509A (en) Resource recommendation method and device based on middle granularity user grouping
CN103365961A (en) Accurate search-oriented website structurization labeling method and system
Belter Visualizing networks of scientific research
CN102567533A (en) Webpage information adding method and webpage information adding system
Cantador et al. Semantic contextualisation of social tag-based profiles and item recommendations
CN108959579B (en) System for acquiring personalized features of user and document
WO2015135600A1 (en) Method and computer product for automatically generating a sorted list from user generated input and / or metadata derived form social media platforms
CN103886073A (en) Coal mine information recommendation system based on collaborative filtering
Mao et al. Google+ facebook: a social-network-optimized web search approach

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20140618