CN103607496A - A method and an apparatus for deducting interests and hobbies of handset users and a handset terminal - Google Patents

A method and an apparatus for deducting interests and hobbies of handset users and a handset terminal Download PDF

Info

Publication number
CN103607496A
CN103607496A CN 201310573351 CN201310573351A CN103607496A CN 103607496 A CN103607496 A CN 103607496A CN 201310573351 CN201310573351 CN 201310573351 CN 201310573351 A CN201310573351 A CN 201310573351A CN 103607496 A CN103607496 A CN 103607496A
Authority
CN
Grant status
Application
Patent type
Prior art keywords
search
keywords
mobile phone
interests
website
Prior art date
Application number
CN 201310573351
Other languages
Chinese (zh)
Other versions
CN103607496B (en )
Inventor
李翔宇
张潇
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Abstract

The invention is applicable to the technical field of communication, and provides a method and an apparatus for deducting interests and hobbies of handset users and a handset terminal. The method comprises the following steps: reading browse history files in handset browsers, analyzing the browsing history files, obtaining keywords of handset user search histories and browsed Web addresses, classifying the keywords of the handset user search histories and the browsed Web addresses, carrying out statistical analysis of keywords of each class and search frequencies of the browsed Web addresses in order to deduct the interests and the hobbies of the handset users according to the high-low degree of the search frequencies. According to the method and the apparatus for deducting the interests and the hobbies of the handset users and the handset terminal of the invention, through browsing the history files, obtaining the keywords and the browsed Web addresses, and carrying out the statistical analysis of the keywords of each class and the search frequencies of the browsed Web addresses, the interests and the hobbies of the handset users can be effectively deducted according to the high-low degree of the search frequencies.

Description

一种推断手机用户兴趣爱好的方法、装置及手机终端 An assumption mobile phone users interests method, apparatus and mobile terminal

技术领域 FIELD

[0001] 本发明属于通讯技术领域,尤其涉及一种推断手机用户兴趣爱好的方法、装置及手机终端。 [0001] The present invention belongs to the field of communication technology, particularly to a method for estimating the mobile phone user interests, apparatus and mobile terminals.

背景技术 Background technique

[0002] 随着手机的普及,手机用户不断增加,手机网民也在不断增加。 [0002] With the popularity of mobile phones, mobile phone users continue to increase, mobile phone users are increasing. 据有关数据统计,2011年国内手机用户总量已达到9.3亿,且手机上网用户已超过3.9亿。 According to statistics, in 2011 the total domestic mobile phone users has reached 930 million, and mobile Internet users has more than 390 million. 此外,据DCCI互联网数据中心预测,到2013年中国手机网民将达7.20亿,手机网民将超越电脑网民。 In addition, according to DCCI Internet Data Center predicts that by 2013 China will reach 720 million mobile phone users, mobile phone users will surpass PC users. 手机浏览器作为网民通过手机浏览网页的工具,具有非常好的发展前景。 Internet users through the mobile browser as a tool for mobile Web browsing, has a very good development prospects.

[0003] 现有技术中,普通的手机终端不具备推断手机用户兴趣爱好的功能。 [0003] prior art, the ordinary mobile phone terminal does not have to infer the function of mobile phone users interests.

发明内容 SUMMARY

[0004] 本发明实施例的目的在于提供一种推断手机用户兴趣爱好的方法,旨在解决普通的手机终端不具备推断手机用户兴趣爱好的功能的问题。 [0004] The object of embodiments of the present invention to provide a method for mobile phone users interests of estimation, to solve the problem of ordinary mobile phone terminal does not have to infer the user's interests function.

[0005] 本发明实施例是这样实现的,一种推断手机用户兴趣爱好的方法,所述方法包括: [0005] Example embodiments of the present invention is implemented, an assumption interests of mobile phone users, the method comprising:

[0006] 读取手机浏览器中的浏览记录文件; [0006] read the history file of the mobile browser;

[0007] 解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址; [0007] parse history file, visit the website and get keyword search of the history of mobile phone users;

[0008] 对历史搜索的关键词和浏览网址分别进行分类; [0008] visit the website for keywords and search history are to classify;

[0009] 统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 [0009] Statistical keyword search frequency in each category and visit the website to search frequency according to the level of mobile phone users to infer interests.

[0010] 本发明实施例还提供了一种推断手机用户兴趣爱好的装置,所述装置包括: [0010] Embodiments of the invention also provides a device for inferring the interests of mobile phone users, the apparatus comprising:

[0011] 读取单元,用于读取手机浏览器中的浏览记录文件; [0011] reading means for reading the file history of the mobile browser;

[0012] 获取单元,用于解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址; [0012] acquisition unit configured to parse the log file browsing, access to mobile phone users visit the website and keyword search history;

[0013] 分类单元,用于对历史搜索的关键词和浏览网址分别进行分类; [0013] classification unit, visit the website for keywords and search history are to classify;

[0014] 统计推断单元,用于统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 [0014] statistical inference unit for statistical keyword search frequency in each category and visit the website to be inferred from the high and low search frequency of mobile phone users interests.

[0015] 本发明实施例还提供了一种手机终端,所述手机终端包括上述的装置。 Example [0015] The present invention further provides a mobile phone terminal, the mobile terminal comprising the above-described means.

[0016] 本发明实施例与现有技术相比,有益效果在于:通过读取浏览记录文件、获取关键词和浏览网址,并统计各类别下关键词和浏览网址的搜索频率,可有效的根据频率的高低来推断手机用户兴趣爱好。 [0016] Example embodiments of the present invention compared with the prior art, the beneficial effect is that: by reading the file history, visit the website, and acquiring keyword, search frequency count in each category and the keyword and URL, and can be effective in accordance with high and low frequency of mobile phone users to infer interests.

附图说明 BRIEF DESCRIPTION

[0017] 图1是本发明实施例提供的推断手机用户兴趣爱好的方法的流程图; [0017] FIG. 1 is a flowchart of a mobile phone inferred user interests embodiment of the method of the present invention;

[0018] 图2是本发明实施例提供的推断手机用户兴趣爱好的装置的第一逻辑示意图;[0019] 图3是本发明实施例提供的推断手机用户兴趣爱好的装置的第二逻辑示意图; [0018] FIG. 2 is a schematic diagram of the first embodiment provides the user interests inferred phone apparatus of the present invention; [0019] FIG. 3 is a schematic view of a second logic inference device according to an embodiment of the mobile phone users interests of the present invention;

[0020] 图4是本发明实施例提供的推断手机用户兴趣爱好的装置的第三逻辑示意图。 [0020] FIG. 4 is a schematic diagram of the third embodiment to provide mobile phone users interests inferred apparatus embodiment of the present invention.

具体实施方式 detailed description

[0021] 为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。 [0021] To make the objectives, technical solutions and advantages of the present invention will become more apparent hereinafter in conjunction with the accompanying drawings and embodiments of the present invention will be further described in detail. 应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。 It should be understood that the specific embodiments described herein are only intended to illustrate the present invention and are not intended to limit the present invention.

[0022] 本发明实施例提供的实施方案如下: [0022] Example embodiments of the present invention provides the following embodiments:

[0023] 为了方便理解本发明实施例,首先在此介绍本发明实施例描述中会引入的几个要素: [0023] To facilitate understanding embodiments of the present invention, first, this introduces several elements described in the embodiment of the present invention will be introduced:

[0024]域名(Domain Name): [0024] domain name (Domain Name):

[0025] 域名,是由一串用点分隔的名字组成的Internet上某一台计算机或计算机组的名称,用于在数据传输时标识计算机的电子方位(有时也指地理位置,地理上的域名,指代有行政自主权的一个地方区域)。 Names on the Internet [0025] domain name is separated by a point with a string consisting of the name of a computer or group of computers for data transfer electronic position marker computers (also sometimes referred to as domain names on the location, geography , referring to a local regional administrative autonomy). 域名是一个IP地址上有“面具”。 A domain name is the "mask" on an IP address. 一个域名的目的是便于记忆和沟通的一组服务器的地址(网站,电子邮件,FTP等)。 The purpose of a domain name is the address easy to remember a group of servers and communication (website, e-mail, FTP, etc.). 域名作为力所能及难忘的互联网参与者的名称,如电脑,手机终端、网络和服务。 Internet domain name as the name of the participants whatever memorable, such as computers, mobile terminals, networks and services.

[0026] 一个完整的域名由二个或二个以上部分组成,各部分之间用英文的句号〃.〃来分隔,例如下列域名:yaho0.com, yahoo, ca.us, yahoo, c0.uk。 [0026] A complete domain name by the composition of two or more portions, by a period in English 〃.〃 to partition between the various parts, for example the following domain: yaho0.com, yahoo, ca.us, yahoo, c0.uk . 其中第一个域名由二部分组成,第二个域名和第三个域名由三部分组成。 Wherein the first domain is composed of two parts, the second domain and the third domain is composed of three parts. 在一个完整的域名中,最后一个〃.〃的右边部分称为顶级域名或一级域名(TLD),在上面的域名例子中,com、us和uk是顶级域名。 In a full domain name, the last 〃.〃 right part called top-level domain or a domain name (TLD), domain name in the above example, com, us and uk is the top-level domain. 最后一个〃.〃的左边部分称为二级域名(SLD),例如,域名yahoo, com中yahoo是二级域名,域名yahoo, ca.us中ca是二级域名,而域名yahoo, c0.uk中co是二级域名。 The last part of the left side is referred to as a second-level domain 〃.〃 (the SLD), for example, the domain yahoo, yahoo COM is the second level domain, the domain yahoo, ca.us ca is the second-level domain, the domain yahoo, c0.uk co is the second-level domain. 二级域名的左边部分称为三级域名,三级域名的左边部分称为四级域名,以此类推。 Left portion of the secondary domain called three domain, the domain is referred to as the left part of three four domain names, and so on. 例如,域名yahoo, ca.us和yahoo, c0.uk中yahoo是三级域名。 For example, the domain name yahoo, ca.us and yahoo, c0.uk in yahoo is a third-level domain.

[0027] B+树的定义和特点: [0027] and the definition of the B + tree features:

[0028] 一、一棵m阶的B+树定义为: [0028] a, B + tree is defined as an m-th order:

[0029] (I)每个节点最多可以有m个元素; [0029] (I) Each node can have up to m elements;

[0030] (2)除了根节点外,每个节点最少有(m/2)个元素; [0030] (2) except the root node, the nodes each have a minimum of elements (m / 2);

[0031] (3)如果根节点不是叶节点,那么它最少有2个孩子节点; [0031] (3) If the root is not a leaf node, then it at least has two children nodes;

[0032] (4)所有的叶子节点都在同一层; [0032] (4) all of the leaf nodes are at the same level;

[0033] (5) 一个有k个孩子节点的非叶子节点有(k-Ι)个元素,按升序排列; [0033] (5) a non-leaf nodes of the child node k has (k-Ι) elements, in ascending order;

[0034] (6)某个元素的左子树中的元素都比它小,右子树的元素都大于或等于它; [0034] (6) an element of the left subtree of elements smaller than it, right subtree elements are equal to or greater than it;

[0035] (7)非叶子节点只存放关键字和指向下一个孩子节点的索引,记录只存放在叶子节点中; [0035] (7) non-leaf nodes only store the index under the keyword and point to a child node, the record is stored only in leaf nodes;

[0036] (8)相邻的叶子节点之间用指针相连。 [0036] (8) are connected by pointers between the leaf nodes adjacent.

[0037] 二、B+树具有的特点是: [0037] bis, B + tree is characterized by having:

[0038] (I)所有关键字都出现在叶子结点的链表中(稠密索引),且链表中的关键字恰好是有序的; [0038] (I) all keywords appear in the list of leaf nodes (dense index), and the list of keywords happens to be ordered;

[0039] (2)不可能在非叶子结点命中;[0040] (3)非叶子结点相当于是叶子结点的索引(稀疏索引),叶子结点相当于是存储(关键字)数据的数据层; [0039] (2) a non-leaf node can not hit; [0040] (3) non-leaf node corresponds to a leaf node of an index (index sparse), equivalent to leaf nodes store data (keyword) data Floor;

[0041] (4)更适合文件索引系统; [0041] (4) more suitable for document indexing system;

[0042] 请参阅图1,本发明实施例提供一种推断手机用户兴趣爱好的方法,所述方法包括: [0042] Referring to FIG. 1, embodiments of the present invention provides a method for mobile phone users interests of estimation, the method comprising:

[0043] 101、读取手机浏览器中的浏览记录文件。 [0043] 101, reading history file in the mobile browser.

[0044] 手机浏览器中的浏览记录文件包括日志文件和缓存文件。 [0044] History files include your phone browser cache files and log files.

[0045] 手机用户通过手机自带的手机浏览器或者安装的其他手机浏览器软件上网,所有的浏览记录都记录在对应浏览器的日志文件和缓存文件中。 [0045] mobile phone users through mobile phone comes with a browser or install other mobile phone Internet browser software, all browsing history are recorded in the corresponding log files and browser cache files.

[0046] 对于不同的浏览器,它的日志文件和缓存文件存储在手机上的位置不同,读取时可以通过文件的后缀名来区分,因为不同的浏览器的日志文件和缓存文件有不同的后缀名。 [0046] For different browsers, different positions of its log files and cache files stored on the phone, can be distinguished by the file extension when reading, because the log files for different browsers and different cache file extension.

[0047] 102、解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址。 [0047] 102, parses the file browsing history, visit the website and get keyword search of the history of mobile phone users.

[0048] 通过解析日志文件和缓存文件,就可以获取手机用户的浏览记录,该浏览记录包括用户之前输入搜索的关键词、之前的浏览网址。 [0048] by parsing log files and cache files, you can get the history of mobile phone users, visit the website at the browsing history including the user entering a search keyword before, before. 这些关键词和浏览网址反映了用户关注的领域以及兴趣爱好。 These keywords and visit the website reflects the user areas of concern and interests.

[0049] 通过解析浏览记录文件的方式来获取手机用户历史搜索的关键词和浏览网址,简单且操作方便。 [0049] to obtain the history of mobile phone users search for files by parsing history the way keywords and visit the website, simple and easy to operate.

[0050] 103、对历史搜索的关键词和浏览网址分别进行分类。 [0050] 103, visit the website for keywords and search history are classified.

[0051] 本实施例中,所述对历史搜索的关键词和浏览网址分别进行分类的步骤包括: [0051] In this embodiment, the step of keyword search and browsing history URLs are classified comprising:

[0052] 对历史搜索的关键词按照语义分类; [0052] history of keyword search Sort by semantics;

[0053] 对历史搜索的浏览网址按照域名由高到低层次分类。 [0053] visit the website for history search by domain name descending hierarchical classification.

[0054] 本实施例中,优选的,所述方法还包括: [0054] In this embodiment, preferably, the method further comprising:

[0055] 采用数组的方式存储各类别下的关键词及对应的关键词的输入搜索频率,同一类别下的不同关键词通过建立的数组下标索引来标记区分; [0055] The array of keywords and stored in each category corresponding to the input keyword search frequency, different keywords to the same category index array subscript numerals distinguished by the establishment;

[0056] 采用B+树的方式存储按域名由高到低层次分类的浏览网址及对应的浏览网址的搜索访问频率。 Search by access frequency domain name descending hierarchical classification of URLs visit the website and browse the corresponding [0056] using B + tree stored.

[0057] 本实施例中,通过数组的方式存储关键词及其输入搜索的频率,还可通过数组的下标建立索引;通过B+树的方式存储浏览网址及搜索访问频率,查询时可以按照链表查询或者树查询,这两种方式方便建立查询索引,有利于排序和查找,执行效率高。 [0057] In the present embodiment, by way of the memory array and the frequency of the input search keyword, also indexed by the subscript of the array; by way of the B + tree store browsing and search URL access frequency, the query can follow list query or query tree, two convenient ways to create a query index, help sorting and searching, implementation of high efficiency.

[0058] 本实施例中,将用户所用的关键词按照语义相近程度分类,每一类中取频率最高的关键词,当用户再次打开浏览器准备输入关键词时,可以将这些关键词推荐给用户。 [0058] In this embodiment, the classification used by the user keyword semantically similar degree, take the highest frequency of keywords in each category, when the user opens a browser again ready for input keywords, these keywords may be recommended to user.

[0059] 本实施例中,103之前还包括以下步骤: [0059] In this embodiment, prior to 103 further comprising the step of:

[0060] 预先设置关键词的类别。 [0060] keywords preset categories.

[0061] 优选的,可提供一个设置界面,让用户根据自己的需要设置常用的类别。 [0061] Preferably, the interface provides a set, allows the user to set common categories according to their needs. 在实际应用中,可只按照用户设置的类别来进行分类,从而通过与用户的互动,可更好的推断用户的兴趣爱好,或者说由用户自己提供自己的兴趣爱好,更加直接方便。 In practice, the user can be set only by category to classify, so that through interaction with the user, can better infer the user's interests, or provide their own interests by the users themselves, more direct and convenient. 对用户没有设置的类别不予考虑。 Category of user is not set will not be considered.

[0062] 对关键词进行分类时,可提前设定好类别,例如按照语义可以分为娱乐,学习,办公,休闲等等,首先将所有的关键词归类,将关键词划分到特定的大类中,同义关键词划归到一类。 [0062] When keywords are classified, can be set up in advance the category, for example, can be divided according to the semantics of entertainment, learning, office, leisure and so on, first of all keyword classification, keyword will be divided into specific large category, keyword synonymous classified into a category. 然后,对特定的大类中的关键词按照频率由高到低排列。 Then, descending order of the keywords in a particular category according to the frequency. 再将大类中的频率最高的关键词取出再按照频率高低排列,以此顺序排列后确定用户的兴趣爱好。 Then the frequency of the highest categories of keywords extracted in accordance with another arrangement of frequency, determined in this order after the user interests arrangement. 关键词的使用频率高低代表了用户具体的兴趣爱好。 Key words high and low frequency of use on behalf of the user's specific interests.

[0063] 对用户访问的网站的访问次数进行统计,使用B+树的数据结构,将网址按照域名由高到低的层次分类统计,比如用户访问百度,百度包含百度百科,百度图片,百度新闻等,而百度百科具体的内容对应的网站属于最高域名的网址,其包含的内容才是用户的搜索的最终结果。 [0063] the number of accesses the user to access the site's statistics, the use of B + tree data structure, according to the Web site domain name descending hierarchical classification statistics, such as user access Baidu, Baidu Baidu Encyclopedia contains, Baidu picture, Baidu news , while Baidu Encyclopedia specific content corresponding to the URL of the site belong to the highest domain name, the end result is the user's search content it contains. 统计用户浏览网站的频率,将用户浏览指定网站的最高层次的域名的网站所属的类别作为用户兴趣爱好的类别,将访问域名最高层次的网站按照频率高低排列,依据排列结果推断用户的兴趣爱好。 Frequency statistics users to browse the site, the generic domain name of the highest levels of users browse specified for the Web site belongs to a user interests categories, the highest level of access domain sites according to high and low frequency arrangement, to infer the user's interests in accordance with the arrangement results.

[0064] 104、统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 [0064] 104, statistical keyword search frequency in each category and visit the website to search frequency according to the level of mobile phone users to infer interests.

[0065] 本实施例中,步骤104具体包括以下步骤: [0065] In this embodiment, step 104 includes the following steps:

[0066] 统计各类别下关键词的搜索频率; [0066] statistical categories under each keyword search frequency;

[0067] 将各类别中搜索频率最高的关键词按照频率由高到低排序; [0067] Each category keyword search frequency according to the frequency of the highest sort descending;

[0068] 统计各类别下浏览网址的搜索频率: [0068] search frequency statistics of each category visit the website at:

[0069] 将各类别中最高域名的网址按照频率由高到低排序; [0069] The web sites of the highest category according to the frequency domain name descending sorting;

[0070] 根据两个排序来推断手机用户的兴趣爱好。 [0070] According to two sort of mobile phone users to infer interests.

[0071] 本实施例中,将用户使用的关键词和浏览网址划分到相应的类别中,再将相应类别中的关键词和浏览网址按照频率由高到低排列统计出来,从而可得出用户最关注的类别和最不关注的类别,以此推断出用户的兴趣爱好。 [0071] In the present embodiment, the keyword and the URL of the browser used by the user to divide the respective categories, then the respective categories and keywords visit the website according to the frequency statistics out from high to low, so that the user can obtain most concerned about the category and category of least concern, in order to infer the user's interests.

[0072] 本实施例中,优选的,所述方法还包括: [0072] In this embodiment, preferably, the method further comprising:

[0073] 按照手机用户兴趣爱好向手机用户推荐关键词、网站或应用。 [0073] According to the mobile phone user interests recommendation keyword, website or application to mobile phone users.

[0074] 按照手机用户兴趣爱好,可以向其推荐关键词、相关的网站或者应用。 [0074] According to the mobile phone user interests, to recommend possible keywords related to the site or application. 可为手机浏览器开发商提供便利,采用本实施例的方法可方便向用户推荐常用的搜索关键词、还可在浏览的网页里给用户推荐与其兴趣爱好相关的网站,不但增加浏览器的功能、增强了手机操作系统及软件的体验,方便了手机用户,浏览器开发商还可以此赚取广告的推荐费用,带来经济效益。 Can facilitate mobile browser developers, according to this embodiment of the method the user can easily recommend to popular search keywords, web browsing can be recommended to the user's interests and its related sites will not only increase the browser's capabilities enhance the experience of mobile phone operating system and software to facilitate the mobile phone users, the browser developers can also recommend this to earn advertising fees, and bring economic benefits.

[0075] 本实施例中,优选的,所述方法还包括: Embodiment, preferably, the method [0075] The present further comprises:

[0076] 向手机用户推荐各类别中搜索频率最高的关键词; [0076] recommendation to mobile phone users in each category the most frequently searched keyword;

[0077] 向手机用户推荐各类别中搜索频率最高的最高域名的网址。 [0077] referral URL in each category, its highest search frequency domain to mobile phone users.

[0078] 可给经常使用手机浏览器浏览输入某固定关键词或是访问某固定网站的手机用户带来便利。 [0078] can often use your phone to browse a fixed input a keyword or visit the website of fixed phone users convenience.

[0079] 请参阅图2,本发明实施例还提供了一种推断手机用户兴趣爱好的装置,所述装置包括: [0079] Referring to FIG 2, an embodiment of the present invention also provides a device for inferring the interests of mobile phone users, the apparatus comprising:

[0080] 读取单元201,用于读取手机浏览器中的浏览记录文件; [0080] The reading unit 201 for reading the file history of the mobile browser;

[0081] 获取单元202,用于解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址; [0081] obtaining unit 202, configured to parse the log file browsing, access to mobile phone users visit the website and keyword search history;

[0082] 分类单元203,用于对历史搜索的关键词和浏览网址分别进行分类;[0083] 优选的,分类单元203还包括类别模块,用于预先设置关键词的类别。 [0082] The classification unit 203, for keyword search and browsing history URLs are classified; [0083] Preferably, the category classification unit 203 further includes a module configured to preset a keyword category.

[0084] 优选的,可提供一个设置界面,让用户根据自己的需要设置常用的类别。 [0084] Preferably, the interface provides a set, allows the user to set common categories according to their needs. 在实际应用中,可只按照用户设置的类别来进行分类,从而通过与用户的互动,可更好的推断用户的兴趣爱好,或者说由用户自己提供自己的兴趣爱好,更加直接方便。 In practice, the user can be set only by category to classify, so that through interaction with the user, can better infer the user's interests, or provide their own interests by the users themselves, more direct and convenient. 对用户没有设置的类别不予考虑。 Category of user is not set will not be considered.

[0085] 统计推断单元204,用于统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 [0085] statistical inference unit 204, for statistics in each category and keyword search frequency visit the website to search frequency according to the level of mobile phone users to infer interests.

[0086] 优选的,所述分类单元,具体用于对历史搜索的关键词按照语义分类和对历史搜索的浏览网址按照域名由高到低层次分类。 [0086] Preferably, the classification unit is configured to search by keywords historical visit the website of the semantic classification and classified according to the domain name of the search history to low levels.

[0087] 请参阅图3,本实施例中,优选的,所述装置还包括: [0087] Referring to FIG 3, in the embodiment, the present preferred embodiment, the apparatus further comprising:

[0088] 数组存储单元301,用于采用数组的方式存储各类别下的关键词及对应的关键词的输入搜索频率,同一类别下的不同关键词通过建立的数组下标索引来标记区分; [0088] The memory cell array 301, a mode for using the memory array and the corresponding keywords in each category keyword search frequency input, different keywords to the same category index array subscript numerals distinguished by the establishment;

[0089] B+树存储单元302,用于采用B+树的方式存储按域名由高到低层次分类的浏览网址及对应的浏览网址的搜索访问频率。 [0089] B + tree storage unit 302, for use of the B + tree stored in descending hierarchical classification by domain search website browsing Browse access frequency corresponding to the URL.

[0090] 请参阅图4,所述装置还包括: [0090] Referring to FIG. 4, the apparatus further comprising:

[0091] 推荐单元401,用于按照手机用户兴趣爱好向手机用户推荐关键词、网站或应用。 [0091] recommendation unit 401, in accordance with the interest for mobile phone users interested recommended keywords, websites, or applications to mobile phone users.

[0092] 装置中的细节方案已在方法中描述,在此不再赘述。 [0092] Details of an apparatus have been described in the process, not described herein again.

[0093] 本发明实施例还提供一种手机终端,所述手机终端包括上述的装置。 [0093] Embodiments of the present invention further provides a mobile phone terminal, the mobile terminal comprises the above-described means.

[0094] 本发明的推断手机用户兴趣爱好的方法、装置及手机终端,通过读取浏览记录文件、获取关键词和浏览网址,并统计各类别下关键词和浏览网址的搜索频率,可有效的根据频率的高低来推断手机用户兴趣爱好。 [0094] The estimation of mobile phone users interests of the present invention a method, apparatus and mobile terminals, by reading the history file to obtain keywords and visit the website, and the statistical frequency of each search keyword category and the URL, and can effectively according to the level of frequency of mobile phone users to infer interests.

[0095] 以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。 [0095] The foregoing is only preferred embodiments of the present invention but are not intended to limit the present invention, any modifications within the spirit and principle of the present invention, equivalent substitutions and improvements should be included in the present within the scope of the invention.

Claims (10)

  1. 1.一种推断手机用户兴趣爱好的方法,其特征在于,所述方法包括: 读取手机浏览器中的浏览记录文件; 解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址; 对历史搜索的关键词和浏览网址分别进行分类; 统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 1. A method for mobile phone users interests inference, characterized in that the method comprising: reading history file in the mobile browser; parsing file browsing history, visit the website and get keyword mobile phone users search history; to history search keywords and visit the website each classification; statistical keyword search frequency in each category and visit the website to search frequency according to the level of mobile phone users to infer interests.
  2. 2.如权利要求1所述的方法,其特征在于,所述对历史搜索的关键词和浏览网址分别进行分类的步骤包括: 对历史搜索的关键词按照语义分类; 对历史搜索的浏览网址按照域名由高到低层次分类。 2. The method according to claim 1, characterized in that, for the keyword search and browsing history URLs separately classifying comprises the step of: historical search keyword according to the semantic classification; browsing search by the URL history domain name descending hierarchical classification.
  3. 3.如权利要求2所述的方法,其特征在于,所述方法还包括: 采用数组的方式存储各类别下的关键词及对应的关键词的输入搜索频率,同一类别下的不同关键词通过建立的数组下标索引来标记区分; 采用B+树的方式存储按域名由高到低层次分类的浏览网址及对应的浏览网址的搜索访问频率。 3. The method according to claim 2, wherein said method further comprises: using array of keywords and stored in the corresponding input of each category keyword search frequency, different keywords of the same category by subscript array index created to mark the distinction; visit the website search frequency of visits and visit the website of the corresponding use of B + tree stored by domain from high to low levels of classification.
  4. 4.如权利要求1或2或3所述的方法,其特征在于,所述方法还包括: 按照手机用户兴趣爱好向手机用户推荐关键词、网站或应用。 4. The method of claim 1 or 2 or as claimed in claim 3, characterized in that, said method further comprising: mobile phone users according to the phone user interests recommended keywords, sites, or applications.
  5. 5.如权利要求1所述的方法,其特征在于,所述浏览记录文件包括日志文件和缓存文件。 5. The method according to claim 1, wherein the log file includes a history file and cache files.
  6. 6.一种推断手机用户兴趣爱好的装置,其特征在于,所述装置包括: 读取单元,用于读取手机浏览器中的浏览记录文件; 获取单元,用于解析浏览记录文件,获取手机用户历史搜索的关键词及浏览网址; 分类单元,用于对历史搜索的关键词和浏览网址分别进行分类; 统计推断单元,用于统计各类别下关键词和浏览网址的搜索频率,以根据搜索频率的高低来推断手机用户兴趣爱好。 An apparatus for mobile phone users interests inference, characterized in that said apparatus comprising: a reading unit for reading the file history of the mobile browser; obtaining unit, configured to parse the recording document browsing, the mobile phone acquires users browsing history search keywords and website; classification unit for keyword search history and visit the website of separate sorting; statistical inference unit for searching under keywords frequency statistics for each category and visit the website to the search high and low frequency of mobile phone users to infer interests.
  7. 7.如权利要求6所述的装置,其特征在于, 所述分类单元,具体用于对历史搜索的关键词按照语义分类和对历史搜索的浏览网址按照域名由高到低层次分类。 7. The apparatus according to claim 6, wherein the classification unit is configured to search by keywords historical visit the website of the semantic classification and classified according to the domain search history from high to low level.
  8. 8.如权利要求7所述的装置,其特征在于,所述装置还包括: 数组存储单元,用于采用数组的方式存储各类别下的关键词及对应的关键词的输入搜索频率,同一类别下的不同关键词通过建立的数组下标索引来标记区分; B+树存储单元,用于采用B+树的方式存储按域名由高到低层次分类的浏览网址及对应的浏览网址的搜索访问频率。 8. The apparatus according to claim 7, wherein said apparatus further comprises: an array of memory cells, for the memory array embodiment using keywords and the corresponding keywords in each category of the input search frequency, the same category different keywords in the array labeled distinguished by the subscript index established; Search domain access frequency by descending hierarchical classification browser and visit the website at the URL corresponding to the B + tree storage means, for employing the B + tree stored.
  9. 9.如权利要求6或7或8所述的装置,其特征在于,所述装置还包括: 推荐单元,用于按照手机用户兴趣爱好向手机用户推荐关键词、网站或应用。 9. The apparatus of claim 6 or 7 or as claimed in claim 8, wherein said apparatus further comprises: a recommendation unit configured in accordance with mobile phone users interests recommended keywords, sites, or applications to mobile phone users.
  10. 10.一种手机终端,其特征在于,所述手机终端包括权利要求6所述的装置。 A mobile phone terminal, wherein said mobile phone terminal comprising apparatus according to claim 6.
CN 201310573351 2013-11-15 2013-11-15 An assumption mobile phone users interests method, apparatus and mobile terminal CN103607496B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201310573351 CN103607496B (en) 2013-11-15 2013-11-15 An assumption mobile phone users interests method, apparatus and mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201310573351 CN103607496B (en) 2013-11-15 2013-11-15 An assumption mobile phone users interests method, apparatus and mobile terminal

Publications (2)

Publication Number Publication Date
CN103607496A true true CN103607496A (en) 2014-02-26
CN103607496B CN103607496B (en) 2017-04-19

Family

ID=50125696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201310573351 CN103607496B (en) 2013-11-15 2013-11-15 An assumption mobile phone users interests method, apparatus and mobile terminal

Country Status (1)

Country Link
CN (1) CN103607496B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810295A (en) * 2014-03-06 2014-05-21 北京邮电大学 Method and device for extracting internet data
CN103955464A (en) * 2014-03-25 2014-07-30 南京邮电大学 Recommendation method based on situation fusion sensing
CN103970891A (en) * 2014-05-23 2014-08-06 三星电子(中国)研发中心 Method for inquiring user interest information based on context
CN105095363A (en) * 2015-06-26 2015-11-25 百度在线网络技术(北京)有限公司 Invitation commenting method and device for sites
WO2018023684A1 (en) * 2016-08-05 2018-02-08 吴晓敏 Information pushing method during recognition of user's interests and recognition system
WO2018023683A1 (en) * 2016-08-05 2018-02-08 吴晓敏 Usage data statistical method for point of interest capturing technology and recognition system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7631032B1 (en) * 1998-01-30 2009-12-08 Net-Express, Ltd. Personalized internet interaction by adapting a page format to a user record
CN102831199A (en) * 2012-08-07 2012-12-19 北京奇虎科技有限公司 Method and device for establishing interest model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7631032B1 (en) * 1998-01-30 2009-12-08 Net-Express, Ltd. Personalized internet interaction by adapting a page format to a user record
CN102831199A (en) * 2012-08-07 2012-12-19 北京奇虎科技有限公司 Method and device for establishing interest model

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810295A (en) * 2014-03-06 2014-05-21 北京邮电大学 Method and device for extracting internet data
CN103955464A (en) * 2014-03-25 2014-07-30 南京邮电大学 Recommendation method based on situation fusion sensing
CN103955464B (en) * 2014-03-25 2017-10-03 南京邮电大学 One proposed method based on context-aware integration
CN103970891A (en) * 2014-05-23 2014-08-06 三星电子(中国)研发中心 Method for inquiring user interest information based on context
CN103970891B (en) * 2014-05-23 2017-08-25 三星电子(中国)研发中心 Based on the context of the user interest information query method
CN105095363A (en) * 2015-06-26 2015-11-25 百度在线网络技术(北京)有限公司 Invitation commenting method and device for sites
WO2018023684A1 (en) * 2016-08-05 2018-02-08 吴晓敏 Information pushing method during recognition of user's interests and recognition system
WO2018023683A1 (en) * 2016-08-05 2018-02-08 吴晓敏 Usage data statistical method for point of interest capturing technology and recognition system

Also Published As

Publication number Publication date Type
CN103607496B (en) 2017-04-19 grant

Similar Documents

Publication Publication Date Title
Duan et al. An empirical study on learning to rank of tweets
US7062561B1 (en) Method and apparatus for utilizing the social usage learned from multi-user feedback to improve resource identity signifier mapping
Chirita et al. Using ODP metadata to personalize search
Elgazzar et al. Clustering wsdl documents to bootstrap the discovery of web services
Hotho et al. Information retrieval in folksonomies: Search and ranking
US20090006388A1 (en) Search result ranking
Xue et al. Optimizing web search using web click-through data
US20090089278A1 (en) Techniques for keyword extraction from urls using statistical analysis
US20100268720A1 (en) Automatic mapping of a location identifier pattern of an object to a semantic type using object metadata
US20070288437A1 (en) Methods and apparatus providing local search engine
US20110167054A1 (en) Automated discovery aggregation and organization of subject area discussions
US7228301B2 (en) Method for normalizing document metadata to improve search results using an alias relationship directory service
Szomszor et al. Semantic modelling of user interests based on cross-folksonomy analysis
Sweeney et al. Effective search results summary size and device screen size: Is there a relationship?
US20050114324A1 (en) System and method for improved searching on the internet or similar networks and especially improved MetaNews and/or improved automatically generated newspapers
US20110246457A1 (en) Ranking of search results based on microblog data
Szpektor et al. Improving recommendation for long-tail queries via templates
Szomszor et al. Correlating user profiles from multiple folksonomies
US20100023506A1 (en) Augmenting online content with additional content relevant to user interests
Tseng Automatic thesaurus generation for Chinese documents
Markowetz et al. Design and Implementation of a Geographic Search Engine.
Hogan et al. An empirical survey of linked data conformance
US20060287988A1 (en) Keyword charaterization and application
Li et al. Tag-based social interest discovery
Hulpus et al. Unsupervised graph-based topic labelling using dbpedia

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
GR01