WO2007051397A1 - An information retrieval system and information retrieval method - Google Patents

An information retrieval system and information retrieval method Download PDF

Info

Publication number
WO2007051397A1
WO2007051397A1 PCT/CN2006/002804 CN2006002804W WO2007051397A1 WO 2007051397 A1 WO2007051397 A1 WO 2007051397A1 CN 2006002804 W CN2006002804 W CN 2006002804W WO 2007051397 A1 WO2007051397 A1 WO 2007051397A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
search
information
feature
behavior information
Prior art date
Application number
PCT/CN2006/002804
Other languages
French (fr)
Chinese (zh)
Inventor
Wei Wang
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2007051397A1 publication Critical patent/WO2007051397A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles

Definitions

  • the invention relates to the field of information retrieval technology, in particular to an information retrieval system and a retrieval method. Background of the invention
  • a search engine is a system that can obtain web page information, build a database, and provide queries. Depending on how you work, you can divide your search engine into two basic categories: the FullText Search Engine and the Directory.
  • the full-text search engine database relies on a software called “Spider” or “Crawlers” to automatically retrieve a large amount of web page content through various links on the network, and analyze and organize according to the rules. Forming. Both Google and Baidu are typical full-text search engine systems. A query to a full-text search engine is often referred to as a search for "all sites” or "all sites”, such as Google's full-text search ( http: ⁇ www.google.com/intl/zh-CN/ ).
  • the catalogue is a manual database that collects and organizes website data, such as Yahoo China and Sohu, Sina, and NetEase. In addition, some navigation sites on the Internet can also be attributed to the original category, such as "Website Home” (http://www.haol23.com/).
  • the query for the category directory is usually called the search "category directory” or search for "category website", such as "Sina search” (http://dir.sina.com.cn/) and "Yahoo China search” (http:/ /cn.search.yahoo.com/dirsrch/ ).
  • Full-text search engines and catalogs are used in different lengths.
  • the full-text search engine relies on software, so the capacity of the database is very large, but its query results are often not accurate enough.
  • the catalogue relies on manual collection and organization of websites, which can provide more accurate query results, but the collected content is very limited. .
  • search engines now offer both types of queries.
  • the integration of these two types of search engines has also produced other search services, here, We also call them search engines, mainly in the following two categories:
  • meta search engine (META Search Engine).
  • these search engines generally do not have their own network robots and databases, their search results are by calling, controlling and optimizing the search results of other independent search engines and in a uniform format Displayed in the same interface.
  • the meta search engine does not have "web robot” or “web spider” and does not have an independent index database, it has its own unique meta-search technology in terms of search request submission, retrieval interface proxy and search result display.
  • “metaFisher meta search engine” http:AVww.hsfz.net/fish/ )
  • it calls and integrates data from multiple search engines such as Google, Yahoo AlltheWeb, Baidu and OpenFind.
  • the integrated search engine uses web technology to link multiple independent search engines on a web page. When querying, click or specify a search engine, input once, multiple search engines simultaneously query, and search results are displayed by different search engines by different pages. Such as "Internet Swiss Army Knife, (http: ⁇ free.okey.net/%7Efree/searchl.htm).
  • the full-text search engine “web robot” or “web spider” is a kind of software on the network. It traverses the web space and can scan websites within a certain IP address range and along the network. The link on the page is from one page to another, from one website to another. In order to ensure the latest information collected, it will also return to the pages that have been captured. Web pages collected by web robots or web spiders must be analyzed by other programs. A large number of calculations are performed according to a certain correlation algorithm to create a web page index, which is added to the content index database.
  • the full-text search engine that we usually see is actually a search interface of a search engine system.
  • the search engine finds an index of all relevant web pages that match the keyword from the huge content index database. And presented to us according to certain ranking rules. Different search engines, different content index databases, and different ranking rules, so when we use different search engines to query with the same keyword, the search results are not the same.
  • a URL which is a URL (Uniform Resource Locator, through which a web browser can access the corresponding file); b. Different content words in the file and in some engines The relative address of each such content word related to the other content words of the file; c. A segment summary of the file, usually only a few lines or the first few lines of the file; d, may be provided in its HTML description section a description of the file.
  • a search engine When a user uses a search engine, the user is provided with a keyword-based query that attempts to find a file containing as many keywords as possible, and when requested, according to an operator or other specification (eg, a logical operation, such as: Look for the range with / or / not). For each such file it finds, the engine retrieves its file record and sorts by the number of key matches in the file relative to other such files to provide the record to the user.
  • a keyword-based query that attempts to find a file containing as many keywords as possible, and when requested, according to an operator or other specification (eg, a logical operation, such as: Look for the range with / or / not).
  • the engine retrieves its file record and sorts by the number of key matches in the file relative to other such files to provide the record to the user.
  • the search engine simply responds to the keyword query provided by the user, and the user may have different behavior habits at different times and on different machines, thus having different needs, and the content information that the user wants to retrieve may have Different, but existing search methods do not consider these situations to classify search engine search results. Summary of the invention
  • the present invention provides an information retrieval system and method.
  • An embodiment of the present invention provides an information retrieval system, including: a search engine, a content index database provided to a search engine for searching, and the following:
  • a user feature database that stores feature behavior information that the user has in different time periods
  • a content analysis system that respectively associates with the user feature database, the search engine, and the connected user
  • the communication network of the terminal is connected, and is used for determining the current time and receiving the user identifier transmitted by the user terminal, and querying the user feature database to obtain the characteristic behavior information of the current time of the user identifier; and searching by the search engine.
  • the result information is re-searched and sorted according to the obtained feature behavior information, and the retrieved search result is sent to the user terminal for display.
  • An embodiment of the present invention further provides an information retrieval method, which includes the following steps: saving feature behavior information corresponding to a user in different time periods;
  • the original search result searched by the search engine is secondarily searched according to the feature behavior information, and the search result including the feature behavior information is displayed to the user.
  • An embodiment of the present invention further provides an information retrieval method, which is characterized by comprising the following steps:
  • the characteristic behavior information corresponding to the user under different conditions is saved, and the original retrieval is performed according to the retrieval keyword input by the user;
  • the original retrieval result is subjected to a second retrieval according to the characteristic behavior information.
  • condition is identified as time, machine model or a combination of the two.
  • the embodiment of the present invention further provides an information retrieval system, which includes an interconnected search engine and a content index database provided for searching by the search engine, and further includes:
  • a user feature database which stores feature behavior information of the user under different conditions
  • a content analysis system which is respectively connected with a user feature database, a search engine, and a communication network connected to the user terminal, for acquiring the current condition identifier and the slave Transmitted by the user terminal User identifier, and according to the condition identifier and the user identifier, querying the user feature database to obtain the feature behavior information of the user identifier under the corresponding condition; and performing the search result information searched by the search engine based on the obtained feature behavior information
  • the sorting is searched again, and the sorted search result is again retrieved and sent to the user terminal for display.
  • conditional identification can be time, or a machine model or a combination of the two.
  • the solution provided by the embodiment of the present invention can search for the original collection result searched by the search engine according to the keyword input by the user according to the time characteristic and the personalized characteristic behavior of the user corresponding to the machine model.
  • the record is subjected to secondary screening and filtering, and the file record information that the user is really interested in is preferentially displayed to the user, which improves the accuracy and search efficiency of the user to retrieve relevant information.
  • Figure 1 shows the structure of a traditional search engine system.
  • FIG. 2 is a system frame diagram of an information retrieval system according to an embodiment of the present invention.
  • FIG. 3 is a framework diagram of a user feature database according to an embodiment of the present invention.
  • FIG. 4 is a skeleton diagram of a content analysis system according to an embodiment of the present invention.
  • Web Spider crawls web pages from the Internet, sends web pages to the "web database,” and repeats the loop until all web pages are crawled.
  • the system obtains the text information from the "web database” and sends it to the "text index” module to create an index to form an "index database”.
  • the server searches the relevant web pages in the "index database”, sorts by relevance according to the "query server”, and extracts The content summary of the keyword, the last page of the organization is returned to the "user".
  • One embodiment of the present invention considers that the user has different feature behavior information in different time periods. Therefore, after the search engine obtains the retrieval result, the retrieval result is processed according to the characteristic behavior information of the user corresponding to the current time period.
  • the search result that meets the user characteristic behavior information is preferentially displayed to the user, thereby improving the accuracy of the search engine retrieval, and making the retrieval result provided to the user more close to the user's needs.
  • FIG. 2 shows an information retrieval system according to an embodiment of the present invention, which includes a content analysis system 23, a user feature database 24, a search engine 22, and a content index database 21, wherein:
  • the content analysis system 23 is configured to receive the user identifier transmitted by the user terminal, the input search keyword, and the current time of obtaining the local server, and query the user feature database 24 to match the characteristic behavior of the user in the time period, and pass the search engine.
  • the searched pages are retrieved and filtered again, so that the retrieved pages are presented to the user in the order of priority behavior of the feature behaviors exhibited by the user during the time period.
  • the combination of the machine model, as well as the time and machine model can also be used as conditional identification, which allows the information retrieval system to provide users with higher retrieval accuracy and search efficiency.
  • the user feature database 24 is used to store the characteristic behavior information of the user, especially the characteristic behavior information of the user in different time periods, and the database is described in detail later, and details are not described herein.
  • the search engine 22 which is a text and keyword based search tool, returns a list of required file pointers with a file title after searching in the existing content index database 21, and usually has some extracted from the body of the file. Descriptive text.
  • the content index database 21 automatically accesses the website by activating an automated program implemented by the software (such as "web spider") and sequentially tracks the hypertext connection therein and extracts each file encountered therein by a so-called "keyword”, And stored in the database, provided to the search engine 22 for access.
  • an automated program implemented by the software such as "web spider”
  • FIG. 3 is an embodiment of the user feature database 24, which may be, but is not limited to, The preservation of the message.
  • the personal user information table, the time period information table, the feature behavior table, and the matching table are described in detail below.
  • the feature behavior information corresponding to the machine model number, the time, the machine model, and the group cooperation condition identifier of the two are saved in the user feature database.
  • the personal user information table is used to store the personal information of the user, and may be a letter input when the user registers.
  • Table 1 below shows a user information table:
  • the time period information table is used to store different time period numbers corresponding to different time segments, and the time segment number is used to facilitate the retrieval of the database, and the setting of the time period is more flexible. As shown in the table below
  • the table when the machine model is used for conditional identification is a machine model information table, which is similar to the time zone information table and has two machine model numbers and machine models.
  • the condition table when identifying the group cooperation conditions of time and machine model, here is an example (see Table 3 below), and its use and time period information Table 2 are similar, just one more.
  • the feature behavior table is used to store different feature lines corresponding to different feature behavior keywords of the user. For the number, one of the feature behavior keywords may also have a dependent keyword, which are all characteristic behavior information.
  • Each of the data in Table 5 above is a complete feature behavior information, which also includes a feature priority item, which is used to identify the priority of different feature behaviors of the user within a certain period of time.
  • the example shown in Table 5 indicates that the user U001 has a feature priority level 9 of the feature behavior number C001 and a feature priority level 8 of the feature behavior number C002 in the time period T001, indicating that the user U001 is more in the time period T001. It is biased to exhibit characteristic behavior with feature behavior number C001.
  • the data stored in the user feature database 24 may be provided by a system for collecting user behavior characteristics.
  • a system for collecting user behavior characteristics refer to the system and method for collecting user service behavior features applied by the applicant. "Invented.
  • FIG. 4 is a block diagram of the content analysis system 23, including a data unit 231, a search engine interface 232, an analysis unit 233, a retrieval analysis unit 234, and a retrieval data storage unit 235. In this embodiment, it is identified by time as a condition.
  • the data transceiver unit 231 is configured to implement interaction with the user terminal, and receive the user through the user terminal.
  • the search keyword input is input to the search engine interface 232, and the obtained user identifier is sent to the analysis unit 233.
  • the search engine interface 232 is configured to implement interaction with the search engine 22, send search keywords sent by the data transceiving unit 231 to the search engine 22, and receive search results of the search engine 22 to the search data storage unit 235.
  • the search data storage unit 235 saves the search results of the search engine 22 sent from the search engine interface 232 for analysis by the search analysis unit 234.
  • the analyzing unit 233 is configured to receive the user identifier sent by the data transceiver unit 231 and obtain the current search time, where the current search time may be the time obtained by the system, or may be the time reported by the machine, and according to the user.
  • the user search database 24 is identified and obtained by the current search time, and the feature behavior information corresponding to the user identifier and the search time is obtained and provided to the search analysis unit 234.
  • the behavioral feature information may include, but is not limited to, a feature behavior keyword and a feature behavior dependent keyword. Accordingly, when the machine model is used as the conditional identification, the analysis unit 233 is a machine type analysis unit (not shown) which performs processing similar to that of the analysis unit 233, and correspondingly acquires and processes the machine model information.
  • the analysis unit determines the corresponding machine model number (ie, the machine model identification) by the machine model reported by the machine, instead of the analysis unit 233, the current time can be obtained directly from the system side, or the current time can be obtained from the machine side, and then After obtaining the user identifier and the machine model identifier, the database 24 is retrieved to obtain corresponding feature behavior information.
  • the combination of time and machine model is used as the conditional identification, the corresponding steps and conditions are similar to the above steps and methods, except that the machine model is changed to the combination of time and machine model, so that the conditions become more specific.
  • the search analysis unit 234 is configured to receive the feature behavior keyword information sent by the analysis unit 233, and perform secondary search filtering and/or sorting on the search result stored in the search data storage unit 235, and filter and / or the sorted search result is sent to the data transceiving unit 231 to be returned to the user terminal for display to the user.
  • FIG. 4 reference is also made to the information retrieval system of the embodiment of the present invention shown in FIG.
  • the flow chart of the present search process the information retrieval method of the present invention is described in detail, wherein the steps are
  • 505 are also an embodiment of a conditional identification based retrieval method of the present invention. This embodiment is identified by time as a condition and includes the following parts:
  • Step 501 First, the user inputs a search keyword in a search engine provided by the user terminal according to the information to be queried, and may input a Boolean type (for example, "and” or "or") between consecutive keywords or Operators that other search engines can recognize.
  • a Boolean type for example, "and” or "or”
  • the user inputs a search keyword "game" at the user terminal, requesting to query related information.
  • Step 502 The information is transmitted to the content analysis system 23 through the network, and the data query unit 231 of the content analysis system 23 obtains the keyword information of the user query; and the data transceiver unit 231 also obtains the identifier of the user, and the user identifier can be obtained. It is input by the user through the user terminal, or may be entered when the user logs in when using the information retrieval system of the embodiment of the present invention.
  • Step 503 The data transceiving unit 231 sends the obtained keyword to the search engine interface 232, and sends the user identification information to the analyzing unit 233.
  • the data transceiving unit 231 sends the keyword "game” input by the user to the search engine interface 232; and the obtained identifier U001 of the user is sent to the analyzing unit 233.
  • Step 504 The search engine interface 232 sends the obtained keyword of the user query to the search engine 22, and the search engine 22 retrieves the related information in the content index database 21 according to the keyword, and returns the search result to the search engine interface 232, and then It is sent to the search data storage unit 235 for storage.
  • Step 505 The analyzing unit 233 finds the matching related feature behavior data from the user feature database 24 according to the obtained user identifier and current time information, and sends the matching characteristic behavior data to the retrieval analyzing unit 234.
  • the time information may be provided by a local server loaded with the content analysis system or by any computer device within the network, preferably provided by a local server.
  • the analysis unit 233 is a machine type analysis unit, or a group having the above unit functions. In conjunction with units (not shown), these units perform similar processing as the analysis unit 233, correspondingly acquiring and processing information such as machine models.
  • the analysis unit After obtaining the user identifier and the condition identifier (time, machine model or a combination of the two), the analysis unit searches the user feature database to obtain corresponding feature behavior information, and the retrieval process is:
  • the intermediate feature behavior information of the user is retrieved from the intermediate feature behavior information of the user by using the condition identifier.
  • the intermediate feature behavior information, the corresponding feature behavior information includes a feature behavior keyword and/or a feature behavior dependent keyword.
  • the two steps may be in no order, that is, the intermediate feature behavior information of the user may be retrieved from the user feature database by using the condition identifier, and then the user identifier is used to retrieve the intermediate feature behavior information from the user. Corresponding feature behavior information.
  • the user's characteristic behavior keywords and characteristic behavior subordinate keywords at the moment are: games, video games, computer games...; music, Classical, orchestral...
  • the user's characteristic behavior keywords and related feature priorities are sent to the retrieval analysis unit 234.
  • the corresponding model number (ie, the model number of the machine) can also be obtained from the database 24 according to the model information reported by the machine; and the user is retrieved from the table of the database 24 according to the user identifier and the model number of the machine.
  • User behavior preferences and priorities; these feature behavior keywords and related feature priorities of the user are then sent to the retrieval analysis unit 234.
  • the following describes the intermediate feature behavior information, and the corresponding feature behavior information.
  • the condition identification can be a combination of time, machine model or time and machine model.
  • match table 5 which shows individual feature behavior information.
  • the intermediate feature behavior information may be retrieved according to the time segment number (ie, the condition), and then the corresponding feature is obtained from the intermediate feature behavior information according to the user number (ie, the user identifier). Behavioral information.
  • the step of retrieving the corresponding characteristic behavior information is similar to the step when the time is the time, only the conditional identification has changed.
  • Step 506 The retrieval analysis unit 234 obtains the relevant retrieval result (such as page information) that the user has searched from the retrieval data storage unit 235 by using the user identifier, and then performs the secondary search by the received feature behavior keyword and related feature priority.
  • the result information is retrieved and reordered, so that the page information that the user really has is first displayed to the user.
  • the feature behavior keyword (game, electronic game, computer game%) with high priority is first searched, and the retrieved file is retrieved.
  • the information is listed first; then the low-priority feature behavior keywords (music, classical, orchestral%) are searched, and the retrieved file information is listed later; then the feature is not included in the secondary search.
  • the information of the original search result of the behavior keyword is listed last.
  • Step 507 The search analyzing unit 234 sends the search result sorted by the secondary search to the data transceiving unit 231, and the data transceiving unit 231 sends the result of the secondary search sorting (such as page information) to the user terminal for display to the user.
  • the above search scheme can be used in almost any information retrieval system to increase the search accuracy of the search engine, whether or not the engine is a regular engine.
  • embodiments of the present invention also improve the accuracy of retrieving information from a mass database, regardless of the language in which the textual information is used, such as Chinese, English, French, German, and the like.

Abstract

An information retrieval system, which includes a search engine, a content indexing database provided to the search engine to perform the search, a user character database, and a content analysis system, is disclosed. An information retrieval method that includes the following steps is also provided correspondingly. The characteristic behavior information corresponding to the user identity in various time interval are stored, and the retrieval keywords input by the user are obtained, the search engine performs the retrieval base on the keyword to obtain the original retrieval result; the user identity and the current time information are obtained, and based on them the characteristic behavior data keyword of the corresponding user is retrieved; and the second retrieve is performed on the original retrieval result retrieved by the search engine based on the characteristic behavior information, and the retrieval results containing the keyword is displayed to the user at a first priority. And another information retrieval method and information retrieval system are provided. The information retrieval system can filter the user’s search according to the different characteristic behavior of the user, improve the accuracy and the performance of the user’s search on the associated information.

Description

信息检索系统和检索方法 技术领域  Information retrieval system and retrieval method
本发明涉及信息检索技术领域,特別是指一种信息检索系统和检索方法。 发明背景  The invention relates to the field of information retrieval technology, in particular to an information retrieval system and a retrieval method. Background of the invention
搜索引擎是指能够获得网站网页资料, 能够建立数据库并提供查询的系 统。 按照工作原理的不同, 可以将搜索引擎分为两个基本类别: 全文搜索引 擎( FullText Search Engine )和分类目录( Directory )。  A search engine is a system that can obtain web page information, build a database, and provide queries. Depending on how you work, you can divide your search engine into two basic categories: the FullText Search Engine and the Directory.
全文搜索引擎的数据库是依靠一个叫 "网络机器人(Spider )" 或叫 "网 络蜘蛛( Crawlers )" 的软件, 通过网络上的各种链接自动获取大量网页信息 内容, 并按以定的规则分析整理形成的。 Google、 百度都是比较典型的全文 搜索引擎系统。 通常将对全文搜索引擎的查询称为搜索 "所有网站" 或 "全 部网站", 如 Google的全文搜索 ( http:〃 www.google.com/intl/zh-CN/ )。  The full-text search engine database relies on a software called "Spider" or "Crawlers" to automatically retrieve a large amount of web page content through various links on the network, and analyze and organize according to the rules. Forming. Both Google and Baidu are typical full-text search engine systems. A query to a full-text search engine is often referred to as a search for "all sites" or "all sites", such as Google's full-text search ( http:〃 www.google.com/intl/zh-CN/ ).
分类目录则是通过人工的方式收集整理网站资料形成数据库的, 比如雅 虎中国以及国内的搜狐、 新浪、 网易分类目录。 另外, 在网上的一些导航站 点,也可以归属为原始的分类目录,如"网址之家" (http://www.haol23.com/ )。 通常将对分类目录的查询称为搜索 "分类目录"或搜索 "分类网站", 如 "新 浪 搜 索 " ( http://dir.sina.com.cn/ ) 和 " 雅 虎 中 国 搜 索 " ( http://cn.search.yahoo.com/dirsrch/ )。  The catalogue is a manual database that collects and organizes website data, such as Yahoo China and Sohu, Sina, and NetEase. In addition, some navigation sites on the Internet can also be attributed to the original category, such as "Website Home" (http://www.haol23.com/). The query for the category directory is usually called the search "category directory" or search for "category website", such as "Sina search" (http://dir.sina.com.cn/) and "Yahoo China search" (http:/ /cn.search.yahoo.com/dirsrch/ ).
全文搜索引擎和分类目录在使用上各有长短。 全文搜索引擎因为依靠软 件进行, 所以数据库的容量非常庞大, 但是, 它的查询结果往往不够准确; 分类目录依靠人工收集和整理网站, 能够提供更为准确的查询结果, 但收集 的内容却非常有限。 为了取长补短, 现在的很多搜索引擎, 都同时提供这两 类查询。 对这两类搜索引擎进行整合, 还产生了其它的搜索服务, 在这里, 我们权且也把它们称作搜索引擎, 主要有以下两类: Full-text search engines and catalogs are used in different lengths. The full-text search engine relies on software, so the capacity of the database is very large, but its query results are often not accurate enough. The catalogue relies on manual collection and organization of websites, which can provide more accurate query results, but the collected content is very limited. . In order to complement each other, many search engines now offer both types of queries. The integration of these two types of search engines has also produced other search services, here, We also call them search engines, mainly in the following two categories:
1、 元搜索引擎 (META Search Engine)., 这类搜索引擎一般都没有自己网 絡机器人及数据库, 它们的搜索结果是通过调用、 控制和优化其它多个独立 搜索引擎的搜索结果并以统一的格式在同一界面集中显示。 元搜索引擎虽没 有 "网络机器人"或 "网络蜘蛛", 也无独立的索引数据库, 但在检索请求提 交、检索接口代理和检索结果显示等方面,均有自己研发的特色元搜索技术。 比如 "metaFisher元搜索引擎" (http:AVww.hsfz.net/fish/ ), 它就调用和整合 了 Google、 Yahoo AlltheWeb、 百度和 OpenFind等多家搜索引擎的数据。  1, meta search engine (META Search Engine)., these search engines generally do not have their own network robots and databases, their search results are by calling, controlling and optimizing the search results of other independent search engines and in a uniform format Displayed in the same interface. Although the meta search engine does not have "web robot" or "web spider" and does not have an independent index database, it has its own unique meta-search technology in terms of search request submission, retrieval interface proxy and search result display. For example, "metaFisher meta search engine" (http:AVww.hsfz.net/fish/ ), it calls and integrates data from multiple search engines such as Google, Yahoo AlltheWeb, Baidu and OpenFind.
2、 集成搜索引擎( All - in - One Search Page )。 集成搜索引擎是通过网 络技术, 在一个网页上链接很多个独立搜索引擎, 查询时, 点选或指定搜索 引擎, 一次输入, 多个搜索引擎同时查询, 搜索结果由各搜索引擎分别以不 同页面显示, 如 "网际瑞士军刀,, ( http:〃 free.okey.net/%7Efree/searchl .htm )。  2. Integrated Search Engine (All - in - One Search Page). The integrated search engine uses web technology to link multiple independent search engines on a web page. When querying, click or specify a search engine, input once, multiple search engines simultaneously query, and search results are displayed by different search engines by different pages. Such as "Internet Swiss Army Knife, (http: 〃 free.okey.net/%7Efree/searchl.htm).
这里再介绍一下搜索引擎的工作原理, 全文搜索引擎的 "网络机器人" 或 "网络蜘蛛" 是一种网絡上的软件, 它遍历 Web空间, 能够扫描一定 IP 地址范围内的网站, 并沿着网络上的链接从一个网页到另一个网页, 从一个 网站到另一个网站采集网页资料。 它为保证采集的资料最新, 还会回访已抓 取过的网页。 网络机器人或网络蜘蛛采集的网页,还要有其它程序进行分析, 根据一定的相关度算法进行大量的计算建立网页索引, 才添加到内容索引数 据库中。 我们平时看到的全文搜索引擎, 实际上只是一个搜索引擎系统的检 索界面, 当输入关键词进行查询时, 搜索引擎会从庞大的内容索引数据库中 找到符合该关键词的所有相关网页的索引,并按一定的排名规则呈现给我们。 不同的搜索引擎, 内容索引数据库不同, 排名规则也不尽相同, 所以, 当我 们以同一关键词用不同的搜索引擎查询时, 搜索结果也就不尽相同。  Here again introduces the working principle of the search engine. The full-text search engine "web robot" or "web spider" is a kind of software on the network. It traverses the web space and can scan websites within a certain IP address range and along the network. The link on the page is from one page to another, from one website to another. In order to ensure the latest information collected, it will also return to the pages that have been captured. Web pages collected by web robots or web spiders must be analyzed by other programs. A large number of calculations are performed according to a certain correlation algorithm to create a web page index, which is added to the content index database. The full-text search engine that we usually see is actually a search interface of a search engine system. When a keyword is entered for query, the search engine finds an index of all relevant web pages that match the keyword from the huge content index database. And presented to us according to certain ranking rules. Different search engines, different content index databases, and different ranking rules, so when we use different search engines to query with the same keyword, the search results are not the same.
现在常规搜索引擎通过由软件实施的自动地访问网站和依次地跟踪其中 的超文本连接并通过所谓的 "关键词" 提取在其中遇到的每一个文件并在一 个大的数据库中标志每个文件以备随后访问。 具体地, 通过这类提取, 这类文件都减缩了, 都被抽调所有语义和句法 信息, 但还包含文件中具有地有内容的词。 这些内容词可能存在文件本身内 或只在该文件的超文本标记语言 (HTML ) 的描述段内。 在以上任何一种情 况下,该引擎为每个这类文件建立一个条目即一个文件记录。对于每个文件, 其内容词都在一个可搜索数据结构中加以标志, 并带有一个往回指向文件记 录的连接。 该文件记录通常包含: a、 一个网址, 即一个 URL (统一资源定 位器, 一个网絡浏览器可通过它访问相应的文件); b、 该文件中的不同内容 词以及在某些引擎中与该文件的其他内容词有关的每个这类内容词的相对地 址; c、 该文件的一个段摘要, 通常只有几行或该文件的前几行; d、 可能会 有在其 HTML描述段中提供的对文件的描述。 Conventional search engines now automatically access websites through software and track their hypertext links in turn and extract each file encountered in it through so-called "keywords" and mark each file in a large database. For later access. Specifically, with this type of extraction, such files are reduced, all semantic and syntactic information is extracted, but words containing content in the file are also included. These content words may exist within the file itself or only in the description section of the file's Hypertext Markup Language (HTML). In either case, the engine creates an entry for each such file, a file record. For each file, its content words are marked in a searchable data structure with a link back to the file record. The file record usually contains: a. A URL, which is a URL (Uniform Resource Locator, through which a web browser can access the corresponding file); b. Different content words in the file and in some engines The relative address of each such content word related to the other content words of the file; c. A segment summary of the file, usually only a few lines or the first few lines of the file; d, may be provided in its HTML description section a description of the file.
用户在使用搜索引擎时, 向引擎提供一个基于关键词的查询, 该搜索引 擎试图查找包含尽可能多的关键词的文件, 以及在请求时根据运算符或其他 规定(例如是逻辑运算, 如: 与 /或 /非) 的范围来查找。 对于每一个它查找 的这类文件, 该引擎检索它的文件记录及按照该文件中相对与其他这类文件 而言的关键词匹配数目来排序以向用户提供该记录。  When a user uses a search engine, the user is provided with a keyword-based query that attempts to find a file containing as many keywords as possible, and when requested, according to an operator or other specification (eg, a logical operation, such as: Look for the range with / or / not). For each such file it finds, the engine retrieves its file record and sorts by the number of key matches in the file relative to other such files to provide the record to the user.
目前, 搜索引擎只是对用户提供的关键词查询做出简单的响应, 而用户 在不同的时间, 不同机器上, 可能会有不同的行为习惯, 从而有不同的需求, 希望检索的内容信息可能有所不同, 但现有的检索方法不会考虑这些情况对 搜索引擎的搜索结果进行分类。 发明内容  At present, the search engine simply responds to the keyword query provided by the user, and the user may have different behavior habits at different times and on different machines, thus having different needs, and the content information that the user wants to retrieve may have Different, but existing search methods do not consider these situations to classify search engine search results. Summary of the invention
有鉴于此, 本发明提供了一种信息检索系统和方法。  In view of this, the present invention provides an information retrieval system and method.
本发明的实施方式提供了一种信息检索系统, 包括: 搜索引擎、 提供给 搜索引擎进行搜索的内容索引数据库, 还包括:  An embodiment of the present invention provides an information retrieval system, including: a search engine, a content index database provided to a search engine for searching, and the following:
用户特征数据库, 其保存有用户在不同时间段内具有的特征行为信息; 内容分析系统, 其分别与用户特征数据库、 搜索引擎、 和连接用户终 端的通信网络相连接,其用于确定当前的时间及接收用户终端传送过来的用 户标识, 并据此查询用户特征数据库获得所述用户标识当前时间的特征行为 信息; 以及将搜索引擎搜索出来的检索结果信息根据获得的所述特征行为信 息进行再次检索排序, 将再次检索排序后的检索结果发送给用户终端显示。 a user feature database that stores feature behavior information that the user has in different time periods; a content analysis system that respectively associates with the user feature database, the search engine, and the connected user The communication network of the terminal is connected, and is used for determining the current time and receiving the user identifier transmitted by the user terminal, and querying the user feature database to obtain the characteristic behavior information of the current time of the user identifier; and searching by the search engine. The result information is re-searched and sorted according to the obtained feature behavior information, and the retrieved search result is sent to the user terminal for display.
本发明的实施方式还提供了一种信息检索方法, 其包括以下步骤: 保存用户在不同时间段对应的特征行为信息;  An embodiment of the present invention further provides an information retrieval method, which includes the following steps: saving feature behavior information corresponding to a user in different time periods;
获得用户输入的检索关键词, 搜索引擎根据检索关键词进行检索获得原 始检索结果;  Obtaining a search keyword input by the user, and the search engine searches for the original search result according to the search keyword;
获取用户标识和当前时间, 根据获得的用户标识和当前时间检索到所对 应的特征行为信息;  Obtaining the user identifier and the current time, and retrieving the corresponding feature behavior information according to the obtained user identifier and the current time;
根据所述特征行为信息对搜索引擎搜索出的原始检索结果进行二次检 索, 将包含所述特征行为信息的检索结果显示给用户。  The original search result searched by the search engine is secondarily searched according to the feature behavior information, and the search result including the feature behavior information is displayed to the user.
本发明的实施方式还提供了一种信息检索方法, 其特征在于包括以下步 骤:  An embodiment of the present invention further provides an information retrieval method, which is characterized by comprising the following steps:
保存用户在不同条件下对应的特征行为信息 ,并依据用户输入的检索 关键词进行原始检索;  The characteristic behavior information corresponding to the user under different conditions is saved, and the original retrieval is performed according to the retrieval keyword input by the user;
获取用户标识和条件标识;  Obtain a user ID and a condition identifier;
通过所述用户标识和条件标识检索到对应的特征行为信息;  Retrieving corresponding feature behavior information by using the user identifier and the condition identifier;
根据所述特征行为信息对原始检索结果进行第二次检索。  The original retrieval result is subjected to a second retrieval according to the characteristic behavior information.
其中, 所述的条件标识为时间、 机器型号或这两者的组合。  Wherein, the condition is identified as time, machine model or a combination of the two.
本发明的实施方式还提供了一种信息检索系统, 该系统包括相互连接 的搜索引擎和提供给搜索引擎进行搜索的内容索引数据库,其特征在于还 包括:  The embodiment of the present invention further provides an information retrieval system, which includes an interconnected search engine and a content index database provided for searching by the search engine, and further includes:
用户特征数据库, 其保存有用户在不同条件下的特征行为信息; 内容分析系统, 其分别与用户特征数据库、 搜索引擎、 和连接用户终 端的通信网络相连接,用于获取当前的条件标识和从用户终端传送过来的 用户标识, 并依据条件标识和用户标识, 查询用户特征数据库获得所述用 户标识在对应条件下的特征行为信息;以及将搜索弓 I擎搜索出来的检索结 果信息根据获得的所述特征行为信息进行再次检索排序,将再次检索排序 后的检索结果发送给用户终端显示。 a user feature database, which stores feature behavior information of the user under different conditions; a content analysis system, which is respectively connected with a user feature database, a search engine, and a communication network connected to the user terminal, for acquiring the current condition identifier and the slave Transmitted by the user terminal User identifier, and according to the condition identifier and the user identifier, querying the user feature database to obtain the feature behavior information of the user identifier under the corresponding condition; and performing the search result information searched by the search engine based on the obtained feature behavior information The sorting is searched again, and the sorted search result is again retrieved and sent to the user terminal for display.
其中在所述的系统中, 所述条件标识可为时间, 或者机器型号或这两 者的组合。  Wherein in the system, the conditional identification can be time, or a machine model or a combination of the two.
由上述方法可以看出, 本发明实施方式给出的方案可以才艮据时间特性、 机器型号对应的用户的个性化的特征行为, 对搜索引擎根据用户输入的关键 词所搜索到的原始搜集结果记录进行二次筛选过滤, 将用户真正感兴趣的文 件记录信息优先显示给用户,提高了用户检索相关信息的准确性和搜索效率。 附图简要说明  It can be seen from the above method that the solution provided by the embodiment of the present invention can search for the original collection result searched by the search engine according to the keyword input by the user according to the time characteristic and the personalized characteristic behavior of the user corresponding to the machine model. The record is subjected to secondary screening and filtering, and the file record information that the user is really interested in is preferentially displayed to the user, which improves the accuracy and search efficiency of the user to retrieve relevant information. BRIEF DESCRIPTION OF THE DRAWINGS
图 1为传统的搜索引擎系统结构图。  Figure 1 shows the structure of a traditional search engine system.
图 2为本发明实施方式的信息检索系统的系统框架图。  2 is a system frame diagram of an information retrieval system according to an embodiment of the present invention.
图 3为本发明实施方式的用户特征数据库的框架图。  FIG. 3 is a framework diagram of a user feature database according to an embodiment of the present invention.
图 4为本发明实施方式的内容分析系统的框架图。  4 is a skeleton diagram of a content analysis system according to an embodiment of the present invention.
图 5为本发明实施方式的检索过程的流程图。 实施本发明的方式  FIG. 5 is a flowchart of a retrieval process according to an embodiment of the present invention. Mode for carrying out the invention
传统的搜索引擎系统结构如图 1所示:  The traditional search engine system structure is shown in Figure 1:
"网络蜘蛛"从互联网上抓取网页, 把网页送入 "网页数据库,,, 反复循 环直到把所有的网页抓取完成。  "Web Spider" crawls web pages from the Internet, sends web pages to the "web database," and repeats the loop until all web pages are crawled.
系统从 "网页数据库" 中得到文本信息, 送入 "文本索引" 模块建立索 引, 形成 "索引数据库"。  The system obtains the text information from the "web database" and sends it to the "text index" module to create an index to form an "index database".
"用户" 通过提交查询请求给 "查询服务器", 服务器在 "索引数据库" 中进行相关网页的查找, 通过 "查询服务器" 按照相关度进行排序, 并提取 关键词的内容摘要, 组织最后的页面返回给 "用户"。 "User" by submitting a query request to the "query server", the server searches the relevant web pages in the "index database", sorts by relevance according to the "query server", and extracts The content summary of the keyword, the last page of the organization is returned to the "user".
本发明的一个实施方式考虑到用户在不同的时间段里会有不同的特征行 为信息, 因此, 在搜索引擎得到检索结果后, 根据当前时间段所对应的用户 的特征行为信息处理检索的结果, 将符合所述用户特征行为信息的检索结果 优先显示给用户, 从而改进搜索引擎检索的精度, 使提供给用户的检索结果 更贴近用户的需求。  One embodiment of the present invention considers that the user has different feature behavior information in different time periods. Therefore, after the search engine obtains the retrieval result, the retrieval result is processed according to the characteristic behavior information of the user corresponding to the current time period. The search result that meets the user characteristic behavior information is preferentially displayed to the user, thereby improving the accuracy of the search engine retrieval, and making the retrieval result provided to the user more close to the user's needs.
下面参考附图对本发明的实施方式进行详细说明。  Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
首先图 2示出了本发明实施方式的信息检索系统,包括内容分析系统 23、 用户特征数据库 24、 搜索引擎 22和内容索引数据库 21 , 其中:  First, Fig. 2 shows an information retrieval system according to an embodiment of the present invention, which includes a content analysis system 23, a user feature database 24, a search engine 22, and a content index database 21, wherein:
内容分析系统 23 , 用于接收用户终端传送过来的用户标识、 输入的搜索 关键字和获得本地服务器的当前时间,并据此查询用户特征数据库 24匹配出 该时段用户的特征行为,对通过搜索引擎 22搜索出来的页面进行再次检索和 过滤, 使检索的页面按用户在该时间段中表现出的特征行为偏好优先级的顺 序呈现给用户。 除了当前时间可用作条件标识外, 机器型号、 以及时间、 机 器型号这两者的组合也可以作为条件标识, 这样可以使该信息检索系统为用 户提供更高的检索准确性和搜索效率。  The content analysis system 23 is configured to receive the user identifier transmitted by the user terminal, the input search keyword, and the current time of obtaining the local server, and query the user feature database 24 to match the characteristic behavior of the user in the time period, and pass the search engine. The searched pages are retrieved and filtered again, so that the retrieved pages are presented to the user in the order of priority behavior of the feature behaviors exhibited by the user during the time period. In addition to the current time can be used as a conditional identification, the combination of the machine model, as well as the time and machine model, can also be used as conditional identification, which allows the information retrieval system to provide users with higher retrieval accuracy and search efficiency.
用户特征数据库 24, 用于保存用户的特征行为信息, 尤其是用户在不同 时间段内所具有的特征行为信息, 后面对该数据库进行了详细的说明, 此处 不再赘述。  The user feature database 24 is used to store the characteristic behavior information of the user, especially the characteristic behavior information of the user in different time periods, and the database is described in detail later, and details are not described herein.
搜索引擎 22, 是基于文本和关键词的搜索工具, 在已有内容索引数据库 21中搜索之后, 返回所需文件指针清单, 并带有文件标题, 以及通常还有一 些从文件正文中摘录下来的描述性文字。  The search engine 22, which is a text and keyword based search tool, returns a list of required file pointers with a file title after searching in the existing content index database 21, and usually has some extracted from the body of the file. Descriptive text.
内容索引数据库 21 , 通过激活由软件实施的自动程序(如 "网络蜘蛛") 自动地访问网站和依次地跟踪其中的超文本连接并通过所谓 "关键词" 提取 在其中遇到的每个文件,并保存在该数据库中,提供给搜索引擎 22进行访问。  The content index database 21 automatically accesses the website by activating an automated program implemented by the software (such as "web spider") and sequentially tracks the hypertext connection therein and extracts each file encountered therein by a so-called "keyword", And stored in the database, provided to the search engine 22 for access.
其中, 图 3为所述用户特征数据库 24的一个实施例,可以通过但不限于 言息的保存。 下 面对给出的个人用户信息表、 时间段信息表、 特征行为表、 匹配表进行详细 描述。 相应地, 在用户特征数据库中保存有机器型号、 以及时间、机器型号、 这两者的组合作条件标识所对应的特征行为信息。 FIG. 3 is an embodiment of the user feature database 24, which may be, but is not limited to, The preservation of the message. The personal user information table, the time period information table, the feature behavior table, and the matching table are described in detail below. Correspondingly, the feature behavior information corresponding to the machine model number, the time, the machine model, and the group cooperation condition identifier of the two are saved in the user feature database.
个人用户信息表用于存储用户的个人信息, 可以是用户注册时输入的信 如下表 1示出了一个用户信息表:  The personal user information table is used to store the personal information of the user, and may be a letter input when the user registers. Table 1 below shows a user information table:
Figure imgf000009_0001
Figure imgf000009_0001
表 1  Table 1
时间段信息表用于存储了不同时间段对应的不同时间段编号, 将时间段 编号是为了便于数据库的检索方便, 同时对时间段的设置更加灵活。 如下表 The time period information table is used to store different time period numbers corresponding to different time segments, and the time segment number is used to facilitate the retrieval of the database, and the setting of the time period is more flexible. As shown in the table below
2示出了一个时间段信息表: 2 shows a time period information table:
Figure imgf000009_0002
Figure imgf000009_0002
表 2  Table 2
当然, 以机器型号作条件标识时的表是机器型号信息表, 其与时间段信 息表类似, 具有机器型号编号和机器型号两项。 以时间、 机器型号的组合作 条件标识时的条件表, 在这里举一个例子(见下表 3 ), 其使用和时间段信息 表 2也是类似, 只是多了一项。  Of course, the table when the machine model is used for conditional identification is a machine model information table, which is similar to the time zone information table and has two machine model numbers and machine models. The condition table when identifying the group cooperation conditions of time and machine model, here is an example (see Table 3 below), and its use and time period information Table 2 are similar, just one more.
Figure imgf000009_0003
Figure imgf000009_0003
表 3  table 3
特征行为表用于存储用户的不同的特征行为关键字所对应的不同特征行 为编号, 其中, 一个特征行为关键字还可以有从属关键字, 这些都属于特征 行为信息。 如下表 4示出了的一个特征行为表: The feature behavior table is used to store different feature lines corresponding to different feature behavior keywords of the user. For the number, one of the feature behavior keywords may also have a dependent keyword, which are all characteristic behavior information. A characteristic behavior table shown in Table 4 below:
Figure imgf000010_0001
Figure imgf000010_0001
表 4  Table 4
:编号所对应的特征行为编号。 通过该 表, 建立了表 1、 表 2和表 4之间的关系, 即建立了不同时间段和特征行为 关键字 /特征行为从属关键字的关系。 如下表 5示出了一个匹配表:  : The feature behavior number corresponding to the number. Through this table, the relationship between Table 1, Table 2 and Table 4 is established, that is, the relationship between different time periods and characteristic behavior keyword/characteristic behavior dependent keywords is established. Table 5 below shows a match table:
Figure imgf000010_0002
Figure imgf000010_0002
表 5  table 5
上述表 5中的每一条数据就是一个完整的特征行为信息, 其还包括了特 征优先级项, 用来标识在一定时间段内, 该用户的不同特征行为的优先级。 如表 5示出的例子表示:用户 U001在时间段 T001中,特征行为编号为 C001 的特征优先级 9高于特征行为编号为 C002的特征优先级 8,表示该用户 U001 在时间段 T001中更偏向于表现出特征行为编号为 C001的特征行为。  Each of the data in Table 5 above is a complete feature behavior information, which also includes a feature priority item, which is used to identify the priority of different feature behaviors of the user within a certain period of time. The example shown in Table 5 indicates that the user U001 has a feature priority level 9 of the feature behavior number C001 and a feature priority level 8 of the feature behavior number C002 in the time period T001, indicating that the user U001 is more in the time period T001. It is biased to exhibit characteristic behavior with feature behavior number C001.
对于用户特征数据库 24所存储的数据,可以是由用户业务行为特征采集 的系统提供, 关于用户业务行为特征采集的系统的实现, 可参见本申请人申 请的 "用户业务行为特征采集的系统及方法" 发明。  The data stored in the user feature database 24 may be provided by a system for collecting user behavior characteristics. For the implementation of the system for collecting user behavior characteristics, refer to the system and method for collecting user service behavior features applied by the applicant. "Invented.
图 4示出了所述内容分析系统 23的框架图, 包括数据 单元 231、 搜 索引擎接口 232、分析单元 233、检索分析单元 234、检索数据存储单元 235。 该实施例中是以时间作为条件标识的。  4 is a block diagram of the content analysis system 23, including a data unit 231, a search engine interface 232, an analysis unit 233, a retrieval analysis unit 234, and a retrieval data storage unit 235. In this embodiment, it is identified by time as a condition.
数据收发单元 231 , 用于实现与用户终端的交互, 接收用户通过用户终 端输入的搜索关键词并发送给搜索引擎接口 232, 以及将获得的用户标识发 送给分析单元 233。 The data transceiver unit 231 is configured to implement interaction with the user terminal, and receive the user through the user terminal. The search keyword input is input to the search engine interface 232, and the obtained user identifier is sent to the analysis unit 233.
搜索引擎接口 232,用于实现与搜索引擎 22的交互,将数据收发单元 231 发送过来的搜索关键词发送给搜索引擎 22, 以及接收搜索引擎 22的搜索结 果发送给检索数据存储单元 235。  The search engine interface 232 is configured to implement interaction with the search engine 22, send search keywords sent by the data transceiving unit 231 to the search engine 22, and receive search results of the search engine 22 to the search data storage unit 235.
检索数据存储单元 235: 将搜索引擎接口 232发送过来的搜索引擎 22的 搜索结果进行保存, 以提供给检索分析单元 234进行分析。  The search data storage unit 235: saves the search results of the search engine 22 sent from the search engine interface 232 for analysis by the search analysis unit 234.
分析单元 233 , 用于接收数据收发单元 231发送过来的用户标识和获得 当前的搜索时间, 所述当前的搜索时间可以是在系统端获得的时间, 也可以 是机器端上报的时间, 并依据用户标识和获得当前的搜索时间检索用户特征 数据库 24, 获得所述用户标识和搜索时间对应的特征行为信息, 并提供给检 索分析单元 234。 所述行为特征信息可以包括但不限于特征行为关键字和特 征行为从属关键字。 相应地, 在机器型号作为条件标识时, 该分析单元 233 为机器型号分析单元(未示出), 该单元完成与分析单元 233相似的处理, 对 应地获取和处理机器型号信息。 该分析单元通过机器上报的机器型号来确定 相应的机器型号编号(即机器型号标识), 而不像分析单元 233既可以直接在 系统端获取当前时间, 也可以从机器端获取当前时间, 然后在获取了用户标 识和机器型号标识后, 对数据库 24进行检索, 获得对应的特征行为信息。在 以时间和机器型号的组合作为条件标识时, 相应步骤和情况与上述步骤和方 法相似, 只是将机器型号对应地改为了时间和机器型号的组合, 让条件变得 更加具体了。  The analyzing unit 233 is configured to receive the user identifier sent by the data transceiver unit 231 and obtain the current search time, where the current search time may be the time obtained by the system, or may be the time reported by the machine, and according to the user. The user search database 24 is identified and obtained by the current search time, and the feature behavior information corresponding to the user identifier and the search time is obtained and provided to the search analysis unit 234. The behavioral feature information may include, but is not limited to, a feature behavior keyword and a feature behavior dependent keyword. Accordingly, when the machine model is used as the conditional identification, the analysis unit 233 is a machine type analysis unit (not shown) which performs processing similar to that of the analysis unit 233, and correspondingly acquires and processes the machine model information. The analysis unit determines the corresponding machine model number (ie, the machine model identification) by the machine model reported by the machine, instead of the analysis unit 233, the current time can be obtained directly from the system side, or the current time can be obtained from the machine side, and then After obtaining the user identifier and the machine model identifier, the database 24 is retrieved to obtain corresponding feature behavior information. When the combination of time and machine model is used as the conditional identification, the corresponding steps and conditions are similar to the above steps and methods, except that the machine model is changed to the combination of time and machine model, so that the conditions become more specific.
检索分析单元 234, 用于接收分析单元 233发送过来的特征行为关键词 信息, 并据此对检索数据存储单元 235中存储的所述搜索结果进行二次检索 过滤和 /或排序, 并将过滤和 /或排序后的检索结果发送给数据收发单元 231 以返回给用户终端显示给用户。  The search analysis unit 234 is configured to receive the feature behavior keyword information sent by the analysis unit 233, and perform secondary search filtering and/or sorting on the search result stored in the search data storage unit 235, and filter and / or the sorted search result is sent to the data transceiving unit 231 to be returned to the user terminal for display to the user.
下面参见图 4, 同时参见图 5示出的本发明实施方式的信息检索系统实 现检索过程的流程图, 对本发明的信息检索方法进行详细说明, 其中步骤Referring now to FIG. 4, reference is also made to the information retrieval system of the embodiment of the present invention shown in FIG. The flow chart of the present search process, the information retrieval method of the present invention is described in detail, wherein the steps are
505, 506也是本发明的一种基于条件标识的检索方法的一个实施方式。 本实 施例是以时间作为条件标识的, 其包括以下部分: 505, 506 are also an embodiment of a conditional identification based retrieval method of the present invention. This embodiment is identified by time as a condition and includes the following parts:
步骤 501 : 首先用户根据要查询的信息在用户终端提供的搜索引擎中输 入检索关键词, 在输入时可能带有一个位于连续关键词之间的布尔型 (例如 "and" 或 "or" )或其他搜索引擎可以识别的运算符。  Step 501: First, the user inputs a search keyword in a search engine provided by the user terminal according to the information to be queried, and may input a Boolean type (for example, "and" or "or") between consecutive keywords or Operators that other search engines can recognize.
假设本例中用户在用户终端输入一个检索关键字 "游戏",请求查询相关 信息。  Assume that in this example, the user inputs a search keyword "game" at the user terminal, requesting to query related information.
步驟 502: 这些信息通过网络传送到内容分析系统 23中, 由内容分析系 统 23的数据收发单元 231获得用户查询的关键词信息; 同时数据收发单元 231还获得该用户的标识,用户标识的获取可以是用户通过用户终端输入的, 也可以是用户在使用本发明实施方式的信息检索系统时登陆时录入的。  Step 502: The information is transmitted to the content analysis system 23 through the network, and the data query unit 231 of the content analysis system 23 obtains the keyword information of the user query; and the data transceiver unit 231 also obtains the identifier of the user, and the user identifier can be obtained. It is input by the user through the user terminal, or may be entered when the user logs in when using the information retrieval system of the embodiment of the present invention.
步骤 503: 数据收发单元 231将获得的关键词发送给搜索引擎接口 232, 将用户标识信息发送给分析单元 233。  Step 503: The data transceiving unit 231 sends the obtained keyword to the search engine interface 232, and sends the user identification information to the analyzing unit 233.
本例中数据收发单元 231将用户输入的关键词 "游戏" 发送到搜索引擎 接口 232; 将获得的该用户的标识 U001发送给分析单元 233。  In this example, the data transceiving unit 231 sends the keyword "game" input by the user to the search engine interface 232; and the obtained identifier U001 of the user is sent to the analyzing unit 233.
步骤 504: 搜索引孥接口 232将获得的用户查询的关键词发送给搜索引 擎 22, 搜索引擎 22根据关键词在内容索引数据库 21中检索相关信息, 将检 索的结果返回给搜索引擎接口 232, 再发送给检索数据存储单元 235中进行 保存。  Step 504: The search engine interface 232 sends the obtained keyword of the user query to the search engine 22, and the search engine 22 retrieves the related information in the content index database 21 according to the keyword, and returns the search result to the search engine interface 232, and then It is sent to the search data storage unit 235 for storage.
步骤 505: 分析单元 233根据获得的用户标识和当前时间信息从用户特 征数据库 24中找到匹配的相关特征行为数据, 再发送给检索分析单元 234。 时间信息可以是由装载内容分析系统的本地服务器提供或网络内任一台计算 机设备提供, 这里优选本地服务器提供。  Step 505: The analyzing unit 233 finds the matching related feature behavior data from the user feature database 24 according to the obtained user identifier and current time information, and sends the matching characteristic behavior data to the retrieval analyzing unit 234. The time information may be provided by a local server loaded with the content analysis system or by any computer device within the network, preferably provided by a local server.
相应地, 在机器型号以及时间、 机器型号这两者的组合也可以作为条件 标识时, 分析单元 233为机器型号分析单元, 或者是具有上述单元功能的组 合单元(未示出), 这些单元完成与分析单元 233相似的处理, 对应地获取和 处理机器型号等信息。这样的分析单元在获取了用户标识和条件标识(时间、 机器型号或这两者的组合)后, 对用户特征数据库进行检索, 获得对应的特 征行为信息, 该检索过程为: Correspondingly, when the combination of the machine model and the time, the machine model can also be used as a condition, the analysis unit 233 is a machine type analysis unit, or a group having the above unit functions. In conjunction with units (not shown), these units perform similar processing as the analysis unit 233, correspondingly acquiring and processing information such as machine models. After obtaining the user identifier and the condition identifier (time, machine model or a combination of the two), the analysis unit searches the user feature database to obtain corresponding feature behavior information, and the retrieval process is:
通过所述用户标识从用户特征数据库中检索出用户的中间特征行为信 自 . 通过所述条件标识从用户的中间特征行为信息中检索出所述对应的特征 行为信息。 所述的中间特征行为信息、 所述对应的特征行为信息包括特征行 为关键字和 /或特征行为从属关键字。  And searching, by the user identifier, the intermediate feature behavior information of the user from the user feature database. The corresponding feature behavior information is retrieved from the intermediate feature behavior information of the user by using the condition identifier. The intermediate feature behavior information, the corresponding feature behavior information includes a feature behavior keyword and/or a feature behavior dependent keyword.
这两个步骤可以没有先后顺序, 即可以先通过所述条件标识从用户特征 数据库中检索出用户的中间特征行为信息,然后再通过所述用户标识从用户 的中间特征行为信息中检索出所述对应的特征行为信息。  The two steps may be in no order, that is, the intermediate feature behavior information of the user may be retrieved from the user feature database by using the condition identifier, and then the user identifier is used to retrieve the intermediate feature behavior information from the user. Corresponding feature behavior information.
本例中, 根据时间信息获得对应的时间段编号 T001 ; 再根据用户标识 U001、 时间段编号从用户特征数据库 24的上述表 5中检索到该用户在此刻 的用户行为偏好和优先级为(C001, 9 ), ( C002, 8 )……; 根据上述表 4获 得该用户在此刻的特征行为关键字及特征行为从属关键字为: 游戏、 电子游 戏、 电脑游戏 ... ...; 音乐、 古典、 管弦……; 将用户的这些特征行为关键词 和相关特征优先级发送给检索分析单元 234。  In this example, the corresponding time period number T001 is obtained according to the time information; and the user behavior preference and priority of the user at the moment are retrieved from the above table 5 of the user feature database 24 according to the user identifier U001 and the time period number (C001). , 9), ( C002, 8 )......; According to the above Table 4, the user's characteristic behavior keywords and characteristic behavior subordinate keywords at the moment are: games, video games, computer games...; music, Classical, orchestral... The user's characteristic behavior keywords and related feature priorities are sent to the retrieval analysis unit 234.
相应地,也可以根据机器上报的型号信息从数据库中 24获得对应的机器 型号编号(即机器型号标识); 再根据用户标识和机器型号编号从数据库 24 的表中检索到该用户在该条件下的用户行为偏好和优先级; 再将用户的这些 特征行为关键词和相关特征优先级发送给检索分析单元 234。 这些步骤与上 述的以时间作为条件标识的实施方式的步骤相似。 在机器标识与时间的组合 作为条件时, 与上述步骤相似。  Correspondingly, the corresponding model number (ie, the model number of the machine) can also be obtained from the database 24 according to the model information reported by the machine; and the user is retrieved from the table of the database 24 according to the user identifier and the model number of the machine. User behavior preferences and priorities; these feature behavior keywords and related feature priorities of the user are then sent to the retrieval analysis unit 234. These steps are similar to the steps described above with time as a conditional identification. The combination of machine identification and time is similar to the above steps.
以下举例说明中间特征行为信息, 和所述对应的特征行为信息。  The following describes the intermediate feature behavior information, and the corresponding feature behavior information.
进行特征信息的获取前,必需在数据库中保存有用户标识在不同条件下 对应的特征行为信息。 条件标识可以是时间、 机器型号或时间与机器型号 的组合。 Before the feature information is obtained, the user ID must be saved in the database under different conditions. Corresponding feature behavior information. The condition identification can be a combination of time, machine model or time and machine model.
例如匹配表 5, 其显示了各个特征行为信息。 以特定的用户编号 (即所 述的用户标识)对数据库中的用户特征信息匹配表进行检索, 所得的特征行 为信息即为所述中间特征行为信息, 接着依据时间段编号 (即所述的条件标 识)对上次检索的结果再次进行检索, 检索出符合特定时间段(即所述的奈 件, 还可以是机器型号或时间与机器型号的组合) 的特征行为信息, 即为所 述对应的特征行为信息。 当然也可以先依据时间段编号(即所述的条件)进 行检索获取中间特征行为信息, 然后再依据所述用户编号(即所述的用户标 识)从中间特征行为信息中获取所述对应的特征行为信息。  For example, match table 5, which shows individual feature behavior information. Searching the user characteristic information matching table in the database by using a specific user number (that is, the user identifier), and the obtained characteristic behavior information is the intermediate characteristic behavior information, and then according to the time period number (ie, the condition Identification) re-searching the result of the last search, and retrieving characteristic behavior information that meets a certain time period (ie, the described piece, which may also be a combination of a machine model or a time and a machine model), that is, the corresponding Feature behavior information. Of course, the intermediate feature behavior information may be retrieved according to the time segment number (ie, the condition), and then the corresponding feature is obtained from the intermediate feature behavior information according to the user number (ie, the user identifier). Behavioral information.
在条件是机器型号或时间与机器型号的组合时, 检索获取所述对应的特 征行为信息的步骤,与奈件是时间时的步骤相似,仅是条件标识发生了变化。  When the condition is the combination of the machine model or time and the machine model, the step of retrieving the corresponding characteristic behavior information is similar to the step when the time is the time, only the conditional identification has changed.
步骤 506: 检索分析单元 234通过用户标识从检索数据存储单元 235获 得该用户已搜索出的相关检索结果(如页面信息),再通过接收的特征行为关 键词和相关特征优先级, 二次对检索结果信息进行检索重新排序 , 使用户真 正相关的页面信息最先显示给用户。  Step 506: The retrieval analysis unit 234 obtains the relevant retrieval result (such as page information) that the user has searched from the retrieval data storage unit 235 by using the user identifier, and then performs the secondary search by the received feature behavior keyword and related feature priority. The result information is retrieved and reordered, so that the page information that the user really has is first displayed to the user.
本例中, 对所述检索结果进行二次检索排序时, 首先使用优先级高的特 征行为关键词 (游戏、 电子游戏、 电脑游戏 ... ...)进行检索, 将检索得出的 文件信息列在最前面; 然后对优先级低的特征行为关键词 (音乐、 古典、 管 弦…… )进行检索, 将检索得出的文件信息列在后面; 然后将二次检索时不包 括所述特征行为关键词的原检索结果的信息列在最后。 本发明实施方式中对 这些关键词的检索过程不做详细描述, 这些技术在每个文本检索系统中都包 括了。  In this example, when the search result is subjected to the secondary search sorting, the feature behavior keyword (game, electronic game, computer game...) with high priority is first searched, and the retrieved file is retrieved. The information is listed first; then the low-priority feature behavior keywords (music, classical, orchestral...) are searched, and the retrieved file information is listed later; then the feature is not included in the secondary search. The information of the original search result of the behavior keyword is listed last. The retrieval process for these keywords is not described in detail in the embodiments of the present invention, and these techniques are included in each text retrieval system.
步骤 507: 检索分析单元 234将二次检索排序后的检索结果发送给数据 收发单元 231 , 由数据收发单元 231将二次检索排序的结果(如页面信息) 发给用户终端显示给用户。 上述检索方案可以用于几乎任何信息检索系统以增加其中搜索引擎的搜 索准确度, 而不论该引擎是否为一个常规引擎。 此外, 本发明的实施方式还 提高了从海量数据库中检索信息的准确度, 而不论文字信息采用何种语言, 例如中文, 英文, 法文, 德文等。 Step 507: The search analyzing unit 234 sends the search result sorted by the secondary search to the data transceiving unit 231, and the data transceiving unit 231 sends the result of the secondary search sorting (such as page information) to the user terminal for display to the user. The above search scheme can be used in almost any information retrieval system to increase the search accuracy of the search engine, whether or not the engine is a regular engine. In addition, embodiments of the present invention also improve the accuracy of retrieving information from a mass database, regardless of the language in which the textual information is used, such as Chinese, English, French, German, and the like.
以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在本 发明的精神和原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在 本发明的保护范围之内。  The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalents, improvements, etc., which are included in the spirit and scope of the present invention, should be included in the present invention. Within the scope of protection.

Claims

权利要求书 Claim
1、 一种信息检索系统, 包括: 相互连接的搜索引擎和提供给搜索引 擎进行搜索的内容索引数据库, 其特征在于, 还包括:  An information retrieval system, comprising: an interconnected search engine and a content index database provided for searching by a search engine, wherein:
用户特征数据库, 保存有用户在不同时间段内所具有的特征行为信 内容分析系统, 其分别与用户特征数据库、 搜索引擎、 和连接用户终 端的通信网絡相连接,用于确定当前的时间及接收用户终端传送过来的用 户标识,并据此查询用户特征数据库获得所述用户标识当前时间的特征行 为信息;以及将搜索引擎搜索出来的检索结果信息根据获得的所述特征行 为信息进行再次检索排序,将再次检索排序后的检索结果发送给用户终端 显示。  The user feature database stores a feature behavior information analysis system that the user has in different time periods, and is respectively connected to the user feature database, the search engine, and the communication network connecting the user terminals, for determining the current time and receiving. The user identifier transmitted by the user terminal is used to query the user feature database to obtain the feature behavior information of the current time of the user identifier; and the search result information searched by the search engine is re-searched and sorted according to the obtained feature behavior information. The retrieved search result is again retrieved and sent to the user terminal for display.
2、 根据权利要求 1所述的系统, 其特征在于, 所述用户特征数据库 包括:  2. The system according to claim 1, wherein the user feature database comprises:
时间段信息表, 用于存储不同时间段对应的不同时间段编号; 为的关键字和 /或特征行为的从属关键字信息;  a time period information table, configured to store different time period numbers corresponding to different time segments; and the dependent keyword information of the keyword and/or feature behavior;
3、 根据权利要求 2所述的系统, 其特征在于, 所述用户特征数据库 进一步包括: 个人用户信息表, 用于存储用户的个人信息。 The system according to claim 2, wherein the user feature database further comprises: a personal user information table for storing personal information of the user.
4、 根据权利要求 1所述的系统, 其特征在于, 所述内容分析系统包 括:  4. The system of claim 1 wherein the content analysis system comprises:
数据收发单元, 用于实现与用户终端的交互, 接收用户终端输入的搜 索关键词并发送给搜索引擎接口, 以及将用户标识发送给分析单元; 搜索引擎接口,用于将数据收发单元发送过来的搜索关键词发送给搜 索引擎, 以及接收搜索引擎的搜索结果发送给检索数据存储单元; 检索数据存储单元,用于保存搜索引擎接口发送过来的搜索引擎的搜 索结果, 以提供给检索分析单元; a data transceiving unit, configured to implement interaction with the user terminal, receive a search keyword input by the user terminal, and send the search keyword to the search engine interface, and send the user identifier to the analysis unit; the search engine interface is configured to send the data transceiver unit Search keywords are sent to the search engine, and search results received by the search engine are sent to the search data storage unit; Retrieving a data storage unit, configured to save search results of the search engine sent by the search engine interface, to be provided to the search analysis unit;
分析单元,用于接收数据收发单元发送过来的用户标识和确定当前时 间, 并据此检索用户特征数据库, 获得所述用户标识当前时间对应的特征 行为信息, 提供给检索分析单元;  The analyzing unit is configured to receive the user identifier sent by the data transceiver unit and determine the current time, and retrieve the user feature database according to the method, and obtain the feature behavior information corresponding to the current time of the user identifier, and provide the information to the search analysis unit;
检索分析单元, 用于接收分析单元发送过来的特征行为信息, 并据此 对检索数据存储单元中存储的所述搜索结果进行二次检索过滤和 /或排 序, 并将过滤和 /或排序后的检索结果发送给数据收发单元以返回给用户 终端。  a search analysis unit, configured to receive feature behavior information sent by the analysis unit, and perform secondary search filtering and/or sorting on the search result stored in the search data storage unit, and filter and/or sort the The retrieval result is sent to the data transceiver unit for return to the user terminal.
5、 一种信息检索方法, 其特征在于, 包括:  5. An information retrieval method, comprising:
保存用户在不同时间段对应的特征行为信息;  Saving feature behavior information corresponding to users in different time periods;
获得用户输入的检索关键词,搜索引擎根据检索关键词进行检索获得 原始检索结果;  Obtaining a search keyword input by the user, and the search engine searches for the original search result according to the search keyword;
获取用户标识和当前时间,根据获得的用户标识和当前时间检索到所 对应的特征行为信息;  Obtaining the user identifier and the current time, and retrieving the corresponding feature behavior information according to the obtained user identifier and the current time;
根据所述特征行为信息对搜索引擎搜索出的原始检索结果进行二次 检索, 将包含所述特征行为信息的检索结果显示给用户。  The original search result searched by the search engine is subjected to secondary search according to the feature behavior information, and the search result including the feature behavior information is displayed to the user.
6、 根据权利要求 5所述的方法, 其特征在于, 所述获取用户标识的 步骤包括: 接收用户通过用户终端输入的用户标识; 或接收用户登陆系统 时录入的用户标识。  The method according to claim 5, wherein the step of acquiring the user identifier comprises: receiving a user identifier input by the user through the user terminal; or receiving a user identifier entered when the user logs in to the system.
7、 根据权利要求 5所述的方法, 其特征在于, 所述获取当前时间的 步骤包括:从本地服务器或网络上任一台计算机设备上获取提供的当前时 间。  7. The method according to claim 5, wherein the step of acquiring the current time comprises: obtaining the current time provided from the local server or any computer device on the network.
8、 根据权利要求 5所述的方法, 其特征在于, 不同特征行为信息设 置有不同的优先级, 所述进行二次检索进一步包括:  The method according to claim 5, wherein the different feature behavior information is set with different priorities, and the performing the second retrieval further comprises:
分别根据所述不同特征行为信息对搜索引擎搜索出的原始检索结果 的二次检索; Original search results searched by the search engine according to the different feature behavior information Secondary search;
根据所述特征行为信息的优先级将对应的二次检索后的检索结果进 行排序。  The search results after the corresponding secondary search are sorted according to the priority of the feature behavior information.
9、 根据权利要求 5所述的方法, 其特征在于, 所述的特征行为信息 包括: 特征行为关键字和 /或特征行为从属关键字。  The method according to claim 5, wherein the feature behavior information comprises: a feature behavior keyword and/or a feature behavior dependent keyword.
10、 一种信息检索方法, 其特征在于, 包括:  10. An information retrieval method, comprising:
保存用户在不同条件下对应的特征行为信息,并依据用户输入的检索 关键词进行原始检索;  The characteristic behavior information corresponding to the user under different conditions is saved, and the original retrieval is performed according to the retrieval keyword input by the user;
获取用户标识和条件标识;  Obtain a user ID and a condition identifier;
通过所述用户标识和条件标识检索得到对应的特征行为信息; 根据所述特征行为信息对原始检索结果进行第二次检索。  Retrieving corresponding feature behavior information by using the user identifier and the condition identifier; and performing a second retrieval on the original search result according to the feature behavior information.
1 1、 根据权利要求 10所述的方法, 其特征在于, 所述的条件标识为 时间、 机器型号或这两者的组合。  1 1. The method according to claim 10, wherein the condition is identified as time, machine model or a combination of the two.
12、 根据权利要求 10所述的方法, 其特征在于, 所述的通过用户标 识和条件标识检索到所对应的特征行为信息包括:  The method according to claim 10, wherein the retrieving the corresponding feature behavior information by using the user identifier and the condition identifier comprises:
通过所述用户标识获得用户的中间特征行为信息;  Obtaining, by the user identifier, intermediate characteristic behavior information of the user;
通过所述条件标识从用户的中间特征行为信息中获取所述对应的特 征行为信息; 或  Obtaining the corresponding feature behavior information from the intermediate feature behavior information of the user by using the condition identifier; or
通过所述条件标识获得用户的中间特征行为信息;  Obtaining the intermediate feature behavior information of the user by using the condition identifier;
通过所述用户标识从用户的中间特征行为信息中获取所述对应的特 征行为信息。  The corresponding feature behavior information is obtained from the intermediate feature behavior information of the user by using the user identifier.
13、 一种信息检索系统, 包括: 相互连接的搜索引擎和提供给搜索引 擎进行搜索的内容索引数据库, 其特征在于, 还包括:  13. An information retrieval system, comprising: an interconnected search engine and a content index database provided for searching by a search engine, wherein:
用户特征数据库, 其保存有用户在不同条件下的特征行为信息; 内容分析系统, 其分别与用户特征数据库、 搜索引擎、 和连接用户终 端的通信网络相连接,用于获取当前的条件标识和从用户终端传送过来的 用户标识, 并依据条件标识和用户标识, 查询用户特征数据库获得所述用 户标识, 在对应条件标识下的特征行为信息; 以及将搜索引擎搜索出来的 检索结果信息根据获得的所述特征行为信息进行再次检索排序,将再次检 索排序后的检索结果发送给用户终端显示。 a user feature database, which stores feature behavior information of the user under different conditions; a content analysis system, which is respectively connected with a user feature database, a search engine, and a communication network connected to the user terminal, for acquiring the current condition identifier and the slave Transmitted by the user terminal User identifier, and according to the condition identifier and the user identifier, querying the user feature database to obtain the user identifier, the feature behavior information under the corresponding condition identifier; and performing the search result information searched by the search engine according to the obtained feature behavior information The sorting is searched again, and the sorted search result is again retrieved and sent to the user terminal for display.
14、 根据权利要求 13所述的系统, 其特征在于, 所述用户特征数据 库包括:  14. The system according to claim 13, wherein the user feature database comprises:
条件标识表, 用于存储不同条件标识和对应的条件编号;  a condition identification table, configured to store different condition identifiers and corresponding condition numbers;
特征行为表,用于存储用户的不同特征行为编  Feature behavior table for storing different feature behaviors of users
的关键字和 /或特征行为的从属关键字信息; Dependent keyword information for keywords and/or feature behavior;
匹配表, 用于存储用户的不同条件  Match table for storing different conditions of the user
15、 根据权利要求 13所述的系统, 其特征在于, 所述内容分析系统 包括:  The system according to claim 13, wherein the content analysis system comprises:
数据收发单元, 用于实现与用户终端的交互, 接收用户终端输入的搜 索关键词并发送给搜索引擎接口, 以及将用户标识发送给分析单元; 搜索引擎接口,用于将数据收发单元发送过来的搜索关键词发送给搜 索引擎, 以及接收搜索引擎的搜索结果发送给检索数据存储单元;  a data transceiving unit, configured to implement interaction with the user terminal, receive a search keyword input by the user terminal, and send the search keyword to the search engine interface, and send the user identifier to the analysis unit; the search engine interface is configured to send the data transceiver unit Search keywords are sent to the search engine, and search results received by the search engine are sent to the search data storage unit;
检索数据存储单元,用于保存搜索引擎接口发送过来的搜索引擎的搜 索结果, 以提供给检索分析单元;  Retrieving a data storage unit for storing a search result of the search engine sent by the search engine interface for providing to the search analysis unit;
分析单元, 用于接收数据收发单元发送过来的用户标识, 和获取条件 标识, 并依据用户标识和条件标识检索用户特征数据库, 获得所述用户标 识在对应的条件标识下的特征行为信息, 提供给检索分析单元;  An analyzing unit, configured to receive a user identifier sent by the data transceiver unit, and obtain a condition identifier, and retrieve a user feature database according to the user identifier and the condition identifier, and obtain feature behavior information of the user identifier under the corresponding condition identifier, and provide the Retrieve the analysis unit;
检索分析单元, 用于接收分析单元发送过来的特征行为信息, 并据此 对检索数据存储单元中存储的所述搜索结果进行二次检索过滤和 /或 序, 并将过滤和 /或排序后的检索结果发送给数据收发单元以返回给用户 终端。  a search analysis unit, configured to receive feature behavior information sent by the analysis unit, and perform secondary search filtering and/or ordering on the search result stored in the search data storage unit, and filter and/or sort the The retrieval result is sent to the data transceiver unit for return to the user terminal.
16、 根据权利要求 13所述的系统, 其特征在于, 所述条件标识可为 时间, 或机器的型号, 或这两者的组合。 16. The system according to claim 13, wherein the condition identifier can be Time, or model of the machine, or a combination of the two.
PCT/CN2006/002804 2005-11-01 2006-10-20 An information retrieval system and information retrieval method WO2007051397A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200510117147XA CN1858733B (en) 2005-11-01 2005-11-01 Information searching system and searching method
CN200510117147.X 2005-11-01

Publications (1)

Publication Number Publication Date
WO2007051397A1 true WO2007051397A1 (en) 2007-05-10

Family

ID=37297642

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2006/002804 WO2007051397A1 (en) 2005-11-01 2006-10-20 An information retrieval system and information retrieval method

Country Status (2)

Country Link
CN (1) CN1858733B (en)
WO (1) WO2007051397A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779193A (en) * 2012-07-16 2012-11-14 哈尔滨工业大学 Self-adaptive personalized information retrieval system and method
US9740996B2 (en) 2012-03-27 2017-08-22 Alibaba Group Holding Limited Sending recommendation information associated with a business object
CN111143460A (en) * 2019-12-30 2020-05-12 智慧神州(北京)科技有限公司 Big data-based economic field data retrieval method and device and processor
CN112445830A (en) * 2020-11-26 2021-03-05 湖南智慧政务区块链科技有限公司 Data analysis system based on block chain technology
CN116186078A (en) * 2023-03-15 2023-05-30 中国华能集团有限公司北京招标分公司 Data retrieval method and system
CN116578677A (en) * 2023-07-14 2023-08-11 高密市中医院 Retrieval system and method for medical examination information

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100555283C (en) * 2006-12-12 2009-10-28 北京搜狗科技发展有限公司 A kind of directly at the dissemination method and the system of user's relevant information
CN101374044B (en) * 2007-08-21 2010-12-15 中国电信股份有限公司 Method and system for making business engine to obtain user identification
CN101996200B (en) * 2009-08-19 2014-03-12 华为技术有限公司 Method and device for searching file
US20110225139A1 (en) * 2010-03-11 2011-09-15 Microsoft Corporation User role based customizable semantic search
CN102207943A (en) * 2010-03-29 2011-10-05 上海博泰悦臻电子设备制造有限公司 Identification information matching-based search method and device
CN102207942A (en) * 2010-03-29 2011-10-05 上海博泰悦臻电子设备制造有限公司 Identification information matching-based search method and device
CN102253936B (en) * 2010-05-18 2013-07-24 阿里巴巴集团控股有限公司 Method for recording access of user to merchandise information, search method and server
TWI547888B (en) * 2010-08-27 2016-09-01 Alibaba Group Holding Ltd A method of recording user information and a search method and a server
CN101916295B (en) * 2010-08-27 2011-12-14 董方 Internet search system and method based on point-to-point network
CN101996246B (en) * 2010-11-09 2012-11-14 中国电信股份有限公司 Method and system for instant indexing
CN102117332A (en) * 2011-03-10 2011-07-06 辜进荣 Given time-based searching method
CN102184224A (en) * 2011-05-09 2011-09-14 李郁文 System and method for screening search results
CN102902695A (en) * 2011-07-29 2013-01-30 上海博泰悦臻电子设备制造有限公司 Navigation system as well as interest point searching method and device
CN102270243A (en) * 2011-08-25 2011-12-07 北京思博途信息技术有限公司 Information search method and system
CN102385636A (en) * 2011-12-22 2012-03-21 陈伟 Intelligent searching method and device
CN102663048B (en) * 2012-03-29 2017-04-12 天津奇思科技有限公司 Method and device for providing search result
CN103577049B (en) * 2012-07-24 2019-04-12 百度在线网络技术(北京)有限公司 A kind of method, apparatus and equipment for suggesting object for providing downloading
CN102880633A (en) * 2012-07-27 2013-01-16 四川长虹电器股份有限公司 Content pushing method based on characteristic word
CN103324675A (en) * 2013-05-24 2013-09-25 崔吉平 Internet individuation accurate information search and algorithm
CN103970848B (en) * 2014-05-01 2016-05-11 刘莎 A kind of universal internet information data digging method
CN104036003B (en) * 2014-06-16 2018-12-14 北京奇虎科技有限公司 search result integration method and device
CN104765867A (en) * 2015-04-23 2015-07-08 宁波市科技信息研究院 Collaborative manufacturing information sharing system
CN105045883B (en) * 2015-07-21 2020-12-25 惠州Tcl移动通信有限公司 Mobile terminal and searching method thereof
CN107885889A (en) * 2017-12-13 2018-04-06 聚好看科技股份有限公司 Feedback method, methods of exhibiting and the device of search result
CN108073726B (en) * 2018-01-29 2019-07-16 百度在线网络技术(北京)有限公司 Method, apparatus, storage medium and the terminal device of information retrieval push
CN109271577A (en) * 2018-09-13 2019-01-25 江苏站企动网络科技有限公司 A kind of network-based information retrieval method
CN110502692B (en) * 2019-07-10 2023-02-03 平安普惠企业管理有限公司 Information retrieval method, device, equipment and storage medium based on search engine
CN111444377A (en) * 2020-04-15 2020-07-24 厦门快商通科技股份有限公司 Voiceprint identification authentication method, device and equipment
CN111914142B (en) * 2020-07-30 2023-07-04 重庆电子工程职业学院 Time-division memory information retrieval system
CN112104910B (en) * 2020-08-05 2023-02-03 苏宁智能终端有限公司 Video searching method, device and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1319815A (en) * 1999-09-22 2001-10-31 Lg电子株式会社 Multimedia search and browse method using multimedia user simple document information structure
CN1460373A (en) * 2001-04-03 2003-12-03 皇家菲利浦电子有限公司 Method and apparatus for generating recommendations based on user preferences and environmental characteristics
WO2004090755A2 (en) * 2003-03-31 2004-10-21 Google Inc. System and method for providing preferred language ordering of search results

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1319815A (en) * 1999-09-22 2001-10-31 Lg电子株式会社 Multimedia search and browse method using multimedia user simple document information structure
CN1460373A (en) * 2001-04-03 2003-12-03 皇家菲利浦电子有限公司 Method and apparatus for generating recommendations based on user preferences and environmental characteristics
WO2004090755A2 (en) * 2003-03-31 2004-10-21 Google Inc. System and method for providing preferred language ordering of search results

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9740996B2 (en) 2012-03-27 2017-08-22 Alibaba Group Holding Limited Sending recommendation information associated with a business object
CN102779193A (en) * 2012-07-16 2012-11-14 哈尔滨工业大学 Self-adaptive personalized information retrieval system and method
CN111143460A (en) * 2019-12-30 2020-05-12 智慧神州(北京)科技有限公司 Big data-based economic field data retrieval method and device and processor
CN112445830A (en) * 2020-11-26 2021-03-05 湖南智慧政务区块链科技有限公司 Data analysis system based on block chain technology
CN116186078A (en) * 2023-03-15 2023-05-30 中国华能集团有限公司北京招标分公司 Data retrieval method and system
CN116578677A (en) * 2023-07-14 2023-08-11 高密市中医院 Retrieval system and method for medical examination information
CN116578677B (en) * 2023-07-14 2023-09-15 高密市中医院 Retrieval system and method for medical examination information

Also Published As

Publication number Publication date
CN1858733B (en) 2012-04-04
CN1858733A (en) 2006-11-08

Similar Documents

Publication Publication Date Title
WO2007051397A1 (en) An information retrieval system and information retrieval method
US11860921B2 (en) Category-based search
US6490579B1 (en) Search engine system and method utilizing context of heterogeneous information resources
JP5632124B2 (en) Rating method, search result sorting method, rating system, and search result sorting system
US6718324B2 (en) Metadata search results ranking system
US20060190446A1 (en) Web search system and method thereof
US20070067304A1 (en) Search using changes in prevalence of content items on the web
US20070250501A1 (en) Search result delivery engine
US6907425B1 (en) System and method for searching information stored on a network
JP2006018843A (en) Dispersing search engine result by using page category information
JP2002132832A (en) Image search method and image search engine device
US20080065632A1 (en) Server, method and system for providing information search service by using web page segmented into several inforamtion blocks
US8180751B2 (en) Using an encyclopedia to build user profiles
CA2713932C (en) Automated boolean expression generation for computerized search and indexing
US20070239692A1 (en) Logo or image based search engine for presenting search results
US9275145B2 (en) Electronic document retrieval system with links to external documents
JP2004029943A (en) Retrieval support method
JP2007188509A (en) Retrieval result providing method and two-stage retrieval system execution method
JP2003281179A (en) Retrieval site server device, retrieval information display control method, program and recording medium
KR100671077B1 (en) Server, Method and System for Providing Information Search Service by Using Sheaf of Pages
US20040210560A1 (en) Method and system for searching a wide area network
WO2000048057A2 (en) Bookmark search engine
US20060059126A1 (en) System and method for network searching
Li et al. A new architecture for web meta-search engines
JP2003173351A (en) Method, device, program and storage medium for analysis, collection and retrieval of information

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06805013

Country of ref document: EP

Kind code of ref document: A1