CN1841377A - 爬寻数据库以找出信息 - Google Patents
爬寻数据库以找出信息 Download PDFInfo
- Publication number
- CN1841377A CN1841377A CNA2006100515554A CN200610051555A CN1841377A CN 1841377 A CN1841377 A CN 1841377A CN A2006100515554 A CNA2006100515554 A CN A2006100515554A CN 200610051555 A CN200610051555 A CN 200610051555A CN 1841377 A CN1841377 A CN 1841377A
- Authority
- CN
- China
- Prior art keywords
- database
- information
- pieces
- filtrator
- data structure
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/40—Data acquisition and logging
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US11/096,429 | 2005-03-29 | ||
| US11/096,429 US7801880B2 (en) | 2005-03-29 | 2005-03-29 | Crawling databases for information |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN1841377A true CN1841377A (zh) | 2006-10-04 |
Family
ID=36581869
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNA2006100515554A Pending CN1841377A (zh) | 2005-03-29 | 2006-02-28 | 爬寻数据库以找出信息 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US7801880B2 (https=) |
| EP (1) | EP1708104A1 (https=) |
| JP (1) | JP5048956B2 (https=) |
| KR (1) | KR101224800B1 (https=) |
| CN (1) | CN1841377A (https=) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105074696A (zh) * | 2013-01-16 | 2015-11-18 | 谷歌公司 | 用于资源约束和其它设备的统一可搜索存储 |
| US11755386B2 (en) | 2019-03-11 | 2023-09-12 | Coupang Corp. | Systems and methods for managing application programming interface information |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR100848264B1 (ko) * | 2006-11-23 | 2008-07-25 | 연세대학교 산학협력단 | 강교량의 데이터베이스 구축방법 |
| JP4868245B2 (ja) * | 2007-08-17 | 2012-02-01 | ヤフー株式会社 | 検索システム、検索装置、および検索方法 |
| EP2463785A1 (en) * | 2010-12-13 | 2012-06-13 | Fujitsu Limited | Database and search-engine query system |
| US8620897B2 (en) * | 2011-03-11 | 2013-12-31 | Microsoft Corporation | Indexing and searching features including using reusable index fields |
| JP5578137B2 (ja) * | 2011-05-25 | 2014-08-27 | 富士通株式会社 | 検索プログラム、装置及び方法 |
| RU2568276C2 (ru) * | 2014-01-24 | 2015-11-20 | Закрытое акционерное общество "РИВВ" | Способ извлечения полезного контента из установочных файлов мобильных приложений для дальнейшей машинной обработки данных, в частности поиска |
| US10803087B2 (en) * | 2018-10-19 | 2020-10-13 | Oracle International Corporation | Language interoperable runtime adaptable data collections |
| US11366862B2 (en) * | 2019-11-08 | 2022-06-21 | Gap Intelligence, Inc. | Automated web page accessing |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7370004B1 (en) * | 1999-11-15 | 2008-05-06 | The Chase Manhattan Bank | Personalized interactive network architecture |
| US6876997B1 (en) * | 2000-05-22 | 2005-04-05 | Overture Services, Inc. | Method and apparatus for indentifying related searches in a database search system |
| JP2002049637A (ja) * | 2000-08-04 | 2002-02-15 | Hitachi Ltd | データベース管理方法及び装置並びに記録媒体 |
| US7630959B2 (en) * | 2000-09-06 | 2009-12-08 | Imagitas, Inc. | System and method for processing database queries |
| US20020042789A1 (en) * | 2000-10-04 | 2002-04-11 | Zbigniew Michalewicz | Internet search engine with interactive search criteria construction |
| US6636854B2 (en) * | 2000-12-07 | 2003-10-21 | International Business Machines Corporation | Method and system for augmenting web-indexed search engine results with peer-to-peer search results |
| US7299219B2 (en) * | 2001-05-08 | 2007-11-20 | The Johns Hopkins University | High refresh-rate retrieval of freshly published content using distributed crawling |
| US20040230572A1 (en) * | 2001-06-22 | 2004-11-18 | Nosa Omoigui | System and method for semantic knowledge retrieval, management, capture, sharing, discovery, delivery and presentation |
| US6763362B2 (en) * | 2001-11-30 | 2004-07-13 | Micron Technology, Inc. | Method and system for updating a search engine |
| US20040117376A1 (en) * | 2002-07-12 | 2004-06-17 | Optimalhome, Inc. | Method for distributed acquisition of data from computer-based network data sources |
| JP2005071050A (ja) * | 2003-08-22 | 2005-03-17 | Nippon Hoso Kyokai <Nhk> | 情報提示システム、情報提示装置、及び情報提示プログラム。 |
| US8224872B2 (en) * | 2004-06-25 | 2012-07-17 | International Business Machines Corporation | Automated data model extension through data crawler approach |
-
2005
- 2005-03-29 US US11/096,429 patent/US7801880B2/en not_active Expired - Fee Related
-
2006
- 2006-02-09 KR KR1020060012550A patent/KR101224800B1/ko not_active Expired - Fee Related
- 2006-02-28 CN CNA2006100515554A patent/CN1841377A/zh active Pending
- 2006-03-01 JP JP2006055312A patent/JP5048956B2/ja not_active Expired - Fee Related
- 2006-03-22 EP EP06111548A patent/EP1708104A1/en not_active Ceased
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105074696A (zh) * | 2013-01-16 | 2015-11-18 | 谷歌公司 | 用于资源约束和其它设备的统一可搜索存储 |
| US11755386B2 (en) | 2019-03-11 | 2023-09-12 | Coupang Corp. | Systems and methods for managing application programming interface information |
Also Published As
| Publication number | Publication date |
|---|---|
| US7801880B2 (en) | 2010-09-21 |
| KR20060105438A (ko) | 2006-10-11 |
| JP2006277732A (ja) | 2006-10-12 |
| US20060224592A1 (en) | 2006-10-05 |
| KR101224800B1 (ko) | 2013-01-21 |
| JP5048956B2 (ja) | 2012-10-17 |
| EP1708104A1 (en) | 2006-10-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9558186B2 (en) | Unsupervised extraction of facts | |
| EP2321745B1 (en) | Providing posts to discussion threads in response to a search query | |
| US7707161B2 (en) | Method and system for creating a concept-object database | |
| US6757678B2 (en) | Generalized method and system of merging and pruning of data trees | |
| US20020065857A1 (en) | System and method for analysis and clustering of documents for search engine | |
| US7383299B1 (en) | System and method for providing service for searching web site addresses | |
| US20020042789A1 (en) | Internet search engine with interactive search criteria construction | |
| US20030225811A1 (en) | Automatically deriving an application specification from a web-based application | |
| CA2657418A1 (en) | Joint optimization of wrapper generation and template detection | |
| US7464090B2 (en) | Object categorization for information extraction | |
| KR20190131778A (ko) | 은닉 url에 포함된 정형 및 비정형 데이터의 수집을 위한 웹 크롤러 시스템 | |
| CN1841377A (zh) | 爬寻数据库以找出信息 | |
| Fernandez et al. | Data preprocessing and cleansing in web log on ontology for enhanced decision making | |
| JP2002534741A (ja) | 半構造化テキストデータを処理する方法及び装置 | |
| KR100296500B1 (ko) | 지능형 인터넷 쇼핑몰 상품비교검색엔진 | |
| US20120109965A1 (en) | System for automatic semantic-based mining | |
| US20060136381A1 (en) | Method and system for a text based search of a self-contained document | |
| EP1484694A1 (en) | Converting object structures for search engines | |
| Hernández et al. | An architecture for efficient web crawling | |
| JP2003331089A (ja) | サービスサイト利用状況の分析装置 | |
| Handschuh et al. | Deep Annotation for Information Integration. | |
| Kamath et al. | Change propagation based incremental data handling in a Web service discovery framework | |
| Fiala | Web mining methods for the detection of authoritative sources | |
| Peng | Web mining with jMap technology | |
| Campi | Exploiting the Search Computing paradigm in e-government |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C12 | Rejection of a patent application after its publication | ||
| RJ01 | Rejection of invention patent application after publication |
Open date: 20061004 |