WO2001024046A3 - Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup - Google Patents

Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup Download PDF

Info

Publication number
WO2001024046A3
WO2001024046A3 PCT/CA2000/000861 CA0000861W WO0124046A3 WO 2001024046 A3 WO2001024046 A3 WO 2001024046A3 CA 0000861 W CA0000861 W CA 0000861W WO 0124046 A3 WO0124046 A3 WO 0124046A3
Authority
WO
WIPO (PCT)
Prior art keywords
contextual
character data
html
context sensitive
hybrid
Prior art date
Application number
PCT/CA2000/000861
Other languages
French (fr)
Other versions
WO2001024046A2 (en
Inventor
Duane Allan Nickull
Chad Matthew Mackenzie
Jamie Michael Thomas Hoglund
Original Assignee
Xml Global Technologies Inc
Duane Allan Nickull
Chad Matthew Mackenzie
Jamie Michael Thomas Hoglund
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xml Global Technologies Inc, Duane Allan Nickull, Chad Matthew Mackenzie, Jamie Michael Thomas Hoglund filed Critical Xml Global Technologies Inc
Priority to AU69739/00A priority Critical patent/AU6973900A/en
Publication of WO2001024046A2 publication Critical patent/WO2001024046A2/en
Publication of WO2001024046A3 publication Critical patent/WO2001024046A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/81Indexing, e.g. XML tags; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]

Abstract

The invention provides a method, computer-readable instructions and a system for generating and indexing context sensitive HTML documents. A context sensitive HTML document is generated by inserting an opening contextual markup tag before an item of character data (or other computer-readable data) within an HTML document and by inserting a closing contextual markup tag after the item of character data. A predefined prefix for identifying contextual information is included in both the opening and closing contextual markup tags. At least one contextual term identifying a context within which the item of character data is used is also included in both the opening and closing contextual markup tags. Each contextual markup tag is marked with HTML delimiters. Context sensitive HTML documents generated in this way form hybrid HTML documents which remain compatible with HTML and which can be processed and incorporated into a context sensitive database that may be searched by users. A hybrid HTML document is processed by scanning it in search of character data marked with the contextual markup tags. The character data and associated contextual terms are retrieved from the hybrid HTML documents and added to a context sensitive database. When the character data is added to the database, the associated contextual terms and the character data are linked, and the character data is linked to an address identifying a location of the hybrid HTML document from which originated the character data and associated contextual terms.
PCT/CA2000/000861 1999-09-29 2000-07-21 Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup WO2001024046A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU69739/00A AU6973900A (en) 1999-09-29 2000-07-21 Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US40733699A 1999-09-29 1999-09-29
US09/407,336 1999-09-29

Publications (2)

Publication Number Publication Date
WO2001024046A2 WO2001024046A2 (en) 2001-04-05
WO2001024046A3 true WO2001024046A3 (en) 2002-05-02

Family

ID=23611600

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CA2000/000861 WO2001024046A2 (en) 1999-09-29 2000-07-21 Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup
PCT/CA2000/001042 WO2001024045A2 (en) 1999-09-29 2000-09-08 Method, system, signals and media for indexing, searching and retrieving data based on context

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/CA2000/001042 WO2001024045A2 (en) 1999-09-29 2000-09-08 Method, system, signals and media for indexing, searching and retrieving data based on context

Country Status (2)

Country Link
AU (2) AU6973900A (en)
WO (2) WO2001024046A2 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040172415A1 (en) 1999-09-20 2004-09-02 Messina Christopher P. Methods, systems, and software for automated growth of intelligent on-line communities
US6927027B2 (en) * 1999-12-21 2005-08-09 Ingeneus Corporation Nucleic acid multiplex formation
CA2455693A1 (en) * 2000-09-20 2002-12-05 Body1, Inc. Methods, systems, and software for automated growth of intelligent on-line communities
EP1244008A1 (en) 2001-03-20 2002-09-25 Sap Ag Method, computer program, and computer for automatically selecting application services for communicating data from a server to a client depending on the type of the client device
AU2002300674B2 (en) * 2001-08-31 2007-09-20 Trusted Board Ltd Electronic approval of documents
EP2202648A1 (en) 2002-04-12 2010-06-30 Mitsubishi Denki Kabushiki Kaisha Hint information describing method for manipulating metadata
US7020667B2 (en) 2002-07-18 2006-03-28 International Business Machines Corporation System and method for data retrieval and collection in a structured format
US7664727B2 (en) 2003-11-28 2010-02-16 Canon Kabushiki Kaisha Method of constructing preferred views of hierarchical data
JP2008507792A (en) * 2004-07-26 2008-03-13 パンセン インフォマティクス インコーポレイテッド A search engine that uses the background situation placed on the network
US7689910B2 (en) 2005-01-31 2010-03-30 International Business Machines Corporation Processing semantic subjects that occur as terms within document content
US8635691B2 (en) * 2007-03-02 2014-01-21 403 Labs, Llc Sensitive data scanner
WO2009003281A1 (en) * 2007-07-03 2009-01-08 Tlg Partnership System, method, and data structure for providing access to interrelated sources of information
US8442982B2 (en) 2010-11-05 2013-05-14 Apple Inc. Extended database search

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DAO T: "AN INDEXING MODEL FOR STRUCTURED DOCUMENTS TO SUPPORT QUERIES ON CONTENT, STRUCTURE AND ATTRIBUTES", PROCEEDINGS OF THE FORUM ON RESEARCH AND TECHNOLOGY ADVANCES IN DIGITAL LIBRARIES, April 1998 (1998-04-01), pages 88 - 97, XP002925486 *
DELLA MEA V ET AL: "HTML generation and semantic markup for telepathology", COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING, vol. 28, no. 11, 1 May 1996 (1996-05-01), AMSTERDAM, NL, pages 1085 - 1094, XP004018210, ISSN: 0169-7552 *
DOBSON S A ET AL: "Lightweight databases", COMPUTER NETWORKS AND ISDN SYSTEMS, NORTH HOLLAND PUBLISHING, vol. 27, no. 6, 1 April 1995 (1995-04-01), AMSTERDAM, NL, pages 1009 - 1015, XP004013202, ISSN: 0169-7552 *
LUKE S ET AL: "ONTOLOGY-BASED WEB AGENTS", PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS, MARINA DEL REY, CA, 5 February 1997 (1997-02-05) - 8 February 1997 (1997-02-08), ACM, NEW YORK, NY, US, pages 59 - 66, XP000775144, ISBN: 0-89791-877-0 *

Also Published As

Publication number Publication date
AU6976600A (en) 2001-04-30
WO2001024045A2 (en) 2001-04-05
WO2001024046A2 (en) 2001-04-05
AU6973900A (en) 2001-04-30
WO2001024045A3 (en) 2002-05-10

Similar Documents

Publication Publication Date Title
US10796074B2 (en) Linking sources to copied text
WO2001024046A3 (en) Authoring, altering, indexing, storing and retrieving electronic documents embedded with contextual markup
EP0827088A3 (en) Finding and modifying strings of a regular language in a text
US7546352B1 (en) Method to automatically merge e-mail replies
US8321396B2 (en) Automatically extracting by-line information
US7953592B2 (en) Semantic analysis apparatus, semantic analysis method and semantic analysis program
US20050004909A1 (en) Method and system for augmenting web content
WO2001082114A3 (en) System for fulfilling an information need
WO2000043918A3 (en) System for inserting hyperlinks into documents
EP1045322A3 (en) Information providing method, information providing system, terminal apparatus, and storage medium storing information providing program
EP1109390A3 (en) System and method for browsing and searching through voicemail using automatic speech recognition
EP2270744A1 (en) Method and apparatus for facilitating directed reading of document portions based on information-sharing relevance
CN101473322A (en) Search early warning
CN103500194A (en) Method, device and browser for loading webpage
EP1615154A3 (en) Method and software for extracting chemical data
WO2005052737A3 (en) System and method of virtualizing physical locations
MXPA02000185A (en) Method and system for searching classified advertising.
WO2006005001A3 (en) Method and system for automated intelligent electronic advertising
EP1107128A1 (en) Apparatus and method for checking the validity of links in a computer network
US7219298B2 (en) Method, system, and program for verifying network addresses included in a file
CN104090869B (en) A kind of method and translation system for translating the network information
JP2006293573A (en) Electronic mail processor, electronic mail filtering method and electronic mail filtering program
WO2004095432A3 (en) Generation and presentation of search results using addressing information
EP1128290A3 (en) A method and system for summarizing and presenting information from results of a search in very large full-text databases
JP2007052737A (en) Information processor and computer program

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP