EP1159689A2 - Dispositif de detection et de navigation pour documents hypertexte - Google Patents

Dispositif de detection et de navigation pour documents hypertexte

Info

Publication number
EP1159689A2
EP1159689A2 EP00916782A EP00916782A EP1159689A2 EP 1159689 A2 EP1159689 A2 EP 1159689A2 EP 00916782 A EP00916782 A EP 00916782A EP 00916782 A EP00916782 A EP 00916782A EP 1159689 A2 EP1159689 A2 EP 1159689A2
Authority
EP
European Patent Office
Prior art keywords
documents
document
similarity
symbols
references
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00916782A
Other languages
German (de)
English (en)
Inventor
Heiko Holzheuer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Original Assignee
Siemens AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG filed Critical Siemens AG
Publication of EP1159689A2 publication Critical patent/EP1159689A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • G06F16/94Hypermedia

Definitions

  • the invention relates to navigation and search in documents linked by references, which are mainly referred to as hypertext documents.
  • hypertext documents Documents linked to references are commonly referred to as hypertext documents.
  • HTML Hypertext Markup Langauage
  • Another example of hypertext documents are the help files contained in the graphic user interface "WINDOWS" distributed by Microsoft.
  • HTML pages should be viewed as representative of all hypertext documents.
  • Search engines are known as a further aid. These are called up with one or more key words, which are applied to a previously created, continuously updated, mostly very extensive, not directly visible index and references to a number of Documents in which those keywords are mentioned, displ ⁇ gen case these indexes are HTML documents used in either the assignable via the META tag keywords in the preparation, or it can be.
  • the text content of other tags, especially the " TITLE "tags, or in addition the entire text content is used. This is primarily a question of the database to be indexed in relation to the available resources.
  • Hypertext pages represent a tree in their basic structure, because each page appears as a node with references to subordinate nodes. The jerk and cross references, however, disturb this structure. Nevertheless, it is known as a navigation aid to display a structure tree of hypertext documents, which is also referred to as a 'site map'.
  • a structure tree of hypertext documents which is also referred to as a 'site map'.
  • a tree is built up, with all references that contradict the tree structure being completely suppressed or only weakly displayed.
  • a number of mostly two-dimensional graphic representation forms are known. Recently, three-dimensional images have been chosen that the user can rotate interactively in the room can, with a corresponding projection being displayed on a two-dimensional surface.
  • US Pat. No. 5,847,708 shows a device in which documents, in particular hypertext documents, are displayed spatially arranged according to a similarity measure. A content reference is thus given; however, the chaining of the documents is no longer visible.
  • the object of the invention is to provide a device which, based on a known hypertext document, automatically displays other documents without the user having to extract search words from the content of the original document in order to initiate an index or full-text search.
  • the invention uses the knowledge that a page of similar content is required in many cases.
  • the invention therefore provides a device with which a symbolic representation of an original document and the documents associated therewith is simultaneously displayed with the symbol of the degree of similarity to the selected original document.
  • F g. 1 an image displayed by the facility.
  • the device consists of a computer with a graphic display and the well-known Emabeemhe th such as mouse and keyboard.
  • the graphic display is preferably operated with the program packages X / Wmdows, JAVA, an operating system ending in -ix, etc. It is also possible to use the Microsoft programs, often shortened to 'Windows'.
  • This display shows a document that is a hypertext document that is preferably saved in HTML format.
  • a program called a browser is used for the display, which evaluates the HTML format instructions for display.
  • the hypertext references are of particular importance for the present invention, hereinafter referred to briefly as references or 'links'.
  • a JAVA application provides the user with an additional function in addition to the functions already provided by the browser, which is described in more detail below. But it is also easily possible to do this with your own program m JAVA or another suitable programming language, to which the address of the source document referred to as URL is given as a parameter.
  • this program will first use the references contained in the original document to get to the documents designated with it, which in turn repeat the procedure recursively. Since the reference structures of hypertext documents do not necessarily represent a tree, the search depth must be restricted. This is done either by specifying the recursion depth, e.g. four, or the number of documents visited, or the time spent, or a combination thereof. It can also be specified that only addresses of a certain domain are tracked.
  • document AI contains two references to documents B1 and B2; B1 references to Cl, C2, C3 and D4; B2 references to C3 and C4; Cl references to Dl, D2 and D3; C2 references to D3, D4 and D5; C3 references to D5 and D6; C4 on D7 and D8.
  • the first approximation of the frequency is simply the number of occurrences in the document.
  • An improved variant takes into account how the location of the occurrence is marked. For example, words in the title or keyword list could be rated with greater weight, so that the frequency appears as a fraction. Standardization to the total number is also possible. This results in a matrix that grows with the number of documents examined, m whose rows are the documents and m whose columns the words are indexed.
  • a distance between two documents can be determined by multiplying two line vectors and summing the products. This distance measure is the greater, the more similar the two documents are because the number is particularly large if the documents have common words that also occur equally frequently in both documents.
  • the first proposals in this direction were proposed by H. Luhn in the article "The automatic creation of literature abstracts", IBM Journal of Research and Development 2, 158-165, 1958.
  • Other functions that use the matrix or extract a square symmetrical matrix of the spacing of the documents from one another by determining spacings in pairs and thus eliminating the words are likewise possible.
  • the distance measure does not meet the criteria of a topological distance, since the triangular equation does not have to be satisfied and the distance to itself provides a maximum value instead of zero.
  • word attention vectors is advantageous in that the matrix of the weighted word occurrences can occur dynamically during the recursive search and each document only has to be transmitted and analyzed once. However, this does not preclude the device from being operated in such a way that a distance measure is determined anew each time by the documents concerned being currently loaded and evaluated.
  • a combination is also possible, in which the determination via word frequencies determines a preselection of documents, for which the distance measure is then determined in pairs according to other methods which require the document text itself. As indicated above, this could be Languages apply in which the process of reducing the number of stems requires extensive syntax and semantic analysis.
  • the reference structure is preferably displayed.
  • a variety of forms are known for this; starting with a list with prints, a tree-like graphic representation or elaborate 3D / 2D representations. In all the usual forms of representation, the focus is on a building structure that canonically arises during recursive descent. The references that do not correspond to the tree structure are then either not shown or shown as additional lines, possibly in a weak form.
  • Various formats are known as 3D / 2D representations, in which the structure is initially built up as a graphic in a three-dimensional space and then projected onto a two-dimensional surface, as is known, for example, as a "fish-eye view".
  • the invention consists in the fact that the distance to the original document, determined via the matrix or otherwise, is indicated by the symbols m of the structural representation.
  • Color is preferably used because it does not play a significant role in the known representations. For example, red could be used for the greatest similarity, green for the closest ones, yellow and blue to black for relative unlikeness.
  • Grayscale represents a different type of coloring, whereby white is used as less significant and black as highly similar is preferred for a display with a light background.
  • a size of the symbols is also equivalent to a color; therefore "color” also stands for grayscale as well as for other scalable representations such as the diameter of a circular area.
  • color also stands for grayscale as well as for other scalable representations such as the diameter of a circular area.
  • 3D / 2D representations in which a reduction in perspective is desired due to the projection in order to visualize the spatial position, is the size not applicable as "color”.
  • the use of the form is also possible because a triangle is a significantly more significant representation and clearly distinguishable from a square, whereas the difference between a hexagon and a heptagon is hardly visible. Nevertheless, in this example the number of corners also represents a "color”.
  • For users with reduced is compensated ed Sehfahtechnik in bright colors, the re mostly by Besse ⁇ distinguishable from forms such as this facility is important and can be combined with the development Buntfarbdarstel ⁇ .
  • a symbol that is not yet the source document can be made the new source document using an input device (mouse).
  • the new color of the representation can then be quickly determined and displayed. In this case, preferably no new descent is carried out from the new position, but the already accumulated data are used. With appropriate equipment, however, it is advisable to add the missing documents that have been moved within reach of the new reference point; possibly as a background process, which then brings the display to the possibly changed state on request.
  • the words are still available as a list, these can be made available to the user as a further selection means. This can be done alphabetically or by frequency. If the user chooses one or more words, the document that is best for this becomes the source document fits. Formally, a virtual document output document is then ⁇ ment, which was the maximum Worthauflgkeiten include the selected words.
  • Another embodiment uses, preferably in addition to the color, the spacing of the symbols in 3D space as a "color feature".
  • 3D representations in particular still leave considerable room for maneuver in the relative spacing of the symbols.
  • the dimensions used, as stated above do not represent a metric, such an image is not clearly defined.
  • an iterative procedure can cause a deformation that clearly shows the relative proximity of different documents. It can be bought that the display does not stand still, but the display constantly changes slightly due to the opposing effects. Rather, this "breathing” is more suitable for displaying the relative uncertainty of the classification better than a "frozen” image that simulates a final arrangement that is not stable at all.

Abstract

Cette invention concerne un dispositif de détection ou de navigation dans des documents reliés entre eux par renvoi, lesdits documents étant représentés symboliquement sur une unité de sortie. A partir d'un document d'origine, les représentations symboliques des autres documents sont pourvues d'une marque indiquant le degré de similitude avec le document d'origine en fonction d'une mesure de similitude.
EP00916782A 1999-03-09 2000-03-01 Dispositif de detection et de navigation pour documents hypertexte Withdrawn EP1159689A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
DE19910357 1999-03-09
DE19910357 1999-03-09
PCT/DE2000/000603 WO2000054167A2 (fr) 1999-03-09 2000-03-01 Dispositif de detection et de navigation pour documents hypertexte

Publications (1)

Publication Number Publication Date
EP1159689A2 true EP1159689A2 (fr) 2001-12-05

Family

ID=7900263

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00916782A Withdrawn EP1159689A2 (fr) 1999-03-09 2000-03-01 Dispositif de detection et de navigation pour documents hypertexte

Country Status (4)

Country Link
US (1) US7020847B1 (fr)
EP (1) EP1159689A2 (fr)
CA (1) CA2366762C (fr)
WO (1) WO2000054167A2 (fr)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7644367B2 (en) 2003-05-16 2010-01-05 Microsoft Corporation User interface automation framework classes and interfaces
US8127252B2 (en) * 2003-11-07 2012-02-28 Microsoft Corporation Method and system for presenting user interface (UI) information
US8577893B1 (en) * 2004-03-15 2013-11-05 Google Inc. Ranking based on reference contexts
US20070028189A1 (en) * 2005-07-27 2007-02-01 Microsoft Corporation Hierarchy highlighting
US20070276419A1 (en) 2006-05-26 2007-11-29 Fox Hollow Technologies, Inc. Methods and devices for rotating an active element and an energy emitter on a catheter
USD709901S1 (en) 2011-05-31 2014-07-29 Lifescan, Inc. Display screen with computer icon for blood glucose monitoring
USD682304S1 (en) 2012-01-06 2013-05-14 Path, Inc. Display screen with graphical user interface
US9727656B2 (en) * 2013-07-04 2017-08-08 Excalibur Ip, Llc Interactive sitemap with user footprints
USD776140S1 (en) 2014-04-30 2017-01-10 Yahoo! Inc. Display screen with graphical user interface for displaying search results as a stack of overlapping, actionable cards
US9830388B2 (en) * 2014-04-30 2017-11-28 Excalibur Ip, Llc Modular search object framework
US9335911B1 (en) * 2014-12-29 2016-05-10 Palantir Technologies Inc. Interactive user interface for dynamic data analysis exploration and query processing
CN106445321A (zh) * 2016-09-13 2017-02-22 宇龙计算机通信科技(深圳)有限公司 一种文档内容显示的方法和终端
US10885021B1 (en) 2018-05-02 2021-01-05 Palantir Technologies Inc. Interactive interpreter and graphical user interface

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5021976A (en) * 1988-11-14 1991-06-04 Microelectronics And Computer Technology Corporation Method and system for generating dynamic, interactive visual representations of information structures within a computer
US5295243A (en) * 1989-12-29 1994-03-15 Xerox Corporation Display of hierarchical three-dimensional structures with rotating substructures
US5911138A (en) * 1993-06-04 1999-06-08 International Business Machines Corporation Database search facility having improved user interface
US5544352A (en) * 1993-06-14 1996-08-06 Libertech, Inc. Method and apparatus for indexing, searching and displaying data
CA2127764A1 (fr) * 1993-08-24 1995-02-25 Stephen Gregory Eick Affichage des resultats d'une consultation
US5515488A (en) * 1994-08-30 1996-05-07 Xerox Corporation Method and apparatus for concurrent graphical visualization of a database search and its search history
US5855015A (en) * 1995-03-20 1998-12-29 Interval Research Corporation System and method for retrieval of hyperlinked information resources
US6067552A (en) * 1995-08-21 2000-05-23 Cnet, Inc. User interface system and method for browsing a hypertext database
US6088032A (en) * 1996-10-04 2000-07-11 Xerox Corporation Computer controlled display system for displaying a three-dimensional document workspace having a means for prefetching linked documents
US5778363A (en) * 1996-12-30 1998-07-07 Intel Corporation Method for measuring thresholded relevance of a document to a specified topic
US5835905A (en) * 1997-04-09 1998-11-10 Xerox Corporation System for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents
US5895470A (en) * 1997-04-09 1999-04-20 Xerox Corporation System for categorizing documents in a linked collection of documents
US6216134B1 (en) * 1998-06-25 2001-04-10 Microsoft Corporation Method and system for visualization of clusters and classifications

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0054167A2 *

Also Published As

Publication number Publication date
WO2000054167A2 (fr) 2000-09-14
CA2366762C (fr) 2009-07-14
WO2000054167A3 (fr) 2001-07-26
CA2366762A1 (fr) 2000-09-14
US7020847B1 (en) 2006-03-28

Similar Documents

Publication Publication Date Title
DE19960043B4 (de) Verfahren zum Navigieren in einer Baumstruktur
DE69534331T2 (de) Verfahren und Vorrichtung zur Hervorhebung der Einzelheit einer Baumstruktur
DE69434096T2 (de) Verfahren und System, um mit einer graphischen Benutzerschnittstelle in einer Datenbank zu suchen
DE60031664T2 (de) Computerverfahren und vorrichtung zum schaffen von sichtbarer graphik unter verwendung von graph algebra
EP0910829B1 (fr) Systeme de banque de donnees
DE69835753T2 (de) Verfahren und gerät zur graphischen abbildung von webteilen
DE69731045T2 (de) Navigation und Interaktion in strukturierten Informationsräumen
DE19842688B4 (de) Verfahren zum Filtern von Daten, die von einem Datenanbieter stammen
EP1311989B1 (fr) Procede de recherche automatique
DE10135445A1 (de) Integriertes Verfahren für das Schaffen einer aktualisierbaren Netzabfrage
DE69909614T2 (de) Sich selbst manipulierende bäume verwendende rechenarchitektur
DE60030735T2 (de) Voraussage der realisierbarkeit eines verbindungsweges
WO2000054167A2 (fr) Dispositif de detection et de navigation pour documents hypertexte
DE69719641T2 (de) Ein Verfahren, um Informationen auf Bildschirmgeräten in verschiedenen Grössen zu präsentieren
DE60310881T2 (de) Methode und Benutzerschnittstelle für das Bilden einer Darstellung von Daten mit Meta-morphing
DE10144390A1 (de) Visualisierung eines Vergleichsergebnisses mindestens zweier in Verzeichnisbäumen organisierter Datenstrukturen
DE10034694A1 (de) Verfahren zum Vergleichen von Suchprofilen
DE10392386T5 (de) System und Verfahren zur dynamischen Erzeugung einer Textbeschreibung für eine bildliche Datendarstellung
DE19817583B4 (de) Verfahren und System zur Datenverarbeitung für dreidimensionale Objekte
EP1224579A2 (fr) Procede de traitement d'objets de donnees
DE60008201T2 (de) Übersetzung von Daten mit elektronischen Bildern
DE10063514A1 (de) Verwendung einer gespeicherten Prozedur zum Zugriff auf Indexkonfigurationsdaten in einem fernen Datenbankverwaltungssystem
DE10031041A1 (de) Bereitstellen einer Zugriffsmöglichkeit auf Anwendungsdatenelemente eines Anwendungsprogramms
DE10057634A1 (de) Verfahren zur Verarbeitung von Text in einer Rechnereinheit und Rechnereinheit
DE19729911A1 (de) System zur Verbesserung der Organisation von Daten einer Dokumentation

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010820

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20020705

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SIEMENS AKTIENGESELLSCHAFT

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SIEMENS AKTIENGESELLSCHAFT

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20150825