EP1159689A2 - Dispositif de detection et de navigation pour documents hypertexte - Google Patents
Dispositif de detection et de navigation pour documents hypertexteInfo
- Publication number
- EP1159689A2 EP1159689A2 EP00916782A EP00916782A EP1159689A2 EP 1159689 A2 EP1159689 A2 EP 1159689A2 EP 00916782 A EP00916782 A EP 00916782A EP 00916782 A EP00916782 A EP 00916782A EP 1159689 A2 EP1159689 A2 EP 1159689A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- documents
- document
- similarity
- symbols
- references
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 239000013598 vector Substances 0.000 claims description 6
- 238000011524 similarity measure Methods 0.000 claims description 4
- 238000002360 preparation method Methods 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 description 8
- 238000000034 method Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 3
- 230000012447 hatching Effects 0.000 description 3
- 239000003086 colorant Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004040 coloring Methods 0.000 description 1
- 230000036461 convulsion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
- G06F16/94—Hypermedia
Definitions
- the invention relates to navigation and search in documents linked by references, which are mainly referred to as hypertext documents.
- hypertext documents Documents linked to references are commonly referred to as hypertext documents.
- HTML Hypertext Markup Langauage
- Another example of hypertext documents are the help files contained in the graphic user interface "WINDOWS" distributed by Microsoft.
- HTML pages should be viewed as representative of all hypertext documents.
- Search engines are known as a further aid. These are called up with one or more key words, which are applied to a previously created, continuously updated, mostly very extensive, not directly visible index and references to a number of Documents in which those keywords are mentioned, displ ⁇ gen case these indexes are HTML documents used in either the assignable via the META tag keywords in the preparation, or it can be.
- the text content of other tags, especially the " TITLE "tags, or in addition the entire text content is used. This is primarily a question of the database to be indexed in relation to the available resources.
- Hypertext pages represent a tree in their basic structure, because each page appears as a node with references to subordinate nodes. The jerk and cross references, however, disturb this structure. Nevertheless, it is known as a navigation aid to display a structure tree of hypertext documents, which is also referred to as a 'site map'.
- a structure tree of hypertext documents which is also referred to as a 'site map'.
- a tree is built up, with all references that contradict the tree structure being completely suppressed or only weakly displayed.
- a number of mostly two-dimensional graphic representation forms are known. Recently, three-dimensional images have been chosen that the user can rotate interactively in the room can, with a corresponding projection being displayed on a two-dimensional surface.
- US Pat. No. 5,847,708 shows a device in which documents, in particular hypertext documents, are displayed spatially arranged according to a similarity measure. A content reference is thus given; however, the chaining of the documents is no longer visible.
- the object of the invention is to provide a device which, based on a known hypertext document, automatically displays other documents without the user having to extract search words from the content of the original document in order to initiate an index or full-text search.
- the invention uses the knowledge that a page of similar content is required in many cases.
- the invention therefore provides a device with which a symbolic representation of an original document and the documents associated therewith is simultaneously displayed with the symbol of the degree of similarity to the selected original document.
- F g. 1 an image displayed by the facility.
- the device consists of a computer with a graphic display and the well-known Emabeemhe th such as mouse and keyboard.
- the graphic display is preferably operated with the program packages X / Wmdows, JAVA, an operating system ending in -ix, etc. It is also possible to use the Microsoft programs, often shortened to 'Windows'.
- This display shows a document that is a hypertext document that is preferably saved in HTML format.
- a program called a browser is used for the display, which evaluates the HTML format instructions for display.
- the hypertext references are of particular importance for the present invention, hereinafter referred to briefly as references or 'links'.
- a JAVA application provides the user with an additional function in addition to the functions already provided by the browser, which is described in more detail below. But it is also easily possible to do this with your own program m JAVA or another suitable programming language, to which the address of the source document referred to as URL is given as a parameter.
- this program will first use the references contained in the original document to get to the documents designated with it, which in turn repeat the procedure recursively. Since the reference structures of hypertext documents do not necessarily represent a tree, the search depth must be restricted. This is done either by specifying the recursion depth, e.g. four, or the number of documents visited, or the time spent, or a combination thereof. It can also be specified that only addresses of a certain domain are tracked.
- document AI contains two references to documents B1 and B2; B1 references to Cl, C2, C3 and D4; B2 references to C3 and C4; Cl references to Dl, D2 and D3; C2 references to D3, D4 and D5; C3 references to D5 and D6; C4 on D7 and D8.
- the first approximation of the frequency is simply the number of occurrences in the document.
- An improved variant takes into account how the location of the occurrence is marked. For example, words in the title or keyword list could be rated with greater weight, so that the frequency appears as a fraction. Standardization to the total number is also possible. This results in a matrix that grows with the number of documents examined, m whose rows are the documents and m whose columns the words are indexed.
- a distance between two documents can be determined by multiplying two line vectors and summing the products. This distance measure is the greater, the more similar the two documents are because the number is particularly large if the documents have common words that also occur equally frequently in both documents.
- the first proposals in this direction were proposed by H. Luhn in the article "The automatic creation of literature abstracts", IBM Journal of Research and Development 2, 158-165, 1958.
- Other functions that use the matrix or extract a square symmetrical matrix of the spacing of the documents from one another by determining spacings in pairs and thus eliminating the words are likewise possible.
- the distance measure does not meet the criteria of a topological distance, since the triangular equation does not have to be satisfied and the distance to itself provides a maximum value instead of zero.
- word attention vectors is advantageous in that the matrix of the weighted word occurrences can occur dynamically during the recursive search and each document only has to be transmitted and analyzed once. However, this does not preclude the device from being operated in such a way that a distance measure is determined anew each time by the documents concerned being currently loaded and evaluated.
- a combination is also possible, in which the determination via word frequencies determines a preselection of documents, for which the distance measure is then determined in pairs according to other methods which require the document text itself. As indicated above, this could be Languages apply in which the process of reducing the number of stems requires extensive syntax and semantic analysis.
- the reference structure is preferably displayed.
- a variety of forms are known for this; starting with a list with prints, a tree-like graphic representation or elaborate 3D / 2D representations. In all the usual forms of representation, the focus is on a building structure that canonically arises during recursive descent. The references that do not correspond to the tree structure are then either not shown or shown as additional lines, possibly in a weak form.
- Various formats are known as 3D / 2D representations, in which the structure is initially built up as a graphic in a three-dimensional space and then projected onto a two-dimensional surface, as is known, for example, as a "fish-eye view".
- the invention consists in the fact that the distance to the original document, determined via the matrix or otherwise, is indicated by the symbols m of the structural representation.
- Color is preferably used because it does not play a significant role in the known representations. For example, red could be used for the greatest similarity, green for the closest ones, yellow and blue to black for relative unlikeness.
- Grayscale represents a different type of coloring, whereby white is used as less significant and black as highly similar is preferred for a display with a light background.
- a size of the symbols is also equivalent to a color; therefore "color” also stands for grayscale as well as for other scalable representations such as the diameter of a circular area.
- color also stands for grayscale as well as for other scalable representations such as the diameter of a circular area.
- 3D / 2D representations in which a reduction in perspective is desired due to the projection in order to visualize the spatial position, is the size not applicable as "color”.
- the use of the form is also possible because a triangle is a significantly more significant representation and clearly distinguishable from a square, whereas the difference between a hexagon and a heptagon is hardly visible. Nevertheless, in this example the number of corners also represents a "color”.
- For users with reduced is compensated ed Sehfahtechnik in bright colors, the re mostly by Besse ⁇ distinguishable from forms such as this facility is important and can be combined with the development Buntfarbdarstel ⁇ .
- a symbol that is not yet the source document can be made the new source document using an input device (mouse).
- the new color of the representation can then be quickly determined and displayed. In this case, preferably no new descent is carried out from the new position, but the already accumulated data are used. With appropriate equipment, however, it is advisable to add the missing documents that have been moved within reach of the new reference point; possibly as a background process, which then brings the display to the possibly changed state on request.
- the words are still available as a list, these can be made available to the user as a further selection means. This can be done alphabetically or by frequency. If the user chooses one or more words, the document that is best for this becomes the source document fits. Formally, a virtual document output document is then ⁇ ment, which was the maximum Worthauflgkeiten include the selected words.
- Another embodiment uses, preferably in addition to the color, the spacing of the symbols in 3D space as a "color feature".
- 3D representations in particular still leave considerable room for maneuver in the relative spacing of the symbols.
- the dimensions used, as stated above do not represent a metric, such an image is not clearly defined.
- an iterative procedure can cause a deformation that clearly shows the relative proximity of different documents. It can be bought that the display does not stand still, but the display constantly changes slightly due to the opposing effects. Rather, this "breathing” is more suitable for displaying the relative uncertainty of the classification better than a "frozen” image that simulates a final arrangement that is not stable at all.
Abstract
Cette invention concerne un dispositif de détection ou de navigation dans des documents reliés entre eux par renvoi, lesdits documents étant représentés symboliquement sur une unité de sortie. A partir d'un document d'origine, les représentations symboliques des autres documents sont pourvues d'une marque indiquant le degré de similitude avec le document d'origine en fonction d'une mesure de similitude.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19910357 | 1999-03-09 | ||
DE19910357 | 1999-03-09 | ||
PCT/DE2000/000603 WO2000054167A2 (fr) | 1999-03-09 | 2000-03-01 | Dispositif de detection et de navigation pour documents hypertexte |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1159689A2 true EP1159689A2 (fr) | 2001-12-05 |
Family
ID=7900263
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00916782A Withdrawn EP1159689A2 (fr) | 1999-03-09 | 2000-03-01 | Dispositif de detection et de navigation pour documents hypertexte |
Country Status (4)
Country | Link |
---|---|
US (1) | US7020847B1 (fr) |
EP (1) | EP1159689A2 (fr) |
CA (1) | CA2366762C (fr) |
WO (1) | WO2000054167A2 (fr) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7644367B2 (en) | 2003-05-16 | 2010-01-05 | Microsoft Corporation | User interface automation framework classes and interfaces |
US8127252B2 (en) * | 2003-11-07 | 2012-02-28 | Microsoft Corporation | Method and system for presenting user interface (UI) information |
US8577893B1 (en) * | 2004-03-15 | 2013-11-05 | Google Inc. | Ranking based on reference contexts |
US20070028189A1 (en) * | 2005-07-27 | 2007-02-01 | Microsoft Corporation | Hierarchy highlighting |
US20070276419A1 (en) | 2006-05-26 | 2007-11-29 | Fox Hollow Technologies, Inc. | Methods and devices for rotating an active element and an energy emitter on a catheter |
USD709901S1 (en) | 2011-05-31 | 2014-07-29 | Lifescan, Inc. | Display screen with computer icon for blood glucose monitoring |
USD682304S1 (en) | 2012-01-06 | 2013-05-14 | Path, Inc. | Display screen with graphical user interface |
US9727656B2 (en) * | 2013-07-04 | 2017-08-08 | Excalibur Ip, Llc | Interactive sitemap with user footprints |
USD776140S1 (en) | 2014-04-30 | 2017-01-10 | Yahoo! Inc. | Display screen with graphical user interface for displaying search results as a stack of overlapping, actionable cards |
US9830388B2 (en) * | 2014-04-30 | 2017-11-28 | Excalibur Ip, Llc | Modular search object framework |
US9335911B1 (en) * | 2014-12-29 | 2016-05-10 | Palantir Technologies Inc. | Interactive user interface for dynamic data analysis exploration and query processing |
CN106445321A (zh) * | 2016-09-13 | 2017-02-22 | 宇龙计算机通信科技(深圳)有限公司 | 一种文档内容显示的方法和终端 |
US10885021B1 (en) | 2018-05-02 | 2021-01-05 | Palantir Technologies Inc. | Interactive interpreter and graphical user interface |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5021976A (en) * | 1988-11-14 | 1991-06-04 | Microelectronics And Computer Technology Corporation | Method and system for generating dynamic, interactive visual representations of information structures within a computer |
US5295243A (en) * | 1989-12-29 | 1994-03-15 | Xerox Corporation | Display of hierarchical three-dimensional structures with rotating substructures |
US5911138A (en) * | 1993-06-04 | 1999-06-08 | International Business Machines Corporation | Database search facility having improved user interface |
US5544352A (en) * | 1993-06-14 | 1996-08-06 | Libertech, Inc. | Method and apparatus for indexing, searching and displaying data |
CA2127764A1 (fr) * | 1993-08-24 | 1995-02-25 | Stephen Gregory Eick | Affichage des resultats d'une consultation |
US5515488A (en) * | 1994-08-30 | 1996-05-07 | Xerox Corporation | Method and apparatus for concurrent graphical visualization of a database search and its search history |
US5855015A (en) * | 1995-03-20 | 1998-12-29 | Interval Research Corporation | System and method for retrieval of hyperlinked information resources |
US6067552A (en) * | 1995-08-21 | 2000-05-23 | Cnet, Inc. | User interface system and method for browsing a hypertext database |
US6088032A (en) * | 1996-10-04 | 2000-07-11 | Xerox Corporation | Computer controlled display system for displaying a three-dimensional document workspace having a means for prefetching linked documents |
US5778363A (en) * | 1996-12-30 | 1998-07-07 | Intel Corporation | Method for measuring thresholded relevance of a document to a specified topic |
US5835905A (en) * | 1997-04-09 | 1998-11-10 | Xerox Corporation | System for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents |
US5895470A (en) * | 1997-04-09 | 1999-04-20 | Xerox Corporation | System for categorizing documents in a linked collection of documents |
US6216134B1 (en) * | 1998-06-25 | 2001-04-10 | Microsoft Corporation | Method and system for visualization of clusters and classifications |
-
1999
- 1999-08-06 US US09/369,360 patent/US7020847B1/en not_active Expired - Lifetime
-
2000
- 2000-03-01 CA CA002366762A patent/CA2366762C/fr not_active Expired - Fee Related
- 2000-03-01 EP EP00916782A patent/EP1159689A2/fr not_active Withdrawn
- 2000-03-01 WO PCT/DE2000/000603 patent/WO2000054167A2/fr active Search and Examination
Non-Patent Citations (1)
Title |
---|
See references of WO0054167A2 * |
Also Published As
Publication number | Publication date |
---|---|
WO2000054167A2 (fr) | 2000-09-14 |
CA2366762C (fr) | 2009-07-14 |
WO2000054167A3 (fr) | 2001-07-26 |
CA2366762A1 (fr) | 2000-09-14 |
US7020847B1 (en) | 2006-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE19960043B4 (de) | Verfahren zum Navigieren in einer Baumstruktur | |
DE69534331T2 (de) | Verfahren und Vorrichtung zur Hervorhebung der Einzelheit einer Baumstruktur | |
DE69434096T2 (de) | Verfahren und System, um mit einer graphischen Benutzerschnittstelle in einer Datenbank zu suchen | |
DE60031664T2 (de) | Computerverfahren und vorrichtung zum schaffen von sichtbarer graphik unter verwendung von graph algebra | |
EP0910829B1 (fr) | Systeme de banque de donnees | |
DE69835753T2 (de) | Verfahren und gerät zur graphischen abbildung von webteilen | |
DE69731045T2 (de) | Navigation und Interaktion in strukturierten Informationsräumen | |
DE19842688B4 (de) | Verfahren zum Filtern von Daten, die von einem Datenanbieter stammen | |
EP1311989B1 (fr) | Procede de recherche automatique | |
DE10135445A1 (de) | Integriertes Verfahren für das Schaffen einer aktualisierbaren Netzabfrage | |
DE69909614T2 (de) | Sich selbst manipulierende bäume verwendende rechenarchitektur | |
DE60030735T2 (de) | Voraussage der realisierbarkeit eines verbindungsweges | |
WO2000054167A2 (fr) | Dispositif de detection et de navigation pour documents hypertexte | |
DE69719641T2 (de) | Ein Verfahren, um Informationen auf Bildschirmgeräten in verschiedenen Grössen zu präsentieren | |
DE60310881T2 (de) | Methode und Benutzerschnittstelle für das Bilden einer Darstellung von Daten mit Meta-morphing | |
DE10144390A1 (de) | Visualisierung eines Vergleichsergebnisses mindestens zweier in Verzeichnisbäumen organisierter Datenstrukturen | |
DE10034694A1 (de) | Verfahren zum Vergleichen von Suchprofilen | |
DE10392386T5 (de) | System und Verfahren zur dynamischen Erzeugung einer Textbeschreibung für eine bildliche Datendarstellung | |
DE19817583B4 (de) | Verfahren und System zur Datenverarbeitung für dreidimensionale Objekte | |
EP1224579A2 (fr) | Procede de traitement d'objets de donnees | |
DE60008201T2 (de) | Übersetzung von Daten mit elektronischen Bildern | |
DE10063514A1 (de) | Verwendung einer gespeicherten Prozedur zum Zugriff auf Indexkonfigurationsdaten in einem fernen Datenbankverwaltungssystem | |
DE10031041A1 (de) | Bereitstellen einer Zugriffsmöglichkeit auf Anwendungsdatenelemente eines Anwendungsprogramms | |
DE10057634A1 (de) | Verfahren zur Verarbeitung von Text in einer Rechnereinheit und Rechnereinheit | |
DE19729911A1 (de) | System zur Verbesserung der Organisation von Daten einer Dokumentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20010820 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
17Q | First examination report despatched |
Effective date: 20020705 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: SIEMENS AKTIENGESELLSCHAFT |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: SIEMENS AKTIENGESELLSCHAFT |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20150825 |