WO2000002142A2 - Verfahren und anordnung zur ermittlung eines informationsgehalts mindestens zweier elektronischer objekte bezüglich eines vorgegebenen elektronischen referenzobjekts - Google Patents
Verfahren und anordnung zur ermittlung eines informationsgehalts mindestens zweier elektronischer objekte bezüglich eines vorgegebenen elektronischen referenzobjekts Download PDFInfo
- Publication number
- WO2000002142A2 WO2000002142A2 PCT/DE1999/001841 DE9901841W WO0002142A2 WO 2000002142 A2 WO2000002142 A2 WO 2000002142A2 DE 9901841 W DE9901841 W DE 9901841W WO 0002142 A2 WO0002142 A2 WO 0002142A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information content
- objects
- electronic
- arrangement according
- similarity
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
Definitions
- a distributed computer network for example the Internet / intranet
- an electronic object is understood to mean any type of electronically stored information.
- An electronic object is, for example
- a user query is a term that is entered by a user and about which the user wants more information.
- a set of terms relating to a predefinable topic or a predefinable term is to be used under a reference object.
- the reference object has a generic term and other terms that are assigned to the generic term.
- a weighting factor can be provided for each term, with which it is indicated to what extent weighted the respective term is to be assigned to the generic term.
- the method from [2] requires an arrangement with a data source DQ through which electronic objects dj are fed to an acquisition component AK.
- the electronic objects dj are processed in the acquisition component AK in such a way that they can be further processed in the arrangement. These electronic objects converted into a format that can be further processed are stored in a database DB.
- a database DB is further to be understood as a structure in which information is stored.
- the Internet / Intranet also represents a distributed database.
- At least one reference object RO is stored in the arrangement.
- the reference object RO is compared with the electronic object dj using a processor P and a similarity measure is determined in the following way:
- Each electronic object dj which is described below as an electronic text file without restricting its general applicability, has a large number of terms.
- Each object dj is described by means of a vector, the dimension of which is as large as the number of different terms in the object dj.
- the vector describing the object dj also contains the indication of the frequency with which the respective term has occurred in the object dj. Assume that the object dj is a text file with the following content:
- the corresponding vector xj which contains an indication of the frequency of the respective terms, has the following structure:
- xj ⁇ 2, 1, 1, 1, 1, 1 ⁇ .
- the vector xj thus describes a word histogram of the object.
- the reference object RO contains key terms for a generic term.
- the type of vector description of a reference object RO is carried out in an equivalent manner as described for an object above.
- the similarity measure s (x_, X2) is now formed such that both document vectors, ie the vector xj, the object dj and the Reference object RO describes to be projected into a predeterminable common subspace.
- the similarity measure s (x] _, X2) is defined as the cosine of the angle between the projected document vectors according to the following rule:
- the similarity measure s (x] _, X2) clearly describes a similarity between two objects to be compared.
- the similarity measure s (x ⁇ , X2) can of course also be determined for two objects dj. In this case, the similarity between the two objects dj is determined.
- the invention is therefore based on the problem of specifying a method and an arrangement with which the disadvantages of known methods described above are avoided.
- a similarity measure is determined for each object with which the similarity of the respective object to at least one further object and / or the reference object is described.
- the information content is determined taking into account the similarity measure and the object information content of the respective object.
- a processor is provided which is set up in such a way that the following steps can be carried out:
- a similarity measure is determined for each object, with which the similarity of the respective object to at least one further object and / or to the reference object is described, and
- the information content is determined taking into account the similarity measure and the object information content of the respective object.
- the invention makes it possible for the first time not only to determine the similarity to a reference object or a probability of a hit for an electronically stored object with respect to a user request for electronically stored objects, but also to provide a user with information as to what information content an object has with respect to others Objects and / or with respect to the reference object.
- information content is clearly to be understood as an indication of how much the individual objects differ from one another or how relevant the respective object is with respect to the reference object.
- the creation date is a parameter of interest to the user. Because the older an object is in this case, the less interesting and the lower object information content the respective object has.
- the object information content is to be understood for one object at a time.
- the information content is formed in accordance with the following regulation.
- G ⁇ rj • f ( P j) • G 3
- the function f (pj), where pj describes a local spatial document density, is a weighting function which states that the less similar objects exist, the more valuable the information is for the user and the greater the information content. This gives the user better information about the relevance of the information object with regard to the user request.
- the invention can advantageously be used as the basis for billing costs for an information search.
- the user is no longer only billed for the amount of information that is transmitted to him, but rather an information content can be offered to him as a basis for calculation.
- FIG. 1 is a sketch with which the method is illustrated;
- FIG. 2 shows a computer network with a large number of computers;
- Figure 3 is a sketch of an arrangement with which the method can be carried out.
- FIG. 2 shows a computer network RN which has a multiplicity of computers R1, R2, R3, ... Ri, Ri + 1 ... Rn-1, Rn, which are coupled to one another.
- IP / TCP Internet Protocol / Transmission Control Protocol
- Electronic objects dj are stored in the computers Ri.
- the following procedure is carried out in each computer with regard to the search term received, which is contained in request A:
- Each computer Ri has the structure shown in FIG. 3 and described above.
- An object information content Gj is assigned to each object dj.
- the object information content Gj of an individual object dj depends on the type of the object.
- the object information content Gj is freely specified by the operator of the database.
- An information content G of the selected objects dj is determined in each computer Ri for at least some of the objects dj stored in the computer Ri.
- the information content G is formed in accordance with the following regulation:
- f (pj) a function, the value of which is lower, the greater the number of objects whose similarity to the object j is greater than a predefinable threshold value
- a local document density p j indicates the number of similar or equivalent objects dj, which are in a local environment of predeterminable size around object j, ie the number of objects dj whose degree of similarity is greater than a predefinable threshold.
- Figure 1 shows four objects dj (d] _, d2 > d3, d4 ) and symbolically two reference objects ROI, R02, with respect to which the relevance and information content is determined.
- Connections between the documents dj and the reference objects ROI, R02 indicate a relevance of the respective object dj to the reference object ROI, R02.
- a first reference object ROI contains the following dimensions with the weight factors assigned to the dimensions in a first reference vector p1:
- a second reference object R02 contains the following dimensions with the weight factors assigned to the dimensions in a second reference vector p2:
- the following table 1 shows the relevance of the respective object to the individual reference objects ROI, R02 for the individual objects dj.
- Object d2 is relevant both for the first reference object ROI and for the second reference object R02.
- the relevance r2 of the object D2 with regard to the combination of the two reference objects ROI, R02 is determined from the individual relevances r2i and r22 and the lengths of the vectors of the reference objects in accordance with the following rule:
- Table 2 shows the respective object information content Gj for each object dj.
- the objects contained in the environment V ⁇ j are counted. This gives a value for the density and a weighted density for the inventory of the existing objects.
- the information content G is determined from these factors in accordance with the following regulation:
- Table 4 shows the calculation of the information content G from the individual object information contents Gj and the weight factors.
- the information content G determined is sent back to the first computer R1 as the result Ei (cf. FIG. 2).
- the result is displayed to the user in the first computer, for example in accordance with the method proposed in [2], i.e. such that the objects are symbolically represented according to the following metaphor:
- the information content G determined serves as the basis for a possible billing of costs that arise because the user actually loads the objects offered from the computers Ri onto the first computer R1. It is thus achieved that several object groups from different information spaces (different database operators) are presented to the user and the user can make a selection depending on the information content G of the individual objects dj.
- the information content G can also be formed, for example, in accordance with the following regulation:
- G ⁇ rj • Gj, j
- Another form of formation of an information measure G can also be used without any problems, it merely being necessary to state what content new information an object or a group of objects contain for the user.
- the objects can either be stored in a computer R1 itself or in a distributed database structure, as is shown in the distributed computer network RN.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99941380A EP1092200A2 (de) | 1998-06-30 | 1999-06-24 | Verfahren und anordnung zur ermittlung eines informationsgehalts mindestens zweier elektronischer objekte bezüglich eines vorgegebenen elektronischen referenzobjekts |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19829210.4 | 1998-06-30 | ||
DE19829210 | 1998-06-30 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2000002142A2 true WO2000002142A2 (de) | 2000-01-13 |
WO2000002142A3 WO2000002142A3 (de) | 2000-04-20 |
Family
ID=7872527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/DE1999/001841 WO2000002142A2 (de) | 1998-06-30 | 1999-06-24 | Verfahren und anordnung zur ermittlung eines informationsgehalts mindestens zweier elektronischer objekte bezüglich eines vorgegebenen elektronischen referenzobjekts |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP1092200A2 (de) |
WO (1) | WO2000002142A2 (de) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6858581B2 (en) | 2000-06-16 | 2005-02-22 | Arizona State University | Chemically-modified peptides, compositions, and methods of production and use |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0687987A1 (de) * | 1994-06-16 | 1995-12-20 | Xerox Corporation | Verfahren und Gerät zum Wiederauffinden von relevanten Dokumenten in einer Sammlung von Dokumenten |
US5647058A (en) * | 1993-05-24 | 1997-07-08 | International Business Machines Corporation | Method for high-dimensionality indexing in a multi-media database |
US5666442A (en) * | 1993-05-23 | 1997-09-09 | Infoglide Corporation | Comparison system for identifying the degree of similarity between objects by rendering a numeric measure of closeness, the system including all available information complete with errors and inaccuracies |
-
1999
- 1999-06-24 WO PCT/DE1999/001841 patent/WO2000002142A2/de not_active Application Discontinuation
- 1999-06-24 EP EP99941380A patent/EP1092200A2/de not_active Withdrawn
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5666442A (en) * | 1993-05-23 | 1997-09-09 | Infoglide Corporation | Comparison system for identifying the degree of similarity between objects by rendering a numeric measure of closeness, the system including all available information complete with errors and inaccuracies |
US5647058A (en) * | 1993-05-24 | 1997-07-08 | International Business Machines Corporation | Method for high-dimensionality indexing in a multi-media database |
EP0687987A1 (de) * | 1994-06-16 | 1995-12-20 | Xerox Corporation | Verfahren und Gerät zum Wiederauffinden von relevanten Dokumenten in einer Sammlung von Dokumenten |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6858581B2 (en) | 2000-06-16 | 2005-02-22 | Arizona State University | Chemically-modified peptides, compositions, and methods of production and use |
Also Published As
Publication number | Publication date |
---|---|
WO2000002142A3 (de) | 2000-04-20 |
EP1092200A2 (de) | 2001-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE69932344T2 (de) | Zugriff zu hierarchischem datenspeicher via sql-eingabe | |
EP0910829A1 (de) | Datenbanksystem | |
CH704497B1 (de) | Verfahren zum Benachrichtigen, Speichermedium mit Prozessoranweisungen für ein solches Verfahren. | |
DE19538240A1 (de) | Informationssystem und Verfahren zur Speicherung von Daten in einem Informationssystem | |
DE102020001541A1 (de) | Verfahren zur Transformation erfasster Sensordaten aus einer ersten Datendomäne in eine zweite Datendomäne | |
DE112007000051T5 (de) | Dreiteiliges-Modell-basiertes Verfahren zur Informationsgewinnung und -verarbeitung | |
DE112012003249T5 (de) | System, Verfahren und Programm zum Abrufen von Informationen | |
DE60037497T2 (de) | Verfahren und vorrichtung für anzeige oder auswahl von einem objekt in einem bild oder einem computerlesbaren aufzeichnungsmedium | |
EP1008067A1 (de) | Verfahren und system zur rechnergestützten ermittlung einer relevanz eines elektronischen dokuments für ein vorgebbares suchprofil | |
EP0856176A1 (de) | Datenbankmanagementsystem sowie datenübertragungsverfahren | |
EP2601594A1 (de) | Verfahren und vorrichtung zur automatischen verarbeitung von daten in einem zellen-format | |
WO2000002142A2 (de) | Verfahren und anordnung zur ermittlung eines informationsgehalts mindestens zweier elektronischer objekte bezüglich eines vorgegebenen elektronischen referenzobjekts | |
DE19703964C1 (de) | Verfahren zur Transformation einer zur Nachbildung eines technischen Prozesses dienenden Fuzzy-Logik in ein neuronales Netz | |
DE19956625C2 (de) | Echtzeit-Datensortierung und -reduktion | |
DE19952630B4 (de) | Verfahren zum Erzeugen einer Auswahlmaske für den Abruf von Daten aus einer oder einer Vielzahl von Datenbanken mit Hilfe von Informationsobjekten | |
EP3396919A1 (de) | Verfahren zur datenübertragung von einem gerät an ein datenverwaltungsmittel, vermittlungseinheit, gerät und system | |
EP1099172B1 (de) | Verfahren, anordnung und satz mehrerer anordnungen zur behebung mindestens einer inkonsistenz in einer datenbankmenge, die eine datenbank sowie mindestens eine kopiedatenbank der datenbank aufweist | |
WO2001059609A1 (de) | Vorrichtung, speichermedium und verfahren zum ermitteln von objekten mit grossen ähnlichkeit zu einem vorgegebenen objekt | |
DE102009016588A1 (de) | Verfahren zur Ermittlung von Textinformationen | |
EP2423830A1 (de) | Verfahren zum Suchen in einer Vielzahl von Datensätzen und Suchmaschine | |
DE10046116B4 (de) | Verfahren und Vorrichtung zum rechnergestützten Ermitteln mindestens eines gespeicherten Produkts und/oder mindestens eines gespeicherten Lösungsprinzips und Computerprogramm-Element | |
DE10017608B4 (de) | Verfahren zur Durchführung von Operationen in einem Datenbanksystem | |
WO2022036378A1 (de) | Verfahren zur datenmanipulationserkennung von numerischen datenwerten | |
DE10006959A1 (de) | Verfahren zur Abfrage einer Datenbank | |
DE69833740T2 (de) | Verfahren und Vorrichtung zur Erklärung der Gültigkeit/Ungültigkeit in einem Rahmen einer über ein Übertragungsnetz als Antwort gesendeten Nachricht |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1999941380 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09720696 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1999941380 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999941380 Country of ref document: EP |