WO2002069184A2 - Verfahren zur datensuche unter berücksichtigung ihres verfügbarkeitszeitraums in einem verteilten system - Google Patents
Verfahren zur datensuche unter berücksichtigung ihres verfügbarkeitszeitraums in einem verteilten system Download PDFInfo
- Publication number
- WO2002069184A2 WO2002069184A2 PCT/EP2002/001912 EP0201912W WO02069184A2 WO 2002069184 A2 WO2002069184 A2 WO 2002069184A2 EP 0201912 W EP0201912 W EP 0201912W WO 02069184 A2 WO02069184 A2 WO 02069184A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- resources
- time
- search
- stored
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 238000012795 verification Methods 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 8
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims 2
- 238000011161 development Methods 0.000 description 8
- 230000018109 developmental process Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 230000002123 temporal effect Effects 0.000 description 6
- 238000011160 research Methods 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000013515 script Methods 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012067 mathematical method Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Definitions
- the present invention relates to a method for searching for data or data containing data currently or previously stored in a distributed system, and a method for accessing the resources of a distributed system and for receiving and / or displaying data currently or previously stored in these resources Data, taking into account the time of availability of the data in the system.
- the invention relates to a method for searching or accessing data from the Internet.
- the Internet in its current form offers the possibility to access extensive databases and information in a short time.
- search engines for example, targeted searches can be carried out for data which are intended to meet specified search conditions.
- the available research options and the database that can be accessed are considerably more extensive than a classic library.
- a characteristic of the Internet is that the information available changes very quickly.
- the content of so-called websites is updated at regular intervals or even continuously, depending on the type of information it contains.
- the average lifespan of a website i.e. the period in which the data remain unchanged is estimated to be around 70 days. If the data is updated, so far usually the originally available data was not saved or archived, so that it was irretrievably lost.
- only the current state of knowledge can be called up when researching on the Internet. How this has developed over time cannot be found in the data made available on the Internet.
- the stored data is provided with information which provides information about when the data was stored. This makes it possible to determine the information content of a
- a method for creating a database is known from US Pat. No. 5,933,832, in which the stored data are provided with a time index which provides information about when the data was renewed.
- this method also does not offer the possibility of searching specifically for data or of accessing data that was available to the general public at a specific point in time or period.
- Another option is to use proxy servers (information about the AT&T iProxy project can be found at: http://www.research.att.com/ ⁇ iproxy / archive /), which provide Internet users with access to the system to expand their scope in such a way that they form a personal archive for the respective user.
- the user has the option of storing a currently accessed website in the personal archive together with information about the time of storage.
- this archive is only limited to the information that is specifically selected and saved by the user, so that it does not provide a comprehensive overview of the level of knowledge in a particular area at a particular point in time.
- both the Internet archive and the personal archive do not offer the option of specifically searching for information, since these are pure databases that do not offer the option of searching under certain search conditions.
- the present invention is therefore based on the object of specifying a concept for accessing and searching for data or data containing resources that are currently or previously stored in a branched system, the time at which the data is available being taken into account.
- the invention relates not only to the Internet, but to all distributed or networked systems which provide data, for example also on intranets, extranets, LANs, WANs or MetropolitanANs.
- a first aspect of the invention relates to a method for searching for data currently or previously stored in a distributed system or for resources, which data contain.
- resources are to be understood as all storage locations of data which can be clearly localized, in the case of the Internet, for example, the storage locations which can be localized by means of a URL (Uniform Resource Locator) or a corresponding standard.
- the data is then to be understood as the websites available under a resource, for example, including the files contained therein and / or the files associated therewith. Strictly speaking, if they are clearly addressable, they can also represent their own resource. For the sake of clarity, however, data will primarily be referred to below.
- the method according to the invention comprises several steps, with a query containing one or more search terms first being transmitted to a search unit.
- the distributed system searches for resources or data or information relating to this data which meets the condition (s) defined by the search terms, and in a final step the data found with the search and / or information related to the resources that contain this data.
- the search can, as is usual with search engines on the Internet, take place in such a way that the distributed system is not searched for every query, but rather that the search engine is connected to a memory which stores images or references (“fingerprints”) to those in the distributed system.
- the data is then only searched in this memory and the search results then refer to the respective data or resources in the distributed system
- the data contains a time index with respect to the point in time or period at which it is available in the system were, the search terms in turn may include a time parameter that limits the search to the time and / or period defined by the time parameter.
- the method according to the invention thus offers the possibility not only of searching for specific resources or for information on a specific subject area or on specific search terms, but also to restrict the search to specific periods or times. This opens up the possibility of getting to know the state of knowledge in a certain area at an earlier point in time and thus, for example, of following the development over time in this area.
- the method according to the invention thus offers the same possibilities as when searching in a classic library, the search being able to be carried out much more simply and efficiently on account of the computer-aided automated processing of the request.
- Developments of this method according to the invention for searching for data or data-containing resources are the subject of subclaims.
- the search unit is preferably implemented by a computer program, which is made available, for example, by certain resources of the system.
- this aspect of the invention relates to a search engine for searching for data or data containing data stored in a distributed system, the search engine being designed such that it carries out the search in the manner just described.
- Another aspect of the present invention relates to a method for accessing resources of a distributed system and for receiving and / or displaying data currently or previously stored in these resources, which also includes access to the data archived in an archive or storage network is.
- the data in turn contain a time index relating to the point in time or period at which they were available in the system, and the information contained in the time index can also be displayed when the retrieved data is displayed. This means that a user can see at any time when the data presented was available.
- This method is also preferably implemented using a computer program.
- This aspect of the invention relates in particular to a browser for access or a representation of access to the resources of a distributed system which is realized in a browser. Further training is the subject of further subclaims.
- a third aspect of the invention which likewise relates to a method for accessing the resources of a distributed system and for receiving and / or displaying data that is currently or previously stored in the resources, the data of the system is accessed as a function of one Predeterminable time parameters, the data stored in the system also containing the time index with respect to the point in time or the period of availability in the system.
- the data is now accessed in a targeted manner in such a way that only the data that is available at a predeterminable, possibly earlier, time point or period is used Available data is accessed. It is therefore possible to determine the information content of resources at an earlier point in time. It also opens up the possibility of not only being available in the currently available standing distributed system but also to move in a temporal dimension. For example, the temporal development of a certain resource can be observed in a simple manner. Alternatively, one could now move around in the distributed system such that the system behaves as it was available at a certain earlier point in time.
- This third aspect of the invention also relates in particular to a browser for access or a representation of access to the resources of a distributed system which is implemented in a browser and to which a time parameter can be predefined, the access to the data of the system taking place as a function of this time parameter. Further developments of this aspect of the invention are also the subject of subclaims.
- Another aspect of the invention relates to a method for archiving data stored in a distributed system.
- Data is first retrieved or received from the distributed system, then supplemented by a time index relating to the point in time or period at which the data was available in the system, provided the data did not yet have a time index, and finally in a data archive or archived at a depository in such a way that the data can be accessed by search engines, browsers or programs.
- the archiving can take place at any point in the distributed system, in which case verification information relating to the data can also be archived in a depository.
- the present invention thus offers a self-contained concept by which it is possible to use the complete information content of the data of a distributed system, taking into account the temporal development of the data. This provides comfortable and powerful display and research options.
- Figure 1 is a schematic representation of a distributed system for explaining the present invention.
- FIG. 2 shows the display of the window of a browser according to the invention, which offers the possibility of taking into account the time or period of availability of this data when accessing and displaying data; and 3 shows a search engine according to the invention, which offers the possibility of taking temporal aspects into account when searching for data.
- the distributed system 1 contains a number of different resources 4 to 10 and 2b, i.e. from clearly localizable storage locations that contain data.
- these resources 4 to 10, 2b can be localized by their URL, in the most general case by any corresponding standard. Strictly speaking, each component of a resource that can be clearly localized itself can represent its own resource.
- the resources 5 to 7 each contain retrievable data, for example websites present in the HTML or another hypertext standard, including the files associated therewith.
- the reference symbol 2b denotes a user terminal which can act as a resource, provided that the data stored there belong to a component of a storage network. The character of the storage network will be explained later.
- Reference number 8 denotes a further resource, which is a public depository. Data made available from resources 5 to 7 can be specifically selected and copied to this public depository 8 - also called a trust center - for data backup, or resource 8 can be instructed to copy this data. The function of this depository 8 will be explained in more detail later.
- a data archive 9 is part of the system 1, in which the data, for example the resources 6 and 7, are systematically stored for archiving.
- the system 1 contains the search engines 4a or 4b as further resources, which serve to serve a user connected to the system 1, represented by a further user terminal 2a, or the user of the terminal 2b while searching for the resources 5- 7, the archives 8, 9 or the data made available in the context of a storage network 2b or 10.
- the search engines 4a, 4b can be used by programs, represented, for example, by an intelligent agent 12, which automatically carries out searches for other resources, archives or users.
- the Search unit 4c only supports research in archives 8 and 9 as a mere interface.
- User 2a can be connected to system 1 via a proxy server 10 or directly as with user 2b.
- I-d denotes private archives, which can be part of resources 2b, 8, 9 or 10.
- the function of these private archives l la-d will also be explained in more detail later.
- the data 5 to 1 provided with the index 1 represent the latest data stock made available by the resources 5 to 7, ie the data that was last updated.
- Resource 5 for example, also provides 5 in addition to the latest data ! several data 5 2 and 5 3 published and archived at earlier times are also available. In the case of the Internet, this archived data corresponds to 5 2 and 5 3 websites in a form that was available at earlier times.
- This archived data 5 2 and 5 3 can be stored in the original format with all content and possibly the data or resources linked by means of references (links), so that they can be read, for example, by a browser or an alternative playback program and displayed exactly as they are were available earlier. This means that during archiving, for example, the download files linked by the links, which are behind the graphical user interface (e.g. PDF files, Word documents, etc.), are also saved. If the data also contain scripts, applets or content dynamically integrated from other resources, this content can also be archived.
- the data 5 2 , 5 3 in compressed form or, if necessary, to exclude individual contents which are not essential for the information content. For example, the advertisements or advertising banners often displayed on websites could be excluded from archiving. If the data contains dynamic content or content that depends on the configurations or information of a user, so when archiving, they are preferably saved as they appear by default when they are called up for the first time.
- the time at which data is saved for archiving can vary depending on the type and content of the data. For example, it can be provided that the data at regular intervals, e.g. a few days, weeks, or months. Another option is to only archive if the content of the data has changed to a certain extent, which e.g. can be determined by a comparison between the most recently archived and the current data, if necessary with the aid of checksum methods or the like. In this case, to reduce the data volume, provision can also be made for only relative changes to be stored and for the data to be completely archived only in the event that the total of the changes is greater than a complete re-storage.
- resource 5 completely archives its data 5j to 5 3 itself and thus makes a complete data record available.
- resource 6 in which the own data 6 to 6 3 are also archived over time, but not with resource 7.
- the archive 9 can make the claim, all of the resources in the distributed system 1 5-7 provided data 5 to 5 3 , 6, to 6 3 and 7. This applies regardless of whether the resources archive their data for general access themselves like resources 5 and 6, but not resource 7. It is also conceivable that only the previous data of certain resources are archived - for whatever reason: so in Example the earlier data 6 t and 7 t of resources 6 and 7, but not that of resource 5.
- this archive 9 can also be provided to archive only the information relating to a specific subject area. If data relating to this subject area are published by resources 5-7, these are systematically archived in archive 9.
- the data can be backed up or copied into the archive 9 using, for example, automatic robotic methods. Based on addressing, cross-referencing, frequency of updates or relevance of the various resources, a systematic query and archiving is carried out with the help of these procedures. It is possible to use so-called “self-learning" methods, in which the frequency of polling is made dependent on the frequency at which the data is updated and the extent of the changes. "Learning" can take place with the aid of mathematical methods, for example based on neural networks, whereby the query frequency is adjusted independently in order to achieve optimal archiving.
- the archiving frequency is increased if the data is updated more frequently, whereas, in contrast, archiving takes place only at long intervals if the data remains unchanged over a long period of time.
- the nature of the changes in content can also be taken into account, for example only the content of texts contained in the data being taken into account for assessing whether archiving should take place or not.
- the resource 6 can initiate archiving in the archive 9 on its own at regular intervals or at times at which the data have been updated.
- This can be implemented using applets, scripts or other software solutions that are provided for setup on the corresponding resource.
- This is particularly advantageous in the case of resource 7, since, in contrast to resources 5 and 6, it does not itself archive the data made available by it. If the data of resource 7 is updated in the example shown, the data previously made available are copied into archive 9 so that it contains a complete set of data 7 that was available at earlier times.
- the archive 9 can also be requested by one of the users 2a or 2b by entering a specific resource to archive this data or resource.
- the interface for the input can run on its own resource or can be integrated in software - for example in the user's browser.
- the archive 9 can also be the basis of an expert system which allows the targeted output of data on specific content, topics, categories, formats and times or intervals. Research in the archive can be carried out via a separate interface, for example a search unit 4c. Archive 9 can also be designed in such a way that data specified in advance is only archived by content or other categories.
- the archived data can only be accessed against payment of a certain fee, whereby the original provider of the data, i.e. resources 6 and 7, from which the data originate, can share in the income, for example in the form of micropricing.
- archives 8 and 9 which are not directly publicly accessible in the system 1, but can only be reached via a further - possibly password-protected - interface.
- This so-called “invisible net” or “deep web” is an area of the Internet that is not directly accessible to users by controlling resources; instead, this area is available in the form of databases that can be queried on these resources via certain interfaces.
- archiving can include direct access to the databases behind the query interface for the purpose of archiving, if necessary after a corresponding agreement, which can also be automatically negotiated by a software solution between the resource and the archive / robot.
- the public depository or trust center 8 performs other tasks.
- a first task is to have the publication of certain data of resources 5-7 documented or verified.
- An interest in such archiving can exist, for example, if it is to be proven that certain information was already available at a certain point in time. For example, it can thus be clearly established whether information which would conflict with the patentability of an invention was already available to the public before the relevant priority date of the application. So it works about documenting, verifying and protecting the origin, time and content of data and resources from manipulation.
- the method provides that the depository 8 is instructed, that is, the request for archiving, for example by the user 2a or 2b, who issues an instruction to query certain data from a resource 5-7 and in the trust center 8 - together with Information on time and origin - to be filed.
- data can be stored in the trust center 8 based on the request from a resource. Both can - as described for storage in archive 9 - be done both manually (i.e. when requested) and automatically by a software solution.
- the deposit can also include that further levels of files connected to the data to be archived by means of links are archived. How many levels should be saved can be made dependent on the user configuration.
- Another task is to make certain content or resources citable when requested by user 2a, 2b or a virtual agent 12. To do this, it must be ensured that certain contents characterized by origin and time are stored permanently and unchangeably. For the storage of data as well as the check with regard to possible changes in data during the transmission processes from and to the trust center 8, this can be done the security criteria according to the Signature Act are used. The procedure is as described above.
- a third function of the depository 8 can consist in the fact that the depository 8 documents or verifies, at a specific point in time, the level of knowledge gathered in an area, for example by means of an expert system, independently of a request for the specific storage of certain data or resources.
- the trust center 8 can therefore also archive data of the resources 5-7 itself, analogously to the method illustrated in relation to the archive 9. In particular, data of certain resources can be monitored at regular intervals and, if necessary, archived automatically for a fee.
- the trust center 8 ensures that the availability of the data is guaranteed at all times, but at the same time manipulation is excluded, so that the data queried from the trust center 8 at a later point in time is identical to the original data available in the distributed system.
- the corresponding data - as described above - can be completely archived in Trust Center 8.
- the trust center 8 it is also conceivable for the trust center 8 to create a digital verification stamp or “fingerprints”.
- the stamp contains coded information on the time, origin and content. A copy of the stamp is stored in the depository 8.
- the data or resources then need not be stored in the trust center 8, but can also take place on the resource 5-7, in the archive 9 or in a personal archive 11a-b (ie also for a user, possibly in the storage network) Data can then be determined by comparing the verification stamp or the fingerprint whether this data is identical to the originally verified.
- archives 8 and 9 In parallel to the previously described methods for storing in archives 8 and 9, there is the possibility of setting up personal archives, to which only a specific user or a more specific group of users can have access.
- These can be designed as "virtual archives” such as 11c and ldd, in which information from archives 8 and 9 is filtered according to user specifications and, if necessary, processed. A section of the entire archive is thus visible in the personal archive.
- these private archives 11c and 11d display data which are stored in archives 8 and 9 but which are only intended for a specific group of users and not for the general public
- the archives 11a and 11b represent actual storage locations in the sense that data are archived here directly - together with the time and origin.
- the personal archive 11b is part of the user terminal 2b.
- the user 2a also has the option available to create a personal archive 11a len, to which only he - or a more specific group of people - has access via a corresponding proxy server 10.
- Archiving in the personal archives 11a and 11b can, for example, take place automatically when the user 2a or 2b accesses certain data of the system 1. As with the trust center 8 and the archive 9, however, automatic archiving methods can also be provided. It is also possible for data and resources to be archived in the personal archives 11a and 11b when the user issues the corresponding command by directly entering an interface through a software solution, for example integrated as a button in the user's browser. Functional extensions of the personal archive 11c or lld can concern a notification of the user when new data is added.
- the personal archive 11a or 11b has the same function as the archive 9, but only contains the data archived therein personally by the users 2a or 2b. In this way it is possible to make an entire network of personal archives available to provide, so to create a decentralized storage network, which overall can contain a large part of the data provided by the system 1 in the past.
- archived data regardless of whether it was archived by resources 5 and 6 themselves, trust center 8, archive 9 or private archives l la-b, contains a time index that provides information about at what point in time or in what period of time the data was available in the system. Available means that the data is basically accessible at this moment.
- the time index can be one, two or more dimensions. One-dimensional means that only a singular time of availability is recorded. Two-dimensional means that two points in time define a time interval (continuum) in which the data were available. Accordingly, multidimensional means that several individual times and / or intervals of availability are recorded. Data in individual resources expediently contain one- or preferably two-dimensional time indices, archived data also multi-dimensional.
- the time or period of availability can be determined in various ways.
- the original resource 5-7 gives the data a time index. Usually, this will be the time when the data will be published for the first time or the period from this time of publication to the current time or the time of the first change.
- the time index can also contain an indication of the time measure used to determine it (local time, but usually GMT).
- the time assigned by the resources can then be transferred. If the resource itself does not give a time index, the time of retrieval or archiving can be used as a time index; with ongoing review, this can also be a period.
- time indices can also be assigned during archiving. Especially when it comes to the verification of certain dates and times / spaces - i.e. when archiving in Trust Center 8 - it must be ensured that the data was actually accessible at the times recorded by the resource or that this data was not subsequently changed , In this case, the trust center will only be able to record certain times for the time index; this is, for example, the moment this data is called up (by a robot or manually). A period (i.e. a continuum of availability) can therefore only be recorded if there is a continuous check of the accessibility or availability. This can also be regulated by a software solution such that the resource regularly contacts the trust center as long as the data is available, or the trust center 8 or the archive 9 is automatically notified of changes.
- the verification stamp In order to enable verification, the verification stamp must be deposited at the exact time that the data is received or, in the case of verification, the time index that the data has is automatically the time at which the verification stamp was created.
- the archived data can contain further notes, for example the references to identical data from other resources, which enables data that come from different resources but have identical contents to be linked.
- a possible form of such a reference is the reference to the URN (uniform resource name) of a document, that is to say a resource-independent identifier for data. All of this becomes important when it comes to finding identical data that can be found under different resources over time.
- the notes on identical data can also be supplemented by user input in a corresponding interface. This makes sense, for example, when the data changes to another resource. This can be noted by user input or automatically, and consequently a temporal continuity of the data is established, even if the resource has changed.
- the data can have blocking notes, which only make the availability possible from a certain point in time or against payment of a fee.
- the notes on indexing, time, availability, fee, confidentiality, etc. are stored in the resource together with the file name as further file properties. This would also allow direct access to these files using a correspondingly expanded locator. Additionally or alternatively, this information can also be saved in the file itself (for example in the header for HTML documents). However, it is also conceivable that all or part of the indexing information is stored centrally in its own Database file can be stored on the corresponding resource or another resource in the distributed system. In this case, direct addressing (for example using an expanded locator) is only possible insofar as the access request for a specific file first has to be directed to the resource with the indexing information. This interprets the request accordingly and then forwards the access request so that the desired file is accessed directly.
- one way of addressing the data is to extend the URL standard to an extended locator, for example a uniform resource and time locator (URTL).
- UTL uniform resource and time locator
- this new locator for resources in distributed systems also contains a time address, so it has been expanded to include a time component or a time parameter.
- different data for example web pages, which can be reached under the same URL over time, can be individually controlled by the extended locator.
- the additional time is a further parameter in the addressing, which can be recognized as such when the data is accessed and processed directly. If addressing takes place according to the conventional standard, that is to say without a time, it can be provided that the most current data is accessed as standard.
- the extended locator is not supported by transmission protocols, the network infrastructure and / or individual resources of the distributed system, the extended locator can be simulated by using the previous URL specifications, so that two-dimensional addressing according to resource and time is possible is. This presupposes that the resources can also interpret the information encoded in this way in URL format using a suitable software solution.
- this new standard can be simulated by a software expansion of the proxy server 10, which converts the requests for data in connection with a specific point in time into corresponding access commands to resources 5-7 or archives 8, 9, 11a and 11b.
- the same can also be done by appropriately expanding the user terminal, for example the browser, in such a way that the two-dimensional input of resource and time is software-coded in the URL standard.
- Access takes place through a browser installed in the computer 2a or 2b, via which requests for data contained in certain resources - possibly via a proxy server 10 - are forwarded to the corresponding resources.
- 2 schematically shows a window of the browser displayed on the monitor 3 of the computer 2a.
- the address of the resource to be accessed is shown in an address field 20 in the upper area.
- a further time field 21 is arranged, which provides information about the time index attached to the data shown.
- the address of the desired resource is to be entered in the address field 20, at the same time a time parameter can be specified in the time field 21, which provides information about the point in time or the period from which the desired data should come. If the time parameter is omitted, the latest version of the stored data can be requested as standard, as shown above. Of course, the input or output of the time parameter does not have to take place via its own time field, but can be entered or displayed within the address field as part of such an expanded address.
- the inputs of addresses and time parameters are then forwarded directly to the corresponding resource 5-7, possibly via the proxy server 10, if necessary in the simulated URTLocator.
- This query does not produce a result (because the resource is not can be reached because it does not support the standard or because it has no data for this time parameter), the request is forwarded to one of archives 8, 9 or / and 1 la, b.
- time index 21 or the information contained in the time index for the data displayed in the browser window are simultaneously displayed in the time field 21, so that it can be seen at any time from which period the data shown originate.
- an alternative form of representation is also conceivable, either implicitly in the address field or graphically as a time bar.
- Reference number 26 denotes a link that represents a cross-reference to further data or resources. Since, depending on the scope of the archiving, the data to which the link 26 refers can be archived, in this case selecting this link 26 automatically leads to the display of the information on which this link 26 is based, also in terms of time. This is the possibility given to navigate through the system at a predetermined time. However, if the data on which the link 26 is based were not stored either on the resource or in one of the archives 8, 9, 11a or 11b, it can be provided that the information available next to the predetermined point in time is accessed. Alternatively, it can also be provided that a new point in time must be specified in order to carry out the access. Possibly. an overview of the times from which data is available can also be shown (e.g. as a pop-up window).
- a time bar 22 is shown on one side of the browser window, which offers the possibility of navigating in the time dimension on the displayed website. This means that selecting the upper arrow 22a automatically leads to access to those data which have been archived according to the data currently displayed in the window. In contrast to this, a selection of the lower arrow 22b automatically leads to access to data that is older by a time step.
- Buttons can also be provided in the browser shown in FIG. 2, by means of which time tolerances can be specified with which the entered time parameter is to be treated. For example, this can be used to set the manner in which corresponding data from other periods should be accessed if data from a desired period are not available. With the help of another button, default settings can be made whether and in what order to the various data stocks of the system, i.e. For example, resources 5-7 or personal archive 11a-d should be accessed first, then archive 9 and finally trust center 8.
- time specified by the time field 21 can be activated or deactivated. Activation means that only data that meets the time condition specified in time field 21 should be accessed. This corresponds to the previously described navigation at a fixed point in the past. Due to the frequent updating of the data made available in distributed systems, however, it often happens that cross-references to other data lead to resources that are no longer accessible or that no longer provide data corresponding to the context at that time.
- the request is automatic
- the search is expanded to include the most recently archived data for the resource searched for or the data closest to the time of the search. This ensures that the most recently available data can be displayed in any case.
- Deactivating the time specified by the time field 21, on the other hand, has the result that the current or at least the last available archived data of the corresponding resources is shown in principle.
- An extension can also be that a separate window displays information about similar or identical data from another resource. This information could provide an indication that the resource you are looking for can be reached at a new address and that the data is only updated on this new resource. Furthermore, it can be displayed in an additional window which cross references have the data shown, or which other data contain cross references to the data displayed in the browser window. The information required for this is based on the indexing or reference notes outlined above or search engines, which can also categorize content.
- the method according to the invention offers the possibility of navigating both between different resources and also in terms of time.
- appropriate extensions can be used to ensure that the most recently available data can be transferred to the archive 9 even when the operation of a resource is discontinued and can be displayed from the archive when requests are made to this resource.
- search engines 4a and 4b are provided, which offer the possibility of searching for specific information from the data provided by the various resources 5-9 and 1 lb and possibly 1 la of system 1.
- the user 2a or 2b transmits an inquiry containing one or more search terms to the search engine 4a or 4b.
- This searches in the system 1 for resources or data which meet the condition (s) caused by the search terms.
- the search can, as is usual with search engines on the Internet, take place in such a way that the distributed system (including the archives) is not searched for every query, but rather that the search engine is connected to a memory that contains the images of the notices (" fingerprints ") on the resources and data present in the distributed system.
- Fig. 3 shows a window of such a search engine 4a or 4b, as shown on monitor 3 of user 2a, which usually has an input field 27 for entering search terms, according to which i n the available resources or data should be researched.
- search terms can also be combined with the usual links (AND, OR etc.) or exclusion criteria.
- the search engine has one or more time parameter windows 28, 29, in which time information can be entered and thus one or more time intervals may be specified.
- the time specifications determine a time parameter, by means of which the search is limited to data that were available in the system in the specified period. It is therefore possible not only to search under the current data as before, but also under data available at an earlier point in time. In particular, there is the possibility, for example, of only retrieving information on a specific topic that was available in the past at a specific point in time.
- the data or the resources containing the data can then, for example, be displayed on the screen in the form of a table or list 30 or be prepared as a catalog or in some other way, for example graphically.
- the search engine 4a or 4b is not accessed in a browser, but rather via an upstream input interface in the sense of a separate software program.
- This interface can be implemented, for example, by an additional program or the like, which appears in the browser as a separate input window or as a browser extension.
- This extension offers additionally the possibility of automatically converting certain entries or error messages due to non-availability of data (in the sense of data of the "invisible net” behind the surface) or resources ("broken link") into corresponding queries to the search engine. This results in a new search request or a new access to data, which is then automatically called up, possibly reconstructed and displayed in the browser.
- this interface can be used to display a catalog for the selection of certain terms or resources, according to or in which research is to be carried out.
- this interface can be used to query stored user-specific parameters.
- the extensions offered by the interface can also be integrated into the browser.
- a corresponding interface can also be provided for the output of data obtained from the system.
- search terms and / or resources or groups of resources and / or time or other parameters the latter can automatically present the information found in a one- or multi-dimensional result list - sorted if necessary according to the parameters mentioned or other relevance criteria. It can be provided that in the event that a query leads to a clear result - for example when querying for a resource at a specific time - the data is displayed directly in the original format, while in the event of the occurrence of several data which meet the search criteria fulfill, a presentation can be provided in a list of results or a cataloged, categorized or graphically prepared output takes place. In order to enable the display in the original format, programs or extensions may have to be made available to users by the search engine or resources.
- a graphic representation of its life cycle for example the temporal development of the data stored on it (by identifying the change) - or its networking with other pages and resources over time can be provided.
- references to other resources that are similar or identical or have a common origin can be displayed.
- the data found can be sorted, for example, using neuronal or evolutionary algorithms.
- the search results can be searched again if several data fulfilling the search criteria are found.
- the method according to the invention for searching for data and data-containing resources also offers the possibility, for example, of explicitly researching for the time parameter, that is to say for example searching for data that is available at a specific point in time or within a specific period of time stood or which have changed within a predetermined period. This also implies the ability to search for resources or groups of resources on which data has changed within a certain period of time.
- the present invention thus offers the possibility of conveniently accessing the resources or data made available in a distributed system, or of searching for data with corresponding information and at the same time also taking into account the period of availability of this data. As a result, the information content of the available data material can be used extremely effectively.
- the methods according to the invention for searching for and for accessing the resources or data are preferably implemented by software programs.
- Existing search engines or browsers that do not yet support the method according to the invention can be retrofitted using additional programs or applets.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Multi Processors (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP02719901A EP1509856A2 (de) | 2001-02-22 | 2002-02-22 | Verfahren zur datensuche unter berücksichtigung ihres verfügbarkeitszeitraums in einem verteilten system |
AU2002250996A AU2002250996A1 (en) | 2001-02-22 | 2002-02-22 | Method for searching for data, taking into account the moment of availability of said data in a distributed system |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE10108564.8 | 2001-02-22 | ||
DE10108564A DE10108564A1 (de) | 2001-02-22 | 2001-02-22 | Verfahren zur Suche nach in einem verteilten System aktuell oder früher gespeicherten Daten oder Daten enthaltenden Ressourcen unter Berücksichtigung des Zeitpunkts ihrer Verfügbarkeit |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002069184A2 true WO2002069184A2 (de) | 2002-09-06 |
WO2002069184A3 WO2002069184A3 (de) | 2004-12-29 |
Family
ID=7675134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2002/001912 WO2002069184A2 (de) | 2001-02-22 | 2002-02-22 | Verfahren zur datensuche unter berücksichtigung ihres verfügbarkeitszeitraums in einem verteilten system |
Country Status (5)
Country | Link |
---|---|
US (1) | US20020116375A1 (de) |
EP (1) | EP1509856A2 (de) |
AU (1) | AU2002250996A1 (de) |
DE (1) | DE10108564A1 (de) |
WO (1) | WO2002069184A2 (de) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1645983A1 (de) * | 2004-10-08 | 2006-04-12 | Draeger Medical Systems, Inc. | Medizinische Daten erfassendes System |
US20070198630A1 (en) * | 2006-01-06 | 2007-08-23 | Lee Jacobson | Delivery of archived content to authorized users |
GB2459670A (en) * | 2008-04-29 | 2009-11-04 | Zdzislaw Wladyslaw Jaworski | Time based matching of data query sets |
US20090287684A1 (en) * | 2008-05-14 | 2009-11-19 | Bennett James D | Historical internet |
US8719708B2 (en) * | 2009-10-28 | 2014-05-06 | Morgan Stanley | Systems and methods for dynamic historical browsing |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6006227A (en) * | 1996-06-28 | 1999-12-21 | Yale University | Document stream operating system |
JP4025379B2 (ja) * | 1996-09-17 | 2007-12-19 | 株式会社ニューズウオッチ | 検索システム |
JP2000036000A (ja) * | 1998-06-30 | 2000-02-02 | Sun Microsyst Inc | 電子商取引における中立的立会人 |
US6615244B1 (en) * | 1998-11-28 | 2003-09-02 | Tara C Singhal | Internet based archive system for personal computers |
US7120862B1 (en) * | 1998-12-01 | 2006-10-10 | Lucent Technologies Inc. | Method and apparatus for persistent access to Web resources using variable time-stamps |
US7765179B2 (en) * | 1998-12-01 | 2010-07-27 | Alcatel-Lucent Usa Inc. | Method and apparatus for resolving domain names of persistent web resources |
US6684204B1 (en) * | 2000-06-19 | 2004-01-27 | International Business Machines Corporation | Method for conducting a search on a network which includes documents having a plurality of tags |
-
2001
- 2001-02-22 DE DE10108564A patent/DE10108564A1/de not_active Withdrawn
-
2002
- 2002-02-22 WO PCT/EP2002/001912 patent/WO2002069184A2/de not_active Application Discontinuation
- 2002-02-22 EP EP02719901A patent/EP1509856A2/de not_active Withdrawn
- 2002-02-22 US US10/080,894 patent/US20020116375A1/en not_active Abandoned
- 2002-02-22 AU AU2002250996A patent/AU2002250996A1/en not_active Abandoned
Also Published As
Publication number | Publication date |
---|---|
DE10108564A1 (de) | 2002-09-12 |
EP1509856A2 (de) | 2005-03-02 |
AU2002250996A1 (en) | 2002-09-12 |
US20020116375A1 (en) | 2002-08-22 |
WO2002069184A3 (de) | 2004-12-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE60029863T2 (de) | System um einer Gruppe von Benutzern Informationen über Dokumentenänderungen zu übermitteln | |
DE69729926T2 (de) | Netzwerkbrowser | |
DE60003148T2 (de) | Bestimmung der Cachezeit | |
DE69832786T2 (de) | Vorrichtung und verfahren zur identifizierung von klienten die an netzwer-sites zugreifen | |
DE69931256T2 (de) | Verfahren und system zum zurückholen einer elektronischen akte | |
DE10003907A1 (de) | Browser für die Anwendung beim Zugriff auf Hypertext-Dokumente in einer Mehrnutzer-Computerumgebung | |
EP1178409A1 (de) | Cookiemanager zur Kontrolle des Cookietransfers in Internet-Client-Server Computersystem | |
DE10118898A1 (de) | Vorrichtung und Verfahren zur Verarbeitung von Lesezeichenereignissen für eine Webseite | |
EP1620810B1 (de) | Verfahren und anordnung zur einrichtung und aktualisierung einer benutzeroberfl che zum zugriff auf informationsseiten in ein em datennetz | |
DE19813884B4 (de) | System und Verfahren zur Ermittlung und Darstellung von verbindungsbezogenen Leistungsdaten in Netzwerken | |
DE19813883B4 (de) | Verfahren, Computerprogrammprodukt und Dokumentenmanagementsystem zum Zugriff auf Internet-Informationen für geschlossene Benutzergruppen | |
EP1509856A2 (de) | Verfahren zur datensuche unter berücksichtigung ihres verfügbarkeitszeitraums in einem verteilten system | |
WO2012017056A1 (de) | Verfahren und vorrichtung zur automatischen verarbeitung von daten in einem zellen-format | |
DE102004030594A1 (de) | Verfahren und System zum Erzeugen einer Webseite | |
DE102006027664B4 (de) | Kommunikationssystem zum Verarbeiten von Daten | |
DE19952630B4 (de) | Verfahren zum Erzeugen einer Auswahlmaske für den Abruf von Daten aus einer oder einer Vielzahl von Datenbanken mit Hilfe von Informationsobjekten | |
DE10319887B4 (de) | Verfahren zum Angleichen eines auf einer Client-Datenverarbeitungseinrichtung angezeigten Datenbestandes an einen auf einer Server-Datenverarbeitungseinrichtung gespeicherten Quelldatenbestand | |
DE10146977B4 (de) | Elektronisches Personaldaten-Verwaltungssystem | |
DE10305363B4 (de) | Netzwerkbasiertes Informationssystem und Verfahren zur zentralen Verwaltung und Aktualisierung von Datenobjekten mit zeitlich sich ändernden Inhalten | |
EP1170676A1 (de) | Darstellung einer Informationsstruktur von Dokumenten des Word Wide Web | |
DE10142379B4 (de) | Verfahren zum Erstellen von Hyperlinks und deren Verwendung zum Aufruf von Zieldokumenten aus einem Ausgangsdokument | |
DE10139761B4 (de) | Computeranordnung in Form eines Client-/Server-Systems mit einer Datei einer Auszeichnungssprache für die Parametrisierung einer automatischen Abfrage sowie entsprechendes Verfahren | |
DE10208959B4 (de) | Verfahren und Vorrichtung zur Erfassung und Auswertung von in einem Rechnernetzwerk abgelegten Informationen | |
Becker et al. | SAP Records Management | |
DE10045279A1 (de) | Vorrichtung und Verfahren zur Bereitstellung von benutzerspezifischen Informationen in Datennetzen |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2002719901 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2002719901 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2002719901 Country of ref document: EP |