WO1998044434A1 - Data processing system and method - Google Patents
Data processing system and method Download PDFInfo
- Publication number
- WO1998044434A1 WO1998044434A1 PCT/GB1998/000804 GB9800804W WO9844434A1 WO 1998044434 A1 WO1998044434 A1 WO 1998044434A1 GB 9800804 W GB9800804 W GB 9800804W WO 9844434 A1 WO9844434 A1 WO 9844434A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- data objects
- data structure
- objects
- data object
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
- G06F16/972—Access to data in other repository systems, e.g. legacy data or dynamic Web page generation
Definitions
- the present invention relates to a data processing system and method, and to a computer program product for implementing said method. More particularly, the present invention relates to a system and method processing data objects stored within or accessible from the data processing system.
- data stored on or retrieved from the Internet typically comprises HTML documents which can be accessed and rendered using one of the many available Web Browsers.
- data bases and associated data base query languages are used to manipulate data retrieved from the Internet and to provide more user friendly interface to the data.
- "Networked Biomedical Image Hyperbase” presented at the WWW '94 Conference workshop, "Teaching and Learning with the Web” at the First International Conference on the WWW discloses an image data base, based on the WWW system, which uses the HTML "forms" facility to access the data stored within the data base.
- knowledge of a data base language, such as SQL is required in order to be able to access the data stored within a data base.
- the data base systems are tailored to the particular type of information stored within the data base. It is often the case that a data base can only manipulate data provided in a particular format and if the format or type of the data changes, corresponding modifications to the data base will have to be effected in order to render the new format or type of data accessible to the user.
- Audio-Visual Council representing an association of over 200 companies, are directed towards unifying standards for networked multimedia services, applications and products.
- the DAVIC produces a significant volume of documentation covering a wide range of topics ranging from telecommunications to media content and computer networking. All of the documentation is available from an FTP site on a server currently located in Italy.
- the documentation produced by the DAVIC comprises many different formats and forms.
- the documentation typically comprises one of the following Microsoft Word Documents, Rich Text Format Documents, HTML pages and
- the present invention provides a method for processing data objects stored within or accessible from a data processing system comprising memory for storing a data structure having a plurality of data object storage locations for storing data objects, means for accessing and manipulating the data objects stored within said data structure, the method comprising the steps of establishing, within said memory, the data structure for storing the data objects; populating said data structure with data objects; creating directly from said data structure an access document for allowing access to said data objects.
- the present invention removes the requirement for a data base interface to the data of interest, together with the concomitant knowledge of an appropriate query language.
- the contents of the data structure can be displayed or rendered using a simple HTML or Web Browser.
- the data objects can take a plurality of forms and have a plurality of data types.
- the access document comprises an HTML document having entries therein relating to the data objects population said data structure.
- the entries represent or are hyperlinks to the data objects.
- an entry may represent a plurality of documents which are identical in content but each have different formats .
- the structure of that document or the rendered document reflects the structure of the data structure storing the data objects.
- the present invention allows the creation of a single access document which allows or provides access to data objects which are have diverse and different formats, representations or structures.
- the access document comprises HTML
- the single document can be readily viewed or rendered by any one of a number of browsers as are well known within the art .
- an embodiment of the present invention provides a method for processing data objects wherein the step of creating the access document comprises the steps of traversing the data structure to access directly the data objects stored therein; collating data associated with or derived from the data objects stored in the data structure; and including in the access document an HTML instruction for presenting the collated data when said access document is rendered.
- the access document produced should preferably contain an or at least one entry for all of the data objects stored within the data structure.
- the data structure may comprise data objects having any one of a plurality of possible formats and that only data in one of those formats may be of interest to a user.
- an aspect of the present invention provides a method for processing data objects wherein the step of collating comprises the steps of specifying at least one type of a selectable plurality of types of data object to be collated; specifying a search key to be used in collating data objects having the at least one type; searching through all data objects of the at least one type to identify those data objects containing a match with said search key; and collating the identified data objects .
- a user can select documents or data objects according to the means at their disposal for examining or reading the content of those documents .
- data objects which relate to similar or identical subject-matter can be grouped within the same storage location, that is, the collated data objects are preferably stored within the same storage location of the data structure.
- Producing and subsequently displaying a summary of the data objects stored within a data structure can be a time consuming task, especially if the step of producing is performed in real-time.
- an aspect of the present invention provides a method for processing data objects wherein the step of populating the data structure comprises, for a data object, the steps of creating information data objects containing information associated with respective data objects; storing each information data object and respective data objects in the same storage location of the data structure .
- an information data object comprises data relating to the attributes of a respective data object such as at least one of either an indication, such as a file name, of a respective data object, the date of creation of a respective data object, an indication the author or creator of the respective data object, a title associated with said data object, or an indication of the location of an associated data object.
- an indication such as a file name, of a respective data object, the date of creation of a respective data object, an indication the author or creator of the respective data object, a title associated with said data object, or an indication of the location of an associated data object.
- An embodiment of the present invention also provides a data processing system for processing data objects stored therein or accessible therefrom comprising memory for storing a data structure having a plurality of data object storage locations for storing data objects, means for accessing and manipulating the data objects stored within the data structure, the system comprising means for establishing, within said memory, the data structure for storing the data objects; means for populating the data structure with data objects; and means for creating directly from the data structure an access document comprising entries relating to the data objects populating said data structure.
- a still further aspect of the present invention provides a computer program product for processing data objects stored within or accessible from a data processing system comprising memory for storing a data structure having a plurality of data object storage locations for storing data objects, means for accessing and manipulating the data objects stored within said data structure, the product or data processing system comprising computer readable program code means for establishing, within said memory, the data structure for storing the data objects; computer readable program code means for populating said data structure with data objects; and computer readable program code means for creating directly from said data structure an access document comprising entries relating to the data objects populating said data structure.
- figure 1 illustrates schematically a computer system suitable for implementing embodiments of the present invention
- figure 2 depicts a hierarchical data structure for storing documents of interest
- figure 3 illustrates a flow chart of an embodiment of the present invention.
- a computer system 100 such as a Pentium Pro, suitable for implementing an embodiment of the present invention comprising a microprocessor 102 for executing instructions of an executable application, connected to address 104 and data 106 buses to allow the manipulation of data stored within a memory 108.
- a hard disc 110 is also accessible via a hard disc controller 112.
- the hard disc 110 is utilised to store data and programs for use by the microprocessor 102.
- Access to memory is typically controlled via a memory controller 114 and utilisation of the buses 104 and 106 is managed by respective bus controller 116.
- the computer system 100 is operable in a client -server arrangement for accessing and retrieving information from the Internet using a Web Browser as is conventional. It will be appreciated that the computer system will typically contain other elements, such as a CD-ROM drive and associated controller, which are not depicted.
- a hierarchical data structure 200 in the form of a tree representative of, for example, a directory 202, such as the root directory, of the hard disc 110, together with a plurality of sub-directories 204 to 210 some of which also have respective sub-directories 212 to 216.
- Each level of the data structure 200 comprises a plurality of data objects.
- the first level of the data structure comprises data objects 218 and 220
- the first level of the sub-directories comprises data objects 222 and 224
- the second level of subdirectories comprises data objects 226 to 236.
- Each subdirectory of the data structure represents a storage location for data objects, that is, a part of the data structure in which a plurality of data objects can be stored.
- the data objects are grouped at each storage location or level within the data structure 200.
- the grouping of the data objects can be according to any predeterminable association such that all data objects within a particular sub-directory may relate to a particular aspect of the DAVIC standardisation discussions.
- all data objects in a subdirectory 212 may be or relate to all published documents concerning the A10 application programming interface (A10 API) such as the documents Ilb03rl0.doc 226 and Ilb03rl0.rtf 228.
- A10 API application programming interface
- the documents Ilb03rl0.doc 226 and Ilb03rl0.rtf represent the same document in different formats.
- the first document 226 is a Microsoft Word document.
- the second document 228 is a Rich Text Format Document.
- the file extension "doc" will be taken to represent a Microsoft Word document and the file extension "rtf " will be taken to represent a Rich Text Format Document .
- Data objects can be grouped notwithstanding that they have different formats.
- the data objects which can be accommodated by the present embodiment are not limited to Microsoft Word or Rich text Format Documents. Other data object types can equally well be accommodated such as HTML documents or JPEG files, as are well known within the art.
- the data objects stored within the data structure can be of any type and in any format .
- each level of the data structure 200 preferably comprises, in addition to the data objects, a plurality of information data objects 238 to 254.
- the information data objects comprise data relating to attributes of a respective data object.
- Those attributes may relate to at least one of the following: an indication,, of a respective data object, the date of creation of a respective data object such as a DOS file name, an indication the author or creator of the respective data object, a title associated with the data object, or an indication of the location of an associated data object.
- the information data objects provide further information about a respective data object than can be identified readily from the data object per se .
- figure 2 shows a one to one correspondence between the data objects and the information data objects, the present invention is not limited thereto.
- a sub-directory will contain a single information data object which describes all or some of the data objects, in general terms, stored within that sub-directory.
- the one to many relationship between the information data object and the associated data objects arises from the fact that the data objects are, or can be, grouped according to any predeterminable association.
- the information data objects may be represented using a DOS file comprising simple ASCII characters or, alternatively, an HTML document, which, when rendered, outputs in a formatted manner the attributes of an associated data object .
- An information data object is created for each data object stored within the data structure 200.
- a single information data object can be formed which comprises attribute data for all data objects within the data structure 200 or within a particular sub-directory of the data structure 200.
- the information data objects are stored preferably within the same sub-directory as the associated data objects.
- the information data objects can be created upon the addition of a new data object to the data structure.
- the data structure can be periodically traversed by a search engine to ensure that all data objects have corresponding information data objects.
- a user can be prompted to create such an information data object.
- the newly created information data object can then be incorporated into the data structure at the appropriate location.
- the information data objects associated with data objects Ilb03rl0.doc 226 and Ilb03rl0.rtf 228 are Ilb03rl0.ini 244 and Ilb03rl0.in2 246 respectively.
- the first of these information data objects may have a format as depicted in Table 1 below.
- a search engine is used to traverse the data structure in, for example, a depth first manner to determine or identify the data objects contained therein.
- the search engine can either examine and extract data directly from the data objects per se, rather the corresponding FAT table entries, in the event that the data objects are stored on the hard disc, or, ignore the data objects and utilise only the information data objects, if present.
- the entries in the sub-directories can be obtained from the FAT table of the hard disc.
- the implementation of such a search engine which is able to traverse such a data structure is well known within the art and will not be described in further detail.
- the search engine also notes for each data object the location thereof within the data structure .
- the search engine after collating information either directly from the data objects or extracting the information from the information data objects, creates an access document, typically an HTML document, for presenting the collated or extracted information to a user.
- the collated information is included within the HTML document using, for example, a "table" instruction, as is well known within the art.
- the location information is utilised to provide access to the documents in the form of, preferably, hypertext links to the data objects.
- the data objects can then viewed by invoking an appropriate application via a suitably coded applet .
- the HTML document can then be rendered using an standard Web Browser, or if applets are utilised, a Java enabled Web Browser, to produce a table of the contents of or an orderly means for accessing the hierarchical data structure and the data objects stored in the hierarchial data structure.
- the location information contained within the rendered table can then be used in the conventional manner to create a hyperlinks to the data objects associated therewith.
- the hyperlink is invoked in the conventional manner by "clicking" on the hyperlink.
- a data object may also represent a HyperText Clipping which is a simple text file containing a name, typically the DOS file name of the file, and a URL pointing to an HTML document on the Internet.
- the HyperText Clipping may have an extension of HTC .
- the search engine can be instructed to collate information relating only to the hyperlinks stored within the data structure.
- the hyperlinks contained within the HyperText Clipping data objects can be used to access data or HTML documents accessible from the Internet.
- the data objects representing hyperlinks can be grouped within a sub-directory according to predeterminable criteria such as all hyperlinks pointing to DAVIC documents relating to the same or very similar aspects of a multimedia standard.
- the search engine is adapted to be able to effect searches for matches with selectable search parameters or a given search key without having to have regard to the types of the data objects stored within the data structure 200. It is likely, in the event that various data objects in the same sub-directory merely represent the same basic information, although in different formats, that the search engine will locate a plurality matches for the given search key or search parameters.
- the search engine should have preferably the ability to search through all of the various types of documents stored within the data structure. Preferably, the search engine should search at least HTML documents, Microsoft Word documents, *.doc, and Rich Text Format documents, *.rtf documents.
- the access document or HTML document subsequently produced is arranged so that, when rendered, documents or data objects located as a result of a search which are merely different types of the same basic underlying document are grouped and displayed together.
- Table 2 there is shown an example of a table 300 produced as a consequence of rendering an HTML document produced from a data structure.
- the table 300 comprises five columns headed "Name”, “Title”, “Description”, “Author/Date” and "Comments”. Each of these headings has a corresponding entry in an information data object associated with a respective data object.
- the "Name” column represents the file name of a data object within a sub-directory
- the "Title” field represents the title of that document
- the "Description” column comprises a meaningful description of the content of the associated data objects
- the "Author/Date” column comprises the name of the author of the data object and the date of creation of the data object
- “Comments” column represents an opportunity to display any further comments in relation to the data objects.
- the documents or data objects listed correspond to those in the second level sub-directory 212 depicted in figure 2.
- the information displayed was extracted from one of the * . in? files associated with the data objects stored within the second level sub-directory 212.
- the information listed in the first column in parenthesis can be derived from the FAT table of the hard disc supporting the data structure or via a suitable program arranged to measure the number of bytes constituting a data object. Referring to Table 2 there is shown schematically the results of rendering an access document, such as an HTML document, produced according to the present invention.
- Figure 3 illustrates a flow chart of the processing performed by an embodiment .
- a data structure such as a disk file structure of a hard disk
- the data processing system gains access to or provides a data structure containing, in a number of storage locations or levels, a number of data objects at step 302.
- the data structure may be stored locally on the hard disk or within the memory of the data processing system.
- the data structure may be stored remotely on, for example, a file sever which may be accessed via a communication network, such as the Internet or WWW.
- Data relating to the stored data objects is collated at step 304.
- the data may be used to identify and provide access to the data objects or may be used to invoke an application for providing access to or opening the data objects.
- the data structure may be traversed using any one of a plurality of known techniques. The technique selected should be appropriate to the structure of the data structure.
- An access document for providing access to or identifying the available data objects within the data structure is created at step 306. The access document is output for further processing or for display to an interested user at step 308.
- an embodiment of the present invention would include the steps of establishing a data structure within an appropriate memory or on a suitable storage medium of a local or remotely accessible data processing system. Any newly created data structure would then require to be populated with data objects. Therefore, an embodiment preferably comprises the step of populating the data structure or allowing the data structure to be populated with data objects.
- the data objects stored within the data structure can be accessed directly via the hyperlinks associated therewith which identify the location of the data objects within the data structure.
- the present invention therefore obviates the need for a data base interface in order to be able to manipulate or view the documents relating to subject.
- the data objects can be viewed or accessed using a simple HTML Web Browser.
- a storage location may comprise multiple data objects.
- the storage location of a hierarchical disc or network file structure can be a directory or a subdirectory of that structure. Each directory or subdirectory comprises several files which represent various types of documents, application etc.
- An advantage of using the disc or network file structure as the data structure of an embodiment is that the need to encapsulate the data objects is avoided. Encapsulation represents or requires a significant investment in terms of defining the various class structures required to encapsulate the data objects. Therefore embodiments of the present invention advantageously reduce the time and expense required to produce a system for providing access to heterogenous documents which is typically incurred with prior art systems which rely upon encapsulation.
- a data object may represent, for example, a file having a particular format or a collation of, for example, files having different formats but relating to the same or similar subject-matter, that is, a data object may represent a folder which contains several other data objects.
- the access document or rendered access document may selectably comprise data relating to at least one of the information data objects or the data objects.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU68423/98A AU6842398A (en) | 1997-03-27 | 1998-03-27 | Data processing system and method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9706429.9 | 1997-03-27 | ||
GBGB9706429.9A GB9706429D0 (en) | 1997-03-27 | 1997-03-27 | Data processing system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1998044434A1 true WO1998044434A1 (en) | 1998-10-08 |
Family
ID=10809998
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB1998/000804 WO1998044434A1 (en) | 1997-03-27 | 1998-03-27 | Data processing system and method |
Country Status (3)
Country | Link |
---|---|
AU (1) | AU6842398A (en) |
GB (1) | GB9706429D0 (en) |
WO (1) | WO1998044434A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2366497A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | Database for storage and retrieval of bookmarks of portions of web-pages |
GB2366498A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | Method of bookmarking a section of a web-page and storing said bookmarks |
GB2366499A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | A method of storing a portion of a web-page |
US6405216B1 (en) | 1999-09-17 | 2002-06-11 | International Business Machines Corporation | Internet-based application program interface (API) documentation interface |
GB2373698A (en) * | 2001-03-20 | 2002-09-25 | Copyn Ltd | Storage of a portion of a web-page containing a link |
SG136805A1 (en) * | 2000-09-26 | 2007-11-29 | Volantis Systems Ltd | Web server |
-
1997
- 1997-03-27 GB GBGB9706429.9A patent/GB9706429D0/en active Pending
-
1998
- 1998-03-27 WO PCT/GB1998/000804 patent/WO1998044434A1/en active Application Filing
- 1998-03-27 AU AU68423/98A patent/AU6842398A/en not_active Abandoned
Non-Patent Citations (3)
Title |
---|
MANN K: "Novell gets net-wise (Internet Publisher)", NETWORK COMPUTING, MAY 1995, UK, vol. 4, no. 5, ISSN 0966-7873, pages 42 - 43, XP002044581 * |
REINHART G ET AL: "INTEGRATION VON ZULIEFERERN IN DAS DATEN- UND PROZESSMANAGEMENT INTEGRATION OF SUPPLIERS INTO THE DATA AND PROCESS MANAGEMENT", ZWF ZEITSCHRIFT FUER WIRTSCHAFTLICHEN FABRIKBETRIEB, vol. 91, no. 9, September 1996 (1996-09-01), pages 391 - 394, XP000640066 * |
SHKLAR L ET AL: "INFOHARNESS: USE OF AUTOMATICALLY GENERATED METADATA FOR SEARCH AND RETRIEVAL OF HETEROGENEOUS INFORMATION", ADVANCED INFORMATION SYSTEMS ENGINEERING. INTERNATIONAL CONFERENCE, CAISE, 12 June 1995 (1995-06-12), pages 217 - 230, XP000677759 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6405216B1 (en) | 1999-09-17 | 2002-06-11 | International Business Machines Corporation | Internet-based application program interface (API) documentation interface |
GB2366497A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | Database for storage and retrieval of bookmarks of portions of web-pages |
GB2366498A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | Method of bookmarking a section of a web-page and storing said bookmarks |
GB2366499A (en) * | 2000-08-25 | 2002-03-06 | Copyn Ltd | A method of storing a portion of a web-page |
SG136805A1 (en) * | 2000-09-26 | 2007-11-29 | Volantis Systems Ltd | Web server |
GB2373698A (en) * | 2001-03-20 | 2002-09-25 | Copyn Ltd | Storage of a portion of a web-page containing a link |
Also Published As
Publication number | Publication date |
---|---|
GB9706429D0 (en) | 1997-05-14 |
AU6842398A (en) | 1998-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6546406B1 (en) | Client-server computer system for large document retrieval on networked computer system | |
US6055544A (en) | Generation of chunks of a long document for an electronic book system | |
Hammer et al. | Semistructured data: The TSIMMIS experience | |
US6654737B1 (en) | Hypertext-based database architecture | |
US6003046A (en) | Automatic development and display of context information in structured documents on the world wide web | |
US8359550B2 (en) | Method for dynamically generating a “table of contents” view of the HTML-based information system | |
US5781785A (en) | Method and apparatus for providing an optimized document file of multiple pages | |
US7797337B2 (en) | Methods, systems, and computer program products for automatically associating data with a resource as metadata based on a characteristic of the resource | |
US7630999B2 (en) | Intelligent container index and search | |
US6983287B1 (en) | Database build for web delivery | |
US7617449B2 (en) | Method and system for mapping content between a starting template and a target template | |
US7428705B2 (en) | Web map tool | |
US20100077320A1 (en) | SGML/XML to HTML conversion system and method for frame-based viewer | |
EP2151769A1 (en) | A system for electronic publishing | |
US20140052778A1 (en) | Method and apparatus for mapping a site on a wide area network | |
AU2005225130A1 (en) | Management and use of data in a computer-generated document | |
US7275066B2 (en) | Link management of document structures | |
US6766329B1 (en) | Dynamic personalized information organizer | |
CN1127031C (en) | System and method for establishing personalized file in electronic form | |
WO1998044434A1 (en) | Data processing system and method | |
US7085772B1 (en) | Method for processing data objects | |
JP2503333B2 (en) | Full-text index search method | |
WO2000039713A1 (en) | A method and system for performing electronic data-gathering across multiple data sources | |
US20050235197A1 (en) | Efficient storage of XML in a directory | |
US6915483B1 (en) | Method for generating a group of page files formatted in a page markup language |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 09101959 Country of ref document: US |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
NENP | Non-entry into the national phase |
Ref country code: JP Ref document number: 1998541277 Format of ref document f/p: F |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |