GB2395300A - Providing access to component parts of a document - Google Patents

Providing access to component parts of a document Download PDF

Info

Publication number
GB2395300A
GB2395300A GB0225972A GB0225972A GB2395300A GB 2395300 A GB2395300 A GB 2395300A GB 0225972 A GB0225972 A GB 0225972A GB 0225972 A GB0225972 A GB 0225972A GB 2395300 A GB2395300 A GB 2395300A
Authority
GB
United Kingdom
Prior art keywords
document
application
client
user
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0225972A
Other versions
GB0225972D0 (en
Inventor
Joanne Elizabeth Clark
Peter Ward
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHASSERAL Ltd
Original Assignee
CHASSERAL Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHASSERAL Ltd filed Critical CHASSERAL Ltd
Priority to GB0225972A priority Critical patent/GB2395300A/en
Publication of GB0225972D0 publication Critical patent/GB0225972D0/en
Publication of GB2395300A publication Critical patent/GB2395300A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Abstract

A client (27, 37) and an engine (20) provide for access by an application (21, 31) to a database datastructure, which stores a plurality of parts of a content document in corresponding nodes. The application (21, 31) is arranged to read said document from an application specific file, the client (27, 37) further comprising a proxy application specific file comprising control statements and means for causing said application (21, 31) to open said proxy file to create a proxy document within said application (21, 31) The control statements are arranged to be automatically executed by said application (21, 31) in order to load said content document parts from said database datastructure into the proxy document in order to provide said content document within said application. Improved concurrent access by more than one user (24, 34) may thereby be achieved.

Description

IMPROVED FILE ACCESS
Field of the Invention
The present invention relates to methods and apparatus for accessing content of a 5 document conventionally stored in a file, including an improved data structure for storing the document, means for converting an application specific document file containing a content document into said data structure, means for displaying and editing said content document supplied from said datastructure from within a legacy application; and an architecture for providing concurrent network access to said 10 content document whilst stored in said data structure.
Background of the Invention
Documents such as word processing files are stored as file on disk. The content of the document is in a proprietary (though not necessarily secret) format and can be 15 fully understood only by the related application. The user invokes the application to read the entire content of the file into memory for the purpose of editing. The user manipulates the information in memory using the features of the application. The internal structure of the information whilst in memory is not exposed. Because the application is single-user and contains the entire document contents in local 20 memory, it is not possible for another user to be editing the same document at the same time. The application uses the support of the underlying operating system to "write-lock" the file. When the user has completed a sequence of editing actions, the application writes back the entire content of the document back to the file.
25 Another problem with this system is that when edits are saved to the file data structure, for example by user command or an autosave function, the entire document needs to be "re-written" to memory each time the document is saved.
US 6088702 discloses a collaborative publishing system especially for developing 30 large text books or similar documents in which different chapters are written by different authors and in which one or more editors also has access to the document.
The document is divided into sections such as chapters or parts of chapters which are stored as separate files. An author or editor accessing a particular chapter file can then lock this in order to edit it whilst still allowing read only access to others.
Whilst this arrangement divides a large file into smaller files, it has the disadvantage mentioned above that only one user can have edit access to a "small " file at a time.
In addition the decision about the appropriate division of the document must be made in advance.
EP 0550374 discloses concurrent user access system for use with a word processing application in which a user defines a region in a shared document for locking, for example by highlighting this region with a cursor. Once this region is locked another user cannot then edit this region, although they do have read only 10 access to it. The system uses tables containing reference data for the locked region such as page and line references for example. When a second user attempts to edit a part of the document the application first checks the tables to determine whether another user has already locked the region and if so prevents the second user from doing this. Whilst this does allow concurrent edit access to different parts of the 15 same document, it is still slow to refresh the corresponding file with any changes made to the document. In addition, the user must manually request the lock and unlock; and the user must use a bespoke editor, not a legacy word processing program they are familiar with.
20 US 5892513 discloses a system in which documents are represented as trees of objects and in which a user can "check out" elements such as a paragraph for editing which then prevents others from editing the same paragraph. The system does however allow access to previous versions of this element which are not checked out by another user. Once a current version of an element is "checked 25 back in" to the document the document file is re-written or saved to incorporate the editing. Generally, systems that use a "check-out" usage style suffer the problem that the second user to access the information, receives a read-only copy, and will create a 30 local writeable copy to perform their updates. When the original becomes available they will then re-assert their version in place of the master copy. Without careful comparison and merging of changes, this can cause updates to be lost and can lead to "version wars". Also, it is often left to the user to check for any updates, so that it is not easy to know whether or not you are working on the latest information.
Summary of the Invention
In general terms in one aspect the present invention provides a data structure for 5 storing a content document such as might be retrieved from a word processing file, the document being comprised of a plurality of parts such as paragraphs. The data structure comprises a database having a plurality of content nodes corresponding to the plurality of document parts, each said document part being stored in a separate database content node which has a unique identifier. The database may be a table 10 or relational type having records and fields for example, an object oriented database
having a plurality of objects corresponding to the parts, or it could be any other type of database in which each content node is capable of being independently accessed and updated.
15 By breaking down a content document such as a word processing file into a number of smaller "chunks", "objects", or component parts of the document and storing these in an indexed datastructure such as a database, the document can be more easily, efficiently and quickly manipulated than a single large file. For example this allows the entire document to be viewed but only a single part of that document to 20 be manipulated when a user is editing the document. In other words only the single document part for example a paragraph is re-written to the database. By contrast when a paragraph in a typical word processing file is edited, and the user asks the application to save the document, explicitly, by "autosave" or when closing the document, the entire document is saved to disk. The database datastructure of this 25 aspect of the invention provides that, each paragraph can be automatically and invisibly saved to the database at the moment the user navigates out of it.
An advantage of this is that even if the word processing application or hardware fails mid-way through editing a document, only the changes to the paragraph most 30 currently being edited will be lost, rather than all changes since the last save of the document. A further advantage is provided in concurrent access to a document by many users.
A database type data structure allows a single part of the document to be easily 35 locked by a user wanting to edit this. This is achieved simply by locking the
v corresponding node in the database data structure, for example by setting a flag in a predetermined field of a relational database record or by the engine keeping track of
locked nodes against user sessions in its memory. By contrast prior art methods
require a complex mechanism for locking a part of an entire file, for example by 5 recording particular page and line numbers in the document which are currently locked. This requires the use of special tables and checking protocols which adds complexity to the architecture and reduces speed.
Preferably the datastructure also comprises audit trail data for recording each 10 change made to each part of the document. This might include information such as when changes were made, by whom, and what those changes were. This information can be stored in special fields of the corresponding document part
record in a relational type database for example, or it could reference another object in an object oriented database or another type of datastructure containing this 1 5 information.
This audit data allows an audit trail of changes to parts of a document to be generated, showing for example how a document has evolved over time. It can also conveniently be used to monitor multiple amendments over time, for example 20 showing how a legal contract being negotiated has been amended over any desired period of time. This contrasts with features such as the Tracking facility on Word_ for example which only shows how any part (e.g. sentence or clause) of the document has been last amended. The Tracking facility on Word also relies on correct working practice for the user to enable this feature and control it correctly.
25 Auditing needs to be transparent, automatic, and reliable and preferably an inseparable part of the act of editing. In addition, because the prior versions are held within the same file as the current version of the document, it is easy to inadvertently transmit this information to a client or third party unless the document is explicitly"cleansed".
By contrast, aspects of the present invention provide that the audit information is stored "alongside" the document content, not within the same file. Hence sending the current document will not include any information contained within prior versions.
Further more, this provides an overview of the editing operations upon a document, 35 ordered by time, by user or by other criteria and combinations thereof. It also provides a comparison of that part of the document affected by each editing
! operation as it was before and after the operation. It also provides the ability to reconstruct the document as it was at any arbitrary time during its life.
In the case of a word processing file, a convenient unit for a node or part is a 5 paragraph, although it could be other units such as a word, a clause or phrase, a sentence, a page or even a whole document. Other types of documents can also be divided up into suitable parts, for example cells in spreadsheet programs and display objects in presentation packages. The document part may be definable by the user or may be pre- configured.
For the purposes of this specification, the term database datastructure is intended to
incorporate databases or similar data storages structures in which a number of discrete content objects such as paragraphs for example are each stored in a separate storage content "node" which is independently accessible. Preferably the 15 objects are stored in a format that is transparent to a number of different applications (eg legacy applications such as Word _). Particular examples of database datastructures are relational databases and object oriented databases, although the term is not intended to be limited to such structures.
20 In general terms in another aspect the present invention provides a method of converting an application specific file comprising a document into a database data structure (DBD). The method comprises opening the file with the application in order to read the document; dividing the document into a number of parts; providing a database datastructure comprising a plurality of content storage nodes, each 25 corresponding to a said part; and storing each said part in a different node.
This method allows a content document stored in a legacy format such as a Word_ doc file to be opened and read, split into parts such as paragraphs, each part then being stored in a node of the DBD. This in turn allows easy access and 30 manipulation to each part (e.g. paragraph) for faster editing or improved concurrent user access as outlined above. For a Word_ document, the Word application is used to open the legacy document, then access the individual paragraphs via Word's COM interface. A similar method can be used to access the individual cells in Excel or the objects in Powerpoint. However, for other applications, it may be 35 more convenient to read the data directly from the legacy file without invoking the normally associated legacy application.
- ) In general terms in another aspect the present invention provides a client for providing access by an application to a database datastructure comprising a plurality of parts of a content document, the application arranged to read said document from 5 an application specific file; the client comprising: a proxy application specific file comprising control statements; means for causing said application to open said
proxy file to create a proxy document within said application; wherein said control statements are arranged to be automatically executed by said application in order to
load said content document parts from said database datastructure into the proxy 10 document in order to provide said content document within said application.
Preferably the application includes an interface for providing partial control over its normal functions by the client. This allows for the control statements to simply trigger
the client to take control of the application and document.
The client allows a legacy application such as Word _ to read and edit a document stored in a database datastructure. The client typically connects to the application's developer's interface for example Word's COM interface. The client also "hooks" into a user's inputs such as keystrokes and cursor actions. The client provides the 20 legacy application with the content document in a way which is completely transparent to the user of the application, but at the same time provides for improved access and edit write speed as well as facilitating improved concurrent user access to the document parts stored in the database. Each editing operation can be immediately saved in the database. In the event of a failure of Word, the 25 operating system or the underlying hardware, these edit operations will not be lost.
In general terms in another aspect the present invention provides an engine for a database datastructure comprising a plurality of parts of a content document, the engine for providing access to said datastructure to one or more clients for 30 interfacing with a user application; the engine comprising: means for providing a client specific cache for each said client, said cache being provided with at least some of said parts dependent on permissions attributed to user of said client; means for communicating with said client in order to send and receive changes to said parts.
In the case of multiple client access to the same content document, the engine also comprises a lock manager function for locking a part which is being edited by a client. 5 Preferably the lock manager function stores a current lock status of each part of said database datastructure, and receives lock request and release messages from said client and locks or releases a lock on a said part specified by a client.
The engine may also be configured to allow edit audit information to be written to 10 either the database datastructure containing the content document or to another datastructure. Preferably the engine also provides the application with objects such as those shown in Figure 2a so that the application can display to the user in the normal way 15 a list of available documents. Because this list is created from the objects computed by the engine, the engine can be configured to provide only certain folder, document (or even paragraph) objects to the application depending on the user's identity and permissions. By contrast a legacy application would show all the application specific folders, files and content even those the user does not have read access to.
Preferably the communications means is arranged to communicate with said client over a network.
Preferably the client comprises a mirror cache for mirroring a corresponding client 25 specific cache provided by the engine which is provided with at least some of said folders, documents and document content parts.
Preferably changes to the client specific engine cache are reflected in all client mirror caches in a timely fashion as they happen.
In a further aspect the present invention provides a system for providing concurrent access to a content document: the system comprising: a database datastructure for storing the document as a plurality of parts, each stored independently; a network: one or more clients as defined above which are coupled to said network; and an 35 engine as defined above which is coupled to the network.
\ The present invention also provides methods corresponding to the above aspects, as well as computer program products corresponding to these methods. Further aspects and advantages will become apparent from the following description of
embodiments. Brief Description of the Drawings
Embodiments of the present invention will now be described in detail with reference to the following drawings, by way of example only and without intending to be 10 limiting, in which: Figure 1a is a schematic diagram showing an architecture for user access to a document through the Word _ legacy application; 15 Figure 1 b is a schematic diagram showing an architecture according to an embodiment of the present invention for user access to a document through a client interface for the Word _ legacy application; Figure 2a is an object diagram showing some typical objects representing a file 20 system and document contents in accordance with an embodiment of the present invention; Figure 2b is a class diagram showing the classes which represent the file system and document contents in a database in accordance with an embodiment of the 25 present invention.
Figure 3 is a schematic diagram showing a method of converting a document from an application specific file to a database datastructure in accordance with an embodiment of the present invention; Figure 4 is a schematic diagram showing an interface architecture for interfacing a database datastructure with a legacy application such as Word _;
/ a\ Figure 5a and 5b are flowcharts showing a method of reading and editing a document stored in a database datastructure using a legacy word processing application; 5 Figure 6 shows an architecture according to another embodiment which provides concurrent access to a content document stored in a database datastructure over a network; Figure 7 is a flowchart of a method for reading and editing documents in the 10 architecture of Figure 6; and Figure 8a and 8b show respectively file system objects stored in the database datastructure and those visible to a user after user permissions are applied.
15 Detailed Description
Embodiments of the invention are described with respect to the Word _ word processing application, however with appropriate modifications, they could also be applied to other legacy word processing packages such as WordPerfect _ as well 20 as other types of legacy applications for example the Excel _ and Lotus _ spreadsheet programs and the PowerPoint _ presentations package.
Figure 1a is a schematic diagram showing the relevant prior art system architecture
for editing a.doc file comprising a content document such as a patent specification
25 for example. The Word _ application 1 reads the (.doc) document file 2 into local working memory (RAM) as a Word document 3. A user 4 then interacts with the application 1 to read and edit the Word document 3. When the user 4 saves the document, or "autosave" executes, or the document is closed, the application saves any changes made in the Word document 3 to the document file 2. The file 2 is an 30 application specific file. Similarly, the way in which the application represents and manipulates the document 3 in working memory is proprietary and secret.
The Word application 1 does however provide the document content and the ability to control certain application functions and user interactions with the application 1 35 through a developer's interface know as the Component Object Model (COM) 5.
This interface 5 allows non-proprietary applications to control certain word processing functions and file and user interactions with the application 1. The COM interface 5 presents the content of the document 3 to external applications in a known and predetermined way, typically as a series of paragraphs.
A difficulty with this legacy approach is that the document is still stored as a single file datastructure 2. This means that the entire file 2 needs to be re-written to disk even if only a single word has been changed in a large document. A further difficulty with this approach is that it inhibits concurrent user access, since the translation 10 between the internal structure 3 and the external file representation 2 has to happen "wholesale", and therefore the file 2 must be locked. This means that other users cannot access file 2 except to read (only) the document 3 whilst it is being edited by the user having edit access to the "locked" file.
15 The system architecture of an embodiment of the present invention is shown in Figure 1 b, and shows two users having concurrent access to a common "document". The content document is stored in a database datastructure (DBD) 20 in accordance with an embodiment of the present invention. The DBD stores a number of parts of the document, for example paragraphs, wherein each document 20 part is stored in a separate node of the DBD.
In the architecture of figure 1b, two users 24 and 34 are each using a Word application 21 and 31 respectively in order to edit the same document "file", which is stored in the DBD 20. The document 23 or 33 which each user 24 or 34 is working 25 on is supplied to the application 21 or 31 through its COM interface 25 or 35. Each user has an associated client 27 or 37 which is coupled to a respective COM interface 25 or 35 as well as to the common DBD 20. The connections between these entities may be by any means, for example an Internet connection.
30 The DBD 20 has an associated engine which allows each user 24 or 34 to request a lock on any part of a document in order to edit this. The other user then only has read only access to that part, but can request a lock on any other part of the document. In this way, only a part or parts of a document are locked by another one user leaving the remaining parts to be edited by other users. The mechanism by 35 which the content document is supplied to each application 21 and 31 as well as the mechanism for editing parts of this are described in more detail below. It is
preferably intended that these mechanisms should be transparent to the users 24 and 34 of the applications 21 and 31.
Figure 2a shows a content document stored in a database datastructure (DBD) in 5 accordance with an embodiment of the present invention. The document is divided up into parts such as paragraphs, pictures, tables, text fragments, however nodes may be defined in other ways such as by sentence, page, or even the entire document. Other types of document may be divided up in an analogous manner into parts, for example a spreadsheet document may be divided into cells and a 10 presentation document may be divided up into slides.
Each object in the DBD may also contain or have associated with it audit information relating to changes made to the object, what these changes were, who made them, and when, in order to allow an audit trail of the object to be generated. Typically in 15 legacy word processing applications, only the last amendment to a particular piece of text is available, for example as in the case of the Tracking Function in Word_.
Because the document is stored in a database datastructure, access to the individual paragraphs of the document is much improved over a standard file type 20 datastructure such as that shown in Figure 1a. This is because each paragraph (part) is accessible independently of the other parts. This provides various advantages including allowing a particular paragraph to be easily locked, for example merely by assigning a flag field in the DBD. Further information about
particular paragraphs can also be included in the DBD merely by adding extra 25 information, for example the audit information.
Various other database types as are known in the art could alternatively be used for storage and independent access to the document parts; for example an object oriented database.
Figure 8a shows an object diagram according to an embodiment of the present invention showing some typical objects representing a file system which might be stored in the DBD 20. The engine applies the user's permissions to these objects, producing a filtered view of the folders and documents to which the user has access, 35 figure fib. By contrast, Explorer/Word _ display to the user all available files,
\ irrespective of whether the user has permission even to read them. In a similar fashion, a filtered view of the document's contents can be presented to the user.
Figure 2b shows a class diagram of an embodiment of the present invention 5 showing classes which represent the file system and document contents in the database datastructure.
Figure 3 shows a conversion method for converting a document stored as an application specific file 2 such as a Word_.doc file into a document stored in a 10 database datastructure (DBD) 20. Referring also to figure 1, a file 2 is read or opened by the application 1, and the content is provided to an external application via the COM interface 5, 15, or 25. In this case the external application may be one of the clients 27 or 37, or a specific conversion application. The relevant external application converts the contents of the document 3, 13, or 23 into parts suitable for 15 storage in the DBD 20. Conveniently, Word's COM interface provides the contents of the document as separate paragraphs. However as other part types are possible, any conversion from parts supplied by the application to other part configurations is provided by the external application or client. The document parts are then written to separate nodes within the DBD 20 for later retrieval by clients or other applications.
Whilst this process has been described with respect to Word _, a conversion application could be provided to interface with the developer's interface on other proprietary applications. In addition, a conversion application could be provided which directly reads the content of a file where the formatting information for this file 25 is not secret. For example the application might be a.txt file, the content being read directly into the conversion application and converted into suitable parts for storage in the DBD 20.
Figure 4 shows the architecture and operation of an application client in more detail.
30 A DBD 40 has an associated Engine 40A which is shown separately. The Engine 40A communicates with a client application 47 which in turn interfaces with the Word TM application 41. The client interface also provides an application specific (.doc) proxy file 46, also known as a stub file. This stub file is written on disk or in the user's file system and contains control statements such as macros, but preferably
35 no content. Through the COM interface 45, the client application 47 is able to add
content to and retrieve content from the document 43 held by the Word application 1 in working memory.
The client application 47 intercepts the user's keyboard and mouse actions by a 5 process know as "hooking". The client application is configured to filter these actions, thereby preventing some of them from reaching the application 41. The client determines whether each action will result in a modification to the document content. If so, it checks whether the user is permitted to modify that part of the document and blocks the action if the user is not. If the user is allowed to modify that 10 part of the document, the client application requests a lock on it from the engine. If the lock is denied by the engine, the action is blocked.
User actions made via Word's commands, e.g. cut and paste, change style, etc. are intercepted by macros contained in the stub file 46. Each macro contacts the client 15 application 47 and filters the action in the same way as described in the previous paragraph. Figure 5a shows a flow chartfor opening a "document" in a Word _ application where the document is stored in the DBD 40 as opposed to a standard.doc Word _ 20 file. The user 44 first logs onto the Engine 40A. The Engine 40A returns folder and document objects which the user 44 is permitted to access. The application client 47 then creates corresponding folder and (macro only) document stub files on the user's file system which the Explorer/Word application sees and displays to the user 44. When the user 44 opens a stub file 46 in Word, a corresponding document 43 is 25 created by Word in memory. This automatically runs the embedded macros which request the client interface to populate the Word _ document 43 with the contents of the appropriate document stored in the DBD 40. The Engine 40A may even return objects corresponding to only a section of a document, for example only one page of a confidential document. The client interface 47 then retrieve the permitted 30 paragraphs or other parts of the document and drives the Word _ application 41 to add this content to the Word _ document 43. Word _ 41 then displays the document contents to the user 44.
If the user 44 attempts to modify some document content, the client 47 intercepts 35 these inputs and checks with the Engine 40A as to whether the user has edit permission. If so, the client 47 drives the application 41 to implement the intercepted
user commands. If not, the client 47 generates a suitable message which it drives the application 41 to display to the user 44. When the user 44 "moves" out of the edited paragraph using arrow or mouse commands for example, the modified content is written to the DBD 40 in the appropriate node(s). Similarly, certain edits 5 may be invoked by user invoked macros (such as cut and paste) in the document 43, in which case these are intercepted and allowed or declined depending on the permission information provided by the Engine 40A.
A concurrent access implementation of this is shown in figure 5b in which, when the 10 user attempts to modify content, the client 47 requests a "lock" on that content from the Engine 40A. If the content is already locked by another user editing any part of the same content, then the Engine 40A returns a decline message and the client 47 drives the application 41 to display a message to the user 44, perhaps indicating which other user is currently working on that content. If the content is not already 15 locked and the user has permission to edit it, then the Engine 40A locks the paragraph(s) and signals the client 47 to allow the attempted edits.
This process is normally transparent to the user who just sees a content document 43 which looks the same as if it had been derived from a file specific to the 20 application 41.
Whilst the DBD 20, 40 has been described with respect to use with a legacy application 41 such as Word _ and a specific client interface 47 for that application 41, the DBD could also be used with an application specifically tailored to interfacing 25 with it.
A schematic diagram of the architecture of another embodiment in accordance with the present invention is shown in Figure 6. This embodiment allows concurrent user access to a document stored in a DBD over a network.
The system comprises an engine 50A, a number of clients 70, 80, and a network 60 which connects the clients 70, 80 to the engine 50A. The engine 50A is associated with a central database datastructure 50. The network 60 may be a single machine, a local area network, a company intranet, or the internet for example. Clients 70, 80 35 communicate with the engine 50A across the network 60 using any suitable communications protocol such as CORBA for example.
\ The engine comprises for each client; a client connection function 53 or 57; a client specific cache 52 or 56; and a permissions filter function 51 or 55. In addition the engine comprises a concurrent access or lock manager function 54. The lock 5 manager function 54 may be for example a separate table indicating which nodes of the DBD 50 are locked, or it may be a specific flag within the DBD itself, etc. The client connection functions 53 or 57 communicate with a corresponding engine connection function 71 or 81 within a client (client1 70 or clients 80). Each client also comprises a mirror cache 72 or 82 which mirrors the corresponding engine 10 cache 52 or 56 respectively. The engine cache 52 or 56 and client cache 72 or 82 respectively are kept mirrored by appropriate periodic or aperiodic signalling over the network using the connection functions 53 and 71 or 57 and 81. Each client provides an interface with a legacy application 74 or 84 such as Word_. Each client writes proxy or stub files 73 or 83 which the application 74 or 84 uses to open 15 a document 75 or 85 in a similar manner to that described with respect to Figures 4 and 5.
Each client may be associated with a permissions filter function 51 or 55 which restricts the client's access to parts of the DBD 50. The form of these restrictions 20 may take limiting client access to particular "documents" or collections of documents within the DBD 50, or even to particular content within a "document" within the DBD 50. The permissions function 51 or 55 provides the client with objects which the client has permission for, such that the client 70 or 80 can drive the legacy application 74 or 84 to display available documents to the user 76 or 86 for 25 selection.
When a user of a particular client attempts to modify content in the document in memory 75 or 85, the client automatically sends a "lock request" message for that content to the engine via the connection link 71/53 or 57/81. The engine 50A 30 checks the concurrent access function 54 to determine whether this content is already locked by another client and also determines from the permissions function 51 or 55 whether the requesting client has permission to edit the content. If either the appropriate permission is not available or the requested content is already locked by another user/client, an appropriate message is forwarded to the 35 requesting client which can then display this to the user trying to edit the content.
If permission is available, and the content is not already locked, the engine locks the content in the DBD 50 and sends a "lock allowed" message to the client which then allows the user to edit the content. Once the user has finished editing the content, for example indicated by the cursor moving out of the content, the client sends a 5 message to the engine which updates the appropriate DBD objects with amendments to the content and unlocks the content so that other clients can now access it for editing.
Edits are not necessarily restricted to a single paragraph, and may include for 10 example more general edits such as reformatting several paragraphs or replacing a number of whole or partial paragraphs with a different set of paragraphs.
Edits made by user 76 are stored in the appropriate DBD objects in persistent storage 50 and in cache 52 and in mirror cache 72. The edits are propagated to the 15 other client caches 56. The client application 81 periodically or aperiodically retrieves these changes from the cache 56 and applies them to the mirror cache 82 and the document in memory, 85, such that user 86 can see changes made by user 76 shortly after they are made. Note that there may be any number of further users and clients viewing the same content.
Because only part of a document (for example one paragraph or a section of a document containing several paragraphs) is locked and edited by any one clienVuser at any one time, the changes made to that paragraph by the user can be written to the DBD 50 and distributed to other clients relatively quickly compared to 25 prior art concurrent access systems.
A method of operating the architecture of Figure 6 is shown in Figure 7. A user first connects to the engine over the network using the Connection functions 71 or 81. In doing so the client forwards a user identifier to the engine which associates that 30 particular user with predetermined permissions to the nodes in the DBD 20.
The engine creates a client specific cache 52 and generates one or more folder and/or document objects depending on the user's permissions. An embodiment may load all accessible node objects, including document contents, into the cache at this 35 point. Another embodiment may load the objects layer by layer in response to user actions. E.g. a folder's child folders and documents may not be loaded until the user
opens that folder. A document's content may not be loaded until the user opens that document in Word.
The client creates a mirror cache 72 on the user machine which mirrors the engine 5 cache 52 provided by the engine. The client also creates application specific proxy files such as.doc stub files 73 for Word _.
When a user opens a stub file 73, this is opened by Word _ 74 and macros in the document 75 instruct the client 70 to add content to the document. The client 10 requests the content of the document opened by the user from the engine 50A which then populates the client's cache 52. These are mirrored in the mirror cache 72 and written to the document through Word's COM interface. Alternatively this content may already be available in the mirror cache 72 in which case no instruction to the engine is necessary.
Embodiments of the present invention also provide an audit trail facility which shows changes made to the folders, documents and document content over their life. The audit information stored includes what the changes were, who made them and when. When the engine 50A records amendments made to a folder(s), document(s) 20 or a part(s) of a document in the corresponding DBD node(s), the corresponding audit information is also stored. Preferably this audit information is stored in the same database as its related node. The linked audit information allows all previous edits to be seen and previous versions of the file system and/or document recovered. Preferably the audit mechanism is an integral part of the engine and the information is stored in the same database. The audit information is written within the same database transaction as the edit and is linked to the information that was affected so that the audit information for a particular part may be readily and directly retrieved. It 30 is also possible to store the audit information in a separate database.
A client (or a specific application) can be configured to read the audit information for a particular node to generate a log of the amendments made to the associated node and optionally its descendants; what these were, who made them and when. This 35 might be particularly useful in the negotiation of legal documents where numerous amendments are made to clauses within the document over time. It is typically
difficult to keep track of this, as legacy applications such as Word_ rely on correct user work practices and are not automatic, transparent, nor intuitive like the present embodiment. 5 Embodiments of the invention may be implemented on a programmable processing apparatus such as a computer, or a processor. Methods in accordance with embodiments of the invention may be embodied as computer readable code carried on a carrier medium which could be a transcient carrier medium such as a signal or a storage carrier medium such as memory. These methods may also be embodied 10 as a specifically configured apparatus such as an ASIC or a programmable logic gate array.
Embodiments of the invention may also be implemented as manufacturing instructions to process such a specifically configured apparatus. For example this 15 may be a hardware description language code such as a Verilog code which when
run on an appropriate manufacturing process specifically configures a configurable apparatus such as an ASIC or programmable logic gate array to implement methods in accordance with embodiments of the invention. Implementation may also take the form of code which dynamically configures re-configurable apparatus such as re 20 programmable logic gate arrays to implement these methods described above and covered by the scope of the appended claims.
The invention has been described with respect to numerous embodiments thereof.
Alterations and modifications as would be obvious to those skilled in the art are 25 intended to be incorporated with the scope hereof.

Claims (1)

1. A client for providing access by an application to a database datastructure comprising a plurality of parts of a content document in corresponding nodes, the 5 application arranged to read said document from an application specific file; the client comprising: a proxy application specific file comprising control statements;
means for causing said application to open said proxy file to create a proxy document within said application; 10 wherein said control statements are arranged to be automatically executed
by said application in order to load said content document parts from said database datastructure into the proxy document in order to provide said content document within said application.
15 2. An interface according to claim 1 further comprising means for requesting a lock on a said part of said content document in said database datastructure.
3. A client according to claim 1 or 2 further comprising means for determining user modifications to a said part of said content document from within said 20 application, and means for updating the corresponding node in the database datastructure. 4. A client according to claim 3 wherein said updates are made automatically upon said user navigating out of said part from within said application.
5. A client according to claim 3 or 4 when dependent on claim 2 wherein said lock requesting means automatically requests a lock when said user modifications determination means determines that said user is attempting to modify said content document from within said application.
6. A client according to any preceding claim further comprising means for storing details of each said user content modification to said document part in order to create a content document modifications audit trail.
7. A client according to any one preceding claim further comprising means for providing document objects to said application for display and selection, said objects corresponding to one or more content documents or parts of said documents stored in said database datastructure.
8. A client according to claim 8 wherein a sub-set of said objects is provided depending on identity and permissions parameters of the user.
9. A client according to any one preceding claim wherein said application is 10 Word _.
10. An engine for a database datastructure comprising a plurality of parts of a content document, the engine for providing access to said datastructure to one or more clients for interfacing with a user application; the engine comprising: 15 means for providing a client specific cache for each said client, said cache being provided with at least some of said parts dependent on a user identity and permissions parameter; and means for communicating with said client in order to send modifications of said parts in said datastructure to said client and to receive from said client 20 modifications of said parts made by said user.
11. An engine according to claim 10 further comprising a lock manager function for locking a part of said content document which is being modified by a client.
25 12. An engine according to claim 11 wherein said lock manager is arranged to store a current lock status of each part of said database datastructure, and to receive lock request and release messages from said client, and to lock and release a lock on a said part specified by a client.
30 13. A system for providing concurrent access to a content document: the system comprising: a database datastructure for storing the document as a plurality of parts, each stored in a separate node; a network: 35 one of more clients according to any one of claims 1 to 9 which are coupled to said network; and
an engine according to any one of claims 10 to 12 which is coupled to the network. 14. A system according to claim 13 further wherein a said client further 5 comprises a mirror cache which is arranged such that the contents of said mirror cache and the contents of a corresponding said client specific cache at the engine are mirrored.
17. Apparatus for transferring a content document stored as an application 10 specific file into a database datastructure comprising a plurality of content storage nodes; the apparatus comprising: means for opening said file in said application in order to access said document; means for dividing said document into a plurality of parts; 15 means storing each said part in a separate node in said database datastructure.
GB0225972A 2002-11-07 2002-11-07 Providing access to component parts of a document Withdrawn GB2395300A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0225972A GB2395300A (en) 2002-11-07 2002-11-07 Providing access to component parts of a document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0225972A GB2395300A (en) 2002-11-07 2002-11-07 Providing access to component parts of a document

Publications (2)

Publication Number Publication Date
GB0225972D0 GB0225972D0 (en) 2002-12-11
GB2395300A true GB2395300A (en) 2004-05-19

Family

ID=9947378

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0225972A Withdrawn GB2395300A (en) 2002-11-07 2002-11-07 Providing access to component parts of a document

Country Status (1)

Country Link
GB (1) GB2395300A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818663B2 (en) 2003-12-23 2010-10-19 Onedoc Limited Editable information management system and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0319232A2 (en) * 1987-12-02 1989-06-07 Xerox Corporation A multi-user data communication system
EP0473960A2 (en) * 1990-09-07 1992-03-11 Xerox Corporation Hierarchical shared books with database
EP0924629A2 (en) * 1997-12-22 1999-06-23 Adobe Systems Incorporated Virtual navigation
US20010042075A1 (en) * 1997-02-14 2001-11-15 Masahiro Tabuchi Document sharing management method for a distributed system
WO2003019414A1 (en) * 2001-08-22 2003-03-06 France Telecom Computer system and document management method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0319232A2 (en) * 1987-12-02 1989-06-07 Xerox Corporation A multi-user data communication system
EP0473960A2 (en) * 1990-09-07 1992-03-11 Xerox Corporation Hierarchical shared books with database
US20010042075A1 (en) * 1997-02-14 2001-11-15 Masahiro Tabuchi Document sharing management method for a distributed system
EP0924629A2 (en) * 1997-12-22 1999-06-23 Adobe Systems Incorporated Virtual navigation
WO2003019414A1 (en) * 2001-08-22 2003-03-06 France Telecom Computer system and document management method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818663B2 (en) 2003-12-23 2010-10-19 Onedoc Limited Editable information management system and method

Also Published As

Publication number Publication date
GB0225972D0 (en) 2002-12-11

Similar Documents

Publication Publication Date Title
US7493307B2 (en) Document management extension software
US7289973B2 (en) Graphical user interface for system and method for managing content
AU735365B2 (en) A method and apparatus for document management utilizing a messaging system
US6240429B1 (en) Using attached properties to provide document services
US6308179B1 (en) User level controlled mechanism inter-positioned in a read/write path of a property-based document management system
AU764026B2 (en) Knowledge provider with logical hyperlinks
US6330573B1 (en) Maintaining document identity across hierarchy and non-hierarchy file systems
EP2304603B1 (en) Structured coauthoring
JP4425348B2 (en) Compound document framework
EP1696346B1 (en) File system represented inside a database
EP0986009A2 (en) Extending application behavior through document properties
JP7150830B2 (en) Content management system workflow functionality enforced by the client device
EP1237073B1 (en) Method and system for creating and maintaining version-specific properties in a distributed environment
JP3460597B2 (en) Compound document management system, compound document structure management method, and recording medium storing compound document structure management program
ZA200100372B (en) Method and apparatus for interacting with a source code Control system.
CA2511528A1 (en) Promotion and demotion techniques to facilitate file property management between object systems
MXPA05005535A (en) Anti virus for an item store.
KR20100061693A (en) Spreadsheet workbook part libraries
Jones et al. Towards the design of secure systems
GB2395300A (en) Providing access to component parts of a document
Decouchant et al. Griffon: A cooperative, structured, distributed document editor
US20020188727A1 (en) Method for processing external data for access and manipulation through a host operating environment
AU752293B2 (en) Document management extension software
Wieczerzycki Versioning technique for collaborative writing tools
AU727275B2 (en) Computer software

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)