GB2357348A - Using an abstract messaging interface and associated parsers to access standard document object models - Google Patents

Using an abstract messaging interface and associated parsers to access standard document object models Download PDF

Info

Publication number
GB2357348A
GB2357348A GB9929936A GB9929936A GB2357348A GB 2357348 A GB2357348 A GB 2357348A GB 9929936 A GB9929936 A GB 9929936A GB 9929936 A GB9929936 A GB 9929936A GB 2357348 A GB2357348 A GB 2357348A
Authority
GB
United Kingdom
Prior art keywords
document
parsers
format
received
object model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB9929936A
Other versions
GB9929936D0 (en
Inventor
John Bryan Ibbotson
Stephen James Paul Todd
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to GB9929936A priority Critical patent/GB2357348A/en
Publication of GB9929936D0 publication Critical patent/GB9929936D0/en
Publication of GB2357348A publication Critical patent/GB2357348A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]

Abstract

A data processing apparatus has: a memory unit 402; a system software unit 401 including a plurality of parsers 401A, 401B, for implementing a Document Object Mode (DOM) application programming interface, including a unit for receiving documents from an originating application 11, 21 and a unit for processing received documents using one of the parsers in order to convert each received document into an object model; and a system software unit for storing each object model into the memory unit; where the unit for processing sends a received document to a particular one of the plurality of parsers depending on the format of the received document. This allows the DOM to accept documents in formats other than XML/HTML, e.g in COBOL.

Description

2357348 USING AN ABSTRACT MESSAGING INTERFACE AND ASSOCIATED PARSERS TO
ACCESS STANDARD DOCUMENT OBJECT MODELS
Field of the Invention
The present invention relates to the fipld of data processing, and i more particularly to the art of computer software programming making use of a Document Object Model application programming interface.
Backqround of the Invention The Document Object Model (DOM) is a platform-neutral and language-neutral application programming interface (API) that will allow programs and scripts to dynamically access and update the content, structure and style of HTML and XML documents. The world wide web Consortium (W3C) has developed published specifications for the DOM and an important objective of the W3C is to provide a standard programming interface that can be used in a wide variety of environments and applications. see, e.g., "What is the Document Object Model?" by J.
Robie of Texcel Research, pp. 1-4, printed from the World Wide Web on November 4, 1999 at http://www.w3.org/TR/REC-DOM-Level-1/introduction.html such document is herein incorporated by reference.
Fig. 1 is an example of the use of prior art DOM technology. A first data processing machine I (e.g., a personal computer running a Web browser) is running a program 11 and a second data processing machine is running a script 21. A third data processing machine 4 is running the DOM software 41. The machines 1 and 2 are communicating with the machine 4 via the Internet 3 using normal World Wide Web technology (i.e., TCP/IP and HTTP). The DOM 41 receives XML documents from the program 11 and script 21 and converts each such XML document into a Java object model, which is a group of objects, which is then stored in memory 42 at the data processing machine 4 where the DOM software 41 is running. This Java object model is, in effect, a tree 421 of nodes containing the data and structure which was contained in the XML document that was originally received by the DOM. The XML documents which arrive at the DOM 41 are parsed by an XML parser 411 in order to convert them into respective Java object models which are then stored in memory 42. Once the Java object models are stored in memory 42, the program 11 and script 21 (or any other program or script that can access the DOM 41) uses the DOM 41's API 2 to access and modify the Java object models of the XML documents that have been previously received and stored in memory 42.
While the DOM allows programs and scripts to dynamically access and update the content, structure and style of documents, the documents which the DOM will accept are quite limited in that they must be in XML (or in HTML). This has greatly limited the use of the DOM and it would be very advantageous for programs or scripts to be able to provide non-XML/HTML documents to a DOM. This would allow the DOM to be much more versatile.
However, according to the present state of the art, there has been no solution to this problem provided in the marketplace.
Summary of the Invention
According to a first aspect, the present invention provides a data processing apparatus having: a memory unit; a system software unit, including a plurality of parsers, for implementing a Document Object Model application programming interface, including a unit for receiving documents from an originating application and a unit for processing received documents using one of the plurality of parsers in order to convert each received document into an object model; and a system software unit for storing each object model into the memory un3-t; wherein the unit for processing sends a received document to a particular one of the plurality of parsers depending on a format of the received document.
According to a second aspect, the present invention provides a data processing method having steps of: receiving a document from an originating application via a Document object Model application programming interface which includes a plurality of parsers; determining a format of the received document; selecting one of the plurality of parsers depending on the results of the determining step; converting the received document into an object model using the parser selected at the selecting step; and storing the object model into a memory unit.
According to a third aspect, the present invention provides a computer program product stored on a computer readable storage medium for, when run on a computer, instructing the computer to carry out the method steps of the second aspect.
According to a fourth aspect, the present invention provides a data processing apparatus having functional processing components for carrying out the respective steps of the method of the second aspect.
Thus, with the present invention, the usefulness of the DOM is greatly improved since documents of different formats can be accepted.
The DOM is no longer limited to dealing with only XML/HTML format 3 documents. Preferably, through the use of an XSL processor placed after the memory unit, non-XML/HTML documents which were originally received by the DOM and stored in the memory unit can be transformed into an output form for display using a web browser.
Brief Description of the Drawings
The present invention will be better understood upon reading the below described detailed description of the preferred embodiments thereof, which will be presented with reference to the following drawings:
Fig. 1 is a block diagram showing three data processing machines running software according to the prior art; is Fig. 2 is a block diagram showing three data processing machines running software according to a preferred embodiment of the present invention; Fig. 3 is a flowchart showing the operational steps performed by the system software of machine 400 of Fig. 2, according to a preferred embodiment of the present invention; and Fig. 4 is a block diagram showing functional software blocks for carrying out the functionality of a preferred embodiment of the present invention.
Detailed Description of the Preferred Embodiments
In Fig. 2, the same basic architecture is illustrated as was shown and described above with respect to Fig. 1. However, the DOM 401 according to a preferred embodiment of the present invention includes more parsers than only the single XML parser 411 that was shown in Fig.
1. For example, in Fig. 2, a COBOL parser 401B is shown in addition to the XML parser 401A. The term "COBOL parser" is used here to mean a parser capable of parsing a structured message such as is typically described in a COBOL Copybook.
If an XML format document is received over Internet 3 from program 11, the system software running in machine 400 which implements the DOM 401 recognizes the format of the document as the XML format and thus routes the received XML format document to the XML parser 401A. on the other hand, if a COBOL format document is received over Internet 3 from program 11, the system software running in machine 400 which implements the DOM 401 recognizes the format of the document as the COBOL format and thus routes the received COBOL format document to the COBOL parser 401B.
4 Preferably, the system software code used to implement the DOM 401 having a plurality of parsers is the code which is being used in message brokers. Specifically, such a message broker receives messages in a variety of formats and then converts the received messages into an abstract message model (i.e., object model). The format of a received message is determined and then a particular parser of a group of parallel parsers is selected based on the determined format of the message for processing the received message in order to convert the received message into the abstract message model. For example, SAGA Software Inc. of Reston Virginia USA have such a message broker product called Sagavista (TM). See, e.g., the World Wide Web-published white paper "Sagavista (TM) Expanding the Reach of Your Enterprise" by David S. Linthicum, Chief Technology Officer, SAGA, such document being herein incorporated by reference.
In these message brokers, each received message has associated with it some meta-information describing the received message's structure, content and physical representation (referred to herein as the message's "format"). This format information is used to select an appropriate parser (from a plurality of parallel parsers) which converts the message to the common abstract message model. once in this common abstract message model, each message can be processed using the same processing nodes within the message broker even though the messages have originated from different formats. The format also defines the mapping from the abstract message model to the physical representation (e.g., record datastructure (Cobol or C), XML tagged structure, etc.).
Steps explaining the operation of the system software code running in machine 400 will now be described with reference to the flowchart of Fig. 3 to illustrate the operation of a DOM that can handle documents in a plurality of formats.
At step 31, the system software code receives a document over the Internet 3 from program 11 running in machine 1. At step 32, the format of the document is determined. At step 33, a parser is selected from a group of parsers, with the selection being dependent on the determined format of the document. At step 34, the selected parser converts the document into an object model (i.e., a tree of nodes). Finally, at step 35, the object model is stored in memory 402 (an object model 402A is shown in memory 402 of Fig. 2). In accordance with existing parsing art, steps 34 and 35 may alternatively be performed in a "lazy" manner thus making them "lazy parsers". Specifically, only those parts of the received document that are requested by the DOM interface, or those parts of the document that must be parsed in order to access the requested parts, are parsed and converted to the object model in step 34.
In Fig. 4, the system software, according to a preferred embodiment of the present invention, running on machine 400 of Fig. 2 includes the DOM 401 including the two parsers 401A and 401B as described above. The DOM 401 also includes a receiving unit 4011 which receives documents that were sent by, for example, program 11 on machine 1 over the Internet 3.
The received documents are sent to a format determining unit 4012 which determines the format of the received documents. The documents are then forwarded to the parser selecting unit 4013 which selects one of the parsers 401A or 401B depending on whether the format determining unit 4012 has determined that the format of the received document is XML or COBOL. It should be noted that while only two parsers are illustrated in this example, in practice there would probably be more than two used.
The parsers convert the document into the object model, as described above, and a storing unit 4014 then stores the object model into the memory 402. An XSL processor 403 then performs a transformation on the stored object models in order to render them in a form in which they can be displayed on a display screen, preferably using a Web browser.
XML (eXtensible Markup Language) emphasizes description of information structure and content as distinct from its presentation. The data structure and its syntax are defined in a DTD (Document Type Definition) specification, which is a derivative from SGML and defines a series of tags and their constraints. In contrast to information structure, the presentation issues are addressed by XSL (eXtensible Style Language), which is also a W3C emerging standard for defining how XML-based data should be expressed on a display screen (e.g., by a Web browser).
XSL processors (such as one developed by Lotus Development Corporation (TM)) typically use the DOM during the process of transforming an input XML description to an output form for display using a Web browser. The transformation is defined using a set of XSL rules which consist of two parts: a pattern matching description to identify structures within the input XML and a transformation which maps the input matched pattern to an output representation. The preferred embodiment of the present invention advantageously uses the XSL rules in order to transform documents from a non-XML format into an output form for display using a Web browser. This, in effect, extends the XML-based mechanism into a general purpose transformation engine. For example, this can be used to transform a COBOL input structure to an XML tagged output structure. In this latter example, the XSL rules used by the XSL processor 403 are written so that a COBOL document, which has been converted into an object model by COBOL parser 401B and stored into memory 402, is transformed into an XML tagged output structure by passing the output of the memory 402 through the XSL processor 403.
6 Accordingly, as standard tools become available for creating and managing XSL rules, these tools may be used without change to transform other non-XML structured documents.
Besides being embodied in a data processing apparatus and a data processing method, examples of which are illustrated in Figs. 2-4, the present invention can also be embodied as a computer program product for use with a computer system. Such an implementation may comprise a series of computer readable instructions either fixed on a tangible medium, such as a computer readable media, e.g., diskette, CD-ROM, ROM, or hard disk, or transmittable to a computer system, via a modem or other interface device, over either a tangible medium, including but not limited to optical or analog communications lines, or intangibly using wireless techniques, including but not limited to microwave, infrared or other transmission techniques. The series of computer readable instructions embodies all or part of the functionality previously described herein.
Those skilled in the art will appreciate that such computer readable instructions can be written in a number of programming languages for use with many computer architectures or operating systems. Further, such instructions may be stored using any memory technology, present or future, including but not limited to, semiconductor, magnetic, or optical, or transmitted using any communications technology, present or future, including but not limited to optical, infrared, or microwave. It is contemplated that such a computer program product may be distributed as a removable media with accompanying printed or electronic documentation, e.g., shrink wrapped software, pre-loaded with a computer system, e.g., on a system ROM or fixed disk, or distributed from a server or electronic bulletin board over a network, e.g., the Internet or World Wide Web.
7

Claims (29)

1. A data processing apparatus comprising:
a memory unit; system software means, including a plurality of parsers, for implementing a Document object Model application programming interface, including means for receiving documents from an originating application and processing means for processing received documents using one of the plurality of parsers in order to convert each received document into an object model; and system software means for storing each object model into the memory unit; wherein the processing means sends a received document to a particular one of the plurality of parsers depending on a format of the received document.
2. The apparatus of claim 1 wherein one of the parsers is an XML parser and a received XML format document is sent to the XML parser for processing.
3. The apparatus of claim 1 wherein one of the parsers is a COBOL parser and a received COBOL format document is sent to the COBOL parser for processing.
4. The apparatus of claim 1 wherein each document has data and structure associated therewith and the object model is represented as a tree of nodes containing the data and structure of the corresponding document.
5. The apparatus of claim I further comprising an XSL processor.
6. The apparatus of claim 5 wherein the XSL processor includes XSL rules for transforming a document from one format to another format.
7. The apparatus of claim 6 wherein the XSL rules are for transforming a document from COBOL format into XML format.
8. The apparatus of claim 1 wherein at least one of the parsers is a lazy parser.
8
9. A data processing method comprising steps of:
receiving a document from an originating application via a Document Object Model application programming interface which includes a plurality of parsers; determining a format of the received document; selecting one of the plurality of parsers depending on the results of the determining step; converting the received document into an object model using the parser selected at the selecting step; and is storing the object model into a memory unit.
10. The method of claim 9 wherein one of the parsers is an XML parser and a received XML format document is sent to the XML parser for processing.
11. The method of claim 9 wherein one of the parsers is a COBOL parser and a received COBOL format document is sent to the COBOL parser for processing.
12. The method of claim 9 wherein each document has data and structure associated therewith and the object model is represented as a tree of nodes containing the data and structure of the corresponding document.
13. The method of claim 9 further comprising a step of outputting the stored object model from the memory unit to an XSL processor.
14. The method of claim 13 wherein the XSL processor includes XSL rules for transforming a document from one format to another format.
is. The method of claim 14 wherein the XSL rules are for transforming a document from COBOL format into XML format.
16. The method of claim 9 wherein at least one of the parsers is a lazy parser.
17. A computer program product stored on a computer readable carrier medium for, when run by a computer, carrying out a data processing method comprising steps of:
9 receiving a document from an originating application via a Document Object Model application programming interface which includes a plurality of parsers; determining a format of the received document; selecting one of the plurality of parsers depending on the results of the determining step; converting the received document into an object model using the parser selected at the selecting step; and storing the object model into a memory unit.
18. The computer program product of claim 17 wherein one of the parsers is an XML parser and a received XML format document is sent to the XML parser for processing.
19. The computer program product of claim 17 wherein one of the parsers is a COBOL parser and a received COBOL format document is sent to the COBOL parser for processing.
20. The computer program product of claim 17 wherein each document has data and structure associated therewith and the object model is represented as a tree of nodes containing the data and structure of the corresponding document.
21. The computer program product of claim 17 further comprising a step of outputting the stored object model from the memory unit to an XSL processor.
22. The computer program product of claim 21 wherein the XSL processor includes XSL rules for transforming a document from one format to another format.
23. The computer program product of claim 22 wherein the XSL rules are for transforming a document from COBOL format into XML format.
24. The computer program product of claim 17 wherein at least one of the parsers is a lazy parser.
25. A data processing apparatus comprising:
means for receiving a document from an originating application via a Document Object Model application programming interface which includes a plurality of parsers; means for determining a format of the received document; means for selecting one of the plurality of parsers depending on the results of the determining step; means for converting the received document into an object model using the parser selected at the selecting step; and means for storing the object model into a memory unit.
26. The apparatus of claim 25 wherein the document is received from the originating application over the Internet.
is
27. The apparatus of claim 1 wherein the document is received from the originating application over the Internet.
28. The method of claim 9 wherein the document is received from the originating application over the Internet.
29. The computer program product of claim 17 wherein the document is received from the originating application over the Internet.
GB9929936A 1999-12-18 1999-12-18 Using an abstract messaging interface and associated parsers to access standard document object models Withdrawn GB2357348A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB9929936A GB2357348A (en) 1999-12-18 1999-12-18 Using an abstract messaging interface and associated parsers to access standard document object models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB9929936A GB2357348A (en) 1999-12-18 1999-12-18 Using an abstract messaging interface and associated parsers to access standard document object models

Publications (2)

Publication Number Publication Date
GB9929936D0 GB9929936D0 (en) 2000-02-09
GB2357348A true GB2357348A (en) 2001-06-20

Family

ID=10866573

Family Applications (1)

Application Number Title Priority Date Filing Date
GB9929936A Withdrawn GB2357348A (en) 1999-12-18 1999-12-18 Using an abstract messaging interface and associated parsers to access standard document object models

Country Status (1)

Country Link
GB (1) GB2357348A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001088840A2 (en) * 2000-05-17 2001-11-22 Ccp Systems Ag Method and system for the transformation of digital print data streams and corresponding printer and printer server
GB2367666A (en) * 2000-04-07 2002-04-10 Nec Corp A communication terminal device which discriminates content type and uses a relevant parser dependent on the markup language in use
WO2002091170A1 (en) * 2001-05-04 2002-11-14 International Business Machines Corporation Dedicated processor for efficient processing of documents encoded in a markup language
WO2003085885A1 (en) * 2002-04-08 2003-10-16 Kent Ridge Digital Labs An interactive messaging communication system
EP1396793A1 (en) * 2001-06-14 2004-03-10 Sharp Kabushiki Kaisha Data processing method, data processing program, and data processing apparatus
US7694284B2 (en) 2004-11-30 2010-04-06 International Business Machines Corporation Shareable, bidirectional mechanism for conversion between object model and XML
EP2530583A1 (en) * 2011-05-31 2012-12-05 Accenture Global Services Limited Computer-implemented method, system and computer program product for displaying a user interface component

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998053393A1 (en) * 1997-05-23 1998-11-26 Adobe Systems Incorporated Data stream processing on networked computer system lacking format-specific data processing resources

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998053393A1 (en) * 1997-05-23 1998-11-26 Adobe Systems Incorporated Data stream processing on networked computer system lacking format-specific data processing resources

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Class com.studiom.dom.DOM", 26 Aug '99, at www.studiom.com/sw/dom/docs/api/com.studiom.dom.DOM.html *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2367666A (en) * 2000-04-07 2002-04-10 Nec Corp A communication terminal device which discriminates content type and uses a relevant parser dependent on the markup language in use
GB2367666B (en) * 2000-04-07 2002-12-04 Nec Corp Communication terminal device
WO2001088840A2 (en) * 2000-05-17 2001-11-22 Ccp Systems Ag Method and system for the transformation of digital print data streams and corresponding printer and printer server
WO2001088840A3 (en) * 2000-05-17 2002-04-18 Ccp Systems Ag Method and system for the transformation of digital print data streams and corresponding printer and printer server
US6684789B2 (en) 2000-05-17 2004-02-03 Ccp Systems Ag Method and system for the transformation of digital print data streams and corresponding printer and printer server
WO2002091170A1 (en) * 2001-05-04 2002-11-14 International Business Machines Corporation Dedicated processor for efficient processing of documents encoded in a markup language
US7013424B2 (en) 2001-05-04 2006-03-14 International Business Machines Corporation Dedicated processor for efficient processing of documents encoded in a markup language
EP1396793A4 (en) * 2001-06-14 2006-02-22 Sharp Kk Data processing method, data processing program, and data processing apparatus
EP1396793A1 (en) * 2001-06-14 2004-03-10 Sharp Kabushiki Kaisha Data processing method, data processing program, and data processing apparatus
EP1770547A3 (en) * 2001-06-14 2007-04-11 Sharp Kabushiki Kaisha Data processing method, data processing program, and data processing apparatus
EP1770548A3 (en) * 2001-06-14 2007-04-11 Sharp Kabushiki Kaisha Data processing method, data processing program, and data processing apparatus
WO2003085885A1 (en) * 2002-04-08 2003-10-16 Kent Ridge Digital Labs An interactive messaging communication system
US7694284B2 (en) 2004-11-30 2010-04-06 International Business Machines Corporation Shareable, bidirectional mechanism for conversion between object model and XML
EP2530583A1 (en) * 2011-05-31 2012-12-05 Accenture Global Services Limited Computer-implemented method, system and computer program product for displaying a user interface component
AU2012203071B2 (en) * 2011-05-31 2013-12-12 Accenture Global Services Limited Computer-implemented method, system and computer program product for displaying a user interface component
US8694960B2 (en) 2011-05-31 2014-04-08 Accenture Global Services Limited Computer-implemented method, system and computer program product for displaying a user interface component
US9268539B2 (en) 2011-05-31 2016-02-23 Accenture Global Services Limited User interface component

Also Published As

Publication number Publication date
GB9929936D0 (en) 2000-02-09

Similar Documents

Publication Publication Date Title
US8326856B2 (en) Method and apparatus of automatic method signature adaptation for dynamic web service invocation
US7568205B2 (en) Providing remote processing services over a distributed communications network
US7533110B2 (en) File conversion
US7483940B2 (en) Dynamic agent with embedded web server and mark-up language support for e-commerce automation
US20030069881A1 (en) Apparatus and method for dynamic partitioning of structured documents
US10860391B2 (en) System and method for automatic generation of service-specific data conversion templates
US20040205577A1 (en) Selectable methods for generating robust Xpath expressions
US20060230057A1 (en) Method and apparatus for mapping web services definition language files to application specific business objects in an integrated application environment
US20070226612A1 (en) Server-side html customization based on style sheets and target device
US20040083453A1 (en) Architecture for dynamically monitoring computer application data
US20030074181A1 (en) Extensibility and usability of document and data representation languages
KR20070086019A (en) Form related data reduction
US20030084405A1 (en) Contents conversion system, automatic style sheet selection method and program thereof
KR20030094320A (en) Dedicated processor for efficient processing of documents encoded in a markup language
WO2002082311A2 (en) Method and apparatus for document markup language based document processing
US20130212121A1 (en) Client-side modification of electronic documents in a client-server environment
Langham et al. Cocoon: building XML applications
JP2008539487A (en) System and method for providing a data format
US8539340B2 (en) Method to serve real-time data in embedded web server
US6772395B1 (en) Self-modifying data flow execution architecture
US8756487B2 (en) System and method for context sensitive content management
US20020184370A1 (en) System and method for providing links to available services over a network
GB2357348A (en) Using an abstract messaging interface and associated parsers to access standard document object models
KR100427681B1 (en) A method and apparatus defining a component model for creating dynamic document in a distributed data processing system
US20040210631A1 (en) Method and apparatus for accessing legacy data in a standardized environment

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)