CN103559322B - Document format conversion method - Google Patents

Document format conversion method Download PDF

Info

Publication number
CN103559322B
CN103559322B CN201310596651.7A CN201310596651A CN103559322B CN 103559322 B CN103559322 B CN 103559322B CN 201310596651 A CN201310596651 A CN 201310596651A CN 103559322 B CN103559322 B CN 103559322B
Authority
CN
China
Prior art keywords
node
xml
station location
location marker
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310596651.7A
Other languages
Chinese (zh)
Other versions
CN103559322A (en
Inventor
李祺
戴鑫波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Medical Information Technology Co ltd
Original Assignee
Medical Information Technology Co Ltd Of Beijing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Medical Information Technology Co Ltd Of Beijing University filed Critical Medical Information Technology Co Ltd Of Beijing University
Priority to CN201310596651.7A priority Critical patent/CN103559322B/en
Publication of CN103559322A publication Critical patent/CN103559322A/en
Application granted granted Critical
Publication of CN103559322B publication Critical patent/CN103559322B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/84Mapping; Conversion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention provides a kind of document format conversion system, including:The element information of each node in dom tree corresponding to XML template files is recorded into allocation list, and mapping relations are stored into the allocation list;For station location marker corresponding to the generation of at least one node, and the station location marker is associatedly stored;According to the allocation list, the target XML file is generated, wherein, corresponding node in the XML template files is addressed according to the station location marker.Pass through technical scheme, can be when two-dimentional relation table enters row format conversion to XML file, addressing operation to node is accelerated by station location marker, so as to especially when the content of XML document is relatively more, level is deep and condition is more complicated, be favorably improved conversion efficiency.

Description

Document format conversion method
Technical field
The present invention relates to format conversion techniques field, in particular to document format conversion method.
Background technology
HL7(Health Level Seven)It is the network opening system interconnection announced based on International Organization for standardization ISO Model OSI layer 7s(Application layer)Medical information exchange agreement.At present, HL7 agreements have proceeded to the third edition, i.e., HL7V3。
When realizing the interaction of medical information between the medical system based on HL7V3, often refer to need local life Into two-dimentional relation table be converted to send after XML file and come from other medical systems to other medical systems, or by what is received XML file be converted to two-dimentional relation table after store to local.In the related art, mainly using XML reflection methods.Wherein, Realize that the process of mapping is generally divided into two parts:Configuration maps and using mapping relations conversion XML.
XML(Extensible Markup Language, extensible markup language)It is a kind of description data of structuring Language, due to its open and scalability, it is widely used in data interaction and data storage aspect at present;XML document master To be made up of elements such as label, label value, attribute, property value, special handling instruction, annotations.At present, XML analytic technique master There are two kinds:SAX(Simple API for XML)And DOM(Document Object Model), two kinds of technology realization principles It is different.SAX is parsed based on event triggered fashion;XML document is then disposably parsed into one by DOM by DOM parser The individual object tree being stored in internal memory, the operation to object tree is converted into XML random operation.Therefore, generally current DOM Due to the convenience of its operation, turn into main flow XML parser method substantially.
When by operating the node locating in dom tree progress XML, mainly XPath technologies are used.XPath uses path Expression formula chooses the node or set of node in XML document;These path expressions and file system in conventional operating systems Path expression is closely similar.The path can be absolute path or relative path.Path expression can have Predicate (predicate), asterisk wildcard and operator.Also containing 100 built-in canonical functions are had more than, these functions are used for XPath String value, numerical value, date and time compare, node processing, series processing, logical value etc..
However, existing DOM analytic techniques still have performance deficiency, it is mainly manifested in:XPath is mainly according to node road The criteria character string expression such as footpath and node attribute values, to parse localization of XML, during parsing XPath, it is main and by time Joint-running point comparison element title and the character string of element value judge to position;Therefore, especially when XML document handles relatively more, text Shelves are bigger, and when level is deep and condition is more complicated, it frequently parses XPath expression formulas and traverse node compares character String can directly embody the defects of parsing positioning performance is slow.
Therefore, the addressing to XML file interior joint how is accelerated(Position), so as to improve the parsing effect to XML document Rate, turn into technical problem urgently to be resolved hurrily at present.
The content of the invention
The present invention is based on above mentioned problem, it is proposed that a kind of new technical scheme, can be literary to XML in two-dimentional relation table When part enters row format conversion, the addressing operation to node is accelerated by station location marker, so as to especially when the content ratio of XML document It is more, when level is deep and condition is more complicated, be favorably improved conversion efficiency.
In view of this, the present invention proposes a kind of document format conversion method, including:Obtain the lattice with target XML file Formula identical XML template files, and with pending two-dimentional relation tableau format identical standard two-dimensional relation table, will described in The element information of each node in dom tree corresponding to XML template files is recorded into allocation list, and by the dom tree The mapping relations between parameter in each node and the standard two-dimensional relation table are stored into the allocation list;To be described Station location marker corresponding at least one node generation in dom tree, and store the position in association with corresponding node and mark Know;According to the allocation list, the parameter in the pending two-dimentional relation table is filled into the XML template files, with The target XML file is generated, wherein, corresponding node in the XML template files is sought according to the station location marker Location.
In the technical scheme, the path that uses is believed during different from directly being parsed using XPath technologies to XML file Breath, the present invention generate station location marker by being individually for node so that the node can be sought according to station location marker, avoid adopting When being addressed with path, to the traversal repeatedly of node, there is identical element term and even especially for many nodes The situation of part identical property value, the addressing speed to node can be effectively lifted, is improved to XML analyzing efficiency and to two Dimension relation tableau format conversion efficiency.
In the above-mentioned technical solutions, it is preferable that generating the process of the station location marker includes:According in the dom tree In the hierarchical relationship between other nodes in any node and the dom tree, and any node and affiliated level Position relationship between other nodes, the station location marker is generated, and the station location marker is stored into the allocation list.
In the technical scheme, by the way that XML file is resolved into DOM tree structure so that each node can be by level Relation and position relationship are positioned, so as to relative in XPath according to the criteria character string such as node path and node attribute values The positioning method of expression formula, without being traveled through repeatedly to node, it can effectively improve to the locating speed of node and to XML file Analyzing efficiency.
In any of the above-described technical scheme, it is preferable that the station location marker includes the word being made up of at least one digital section Symbol string, to represent the path from root node to any node;Wherein, each position of the digital section in the character string The present node represented in the path number of levels residing in the dom tree is put, the numerical value of each digital section represents Present node location in affiliated level.
In the technical scheme, hierarchical relationship and position relationship corresponding to each node, specific character string can be used To represent, for example first digit section in character string corresponds to the first level of dom tree(Root node is not counted in, successively downwards For first level etc.), the numerical value in the first digit section represents to carry out other nodes in present node and the first level During arrangement, the location of present node, such as " 06 " represent present node in all nodes of the first level, from default Start node starts, and is the 6th node.Recording mode based on character string, each node can be embodied explicitly and is existed Situation in dom tree, help quickly to realize node locating.
In any of the above-described technical scheme, it is preferable that generating the process of the station location marker also includes:In the dom tree In multiple levels node between when nest relation be present, generate institute's rheme for the father node in the node of the multiple level Put mark, and according to the relative hierarchical relationship and phase between other nodes in the node of the multiple level and the father node To position relationship, relative position corresponding to generation identifies, using the position of other nodes in the node as the multiple level Mark.
In the technical scheme, for multiple nodes of nest relation be present, identified by generating relative position so that When parsing the nest relation, the relative position relation can be based on, is only addressed repeatedly between this multiple node, without every It is secondary all to be positioned since root node, so as to be favorably improved the processing speed to the nest relation.
In any of the above-described technical scheme, it is preferable that generating the process of the station location marker also includes:For described at least one Individual node generates unique annotation identifier, to be used as the station location marker;And the annotation identifier is inserted into described Correspond to the position of at least one node in XML template files, to establish the incidence relation with least one node.
In the technical scheme, because annotation identifier is unique so that when performing the addressing to node, if desired Some annotation identifier is searched, can directly and only find a corresponding node, be i.e. lookup result will not be more due to existing Individual title is identical or property value identical node and be affected, the iterative parsing to XML file is avoided, so as to help speed up To the addressing speed of node, the analyzing efficiency to XML file is improved.
In any of the above-described technical scheme, it is preferable that generating the process of the mapping relations also includes:When the XML moulds During the node that presence can repeat in plate file, corresponding circulation mark is set in the mapping relations.
In the technical scheme, by setting circulation mark so that when performing form conversion, can be directly acquainted with accordingly To the presence of circulation, so as to help to perform the parsing operation to XML template files.
In any of the above-described technical scheme, it is preferable that generating the process of the mapping relations also includes:When the dom tree In multiple levels node between when nest relation be present, Nested conditions information is marked in the mapping relations.
In the technical scheme, by marking Nested conditions information, help to realize the parsing to corresponding nested structure, add The form conversion speed of fast file.
In any of the above-described technical scheme, it is preferable that also include:Generation is described corresponding at least one type of service XML template files and corresponding allocation list;And the type of service read in the pending two-dimentional relation table marks, and obtains Corresponding to the XML template files and allocation list of type of service mark, for entering to the pending two-dimentional relation table Row format is changed.
In the technical scheme, when two-dimentional relation table corresponding to a variety of different service types be present, different service types The parameter that is included in corresponding two-dimentional relation table, parameter format etc. there may be difference, but corresponding to same type of service Two-dimentional relation tableau format is fixed, thus by a variety of XML template files corresponding to being generated according to type of service and is matched somebody with somebody Put table, you can realize the compatibility to different service types.
By above technical scheme, station location marker can be passed through when two-dimentional relation table enters row format conversion to XML file To accelerate the addressing operation to node, so as to especially when the content of XML document is relatively more, level is deep and condition is more complicated When, it is favorably improved conversion efficiency.
Brief description of the drawings
Fig. 1 shows the schematic block diagram of document format conversion method according to an embodiment of the invention;
Fig. 2 shows the schematic block diagram of document format conversion system according to an embodiment of the invention;
Fig. 3 shows the structural representation of HL7V3 transform engines according to an embodiment of the invention;
Fig. 4 is the schematic flow diagram that the XML template configurations instrument of embodiment illustrated in fig. 3 performs the configuration to XML templates;
Fig. 5 is that the CDS of embodiment illustrated in fig. 3 turns the schematic flow diagram that XML modules perform the form conversion to CDS files;
Fig. 6 is that the XML of embodiment illustrated in fig. 3 turns the schematic flow diagram that CDS modules perform the form conversion to XML file.
Embodiment
It is below in conjunction with the accompanying drawings and specific real in order to be more clearly understood that the above objects, features and advantages of the present invention Mode is applied the present invention is further described in detail.It should be noted that in the case where not conflicting, the implementation of the application Feature in example and embodiment can be mutually combined.
Many details are elaborated in the following description to facilitate a thorough understanding of the present invention, still, the present invention may be used also To be different from other modes described here using other to implement, therefore, the present invention is not limited to following public specific real Apply the limitation of example.
Fig. 1 shows the schematic block diagram of document format conversion method according to an embodiment of the invention.
As shown in figure 1, document format conversion method according to an embodiment of the invention, including:Step 102, obtain With the form identical XML template files of target XML file, and with pending two-dimentional relation tableau format identical standard Two-dimentional relation table, the element information of each node in dom tree corresponding to the XML template files is recorded into allocation list, And the mapping relations between the parameter in each node in the dom tree and the standard two-dimensional relation table are stored to described In allocation list;Step 104, be the dom tree at least one node generation corresponding to station location marker, and with corresponding node The station location marker is stored in association;Step 106, according to the allocation list, by the pending two-dimentional relation table Parameter is filled into the XML template files, to generate the target XML file, wherein, according to the station location marker to described Corresponding node is addressed in XML template files.
In the technical scheme, the path that uses is believed during different from directly being parsed using XPath technologies to XML file Breath, the present invention generate station location marker by being individually for node so that the node can be sought according to station location marker, avoid adopting When being addressed with path, to the traversal repeatedly of node, there is identical element term and even especially for many nodes The situation of part identical property value, the addressing speed to node can be effectively lifted, is improved to XML analyzing efficiency and to two Dimension relation tableau format conversion efficiency.
In the above-mentioned technical solutions, it is preferable that generating the process of the station location marker includes:According in the dom tree In the hierarchical relationship between other nodes in any node and the dom tree, and any node and affiliated level Position relationship between other nodes, the station location marker is generated, and the station location marker is stored into the allocation list.
In the technical scheme, by the way that XML file is resolved into DOM tree structure so that each node can be by level Relation and position relationship are positioned, so as to relative in XPath according to the criteria character string such as node path and node attribute values The positioning method of expression formula, without being traveled through repeatedly to node, it can effectively improve to the locating speed of node and to XML file Analyzing efficiency.
In any of the above-described technical scheme, it is preferable that the station location marker includes the word being made up of at least one digital section Symbol string, to represent the path from root node to any node;Wherein, each position of the digital section in the character string The present node represented in the path number of levels residing in the dom tree is put, the numerical value of each digital section represents Present node location in affiliated level.
In the technical scheme, hierarchical relationship and position relationship corresponding to each node, specific character string can be used To represent, for example first digit section in character string corresponds to the first level of dom tree(Root node is not counted in, successively downwards For first level etc.), the numerical value in the first digit section represents to carry out other nodes in present node and the first level During arrangement, the location of present node, such as " 06 " represent present node in all nodes of the first level, from default Start node starts, and is the 6th node.Recording mode based on character string, each node can be embodied explicitly and is existed Situation in dom tree, help quickly to realize node locating.
In any of the above-described technical scheme, it is preferable that generating the process of the station location marker also includes:In the dom tree In multiple levels node between when nest relation be present, generate institute's rheme for the father node in the node of the multiple level Put mark, and according to the relative hierarchical relationship and phase between other nodes in the node of the multiple level and the father node To position relationship, relative position corresponding to generation identifies, using the position of other nodes in the node as the multiple level Mark.
In the technical scheme, for multiple nodes of nest relation be present, identified by generating relative position so that When parsing the nest relation, the relative position relation can be based on, is only addressed repeatedly between this multiple node, without every It is secondary all to be positioned since root node, so as to be favorably improved the processing speed to the nest relation.
In any of the above-described technical scheme, it is preferable that generating the process of the station location marker also includes:For described at least one Individual node generates unique annotation identifier, to be used as the station location marker;And the annotation identifier is inserted into described Correspond to the position of at least one node in XML template files, to establish the incidence relation with least one node.
In the technical scheme, because annotation identifier is unique so that when performing the addressing to node, if desired Some annotation identifier is searched, can directly and only find a corresponding node, be i.e. lookup result will not be more due to existing Individual title is identical or property value identical node and be affected, the iterative parsing to XML file is avoided, so as to help speed up To the addressing speed of node, the analyzing efficiency to XML file is improved.
In any of the above-described technical scheme, it is preferable that generating the process of the mapping relations also includes:When the XML moulds During the node that presence can repeat in plate file, corresponding circulation mark is set in the mapping relations.
In the technical scheme, by setting circulation mark so that when performing form conversion, can be directly acquainted with accordingly To the presence of circulation, so as to help to perform the parsing operation to XML template files.
In any of the above-described technical scheme, it is preferable that generating the process of the mapping relations also includes:When the dom tree In multiple levels node between when nest relation be present, Nested conditions information is marked in the mapping relations.
In the technical scheme, by marking Nested conditions information, help to realize the parsing to corresponding nested structure, add The form conversion speed of fast file.
In any of the above-described technical scheme, it is preferable that also include:Generation is described corresponding at least one type of service XML template files and corresponding allocation list;And the type of service read in the pending two-dimentional relation table marks, and obtains Corresponding to the XML template files and allocation list of type of service mark, for entering to the pending two-dimentional relation table Row format is changed.
In the technical scheme, when two-dimentional relation table corresponding to a variety of different service types be present, different service types The parameter that is included in corresponding two-dimentional relation table, parameter format etc. there may be difference, but corresponding to same type of service Two-dimentional relation tableau format is fixed, thus by a variety of XML template files corresponding to being generated according to type of service and is matched somebody with somebody Put table, you can realize the compatibility to different service types.
Fig. 2 shows the schematic block diagram of document format conversion system according to an embodiment of the invention.
As shown in Fig. 2 document format conversion system 200 according to an embodiment of the invention, including:Template configuration mould Block 202, for obtain with the form identical XML template files of target XML file, and with pending two-dimentional relation table Form identical standard two-dimensional relation table, the element information of each node in dom tree corresponding to the XML template files is remembered Record is into allocation list, and by the mapping between the parameter in each node in the dom tree and the standard two-dimensional relation table Relation is stored into the allocation list;Identifier generation module 204, for in the dom tree at least one node generation pair The station location marker answered, and the station location marker is stored in association with corresponding node;Format converting module 206, for basis The allocation list, the parameter in the pending two-dimentional relation table is filled into the XML template files, with described in generation Target XML file, wherein, corresponding node in the XML template files is addressed according to the station location marker.
In the technical scheme, the path that uses is believed during different from directly being parsed using XPath technologies to XML file Breath, the present invention generate station location marker by being individually for node so that the node can be sought according to station location marker, avoid adopting When being addressed with path, to the traversal repeatedly of node, there is identical element term and even especially for many nodes The situation of part identical property value, the addressing speed to node can be effectively lifted, is improved to XML analyzing efficiency and to two Dimension relation tableau format conversion efficiency.
In the above-mentioned technical solutions, it is preferable that the identifier generation module 204 is used for:According to any in the dom tree Other in the hierarchical relationship between other nodes in node and the dom tree, and any node and affiliated level Position relationship between node, the station location marker is generated, and the station location marker is stored into the allocation list.
In the technical scheme, by the way that XML file is resolved into DOM tree structure so that each node can be by level Relation and position relationship are positioned, so as to relative in XPath according to the criteria character string such as node path and node attribute values The positioning method of expression formula, without being traveled through repeatedly to node, it can effectively improve to the locating speed of node and to XML file Analyzing efficiency.
In any of the above-described technical scheme, it is preferable that the station location marker includes the word being made up of at least one digital section Symbol string, to represent the path from root node to any node;Wherein, each position of the digital section in the character string The present node represented in the path number of levels residing in the dom tree is put, the numerical value of each digital section represents Present node location in affiliated level.
In the technical scheme, hierarchical relationship and position relationship corresponding to each node, specific character string can be used To represent, for example first digit section in character string corresponds to the first level of dom tree(Root node is not counted in, successively downwards For first level etc.), the numerical value in the first digit section represents to carry out other nodes in present node and the first level During arrangement, the location of present node, such as " 06 " represent present node in all nodes of the first level, from default Start node starts, and is the 6th node.Recording mode based on character string, each node can be embodied explicitly and is existed Situation in dom tree, help quickly to realize node locating.
In any of the above-described technical scheme, it is preferable that the identifier generation module 204 is additionally operable to:In the dom tree When nest relation be present between the node of multiple levels, the position mark is generated for the father node in the node of the multiple level Know, and according to the relative hierarchical relationship between other nodes in the node of the multiple level and the father node and with respect to position Relation is put, relative position corresponding to generation identifies, using the station location marker of other nodes in the node as the multiple level.
In the technical scheme, for multiple nodes of nest relation be present, identified by generating relative position so that When parsing the nest relation, the relative position relation can be based on, is only addressed repeatedly between this multiple node, without every It is secondary all to be positioned since root node, so as to be favorably improved the processing speed to the nest relation.
In any of the above-described technical scheme, it is preferable that the identifier generation module 204 is used for:For at least one section Point generates unique annotation identifier, to be used as the station location marker;And the annotation identifier is inserted into the XML moulds Correspond to the position of at least one node in plate file, to establish the incidence relation with least one node.
In the technical scheme, because annotation identifier is unique so that when performing the addressing to node, if desired Some annotation identifier is searched, can directly and only find a corresponding node, be i.e. lookup result will not be more due to existing Individual title is identical or property value identical node and be affected, the iterative parsing to XML file is avoided, so as to help speed up To the addressing speed of node, the analyzing efficiency to XML file is improved.
In any of the above-described technical scheme, it is preferable that the template configuration module 202 is additionally operable to:Generation corresponds at least A kind of XML template files of type of service and corresponding allocation list;And the format converting module 106 is additionally operable to:Read Take the type of service in the pending two-dimentional relation table to mark, obtain the XML templates for corresponding to type of service mark File and allocation list, for entering row format conversion to the pending two-dimentional relation table.
In the technical scheme, when two-dimentional relation table corresponding to a variety of different service types be present, different service types The parameter that is included in corresponding two-dimentional relation table, parameter format etc. there may be difference, but corresponding to same type of service Two-dimentional relation tableau format is fixed, thus by a variety of XML template files corresponding to being generated according to type of service and is matched somebody with somebody Put table, you can realize the compatibility to different service types.
It should be noted that the two-dimentional relation table mentioned by the corresponding present invention, it should be appreciated by those skilled in the art that it is Refer to the form or file of any form of embodiment two-dimentional relation, can due to reality used by developing instrument or technological means not With and difference;As a kind of more specific form of expression, for example, for using Delphi development technique when, the two dimension pass It is that table can be CDS(ClientDataSet, client data collection)File.
The collection that technical scheme can apply between HL7V3 form transform engine, integrated platform, heterogeneous system Turn XML, XML into interaction, two-dimentional relation watch to turn to need conversion XML and XML fast based on template in the various fields such as two-dimentional relation watch Application system in terms of fast positioning.Below so that CDS turns XML in HL7V3 transform engines as an example, the implementation of the present invention is further illustrated Mode.
In HL7V3 engines, the data in operation system are all with two-dimentional relation sheet form(That is CDS files)Store and make , for the ease of isomeric data interacting message, HL7V3 engines must just be converted to two-dimentional relation data corresponding to HL7V3 The message of XML reference formats, so as to be interacted with platform or other systems;In HL7V3, substantial amounts of standard message is defined Service, each messenger service both define the form of standard.
When operation system is to outgoing message, it would be desirable to which the CDS two-dimentional relation table data of conversion are passed to as parameter, are called and are turned Change after engine CDS turns XML functions, transform engine can load XML allocation lists and the standard x ML empty template files configured(Be free of The full standard structure XML file of business datum), record has mapping and the configuration information of XML each elements in allocation list, and conversion is drawn Hold up and empty template XML element is positioned according to the position encoded or annotation identifier recorded in allocation list information one by one, by business number Read according to according to mapping configuration from CDS in write-in node elements, finally return to the XML file converted.
When the XML file to receiving is converted to CDS files, standard x ML template files and CDS files are again based on Between mapping relations(Allocation list can be established)Deng so as to realize the form conversion to XML file based on the mapping relations.
With reference to Fig. 3 to Fig. 6, the form transform engine based on the present invention is described in detail.
Fig. 3 shows the structural representation of HL7V3 transform engines according to an embodiment of the invention.
As shown in figure 3, HL7V3 transform engines 300 according to an embodiment of the invention include:
XML template configurations instrument 302:For the configuration for the XML template files for carrying out standard, XML and CDS mapping is generated The information such as the various supporting mark of correlations such as relation, the new XML template files comprising annotation identifier, position encoded, and save as Integrated configuration information table(That is allocation list).
Further, when being configured to allocation list, automatic sequence travels through the XML masterplate files of standard, for each member Element is generated for capable of rapid positioning position encoded and annotation identifier;It is position encoded to be stored in allocation list, annotation identifier It can be inserted into XML template files, form the new new XML template files for not influenceing original structure;Match somebody with somebody to mapping relations When putting, to need store dynamic service data each node elements by configuration tool, manual configuration correspond to CDS title and Its corresponding field name, the mapping relations are matched by title, i.e., must construct initialization in subsequent conversion application has There must be the field name in mapping in the CDS examples of corresponding title and the field of CDS examples, configuration is otherwise reported in transfer process It is abnormal;During configuration data collection node, circulation mark can be marked:Or 1..n, 0..n for constraint expression loop body interior nodes weight Plural number, 0..n represent that the node elements are optional, can not had, it is possibility to have a plurality of repetition;If without when, it should delete the section Point;1..n it can not be sky to represent the node elements, it is necessary to occurred once, it is also necessary to there is value, node can also repeat repeatedly, Expression contains a plurality of business record.
CDS turns XML modules 304:For foundation allocation list, by the business datum two in CDS the or CDS groups in operation system The Content Transformation of dimension relation sheet form is into the XML file and output for meeting HL7V3 standard form XML structures, as HL7V3 forms Messages application.
Further, during CDS turns XML, if equipped with position encoded, preferential use in allocation list.Wherein, Every two are one section in position encoded, a level in every section of corresponding dom tree(That is level), the order of section corresponds to hierachy number. Such as " 010201 " includes three sections, and first paragraph 01 represents the 1st node elements of first layer, second segment 02, represents the 2nd of the second layer Individual node elements, the 1st node elements of the 3rd section of 01 expression third layer.When being positioned to DOM node, directly by hop count Child node level is decomposed into, switchs to respective layer time son node number group subscript by two in section are position encoded, quickly directly positioning Node, such as:It is assumed that position encoded is 01020302, then parsing operation is proceeded by from root node root:“root.item[01] .item [02] .item [03] .item [02] " is directly positioned, and according to the CDS titles and field name mapped in allocation list, is read Go out corresponding field value in CDS of the same name and be assigned to node elements.
XML turns CDS modules 306:The XML file of incoming business datum is received, reads allocation list, it is according to configuration that XML is literary Business datum in part reads and write in CDS corresponding field, exports CDS files.
Further, during XML turns CDS, according to absolute path and relative path in allocation list and annotation mark Symbol, localization of XML interior joint element.Wherein, it is preferential to use if there is annotation identifier in XML, can fast positioning;Such as nothing in XML Annotation identifier, then XPath paths can only be used, it is preferential to use wherein if relative path, if without relative path, only Absolute path can be used, i.e., positioning be parsed by XPath.
1st, XML template configurations instrument 302
Fig. 4 is the schematic flow diagram that the XML template configurations instrument of embodiment illustrated in fig. 3 performs the configuration to XML templates.
As shown in figure 4, the exemplary flow that XML template configurations instrument 302 performs the configuration to XML templates includes:
Step 402, XML standard form files are loaded, are initialized as dom tree;Because the method for the present invention is to be based on XML moulds In plate file basis, so the processing of the dynamic structure XML if not normal structure form of having an agreement, does not apply to this side Method.
Step 404, DOM tree node element is traveled through, configuration tool can extract DOM structure and each section automatically when loading Point element, including the element such as label, label value, attribute, property value, special handling instruction, annotation;To each node elements by tree Shape layer of structure deploys, and each element saves as a record, and instrument is suitable automatically according to the level and element of dom tree type structure Sequence, it is each Element generation absolute position encoder;Extract the hierachy number of each element, XPath complete trails be stored in each it is right In the record for answering element.
Step 406, manual configuration XML data set node and field node, the mapping configured between XML node and CDS are closed System, if being provided with data set node, subordinate's child node of the node can be all automatically generated relative to the data set node Relative position encodes and relative XPath paths, is also stored in allocation list record.
Step 408, the cycle labeling of data set node is set, when there is provided during data set node, marking the data set The cycle labeling of node, 0..n represent that the node is optional, can repeatedly, if during without content, it should delete the node; 1..n represent that the node is essential, can be repeatedly;If XML duplication of name label nodes are relatively more, this in HL7V3 standard x ML, Label duplication of name is commonplace, because tag name is relatively more abstract, in order to embody expression different business implication, comes often through property value Constraint definition, the business implication of sometimes one label node element have also relied on contextual tab and its property value, this complexity Expression, using at corresponding XML node, annotation identifier is added, be so not shadow because annotation identifier name requires unique Primary standard XML semantic structures are rung, and XPath and condition positioning can be simplified with annotation identifier indirectly;When configuration addition annotation After identifier, because influencing whether original position encoded order, XML template files and allocation list can refresh guarantor automatically Deposit.
Step 410, allocation list and new XML template files are preserved.
2nd, CDS turns XML modules 304
Fig. 5 is that the CDS of embodiment illustrated in fig. 3 turns the schematic flow diagram that XML modules perform the form conversion to CDS files.
Perform the exemplary flow of the forms of CDS files conversion is included as shown in figure 5, CDS turns XML modules 304:
Step 502, initialize, load allocation list, be buffered in C in internal memory, initialize storehouse DS, CS, increase list, delete List;Storehouse CS is used to preserve nested node pointer, and storehouse DS is used for the record position label for preserving nested CDS;Increase row It is used to keep in the XML node subtree pointer for needing to newly increase when circulation to be recorded in table, delete list should in XML for keeping in The node pointer of deletion, to avoid influenceing former XML formwork structures order in the process, the node processing in two lists, it is placed on CDS turns after XML processing is fully completed, before terminating, the node in delete list to be deleted to the XML after processing one by one, will be increased Node in list increases in nearly XML one by one.
Step 504, successor parameter CDS array B are received, type of service is read from B, are filtered in C to the type of service phase Whole configuration records of pass;It is to support multiple two-dimentional relation table CDS, CDS arrays suitable using CDS array forms as ginseng is entered Sequence is related, and mapping relations are to correspond to the CDS joined in CDS arrays in order number in allocation list.
Step 506, the XML template files of type of service are corresponded to from load configurations, are instantiated as dom tree D, mobile configuration Table C records Cn to first bar(n=1).
Step 508, C current records Cn is read.
Step 510, judge whether Cn rows terminate, if C rows terminate, perform step 538, otherwise into step 512 or step Rapid 532.
Step 512, Cn is data set node type, and CS stacks are sky, then Cn location tags stacking CS.
Step 514, judge whether DS stacks are empty.If DS is sky, step 518 is gone to, if DS is not sky, into step 516。
Step 516, judge whether the hierachy number of current data set node is more than or equal to the hierachy number of DS top-of-stack pointer nodes, If so, then going to step 524, step 518 is otherwise performed.
Step 518, read that Cn interior joints are position encoded and CDS, D is to corresponding node element for positioning, by node elements pointer Stacking CS, CDS move on to first bar record position, by CDS and current recording position label stacking DS.
Step 520, judge whether CDS is empty, is such as sky, then into step 522, such as not to be empty, then into step 503, Move C to next record.
Step 522, add CS stack tops node pointer and add delete list, DS, CS pop, continuous moving C to next note Record, until record of the next hierachy number more than or equal to current hierachy number, go to step 508.
Step 524, judge whether CS stack tops CDS records terminate, such as terminate to perform step 528, otherwise perform step 526.
Step 526, the new node of DS top-of-stack pointer referents is replicated, addition new node pointer moves into increase list Dynamic C is recorded back to CS stack top label positions, subsequently into step 530.
Step 528, if DS, CS are not sky, DS pops, and CS pops, and then goes to step 530, that is, moves C to next note Record.
Step 530, mobile C to next record.
Step 532, Cn is record set node type.
Step 534, position encoded (relative position coding, absolute position encoder) fast positioning D corresponding nodes member is preferentially taken Element, it is such as no position encoded, then take XPath paths(Relative path, absolute path)Positioning.
Step 536, field value in CDS is read from Cn configuration and is written to the node elements or attribute navigated in D Value, then go to step 530, mobile C to next record.
Step 538, storehouse DS, CS are emptied;Handle delete list, increase list;If delete list is not sky, travel through Increase list, one by one increase to increase list middle finger pin knot point in D;If delete list is not sky, the deletion pair from D one by one Answer node;Two lists are emptied, D is saved as into XML outputs, terminates to exit.
3rd, XML turns CDS modules 306
Fig. 6 is that the XML of embodiment illustrated in fig. 3 turns the schematic flow diagram that CDS modules perform the form conversion to XML file.
Perform the exemplary flow of the form of XML file conversion is included as shown in fig. 6, XML turns CDS modules 306:
Step 602, incoming parameter XML is received, is initialized as dom tree, the type of service according to corresponding to XML, initializes phase Corresponding CDS.
Step 604, XML allocation lists are loaded, corresponding configuration record in allocation list is filtered out by type of service, remembers from first bar Record starts, and traversal reads configuration record information one by one.
Step 606, judge whether there is annotation identifier in allocation list record, if any step 608 is then performed, performed as without if Step 612.
Step 608, judge annotation identifier is whether there is in DOM, if any then execution step 610;As performed step 612 without if.
Step 610, DOM node element is retrieved by annotation identifier, reads node elements value, according in allocation list record Mapping relations, write a value into corresponding CDS in respective field, perform step 614.
Step 612, by XPath path orientation DOM node elements, such as determined equipped with the preferential of relative path using relative path Position, read in CDS corresponding to the write-in of node elements value in corresponding field, execution step 614.
Step 614, CDS outputs are preserved to return, are terminated.
Technical scheme is described in detail above in association with accompanying drawing, the present invention proposes a kind of document format conversion system System, it can be grasped when two-dimentional relation table enters row format conversion to XML file by station location marker come the addressing accelerated to node Make, so as to especially when the content of XML document is relatively more, level is deep and condition is more complicated, be favorably improved conversion effect Rate.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should be included in the scope of the protection.

Claims (7)

  1. A kind of 1. document format conversion method, it is characterised in that including:
    Obtain with the form identical XML template files of target XML file, and with pending two-dimentional relation tableau format phase With standard two-dimensional relation table, the element information of each node in dom tree corresponding to the XML template files is recorded to matching somebody with somebody Put in table, and the mapping relations between the parameter in each node in the dom tree and the standard two-dimensional relation table are deposited Storage is into the allocation list;
    For station location marker corresponding at least one node generation in the dom tree, and stored in association with corresponding node The station location marker;
    According to the allocation list, the parameter in the pending two-dimentional relation table is filled into the XML template files, with The target XML file is generated, wherein, corresponding node in the XML template files is sought according to the station location marker Location;
    Wherein, generating the process of the station location marker includes:
    The hierarchical relationship between other nodes in any node and the dom tree in the dom tree, and described The position relationship between other nodes in one node and affiliated level, generates the station location marker, and by the station location marker Store into the allocation list.
  2. 2. document format conversion method according to claim 1, it is characterised in that the station location marker is included by least one The character string of number field composition, to represent the path from root node to any node;
    Wherein, the present node in path described in each positional representation of the digital section in the character string is in the DOM Residing number of levels in tree, the numerical value of each digital section represent present node location in affiliated level.
  3. 3. document format conversion method according to claim 1, it is characterised in that generate the process of the station location marker also Including:
    When nest relation be present between the node of multiple levels in the dom tree, in the node of the multiple level Father node generates the station location marker, and according between other nodes in the node of the multiple level and the father node With respect to hierarchical relationship and relative position relation, relative position corresponding to generation identifies, using in the node as the multiple level Other nodes station location marker.
  4. 4. document format conversion method according to claim 1, it is characterised in that generate the process of the station location marker also Including:
    Unique annotation identifier is generated at least one node, to be used as the station location marker;And
    The annotation identifier is inserted into the position for corresponding at least one node in the XML template files, to build The vertical incidence relation with least one node.
  5. 5. document format conversion method according to any one of claim 1 to 4, it is characterised in that generate the mapping The process of relation also includes:
    When the node that presence can repeat in the XML template files, followed in the mapping relations corresponding to setting Ring mark.
  6. 6. document format conversion method according to any one of claim 1 to 4, it is characterised in that generate the mapping The process of relation also includes:
    When nest relation be present between the node of multiple levels in the dom tree, marked in the mapping relations nested Conditional information.
  7. 7. document format conversion method according to any one of claim 1 to 4, it is characterised in that also include:
    The XML template file and corresponding allocation list of the generation corresponding at least one type of service;And
    The type of service mark in the pending two-dimentional relation table is read, obtains and corresponds to type of service mark XML template files and allocation list, for entering row format conversion to the pending two-dimentional relation table.
CN201310596651.7A 2013-11-22 2013-11-22 Document format conversion method Active CN103559322B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310596651.7A CN103559322B (en) 2013-11-22 2013-11-22 Document format conversion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310596651.7A CN103559322B (en) 2013-11-22 2013-11-22 Document format conversion method

Publications (2)

Publication Number Publication Date
CN103559322A CN103559322A (en) 2014-02-05
CN103559322B true CN103559322B (en) 2017-11-17

Family

ID=50013568

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310596651.7A Active CN103559322B (en) 2013-11-22 2013-11-22 Document format conversion method

Country Status (1)

Country Link
CN (1) CN103559322B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3021787B1 (en) * 2014-05-30 2023-08-18 Amadeus Sas CONTENT MANAGEMENT SYSTEM
US10049329B2 (en) 2014-05-30 2018-08-14 Amadeus S.A.S. Content exchange with a travel management system
US10042871B2 (en) 2014-05-30 2018-08-07 Amadeaus S.A.S. Content management in a travel management system
CN104850591B (en) * 2015-04-24 2019-03-19 百度在线网络技术(北京)有限公司 A kind of the conversion storage method and device of data
CN107015949B (en) * 2016-12-31 2020-11-06 苏州市环亚数据技术有限公司 Medical data standard conversion method
CN106951399B (en) * 2017-03-23 2020-05-19 北京捷成世纪科技股份有限公司 Method and device for quickly generating ONIX standard file
CN107423322B (en) * 2017-03-31 2020-03-03 广州视源电子科技股份有限公司 Method and device for displaying label nesting hierarchy of webpage
CN108763546A (en) * 2018-05-31 2018-11-06 北京五八信息技术有限公司 A kind of conversion method of data format, device, storage medium and terminal
CN110222319A (en) * 2019-06-19 2019-09-10 北京百度网讯科技有限公司 Method and apparatus for mining data
CN110795444B (en) * 2019-10-25 2022-12-02 北京小米移动软件有限公司 DOM data updating method, page updating method and device
CN111950247A (en) * 2020-07-08 2020-11-17 北京明略软件系统有限公司 Configuration-based Word document generation method
CN111858472B (en) * 2020-08-03 2023-09-05 深圳赛安特技术服务有限公司 File format conversion method, device, computer equipment and storage medium
CN112100316A (en) * 2020-09-16 2020-12-18 北京天空卫士网络安全技术有限公司 Data management method and device
CN115221113A (en) * 2021-04-20 2022-10-21 华为技术有限公司 Data format conversion method, device, equipment and computer readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1883037A2 (en) * 2006-07-26 2008-01-30 Xerox Corporation Graphical syntax analysis of tables through tree rewriting
CN101266595A (en) * 2008-05-09 2008-09-17 北京泰得思达科技发展有限公司 Electronic bid applied system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4291999B2 (en) * 2002-01-18 2009-07-08 株式会社インターネットディスクロージャー Document creation system and creation management program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1883037A2 (en) * 2006-07-26 2008-01-30 Xerox Corporation Graphical syntax analysis of tables through tree rewriting
CN101266595A (en) * 2008-05-09 2008-09-17 北京泰得思达科技发展有限公司 Electronic bid applied system

Also Published As

Publication number Publication date
CN103559322A (en) 2014-02-05

Similar Documents

Publication Publication Date Title
CN103559322B (en) Document format conversion method
CN103559321B (en) document format conversion system
CN103593457B (en) Method for converting document format
CN1542657B (en) Method for ensuring data compatibility when storing data item in database
US8176084B2 (en) Structure based storage, query, update and transfer of tree-based documents
US20130138733A1 (en) Universal collaboration adapter for web editors
CN103399857B (en) General method for extracting document structural information
CN102799592B (en) The parsing method and system of rich text document
CN102662997A (en) Method of storing XML data into relational database
CN103777934B (en) A kind of method and system generating controller CAN alternation of bed based on MATLAB
CN101504662A (en) Data conversion method and apparatus
CN114338855B (en) Method for realizing parsing and generating HL7, FHIR standard and custom XML message
CN101609415A (en) Universal service calling system and method based on middleware
CN1684065A (en) Method and device for handling metadata
CN110147544A (en) A kind of instruction generation method, device and relevant device based on natural language
CN110059085A (en) A kind of parsing of JSON data and modeling method of web oriented 2.0
CN109766344A (en) A kind of insurance product information storage method and its database
CN103136172B (en) The method and apparatus of output examination question
CN106095961A (en) Table display processing method and device
US20040220954A1 (en) Translation of data from a hierarchical data structure to a relational data structure
CN113608903A (en) Fault management method based on XML language
CN108509397A (en) Storage, analytic method and the system of hierarchical structure data based on identifier technology
CN107894973A (en) A kind of method for interchanging data and system based on XML
CN101393554A (en) Conversion method and device from IDL information to OWL information
CN101324846B (en) Method for creating data model according to ASN.1 information dynamic state

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: PKU HEALTHCARE IT CO., LTD.

Free format text: FORMER OWNER: FOUNDER INTERNATIONAL CO., LTD.

Effective date: 20150203

Free format text: FORMER OWNER: FOUNDER INTERNATIONAL (BEIJING) CO., LTD.

Effective date: 20150203

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 215123 SUZHOU, JIANGSU PROVINCE TO: 100080 HAIDIAN, BEIJING

TA01 Transfer of patent application right

Effective date of registration: 20150203

Address after: 100080, No. 19, No. 52 West Fourth Ring Road, Beijing, Haidian District

Applicant after: Peking University Medical Information Technology Co.,Ltd.

Address before: Suzhou City, Jiangsu Province, Suzhou Industrial Park 215123 Xinghu Street No. 328 Creative Industry Park founder International Building

Applicant before: FOUNDER INTERNATIONAL Co.,Ltd.

Applicant before: Founder International Co.,Ltd. (Beijing)

GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20240202

Granted publication date: 20171117

PP01 Preservation of patent right