WO2007144853A2 - Procédé et dispositif pour effectuer une analyse personnalisée sur un document xml en fonction d'une application - Google Patents
Procédé et dispositif pour effectuer une analyse personnalisée sur un document xml en fonction d'une application Download PDFInfo
- Publication number
- WO2007144853A2 WO2007144853A2 PCT/IB2007/052306 IB2007052306W WO2007144853A2 WO 2007144853 A2 WO2007144853 A2 WO 2007144853A2 IB 2007052306 W IB2007052306 W IB 2007052306W WO 2007144853 A2 WO2007144853 A2 WO 2007144853A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- tag
- son
- parsing
- node
- character string
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
Definitions
- the present invention generally relates to a field of document data processing, and in particular, to a novel XML document parsing method different from SAX (Simple API for XML) and DOM (Document Object Model) widely used at present, and a parsing apparatus thereof, which have characteristics of parsing based on the customized application requirements (for example, parsing requirements customized by user) and improving XML document parsing performance,
- SAX Simple API for XML
- DOM Document Object Model
- XML extensible Markup Language
- XML is a kind of structural document tag language mainly consisting of tag, value of tag, attribute, value of attribute, specific processing instruction, and annotation and the like.
- the XML has been widely applied to various fields such as data storing and data communication at present because of its advantages such as openness, extensibility, rigorous grammar, etc.
- SAX performs parsing in a event-trigged way, inputs XML document 100, parser reads data in sequence from the XML document via SAX parser 101 , and when finds an occurrence of specific symbols (such as start and end of a tag, etc.), notifies the application layer module 103 in a form of event, when process in the application layer is completed, the SAX parser is returned to continue the subsequent process, until the end of the document or the application layer requests to terminate the parsing procedure,
- DOM parses the XML document 100 into a object tree 104 saved in a memory via
- Nodes of the object tree are always mapping of certain part of the XML document, hierarchy of the nodes exactly reflect the structure of the XML document, random operations of the XML document is converted into operations of DOM tree.
- table 1 Advantages and disadvantages of the two technologies are listed in table 1 as follows. It can be seen from the table 1 that DOM is apparently superior to SAX, except for its performance low in some extent. Therefore, DOM become the mainstream XML parser at present and has been widely applied.
- an object of the present invention is to provide a extendable API interface through which not only provides powerful XML data processing capability similar to DOM (such as random access to the XML document data; inquiry, modification, and deletion of the XML document data), but also improves speeds of parsing the XML document and inquiring the XML data possibly, thereby enhances data processing performance of computer.
- a technical solution according to the present invention is as follows: provides a user with a SpeedXMLParser interface in which interface parameters of a method Parselnstruction of the XML parsing process provides a user customized attribution parameter instruction, wherein this parameter indicates which tags and attribute values thereof need to be parsed by the application, the parser parses only a part of XML document specified by the application, and ignores other parts of the XML document.
- Figure 2 illustrates differences between parsing trees of the SpeedXMLParser and the DOM.
- a method for parsing document performing customized parsing on the XML document based on customized application requirements, the method comprising the following steps: determining a parsing range of said XML document based on the customized application requirements; parsing the XML document based on the determined parsing range to obtain information matched with the application requirements from the XML document.
- a document parsing apparatus for parsing XML document, the document parsing apparatus performing customized parsing on the XML document based on customized application requirements, the document parsing apparatus comprising: a parsing instruction tree generation unit for generating a parsing instruction tree required to parse the XML document based on the customized application requirements; and a document parsing unit for parsing the XML document based on the instruction tree generated by the parsing instruction tree generation unit to obtain information matched with the application requirements from the XML document.
- the parsing method and parsing apparatus parse document meeting certain syntax rules according to the customized application requirements, thereby increases parsing efficiency and improves data processing performance of computer.
- Figure 1 is a principle diagram showing technical implementation of SAX and DOM in the prior art
- Figure 2 is a diagram showing differences between a parsing tree created by the parser SpeedXMLParser of a parsing method according to the present invention and a parsing tree created by DOM;
- Figure 3 is a schematic diagram showing an application program interface provided by the parser SpeedXMLParser of the parsing method according to the present invention;
- Figure 4 is a schematic diagram showing a process of parsing the XML document by using the parser SpeedXMLParser in the parsing method according to the present invention
- Figure 5 is a diagram showing definition of Userlnstruction and structure of User Instruction tree in the parsing method according to the present invention
- Figure 6 is a flowchart showing a process of creating the Userlnstruction tree in the parsing method according to the present invention
- Figures 7 and 8 are detailed flowchart showing a process of parsing the XML document by using the parser SpeedXMLParser in the parsing method according to the present invention
- Figure 10 is a block diagram showing schematic structure of the parsing apparatus for implementing the parsing method according to the present invention.
- Figure 2 is a diagram showing differences between a parsing tree created by the parser SpeedXMLParser of a parsing method according to the present invention and a parsing tree created by DOM.
- Fig.2 given that there are 18 kinds of tags in the inputted XML document, DOM will parse them completely into a DOM tree structure 200 (of course, one tag is possible to have a plurality of values, not shown in the figure). However, in fact, one certain application or one certain module of the application only needs to use data with three kinds of tags, 0, 4, and 6.
- the inputted XML document can be parsed to a tree structure 201 by applying the parser SpeedXMLParser for implementing the parsing method according to the present invention.
- tags below the tags 3, 5, and 7 in the initial DOM tree structure 200 is parsed to only one node in the method according to the present invention, rather than further parsing them to a son tree, thereby significantly reduces parsing workload and improves parsing efficiency.
- Figure 3 illustrates the provided application program interface in a way of UML (Unified Modeling Language) in which class String is a character string class, InputStringStream is a character string input stream, and ElementList is a linklist of Element and a particular application of class linklist. Similarly, AttributeList is a linklist of Attribute.
- the string class, linklist class, and character string input stream have been supported by the standard object oriented programming language such as C++/Java.
- SpeedXMLParser is an entry of parsing XML document. Variables are defined as follows:
- Element is used to save a node of SpeedXMLParser tree which can be a leaf node or a root node of a son tree, and corresponds to a tag of corresponding XML document.
- Definition of variable ⁇ lemenf is as follows:
- Attribute is used to save a attribute value of a certain tag.
- Definition of variable 'Attribute' is as follows:
- an application operates pseudo-codes of the XML document data as follows;
- SpeedXMLParser* parser new SpeedXMLParser (); parser->ParseInstruction(instream, f 7objectslist/(book ⁇ a>),((computer ⁇ a>)/(configuration))" J HASH_MODE);
- rootTag is a root tag
- sonTag is a son tag which itself can be recursive.
- the meaning of ⁇ a> is to request the parser to parse attribute value of this tag.
- Figure 4 is a schematic diagram showing a process of parsing the XML document by using the parser SpeedXMLParser in the parsing method according to the present invention. As shown in Fig.4, the parsing process includes two main steps:
- Step S400 parses parameter 'instruction' of the interface, creates a tree 'Userlnstruction';
- Step S401 performs the customized parsing process on the XML document according to the tree 'Userlnstruction' created in the step of S400, and creates a memory tree structure 'SpeedXMLParse';
- Figure 5 is a diagram showing definition of Userlnstruction and structure of Userlnstruction tree in the parsing method according to the present invention.
- Block 500 illustrates definition of 'Userlnstruction':
- Block 501 illustrates a structure of tree Userlnstruction generated through parsing when parameter instruction is equal to
- step of S600 determines value of the read out son character string (step of S601).
- step of S602 If the value of the read out son character string is null, or is not in a format of "/tag”, or is not a character string beginning with "/tag/”, exits erroneously (step of S602).
- son character string is a character string beginning with "/tag/”
- creats a root node for the tree Userlnstruction and sets value of data element 'TagName' of the root node to "tag”, value of IfNeedParseAttr to 'false', and let Son point to a newly created UserlnstructionList, and uses the remaining character string of instruction as input parameter (step of S604) and creates son nodes of other each level for the tree Userlnstruction on a basis of one level by one level (step of S605).
- a particular method of creating son nodes for the tree Userlnstruction on a basis of one level by one level in the step of S605 is as follows:
- Match of '(' and ')' means that, a) the separated son character string must be a character string beginning with '(' and ending with ')', the son character string between the '(' and ')' contains '(' and ')' of the same number; b) son character string between each pair of internal '(' and ')' shall be matched.
- step of 1.2.1 If the son character string does not contain character 7% proceeds to step of 1.2.1); and if the son character string contains character 7', re-enters into the step of 1);
- son character string is not a character string beginning with '(', determines whether it meets the format of "tagString", or is a character string beginning with "tagString/";
- step of S701 analyzes the read out valid character string (step of S701). If it is not a start tag, exits erroneously (step of S703). If it is a start tag, further determines whether data element
- TagName of the root node of the tree Userlnstruction is consistent with name of the start tag read out (step of S702). It they are not consistent with each other, exits erroneously (step of S703). If they are consistent with each other, member Root points to a newly created root node, and sets data element Name of the root node to name of the start tag, data element IfParseAttr to false, and data element Attribute to NULL. Further determines whether the start tag of this root node is an empty tag (step of S704).
- step of S707 determines what is the valid son character string read out (step of S707): if it is a tag value, it means the character stream beginning with " ⁇ rootTag>rootValue", and then directly sets data element IfParseValue of the root node to false, let Value point to this tag value character string, and continues to read valid son character string from the stream instream and determines whether the valid son character string read out is an end tag of the root tag. If it is, ends the parsing and exits normally and ignores the subsequent character stream which has not been processed by instream yet. If it is not the end tag of the root tag, exits erroneously (step of S708).
- the valid son character string read out in the step of S707 is an end tag, it means the character stream beginning with " ⁇ rootTag> ⁇ /rootTag>", and thenidetermines whether it is the end tag of the root node. If it is, sets data element IfParseValue of the root node to false, Value to NULL and then ends the parsing and ignores the subsequent character stream which has not been processed by instream yet. If it is not the end tag of the root node, exits erroneously (step of S709).
- the valid son character string read out in the step of S707 is a start tag, it means the character stream beginning with " ⁇ rootTag> ⁇ subTag>", and then sets data element IfParseValue of the root node to true, let Value point to a newly created fast index table (type of the fast index table is indicated by variable Mode of SpeedXMLParse, the same hereafter). Then, loops to read valid son character string from the character stream of parameter instream so as to generate son nodes of other each level (step of S710).
- FIG 8 is a flowchart showing a particular process of generating son nodes of each level.
- Step of S805 follows step of S710 and is described in detail as follows: 1) continues to read valid son character string sequentially from character stream of parameter instream (step of S800);
- step of S801 determines whether the read out son character string is an end tag. If it is, further determines whether it is an end tag of the root tag in step of S802. If it is the end tag of the root tag, ends the parsing and exits normally in step of S804 and ignores the subsequent character stream which has not been processed by instream. If it is not the end tag of the root tag, exits erroneously in step of S803.
- step of S801 determines whether the read out son character string is a start tag (step of S805). If it is not a start tag, exits erroneously (step of S809). If it is a start tag, determines whether such a tag has existed in the same level of tree Userlnstruction (step of S806).
- step of S806 if it is determined in the step of S806 that the tag does not exist in the same level of the tree Userlnstruction, which means that application does not require to further parse son tags of the present tag, then further determines in step of S 807 whether the read out son character string is an empty tag.
- step of S810 If it is, creates a son node with attribute and son tag which need not to be parsed, and adds the son node into list ElementList corresponding to the tag in the fast index table indicated by Value of upper level node (all the created new nodes need this process, the same hereafter), sets Value of this node to NULL, and ends the parsing process of the present node (step of S810) and then returns to the parsing process of next upper level node (step of S800). If it is determined in the step of S807 that the son character string read out is not an empty tag, scans sequentially whether finds an end tag corresponding to the start tag (step of S8O8).
- step of S809 If does not Fmd an end tag corresponding to the start tag, exits erroneously (step of S809); if finds an end tag corresponding to the start tag, creates a son node with attribute and son tag which need not to be parsed and sets Value of this node to character string between the start tag and the end tag. For example, if the tag being parsed currently is "camera”, and character string to be parsed currently is " ⁇ camera> ⁇ man ⁇ facturer>Kodak ⁇ /manufacturer> ⁇ model>DX6490 ⁇ /model> ⁇ /camera>", then Value of this node indicates a character string value of " ⁇ manufacturer>Kodak ⁇ /manufacturer> ⁇ model>DX6490 ⁇ /model>". Then, ends the parsing process of the present node (step of S81 1) and returns to the parsing process of next upper level node (step of S800).
- step of S806 determines whether the tag has existed in the same level of tree Userlnstruction, which means that application requires to further parse the son tag and its attributes of this tag, then takes out the matched object Userlnstruction.
- step of S812 determines whether the son character string read out is an empty tag. If it is an empty tag, creates a son node and determines according to the object Userlnstruction whether further parses attributes of this tag, then ends the parsing process of the present node (step of S813) and returns to the parsing process of the next upper level node (step of S800).
- step of S815 if it is determined in step of S815 that value of the valid son character string read out is value of the tag, sets Value of this node to indicate to the value of the tag, continues to read valid son character string from the stream instream, determines whether the valid son character string read out is an end tag of the present son node. If it is the end tag of the present son node, ends the parsing process of the present node and returns to the parsing process of the next upper level node (step of S800). If it is not the end tag of the present son node, exits erroneously (step of S816),
- step of SSl 5 if it is determined in the step of SSl 5 that value of the valid son character string read out is an end tag, which means that value of the tag is NULL, then determines whether it is the end tag of the present son node. If it is the end tag, ends the parsing process of the present node and sets Value of the present node to NULL and then returns to the parsing process of the next upper level node. If it is not the end tag, exits erroneously (step of S817).
- step of S815 if it is determined in the step of S815 that value of the valid son character string read out is a start tag, which means entering into a parsing process of a son node, then sets Value of the present son node to indicate to an empty fast index table which is newly created and performs the parsing process of the next low level son node of the present node (steps of S818 and S806).
- FIG. 10 shows a schematic structure block diagram of the parsing apparatus according to the present invention.
- the parsing apparatus 900 comprises a parsing instruction tree generation unit 901 and a document parsing unit 902.
- the customized document parsing requirement is inputted into the parsing instruction tree generation unit 901.
- the parsing instruction tree generation unit 901 performs the instruction tree generation method as described in conjunction with Fig.6, creates a customized parsing instruction tree, and outputs to the document parsing unit 902.
- a document to be parsed 903 is also inputted to the document parsing unit 902.
- the document parsing unit 902 performs the memory tree generation method as described in conjunction with Figs. 7 and 8 based on the customized parsing instructions so as to carry out the customized parsing of the inputted document 903.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Document Processing Apparatus (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente invention concerne un analyseur XML nouveau, SpeedXMLParser, et un procédé de mise en œuvre d'analyse dudit analyseur. L'analyseur effectue une analyse personnalisée sur le document XML par instruction de paramètre fournie dans le procédé d'analyse de document XML ParseInstruction. L'analyseur génère un arbre UserInstruction en fonction de l'instruction de paramètre et effectue ensuite le processus d'analyse personnalisée sur le document XML en fonction de l'arbre UserInstruction et crée une structure d'arbre mémoire SpeedXMLParser (similaire à une DOM) pour faciliter le traitement des données du document XML telles que l'accès aléatoire aux données du document XML, la recherche, la modification et la suppression des données du document XML et similaire. La présente invention présente des avantages techniques tels qu'une augmentation significative des performances d'analyse du document XML et donc une amélioration des performances de traitement de données de l'ordinateur.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2006100925633A CN100458786C (zh) | 2006-06-15 | 2006-06-15 | 基于应用定制解析xml文档的方法及装置 |
CN200610092563.3 | 2006-06-15 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007144853A2 true WO2007144853A2 (fr) | 2007-12-21 |
WO2007144853A3 WO2007144853A3 (fr) | 2008-03-06 |
Family
ID=37609520
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2007/052306 WO2007144853A2 (fr) | 2006-06-15 | 2007-06-15 | Procédé et dispositif pour effectuer une analyse personnalisée sur un document xml en fonction d'une application |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN100458786C (fr) |
WO (1) | WO2007144853A2 (fr) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795915A (zh) * | 2018-07-31 | 2020-02-14 | 中兴通讯股份有限公司 | xml文件批量修改方法、系统、设备和计算机可读存储介质 |
CN111881696A (zh) * | 2020-07-31 | 2020-11-03 | 兰州大学 | 一种cml到化学盲文的转换系统及方法 |
CN113591454A (zh) * | 2021-07-30 | 2021-11-02 | 中国银行股份有限公司 | 一种文本解析方法及装置 |
CN116976286A (zh) * | 2023-09-22 | 2023-10-31 | 北京紫光芯能科技有限公司 | 用于进行文本布局的方法及装置、电子设备、存储介质 |
CN117275651A (zh) * | 2023-09-01 | 2023-12-22 | 北京华益精点生物技术有限公司 | 医疗报告生成方法、装置及电子设备 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101650670B (zh) | 2008-08-14 | 2013-01-09 | 鸿富锦精密工业(深圳)有限公司 | 可共享应用程序配置参数的电子系统及其方法 |
CN101650733B (zh) * | 2009-07-31 | 2012-10-31 | 金蝶软件(中国)有限公司 | 一种单点登录系统及其个性化数据引入方法和装置 |
CN101739462B (zh) * | 2009-12-31 | 2012-11-28 | 中兴通讯股份有限公司 | 可扩展标记语言编码方法、解码方法和客户端 |
CN103049536A (zh) * | 2012-11-01 | 2013-04-17 | 广州汇讯营销咨询有限公司 | 提取网页正文内容的方法和系统 |
CN104424334A (zh) * | 2013-09-11 | 2015-03-18 | 方正信息产业控股有限公司 | Xml文档节点的构建方法和装置 |
CN104753891B (zh) * | 2013-12-31 | 2019-04-05 | 中国移动通信集团湖南有限公司 | 一种xml报文解析方法及装置 |
CN108140026B (zh) * | 2015-05-20 | 2022-11-18 | 电子湾有限公司 | 搜索中的多面实体识别 |
CN105868257A (zh) * | 2015-12-28 | 2016-08-17 | 乐视网信息技术(北京)股份有限公司 | Xml数据解析方法、生成方法以及处理系统 |
CN106372042B (zh) * | 2016-08-31 | 2019-09-24 | 北京奇艺世纪科技有限公司 | 一种文档内容获取方法和装置 |
CN106407679B (zh) * | 2016-09-13 | 2019-03-26 | 上海市徐汇区中心医院 | 移动互联跨平台跨设备远程诊疗系统 |
CN108076010B (zh) * | 2016-11-10 | 2020-09-08 | 中国移动通信集团广东有限公司 | 一种xml报文解析方法及服务器 |
CN108399084B (zh) * | 2017-02-08 | 2021-02-12 | 中科创达软件股份有限公司 | 一种应用程序的运行方法及系统 |
CN108427676A (zh) * | 2017-02-13 | 2018-08-21 | 北京新云胜科技有限公司 | 一种xml标签快速定位和处理的方法 |
CN110765163B (zh) * | 2019-10-17 | 2020-07-14 | 广州商品清算中心股份有限公司 | 一种大数据处理流程的执行计划生成方法 |
CN112148298A (zh) * | 2020-09-11 | 2020-12-29 | 杭州安恒信息技术股份有限公司 | Html数据解析方法、装置、计算机设备和存储介质 |
CN112182310B (zh) * | 2020-11-04 | 2023-11-17 | 上海德拓信息技术股份有限公司 | 一种内置实时搜索的通用树形组件实现方法 |
CN113347196A (zh) * | 2021-06-21 | 2021-09-03 | 浙江理工大学 | 一种对网络数据进行解析的解析方法、装置、电子设备以及存储介质 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005064489A1 (fr) * | 2003-12-26 | 2005-07-14 | Electronics And Telecommunications Research Institute | Processeur xml et procede de traitement xml dans un systeme muni d'un processeur xml |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6763499B1 (en) * | 1999-07-26 | 2004-07-13 | Microsoft Corporation | Methods and apparatus for parsing extensible markup language (XML) data streams |
US7191186B1 (en) * | 2002-11-27 | 2007-03-13 | Microsoft Corporation | Method and computer-readable medium for importing and exporting hierarchically structured data |
CN1667610A (zh) * | 2005-03-24 | 2005-09-14 | 北京北方烽火科技有限公司 | 一种基于标记的xml快速解码方法 |
-
2006
- 2006-06-15 CN CNB2006100925633A patent/CN100458786C/zh not_active Expired - Fee Related
-
2007
- 2007-06-15 WO PCT/IB2007/052306 patent/WO2007144853A2/fr active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2005064489A1 (fr) * | 2003-12-26 | 2005-07-14 | Electronics And Telecommunications Research Institute | Processeur xml et procede de traitement xml dans un systeme muni d'un processeur xml |
Non-Patent Citations (1)
Title |
---|
INT BUSINESS MACHINES CORP vol. 450, no. 89, 10 October 2001, * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110795915A (zh) * | 2018-07-31 | 2020-02-14 | 中兴通讯股份有限公司 | xml文件批量修改方法、系统、设备和计算机可读存储介质 |
CN111881696A (zh) * | 2020-07-31 | 2020-11-03 | 兰州大学 | 一种cml到化学盲文的转换系统及方法 |
CN111881696B (zh) * | 2020-07-31 | 2024-02-23 | 兰州大学 | 一种cml到化学盲文的转换系统及方法 |
CN113591454A (zh) * | 2021-07-30 | 2021-11-02 | 中国银行股份有限公司 | 一种文本解析方法及装置 |
CN117275651A (zh) * | 2023-09-01 | 2023-12-22 | 北京华益精点生物技术有限公司 | 医疗报告生成方法、装置及电子设备 |
CN116976286A (zh) * | 2023-09-22 | 2023-10-31 | 北京紫光芯能科技有限公司 | 用于进行文本布局的方法及装置、电子设备、存储介质 |
CN116976286B (zh) * | 2023-09-22 | 2024-02-27 | 北京紫光芯能科技有限公司 | 用于进行文本布局的方法及装置、电子设备、存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN1896992A (zh) | 2007-01-17 |
WO2007144853A3 (fr) | 2008-03-06 |
CN100458786C (zh) | 2009-02-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2007144853A2 (fr) | Procédé et dispositif pour effectuer une analyse personnalisée sur un document xml en fonction d'une application | |
JP6922538B2 (ja) | Api学習 | |
US9721017B2 (en) | Search and navigation to specific document content | |
KR100977352B1 (ko) | 워드 프로세서 문서의 원시 xml에서 비원시 xml을 지원하는 시스템 및 방법 | |
US7366973B2 (en) | Item, relation, attribute: the IRA object model | |
US20070136698A1 (en) | Method, system and apparatus for a parser for use in the processing of structured documents | |
US7120869B2 (en) | Enhanced mechanism for automatically generating a transformation document | |
CN111209004A (zh) | 代码转换方法及装置 | |
US7941417B2 (en) | Processing structured electronic document streams using look-ahead automata | |
US20080208830A1 (en) | Automated transformation of structured and unstructured content | |
CN109522018A (zh) | 页面处理方法、装置及存储介质 | |
KR20070086019A (ko) | 폼 관련 데이터 감소 | |
US7457812B2 (en) | System and method for managing structured document | |
US9311058B2 (en) | Jabba language | |
Szeredi et al. | The semantic web explained: The technology and mathematics behind web 3.0 | |
CN110851136A (zh) | 数据获取方法、装置、电子设备及存储介质 | |
Tekli et al. | Approximate XML structure validation based on document–grammar tree similarity | |
Käbisch et al. | Standardized and efficient RDF encoding for constrained embedded networks | |
JP2005135199A (ja) | オートマトン作成方法、および、xmlデータ検索方法、ならびに、xmlデータ検索装置、xmlデータ検索プログラム、および、xmlデータ検索プログラムの記録媒体 | |
KR101802051B1 (ko) | 자연 언어 처리 스키마 및 그 지식 데이터베이스 구축 방법 및 시스템 | |
CN104778232A (zh) | 一种基于长查询的搜索结果的优化方法和装置 | |
CN116467047A (zh) | 针对容器配置合规性的检测方法、装置、存储介质及终端 | |
CN111046636A (zh) | 筛选pdf文件信息的方法、装置、计算机设备及存储介质 | |
Sakamoto et al. | Extracting partial structures from HTML documents | |
KR102407941B1 (ko) | Rpc에 기반하여 외부 장치의 함수 또는 프로시저를 호출하는 전자 장치가 사용자 인터페이스를 생성하는 방법, 그 컴퓨터 프로그램 및 그 전자 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07789704 Country of ref document: EP Kind code of ref document: A2 |