CN105786921A - Data module conversion method and device for non-structured document - Google Patents

Data module conversion method and device for non-structured document Download PDF

Info

Publication number
CN105786921A
CN105786921A CN201410829893.0A CN201410829893A CN105786921A CN 105786921 A CN105786921 A CN 105786921A CN 201410829893 A CN201410829893 A CN 201410829893A CN 105786921 A CN105786921 A CN 105786921A
Authority
CN
China
Prior art keywords
structured document
label
data
data module
dmrl
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410829893.0A
Other languages
Chinese (zh)
Other versions
CN105786921B (en
Inventor
刘剑
梁伟杰
连光耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Aerospace Measurement and Control Technology Co Ltd
Original Assignee
Beijing Aerospace Measurement and Control Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Aerospace Measurement and Control Technology Co Ltd filed Critical Beijing Aerospace Measurement and Control Technology Co Ltd
Priority to CN201410829893.0A priority Critical patent/CN105786921B/en
Publication of CN105786921A publication Critical patent/CN105786921A/en
Application granted granted Critical
Publication of CN105786921B publication Critical patent/CN105786921B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a data module conversion method and device for a non-structured document. The method comprises the following steps of: selecting a non-structured document to be converted; pre-labeling the non-structured document to be converted, and determining conversion data object classification; according to the conversion data object classification, generating a data module request list DMRL according with an interactive electronic technical manual IETM; and converting the non-structured document to be converted into multiple data modules according to the DMRL. According to the invention, contents of all kinds of classes in the non-structured document are extracted by inserting a pre-set label in the non-structured document, such that multiple data modules are obtained by conversion; therefore, the IETM writing efficiency is increased; and the artificial IETM writing workload is reduced.

Description

The data module method for transformation of a kind of non-structured document and device
Technical field
The present invention relates to information data switching technology field, particularly relate to data module method for transformation and the device of a kind of non-structured document.
Background technology
At aircraft, ferrum road, the after-sale service of the main equipments such as boats and ships, customer service, equipment guarantee field, substantial amounts of maintenance of equipment file can be used, these technological documents are in use, it is big that existence takies memory space, it is difficult to carry, management, search, in-convenience in use, the problems such as service efficiency is low, in order to solve problem above, reasonable approach uses a kind of Data Organizing Model exactly, data description method, organizing again by data, recycling, by data (the such as text of various ways, video, audio frequency, three-dimensional) integrate, synthesization represents equipment, the maintaining of equipment, the processes such as fault diagnosis.This kind of technology wherein used is exactly interactive electronic technical manual (InteractiveElectronicTechnicalManual is called for short IETM).
IETM is applied in the processes such as the maintenance record management of the electronization of technical data, standardization, integrated management, the instruction of equipment, the breakdown maintenance of equipment, the training and examination of equipment, equipment, while improving equipment, equipment fault diagnosis efficiency, reduce its Support expense.
IETM is as the important tool of a kind of equipment guarantee, but, IETM belongs to new things, the making work of IETM is not arranged in the equipment development stage, thus causing after equipment development completes, having to again write IETM, this will bring huge human input and workload undoubtedly, is likely to the situation causing data inconsistent simultaneously.How from a large amount of unstructured datas (such as WORD formatted file) such as the source documents of equipment, to extract data, generate the corresponding data module content of IETM, the manufacturing process of IETM will be affected.
Summary of the invention
Based on above-mentioned technical problem, the invention provides the data module method for transformation of a kind of non-structured document and device.
For solving above-mentioned technical problem, the present invention solves by the following technical programs.
The invention provides the data module method for transformation of a kind of non-structured document, including: selected non-structured document to be transformed;Described non-structured document to be transformed is carried out pre-labeling process, it is determined that conversion data target classification;According to described conversion data target classification, generate the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard;According to described DMRL, described non-structured document to be transformed is converted into multiple data module.
Wherein, described non-structured document to be transformed is carried out pre-labeling process, determine conversion data target classification, including: the label of preset kind is inserted in the relevant position in described non-structured document to be transformed, makes the data module of each label correspondence respective type.
Wherein, described label includes one below:<system>,<descript>,<proced>,<fault>,<process>;Behind the relevant position inserted by described label in described non-structured document to be transformed, described label includes: node type, nodename and node content.
Wherein, according to described conversion data target classification, generate the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard, including: obtain node type and nodename that label is<system>, and configure the SNS coding of correspondence;Acquisition label is<descript>,<fault>,<process>node type and nodename, be respectively configured correspondence type coding;Corresponding relation according to node type Yu data module, according to DMRL data form, automatically generates the DMRL comprising the node type of each label, nodename and coding.
Wherein, according to described DMRL, described non-structured document to be transformed is converted into multiple data module, including: according to the DMRL generated, non-structured document carries out node content extraction;For the multiple node contents extracted, according to IETM data format standard, it is separately converted to the data module of corresponding data form.
Wherein, described method also includes: according to the sensitive word rule pre-set and synonym rule, searches sensitive word and synonym in non-structured document, inserts sensitive word label at sensitive word and synon position;Described according to described DMRL, described non-structured document to be transformed is converted into multiple data module, including: identify sensitive word label, the node content of sensitive word label is converted into the data module of corresponding data form.
Present invention also offers the data module reforming unit of a kind of non-structured document, including: chosen module, for selecting non-structured document to be transformed;Processing module, for carrying out pre-labeling process to described non-structured document to be transformed, it is determined that conversion data target classification;Generation module, for according to described conversion data target classification, generating the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard;Conversion module, for according to described DMRL, being converted into multiple data module by described non-structured document to be transformed.
Wherein, described processing module is used for: the label of preset kind is inserted in the relevant position in described non-structured document to be transformed, makes the data module of each label correspondence respective type.
Wherein, described label includes one below:<system>,<descript>,<proced>,<fault>,<process>;After described label is inserted the relevant position in described non-structured document to be transformed by described processing module, described label includes: node type, nodename and node content.
Wherein, described generation module is used for: obtains node type and nodename that label is<system>, and configures the SNS coding of correspondence;Acquisition label is<descript>,<fault>,<process>node type and nodename, be respectively configured correspondence type coding;Corresponding relation according to node type Yu data module, according to DMRL data form, automatically generates the DMRL comprising the node type of each label, nodename and coding;
Wherein, described conversion module, it is used for: according to the DMRL generated, non-structured document carries out node content extraction;For the multiple node contents extracted, according to IETM data format standard, it is separately converted to the data module of corresponding data form.
The present invention has the beneficial effect that:
The present invention, by inserting default label in non-structured document, extracts various types of other content in non-structured document, and then converts out multiple data module.What the present invention improved IETM writes efficiency, reduces the workload of manual compiling IETM.
Accompanying drawing explanation
Fig. 1 is the flow chart of the data module method for transformation of the non-structured document of one embodiment of the invention;
Fig. 2 is the flow chart of the step generating DMRL of one embodiment of the invention;
Fig. 3 is the flow chart of steps of the data module conversion of one embodiment of the invention;
Fig. 4 is the structure chart of the data module reforming unit of the non-structured document of one embodiment of the invention.
Detailed description of the invention
The invention provides the data module method for transformation of a kind of non-structured document and device.Below in conjunction with accompanying drawing and embodiment, the present invention is further elaborated.Should be appreciated that specific embodiment described herein is only in order to explain the present invention, does not limit the present invention.
Fig. 1 is the flow chart of the data module method for transformation of non-structured document according to an embodiment of the invention.
Step S110, selected non-structured document to be transformed.
Non-structured document to be transformed is the source documents of equipment, is maintenance of equipment file.Need, by maintenance of equipment file, to convert the data module meeting IETM standard to.
Step S120, carries out pre-labeling process, it is determined that conversion data target classification to non-structured document to be transformed.
Pre-labeling processes and refers to: non-structured document is carried out labeling description, and the label preset is inserted in the relevant position in non-structured document.Label, for identifying the node in non-structured document, specifies content different classes of in non-structured document.Such as: mark off the main body class content in non-structured document with label, describe class content, failure classes content, process class content.Main body refers to the main object described in non-structured document.Such as: in one section of article, chapter 1 describes main frame, chapter 2 describes display, then main body respectively main frame and display.
Conversion data target classification refers to: the data content of corresponding different pieces of information module in non-structured document, that is to say insert preset label, mark off different classes of content after non-structured document.
Step S130, according to the conversion data target classification determined, generates the data module list of requirements (DataModuleRequirementsList is called for short DMRL) meeting IETM standard.
DMRL includes the data module essential information needed for user.This data module essential information includes each label node type of user requested data module, nodename and coding, specifically refer to the description of Fig. 2.
Step S140, according to the DMRL generated, is converted into multiple data module by non-structured document to be transformed.
Multiple data modules are divided into different types, for instance: main body class data module, description class data module, failure classes data module, process class data module.Further, data module can be regarded as and carries out content different classes of in non-structured document respectively converting and being formed.
For step S120, specifically:
The label of preset kind is inserted in relevant position in non-structured document to be transformed, forms multiple node (label), makes the data module of the node type correspondence respective type of each node.
Non-structured document includes the forms such as WORD, PDF and CAJ.The present embodiment is preferred, and non-structured document is WORD form.
Insert the label of non-structured document, include but not limited to:<system>,<descript>,<proced>,<fault>,<process>.Wherein<system>the class that is the theme label,<descript>for describe class label,<fault>for failure classes label,<process>for process class label.
Labeling requirement predefines.Concrete, it is possible to carrying out tag definition according to data module, make label be mapped with data module, so, the data content of non-structured document is just mapped with corresponding data module naturally.Further, each label includes node type, and according to the position that label inserts, label also includes nodename and node content.Node type is configured according to the type of data module, and nodename is configured according to the document content of label on position, and node content is document content.Wherein, the node content of each label can correspond to identical or different data modules.
Such as:<system>is the theme class label, the position at main body (e.g., equipment cabinets) place, corresponding main body class data module are inserted;<descript>, for describing class label, inserts the position at device description place, corresponding description class data module;<fault>is failure classes label, inserts equipment fault and describes the position at place, corresponding failure classes data module;<process>for process class label, the position at insertion equipment use procedure place, corresponding process class data module.
For label inserted mode, for instance: the part content of text in selected parts non-structured document, and label is inserted in relevant position:
……
1.3.6 safety
Driver task's terminal has self-destroying function, it is ensured that the safety of important information.
……
To above-mentioned content of text, label can be set as follows:
<systemname=" totally ">
<descript>
1.3.6 safety
Driver task's terminal has self-destroying function, it is ensured that the safety of important information.
</descript>
</system>”。
And for example: inserting the non-structured document after label is:
<systemname=" software ">
4 software subsystem designs
<systemname=" technical information acquisition subsystem ">
4.1 technical information acquisition subsystems (Infosys transformation and upgrade)
<descriptname=" system structure ">
4.1.1 system structure
<Para>
Infosys adopts the architecture of C/S, realize in the process such as development and production, sizing, the collection of technical information, the information such as including the parameter that support technical data, training, maintenance and spares provisioning etc. is provided, picture, design drawing, each class model, design documentation, technical manual.
</Para>
</descript>
</system>。
For another example: in one section of article, chapter 1 and chapter 2 be respectively described different main bodys, chapter 1 and chapter 2 all include first segment device description, second section equipment operation, Section three equipment fault, then insert label time: chapter 1 insertion<system>, chapter 2 inserts<system>, the first segment of chapter 1 inserts<descript>, second section insert<process>, Section three insert<fault>, the first segment of chapter 2 inserts<descript>, second section insert<process>, Section three insert<fault>.
Insert label by the relevant position in non-structured document, make non-structured document be changed into description category information content, and reached the purpose that non-structured document is split.Wherein, descriptor format is referred to the S1000D standard of IETM, and suitably extends.
If non-structured document is WORD form, then accesses plug-in unit by the office document data provided, non-structured document is conducted interviews.
For step S130, specifically:
The flow chart of the concrete steps step generating DMRL as shown in Figure 2.
Step S210, in non-structured document, obtains node type and nodename that label is<system>, and the system configuring correspondence divides coding (StandardnumberingSystem is called for short SNS).
Wherein, SNS coding is based on standardized coding scheme, for the code of marker rig and distinguishing hierarchy thereof.In other words, non-structured document is converted into SNS structure.<system>is one-level label.
Step S220, in non-structured document, obtaining label is<descript>,<fault>,<process>node type and nodename, and be respectively configured correspondence type coding.
<descript>,<fault>,<process>it is corresponding on label<system>two grades of labels under label.
Step S230, the corresponding relation according to node type Yu data module, according to DMRL data form, automatically generate the DMRL comprising each label node type, nodename and coding.
Further, while generating DMRL, data module coding is corrected.Because each data module has unique data module coding, this data module coding includes initial SNS coding and starting type coding (IC code), for<system>label configuration SNS coding, for<descript>,<fault>,<process>after label is respectively configured type coding, the starting type coding of SNS coding and initial SNS coding and type coding and correspondence there are differences, at this moment, it is necessary to initial SNS coding and starting type coding in data module being encoded replace with corresponding SNS coding and type coding.
For step S140, specifically:
According to DMRL, extract the content (node content) under each label in non-structured document, by the content under each label of extraction, data module that is that be separately converted to corresponding types and that meet IETM data format standard, and it is managed with the form of data module list.
As it is shown on figure 3, be the flow chart of steps that data module converts according to an embodiment of the invention.
Step S310, according to the DMRL generated, carries out node content extraction in non-structured document.
According to<descript>,<fault>,<process>on label, it is partitioned into node content.In other words, segmentation unit is data module.
Such as: in one section of article, chapter 1 and chapter 2 be respectively described different main bodys, chapter 1 and chapter 2 all include first segment device description, second section equipment operation, Section of three equipment fault, so when splitting content, go out according to tag extraction: the node content A of chapter 1, chapter 2, node content B, the node content C of first segment chapter 1, the node content D of second section, Section three node content E, the node content F of the first segment of chapter 2, the node content G of second section, the node content H of Section three.
Step S320, for the multiple node contents extracted, according to IETM data format standard, is separately converted to the data module of corresponding data form.
Such as: press for content of text<para>data form converts, and presses for figure<graphic>data form converts, for form according to<table>form converts.
Step S330, for the multiple data modules obtained, is managed in data module list.
With data module list of requirements DMRL the difference is that, data module list includes the essential information of all data modules, and DMRL only includes the essential information of the data module that user needs, as included the coding of each label node type, nodename and correspondence under data module.
For step S320, specifically:
In the process that non-structured document is carried out data module conversion, it is also possible to non-structured document is carried out labeling further and processes, in order to increase the degree of accuracy of data module.
According to the sensitive word rule pre-set and synonym rule, non-structured document is searched sensitive word and synonym, insert sensitive word label at sensitive word and synon position;In conversion process, identify sensitive word label, the node content of sensitive word label is converted into the data module of corresponding data form, in other words, sensitive word and data module that synon content transformation is corresponding data form will be comprised.
Concrete, non-structured document is resolved, for instance: selected non-structured document is resolved, analyzes the purposes of non-structured document, determine the synonym that the sensitive word in non-structured document is corresponding with sensitive word, set up sensitive word rule, synonym rule.Sensitive word rule is such as shown in table 1, but is not limited to the content in table 1, and synonym rule is such as shown in table 2, but is not limited to the content in table 2.
Table 1 sensitive word rule
Document name Sensitive word Form Purposes
Driver behavior illustrates .doc Safety <security>safety</security>
Driver behavior illustrates .doc Task terminal <endItem>task terminal</endItem>
Table 2 synonym rule
Sensitive word Synonym one Synonym two Synonym X
Safety Safety Security feature
Task terminal Terminal Use terminal
The form etc. of multiple sensitive word, the non-structured document title at each sensitive word place, each sensitive word interpolation sensitive word label is included in sensitive word rule.
One or more synonyms that sensitive word is corresponding are included in synonym rule.When adding sensitive word label for synonym, the interpolation form that the sensitive word that adopts this synonym corresponding is identical.
In one embodiment, the sensitive word rule of setting, it is also possible to realize the sensitive word in preset range is added sensitive word label.Further, sensitive word rule is added conjunctive word and metadata.Conjunctive word is the associated symbol of scope limiting sensitive word, as more than, less than etc.;Metadata is the value range of sensitive word, such as MPa, Min etc..Such as: air pressure is more than 5MPa, then, air pressure be sensitive word, more than being metadata for conjunctive word, Mpa.
Further, the sensitive word preset and synonym are likely to have definition to describe in IETM standard, it is also possible to do not define description in IETM standard;For the sensitive word and the synonym that do not define description, it is necessary to IETM standard is extended, so as to there is corresponding description.Such as: " air pressure " does not define description in IETM standard, then need to redefine IETM data content, make " air pressure " there is definition and describe.
Non-structured document is carried out the processes such as labeling segmentation, sensitive word and synonym definition, DMRL generation, IETM conversion by the present embodiment, tentatively solves non-structured document and generates the problems such as workload during IETM is big, generation data are inconsistent.
Present invention seek to address that the maintenance support class data module (such as describing class data module, program class data module, failure classes data module, maintenance project class data module etc.) that the non-structural data such as maintenance of equipment file are converted in IETM, what be improved IETM writes efficiency, the problem reducing the workload of manual compiling IETM, specifically include: selected non-structured document (WORD document) to be analyzed, selected document content is carried out the preliminary labeling of data, is carried out the classification of conversion data target by labeling;Go deep into the lteral data in combing document, analytical data purposes, the metadata sensitive word in definition data, synonym rule;After conversion data target classification is determined, generate the IETM DMRL (data module list of requirements) required;According to DMRL, analysis process data and result data are converted into the data form meeting IETM data format standard, and are managed with the form of data module list.
Present invention also offers the data module reforming unit of a kind of non-structured document, as shown in Figure 4.
Chosen module 410, for selecting non-structured document to be transformed.
Processing module 420, for carrying out pre-labeling process to described non-structured document to be transformed, it is determined that conversion data target classification.Further, processing module 420, insert the label of preset kind for the relevant position in described non-structured document to be transformed, make the data module of each label correspondence respective type.Described label includes one below:<system>,<descript>,<proced>,<fault>,<process>.After described label is inserted the relevant position in described non-structured document to be transformed by described processing module 420, described label includes: node type, nodename and node content.
Generation module 430, for according to described conversion data target classification, generating the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard.Further, described generation module 430 is for obtaining the node type and nodename that label is<system>, and configures the SNS coding of correspondence;Acquisition label is<descript>,<fault>,<process>node type and nodename, be respectively configured correspondence type coding;Corresponding relation according to node type Yu data module, according to DMRL data form, automatically generates the DMRL comprising the node type of each label, nodename and coding.
Conversion module 440, for according to described DMRL, being converted into multiple data module by described non-structured document to be transformed.Further, conversion module 440, for according to the DMRL generated, carrying out node content extraction in non-structured document;For the multiple node contents extracted, according to IETM data format standard, it is separately converted to the data module of corresponding data form.
The function of the device described in the present embodiment is described in the embodiment of the method shown in Fig. 1-Fig. 3, therefore not detailed part in the description of the present embodiment, it is possible to referring to the related description in previous embodiment, do not repeat at this.
Although being example purpose, having been disclosed for the preferred embodiments of the present invention, it is also possible for those skilled in the art will recognize various improvement, increase and replacement, and therefore, the scope of the present invention should be not limited to above-described embodiment.

Claims (10)

1. the data module method for transformation of a non-structured document, it is characterised in that including:
Selected non-structured document to be transformed;
Described non-structured document to be transformed is carried out pre-labeling process, it is determined that conversion data target classification;
According to described conversion data target classification, generate the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard;
According to described DMRL, described non-structured document to be transformed is converted into multiple data module.
2. the method for claim 1, it is characterised in that described non-structured document to be transformed is carried out pre-labeling process, it is determined that conversion data target classification, including:
The label of preset kind is inserted in relevant position in described non-structured document to be transformed, makes the data module of each label correspondence respective type.
3. method as claimed in claim 2, it is characterised in that
Described label includes one below:<system>,<descript>,<proced>,<fault>,<process>;
Behind the relevant position inserted by described label in described non-structured document to be transformed, described label includes: node type, nodename and node content.
4. method as claimed in claim 3, it is characterised in that according to described conversion data target classification, generate the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard, including:
Obtain node type and nodename that label is<system>, and configure the SNS coding of correspondence;
Acquisition label is<descript>,<fault>,<process>node type and nodename, be respectively configured correspondence type coding;
Corresponding relation according to node type Yu data module, according to DMRL data form, automatically generates the DMRL comprising the node type of each label, nodename and coding.
5. the method for claim 1, it is characterised in that according to described DMRL, is converted into multiple data module by described non-structured document to be transformed, including:
According to the DMRL generated, non-structured document carries out node content extraction;
For the multiple node contents extracted, according to IETM data format standard, it is separately converted to the data module of corresponding data form.
6. the method for claim 1, it is characterised in that described method also includes:
According to the sensitive word rule pre-set and synonym rule, non-structured document is searched sensitive word and synonym, insert sensitive word label at sensitive word and synon position;
Described according to described DMRL, described non-structured document to be transformed is converted into multiple data module, including: identify sensitive word label, the node content of sensitive word label is converted into the data module of corresponding data form.
7. the data module reforming unit of a non-structured document, it is characterised in that including:
Chosen module, for selecting non-structured document to be transformed;
Processing module, for carrying out pre-labeling process to described non-structured document to be transformed, it is determined that conversion data target classification;
Generation module, for according to described conversion data target classification, generating the data module list of requirements DMRL meeting interactive electronic technical manual IETM standard;
Conversion module, for according to described DMRL, being converted into multiple data module by described non-structured document to be transformed.
8. device as claimed in claim 7, it is characterised in that described processing module is used for:
The label of preset kind is inserted in relevant position in described non-structured document to be transformed, makes the data module of each label correspondence respective type.
9. device as claimed in claim 8, it is characterised in that
Described label includes one below:<system>,<descript>,<proced>,<fault>,<process>;
After described label is inserted the relevant position in described non-structured document to be transformed by described processing module, described label includes: node type, nodename and node content.
10. device as claimed in claim 9, it is characterised in that
Described generation module is used for:
Obtain node type and nodename that label is<system>, and configure the SNS coding of correspondence;
Acquisition label is<descript>,<fault>,<process>node type and nodename, be respectively configured correspondence type coding;
Corresponding relation according to node type Yu data module, according to DMRL data form, automatically generates the DMRL comprising the node type of each label, nodename and coding;
Described conversion module, is used for:
According to the DMRL generated, non-structured document carries out node content extraction;
For the multiple node contents extracted, according to IETM data format standard, it is separately converted to the data module of corresponding data form.
CN201410829893.0A 2014-12-26 2014-12-26 A kind of the data module method for transformation and device of non-structured document Active CN105786921B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410829893.0A CN105786921B (en) 2014-12-26 2014-12-26 A kind of the data module method for transformation and device of non-structured document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410829893.0A CN105786921B (en) 2014-12-26 2014-12-26 A kind of the data module method for transformation and device of non-structured document

Publications (2)

Publication Number Publication Date
CN105786921A true CN105786921A (en) 2016-07-20
CN105786921B CN105786921B (en) 2019-06-18

Family

ID=56388701

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410829893.0A Active CN105786921B (en) 2014-12-26 2014-12-26 A kind of the data module method for transformation and device of non-structured document

Country Status (1)

Country Link
CN (1) CN105786921B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294551A (en) * 2016-07-25 2017-01-04 中国商用飞机有限责任公司 System and comprehensive establishment management system is managed for the CIR of technical publications
CN108021632A (en) * 2017-11-23 2018-05-11 中国移动通信集团河南有限公司 Unstructured data and the mutual conversion process method of structural data
CN108710660A (en) * 2018-05-11 2018-10-26 上海核工程研究设计院有限公司 A kind of items property parameters modeling of database and storage method
CN110119984A (en) * 2018-02-07 2019-08-13 青岛农业大学 A kind of processing system for international trade tick financing
CN110990636A (en) * 2019-12-18 2020-04-10 哈尔滨工程大学 Intelligent data module acquisition and conversion method for diesel engine interactive electronic technical manual
CN111666747A (en) * 2020-05-29 2020-09-15 中国工程物理研究院计算机应用研究所 Method for generating WORD document into description class data module conforming to S1000D standard
CN111859863A (en) * 2020-06-03 2020-10-30 远光软件股份有限公司 Document structure conversion method and device, storage medium and electronic equipment
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070011134A1 (en) * 2005-07-05 2007-01-11 Justin Langseth System and method of making unstructured data available to structured data analysis tools
CN101055578A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 File content dredger based on rule
CN102207975A (en) * 2011-06-24 2011-10-05 天津大学 Method for manufacturing and displaying extensive makeup language (xml) data module based on ietm standard
CN102982027A (en) * 2011-09-02 2013-03-20 北大方正集团有限公司 Method and device for abstracting contents in document
CN103678625A (en) * 2013-12-18 2014-03-26 北京航天测控技术有限公司 Method and device for transforming interactive electronic technical manual data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070011134A1 (en) * 2005-07-05 2007-01-11 Justin Langseth System and method of making unstructured data available to structured data analysis tools
CN101055578A (en) * 2006-04-12 2007-10-17 龙搜(北京)科技有限公司 File content dredger based on rule
CN102207975A (en) * 2011-06-24 2011-10-05 天津大学 Method for manufacturing and displaying extensive makeup language (xml) data module based on ietm standard
CN102982027A (en) * 2011-09-02 2013-03-20 北大方正集团有限公司 Method and device for abstracting contents in document
CN103678625A (en) * 2013-12-18 2014-03-26 北京航天测控技术有限公司 Method and device for transforming interactive electronic technical manual data

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106294551A (en) * 2016-07-25 2017-01-04 中国商用飞机有限责任公司 System and comprehensive establishment management system is managed for the CIR of technical publications
CN108021632A (en) * 2017-11-23 2018-05-11 中国移动通信集团河南有限公司 Unstructured data and the mutual conversion process method of structural data
CN108021632B (en) * 2017-11-23 2020-07-07 中国移动通信集团河南有限公司 Mutual conversion processing method for unstructured data and structured data
CN110119984A (en) * 2018-02-07 2019-08-13 青岛农业大学 A kind of processing system for international trade tick financing
CN108710660A (en) * 2018-05-11 2018-10-26 上海核工程研究设计院有限公司 A kind of items property parameters modeling of database and storage method
CN110990636A (en) * 2019-12-18 2020-04-10 哈尔滨工程大学 Intelligent data module acquisition and conversion method for diesel engine interactive electronic technical manual
CN111666747A (en) * 2020-05-29 2020-09-15 中国工程物理研究院计算机应用研究所 Method for generating WORD document into description class data module conforming to S1000D standard
CN111859863A (en) * 2020-06-03 2020-10-30 远光软件股份有限公司 Document structure conversion method and device, storage medium and electronic equipment
CN112699641A (en) * 2021-03-25 2021-04-23 南京国睿信维软件有限公司 Method for quickly converting batch copy of WORD content to DM based on S1000D standard

Also Published As

Publication number Publication date
CN105786921B (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN105786921A (en) Data module conversion method and device for non-structured document
CN104408078B (en) A kind of bilingual Chinese-English parallel corpora base construction method based on keyword
CN110770735B (en) Transcoding of documents with embedded mathematical expressions
US20070033520A1 (en) System and method for web page localization
CN104391730A (en) Software source code language translation system and method
CN103914443A (en) Mixed typesetting method and device for plurilingual characters
CN103885942B (en) A kind of rapid translation device and method
CN102722479A (en) A method and device for realizing language translation
JP6090850B2 (en) Source program analysis system, source program analysis method and program
CN106776495B (en) Document logic structure reconstruction method
CN105786505A (en) Json based complex web page component self-defining method and device
CN101539910A (en) A sentence taking method for computer aided translation and system thereof
CN106372053B (en) Syntactic analysis method and device
CN110427187A (en) A kind of list visual layout method based on the parsing of HTML Custom Attributes
CN109783801B (en) Electronic device, multi-label classification method and storage medium
CN106528088A (en) Method and device for addling control in online form
CN111178088A (en) Configurable neural machine translation method oriented to XML document
JP2019032704A (en) Table data structuring system and table data structuring method
JP6952967B2 (en) Automatic translator
CN104298705A (en) Converting method of relational data and unstructured data
CN103761095A (en) Method for generating universal header data information of upgraded file
WO2013062550A1 (en) Aligning annotation of fields of documents
CN102629244B (en) Multi-language work card generating system and method
CN102609410B (en) Authority file auxiliary writing system and authority file generating method
CN111709221A (en) Document generation method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant